Changelog
This is a live copy of the workspace CHANGELOG.md. Releases are tracked in the format documented at Keep a Changelog and follow Semantic Versioning.
Changelog
All notable changes to the Fovea project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[0.5.11] - 2026-07-02
The 0.5.11 patch is the third and final release remediating the second swarm audit — the model-service slice — fixing event-loop starvation, ffmpeg subprocess leaks, a shared-model inference race, and a DNS-rebinding SSRF gap in the Python inference service (#198). Nothing is breaking. This completes the second-audit three-patch series (0.5.9 backend, 0.5.10 frontend, 0.5.11 model-service).
Fixed
- The summarize-with-audio pipeline ran Whisper transcription and pyannote diarization directly on the asyncio event loop — the 0.5.8 offload reached the standalone
/transcribeand/diarizeroutes and the VLM path but missed the summarizer's own audio path — so aPOST /summarizewith audio froze the/healthprobe, metric export, and every concurrent request for the tens of seconds to minutes the inference took. The load/transcribe/diarize/unload sequence now runs on a worker thread viaasyncio.to_thread(model-service/src/infrastructure/adapters/outbound/transcriber_whisper.py). - Two ffmpeg extraction helpers on the transcription hot path (
extract_audio_track,extract_audio_segment) awaited a timeout that cancelled only the coroutine, leaving the ffmpeg child running as an orphan; they now go through the shared kill-and-reap helper so the timeout path terminates and reaps the subprocess (model-service/src/application/services/audio_processing.py). - The video processor's
extract_audioandextract_thumbnailawaited ffmpeg with no timeout at all, so a malformed or truncated video could wedge the request forever and leak the child; both now use a bounded timeout that kills and reaps on expiry (model-service/src/infrastructure/adapters/outbound/video/processor.py). - The standalone transcribe and diarize routes fetched the single cached model instance and ran its native inference in a worker thread with no serialization, so two concurrent requests called the same non-thread-safe object (faster-whisper / pyannote) at once, risking garbled output or a crash inside the native code. Inference is now serialized per task type by an
asyncio.Lock, while distinct tasks still run in parallel (model-service/src/infrastructure/adapters/inbound/fastapi/routes/transcribe.py,diarize.py,inference_locks.py). - The video downloader's SSRF allow-list resolved the host and required a public IP but then fetched by hostname, so DNS could be rebound between the check and the connect (a TOCTOU). The connection is now pinned to the address vetted during preflight while still presenting the hostname for TLS, so a rebind cannot redirect it to a private or internal host (
model-service/src/infrastructure/adapters/outbound/video/downloader.py).
[0.5.10] - 2026-07-01
The 0.5.10 patch is the second of three releases remediating the second swarm audit — the frontend slice — fixing auto-save data-loss and stale-cache defects in the annotation UI (#196). Several were gaps in the first audit's own frontend fixes, which closed one instance of a defect class without closing every instance. Nothing is breaking.
Fixed
Auto-save no longer drops edits
- A forced save issued while another save was already in flight was silently dropped: the dialog's Done/Escape/backdrop flush, a keyframe override, and the session-expiry emergency flush could each return as if the write succeeded while persisting nothing. A forced save is now parked and drained once the in-flight save settles, and it reports whether it actually wrote (
annotation-tool/src/hooks/data/useAutoSave.ts). - Switching the persona in an open video-summary editor dropped the unsaved edits made under the previous persona — and, in the render window where the id had already changed but the loaded content had not, could persist one persona's text against another. The dialog now flushes the editor before switching and keys the editor by persona so it remounts cleanly for the new one (
annotation-tool/src/components/video/VideoSummaryDialog.tsx,VideoSummaryEditor.tsx). - The session-expiry emergency flush counted a no-op forced save (blocked by an in-flight save, skipped by change detection, or superseded) as a successful save, over-reporting preservation at the moment it matters most. It now counts only editors that actually persisted (
annotation-tool/src/hooks/data/autoSaveRegistry.ts,useAutoSave.ts). - Opening the editor fired a redundant auto-save of the just-loaded content, because the change baseline started empty and the initial sync looked like an edit. The editor now seeds the baseline to the adopted server content, so the first debounce tick sees no change (
annotation-tool/src/hooks/data/useAutoSave.ts,VideoSummaryEditor.tsx).
Stale caches refreshed after mutations
- Deleting an ontology type from the persona editor left open annotation views and world panels showing annotations and per-persona type-assignments the server had already removed, because the delete hooks wired to the buttons never invalidated the annotation and world caches (only their unused twins did). Those four hooks now invalidate both (
annotation-tool/src/store/queries/usePersonas.ts). - A world collection or relation the user had just deleted could reappear after a quick follow-up add or edit: the add path built its whole-array PUT from a cache that still held the deleted object, and the merge-by-id server re-created it. The delete hooks now strip the object from the cache before any following write (
annotation-tool/src/store/queries/useWorld.ts). - Creating or deleting a project-scoped persona did not refresh the project's persona list, so the project detail page showed a stale roster within its two-minute cache window. Both mutations now invalidate the project-personas cache (
annotation-tool/src/store/queries/usePersonas.ts). - Accepting several AI ontology suggestions at once, or importing several Wikidata items quickly, left the ontology editor showing only a subset of the persisted types: the concurrent optimistic writes read the same base snapshot and the last one won. The add/update hooks now invalidate the per-persona ontology so the authoritative server state (merged by id) is refetched once the concurrent writes settle (
annotation-tool/src/store/queries/usePersonas.ts).
[0.5.9] - 2026-06-30
The 0.5.9 patch is the first of three releases remediating a second swarm audit of the shipped 0.5.x line — the backend slice — fixing authorization, lost-update concurrency, idempotency, and data-fidelity defects in the Fastify server (#194). Roughly a third of the findings were gaps in the first audit's own fixes: a fix addressed specific instances of a defect class without closing every instance. Nothing is breaking.
Fixed
Authorization
- The
/api/models/*proxy routes were unauthenticated: any caller could read the model configuration and, worse, drive the shared inference service — selecting, loading, or unloading a model affects every user — with no credentials. The routes now require authentication, and the state-changing select/load/unload operations additionally require admin (server/src/app.ts,server/src/routes/models.ts). - A persona could be attached to a project by a non-member, since the create ability passed for any self-owned persona regardless of the target project. Project-scoped persona creation now verifies project membership and returns 403 otherwise (
server/src/services/persona-service.ts). - The import "replace" paths overwrote or deleted an existing annotation, persona, summary, claim, or claim-relation found by id with no instance-level check, so a crafted import could clobber another user's row. Each replace now requires update (or delete) on the existing row, mirroring the ontology-import guard (
server/src/services/import/entity-importers.ts).
Concurrency — lost-update hardening
- The optimistic-concurrency guard on world state and ontologies compared on
updatedAt, which two writes landing in the same millisecond may share — letting the second silently overwrite the first. Both tables now carry a monotonicversioncolumn, and every guarded write compares-and-swaps on it; guard exhaustion now returns 409 and a missing row 404 rather than a generic 500 (server/prisma/migrations/20260630120000_add_optimistic_version_and_admin_apikey_uniqueness,server/src/repositories/{WorldStateRepository,ProjectRepository,PersonaRepository}.ts,server/src/services/world-state-service.ts). - Persona type-deletion and persona deletion wrote the annotation delete, the ontology gloss cleanup, and the world-state cleanup as separate unguarded statements, so a partial failure left them disagreeing and a concurrent edit could be clobbered. Each now runs in a single transaction with the ontology and world-state writes routed through the version guard, recomputing from a fresh read (
server/src/services/persona-service.ts). - The admin "clear world state" path wrote through a bare id update that could half-survive a concurrent write; it now clears through the version guard (
server/src/services/world-state-service.ts). - First-access creation of a project world state raced into the compound unique constraint and surfaced a 500; it now upserts the empty row and merges through the guard (
server/src/services/project-service.ts). - The claim-extraction worker re-saved claims on every run with no dedup, so a job retry or double-submit duplicated every claim for a summary. Re-extraction is now idempotent: the prior extracted claims are replaced inside one transaction, preserving manually authored claims (
server/src/queues/setup.ts).
Idempotency and conflict status
- Several create and update paths surfaced a duplicate as a 500 rather than a 409: the self-service profile email/username update, project and group membership adds, and admin API keys. Each now maps the unique-constraint violation to a 409 (
server/src/routes/{users,groups}.ts,server/src/services/{project-service,api-key-service}.ts). - Admin API keys were unconstrained, since the compound
@@unique([userId, provider])does not bind rows with a null userId in Postgres, so the advertised 409 never fired. A partial unique index now enforces one admin key per provider, deduping any existing duplicates first (server/prisma/migrations/20260630120000_add_optimistic_version_and_admin_apikey_uniqueness). - A claim create that lost the same-id race re-threw the raw unique violation as a 500 when the existing row was not updatable; it now returns 403, mirroring the pre-create authorization check (
server/src/services/claim-service.ts). - A service-layer Zod validation failure — one that bypasses Fastify's schema validation — fell through to a generic 500; it now returns 400 with the field-level issues (
server/src/app.ts). - The ontology-save handler returned a bespoke 500 that leaked the raw error message to the client; it now re-throws to the global handler for a safe generic 500, keeping the detail in the logs (
server/src/routes/ontology.ts).
Data fidelity
- Forking a shared video summary copied the summary body but dropped every extracted claim, leaving the fork's claim view empty. The fork now deep-copies the claim tree under fresh ids and carries the denormalized
claimsJson, re-pointing every claim reference — row ids, parent links, and the references embedded in glosses — at the new ids (server/src/routes/sharing.ts). - Cross-summary import conflict detection compared relation ids against the collection id set and read the world-state row with a narrower filter than the writer used, so it could miss or misreport conflicts. Both now key on the correct row and the relation id set (
server/src/services/import/conflict-resolver.ts,server/src/services/import-handler.ts). - A merge-keyframes import resolution on an existing annotation fell through to an unconditional create and collided on the primary key; the create is now guarded by an existence check (
server/src/services/import/entity-importers.ts).
Demo and cache hygiene
- The idle-demo-user sweeper deleted users but never evicted their cached abilities, leaking memory as anonymous demo sessions churned; it now invalidates each swept user's ability cache (
server/src/demo/idle-reset.ts). - The anonymous-session mint endpoint had neither a per-route rate limit nor a population ceiling, so it could be driven to create users faster than the sweeper reclaims them; it now rate-limits per IP and refuses once a live-anonymous-user ceiling is reached (
server/src/demo/anonymous-session.ts). - Admin user creation accepted an independent
systemRolethat could diverge fromisAdmin(CASL keys on the former, the admin middleware on the latter); the two are now coupled. Admin user deletion now invalidates the deleted user's cached abilities (server/src/routes/users.ts).
[0.5.8] - 2026-06-30
The 0.5.8 patch is the third and final audit-driven hardening release — the model-service slice — fixing resource and concurrency defects in the Python inference service (#192). Nothing is breaking.
Fixed
- The
/detection/detectand/tracking/trackroutes leaked an OpenCVVideoCapturehandle (and decoder memory) whenever a frame read or mask decode failed, since the capture was released only on the success path. The capture is now released in atry/finallyon every exit path, and an open failure raises a clean error (model-service/.../routes/detection.py,tracking.py). - Synchronous VLM, transcription, and diarization inference ran on the asyncio event loop, so a single request froze all concurrent requests, the
/healthprobe, and telemetry export until it finished. The blocking calls are now offloaded withasyncio.to_thread(model-service/.../use_cases/summarize_video.py,routes/transcribe.py,routes/diarize.py). - ffprobe/ffmpeg subprocesses were awaited with no timeout, so a malformed or slow media file could wedge a request indefinitely and leak the child process. These calls now go through a shared helper that bounds the wait and kills and reaps the process on timeout (
model-service/.../services/audio_processing.py). ModelManagerhad no concurrency guard, so two requests loading the same not-yet-loaded model could both pass the "already loaded?" check and double-load (risking OOM). Load, unload, and eviction are now serialized by a reentrant model lock with a post-acquire re-check (model-service/.../services/model_management.py).- Id-based video resolution hardcoded
/videos, returning 404 on any deployment whose video volume is mounted elsewhere; it now reads the configuredvideo_data_root(model-service/.../use_cases/summarize_video.py). - The admin reconfigure token is now compared with
hmac.compare_digest(constant-time) instead of!=(model-service/.../routes/admin.py).
[0.5.7] - 2026-06-30
The 0.5.7 patch is the second of three audit-driven hardening releases — the frontend slice — fixing data-loss and stale-cache defects in the annotation UI (#190). Nothing is breaking.
Fixed
Auto-Save No Longer Drops Edits
- A comment-only edit in the video summary editor was never auto-saved: the auto-save change-detection keyed on the summary body alone, so editing only the comment never armed the debounce and the edit was lost on dialog dismiss or navigation. (The 0.5.6 line's predecessor fix folded the comment into the comparison snapshot but did not make the debounce fire on it — a passing test masked the gap.)
useAutoSavenow keys change detection on the serialized comparison snapshot's VALUE, so an edit to any compared field — including a sibling field such as the comment — schedules a save (annotation-tool/src/hooks/data/useAutoSave.ts,VideoSummaryEditor.tsx). - Dismissing the summary dialog with Escape or a click outside bypassed the save-on-close flow, dropping edits made in the last second. The dialog now routes Escape/backdrop dismiss through the same
forceSaveflush as the Done button (annotation-tool/src/components/video/VideoSummaryDialog.tsx). - After a transient save failure, the in-progress guard was released immediately even though a retry was scheduled, allowing a second save to run concurrently with the retry (duplicate writes / last-writer-wins). The guard is now held through the backoff and released only on a terminal outcome (
annotation-tool/src/hooks/data/useAutoSave.ts). - The session-expiry emergency save was a no-op stub that logged but saved nothing. It now flushes every mounted editor's pending edits through a registry of their
forceSavecallbacks (annotation-tool/src/hooks/auth/useEmergencySave.ts,annotation-tool/src/hooks/data/autoSaveRegistry.ts).
Ontology Type Deletion Persists Again
- Deleting an entity, role, event, or relation type from a persona stopped taking effect after 0.5.6: the client deleted by PUTting the ontology with the type omitted, but 0.5.6 changed the ontology write to merge by id, so the omitted type was kept and re-appeared on the next refetch. These deletions now call the dedicated DELETE endpoint (which also cleans up gloss references, world assignments, and annotations) (
annotation-tool/src/store/queries/usePersonas.ts).
Stale Caches Refreshed After Mutations
- Creating and sharing a persona did not refresh the Sent Shares panel; a self-role change in a project or group left the list's own-role field stale; the summary save/generate/delete mutations skipped the batch summaries-lookup cache; and deleting an ontology type (or a persona) left annotation lists and world panels showing entries that no longer exist server-side. Each of these mutations now invalidates the additional query keys it affects (
annotation-tool/src/store/queries/{usePersonas,useProjects,useGroups,useSummaries}.ts).
[0.5.6] - 2026-06-30
The 0.5.6 patch is the first of three audit-driven hardening releases — the backend slice — fixing a batch of authorization, data-integrity, idempotency, and concurrency defects surfaced by a code audit (#188). Nothing is breaking; the API additively gains 409 conflict responses on duplicate creates and the resource-fork now produces correctly-scoped resources.
Fixed
Authorization and Access Control
- The Bull Board queue dashboard (
/admin/queues) was mounted with no authentication, exposing queued job payloads and Bull Board's mutating actions to any client that could reach the server. It is now gated behind an adminonRequesthook (server/src/app.ts). - A password change did not revoke the user's existing sessions, so a stolen or shared token survived a reset. Both the self-service profile update and the admin user update now revoke all of the affected user's sessions (the self-update re-issues a fresh session so the acting client stays logged in) (
server/src/routes/users.ts). - The ontology importer overwrote any persona's ontology with no ownership check, letting any authenticated user clobber another user's ontology by uploading a line targeting their persona. The importer now enforces the same instance-level CASL update check as the rest of the app (
server/src/services/import/entity-importers.ts). - The claim-relation privacy filter on
getRelationswas a no-op (anORover either endpoint, the known one of which is always readable), leaking the existence and metadata of relations to claims the caller cannot read. Each query now requires the opposite endpoint to be accessible (server/src/services/claim-service.ts). - The corpus-manifest sync changed group memberships/roles and project ownership without invalidating the in-memory CASL ability cache, so a downgrade lingered until restart. The sync now clears the ability cache when it reconciles memberships or ownership (
server/src/services/videoSync.ts). - The last
project_ownercould be demoted (via change-role) or stranded (via admin user-deletion), leaving a project with no owner. Both paths now enforce the last-owner invariant (server/src/services/project-service.ts,server/src/routes/users.ts). - Per-IP rate limiting trusted a fully spoofable
X-Forwarded-ForbecausetrustProxywastrue; it is now configurable viaTRUST_PROXYand defaults to trusting a single proxy hop (server/src/app.ts,server/src/config.ts). - The demo seed fixture loader interpolated an unconstrained
tourIdinto a filename (path traversal);tourIdis now constrained to a strict charset at the route boundary (server/src/demo/seed.ts).
Idempotency and Conflict Handling
- Creating a resource share, a claim relation, a user, or a group is now idempotent / conflict-safe: duplicate shares and claim relations no longer accumulate (a re-issue reuses the existing row), and a duplicate user email/username or group slug returns
409 Conflictinstead of a500. New uniqueness constraints back these (aClaimRelationtriple unique and partial unique indexes onResourceShareand personalWorldState); a migration dedupes any pre-existing duplicates before adding them (server/src/routes/{sharing,users,groups,auth}.ts,server/src/services/{claim-service,project-service}.ts,server/prisma/migrations). - A user could end up with two personal world-state rows (the compound unique did not constrain a NULL
projectId); the personal get-or-create/update paths are now race-safe and merge on conflict, backed by a partial unique index (server/src/services/world-state-service.ts).
Concurrency — Lost Updates
- Concurrent edits to a persona's ontology, a project's world state, or an import running alongside a UI edit silently overwrote each other (whole-blob replace, no guard). These writes now merge each array by id under optimistic concurrency, the same pattern that protects personal world-state writes (
server/src/services/{persona-service,project-service}.ts,server/src/repositories/{PersonaRepository,ProjectRepository}.ts,server/src/routes/ontology.ts,server/src/services/import/entity-importers.ts).
World-Object Deletion Integrity
- Deleting a world entity/event/time wrote the world state and cleaned up dependent ontology gloss references in two un-transacted writes, orphaning glosses on a partial failure, and bypassed the optimistic-concurrency guard so a concurrent addition could be clobbered. Each deletion now runs the guarded world-state write and all gloss cleanups in a single transaction (
server/src/services/world-state-service.ts). - The world-object deletion preview always reported
annotationCount: 0; it now counts the object annotations that reference the object.
Added
Deep-Fork of Shared Resources
- Forking a shared summary always returned a
500(it collided on the source's(videoId, personaId)), and forked claims/annotations were orphaned or lost their object link. Forking now deep-copies into the forker's own scope: the source persona is forked, the summary/claim is re-pointed at the forked persona/summary,claimsJsonis rebuilt, and annotationlinkTypeis preserved (server/src/routes/sharing.ts).
Conflict Responses
- The user, group, and project create/update endpoints document and return
409 Conflicton a duplicate, reflected in the OpenAPI contract and generated client types.
[0.5.5] - 2026-06-29
The 0.5.5 patch is an audit-driven hardening release (#186). It closes two higher-severity defects — a former group member kept the group's permissions until the server restarted, and concurrent edits to the world graph silently overwrote one another — together with a set of data-scope, idempotency, and cache-staleness gaps. Nothing is breaking; the API additively gains explicit DELETE endpoints for world collections and relations, an optional client-supplied id on claim creation, and a 409 on a duplicate video assignment.
Fixed
Deleting a Group Clears Its Former Members' Cached Abilities
- Both group-delete handlers (
server/src/routes/groups.ts) removed the group without invalidating its former members' in-memory ability cache. A group-scope, non-own-only permission compiles into a globally unconditioned CASL rule, so a former member kept the group's system-wide access until the process restarted or they logged in again. Each handler now snapshots the membership before the cascade and callsinvalidateUserAbilitiesfor every former member, mirroringProjectService.delete.
Concurrent World-Graph Edits No Longer Overwrite One Another
- Every world add and update went through a whole-blob read-modify-write
PUT /api/worldwith no serialization, so two edits in quick succession — or a Wikidata import racing a manual edit — read the same stale base and the last writer silently dropped the other's entities, events, or relations. Writes are now safe on two fronts: the client (annotation-tool/src/store/queries/useWorld.ts) funnels mutations through a single-flight chain that threads the latest server state into the next write, and the server (server/src/services/world-state-service.ts,server/src/repositories/WorldStateRepository.ts) merges each of the seven object arrays byidunder an optimistic-concurrency guard rather than replacing them wholesale. Because merge-by-id makes removal-by-omission a no-op, every object removal — entities, events, times, the three collection kinds, and relations — now goes through an explicit DELETE endpoint; the client delete hooks call the gracefulDELETE /api/world/...routes, which also clean up dependent relations, collection memberships, and gloss references.
Personaless Object Annotations Inherit the Video's Project Scope
- Object annotations carry no persona, so the create path (
server/src/routes/annotations.ts) had no project to stamp and persisted them withprojectId = NULL, invisible to every project reviewer but the creator. When the caller is a member of exactly one project the video is assigned to, the annotation now inherits that project; an ambiguous (multiple) or absent assignment stays personal, and the existing CASL pre-authorization still validates the resolved scope.
Claim Creation Is Idempotent on a Client-Supplied ID
- Neither claim-create route accepted a client
id, so a network retry or programmatic resend minted a duplicate claim. Both routes (server/src/routes/claims.ts,server/src/services/claim-service.ts) now accept an optionalid; a resend carrying an existing id re-authorizes against the stored row and returns it instead of creating a duplicate, with aP2002race fallback. This mirrors the annotation idempotent-create hardening from 0.5.4.
Duplicate Video Assignment Returns 409 Instead of 500
- Re-assigning a video already assigned to a project violated the
@@unique([projectId, videoId])constraint and surfaced as an unhandled500. The route (server/src/routes/video-assignments.ts) now catches the constraint violation and returns a409 Conflict.
The Summary Editor Autosaves Comment-Only Edits
- The video summary editor's autosave watched only the summary body, so editing the comment without touching the summary never triggered a save and the edit was lost on navigation. Autosave now compares a snapshot that includes the comment (
annotation-tool/src/components/video/VideoSummaryEditor.tsx).
Relating Claims Refreshes Both Endpoints' Relation Panels
- Creating or deleting a claim relation invalidated only the source claim's relation query, leaving the target claim's relation panel stale until a manual refetch. Both mutations (
annotation-tool/src/store/queries/useClaims.ts) now invalidate the target as well.
Membership and Role Changes Refresh the Client Ability Mirror
- The client ability cache carried a five-minute stale time and no mutation invalidated it, so a user's own role or membership change took up to five minutes to reflect in the UI even though the server always enforced it correctly. The six self-affecting project- and group-membership mutations (
annotation-tool/src/store/queries/useProjects.ts,useGroups.ts) now invalidate the ability mirror on success.
Admin isAdmin Changes Stay in Sync With systemRole
- The admin user-update endpoint (
server/src/routes/users.ts) wroteisAdminbut neversystemRole, which is what CASL'smanage allkeys on, so a promotion or demotion left the two fields divergent and the cached abilities stale. The endpoint now setssystemRoleto match and invalidates the affected user's abilities.
Removed a Dead Auto-Save Hook
useAutoSaveAnnotationshad no callers and re-armed the exact save-loop footgun the live autosave was hardened against; it has been deleted.
Added
Explicit DELETE Endpoints for World Collections and Relations
server/src/routes/world.tsgainsDELETEroutes for entity collections, event collections, time collections, and relations, so these objects are removed explicitly rather than by omission from a whole-blobPUT. This is what lets the non-clobbering merge-by-id persistence be safe (see Fixed).
Optional Client-Supplied ID on Claim Creation
- Both claim-create endpoints accept an optional
id, enabling idempotent retries; theClaimresponse already echoes it.
[0.5.4] - 2026-06-25
The 0.5.4 patch fixes project-scope and ownership stamping on video summaries and claims (#181). Project collaborators could not see a teammate's summary or add claims under it because summaries were persisted without their persona's project, and model-generated summaries and extracted claims were left unowned. Nothing is breaking; the API additively gains a projectId field on summary and claim responses.
Fixed
Summaries and Claims Are Stamped With Their Persona's Project
- Every video-summary and claim write now stamps
projectIdfrom the persona (or the parent summary). Previously the interactive summary route (server/src/routes/summaries.ts), the summarization and claim-extraction workers (server/src/queues/setup.ts), and the auto-created empty summary (server/src/repositories/ClaimRepository.ts) all omitted it, so rows were bornprojectId = NULLand were invisible to every project collaborator except the creator — which403'd them at the parent-summary read gate when they tried to add claims, and hid the content from project-scoped queries. The summary update path also re-stamps the scope so a previously NULL-scoped row heals on its next save, and a migration backfills existing summaries and claims.
Model-Generated Summaries and Extracted Claims Are Owned
- The summarization and claim-extraction queue workers created rows without
createdBy, leaving model-generated summaries and extracted claims owned by no one (readable only by an admin). The requesting user is now threaded through the queue payload and stamped as the owner on create.
Added
projectId on Summary and Claim API Responses
- The
VideoSummaryandClaimAPI responses now includeprojectId, so clients can reflect a resource's project scope; its prior absence had helped the stamping defect go unnoticed.
[0.5.3] - 2026-06-24
The 0.5.3 patch fixes a claims-workspace interaction bug (#177). No API shapes change and nothing is breaking.
Fixed
Clicking a Claim Card No Longer Switches to the Summary Tab
- In the video summary editor's Claims tab, clicking a claim card switched the interface back to the Summary tab, which is not what clicking a card should do. A card now selects in place: it is marked selected on the Claims tab and records its source spans so the Summary tab highlights the claim's provenance only if the user chooses to switch there. The card's selected styling, previously bound to a state value that was never set, is now driven by the selected-claim state (
annotation-tool/src/components/video/VideoSummaryEditor.tsx,annotation-tool/src/components/claims/ClaimsViewer.tsx).
[0.5.2] - 2026-06-22
The 0.5.2 patch fixes a Safari-only failure (#143) where a video in the annotation workspace blacked out the moment it was paused and jumped position on resume, while Chrome and Firefox played it back correctly. The cause was twofold: the video stream endpoint mishandled the byte-range requests Safari issues but Chrome does not, and WebKit dropped the paused video frame from its compositor. No API shapes change and nothing is breaking.
Fixed
Video Stream Range Requests Handle Safari's Suffix and Edge Ranges
- The local video stream provider (
server/src/services/videoStorage.ts) parsed the HTTPRangeheader with a naivesplit('-'). A suffix range (bytes=-N, which Safari uses to read a file's trailingmoovatom and to re-buffer on pause and resume) produced aNaNstart offset and threw, and a range whose end ran past the file declared aContent-Lengthlarger than the bytes actually streamed. Chrome and Firefox issue plain bounded ranges and were unaffected, so the player worked there while Safari received a failed request mid-playback, blacked out, and re-seeked on resume. Range parsing now follows RFC 7233: suffix ranges resolve to the last N bytes, open-ended ranges run to the last byte, an end past the file clamps soContent-Lengthalways matches the stream, and a start past the file returns416 Range Not Satisfiablewith aContent-Rangeheader rather than a 404 that strict clients treat as a fatal media error.
Paused Video Keeps Its Frame in Safari
- The annotation video element (
annotation-tool/src/components/annotation/AnnotationWorkspace.css) is composited beneath the interactive annotation overlay, and WebKit stopped repainting the last decoded frame the instant playback paused, showing the container's black background instead. The element is now pinned to its own GPU compositing layer (transform: translateZ(0)withbackface-visibility: hidden), which keeps the frame painted while paused.
Added
Cross-Browser Video Playback E2E Coverage
- The Playwright matrix gained
video-chromium,video-webkit, andvideo-firefoxprojects (annotation-tool/playwright.config.ts) that run a new pause-and-resume spec under all three engines, since the regression above was WebKit-only and the prior E2E matrix ran only under Chrome. The spec asserts that the stream endpoint answers Safari's suffix and edge byte ranges, that the playhead stays steady across pause and resume, and that the paused frame stays decoded.
[0.5.1] - 2026-06-22
The 0.5.1 patch resolves a batch of field-reported bugs surfaced on a self-hosted production deployment, spanning backend request validation and rate limiting, frontend request fan-out and resilience, and a set of annotation-workspace and persona-builder display fixes. No API shapes change and nothing is breaking; the cross-service contracts are unchanged.
Fixed
Video Assignment Accepts Non-UUID Video IDs
- The admin video-assignment endpoints (
server/src/routes/video-assignments.ts) constrained everyvideoId/videoIdstoformat: 'uuid', so they rejected any video whose id is not a UUID. FOVEA video ids are free-form strings — videos imported with externally-derived ids use a short hex form, and only Prisma-created videos are UUIDs — so the whole feature returned400 must match format "uuid"on those deployments. The video id fields now use aVideoId = Type.String({ minLength: 1 })schema;projectIdandassignedUserId, which legitimately are UUIDs, stay format-constrained.
Media Endpoints Exempt from the Global Rate Limit
- The video stream and thumbnail routes (
server/src/routes/videos/stream.ts,thumbnail.ts) now setconfig: { rateLimit: false }, removing them from the shared per-IP@fastify/rate-limitbucket. A grid that fanned out many per-card requests could exhaust the budget, after which/streamreturned a 429 JSON body instead of video bytes (the player went black withMEDIA_ERR_SRC_NOT_SUPPORTED) and a 429 thumbnail re-fetched endlessly for lack of a cache header. Media bytes are no longer subject to the request limiter.
Annotation Save Failures Are Surfaced
- The annotation write mutations (
annotation-tool/src/store/queries/useAnnotations.ts) defined onlyonSuccess, so a failed create/update/delete was swallowed — for a new box the drawing state reset and the box simply vanished, which read as "the tool is broken" rather than "the save failed." The three single-write mutations now route failures through anonErrorthat toasts the backend-provided message (via the already-mountedsonnertoaster), and the batch auto-save surfaces its collected per-annotation failures the same way.
Ontology Requests No Longer Refetch on Every Navigation
useAllPersonaOntologies(annotation-tool/src/store/queries/usePersonas.ts) was configured withstaleTime: 0andrefetchOnMount: 'always', so the (already-batched) ontology request re-fired on every navigation, including pages that do not need ontologies. It now uses a normal five-minutestaleTimeand the default mount behavior; mutations already invalidate the ontologies query key, so the data stays fresh without the per-navigation refetch.
Id-Bound Select Triggers Show Names Instead of UUIDs
- The shared Base UI Select renders a self-closing
<SelectValue/>'s bound value verbatim, so id-bound dropdowns across the world, ontology, claims, project, and persona editors showed a raw UUID in the closed trigger (and briefly flashed it before the matching item mounted). Each id-bound call site now passes a children render function mapping the selected id to its label, with sentinel/empty values falling through to the placeholder; enum/label selects, whose value is already human-readable, are unchanged.
Persona Select Wraps in the Open List
- On the annotate page the persona Select clipped long "name (role)" labels in the open dropdown while overflowing its closed trigger. The closed-trigger truncation was already corrected; the open list now renders un-anchored at a wider width with wrapping item text, so the full persona label is readable while choosing. Scoped to that one call site; the shared Select component is unchanged.
Long Video Descriptions No Longer Collapse the Player
- On the annotate page a multi-thousand-character video description grew the header card unbounded and squeezed the flex-grow video player to near-zero height. The description block is now height-bounded and scrolls (
max-h-28 overflow-y-auto whitespace-pre-wrap), so the player keeps the remaining column height.
Annotate Header Title Falls Back to the Video Title
- The annotate header showed "Loading..." indefinitely for videos with no uploader metadata (file-synced videos that carry a title but no
uploader/uploaderId), because "Loading..." served as both the loading placeholder and the no-uploader fallback. It now shows "Loading..." only while the video is actually loading and otherwise falls back throughuploader,uploaderId,title,filename, and finally "Untitled video".
Added
Lazy Thumbnail Loading in the Video Grid
- Each video-grid card now defers loading its thumbnail (a CSS background image, so the native
loading="lazy"attribute does not apply) until the card nears the viewport, via anIntersectionObserverwith a 300px margin that loads once. A large library no longer fires hundreds of concurrent thumbnail requests on first paint, which had saturated the browser's per-origin connection pool and competed with video streams.
Video Stream Retry
- When a video stream fails to load, the player now shows a "Retry" overlay that reloads the source once, instead of leaving a permanently black player. The global TanStack Query client also stops retrying 4xx/429 responses and no longer refetches on window focus, reducing the request fan-out that could trip the rate limiter in the first place.
Persona Builder Readability
- The persona/ontology builder gains several readability fixes: long persona descriptions in the list are expandable ("Show more"/"Show less") via a new reusable
ExpandableText, the selected-persona detail header wraps its full description instead of clipping to one line, the persona list is scoped to the active project, and a user who belongs to a project now defaults into it on load rather than landing on an empty Personal Workspace. (The project dropdown's name-instead-of-UUID display shipped in 0.5.0.)
[0.5.0] - 2026-06-22
The 0.5.0 cycle delivers the architecture-modularization roadmap (notes/architecture-review.md): single sources of truth for configuration, containerization, the build/test surface, and cross-service contracts, plus a service/repository layer for the backend domains and a handful of folded-in bug fixes.
Changed
Backend Configuration Single-Source-of-Truth
- All backend environment access is centralized in one typed, validated module,
server/src/config.ts. It is the only file inserver/srcpermitted to readprocess.env(enforced by a new ESLintno-restricted-syntaxrule with aconfig.ts/prisma/testexemption), loads an optional local.envviadotenv, exposes a deep-frozenconfigobject grouped by concern (server,redis,auth,storage,modelService,rateLimit,cors,otel,mode,demo,tours,wikidata,externalLinks,defaultUser), and centralizes every default and coercion. All 31 previously-scatteredprocess.envreads across routes, services, queues, middleware, and libs now go through it. The two dynamic read patterns are exposed as typed helpers:config.modelService.timeoutMs(endpoint)andconfig.getProviderApiKey(provider). - Configuration is now validated once at startup and fails fast:
config.tsis imported first inserver/src/index.ts(before tracing), validates numeric env vars through a TypeBox schema, validates theAPI_KEY_ENCRYPTION_KEYformat at boot when it is set (so a misconfigured wrong-length key fails fast rather than silently corrupting data later, while deployments that do not use API-key management still boot without it; the key is otherwise validated lazily at first use), and refuses an unset or dev-defaultSESSION_SECRETin production. - Four environment defaults that previously disagreed across read sites are unified to one value each:
STORAGE_PATH(the repo-relativevideospath),FOVEA_MODE(multi-user),MODEL_SERVICE_URL(http://localhost:8000), andOTEL_EXPORTER_OTLP_ENDPOINT(http://localhost:4318). These defaults only apply when the variable is unset; every docker/production deployment sets them explicitly, so deployed behavior is unchanged. Single-user and demo-mode branching now route through the singleisSingleUserMode()/isDemoModeEnabled()predicates (reading fromconfig) instead of inline per-handlerprocess.envcomparisons.
Frontend Configuration Single-Source-of-Truth
- All frontend environment access is centralized in one typed module,
annotation-tool/src/config.ts, the only file permitted to readimport.meta.env(enforced by an ESLintno-restricted-syntaxrule). It exposes a deep-frozenconfiggrouped by concern (env,api,wikidata,deploymentMode,testData,demo) with all defaults and coercions centralized, removes the twoimport.meta as unknowncast hacks indemo/config.tsanddemo/api.ts, drops the deadVITE_MODEL_SERVICE_URLdeclaration, and normalizes the previously-inconsistentVITE_E2Echecks to=== '1'. - The mode/demo build flags resolve into one derived
config.deploymentModewith a typedkind(normal/public-demo/tour-demo/legacy-demo-shell) and documented precedence (legacy DemoShell short-circuits;VITE_DEMO_PUBLICsuppresses the runtime MSW worker even when the tour-demo mocks are built). The three tree-shaking-critical guards (theVITE_TOUR_DEMOMSW dynamic-import gates and theVITE_E2Efixture gate) deliberately keep their inline literal comparison so Rollup still drops themocks/tourDemosubtree from normal production builds, verified by a build check.
Model-Service Configuration via pydantic-settings
- The model-service now reads every environment variable through one typed
Settings(BaseSettings)inmodel-service/src/infrastructure/config/settings.py(pydantic-settings), validated once and instantiated at the top of the FastAPIlifespanso an invalid environment fails fast at startup. All 13 scatteredos.environ/os.getenvreads across the routes, services, adapters, and observability layers now route through it; a guard test asserts no env read remains outside the settings module. The existingmodels.yamldiscriminated-union catalog validation is unchanged -Settingsonly resolves which catalog to load and feeds the DI container, subsuming the two previously-divergentMODEL_CONFIG_PATHresolution sites into one. - The dynamic
<PROVIDER>_API_KEYlookup becomes a typedSettings.get_provider_api_key(provider)helper, theHUGGING_FACE_HUB_TOKEN/HF_TOKENpair becomes one alias-choice field, and the singleOTEL_EXPORTER_OTLP_ENDPOINTfans out to the traces and metrics endpoints through derived properties that preserve the prior two-default behavior exactly.
Configuration Drift Guard and Deployment-Mode Summary
- A CI guard (
pnpm check:env, run in the backend lint job) fails the build when.env.exampledeclares a duplicate key or omits any variable the backend config module reads, so the committed template cannot silently drift from the code..env.exampleis brought into sync: the previously-undocumentedSESSION_IDLE_TIMEOUT_MINUTES,ALLOW_TEST_ADMIN_BYPASS,DEFAULT_USER_USERNAME/DEFAULT_USER_DISPLAY_NAME,MODEL_SERVICE_ADMIN_TOKEN,AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY, and the demo/tours variables are now documented (as commented optional entries). - The backend resolves its mode and demo flags into one derived
config.deploymentModeobject (typedauthanddemo) and logs a one-line summary at startup (for example[config] deployment mode: auth=multi-user demo=off). A new operations guide (docs/docs/operations/configuration.md) documents the configuration model, fail-fast startup, and the deployment-mode taxonomy across all three services.
Node and pnpm Version Single-Source-of-Truth
- Node and pnpm versions are now pinned in exactly one place. A root
.nvmrc(Node22) and the rootpackage.jsonpackageManagerfield (pnpm@10.15.0, plusengines.node >=22) are the single sources: every Dockerfile takesARG NODE_VERSIONand every CI workflow readsnode-version-file: .nvmrcand resolves pnpm frompackageManager(the per-workflowpnpm/action-setupversionpins and theNODE_VERSIONenv vars are removed). This closes the prior split where the production images ran Node 20 + pnpm 10.15.0 while every CI workflow ran Node 22 + pnpm 9, and fixes a stray Node 20 in the deploy workflow. The converged Node 22 images were validated by building them locally (the backend image runs Node 22 with its native modules and Prisma client intact).
CI Validates All Compose Configurations
- A new
verify-composeCI job validates every committed docker compose configuration (each with its documented-foverride chain and profile) on every relevant PR, replacing the prior check that covered only two of them. It catches structural drift, such as an overlay referencing a service its base does not define, before it reaches a deploy.
Dev Infrastructure Consolidated onto the Root Compose Stack
- Deleted the duplicate
server/docker-compose.dev.yml, which redefined Postgres, Redis, and the full observability stack with conflicting credentials (user/passwordversus the root stack'sfovea/fovea_password) and a separate volume. Thedev:infra,dev:infra:full,stop, andcleanscripts now drive the rootdocker-compose.yml(plus itsdevoverlay for the observability services), so local development and the full stack share one Postgres definition and one set of credentials. Local dev databases that relied on the olduser/passwordcredentials must point theirDATABASE_URLat the canonicalfovea/fovea_password(matching.env.example).
Unified Build and Test Entrypoint
- A root
Makefileis now the single source for the install/lint/typecheck/test/build recipe across all four components (Node via pnpm, Python via uv):make lint,make typecheck,make test,make build, plus per-suite targets such asmake test-model-service(runmake helpto list them). The README and CONTRIBUTING guides point at it, replacing their previously-divergent per-package instructions, which had drifted betweennpmandpnpmand between barepytestanduv run pytest.
CI Reconciled with Reality
- Rewrote
.github/workflows/README.mdto match what the workflows actually do; it had claimed an in-CI e2e gate, a GPU docker-image matrix, and multi-arch release images, none of which happen. Removed the permanently-disabled (if: false)test-e2ejob fromci.yml(end-to-end tests run in the dedicatede2e-mock.yml/e2e-real-models.ymlworkflows). Thetest-model-servicejob is now listed in the quality gate as advisory (its result is reported but does not block, since its full ML-stack install is disk-sensitive on shared runners), replacing the summary's inaccurate "skipped (disk space constraints)". CorrectedDOCKER_QUICK_REFERENCE.mdto referenceSESSION_SECRET(the variable the stack uses) instead of the non-existentCOOKIE_SECRET.
Backend Modularization: Service/Repository Layer
- Extracted the personas domain into a
PersonaRepository(pure Prisma data access) and aPersonaService(orchestration, RBAC, and response mapping), reducingserver/src/routes/personas.tsfrom 1641 lines with 44 directprisma.*calls to 640 thin lines with zero. Routes now validate input and dispatch to the service; the service owns the CASL ability checks, the list-mode branching, theisSystemGeneratedcoercion, and the ontology type-deletion / world-state-cleanup orchestration; the repository owns every query. RBAC decisions, response shapes, and behavior are unchanged (the full personas route-test suite and the local E2E suite pass). This establishes the pattern for the remaining domain extractions. - The persona API response and the frontend
Personatype now exposeprojectId, which had been silently stripped by the response schema, enabling project-scoped persona browsing. - Extracted the projects domain into a
ProjectRepository(pure Prisma data access, including the create-project-with-owner-membership transaction) and aProjectService(orchestration, authorization, and response mapping), reducingserver/src/routes/projects.tsfrom 1072 lines with 32 directprisma.*calls to 581 thin lines with zero, following the personas pattern. - Migrated project authorization onto CASL, the same engine every other domain already uses. The projects routes previously enforced access with bespoke inline checks (
['project_owner','project_manager'].includes(myRole)role-string comparisons and arequest.user.isAdminboolean);Projectis now a first-class CASL subject inserver/src/lib/abilities.ts, so reads, updates, deletes, member management, and project creation resolve through the sameRolePermissionmatrix and request-scopedabilityas personas, claims, world state, and video. The database already granted these project permissions (project_owner→ update/delete/manage_members,project_manager→ update/manage_members, every project role → read,group_owner/group_admin→ create) andmanage_memberswas an already-defined-but-unused CASL action, so the policy is unchanged - this removes a duplicated, divergent authorization path rather than changing who may do what. The group-scoped create permission is now correctly limited to the group a candidate project belongs to (a group administrator could previously be granted creation against any group). One deliberate consistency change: a project administrator is now recognized through the CASL system role (systemRole === 'system_admin'), matching every other domain, rather than the separateisAdminboolean column; seeded and provisioned administrators carry both, so no real administrator is affected. - Added
GET /api/projects/:projectId/assignable-users(authorized bymanage_members), returning the users who are not already members of a project. The "Add member" picker on the project detail page previously called the admin-only/api/admin/usersendpoint, so it returned a 403 and showed an empty list to project owners and managers who were not system administrators; it now reads from the scoped endpoint and works for anyone who can manage the project's members. - Extracted the claims domain into a
ClaimRepository(pure Prisma data access) and aClaimService(orchestration, the relocated CASL authorization, and response mapping), reducingserver/src/routes/claims.tsfrom 1559 lines with 33 directprisma.*calls to 783 thin lines with zero. The claims routes already authorized through CASL, so everycan(...)/accessibleBy(...)decision is preserved verbatim, along with the denormalizedclaimsJsonrebuild, the claim-extraction and claim-synthesis queue jobs, the auto-created empty summary, and the subclaim cascade - a structural refactor with no behavior change. - Extracted the world-state domain into a
WorldStateRepository(pure Prisma data access) and aWorldStateService(orchestration, the relocated CASL authorization, the demo / single-user read handling, and the object-reference gloss cleanup), reducingserver/src/routes/world.tsfrom 1148 lines with 27 directprisma.*calls to 372 thin lines with zero. The world routes already authorized through CASL, so everycan(...)/accessibleBy(...)decision is preserved verbatim - including the existence-privacy distinction between forbidden and not-found, the anonymous / demo-mode read widening, the single-user default-user resolution, and the entity / event / time object-reference-to-text gloss cleanup on deletion - a structural refactor with no behavior change. - Split the 2232-line
server/src/services/import-handler.tsimport state machine into focused modules underserver/src/services/import/:line-parser.ts(line parsing + validation),dependency-graph.ts,conflict-resolver.ts(conflict detection, resolution, id remapping, and cross-user resolution),entity-importers.ts(the per-typeEntityImportercollaborator), andtypes.ts- leavingimport-handler.tsa 517-line orchestrator. The pure parsing / graph / conflict logic became standalone functions; the per-type database writes moved to anEntityImporterthat receives the transaction client on each call, so the entire import still runs inside the same single interactive$transaction(samemaxWait/timeout, same write order).routes/import.tsand the publicImportHandlerAPI are unchanged; behavior is identical.
Cross-Service Contract Generation (OpenAPI)
- The backend now emits its OpenAPI 3.x specification to a committed
server/openapi.json(pnpm --filter @fovea/server gen:openapi), and the frontend generates its API types from that spec withopenapi-typescript(pnpm --filter @fovea/annotation-tool gen:api-types→src/api/generated/openapi.ts). The hand-maintained object-detection and ontology-augmentation request/response types inannotation-tool/src/api/client.tsare deleted and re-exported fromsrc/api/generated/contracts.ts, where they are derived from the generated spec, so the frontend'sDetectionResponse/AugmentationResponse/OntologySuggestionshapes now track the server schema by construction rather than by hand. A newcontract-driftCI job regenerates the spec and the frontend types and fails the build when either is stale, making the cross-service detection/ontology drift structurally impossible. - Fixed a latent ontology-augmentation casing bug:
OntologyAugmenter.tsxcarried its ownAugmentationResponsetype declaringpersona_id/target_category(snake_case) while the API actually returnspersonaId/targetCategory(camelCase). The component body already read the camelCase fields, so the mismatch was confined to the (now-deleted) type and its test fixtures, both corrected.
Model-Service Contract Layer Decoupling
- Removed the eager submodule re-exports from the model-service
src/infrastructure/**/__init__.pychain, so importing a Pydantic contract DTO (a request/response schema) no longer transitively loads the entire outbound ML adapter tree (adapters → outbound → video → processor → cv2). Importing a schema now pulls only that module and its real ancestors;cv2/torchload only when something actually uses the video processor. This removes an import-time-side-effect anti-pattern and lets the cross-service contract layer be imported (and emitted) without the ML stack — the foundation for an upcoming model-service contract pipeline. No call sites changed (every consumer already imported via concrete deep paths); the full model-service test suite (1235 tests) and the assembled app are unchanged. - Raised the
opencv-pythonfloor to>=4.10so the resolver can never select the numpy-1-ABI4.9.x, which fails to import against thenumpy>=2the audio stack (pyannote-audio>=4) requires. Fresh installs already resolved to a numpy-2-compatible build; this pins the requirement explicitly to prevent an ABI regression on a stale lock.
Server↔Model-Service Contract Pipeline
- The model-service now emits a committed OpenAPI 3.1 contract (
model-service/openapi.json) for the shapes the server consumes (detection, ontology augmentation, claim extraction, summary synthesis, summarize), and the server generates TypeScript types from it (pnpm --filter @fovea/server gen:model-service-types→src/lib/model-service/contract.ts). Generation is ML-free —model-service/scripts/gen_contract_spec.pyimports only the Pydantic request/response models (not the FastAPI app) and assertscv2/torchnever load — so the spec is deterministic and the CI gate does not need the heavy ML install. This is the producer-owned counterpart to the server→frontend OpenAPI generation: the model-service owns the shapes it produces; the server tracks them by generation rather than by hand. server/src/lib/model-service/contract-assertions.tsadds compile-time compatibility assertions between the generated producer types and the server's hand-written model-service-facing interfaces (ModelClaimExtractionResponse,ModelSynthesisResponse,ModelSummarizeResponse/Request, and the detection/ontology request+response expectations). If the model-service drops, renames, or retypes a field the server reads, regenerating the contract makestscfail with an error whoseoffendingKeysnames the exact field — a server↔model-service drift becomes a red build, not a runtime surprise. A new blockingcontract-drift-model-serviceCI job (Contracts / Model Service Drift Check) regenerates the spec and the generated types and fails on any diff. This surfaced and corrected one latent annotation drift (ModelSummarizeResponse.key_frameswas typednumber[]but the producer sendsKeyFrameobjects, stored verbatim into a JSON column — a type-only fix).docs/docs/development/cross-service-contracts.mddocuments the single workflow for adding or changing a cross-service endpoint.
Model-Service Loader Files Split One-Class-Per-File
- The two monolithic model-service loader modules —
vlm/loader.py(1158 lines, 6 VLM loaders) anddetection/loader.py(771 lines, 7 detection loaders) — are split intoloaders/subpackages with one loader per file plus a sharedloaders/base.py(the config/enums, the abstract base, the architecture-keyed registry, and the factory). Each formerloader.pyis kept as a thin aggregator that re-exports the public surface and imports every per-loader module for its@registry.register(...)side effect, so registration still runs on import and no consumer import path changes. Pure reorganization — no loader logic, registry, factory dispatch, or architecture class changed; the registered architectures (9 VLM, 7 detection PyTorch + 3 ONNX), the architecture↔loader invariant test,mypy, and the full suite are unchanged.
Demo-Mode Read Widening Consolidated
- Demo mode (
FOVEA_DEMO_MODE) exposes system-seeded content to anonymous/all users on the public-demo/booth deployment by widening read access; that widening was duplicated ad hoc acrossroutes/annotations.ts,routes/summaries.ts,routes/persona-preferences.ts,services/persona-service.ts,services/video-access-service.ts, andservices/world-state-service.ts, each with a different shape, so what demo mode exposed was hard to audit. It is now consolidated into one module,server/src/lib/demo-rbac.ts— the single source of truth for the per-subject demo read scopes ({ source: { startsWith: 'demo-fixture' } }for annotations,{ isSystemGenerated: true }for personas) and the demo read predicates (videos, summaries, world state, system-persona access). Each helper gates onisDemoModeEnabled()internally, so no route or service scatters that flag check anymore (a grep ofroutes/+services/forisDemoModeEnabledis now empty). Pure consolidation: every site's effective query/decision is byte-for-byte preserved — demo mode exposes the same content, and with demo off a self-hoster's per-user RBAC is exactly as before.
Annotation Workspace Logic Extracted into Hooks
-
The 1403-line
annotation-tool/src/components/annotation/AnnotationWorkspace.tsxis decomposed into three cohesive custom hooks undercomponents/annotation/hooks/:useAnnotationDialogs(the editor/summary/detection/transcript dialog state and the transcript-request flow),useSummaryFlow(the claim timestamp-capture wiring and the draft-claim auto-open), anduseAnnotationState(the playback state, the keyframe/edit handlers, and the auto-save setup). The component drops to a thin composition of the three hooks plus its render. Pure move — state, effects, dependency arrays, handler logic, and render output are unchanged (the keyframe handlers still pass the freshly-mutated array toforceSave, and the auto-save change-detection snapshot is byte-identical), so behavior is preserved. -
The annotations list endpoint (
GET /api/annotations/:videoId) now returns an optional server-resolvedlinkedObjectName, andAnnotationOverlayuses it as a fallback badge label. World objects (entities/events/times/locations) are per-user, so a reviewer reading another annotator's object annotation previously saw a generic "Entity" badge — their own world could not resolve the other annotator's object id. The server now resolves the linked object's display name from the owner's world (batched: oneworldState.findManyover the distinct owners on the page), gated implicitly by the existingaccessibleBy(read)filter so only annotations the caller may already read are resolved and only the name is exposed. The frontend still prefers the live local object when the caller shares the world;linkedObjectNameis the fallback when it cannot.
Annotation Create Is Idempotent on a Client Id (Autosave Duplicate Prevention, Backend Half)
POST /api/annotationsnow accepts an optional client-generatedidand is idempotent on it: when the id already exists, the existing annotation is updated in place (HTTP 200) instead of a duplicate being minted (HTTP 201 for a genuine create). The update path authorizes against the existing row via the same instance-levelcan('update', subject('Annotation', …))gate asPUT /api/annotations/:id, and only writes mutable fields — identity columns (videoId,createdByUserId, owner) are never repointed by an id. A concurrent-create race (two requests with the same new id) is collapsed to a single row via aP2002fallback to the update path. This is the backend half of the autosave-duplicates fix: the annotation workspace re-sends each box with its stable client id, so a lagged auto-save re-POST of an already-persisted box no longer creates a second row. (Sending noidpreserves the previous create-with-fresh-id behavior, so existing clients are unaffected; the client wiring + the auto-save loop fix land alongside the workspace decomposition.)
Annotation Auto-Save No Longer Loops or Duplicates Boxes (Client Half)
- The annotation workspace auto-save no longer fires in a perpetual ~1/sec loop while a video with annotations is idle, and editing a box no longer spawns a duplicate. Two client fixes complete the issue (with the idempotent backend create above): (1)
useAutoSavegained an optionalgetComparisonSnapshotso its change-detection can ignore server-managed fields; the workspace stripsupdatedAt/createdAtbefore comparing, so the post-save refetch echoing a newupdatedAtno longer looks like a change and re-arms the save. (2) The create request now sends the annotation's stable clientid(transformFrontendToBackend), so a freshly-created box keeps that id instead of diverging from a server-minted one — and a lagged re-save is an idempotent update, not a duplicate. OtheruseAutoSavecallers are unaffected (the comparison default is unchanged). - Fixed a related silent data-loss bug the loop had been masking: keyframe edits (add / remove / update / move, interpolation-segment changes) and box drag / resize / nudge wrote the change to the local query cache and then forced a save in the same tick, before the cache update propagated back into the auto-save's
data— so the forced save persisted the pre-edit array and dropped the edit. The save succeeded (no error), so once the masking loop was removed the edit would have been lost on reload. The keyframe/box mutation hooks now return the freshly-computed array and the edit handlers pass it toforceSave(dataOverride), which persists exactly that snapshot (and updates the change baseline so the later propagation does not re-fire a save). The end-to-end persistence test now asserts the edited keyframe genuinely survives a reload, rather than that a box is merely visible.
Release Workflow Now Publishes GitHub Releases
release.ymlgains acreate-releasejob that, on everyv*.*.*tag push, extracts the matchingCHANGELOG.mdsection and creates or updates the GitHub Release. Previously the workflow only built and pushed Docker images, so a tag published no GitHub Release unless one was made by hand (which is how v0.4.3's Release came to be missing). The job is deliberately independent of the image build, so a Release is published even when the heavymodel-service-gpuimage hits the 90-minute job timeout.
Model-Service Thumbnails Written to a Container-Local Path
- The
model-serviceandmodel-service-gpuservices indocker-compose.ymlnow setTHUMBNAIL_OUTPUT_ROOT(following the backend'sTHUMBNAIL_PATH, default/videos/thumbnails) into the shared/videosmount. Previously the model-service defaulted to its container-local/tmp/thumbnails, so generated thumbnails landed where the backend could never read them.
Orphaned Model-Service Tests Now Run
- The model-service had two parallel test trees:
test/(the configuredtestpaths) and a separatetests/holding the architecture-registry and per-family loader-factory suites. Becausepytest.inipinstestpaths = test, thetests/tree was never collected, so 118 tests silently did not run. Movedtests/infrastructure/intotest/infrastructure/and deleted the duplicate tree, bringing those 118 tests into the standard model-service suite.
[0.4.4] - 2026-06-17
The first installment of the architecture-modularization roadmap (notes/architecture-review.md, Phase 0): reversible cleanups with no user-facing behavior change — dead dependencies removed, a configuration template de-duplicated, and one latent model-loader gap closed with a regression guard.
Removed
Unused Backend Dependencies
- Dropped
cors,express, andmulter(and their@types/cors,@types/express,@types/multertype packages) fromserver/package.json. The backend is entirely Fastify and imports@fastify/cors; these three packages had zero import sites inserver/src.pnpm-lock.yamlwas regenerated, removing them and their now-orphaned transitive dependencies.
Fixed
SAM 3.1 Object Detection Had No Loader Path
_object_detection_factory(model-service/src/infrastructure/config/task_factories.py) was missing theframework == "sam3"pre-dispatch that_object_tracking_factoryalready had, so anobject_detectionmodel selecting SAM 3.1 (config/models.yaml,selected: "sam-3-1") routed to the architecture-keyed detection registry — which registers no SAM3 loader — and would raiseUnknownArchitectureErrorat load time. The factory now pre-dispatchesframework == "sam3"to the existing, contract-testedSAM3DetectionAdapterviaSAM3Loader, matching the tracking path. TheSAM3Detection/SAM3Trackingarchitecture classes are deliberately registry-less (SAM3 loads through framework pre-dispatch, and the classes exist so SAM3 YAML entries carry a schema-validarchitecture:block), so nothing was removed.- Added
model-service/test/config/test_catalog_dispatch_invariant.py, a bidirectional architecture-to-loader invariant test: everyarchitecture.kindacrossmodels.yamlandmodels-cpu.yamlmust resolve to a registered loader, a documented framework pre-dispatch (sam3,external_api), or a dedicated task factory (speaker diarization, voice-activity detection), and every registered loader must target a valid architecture-family union member. Reverting the SAM3 detection fix makes this suite fail, so a future un-wired architecture is caught in CI rather than at model load.
Changed
Configuration Template De-duplication
- Removed the duplicate
REDIS_PORTdeclaration from.env.example(it appeared in both the Redis-configuration and host-ports sections; duplicate dotenv keys are silently last-wins). The key is now defined once, in the Redis-configuration section, with a pointer comment where the host-ports list previously repeated it. - Synchronized the two package manifests that had drifted behind the release version (
model-service/package.jsonandwikibase/pyproject.tomlwere still at0.4.0); all seven package manifests now report0.4.4.
[0.4.3] - 2026-06-17
This release works through the open issue backlog: it closes three issues that were already resolved on main (verified by running their tests) and fixes four that were still outstanding.
Added
Claim Timestamps (#134)
- Claims can now record the video segment(s) they are grounded in. A new
Claim.timeSpansJSON column (server/prisma/schema.prisma, migration20260616000000_add_claim_time_spans) stores a list of{ start, end, source, annotationIds? }objects, supporting discontiguous spans, and is threaded through the claim create/update routes and their TypeBox schemas (server/src/routes/claims.ts), import (server/src/services/import-handler.ts), and export (server/src/services/export-handler.ts). The matching frontend type isClaimTimeSpaninannotation-tool/src/models/claims.ts. - The
ClaimEditorlets annotators set spans two ways: by scrubbing the video — the dialog hides, a workspace capture banner reads the playhead for the span start then end, and the dialog returns with the span appended (state machine inannotation-tool/src/store/zustand/claimsUiStore.ts, banner + hide/reopen wiring inAnnotationWorkspace.tsx/VideoSummaryEditor.tsx) — or by deriving spans from the time bounds of selected object/bounding-box annotations (reusinggetAnnotationTimeBounds). Spans render as removable chips in the editor and read-only badges inClaimsViewer.
Recursive Video Discovery + Corpus Manifest (#108)
LocalStorageProvider.listVideos(server/src/services/videoStorage.ts) now discovers videos recursively, so videos organized into subdirectories sync without flattening; keys are stored subdirectory-relative and resolve through the existing streaming and metadata-sidecar paths.- A new optional
fovea.manifest.jsonat the root of the videos storage declares projects and user groups. During sync (server/src/services/videoManifest.ts, applied fromserver/src/services/videoSync.ts), projects and groups are upserted by slug, group memberships are reconciled additively (members are added and roles updated, never removed), and each video is assigned to the project whose path glob is the most specific match ("nearest wins"). Assignments carrysource: "folder"and re-syncing is idempotent. A missing manifest is a no-op; a malformed one is logged and skipped. Documented indocs/docs/guide/deployment.md.
Batch Lookup Endpoints (#136)
POST /api/personas/ontologiesandPOST /api/videos/summaries/lookupreturn sparse arrays in a single round-trip, replacing the per-persona and per-video request fan-out on the VideoBrowser's initial load.
Fixed
Auth Race Lets Logged-Out Users Briefly See the Video Browser (#92)
ProtectedRoute(annotation-tool/src/App.tsx) now holds the loading screen untilappConfig !== null, anduseSession(annotation-tool/src/hooks/auth/useSession.ts) wraps the/api/configfetch in a bounded exponential-backoff retry ([500, 1000, 2000, 4000, 8000]ms). Previously a transient 5xx on/api/configunder load left the deployment mode unknown (defaulting to single-user) while a/api/auth/me401 cleared the loading flag, so a logged-out visitor briefly fell through to the protected Layout. The jsdom test setup also gains a Web Storage polyfill so the persisted auth store can write during tests.
Initial-Load Fan-Out Trips the Rate Limit (#136)
- The frontend no longer fans a hard refresh out into one request per persona and one per video.
useAllPersonaOntologiesand the VideoBrowser per-card summary fetch now use the new batch endpoints and seed the per-id caches; the per-carduseVideoSummaryis gated on the batch settling and falls back to individual fetches if the batch route is unavailable. The Fastify rate-limit cap and window are now env-configurable viaRATE_LIMIT_MAXandRATE_LIMIT_WINDOW(server/src/app.ts) so operators can size the limit to their corpus.
Verified Already Resolved (closed without code change)
- #100 — Import yields annotation conflicts for new users. Cross-user import already regenerates all UUIDs on
main; confirmed byserver/test/services/import-cross-user.test.ts,import-handler-remap-ids.test.ts, and thecross-user-import-{foreign,rich,real}-fixture+import-export-cross-userintegration tests. - #121 — Display issues for imported annotations. Entity deduplication and structure-agnostic inline-UUID remapping already ship on
main; the foreign/rich fixtures assert no duplicate annotations, claims display, and no stale exporter UUIDs in claim text. - #122 — Vitest dual-React Dialog tests. The dual-React
useContextfailure no longer reproduces onmain; the exact files named in the issue (persona-deletion.test.tsx,PersonaBrowser.test.tsx) and the Dialog-rendering tests all pass.
[0.4.2] - 2026-06-06
Fixed
Annotation Save Duplication on Add Keyframe (cachedIdsRef Regression)
annotation-tool/src/store/queries/useAnnotations.tsconverts theuseSaveAnnotationshook'scachedIdsReffrom a plain{ current: new Set<string>() }literal touseRef<Set<string>>(). The literal was reconstructed on every render;onMutate's call toqueryClient.setQueryData(...)triggers a synchronous re-render that ran the hook again and produced a fresh empty ref beforemutationFnread it. With the ref now an empty Set, every annotation in the list was treated as new and routed throughapi.saveAnnotation(POST = create). The observable symptom: clicking Add Keyframe on any existing annotation triggered a wave of N new annotation rows on the canvas (one per existing annotation), and the new keyframe never attached to the selected annotation because the original annotation row was unchanged on the server while the optimistically-mutated cache row was orphaned by the round-trip. The bug was present in v0.1.11'suseSaveAnnotations.tstoo (identical literal-ref pattern), but v0.1.11 sat behind the separateuseAutoSaveAnnotationshook which short-circuited saves via a JSON.stringify no-op comparison; v0.4.x replaced that custom path with the genericuseAutoSavewhich fires far more aggressively and exposed the latent ref bug.annotation-tool/src/store/queries/useAnnotations.test.tsxadds two regression tests pinning the fix:- "routes an annotation already in cache through PUT, not POST": seeds the QueryClient cache with one annotation, calls
mutateAsyncwith the same id, asserts zero POSTs and exactly one PUT to/api/annotations/existing-1, and asserts the result counts (created: 0, updated: 1). - "does not duplicate every annotation when many existing rows are saved together (Add Keyframe simulation)": seeds the cache with four annotations, calls
mutateAsyncwith the same four, asserts zero POSTs and exactly four PUTs (one per id), and asserts the result counts (created: 0, updated: 4).
- "routes an annotation already in cache through PUT, not POST": seeds the QueryClient cache with one annotation, calls
Timeline Track List Clipping
annotation-tool/src/components/annotation/timeline/TimelineRoot.tsxwraps the track-header column and the keyframe-surface column in their ownflex-1 min-h-0 overflow-y-autoscrollers, with a synchronised-scroll handler pair (handleLeftScroll/handleRightScrollguarded by asuppressScrollSyncRefflag) that mirrorsscrollTopbetween the two halves so the header column and the lane column stay row-aligned no matter which side the user drives. The prioroverflow-hiddencolumns clipped the bottom of the track stack the moment the annotation count exceeded the visible area, hiding lock / solo / mute controls and the playhead for every track beyond row N.
[0.4.1] - 2026-06-06
Fixed
CASL Baseline Create Permissions (Production-Demo 403 Storm)
- Every signed-in user now receives a baseline
creategrant on resources they will own (Annotation,VideoSummary,Claim,Persona,WorldState), conditional on the candidate row'screatedBy/createdByUserId/userIdfield matching the caller. The priorserver/src/lib/abilities.tsbaseline covered onlyread,update, anddeleteon owned resources, which forced any user without an explicitproject_membershipsrow to obtain a project role (annotator / project_manager / project_owner) before they could create anything. On demo.fovea.video this produced an authoring-deadlock: the autosave loop inVideoSummaryEditorfiredPOST /api/summarieswithin seconds of the Edit Video Summary dialog opening, and the route's CASL gate (ability.can('create', subject('VideoSummary', { projectId: persona.projectId, createdBy: userId }))) denied every attempt, so the dialog rendered the error text "Cannot create this VideoSummary" as the actual summary body and re-fired the samePOSTevery few seconds in a 403 storm. The same wall blockedPOST /api/annotations(every keyframe save) andPOST /api/summaries/:summaryId/claims(every manual claim) and contributed to the keyframe-snaps-back behaviour where one keyframe edit landed under the baselineupdatewhile the next round-tripped through acreatethat 403'd. With the baselinecreaterules in place, a personal-project user can create their own resources without needing a project_memberships row; cross-user creates (a candidate naming a differentcreatedBy) still fail because the CASL condition still scopes to the caller. server/test/lib/abilities.test.tsadds a positive baseline-create suite that assertscan('create', subject(model, { ownerField: 'user-1' }))is true for all five models (own VideoSummary, own Annotation, own Claim, own Persona, own WorldState) and that the cross-user variants (ownerField: 'other-user') are denied. The priorviewer can only read annotationsandhandles empty permissions array gracefullyassertions are now subject-aware so the barecan('create', 'Annotation')check no longer fires a false-positive against the new conditional rule. 20/20 abilities tests green.
Bounding Box Visibility
- Bumped the
InteractiveBoundingBoxstroke width from 2px / 4px to 3px / 6px (type vs object annotation) so the boxes are visible against busy video frames at lower viewport zooms; the prior 2px stroke was lost in high-saturation regions of the underlying clip. - Bumped the always-on annotation type badge from
text-[clamp(10px,0.75rem,14px)]withh-6totext-[clamp(13px,1rem,18px)] font-semiboldwithh-8, widened theforeignObjectfrom 200×30 to 240×38 (y offset shifted from -30 to -38), and bumpedmax-wfrom 180 to 220 px so longer Wikidata-grounded labels (e.g., "Spectator → spectator at LoanDepot Park") don't truncate prematurely.
Demo Deploy Hygiene
.github/workflows/deploy.ymlno longer builds or starts themodel-servicecontainer on demo.fovea.video. The demo server cannot withstand CVPR-scale concurrent inference traffic; the frontend'smodelsDisabledgate already greys every model-service-hitting button (Detect Objects, Transcribe Audio, per-card Summarize, Summarize All Videos, Extract Claims, Suggest Types) when/api/models/config503s, which is the steady-state with no model-service container. Both DEMO_MODE branches now restrictdocker compose uptobackend frontendinstead of bringing up every service in the base compose file, anddocker compose buildexcludes model-service so the ~5 GB CPU image rebuild no longer happens on every deploy. The minimal-CPU image continues to publish fromdocker.ymlfor any downstream stack that wants it; the live local-backup the CVPR demo runs on the booth laptop builds its full-CPU variant viadocker-compose.local-full.yml..github/workflows/release.ymlrepointed the frontend and backend Build Release Images jobs fromcontext: ./annotation-tool/context: ./servertocontext: .with explicitfile: <package>/Dockerfile, matchingdocker.yml. The priorcontext: ./<package>config made buildx fail withfailed to compute cache key: "/annotation-tool": not foundbecause the DockerfilesCOPY annotation-tool/.../COPY server/...from the pnpm-workspace lockfile at the repo root, not from inside the package directory. The two model-service entries keepcontext: ./model-service(their Dockerfile is package-local) withfileleft unset so the action defaults to./model-service/Dockerfile. v0.4.0 GHCR image publication was broken by the prior config; v0.4.1 re-tag triggers a green release run.docs/scripts/generate-api-docs.sh-generateddocs/docs/api-reference/model-service/subtree is now.gitignored alongside the already-ignoredfrontend/andbackend/mirrors so the auto-generated MD files do not show up as untracked after a docs rebuild.annotation-tool/probe-*.mjsand rootprobe-*.mjsare now.gitignored so ad-hoc Playwright probes don't pollutegit status; the four pre-existing committed probes (probe-gloss / probe-one / probe-state-isolation / probe-tours) stay tracked.
Docusaurus MDX Build
annotation-tool/src/tours/engine/simulateAction.tshumanTypeTSDoc wrapsKeyboardEvent('keydown')/InputEvent('beforeinput')/InputEvent('input', { data, inputType: 'insertText' })/KeyboardEvent('keyup')in inline-code backticks so MDX 3 stops trying to parse the{ data, inputType: 'insertText' }literal as a JSX expression. The prior un-quoted form blew up Docusaurus' acorn parse withCould not parse expression with acornat the generatedhumanType.mdline 20 col 43, breaking the Documentation CI workflow on every commit since the TSDoc landed.
0.4.0 - 2026-06-04
Added
Audio Transcription and Speaker Diarization
- New end-to-end audio path: a Transcribe Audio button on the AnnotationWorkspace toolbar (
data-testid="transcribe-audio-button") calls the new backend routePOST /api/videos/:videoId/transcribewhich forwards to the model-service's new/api/transcribe(faster-whisper) endpoint and, whenenableDiarization: true, also to/api/diarize(pyannote). The backend merges per-second overlap so every transcript segment carries the speaker who talked the longest within its interval; the resulting payload renders in a newTranscriptPanelcomponent with colour-coded speaker chips (8-colour palette + an Unknown fallback), MM:SS click-to-seek timestamps, an active-segment highlight that follows the video playhead, and auto-scroll-into-view so the booth visitor can leave the dialog open while the clip plays. Diarization failure degrades gracefully to the plain transcript so the visitor always sees text. - New model-service routes:
POST /api/transcribe(body:{audio_path, language?}) returns{text, segments, language, duration, processing_time, model_used}; the loader is typed asAudioTranscriptionLoaderand the result asTranscriptionResult(noAny). Empty-stringlanguageis normalised toNoneso the visitor's auto-detect path is not rejected by the faster-whisper hard-lookup table.POST /api/diarize(body:{audio_path, num_speakers?, min_speakers?, max_speakers?}) returns{segments, speakers (first-appearance ordered, deduped), processing_time, model_used}; the loader is typed via a local_DiarizationModelProtocol so the route does not couple to PyannoteLoader as a concrete class. Per-request speaker-count hints are logged as a warning since the current loader binds them at config-load time.
- New backend route
POST /api/videos/:videoId/transcribeinserver/src/routes/videos/transcribe.ts: resolves the video viavideoRepository.findByIdWithSelect, translates/data/to/videos/on the path, and forwards through the typedfetchModelServicehelper with a newMODEL_SERVICE_TIMEOUT_TRANSCRIBE_MSenv var (default 300_000 ms). Diarization failure returns a plain transcript with alog.warnand a 200; transcription failure preserves the upstream status with a typed{error: 'MODEL_SERVICE_ERROR', message}body that survives the fast-json-stringify response schema. - New frontend types:
TranscribeRequest,TranscribeResponse,TranscriptSegmentin@api/client; newuseTranscribeVideomutation hook in@store/queries/useTranscribe; newMicicon import on AnnotationWorkspace; new dialog state machinery (transcriptDialogOpen,transcriptResult,transcriptError,diarizationRequested). - HuggingFace token (
HF_TOKEN+HUGGING_FACE_HUB_TOKEN) plumbed throughdocker-compose.e2e.real-models.ymlso the model-service can authenticate against the gated pyannote model and avoid the unauthenticated rate-limit that stalled first-call faster-whisper downloads. - New
MODEL_SERVICE_TIMEOUT_<KIND>_MSenv var family onserver/src/lib/fetchModelService.tsmakes every per-endpoint ceiling overridable;docker-compose.e2e.real-models.ymlbumps the six values to CPU-friendly ranges (600000 ms detection/thumbnails/ontology-augment, 1800000 ms summarize/extractClaims/synthesize). Production defaults are unchanged. - Model-service
[cpu]optional-dependencies extra inmodel-service/pyproject.tomlshipsllama-cpp-python>=0.3.0+onnxruntime>=1.20.0so the CPU image actually carries the runtimes its bundledmodels-cpu.yamlconfig selects (the prior CPU image had no llama_cpp module and every GGUF or ONNX load died at import). - New
model-service/test/infrastructure/adapters/inbound/fastapi/routes/test_diarize.pycovers the route end-to-end against a fake PyannoteLoader-shaped diarizer (happy path, 404 missing audio, 500 missing task, 500 load failure, 500 diarize failure, warning when hints supplied, no warning when hints omitted): 7/7 passing. - New
server/test/routes/videos/transcribe.test.tscovers the backend forwarder: 404 missing video, plain transcription forwards to /transcribe only, enableDiarization forwards to both and assigns max-overlap speaker per segment, diarization 500 falls back to plain transcript with a warn, transcription 500 returnsMODEL_SERVICE_ERRORwith the upstream text, timeout returns 504 withMODEL_SERVICE_TIMEOUT, unreachable returns 502 withMODEL_SERVICE_UNREACHABLE, the typed error classes are distinguishable: 8/8 passing. - New
annotation-tool/src/components/video/TranscriptPanel.test.tsxcovers the new UI component (header summary surfaces language / duration / ASR / processing time, speaker legend friendly-name mappingSPEAKER_00toSpeaker 1, MM:SS timestamp formatting, onSeek wiring, active-segment data attribute moves on rerender, per-segment speaker chip uses friendly name, empty-speakers omits chips entirely, whitespace renders italic(silence)placeholder, diarization metadata surfaces): 9/9 passing. - Tier 2 integration spec
annotation-tool/test/e2e/integration/model-service/real-model-inference.spec.tsgains a Transcribe Audio user journey that drives the toolbar button against the real CPU model-service stack.
Tour-Demo Mode (MSW Model-Service Interception)
- New build flag
VITE_TOUR_DEMO=1ships an MSW browser worker that intercepts every model-service-bound route the tours touch (/api/ontology/augment,/api/videos/:videoId/detect,/api/videos/:videoId/track,/api/videos/:videoId/transcribe,/api/videos/:videoId/summarize,/api/claims/extract), so the CVPR booth machine no longer needs a live model service to demonstrate detection / tracking / ontology augmentation / transcription / diarization / VLM summarization / claim extraction. The dynamic import sits behind a statically-analysable env-var guard, so the entiresrc/mocks/tourDemosubtree tree-shakes out of production builds where the flag is off. - Every fixture is sourced from the deployment's
TourContentBundle(the same/tour-content.jsonan admin already edits to retheme tours), so swapping the domain re-themes the mocked model outputs in the same edit pass. Three new sub-slot interfaces extendTourContentBundle:TourMockOntologySuggestionfor Tour 3's AI augmenter,TourMockDetectionProposal+TourMockTrackingKeyframefor Tour 6's detection + tracker,TourMockTranscriptSegment+TourMockClaimAtomfor Tour 7's transcribe + summarize + claim-split flow. - Every fixture is intentionally "almost-there": the analyst polishes the model output to its final microvent form, not the other way around. Five ontology suggestions where the visitor accepts one and rejects four; four detection proposals (two genuine high-confidence containers, two spurious low-confidence boxes the analyst rejects) hand-grounded against an extracted frame at t=8s of the ABC7 Port of Long Beach cargo-fall clip; a 30-frame tracker trajectory with a clean first-22 prefix and a flagged last-8 drift the analyst re-anchors; a four-segment two-speaker transcript with one low-confidence segment carrying a deliberate single-word recognition error AND a wrong speaker assignment; a VLM summary synthesised from the eventual atomic claims with one believable factual error ("above the right-field line" vs the final "behind home plate"); a non-atomic compound claim with a
needsSplitflag + threesplitTargetsthe analyst splits into atomic rows via the claim editor. TourMockDetectionProposalcarriesacceptAsLabel+acceptAsWikidataIdso each suggested type is grounded in Wikidata (e.g.containerQ987767, water Q283, crane Q178692). TheDetectioninterface inapi/client.tsgains two optional tour-demo-only fields; the candidates list renders a "Snap to type" chip (data-testid="suggested-type-chip") gated onacceptAsLabelbeing truthy.- Simulated latency 800-1800 ms in
handlers.tsmatches a warm CPU-mode model-service so the visitor sees a real-feeling "computing" beat without the booth needing a GPU stack. - New
docker-compose.tour-demo.ymloverride flipsVITE_TOUR_DEMO=1at frontend build time so the booth operator engages tour-demo mocking viadocker compose -f docker-compose.yml -f docker-compose.tour-demo.yml up -d --buildwithout touching the base compose file or the Dockerfile. - New smoke specs:
test/e2e/smoke/tour-demo-msw.spec.ts(2 tests, 17.6s) asserts every one of the six routes is intercepted, validates each fixture against the bundle defaults, and checks simulated latency lands in the 600-2400 ms windowtest/e2e/smoke/tour-demo-launch-all.spec.ts(12 tests including 10 tour launches) walks every built-in tour throughwindow.__foveaTour.launch()and asserts each one resolves true + becomes the active tour + abandons cleanlytest/e2e/smoke/tour-demo-spotlight-pause-resume.spec.ts(4 tests, 5.4s) re-asserts the engine wiring against the MSW-mocked demo build: spotlight overlay paints (4 backdrop + 1 outline + 4 corner = 9 rects), Pause unmounts and surfaces the resume pill, Resume re-mounts at the same step, paused state survives a hard reload
Public Tour Catalogue and New Tours for demo.fovea.video
- New
src/pages/TourCataloguePage.tsxis a public 4-column tour catalogue that mounts as the/route when the bundle is built withVITE_DEMO_PUBLIC=1. The page surfaces the FOVEA wordmark + the "Flexible Ontology Visual Event Analyzer" tagline + a Sign in link top-right, and renders 12 tour cards in a responsive grid (sm: 2, lg: 4 columns) so the catalogue lays out as a 4×3 grid on a landscape booth screen and stacks cleanly down on mobile QR scans.App.tsxwires the flag so the authenticated app moves under/app/*and the public catalogue claims/. - Two new tours land in the catalogue:
- Tour 0 (welcome.ts) "Welcome to FOVEA": three-step orientation that opens the catalogue (FOVEA backronym reading + the four-layer model + the analyst-polishes-model-output editing-loop framing all in two minutes). Content-neutral; no bundle slot.
- Tour 11 (keyframes-interpolation.ts) "Working with longer videos: keyframes and interpolation": five-step temporal-modeling deep-dive that closes the grid (sparse keyframes, linear/Bezier interpolation curve, motion-path overlay, same data structure model and human edit).
scripts/index.tsorders Welcome first followed by the four-layer arc, the model-assisted flows, the collaboration / admin / import-export operator surfaces, and keyframes-interpolation last.
Auth-Pages Branding and Admin-Only Account Policy
- LoginPage and RegisterPage gain the official
fovea-logo.svg(size-12 above the wordmark), the uppercaseFOVEAh1 withtracking-widematching the sidebar wordmark, and the "Flexible Ontology Visual Event Analyzer" tagline. The generic LucideLogIn/UserPlusicons are dropped. - LoginPage surfaces a clear Alert with a mailto when
authStore.allowRegistrationis false: "Self-registration is disabled on this deployment. To request an account, email admin@fovea.video." Replaces the silent dead-end where the Register link just vanished. The admin console'sCreateUserDialogis independent of the registration toggle, so the operator can still mint accounts trivially after the demo deploy. - LoginPage post-login redirect respects
VITE_DEMO_PUBLIC=1and lands on/appinstead of looping back to the catalogue at/.
Demo Deployment Plumbing (deploy.yml demo_mode + nginx.demo.conf)
- New
workflow_dispatchinputdemo_mode(default false) on.github/workflows/deploy.yml. When set true the on-server env patch flipsALLOW_REGISTRATION=false, setsVITE_TOUR_DEMO=1+VITE_DEMO_PUBLIC=1, copiesannotation-tool/nginx.demo.confoverannotation-tool/nginx.conf, skipsdocker compose up model-serviceentirely (the frontend MSW worker intercepts the six model-service routes so there is nothing to start), and explicitly limits the recreate tobackend+frontendso the missing model-service does not block. - New
annotation-tool/nginx.demo.confships twolimit_req_zonescopes (login_zone 30r/m burst 10, register_zone 5r/m burst 3 as defence-in-depth even though registration is disabled),Cache-Control: no-storeon/tour-content.jsonso admin edits land immediately,Cache-Control: public, immutableon/assets/*, and 60 s cache on/mockServiceWorker.jsso a worker-version bump propagates fast without re-downloading on every scan. docker-compose.ymlfrontend service threadsVITE_TOUR_DEMO+VITE_DEMO_PUBLICbuild args with empty defaults so production builds without the demo flag tree-shake the entiresrc/mocks/tourDemosubtree out of the bundle.- New runbook at
docs/development/demo-fovea-deployment.mddocuments the workflow input, the env-var deltas, the post-deploy curl smoke (including a login rate-limit burst check), the admin-console account-mint flow, and how the booth-laptopdocker-compose.tour-demo.ymlstack relates to the production demo deploy.
Tour Content (Twelve Tours, All New in 0.4.0)
- Tour 3 "Grow your ontology from Wikidata" covers BOTH manual type creation AND Wikidata import in a single walkthrough so the visitor sees the contrast directly: steps 1-2 take the manual-entry path (
type-editor-mode-manualthentype-editor-mode-wikidata); steps 3-8 continue the Wikidata flow. Newdata-tour-idanchors onshared/ModeSelector.tsx(manual / copy / wikidata) make the modes addressable. - Tour 5 "The world layer" walks all four world-object editors (entity / location / event / time) with narration that calls out what is distinctive about each (entity binds a TYPE to a thing, location adds coordinates and a map pin, event has start/end + role bindings the entity editor does not, time has the start/end/fuzzy controls), including an event-instance step.
- Tour 2 "Building a persona's ontology" relation-type-editor narration contrasts the source-types and target-types pickers against the entity / event / role type editors.
- Tours 3, 6, 7 narrate the accept-some / reject-some / inline-edit / split-compound-claim editing loop the demo is built around. Tour 3 augmenter-results includes accept-and-rename steps (accept the close-but-not-quite "Ball grab" suggestion and rename it to lowercase
ball-grab, reject the four distractors). Tour 6 candidates-list + tracking-results-panel includes accept-two / reject-two / snap-to-general-type-with-Wikidata-Q987767 / re-anchor-tracker-at-frame-214 steps. Tour 7 transcript-viewer + video-summary-editor + claims-extraction-dialog includes inline-editsnatchedtograbbed, flip the speaker chip, correctabove the right-field linetobehind home plate, split the non-atomic compound claim into three atomic rows. - Tour 7 carries a two-step prelude visiting the on-demand Transcribe Audio button + TranscriptPanel dialog before the saved-summary
audio-config-panel+transcript-viewerflow.
Type Strictness on the New Model-Service Inbound Routes
model-service/src/infrastructure/adapters/inbound/fastapi/routes/transcribe.py(new in 0.4.0 for the audio-transcription path):result: TranscriptionResult(noAny); model is typedAudioTranscriptionLoaderviacast. Empty-string language is normalised toNoneso the auto-detect path is not rejected by faster-whisper.model-service/src/infrastructure/adapters/inbound/fastapi/routes/diarize.py(new in 0.4.0 for the speaker-diarization path): model is typed via a localProtocol(_DiarizationModel) with the singlediarize(audio_path) -> DiarizationResultcontract, so the route does not couple toPyannoteLoaderas a concrete class.result: DiarizationResult(typed dataclass withSpeakerSegmentmembers).
Operations Section Expanded From One Runbook to Six Pages
docs/docs/operations/grows from the singledemo-fovea-deployment.mdrunbook to six pages:production-deployment.md(six-container docker-compose stack first-time setup, port/data-volume topology, what-to-expose checklist),monitoring.md(OTel trace export, the Prometheus counters and histogramsserver/src/metrics.tsemits,/api/healthreadiness probe, model-service/health, a minimal alert set, what is intentionally not instrumented),backup-restore.md(pg_dumpcadence + storage-volume rsync, quarterly DR drill),upgrades.md(patch / minor / major paths, Prisma migrate deploy, permission-catalogue re-seed),troubleshooting.md(failure modes organized by user-visible symptom), plus the existingdemo-fovea-deployment.md. Each page is rooted in what the live code actually carries (env vars fromfetchModelService.ts, metrics frommetrics.ts, etc.).
Annotation Timeline Rewrite
- Rewrote the annotation timeline as a composition of small DOM primitives under
src/components/annotation/timeline TimelineRootorchestrates a fixed-width track-header column and a flexible right column containingTimelineRuler,TimelinePlayhead, and stackedTimelineTracklanesTimelineTracklanes renderInterpolationSegmentgradients andKeyframeMarkerdiamonds with selection, current, and locked statesTransportBarcarries the SMPTE timecode readout, keyframe-edit cluster, and zoom controlsuseTimelineViewportmanagesResizeObserver-backed container width plus zoom clamped between fit-to-view andMAX_ZOOMuseKeyframeDraginstalls window-level pointer listeners to reposition keyframes with obstruction nudginguseTimelineKeyboardwires J/K/L playback shortcuts and theShortcutPalettesurfaces the binding table via?TimelineComponent.tsxremains as a drop-in shim that threadsuseMoveKeyframethrough
Bounding-Box Editing Polish
BoundingBoxHUDrenders a float W×H and x,y readout with monospace tabular-nums in aforeignObjectanchored below the box during drag/resizeuseBoundingBoxKeyboardhook nudges the active box by 1 px (10 px with shift) on arrow keys and callsonUpdate+onEditCompletethrough the existing persistence pipeline- Shift-hold aspect-ratio lock for corner resize handles honours whichever axis drifted farther and anchors the opposite edge so the box grows from its corner
Backend Reliability
services/system-config-propagator.tsfactors model-service propagation out of the admin-config route- Server startup now auto-replays every persisted
SystemConfigrow so a fresh model-service picks up admin settings without operator intervention
Cross-User Import Regression Coverage
- Regression suite in
server/test/integration/cross-user-import-rich-fixture.test.tsagainstserver/test/fixtures/cross-user-import-rich-export.jsonl(the richest of the seven annotator exports uploaded to #121, carrying 20 personas / 20 ontologies / 79 entities / 136 summaries across ~96 distinct videos / 621 claims / 9 object annotations). The test imports the fixture into a fresh user viareseedOwnershipBaselineand walks four assertions sourced directly from the screenshot on the reopened #100: (a) every imported summary'spersonaIddereferences viaGET /api/personas/:idwith a 200 (a 404 here is the user-visible 'Persona <uuid> not found' banner in the Edit Video Summary dialog), (b) every dereferenced persona is owned by the importer (cross-checked againstGET /api/personas), (c) nosummary.personaIdequals one of the original exporter-side persona ids (i.e. the remap actually rewrote it, not just preserved it), (d) every imported claim'ssummaryIdresolves to a summary owned by the importer, with round-trip claim and annotation counts matching the fixture exactly. The suite carries a 90_000ms per-test timeout to accommodate the Clean Architecture indirection on top of CASL's per-call overhead. server/test/integration/cross-user-import-real-fixture.test.tsnow also walksGET /api/personas/:idwith the summary'spersonaIdafter import and intersects the returned id against the requester'sGET /api/personaslist. The previous test only asserted the summary row carried a personaId without verifying the dereference, leaving the post-import Edit Video Summary path (the exact API the bug screenshot in #100 surfaces) untested.- Unit suite
test/services/import-handler-remap-ids.test.ts(13 tests, no database) exercises every surface of the new id-shape substitution against a syntheticidMap: whole-string ids in arbitrary field names, inline mentions inclaim.text/claim.comment, every free-text surface (personainformationNeed/details, ontology type descriptions, world object name / description, summary text, claim-relation description), nested structures through arrays and glossitems,*Idsarrays, collectionmembersarrays, multiple ids in one string, ids embedded inside larger tokens (claim_<id>_v2,entity-<id>.png,url=…/<id>?q=1), uppercase / mixed-case ids, JSON-encoded blobs that carry ids, ids not inidMapleft untouched, non-id strings unchanged, empty-resolutions no-op, and primitives (number / boolean / null) untouched. The integration comparator intest/integration/import-export-fidelity.test.tsnow treatsmembersas id-like so the round-trip diff stops asserting that reference arrays survive byte-for-byte; the round-trip behaviour itself is unchanged.
Changed
UI Framework Migration (MUI to shadcn-ui)
- Migrated the entire annotation-tool frontend from Material UI to shadcn-ui
- Replaced MUI
Box,Typography,Button,Alert,Accordion,Dialog,Menu, and form primitives with shadcn equivalents - Switched from Emotion-based theming to Tailwind CSS v4 with a Fovea-specific design-token layer
- Replaced MUI icons with Lucide React icons via a barrel export
- Rebuilt the Layout around the shadcn sidebar composition pattern with fixed dialog overflow handling
- Fixed sidebar toggle, narrowed the dropdown menu, resolved tab overflow, and reduced the sidebar width
- Renamed Ontology Builder to Persona Builder with updated icons and keyboard shortcut
- Updated all component tests for the new shadcn DOM structure, ARIA roles, and named exports
Docusaurus Reorganization
- Comprehensive Docusaurus reorganization at
docs/docs/: industry-standard split into Tutorial / Guide / Concepts / Reference / Operations / Project; orphan markdown at the doc root deleted or moved into the published tree. The Docusaurus site continues to serve atfovea.video(landing renders at/, docs tree at/docs/*). - Version-neutral docs sweep: every
v0.X.Y/since v0.X.Z/carried from v0.Xreference in the published docs is scrubbed (the workspaceCHANGELOG.mdis the only place version numbers appear). Stability and contributing pages describe the maintenance-line policy without enumerating which version is which. - House style sweep: em-dashes (
—/–) removed from every doc and replaced with semicolons / commas / hyphens; all docs use American spelling (organize,behavior,color,catalog,license,flavor,whilst -> while, ...). Theguide/tour-catalogue.mdfile is renamed toguide/tour-catalog.md(with the sidebar and cross-page links updated).
Tooling and Build
- Monorepo switched to a pnpm workspace with ergonomic dev commands
- All Dockerfiles updated for the pnpm workspace layout
jsdompinned to^26.1.0for Node 18 ESM compatibility
Cross-User Import Remap Made Structure-Agnostic
- Replace the field-name allowlist inside
remapObjectIdswith a structure-agnostic substitution built from the cross-useridMapitself. The prior fix on this branch (cherry-picked from v0.3.2) added an inline-UUID regex pass as a fallback after the existingid/*Id/*Ids/ gloss-contentbranches, but the allowlist still hid two correctness gaps: (1)entityCollection.members/eventCollection.members/timeCollection.membersare id-reference arrays that the allowlist never matched (they do not end inIds), so after a cross-user import every collection silently held pre-import ids pointing at entities that no longer existed in the importer's world; (2) any future id-bearing field whose name did not match the allowlist patterns would have the same problem.remapIdsnow lowercasesidMapkeys on insert, builds a single case-insensitive matcher from those keys sorted longest-first and RegExp-escaped, and applies it to every string value in the payload tree. Whole-string id values, ids embedded in surrounding prose, ids in arbitrary array positions (members,entityIds, ordinary string arrays), GlossItemcontent, and ids inside JSON-encoded substrings are all rewritten by the same pass; substrings whose lowercased form is not inidMappass through unchanged, so the substitution is a strict no-op outside the cross-user path. Reported as a continuation of #121.
Removed
@mui/material,@mui/icons-material,@mui/x-*,@emotion/react, and@emotion/styleddependencies- Unused
DropdownPaperhelper left over from the MUI migration
Fixed
LlamaCpp VLM Sync-Path Fix
LlamaCppVLMLoader.load/.generate/.unloadare no longer declaredasync def. The underlying llama_cpp.Llama API is blocking, the only caller chain (VLMLoaderAdapter to use_case.summarize to route) is sync top-to-bottom, and the prior async signatures silently discarded their bodies:VLMLoaderAdapter.loadflipped_loadedto True without actually loading,.generatereturnedstr(coroutine)which produced "<coroutine object LlamaCppVLMLoader.generate at 0x…>" as the summary text the user observed, and.unloadnever released memory. Dropping the async keyword from all three methods is the minimal fix.
House Style on User-Facing Strings
- Em-dashes scrubbed out of every user-facing
narration/recap/description/titlestring across the twelve tour scripts; replaced with semicolons or hyphens contextually. Same sweep oversrc/tours/content/types.ts+src/tours/content/microvent.ts+src/mocks/tourDemo/handlers.ts+src/mocks/tourDemo/browser.tscomments and console.info.
Documentation Drift Audit (155 Confirmed Findings Against the Live Codebase)
- Two-round audit/fix workflow grounds the docs against the live codebase: 67 Opus auditors checked every doc claim against
server/src/,model-service/src/,annotation-tool/src/,docker-compose*.yml, andmodel-service/config/models*.yaml; an adversarial Opus verifier defaulted to rejection per finding and confirmed 155 of 162 candidate findings; 50+ Opus fixers + per-file verifiers applied the fixes across two rounds + a surgical manual pass for the residual five files. The 13 cross-cutting patterns landed: Grafana host port corrected (3010 -> 3002 across 4 files), fabricatedassertX-Ownedownership-helper names replaced with the actual CASLrequest.ability.can()pattern (4 files), fusion-strategy enum corrected from inventedparallel/audio-firstto the realsequential/timestamp_aligned/native_multimodal/hybrid(3 files), fabricated export-type discriminators (worldEntity,worldEvent,videoSummary) replaced with the realpersona/ontology/entity/event/time/*_collection/relation/summary/claim/claim_relation/annotation/metadataset (2 files),PUT /api/ontologybody shape corrected to the bulk envelope plus per-personaPUT /api/personas/:id/ontology(3 files), OTel metric names corrected from the invented*_mssuffix to the real dotted form (4 files), nonexistent env vars and CLI scripts in the operations docs replaced with the real ones (JWT_SECRET->SESSION_SECRET,MODEL_CACHE_DIR->HF_HOME,npm run seed:permissions+npm run admin:create->npm run seed+ADMIN_PASSWORD; the OIDC/JWT story dropped entirely since auth is cookie-session), "Ontology Builder" UI references renamed to the now-shipped "Persona Builder" and the documented keyboard shortcut rebound fromotop(2 files),MODEL_SERVICE_TIMEOUT_*env var keys aligned withfetchModelService.ts(THUMBNAILS_MS,EXTRACT_CLAIMS_MS; nonexistentTRACKING_MSremoved),annotation linkTypemigration timestamp corrected (3 files), WorldState described as@@unique([userId, projectId])rather than@unique(userId), model-service API reference prefixed with/api/on every endpoint to matchroutes/__init__.py, and thereference/model-loaders.mdframework column rebuilt frommodels.yamlso the ~25 mislabeledtransformersrows now show their realultralytics/pytorch/sam3/whisper/whisperx/nemo_canary/nemo_parakeet/pyannotelabels. Florence-2 added toobject_detection, the nonexistentFallback chainsection deleted, and the stale "Anchors not yet landed" block inreference/tour-anchors.mdreplaced with the 15 already-shipped anchors. Final docs build is clean (0 errors, 0 broken links).
Real Model-Service Stack Bugs from Tier 2 Verification
docker-compose.e2e.ymlmodel-service service gains the./test-videos:/test-videos:romount that the base compose file was missing. Without it the path sanitiser inmodel-service/src/infrastructure/adapters/outbound/video/processor.pydefaultedVIDEO_DATA_ROOTto/videosand rejected every/test-videos/*path the backend forwarded with "Video file not found" even when the file existed on every other container.docker-compose.e2e.real-models.ymlnow pinsMODEL_CONFIG_PATH=/config/models-cpu.yamlbecause the basedocker-compose.e2e.ymlhardcodes/config/models.yaml(the GPU defaults) regardless of theDEVICEarg. The CPU stack was being pointed at a GPU-required selection (qwen-3-vl-8b at 9GB VRAM via sglang) and crashed on first model load.docker-compose.e2e.real-models.ymladdsCUDA_VISIBLE_DEVICES=""to keep CPU-only hosts offlibcudart.so.13. Without it pyannote.audio and the Silero VAD weights transitively pull torch with CUDA runtime bindings, and the warmup ofspeaker_diarization+voice_activity_detectionfailed with "libcudart.so.13: cannot open shared object file" on CPU-only runners.
Models Catalogue Tightening
model-service/config/models-cpu.yamlselectsfaster-whisper-tiny(Systran/faster-whisper-tiny, 39 MB) as the default audio-transcription model so the CPU image carries a model that actually loads on a developer laptop without HuggingFace rate-limiting.- The same file sets
warmup_on_startup: trueand bumpsmax_video_framesto 3 for the CPU profile so the booth visitor's first inference is responsive. - Architecture catalogue gains the missing markers (
ClaudeVisionAPI,OpenAIVisionAPI,GeminiVisionAPI,GrokVisionAPI,SAM3Detection,PyannoteDiarization,SileroVAD) the five per-family registry-migration subagents legitimately did not cover; every YAML option that referenced them now declares its architecture block, and the registry exhaustiveness tests are tightened to recognise the new unregistered-by-design markers. ModelConfig.architectureis tightened fromArchitecture | Noneto requiredArchitectureacross both ModelConfig surfaces, with the field declared betweenframeworkandvram_gbso the discriminated-union TypeAdapter raises at config load on a missing or malformed architecture block.
Claim Text No Longer Carries UUIDs From Reference-Kind Gloss Items
ClaimEditor.handleSave(annotation-tool/src/components/claims/ClaimEditor.tsx) now resolves gloss items to human-readable labels when synthesizingclaim.text, via the existingglossToTexthelper fromannotation-tool/src/utils/glossUtils.ts. The previous handler didconst text = gloss.map(item => item.content).join('')which works fortext-kind items (wherecontentis the user-typed string) but writes the raw UUID intotextfor everytypeRef/objectRef/annotationRef/claimRefitem, becauseGlossItem.contentstores the referenced thing's id. The symptom showed up in the JSONL export, whereclaim.textread like "The <player-uuid> hit the <ball-uuid>" instead of "The Player 9 hit the ball".glossToTextresolvestypeRefids against the active persona ontology (entities / events / roles / relationTypes) andobjectRefids against world state (entities / events / times), with a fall-back to the raw id only when a lookup misses. The same naive concatenation is replaced withglossToTextin three preview surfaces that had the same bug at display time:ClaimRelationsViewer.getClaimText(source / target claim previews in the relations panel),ClaimRelationEditor(relation-type-dropdown preview plus source / target previews; gains an optionalpersonaIdprop threaded fromClaimsViewer), andImportDialog(entity-row preview during persona import; resolvestypeRefids against the source persona's ontology).- The unused name-less
convertTypeRefsToTexthelper inserver/src/lib/reference-cleanup.tsis deleted along with its five test cases. The production type-deletion path (server/src/routes/personas.tsDELETE /api/personas/:personaId/ontology/{entities,events,roles,relation-types}/:typeId) routes throughupdateGlossesInTypes->convertTypeRefsToTextWithName, which uses the type name. The name-less variant had zero production callers; its replacement-text content ofitem.content(the raw UUID) would have reintroduced the same UUID-in-text bug if anyone had reached for it next.
Cross-User Import Transaction Timeout
ImportHandler.executeImportnow configures the Prisma atomic-mode transaction with{ maxWait: 10_000, timeout: 300_000 }. The default 5_000ms interactive-transaction timeout is exceeded by realistic cross-user imports; a payload with ~20 personas / ~100+ summaries / hundreds of claims times out withTransaction already closedpartway through because every nested write goes through the CASL ability check and the Clean Architecture indirection. Without the bump the whole import rolls back and the user sees a 500 fromPOST /api/import; with it, the import completes against realistic payload sizes. Surfaced by the forward-port of the rich regression fixture.
Schema Hardening
- Replaced vitest-broken
Type.Unionnullable response schemas on/api/me/preferenceswith thefast-json-stringify-safeType.Unsafearray-type pattern so null values serialize correctly. - Resolved
SystemConfigauditupdatedByUserIdthrough the users table so phantom test-bypass ids and real deleted-user races no longer violate the FK.
Shadcn Migration Followups
ClaimEditorClaiming Event / Time / Location dropdowns now populate from world state instead of showing the None-only placeholder menus the shadcn migration left behind (events fromuseEvents(), times fromuseTimes(), locations fromuseEntities()filtered to entities tagged with alocationTypefield).ObjectWorkspace'sobject.duplicatecommand now actually duplicates the selected world object (entity / event / location / time / collection) instead ofalert('Duplicate object not yet implemented'), via a purebuildDuplicatePayloadhelper that strips server-managed and Wikidata-provenance fields and appends a(copy)suffix.OntologyWorkspace'sontology.duplicateTypecommand now actually duplicates the selected ontology type (entity / role / event / relation) instead ofalert('Duplicate type not yet implemented'), via a purebuildDuplicateOntologyTypehelper following the same shape as the world-object duplicator.AnnotationWorkspace's command-contextdrawingModeflag now reflects the actualannotationUiStore.drawingModevalue instead of being hardcodedfalse, so when-clauses that gate ondrawingModefire correctly while a draw-mode button is active.ImportResultDialog's orphan-skipped banner now carries thedata-testid="import-orphan-skipped-banner"attribute that the corresponding E2E spec (test/e2e/regression/export-import/orphan-skipped-banner.spec.ts) was already probing for, and the banner prose now matches the E2E spec's/missing referenced data/iassertion. The unit-level rendered-output test stays skipped pending the workspace-wide pnpm + jsdom React-dedup fix.videoStorage.getVideoUrlnow fails fast with an actionable error message whenCDN_ENABLED=trueandCDN_SIGNED_URLS=trueinstead of silently returning an unsigned URL (the placeholder behaviour produced 403 cascades through signed CloudFront distributions); operators must either setCDN_SIGNED_URLS=false(public-CDN-in-front-of-public-bucket) or wire up@aws-sdk/cloudfront-signer.- Shimmed
PointerEvent+Elementpointer-capture intest/setup.tsso Base UI's checkbox/dialog handlers no longer throwPointerEvent is not definedunder jsdom. - Updated
TimelineComponenttests to pass the fullTimelineComponentPropsvia amakePropshelper and query buttons byaria-labelinstead of the canvas-era emoji placeholders. - Swapped the workspace integration test's
querySelector('canvas')probe forgetByLabelText('Video annotation timeline'). - Annotation-drawing duplication during keyframe edits.
- Full
annotation-toolvitest suite now reports 102 files / 1698 tests pass (5 canvas-era tombstones skipped with a pointer to the shadcn rewrite, 0 failed).
[0.3.3] - 2026-05-13
Forward-ports the v0.1.10 / v0.2.3 generalisation of the cross-user id remap to the v0.3.x line. The bug taxonomy and user-visible behaviour is the same; the integration is unchanged from v0.2.3 since remapObjectIds lives outside both the CASL surface introduced in v0.2.0 and the Clean Architecture refactor introduced in v0.3.0.
Changed
- Replace the field-name allowlist inside
remapObjectIdswith a structure-agnostic substitution built from the cross-useridMapitself. The v0.3.2 fix added an inline-UUID regex pass as a fallback after the existingid/*Id/*Ids/ gloss-contentbranches, but the allowlist still hid two correctness gaps: (1)entityCollection.members/eventCollection.members/timeCollection.membersare id-reference arrays that the allowlist never matched (they do not end inIds), so after a cross-user import every collection silently held pre-import ids pointing at entities that no longer existed in the importer's world; (2) any future id-bearing field whose name did not match the allowlist patterns would have the same problem.remapIdsnow lowercasesidMapkeys on insert, builds a single case-insensitive matcher from those keys sorted longest-first and RegExp-escaped, and applies it to every string value in the payload tree. Whole-string id values, ids embedded in surrounding prose, ids in arbitrary array positions (members,entityIds, ordinary string arrays), GlossItemcontent, and ids inside JSON-encoded substrings are all rewritten by the same pass; substrings whose lowercased form is not inidMappass through unchanged, so the substitution is a strict no-op outside the cross-user path. Reported as a continuation of #121.
Added
- Unit suite
test/services/import-handler-remap-ids.test.ts(13 tests, no database) exercises every surface of the new id-shape substitution against a syntheticidMap: whole-string ids in arbitrary field names, inline mentions inclaim.text/claim.comment, every free-text surface (personainformationNeed/details, ontology type descriptions, world object name / description, summary text, claim-relation description), nested structures through arrays and glossitems,*Idsarrays, collectionmembersarrays, multiple ids in one string, ids embedded inside larger tokens (claim_<id>_v2,entity-<id>.png,url=…/<id>?q=1), uppercase / mixed-case ids, JSON-encoded blobs that carry ids, ids not inidMapleft untouched, non-id strings unchanged, empty-resolutions no-op, and primitives (number / boolean / null) untouched. The integration comparator intest/integration/import-export-fidelity.test.tsnow treatsmembersas id-like so the round-trip diff stops asserting that reference arrays survive byte-for-byte; the round-trip behaviour itself is unchanged.
[0.3.2] - 2026-05-11
[0.3.1] - 2026-05-04
Forward-ports the data-fidelity, schema, UX, and DoS fixes from v0.1.8 (and the v0.2.1 RBAC integration of those fixes) to the v0.3.x line. The bug taxonomy and user-visible behavior is the same as v0.1.8; this section lists only the deltas specific to v0.3.x, plus the items unique to this release. Cross-version exports between v0.2.x and v0.3.x are intentionally not supported.
Schema
- Adds
Annotation.linkTypecolumn. Same column as v0.1.8 and v0.2.1.
Fixed (RBAC integration deltas, identical to v0.2.1)
The fixes below are conceptually the same as v0.1.8 but are wired through CASL rather than v0.1.8's lib/ownership.ts helpers, so there is no parallel ownership system on the v0.3.x line.
POST /api/annotationscallsrequest.ability.can('read', subject('Persona', persona))on the suppliedpersonaIdbefore attaching. The genericcreate Annotationcandidate carriescreatedByUserId = callerand passes CASL's create rule even when the target persona is foreign; the explicit read-on-target gate closes the gap.POST /api/summaries/:summaryId/claimsandGET /api/summaries/:summaryId/claimsapply the sameread-on-parent gate viasubject('VideoSummary', summary).POST /api/videos/:videoId/detectrunsability.can('read', subject('Persona', persona))when apersonaIdis supplied. The videos plugin also wiresbuildAbilitiessorequest.abilityis populated for every video sub-route.PUT /api/ontologyandPOST /api/ontology/augmentcatch blocks re-throwAppErrorso authorization-induced 403/404 are no longer collapsed into 500.POST/PUT /api/personasstripisSystemGeneratedfor non-system_adminrequests by checkingrequest.user.systemRole.GET /api/import/historyscopes byimportedBy = request.user.iddirectly.
Fixed (carried through unchanged from v0.1.8)
Claim.audio/Claim.video/Claim.metadataround-trip for any JSON value (was wiped toJsonNullfor non-arrays).- Object annotations linked to events / times / locations round-trip through export+import via the new
linkTypecolumn. POST /api/importreturns 4xx (typically 413) forFST_*_LIMITcodes instead of 500.POST /api/importpopulatesimportedByso the history listing returns the row.app.setErrorHandlertypes itserrorparameter asFastifyError.
UX (carried through unchanged from v0.1.8)
ImportResultDialogshows a yellow "Completed with Warnings" title and a prominent banner when annotations were skipped because of missing referenced data.
Infrastructure
model-service/Dockerfileretriespip install torch torchvisionandpip install -e .up to 3× with a 30s sleep between attempts, matching the existingapt-get updateretry pattern. Closes a release.yml flake first observed on v0.2.0's release run..github/workflows/ci.ymltriggers onrelease/**PRs in addition tomain/develop, so backport PRs to maintenance branches go through the same lint + test gate.
Tests
- Forward-ports every v0.1.8 / v0.2.1 test suite (multi-user-isolation, import-export-cross-user, import-export-edges, import-export-fidelity, issue-121-real-fixture, orphan-banner predicate test, Playwright spec). Seeds populate
createdBy/createdByUserId. Shared helpertest/integration/_rbac-baseline.tswipes the test-helper's blanket-grantRolePermissionrows and re-seeds an ownership-aware production-like baseline so the matrix actually exercises CASL's per-row ownership rules.
0.3.0 - 2026-04-24
Added
Model Service Clean Architecture
- Domain layer with entities, value objects, and exception hierarchy
- Application layer with service interfaces (ports) and use cases
- Infrastructure layer with adapter pattern for all external dependencies
- Dependency injection container with manual factory wiring
- Pydantic StrictBaseModel with plugin for stricter validation
- NumPy-style docstrings across all model service modules
- Contract tests with fake model manager and VLM loader
YamlModelRepositoryimplementingIModelRepositoryportdetect_objectsandtrack_objectsuse cases with outbound port adapters- Audio port adapters routing transcription, diarization, and VAD through the port system
- OpenTelemetry spans on every use case and
model_inferencemetrics on every outbound adapter ThinkingTraceandReasonedTextDTOs capturing reasoning traces through use cases and FastAPI schemas- Structural
_LLMLoaderLike/_LoaderConfigprotocols replacingAnyonLLMLoaderAdapter - Shared base modules (
audio/base.py,detection/base.py,llm/base.py) breaking runtime cyclic imports
CPU Inference Support
- ONNX Runtime detection loaders (YOLO-World, Florence-2, Grounding DINO)
- llama.cpp LLM loader with GGUF quantization for fast CPU text generation
- llama.cpp VLM loader with GGUF multimodal inference
- SmallVLMLoader for Transformers-based CPU vision models (SmolVLM, Moondream)
- Factory function dispatch for all loader types (detection, LLM, VLM)
- CPU model configurations in
models-cpu.yamlwith GGUF entries llama-cpp-pythonadded to CPU optional dependency group
2026 Model Catalog
- Wave 1: 57 new model entries in
models.yamland 11 inmodels-cpu.yamlcovering Qwen3-VL, Tarsier2, Moondream3, Qwen3, DeepSeek R1 distills, Kimi K2.6, GLM-4.7, Claude 4.6/4.7, GPT-5.4, Gemini 3.1 Pro, Grok 4, SAM 3.1, YOLOv12, YOLOE-26, RF-DETR, Canary-Qwen, Parakeet TDT, and WhisperX - Wave 2+3 loaders: SAM3, Canary, Parakeet, WhisperX, YOLOv12, YOLOE-26, and RF-DETR with contract tests
Docker CPU Build
- Automatic installation of CPU extras (
onnxruntime,llama-cpp-python) whenDEVICE=cpu cmakeadded to builder stage for compiling native extensions- Model config auto-selection via symlink (
models-cpu.yamlfor CPU,models.yamlfor GPU)
Frontend CPU Mode
- Backend config endpoint exposes
models_availableandcpu_models_availableflags - Three-state UI: GPU mode, CPU mode with models (info), no models available (error)
- All AI features (detection, summarization, ontology, claims) enabled when CPU models exist
- Replaced binary
isCpuOnlygating withmodelsDisabledacross all components - Admin model management page with CPU/GPU device toggles, download status, and job status fixes
Admin and Persona Configuration Surface
UserPreferences,PersonaPreferences, andSystemConfigPrisma models with RBAC-gated endpoints (/api/me/preferences,/api/personas/:id/preferences,/api/admin/config)- Model-service
/api/admin/reconfigureendpoint (gated byMODEL_SERVICE_ADMIN_TOKEN) that applies storage-path changes viareconfigure_rootsand updatesModelManagerinference knobs SystemConfigPanelrendering shadcn tabs for storage paths, runtime, and external APIs behindisAdminon the Settings pagePersonaEditorembeds a collapsiblePersonaPreferencesSectionfor per-persona inference pinsuseInferencePreferencesmigrated from localStorage to server-backed TanStack Query with optimistic updatesmergeOverrideshelper (user → persona precedence) with unit testsGenerationOverrides/AudioOverridesthreaded fromVideoBrowserthroughCreateSummaryRequest,SummarizeJobData, and the video-summarization worker into model-service asgeneration_overrides/audio_overrides- Inference Settings tab with Sampling / Audio / Detection / Advanced subtabs: sliders and inputs bound to backend defaults via
useModelDefaults/useModelFrameworks, per-field Reset controls /api/models/defaultsand/api/models/frameworksproxied through the Node server with TanStack Query hooks
Tests
- 234 domain and use-case unit tests with typed fakes
- 158 additional model-service tests covering the YAML model repository, task factories, domain exception hierarchy, thumbnails and claims FastAPI routes,
audio_processingservice, base audio client, and all seven vendor audio clients (AssemblyAI, AWS Transcribe, Azure Speech, Deepgram, Gladia, Google Speech, Rev AI) test/loaders/conftest.pystubssam2/sam2.build_saminsys.modulesso tracking-loader tests run without the optional SDKtest/external_apis/audio/conftest.pystubs the audio vendor SDKs so the package__init__resolves in CIpreferences.test.tsRBAC coverage for the new preferences endpoints
Changed
- Model service restructured from flat module layout to Clean Architecture layers
- Route handlers decomposed into domain-specific modules with DI
- Use cases updated with corrected imports after architecture relocation
- Use cases now depend only on DTOs and ports;
torchand model-loader imports moved into infrastructure adapters - Model manager relocated to
application/services - Claims route reads framework from config instead of hardcoding Transformers
- Frontend
ModelConfiginterface extended withmodelsAvailableandcpuModelsAvailable ModelSettingsPanelshows CPU mode info banner instead of GPU-required errorModelStatusDashboarduses severity-appropriate alerts for CPU modeModelManager.__init__now requirescapability_probe(was silently lazy-loaded)- Thumbnail output directory is env-configurable via
THUMBNAIL_OUTPUT_ROOT - All Python docstrings converted to NumPy-style for consistency
- README rewritten for v0.1.0-style presentation with centered header, badges, and updated content
- LICENSE year updated
- Release workflow
DEVICEarg switched fromcudatogputo match Dockerfile stages
Removed
- Backward-compat
TimeSpaninterface andtimeSpan?annotation field (server types, ontology JSON schema, frontendtransformBackendToFrontend, anduseAnnotationDrawingstub) - Legacy
stringbranch ofOntologyTypeItem.gloss; type narrowed toGlossItem[] - Legacy
string-baseUrl overload ofextractWikidataInfo(and its dedicated test case);WikidataSearchnow passes{ baseUrl } - Stale
userId(legacy) /createdByUserIdcommentary inabilities.tsnow that the backfill migration has completed capability_probe=Nonebackcompat path inModelManager
Security
- Hardened
video_downloaderandvideo_processoragainst SSRF and path injection: strict host allow-list with DNS resolution and IP safety check, extension allow-list, and resolve-then-relative-to path validation against configurable roots get_video_path_for_idnow guards against path traversal viaresolve-then-relative_toinstead ofexists-then-commonpath- Replaced custom path/URL validators with inline CodeQL-recognized sanitizers (
re.fullmatchon URL +os.path.realpath/startswithguards at each filesystem sink) - Sanitized logged user-derived values with CRLF replacement to eliminate log-injection alerts
- Rewrote temp-file extension selection as a literal-only
elifchain so CodeQL sees the extension as constant-sourced on every branch - Eliminated compound-
orguards, bakedos.sepinto module-level prefix constants, and collapsed the URL regex to a non-backtracking single alternative to clear residual CodeQL alerts - Moved type-only LLM loader imports behind
TYPE_CHECKING
Fixed
- Broken relative imports in use cases after architecture refactoring
- Video module export mismatches (
download_videovs.download_video_if_needed) - Claims route hardcoding
LLMFramework.TRANSFORMERSinstead of reading config YamlModelRepository.reload()previously passed a raw task dict asTaskConfig.selected; now parsesselectedandoptionsvia a dedicated helper- Audio loaders now guard against
load()failure; cv2 RGB frames cast touint8forDetectObjectsFrameInputand tracking append - Model-service test patch targets updated from
AutoModelForVision2SeqtoAutoModelForImageTextToTextto match the current loader import - Dropped stale
print-based fallback assertion intest_create_llm_loader_with_fallback_uses_fallback - ESLint warnings: missing hook dependencies, unused variables, and unused imports
- Ruff errors: unsorted
__all__lists, import ordering, deferred import warnings;ruff formatapplied across the model-service test suite
0.2.0 - 2026-04-21
Added
Role-Based Access Control (RBAC)
- CASL authorization engine with permission seed data
- Role-based permission schema (admin, manager, annotator, viewer)
- Row-level authorization on every data route (annotations, summaries, claims, world state, personas, ontology, export, import) using
accessibleBy()list filters andsubject()-based instance checks - Per-model ownership field resolution:
Persona/WorldStateuseuserId,AnnotationusescreatedByUserId,VideoSummary/Claim/UserGroupusecreatedBy,ProjectusesownerUserId - Per-user ability cache with explicit invalidation on every membership add, remove, role change, and project deletion
- Admin-editable
/api/admin/permissionsCRUD endpoints for runtime RolePermission management - Sharing privilege cap: re-shared resources cannot exceed the received permission level
- VideoAccessService wired into all video routes so authenticated users only see videos assigned to their projects; non-existent videos pass through so route validation errors are not masked by 404
- Backfill migration populating
createdByUserIdfrom legacyuserIdon existing annotations, andcreatedByon existing summaries and claims from their owning persona's user - 29 negative RBAC security tests covering cross-tenant IDOR, null-ownership denial, cache invalidation timing, sharing escalation, and admin-only enforcement
seedBaselinePermissions()test helper module for E2E test setup
Projects and Groups
- Project entity with membership, ownership, and sharing controls
- Group entity for organizing users into teams
- Backend routes for CRUD operations on projects, groups, and memberships
- Video assignment to projects with access scoping
- Project sharing with configurable permission levels
- User autocomplete for persona and member dialogs
Frontend
- Admin panel pages for project and group management
- Frontend stores and TanStack Query hooks for RBAC entities
- Project assignment and sharing dialogs in persona editor
- Member management with role selection
Observability
- OTEL tracing spans for RBAC authorization checks
- Prometheus alert rules for permission denied events
- Grafana RBAC monitoring dashboard
- Metrics for group, project, sharing, and video assignment operations
Testing
- Unit and integration tests for RBAC, groups, projects, sharing, and video assignments
- Frontend tests for RBAC stores, query hooks, and user management pages
Documentation
- User guide for projects and groups workflow
- RBAC architecture and permission model documentation
- API reference for new endpoints
Changed
- All data-mutating routes now populate
createdByUserId(annotations) andcreatedBy(summaries, claims, claim relations) from the authenticated session, never from the request body - All Prisma JSON field handling uses runtime
toJson()conversion andPrisma.JsonObjecttype guards instead of type assertion casts
0.1.7 - 2026-04-15
Fixed
- Regenerates IDs for cross-user imports whose exports contain no persona lines (for example, users who only create object annotations linked to world entities)
- Remaps array-valued ID reference fields (
entityIds,eventIds) on entity and event collections during cross-user imports - Remaps
GlossItem.contentforobjectRef,annotationRef,claimRef, and instance-leveltypeRefitems so claims citing regenerated objects follow their new UUIDs - Lets cross-user ID regeneration override non-regenerating resolutions (
skip,replace,merge) so annotations referencing entities in the same import batch get new IDs
Added
- Emits a provenance
metadataline withexporterUserIdat the start of every full export for reliable cross-user detection - Emits
userIdon exported object annotations so cross-user detection works for exports that contain no persona lines - Import dialog now shows a cross-user banner, per-conflict smart defaults, an "apply to all" bulk resolution, and auto-collapses large conflict groups
0.1.6 - 2026-03-28
Fixed
- Generates new UUIDs when importing annotations from a different user even when original IDs are absent from the database
0.1.5 - 2026-03-10
Fixed
- Fixes object annotation dropdown jitter when creating a second bounding box on a video
0.1.4 - 2026-03-10
Fixed
- Scopes export keyframe and interpolated frame statistics to the authenticated user's annotations
0.1.3 - 2026-03-06
Fixed
- Skips invalid annotation sequences during export instead of returning 400
0.1.2 - 2026-03-06
Fixed
- Stabilizes entity dropdown scroll behavior in annotation autocomplete
0.1.1 - 2026-03-06
Fixed
- Scopes annotation export to the authenticated user's personas
0.1.0 - 2026-02-27
Initial release of Fovea, the Flexible Ontology Visual Event Analyzer.
Added
Core Platform
- React + TypeScript frontend with Material UI, built with Vite
- Fastify + TypeScript backend with Prisma ORM and PostgreSQL
- FastAPI + Python model service for AI inference
- Docker Compose orchestration for all services
- Docusaurus documentation site
Video Management
- Video browser with metadata display, search, and filtering
- S3 and local filesystem storage providers with hybrid support
- Video streaming endpoint with range request support
- Thumbnail generation for video previews
- Video sync endpoint for bulk metadata ingestion
Annotation System
- Bounding box annotation with draw, resize, and drag support
- Keyframe-based bounding box sequences with interpolation
- Linear and bezier interpolation modes with visibility ranges
- Canvas-based timeline with playhead scrubbing and zoom (1-10x)
- Keyboard shortcuts for frame navigation and workspace switching
- JSON Lines import/export with conflict resolution and preview
- Automated tracking integration (SAMURAI, SAM2, YOLO11-seg) for bootstrapping annotations
Ontology Management
- Persona-scoped ontology types (entity, role, event, relation)
- Multi-persona type creation and shared type tracking
- AI-powered type suggestions via LLM integration
- Wikidata integration with one-click import and ID mapping
- Configurable Wikidata URL with local Wikibase support
- Gloss editor with autocomplete and claim references
World State
- World object editors for entities, events, times, locations, and collections
- World state persistence to PostgreSQL
- Auto-save with debounce for all world objects
Video Summarization
- VLM-powered video summarization with persona context
- BullMQ job queue for async processing
- Key frame extraction with confidence scoring
- Audio transcription with speaker diarization (AssemblyAI, Deepgram, Azure, AWS, Google, Rev.ai, Gladia)
- Audio-visual fusion strategies
- Summary preview on Claims tab
Claims System
- Hierarchical claims and subclaims with manual editing
- Claim extraction from summaries via LLM
- Claim synthesis with BullMQ queue worker
- Typed claim relations with filtering and search
- Claim provenance tracking with comment fields
- Claim span highlighting in summaries
Object Detection
- Multi-model detection (YOLO-World, OWLv2, Florence-2, Grounding DINO)
- Configurable query options with ontology-aware prompts
- Detection candidate review with accept/reject controls
AI Model Service
- Model configuration system with YAML-based profiles
- Multi-model support for VLM, LLM, detection, and tracking tasks
- SGLang, vLLM, and Transformers inference frameworks
- 4-bit quantization support via bitsandbytes
- Model status dashboard with VRAM monitoring
- Model settings panel with per-task model selection
- External API support (Anthropic Claude, OpenAI GPT, Google Gemini)
- Pre-loading of selected models on service startup
- GPU configuration profiles for various hardware (A10G, etc.)
Authentication and Security
- Session-based authentication with progressive lockout
- Single-user mode with auto-authentication
- Admin user management with secure password handling
- User-scoped API keys with AES-256-GCM encryption
- Session management with heartbeat, emergency save, and expiry warnings
- CSRF protection and rate limiting by client IP
Data Management
- Full export/import system with Zod validation for all data types
- User-scoped data isolation with cross-user conflict resolution
- Persona auto-save on creation
- Auto-save for annotations, ontology types, and world objects
Observability
- OpenTelemetry distributed tracing across all services
- Prometheus metrics with custom counters
- Grafana dashboards for monitoring
- Health check endpoints with Docker HEALTHCHECK
- Structured logging throughout
Infrastructure
- GitHub Actions CI/CD with lint, test, and Docker builds
- Release workflow with automatic changelog generation
- Deployment workflow with rsync and health checks
- Security scanning with CodeQL and TruffleHog
- Docker multi-stage builds with BuildKit optimizations
- Redis caching with CacheService integration
- Database indexes for performance
Frontend Architecture
- State management migration from Redux to TanStack Query + Zustand
- Feature-based directory structure with barrel exports
- Path aliases for clean imports
- Error boundaries with retry capability
- TypeScript strict mode with proper typing throughout
Backend Architecture
- Typed error class hierarchy with global error handler
- Modular video route structure
- VideoRepository pattern for database access
- Standardized storage configuration with STORAGE_PATH