Technical
Architecture
System design, data model, and technical decisions.
Loading…
Technical
System design, data model, and technical decisions.
Privilege by design. The architecture ensures attorney-client privilege and work product protection are structural properties of the system, not afterthoughts. The attorney directs all AI analysis. Enterprise API terms maintain confidentiality. Client data never touches consumer-grade AI services.
Tamper-evident from capture. Every artifact is SHA-256 hashed and recorded in a JSONL manifest at the moment of ingestion. The chain of custody is unbroken from capture through court presentation.
AI costs are tiered. Most operations require no LLM. Search and extraction are pre-computed at ingestion. Only attorney-directed synthesis and analysis invoke frontier models, and even then agentic retrieval ensures only relevant evidence is sent.
The vault is the platform. Evidence capture, preservation, and organization provide standalone value without any AI features. The AI analysis layer is a premium capability on top of a fundamentally useful evidence management system.
Agentic over semantic. Retrieval is reasoning-based, not similarity-based. The system builds structured evidence indexes at ingestion and uses LLM reasoning to navigate them. No vector database, no embedding vendor, no chunking. Similarity is not relevance; relevance requires reasoning.
Multi-tenant from day one. Every table, every S3 path, every query is scoped by firm and case. Multi-tenancy is a data model property, not a bolt-on.
Agent-native from day one. Every API endpoint is simultaneously a UI operation and an agent tool. The API is the tool layer; there is no separate "agent API." Agents are first-class practitioners operating under attorney direction, not feature add-ons. See agent-native.md for the full agent architecture.
Every FastAPI endpoint is an atomic tool. The API is the tool layer. The UI and agents call the same endpoints. This is the foundational architectural decision that makes agent parity automatic rather than aspirational.
job_id instead of blocking. See Async Operations below.Every FastAPI route carries tool metadata via OpenAPI extensions:
| Extension | Required | Purpose |
|---|---|---|
x-tool-name |
Yes | Tool identifier (e.g., evidence.search) |
x-tool-permission |
Yes | Required permission (e.g., read:evidence) |
x-tool-audit-category |
Yes | Audit category (e.g., search, create, delete) |
x-tool-entity-type |
No | Entity type acted upon (e.g., evidence, fact) |
A tool_meta() helper function attaches these extensions to routes at definition time. A unit test validates the helper works correctly, but there is no CI step that validates all routes carry the required extensions.
The Next.js frontend calls the same API endpoints agents call. No BFF (backend-for-frontend). No screen-specific endpoints. The frontend currently uses raw fetch() calls to the API. OpenAPI-based TypeScript client generation (e.g., @hey-api/openapi-ts) is planned but not yet implemented.
The top bar contains a right-side slot for page-specific action buttons (e.g., "Upload Evidence" on the evidence browser, "Create Case" on the case list). Pages inject content via TopBarActionsContext: a React context that holds the current action node. The shell layout reads this context and renders it in the top bar. Pages use the useTopBarActions(node) hook to register their actions on mount and clear them on unmount. This pattern is defined in frontend/components/shell/topbar-actions-context.tsx and is the standard for all M3+ pages.
All endpoints follow consistent conventions for pagination, idempotency, and versioning:
.list endpoint returns { items, next_cursor, has_more }. Cursor-based rather than offset-based because evidence, facts, and events are frequently inserted.POST (create) endpoints accept an Idempotency-Key header. This is implemented centrally in backend/app/middleware/idempotency.py (Redis-backed, 24-hour TTL, body-hash matching, lock-based deduplication). Endpoint implementations do not need per-service idempotency logic — it is handled by the middleware layer automatically for all POST requests that include the header./v1/ prefix. Breaking changes require a new version; non-breaking additions don't.source_type, capture_method, processing_status, case status, etc.) are validated at the application layer via Pydantic, not with DB-level CHECK constraints or PostgreSQL ENUM types. Allowed values are defined in a central module (e.g., app/constants.py or per-domain enums.py) and referenced by both schemas and services. This enables rapid iteration — adding a new source type is a code change, not an Alembic migration. DB columns store plain VARCHAR. Existing CHECK constraints on case status should be migrated to this pattern for consistency.See agent-native.md: API Conventions for the full specification.
Intactus uses Server-Sent Events (SSE) for one-way real-time updates from the backend to the browser. SSE is the default transport for progress updates, enrichment step completion, and firm-scoped event delivery where the client needs a live feed but does not need bidirectional messaging.
EventSource without a custom client protocol.The real-time path is:
Celery / backend service
-> publish event to Redis pub/sub
-> SSE endpoint subscribes to Redis channel
-> Next.js same-origin proxy streams the response
-> browser EventSource updates the UIThere are two SSE scopes in the backend:
The backend publishes events into Redis pub/sub channels keyed by firm and, when needed, by case. This keeps event fan-out cheap and avoids holding PostgreSQL connections open for long-poll delivery.
The SSE endpoints live in a dedicated Starlette sub-application mounted under /api/v1/sse. This is intentional.
BaseHTTPMiddleware layers for concerns like error handling, idempotency, tenant handling, and rate limiting.The backend SSE handlers do four things:
The snapshot-then-stream pattern is important. On connect, the server first sends the current known state so the client can hydrate immediately, then it continues with incremental updates. This avoids races where the UI connects midway through an active enrichment run and would otherwise wait for the next event before showing current state.
Heartbeats are sent periodically to keep the connection alive through proxies and load balancers and to refresh server-side connection bookkeeping.
The browser does not connect directly to the backend SSE endpoint. Instead, it connects to a same-origin Next.js route that proxies the stream through to the backend.
This proxy exists for practical reasons:
EventSource cannot set arbitrary authorization headersThe Next.js route reads the session cookie, adds the backend Authorization header, requests the backend SSE stream, and returns the stream body without buffering or transformation.
On the client side:
EventSource is used for the live connectionThe frontend treats SSE as the preferred real-time channel, not the only channel.
This hybrid approach keeps the UI responsive without making correctness depend on a permanently healthy stream. SSE improves latency and user experience; polling remains the safety net.
SSE follows the same tenant boundaries as the rest of the platform.
firm_id, and case-level channels also include case_idThis preserves the same firm/case isolation model used throughout the API and database layers.
SSE is the right fit for Intactus today because the product mostly needs server-to-browser event delivery, not collaborative peer messaging or duplex sessions. If future features require true bidirectional low-latency messaging, WebSockets or a dedicated realtime service may be warranted. For the current architecture, SSE provides the lowest-complexity path to live updates while staying aligned with the existing HTTP-first API design.
CLIENT PORTAL ATTORNEY DASHBOARD
(Responsive Web → Native) (Web Application)
| |
v v
┌───────────────────────────────────────────┐
│ API GATEWAY │
│ (Caddy → FastAPI, Auth, Rate Limits) │
└─────────────────┬─────────────────────────┘
│
┌─────────────────┼──────────────────────┐
│ │ │
v v v
┌─────────────┐ ┌──────────────┐ ┌──────────────────┐
│ INGESTION │ │ SEARCH │ │ ANALYSIS │
│ PIPELINE │ │ (PostgreSQL) │ │ (Future) │
│ │ │ │ │ │
│ - File upload│ │ - Full-text │ │ - Agentic │
│ - Text │ │ search │ │ retrieval │
│ extraction │ │ - Entity │ │ - Sonnet/Opus │
│ - OCR (Haiku)│ │ lookup │ │ via Anthropic │
│ - Hash │ │ - Filters │ │ API │
│ - Manifest │ │ │ │ - Report gen │
│ - Enrichment │ │ Cost: FREE │ │ - Chat with │
│ (entities, │ │ at query │ │ My Case │
│ facts, │ │ │ │ │
│ relations,│ │ │ │ Cost: $$ │
│ summaries)│ │ │ │ per query │
│ │ │ │ │ │
│ Cost: $0.005 │ │ │ │ │
│ per item │ │ │ │ │
└──────┬───────┘ └──────┬───────┘ └────────┬──────────┘
│ │ │
v v v
┌──────────────────────────────────────────────────┐
│ DATA LAYER │
│ │
│ PostgreSQL 16 S3 (artifacts) │
│ - metadata - files, images │
│ - manifest - screenshots │
│ - entities - audio, video │
│ - relationships - .eml files │
│ - hashes - text exports │
│ - summaries │
│ - classifications │
│ │
│ ┌─────────────────────────────────────────┐ │
│ │ TASK QUEUE (Celery + Redis) │ │
│ │ - Ingestion jobs │ │
│ │ - Enrichment jobs │ │
│ └─────────────────────────────────────────┘ │
└──────────────────────────────────────────────────┘
Every record in the system is scoped by firm and case. This is not a future concern; it is a structural property of the data model from the first migration.
┌──────────┐
│ Firm │
│ (tenant) │
└────┬─────┘
│ has many
┌────┴─────┐
│ Case │
└────┬─────┘
│ has many
┌───────────────┼──────────────┬──────────────┬──────────────┐
│ │ │ │ │
┌────┴────┐ ┌──────┴────┐ ┌─────┴──────┐ ┌────┴────┐ ┌──────┴─────┐
│Evidence │ │ Entity │ │Relationship│ │ Fact │ │ Claim │
│ Item │ │ │ │ │ │ │ │ └─ Issue │
└─────────┘ └───────────┘ └────────────┘ └─────────┘ │ (future) │
└────────────┘
firm_id and case_id columnssessions, agent_sessions) are exempt from RLS; token lookups are scoped by cryptographic hash and must resolve the tenant before RLS context can be set/{firm_id}/{case_id}/{artifact_type}/{artifact_id}/{filename}Implementation: See
backend/app/tenant.pyfor session-scoped tenant context, migrations0003/0004for RLS policies, and migration0011for the auth table exemption.
Actor attribution: All tables with audit fields (created_by, updated_by, or equivalent) include actor_type (human | agent) and actor_id columns. This distinguishes human actions from agent actions across the data model without requiring separate audit infrastructure. When actor_type is agent, the actor_id references the agent record and the agent's agent_owner_id links to the directing attorney.
| Role | Sees | Can Do |
|---|---|---|
| Firm admin | All cases in firm | Manage attorneys, billing, firm settings |
| Attorney | Assigned cases | Full evidence management, AI analysis, case administration |
| Staff | Assigned cases | Evidence intake, organization, search (no AI analysis) |
| Agent | Cases assigned to owner attorney | Tool-layer operations scoped by permission grants (see Agent Role Model) |
| Client | Own case only | Upload evidence, portal features, Chat with My Case, secure messaging |
| Guest | Specific case(s) only | Read-only access to attorney dashboard view for that case. For co-counsel, expert witnesses, mediators, guardians ad litem. |
Authentication: FastAPI owns all authentication. Human users authenticate via magic links (email-based, passwordless) sent through Resend. Prefixed bearer tokens with SHA-256 hashing only — raw tokens are never persisted. Sessions are stored in PostgreSQL. See backend/app/auth/ for the full implementation.
Agent authentication: Agents use API keys exchanged for scoped session tokens. A single auth dependency validates both human and agent tokens, resolving to a unified principal model. Agent tables (API keys, sessions, audit log) ship in M1; management endpoints ship in M2+. See backend/app/models/agent.py for the data model and agent-native.md: Agent Authentication & Sessions for the full specification.
Agent permissions: Every agent has an agent_owner_id foreign key referencing the directing attorney. An agent inherits its case access from the owner attorney but is further scoped by operation type (read, write, delete, analyze) and entity type. Permission checks are enforced at the API layer via require_scope(): same middleware, same firm/case scoping, with additional agent scope validation. See agent-native.md: Agent Role Model for the full permission model.
A Fact is a discrete factual assertion, linked to one or more source passages across one or more evidence items that support it. Facts are the atomic unit of case reasoning; they connect raw evidence to legal arguments with passage-level provenance.
Facts are generated at two points:
The facts table stores the assertion text, a confidence score (0.00-1.00), a status field (auto, approved, dismissed), approval tracking (approved_by, approved_at), temporal fields (occurred_at, occurred_end_at, occurred_precision) for when the fact took place, and a JSONB metadata column. See backend/app/models/knowledge_graph.py (Fact class) for the full column set.
The fact_evidence junction table implements the many-to-many relationship between facts and evidence items. Each row captures where in a specific evidence item the fact is grounded: source_snippet, start_offset, end_offset, and an is_primary flag for the canonical source. The table uses a composite primary key on (fact_id, evidence_item_id) and includes firm_id denormalized for RLS.
Two additional junction tables connect facts to other knowledge graph elements:
fact_entities: Links facts to the entities they mention, with an optional role and mention_text. Composite PK on (fact_id, entity_id).relationship_facts: Links relationships to the facts that support them. Composite PK on (relationship_id, fact_id).See backend/app/models/knowledge_graph.py for FactEvidence, FactEntity, and RelationshipFact.
Two-tier approval model:
| Tier | Generated By | Approval | Examples |
|---|---|---|---|
| Auto-computed metadata | Ingestion pipeline (Haiku) | Stored automatically, attorney can edit/override anytime | Entity extraction, 2-sentence summaries |
| Facts & assertions | Ingestion pipeline or analysis engine | Require explicit attorney review: approve, edit, or dismiss | "Respondent was 40 minutes late for pickup on Oct 14", "Financial disclosure contradicts social media post from Sep 3" |
The approval workflow supports batch operations: select multiple facts, approve all, or filter by confidence and approve in bulk. This keeps the workflow fast for the 80% of facts that are obviously correct while preserving rigor for borderline findings.
Fact state machine: Facts have three statuses (auto, approved, dismissed) but no terminal states. The batch_update_status service function transitions any set of facts to any valid status. There are no separate /approve or /dismiss endpoints — all status changes go through POST /facts/bulk-update with a status field. Facts can be soft-deleted via DELETE /facts/{id} (sets deleted_at/deleted_by_id). The system uses hard-delete for orphan cleanup during re-analysis.
Batch operations: The bulk-update endpoint follows a validate-all-before-execute pattern: all fact IDs must be validated as belonging to the requesting firm_id + case_id before any transition is executed. Max 100 facts per batch; exceeding returns 422. Response is per-item: [{ fact_id, status: "approved" | "failed", error?: str }].
Claims and Issues: The claims, issues, issue_facts, and issue_evidence tables are defined in the Issue & Claim Schema section below but are not yet implemented.
At ingestion, the enrichment pipeline extracts relationships between entities using a category + label system. Rather than a flat enum of relationship types, each relationship has a structured category (from a fixed set of 7), a free-text label describing the specific relationship, and an optional inverse_label for the reverse direction.
| Evidence Type | Extracted Relationships (category / example labels) |
|---|---|
| communication / "sent email to", "cc'd on email"; professional / "works with" | |
| Text message | communication / "messaged", "texted"; social / "is friend of" |
| Court document | legal / "filed by", "regarding"; professional / "represented by" |
| Financial record | financial / "paid", "owes"; professional / "employed by" |
| General evidence | familial / "parent of", "sibling of"; spatial / "lives at", "works at" |
The 7 relationship categories are defined in KG_RELATIONSHIP_CATEGORIES in backend/app/enums.py: familial, professional, legal, social, financial, spatial, communication. The API validates category against this set; unknown values return 422. The label and inverse_label fields are free-text, allowing the LLM to describe the specific nature of each relationship.
Each relationship links a source_entity_id to a target_entity_id, with an optional confidence score. Direction-agnostic deduplication is enforced via a unique index on LEAST(source_entity_id, target_entity_id), GREATEST(source_entity_id, target_entity_id), case_id, and category — so "A is parent of B" and "B is parent of A" within the same category cannot coexist as separate rows.
Relationships support basic CRUD operations and can be linked to supporting facts via the relationship_facts junction table. See backend/app/models/knowledge_graph.py (Relationship and RelationshipFact classes) for the full schema.
Full-text search: facts.assertion uses PostgreSQL tsvector/tsquery with a GIN index. Migration must include: CREATE INDEX facts_assertion_fts ON facts USING GIN (to_tsvector('english', assertion));
An Evidence Item is the core unit of the vault: a captured artifact with full provenance, hash-chain integrity, and pre-computed metadata. Evidence items are created by the ingestion pipeline and are immutable once hashed.
The evidence_items table organizes its columns into several groups:
source_type and capture_method classify how the evidence entered the system.title, extracted_text (full text — HTML stripped, OCR'd via PyMuPDF/fitz for PDFs and Anthropic Vision for images), content_date (original content date), captured_at.sha256_hash (computed by pipeline after upload, nullable at creation), manifest_entry_id (reference to JSONL manifest entry).processing_status, processing_error, processing_steps (per-step JSONB tracking).summary, summary_one_liner, classifications (JSONB), confidence. Also extraction_instructions and enrichment_instructions (attorney-provided custom instructions for re-extraction and enrichment), and text_modified_since_enrichment (flag indicating extracted text changed after last enrichment).reviewed_at, reviewed_by_id — marks when an attorney reviewed the evidence.s3_key, content_type, size_bytes, original_filename.deleted_at, deleted_by_id.metadata JSONB for source-specific fields.See backend/app/models/evidence.py for the full column set.
Notes:
source_type values: web, email, sms, file, image, video, audio, other. Defined in SOURCE_TYPES in backend/app/enums.py.capture_method values: upload, browser, api, email_forward. Defined in CAPTURE_METHODS in backend/app/enums.py. Determines which pipeline variant processes the item.processing_status has 6 states: pending, processing, completed, failed, enriching, partially_enriched. Defined in PROCESSING_STATUSES in backend/app/enums.py. Items with pending or processing status show a progress indicator in the UI. Items with failed status surface the error to the attorney.processing_steps tracks per-step completion within the pipeline (status, timestamp, errors). Each step is recorded independently so partial failures are visible. Structure: {"step_name": {"status": "completed"|"failed"|"skipped", "started_at": "...", "completed_at": "...", "error": "..."}}.An Entity is a named item extracted from evidence during ingestion. Entities are deduplicated across evidence items within a case and can be merged manually by the attorney.
Entities have exactly 3 types: person, organization, location — defined in KG_ENTITY_TYPES in backend/app/enums.py. Each entity stores a display_name (canonical name) and a dedup_key (normalized for deduplication). Uniqueness is enforced via a partial unique index on (firm_id, case_id, entity_type, dedup_key) WHERE merged_into_id IS NULL AND deleted_at IS NULL, allowing soft-deleted and merged entities to coexist with active ones sharing the same key.
Additional entity columns include:
summary and summary_one_liner: AI-generated entity summaries, debounced per-entity via the entity summarization pipeline. summary_edited tracks whether the attorney has overridden the AI summary.parsed_date and date_precision: Temporal metadata for date-like entities.deleted_at, deleted_by_id: Soft delete timestamps. merged_into_id: Self-referencing FK for merge tracking — when set, the entity has been merged into the referenced entity. Merge and delete are distinct operations (see Record Lifecycle Model below).metadata (JSONB): Entity-type-specific fields.See backend/app/models/knowledge_graph.py (Entity class) for the full column set.
Entity-evidence junction: The entity_evidence table links entities to evidence items. It uses a surrogate UUID primary key with a unique constraint on (entity_id, evidence_item_id) — guaranteeing one link per pair. Each row also stores start_offset, end_offset, mention_text, and confidence. The firm_id is denormalized for RLS. See backend/app/models/knowledge_graph.py (EntityEvidence class).
Related tables: Several additional tables support the entity subsystem:
dedup_suggestions: Stores fuzzy-match dedup candidates with a score, match_reasons (JSONB), and a cluster_id for grouping. Suggestions have a status (pending/accepted/rejected) with decided_by/decided_at tracking. Unique constraint on (source_entity_id, target_entity_id).merge_history: Audit trail for entity merges. Each row records a single column change made during a merge operation: table_name, row_id, column_name, old_value, new_value. Grouped by merge_event_id so an entire merge can be replayed or reversed.temporal_mentions: Stores date/time references extracted from evidence, linked to a fact_id and/or evidence_item_id. Includes occurred_at, end_at, precision, mention_text, display_text, temporal_role (when, deadline, filed, created, received), and character offsets.fact_entities: Junction table linking facts to the entities they mention (see Fact Schema).relationship_facts: Junction table linking relationships to supporting facts (see Relationship Schema).See backend/app/models/knowledge_graph.py for DedupSuggestion, MergeHistory, TemporalMention, FactEntity, and RelationshipFact.
All three core data types (evidence, entities, facts) follow a consistent lifecycle model. Records can be in one of four mutually exclusive states:
| State | merged_into_id |
deleted_at |
deleted_by_id |
Meaning |
|---|---|---|---|---|
| Active | NULL | NULL | NULL | Normal record |
| Merged | target UUID | NULL | NULL | Absorbed into target; data lives on |
| Deleted | NULL | timestamp | user UUID | Explicitly removed by user |
| Merged + Erased | target UUID | timestamp | user UUID | Future GDPR path |
Active predicate: WHERE merged_into_id IS NULL AND deleted_at IS NULL (used in all list queries).
Delete behavior by actor:
deleted_at/deleted_by_id. Leaves junction rows in place. Invalidates dedup suggestions. Reversible with audit trail.merged_into_id only. NOT deletion. Preserves lineage per FHIR/MDM standards.Soft-delete leaves junction rows intact: When a record is soft-deleted, its junction/child rows (fact_entities, entity_evidence, relationship_facts, etc.) remain in the database. They are filtered at query time via the active predicate on the parent table. This preserves graph connections for future undo/restore capabilities.
Merge behavior: When an attorney merges two entities, the secondary entity's merged_into_id is set to the primary entity's id. All entity_evidence rows, relationships, and aliases transfer to the primary. The secondary entity is retained (merged, not deleted) so that existing references resolve correctly. Every column change during the merge is recorded in merge_history, enabling full unmerge capability. After merge completes, entity summarization is automatically triggered for the primary entity to incorporate the newly absorbed data.
Merge security: The merge endpoint validates that both entities share the same firm_id and case_id. No cross-case merges are permitted.
Merge idempotency: If the secondary entity already has merged_into_id = primary_id, the merge is a no-op. Return 200 with the current state of the primary entity.
Auto-deduplication: The dedup_key is computed at write time by generate_dedup_key() in backend/app/services/entity_service.py. The key is prefixed with the entity type and applies type-specific normalization: person names strip titles/suffixes and canonicalize nicknames; organization names strip legal suffixes and expand abbreviations; location names expand address abbreviations. After type-specific processing, the name is lowercased, accents are stripped (NFKD), punctuation is removed, tokens are sorted alphabetically, and joined with underscore. Example: "Dr. Sarah Johnson" (person) → person:johnson_sarah. Exact-match duplicates are caught at insertion time. Fuzzy matches (e.g., "Sarah Johnson" vs "S. Johnson") are surfaced to the attorney as dedup_suggestions rather than auto-merged, preserving accuracy over convenience.
Fact count: Entity-level counts (fact count, evidence count, relationship count) are not stored as denormalized columns. They are computed at query time via JOIN.
Full-text search: Entity display_name is indexed for full-text search using PostgreSQL tsvector/tsquery with a GIN index. Do not use ILIKE for search. Migration must include: CREATE INDEX entities_display_name_fts ON entities USING GIN (to_tsvector('english', display_name));
Implementation Status: NOT YET IMPLEMENTED. The tables described below do not exist in the database. This section is retained as the design specification for a future milestone.
A Claim is a top-level legal argument or legal theory. An Issue is a sub-element within a claim — typically a legal element that must be proven, a statutory factor the court must weigh, or a discrete contested question. Together they provide the hierarchical structure attorneys use to organize facts and evidence around legal theories.
In tort/contract litigation: Claims map to legal theories (Negligence, Breach of Contract, RICO). Issues map to elements of each theory ("Duty," "Breach," "Causation," "Damages"). authority_citations may reference a statute or pattern jury instruction. Common law claims have no statute.
In family law: Claims map to petition or motion types ("Petition for Primary Physical Custody," "Motion for Contempt — Violation of Placement Order"). Issues map to statutory factors the court must weigh — e.g., the 16 enumerated factors in Wis. Stat. § 767.41(5)(am). The attorney creates one Issue per factor they intend to argue. The claim's authority_citations references the factor-list statute; individual Issues are labeled by factor name/number.
In criminal law (defense side): claim_type='defense' captures the overall defense theory ("Self-Defense," "Mistaken Identity," "Lack of Intent"). claim_type='affirmative_defense' captures formal affirmative defenses with their own required elements (self-defense requires: belief of imminent threat, reasonableness, not the initial aggressor). claim_type='motion' captures pretrial motions — a "Motion to Suppress — Unlawful Stop" becomes a Claim with Issues mapping to the 4th Amendment prongs the attorney must argue. Evidence suppression motions are high-value: winning one can collapse the prosecution's case.
In criminal law (prosecution side): claim_type='charge' represents each count in the charging document. authority_citations is effectively required — every criminal charge is statutory (e.g., Wis. Stat. § 940.01(1)(a) for 1st degree intentional homicide, Wis. Stat. § 346.63 for OWI, 18 U.S.C. § 1962 for RICO). Issues map to the elements of the offense that must be proven beyond a reasonable doubt. Each element becomes one Issue; Facts and evidence are linked to prove or disprove that element.
jurisdiction distinguishes state from federal charges — 'WI' for Wisconsin, 'federal' for federal crimes. A single case may have both (federal RICO charge + parallel state fraud charge for the same conduct). A claim can reference multiple statutes via the authority_citations array.
Neither jurisdiction nor authority_citations is required at the DB level. Common law civil claims have neither. But criminal charges should always include authority_citations.
For user-facing features, see Case Outlines and Issue Linking.
CREATE TABLE claims (
id UUID PRIMARY KEY,
firm_id UUID NOT NULL,
case_id UUID NOT NULL,
claim_type VARCHAR NOT NULL DEFAULT 'claim',
-- Civil: 'claim', 'counterclaim', 'affirmative_defense'
-- Criminal: 'charge' (prosecution-side, always statutory),
-- 'defense' (defense narrative/theory),
-- 'affirmative_defense' (self-defense, insanity — shared with civil),
-- 'motion' (pretrial/trial motions: suppress, dismiss, for judgment)
-- Family: 'claim' covers petitions/motions (differentiated by title)
title VARCHAR NOT NULL, -- e.g., "Count 1: 1st Degree Intentional Homicide",
-- "Self-Defense", "Motion to Suppress — Unlawful Stop",
-- "Petition for Primary Physical Custody", "Negligence"
description TEXT, -- optional narrative context
jurisdiction VARCHAR, -- optional: 'federal', 'WI', 'CA', etc.
authority_citations JSONB, -- optional for civil; effectively required for criminal charges
sort_order INT NOT NULL,
created_by UUID NOT NULL,
created_at TIMESTAMPTZ NOT NULL,
updated_at TIMESTAMPTZ NOT NULL
);
-- authority_citations format:
-- [{"citation": "Wis. Stat. § 940.01(1)(a)", "jurisdiction": "WI", "label": "1st Degree Intentional Homicide"},
-- {"citation": "18 U.S.C. § 1962", "jurisdiction": "federal", "label": "RICO"},
-- {"citation": "Wis. Stat. § 767.41", "jurisdiction": "WI", "label": "Wisconsin Custody Factors"}]
CREATE TABLE issues (
id UUID PRIMARY KEY,
firm_id UUID NOT NULL,
case_id UUID NOT NULL,
claim_id UUID NOT NULL REFERENCES claims(id),
title VARCHAR NOT NULL, -- e.g., "Late pickups"
description TEXT,
sort_order INT NOT NULL, -- ordering within the parent claim
created_by UUID NOT NULL,
created_at TIMESTAMPTZ NOT NULL,
updated_at TIMESTAMPTZ NOT NULL
);
-- Many-to-many: facts can support multiple issues
CREATE TABLE issue_facts (
issue_id UUID NOT NULL REFERENCES issues(id),
fact_id UUID NOT NULL REFERENCES facts(id),
PRIMARY KEY (issue_id, fact_id)
);
-- Many-to-many: evidence items can be linked to issues directly
CREATE TABLE issue_evidence (
issue_id UUID NOT NULL REFERENCES issues(id),
evidence_item_id UUID NOT NULL,
PRIMARY KEY (issue_id, evidence_item_id)
);Claims and issues are scoped by firm_id and case_id like all other tables. The sort_order fields drive the attorney's custom ordering in the case outline view and in exported reports (Facts by Issues, Case Outline, Statement of Material Facts).
The same claims → issues → issue_facts → facts → fact_evidence → evidence_items query
path drives different output documents depending on practice area and claim_type:
Civil litigation — Statement of Material Facts (SUMF) Required with summary judgment motions. Numbered paragraphs, each a Fact with all supporting evidence citations, organized by legal element (Issue).
Family law — Hearing Memorandum / Trial Brief
Each custody factor (Issue) becomes a section. Facts under it become numbered
paragraphs with exhibit citations. sort_order on Issues mirrors the statutory
factor numbering (e.g., Wis. Stat. § 767.41(5)(am) factors 1–16).
Criminal — Case Theory Brief / Trial Brief
claim_type='charge': each element of the offense (Issue) with corroborating facts.
claim_type='defense': defense narrative organized by theory and supporting facts.
claim_type='affirmative_defense': each required element of the defense with facts.
claim_type='motion': suppression motion structured by constitutional prong (Issue),
with facts and evidence citations supporting the challenge.
In all cases:
sort_order on Claims and Issues controls document orderingaffirmative_defense and defense claim_types use the same hierarchy as offensive claims — no separate table requiredFor the full agent-native design philosophy, role model, tool inventory, and multi-agent coordination model, see agent-native.md. This section covers the architectural integration points within the platform.
The API is the tool layer. Every FastAPI endpoint is simultaneously a UI operation and an agent tool. There is no separate "agent API" or "tool registry" apart from the API itself. The OpenAPI spec is the tool registry; agents discover available operations by reading it. See API Design: Tool-Layer-First for the design principles.
The Complete Tool Inventory maps every user action to a specific API endpoint across 22 domains: cases, evidence, facts, claims & issues, entities, relationships, reports, exports, messages, chat, notifications, monitoring, designations, transcripts, saved filters, timeline, ingestion, jobs, audit, agent management, usage & billing, and tool discovery.
Agents authenticate via scoped API keys (not human auth flows). Attorneys generate keys through the dashboard, each scoped to specific cases, operation permissions, and rate limits. Agent sessions are created via POST /agent/sessions with a configurable TTL. See agent-native.md: Agent Authentication & Sessions for the full specification.
All agent actions are logged in a dedicated audit table with attorney attribution, tool invocation details, and tamper-evident hashing of inputs and outputs. See agent-native.md: Agent Audit Trail for the full schema.
The agent_audit_log table is scoped by firm_id and case_id like all other tables. It records the agent identity, the directing attorney (agent_owner_id), the tool invoked, the entity acted upon, and hashes of both input and output. Entries appear alongside human actions in the case activity feed.
Agent Infrastructure Schema: The data model for API keys (agent_api_keys), sessions (agent_sessions), events (events), webhook delivery tracking (webhook_deliveries), and usage operations (usage_operations) is defined in agent-native.md: Agent Infrastructure Schema. The usage_operations table tracks per-call LLM costs (operation_type, operation_price, llm_cost, case_id, actor) for the M4 enrichment pipeline and all subsequent LLM operations.
POST /agent/sessions with target case(s) and requested permissionsGET /agent/sessions/{id}/briefingGET /agent/sessions/{id}/changes?since={timestamp}DELETE /agent/sessions/{id}, or API key revocationSee agent-native.md: Observability for context refresh capabilities.
Per-API-key rate limits (configurable by attorney, default 100 req/min), per-case session quotas (default 5 concurrent), and attorney-configurable AI cost controls: per-case caps, per-case alerts, firm-wide caps. A circuit breaker auto-suspends agent sessions after 50 consecutive errors and notifies the attorney. See agent-native.md: Rate Limiting, Quotas & Cost Management and business-model.md: AI Cost Management for the full specification.
For capture methods that feed this pipeline, see Evidence Capture.
| Step | Method | LLM? | Cost | Milestone |
|---|---|---|---|---|
| 1. Extract text from HTML/email/PDF/image | Parsing (PyMuPDF/fitz for PDFs), Anthropic Vision API for images | No* | Free* | M2 |
| 2. Store extracted text with positional metadata | Application logic (character offsets, page/line for PDFs) | No | Free | M2 |
| 3. Extract entities (person, organization, location) | Haiku LLM (configurable) | Yes (cheap) | ~$0.001 | M4 |
| 4. Extract facts with passage-level citations | Haiku LLM (configurable) | Yes (cheap) | ~$0.001 | M4 |
| 5. Derive relationships from extracted facts | Haiku LLM (configurable) | Yes (cheap) | ~$0.001 | M4 |
| 6. Generate summary + one-liner | Haiku LLM (configurable) | Yes (cheap) | ~$0.001 | M4 |
| 7. SHA-256 hash all artifacts | Compute | No | Free | M2 |
| 8. Record in tamper-evident manifest | Application logic | No | Free | M2 |
| 9. Store artifacts in S3 | S3 | No | ~$0.023/GB/mo | M2 |
* Image OCR uses Claude Haiku via the Anthropic Vision API, which is an LLM call (~$0.001/image). PDF text extraction via PyMuPDF is free.
Note: Classification is not currently a separate pipeline step. The classifications JSONB column exists on the evidence model but is not populated by the enrichment pipeline.
M2 implements steps 1-2 and 7-9 — text extraction, hashing, manifest recording, and S3 storage. No LLM calls for PDFs, so per-artifact ingestion cost is effectively free (S3 storage only). M4 adds steps 3-6 — LLM-based entity extraction, fact extraction, relationship derivation, and summarization. All enrichment steps use Haiku by default (~$0.001/item each). Model is configurable per step via environment variables.
The ingestion pipeline is implemented as a single Celery task (run_pipeline) that executes steps sequentially within one task invocation. Each step is tracked in the processing_steps JSONB column and is independently retriable.
# M2: Single-task pipeline (steps 1-2, 7-9)
@celery_app.task(name="app.tasks.ingestion", bind=True, max_retries=3)
def run_pipeline(self, evidence_id, firm_id, force=False):
# 1. Download artifact from S3 to temp file
# 2. Extract text (dispatch to extractor by content_type)
# 3. Compute SHA-256 hash (chunked reads)
# 4. Create manifest entry (hash chain, per-case lock)
# 5. Set processing_status = 'completed'M4 enrichment: After the ingestion pipeline completes (text extraction, hashing, manifest), enrichment is spawned as a separate async task via run_enrichment.apply_async(). Within the enrichment task, steps run sequentially (not in parallel): extract_entities → extract_facts → derive_relationships → summarize. This sequential ordering is intentional — fact extraction can reference extracted entities, and relationship derivation uses extracted facts. Each step's status is tracked independently in processing_steps JSONB. The evidence item's processing_status transitions through enriching during enrichment and lands at completed or partially_enriched depending on step outcomes.
Error handling per step: Each step catches its own exceptions and records them in processing_steps JSONB (status, timestamp, error message). If a step fails, the evidence item is marked processing_status = 'failed' with the error in processing_error. The Celery task retries up to 3 times with 60-second delay for transient errors.
Re-entrant behavior: Every step checks whether its output already exists before running. Re-running the pipeline on an already-completed item is a no-op unless the force=True parameter is passed. This makes retries safe.
All file transfers (upload and download) use presigned S3 URLs. Files never pass through the API; agents and the UI upload directly to S3 and download directly from S3.
Upload: Caller requests a presigned URL via POST /cases/{id}/evidence/upload, PUTs the file to S3, then confirms the upload via POST /evidence/uploads/{id}/confirm to trigger the ingestion pipeline. Batch uploads follow the same pattern with parallel presigned URLs.
Download: Caller requests a presigned URL via GET /evidence/{id}/download or GET /exports/{id}/download, then GETs the file directly from S3.
See agent-native.md: File Handling Protocol for the full specification.
Every evidence item is recorded in a JSONL manifest file at ingestion time. The manifest provides an independent, append-only chain of custody record that can be verified without access to the database.
Manifest format (one JSON object per line):
{"seq":1,"evidence_id":"uuid","sha256":"abc123...","filename":"email_oct14.eml","s3_key":"/firm/case/evidence/uuid/email_oct14.eml","captured_at":"2026-01-15T10:30:00Z","capture_method":"email_forward","size_bytes":14208,"prev_hash":"0000...","entry_hash":"def456..."}
{"seq":2,"evidence_id":"uuid","sha256":"789abc...","filename":"screenshot_fb.png","s3_key":"/firm/case/evidence/uuid/screenshot_fb.png","captured_at":"2026-01-15T10:31:00Z","capture_method":"browser","size_bytes":284912,"prev_hash":"def456...","entry_hash":"ghi789..."}Hash chain: Each entry's entry_hash is SHA-256(prev_hash + evidence_id + sha256 + captured_at). The first entry uses prev_hash: "0" * 64. Any modification to a prior entry breaks the chain from that point forward; tampering is immediately detectable.
Storage: One manifest file per case, stored in S3 at /{firm_id}/{case_id}/manifest.jsonl. The manifest is also mirrored in the manifest_entries database table for queryability, but the S3 file is the authoritative record for legal/audit purposes.
Write ordering and partial failure: The DB manifest_entries row is written first (within the ingestion transaction), then the JSONL line is appended to S3. If the S3 append fails after the DB commit, a reconciliation task is enqueued to retry the S3 write. The DB is the operational authority (queries, UI, API responses) while the S3 JSONL file is the legal/audit authority (court-presentable, independently verifiable). A reconciliation job can rebuild the S3 manifest from the DB mirror at any time, so transient S3 failures do not block ingestion.
Independent verification: Given a manifest file and access to the S3 artifacts, any party can verify the chain:
sha256entry_hash from prev_hash + evidence_id + sha256 + captured_atprev_hash matches the prior entry's entry_hashThis verification can be performed by opposing counsel, a court-appointed expert, or an independent auditor without requiring access to the Intactus platform.
For user-facing search capabilities, see Search & Discovery.
PostgreSQL handles 80% of queries instantly:
| Attorney Action | Implementation | Cost |
|---|---|---|
| Text search ("custody") | PostgreSQL full-text search (GIN index) | Free |
| Filter by classification ("threatening") | Pre-computed tag filter (not yet populated — see note) | Free |
| Find mentions of a person ("Sarah") | Entity index lookup | Free |
| Browse timeline | Sort by pre-extracted dates, filter by type | Free |
| Read summary of each result | Pre-computed at ingestion | Free |
| Filter by source type, date range | Indexed metadata filters | Free |
Note: Classification filtering depends on the
classificationsJSONB column being populated during enrichment. The column exists but no classification step is currently in the enrichment pipeline, so this filter will not return results until classification is implemented.
The planned search schema below describes the target design. Currently, the evidence list endpoint uses query parameters (not this POST body format) for filtering and pagination.
{
"query": "custody pickup late",
"filters": {
"source_types": ["email", "sms"],
"date_range": { "start": "2025-06-01", "end": "2025-12-31" },
"classifications": ["custody_relevant"],
"entity_ids": ["uuid-of-sarah"],
"capture_methods": ["upload", "email_forward"],
"statuses": ["completed"],
"client_visible": true
},
"sort": { "field": "content_date", "order": "desc" },
"cursor": null,
"limit": 25
}Notes:
query drives PostgreSQL full-text search (GIN index). Empty or null query returns all items matching the filters.source_types: ["email", "sms"] AND classifications: ["custody_relevant"] returns emails OR texts that are custody-relevant.entity_ids filters to evidence items that mention any of the specified entities (via entity_evidence junction table).capture_methods values: upload, browser, api, email_forward.For complex queries that require reasoning, such as "find all instances where he contradicted his financial disclosure" or "identify the escalation pattern across these communications," the system uses agentic reasoning over a structured evidence index.
Attorney asks complex question
↓
Build evidence manifest: structured index of all case evidence
(item ID, type, date, source, sender, recipient, entities,
classifications, summary, relationship edges)
↓
LLM (Haiku or Sonnet) reasons over the manifest
- Identifies relevant evidence items
- Explains reasoning: "Items 14, 27, 31, and 45 are relevant because..."
- Requests full text of specific items if needed
↓
If full text needed: retrieve from S3, send to LLM with question
↓
Return results with explicit provenance (item IDs, dates, sources)
This approach is inspired by PageIndex (vectorless, reasoning-based retrieval) and DeepRead (structure-aware document reasoning). The key insight: the evidence manifest (summaries, metadata, entities, relationships, classifications) is a structured index that the LLM navigates through reasoning, not similarity matching. For a typical case (50-500 items), the manifest fits comfortably in context.
Agent-native note: Agents use the same tool layer as the UI to compose searches. A Research Agent performing agentic retrieval calls evidence.search, entities.search, relationships.traverse, and facts.search, the same atomic tools available to any consumer. The manifest-reasoning pattern is one composition strategy; agents can compose tools in novel ways (e.g., traversing relationships first, then filtering evidence) without being constrained to a single retrieval pipeline.
Why not vector search?
Only invoked when the attorney explicitly requests synthesis or analysis. The agentic retrieval layer identifies relevant evidence; the analysis layer reasons deeply over the full text.
| Operation | Model | Estimated Cost |
|---|---|---|
| Summarize selected evidence items | Sonnet | ~$0.30-0.50 |
| Build timeline narrative | Sonnet | ~$0.30-0.50 |
| Compare/contrast documents | Sonnet | ~$0.40-0.60 |
| Detect patterns across corpus | Sonnet | ~$0.50-1.00 |
| Deep legal analysis | Opus | ~$2.00-3.00 |
| Full case report generation | Opus | ~$3.00-5.00 |
Query
│
├── Structured search (SQL-expressible) → PostgreSQL (free)
├── Classification / quick summary → Haiku ($0.01-0.03)
├── Agentic retrieval / manifest reasoning → Haiku or Sonnet ($0.05-0.30)
├── Standard analysis / synthesis → Sonnet ($0.30-0.50)
└── Complex legal reasoning / full report → Opus ($2.00-3.00)
| Component | Technology | Rationale |
|---|---|---|
| Queue broker | Redis | Simple, proven, low-latency |
| Task framework | Celery (Python) | Mature, integrates with FastAPI, supports retries/scheduling |
| Scheduling | Celery Beat (deferred) | Periodic tasks — not yet configured, will be added when monitoring/polling features ship |
| Job | Trigger | Priority | Timeout | Milestone |
|---|---|---|---|---|
| Evidence upload processing (ingestion pipeline) | Upload confirmed | High | 5 min | M2 |
| Stale upload cleanup | Periodic (manual for now) | Low | 1 min | M2 |
| Web capture | Attorney/client submits URL | High | 2 min | Future |
| Email ingestion (OAuth) | Attorney initiates sync | Medium | 10 min | Future |
| Inbound email processing | SES receives forwarded email | High | 5 min | Future |
| Enrichment (entities, facts, relationships, summary) | Upload completed or re-enrichment requested | Medium | 2 min | M4 |
| Automated monitoring poll | Celery Beat schedule | Low | 5 min | Future |
| Analysis request | Attorney initiates | Medium | 5 min | Future |
| Report generation | Attorney initiates | Low | 10 min | Future |
| Chat with My Case query | Client asks question | High | 30 sec | Future |
Every job has a status visible to the appropriate user:
QUEUED → PROCESSING → COMPLETED
→ FAILED (with error detail)
→ CANCELLING → CANCELLED
Agent-initiated jobs: Agents can enqueue background jobs through the API (e.g., an Intake Agent triggering ingestion for a batch of uploads, or a Drafting Agent requesting report generation). Agents poll for job completion or receive webhook notifications. Every agent-initiated job is attributed to the agent and its owner attorney in the audit trail.
Any API operation that takes >5 seconds returns a job_id instead of blocking. This applies to both human and agent callers.
Caller invokes long-running operation (e.g., POST /cases/{id}/reports)
↓
API returns immediately: { "job_id": "uuid", "status": "queued", "poll_url": "/jobs/{id}" }
↓
Caller polls GET /jobs/{id} for status OR receives webhook on completion
↓
On completion: GET /jobs/{id}/result returns output or presigned download URL
Job management endpoints:
| Endpoint | Description |
|---|---|
GET /jobs/{id} |
Get job status, progress, and metadata |
GET /jobs |
List jobs (filterable by case, type, status) |
POST /jobs/{id}/cancel |
Cancel an in-flight job |
POST /jobs/{id}/retry |
Retry a failed job |
GET /jobs/{id}/result |
Get job output or presigned download URL |
Failed jobs include: error type, error message, retry guidance, and partial results (if any). See agent-native.md: Async Operations Pattern for the full specification.
The platform emits events on state changes. Both the UI and agents can subscribe to events for real-time reactivity.
Events are emitted for significant state changes. The canonical event type list is maintained in agent-native.md: Event Types.
Implemented event types: 21 event types across 6 domains — evidence.*, entity.*, relationship.*, fact.*, job.*, case.*. These cover CRUD operations on all core knowledge graph entities, job lifecycle events, and case-level events. See backend/app/enums.py for the canonical list.
Future event types: message.received, monitoring.delta_detected, report.ready, agent.session_completed, agent.cap_warning, agent.cap_reached.
Polling (M2): GET /events?since={timestamp}&types={event_types} returns events since the given timestamp. Long-polling option (?wait=30) holds the connection up to 30 seconds to reduce chattiness.
Webhooks (not yet implemented): Agents will register a webhook URL during session creation. The platform will POST event payloads on state changes with exponential backoff and retries.
Every event includes: event_id, event_type, case_id, entity_type, entity_id, actor_type (human/agent/system), actor_id, timestamp, and an event-specific data payload.
See agent-native.md: Event System for the full specification.
Implementation Status: Not yet implemented. No notifications table, WebSocket push, or push notification infrastructure exists. This section describes the planned notification architecture.
Events drive user-facing notifications. Not every event produces a notification; only events that require user attention do.
Planned delivery channels:
| Channel | Technology | Use Case |
|---|---|---|
| In-app | WebSocket push to dashboard (planned) | Real-time: new evidence processed, messages received |
| Resend transactional email (planned) | Digests, job failures, approval requests, monitoring alerts | |
| Push | Web Push API (planned) | Mobile-priority: client messages, urgent monitoring alerts |
Planned preference configuration: Each user will configure notification preferences per event type and channel, with defaults set per role.
Every design decision is informed by United States v. Heppner (SDNY, Feb. 10, 2026):
| Heppner Failure | Intactus Design |
|---|---|
| Client used consumer AI independently | Attorney directs all AI analysis (work product) |
| Consumer privacy policy allows data disclosure | Anthropic commercial API terms (no training on customer data) |
| No expectation of confidentiality | Contractual confidentiality with Anthropic |
| AI is not an attorney | AI operates as a tool under attorney direction (Kovel doctrine) |
| Documents didn't reflect counsel's strategy | Analysis initiated by attorney reflects their litigation strategy |
| No attorney oversight of AI actions | Every agent action logged with agent_owner_id, attorney directs and reviews all agent work product |
Client uploads evidence (or system captures it)
↓
Stored in encrypted S3 (scoped to firm/case, in platform's VPC)
↓
Ingestion pipeline processes locally (text extraction) then LLM enrichment
↓
Entity/relationship extraction, classification, summary via Haiku; fact extraction via Sonnet
↓
Attorney reviews evidence in dashboard
↓
Attorney initiates analysis → agentic retrieval identifies relevant items
↓
Relevant evidence sent to Sonnet/Opus via Anthropic API (direct). Bedrock migration planned before production.
↓
Response returned to platform
↓
Analysis stored as attorney work product
| Requirement | Solution | Status |
|---|---|---|
| Data encryption at rest | S3 SSE-KMS, RDS encryption | Planned |
| Data encryption in transit | TLS 1.3 everywhere | Implemented (Caddy) |
| No data on public internet | Bedrock via PrivateLink (planned) | Planned — currently using Anthropic API (direct) |
| Zero data retention by LLM provider | Anthropic commercial API terms; Bedrock ZDR planned | Partial — commercial terms apply; ZDR planned |
| No model training on customer data | Anthropic commercial API terms | Implemented (contractual) |
| SOC 2 Type II compliance | Vanta/Drata for continuous monitoring | Planned |
| Audit logging | Application-level audit log; CloudTrail planned | Partial — app audit log implemented |
| Multi-tenant data isolation | Firm/case scoping on every query + row-level security | Implemented |
Implementation Status: Not yet implemented. Presidio is not installed and no PII redaction infrastructure exists. This section describes the planned redaction architecture.
Even within the private VPC, an optional redaction layer using Microsoft Presidio:
This provides a second line of defense beyond the contractual ZDR protections.
Agents operate under the same privilege framework as human users, with additional structural guarantees:
agent_owner_id). The attorney initiates, configures, and reviews agent work. This satisfies the Kovel doctrine requirement that the agent operates under attorney direction.| Layer | Technology | Rationale |
|---|---|---|
| Frontend | Next.js, React 19, TypeScript | Path to React Native for mobile; component reuse across client portal and attorney dashboard |
| Styling | Tailwind CSS v4, shadcn/ui, Radix UI | Consistent design system, accessible components |
| Icons | Lucide React | Clean, consistent iconography |
| Package manager | pnpm | Fast, disk-efficient |
| API codegen | @hey-api/openapi-ts (planned) | Auto-generate TypeScript client from FastAPI OpenAPI schema. Currently using raw fetch() |
| Backend | Python 3.12+, FastAPI, Uvicorn | Async-first, existing expertise from evidence project |
| ORM | SQLAlchemy 2.x (async) + asyncpg | Async database access, mature migration tooling |
| Migrations | Alembic | Schema versioning and migration management |
| Data validation | Pydantic + pydantic-settings | Settings management, request/response schemas |
| Dependency management | uv | Fast, reproducible Python environments |
| Database | PostgreSQL 16 | Relational data, full-text search, JSONB, future pgvector/AGE option |
| Object storage | AWS S3 | Artifact storage, encryption, durability |
| LLM | Claude (Haiku/Sonnet) via Anthropic API | Single vendor for all AI. Bedrock migration planned before production |
| Web capture | Playwright (Python) (not yet implemented) | Full-page screenshots, video interception |
| OCR | Anthropic Vision API (Claude Haiku) | High-quality image OCR via LLM vision |
| NER/Entity extraction | Haiku/Sonnet LLM | Single-pass extraction during ingestion, configurable per step |
| Transcription | Whisper (not yet implemented) | Cost-effective audio/video transcription |
| Task queue | Celery + Redis | Async job processing, scheduling, retries |
| Auth | FastAPI-owned (magic links via Resend) | Unified human + agent auth, PostgreSQL-backed sessions |
| Email (transactional) | Resend | Magic links, notifications |
| Email (inbound) | AWS SES (receiving) (not yet implemented) | Case intake email addresses |
| Reverse proxy | Caddy | Auto HTTPS (Let's Encrypt), routing, security headers |
| Deployment | Docker Compose | Single VPS initially, containers ready for scaling |
| Linting | Ruff (Python), ESLint (TypeScript) | Consistent code quality |
| Testing | pytest, pytest-asyncio, httpx | Backend test suite |
| Dev workflow | Makefile | Unified commands for dev, test, lint, migrate, deploy |
| Payments | Stripe (not yet implemented) | Subscription billing, usage metering, invoicing |
| Compliance | Vanta or Drata (not yet implemented) | SOC 2 continuous monitoring |
Implementation Status: Not yet implemented. This section describes the planned billing architecture. No Stripe integration exists.
Billing is handled entirely through Stripe. The platform tracks usage internally and reports it to Stripe for metering and invoicing.
| Component | Purpose |
|---|---|
| Stripe Products & Prices | Define subscription tiers (Practitioner, Firm, Enterprise) |
| Stripe Subscriptions | Manage recurring billing per firm |
| Stripe Usage Records | Report AI analysis token consumption for metered billing |
| Stripe Customer Portal | Self-service plan changes, payment method updates, invoice history |
| Stripe Webhooks | Sync subscription state changes back to the platform |
Billing period: Monthly, aligned to firm signup date. AI usage (metered component) is reported to Stripe daily and invoiced at period end. Base subscription fees are charged at period start.
Payment failure handling: Stripe Smart Retries handle failed payments automatically. After 3 failed attempts over 14 days, the subscription moves to past_due. The firm admin receives email notifications at each retry. After 28 days past due, the account is downgraded to read-only mode (no new evidence ingestion, no AI analysis) until payment is resolved. No data is deleted; the vault remains accessible in read-only mode indefinitely.
See business-model.md for pricing tiers, AI cost management, and the usage tracking model.
| Component | Technology | Rationale |
|---|---|---|
| Queue broker | Redis | Simple, proven, low-latency |
| Task framework | Celery (Python) | Mature, integrates with FastAPI, supports retries/scheduling |
| Scheduling | Celery Beat (deferred) | Periodic tasks — not yet configured, will be added when monitoring/polling features ship |
| Job | Trigger | Priority | Timeout | Milestone |
|---|---|---|---|---|
| Evidence upload processing (ingestion pipeline) | Upload confirmed | High | 5 min | M2 |
| Stale upload cleanup | Periodic (manual for now) | Low | 1 min | M2 |
| Web capture | Attorney/client submits URL | High | 2 min | Future |
| Email ingestion (OAuth) | Attorney initiates sync | Medium | 10 min | Future |
| Inbound email processing | SES receives forwarded email | High | 5 min | Future |
| Enrichment (entities, facts, relationships, summary) | Upload completed or re-enrichment requested | Medium | 2 min | M4 |
| Automated monitoring poll | Celery Beat schedule | Low | 5 min | Future |
| Analysis request | Attorney initiates | Medium | 5 min | Future |
| Report generation | Attorney initiates | Low | 10 min | Future |
| Chat with My Case query | Client asks question | High | 30 sec | Future |
Every job has a status visible to the appropriate user:
QUEUED → PROCESSING → COMPLETED
→ FAILED (with error detail)
→ CANCELLING → CANCELLED
Agent-initiated jobs: Agents can enqueue background jobs through the API (e.g., an Intake Agent triggering ingestion for a batch of uploads, or a Drafting Agent requesting report generation). Agents poll for job completion or receive webhook notifications. Every agent-initiated job is attributed to the agent and its owner attorney in the audit trail.
Any API operation that takes >5 seconds returns a job_id instead of blocking. This applies to both human and agent callers.
Caller invokes long-running operation (e.g., POST /cases/{id}/reports)
↓
API returns immediately: { "job_id": "uuid", "status": "queued", "poll_url": "/jobs/{id}" }
↓
Caller polls GET /jobs/{id} for status OR receives webhook on completion
↓
On completion: GET /jobs/{id}/result returns output or presigned download URL
Job management endpoints:
| Endpoint | Description |
|---|---|
GET /jobs/{id} |
Get job status, progress, and metadata |
GET /jobs |
List jobs (filterable by case, type, status) |
POST /jobs/{id}/cancel |
Cancel an in-flight job |
POST /jobs/{id}/retry |
Retry a failed job |
GET /jobs/{id}/result |
Get job output or presigned download URL |
Failed jobs include: error type, error message, retry guidance, and partial results (if any). See agent-native.md: Async Operations Pattern for the full specification.
The platform emits events on state changes. Both the UI and agents can subscribe to events for real-time reactivity.
Events are emitted for significant state changes. The canonical event type list is maintained in agent-native.md: Event Types.
Implemented event types: 21 event types across 6 domains — evidence.*, entity.*, relationship.*, fact.*, job.*, case.*. These cover CRUD operations on all core knowledge graph entities, job lifecycle events, and case-level events. See backend/app/enums.py for the canonical list.
Future event types: message.received, monitoring.delta_detected, report.ready, agent.session_completed, agent.cap_warning, agent.cap_reached.
Polling (M2): GET /events?since={timestamp}&types={event_types} returns events since the given timestamp. Long-polling option (?wait=30) holds the connection up to 30 seconds to reduce chattiness.
Webhooks (not yet implemented): Agents will register a webhook URL during session creation. The platform will POST event payloads on state changes with exponential backoff and retries.
Every event includes: event_id, event_type, case_id, entity_type, entity_id, actor_type (human/agent/system), actor_id, timestamp, and an event-specific data payload.
See agent-native.md: Event System for the full specification.
Implementation Status: Not yet implemented. No notifications table, WebSocket push, or push notification infrastructure exists. This section describes the planned notification architecture.
Events drive user-facing notifications. Not every event produces a notification; only events that require user attention do.
Planned delivery channels:
| Channel | Technology | Use Case |
|---|---|---|
| In-app | WebSocket push to dashboard (planned) | Real-time: new evidence processed, messages received |
| Resend transactional email (planned) | Digests, job failures, approval requests, monitoring alerts | |
| Push | Web Push API (planned) | Mobile-priority: client messages, urgent monitoring alerts |
Planned preference configuration: Each user will configure notification preferences per event type and channel, with defaults set per role.
Every design decision is informed by United States v. Heppner (SDNY, Feb. 10, 2026):
| Heppner Failure | Intactus Design |
|---|---|
| Client used consumer AI independently | Attorney directs all AI analysis (work product) |
| Consumer privacy policy allows data disclosure | Anthropic commercial API terms (no training on customer data) |
| No expectation of confidentiality | Contractual confidentiality with Anthropic |
| AI is not an attorney | AI operates as a tool under attorney direction (Kovel doctrine) |
| Documents didn't reflect counsel's strategy | Analysis initiated by attorney reflects their litigation strategy |
| No attorney oversight of AI actions | Every agent action logged with agent_owner_id, attorney directs and reviews all agent work product |
Client uploads evidence (or system captures it)
↓
Stored in encrypted S3 (scoped to firm/case, in platform's VPC)
↓
Ingestion pipeline processes locally (text extraction) then LLM enrichment
↓
Entity/relationship extraction, classification, summary via Haiku; fact extraction via Sonnet
↓
Attorney reviews evidence in dashboard
↓
Attorney initiates analysis → agentic retrieval identifies relevant items
↓
Relevant evidence sent to Sonnet/Opus via Anthropic API (direct). Bedrock migration planned before production.
↓
Response returned to platform
↓
Analysis stored as attorney work product
| Requirement | Solution | Status |
|---|---|---|
| Data encryption at rest | S3 SSE-KMS, RDS encryption | Planned |
| Data encryption in transit | TLS 1.3 everywhere | Implemented (Caddy) |
| No data on public internet | Bedrock via PrivateLink (planned) | Planned — currently using Anthropic API (direct) |
| Zero data retention by LLM provider | Anthropic commercial API terms; Bedrock ZDR planned | Partial — commercial terms apply; ZDR planned |
| No model training on customer data | Anthropic commercial API terms | Implemented (contractual) |
| SOC 2 Type II compliance | Vanta/Drata for continuous monitoring | Planned |
| Audit logging | Application-level audit log; CloudTrail planned | Partial — app audit log implemented |
| Multi-tenant data isolation | Firm/case scoping on every query + row-level security | Implemented |
Implementation Status: Not yet implemented. Presidio is not installed and no PII redaction infrastructure exists. This section describes the planned redaction architecture.
Even within the private VPC, an optional redaction layer using Microsoft Presidio:
This provides a second line of defense beyond the contractual ZDR protections.
Agents operate under the same privilege framework as human users, with additional structural guarantees:
agent_owner_id). The attorney initiates, configures, and reviews agent work. This satisfies the Kovel doctrine requirement that the agent operates under attorney direction.| Layer | Technology | Rationale |
|---|---|---|
| Frontend | Next.js, React 19, TypeScript | Path to React Native for mobile; component reuse across client portal and attorney dashboard |
| Styling | Tailwind CSS v4, shadcn/ui, Radix UI | Consistent design system, accessible components |
| Icons | Lucide React | Clean, consistent iconography |
| Package manager | pnpm | Fast, disk-efficient |
| API codegen | @hey-api/openapi-ts (planned) | Auto-generate TypeScript client from FastAPI OpenAPI schema. Currently using raw fetch() |
| Backend | Python 3.12+, FastAPI, Uvicorn | Async-first, existing expertise from evidence project |
| ORM | SQLAlchemy 2.x (async) + asyncpg | Async database access, mature migration tooling |
| Migrations | Alembic | Schema versioning and migration management |
| Data validation | Pydantic + pydantic-settings | Settings management, request/response schemas |
| Dependency management | uv | Fast, reproducible Python environments |
| Database | PostgreSQL 16 | Relational data, full-text search, JSONB, future pgvector/AGE option |
| Object storage | AWS S3 | Artifact storage, encryption, durability |
| LLM | Claude (Haiku/Sonnet) via Anthropic API | Single vendor for all AI. Bedrock migration planned before production |
| Web capture | Playwright (Python) (not yet implemented) | Full-page screenshots, video interception |
| OCR | Anthropic Vision API (Claude Haiku) | High-quality image OCR via LLM vision |
| NER/Entity extraction | Haiku/Sonnet LLM | Single-pass extraction during ingestion, configurable per step |
| Transcription | Whisper (not yet implemented) | Cost-effective audio/video transcription |
| Task queue | Celery + Redis | Async job processing, scheduling, retries |
| Auth | FastAPI-owned (magic links via Resend) | Unified human + agent auth, PostgreSQL-backed sessions |
| Email (transactional) | Resend | Magic links, notifications |
| Email (inbound) | AWS SES (receiving) (not yet implemented) | Case intake email addresses |
| Reverse proxy | Caddy | Auto HTTPS (Let's Encrypt), routing, security headers |
| Deployment | Docker Compose | Single VPS initially, containers ready for scaling |
| Linting | Ruff (Python), ESLint (TypeScript) | Consistent code quality |
| Testing | pytest, pytest-asyncio, httpx | Backend test suite |
| Dev workflow | Makefile | Unified commands for dev, test, lint, migrate, deploy |
| Payments | Stripe (not yet implemented) | Subscription billing, usage metering, invoicing |
| Compliance | Vanta or Drata (not yet implemented) | SOC 2 continuous monitoring |
Implementation Status: Not yet implemented. This section describes the planned billing architecture. No Stripe integration exists.
Billing is handled entirely through Stripe. The platform tracks usage internally and reports it to Stripe for metering and invoicing.
| Component | Purpose |
|---|---|
| Stripe Products & Prices | Define subscription tiers (Practitioner, Firm, Enterprise) |
| Stripe Subscriptions | Manage recurring billing per firm |
| Stripe Usage Records | Report AI analysis token consumption for metered billing |
| Stripe Customer Portal | Self-service plan changes, payment method updates, invoice history |
| Stripe Webhooks | Sync subscription state changes back to the platform |
Billing period: Monthly, aligned to firm signup date. AI usage (metered component) is reported to Stripe daily and invoiced at period end. Base subscription fees are charged at period start.
Payment failure handling: Stripe Smart Retries handle failed payments automatically. After 3 failed attempts over 14 days, the subscription moves to past_due. The firm admin receives email notifications at each retry. After 28 days past due, the account is downgraded to read-only mode (no new evidence ingestion, no AI analysis) until payment is resolved. No data is deleted; the vault remains accessible in read-only mode indefinitely.
See business-model.md for pricing tiers, AI cost management, and the usage tracking model.
| Layer | Technology | Rationale |
|---|---|---|
| Frontend | Next.js, React 19, TypeScript | Path to React Native for mobile; component reuse across client portal and attorney dashboard |
| Styling | Tailwind CSS v4, shadcn/ui, Radix UI | Consistent design system, accessible components |
| Icons | Lucide React | Clean, consistent iconography |
| Package manager | pnpm | Fast, disk-efficient |
| API codegen | @hey-api/openapi-ts (planned) | Auto-generate TypeScript client from FastAPI OpenAPI schema. Currently using raw fetch() |
| Backend | Python 3.12+, FastAPI, Uvicorn | Async-first, existing expertise from evidence project |
| ORM | SQLAlchemy 2.x (async) + asyncpg | Async database access, mature migration tooling |
| Migrations | Alembic | Schema versioning and migration management |
| Data validation | Pydantic + pydantic-settings | Settings management, request/response schemas |
| Dependency management | uv | Fast, reproducible Python environments |
| Database | PostgreSQL 16 | Relational data, full-text search, JSONB, future pgvector/AGE option |
| Object storage | AWS S3 | Artifact storage, encryption, durability |
| LLM | Claude (Haiku/Sonnet) via Anthropic API | Single vendor for all AI. Bedrock migration planned before production |
| Web capture | Playwright (Python) (not yet implemented) | Full-page screenshots, video interception |
| OCR | Anthropic Vision API (Claude Haiku) | High-quality image OCR via LLM vision |
| NER/Entity extraction | Haiku/Sonnet LLM | Single-pass extraction during ingestion, configurable per step |
| Transcription | Whisper (not yet implemented) | Cost-effective audio/video transcription |
| Task queue | Celery + Redis | Async job processing, scheduling, retries |
| Auth | FastAPI-owned (magic links via Resend) | Unified human + agent auth, PostgreSQL-backed sessions |
| Email (transactional) | Resend | Magic links, notifications |
| Email (inbound) | AWS SES (receiving) (not yet implemented) | Case intake email addresses |
| Reverse proxy | Caddy | Auto HTTPS (Let's Encrypt), routing, security headers |
| Deployment | Docker Compose | Single VPS initially, containers ready for scaling |
| Linting | Ruff (Python), ESLint (TypeScript) | Consistent code quality |
| Testing | pytest, pytest-asyncio, httpx | Backend test suite |
| Dev workflow | Makefile | Unified commands for dev, test, lint, migrate, deploy |
| Payments | Stripe (not yet implemented) | Subscription billing, usage metering, invoicing |
| Compliance | Vanta or Drata (not yet implemented) | SOC 2 continuous monitoring |
Implementation Status: Not yet implemented. This section describes the planned billing architecture. No Stripe integration exists.
Billing is handled entirely through Stripe. The platform tracks usage internally and reports it to Stripe for metering and invoicing.
| Component | Purpose |
|---|---|
| Stripe Products & Prices | Define subscription tiers (Practitioner, Firm, Enterprise) |
| Stripe Subscriptions | Manage recurring billing per firm |
| Stripe Usage Records | Report AI analysis token consumption for metered billing |
| Stripe Customer Portal | Self-service plan changes, payment method updates, invoice history |
| Stripe Webhooks | Sync subscription state changes back to the platform |
Billing period: Monthly, aligned to firm signup date. AI usage (metered component) is reported to Stripe daily and invoiced at period end. Base subscription fees are charged at period start.
Payment failure handling: Stripe Smart Retries handle failed payments automatically. After 3 failed attempts over 14 days, the subscription moves to past_due. The firm admin receives email notifications at each retry. After 28 days past due, the account is downgraded to read-only mode (no new evidence ingestion, no AI analysis) until payment is resolved. No data is deleted; the vault remains accessible in read-only mode indefinitely.
See business-model.md for pricing tiers, AI cost management, and the usage tracking model.