Falconium Backend Architecture

Client

📱

iPad App

Swift / SwiftUI / MVVM

REST API calls → /api/v1/*
WebSocket → ws/transcribe/
Auth: JWT Bearer tokens

Orchestration

n8n

n8n Workflows

Agent orchestration (MVP)

NOT in backend code

Not Implemented

Edge Orchestrator

On-device Llama 3.5

Future — not for MVP

Anonymisation

PII/sensitive data masking

Designed, not coded

Emotion Engine

13-state tone adaptation

Prompts exist, no backend logic

Django Backend — 6 Apps /api/v1/

User model (email-based auth)

built

JWT login + token refresh 5 endpoints

built

Outlook OAuth token storage

built

Password reset (sends to admins, not user)

issue

Celery task decorators commented out

issue

Conversations & Messages CRUD 8 endpoints

built

RAPTOR hierarchical retrieval (LLMRaptorView)

built

Query routing (Claude → classify)

built

General Q&A chain (Claude)

built

Internet search chain (Perplexity)

built

WebSocket audio transcription

built

QuestionLLM audit logging

built

Old AIAgent (routing) — dead code alongside RAPTOR

redundant

Projects + Tasks + SubTasks 18 endpoints

built

File upload + folder hierarchy

built

RAPTOR indexing pipeline (Celery)

built

FalconiumConfig (prompt management)

built

PromptSetting (RAG tuning)

built

Documents (system/internal) admin-managed

built

Auto-creates project directory (signal)

built

OAuth callback + logout 18 endpoints

built

Inbox / Sent / Detail fetch

built

Send, reply, reply-all, forward

built

Bulk ops (delete, flag, read, importance)

built

No local models — fully stateless

no data

Emails not indexed in Weaviate / not searchable by AI

gap

Meeting CRUD + participants 9 endpoints

built

Meeting messages (speaker-attributed)

built

Meeting tasks + subtasks + attachments

built

Parts & roles per meeting

built

Transcript field exists but not indexed for RAG

gap

No auto-task generation from transcripts

planned

JSON-RPC endpoint 1 endpoint

built

6 MCP tools (docs, search, Q&A, list)

built

MCPToolConfig admin-managed registry

built

Duplicates RAPTOR agents from chats app

duplication

Separate Weaviate classes (RAPTORNode vs External_documents)

split

Query Pipeline — What happens when you ask Falconium a question

Input

iPad sends text

→

Router

            Claude classifies query

→

Decision

Route to…

→

📚 Document retrieval → RAPTOR → Weaviate → Claude synthesises

🌐 Internet search → Perplexity sonar-pro → answer

💬 General Q&A → Claude direct → answer

All queries logged to QuestionLLM audit table · Prompts loaded from FalconiumConfig

Document Indexing Pipeline

Upload

File uploaded via API or admin

→

Extract

PDF / DOCX / TXT → text

→

Chunk

512 tokens, 50 overlap

→

RAPTOR

            Build hierarchy tree (GMM clusters → summaries)

→

Index

Store in Weaviate

Two parallel pipelines: User files → External_documents class · System docs → RAPTORNode class · Runs as Celery background task

External Services

LLM Providers

Anthropic Claude

Routing, Q&A, RAPTOR synthesis

claude-3-5-haiku-latest

OpenAI

Embeddings + RAPTOR summaries

text-embedding-ada-002 · gpt-4o-mini

Perplexity

Real-time internet search

sonar-pro

Audio & Voice

Groq

Fast Whisper transcription

Replicate

WhisperX + speaker diarisation

ElevenLabs

Voice synthesis

Configured but not in backend code

Integrations

Microsoft Graph

Outlook email (OAuth 2.0)

Storage & Data

AWS S3

File storage (production)

Weaviate

Vector DB for RAG

2 classes: RAPTORNode + External_documents

Infrastructure Layer

🐘

PostgreSQL 14.1

Primary database

⚡

Redis

Cache + message broker

🔄

Celery

Background task queue

🐳

Docker

Containerised (7 services)

🚀

Daphne (ASGI)

HTTP + WebSocket server

⚠ Issues, Gaps & Inconsistencies Found in Codebase ▼

HIGH

Duplicate RAPTOR agents. The chats app has its own RaptorAIAgent (raptor_chain.py) and the mcp_server app has two separate copies (mcr_internal_documents.py, mcr_external_documents.py). These are independent implementations of the same logic — changes to one don't propagate to the other. This will cause bugs as the system evolves.

HIGH

Emails are an island. The mails app is fully stateless — it stores nothing locally and has zero connection to the AI pipeline. Emails can't be searched by Falconium's AI, can't be referenced in conversations, and aren't indexed in Weaviate. The AI can't help with email-related queries like "summarise my recent emails about Project X."

HIGH

Meeting transcripts are an island. The meetings app stores transcripts as a TextField, but they're never indexed into Weaviate. The AI can't search meeting content, extract action items, or answer questions like "what did we discuss about the deadline?"

MEDIUM

Dead code in chats — old AIAgent. The chats/ai_services/main.py contains the original AIAgent class with its own routing logic, alongside the newer RaptorAIAgent. The old LLMResponseView endpoint is commented out in urls.py. This dead code creates confusion about which pipeline is active.

MEDIUM

FalconiumConfig type choices don't match usage. The model defines types like "frontend_model", "standard_files_retrieval", etc. But the chats AI code calls get_active_prompt() with keys like "routing", "format_retrival_query" — these don't appear in the defined type choices. The lookup relies on a generic query that works despite the mismatch, but it's fragile.

MEDIUM

Two Weaviate schemas for essentially the same purpose. User files go into External_documents (with user_id) and system docs go into RAPTORNode (without user_id). The chats RAPTOR agent only searches RAPTORNode. The mcp_server has separate agents for each. This split means a user asking "search all documents" only sees one set.

MEDIUM

n8n orchestration invisible to backend. The CLAUDE.md describes n8n as the MVP orchestration layer, but there's no n8n integration in the Django code. n8n presumably calls the API externally, but the backend has no awareness of multi-agent workflows, handoffs, or orchestration state.

LOW

Password reset emails go to ADMIN_EMAILS, not the user. In accounts/tasks.py, the reset link is sent to settings.ADMIN_EMAILS rather than the requesting user's email. Likely a development shortcut but will need fixing before any client deployment.

LOW

JWT access tokens valid for 70 days. ACCESS_TOKEN_LIFETIME = timedelta(days=70) is extremely long for a security-sensitive financial application. Industry standard is 15-60 minutes with refresh tokens handling re-authentication.

LOW

CORS allows all origins. CORS_ORIGIN_ALLOW_ALL = True is fine for development but must be restricted before any external deployment.

LOW

ElevenLabs voice synthesis not in backend. Referenced in architecture docs and the iPad app likely calls it directly, but there's no backend endpoint for voice generation. This may be intentional (client-side) but worth noting.

Cross-App Dependencies

accounts.User

→ used by all 5 other apps (FK in conversations, messages, projects, tasks, files, meetings, participants)

services.FalconiumConfig

→ used by chats (loads prompts for routing, Q&A, retrieval, search)

services.PromptSetting

→ used by chats (RAPTOR search params) and mcp_server (RAPTOR search params)

chats.QuestionLLM

→ used by mcp_server (logs tool invocations)

chats.ai_services.*

→ used by mcp_server (GeneralAnthropicChain, PerplexityInternetChain)

services.Project

→ used by meetings (Meeting.project FK)

meetings.pagination

→ used by mails (EmailPageNumberPagination — oddly defined in meetings, used in mails)

mails

→ connects only to accounts (for OAuth tokens) — no other app dependencies

FALCONIUM BACKEND ARCHITECTURE