On-premise LLM platform that searches, summarises and cross-references classified document corpora — delivered in 75 days, fully air-gapped.
At a glance
Built with
Overview
A government agency managing thousands of classified documents needed an AI system to search, summarise, cross-reference, and extract insight — without any document ever leaving their secure network.
The Challenge
Commercial LLM APIs were off-limits due to data-residency rules. All processing had to occur on-premise with open-weight models, across multi-format documents: PDFs, scanned images, and handwritten forms.
How it fits together
Architecture
The Solution
We deployed Llama 3 70B quantised to GGUF on an air-gapped GPU cluster, with a hybrid-search RAG pipeline (BM25 + vectors) over pgvector. OCR preprocessing and Whisper ASR handled scans and audio. Built in 75 days, brief to production.
Results
From brief to production
