Artikate Studio
Project MINERVA
All case studies
Defence & AI2023· Classified Government Client — India

Project MINERVA

Sovereign LLM Document Intelligence

On-premise LLM platform that searches, summarises and cross-references classified document corpora — delivered in 75 days, fully air-gapped.

At a glance

75 days
Brief to production
our fastest delivery
8 min
500-page cross-ref
down from 3 days
94%
Extraction accuracy
on benchmark set
0
Document egress
fully air-gapped

Built with

Llama 3 70BGGUFRAGpgvectorOCRWhisper ASRFastAPI

Overview

A government agency managing thousands of classified documents needed an AI system to search, summarise, cross-reference, and extract insight — without any document ever leaving their secure network.

The Challenge

Commercial LLM APIs were off-limits due to data-residency rules. All processing had to occur on-premise with open-weight models, across multi-format documents: PDFs, scanned images, and handwritten forms.

How it fits together

Architecture

Ingest + OCR
PDF, scans, audio
Index
BM25 + vectors
Retrieval
hybrid search
Sovereign LLM
GGUF, on-prem

The Solution

We deployed Llama 3 70B quantised to GGUF on an air-gapped GPU cluster, with a hybrid-search RAG pipeline (BM25 + vectors) over pgvector. OCR preprocessing and Whisper ASR handled scans and audio. Built in 75 days, brief to production.

Results

Processing-time reduction3 days → 8 min
Extraction accuracy94%
Analyst hours saved / week~80%

The Outcome

Cross-referencing a 500-page classified report dropped from 3 days to 8 minutes, at 94% extraction accuracy — with zero document egress and 200+ queries served per day.

75-day delivery · 3 days → 8 minutes · 94% extraction accuracy

Highlights

  • Open-weight Llama 3 70B — no commercial API, no egress
  • Hybrid BM25 + vector retrieval over classified corpora
  • Brief to production in 75 days

From brief to production

Delivery timeline

Days 1–10
Secure environment
Air-gapped GPU cluster stood up
Days 11–40
RAG + ingestion
OCR, indexing, hybrid retrieval
Days 41–65
LLM + UI
Quantised model, query workspace
Days 66–75
Hardening & handover
Eval, audit, production