Invoice Auditor
Multi-agent · LangGraph · LangChain · LangSmith · MCP · Python

Agentic invoice auditor. Auto-approves the clean ones, escalates only edge cases.

Six specialised agents orchestrated in LangGraph + LangChain extract, translate, validate, audit, and save each invoice through a Groq Llama 3.3 LLM — with human-in-the-loop on flagged ones, a RAG chat over the corpus, and end-to-end LangSmith tracing. Numbers below are pulled live from the running system, not hardcoded.

PythonMulti-agentLangGraphLangChainLangSmithGroq Llama 3.3 LLMRAGHITLMCPFunction-callingChromaDBFastAPIStreaming SSEPostgresPydantic
Live impact · from the running system
Auto-approved
Cleared without human touch
Invoices audited
Loading from /stats
Spend analyzed
Across all processed invoices
Flagged for review
Only the edge cases
Live pipeline — how an invoice flows
Step 1 of 6
Extract

Reads PDFs, DOCX, and OCR'd scans into typed JSON.

LangGraphPydanticGroq Llama 3.3Tesseract OCRpdfplumber + pypdf

At a glance · 8 layers · 21 tools

Agent Framework
LangGraphLangChainPydanticMCP Server
LLM + Embeddings
Groq Llama 3.3HuggingFace
Vector + Memory
ChromaDB CloudSQLite
Observability
LangSmithSSE Streaming
Document I/O
pdfplumber + pypdfTesseract OCRpython-docx
Backend
FastAPIhttpx
Frontend
Next.js 14Tailwind CSSFramer MotionApache EChartscmdk
Guardrails
Guardrails

The full stack — hover for details

Agent Framework
LangGraph

Wires the 6 agents into a state machine with pause/resume.

Agent Framework
LangChain

Provides bind_tools and with_structured_output helpers.

Agent Framework
Pydantic

Schema validates every LLM output; auto-retries on bad JSON.

Agent Framework
MCP Server

Exposes 12 invoice tools to Claude Desktop, Cursor, Zed.

LLM + Embeddings
Groq Llama 3.3

70B reasoning model behind extraction, audit, chat synthesis.

LLM + Embeddings
HuggingFace

all-MiniLM-L6-v2 sentence-transformer for chunk embeddings.

Vector + Memory
ChromaDB Cloud

Hosted vector DB; stores chunk embeddings for RAG.

Vector + Memory
SQLite

LangGraph checkpoints — pipeline survives restarts.

Observability
LangSmith

Auto-traces every agent node and LLM call.

Observability
SSE Streaming

Streams per-agent events and chat tokens live to the UI.

Document I/O
pdfplumber + pypdf

Two-tier PDF text extraction with layout-aware fallback.

Document I/O
Tesseract OCR

Reads scanned-image invoices via pytesseract.

Document I/O
python-docx

Parses .docx invoice tables and paragraphs.

Backend
FastAPI

ASGI backend exposing /process, /chat, /stats, /upload.

Backend
httpx

MCP server talks back to FastAPI over HTTP.

Frontend
Next.js 14

App Router; dashboard, chat, tech, invoice detail pages.

Frontend
Tailwind CSS

Dark theme with violet→cyan AI-reserved gradient accent.

Frontend
Framer Motion

Page transitions, streaming pulses, stagger animations.

Frontend
Apache ECharts

Treemap, donut, area chart on the analytics dashboard.

Frontend
cmdk

Slash command palette in chat (/flagged, /vendor, …).

Guardrails
Guardrails

PII redaction, prompt-injection block, off-topic gate on chat I/O.

Where an invoice goes

Every box is a real file or service. Arrows show how data physically moves.

1 · Input
  • PDF / DOCX / PNGdrag-drop or /upload
  • pdfplumber + pypdfPDF text
  • Tesseract OCRscans
  • python-docxDOCX tables
2 · Agents
  • LangGraphorchestrates 6 nodes
  • Groq Llama 3.3reasoning model
  • Pydanticstructured outputs
  • SQLiteHITL checkpoints
  • LangSmithtraces everything
3 · Storage + UI
  • ChromaDB Cloudvector embeddings
  • HuggingFaceMiniLM embeddings
  • FastAPI /streamSSE telemetry
  • Next.js + EChartsdashboard + chat
  • MCP ServerClaude Desktop link
3
Document formats — PDF, DOCX, scans
3
Languages — English, German, Spanish
1-click
Human review on flagged invoices
0 loss
Redeploys mid-pipeline — Postgres checkpoints
RAG
Natural-language Q&A across every invoice
MCP
Same tools usable from Claude Desktop, Cursor