Lastenheft
Problem. Industrial requirement documents (Lastenhefte) are dense, bilingual, and full of tables and diagrams that text-only search cannot read, and in regulated industries every answer has to be traceable back to its source.
Outcome. Built an end-to-end multimodal RAG over 26 German and English industrial PDFs, using ColPali for visual page retrieval and a LoRA-fine-tuned BGE-reranker-v2-m3. Fine-tuning on 714 synthetic DE and EN technical queries lifted Hit@1 by 8.4 points and MRR by 5.1 over the off-the-shelf reranker. A LangGraph multi-agent setup (planner, retriever, validator, synthesizer) routes between local and API models, and the system is built to the EU AI Act and GDPR: Article 6 risk classification, Article 13 transparency, Article 17 right to erasure, and a full audit trail.
- Python
- FastAPI
- Next.js
- LangGraph
- ColPali
- pgvector
- Docker
- Langfuse