Case Study

From Document Chaos to Instant Answers: Building a Secure RAG System

Secure AI-Powered Document Intelligence with Role-Based Access Control

Platform

Web Application

Solution

Custom RAG System

Industry

Enterprise Search

Key Tech

Qdrant, Docling, Laravel

Overview

Organizations accumulate vast amounts of institutional knowledge—technical documentation, policy manuals, client records, and operational procedures—spread across disconnected systems. As companies grow, finding the right information becomes increasingly difficult, while ensuring sensitive documents remain protected adds another layer of complexity.

We were engaged by a client to develop a custom private internal RAG system tailored to their specific requirements, transforming how their employees access internal knowledge. This internal tool enables natural language queries across thousands of documents while enforcing strict role-based access control, ensuring users only see information they’re authorized to access.

What is Retrieval-Augmented Generation (RAG)?

Retrieval-Augmented Generation (RAG) is an AI architecture that enhances large language models (LLMs) by connecting them to external knowledge sources. Instead of relying solely on what the AI was trained on, RAG systems retrieve relevant information from your own documents before generating a response.

How a Typical RAG System Works

1

Document Ingestion

Documents are processed, split into smaller chunks, and converted into numerical representations (embeddings) that capture their semantic meaning.

2

Vector Storage

These embeddings are stored in a specialized vector database that enables fast similarity searches.

3

Query Processing

When a user asks a question, the query is also converted into an embedding and compared against stored document embeddings.

4

Retrieval

The most semantically similar document chunks are retrieved from the vector database.

5

Generation

The retrieved context is passed to an LLM along with the user’s question, which generates an accurate, grounded response.

Challenges

While the RAG concept is straightforward, building a production-ready system presents significant challenges:

Fragmented Knowledge Base

Internal documents were scattered across file servers, email attachments, and department-specific storage. Employees knew information existed but spent considerable time locating it. Traditional folder hierarchies and keyword search proved inadequate—they couldn’t understand context or semantic meaning, returning irrelevant results for complex queries.

Multiple Document Formats

The organization’s knowledge base included PDFs (both native and scanned), Word documents, Excel spreadsheets, email archives (.eml), markdown files, and plain text. Each format required different handling, and many contained complex layouts with tables, multi-column text that simple text extraction would destroy.

Security and Access Control

Not all employees should access all documents. Customer contracts, financial records, HR policies, and strategic plans required strict access controls. The AI system needed to respect these boundaries absolutely—it could never synthesize answers from documents a user wasn’t authorized to see. A single information leak would create legal and competitive risks.

Infrastructure Constraints

The solution needed to be hosted on a VPS with limited memory. Processing thousands of documents for AI retrieval typically demands significant compute resources, but the deployment required careful memory management and optimization to run reliably within these constraints.

Objectives

Intelligent Document Processing with Docling

We initially parsed PDFs and other documents using LLPhant PHP, but later we integrated Docling as the document processing engine, chosen for its superior handling of complex document structures. Independent testing has shown Docling to be one of the best-performing PDF parsers available, outperforming dozens of alternatives when processing thousands of pages. Unlike simple text extractors, Docling understands document layout—preserving table structures, identifying headings and sections, and maintaining logical reading order even in multi-column formats.

 

Docling’s semantic chunking splits documents at natural boundaries (paragraph breaks, section headings, table boundaries) rather than arbitrary character limits. This preserves context and dramatically improves retrieval accuracy. The system processes documents asynchronously as a separate service, keeping memory usage controlled regardless of file size.

Secure Multi-Tenancy with Qdrant

For document isolation between users, we leveraged Qdrant’s multi-tenancy architecture. This approach ensures complete separation of documents at the vector database level—one user cannot access another user’s documents. Access controls are enforced during every query, making unauthorized access architecturally impossible.

Role-Based Access Model

The system implements three distinct access levels:

Role Use Case
Admin
Internal administrators managing the knowledge base
User
Employees with isolated personal storage (cannot see other users’ documents)
Customer
External partners seeing only authorized content

Access control is enforced at query time. When permissions change, access updates immediately without re-processing documents.

Memory-Optimized Processing

To operate within infrastructure constraints, we implemented batched document processing with controlled memory allocation, queue-based ingestion with automatic retry handling, and server-side document parsing that processes files externally before returning only the extracted content.

Benefits

Natural Language Access

Employees ask questions in plain language instead of constructing complex search queries. The system understands intent and retrieves semantically relevant documents, even when exact keywords don’t match.

Source Transparency

Every answer includes citations to source documents, allowing users to verify information and access original materials when needed. This builds trust in AI-generated responses.

Centralized Management

Administrators manage documents through a single interface with automatic indexing. New documents become searchable immediately after processing, keeping the knowledge base current.

Built-in Security

Role-based access is enforced at the architectural level, not as an afterthought. Complete query logging provides audit trails for compliance requirements.

Results

Improved Information Discovery

Employees locate relevant information through conversational queries, reducing time spent searching through folder hierarchies.

Maintained Security Compliance

The multi-tenant architecture ensures document isolation between users. Each user only retrieves their own documents—no cross-user access is possible.

Scalable Knowledge Base

The system handles growing document volumes through efficient processing and storage. New content integrates seamlessly without performance degradation.

Reduced Onboarding Friction

New employees access institutional knowledge immediately through natural questions, rather than learning where information lives across disconnected systems.

Technology Stack

Why This Matters

This project demonstrates proven expertise in:

  • Custom RAG System Development — Architecture designed around specific security, performance, and integration requirements
  • AI Workflow Orchestration — Reliable multi-stage pipelines from document ingestion to response generation
  • Secure by Design — Role-based access control built into the system architecture
  • Real-World Constraint Engineering — Memory-optimized solutions for existing infrastructure

Struggling to Hire the Right Software Engineers?

Our software team augmentation services connect you with top talent to scale faster and smarter.