Systems Library / AI Capabilities / How to Build a RAG System with Your Business Documents
AI Capabilities rag knowledge

How to Build a RAG System with Your Business Documents

Create a retrieval-augmented generation system for accurate answers from your data.

Jay Banlasan

Jay Banlasan

The AI Systems Guy

When you build a rag system with business documents and ai, your team gets accurate answers from your own data instead of generic AI responses. I run these for businesses sitting on years of SOPs, meeting notes, contracts, and internal wikis that nobody can search effectively. RAG pulls the relevant chunks from your documents and feeds them to the AI as context.

The AI answers from your data, not from its training data. That is the difference between useful and hallucinated.

What You Need Before Starting

Step 1: Load and Chunk Your Documents

from langchain.document_loaders import DirectoryLoader, PyPDFLoader, TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

def load_documents(docs_path):
    loaders = {
        "**/*.pdf": PyPDFLoader,
        "**/*.txt": TextLoader,
    }
    all_docs = []
    for pattern, loader_cls in loaders.items():
        loader = DirectoryLoader(docs_path, glob=pattern, loader_cls=loader_cls)
        all_docs.extend(loader.load())
    return all_docs

def chunk_documents(documents, chunk_size=1000, overlap=200):
    splitter = RecursiveCharacterTextSplitter(
        chunk_size=chunk_size,
        chunk_overlap=overlap,
        separators=["\n\n", "\n", ". ", " "]
    )
    return splitter.split_documents(documents)

Step 2: Create the Vector Store

from sentence_transformers import SentenceTransformer
import chromadb

model = SentenceTransformer("all-MiniLM-L6-v2")
chroma = chromadb.PersistentClient(path="./business_rag")
collection = chroma.get_or_create_collection("documents")

def index_chunks(chunks):
    for i, chunk in enumerate(chunks):
        embedding = model.encode(chunk.page_content).tolist()
        collection.add(
            ids=[f"chunk_{i}"],
            embeddings=[embedding],
            documents=[chunk.page_content],
            metadatas=[{
                "source": chunk.metadata.get("source", "unknown"),
                "page": chunk.metadata.get("page", 0)
            }]
        )
    print(f"Indexed {len(chunks)} chunks")

Step 3: Build the Query Pipeline

import anthropic

client = anthropic.Anthropic()

def query_rag(question, top_k=5):
    query_embedding = model.encode(question).tolist()
    results = collection.query(
        query_embeddings=[query_embedding],
        n_results=top_k
    )

    context_chunks = results["documents"][0]
    sources = results["metadatas"][0]
    context = "\n\n---\n\n".join(context_chunks)

    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=500,
        system="Answer the question using ONLY the provided context. If the context does not contain the answer, say 'I could not find this in the available documents.' Always cite which document the answer comes from.",
        messages=[{
            "role": "user",
            "content": f"Context:\n{context}\n\nQuestion: {question}"
        }]
    )

    return {
        "answer": response.content[0].text,
        "sources": [s["source"] for s in sources]
    }

Step 4: Add the API Layer

from flask import Flask, request, jsonify

app = Flask(__name__)

@app.route("/api/ask", methods=["POST"])
def ask():
    question = request.json["question"]
    result = query_rag(question)
    return jsonify(result)

@app.route("/api/index", methods=["POST"])
def reindex():
    docs_path = request.json.get("path", "./documents")
    documents = load_documents(docs_path)
    chunks = chunk_documents(documents)
    index_chunks(chunks)
    return jsonify({"indexed": len(chunks)})

Step 5: Test and Validate

def validate_rag(test_questions):
    results = []
    for q in test_questions:
        answer = query_rag(q["question"])
        results.append({
            "question": q["question"],
            "expected": q["expected_answer"],
            "actual": answer["answer"],
            "sources": answer["sources"]
        })
    return results

test_set = [
    {"question": "What is our refund policy?", "expected_answer": "30-day refund policy"},
    {"question": "How do I request time off?", "expected_answer": "Submit through HR portal"},
]

What to Build Next

Add document versioning. When a document gets updated, re-index only that document and keep track of which version was used for each answer. This creates an audit trail for compliance.

Related Reading

Want this system built for your business?

Get a free assessment. We will map every system your business needs and show you the ROI.

Get Your Free Assessment

Related Systems