How to Build a Company Policy Q&A Bot with RAG
Create an internal bot that answers policy questions from your handbook.
Jay Banlasan
The AI Systems Guy
A company policy qa bot using rag for internal teams answers "Can I do X?" in seconds instead of emailing HR and waiting two days. I build these for organizations with 50+ employees where the same policy questions get asked repeatedly. The bot reads your employee handbook, benefits docs, and travel policies, then gives precise answers with page references.
HR teams save 5-10 hours per week when employees can self-serve policy questions.
What You Need Before Starting
- Your company handbook and policy documents (PDF, DOCX, or Markdown)
- Python 3.8+ with chromadb, anthropic, and sentence-transformers
- Slack or Teams for the bot interface
- An approval process for the initial knowledge base
Step 1: Index Policy Documents
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from sentence_transformers import SentenceTransformer
import chromadb
model = SentenceTransformer("all-MiniLM-L6-v2")
chroma = chromadb.PersistentClient(path="./policy_rag")
collection = chroma.get_or_create_collection("policies")
def index_policy_docs(doc_paths):
splitter = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=200)
for path in doc_paths:
loader = PyPDFLoader(path)
pages = loader.load()
chunks = splitter.split_documents(pages)
for i, chunk in enumerate(chunks):
embedding = model.encode(chunk.page_content).tolist()
collection.add(
ids=[f"{os.path.basename(path)}_p{chunk.metadata.get('page', 0)}_{i}"],
embeddings=[embedding],
documents=[chunk.page_content],
metadatas=[{
"source": os.path.basename(path),
"page": chunk.metadata.get("page", 0),
"section": extract_section_header(chunk.page_content)
}]
)
Step 2: Build the Policy Q&A Function
import anthropic
client = anthropic.Anthropic()
POLICY_PROMPT = """You are an internal policy assistant for [Company Name].
Answer employee questions using ONLY the policy documents provided.
Rules:
- Cite the specific document and page number
- If the policy is ambiguous, say so and recommend contacting HR
- Never interpret policies loosely. Stick to what is written.
- If the question is not covered by any policy, say "This is not covered in our current policies. Please contact HR at [email protected]."
- For sensitive topics (termination, harassment, legal), always add: "For specific situations, please consult HR directly."
"""
def ask_policy(question):
query_embedding = model.encode(question).tolist()
results = collection.query(query_embeddings=[query_embedding], n_results=5)
context = "\n\n".join([
f"[{results['metadatas'][0][i]['source']}, Page {results['metadatas'][0][i]['page']}]\n{results['documents'][0][i]}"
for i in range(len(results["ids"][0]))
])
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=400,
system=POLICY_PROMPT,
messages=[{"role": "user", "content": f"Policy context:\n{context}\n\nEmployee question: {question}"}]
)
return {
"answer": response.content[0].text,
"sources": [{"doc": results["metadatas"][0][i]["source"], "page": results["metadatas"][0][i]["page"]} for i in range(len(results["ids"][0]))]
}
Step 3: Connect to Slack
from slack_bolt import App
slack_app = App(token=os.getenv("SLACK_BOT_TOKEN"), signing_secret=os.getenv("SLACK_SIGNING_SECRET"))
@slack_app.message("")
def handle_message(message, say):
question = message["text"]
user = message["user"]
result = ask_policy(question)
sources = ", ".join([f"{s['doc']} (p.{s['page']})" for s in result["sources"][:3]])
say(f"{result['answer']}\n\n_Sources: {sources}_")
log_query(user, question, result["answer"])
Step 4: Add Sensitive Topic Guards
SENSITIVE_TOPICS = ["harassment", "termination", "fired", "discrimination", "lawsuit", "legal action", "disability", "pregnancy"]
def check_sensitivity(question):
question_lower = question.lower()
for topic in SENSITIVE_TOPICS:
if topic in question_lower:
return True
return False
def ask_policy_safe(question):
if check_sensitivity(question):
result = ask_policy(question)
result["answer"] += "\n\nThis topic involves sensitive workplace matters. For your specific situation, please contact HR directly at [email protected] or schedule a confidential meeting."
notify_hr(question)
return result
return ask_policy(question)
Step 5: Track Usage and Gaps
def get_policy_bot_report(days=30):
conn = sqlite3.connect("policy_bot.db")
start = f"-{days} days"
total = conn.execute("SELECT COUNT(*) FROM queries WHERE asked_at > datetime('now', ?)", (start,)).fetchone()[0]
unanswered = conn.execute("SELECT COUNT(*) FROM queries WHERE answer LIKE '%not covered%' AND asked_at > datetime('now', ?)", (start,)).fetchone()[0]
top_topics = conn.execute("""
SELECT question, COUNT(*) FROM queries WHERE asked_at > datetime('now', ?)
GROUP BY question ORDER BY COUNT(*) DESC LIMIT 10
""", (start,)).fetchall()
return {"total_queries": total, "unanswered": unanswered, "top_topics": top_topics}
What to Build Next
Add policy change notifications. When a document gets updated in the system, automatically notify affected employees. If the PTO policy changes, everyone who asked about PTO in the last 3 months should know about the update.
Related Reading
- Hiring and Recruitment with AI - AI in HR operations beyond just policy Q&A
- The Centralized Brain Concept - policy as part of the company knowledge system
- AI in Customer Service - the same patterns work for internal "customers"
Want this system built for your business?
Get a free assessment. We will map every system your business needs and show you the ROI.
Get Your Free Assessment