Every developer knows the feeling: you’re starting on a new team or tackling an open-source project, and you’re faced with a mountain of unfamiliar code. You spend hours, even days, grep
-ing through files, tracing function calls, and trying to decipher the architecture from scattered comments and documentation.
What if you could just ask the codebase questions in plain English? “What’s the purpose of the Node
class?” “Show me an example of how to use the SimpleDirectoryReader
.”
Welcome to our next project: building a “Chat with Your Codebase” agent. We are going to use LlamaIndex to ingest an entire Git repository—in this case, parts of the LlamaIndex library itself—and build a conversational agent that can act as an expert guide to its own source code. This is a perfect, real-world application of Retrieval-Augmented Generation (RAG).
Part 1: Setting Up the Knowledge Base (The Code)
Our agent’s “knowledge” will be the source code of a real project. To keep things meta, we’ll use the LlamaIndex repository.
Step 1: Clone the LlamaIndex Repository
In your terminal, clone the official repository into a local folder. This will be our data.
# We'll put the repo in a specific folder to keep things organized
git clone https://github.com/run-llama/llama_index.git ./llama_index_repo
Step 2: Install Dependencies
This project requires a few specific LlamaIndex packages for loading and parsing code.
pip install llama-index llama-index-llms-openai llama-index-readers-file
Part 2: The Ingestion Pipeline for Code
Now, we’ll write a script to index the source code. A key challenge is that code is not like prose; it has a rigid structure of classes and functions. A simple text splitter would break code in the middle of a function, destroying its meaning. We need a smarter approach.
Step 1: Loading the Code with SimpleDirectoryReader
We’ll point LlamaIndex’s versatile SimpleDirectoryReader at our cloned repo. To keep this tutorial fast and focused, we’ll only index the core logic directory (llama_index/core) and configure the reader to only load Python files.
# build_index.py
from llama_index.core import SimpleDirectoryReader
# Load only .py files, recursively, and exclude certain directories
reader = SimpleDirectoryReader(
input_dir="./llama_index_repo/llama_index/core",
required_exts=[".py"],
exclude=["*/legacy/*", "*/__init__.py"],
recursive=True,
)
print("Loading documents...")
documents = reader.load_data()
print(f"Loaded {len(documents)} source code documents.")
Step 2: A Smarter Way to Split: The CodeSplitter
This is the crucial step. Instead of a generic text splitter, LlamaIndex provides language-aware CodeSplitters. This splitter understands Python syntax and intelligently chunks the code to keep related blocks, like entire functions or classes, together.
# build_index.py (continued)
from llama_index.core.node_parser import CodeSplitter
splitter = CodeSplitter(
language="python",
chunk_lines=40, # The number of lines to split on
chunk_lines_overlap=15,# How many lines of overlap between chunks
max_chars=1500, # The max number of characters per chunk
)
nodes = splitter.get_nodes_from_documents(documents)
print(f"Split documents into {len(nodes)} code nodes.")
Step 3: Building and Persisting the Index
This follows the same pattern we learned in Post #42. We create the index from our specialized code nodes and save it to disk.
# build_index.py (continued)
from llama_index.core import VectorStoreIndex
from dotenv import load_dotenv
load_dotenv()
print("Creating and persisting index... This may take a few minutes.")
index = VectorStoreIndex(nodes)
index.storage_context.persist(persist_dir="./storage_code")
print("Index created and saved successfully.")
Run this script once: python build_index.py
. It will take a few minutes to process the code, create embeddings, and save your knowledge base to the ./storage_code
directory.
Part 3: Building the Chat Engine
Now we can build our conversational interface. This will load our persisted, code-aware index and create a ChatEngine
.
# main.py
from dotenv import load_dotenv
from llama_index.core import StorageContext, load_index_from_storage
load_dotenv()
PERSIST_DIR = "./storage_code"
# --- Load the Index ---
storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
index = load_index_from_storage(storage_context)
# --- Create the Chat Engine ---
# The system prompt is key to guiding the agent's behavior
system_prompt = (
"You are an expert on the LlamaIndex Python library. Your role is to be a helpful "
"assistant for developers using the library. You will be given context from the source "
"code. Use this context to answer questions about how the code works. If you don't know "
"the answer from the context, say so."
)
chat_engine = index.as_chat_engine(
system_prompt=system_prompt,
chat_mode="context",
verbose=False
)
# --- Start the Conversation Loop ---
if __name__ == "__main__":
print("Codebase assistant is ready. Type 'exit' to end.")
while True:
user_input = input("You: ")
if user_input.lower() == "exit":
break
response = chat_engine.chat(user_input)
print(f"Bot: {response}")
Part 4: Chatting with Your Code
Run python main.py
and start asking questions. The agent can now act as an expert developer who has memorized the entire codebase.
Sample Conversation:
You: What is the purpose of the
QueryEngine
in LlamaIndex?Bot: Based on the source code, a
QueryEngine
is a generic interface that allows you to ask questions of your data. It takes a natural language query and returns a richResponse
object. It’s a high-level abstraction that bundles together a retriever (for fetching data) and a response synthesizer (for generating an answer with an LLM).You: Show me the code for the
SimpleDirectoryReader
‘sload_data
method.Bot: Certainly. Here is the
load_data
method from theSimpleDirectoryReader
class:def load_data( self, show_progress: bool = False, num_workers: Optional[int] = None ) -> List[Document]: """Load data from the input directory.
... (rest of the docstring and code) ... """
**You:** What parameters does it take?
Bot: The
load_data
method takes two optional parameters:show_progress
(a boolean to display a progress bar) andnum_workers
(an integer to specify the number of parallel workers for loading files).
Conclusion
You have successfully built a powerful and practical developer assistant. By using code-aware components like CodeSplitter
, you’ve transformed a raw Git repository into a structured, queryable knowledge base that can dramatically accelerate a developer’s workflow.
This project proves that RAG is not just for unstructured text like articles. By using specialized loaders and parsers, you can turn any data source—including the very code we write—into an intelligent and interactive resource. This opens up incredible possibilities for developer tooling, automated documentation, and AI-assisted debugging.
Author

Experienced Cloud & DevOps Engineer with hands-on experience in AWS, GCP, Terraform, Ansible, ELK, Docker, Git, GitLab, Python, PowerShell, Shell, and theoretical knowledge on Azure, Kubernetes & Jenkins. In my free time, I write blogs on ckdbtech.com