Blog Post #46: Building a "Chat with Your Codebase" Agent using LlamaIndex

Table of Contents

Every developer knows the feeling: you’re starting on a new team or tackling an open-source project, and you’re faced with a mountain of unfamiliar code. You spend hours, even days, grep-ing through files, tracing function calls, and trying to decipher the architecture from scattered comments and documentation.

What if you could just ask the codebase questions in plain English? “What’s the purpose of the Node class?” “Show me an example of how to use the SimpleDirectoryReader.”

Welcome to our next project: building a “Chat with Your Codebase” agent. We are going to use LlamaIndex to ingest an entire Git repository—in this case, parts of the LlamaIndex library itself—and build a conversational agent that can act as an expert guide to its own source code. This is a perfect, real-world application of Retrieval-Augmented Generation (RAG).

Part 1: Setting Up the Knowledge Base (The Code)

Our agent’s “knowledge” will be the source code of a real project. To keep things meta, we’ll use the LlamaIndex repository.

Step 1: Clone the LlamaIndex Repository

In your terminal, clone the official repository into a local folder. This will be our data.

# We'll put the repo in a specific folder to keep things organized
git clone https://github.com/run-llama/llama_index.git ./llama_index_repo

Step 2: Install Dependencies

This project requires a few specific LlamaIndex packages for loading and parsing code.

pip install llama-index llama-index-llms-openai llama-index-readers-file

Part 2: The Ingestion Pipeline for Code

Now, we’ll write a script to index the source code. A key challenge is that code is not like prose; it has a rigid structure of classes and functions. A simple text splitter would break code in the middle of a function, destroying its meaning. We need a smarter approach.

Step 1: Loading the Code with SimpleDirectoryReader

We’ll point LlamaIndex’s versatile SimpleDirectoryReader at our cloned repo. To keep this tutorial fast and focused, we’ll only index the core logic directory (llama_index/core) and configure the reader to only load Python files.

# build_index.py
from llama_index.core import SimpleDirectoryReader

# Load only .py files, recursively, and exclude certain directories
reader = SimpleDirectoryReader(
    input_dir="./llama_index_repo/llama_index/core",
    required_exts=[".py"],
    exclude=["*/legacy/*", "*/__init__.py"],
    recursive=True,
)

print("Loading documents...")
documents = reader.load_data()
print(f"Loaded {len(documents)} source code documents.")

Step 2: A Smarter Way to Split: The CodeSplitter

This is the crucial step. Instead of a generic text splitter, LlamaIndex provides language-aware CodeSplitters. This splitter understands Python syntax and intelligently chunks the code to keep related blocks, like entire functions or classes, together.

# build_index.py (continued)
from llama_index.core.node_parser import CodeSplitter

splitter = CodeSplitter(
    language="python",
    chunk_lines=40,        # The number of lines to split on
    chunk_lines_overlap=15,# How many lines of overlap between chunks
    max_chars=1500,        # The max number of characters per chunk
)

nodes = splitter.get_nodes_from_documents(documents)
print(f"Split documents into {len(nodes)} code nodes.")

Step 3: Building and Persisting the Index

This follows the same pattern we learned in Post #42. We create the index from our specialized code nodes and save it to disk.

# build_index.py (continued)
from llama_index.core import VectorStoreIndex
from dotenv import load_dotenv

load_dotenv()

print("Creating and persisting index... This may take a few minutes.")
index = VectorStoreIndex(nodes)
index.storage_context.persist(persist_dir="./storage_code")
print("Index created and saved successfully.")

Run this script once: python build_index.py. It will take a few minutes to process the code, create embeddings, and save your knowledge base to the ./storage_code directory.

Part 3: Building the Chat Engine

Now we can build our conversational interface. This will load our persisted, code-aware index and create a ChatEngine.

# main.py
from dotenv import load_dotenv
from llama_index.core import StorageContext, load_index_from_storage

load_dotenv()
PERSIST_DIR = "./storage_code"

# --- Load the Index ---
storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
index = load_index_from_storage(storage_context)

# --- Create the Chat Engine ---
# The system prompt is key to guiding the agent's behavior
system_prompt = (
    "You are an expert on the LlamaIndex Python library. Your role is to be a helpful "
    "assistant for developers using the library. You will be given context from the source "
    "code. Use this context to answer questions about how the code works. If you don't know "
    "the answer from the context, say so."
)

chat_engine = index.as_chat_engine(
    system_prompt=system_prompt,
    chat_mode="context",
    verbose=False
)

# --- Start the Conversation Loop ---
if __name__ == "__main__":
    print("Codebase assistant is ready. Type 'exit' to end.")
    while True:
        user_input = input("You: ")
        if user_input.lower() == "exit":
            break
        
        response = chat_engine.chat(user_input)
        print(f"Bot: {response}")

Part 4: Chatting with Your Code

Run python main.py and start asking questions. The agent can now act as an expert developer who has memorized the entire codebase.

Sample Conversation:

You: What is the purpose of the QueryEngine in LlamaIndex?

Bot: Based on the source code, a QueryEngine is a generic interface that allows you to ask questions of your data. It takes a natural language query and returns a rich Response object. It’s a high-level abstraction that bundles together a retriever (for fetching data) and a response synthesizer (for generating an answer with an LLM).

You: Show me the code for the SimpleDirectoryReader‘s load_data method.

Bot: Certainly. Here is the load_data method from the SimpleDirectoryReader class:
def load_data(
    self, show_progress: bool = False, num_workers: Optional[int] = None
) -> List[Document]:
    """Load data from the input directory.

... (rest of the docstring and code) ...
"""

**You:** What parameters does it take?

Bot: The load_data method takes two optional parameters: show_progress (a boolean to display a progress bar) and num_workers (an integer to specify the number of parallel workers for loading files).

Conclusion

You have successfully built a powerful and practical developer assistant. By using code-aware components like CodeSplitter, you’ve transformed a raw Git repository into a structured, queryable knowledge base that can dramatically accelerate a developer’s workflow.

This project proves that RAG is not just for unstructured text like articles. By using specialized loaders and parsers, you can turn any data source—including the very code we write—into an intelligent and interactive resource. This opens up incredible possibilities for developer tooling, automated documentation, and AI-assisted debugging.

Author

Debjeet Bhowmik

Experienced Cloud & DevOps Engineer with hands-on experience in AWS, GCP, Terraform, Ansible, ELK, Docker, Git, GitLab, Python, PowerShell, Shell, and theoretical knowledge on Azure, Kubernetes & Jenkins. In my free time, I write blogs on ckdbtech.com

Blog Post #46: Building a “Chat with Your Codebase” Agent using LlamaIndex