What is RAG?
RAG (Retrieval-Augmented Generation) is a technique that combines information retrieval with AI to find and retrieve relevant documents based on meaning, not just keywords.
How it works:
- Step 1 - Build Index: Your documents are processed, split into chunks, and converted into vector embeddings (mathematical representations that capture meaning)
- Step 2 - Search: When you search, your query is also converted to a vector and compared against all document chunks
- Step 3 - Retrieve: The system finds the top-k most relevant chunks based on semantic similarity and returns them to you
Why it's powerful: Unlike traditional keyword search, RAG understands context and meaning. For example, searching for "AI learning" will find documents about "machine learning" even though the exact words don't match!
About the k parameter: The "k" in top-k retrieval refers to the maximum number of results to return. This is a standard concept in all RAG systems. You can adjust k based on your needs - higher k returns more results but may include less relevant ones. The system will return up to k results, but may return fewer if there aren't enough relevant matches in the index.
Step 1: Build the Index
Click the button below to process all documents in the notes/ folder.
The system will split them into chunks, generate embeddings, and create a searchable index.
You can add or edit the documents in the notes/ folder and rebuild the index anytime.
Note: Default example documents are provided covering topics like RAG systems, machine learning, neural networks, vector embeddings, natural language processing, information retrieval, database systems, and API design.
Step 2: Search Your Documents
Enter a search query below. The RAG system will find the most relevant document chunks based on semantic similarity and return the top-k results. The k parameter controls how many results to retrieve (default: 5, max: 20). Try natural language queries like "How does machine learning work?" or "What are vector embeddings?"