Learn how to boost your AI model’s performance by fully leveraging retrieved text chunks
Component | Content |
---|---|
User Query | ”What are the fundamental concepts and operations in arithmetic, and how are they used in mathematics?” |
Retrieved Chunk 1 | ”In mathematics, arithmetic is the basic study of numbers. The four basic arithmetic operations are addition, subtraction, multiplication, and division, although other operations such as exponentiation and extraction of roots are also studied in arithmetic. Other arithmetic topics includes working with signed numbers, fractions, decimals and percentages.” |
Retrieved Chunk 2 | ”Mathematics includes the study of numbers, shapes and patterns. Structure: how things are organized. This subfield is usually called algebra. Place: where things are and their arrangement. This subfield is usually called geometry. Change: how things become different. This subfield is usually called analysis.” |
Retrieved Chunk 3 | ”Multiplication is an arithmetic operation for finding the product of two numbers. With natural numbers, multiplication gives the number of tiles in a rectangle. Multiplication between numbers is said to be commutative—when the order of the numbers does not influence the value of the product.” |
Model Response | ”Arithmetic is the basic study of numbers in mathematics. The main operations are addition, subtraction, multiplication, and division. These operations are used to work with numbers in mathematics.” |
combined_score
and relevance
, which become invaluable signals later on. This process:
Metric | Basic Implementation | Enhanced Implementation |
---|---|---|
Chunk Utilization | 38% of relevant content used | ~100% of relevant content used |
Chunk Attribution | 2 out of 5 chunks referenced | All relevant chunks referenced |
Completeness | 73% of essential information | 100% of essential information |
DocumentStore
class serves as our foundation for managing and retrieving documents:
num_docs
parameter controls how many articles to load, making it easy to start small for testing.
_chunk_text
methodSentenceTransformer
model ‘all-MiniLM-L6-v2’, which provides a good balance of speed and qualityk
parameter determines how many similar documents to retrieve for each query(?<=[.!?])\s+
to split on sentence boundariesIndexFlatIP
for exact inner product calculations