Implementing Retrieval-Augmented Generation (RAG) with LangChain and Weaviate: A Comprehensive Guide

Haritha Kotte

Share this article

In this blog post, we will walk through the implementation of a Retrieval-Augmented Generation (RAG) application using LangChain and Weaviate. Our goal is to build an intelligent meal planner that generates personalized recipes, complete with detailed instructions, an ingredients list, and nutritional information tailored to your dietary needs and preferences. By integrating LangChain for language processing and Weaviate for powerful vector-based search and storage, this app showcases how RAG can transform dynamic, data-driven solutions.

What is a RAG application?

A RAG (Retrieval-Augmented Generation) application is a type of AI-driven system that combines two major components:

Retrieval: The system retrieves relevant information or documents from a large collection (e.g., a database or knowledge base).
- A retriever uses techniques like vector similarity search (e.g., using cosine similarity) to find the most relevant chunks of information based on an embedding (a numerical representation of the query).
Generation: A language model (typically a large transformer model, such as GPT) uses the retrieved information to generate a coherent, contextually relevant answer or output.
- The retrieved documents are then passed to a generative language model (like GPT) to produce a natural language response.
- The generator uses the context provided by the retrieved information to generate a more accurate and relevant response.

How the RAG application works?

Initially we ingest the data into the vector database(Weaviate) by generating embeddings and retrieve the matching results using Vector store and process the results using a generator.

What are embeddings

Embeddings are dense vector representations of data, typically used to encode items like words, sentences, or documents into a continuous vector space. These vectors capture the semantic meaning of the data, allowing for comparisons between different items based on their context and relationships.

What is vector database

A vector database is a database system specifically designed to handle and query vectors (embeddings). Unlike traditional databases that handle structured data (e.g., numbers, strings) through exact matching, a vector database is optimized to work with high-dimensional vector representations that require similarity-based operations.

First, we are going to ingest the data to Weaviate. There are 2 ways of generating embeddings for the data to weaviate vector database.

We can generate it using an embedding model and upload it to the vector database.

from datasets import load_dataset
from sentence_transformers import SentenceTransformer
import weaviate
from weaviate.auth import Auth
import weaviate.classes.config as wvcc
from sklearn.preprocessing import normalize
import time
from dotenv import load_dotenv
import os

# Load the .env file
load_dotenv()

# Access environment variables
max_retries = os.getenv('MAX_RETRIES')
cluster_url = os.getenv('WEAVIATE_CLUSTER')
auth_key = os.getenv('WEAVIATE_KEY')

# Load the dataset
df = load_dataset("Shengtao/recipe")
recipes = df['train']
recipes = recipes.select(range(100)) # use subset of recipies

model = SentenceTransformer('all-mpnet-base-v2')

def generate_embeddings(examples):
    combined_text = [" ".join([f"{col}: {examples[col][i]}" for col in recipes.column_names if col != 'embeddings']) for i in range(len(examples[recipes.column_names[0]]))]

    embeddings = model.encode(combined_text, show_progress_bar=True)
    embeddings = normalize(embeddings) # normalize embeddings
    examples["embedding"] = [embedding.tolist() for embedding in embeddings]

    return examples

# Generate embeddings and store them in the same dataset
recipes = recipes.map(generate_embeddings, batched=True)

# initialize weaviate client
weaviate_client = weaviate.connect_to_weaviate_cloud(
    cluster_url=cluster_url,
    auth_credentials=Auth.api_key(auth_key),
    skip_init_checks=True,
)

# delete the preexisting collection with the same name if exists
weaviate_client.collections.delete("RecipeST")

properties = [wvcc.Property(
                name=col,
                data_type=wvcc.DataType.TEXT
            ) for col in recipes.column_names if col != "embedding"]

# Create the collection with properties and embedding            
collection = weaviate_client.collections.create(
    name="RecipeST",
    description="A collection to store recipes",
    properties=properties + [wvcc.Property(
            name="embedding",
            data_type=wvcc.DataType.NUMBER_ARRAY
        )],
    vector_index_config=wvcc.Configure.VectorIndex.hnsw(
        distance_metric=wvcc.VectorDistances.COSINE
    ),
)

# Upload it to the weaviate store as batches
try:
    with weaviate_client.batch.dynamic() as batch:
        for index, row in enumerate(recipes):
            data_object = {col: str(row[col]) for col in recipes.column_names if col != "embedding"}
            data_object['embedding'] = row['embedding']
            for attempt in range(max_retries):
                try:
                    batch.add_object(
                        properties=data_object,
                        collection="RecipeST",
                    )

                    print("Data uploaded successfully.")
                    break
                except weaviate.exceptions.UnexpectedStatusCodeException as e:
                    if '503' in str(e):
                        print(f"Attempt {attempt + 1}: Model is still loading, retrying...")
                        time.sleep(20)  # Wait and retry
                    if '429' in str(e):
                        # Handle rate limit error
                        retry_after = 60  # Retry-After header might be in seconds
                        print(f"Rate limit exceeded. Retrying after {retry_after} seconds...")
                        time.sleep(retry_after)
                    else:
                        raise  # Raise if it's a different error

finally:
    weaviate_client.close()

In the above example, I am using the sandbox in the Weaviate free trial, which comes with resource constraints. To avoid overloading the resources, I am working with a small subset of the data.

First, we load the dataset from Huggingface, then generate embeddings for each row using SentenceTransformers and store them in the same dataset. After that, we connect to the Weaviate cloud, create a collection named ‘RecipeST,’ and upload the data along with the generated embeddings in dynamic batches. Once everything is complete, we close the connection.

As mentioned, I am using a small subset of the data because the Weaviate free trial sandbox has resource limitations.

Another way is to provide the vectoriser configuration and specify the embedding model to the Weaviate collection while creating it.

from datasets import load_dataset
import weaviate
import weaviate.classes.config as wvcc
from weaviate.auth import Auth
from dotenv import load_dotenv
import os
import time

# Load the .env file
load_dotenv()

# Access environment variables
rate_limit = os.getenv('RATE_LIMIT')
cluster_url = os.getenv('WEAVIATE_CLUSTER')
auth_key = os.getenv('WEAVIATE_KEY')
huggingface_new_apikey = os.getenv('HUGGINGFACE_NEW_APIKEY')

df = load_dataset("Shengtao/recipe")
recipes = df['train']
recipes = recipes.select(range(100))

weaviate_client = weaviate.connect_to_weaviate_cloud(
    cluster_url=cluster_url,
    auth_credentials=Auth.api_key(auth_key),
    headers={"X-HuggingFace-Api-Key": huggingface_new_apikey},
    skip_init_checks=True,
)

properties = [wvcc.Property(
                name=col,
                data_type=wvcc.DataType.TEXT
            ) for col in recipes.column_names if col != "embedding"]

# Delete the collection if it already exists
weaviate_client.collections.delete("RecipeV4")
# Create collection with vectorizer config
# Note that you can use `client.collections.create_from_dict()` to create a collection from a v3-client-style JSON object
collection = weaviate_client.collections.create(
    name="RecipeV4",
    description="A collection to store recipes",
    vectorizer_config=wvcc.Configure.Vectorizer.text2vec_huggingface(
        model="sentence-transformers/all-mpnet-base-v2",
        vectorize_collection_name=True
    ),
    properties=properties
)

# Upload data as batches with rate limit since it uses huggingface API for vectorization
try:
    with weaviate_client.batch.rate_limit(requests_per_minute=rate_limit) as batch:  # or <collection>.batch.rate_limit()

        for index, row in enumerate(recipes):
            data_object = {col: str(row[col]) for col in recipes.column_names}
            max_retries = 5
            for attempt in range(max_retries):
                try:
                    batch.add_object(properties=data_object, collection="RecipeV4")
                    print("Data uploaded successfully.")
                    break
                except weaviate.exceptions.UnexpectedStatusCodeException as e:
                    if '503' in str(e):
                        print(f"Attempt {attempt + 1}: Model is still loading, retrying...")
                        time.sleep(20)  # Wait and retry
                    if '429' in str(e):
                        # Handle rate limit error
                        retry_after = 60  # Retry-After header might be in seconds
                        print(f"Rate limit exceeded. Retrying after {retry_after} seconds...")
                        time.sleep(retry_after)
                    else:
                        raise  # Raise if it's a different error

finally:
    weaviate_client.close()

In the above example, we first load the dataset from Hugging Face, then connect to the Weaviate cloud and create a collection named ‘Recipev4’ with the vectorization_config. By creating the collection with vectorization_config, we instruct Weaviate to generate and store embeddings for the uploaded data without requiring us to generate them manually. However, we need to rate-limit the data upload to avoid hitting the rate limits of the third-party vectorization API (Hugging Face). Once everything is done, we close the connection.

After ingesting the data into the Weaviate vector database, we can query the vector store using the retriever. For retrieval, we are using LangChain’s similarity_search.

What is a Vector Store

A vector store refers to a system or library for storing and managing vector embeddings, but typically with a focus on applications like natural language processing, AI, and machine learning. It may not have all the advanced features of a full-fledged vector database but is designed to serve as a simple mechanism for managing embeddings, especially when integrated with machine learning pipelines.

What is Langchain

LangChain is a framework designed to facilitate the development of applications that leverage large language models (LLMs) like GPT-3, GPT-4, and others, along with various data sources. It simplifies the process of building applications that use LLMs for tasks such as natural language understanding, text generation, and semantic search, often combining them with external tools, databases, APIs, and vector stores to enhance their capabilities.

from langchain_weaviate.vectorstores import WeaviateVectorStore
from langchain_huggingface import HuggingFaceEmbeddings
import weaviate
from weaviate.classes.init import Auth
from dotenv import load_dotenv
import os

# Load the .env file
load_dotenv()

# Access environment variables
cluster_url = os.getenv('WEAVIATE_CLUSTER')
auth_key = os.getenv('WEAVIATE_KEY')

# Connect to weaviate store
weaviate_client = weaviate.connect_to_weaviate_cloud(
    cluster_url=cluster_url, 
    auth_credentials=Auth.api_key(auth_key),
    skip_init_checks=True,
)

embedding_model_name = "sentence-transformers/all-mpnet-base-v2"
embeddings = HuggingFaceEmbeddings(
    model_name=embedding_model_name
)

# Initialise the vector store
vectorstore = WeaviateVectorStore(client=weaviate_client, index_name="RecipeV4", text_key="title", embedding=embeddings)

# Meal planner
def meal_planner():
    print("Welcome to the Meal Planner!")
    while True:
            # Input the data
        meal_type = input("Enter the type of meal (e.g., breakfast, lunch, dinner, main dish): ")
        ingredients = input("Do you have any specific ingredients for your dish? (leave blank if you don't have any preference): ")
        nutrition_goals = input("Enter specific nutritional goals (e.g., low-carb, high-protein): ")
        any_other = input("Do you have any other preference? (leave blank if you don't have any preference): ")
        query = f"{meal_type} {ingredients} {nutrition_goals} {any_other}"

            # Retrieve the results
        results = vectorstore.similarity_search(query, k=4)

            # Format the results and serve
        print("\nHere are some meal options for you:\n")
        for i, result in enumerate(results):
            print(f"Option {i + 1}:")
            print("Title:", result.page_content)
            print("Ingredients:", result.metadata.get("ingredients"))
            print("Instructions:", result.metadata.get("instructions_list"))
            print("Calories:", result.metadata.get("calories"))
            print("Carbohydrates:", result.metadata.get("carbohydrates_g"), "g")
            print("Fat:", result.metadata.get("fat_g"), "g")
            print("Protein:", result.metadata.get("protein_g"), "g")
            # Print other nutritional fields...
            print()

        another = input("Do you want to plan another meal? (yes/no): ")
        if another.lower() != "yes":
            break

meal_planner()
weaviate_client.close()

In the above example, we are using the vector store’s similarity_search for retrieving the data. After retrieving the matching results, we should ideally write the generator module and use an LLM for processing the data. However, for small-scale, simple use cases—such as retrieving 2 or 3 documents from a retriever—there’s no significant advantage to using a generator. We can directly return the list of results as shown above.

The Output of the above meal planner looks something like this

Welcome to the Meal Planner!
Enter the type of meal (e.g., breakfast, lunch, dinner, main dish): Lunch
Do you have any specific ingredients for your dish? (leave blank if you don't have any preference): Potato
Enter specific nutritional goals (e.g., low-carb, high-protein): High protein
Do you have any other preference? (leave blank if you don't have any preference): 

Here are some meal options for you:

Option 1:
Title: Microwave Baked Potato
Ingredients: 1 large russet potato ; 1 tablespoon butter or margarine ; 3 tablespoons shredded Cheddar cheese ;   salt and pepper to taste ; 3 teaspoons sour cream
Instructions: ['Scrub potato and prick with a fork. Place on a microwave-safe plate.', 'Microwave on full power for 5 minutes. Turn potato over, and microwave until soft, about 5 more minutes.', 'Remove potato from the microwave, and cut in half lengthwise. Season with salt and pepper and mash up the inside a little with a fork.', 'Add butter and Cheddar cheese. Microwave until melted, about 1 more minute.', 'Top with sour cream, and serve.']
Calories: 517.3
Carbohydrates: 65.4 g
Fat: 23.1 g
Protein: 14.2 g

Option 2:
Title: Old Fashioned Potato Salad
Ingredients: 5  potatoes ; 3  eggs ; 1 cup chopped celery ; ½ cup chopped onion ; ½ cup sweet pickle relish ; ¼ teaspoon garlic salt ; ¼ teaspoon celery salt ; 1 tablespoon prepared mustard ;   ground black pepper to taste ; ¼ cup mayonnaise
Instructions: ['Bring a large pot of salted water to a boil. Add potatoes and cook until tender but still firm, about 15 minutes. Drain, cool, peel and chop.', 'While potatoes cook, place eggs in a saucepan and cover with cold water. Bring water to a boil; cover, remove from heat, and let eggs stand in hot water for 10 to 12 minutes. Remove from hot water, cool, peel and chop.', 'Combine the potatoes, eggs, celery, onion, relish, mayonnaise, mustard, garlic salt, celery salt, and pepper in a large bowl. Mix together well and refrigerate until chilled.']
Calories: 206.4
Carbohydrates: 30.5 g
Fat: 7.6 g
Protein: 5.5 g

Option 3:
Title: Basic Mashed Potatoes
Ingredients: 2 pounds baking potatoes, peeled and quartered ; 2 tablespoons butter ; 1 cup milk ;   salt and pepper to taste
Instructions: ['Bring a large pot of salted water to a boil. Add potatoes and garlic, lower heat to medium, and simmer until potatoes are tender, 15 to 20 minutes.', 'When the potatoes are almost finished, heat milk and butter in a small saucepan over low heat until butter is melted.', 'Drain potatoes and return to the pot. Slowly add warm milk mixture, blending it in with a potato masher or electric mixer until potatoes are smooth and creamy. Season with salt and pepper.']
Calories: 257.1
Carbohydrates: 43.7 g
Fat: 7.2 g
Protein: 5.6 g

Option 4:
Title: Roasted Vegetables
Ingredients: 1 small butternut squash, cubed ; 2  red bell peppers, seeded and diced ; 1  sweet potato, peeled and cubed ; 3  Yukon Gold potatoes, cubed ; 1  red onion, quartered ; 1 tablespoon chopped fresh thyme ; 2 tablespoons chopped fresh rosemary ; ¼ cup olive oil ; 2 tablespoons balsamic vinegar ;   salt and freshly ground black pepper
Instructions: ['Preheat the oven to 475 degrees F (245 degrees C).', 'Combine butternut squash, Yukon Gold potatoes, bell peppers, sweet potato, and red onion pieces in a large bowl.', 'Stir olive oil, balsamic vinegar, rosemary, and thyme together in a small bowl; season with salt and pepper. Pour over vegetables and toss until well coated. Transfer vegetables to a large roasting pan and spread in an even layer.', 'Roast in the preheated oven, stirring every 10 minutes, until vegetables are slightly caramelized and cooked through, 35 to 40 minutes.']
Calories: 122.7
Carbohydrates: 20.0 g
Fat: 4.7 g
Protein: 2.0 g

Do you want to plan another meal? (yes/no): no

But, if you are keen on using the generation step, here is the implementation for you.

from langchain_weaviate.vectorstores import WeaviateVectorStore
from langchain_huggingface import HuggingFaceEmbeddings
import weaviate
from weaviate.classes.init import Auth
from dotenv import load_dotenv
import os
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

# Load the .env file
load_dotenv()

# Access environment variables
cluster_url = os.getenv('WEAVIATE_CLUSTER')
auth_key = os.getenv('WEAVIATE_KEY')

# Connect to weaviate client
weaviate_client = weaviate.connect_to_weaviate_cloud(
    cluster_url=cluster_url, 
    auth_credentials=Auth.api_key(auth_key),
    skip_init_checks=True,
)

# Mention the embedding model name
embedding_model_name = "sentence-transformers/all-mpnet-base-v2"
embeddings = HuggingFaceEmbeddings(
    model_name=embedding_model_name
)

# Initialise the vector store
vectorstore = WeaviateVectorStore(client=weaviate_client, index_name="RecipeV4", text_key="title", embedding=embeddings)
# Create the retriever to fetch relevant documents based on a query.
retriever = vectorstore.as_retriever()

# Construct a template for the RAG mode
template = """You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Show in a detailed information list format for the user to prepare the dishes and analyze the nutrition information of the dishes.
Question: {question}
Context: {context}
Answer:
"""
prompt = ChatPromptTemplate.from_template(template)
print(prompt)

# Connect to OpenAI GPT Model
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0, api_key=os.getenv("OPENAI_API_KEY"))
# Build the RAG chain
rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

output = rag_chain.invoke("lunch with high protien potato recipe")
print(output)

In this example, we are creating a RAG pipeline:

First, we initialized the vector store using WeaviateVectorStore
Then, we created a retriever that will later be used to fetch relevant documents based on a query.
Next, we created a prompt template that instructs the model to answer questions using the retrieved context, formatted as specified in the template.
After that, OpenAI’s gpt-3.5-turbo model is initialized using the API key from the environment. The temperature=0 setting makes the model more deterministic in its responses.
Then, we built a RAG chain using the chaining operator | and RunnablePassthrough.
- RunnablePassthrough: A utility in LangChain that essentially acts as a placeholder or pass-through in a chain of operations, allowing the value being passed through the chain to remain unchanged without performing any operations on it.
- Retriever: The retriever fetches relevant documents based on the query.
- Prompt Template: The template combines the retrieved context and the question into a complete prompt.
- LLM: The combined prompt is sent to the OpenAI GPT model, which generates an answer.
- StrOutputParser: Parses the output string from the model to return a clean result.
Finally, we invoked the RAG chain to get the result.

Output:

input_variables=['context', 'question'] messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Show in a detailed information list format for the user to prepare the dishes and analyze the nutrition information of the dishes.\nQuestion: {question}\nContext: {context}\nAnswer:\n"))]
### Potato Salad Recipe:
- **Ingredients:**
  - 5 potatoes
  - 3 eggs
  - 1 cup chopped celery
  - ½ cup chopped onion
  - ½ cup sweet pickle relish
  - ¼ teaspoon garlic salt
  - ¼ teaspoon celery salt
  - 1 tablespoon prepared mustard
  - Ground black pepper to taste
  - ¼ cup mayonnaise

- **Instructions:**
  1. Bring a large pot of salted water to a boil. Add potatoes and cook until tender but still firm, about 15 minutes. Drain, cool, peel, and chop.
  2. Place eggs in a saucepan, cover with cold water, bring to a boil, cover, remove from heat, and let eggs stand in hot water for 10 to 12 minutes. Remove, cool, peel, and chop.
  3. Combine potatoes, eggs, celery, onion, relish, garlic salt, celery salt, mustard, pepper, and mayonnaise in a large bowl. Mix well and refrigerate until chilled.

- **Nutrition Information:**
  - Calories: 206.4
  - Protein: 5.5g
  - Carbohydrates: 30.5g
  - Fat: 7.6g
  - Saturated Fat: 1.5g
  - Cholesterol: 72.4mg
  - Sodium: 334.7mg
  - Potassium: 647.1mg
  - Fiber: 3.5g
  - Sugars: 6.4g
  - Vitamin C: 27.6mg
  - Iron: 1.6mg
  - Calcium: 36.8mg

### Microwave Baked Potato Recipe:
- **Ingredients:**
  - 1 large russet potato
  - 1 tablespoon butter or margarine
  - 3 tablespoons shredded Cheddar cheese
  - Salt and pepper to taste
  - 3 teaspoons sour cream

- **Instructions:**
  1. Scrub the potato and prick with a fork. Microwave on full power for 5 minutes, turn over, and microwave for 5 more minutes until soft.
  2. Cut the potato in half lengthwise, season with salt and pepper, mash the inside slightly, add butter and 2 tablespoons of cheese, microwave until melted.
  3. Top with remaining cheese and sour cream, and serve.

- **Nutrition Information:**
  - Calories: 517.3
  - Protein: 14.2g
  - Carbohydrates: 65.4g
  - Fat: 23.1g
  - Saturated Fat: 14.5g
  - Cholesterol: 63.1mg
  - Sodium: 421.6mg
  - Potassium: 1602.1mg
  - Fiber: 8.1g
  - Sugars: 3.0g
  - Vitamin C: 72.8mg
  - Iron: 3.1mg
  - Calcium: 244.4mg

### Basic Mashed Potatoes Recipe:
- **Ingredients:**
  - 2 pounds baking potatoes, peeled and quartered
  - 2 tablespoons butter
  - 1 cup milk
  - Salt and pepper to taste

- **Instructions:**
  1. Bring a pot of salted water to a boil. Add potatoes and cook until tender, drain.
  2. Heat butter and milk in a saucepan until butter melts. Blend milk mixture into potatoes until smooth and creamy. Season with salt and pepper.

- **Nutrition Information:**
  - Calories: 257.1
  - Protein: 5.6g
  - Carbohydrates: 43.7g
  - Fat: 7.2g
  - Saturated Fat: 4.5g
  - Cholesterol: 20.1mg
  - Sodium: 76.1mg
  - Potassium: 763.1mg
  - Fiber: 3.7g
  - Sugars: 4.6g
  - Vitamin C: 15.2mg
  - Iron: 0.7mg
  - Calcium: 89.4mg

These recipes provide a variety of options for a high-protein lunch, with detailed instructions and nutrition information for each dish.

Conclusion

In this post, we have covered the process of creating a RAG(Retrieval-Augmented Generation) application, from creation and ingestion to retrieval and generation. While we've aimed to cover the key aspects, there may still be areas that need deeper exploration. Hopefully, this provides a good starting point. Further refinement and experimentation might be necessary as you develop your own RAG system.

References

https://github.com/Hivekind/python-langchain-weaviate

https://github.com/Hivekind/meal_planner

Need help automating with AI?

Reach out to us by filling out the form on our contact page. If you need an NDA, just let us know, and we’ll gladly provide one.

Take me to the contact page

Top software development company Malaysia awards