MongoRAG: Leveraging MongoDB Atlas as a Vector Database with Databricks-Deployed Embedding Model and LLMs for Retrieval-Augmented Generation

Author(s): Dwaipayan Bandyopadhyay

Originally published on Towards AI.

Today, in this article, I will give a detailed walkthrough about how we can leverage MongoDB’s own Atlas as a Vector Search Index and Embedding model and LLM served as an endpoint in the Databricks portal to do Retrieval Augmented Generation (RAG) on a piece of data.

MongoRAG: Leveraging MongoDB Atlas as a Vector Database with Databricks-Deployed Embedding Model and LLMs for Retrieval-Augmented Generation — Source : Image by Author

In today’s AI World, where large amounts of structured and unstructured data are generated daily, accurately using knowledge has become the cornerstone of modern-day technology. Retrieval Augmented Generation (RAG) is a widely used approach that solves real-world data problems by amalgamating the power of Generative AI and Information Retrieval.

Retrieval Augmented Generation generally consists of Three major steps, I will explain them briefly down below –

Information Retrieval — The very first step involves retrieving relevant information from a knowledge base, database, or vector database, where we store the embeddings of the data from which we will retrieve information. This Retrieval part is typically done via Similarity Search, in which we find the similarities between the embedded query and the embeddings already stored in the Vector Database.
Augmentation Step — After retrieving the similar information from the Vector Database, it gets combined with the query asked by the user so that the retriever gets the context to what has been asked and form a better answer for the query.
Generation Step — This is the final step, where a Large Language Model comes into play, we feed the augmented information to the LLM, and it generates a proper human readable answer based on that information provided. Feeding of the augmented information is crucial because otherwise the AI might generate some random information as it doesn’t have any context of what has been asked.

What is MongoDB Atlas?

Atlas is a multi-cloud database service provided by MongoDB in which the developers can create clusters, databases and indexes directly in the cloud, without installing anything locally. Basically, it’s MongoDB on Cloud, users can create an account by signing up from their official website provided below –

MongoDB Atlas: Cloud Document Database | MongoDB

After signing in for the very first time, just follow the steps mentioned in the below documentation to spin up a free cluster.

Get Started with Atlas — MongoDB Atlas

After the Cluster has been created, it’s time to create a Database and a collection. Now, as MongoDB is a NoSQL Database, we have to create a Database first (unlike Schema for SQL Databases, although the concept is same), then inside the Database we have to create a collection, in which we can store documents (It is like creating a table inside a Database). If this feels confusing, please refer to the following article of how to create a Collection and Database, but remember, do not add any documents, just create collection and a database.

Connecting MongoDB with Python — The Coding part starts now

Now, we will connect MongoDB with Python, so that we can do the rest of the steps programmatically, without using the UI for a second.

To connect and access MongoDB Atlas via Python, we need to install a package called ‘pymongo’. It can be installed via the following pip command.

pip install pymongo

After it has been installed, we will import the class MongoClient to connect with MongoDB via Python. For that we will require the connection string, which can be found under Drivers settings after clicking on Connect from the Cluster. The process can be found in the following link, Step — 2.

Quick Start: Getting Started With MongoDB Atlas and Python | MongoDB

After the connection string is found, write and execute the below to connect with MongoDB

from pymongo import MongoClient

client = MongoClient("YOUR_CONNECTION_URL")
dbName = "YOUR_DATABASE_NAME"
collectionName = "YOUR_COLLECTION_NAME"
collection = client[dbName][collectionName]

This will establish the connection with MongoDB, if no errors are encountered, then the connection has been successfully made with MongoDB.

After the connection has been established, let’s talk about all the other packages we require to do the entire RAG process, apart from pymongo. Install the following packages via pip

pip install langchain
pip install langchain_databricks
pip install langchain_mongodb

We only require these three packages to do the entire process. After they are installed successfully, let’s import all the necessary classes from these packages.

Importing Necessary classes from the packages

from pymongo import MongoClient
from langchain_mongodb.vectorstores import MongoDBAtlasVectorSearch
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains import RetrievalQA
from langchain_databricks import ChatDatabricks
from langchain_databricks import DatabricksEmbeddings

As we have already established the connection with Databricks, let’s just load our data and do the chunking using RecursiveCharacterTextSplitter. We will be keeping each chunk size as 1000 with an overlapping of 100 characters and a new paragraph(\n\n) as a separator.

# Importing the data using TextLoader
loader = TextLoader("story.txt")
data = loader.load()

# Configuring the Chunking strategy
text_splitter = RecursiveCharacterTextSplitter(
 chunk_size=1000,
 chunk_overlap=100,
 separators="\n\n"
)

# keeping the chunks in this variable
chunked_docs = text_splitter.split_documents(data)

Configuring LLM and Embedding Models

Next, we will configure our Embedding model and Large Language Model which we are going to use. Now, here we will be using the models which are serving as an endpoint in Databricks Portal. If someone don’t have the access of Databricks, then they can go with the regular approach of using OpenAIEmbeddings and ChatOpenAI classes, and configure them accordingly.

embeddings = DatabricksEmbeddings(
 endpoint="databricks-gte-large-en"
)

llm = ChatDatabricks(
 target_uri="databricks",
 endpoint="databricks-meta-llama-3-1-70b-instruct",
 temperature=0.0,
)

We will be using the GTE-Large embedding model and the Meta LLama 3.1 70B Instruct models for this demo.

Creating a Vector Search Index in MongoDB Atlas

Now, after all the configuration is done, we will be creating a Vector Search Index in Atlas, in which we will store our embeddings and use them later to do the RAG. There are two ways to create the Vector Search Index, one is either the UI or the other way is via Code. Now Atlas provides us with a default Search Index name i.e ‘vector_index’, if someone wants to go by this name, then they can just write and execute the following code

vectorStore = MongoDBAtlasVectorSearch.from_documents(
 chunked_docs, 
 embeddings, 
 collection=collection).create_vector_search_index(
 dimensions=1024
 )

This will create a vector search index named vector_index with the dimension 1024, inside the collection we created earlier. We just have the pass the chunked documents, alongside the embeddings and the collection configuration via which we connected to Atlas.

Image before the creation of the Search Index (execution of the above code)

Image after executing the above code (creation of the default search index)

As we can see now, after the execution of the above piece of code, our search index with the default name ‘vector_index’ has been created and 129 documents have been inserted (which is the number of chunks created earlier)

But, if someone wants to go a step further and create their own Search Index by providing own custom name, then we need to make some changes in the above code. First, we need to create the custom index using the name provided by the user, and then insert the embeddings into it, this cannot be done in one go (if done programmatically).

Creating custom vector search index

MongoDBAtlasVectorSearch(index_name="mongo_rag",
 collection=collection,
 embedding=embeddings).create_vector_search_index(
 dimensions=1024
)

Here, we are creating an index called mongo_rag first with the dimension of 1024, now the dimension is very crucial whether we create the by-default index or custom index, because if this dimension doesn’t match with the one of the Embedding model, then it will be a major issue, the application will not even execute. for the embedding model used here i.e GTE-Large, the dimension is 1024.

Image after creating the custom index (embeddings are not yet added)

As we can see here, the index has been created successfully, but the Documents are still at 129 values as we haven’t populated the embeddings here. We should delete the previously added chunks first, otherwise we will just push the same chunks again, which will be a repetition which might introduce hallucinations.

Populating the Custom Index with Embeddings

Using the following code, we can populate the custom index with the embeddings

vectorStore = MongoDBAtlasVectorSearch.from_documents(index_name="mongo_rag",
 embedding=embeddings,
 documents=chunked_docs, 
 collection=collection)

In this approach, while inserting embeddings into the search index, we are providing the index_name here, this will let us store the indexes in that particular search index.

Designing the RAG function

In this step, we will just design a generic RAG function using the LLMs and Endpoint configuration we defined earlier.

def query_data(query):
 # Perform Atlas Vector Search using Langchain's vectorStore
 # similarity_search returns MongoDB documents most similar to the query 

 docs = vectorStore.similarity_search(query, k=3)
 
 # Putting the similar chunks into a list to print it later
 similar_chunks = [chunk for chunk in docs]
 
 # Setting up the retriever defined using MongDBAtlasVectorSearch
 retriever = vectorStore.as_retriever()

 # Load "stuff" documents chain. Stuff documents chain takes a list of documents,
 # inserts them all into a prompt and passes that prompt to an LLM.

 qa = RetrievalQA.from_chain_type(llm, chain_type="stuff", retriever=retriever)

 # Execute the chain
 retriever_output = qa.invoke(query)


 # Return Atlas Vector Search output, and output generated using RAG Architecture
 return (f"Similar Chunks\n-{similar_chunks}\n, Answer-{retriever_output}")

Now we will pass a sample query and check how it is working

query = "Explain the Character of Macbeth"
query_data(query)

Answer

'Similar Chunks\n-[Document(metadata={\'_id\': \'6798ecf0a3137ba55ff6b544\', \'source\': \'story.txt\'}, page_content="1606\\nTHE TRAGEDY OF MACBETH\\n\\n\\nby William Shakespeare\\n\\n\\n\\nDramatis Personae\\n\\n DUNCAN, King of Scotland\\n MACBETH, Thane of Glamis and Cawdor, a general in the King\'s\\narmy\\n LADY MACBETH, his wife\\n MACDUFF, Thane of Fife, a nobleman of Scotland\\n LADY MACDUFF, his wife\\n MALCOLM, elder son of Duncan\\n DONALBAIN, younger son of Duncan\\n BANQUO, Thane of Lochaber, a general in the King\'s army\\n FLEANCE, his son\\n LENNOX, nobleman of Scotland\\n ROSS, nobleman of Scotland\\n MENTEITH nobleman of Scotland\\n ANGUS, nobleman of Scotland\\n CAITHNESS, nobleman of Scotland\\n SIWARD, Earl of Northumberland, general of the English forces\\n YOUNG SIWARD, his son\\n SEYTON, attendant to Macbeth\\n HECATE, Queen of the Witches\\n The Three Witches\\n Boy, Son of Macduff \\n Gentlewoman attending on Lady Macbeth\\n An English Doctor\\n A Scottish Doctor\\n A Sergeant\\n A Porter\\n An Old Man\\n The Ghost of Banquo and other Apparitions\\n Lords, Gentlemen, Officers, Soldiers, Murtherers, Attendants,"), Document(metadata={\'_id\': \'6798ecf0a3137ba55ff6b5a7\', \'source\': \'story.txt\'}, page_content="Was a most sainted king; the queen that bore thee,\\n Oftener upon her knees than on her feet,\\n Died every day she lived. Fare thee well!\\n These evils thou repeat\'st upon thyself\\n Have banish\'d me from Scotland. O my breast,\\n Thy hope ends here!\\n MALCOLM. Macduff, this noble passion,\\n Child of integrity, hath from my soul\\n Wiped the black scruples, reconciled my thoughts\\n To thy good truth and honor. Devilish Macbeth\\n By many of these trains hath sought to win me\\n Into his power, and modest wisdom plucks me\\n From over-credulous haste. But God above\\n Deal between thee and me! For even now\\n I put myself to thy direction and \\n Unspeak mine own detraction; here abjure\\n The taints and blames I laid upon myself,\\n For strangers to my nature. I am yet\\n Unknown to woman, never was forsworn,\\n Scarcely have coveted what was mine own,\\n At no time broke my faith, would not betray\\n The devil to his fellow, and delight"), Document(metadata={\'_id\': \'6798ecf0a3137ba55ff6b57d\', \'source\': \'story.txt\'}, page_content="Particular addition, from the bill\\n That writes them all alike; and so of men. \\n Now if you have a station in the file,\\n Not i\' the worst rank of manhood, say it,\\n And I will put that business in your bosoms\\n Whose execution takes your enemy off,\\n Grapples you to the heart and love of us,\\n Who wear our health but sickly in his life,\\n Which in his death were perfect.\\n SECOND MURTHERER. I am one, my liege,\\n Whom the vile blows and buffets of the world\\n Have so incensed that I am reckless what\\n I do to spite the world.\\n FIRST MURTHERER. And I another\\n So weary with disasters, tugg\'d with fortune,\\n That I would set my life on any chance,\\n To mend it or be rid on\'t.\\n MACBETH. Both of you\\n Know Banquo was your enemy.\\n BOTH MURTHERERS. True, my lord.\\n MACBETH. So is he mine, and in such bloody distance\\n That every minute of his being thrusts \\n Against my near\'st of life; and though I could")]\n, 


Answer-{\'query\': \'Explain the Character of Macbeth\', \'result\': "Based on the provided context, Macbeth is a complex character who is the Thane of Glamis and Cawdor, and a general in the King\'s army. He is a prominent figure in the play and is driven by a desire for power and prestige. \\n\\nInitially, Macbeth is portrayed as a respected and accomplished military leader, but as the play progresses, his darker qualities are revealed. He is shown to be ruthless, ambitious, and willing to do whatever it takes to achieve his goals, including murder. \\n\\nMacbeth\'s relationship with his wife, Lady Macbeth, also plays a significant role in shaping his character. He is influenced by her goading and encouragement, which pushes him to commit regicide and seize the throne. \\n\\nHowever, Macbeth\'s actions are also motivated by a sense of insecurity and paranoia, as he becomes increasingly obsessed with the idea of being overthrown and killed. This fear drives him to order the murder of his friend Banquo and his family, further highlighting his descent into darkness and tyranny.\\n\\nThroughout the play, Macbeth\'s character undergoes a significant transformation, from a respected nobleman to a tyrannical and isolated ruler. His downfall is ultimately sealed when he is killed by Macduff, and his head is brought to Malcolm, the rightful king. \\n\\nIt\'s worth noting that the provided context only gives a glimpse into Macbeth\'s character, and a more comprehensive understanding would require a broader analysis of the entire play."}'

The output can be further modified based on the requirement.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

MongoRAG: Leveraging MongoDB Atlas as a Vector Database with Databricks-Deployed Embedding Model and LLMs for Retrieval-Augmented Generation

Author(s): Dwaipayan Bandyopadhyay

What is MongoDB Atlas?

Feedback ↓ Cancel reply

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

Machine Learning at Scale: Why PySpark MLlib Still Wins in 2025

Same Prompt, Different Minds: What 3 LLMs Taught Me About AI in the Classroom

LAI #83: Corrective RAG, Real-Time PPO, Adaptive Retrieval, and LLM Scaling Paths

AI in the Classroom: Create and Grade Assignments with ChatGPT

5 Ways TinyML and Edge AI Bring AI to Your Device

The World’s Leading AI and Technology Publication.

Company

CONTACT US

🔥 Recommended Articles 🔥

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

MongoRAG: Leveraging MongoDB Atlas as a Vector Database with Databricks-Deployed Embedding Model and LLMs for Retrieval-Augmented Generation

Author(s): Dwaipayan Bandyopadhyay

What is MongoDB Atlas?

Related posts

Feedback ↓ Cancel reply

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement

Subscribe to our AI newsletter!

🔥 Recommended Articles 🔥