Generative AI: Retrieval Augmentation Generation using LangChain and Pinecone| Part 2

Rajat Nigam
3 min readAug 23, 2023

--

Photo by Collab Media on Unsplash

What is RAG (Retrieval Augmentation Generation)?

Pre-Trained Large Language Models (LLMs) + Own Data -> Generate Response

Retrieve data from outside a foundation model and augment your prompts by adding the relevant retrieved data in context.

What are Embeddings?

Own Data -> OpenAI Embeddings (LLM Embeddings) -> Vector Array

For interacting with text embedding models, there is a class called Embeddings. Numerous embedding model suppliers (OpenAI, Cohere, Hugging Face, etc.) exist; this class is intended to give them all a uniform interface.

Text embeddings turn a piece of text into a vector image.It implies that we can consider text as a vector and perform operations like semantic search, in which we look for text fragments that are most comparable to one another.

What are Vector Stores?

Pinecone
The vector database from Pinecone is completely managed, good for developers, and simple to scale. With a few clicks or API calls, create an account and your first index. Use the most recent AI models and our comprehensive developer documentation to begin creating AI-powered applications in no time.

Create an index my-wiki-index
Obtain api keys

FAISS
Vector store FAISS in memory. created by the Fundamental AI Research team at Meta with MIT permission. FAISS allows you to run vector stores on your local CPU instead of the GPUs that are typically used by vector stores. It has algorithms that can search through vector set collections of any size, including those that would not fit in RAM. On the GPU, some of the most beneficial algorithms are implemented.

https://github.com/facebookresearch/faiss

What are LangChain Document Loaders?

To load data from a source as documents, use document loaders.
A document is a text file with accompanying metadata. There are document loaders, for instance, that may be used to load a options_trading.txt file, the text content of any website, or even the transcript of a YouTube video. Data can be loaded as documents from a defined source using the “load” method exposed by document loaders. They also offer the opportunity to design a “lazy load” that slowly loads data into memory.

LLM text loader can load the content from HTML, markdown, JSON, PDF and CSV. We are loading Options Trading blogs from medium as a contextual data.

https://python.langchain.com/docs/modules/data_connection/document_loaders/

What are LangChain Text Splitters?

Text splitters function as follows:
1. Divide the content into manageable, semantically significant sections (typically phrases).
2. As you get closer to a specific size (as determined by some function), begin mixing these little bits into bigger portions.
3. Once you’ve reached that size, separate that chunk into its own paragraph before beginning to write a new paragraph with some overlap (to maintain context between pieces).

https://python.langchain.com/docs/modules/data_connection/document_transformers/#text-splitters

Implementation

### Dependencies
- pipenv install langchain
- pipenv install black
- pipenv install openai
- pipenv install pinecone-client
- pipenv install tiktoken
- pipenv install pycco

Test Everything

Input Prompt
Pinecone is successfully populated by OpenAIEmbeddings vector array
Result

Conclusion

  • Unlike traditional databases that rely on exact matches, vector databases gauge data similarities in high-dimensional spaces, yielding more nuanced and accurate results.
  • This means enhanced search accuracy and deeper insights from stored information.
  • Paving the way for smarter data-driven decisions and experiences.

Reference

--

--

Rajat Nigam
Rajat Nigam

Written by Rajat Nigam

I'm a lifetime student of software engineering. Professionally I work in the capacity of Individual Contributor, Trainer, Lead Engineer and an Architect.

No responses yet