Generative AI: Develop LLM-powered applications with LangChain, Python, and Milvus (VectorDB) | Part 1
In the ever-evolving landscape of artificial intelligence (AI), Generative AI is one of the most captivating frontiers. Generative AI empowers machines to learn from vast amounts of data and create new content that resembles human creations. Through sophisticated algorithms and deep neural networks, these AI models have demonstrated their prowess in generating realistic images, composing mesmerizing music, and crafting captivating stories. Whether you’re an AI enthusiast, a curious reader, or a creative professional, this blog aims to unravel the mysteries behind Generative AI and its boundless potential.
Throughout this blog, we will delve into the foundational concepts behind Generative AI using the LangChain framework, and also create a small LLM-powered application using LangChain and OpenAI Chat Models.
What is an LLM?
An LLM, which stands for a Language Model, is a type of machine-learning neural network that learns from input/output datasets. These datasets often contain unlabeled or uncategorized text, and the model utilizes self-supervised or semi-supervised learning methods. The LLM takes in information or content and generates predictions for the next word. The input data can range from proprietary corporate data to any information found on the internet, as exemplified by ChatGPT.
Training LLMs effectively require the use of extensive and expensive server farms that function as supercomputers.
LLMs are governed by a vast number of parameters, reaching into the millions, billions, or even trillions. These parameters serve as decision-making elements, enabling the LLM to choose between various potential answers. For instance, OpenAI’s GPT-3 LLM has 175 billion parameters, and their latest model, GPT-4, reportedly boasts 1 trillion parameters.
What is LangChain?
LangChain is a framework designed to create language model-powered applications. It offers two key features:
- Data-aware capability, enabling the language model to connect with other data sources.
- Agentic functionality, allows the language model to interact with its surroundings effectively.
The main advantages of LangChain include:
- Components that provide abstractions for language models, offering a range of implementations for each abstraction. These components are modular and user-friendly, whether you choose to use the entire LangChain framework or not.
- Off-the-shelf chains are pre-structured sets of components designed to perform specific higher-level tasks. These ready-made chains facilitate a quick start.
- For more advanced applications and specific use cases, the components make it simple to customize existing chains or construct new ones.
Problem Statement
given information ${DYNAMIC_INPUTS} about mutual funds from I want you to create:
1. a short summary
2. two interesting facts about them
3. Which of them is a better option
TechStack
- Python@3.11 (https://www.python.org/)
- LangChain Framework (https://python.langchain.com/)
- Milvus (https://milvus.io/)
- OpenApi Keys (https://openai.com/blog/openai-api)
Workspace Setup
- mkdir workspace/langChain
- cd workspace/langChain
- virtualenv venv
- alias python=$HOME/workspace/langChain/venv/bin/python3.11
- alias pip=$HOME/workspace/langChain/venv/bin/pip3.11
- pip install --user pipenv
- python -m pip install --upgrade pip
- PYTHON_BIN_PATH="$PWD/venv/bin"
- PATH="$PATH:$PYTHON_BIN_PATH"
- export LANG="en_US.UTF-8"
- pipenv install langchain
- pipenv install black (Formatter)
This code sets up a language model-powered application to generate summaries of the provided information.
It uses the GPT-3.5 Turbo model from OpenAI. The code defines a template for the prompt, takes user input data (“information”), initializes the language model (LLM), and then creates a Language Model Chain (LLMChain) with the specified LLM and prompt. Finally, it executes the chain to generate a summary based on the provided “information” and prints the result.
Input Prompt
Fund Name is Top 200 Fund provide returns of 33.07% in 3 Years, 21.52% in 5 Years and 19.21% in 10 Years.
Another Fund Name is CREST (Thematic Fund) provide returns of 27.58%% in 3 Years, 14.33% in 5 Years and 11.12% in 10 Years.
Execute
- Get your OpenAI API Keys from https://openai.com/blog/openai-api
- Update the keys in .env file.
Output
Conclusion
- We have successfully developed a working backend logic using the LangChain framework along with OpenAI and Python as a preferred programming language