1. Installation
pip install llama-index pip install llama-index-vector-stores-azureaisearch pip install azure-search-documents pip install llama-index-embeddings-azure-openai pip install llama-index-llms-azure-openai
from llama_index.llms.azure_openai import AzureOpenAI from llama_index.embeddings.azure_openai import AzureOpenAIEmbedding from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, StorageContext from llama_index.core.settings import Settings from llama_index.core.node_parser import SentenceSplitter from azure.search.documents.indexes import SearchIndexClient from azure.core.credentials import AzureKeyCredential from azure.search.documents import SearchClient as LISearchClient from llama_index.vector_stores.azureaisearch import AzureAISearchVectorStore from llama_index.vector_stores.azureaisearch import ( IndexManagement, )
2. Configuration and Initialization
First, you need to configure Azure OpenAI by setting your API key, endpoint, and version. This step ensures that your application can communicate with the Azure OpenAI services.
# Configuration for Azure OpenAI api_key = "your_azure_openai_api_key_here" azure_endpoint = "your_azure_openai_endpoint_here" api_version="2024-02-15-preview"
# Configuration for Azure Search Service search_service_api_key = "your_azure_search_service_admin_key_here" search_service_endpoint= "your_azure_search_service_endpoint_here" search_creds = AzureKeyCredential(search_service_api_key)
Are you ready to harness the power of AI in your business? Get ahead with our 𝗙𝗥𝗘𝗘 𝗠𝗶𝗰𝗿𝗼𝘀𝗼𝗳𝘁 𝗖𝗼𝗽𝗶𝗹𝗼𝘁 𝗮𝗻𝗱 𝗔𝗜 𝗥𝗲𝗮𝗱𝗶𝗻𝗲𝘀𝘀 𝗖𝗵𝗲𝗰𝗸𝗹𝗶𝘀𝘁!
3. Initialization of Azure AI Components
Initialize the language model and embedding model with the specified configurations. We will be using the gpt-35-turbo model from AzureOpenAI to generate responses, and text-embedding-ada-002 from AzureOpenAIEmbedding to convert text into numerical representations. Additionally, we provide our previously defined API key, Azure endpoint, and API version.
# Initialize the AzureOpenAI language model llm = AzureOpenAI( model="gpt-35-turbo", api_key=api_key, azure_endpoint=azure_endpoint, api_version=api_version ) # Initialize the embedding model embed_model = AzureOpenAIEmbedding( model="text-embedding-ada-002", api_key=api_key, azure_endpoint=azure_endpoint, api_version=api_version )
4. Azure Search Vector Store Setup
Initializing a client for accessing an Azure AI Search Index: SearchIndexClient is initialized with the specified configurations. This client allows you to manage and query your search index.
Azure Search Vector Store Initialization: Sets up the Azure AI Search Vector Store and configures various parameters such as field keys, dimensionality, and search settings. index_name is the name of the search index that the client interacts with. This store is responsible for managing the storage and retrieval of vector embeddings in your search index.
index_client = SearchIndexClient( endpoint=search_service_endpoint, credential=search_creds ) index_name = "llamaindex-vector-demo" vector_store = AzureAISearchVectorStore( search_or_index_client=index_client, index_name=index_name, index_management=IndexManagement.CREATE_IF_NOT_EXISTS, id_field_key="id", chunk_field_key="chunk", embedding_field_key="embedding", embedding_dimensionality=1536, metadata_string_field_key="metadata", doc_id_field_key="doc_id", language_analyzer="en.lucene", vector_algorithm_type="exhaustiveKnn", )
Define global settings
Before loading the data, we define our global settings. LlamaIndex v0.10 introduced the Settings object, which only needs to be defined once and can be used globally in our downstream code. Here we are configuring our LLM, which responds to prompts and queries, and our embedding model, responsible for converting text to numerical representations.
Settings.llm = llm Settings.embed_model = embed_model
5. Data Loading and Indexing
load_and_index_data(path) function takes a path as an input, by default it takes the path to a directory named "data" that is in the same path as the script. You can put all the documents you want to use for retrieval in a directory and provide the path to it with this function.
docs = SimpleDirectoryReader(input_dir=path, recursive=True).load_data(): Reads the directory in the given "path".
recursive=True allows reading from subdirectories. Finally loads the documents into the docs variable. SimpleDirectoryReader supports lots of file types such as csv, docx, ipynb, md, mp3, pdf, png, and many more.
The StorageContext is set up with the initialized vector store.
Finally, we create a vector store with the storage_context and the loaded documents.
def load_and_index_data(path="./data"): docs = SimpleDirectoryReader(input_dir=path, recursive=True).load_data() storage_context = StorageContext.from_defaults(vector_store=vector_store) index = VectorStoreIndex.from_documents(docs, storage_context=storage_context) return index
Are you ready to harness the power of AI in your business? Get ahead with our 𝗙𝗥𝗘𝗘 𝗠𝗶𝗰𝗿𝗼𝘀𝗼𝗳𝘁 𝗖𝗼𝗽𝗶𝗹𝗼𝘁 𝗮𝗻𝗱 𝗔𝗜 𝗥𝗲𝗮𝗱𝗶𝗻𝗲𝘀𝘀 𝗖𝗵𝗲𝗰𝗸𝗹𝗶𝘀𝘁!
6. Setting up the Chat Engine
Here we call the load_and_index_data() function, we just implemented, to create a vector store index. Next, we are building a chat engine from the index and setting the chat mode to "condense_question". This mode generates a standalone question from conversation context and the last message, then queries the query engine with the generated question.
index = load_and_index_data() chat_engine = index.as_chat_engine(chat_mode="condense_question", verbose=True
You can also try out other chat modes:
- "context" mode is a simple chat mode built on top of a retriever. For each interaction it retrieves text from the index using the input and sets the retrieved text as context in the system prompt. Finally, it returns an answer to the user.
- "condense_plus_context" is a multi-step chat mode built on top of a retriever that combines condense question and context mode. For each interaction it generates a standalone question from the conversation and latest user message. It then builds a context for the standalone question and passes it along with prompt and user input to the LLM to generate a response.
7. CLI Interaction
At last, we start a conversation loop, where the user can input queries and the chatbot responds accordingly until the user exits.
while True: prompt = input("User: ") if prompt == "exit": break response = chat_engine.chat(prompt) print(f"Chatbot: {response}")
You should now be able to run the application and start communicating about the data you provided. The application will keep asking for your input until you type "exit" to stop.
Are you ready to harness the power of AI in your business? Get ahead with our 𝗙𝗥𝗘𝗘 𝗠𝗶𝗰𝗿𝗼𝘀𝗼𝗳𝘁 𝗖𝗼𝗽𝗶𝗹𝗼𝘁 𝗮𝗻𝗱 𝗔𝗜 𝗥𝗲𝗮𝗱𝗶𝗻𝗲𝘀𝘀 𝗖𝗵𝗲𝗰𝗸𝗹𝗶𝘀𝘁!
Examples
To test our application, we put two PDF files in the data folder, containing information about coffee and tea, which we downloaded from Wikipedia. After running the application and waiting for a while to setup everything, we could start asking questions via the CLI interface. Here are some examples with their respective answers from the chatbot. The LLM also processes the user input for querying as we can see under "Querying with:" output.
Example 1:
Example 2:
Here we can see the benefits of condense_question mode. The LLM remembers the last question and combines the input from the last question with the current question to generate a query and returns a response.
Example 3:
Example 4:
Download AI Readiness Checklist
Are you ready to harness the power of AI in your business? Get ahead with our 𝗙𝗥𝗘𝗘 𝗠𝗶𝗰𝗿𝗼𝘀𝗼𝗳𝘁 𝗖𝗼𝗽𝗶𝗹𝗼𝘁 𝗮𝗻𝗱 𝗔𝗜 𝗥𝗲𝗮𝗱𝗶𝗻𝗲𝘀𝘀 𝗖𝗵𝗲𝗰𝗸𝗹𝗶𝘀𝘁!
Conclusion
By following the steps outlined in this article, you have learned how to configure and initialize Azure OpenAI and Azure Search, set up an Azure Search Vector Store, embed and index your data, store it in a vector store, and build a chat engine.
These steps enable you to create a chatbot that can handle large volumes of private data efficiently, making use of various Azure AI components.
Key components used in this application include Azure OpenAI for the language model, Azure Search for data indexing and retrieval, and Azure AI Services for text embeddings and related operations. By installing the necessary libraries and configuring the Azure services, you can set up a robust system for creating an intelligent chatbot.
By implementing this RAG-based chatbot system, you can leverage LLMs to interact with your private data efficiently, overcoming the limitations of context size and enhancing the chatbot's capabilities. This foundation opens the door for further customization and improvement to meet specific needs and use cases, making your chatbot a powerful tool for various applications.
With these insights and practical steps, you are now equipped to create and deploy your own chatbot using LlamaIndex and Azure AI, empowering you to harness the full potential of modern AI technologies.