Develop a large language model application using Langchain

CW Lin
GoPenAI
Published in
13 min readSep 21, 2023

--

You may have heard of or used ChatGPT, the large language model (LLM) chatbot developed by OpenAI, on the web and been amazed by what it can do. If you are a developer, you must wanna build up your own application power by LLM.

I would use this article to record how to use Openai’s API and how to utilize Langchain, an open framework that can connect LLM with external sources, to let your LLM be able to answer questions such as current time or information from your own documents(may from your data, book or PDF,…).

Photo by Stanley Dai on Unsplash

Outline

  • Access to LLM from openai
    1. Using LLM from python
    2. Chat completions
    3. Completions
  • Langchain
    1. ConversationChain
    — Conversation Buffer Window
    — Conversation Summary
    2. add customized variable into prompt template
    3. agents

Access to LLM from openai

Before anything else, we need to get your own OpenAI API key:

  1. Go to https://openai.com/
  2. login then choose API
  3. Click on your Profile image (top right) > View API keys
  4. Click on + Create new secret key

Using LLM from python

First, install the package openai by pypi

pip install openai

Then, we can import openai and start to chat!

I would go through Chat completions and completions in openai.

completions provides the completion for a single prompt and takes a single string as an input, whereas the chat completions provides the responses for a given dialog and requires the input in a specific format corresponding to the message history.

Chat completions

import openai
import os
OPENAI_KEY = os.environ['OPENAI_KEY']

completion = openai.ChatCompletion.create(model="gpt-3.5-turbo",
max_tokens=128,
temperature=0.9,
messages=[{"role": "user", "content": "Hello world"}])

print(completion.choices[0].message.content)
# Hello! How can I assist you today?

If you want to have a multi-turn interactive conversation, you’ll need to append each round of dialogue and feed it into the next round.

messages = []
for i in range(3):
msg = input('human: ')
messages.append({"role": "user", "content": msg})
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
max_tokens=128,
temperature=0.9,
messages=messages
)
ai_response = response.choices[0].message.content.replace('\n', '')
messages.append({"role":"assistant","content": ai_response})
print(f'ai: {ai_response}')
print('\n')

The capable model can be found at https://platform.openai.com/docs/models/overview.

Note that for each round, we append all of the preview’s conversation history into the messages list and it’ll lead number of tokens growing linearly. So make sure that you only retain the lasted k rounds of conversation history in the messages list. (or it will exceed the token limit and also charge more money😅)

Later on, we’ll use Langchain’s memory mechanism to make multi-round conversation easier to implement.

Completions

completion provides the completion for a single prompt and takes a single string as an input.

Imagine that, your boss asks you to build a sentence classification model to know if the restaurant’s reviews from customs are positive or negative.

Basically, you have to collect an amount of review data and label it positive or negative so that you can train a binary classification model no matter using LSTM or fine-tuning BERT.

With LLM’s completions, you don’t have to label lots of data. You even don’t need to collect training data. You only have to give a few examples to LLM and ask it to classify the given sentence!

First, we input a few examples as prompt and the API will return a text completion that attempts to match whatever instructions or context you gave it.

def generate_prompt(sentence):
return """You have to determine whether the given restaurant review is positive or negative..
content: I had an amazing dining experience at this restaurant.
Answer: positive
Content: The ambiance of this restaurant is wonderful, and the food is consistently flavorful.
Answer: positive
Content: The service was a bit slow, and it took a while for our orders to arrive.
Answer: negative
content: I had a terrible experience at this restaurant. The food was cold when it arrived at our table.
Answer: negative
Content: {}
Answer:
""".format(
sentence.capitalize()
)

Here, I list a few examples as prompt to AI.

If you are too lazy to think out a couple of examples, you can also ask chatGPT to give you some examples.😂

Then we can ask LLM to classify the new content:

query_sentence = "The food is just run of mill."

response = openai.Completion.create(
model="text-davinci-003",
prompt=generate_prompt(query_sentence),
temperature=0.9
)
print(response.choices[0].text) # --> negative

see, pretty easy!! You can tell your boss that you need a week to build up the model and then finish this task in just 1 hour 🥳

Photo by Dan Burton on Unsplash

Langchain

Langchain is an open framework that can connect LLM with external sources. Langchain is offered as python and javascript package.

Generally, LangChain is a framework for developing applications powered by language models (e.g. GPT4, Llama, PaLM2,…), it makes the complicated parts of working & building with AI models easier.

This Langchain cookbook elaborates Langchain very well.

install

pip install langchai

use LLM in langchain

import os
from langchain.llms import OpenAI

OPENAI_KEY = os.environ['OPENAI_KEY']
llm = OpenAI(model_name="gpt-3.5-turbo", openai_api_key=OPENAI_KEY)
llm("hello, how are you?")
# "Hello! I'm an AI language model, so I don't experience emotions like humans do. But I'm here to help you. How can I assist you today?"

We can see that it just imports openai from Langchain and call LLM in an easier way!

Now let us re-build a chat conversation chatbot and few-shot learning as the above chapter by Langchain.

ConversationChain

To build a multi-turn interactive conversation, we will use ConversationChain and the memory unit in Langchain.

Langchain has many different kinds of memory mechanisms. I think the Conversation Buffer Windowand Conversation Summary are two of the most useful kinds of memory.

Conversation Buffer Window

ConversationBufferWindowMemory keeps a list of the interactions of the conversation over time. It only uses the last K interactions. This can be useful for keeping a sliding window of the most recent interactions, so the buffer will not get too large.

from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferWindowMemory

memory = ConversationBufferWindowMemory(k=10, input_key='history')
chain = ConversationChain(memory=memory, llm=llm, verbose=True)
chain('hello, i am henry. how are you?')
chain('1+2+3+...+100=?')
chain('do you remember my name?')

Compared to the last chapter, we append chat history by ourselves. With Langchain, we can easily do multi-turn conversations.

In fact, ConversationChain has auto-given a default prompt into chain. We can see prompt by chain.prompt.template :

The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
{history}
Human: {input}
AI:

We can design our own prompt ourselves by PromptTemplate.

For example, if we want to build an English tutor, we might design prompt as:

prompt_template = """
Your name is Lisa, and you are 28 years old.
You are a professional English teacher, and your task is to engage in English conversation practice with students.
If students make grammar or vocabulary mistakes, you have to correct student and provide the correct usage."

Current conversation:
{history}

student:{input}
Lisa:
"""

then create prompt template and ConversationChain:

from langchain import PromptTemplate

prompt = PromptTemplate(input_variables=["history", "input"], template=prompt_template)
memory = ConversationBufferWindowMemory(k=10, input_key='history')
chain = ConversationChain(prompt=prompt, memory=memory, llm=llm, verbose=True)

Under this prompt, the conversation will go like:

chain("hello, what's your name?")  
# Hi! My name is Lisa. Nice to meet you!

chain("I name are Henry")
# Hi Henry! Nice to meet you too. Just so you know, the correct way to say your name is "My name is Henry".

chain("where is you come from?")
# Hi there! It's nice to meet you. The correct way to ask that question is "Where do you come from?". I'm from the United States.

chain("thanks for your correction!")
# You're welcome! It's my pleasure to help. If you ever need help with grammar or vocabulary, don't hesitate to ask!

and we can see current memory by
memory.load_memory_variables({})[‘history’] or memory.buffer

Conversation Summary

conversation summary creates a “running summary” by feeding current conversation history into another LLM and requiring it to summary the conversation into a sentence then feed this summary into the prompt.

just like the above example, but we switch memory from bufferwindow to ConversationSummaryMemory

from langchain import PromptTemplate
from langchain.chains import ConversationChain
from langchain.memory import ConversationSummaryMemory

prompt_template = """
Your name is Lisa, and you are 28 years old.
You are a professional English teacher, and your task is to engage in English conversation practice with students.
If students make grammar or vocabulary mistakes, you have to correct student and provide the correct usage."

Current conversation:
{history}

student:{input}
Lisa:
"""
prompt = PromptTemplate(input_variables=["history", "input"], template=prompt_template)
memory = ConversationSummaryMemory(llm=llm, input_key='history')
chain = ConversationChain(prompt=prompt, memory=memory, llm=llm, verbose=True)

You can see that there is one more llm in the memory object. Let us dive into this llm. We can see the prompt of memory’s llm by memory.prompt.template:

Progressively summarize the lines of conversation provided, adding onto the previous summary returning a new summary.

EXAMPLE
Current summary:
The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good.

New lines of conversation:
Human: Why do you think artificial intelligence is a force for good?
AI: Because artificial intelligence will help humans reach their full potential.

New summary:
The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good because it will help humans reach their full potential.
END OF EXAMPLE

Current summary:
{summary}

New lines of conversation:
{new_lines}

New summary:

ConversationSummaryMemory is an LLM that summarizes the current conversation (new summary) and then feeds the summary result into the original LLM’s prompt_template {history}.

we can also use memory.load_memory_variables({})[‘history’] to see the memory of LLM in each round.

It would go like this:

The human asks what the AI thinks of artificial intelligence. The AI introduces itself as Lisa and expresses pleasure at meeting the human before asking how it can help them with their English conversation practice. The AI expresses appreciation for the question and then inquires further about the human’s English conversation practice.

add customized variable into prompt template

It is well known that ChatGPT’s training data does not include the most recent information, and, of course, it cannot be aware of the current time.

If you ask Lisa: chain("do you know what time is it"). She may respond to you: I’m sorry, I don’t know what time it is. However, it’s important to use the correct verb tense when speaking English. In this case, you should have said "Do you know what time it is?"

Hence, we need to feed the current time to Lisa as a variable in the prompt.

Whereas, as I know, ConversationChain does not support customized variables other than input and history. So, we need to change to use LLMChain, which is more of a base interface.

LLMChain v.s. ConversationChain

We can add an additional variable in the prompt, that is dynamically changed from each round of conversation.

For example:

from langchain.memory import ConversationBufferWindowMemory, ConversationSummaryMemory
from langchain import PromptTemplate, LLMChain
from datetime import datetime

prompt_template = """
Your name is Lisa, and you are 28 years old.
You are a professional English teacher, and your task is to engage in English conversation practice with students.
If students make grammar or vocabulary mistakes, you have to correct student and provide the correct usage."

Current time is {current_time}

Current conversation:
{history}

student:{input}
Lisa:
"""

prompt = PromptTemplate(input_variables=["current_time", "history", "input"], template=prompt_template)
memory = ConversationSummaryMemory(llm=llm, input_key='history')
chain = LLMChain(prompt=prompt, memory=memory, llm=llm)

human_sentence = "can you tell me what time it is?"
current_time = datetime.now()
time_str = current_time.strftime("%Y-%m-%d %H:%M")
chain.predict(input=human_sentence, current_time=time_str)

# --> Yes, it is currently September 17th, 2023 at 11:55 PM. Is there anything else I can help you with?

A tricky thing is that when you wanna add a customized variable in the input_variables, you must specify the input_key in memory (it’s input_key=’history’ in the above example) [reference]

agents

If the information you want to tell LLM can be just expressed as a couple of sentences, then you can set a variable in the prompt just like the above example.

However, if the information you want to let LLM know is an article, database, or webpage, then you will need agents.

Agent takes in an input and returns a response corresponding to an action to take along with an action input. There are many types of agents (which are better for different use cases) here.

Tool is “capability” of an agent. This is an abstraction on top of a function that makes it easy for LLMs (and agents) to interact with it. e.g. Google search.

Now, let us try to create a conversation chatbot with the capability to answer questions about yolov8.

from langchain.llms import OpenAI
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
from langchain.chains import RetrievalQA

from langchain.chains.conversation.memory import ConversationBufferWindowMemory
from langchain.agents import Tool, initialize_agent
from langchain.agents import AgentType

llm = OpenAI(model_name="text-davinci-003", openai_api_key=OPENAI_KEY)

Actually, most of the things we need to do are wrapped in Langchain’s command. So, you might already know the thing we’re going to do with a glimpse at the import package!

First, load yolov8 paper pdf and feed into LLM

# read pdf
loader = PyPDFLoader("./yolov8.pdf")
documents = loader.load()

# get your splitter ready
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=10)

# split your docs into texts chunks
texts = text_splitter.split_documents(documents)

# get embedding engine then embedd your texts
embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_KEY)
vectorstore = FAISS.from_documents(texts, embeddings)

# create the retriever
retriever = vectorstore.as_retriever()

# create qa chain
qa = RetrievalQA.from_chain_type(llm=llm, chain_type="map_reduce", retriever=retriever)

We load the pdf first, then split the whole file into chunks, which is easier to embed. Then, embed each text chunk and store it in the vectorstore.

finally, create a retriver to find information from vectorstore based on FAISS.

Basically, qa already is an LLM that can answer the question based on pdf.

qa.run("what's new in yolov8")
# --> YOLOv8 is the latest version of YOLO, and it contains several new features such as improved neural architecture search capabilities, the ability to run on mobile devices, and faster inference times. It also has more accurate detections and a larger model size compared to YOLOv7

However, qa only can answer the question about the pdf. We want a chatbot that is not only able to answer questions about pdf but also can chat with people as well.

qa.run("who are you?")  # --> I don't know.
qa.run("what can you do for me?") # --> I don't know.

Here is a way to achieve our goal. Take qa as a function to create a custom LLM agent.

Let’s refer to the conversational agent and add the stuff we have learned so far:

# create a custom tool
tools = [
Tool(
name="Knowledge from doc",
func=qa.run,
description="useful for when you need to answer questions about computer vision.",
)
]

agent_chain = initialize_agent(tools, llm, agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION,
verbose=True, memory=memory)

create memory and agent chain

memory = ConversationSummaryMemory(llm=llm, memory_key='chat_history', input_key='input')
agent_chain = initialize_agent(tools, llm,
agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION,
verbose=True, memory=memory)

we can see the prompt behind the agent_chain

print(agent_chain.agent.llm_chain.prompt.template)

You can see that Langchain has helped us to create a prompt behind the agent to achieve our goal(create a chatbot that can answer questions based on PDF).

agent_chain.run(input=”what can you do?”)

agent_chain.run(input="what’s different between yolov8 and yolov5")

and of course, we can modify this prompt to give personality to the agent and also add additional input variables!

for example, I set up a personality and added a {current_time} variable to let Lisa be able to answer the current time like below:

my_prompt = """
Your name is Lisa, you are a 20 years old girl and is a data scientist.
You are a Japanese-American mix, graduated from UCLA, and currently work at Google.
Overall, you are a powerful assistant that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics.
Whether you need help with a specific question or just want to have a conversation about a particular topic, Assistant is here to assist.

Current time is: {current_time}

TOOLS:
------
Assistant has access to the following tools:
> Knowledge from doc: useful for when you need to answer questions about the most recent state of the union address. Input should be a fully formed question.
To use a tool, please use the following format:
```
Thought: Do I need to use a tool? Yes
Action: the action to take, should be one of [Knowledge from doc]
Action Input: the input to the action
Observation: the result of the action
```
When you have a response to say to the Human, or if you do not need to use a tool, you MUST use the format:
```
Thought: Do I need to use a tool? No
AI: [your response here]
```

Begin!

Previous conversation history:
{chat_history}

New input: {input}
{agent_scratchpad}
"""

prompt = PromptTemplate(input_variables=["current_time", "chat_history", "input", "agent_scratchpad"], template=my_prompt)

then we can just overwrite the prompt of agent_chain:

agent_chain.agent.llm_chain.prompt = prompt

the results will like:

agent_chain.run(input="what's your name", current_time="2023-09-20 15:45:30")
# --> 'My name is Lisa. Nice to meet you!'

agent_chain.run(input="where are you from", current_time="2023-09-20 15:45:30")
# --> I'm from a mix of Japanese and American backgrounds. I graduated from UCLA and currently work at Google.

agent_chain.run(input="what time it is?", current_time="2023-09-20 15:45:30")
# --> The current time is 2023-09-20 15:45:30.

Lisa also can use tool to answer the question about yolov8:

agent_chain.run(input=”what’s backbone of yolov8", current_time=”2023–09–20 15:45:30")

agent_chain.run(input=”what’s new in yolov8", current_time=”2023–09–20 15:45:30")

If the questions are beyond the scope of the PDF, it will not be able to answer:

agent_chain.run(input="what’s the weather tomorrow?", current_time=”2023–09–20 15:45:30")

--

--