A web app that acts as a restaurant waiter, taking orders from the menu using RAG, LLM, and Gradio.

Hello. Using just a tablet for taking orders can feel a bit bland, right? Let's use an LLM.

This time, we used a local LLM (because, for some reason, the credit card didn't go through for the GPT-3.5 API) + RAG + Gradio.

A local LLM allows us to implement something like ChatGPT right at our fingertips. Using ChatGPT's system requires per-use charges, which can be costly.

RAG is used to embed the menu into the LLM. The AI doesn't know the restaurant's menu, so you need to give it the menu. Also, it needs to check for updated information like items that are out of stock.

Gradio allows us to easily create a web app. We can't have customers punching in Python code on their phones to place orders.

For the LLM, we used Mistral7B.

from transformers import AutoTokenizer, pipeline

model_id = "mistralai/Mistral-7B-Instruct-v0.2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
pipe = pipeline("text-generation", model=model_id, tokenizer=tokenizer, device=0, max_new_tokens=300)

For this instance of RAG, we imported the database from a CSV file. We struggled with importing Japanese text, so we ultimately used pandas to handle it.

from langchain_community.document_loaders import DataFrameLoader
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings.huggingface import HuggingFaceEmbeddings
from langchain_text_splitters import CharacterTextSplitter
import pandas as pd

df = pd.read_csv("list5.csv")
loader = DataFrameLoader(df, page_content_column="name")

documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

embeddings = HuggingFaceEmbeddings(
model_name="intfloat/multilingual-e5-large"
)

db = FAISS.from_documents(docs, embeddings)
retriever = db.as_retriever()
print(db.index.ntotal)

This time, the menu is listed in a file named "list5.csv". It includes recommendations, prices, and descriptions of the dishes. I wrote the prompt, which might feel a bit verbose.

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain.llms import HuggingFacePipeline

llm = HuggingFacePipeline(pipeline=pipe)

template = """ここはタイ料理レストラン「ブルーキャット」で、あなたは店員です。お客さんのオーダーに答えていくつか注文をとってもらえますか？注文以外の会話は次のコンテクストを参照しなくていいです。日本語で答えてください。お店にある料理は次のコンテクストの料理です。:

{context}

Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

chain = (
{"context": retriever, "question": RunnablePassthrough()}
| prompt
| llm
| StrOutputParser()
)

query = "トムヤムクンはありますか？"
answer = chain.invoke(query)

if "Answer:" in answer:
answer = answer.split("Answer:")[1]

if "Question:" in answer:
answer = answer.split("Question:")[0]

print(answer)

The output was unstable, so I wanted to remove unnecessary characters, but it wasn't perfect. Adjustments will be needed in the future.

Next, regarding Gradio: this time, I included a branch to remove unnecessary characters from the output.

import gradio as gr
import os

def add_text(history, text):
history = history + [(text, None)]
return history, gr.Textbox(value="", interactive=False)

def bot(history):
  query = history[-1][0]
  response = chain.invoke(query)
  if "Answer:" in response:
    response = response.split("Answer:")[1]
  if "Question:" in response:
    response = response.split("Question:")[0]
  history[-1][1] = ""
  for character in response:
    history[-1][1] += character
    yield history

with gr.Blocks() as demo:
  chatbot = gr.Chatbot([])
  with gr.Row():
    txt = gr.Textbox(
      scale=4,
      show_label = False,
      container = False
    )
  clear = gr.Button("Clear")

txt_msg = txt.submit(add_text, [chatbot, txt], [chatbot, txt], queue = False).then(bot, chatbot, chatbot)
txt_msg.then(lambda: gr.Textbox(interactive = True), None, [txt], queue = False)
clear.click(lambda: None, None, chatbot, queue=False)

demo.queue()
demo.launch(share=True)

As a result...

We ended up with a nice response system. It remembered the menu content well and explained it properly. I'd like to try using it in an actual restaurant. I wonder if it would increase sales.

A web app that acts as a restaurant waiter, taking orders from the menu using RAG, LLM, and Gradio.

Yuichiro Minato