LangChainのChatAgentのデフォルトのプロンプトをのぞく

Posted On 2023-04-01

5.6k{icon} {views}

LangChainのAgentをカスタムできると自由度が上がりますが、LangChainのAgentがいまいちブラックボックスだった感があったので、デフォルトのプロンプトを解剖してみました。最近のLangChainだとChatGPT対応されており、これをカスタムすることで安価にAPIが使えそうです。

はじめに

LangChainのAgentを使うとReActで自動的に使うツールを選択してくれるのですが、このAgentがどんなテンプレートをLLMに送っているのか前から気になっていました。そこで、Agentのプロンプトをデバッグする方法を考えます。

ここでターゲットとするAgentは、ChatGPTのConversationalChatAgentです。

ConversationalなAgentは比較的最近のLangChainでないと利用できません（LangChain0.0.101では不可で、0.0.128では利用可能だった）。

Agentのクラスはどこ？

公式ドキュメントのAgentによると、Agentを使うときは次のようにするのが一般的です。

from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.llms import OpenAI

llm = OpenAI(temperature=0)
tools = load_tools(["serpapi", "llm-math"], llm=llm)

agent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)

しかし、このドキュメントで書いてある例はGPT3.5であって、ChatGPT（GPT3.5-Turbo）ではないのですよね。

現在のところ、以下の6種類のAgentが用意されており、

zero-shot-react-description
react-docstore
self-ask-with-search
conversational-react-description
chat-zero-shot-react-description
chat-conversational-react-description

があり、「chat-」で始まる2種類が、GPT3.5-Turbo向けであると考えられます。これらのクラスはLangChain内のどこで定義しているでしょうか？

答えはlangchain.agents.loading.py内のコードにマッパーが書かれています。

chat-conversational-react-descriptionの場合は、ConversationalChatAgentが当該のクラスです。

プロンプトの構造を知る

プロンプトの構造を知りたいため、適当なSearch Toolsと自作のツールを登録し、プロンプトを見ます。

import os
os.environ["OPENAI_API_KEY"] = "<your-open-ai-key>"
os.environ["SERPAPI_API_KEY"] = "<your-serp-api-key>"

from langchain import SerpAPIWrapper
from langchain.agents import ConversationalChatAgent, tool, Tool

serarch = SerpAPIWrapper()

@tool("年齢と性別からIDを取得する", return_direct=True)
def get_id_from_gender_and_age(inputs):
    """年齢と性別からIDを計算するときに役立ちます
このツールの入力は、2つのカンマ区切りの文字列、性別, 年齢を表す"""
    gender, age = inputs.split(",")
    gender = gender.strip()
    age = int(age.strip())
    if "男" in gender:
        age += 1000
    elif "女" in gender:
        age += 2000
    return age

tools = [
    Tool(
        name="Search",
        func=serarch.run,
        description="時事問題に答えたいときに役立ちます"
    ),
    Tool(
        name=get_id_from_gender_and_age.name,
        func=get_id_from_gender_and_age,
        description=get_id_from_gender_and_age.description
    )
]

prompt = ConversationalChatAgent.create_prompt(tools=tools)

このpromptを見ると、

input_variables=['input', 'chat_history', 'agent_scratchpad'] output_parser=None partial_variables={} messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], output_parser=None, partial_variables={}, template='Assistant is a large la…

なにやらネストされた構造になっています。Chat対応するためにこのような構造が必要だったのでしょう。messageがプロンプトを格納している場所に見えます。

messageはコレクションとして格納されています。

print(len(prompt.messages))

for x in prompt.messages:
    print(x)

と列挙すると、

4

prompt=PromptTemplate(input_variables=[], output_parser=None, partial_variables={}, template='Assistant is a large language model trained by OpenAI.\n\nAssistant is designed to be able to…
variable_name='chat_history'
prompt=PromptTemplate(input_variables=['input'], output_parser=None, partial_variables={}, template='TOOLS\n------\nAssistant can ask the user to use tools to look up information that may be helpful in answering the users original question. The tools the human can use are:\n\n> …
variable_name='agent_scratchpad'

と4個のメッセージからなります。プロンプトと変数名が交互に並んでいるようですね。プロンプトの場合は、PromptTemplateというクラスのオブジェクトからなり、template以下が実際のプロンプトにあたります。

つまり、以下のようにすれば実際のプロンプトを書き出せます。

for x in prompt.messages:
    if hasattr(x, "prompt"):
        print(x.prompt.template)

書き出したプロンプト

ConversationalChatAgentのデフォルトのプロンプトにToolだけ登録したものを書き出すと以下のようになります。

Assistant is a large language model trained by OpenAI.

Assistant is designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, Assistant is able to generate human-like text based on the input it receives, allowing it to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.

Assistant is constantly learning and improving, and its capabilities are constantly evolving. It is able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. Additionally, Assistant is able to generate its own text based on the input it receives, allowing it to engage in discussions and provide explanations and descriptions on a wide range of topics.

Overall, Assistant is a powerful system that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether you need help with a specific question or just want to have a conversation about a particular topic, Assistant is here to assist.
TOOLS
------
Assistant can ask the user to use tools to look up information that may be helpful in answering the users original question. The tools the human can use are:

> Search: 時事問題に答えたいときに役立ちます
> 年齢と性別からIDを取得する: 年齢と性別からIDを取得する(inputs) - 年齢と性別からIDを計算するときに役立ちます
このツールの入力は、2つのカンマ区切りの文字列、性別, 年齢を表す

RESPONSE FORMAT INSTRUCTIONS
----------------------------

When responding to me please, please output a response in one of two formats:

**Option 1:**
Use this if you want the human to use a tool.
Markdown code snippet formatted in the following schema:

\```json
{{
    "action": string \ The action to take. Must be one of Search, 年齢と性別からIDを取得する
    "action_input": string \ The input to the action
}}
\```

**Option #2:**
Use this if you want to respond directly to the human. Markdown code snippet formatted in the following schema:

\```json
{{
    "action": "Final Answer",
    "action_input": string \ You should put what you want to return to use here
}}
\```

USER'S INPUT
--------------------
Here is the user's input (remember to respond with a markdown code snippet of a json blob with a single action, and NOTHING else):

{input}

※ブログのMarkdownの書き出しと競合してしまうため、「json」の部分をエスケープしています

となりました。こう書き出してみるとわかりやすいですね。

コードではどこか？

これをLangChainのどこで定義しているかというと、各Agent直下のprompt.pyになります（このケースは、conversational_chat/prompt.py）。

https://github.com/hwchase17/langchain/blob/4b59bb55c74449bdff0fe88bf0b98fd8052cea25/langchain/agents/conversational_chat/prompt.py

GitHubの場合は、バージョン差分があって若干変わっていますが、

PREFIX
FORMAT_INSTRUCTIONS
SUFFIX
TEMPLATE_TOOL_RESPONSE

の4要素からなります。これらの変数がどのように解釈されているかというと、ConversationalChatAgentのコードを見ると、

https://github.com/hwchase17/langchain/blob/4b59bb55c74449bdff0fe88bf0b98fd8052cea25/langchain/agents/conversational_chat/base.py#L8-L13

実際にユーザーが変更する可能性がありそうな場所は、PrefixとSuffixの部分だと思われます。from_llm_and_toolsの関数のInputを見ると、

    @classmethod
    def from_llm_and_tools(
        cls,
        llm: BaseLanguageModel,
        tools: Sequence[BaseTool],
        callback_manager: Optional[BaseCallbackManager] = None,
        system_message: str = PREFIX,
        human_message: str = SUFFIX,
        input_variables: Optional[List[str]] = None,
        output_parser: Optional[BaseOutputParser] = None,
        **kwargs: Any,
    ) -> Agent:

system_messageにPREFIXが、human_messageにSUFFIXがデフォルト登録されています

LangChainはアップデートの早いライブラリで、これらの仕様は刻々と変わる可能性が高いですが、何らかの参考になれば幸いです。

Shikoan's ML Blogの中の人が運営しているサークル「じゅ～しぃ～すくりぷと」の本のご案内

技術書コーナー

北海道の駅巡りコーナー

Tags:ChatGPT, GPT, LangChain, LLM

はじめに

Agentのクラスはどこ？

プロンプトの構造を知る

書き出したプロンプト

コードではどこか？

Add a Comment