BabyBeeAGIを触ってみる

2023年5月2日 10:33

BabyBeeAGIについての投稿されていたので、触ってみます。
BabyAGIよりも複雑なプロンプトを用いているとのことで、タスク管理エージェントが担当する特定の機能は次のようなものとのこと

完全なタスクリストと完了/不完全なステータスを追跡する
タスク間の依存関係の割り当て
目標を達成するために新しいタスクがいつ必要かを決める
各タスクに使用するツールを割り当てる
クリーンなJSONとして結果を提供する

ソースコードは下記においてあります。

300行ぴったりで書かれていました。

google-search-resultsでは、SerpApi（https://serpapi.com/）のAPIkeyが必要になります。ですので、非常に少ない無料枠と、課金したとしてもMAX1万クエリー/日のなかで実施することになりそうです。

SerpAPiのアカウントを作るとキーが得られます。

単体で遊びたい場合は、以下が参考になります。

プロンプトを眺めてみる

鍵となるプロンプトについて説明します。

タスク処理エージェント

Complete your assigned task based on the objective: {OBJECTIVE}. 
Your task: {task['task']}
The previous task ({dependent_task['id']}. {dependent_task['task']}) 
result: {dependent_task_result}
Response:

日本語に訳すと

目的：{OBJECTIVE}に基づき、割り当てられたタスクを完了させる。
あなたのタスク：{task['task']}。
前のタスク（{dependent_task['id']}. {dependent_task['task']})
結果です： {dependent_task_result}となります。
レスポンス：

タスク管理エージェント

You are a task management AI tasked with cleaning the formatting of and reprioritizing the following tasks: {minified_task_list}.
Consider the ultimate objective of your team: {OBJECTIVE}.
Do not remove any tasks. Return the result as a JSON-formatted list of dictionaries.
Create new tasks based on the result of last task if necessary for the objective. Limit tasks types to those that can be completed with the available tools listed below. Task description should be detailed.The maximum task list length is 7. Do not add an 8th task.The last completed task has the following result: {result}.
Current tool option is [text-completion] {websearch_var} and [web-scrape] only.
For tasks using [web-scrape], provide only the URL to scrape as the task description. Do not provide placeholder URLs, but use ones provided by a search step or the initial objective.
For tasks using [web-search], provide the search query, and only the search query to use (eg. not 'research waterproof shoes, but 'waterproof shoes')
dependent_task_id should always be null or a number.
Do not reorder completed tasks. Only reorder and dedupe incomplete tasks.
Make sure all task IDs are in chronological order.
Do not provide example URLs for [web-scrape].
Do not include the result from the last task in the JSON, that will be added after..
The last step is always to provide a final summary report of all tasks.
An example of the desired output format is: [{"id": 1, "task": "https://untapped.vc", "tool": "web-scrape", "dependent_task_id": null, "status": "incomplete", "result": null, "result_summary": null}, {"id": 2, "task": "Analyze the contents of...", "tool": "text-completion", "dependent_task_id": 1, "status": "incomplete", "result": null, "result_summary": null}, {"id": 3, "task": "Untapped Capital", "tool": "web-search", "dependent_task_id": null, "status": "incomplete", "result": null, "result_summary": null}].

日本語に訳すと

タスク管理AIとして、以下のタスクのフォーマットを整理し、優先順位を見直してください: {minified_task_list}。
チームの究極の目的を考慮してください: {OBJECTIVE}。
タスクを削除しないでください。
結果をJSON形式の辞書のリストとして返してください。
目的に必要であれば、最後のタスクの結果に基づいて新しいタスクを作成してください。
タスクの種類は、以下にリストされた利用可能なツールで完了できるものに制限してください。
タスクの説明は詳細であるべきです。
タスクリストの最大長は7です。8番目のタスクを追加しないでください。
最後に完了したタスクの結果は次のとおりです: {result}。
現在のツールオプションは、[text-completion] {websearch_var}および[web-scrape]のみです。
[web-scrape]を使用するタスクの場合、タスクの説明としてスクレイプするURLのみを提供してください。
プレースホルダーURLを提供せず、検索手順または初期目標で提供されたものを使用してください。
[web-search]を使用するタスクの場合、使用する検索クエリのみを提供してください（例：「防水シューズを調べる」ではなく、「防水シューズ」）。 dependent_task_idは、nullまたは数値でなければなりません。
完了したタスクの順序を変更しないでください。未完了のタスクのみを並べ替えて重複を削除してください。
すべてのタスクIDが時系列順になっていることを確認してください。 [web-scrape]の例のURLを提供しないでください。
最後のタスクの結果はJSONに含めず、その後追加されます。最後のステップは、すべてのタスクの最終的な要約レポートを提供することです。
望ましい出力形式の例は次のとおりです。
[{"id": 1, "task": "https://untapped.vc", "tool": "web-scrape", "dependent_task_id": null, "status": "incomplete", "result": null, "result_summary": null},
{"id": 2, "task": "Analyze the contents of...", "tool": "text-completion", "dependent_task_id": 1, "status": "incomplete", "result": null, "result_summary": null},
{"id": 3, "task": "Untapped Capital", "tool": "web-search", "dependent_task_id": null, "status": "incomplete", "result": null, "result_summary": null} ].

実行させてみる

OpenAPIキーとSarpキーをセットして実行させてみます。
私はローカルに落として、必要なモジュールをインストールして
実行しました。

まず動かしてみる

説明にある通りかなり低速です。

おぉ、タスクの処理状況や、依存タスクも可視化されました！

運転免許証に関するタスクは失敗…
コンテキストが長すぎると、耐えきれないみたいです

気を取り直して、もっと簡単そうなタスクを選定

＜日本語訳＞
1. VISAカード取得の条件について情報収集する。
2. VISAカードの種類を調べる。
3. 各カードの特徴や利点を比較する。
4. 個人のニーズに合った最適な選択肢を決定する。
5. 選択したVISAカードのオンラインアプリケーションを作成する。
6. 申込書と追加の必要書類を提出します。
7. 申し込み状況を確認する。
8. VISAカードを受け取り、有効化する。

ふむふむ。今回はできるかな？頑張ってくれ