Gemini API の Function Calling を試す

npaka

2024年4月28日 06:46

「Gemini API」の「Function Calling」を試したので、まとめました。

1. Function Calling

「Function Calling」は、開発者が事前に関数定義を指定しておくことで、モデルが外部プログラムを呼び出すことを選択できるようにする機能です。

2. Gemini APIの準備

「Google Colab」での「Gemini API」の準備手順は、次のとおりです。

(1) パッケージのインストール。

# パッケージのインストール
!pip install -U -q google-generativeai

(2) 「Google AI Studio」でAPIキーを取得し、シークレットの「GOOGLE_API_KEY」に登録後、以下のセルを実行。

from google.colab import userdata
import google.generativeai as genai

# 環境変数の準備 (左端の鍵アイコンでGOOGLE_API_KEYを設定)
GOOGLE_API_KEY=userdata.get("GOOGLE_API_KEY")
genai.configure(api_key=GOOGLE_API_KEY)

3. Automatic Function Calling

「Automatic Function Calling」の手順は、次のとおりです。

(1) 関数の定義。
計算 (加算、減算、乗算、除算) を行う関数を定義します。

def add(a:float, b:float):
    """returns a + b."""
    return a+b

def subtract(a:float, b:float):
    """returns a - b."""
    return a-b

def multiply(a:float, b:float):
    """returns a * b."""
    return a*b

def divide(a:float, b:float):
    """returns a / b."""
    return a*b

(2) モデル生成時に関数リストを指定。

# モデル生成時に関数リストを指定
model = genai.GenerativeModel(
    model_name="models/gemini-1.5-pro-latest",
    tools=[add, subtract, multiply, divide]  # Toolsの設定
)
model

モデルは「関数名」「ドキュメント文字列」「パラメータ」「パラメータ型の注釈」から、応答する際に関数が必要かどうかを判断します。

関数のパラメータ型の注釈をAPIが理解できる形式 (glm.FunctionDeclaration) に変換します。 APIはパラメータ型の限られた選択のみをサポートし、「Python SDK」の自動変換はそのサブセットのみをサポートします。

AllowedTypes = int | float | bool | str | list['AllowedTypes'] | dict

(3) チャットの作成。
enable_automatic_function_calling=True で「Automatic Function Calling」を有効化しています。

# チャットの作成
chat = model.start_chat(enable_automatic_function_calling=True)

(4) 計算の利用を促す質問応答。
57x44=2508なので正しいです。

# 質問応答
response = chat.send_message("私は57匹の猫を飼っていて、それぞれが44個のミトンを持っています。ミトンは合計で何個になりますか?")
response.text

57匹の猫がいて、それぞれが44個のミトンを持っている場合、合計で2508個のミトンになります。

(5) 会話履歴の確認。
会話履歴から、会話の流れと関数の使われ方がわかります。

# 会話履歴の確認
for content in chat.history:
    print(content.role, "->", [type(part).to_dict(part) for part in content.parts])
    print('-'*80)

user -> [{'text': '私は57匹の猫を飼っていて、それぞれが44個のミトンを持っています。ミトンは合計で何個になりますか?'}]
--------------------------------------------------------------------------------
model -> [{'function_call': {'name': 'multiply', 'args': {'b': 44.0, 'a': 57.0}}}]
--------------------------------------------------------------------------------
user -> [{'function_response': {'name': 'multiply', 'response': {'result': 2508.0}}}]
--------------------------------------------------------------------------------
model -> [{'text': '57匹の猫がいて、それぞれが44個のミトンを持っている場合、合計で2508個のミトンになります。'}]
--------------------------------------------------------------------------------

会話の各ターンは、次の情報を含むコンポーネント(glm.Content)によって表されます。

・role : コンテンツ発信元 (user or model)
・parts : メッセージに含まれるコンポーネント(glm.Part)のリスト
　・text : テキストメッセージ。
　・function calling (glm.FunctionCall) : 指定された引数で特定の関数を実行するためのモデルからのリクエスト。
　・function response (glm.FunctionResponse) : 要求された関数の実行後にユーザーによって返された結果。

状態遷移は次のとおりです。

4. Manual Function Calling

「Manual Function Calling」では、モデルからの glm.FunctionCall リクエストを自分で処理できます。

次の場合に、「Manual Function Calling」になります。

・Chatをenable_automatic_function_calling=False (default) で使用。
・GenerativeModel.generate_content()を使用。

(1) 関数の定義。
次の関数を定義します。

・find_movies : 説明 (タイトル・ジャンル) をもとに、上映されている映画タイトルを検索
・find_theaters : 場所をもとに映画館を検索
・get_showtimes : 場所と映画タイトルと映画館名と日付をもとに映画の開始時間を検索

def find_movies(description: str, location: str = ''):
  """find movie titles currently playing in theaters based on any description, genre, title words, etc.

  Args:
      description: Any kind of description including category or genre, title words, attributes, etc.
      location: The city and state, e.g. San Francisco, CA or a zip code e.g. 95616
  """
  return ["Detective Conan", "Godzilla-1.0"]

def find_theaters(location: str, movie: str = ''):
    """Find theaters based on location and optionally movie title which are is currently playing in theaters.

    Args:
        location: The city and state, e.g. San Francisco, CA or a zip code e.g. 95616
        movie: Any movie title
    """
    return ["TOHO Chinemas"]    

def get_showtimes(location:str, movie:str, theater:str, date:str):
    """
    Find the start times for movies playing in a specific theater.

    Args:
      location: The city and state, e.g. San Francisco, CA or a zip code e.g. 95616
      movie: Any movie title
      thearer: Name of the theater
      date: Date for requested showtime
    """
    return ['10:00', '11:00']

(2) モデルの生成。
関数を辞書で管理することで、後で関数名で簡単に検索できるようになります。

# 関数を辞書で管理
functions = {
    "find_movies": find_movies,
    "find_theaters": find_theaters,
    "get_showtimes": get_showtimes,
}

# モデルの生成
model = genai.GenerativeModel(
    model_name="models/gemini-1.5-pro-latest",
    tools=functions.values(),
)

(3) 質問応答。
呼び出すべき関数が返されます。「Automatic Function Calling」は有効化されてないため、関数を自分で呼び出す必要があります。

# 質問応答
response = model.generate_content("東京で名探偵コナンを映画を現在上映している映画館はどこですか？")
response.candidates[0].content.parts

[function_call {
  name: "find_theaters"
  args {
    fields {
      key: "location"
      value {
        string_value: "Mountain View, CA"
      }
    }
    fields {
      key: "movie"
      value {
        string_value: "Barbie"
      }
    }
  }
}
]

(4) 関数呼び出し。

# 関数呼び出し
def call_function(function_call, functions):
  function_name = function_call.name
  function_args = function_call.args
  return functions[function_name](**function_args)

# Function Callingかどうかの確認
part = response.candidates[0].content.parts[0]
if part.function_call:
    result = call_function(part.function_call, functions)

# 関数呼び出しの実行
print(result)

['TOHO Chinemas']

(5) 関数呼び出し結果をモデルに渡す。
最後に、関数呼び出し結果と会話履歴を次のgenerate_content()に渡して、モデルから最終的な応答を取得します。

import google.ai.generativelanguage as glm
from google.protobuf.struct_pb2 import Struct

# Function Responseの生成
# ***次のIssue対応次第更新**
# https://github.com/google/generative-ai-python/issues/243
s = Struct()
s.update({'result': result})
function_response = glm.Part(
    function_response=glm.FunctionResponse(name="find_theaters", response=s)
)

# 会話履歴の作成
messages = [
    {'role':'user',
     'parts': ['東京で名探偵コナンを現在上映している映画館はどこですか？']},
    {'role':'model',
     'parts': response.candidates[0].content.parts},
    {'role':'user',
     'parts': [function_response]}
]

# 質問応答
response = model.generate_content(messages)
print(response.text)

TOHOシネマズで上映中です。

この記事が気に入ったらサポートをしてみませんか？