Google Colab 無料版でオープンソースのコーディングAI「StableCode」を使ってみた

2023年8月9日 14:30

Stability AI 様からオープンソースのコーディング用AI「StableCode」がリリースされました！

🚀Exciting news! Stability AI has launched StableCode, the revolutionary generative AI LLM for coding!

💡 Developers, get ready to level up your coding game! #AI #Coding #StableCode #StabilityAI https://t.co/XFrV36JMMu
— Stability AI (@StabilityAI) August 8, 2023

【POSTの日本語訳】
エキサイティングなニュースです！Stability AIは、コーディングのための革新的なジェネレーティブAI LLM、StableCodeを発表しました！開発者の皆さん、コーディングのレベルアップに備えましょう！

StableCodeをColab無料版で使ってみました

使用したモデル: stabilityai/stablecode-instruct-alpha-3b

モデルを利用する際は上記のページの利用規約に同意する必要があります。

ソースコード

%%capture
!pip install transformers

use_auth_token = "HuggingFaceのAPIキー"

from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("stabilityai/stablecode-instruct-alpha-3b", use_auth_token=use_auth_token)
model = AutoModelForCausalLM.from_pretrained(
  "stabilityai/stablecode-instruct-alpha-3b",
  trust_remote_code=True,
  torch_dtype="auto",
  use_auth_token=use_auth_token,
)
model.cuda()

inputs = """
###Instruction
Generate a python function to find number of CPU cores

###Response
"""
inputs = tokenizer(inputs, return_tensors="pt").to("cuda")

tokens = model.generate(
    inputs.input_ids,
    max_new_tokens=128,
    temperature=0.2,
    do_sample=True,
    attention_mask=inputs.attention_mask,
    pad_token_id=tokenizer.eos_token_id,
)

print(tokenizer.decode(tokens[0], skip_special_tokens=True))

実行結果

###Instruction
Generate a python function to find number of CPU cores

###Response
def get_cpu_count():
"""
This function will return the number of CPU cores
installed in the system
"""
import multiprocessing
return multiprocessing.cpu_count()

提案コードを試しに実行してみた

ちゃんと動いていますね！

公式のサンプルコードとの違い

tokens = model.generate(
**inputs,
max_new_tokens=48,
temperature=0.2,
do_sample=True,
)

太字部分で以下のエラーが出てしまったため、

ValueError: The following `model_kwargs` are not used by the model: ['token_type_ids'] (note: typos in the generate arguments will also show up in this list) )

inputs.input_ids に書き換え、attention_mask を指定しています。
また、pad_token_id も eos_token_id で指定するようにしました。
max_new_tokens は 128 に増やしました。

ライセンス

https://huggingface.co/stabilityai/stablecode-instruct-alpha-3b/blob/main/LICENSE.md

このライセンスは、非営利の研究目的に限り、StableCodeの複製、配布、派生物の作成を許可しています。商業的な使用は許可されていないようです。

この記事が気に入ったらサポートをしてみませんか？