テキストから動画生成拡散モデル

kazu | 生成AI×教育 ( https://aiacademy.jp/bootcamp )

2023年3月19日 16:44

1.7 billion parameter text to video generation diffusion model

入力されたテキストを記述し、テキストの記述に合致した動画を返す多段階のテキストから動画への拡散モデルを採用。（※英語入力のみに対応）

元ツイート

1.7 billion parameter text to video generation diffusion model

model: https://t.co/ZcKHuDxk6l
model files: https://t.co/uqVfCZTSk3 pic.twitter.com/tQgW3uXhEU
— AK (@_akhaliq) March 19, 2023

モデル: https://modelscope.cn/models/damo/text-to-video-synthesis/summary
モデルファイル: https://modelscope.cn/models/damo/text-to-video-synthesis/files

動画（デモ）

サンプルコード

※現在動作確認中

pip install open_clip_torch modelscope pytorch_lightning

from modelscope.pipelines import pipeline
from modelscope.outputs import OutputKeys


def main():
    p = pipeline('text-to-video-synthesis', 'damo/text-to-video-synthesis')
    test_text = {
        'text': 'A panda eating bamboo on a rock.',
    }
    output_video_path = p(test_text)[OutputKeys.OUTPUT_VIDEO]
    print('output_video_path:', output_video_path)


if __name__ == '__main__':
    main()

上記のコードは、出力ビデオが保存されるパスを表示

この記事が参加している募集

AIとやってみた

24,352件

この記事が気に入ったらサポートをしてみませんか？