見出し画像

AI-gpt-4-visionでimage-to-promptして、DALL-Eにprompt-to-imageさせる実験

gpt-4-visionでimage-to-promptして、DALL-Eにprompt-to-imageさせる実験

画像ファイルをgpt-4-visionに突っ込んでpromptを生成してもらい、そのpromptを使ってDALL-Eで画像を生成する実験。

元の画像


gpt-4-visionのコード

Google Colabで動かしました。

!pip install openai

from openai import OpenAI
import os
os.environ["OPENAI_API_KEY"] = "<APIキー>"

client = OpenAI()

response = client.chat.completions.create(
  model="gpt-4-vision-preview",
  messages=[
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "Genarate prompts what’s in this image?"},
        {
          "type": "image_url",
          "image_url": {
            "url": "https://assets.st-note.com/img/1700265487509-9EPnhZvMO9.png",
          },
        },
      ],
    }
  ],
  max_tokens=300,
)

print(response.choices[0])

出力

Choice(finish_reason=None, index=0, message=ChatCompletionMessage(content='This is an illustration of a woman and a young child. The woman appears to be reading a story to the child from a large blue book. The child, who seems quite engaged, is looking at the book with great interest. Both characters are drawn in an anime/manga style, with detailed features and soft coloring. The woman is wearing a light sweater and has short, dark hair. They are both seated, and the environment suggests a calm, intimate moment of sharing a story or learning together.', role='assistant', function_call=None, tool_calls=None), finish_details={'type': 'stop', 'stop': '<|fim_suffix|>'})

DALL-Eで出力

gpt-4-visionに「より美的な画像を生成するためのプロンプトを生成してください」と指示してみる

client = OpenAI()

response = client.chat.completions.create(
  model="gpt-4-vision-preview",
  messages=[
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "Generate prompts to generate more aesthetic images What is in this image?"},
        {
          "type": "image_url",
          "image_url": {
            "url": "https://assets.st-note.com/img/1700273289877-3DZcYkl4OG.png",
          },
        },
      ],
    }
  ],
  max_tokens=300,
)

print(response.choices[0])

出力

Choice(finish_reason=None, index=0, message=ChatCompletionMessage(content='This image depicts two animated characters with a soft, aesthetic art style. The character on the left is a young child with curly dark hair, and a blue top, holding and looking at a book with interest. The character on the right appears to be an older female with shoulder-length light hair, wearing a cozy, off-white cable-knit sweater, gently holding the book for the child. Both characters have rosy cheeks and are wearing subtle blush makeup, contributing to a warm, comforting scene. The artwork has a gentle color palette with a focus on soft pinks and neutral tones, enhancing the tender and peaceful mood of the image.\n\nHere are some prompts that can be used to generate similar aesthetic images:\n\n1. A teenager sitting under a tree with a sketchbook, surrounded by autumn leaves.\n2. Two children chasing fireflies in a twilight garden, with glowing lights dotting the landscape.\n3. A young adult reading a book by the window on a rainy day, with a steaming mug of tea beside them.\n4. A group of friends having a picnic in a meadow, with a checkered blanket and a basket of fruits and sandwiches.\n5. A person sitting in a cozy cafe, writing in a journal, surrounded by soft ambient lighting and bookshelves.\n6. An artist painting a landscape on a canvas in a sunlit studio with plants and art supplies around.\n7. A family baking cookies together in a kitchen filled with warm light and homey décor.\n8.', role='assistant', function_call=None, tool_calls=None), finish_details={'type': 'max_tokens'})

DALL-Eで出力

繰り返す

client = OpenAI()

response = client.chat.completions.create(
  model="gpt-4-vision-preview",
  messages=[
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "Generate prompts to generate more aesthetic images What is in this image?"},
        {
          "type": "image_url",
          "image_url": {
            "url": "https://assets.st-note.com/img/1700273733282-cNtwGgTtjI.png",
          },
        },
      ],
    }
  ],
  max_tokens=300,
)

print(response.choices[0])

出力

Choice(finish_reason=None, index=0, message=ChatCompletionMessage(content='This image features an illustration of a tender moment between two characters, likely a mother and child. The adult figure is holding a book, which the child is attentively looking at, suggesting they might be engaged in a storytime session. The illustration has a warm and soft color palette, mainly pinks and creams, contributing to a nurturing and gentle atmosphere. The characters are drawn with rounded features and blushed cheeks, which adds to the overall cohesiveness and charm of the aesthetic.\n\nHere are some prompts to generate more images with a similar aesthetic:\n\n1. Illustrate a father and son cooking together in a kitchen with soft pastel colors and a heartwarming atmosphere.\n2. Create an image of a grandmother and granddaughter gardening together, with the focus on delicate floral patterns and a serene color palette.\n3. Depict two siblings, one older and one younger, cloud-gazing in the afternoon with a golden hour light and a soft, dreamy background.\n4. Show a family having a picnic in a park during cherry blossom season, with a visual focus on the pink blossoms and a blissful spring setting.\n5. Capture a moment of a child learning to ride a bike with the help of a patient parent, surrounded by a quiet suburban landscape bathed in the gentle light of sunset.\n6. Illustrate a cozy bedtime scene with a parent reading a fairytale to their child by a night lamp, enveloped in warm hues and soothing textures.\n\nEach of these prompts emphasizes heart', role='assistant', function_call=None, tool_calls=None), finish_details={'type': 'max_tokens'})

DALL-Eで出力

振り返り

「より美的な画像を」という指示だとAI同士がやりとりして画像がある方向に向かって変化していくようすが観察しづらかった。
別の方法を考える。

この記事が参加している募集

#AIとやってみた

28,308件

この記事が気に入ったらサポートをしてみませんか?