【Python】動画内の指定オブジェクトを追跡して自動トリミング（1）

2024年3月31日 16:15

【状況】動画内の複数のオブジェクトから選択して，そのオブジェクトの動く範囲を求めたい
【対処】YOLOv8を用いて検出したオブジェクトのバウンディングボックスを全て求める

ゴールは，特定のオブジェクトの動く範囲を求めて，その範囲のみの動画を生成することです．今回は，オブジェクトを選択してトレースできるところまでです．

まずはサンプルコードから

公式サイトに，動画からフレームを切り出して処理をするひな形があるので，出発点として，そのまま使います．変えたのはvideo_pathのファイル名のみです．ソースコードと同じフォルダにwalk.mp4を置いています．
以降，invideo AIで作成した動画（人々が歩く，みたいなキーワードで生成）を使います．

import cv2
from ultralytics import YOLO

# Load the YOLOv8 model
model = YOLO('yolov8n.pt')

# Open the vieo file
video_path = "walk.mp4"
cap = cv2.VideoCapture(video_path)

# Loop through the video frames
while cap.isOpened():
    # Read a frame from the video
    success, frame = cap.read()

    if success:
        # Run YOLOv8 tracking on the frame, persisting tracks between frames
        results = model.track(frame, persist=True)

        # Visualize the results on the frame
        annotated_frame = results[0].plot()

        # Display the annotated frame
        cv2.imshow("YOLOv8 Tracking", annotated_frame)

        # Break the loop if 'q' is pressed
        if cv2.waitKey(1) & 0xFF == ord("q"):
            break
    else:
        # Break the loop if the end of the video is reached
        break

# Release the video capture object and close the display window
cap.release()
cv2.destroyAllWindows()

複数の人，犬などのオブジェクトが認識できました．

複数のオブジェクトから対象を選択

ここで，複数のオブジェクトが映っていることを前提とし，動画の読みこみ処理の前にidを入力して選択できるようにします．映っているidの中から選んで数字を入力します（エラー処理はしていません）．また，ここで取得したフレームのコピーをtracking_frameに保存しておき，tracking_frameにバウンディングボックスを上書きしていきます．

results = model.track(frame, persist=True)

cv2.imshow("YOLOv8 Tracking", results[0].plot())
cv2.waitKey(1)  # ウェイトを入れないと画面が更新されない

print("処理対象の[番号]を入力：")
select_id = int(input())  # 数値を入力する 

# 最初のフレームをコピーして，ここに上書きしていく
tracking_frame = frame.copy()

検出したオブジェクトの情報を得る

results[0]に検出したオブジェクトの情報が入っているので引き出します．今回は，座標値とidのみを取り出して使います（もちろんオブジェクト名なども取り出せます）．見つかったオブジェクト群から，ループ処理で選択したidが見つかれば，バウンディングボックスを青線で描画し，最初のフレーム（tracking_frame）に上書きしていきます．

# 検出したオブジェクト群から，トラッキング対象を探してバウンディングボックスを記録
# 　boxes.xyxyがxy座標のセット，boxes.idが識別番号を保持
for box, id in zip(results[0].boxes.xyxy, results[0].boxes.id):  
    if id == select_id: # 選択したオブジェクトのid
        box_x1, box_y1, box_x2, box_y2 = map(int, box)   # バウンディングボックスの座標を保存
    
        # バウンディングボックスを青線で表示
        cv2.rectangle(tracking_frame, (box_x1, box_y1), (box_x2, box_y2), (255, 0, 0), 2)
        break

# トラッキング領域を描画
cv2.imshow("YOLOv8 Tracking", tracking_frame)
cv2.waitKey(1)  # ウェイトを入れないと画面が更新されない

以下の結果の図を比べると，id1とid2で結果が異なることが分かります．

対象を選択することができた♪
上記の「最初のフレーム→残りのフレームのループ」の流れだと，トラッキングの呼び出しが2か所になります．ループ内にまとめて最初の読み出しのとき，という処理をすればトラッキングの呼び出しは1か所にできるけど，こっちの方が流れが分かりやすいかな．
なお，最初のフレームに居ないオブジェクトは選択できない仕様なので，途中から出てくる場合には，ひと工夫必要です．

今回はここまで．次は，バウンディングボックス群を含む領域を求めます．

コード

import cv2
from ultralytics import YOLO

# Load the YOLOv8 model
model = YOLO('yolov8n.pt')

# Open the video file
video_path = "walk-1.mp4"
cap = cv2.VideoCapture(video_path)

success, frame = cap.read()
if not success :    # 読み込みに失敗したら終了
    exit()
    
# 最初のフレームから，トラッキング対象を選択する
results = model.track(frame, persist=True)  # オブジェクト抽出結果を得る

cv2.imshow("YOLOv8 Tracking", results[0].plot())
cv2.waitKey(1)  # ウェイトを入れないと画面が更新されない

print("処理対象の[番号]を入力：")
select_id = int(input())  # 数値を入力する 

# 最初のフレームをコピーして，ここに上書きしていく
tracking_frame = frame.copy()

# Loop through the video frames
while cap.isOpened():
    # Read a frame from the video
    success, frame = cap.read()

    if success:
        # Run YOLOv8 tracking on the frame, persisting tracks between frames
        results = model.track(frame, persist=True)
        
        # 検出したオブジェクト群から，トラッキング対象を探してバウンディングボックスを記録
        # 　boxes.xyxyがxy座標のセット，boxes.idが識別番号を保持
        for box, id in zip(results[0].boxes.xyxy, results[0].boxes.id):  
            if id == select_id: # 選択したオブジェクトのid
                box_x1, box_y1, box_x2, box_y2 = map(int, box)   # バウンディングボックスの座標を保存
            
                # バウンディングボックスを青線で表示
                cv2.rectangle(tracking_frame, (box_x1, box_y1), (box_x2, box_y2), (255, 0, 0), 2)
                break

        # 検出過程を描画
        cv2.imshow("YOLOv8 Tracking", tracking_frame)
        cv2.waitKey(1)  # ウェイトを入れないと画面が更新されない

    else:
        # Break the loop if the end of the video is reached
        break
    
# Release the video capture object and close the display window
cap.release()
cv2.destroyAllWindows()

この記事が気に入ったらサポートをしてみませんか？