PyAutoGUIでマルチディスプレイ対応する方法（Windows）

2024年2月24日 19:10

Windowsで、マルチディスプレイ環境下における画像を認識して操作する要件があったので、確認したところ、そのまま使う場合は対応していないらしく、実装した結果を紹介します。

実行環境

確認した環境は下記の通りです。

> python --version
Python 3.11.7

> pip freeze       
 :
PyAutoGUI==0.9.54
 :
PyScreeze==0.1.30
 :
pywin32==306
 :

インストールされていない場合は、pyautogui、pywin32をインストールします。

> pip install pyautogui
> pip install pywin32

実装内容

修正するファイルは1つで、pyscreezeパッケージ内のファイルを修正します。修正箇所は3か所です。今回はminicondaで環境を作っています。

フォルダ：~\miniconda3\envs\dev_pyautogui\Lib\site-packages\pyscreeze
ファイル：init.py

下記のコードの表記内、いずれもBEGIN multi display support～END multi display supportが追記内容となります。

１カ所目は、win32apiのインポートです。ファイルの最初の方にsys.platform == 'win32'という条件式があるのでその中に追記します。

if sys.platform == 'win32':
    # On Windows, the monitor scaling can be set to something besides normal 100%.
    # PyScreeze and Pillow needs to account for this to make accurate screenshots.
    # TODO - How does macOS and Linux handle monitor scaling?
    import ctypes
    # BEGIN multi display support
    import win32api
    # END multi display support

    try:
        ctypes.windll.user32.SetProcessDPIAware()
    except AttributeError:
        pass  # Windows XP doesn't support monitor scaling, so just do nothing.

    try:
        import pygetwindow
    except ImportError:
        _PYGETWINDOW_UNAVAILABLE = True
    else:
        _PYGETWINDOW_UNAVAILABLE = False

2カ所目は、locateOnScreen関数内です。

win32api.EnumDisplayMonitors() を使用して、システムに接続されている全ディスプレイの情報を取得します。

各ディスプレイの情報から、左上の座標を取得します。それぞれのディスプレイの左上の座標の最小を見つけ、最も左上に位置するディスプレイの座標を取得します。

指定された画像が見つかった場合 (retVal が None でない場合)、その座標に対して、最も左上に位置するディスプレイの座標を加算します。これにより、画像の座標が画面全体に対する正確な位置に変換されます。

加算された座標と、元の座標情報 (retVal) を使って、Box オブジェクトを作成します。これは、特定された画像の領域を示すための座標情報を含んでいます。

def locateOnScreen(image, minSearchTime=0, **kwargs):
    """TODO - rewrite this
    minSearchTime - amount of time in seconds to repeat taking
    screenshots and trying to locate a match.  The default of 0 performs
    a single search.
    """
    start = time.time()
    while True:
        try:
            # the locateAll() function must handle cropping to return accurate coordinates,
            # so don't pass a region here.
            screenshotIm = screenshot(region=None)
            retVal = locate(image, screenshotIm, **kwargs)

            # BEGIN multi display support
            if retVal is not None and sys.platform == 'win32':
                displays = win32api.EnumDisplayMonitors()
                left_min = min([display[2][0] for display in displays])
                top_min = min([display[2][1] for display in displays])
                retVal = Box(
                    left = retVal[0] + left_min,
                    top = retVal[1] + top_min,
                    width = retVal[2],
                    height = retVal[3]
                )
            # END multi display support

            try:
                screenshotIm.fp.close()
            except AttributeError:
                # Screenshots on Windows won't have an fp since they came from
                # ImageGrab, not a file. Screenshots on Linux will have fp set
                # to None since the file has been unlinked
                pass
            if retVal or time.time() - start > minSearchTime:
                return retVal
        except ImageNotFoundException:
            if time.time() - start > minSearchTime:
                if USE_IMAGE_NOT_FOUND_EXCEPTION:
                    raise
                else:
                    return None

3カ所目は_screenshot_win32関数の引数allScreensをTrueに変更します。

# BEGIN multi display support
# def _screenshot_win32(imageFilename=None, region=None, allScreens=False):
def _screenshot_win32(imageFilename=None, region=None, allScreens=True):
# END multi display support
    """
    TODO
    """
    # TODO - Use the winapi to get a screenshot, and compare performance with ImageGrab.grab()
    # https://stackoverflow.com/a/3586280/1893164
    im = ImageGrab.grab(all_screens=allScreens)
    if region is not None:
        assert len(region) == 4, 'region argument must be a tuple of four ints'
        assert isinstance(region[0], int) and isinstance(region[1], int) and isinstance(region[2], int) and isinstance(region[3], int), 'region argument must be a tuple of four ints'
        im = im.crop((region[0], region[1], region[2] + region[0], region[3] + region[1]))
    if imageFilename is not None:
        im.save(imageFilename)
    return im

以上で、マルチディスプレイ化が可能となります。次のようなサンプルプログラムを動作させるとマルチディスプレイ環境でも画像を検出した位置を取得してくれます。

import pyautogui
from pyautogui import ImageNotFoundException

try:
    position = pyautogui.locateOnScreen("excel.png")
    print(position)

except ImageNotFoundException:
    print("Image not found")

注意点

パッケージ内を直接変更しているので、他への影響有無はテストが必要となります。また、性能上はディスプレイ数に比例して処理に時間がかかる可能性があります。

この記事が気に入ったらサポートをしてみませんか？