英語で読む「OpenCV+Kinect」

CVはComputer Visionの略。昔、KinectをOpenNI経由で使う際にOpenCV 2.xを入れ顔画像認識とかも試してたが、今やRaspberry Piで動態検出までできる。これが3.5年ぶりにメジャーバージョンアップし、そのリリースノートにKinectの文字を見かけたので読んでみた。

After almost 3.5 years since groundbreaking 3.0 release, we are glad to present the first stable release in the 4.x line.
OpenCV is now C++11 library and requires C++11-compliant compiler. Minimum required CMake version has been raised to 3.5.1.[..]
New module G-API has been added, it acts as an engine for very efficient graph-based image procesing pipelines.
dnn module now includes experimental Vulkan backend and supports networks in ONNX format.

Release Highlightsの先頭はC++11対応だ。よくわからないが、先頭に書きたくなるほど重要ということなんだろう。GithubのChangeLogではC+11のロゴまで貼られてる。

ONNXはTensorFlowやChainerなどDeepLearningのフレームワークがサポートするモデル定義フォーマット。例えばTensorFlowで学習したモデルをOpenCVで使えるということだろうか。

The popular Kinect Fusion algorithm has been implemented and optimized for CPU and GPU (OpenCL)
QR code detector and decoder have been added to the objdetect module
Very efficient and yet high-quality DIS dense optical flow algorithm has been moved from opencv_contrib to the video module.

KinectはMicrosoft XBOXで体を動きを取り込めるデバイスだが2017年に生産終了となっている。そのKinectを使った3DキャプチャするKinect Fusionのアルゴリズムが2018年にリリースされたOpenCVの最新バージョンで生き続ける。

その昔、MicrosoftはKinectのSDKを公開しなかったので、OpenNI+OpenCVという組み合わせで使っていたことを考えると、不思議な因果関係だ。

当時のAlex Kipmanによる「Kinectがハックされたわけじゃないよ」という発言が面白かったので最後に貼っておく。

The first thing to talk about is, Kinect was not actually hacked. Hacking would mean that someone got to our algorithms that sit inside of the Xbox and was able to actually use them, which hasn't happened. Or, it means that you put a device between the sensor and the Xbox for means of cheating, which also has not happened. That's what we call hacking, and that's what we have put a ton of work and effort to make sure doesn't actually occur. What has happened is someone wrote an open-source driver for PCs that essentially opens the USB connection, which we didn't protect, by design, and reads the inputs from the sensor. The sensor, again, as I talked earlier, has eyes and ears, and that's a whole bunch of noise that someone needs to take and turn into signal.



この記事が気に入ったらサポートをしてみませんか?