DeepLab(TensorFlow)のログ制御

2019年9月30日 14:58

動作環境はGoogle Colaboratory。他環境でも同じのはず。TensorFlowのバージョンは1.14。

deprecated（非推奨）の抑止

TensorFlowを使ってるとdeprecatedが多量に出力されるがこっちはわかって使ってるし、Google Colaboratory等ではいちいちパッケージをアップデートするのも手間がかかる。
deprecatedは各Pythonスクリプトに以下を設定すると抑止できる。

from tensorflow.python.util import deprecation_wrapper
deprecation_wrapper._PER_MODULE_WARNING_LIMIT = 0
from tensorflow.python.util import deprecation
deprecation._PRINT_DEPRECATION_WARNINGS = False

参照にしたのは以下トピック。

ABSL handlerの抑止

DeepLab本体のログが長ったらしい形式で出力されたり二重出力になってしまう場合、

例：

I0919 08:30:26.657488 140341343639424 model.py:401] Using dense prediction cell config.

各PythonスクリプトでABSLハンドラの出力するレベルをCritical(50)より上に設定すると抑止できる。

from absl import logging as _logging
_logging.use_absl_handler()
handler = _logging.get_absl_handler().setLevel(60)

抑止でなく表示フォーマットを変えたい場合は、setLevel()のかわりにsetFormatter()で任意のフォーマッタを設定すればよい。

import logging as logging
from absl import logging as _logging
_logging.use_absl_handler()
handler = _logging.get_absl_handler().setFormatter(logging.Formatter(logging.BASIC_FORMAT))

C++部分のログ制御

C++部分は主にハードウェアのプロファイルやライブラリ読み込みについてが出力される。

例：

2019-09-19 08:30:24.358224: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300000000 Hz
2019-09-24 08:14:26.875219: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0

環境変数F_CPP_MIN_LOG_LEVELを1に設定するとINFO(0)をより上のレベルのみ出力されるようになる。1〜3がWARNING〜FATALに対応。

%env TF_CPP_MIN_LOG_LEVEL = 1

Model Analysis Reportの抑止

eval.py実行時に表示されるModel Analysis Reportは内容が長大でログがすこぶる読みにくい。
各Pythonスクリプトでimport tensorflow as tfの後に、

tf.contrib.tfprof.model_analyzer.TRAINABLE_VARS_PARAMS_STAT_OPTIONS['output'] = 'none'
tf.contrib.tfprof.model_analyzer.FLOAT_OPS_OPTIONS['output'] = 'none'
tf.contrib.tfprof.model_analyzer.PRINT_ALL_TIMING_MEMORY['output'] = 'none'

を設定すると抑止できる。

ログ出力例

train.py

WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
 * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
 * https://github.com/tensorflow/addons
 * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

WARNING:tensorflow:From /content/drive/My Drive/ML/tensorflow/models/research/slim/nets/mobilenet/mobilenet.py:397: The name tf.nn.avg_pool is deprecated. Please use tf.nn.avg_pool2d instead.

INFO:tensorflow:Training on trainval set
INFO:tensorflow:Initializing model from path: datasets/pascal_voc_seg/init_models/deeplabv3_pascal_train_aug/model.ckpt
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
2019-09-21 05:40:25.699165: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:40] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2019-09-21 05:40:35.173572: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set.  If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU.  To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into datasets/pascal_voc_seg/exp/train_on_trainval_set/train/model.ckpt.
2019-09-21 05:41:09.537115: W tensorflow/core/common_runtime/bfc_allocator.cc:237] Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.15GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
Total loss is :[0.241005644]
2019-09-21 05:41:09.613825: W tensorflow/core/common_runtime/bfc_allocator.cc:237] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.65GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-09-21 05:41:09.627528: W tensorflow/core/common_runtime/bfc_allocator.cc:237] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.69GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
INFO:tensorflow:Saving checkpoints for 10 into datasets/pascal_voc_seg/exp/train_on_trainval_set/train/model.ckpt.

eval.py

WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
 * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
 * https://github.com/tensorflow/addons
 * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

WARNING:tensorflow:From /content/drive/My Drive/ML/tensorflow/models/research/slim/nets/mobilenet/mobilenet.py:397: The name tf.nn.avg_pool is deprecated. Please use tf.nn.avg_pool2d instead.

INFO:tensorflow:Evaluating on val set
INFO:tensorflow:Performing single-scale test.
4 ops no flops stats due to incomplete shapes.
Parsing Inputs...
4 ops no flops stats due to incomplete shapes.
Parsing Inputs...
INFO:tensorflow:Waiting for new checkpoint at datasets/pascal_voc_seg/exp/train_on_trainval_set/train
INFO:tensorflow:Found new checkpoint at datasets/pascal_voc_seg/exp/train_on_trainval_set/train/model.ckpt-10
INFO:tensorflow:Graph was finalized.
2019-09-24 08:35:33.587912: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:40] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
INFO:tensorflow:Restoring parameters from datasets/pascal_voc_seg/exp/train_on_trainval_set/train/model.ckpt-10
2019-09-24 08:35:36.110165: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set.  If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU.  To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Starting evaluation at 2019-09-24-08:35:37

Google ColaboratoryでNotebookを使用しているので、Python スクリプトに書き込まないでコードセルに書き出したいところ。 ■

この記事が気に入ったらサポートをしてみませんか？