You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

257 lines
11 KiB

6 years ago
  1. # Inference and evaluation on the Open Images dataset
  2. This page presents a tutorial for running object detector inference and
  3. evaluation measure computations on the [Open Images
  4. dataset](https://github.com/openimages/dataset), using tools from the
  5. [TensorFlow Object Detection
  6. API](https://github.com/tensorflow/models/tree/master/research/object_detection).
  7. It shows how to download the images and annotations for the validation and test
  8. sets of Open Images; how to package the downloaded data in a format understood
  9. by the Object Detection API; where to find a trained object detector model for
  10. Open Images; how to run inference; and how to compute evaluation measures on the
  11. inferred detections.
  12. Inferred detections will look like the following:
  13. ![](img/oid_bus_72e19c28aac34ed8.jpg)
  14. ![](img/oid_monkey_3b4168c89cecbc5b.jpg)
  15. On the validation set of Open Images, this tutorial requires 27GB of free disk
  16. space and the inference step takes approximately 9 hours on a single NVIDIA
  17. Tesla P100 GPU. On the test set -- 75GB and 27 hours respectively. All other
  18. steps require less than two hours in total on both sets.
  19. ## Installing TensorFlow, the Object Detection API, and Google Cloud SDK
  20. Please run through the [installation instructions](installation.md) to install
  21. TensorFlow and all its dependencies. Ensure the Protobuf libraries are compiled
  22. and the library directories are added to `PYTHONPATH`. You will also need to
  23. `pip` install `pandas` and `contextlib2`.
  24. Some of the data used in this tutorial lives in Google Cloud buckets. To access
  25. it, you will have to [install the Google Cloud
  26. SDK](https://cloud.google.com/sdk/downloads) on your workstation or laptop.
  27. ## Preparing the Open Images validation and test sets
  28. In order to run inference and subsequent evaluation measure computations, we
  29. require a dataset of images and ground truth boxes, packaged as TFRecords of
  30. TFExamples. To create such a dataset for Open Images, you will need to first
  31. download ground truth boxes from the [Open Images
  32. website](https://github.com/openimages/dataset):
  33. ```bash
  34. # From tensorflow/models/research
  35. mkdir oid
  36. cd oid
  37. wget https://storage.googleapis.com/openimages/2017_07/annotations_human_bbox_2017_07.tar.gz
  38. tar -xvf annotations_human_bbox_2017_07.tar.gz
  39. ```
  40. Next, download the images. In this tutorial, we will use lower resolution images
  41. provided by [CVDF](http://www.cvdfoundation.org). Please follow the instructions
  42. on [CVDF's Open Images repository
  43. page](https://github.com/cvdfoundation/open-images-dataset) in order to gain
  44. access to the cloud bucket with the images. Then run:
  45. ```bash
  46. # From tensorflow/models/research/oid
  47. SPLIT=validation # Set SPLIT to "test" to download the images in the test set
  48. mkdir raw_images_${SPLIT}
  49. gsutil -m rsync -r gs://open-images-dataset/$SPLIT raw_images_${SPLIT}
  50. ```
  51. Another option for downloading the images is to follow the URLs contained in the
  52. [image URLs and metadata CSV
  53. files](https://storage.googleapis.com/openimages/2017_07/images_2017_07.tar.gz)
  54. on the Open Images website.
  55. At this point, your `tensorflow/models/research/oid` directory should appear as
  56. follows:
  57. ```lang-none
  58. |-- 2017_07
  59. | |-- test
  60. | | `-- annotations-human-bbox.csv
  61. | |-- train
  62. | | `-- annotations-human-bbox.csv
  63. | `-- validation
  64. | `-- annotations-human-bbox.csv
  65. |-- raw_images_validation (if you downloaded the validation split)
  66. | `-- ... (41,620 files matching regex "[0-9a-f]{16}.jpg")
  67. |-- raw_images_test (if you downloaded the test split)
  68. | `-- ... (125,436 files matching regex "[0-9a-f]{16}.jpg")
  69. `-- annotations_human_bbox_2017_07.tar.gz
  70. ```
  71. Next, package the data into TFRecords of TFExamples by running:
  72. ```bash
  73. # From tensorflow/models/research/oid
  74. SPLIT=validation # Set SPLIT to "test" to create TFRecords for the test split
  75. mkdir ${SPLIT}_tfrecords
  76. PYTHONPATH=$PYTHONPATH:$(readlink -f ..) \
  77. python -m object_detection/dataset_tools/create_oid_tf_record \
  78. --input_box_annotations_csv 2017_07/$SPLIT/annotations-human-bbox.csv \
  79. --input_images_directory raw_images_${SPLIT} \
  80. --input_label_map ../object_detection/data/oid_bbox_trainable_label_map.pbtxt \
  81. --output_tf_record_path_prefix ${SPLIT}_tfrecords/$SPLIT.tfrecord \
  82. --num_shards=100
  83. ```
  84. To add image-level labels, use the `--input_image_label_annotations_csv` flag.
  85. This results in 100 TFRecord files (shards), written to
  86. `oid/${SPLIT}_tfrecords`, with filenames matching
  87. `${SPLIT}.tfrecord-000[0-9][0-9]-of-00100`. Each shard contains approximately
  88. the same number of images and is defacto a representative random sample of the
  89. input data. [This enables](#accelerating_inference) a straightforward work
  90. division scheme for distributing inference and also approximate measure
  91. computations on subsets of the validation and test sets.
  92. ## Inferring detections
  93. Inference requires a trained object detection model. In this tutorial we will
  94. use a model from the [detections model zoo](detection_model_zoo.md), which can
  95. be downloaded and unpacked by running the commands below. More information about
  96. the model, such as its architecture and how it was trained, is available in the
  97. [model zoo page](detection_model_zoo.md).
  98. ```bash
  99. # From tensorflow/models/research/oid
  100. wget http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_oid_14_10_2017.tar.gz
  101. tar -zxvf faster_rcnn_inception_resnet_v2_atrous_oid_14_10_2017.tar.gz
  102. ```
  103. At this point, data is packed into TFRecords and we have an object detector
  104. model. We can run inference using:
  105. ```bash
  106. # From tensorflow/models/research/oid
  107. SPLIT=validation # or test
  108. TF_RECORD_FILES=$(ls -1 ${SPLIT}_tfrecords/* | tr '\n' ',')
  109. PYTHONPATH=$PYTHONPATH:$(readlink -f ..) \
  110. python -m object_detection/inference/infer_detections \
  111. --input_tfrecord_paths=$TF_RECORD_FILES \
  112. --output_tfrecord_path=${SPLIT}_detections.tfrecord-00000-of-00001 \
  113. --inference_graph=faster_rcnn_inception_resnet_v2_atrous_oid/frozen_inference_graph.pb \
  114. --discard_image_pixels
  115. ```
  116. Inference preserves all fields of the input TFExamples, and adds new fields to
  117. store the inferred detections. This allows [computing evaluation
  118. measures](#compute_evaluation_measures) on the output TFRecord alone, as ground
  119. truth boxes are preserved as well. Since measure computations don't require
  120. access to the images, `infer_detections` can optionally discard them with the
  121. `--discard_image_pixels` flag. Discarding the images drastically reduces the
  122. size of the output TFRecord.
  123. ### Accelerating inference
  124. Running inference on the whole validation or test set can take a long time to
  125. complete due to the large number of images present in these sets (41,620 and
  126. 125,436 respectively). For quick but approximate evaluation, inference and the
  127. subsequent measure computations can be run on a small number of shards. To run
  128. for example on 2% of all the data, it is enough to set `TF_RECORD_FILES` as
  129. shown below before running `infer_detections`:
  130. ```bash
  131. TF_RECORD_FILES=$(ls ${SPLIT}_tfrecords/${SPLIT}.tfrecord-0000[0-1]-of-00100 | tr '\n' ',')
  132. ```
  133. Please note that computing evaluation measures on a small subset of the data
  134. introduces variance and bias, since some classes of objects won't be seen during
  135. evaluation. In the example above, this leads to 13.2% higher mAP on the first
  136. two shards of the validation set compared to the mAP for the full set ([see mAP
  137. results](#expected-maps)).
  138. Another way to accelerate inference is to run it in parallel on multiple
  139. TensorFlow devices on possibly multiple machines. The script below uses
  140. [tmux](https://github.com/tmux/tmux/wiki) to run a separate `infer_detections`
  141. process for each GPU on different partition of the input data.
  142. ```bash
  143. # From tensorflow/models/research/oid
  144. SPLIT=validation # or test
  145. NUM_GPUS=4
  146. NUM_SHARDS=100
  147. tmux new-session -d -s "inference"
  148. function tmux_start { tmux new-window -d -n "inference:GPU$1" "${*:2}; exec bash"; }
  149. for gpu_index in $(seq 0 $(($NUM_GPUS-1))); do
  150. start_shard=$(( $gpu_index * $NUM_SHARDS / $NUM_GPUS ))
  151. end_shard=$(( ($gpu_index + 1) * $NUM_SHARDS / $NUM_GPUS - 1))
  152. TF_RECORD_FILES=$(seq -s, -f "${SPLIT}_tfrecords/${SPLIT}.tfrecord-%05.0f-of-$(printf '%05d' $NUM_SHARDS)" $start_shard $end_shard)
  153. tmux_start ${gpu_index} \
  154. PYTHONPATH=$PYTHONPATH:$(readlink -f ..) CUDA_VISIBLE_DEVICES=$gpu_index \
  155. python -m object_detection/inference/infer_detections \
  156. --input_tfrecord_paths=$TF_RECORD_FILES \
  157. --output_tfrecord_path=${SPLIT}_detections.tfrecord-$(printf "%05d" $gpu_index)-of-$(printf "%05d" $NUM_GPUS) \
  158. --inference_graph=faster_rcnn_inception_resnet_v2_atrous_oid/frozen_inference_graph.pb \
  159. --discard_image_pixels
  160. done
  161. ```
  162. After all `infer_detections` processes finish, `tensorflow/models/research/oid`
  163. will contain one output TFRecord from each process, with name matching
  164. `validation_detections.tfrecord-0000[0-3]-of-00004`.
  165. ## Computing evaluation measures
  166. To compute evaluation measures on the inferred detections you first need to
  167. create the appropriate configuration files:
  168. ```bash
  169. # From tensorflow/models/research/oid
  170. SPLIT=validation # or test
  171. NUM_SHARDS=1 # Set to NUM_GPUS if using the parallel evaluation script above
  172. mkdir -p ${SPLIT}_eval_metrics
  173. echo "
  174. label_map_path: '../object_detection/data/oid_bbox_trainable_label_map.pbtxt'
  175. tf_record_input_reader: { input_path: '${SPLIT}_detections.tfrecord@${NUM_SHARDS}' }
  176. " > ${SPLIT}_eval_metrics/${SPLIT}_input_config.pbtxt
  177. echo "
  178. metrics_set: 'oid_V2_detection_metrics'
  179. " > ${SPLIT}_eval_metrics/${SPLIT}_eval_config.pbtxt
  180. ```
  181. And then run:
  182. ```bash
  183. # From tensorflow/models/research/oid
  184. SPLIT=validation # or test
  185. PYTHONPATH=$PYTHONPATH:$(readlink -f ..) \
  186. python -m object_detection/metrics/offline_eval_map_corloc \
  187. --eval_dir=${SPLIT}_eval_metrics \
  188. --eval_config_path=${SPLIT}_eval_metrics/${SPLIT}_eval_config.pbtxt \
  189. --input_config_path=${SPLIT}_eval_metrics/${SPLIT}_input_config.pbtxt
  190. ```
  191. The first configuration file contains an `object_detection.protos.InputReader`
  192. message that describes the location of the necessary input files. The second
  193. file contains an `object_detection.protos.EvalConfig` message that describes the
  194. evaluation metric. For more information about these protos see the corresponding
  195. source files.
  196. ### Expected mAPs
  197. The result of running `offline_eval_map_corloc` is a CSV file located at
  198. `${SPLIT}_eval_metrics/metrics.csv`. With the above configuration, the file will
  199. contain average precision at IoU≥0.5 for each of the classes present in the
  200. dataset. It will also contain the mAP@IoU≥0.5. Both the per-class average
  201. precisions and the mAP are computed according to the [Open Images evaluation
  202. protocol](evaluation_protocols.md). The expected mAPs for the validation and
  203. test sets of Open Images in this case are:
  204. Set | Fraction of data | Images | mAP@IoU≥0.5
  205. ---------: | :--------------: | :-----: | -----------
  206. validation | everything | 41,620 | 39.2%
  207. validation | first 2 shards | 884 | 52.4%
  208. test | everything | 125,436 | 37.7%
  209. test | first 2 shards | 2,476 | 50.8%