You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

258 lines
12 KiB

  1. # Tensorflow Object Detection API
  2. Creating accurate machine learning models capable of localizing and identifying
  3. multiple objects in a single image remains a core challenge in computer vision.
  4. The TensorFlow Object Detection API is an open source framework built on top of
  5. TensorFlow that makes it easy to construct, train and deploy object detection
  6. models. At Google we’ve certainly found this codebase to be useful for our
  7. computer vision needs, and we hope that you will as well.
  8. <p align="center">
  9. <img src="g3doc/img/kites_detections_output.jpg" width=676 height=450>
  10. </p>
  11. Contributions to the codebase are welcome and we would love to hear back from
  12. you if you find this API useful. Finally if you use the Tensorflow Object
  13. Detection API for a research publication, please consider citing:
  14. ```
  15. "Speed/accuracy trade-offs for modern convolutional object detectors."
  16. Huang J, Rathod V, Sun C, Zhu M, Korattikara A, Fathi A, Fischer I, Wojna Z,
  17. Song Y, Guadarrama S, Murphy K, CVPR 2017
  18. ```
  19. \[[link](https://arxiv.org/abs/1611.10012)\]\[[bibtex](
  20. https://scholar.googleusercontent.com/scholar.bib?q=info:l291WsrB-hQJ:scholar.google.com/&output=citation&scisig=AAGBfm0AAAAAWUIIlnPZ_L9jxvPwcC49kDlELtaeIyU-&scisf=4&ct=citation&cd=-1&hl=en&scfhb=1)\]
  21. <p align="center">
  22. <img src="g3doc/img/tf-od-api-logo.png" width=140 height=195>
  23. </p>
  24. ## Maintainers
  25. * Jonathan Huang, github: [jch1](https://github.com/jch1)
  26. * Vivek Rathod, github: [tombstone](https://github.com/tombstone)
  27. * Ronny Votel, github: [ronnyvotel](https://github.com/ronnyvotel)
  28. * Derek Chow, github: [derekjchow](https://github.com/derekjchow)
  29. * Chen Sun, github: [jesu9](https://github.com/jesu9)
  30. * Menglong Zhu, github: [dreamdragon](https://github.com/dreamdragon)
  31. * Alireza Fathi, github: [afathi3](https://github.com/afathi3)
  32. * Zhichao Lu, github: [pkulzc](https://github.com/pkulzc)
  33. ## Table of contents
  34. Setup:
  35. * <a href='g3doc/installation.md'>Installation</a><br>
  36. Quick Start:
  37. * <a href='object_detection_tutorial.ipynb'>
  38. Quick Start: Jupyter notebook for off-the-shelf inference</a><br>
  39. * <a href="g3doc/running_pets.md">Quick Start: Training a pet detector</a><br>
  40. Customizing a Pipeline:
  41. * <a href='g3doc/configuring_jobs.md'>
  42. Configuring an object detection pipeline</a><br>
  43. * <a href='g3doc/preparing_inputs.md'>Preparing inputs</a><br>
  44. Running:
  45. * <a href='g3doc/running_locally.md'>Running locally</a><br>
  46. * <a href='g3doc/running_on_cloud.md'>Running on the cloud</a><br>
  47. Extras:
  48. * <a href='g3doc/detection_model_zoo.md'>Tensorflow detection model zoo</a><br>
  49. * <a href='g3doc/exporting_models.md'>
  50. Exporting a trained model for inference</a><br>
  51. * <a href='g3doc/tpu_exporters.md'>
  52. Exporting a trained model for TPU inference</a><br>
  53. * <a href='g3doc/defining_your_own_model.md'>
  54. Defining your own model architecture</a><br>
  55. * <a href='g3doc/using_your_own_dataset.md'>
  56. Bringing in your own dataset</a><br>
  57. * <a href='g3doc/evaluation_protocols.md'>
  58. Supported object detection evaluation protocols</a><br>
  59. * <a href='g3doc/oid_inference_and_evaluation.md'>
  60. Inference and evaluation on the Open Images dataset</a><br>
  61. * <a href='g3doc/instance_segmentation.md'>
  62. Run an instance segmentation model</a><br>
  63. * <a href='g3doc/challenge_evaluation.md'>
  64. Run the evaluation for the Open Images Challenge 2018</a><br>
  65. * <a href='g3doc/tpu_compatibility.md'>
  66. TPU compatible detection pipelines</a><br>
  67. * <a href='g3doc/running_on_mobile_tensorflowlite.md'>
  68. Running object detection on mobile devices with TensorFlow Lite</a><br>
  69. ## Getting Help
  70. To get help with issues you may encounter using the Tensorflow Object Detection
  71. API, create a new question on [StackOverflow](https://stackoverflow.com/) with
  72. the tags "tensorflow" and "object-detection".
  73. Please report bugs (actually broken code, not usage questions) to the
  74. tensorflow/models GitHub
  75. [issue tracker](https://github.com/tensorflow/models/issues), prefixing the
  76. issue name with "object_detection".
  77. Please check [FAQ](g3doc/faq.md) for frequently asked questions before
  78. reporting an issue.
  79. ## Release information
  80. ### Feb 11, 2019
  81. We have released detection models trained on the [Open Images Dataset V4](https://storage.googleapis.com/openimages/web/challenge.html)
  82. in our detection model zoo, including
  83. * Faster R-CNN detector with Inception Resnet V2 feature extractor
  84. * SSD detector with MobileNet V2 feature extractor
  85. * SSD detector with ResNet 101 FPN feature extractor (aka RetinaNet-101)
  86. <b>Thanks to contributors</b>: Alina Kuznetsova, Yinxiao Li
  87. ### Sep 17, 2018
  88. We have released Faster R-CNN detectors with ResNet-50 / ResNet-101 feature
  89. extractors trained on the [iNaturalist Species Detection Dataset](https://github.com/visipedia/inat_comp/blob/master/2017/README.md#bounding-boxes).
  90. The models are trained on the training split of the iNaturalist data for 4M
  91. iterations, they achieve 55% and 58% mean AP@.5 over 2854 classes respectively.
  92. For more details please refer to this [paper](https://arxiv.org/abs/1707.06642).
  93. <b>Thanks to contributors</b>: Chen Sun
  94. ### July 13, 2018
  95. There are many new updates in this release, extending the functionality and
  96. capability of the API:
  97. * Moving from slim-based training to [Estimator](https://www.tensorflow.org/api_docs/python/tf/estimator/Estimator)-based
  98. training.
  99. * Support for [RetinaNet](https://arxiv.org/abs/1708.02002), and a [MobileNet](https://ai.googleblog.com/2017/06/mobilenets-open-source-models-for.html)
  100. adaptation of RetinaNet.
  101. * A novel SSD-based architecture called the [Pooling Pyramid Network](https://arxiv.org/abs/1807.03284) (PPN).
  102. * Releasing several [TPU](https://cloud.google.com/tpu/)-compatible models.
  103. These can be found in the `samples/configs/` directory with a comment in the
  104. pipeline configuration files indicating TPU compatibility.
  105. * Support for quantized training.
  106. * Updated documentation for new binaries, Cloud training, and [Tensorflow Lite](https://www.tensorflow.org/mobile/tflite/).
  107. See also our [expanded announcement blogpost](https://ai.googleblog.com/2018/07/accelerated-training-and-inference-with.html) and accompanying tutorial at the [TensorFlow blog](https://medium.com/tensorflow/training-and-serving-a-realtime-mobile-object-detector-in-30-minutes-with-cloud-tpus-b78971cf1193).
  108. <b>Thanks to contributors</b>: Sara Robinson, Aakanksha Chowdhery, Derek Chow,
  109. Pengchong Jin, Jonathan Huang, Vivek Rathod, Zhichao Lu, Ronny Votel
  110. ### June 25, 2018
  111. Additional evaluation tools for the [Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html) are out.
  112. Check out our short tutorial on data preparation and running evaluation [here](g3doc/challenge_evaluation.md)!
  113. <b>Thanks to contributors</b>: Alina Kuznetsova
  114. ### June 5, 2018
  115. We have released the implementation of evaluation metrics for both tracks of the [Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html) as a part of the Object Detection API - see the [evaluation protocols](g3doc/evaluation_protocols.md) for more details.
  116. Additionally, we have released a tool for hierarchical labels expansion for the Open Images Challenge: check out [oid_hierarchical_labels_expansion.py](dataset_tools/oid_hierarchical_labels_expansion.py).
  117. <b>Thanks to contributors</b>: Alina Kuznetsova, Vittorio Ferrari, Jasper Uijlings
  118. ### April 30, 2018
  119. We have released a Faster R-CNN detector with ResNet-101 feature extractor trained on [AVA](https://research.google.com/ava/) v2.1.
  120. Compared with other commonly used object detectors, it changes the action classification loss function to per-class Sigmoid loss to handle boxes with multiple labels.
  121. The model is trained on the training split of AVA v2.1 for 1.5M iterations, it achieves mean AP of 11.25% over 60 classes on the validation split of AVA v2.1.
  122. For more details please refer to this [paper](https://arxiv.org/abs/1705.08421).
  123. <b>Thanks to contributors</b>: Chen Sun, David Ross
  124. ### April 2, 2018
  125. Supercharge your mobile phones with the next generation mobile object detector!
  126. We are adding support for MobileNet V2 with SSDLite presented in
  127. [MobileNetV2: Inverted Residuals and Linear Bottlenecks](https://arxiv.org/abs/1801.04381).
  128. This model is 35% faster than Mobilenet V1 SSD on a Google Pixel phone CPU (200ms vs. 270ms) at the same accuracy.
  129. Along with the model definition, we are also releasing a model checkpoint trained on the COCO dataset.
  130. <b>Thanks to contributors</b>: Menglong Zhu, Mark Sandler, Zhichao Lu, Vivek Rathod, Jonathan Huang
  131. ### February 9, 2018
  132. We now support instance segmentation!! In this API update we support a number of instance segmentation models similar to those discussed in the [Mask R-CNN paper](https://arxiv.org/abs/1703.06870). For further details refer to
  133. [our slides](http://presentations.cocodataset.org/Places17-GMRI.pdf) from the 2017 Coco + Places Workshop.
  134. Refer to the section on [Running an Instance Segmentation Model](g3doc/instance_segmentation.md) for instructions on how to configure a model
  135. that predicts masks in addition to object bounding boxes.
  136. <b>Thanks to contributors</b>: Alireza Fathi, Zhichao Lu, Vivek Rathod, Ronny Votel, Jonathan Huang
  137. ### November 17, 2017
  138. As a part of the Open Images V3 release we have released:
  139. * An implementation of the Open Images evaluation metric and the [protocol](g3doc/evaluation_protocols.md#open-images).
  140. * Additional tools to separate inference of detection and evaluation (see [this tutorial](g3doc/oid_inference_and_evaluation.md)).
  141. * A new detection model trained on the Open Images V2 data release (see [Open Images model](g3doc/detection_model_zoo.md#open-images-models)).
  142. See more information on the [Open Images website](https://github.com/openimages/dataset)!
  143. <b>Thanks to contributors</b>: Stefan Popov, Alina Kuznetsova
  144. ### November 6, 2017
  145. We have re-released faster versions of our (pre-trained) models in the
  146. <a href='g3doc/detection_model_zoo.md'>model zoo</a>. In addition to what
  147. was available before, we are also adding Faster R-CNN models trained on COCO
  148. with Inception V2 and Resnet-50 feature extractors, as well as a Faster R-CNN
  149. with Resnet-101 model trained on the KITTI dataset.
  150. <b>Thanks to contributors</b>: Jonathan Huang, Vivek Rathod, Derek Chow,
  151. Tal Remez, Chen Sun.
  152. ### October 31, 2017
  153. We have released a new state-of-the-art model for object detection using
  154. the Faster-RCNN with the
  155. [NASNet-A image featurization](https://arxiv.org/abs/1707.07012). This
  156. model achieves mAP of 43.1% on the test-dev validation dataset for COCO,
  157. improving on the best available model in the zoo by 6% in terms
  158. of absolute mAP.
  159. <b>Thanks to contributors</b>: Barret Zoph, Vijay Vasudevan, Jonathon Shlens, Quoc Le
  160. ### August 11, 2017
  161. We have released an update to the [Android Detect
  162. demo](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android)
  163. which will now run models trained using the Tensorflow Object
  164. Detection API on an Android device. By default, it currently runs a
  165. frozen SSD w/Mobilenet detector trained on COCO, but we encourage
  166. you to try out other detection models!
  167. <b>Thanks to contributors</b>: Jonathan Huang, Andrew Harp
  168. ### June 15, 2017
  169. In addition to our base Tensorflow detection model definitions, this
  170. release includes:
  171. * A selection of trainable detection models, including:
  172. * Single Shot Multibox Detector (SSD) with MobileNet,
  173. * SSD with Inception V2,
  174. * Region-Based Fully Convolutional Networks (R-FCN) with Resnet 101,
  175. * Faster RCNN with Resnet 101,
  176. * Faster RCNN with Inception Resnet v2
  177. * Frozen weights (trained on the COCO dataset) for each of the above models to
  178. be used for out-of-the-box inference purposes.
  179. * A [Jupyter notebook](object_detection_tutorial.ipynb) for performing
  180. out-of-the-box inference with one of our released models
  181. * Convenient [local training](g3doc/running_locally.md) scripts as well as
  182. distributed training and evaluation pipelines via
  183. [Google Cloud](g3doc/running_on_cloud.md).
  184. <b>Thanks to contributors</b>: Jonathan Huang, Vivek Rathod, Derek Chow,
  185. Chen Sun, Menglong Zhu, Matthew Tang, Anoop Korattikara, Alireza Fathi, Ian Fischer, Zbigniew Wojna, Yang Song, Sergio Guadarrama, Jasper Uijlings,
  186. Viacheslav Kovalevskyi, Kevin Murphy