You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

165 lines
5.6 KiB

6 years ago
  1. # Configuring the Object Detection Training Pipeline
  2. ## Overview
  3. The Tensorflow Object Detection API uses protobuf files to configure the
  4. training and evaluation process. The schema for the training pipeline can be
  5. found in object_detection/protos/pipeline.proto. At a high level, the config
  6. file is split into 5 parts:
  7. 1. The `model` configuration. This defines what type of model will be trained
  8. (ie. meta-architecture, feature extractor).
  9. 2. The `train_config`, which decides what parameters should be used to train
  10. model parameters (ie. SGD parameters, input preprocessing and feature extractor
  11. initialization values).
  12. 3. The `eval_config`, which determines what set of metrics will be reported for
  13. evaluation.
  14. 4. The `train_input_config`, which defines what dataset the model should be
  15. trained on.
  16. 5. The `eval_input_config`, which defines what dataset the model will be
  17. evaluated on. Typically this should be different than the training input
  18. dataset.
  19. A skeleton configuration file is shown below:
  20. ```
  21. model {
  22. (... Add model config here...)
  23. }
  24. train_config : {
  25. (... Add train_config here...)
  26. }
  27. train_input_reader: {
  28. (... Add train_input configuration here...)
  29. }
  30. eval_config: {
  31. }
  32. eval_input_reader: {
  33. (... Add eval_input configuration here...)
  34. }
  35. ```
  36. ## Picking Model Parameters
  37. There are a large number of model parameters to configure. The best settings
  38. will depend on your given application. Faster R-CNN models are better suited to
  39. cases where high accuracy is desired and latency is of lower priority.
  40. Conversely, if processing time is the most important factor, SSD models are
  41. recommended. Read [our paper](https://arxiv.org/abs/1611.10012) for a more
  42. detailed discussion on the speed vs accuracy tradeoff.
  43. To help new users get started, sample model configurations have been provided
  44. in the object_detection/samples/configs folder. The contents of these
  45. configuration files can be pasted into `model` field of the skeleton
  46. configuration. Users should note that the `num_classes` field should be changed
  47. to a value suited for the dataset the user is training on.
  48. ## Defining Inputs
  49. The Tensorflow Object Detection API accepts inputs in the TFRecord file format.
  50. Users must specify the locations of both the training and evaluation files.
  51. Additionally, users should also specify a label map, which define the mapping
  52. between a class id and class name. The label map should be identical between
  53. training and evaluation datasets.
  54. An example input configuration looks as follows:
  55. ```
  56. tf_record_input_reader {
  57. input_path: "/usr/home/username/data/train.record"
  58. }
  59. label_map_path: "/usr/home/username/data/label_map.pbtxt"
  60. ```
  61. Users should substitute the `input_path` and `label_map_path` arguments and
  62. insert the input configuration into the `train_input_reader` and
  63. `eval_input_reader` fields in the skeleton configuration. Note that the paths
  64. can also point to Google Cloud Storage buckets (ie.
  65. "gs://project_bucket/train.record") for use on Google Cloud.
  66. ## Configuring the Trainer
  67. The `train_config` defines parts of the training process:
  68. 1. Model parameter initialization.
  69. 2. Input preprocessing.
  70. 3. SGD parameters.
  71. A sample `train_config` is below:
  72. ```
  73. batch_size: 1
  74. optimizer {
  75. momentum_optimizer: {
  76. learning_rate: {
  77. manual_step_learning_rate {
  78. initial_learning_rate: 0.0002
  79. schedule {
  80. step: 0
  81. learning_rate: .0002
  82. }
  83. schedule {
  84. step: 900000
  85. learning_rate: .00002
  86. }
  87. schedule {
  88. step: 1200000
  89. learning_rate: .000002
  90. }
  91. }
  92. }
  93. momentum_optimizer_value: 0.9
  94. }
  95. use_moving_average: false
  96. }
  97. fine_tune_checkpoint: "/usr/home/username/tmp/model.ckpt-#####"
  98. from_detection_checkpoint: true
  99. load_all_detection_checkpoint_vars: true
  100. gradient_clipping_by_norm: 10.0
  101. data_augmentation_options {
  102. random_horizontal_flip {
  103. }
  104. }
  105. ```
  106. ### Model Parameter Initialization
  107. While optional, it is highly recommended that users utilize other object
  108. detection checkpoints. Training an object detector from scratch can take days.
  109. To speed up the training process, it is recommended that users re-use the
  110. feature extractor parameters from a pre-existing image classification or
  111. object detection checkpoint. `train_config` provides two fields to specify
  112. pre-existing checkpoints: `fine_tune_checkpoint` and
  113. `from_detection_checkpoint`. `fine_tune_checkpoint` should provide a path to
  114. the pre-existing checkpoint
  115. (ie:"/usr/home/username/checkpoint/model.ckpt-#####").
  116. `from_detection_checkpoint` is a boolean value. If false, it assumes the
  117. checkpoint was from an object classification checkpoint. Note that starting
  118. from a detection checkpoint will usually result in a faster training job than
  119. a classification checkpoint.
  120. The list of provided checkpoints can be found [here](detection_model_zoo.md).
  121. ### Input Preprocessing
  122. The `data_augmentation_options` in `train_config` can be used to specify
  123. how training data can be modified. This field is optional.
  124. ### SGD Parameters
  125. The remainings parameters in `train_config` are hyperparameters for gradient
  126. descent. Please note that the optimal learning rates provided in these
  127. configuration files may depend on the specifics of the training setup (e.g.
  128. number of workers, gpu type).
  129. ## Configuring the Evaluator
  130. The main components to set in `eval_config` are `num_examples` and
  131. `metrics_set`. The parameter `num_examples` indicates the number of batches (
  132. currently of batch size 1) used for an evaluation cycle, and often is the total
  133. size of the evaluation dataset. The parameter `metrics_set` indicates which
  134. metrics to run during evaluation (i.e. `"coco_detection_metrics"`).