yigitcolakoglu
/
MyCity

# TPU compatible detection pipelines

[TOC]
The Tensorflow Object Detection API supports TPU training for some models. Tomake models TPU compatible you need to make a few tweaks to the model config asmentioned below. We also provide several sample configs that you can use as atemplate.
## TPU compatibility

### Static shaped tensors

TPU training currently requires all tensors in the Tensorflow Graph to havestatic shapes. However, most of the sample configs in Object Detection API havea few different tensors that are dynamically shaped. Fortunately, we providesimple alternatives in the model configuration that modifies these tensors tohave static shape:
*   **Image tensors with static shape** - This can be achieved either by using a    `fixed_shape_resizer` that resizes images to a fixed spatial shape or by    setting `pad_to_max_dimension: true` in `keep_aspect_ratio_resizer` which    pads the resized images with zeros to the bottom and right. Padded image    tensors are correctly handled internally within the model.
    ```    image_resizer {      fixed_shape_resizer {        height: 640        width: 640      }    }    ```
    or
    ```    image_resizer {      keep_aspect_ratio_resizer {        min_dimension: 640        max_dimension: 640        pad_to_max_dimension: true      }    }    ```
*   **Groundtruth tensors with static shape** - Images in a typical detection    dataset have variable number of groundtruth boxes and associated classes.    Setting `max_number_of_boxes` to a large enough number in the    `train_input_reader` and `eval_input_reader` pads the groundtruth tensors    with zeros to a static shape. Padded groundtruth tensors are correctly    handled internally within the model.
    ```    train_input_reader: {      tf_record_input_reader {        input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record-?????-of-00100"      }      label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"      max_number_of_boxes: 200    }
    eval_input_reader: {      tf_record_input_reader {        input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record-?????-of-0010"      }      label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"      max_number_of_boxes: 200    }    ```
### TPU friendly ops

Although TPU supports a vast number of tensorflow ops, a few used in theTensorflow Object Detection API are unsupported. We list such ops below andrecommend compatible substitutes.
*   **Anchor sampling** - Typically we use hard example mining in standard SSD    pipeliens to balance positive and negative anchors that contribute to the    loss. Hard Example mining uses non max suppression as a subroutine and since    non max suppression is not currently supported on TPUs we cannot use hard    example mining. Fortunately, we provide an implementation of focal loss that    can be used instead of hard example mining. Remove `hard_example_miner` from    the config and substitute `weighted_sigmoid` classification loss with    `weighted_sigmoid_focal` loss.
    ```    loss {      classification_loss {        weighted_sigmoid_focal {          alpha: 0.25          gamma: 2.0        }      }      localization_loss {        weighted_smooth_l1 {        }      }      classification_weight: 1.0      localization_weight: 1.0    }    ```
*   **Target Matching** - Object detection API provides two choices for matcher    used in target assignment: `argmax_matcher` and `bipartite_matcher`.    Bipartite matcher is not currently supported on TPU, therefore we must    modify the configs to use `argmax_matcher`. Additionally, set    `use_matmul_gather: true` for efficiency on TPU.
    ```    matcher {      argmax_matcher {        matched_threshold: 0.5        unmatched_threshold: 0.5        ignore_thresholds: false        negatives_lower_than_unmatched: true        force_match_for_each_row: true        use_matmul_gather: true      }    }    ```
### TPU training hyperparameters

Object Detection training on TPU uses synchronous SGD. On a typical cloud TPUwith 8 cores we recommend batch sizes that are 8x large when compared to a GPUconfig that uses asynchronous SGD. We also use fewer training steps (~ 1/100 x)due to the large batch size. This necessitates careful tuning of some othertraining parameters as listed below.
*   **Batch size** - Use the largest batch size that can fit on cloud TPU.
    ```    train_config {      batch_size: 1024    }    ```
*   **Training steps** - Typically only 10s of thousands.
    ```    train_config {      num_steps: 25000    }    ```
*   **Batch norm decay** - Use smaller decay constants (0.97 or 0.997) since we    take fewer training steps.
    ```    batch_norm {      scale: true,      decay: 0.97,      epsilon: 0.001,    }    ```
*   **Learning rate** - Use large learning rate with warmup. Scale learning rate    linearly with batch size. See `cosine_decay_learning_rate` or    `manual_step_learning_rate` for examples.
    ```    learning_rate: {      cosine_decay_learning_rate {        learning_rate_base: .04        total_steps: 25000        warmup_learning_rate: .013333        warmup_steps: 2000      }    }    ```
    or
    ```     learning_rate: {      manual_step_learning_rate {        warmup: true        initial_learning_rate: .01333        schedule {          step: 2000          learning_rate: 0.04        }        schedule {          step: 15000          learning_rate: 0.004        }      }    }    ```
## Example TPU compatible configs

We provide example config files that you can use to train your own models on TPU
*   <a href='https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_mobilenet_v1_300x300_coco14_sync.config'>ssd_mobilenet_v1_300x300</a> <br>*   <a href='https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_mobilenet_v1_ppn_shared_box_predictor_300x300_coco14_sync.config'>ssd_mobilenet_v1_ppn_300x300</a> <br>*   <a href='https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_mobilenet_v1_fpn_shared_box_predictor_640x640_coco14_sync.config'>ssd_mobilenet_v1_fpn_640x640    (mobilenet based retinanet)</a> <br>*   <a href='https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync.config'>ssd_resnet50_v1_fpn_640x640    (retinanet)</a> <br>
## Supported Meta architectures

Currently, `SSDMetaArch` models are supported on TPUs. `FasterRCNNMetaArch` isgoing to be supported soon.