yigitcolakoglu
/
MyCity

# Preparing Inputs

[TOC]
To use your own dataset in Tensorflow Object Detection API, you must convert itinto the [TFRecord file format](https://www.tensorflow.org/api_guides/python/python_io#tfrecords_format_details).This document outlines how to write a script to generate the TFRecord file.
## Label Maps

Each dataset is required to have a label map associated with it. This label mapdefines a mapping from string class names to integer class Ids. The label mapshould be a `StringIntLabelMap` text protobuf. Sample label maps can be found inobject_detection/data. Label maps should always start from id 1.
## Dataset Requirements

For every example in your dataset, you should have the following information:
1. An RGB image for the dataset encoded as jpeg or png.2. A list of bounding boxes for the image. Each bounding box should contain:    1. A bounding box coordinates (with origin in top left corner) defined by 4       floating point numbers [ymin, xmin, ymax, xmax]. Note that we store the       _normalized_ coordinates (x / width, y / height) in the TFRecord dataset.    2. The class of the object in the bounding box.
# Example Image

Consider the following image:
![Example Image](img/example_cat.jpg "Example Image")
with the following label map:
```item {  id: 1  name: 'Cat'}

item {  id: 2  name: 'Dog'}```
We can generate a tf.Example proto for this image using the following code:
```python
def create_cat_tf_example(encoded_cat_image_data):   """Creates a tf.Example proto from sample cat image.
  Args:    encoded_cat_image_data: The jpg encoded data of the cat image.
  Returns:    example: The created tf.Example.  """
  height = 1032.0  width = 1200.0  filename = 'example_cat.jpg'  image_format = b'jpg'
  xmins = [322.0 / 1200.0]  xmaxs = [1062.0 / 1200.0]  ymins = [174.0 / 1032.0]  ymaxs = [761.0 / 1032.0]  classes_text = ['Cat']  classes = [1]
  tf_example = tf.train.Example(features=tf.train.Features(feature={      'image/height': dataset_util.int64_feature(height),      'image/width': dataset_util.int64_feature(width),      'image/filename': dataset_util.bytes_feature(filename),      'image/source_id': dataset_util.bytes_feature(filename),      'image/encoded': dataset_util.bytes_feature(encoded_image_data),      'image/format': dataset_util.bytes_feature(image_format),      'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),      'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),      'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),      'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),      'image/object/class/text': dataset_util.bytes_list_feature(classes_text),      'image/object/class/label': dataset_util.int64_list_feature(classes),  }))  return tf_example```
## Conversion Script Outline {#conversion-script-outline}

A typical conversion script will look like the following:
```python
import tensorflow as tf
from object_detection.utils import dataset_util

flags = tf.app.flagsflags.DEFINE_string('output_path', '', 'Path to output TFRecord')FLAGS = flags.FLAGS

def create_tf_example(example):  # TODO(user): Populate the following variables from your example.  height = None # Image height  width = None # Image width  filename = None # Filename of the image. Empty if image is not from file  encoded_image_data = None # Encoded image bytes  image_format = None # b'jpeg' or b'png'
  xmins = [] # List of normalized left x coordinates in bounding box (1 per box)  xmaxs = [] # List of normalized right x coordinates in bounding box             # (1 per box)  ymins = [] # List of normalized top y coordinates in bounding box (1 per box)  ymaxs = [] # List of normalized bottom y coordinates in bounding box             # (1 per box)  classes_text = [] # List of string class name of bounding box (1 per box)  classes = [] # List of integer class id of bounding box (1 per box)
  tf_example = tf.train.Example(features=tf.train.Features(feature={      'image/height': dataset_util.int64_feature(height),      'image/width': dataset_util.int64_feature(width),      'image/filename': dataset_util.bytes_feature(filename),      'image/source_id': dataset_util.bytes_feature(filename),      'image/encoded': dataset_util.bytes_feature(encoded_image_data),      'image/format': dataset_util.bytes_feature(image_format),      'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),      'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),      'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),      'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),      'image/object/class/text': dataset_util.bytes_list_feature(classes_text),      'image/object/class/label': dataset_util.int64_list_feature(classes),  }))  return tf_example

def main(_):  writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
  # TODO(user): Write code to read in your dataset to examples variable
  for example in examples:    tf_example = create_tf_example(example)    writer.write(tf_example.SerializeToString())
  writer.close()

if __name__ == '__main__':  tf.app.run()
```
Note: You may notice additional fields in some other datasets. They arecurrently unused by the API and are optional.
Note: Please refer to the section on [Running an Instance SegmentationModel](instance_segmentation.md) for instructions on how to configure a modelthat predicts masks in addition to object bounding boxes.
## Sharding datasets

When you have more than a few thousand examples, it is beneficial to shard yourdataset into multiple files:
*   tf.data.Dataset API can read input examples in parallel improving    throughput.*   tf.data.Dataset API can shuffle the examples better with sharded files which    improves performance of the model slightly.
Instead of writing all tf.Example protos to a single file as shown in[conversion script outline](#conversion-script-outline), use the snippet below.
```pythonimport contextlib2from object_detection.dataset_tools import tf_record_creation_util
num_shards=10output_filebase='/path/to/train_dataset.record'
with contextlib2.ExitStack() as tf_record_close_stack:  output_tfrecords = tf_record_creation_util.open_sharded_output_tfrecords(      tf_record_close_stack, output_filebase, num_shards)  for index, example in examples:    tf_example = create_tf_example(example)    output_shard_index = index % num_shards    output_tfrecords[output_shard_index].write(tf_example.SerializeToString())```
This will produce the following output files
```bash/path/to/train_dataset.record-00000-00010/path/to/train_dataset.record-00001-00010.../path/to/train_dataset.record-00009-00010```
which can then be used in the config file as below.
```bashtf_record_input_reader {  input_path: "/path/to/train_dataset.record-?????-of-00010"}```