Tùy chỉnh giải mã tính năng

API tfds.decode cho phép bạn ghi đè giải mã tính năng mặc định. Trường hợp sử dụng chính là bỏ qua việc giải mã hình ảnh để có hiệu suất tốt hơn.

Ví dụ sử dụng

Bỏ qua việc giải mã hình ảnh

Để giữ toàn quyền kiểm soát quy trình giải mã hoặc áp dụng bộ lọc trước khi hình ảnh được giải mã (để có hiệu suất tốt hơn), bạn có thể bỏ qua hoàn toàn quá trình giải mã hình ảnh. Điều này hoạt động với cả tfds.features.Image và tfds.features.Video .

ds = tfds.load('imagenet2012', split='train', decoders={
    'image': tfds.decode.SkipDecoding(),
})

for example in ds.take(1):
  assert example['image'].dtype == tf.string  # Images are not decoded

Lọc/xáo trộn tập dữ liệu trước khi hình ảnh được giải mã

Tương tự như ví dụ trước, bạn có thể sử dụng tfds.decode.SkipDecoding() để chèn tùy chỉnh đường dẫn tf.data bổ sung trước khi giải mã hình ảnh. Bằng cách đó, hình ảnh được lọc sẽ không được giải mã và bạn có thể sử dụng bộ đệm ngẫu nhiên lớn hơn.

# Load the base dataset without decoding
ds, ds_info = tfds.load(
    'imagenet2012',
    split='train',
    decoders={
        'image': tfds.decode.SkipDecoding(),  # Image won't be decoded here
    },
    as_supervised=True,
    with_info=True,
)
# Apply filter and shuffle
ds = ds.filter(lambda image, label: label != 10)
ds = ds.shuffle(10000)
# Then decode with ds_info.features['image']
ds = ds.map(
    lambda image, label: ds_info.features['image'].decode_example(image), label)

Cắt xén và giải mã cùng một lúc

Để ghi đè hoạt động tf.io.decode_image mặc định, bạn có thể tạo một đối tượng tfds.decode.Decoder mới bằng cách sử dụng trình trang trí tfds.decode.make_decoder() .

@tfds.decode.make_decoder()
def decode_example(serialized_image, feature):
  crop_y, crop_x, crop_height, crop_width = 10, 10, 64, 64
  return tf.image.decode_and_crop_jpeg(
      serialized_image,
      [crop_y, crop_x, crop_height, crop_width],
      channels=feature.feature.shape[-1],
  )

ds = tfds.load('imagenet2012', split='train', decoders={
    # With video, decoders are applied to individual frames
    'image': decode_example(),
})

Tương đương với:

def decode_example(serialized_image, feature):
  crop_y, crop_x, crop_height, crop_width = 10, 10, 64, 64
  return tf.image.decode_and_crop_jpeg(
      serialized_image,
      [crop_y, crop_x, crop_height, crop_width],
      channels=feature.shape[-1],
  )

ds, ds_info = tfds.load(
    'imagenet2012',
    split='train',
    with_info=True,
    decoders={
        'image': tfds.decode.SkipDecoding(),  # Skip frame decoding
    },
)
ds = ds.map(functools.partial(decode_example, feature=ds_info.features['image']))

Tùy chỉnh giải mã video

Video là Sequence(Image()) . Khi áp dụng bộ giải mã tùy chỉnh, chúng sẽ được áp dụng cho từng khung hình riêng lẻ. Điều này có nghĩa là bộ giải mã hình ảnh sẽ tự động tương thích với video.

@tfds.decode.make_decoder()
def decode_example(serialized_image, feature):
  crop_y, crop_x, crop_height, crop_width = 10, 10, 64, 64
  return tf.image.decode_and_crop_jpeg(
      serialized_image,
      [crop_y, crop_x, crop_height, crop_width],
      channels=feature.feature.shape[-1],
  )

ds = tfds.load('ucf101', split='train', decoders={
    # With video, decoders are applied to individual frames
    'video': decode_example(),
})

Tương đương với:

def decode_frame(serialized_image):
  """Decodes a single frame."""
  crop_y, crop_x, crop_height, crop_width = 10, 10, 64, 64
  return tf.image.decode_and_crop_jpeg(
      serialized_image,
      [crop_y, crop_x, crop_height, crop_width],
      channels=ds_info.features['video'].shape[-1],
  )


def decode_video(example):
  """Decodes all individual frames of the video."""
  video = example['video']
  video = tf.map_fn(
      decode_frame,
      video,
      dtype=ds_info.features['video'].dtype,
      parallel_iterations=10,
  )
  example['video'] = video
  return example


ds, ds_info = tfds.load('ucf101', split='train', with_info=True, decoders={
    'video': tfds.decode.SkipDecoding(),  # Skip frame decoding
})
ds = ds.map(decode_video)  # Decode the video

Chỉ giải mã một tập hợp con của các tính năng.

Bạn cũng có thể bỏ qua hoàn toàn một số tính năng bằng cách chỉ xác định những tính năng bạn cần. Tất cả các tính năng khác sẽ bị bỏ qua/bỏ qua.

builder = tfds.builder('my_dataset')
builder.as_dataset(split='train', decoders=tfds.decode.PartialDecoding({
    'image': True,
    'metadata': {'num_objects', 'scene_name'},
    'objects': {'label'},
})

TFDS sẽ chọn tập hợp con của builder.info.features khớp với cấu trúc tfds.decode.PartialDecoding đã cho.

Trong đoạn mã trên, các đặc trưng được trích xuất ngầm để khớp với builder.info.features . Cũng có thể xác định rõ ràng các tính năng. Đoạn mã trên tương đương với:

builder = tfds.builder('my_dataset')
builder.as_dataset(split='train', decoders=tfds.decode.PartialDecoding({
    'image': tfds.features.Image(),
    'metadata': {
        'num_objects': tf.int64,
        'scene_name': tfds.features.Text(),
    },
    'objects': tfds.features.Sequence({
        'label': tfds.features.ClassLabel(names=[]),
    }),
})

Siêu dữ liệu gốc (tên nhãn, hình ảnh,...) được tự động sử dụng lại nên không bắt buộc phải cung cấp.

tfds.decode.SkipDecoding có thể được chuyển tới tfds.decode.PartialDecoding , thông qua PartialDecoding(..., decoders={}) kwargs.