Help protect the Great Barrier Reef with TensorFlow on Kaggle Join Challenge

View source on GitHub

A Dataset consisting of the results from a SQL query.

driver_name A 0-D tf.string tensor containing the database type. Currently, the only supported value is 'sqlite'.
data_source_name A 0-D tf.string tensor containing a connection string to connect to the database.
query A 0-D tf.string tensor containing the SQL query to execute.
output_types A tuple of tf.DType objects representing the types of the columns returned by query.

element_spec The type specification of an element of this dataset.



View source

Applies a transformation function to this dataset.

apply enables chaining of custom Dataset transformations, which are represented as functions that take one Dataset argument and return a transformed Dataset.

For example:

dataset = ( x: x ** 2)
           .apply(group_by_window(key_func, reduce_func, window_size))
           .map(lambda x: x ** 3))

transformation_func A function that takes one Dataset argument and returns a Dataset.

Dataset The Dataset returned by applying transformation_func to this dataset.


View source

Combines consecutive elements of this dataset into batches.

The components of the resulting element will have an additional outer dimension, which will be batch_size (or N % batch_size for the last element if batch_size does not divide the number of input elements N evenly and drop_remainder is False). If your program depends on the batches having the same outer dimension, you should set the drop_remainder argument to True to prevent the smaller batch from being produced.

batch_size A tf.int64 scalar tf.Tensor, representing the number of consecutive elements of this dataset to combine in a single batch.
drop_remainder (Optional.) A tf.bool scalar tf.Tensor, representing whether the last batch should be dropped in the case it has fewer than batch_size elements; the default behavior is not to drop the smaller batch.

Dataset A Dataset.


View source

Caches the elements in this dataset.

filename A tf.string scalar tf.Tensor, representing the name of a directory on the filesystem to use for caching elements in this Dataset. If a filename is not provided, the dataset will be cached in memory.

Dataset A Dataset.


View source

Creates a Dataset by concatenating the given dataset with this dataset.

a = Dataset.range(1, 4)  # ==> [ 1, 2, 3 ]
b = Dataset.range(4, 8)  # ==> [ 4, 5, 6, 7 ]

# The input dataset and dataset to be concatenated should have the same
# nested structures and output types.
# c = Dataset.range(8, 14).batch(2)  # ==> [ [8, 9], [10, 11], [12, 13] ]
# d = Dataset.from_tensor_slices([14.0, 15.0, 16.0])
# a.concatenate(c) and a.concatenate(d) would result in error.

a.concatenate(b)  # ==> [ 1, 2, 3, 4, 5, 6, 7 ]

dataset Dataset to be concatenated.

Dataset A Dataset.


View source

Enumerates the elements of this dataset.

It is similar to python's enumerate.

For example:

# NOTE: The following examples use `{ ... }` to represent the
# contents of a dataset.
a = { 1, 2, 3 }
b = { (7, 8), (9, 10) }

# The nested structure of the `datasets` argument determines the
# structure of elements in the resulting dataset.
a.enumerate(start=5)) == { (5, 1), (6, 2), (7, 3) }
b.enumerate() == { (0, (7, 8)), (1, (9, 10)) }

start A tf.int64 scalar tf.Tensor, representing the start value for enumeration.

Dataset A Dataset.


View source

Filters this dataset according to predicate.

d =[1, 2, 3])

d = d.filter(lambda x: x < 3)  # ==> [1, 2]

# `tf.math.equal(x, y)` is required for equality comparison
def filter_fn(x):
  return tf.math.equal(x, 1)

d = d.filter(filter_fn)  # ==> [1]

predicate A function mapping a dataset element to a boolean.

Dataset The Dataset containing the elements of this dataset for which predicate is True.


View source

Maps map_func across this dataset and flattens the result.

Use flat_map if you want to make sure that the order of your dataset stays the same. For example, to flatten a dataset of batches into a dataset of their elements:

a = Dataset.from_tensor_slices([ [1, 2, 3], [4, 5, 6], [7, 8, 9] ])

a.flat_map(lambda x: Dataset.from_tensor_slices(x + 1)) # ==>
#  [ 2, 3, 4, 5, 6, 7, 8, 9, 10 ] is a generalization of flat_map, since flat_map produces the same output as

map_func A function mapping a dataset element to a dataset.

Dataset A Dataset.


View source

Creates a Dataset whose elements are generated by generator.

The generator argument must be a callable object that returns an object that supports the iter() protocol (e.g. a generator function). The elements generated by generator must be compatible with the given output_types and (optional) output_shapes arguments.

For example:

import itertools

def gen():
  for i in itertools.count(1):
    yield (i, [1] * i)

ds =
    gen, (tf.int64, tf.int64), (tf.TensorShape([]), tf.TensorShape([None])))

for value in ds.take(2):
  print value
# (1, array([1]))
# (2, array([1, 1]))

generator A callable object that returns an object that supports the iter() protocol. If args is not specified, generator must take no arguments; otherwise it must take as many arguments as there are values in args.
output_types A nested structure of tf.DType objects corresponding to each component of an element yielded by generator.
output_shapes (Optional.) A nested structure of tf.TensorShape objects corresponding to each component of an element yielded by generator.
args (Optional.) A tuple of tf.Tensor objects that will be evaluated and passed to generator as NumPy-array arguments.

Dataset A Dataset.


View source

Creates a Dataset whose elements are slices of the given tensors.

Note that if tensors contains a NumPy array, and eager execution is not enabled, the values will be embedded in the graph as one or more tf.constant operations. For large datasets (> 1 GB), this can waste memory and run into byte limits of graph serialization. If tensors contains one or more large NumPy arrays, consider the alternative described in this guide.

tensors A dataset element, with each component having the same size in the 0th dimension.

Dataset A Dataset.