Tuesday, 27 November 2018

tensorFlow -- .tfrecord Files

### Own - Conda venv --- dc_info_venv
# Source --- https://medium.com/mostly-ai/tensorflow-records-what-they-are-and-how-to-use-them-c46bc4bbb564
### main Source --- https://www.tensorflow.org/guide/

# 
import tensorflow as tf
#from tf.keras import layers ### Fails - We have TF version == 1.5.0 

import math
import numpy as np
import h5py
import matplotlib.pyplot as plt
from tensorflow.python.framework import ops
#from tf_utils import load_dataset, random_mini_batches, convert_to_one_hot, predict

%matplotlib inline
np.random.seed(1)
#
print(tf.VERSION)
print(tf.keras.__version__)
import keras
print('Keras: {}'.format(keras.__version__))
1.5.0
2.1.2-tf
Keras: 2.2.4
Using TensorFlow backend.
In [2]:
# Source --- https://medium.com/mostly-ai/tensorflow-records-what-they-are-and-how-to-use-them-c46bc4bbb564

"""
A TFRecord file stores your data as a sequence of binary strings.
This means you need to specify the structure of your data before you write it to the file.
Tensorflow provides two components for this purpose: 

tf.train.Example and 
tf.train.SequenceExample. 

You have to store each sample of your data in one of these structures, 
then ----serialize-------- it and use a tf.python_io.TFRecordWriter to write it to disk.
"""

## DHANKAR --- FATT --- Some other sources mentinn getting IMAGES in as NUMPY ARRAYS ?
## SOURCE ---- https://www.tensorflow.org/api_docs/python/tf/data


"""
The tf.data API enables you to build complex input pipelines from simple, reusable pieces.
For example, the pipeline for an image model might aggregate data from files in a ---- distributed file system,
apply random perturbations to each image, and ------- merge randomly selected images ---- into a batch for training.

The pipeline for a text model might involve extracting symbols from raw text data, converting
them to embedding identifiers with a ----lookup table-----, and -----batching together sequences----
of different lengths. 

The tf.data API makes it easy to deal with large amounts of data,
different data formats, and complicated transformations.
"""

### tensor_1 == image_data
### tensor_2 == image_label
"""
A tf.data.Dataset represents a sequence of elements, in which each element contains one or more ---Tensor-- objects.
For example, in an--- image pipeline, an element might be a ----single training example---, with a pair of tensors
representing the image data and a label.
"""

### Dataset.from_tensor_slices()
### Dataset.batch()

"""

    Creating a source (e.g. Dataset.from_tensor_slices()) constructs a dataset from one or more tf.Tensor objects.

    Applying a transformation (e.g. Dataset.batch()) constructs a dataset from one or more tf.data.Dataset objects.

"""


### tf.data.Iterator

"""
A tf.data.Iterator provides the main way to extract elements from a dataset. 
The operation returned by Iterator.get_next() yields the next element of a Dataset when executed,
and typically acts as the interface between input pipeline code and your model.

The simplest iterator is a "one-shot iterator", which is associated with a particular Dataset and 
iterates through it once.

For more sophisticated uses, the Iterator.initializer operation enables you to reinitialize 
and parameterize an iterator with different datasets, 
so that you can, for example, 
iterate over training and validation data multiple times in the same program.
"""

### Dataset structure
# --- dataset >> elements >> tf.Tensor -- components >> tf.TensorShape

"""
Dataset structure

A dataset comprises ---elements--- that each have the same structure. 
An element contains one or more ----tf.Tensor objects---, called ----components---.
----- Each component has a tf.DType representing the type of elements in the tensor
----- and a tf.TensorShape representing the (possibly partially specified) static shape of each element. 
"""

### PROPERTIES ===>>  Dataset.output_types and Dataset.output_shapes
"""
The Dataset.output_types and Dataset.output_shapes properties 

----allow you to inspect the inferred types 
----and shapes of each component of a dataset element. 

The nested structure of these properties map to the structure of an element, 
--- which may be a single tensor, 
--- a tuple of tensors, 
--- or a nested tuple of tensors.
"""

### 
"""

"""
In [4]:
dataset1 = tf.data.Dataset.from_tensor_slices(tf.random_uniform([4, 1000]))
print(dataset1.output_types)  # ==> "tf.float32"
print(dataset1.output_shapes)  # ==> "(10,)"
#
print(dataset1)
<dtype: 'float32'>
(1000,)
<TensorSliceDataset shapes: (1000,), types: tf.float32>
In [6]:
dataset2 = tf.data.Dataset.from_tensor_slices(
   (tf.random_uniform([4]),
    tf.random_uniform([4, 100], maxval=100, dtype=tf.int32)))
print(dataset2.output_types)  # ==> "(tf.float32, tf.int32)"
print(dataset2.output_shapes)  # ==> "((), (100,))"
#
print(dataset2)
(tf.float32, tf.int32)
(TensorShape([]), TensorShape([Dimension(100)]))
<TensorSliceDataset shapes: ((), (100,)), types: (tf.float32, tf.int32)>
In [5]:
dataset3 = tf.data.Dataset.zip((dataset1, dataset2))
print(dataset3.output_types)  # ==> (tf.float32, (tf.float32, tf.int32))
print(dataset3.output_shapes)  # ==> "(10, ((), (100,)))"
(tf.float32, (tf.float32, tf.int32))
(TensorShape([Dimension(10)]), (TensorShape([]), TensorShape([Dimension(100)])))
In [ ]:
"""
It is often convenient to give names to each component of an element, 
for example if they represent different features of a training example. 

In addition to tuples, you can use collections.namedtuple or a dictionary mapping strings to tensors 
to represent a single element of a Dataset.
"""
In [9]:
### Official 
dataset = tf.data.Dataset.from_tensor_slices(
   {"a": tf.random_uniform([4]),
    "b": tf.random_uniform([4, 100], maxval=100, dtype=tf.int32)})
print(dataset.output_types)  # ==> "{'a': tf.float32, 'b': tf.int32}"
print(dataset.output_shapes)  # ==> "{'a': (), 'b': (100,)}"
{'b': tf.int32, 'a': tf.float32}
{'b': TensorShape([Dimension(100)]), 'a': TensorShape([])}
In [12]:
## DHANKAR ---

dataset_11 = tf.data.Dataset.from_tensor_slices(
   {
    "a": tf.random_uniform([4, 500], maxval=1000, dtype=tf.int32),
    "b": tf.random_uniform([4, 100], maxval=100, dtype=tf.int32)
    }
      )

print(dataset_11.output_types)  # ==> "{'a': tf.float32, 'b': tf.int32}"
print(dataset_11.output_shapes)  # ==> "{'a': (), 'b': (100,)}"
{'b': tf.int32, 'a': tf.int32}
{'b': TensorShape([Dimension(100)]), 'a': TensorShape([Dimension(500)])}
In [13]:
### CSV Uploads --ERROR --- FATT 
# latest version of TF == has the CSV Func 
## Documentation for version --- 1.12 
# https://www.tensorflow.org/api_docs/python/tf/contrib/data/CsvDataset

#v-1.6.0 --- Has Experimental --
##- tensorflow/tensorflow/python/data/experimental/benchmarks/csv_dataset_benchmark.py


# Right now using - v-1.5.0 --- which does not . 
# /a6_18/OwnFork_TensorFlow/tensorflow/tensorflow/contrib/data/python/ops/readers.py


# Creates a dataset that reads all of the records from two CSV files, each with
# eight float columns
filenames = ["/media/dhankar/Dhankar_1/a6_18/Tensors_et_al/date_fmts.csv"]

record_defaults = [tf.float32] * 8   # Eight required float columns
dataset = tf.contrib.data.CsvDataset(filenames, record_defaults)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-13-31dd95f298fc> in <module>
      5 filenames = ["/media/dhankar/Dhankar_1/a6_18/Tensors_et_al/date_fmts.csv"]
      6 record_defaults = [tf.float32] * 8   # Eight required float columns
----> 7 dataset = tf.contrib.data.CsvDataset(filenames, record_defaults)

AttributeError: module 'tensorflow.contrib.data' has no attribute 'CsvDataset'
In [ ]:
##FATT --- CSV OnHold
# Source --- https://medium.com/mostly-ai/tensorflow-records-what-they-are-and-how-to-use-them-c46bc4bbb564

## https://github.com/tensorflow/tensorflow/blob/r1.5/tensorflow/core/example/example.proto

"""
If your dataset consist of features, where each feature is a list of values of the same type, 
tf.train.Example is the right component to use.

We have a number of features, 
each being a list where every entry has the same data type. 
In order to store these features in a TFRecord, 
we fist need to create the lists that constitute the features.

tf.train.BytesList
tf.train.FloatList
tf.train.Int64List 

are at the core of a tf.train.Feature. 

All three have a single attribute value, which expects a list of respective 
--- bytes, 
--- float, 
--- int.

"""

### tf.train.Feature
"""
tf.train.Feature ---  wraps a list of data of a specific type so Tensorflow can understand it.
It has a single attribute, which is a ---union of ----bytes_list/float_list/int64_list. 
Being a union, the stored list can be of type 
--- tf.train.BytesList (attribute name bytes_list), 
--- tf.train.FloatList (attribute name float_list), 
--- tf.train.Int64List (attribute name int64_list).

tf.train.Features ----PLURAL----Features---- is a collection of named features. 
It has a single attribute feature that expects a dictionary where the --- key ----is the name of the features 
---- and the value a tf.train.Feature.

"""


"""
In our example, each TFRecord represents the movie ratings and corresponding suggestions 
of a single user (a single sample). 
Writing recommendations for all users in the dataset follows the same process. 
It is important that the type of a feature (e.g. float for the movie rating) is the same across all samples 
in the dataset. 
This conformance criterion and others are defined in the protocol buffer definition of tf.train.Example.
"""
In [5]:
# Create example data
# Source --- https://medium.com/mostly-ai/tensorflow-records-what-they-are-and-how-to-use-them-c46bc4bbb564
data = {
    'Age': 29,
    'Movie': ['The Shawshank Redemption', 'Fight Club'],
    'Movie Ratings': [9.0, 9.7],
    'Suggestion': 'Inception',
    'Suggestion Purchased': 1.0,
    'Purchase Price': 9.99
}

print(data)
{'Suggestion': 'Inception', 'Suggestion Purchased': 1.0, 'Movie': ['The Shawshank Redemption', 'Fight Club'], 'Purchase Price': 9.99, 'Age': 29, 'Movie Ratings': [9.0, 9.7]}
In [5]:
# Create the Example
# Source --- https://medium.com/mostly-ai/tensorflow-records-what-they-are-and-how-to-use-them-c46bc4bbb564

example = tf.train.Example(features=tf.train.Features(feature={
    'Age': tf.train.Feature(
        int64_list=tf.train.Int64List(value=[data['Age']])),
    'Movie': tf.train.Feature(
        bytes_list=tf.train.BytesList(
            value=[m.encode('utf-8') for m in data['Movie']])),
    'Movie Ratings': tf.train.Feature(
        float_list=tf.train.FloatList(value=data['Movie Ratings'])),
    'Suggestion': tf.train.Feature(
        bytes_list=tf.train.BytesList(
            value=[data['Suggestion'].encode('utf-8')])),
    'Suggestion Purchased': tf.train.Feature(
        float_list=tf.train.FloatList(
            value=[data['Suggestion Purchased']])),
    'Purchase Price': tf.train.Feature(
        float_list=tf.train.FloatList(value=[data['Purchase Price']]))
}))

print(example)
features {
  feature {
    key: "Age"
    value {
      int64_list {
        value: 29
      }
    }
  }
  feature {
    key: "Movie"
    value {
      bytes_list {
        value: "The Shawshank Redemption"
        value: "Fight Club"
      }
    }
  }
  feature {
    key: "Movie Ratings"
    value {
      float_list {
        value: 9.0
        value: 9.699999809265137
      }
    }
  }
  feature {
    key: "Purchase Price"
    value {
      float_list {
        value: 9.989999771118164
      }
    }
  }
  feature {
    key: "Suggestion"
    value {
      bytes_list {
        value: "Inception"
      }
    }
  }
  feature {
    key: "Suggestion Purchased"
    value {
      float_list {
        value: 1.0
      }
    }
  }
}

In [6]:
# Write TFrecord file
with tf.python_io.TFRecordWriter('customer_1.tfrecord') as writer:
    #
    writer.write(example.SerializeToString())
    
In [7]:
# Read and print data:
sess = tf.InteractiveSession()

# Read TFRecord file
reader = tf.TFRecordReader()
filename_queue = tf.train.string_input_producer(['customer_1.tfrecord'])

_, serialized_example = reader.read(filename_queue)

# Define features
read_features = {
    'Age': tf.FixedLenFeature([], dtype=tf.int64),
    'Movie': tf.VarLenFeature(dtype=tf.string),
    'Movie Ratings': tf.VarLenFeature(dtype=tf.float32),
    'Suggestion': tf.FixedLenFeature([], dtype=tf.string),
    'Suggestion Purchased': tf.FixedLenFeature([], dtype=tf.float32),
    'Purchase Price': tf.FixedLenFeature([], dtype=tf.float32)}

# Extract features from serialized data
read_data = tf.parse_single_example(serialized=serialized_example,
                                    features=read_features)

# Many tf.train functions use tf.train.QueueRunner,
# so we need to start it before we read
tf.train.start_queue_runners(sess)

# Print features
for name, tensor in read_data.items():
    print('{}: {}'.format(name, tensor.eval()))
Age: 29
Purchase Price: 9.989999771118164
Movie: SparseTensorValue(indices=array([[0],
       [1]]), values=array([b'The Shawshank Redemption', b'Fight Club'], dtype=object), dense_shape=array([2]))
Suggestion Purchased: 1.0
Movie Ratings: SparseTensorValue(indices=array([[0],
       [1]]), values=array([9. , 9.7], dtype=float32), dense_shape=array([2]))
Suggestion: b'Inception'
In [7]:
# Create example data
data1 = {
    # Context
    'Locale': 'pt_BR',
    'Age': 19,
    'Favorites': ['Majesty Rose', 'Savannah Outen', 'One Direction'],
    # Data
    'Data': [
        {   # Movie 1
            'Movie Name': 'The Shawshank Redemption',
            'Movie Rating': 9.0,
            'Actors': ['Tim Robbins', 'Morgan Freeman']
        },
        {   # Movie 2
            'Movie Name': 'Fight Club',
            'Movie Rating': 9.7,
            'Actors': ['Brad Pitt', 'Edward Norton', 'Helena Bonham Carter']
        }
    ]
}

print(data1)
{'Data': [{'Actors': ['Tim Robbins', 'Morgan Freeman'], 'Movie Rating': 9.0, 'Movie Name': 'The Shawshank Redemption'}, {'Actors': ['Brad Pitt', 'Edward Norton', 'Helena Bonham Carter'], 'Movie Rating': 9.7, 'Movie Name': 'Fight Club'}], 'Age': 19, 'Favorites': ['Majesty Rose', 'Savannah Outen', 'One Direction'], 'Locale': 'pt_BR'}
In [10]:
# Create the context features (short form)
customer = tf.train.Features(feature={
    'Locale': tf.train.Feature(bytes_list=tf.train.BytesList(
        value=[data1['Locale'].encode('utf-8')])),
    'Age': tf.train.Feature(int64_list=tf.train.Int64List(
        value=[data1['Age']])),
    'Favorites': tf.train.Feature(bytes_list=tf.train.BytesList(
        value=[m.encode('utf-8') for m in data1['Favorites']]))
})

# Create sequence data
names_features = []
ratings_features = []
actors_features = []

for movie in data1['Data']:
    # Create each of the features, then add it to the
    # corresponding feature list
    movie_name_feature = tf.train.Feature(
        bytes_list=tf.train.BytesList(
            value=[movie['Movie Name'].encode('utf-8')]))
    names_features.append(movie_name_feature)
    
    movie_rating_feature = tf.train.Feature(
        float_list=tf.train.FloatList(value=[movie['Movie Rating']]))
    ratings_features.append(movie_rating_feature)
                                             
    movie_actors_feature = tf.train.Feature(
        bytes_list=tf.train.BytesList(
            value=[m.encode('utf-8') for m in movie['Actors']]))
    actors_features.append(movie_actors_feature)

movie_names = tf.train.FeatureList(feature=names_features)
movie_ratings = tf.train.FeatureList(feature=ratings_features)
movie_actors = tf.train.FeatureList(feature=actors_features)

movies = tf.train.FeatureLists(feature_list={
    'Movie Names': movie_names,
    'Movie Ratings': movie_ratings,
    'Movie Actors': movie_actors
})

# Create the SequenceExample
example = tf.train.SequenceExample(context=customer, feature_lists=movies)

print(example)
context {
  feature {
    key: "Age"
    value {
      int64_list {
        value: 19
      }
    }
  }
  feature {
    key: "Favorites"
    value {
      bytes_list {
        value: "Majesty Rose"
        value: "Savannah Outen"
        value: "One Direction"
      }
    }
  }
  feature {
    key: "Locale"
    value {
      bytes_list {
        value: "pt_BR"
      }
    }
  }
}
feature_lists {
  feature_list {
    key: "Movie Actors"
    value {
      feature {
        bytes_list {
          value: "Tim Robbins"
          value: "Morgan Freeman"
        }
      }
      feature {
        bytes_list {
          value: "Brad Pitt"
          value: "Edward Norton"
          value: "Helena Bonham Carter"
        }
      }
    }
  }
  feature_list {
    key: "Movie Names"
    value {
      feature {
        bytes_list {
          value: "The Shawshank Redemption"
        }
      }
      feature {
        bytes_list {
          value: "Fight Club"
        }
      }
    }
  }
  feature_list {
    key: "Movie Ratings"
    value {
      feature {
        float_list {
          value: 9.0
        }
      }
      feature {
        float_list {
          value: 9.699999809265137
        }
      }
    }
  }
}

In [11]:
# Write TFrecord file
with tf.python_io.TFRecordWriter('customer_2.tfrecord') as writer:
    writer.write(example.SerializeToString())
In [12]:
# Read and print data:
sess = tf.InteractiveSession()

# Read TFRecord file
reader = tf.TFRecordReader()
filename_queue = tf.train.string_input_producer(['customer_1.tfrecord'])

_, serialized_example = reader.read(filename_queue)

# Define features
context_features = {
    'Locale': tf.FixedLenFeature([], dtype=tf.string),
    'Age': tf.FixedLenFeature([], dtype=tf.int64),
    'Favorites': tf.VarLenFeature(dtype=tf.string)
}
sequence_features = {
    'Movie Names': tf.FixedLenSequenceFeature([], dtype=tf.string),
    'Movie Ratings': tf.FixedLenSequenceFeature([], dtype=tf.float32),
    'Movie Actors': tf.VarLenFeature(dtype=tf.string)
}

# Extract features from serialized data
context_data, sequence_data = tf.parse_single_sequence_example(
    serialized=serialized_example,
    context_features=context_features,
    sequence_features=sequence_features)

# Many tf.train functions use tf.train.QueueRunner,
# so we need to start it before we read
tf.train.start_queue_runners(sess)

# Print features
print('Context:')
for name, tensor in context_data.items():
    print('{}: {}'.format(name, tensor.eval()))

print('\nData')
for name, tensor in sequence_data.items():
    print('{}: {}'.format(name, tensor.eval()))
Context:
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
~/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1349     try:
-> 1350       return fn(*args)
   1351     except errors.OpError as e:

~/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tensorflow/python/client/session.py in _run_fn(session, feed_dict, fetch_list, target_list, options, run_metadata)
   1328                                    feed_dict, fetch_list, target_list,
-> 1329                                    status, run_metadata)
   1330 

~/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg)
    472             compat.as_text(c_api.TF_Message(self.status.status)),
--> 473             c_api.TF_GetCode(self.status.status))
    474     # Delete the underlying status object from memory otherwise it stays alive

InvalidArgumentError: Name: , Context feature 'Locale' is required but could not be found.
  [[Node: ParseSingleSequenceExample/ParseSingleSequenceExample = ParseSingleSequenceExample[Ncontext_dense=2, Ncontext_sparse=1, Nfeature_list_dense=2, Nfeature_list_sparse=1, Tcontext_dense=[DT_INT64, DT_STRING], context_dense_shapes=[[], []], context_sparse_types=[DT_STRING], feature_list_dense_shapes=[[], []], feature_list_dense_types=[DT_STRING, DT_FLOAT], feature_list_sparse_types=[DT_STRING], _device="/job:localhost/replica:0/task:0/device:CPU:0"](ReaderReadV2:1, ParseSingleSequenceExample/ParseSingleSequenceExample/feature_list_dense_missing_assumed_empty, ParseSingleSequenceExample/ParseSingleSequenceExample/context_sparse_keys_0, ParseSingleSequenceExample/ParseSingleSequenceExample/context_dense_keys_0, ParseSingleSequenceExample/ParseSingleSequenceExample/context_dense_keys_1, ParseSingleSequenceExample/ParseSingleSequenceExample/feature_list_sparse_keys_0, ParseSingleSequenceExample/ParseSingleSequenceExample/feature_list_dense_keys_0, ParseSingleSequenceExample/ParseSingleSequenceExample/feature_list_dense_keys_1, ParseSingleSequenceExample/Const, ParseSingleSequenceExample/Const_1, ParseSingleSequenceExample/ParseSingleSequenceExample/debug_name)]]

During handling of the above exception, another exception occurred:

InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-12-6aaa17de5d93> in <module>
     33 print('Context:')
     34 for name, tensor in context_data.items():
---> 35     print('{}: {}'.format(name, tensor.eval()))
     36 
     37 print('\nData')

~/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tensorflow/python/framework/ops.py in eval(self, feed_dict, session)
    646 
    647     """
--> 648     return _eval_using_default_session(self, feed_dict, self.graph, session)
    649 
    650 

~/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tensorflow/python/framework/ops.py in _eval_using_default_session(tensors, feed_dict, graph, session)
   4756                        "the tensor's graph is different from the session's "
   4757                        "graph.")
-> 4758   return session.run(tensors, feed_dict)
   4759 
   4760 

~/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
    893     try:
    894       result = self._run(None, fetches, feed_dict, options_ptr,
--> 895                          run_metadata_ptr)
    896       if run_metadata:
    897         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

~/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1126     if final_fetches or final_targets or (handle and feed_dict_tensor):
   1127       results = self._do_run(handle, final_targets, final_fetches,
-> 1128                              feed_dict_tensor, options, run_metadata)
   1129     else:
   1130       results = []

~/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
   1342     if handle is None:
   1343       return self._do_call(_run_fn, self._session, feeds, fetches, targets,
-> 1344                            options, run_metadata)
   1345     else:
   1346       return self._do_call(_prun_fn, self._session, handle, feeds, fetches)

~/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1361         except KeyError:
   1362           pass
-> 1363       raise type(e)(node_def, op, message)
   1364 
   1365   def _extend_graph(self):

InvalidArgumentError: Name: , Context feature 'Locale' is required but could not be found.
  [[Node: ParseSingleSequenceExample/ParseSingleSequenceExample = ParseSingleSequenceExample[Ncontext_dense=2, Ncontext_sparse=1, Nfeature_list_dense=2, Nfeature_list_sparse=1, Tcontext_dense=[DT_INT64, DT_STRING], context_dense_shapes=[[], []], context_sparse_types=[DT_STRING], feature_list_dense_shapes=[[], []], feature_list_dense_types=[DT_STRING, DT_FLOAT], feature_list_sparse_types=[DT_STRING], _device="/job:localhost/replica:0/task:0/device:CPU:0"](ReaderReadV2:1, ParseSingleSequenceExample/ParseSingleSequenceExample/feature_list_dense_missing_assumed_empty, ParseSingleSequenceExample/ParseSingleSequenceExample/context_sparse_keys_0, ParseSingleSequenceExample/ParseSingleSequenceExample/context_dense_keys_0, ParseSingleSequenceExample/ParseSingleSequenceExample/context_dense_keys_1, ParseSingleSequenceExample/ParseSingleSequenceExample/feature_list_sparse_keys_0, ParseSingleSequenceExample/ParseSingleSequenceExample/feature_list_dense_keys_0, ParseSingleSequenceExample/ParseSingleSequenceExample/feature_list_dense_keys_1, ParseSingleSequenceExample/Const, ParseSingleSequenceExample/Const_1, ParseSingleSequenceExample/ParseSingleSequenceExample/debug_name)]]

Caused by op 'ParseSingleSequenceExample/ParseSingleSequenceExample', defined at:
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/ipykernel/__main__.py", line 3, in <module>
    app.launch_new_instance()
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/ipykernel/kernelapp.py", line 505, in start
    self.io_loop.start()
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tornado/platform/asyncio.py", line 132, in start
    self.asyncio_loop.run_forever()
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/asyncio/base_events.py", line 421, in run_forever
    self._run_once()
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/asyncio/base_events.py", line 1425, in _run_once
    handle._run()
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/asyncio/events.py", line 127, in _run
    self._callback(*self._args)
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tornado/ioloop.py", line 758, in _run_callback
    ret = callback()
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tornado/stack_context.py", line 300, in null_wrapper
    return fn(*args, **kwargs)
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tornado/gen.py", line 1233, in inner
    self.run()
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tornado/gen.py", line 1147, in run
    yielded = self.gen.send(value)
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 357, in process_one
    yield gen.maybe_future(dispatch(*args))
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tornado/gen.py", line 326, in wrapper
    yielded = next(result)
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 267, in dispatch_shell
    yield gen.maybe_future(handler(stream, idents, msg))
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tornado/gen.py", line 326, in wrapper
    yielded = next(result)
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 534, in execute_request
    user_expressions, allow_stdin,
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tornado/gen.py", line 326, in wrapper
    yielded = next(result)
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/ipykernel/ipkernel.py", line 294, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/ipykernel/zmqshell.py", line 536, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2819, in run_cell
    raw_cell, store_history, silent, shell_futures)
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2845, in _run_cell
    return runner(coro)
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/IPython/core/async_helpers.py", line 67, in _pseudo_sync_runner
    coro.send(None)
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 3020, in run_cell_async
    interactivity=interactivity, compiler=compiler, result=result)
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 3185, in run_ast_nodes
    if (yield from self.run_code(code, result)):
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 3267, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-12-6aaa17de5d93>", line 26, in <module>
    sequence_features=sequence_features)
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tensorflow/python/ops/parsing_ops.py", line 944, in parse_single_sequence_example
    feature_list_dense_defaults, example_name, name)
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tensorflow/python/ops/parsing_ops.py", line 1141, in _parse_single_sequence_example_raw
    name=name)
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tensorflow/python/ops/gen_parsing_ops.py", line 456, in _parse_single_sequence_example
    feature_list_dense_shapes=feature_list_dense_shapes, name=name)
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
    op_def=op_def)
  File "/home/dhankar/anaconda2/envs/dc_info_venv/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1625, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Name: , Context feature 'Locale' is required but could not be found.
  [[Node: ParseSingleSequenceExample/ParseSingleSequenceExample = ParseSingleSequenceExample[Ncontext_dense=2, Ncontext_sparse=1, Nfeature_list_dense=2, Nfeature_list_sparse=1, Tcontext_dense=[DT_INT64, DT_STRING], context_dense_shapes=[[], []], context_sparse_types=[DT_STRING], feature_list_dense_shapes=[[], []], feature_list_dense_types=[DT_STRING, DT_FLOAT], feature_list_sparse_types=[DT_STRING], _device="/job:localhost/replica:0/task:0/device:CPU:0"](ReaderReadV2:1, ParseSingleSequenceExample/ParseSingleSequenceExample/feature_list_dense_missing_assumed_empty, ParseSingleSequenceExample/ParseSingleSequenceExample/context_sparse_keys_0, ParseSingleSequenceExample/ParseSingleSequenceExample/context_dense_keys_0, ParseSingleSequenceExample/ParseSingleSequenceExample/context_dense_keys_1, ParseSingleSequenceExample/ParseSingleSequenceExample/feature_list_sparse_keys_0, ParseSingleSequenceExample/ParseSingleSequenceExample/feature_list_dense_keys_0, ParseSingleSequenceExample/ParseSingleSequenceExample/feature_list_dense_keys_1, ParseSingleSequenceExample/Const, ParseSingleSequenceExample/Const_1, ParseSingleSequenceExample/ParseSingleSequenceExample/debug_name)]]
In [13]:
#### DHANKAR ---- customer_2.tfrecord
#

# Read and print data:
sess = tf.InteractiveSession()

# Read TFRecord file
reader = tf.TFRecordReader()
filename_queue = tf.train.string_input_producer(['customer_2.tfrecord'])

_, serialized_example = reader.read(filename_queue)

# Define features
context_features = {
    'Locale': tf.FixedLenFeature([], dtype=tf.string),
    'Age': tf.FixedLenFeature([], dtype=tf.int64),
    'Favorites': tf.VarLenFeature(dtype=tf.string)
}
sequence_features = {
    'Movie Names': tf.FixedLenSequenceFeature([], dtype=tf.string),
    'Movie Ratings': tf.FixedLenSequenceFeature([], dtype=tf.float32),
    'Movie Actors': tf.VarLenFeature(dtype=tf.string)
}

# Extract features from serialized data
context_data, sequence_data = tf.parse_single_sequence_example(
    serialized=serialized_example,
    context_features=context_features,
    sequence_features=sequence_features)

# Many tf.train functions use tf.train.QueueRunner,
# so we need to start it before we read
tf.train.start_queue_runners(sess)

# Print features
print('Context:')
for name, tensor in context_data.items():
    print('{}: {}'.format(name, tensor.eval()))

print('\nData')
for name, tensor in sequence_data.items():
    print('{}: {}'.format(name, tensor.eval()))
Context:
Age: 19
Favorites: SparseTensorValue(indices=array([[0],
       [1],
       [2]]), values=array([b'Majesty Rose', b'Savannah Outen', b'One Direction'], dtype=object), dense_shape=array([3]))
Locale: b'pt_BR'

Data
Movie Actors: SparseTensorValue(indices=array([[0, 0],
       [0, 1],
       [1, 0],
       [1, 1],
       [1, 2]]), values=array([b'Tim Robbins', b'Morgan Freeman', b'Brad Pitt', b'Edward Norton',
       b'Helena Bonham Carter'], dtype=object), dense_shape=array([2, 3]))
Movie Names: [b'The Shawshank Redemption' b'Fight Club']
Movie Ratings: [9.  9.7]
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 

No comments:

Post a Comment