Convert DNA sequences into a one hot nucleotide encoding.
@tf.function
tfio.genome.sequences_to_onehot(
sequences
)
Used in the notebooks
Each nucleotide in each sequence is mapped as follows:
A -> [1, 0, 0, 0]
C -> [0, 1, 0, 0]
G -> [0 ,0 ,1, 0]
T -> [0, 0, 0, 1]
If for some reason a non (A, T, C, G) character exists in the string, it is
currently mapped to a error one hot encoding [1, 1, 1, 1].
Args |
sequences
|
A tf.string tensor where each string represents a DNA sequence
|
Returns |
tf.RaggedTensor
|
The output sequences with nucleotides one hot encoded.
|