Convert DNA sequences into a one hot nucleotide encoding.
@tf.function
tfio.genome.sequences_to_onehot( sequences )
Used in the notebooks
Each nucleotide in each sequence is mapped as follows: A -> [1, 0, 0, 0] C -> [0, 1, 0, 0] G -> [0 ,0 ,1, 0] T -> [0, 0, 0, 1]
If for some reason a non (A, T, C, G) character exists in the string, it is currently mapped to a error one hot encoding [1, 1, 1, 1].
Args | |
---|---|
sequences
|
A tf.string tensor where each string represents a DNA sequence |
Returns | |
---|---|
tf.RaggedTensor
|
The output sequences with nucleotides one hot encoded. |