Data normalization into FASTA format

A script pipeline to normalize and transform the DNA sequence into the standard FASTA format.

Examples of properly formatted FASTA definition lines for nucleotide sequences:

›Seq1 [organism=Streptomyces lavendulae] [strain=456A] Streptomyces lavendulae 
strain 456A mitomycin radical oxidase (mcrA) gene, complete cds.

›ABCD [organism=Plasmodium falciparum] [isolate=ABCD] Plasmodium falciparum 
isolate ABCD merozoite surface protein 2 (msp2) gene, partial cds.

› [organism=Homo sapiens] [chromosome=17] [map=17q21] [moltype=mRNA] 
Homo sapiens breast and ovarian cancer susceptibility protein (BRCA1) mRNA, complete cds.



