A λZAP Express cDNA library was constructed with mRNA obtained from immature miracidia within eggs, hatched miracidia, and sporocysts of Echinostoma paraensei. This cDNA library was amplified and 213 expressed sequence tag (EST) sequences (averaging 466 nucleotides in length) were obtained. The mean percentage of unresolved bases within the EST sequences was 0.4%, ranging from 0 to 4.6%. The 213 ESTs represent 151 unique messages. BLAST (version 2.0.8) analysis disclosed that 64 unique E. paraensei messages (42.4%) had significant similarities (BLAST score ≤e-5), at deduced amino acid or nucleotide levels, with known sequences in the nonredundant GenBank databases or the dbEST database (NCBI). The remainder, 57.6% of the unique EST-encoded messages, scored nonsignificant hits. Most of the E. paraensei messages that could be assigned a cellular role based on sequence similarities were involved in gene/protein expression. Several ESTs scored highest similarities with sequences obtained from trematode species. A total of 22,560 nucleotides present in open reading frames from ESTs that aligned with known sequences was used to determine codon usage for E. paraensei. Analysis of a subset of eight ESTs that contained full-length open reading frames did not reveal a bias in codon usage. Also, EST sequences were found to contain 3′ untranslated regions with an average length of 69.9 ± 88.4 nucleotides (n = 46). The EST sequences were submitted to GenBank/dbEST, adding to the 51 available Echinostoma-derived sequences, to provide reference information for both phylogenetic analysis and study of general trematode biology.

