Systematic as against random errors can compromise the use of sequence data. The choice of selecting one or more NGS technologies depends on the yield of data required, read length required, cost per data point and accuracy of the sequence data. Various NGS technologies are now available for the rapid sequencing of whole genomes. All other sites were identical giving an additional assurance of the correctness of the final consensus. The new mapping results for the wild rice show only one discrepancy described earlier (Table 3, # 7) in a long homopolymer region. Illumina reads as against Ion Torrent reads of the W-rice-genotype were selected for remapping as even short Illumina reads (36 bases) of the R-rice-genotype provided a consensus matching the reference-sequence (Table 1), as against an inaccurate consensus when Ion Torrent reads were used (Table 1) due to indel-associated errors. To assess the correctness of this amended chloroplast sequence paired-end reads from Illumina sequencing platform were re-mapped to this new wild rice chloroplast genome. Based on the analysis and all the findings using various approaches outlined above, an amended wild-rice-genotype chloroplast consensus sequence was created incorporating the most probable variants from Table 3. However, all other errors found in Ion Torrent sequence (Table 3, # 8 to # 20) were not due mapping errors but read errors from either deletions or insertions in homopolymer regions. Thus for this variant we cannot conclude with certainty if this discrepancy is due to mapping artefact or due to read error. indica isolate 93-11 (GenBank accession – AY522329) and Oryza nivara (GenBank accession – AP006728) which were sequenced by Sanger technology. Nipponbare (reference-sequence), Oryza sativa ssp. Interestingly, the insertion was not present in the chloroplast genome sequence of Oryza sativa spp. Comparison of other chloroplast genomes, known to have been sequenced on the Illumina platform (GAII), showed some to have this insertion, namely Australian Oryza rufipogon (GenBank accession – JN005833), Asian Oryza rufipogon (GenBank accession – JN005832) and Oryza meridionalis (GenBank accession – JN005831). This polymorphism was not called in the consensus from the subset of Illumina data. Variations at this position in contigs sequences varied from 2 to 3 A’s insertions. The location of the variation suggested that in both cases it could be an error. One of the variants ( # 7), an insertion of an A in the Ion Torrent consensus and AA in the Illumina consensus, was found in a long homopolymer stretch of 10 A’s. Similarly, mapping error and not sequence read error was the reason for variants detected at several positions in one or the other mapping-consensus sequences ( Table 3: for # 2 see Figure 2, for # 3 and # 4 see Figure 3), as these variants were not observed in the corresponding region of the contig sequences obtained from de novo assembly (from both analysis software). was not due to sequencing error but due to mapping artefact. Nucleotides with background colours represent the mismatches between reads and the reference sequence paired end reads are shown in blue single reads are shown in green and red (in forward and reverse orientation, respectively). Nipponbare Consensus – consensus sequence of wild rice chloroplast sequence derived by mapping reads from Illumina (A) and Ion Torrent (B) platforms to the reference. Oryza sativa – fragment of chloroplast sequence of Oryza sativa spp. The duplicated region was a probable cause of the misalignment of reads. The nucleotides in the insertion were duplicated in wild rice (sequence marked in red rectangle), and not in the reference genome where only one copy of these nucleotides was present (marked in green rectangle). This insertion was missed in the mapping of Illumina reads, although it was present in the reads ((B), example of the read sequence marked in black rectangle). In the mapping of Ion Torrent reads there was a long insertion (TCCTATTTAATA) reported in the consensus sequence of wild rice chloroplast ((A), marked with orange background colour). Reads were mapped to the chloroplast reference of Oryza sativa cv. Snapshot of mapping results of wild rice Ion Torrent (A) and Illumina (B) reads.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |