To support the effort to contain the spread of COVID-19, the Florida Museum is temporarily closed to the public until further notice. More Info
Liu, X., W. Mei, P. S. Soltis, D. E. Soltis, and W. B. Barbazuk. 2017. Detecting Alternatively Spliced Transcript Isoforms from Single-Molecule Long-Read Sequences without a Reference Genome. Mol Ecol Resources. [View on publisher’s site]

Abstract

Alternative splicing (AS) is a major source of transcript and proteome diversity, but examining AS in species without well-annotated reference genomes remains difficult. Research on both human and mouse has demonstrated the advantages of using Iso-Seqdata for isoform-level transcriptome analysis, including the study of AS and gene fusion. We applied Iso-Seq to investigate AS in Amborella trichopoda, a phylogenetically pivotal species that is sister to all other living angiosperms. Our data show that, compared with RNA-Seq data, the Iso-Seq platform provides better recovery on large transcripts, new gene locus identification, and gene model correction. Reference-based AS detection with Iso-Seq data identifies AS within a higher fraction of multi-exonic genes than observed for published RNA-Seq analysis (45.8% vs. 37.5%). These data demonstrate that the Iso-Seq approach is useful for detecting AS events. Using the Iso-Seq-defined transcript collection in Amborella as a reference, we further describe a pipeline for detection of AS isoforms from PacBio Iso-Seqwithout using a reference sequence (de novo). Results using this pipeline show a 66-76% overall success rate in identifying AS events. This de novo AS detection pipeline provides a method to accurately characterize and identify bona fide alternatively spliced transcripts in any non-model system that lacks a reference genome sequence. Hence, our pipeline has huge potential applications and benefits to the broader biology community.