Next-generation DNA and RNA sequencing has revolutionized biology and medicine. With sequencing costs continuously dropping and our ability to generate large datasets rising, data analysis becomes more important than ever. Next-Generation Sequencing Data Analysis walks readers through next-generation sequencing (NGS) data analysis step by step for a wide range of NGS applications.
For each NGS application, this book covers topics from experimental design, sample processing, sequencing strategy formulation, to sequencing read quality control, data preprocessing, read mapping or assembly, and more advanced stages that are specific to each application. Major applications include:
- RNA-seq: Both bulk and single cell (separate chapters)
- Genotyping and variant discovery through whole genome/exome sequencing
- Clinical sequencing and detection of actionable variants
- De novo genome assembly
- ChIP-seq to map protein-DNA interactions
- Epigenomics through DNA methylation sequencing
- Metagenome sequencing for microbiome analysis
Before detailing the analytic steps for each of these applications, the book presents introductory cellular and molecular biology as a refresher mostly for data scientists, the ins and outs of widely used NGS platforms, and an overview of computing needs for NGS data management and analysis. The book concludes with a chapter on the changing landscape of NGS technologies and data analytics.
The second edition of this book builds on the well-received first edition by providing updates to each chapter. Two brand new chapters have been added to meet rising data analysis demands on single-cell RNA-seq and clinical sequencing. The increasing use of long-read sequencing has also been reflected in all NGS applications. This book discusses concepts and principles that underlie each analytic step, along with software tools for implementation. It highlights key features of the tools while omitting tedious details to provide an easy-to-follow guide for practitioners in life sciences, bioinformatics, biostatistics, and data science. Tools introduced in this book are open source and freely available.