A Fosmid pool-based next generation sequencing approach to haplotype-resolve whole genomes
MetadataShow full item record
Suk, Eun-Kyung; Schulz, Sabrina; Mentrup, Birgit; Huebsch, Thomas; Duitama, Jorge; Hoehe, Margret R.. 2017. A Fosmid pool-based next generation sequencing approach to haplotype-resolve whole genomes . In: Tiemann-Boege, Irene; Betancourt, Andrea (eds).2017. Haplotyping: Methods and Protocols. Springer. x,326 p. ISBN 978-1-4939-6750-6 . Springer New York, p.223-269.
Permanent link to cite or share this item: http://hdl.handle.net/10568/79937
Haplotype resolution of human genomes is essential to describe and interpret genetic variation and its impact on biology and disease. Our approach to haplotyping relies on converting genomic DNA into a fosmid library, which represents the entire diploid genome as a collection of haploid DNA clones of ~40 kb in size. These can be partitioned into pools such that the probability that the same pool contains both parental haplotypes is reduced to ~1 %. This is the key principle of this method, allowing entire pools of fosmids to be massively parallel sequenced, yielding haploid sequence output. Here, we present a detailed protocol for fosmid pool-based next generation sequencing to haplotype-resolve whole genomes including the following steps: (1) generation of high molecular weight DNA fragments of ~40 kb in size from genomic DNA; (2) fosmid cloning and partitioning into 96-well plates; (3) barcoded sequencing library preparation from fosmid pools for next generation sequencing; and (4) computational analysis of fosmid sequences and assembly into contiguous haploid sequences. This method can be used in combination with, but also without, whole genome shotgun sequencing to extensively resolve heterozygous SNPs and structural variants within genomic regions, resulting in haploid contigs of several hundred kb up to several Mb. This method has a broad range of applications including population and ancestry genetics, the clinical interpretation of mutations in personal genomes, the analysis of cancer genomes and highly complex disease gene regions such as MHC. Moreover, haplotype-resolved genome sequencing allows description and interpretation of the diploid nature of genome biology, for example through the analysis of haploid gene forms and allele-specific phenomena. Application of this method has enabled the production of most of the molecular haplotype-resolved genomes reported to date.
CGIAR Author ORCID iDs
- CIAT Book Chapters