Accurate imputation of African cattle genomes using a diverse reference panel

Abstract

Background: In cattle, most commercial single-nucleotide polymorphism (SNP) genotyping arrays have been shown to be suboptimal for capturing genomic variation in non-European populations, particularly in African cattle. Low-coverage whole-genome sequencing (LCWGS) followed by imputation provides a cost-effective method for genotyping that is more adaptable and can outperform genotyping arrays. Results: Here, we generate a high-quality reference imputation panel representative of the complex ancestries of cattle populations in Africa to enable the deployment of LCWGS. To do so, we generated 116 high-coverage (between 20‒24×) new African cattle genomes, representing most cattle breeds across the continent. We combined this data with publicly available genomes from other regions to build a reference panel that comprised over 3,300 cattle genomes from 133 cattle populations, thus capturing the genetic diversity of domestic cattle across the world. After applying a high filtering step to remove poor genome sequences with very low sequence coverage, we retained 1,882 with an average coverage of 7×. We show that the imputation pipeline implemented, based on this reference panel, provides highly accurate genotypes of common (> 99% accuracy) and rare (> 98% accuracy) variants in genome coverage as low as 0.5×. Conclusion: This panel provides an important new resource for genetic improvement and conservation of African cattle populations.

Type
Publication
Collection. Figshare