Optimal design of low density marker panels for genotype imputation
Cost-effective genotyping of livestock species can be done through a process which involves genotyping part of the population using a high density (HD) panel and the remainder with a lower density panel and then use imputation to infer the missing genotypes that are not included on the low density panel. Therefore, it is desirable to have a method of selecting markers for an assay that maximises imputation accuracy. Here we present a marker selection method that relies on the pairwise (co)variances between single nucleotide polymorphisms (SNPs) and the minor allele frequency (MAF) of SNPs. The performance of the developed method was tested in a 5 fold cross-validation process using genotypes of crossbred dairy cattle in East Africa, a population in which it is unclear whether existing low density SNP assays designed for purebred populations will maintain high imputation accuracies. Various densities of SNPs were selected using the (co)variance method and alternative SNP selection methods and then imputed up to the HD panel. The (co)variance method provided the highest imputation accuracies at all marker densities, with accuracies being up to 19% higher than the random selection of SNPs. The presented method is straightforward in its application and can ensure high accuracies in genotype imputation of crossbred dairy population in East Africa.