Abstract:
The authors report the whole genome analysis of 99 Kinh people extracted from 1,000 Genome Project data. They classified variants into two sets: variants that were uniquely present in Kinh genomes (KHVonly -1,895,398 variants) and variants that were present in more than 90% of Kinh samples (KHV90 -1,556,750 variants), in which 99% were SNP - Single Nucleotide Polymorphism. Both of the variants sets shared a similar pattern in the distribution of variants across chromosomes with chromosome 2 carrying highest number of variants (KHVonly: 161,224; KHV90: 132,508). The distribution of variants in a chromosome was not even, presented by the wide range of density values 0-70,000 SNP/1 Mb with the trend of focusing on both ends of chromosomes (autosomal chromosomes) or focusing in the center (sex chromosomes). Funtional region analysis predicted that gene functions located in the exon regions were affected more by variants in KHVonly (157,311 functions) than by variants in KHV90 (66,123 functions). The authors found that no pathogenic SNP related to 27 genetic diseases, due to the criteria of chosing individuals with heathy appearance for sequencing of 1,000 genomes project. All the results of this study were uploaded to the web browser developed by the research group at the address “https://vsnp.ibt.ac.vn”. The results are the foundation for detailed characterization of whole genome of Kinh people, providing an informative and valuable resource for upcoming researches on genetic variations, identification, and genetic diseases of Vietnamese population.
Keywords:
bioinformatics, forensics, genetic diseases, Kinh people genome, Next Generation Sequencing (NGS), 1,000 genome project.