АвторТема: The Thousand Polish Genomes Project  (Прочитано 1156 раз)

0 Пользователей и 1 Гость просматривают эту тему.

Оффлайн NathanSАвтор темы

  • Сообщений: 1279
  • Страна: 00
  • Рейтинг +1208/-2
The Thousand Polish Genomes Project
« : 13 Июль 2021, 14:20:47 »
'The Thousand Polish Genomes Project’-a national database of Polish variant allele frequencies

https://doi.org/10.1101/2021.07.07.451425
Abstract
Although Slavic populations account for over 3.5% of world inhabitants, no centralized, open source reference database of genetic variation of any Slavicpopulation exists to date. Such data are crucial for either  biomedical  research  and  genetic  counseling  and  are  essential  for  archeological  and  historical studies. Polish population, homogenous and sedentary in its nature but influenced by many migrations of the past, is unique and could serve as a good genetic reference for middle European Slavic nations.The aim of the present study was to describe first results of analyses of a newly created national database of Polish genomic variant allele frequencies. Never before has any study on the whole genomes of Polish population been conducted on such a large number of individuals (1,079).A  wide  spectrum  of  genomic  variation  was  identified  and  genotyped,  such  as  small  and  structural variants, runs of homozygosity, mitochondrial haplogroups and Mendelian inconsistencies. The allele frequencies were calculated for 943 unrelated individuals and released publicly as The Thousand Polish Genomes  database.  A  precise  detection  and  characterisation  of  rare  variants  enriched  in  the  Polish population allowed to confirm the allele frequencies for known pathogenic variants in diseases, such as Smith-Lemli-Opitz  syndrome (SLOS)  or  Nijmegen  breakage  syndrome  (NBS).    Additionally,  the analysis of OMIM AR genes led to the identification of 22 genes with significantly different cumulative allele frequencies in the Polish (POL) vs European NFE population. We hope that The Thousand Polish Genomes  database  will  contribute  to  the  worldwide  genomic  data  resources  for  researchers  and clinicians.
---------------------------------------------------------------
High depth (>30x) PCR-free whole-genome sequencing was applied to all samples.
Whole  Genome Sequencing (WGS) was performed on the Illumina NovaSeq 6000 platform using 150 bp paired-end reads, with median average depth of 35.72X.
the reads were subsequently mapped  to  the  GRCh38  human  reference  genome

Mitochondrial haplogroupsUsing  variant  calls  in  the  mitochondrial  genome,  we  inferred  haplogroups  among  the  unrelated individuals. In 930 individuals with high quality haplogroup assignment the most abundant haplogroup was H with 410 (44.1%) representatives, U with 161 (17.3%), J with 92 (9.9%), and T with 83 (8.9%) individuals (Fig. 6). The largest H sub-haplogroup was H1 (N=128; 31.2% of the H haplogroup), and a similar number of individuals was divided between subclades H2, H5, H6 and H11 (N=116; together 28.3% of the H haplogroup). The second most abundant sub-haplogroup in the cohort was U5 with 98 (10.5%) individuals.

In the first comparison with continental populations, we observed that the POL cohort  is  homogenous  and  clustered  within  the  European  population  (Fig.11A  and  11B).  After prediction using a random forest method only one sample was located in the AMR population cluster. In PCA of European subpopulations, almost all POL samples (938 out of 943) were clustered with other European ancestries, with 496 individuals belonging to the GBR, 427 to the CEU, 12 to the TSI, and 3 to  the  IBS  subpopulations.  Five  samples  were  closer  to  non-European  populations.

Compared against the world populations, the POL cohort was similar to the European populations at low K values (K = 2 to 5), and at K above 5 (favoured by  the  cross-validation  analysis  for  the  world  dataset)  forms  a  distinct  cluster,  with  some  common ancestry with GBR and CEU, and also FIN.

Although in terms of sample size our project does not compare to the world's largest,  it remains one of the largest open allele-frequency datasets generated on high-coverage WGS data and the largest of Slavic population.

https://naszegenomy.pl/

Оффлайн NathanSАвтор темы

  • Сообщений: 1279
  • Страна: 00
  • Рейтинг +1208/-2
Re: The Thousand Polish Genomes Project
« Ответ #1 : 13 Июль 2021, 14:22:51 »
В статье упоминаются и другие проекты:

Genomiczna Mapa Polski: https://www.genompolski.pl/
Polgenom: https://polgenom.pl/

Оффлайн NathanSАвтор темы

  • Сообщений: 1279
  • Страна: 00
  • Рейтинг +1208/-2
Re: The Thousand Polish Genomes Project
« Ответ #2 : 13 Июль 2021, 14:55:24 »
Хорошо бы, конечно, получить возможность использовать эти данные для сравнения.

Как я понимаю, для ученых это не проблема. У них на странице есть кнопkа для запроса данных, но лицензия Creative Commons С указанием авторства-Некоммерческая-С сохранением условий версии 4.0 Международная https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode.ru

Оффлайн Павлов Сергей

  • Сообщений: 127
  • Страна: ru
  • Рейтинг +28/-0
Re: The Thousand Polish Genomes Project
« Ответ #3 : 06 Январь 2022, 12:43:37 »
А есть ли данные по полкам в открытом доступе? Спасибо!

 

© 2007 Молекулярная Генеалогия (МолГен)

Внимание! Все сообщения отражают только мнения их авторов.
Все права на материалы принадлежат их авторам (владельцам) и сетевым изданиям, с которых они взяты.