Важная информация от Бонни Шрек:
Dear all,
I'll send this info in an email too, since some people won't read it here.
I've just started this comparison process, which will be long and interesting. I'm having to learn how to do a lot more things with Excel, and I still have a long way to go. For now, it appears that there are just six SNPs in common between, on the one hand, the unfiltered vcf file Vince Tilroe gave me, which includes all of the haplogroup J samples from the 1K genomes project and a few of the hg. I samples, and on the other hand, the file from the quxuq's results (E11334/ GRC10035555) called 454HCDiffsMetadata.txt. If someone can explain to me what better file is available, I'd appreciate it, but that's the best one I could find. It has a somewhat limited number of SNPs in it.
Of these six SNPs listed both in quxuq's 454 results and in the 1K Genomes samples, here are the details (Build 37/hg 19):
6401964 is not derived in any of the I or J samples that have a clear result
7900883 is derived in all I + J samples that have any clear result
16497020 is derived in all I + J samples that have any clear result
17844781 is derived in all I + J samples that have any clear result
23550924 is derived in all I + J samples that have any clear result
18683719 is derived in the very basal J1* sample, HG01253, and in quxuq, and ancestral in every other sample that has a clear result!
HG01494, the sample in which Z1834, Z1842, etc., were discovered, is ancestral for 18683719.
So this new SNP may define a very early branch of J1 including both of these samples, and not including the J1d, Z1834+ samples! If so, it will change the J tree all over again. As I've mentioned, RCO will have to test this SNP, to determine whether M365 is parallel to it or downstream of it. If they're parallel, and if we can get rid of the private SNP M62, then M365 would be J1a, L136 would be J1b, Z1834 would be J1c, and this new SNP would be J1d. Unless someone can convince us of a better scheme. I certainly hope M62 can be removed, but if it can't, this SNP would be J1e.
Ray, I'm not sure there'd be much point in checking the singleton SNPs from the old set of 1K Genomes samples, since HG01494 was the most basal one in it, and we know it's not derived for this new SNP. However, the singleton SNPs from HG01494 could be checked against the new HG01253, and the data of GRC006556.
I haven't checked yet to see if good primers could easily be found for 18683719. Its Build 36/hg 18 position is 17193113. Anyone who wants to check can let us all know.
I'll post more when I learn anything new!
Bonnie
------------------------------------------------------------------------------------------------------------
I've been looking at lots of 1K genome data which has large sets of SNPs that are derived in the same samples. There was two sets that gave contradictory messages about the shape of the tree. So I spent a lot of time looking very closely at the raw, unfiltered results to try to determine which were valid. The side that won out was the one saying that HG01253, who apparently is close to Quxuq's lineage, branches off first, with Z1834+ branching off next from the other side of that first split. The second alternative would have been that there was a branch shared by Quxuq and Z1834+ on one side of a split, with L136 on the other. The SNPs pointing that way turned out to be quite weak, while there were a lot more and stronger SNPs for the first alternative.
So it looks like we'll have:
* SNPs that are shared by Z1834+ and L136+ samples, and ancestral in HG01253
* SNPs defining the new early branch, like 18683719, derived only in HG01253
* Potentially a bunch of SNPs basically equivalent to L136, but some could be upstream or downstream, and give us more resolution.
* Potentially SNPs that could define a branch that's L136+ and P58-
More later!
Bonnie