PDA

View Full Version : Let's play with haplotypes!



Wojewoda
2012-03-07, 10:14
During recent discussion on the nature of the one the subclades of N1c found in non-Boreal Europe characterized by the L550+ mutation (N1c1d) (http://www.forumbiodiversity.com/showthread.php?t=29574) Jaska made for following claim:



The Spanish type is very different, it is not derived from any other L550+ type. Therefore it must have been developed separately a long time ago. Besides, there is no Spanish type anywhere outside Iberia (except of course in Latin America, due to the Spanish influence). Furthermore, the Spanish group is even older than the split between the Scandinavian and Balto-Slavic groups! (As you should have seen in the picture you quoted.) There is no way that it could have been born only in the Migration Era.


I propose to try to verify this claim (I am not saying it is or it is not correct at the moment) by the way of analysing the haplotypes of the L550 sub-haplogroup ourselves in this very thread.

DNA-Forums has been down for me for the last couple days, so I propose to play in recreating here the type of skills people form that forum posses.

So I propose a collective effort to learn the proper way of analyzing sets of haplotypes.

First we need data. Luckily there is a very handy Russian site semargl (http://www.semargl.me/en/dna/ydna/) which provides all the data one would need.

After switching language to English we go into "Map of branches" section and we find here N1c L550+:

N1c1d(L550) Iberian
N1c1d(L550) Neuri
N1c1d(L550) Neuri-L149.2/L551
N1c1d(L550) Neuri-L58
N1c1d(L550) Neuri-L591
N1c1d(L550) Unknown
N1c1d(L550) Varangian

We will be probably mostly interested here in the 3 sets of haplotypes:

N1c1d(L550) Iberian
N1c1d(L550) Neuri (http://en.wikipedia.org/wiki/Neuri)
N1c1d(L550) Varangian (http://en.wikipedia.org/wiki/Varangians)

... which we will find clicking "Table haplotypes" here (http://www.semargl.me/en/dna/ydna/haplotypes/table/154/), here (http://www.semargl.me/en/dna/ydna/haplotypes/table/145/) and here (http://www.semargl.me/en/dna/ydna/haplotypes/table/153/).

In the next step we should copy (CTRL-C) tables with the DYS data into any Excel-like spreadsheet file and possibly share it by uploading the file somewhere so all people interested in this exercise could use it.

I don't have time to do it now, but if anyone is interested in learning new methods of data analysis and wants to play I invite him/her to create such a - for instance - Google spreadsheet with the relevant data (if no one helps I will do it later).

When we have easily accessible data we could move to the next phase of actual analysis of haplotype modals, variances, coalescence times, time to Most Recent Common Ancestor estimates and so on.

Wojewoda
2012-03-07, 17:48
I can see that no one wants to play. :(

Nevertheless I proceed to the next step: here is the link to the Google spreadsheet containg relevant data - 67 marker haplotype sets for the supposed "Iberian" (6 samples), "Neuri" (130 samples) and "Varangian" (78 samples) subclades of the N1c L550+ branch (https://docs.google.com/spreadsheet/ccc?key=0AqIQvSLU0ebVdFIxblVvWlpmUVhMYkZseXJRbVVIe nc#gid=0).

I guess in the next step I will calculate the modal values of all DYS markers for each subgroup to be able to see the similarity relantionship between these 3 groups: Iberians, Neuri and Varangians.

Wojewoda
2012-03-08, 08:46
I guess in the next step I will calculate the modal values of all DYS markers for each subgroup to be able to see the similarity relantionship between these 3 groups: Iberians, Neuri and Varangians.

OK, I have created new sheet in the document in which I have attempted to to calculated modal values for every DYS location available in each of the 3 groups (Iberians, Neuri and Varangians). At first I used "median" function (middle value), but then I realised that "mode" function (most frequent value should be more appropriate). The results are in the 2,3,4 lines of the 'Modal DYS value' sheet.

The differences in modal values between these 3 groups of haplotypes were observed at the following DYS locations: dys393, dys19, dys389i, dys389ii, dys458, dys447, dys570, cdya, cdyb (and dys714, dys522, dys533, dys445, dys712, dys635).

Then - for each modal value of given DYS location - I calculated absolute values of the 3 pairs of differences: Iberian-Neuri, Iberian-Variangians and Neuri-Varangians. The results are in the lines 6,7,8.

Finally I summed up these differences across all available DYS locations and across first - I believe - 67 DYS locations.

The results for 67 markers (cells A6-A8):

I-N 9
I-V 10
N-V 7

And for all markers (cells B6-B8)

I-N 14
I-V 15
N-V 11

Conclusion: the most similar pair of groups are Neuri and Varangians.

The next most similar pair are Iberians and Neuri closely followed by the last pair: Iberians and Varangians.

Overall the differences is not very large I believe - quite contrary it is rather small - but in general the calculations - if I didn't make any mistakes - confirm Jaska's observation that the Iberian group of N1c L550+ haplotypes is the most divergent one.

Jaska expressed this obserbation in the following tree:

http://upload2.fototube.pl/pics/2012/03/06/org/ea9751f55877fcdb023bd6c175513f3e.png

Jaska's "Skandinavians" = "Varangians", and "Polish" = "Neuri"

What prehistorical movement could have brought North-Eastern European male lines to Iberia slightly before the differentiation of the Varangian (including Rurik and his gang) and Neuri (including Gediminas with friends) groups?

Can we guess the time frame of the differentiation of these 3 groups? We already know that such caclulations are not very reliable (http://dienekes.blogspot.com/2011/08/y-str-variance-of-busby-et-al-2011.html), but still there should be a way to distinguish between - let's say - Mesolithic and - for instance - Migration Period time frames.

Huckleberry Finn
2012-03-08, 09:06
This may not be related to the topic at all but as a side note:

"Furthermore, a single-blade battle knife with a bronze hilt found near Luga has direct parallels among materials from cemeteries in north-western Spain of the late 4th – early 5th century (Ščukin 2005: 217–218, 412–419)."

http://www.helsinki.fi/venaja/nwrussia/eng/Conference/pdf/Jushkova.pdf

Wojewoda
2012-03-08, 09:49
This may not be related to the topic at all but as a side note:

"Furthermore, a single-blade battle knife with a bronze hilt found near Luga has direct parallels among materials from cemeteries in north-western Spain of the late 4th – early 5th century (Ščukin 2005: 217–218, 412–419)."

http://www.helsinki.fi/venaja/nwrussia/eng/Conference/pdf/Jushkova.pdf

This would point to the Migration Period.

Polako
2012-03-08, 12:00
This would point to the Migration Period.

So where would these L550+ arrive from in Iberia? From modern Germany, Scandinavia or Poland?

Hweinlant
2012-03-11, 23:31
So where would these L550+ arrive from in Iberia? From modern Germany, Scandinavia or Poland?

There are more new snps just under L550. These will likely split the L550 into two and the dilemma will be solved. The Varangian-Finnish-Scandinavian branch has Dys19=14 , while the Baltoslav branch has Dys19=15. Most of the L550- has Dys19=14 (this is the ancestral state).

Iberian branch in this sense is sort of anomaly as it doesnt resemble neither of the two other N1c1d clusters, but is basically just a double mutation away from the Dys19=14. I think it will be ultimately from Varangian-Finnish-Scando branch and I dont think it is from Goths but rather from Viking era.

Precense of N1c1* at Iberia is very limited and if I recall right it has been found only from Seville and thereabouts in academic studies. Location known to have Viking presence (http://en.wikipedia.org/wiki/Viking_expansion#Iberia). There is no way that it is superold at Iberia as it has only local presence.