User Tag List

Page 1 of 3 1 2 3 LastLast
Results 1 to 10 of 24

Thread: Beware the "calculator effect"2608 days old

  1. #1
    Established Member
    Your Friend
    Last Online
    @
    Join Date
    2009-10-23
    Posts
    9,652
    Gender
    Y-DNA
    R1a-Z282
    mtDNA
    H7
    Metaethnos
    Slavic
    Ethnicity
    Polish
    Phenotype
    Barbarian
    Religion
    Crop Circles
    Poland

    Default Beware the "calculator effect"

    An important and useful notice for personal genomics customers using third party tools...

    There are now lots of genetic ancestry calculators available online, so that personal genomics customers can analyze their DNA without having to send their raw data anywhere. If used correctly, the vast majority of these are extremely accurate, and can do things that aren't even being done by scientists in peer reviewed studies, like picking up minority admixtures between European ethnic groups.

    However, many people are getting skewed results, despite doing everything right. For instance, users from the UK often come out much more continental European than they should. Some of them actually believe that this is because they're genetically more Norman or Saxon than the average Brit. Nope, the real reason is what I call the "calculator effect". Basically, this is when the algorithm produces different results for people who were part of the original ADMIXTURE runs that set up the allele frequencies used by the calculators, than those who weren't, even though both sets of users are of exactly the same origin, and should expect almost identical results.

    So, is it possible to get around this calculator effect? Yup, people who weren't included in the datasets that produced the allele frequencies used by the calculators, shouldn't compare their results to those who were, including the academic references used. They should only compare results to those of other calculator users. On the other hand, members of the various projects who were run as references, should only compare their results to other project members and relevant academic references.

    The reason I'm harping on about this, is because I'm afraid that someone, somewhere will eventually note the aforementioned discrepancies in the results, and shoot their mouth off about how inaccurate third party analyses are for personal genomics customers. That would be unfair, and potentially damaging to all the genome bloggers out there who are providing awesome services for free.

    However, I am disappointed that no one else is talking about the calculator effect, and how to remedy it. I actually designed my Eurogenes ancestry tests for Gedmatch with this problem in mind, by only using academic references to source the allele frequencies. This means that test results for Eurogenes project members and non-members are directly comparable. Perhaps other genome bloggers can eventually do the same?
    Eurogenes Ancestry Project: Beware the "calculator effect"
    Last edited by Polako; 2012-05-27 at 08:05.

  2. The Following 9 Users Say Thank You to Polako For This Useful Post:

    ducktard (2012-05-27), Jaska (2012-05-27), JAX (2012-06-01), Jonny (2012-05-27), Litvin (2017-10-07), mizrahi jew (2012-12-02), Nordmann (2012-05-31), snowwhite (2012-05-27), Svin (2012-12-02)

  3. # ADS
    Advertisement bot
    Join Date
    2013-03-24
    Posts
    All threads
       
     

  4. #2
    Established Member
    Molecular Biologist
    Last Online
    2014-12-25 @ 19:28
    Join Date
    2009-11-22
    Posts
    1,147
    Gender
    Y-DNA
    R1b1b2
    mtDNA
    H6a1a

    Default

    Thanks for letting us know. Although there is an Oracle button for Eurogenes results at Gedmatch, it is not working. It reads "unable to upload population data".

  5. #3
    Established Member
    Your Friend
    Last Online
    @
    Join Date
    2009-10-23
    Posts
    9,652
    Gender
    Y-DNA
    R1a-Z282
    mtDNA
    H7
    Metaethnos
    Slavic
    Ethnicity
    Polish
    Phenotype
    Barbarian
    Religion
    Crop Circles
    Poland

    Default

    Quote Originally Posted by Serge View Post
    Thanks for letting us know. Although there is an Oracle button for Eurogenes results at Gedmatch, it is not working. It reads "unable to upload population data".
    I haven't bothered to work on that yet. I might in the future, but we'll see.

  6. The Following User Says Thank You to Polako For This Useful Post:

    Serge (2012-05-27)

  7. #4
    Established Member
    Molecular Biologist Jonny's Avatar
    Last Online
    @
    Join Date
    2012-05-18
    Posts
    4,698
    Gender
    Y-DNA
    R1b
    mtDNA
    T2b
    England

    Default

    I don't know anything about the science behind these things, but it seems like obvious common sense that you shouldn't compare Person X with an average also containing Person X, no?

  8. #5
    Established Member
    Molecular Biologist ~Elizabeth~'s Avatar
    Last Online
    @
    Join Date
    2012-03-07
    Posts
    5,023
    Gender
    mtDNA
    H1c12
    Politics
    Trump 2020
    United States

    Default

    @ Polako
    I get that you mean to say that your Eurogenes calculator(s) is the best of the bunch at Gedmatch, even though I don't understand what calculator effect is.

  9. #6
    Established Member
    Your Friend
    Last Online
    @
    Join Date
    2009-10-23
    Posts
    9,652
    Gender
    Y-DNA
    R1a-Z282
    mtDNA
    H7
    Metaethnos
    Slavic
    Ethnicity
    Polish
    Phenotype
    Barbarian
    Religion
    Crop Circles
    Poland

    Default

    I've updated my Calculator Effect post to include the results of an experiment I ran to show the "calculator effect" at work.

    Now, I've put together a quick experiment to show the "calculator effect" in full force. I ran two intra-North European ADMIXTURE analyses at K=3, Test1 and Test2, and included myself (PL1) only in the former. These tests were almost identical, except for the fact that I wasn't part of the second run. I then tested my genome with calculators made from the allele frequencies from the two runs.

    My calculator results for Test1 were very similar to the results I received from ADMIXTURE, and made perfect sense based on my ancestry. However, the calculator results for Test2 were way off, and basically made me look like a different sample from some other part of Europe. I even managed to score above noise level Far Eastern ancestry in the calculator version of Test2. Please note, however, that all the other individuals received almost identical scores in both tests. The results from the experiment can be seen in the spreadsheet below.

    Calculator Effect K=3
    Last edited by Polako; 2012-05-31 at 11:40.

  10. The Following 2 Users Say Thank You to Polako For This Useful Post:

    Jaska (2012-05-31), Nordmann (2012-05-31)

  11. #7
    Established Member
    Molecular Biologist Jonny's Avatar
    Last Online
    @
    Join Date
    2012-05-18
    Posts
    4,698
    Gender
    Y-DNA
    R1b
    mtDNA
    T2b
    England

    Default

    You need to change the permissions for the spreadsheet, can't access it at the moment.

  12. The Following User Says Thank You to Jonny For This Useful Post:

    Polako (2012-05-31)

  13. #8
    Established Member
    Your Friend
    Last Online
    @
    Join Date
    2009-10-23
    Posts
    9,652
    Gender
    Y-DNA
    R1a-Z282
    mtDNA
    H7
    Metaethnos
    Slavic
    Ethnicity
    Polish
    Phenotype
    Barbarian
    Religion
    Crop Circles
    Poland

    Default

    OK, thanks. I've done it now.

  14. #9
    Established Member
    ----- Nordmann's Avatar
    Last Online
    @
    Join Date
    2011-03-14
    Posts
    606
    Location
    Wrong Planet
    Gender
    Y-DNA
    I-F3312
    mtDNA
    K2a3
    Phenotype
    Bad Tempered Nordic
    Norway Sweden Finland

    Default

    @Polako this is problematic

    Really makes you wonder how come not all are affected by this "calculator effect"?
    Not even all within the same population are affected it seems

    You did not have them in the Excel sheet but in case you have some ideas how do the Finns and Scandinavians come out in these tests?

  15. #10
    Established Member
    Your Friend
    Last Online
    @
    Join Date
    2009-10-23
    Posts
    9,652
    Gender
    Y-DNA
    R1a-Z282
    mtDNA
    H7
    Metaethnos
    Slavic
    Ethnicity
    Polish
    Phenotype
    Barbarian
    Religion
    Crop Circles
    Poland

    Default

    Quote Originally Posted by Nordmann View Post
    @Polako this is problematic

    Really makes you wonder how come not all are affected by this "calculator effect"?
    Not even all within the same population are affected it seems.
    The only sample affected in this run is PL1. The rest are fine because they were in both ADMIXTURE runs, but PL1 was only in the first ADMIXTURE run, and was tested with a calculator made from the second run.

    If you look at the spreadsheet, all the samples except PL1 show a difference of less than 1% for all the clusters. But check out the differences for PL1.

    Test1 - PL1 (included in original run)

    28.34% Orcadian
    0.83% Far_Eastern
    70.83% Lithuanian

    Test2 - PL1 (not included in original run)

    42.58% Orcadian
    1.66% Far_Eastern
    55.75% Lithuanian

  16. The Following 3 Users Say Thank You to Polako For This Useful Post:

    Jaska (2012-05-31), observer (2012-11-10), Silesian (2012-05-31)

Page 1 of 3 1 2 3 LastLast

Similar Threads

  1. "Racial Analysis Calculator"
    By Humanist in forum Physical Anthropology
    Replies: 24
    Last Post: 2017-11-17, 21:15
  2. debunking the false idea of "aryan" and "semitic" races
    By Metal Gear in forum Ethnicity, Race & Nation
    Replies: 109
    Last Post: 2017-08-19, 04:26
  3. Replies: 9
    Last Post: 2016-08-31, 20:06
  4. Replies: 17
    Last Post: 2012-09-12, 15:54
  5. Replies: 0
    Last Post: 2012-01-15, 17:28

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
<