When you mail in your saliva sample to AncestryDNA™ or another DNA collector they use a nifty device called a microarray to map your genome. They don’t look at your entire genetic code, because our genomes are by and large 99.9% the same across all humans (I bet you feel soooo special now). Instead, they focus on the 0.1% of known common genetic variation between us. These variations, also known as markers, or SNPs (“snips”), account for most of the human genetic variations you observe. They account for differences in eye color, skin pigmentation, and many other traits.
“Raw DNA data” sounds like a mouthful but it’s simply a computer file that lists out these SNPs, along with other useful information which we’ll get to later. It’s a text file you can save to your desktop, or double-click to open.
AncestryDNA™ and 23andMe provide raw data in a .txt format (compressed during download as a .zip); FTDNA provides it as a .CSV file (compressed as a .gz).
More about Your Raw DNA Data
While raw DNA data doesn’t contain your entire genome, it does include, oh, just several hundred thousand SNPs. AncestryDNA™’s and FTDNA’s raw data files include about 700,000 markers; 23andMe’s include anywhere from about 600,000 to 1 million markers depending on when you were tested. Here’s an approximate SNP count for the various DNA collectors:
|Approx. # SNPs||DNA Collector|
|701,480||AncestryDNA v1 (pre-May 2016)|
|571,430||23andMe v2 (really old)|
|949,460||23andMe v3 (pre-Nov 2013)|
|552,540||23andMe v4 (pre-Sep 2017)|
|700,000||FamilyTreeDNA (various versions)|
|720,710||MyHeritage v1 & v2|
|606,130||Living DNA v1|
|536,070||Genes for Good v1|
If I had a dollar for every SNP I have…
DNA collectors only use a very small portion of your DNA to generate their results. The big draw of third-party tools like Gene Heritage is they give you a second chance to mine your unused raw DNA data for even more information about yourself. The difference in SNP counts between DNA collectors explains why coverage may vary with your third-party tool.
Typically all you need to do with your raw DNA data is download it from your DNA collector and upload it to a third-party party tool. But if you want to get in touch with your inner geek by ogling a long list of your single nucleotide polymorphisms, or SNPs, open your raw DNA file in a text editor and you’ll see something like this:
This is a screenshot from my very own AncestryDNA™ raw DNA data (please don’t use it to clone me; I’m not that special, trust me). Raw DNA from other collectors looks very similar, typically with five columns corresponding to:
- Your SNPs (each identified by a code called a rsID number)
- The chromosomes on which each SNP is located
- The position of each SNP on the chromosome
- Your two alleles for each SNP (one comes from your father, one from your mother). Possible allele letters are A (adenine), C (cytosine), G (guanine), T (thymine), or 0 (for missing data).
In this screenshot, the highlighted line corresponds to a variation influencing my eye color.
From this highlighted line I can see that:
- The rsID of the SNP is rs12913832
- The SNP is located on my 15th chromosome (each human has 23 chromosomes)
- I inherited one A allele from my mother and another A allele from my father.