22 April 2021

A new match - and some more detail on mtDNA mutations

by Hank de Wit

I have a new mtDNA match on Family Tree DNA. I’ve been getting roughly two a year. I’ve sent an email to this match and I’m now waiting to see if they will return my email. Fingers crossed.

This new match is GD2 with me. That means that we have a genetic difference of two mutations. So what does this really mean?

The mtDNA molecule is a ring like structure and has about 16000 base-pairs, each of which can have one of four bases that we usually abbreviate to the letters A, G, T and C.

When a mutation occurs one of the bases will be replaced with one of the others. The bases A and G have a similar shape as do the T and C bases. So the usual thing is for an A to change to a G, or vice-versa; or a T to a C (and vice-versa). However is is possible for an A to be replaced with a T, or C, just less likely.

Even though mtDNA is a ring structure we can identify a start and end point. So we can identify each position of the molecule by it’s position, 1,2,3,..16569 along from the start.

So a mutation can be named by referring to it’s position. There are two naming conventions. In the RSRS convention a mutation of A to G at position 8535 along the molecule is called A8535G. In the rCRS convention it would be simply 8535G. I’ll use the RSRS convention.

Mutations are not limited to just substitutions as just described. Occasionally a base-pair can be inserted or deleted from the molecule. To ensure that the alignment and numbering of a position doesn’t get affected by these kinds of mutations we name them differently. For example a single insertion of an A base after the position 309 in the molecule is named 309.1A. A second insertion at this position would be 309.2A. A deletion at a position, say 301, would be named 301d.

Using my own data as an example. On Family Tree DNA my Haplogroup is designated as H and I have the “Extra Mutations”,

309.1C, 315.1C, T319C, 522.1A, 522.2C, A3505G, A13748G, C16222T

The mutations 309.1C, 315.1C, 522.1C, and 522.2C are all insertions. Insertions at these positions are quite common. For that reason they are usually ignored in the determination of Haplogroups. Family Tree DNA don’t include these mutations in their genetic difference calculation.

So my significant mutations are A3505G, A13748G, C16222T, and T319C. Some of them are used to define the new Haplogroup H37 and it’s sub-clades.

From the YFull Mtree we have:

H -> A3505G H37 -> A13748G (C16222T) H37b -> T319C H37b1

Each of the “->” arrows represent a branching point of the Haplotree. If we begin at the Haplogroup H which formed roughly 15,000-25,000 years ago and look at later mutations through time we get a tree structure.

There are over one hundred direct branches from H. The most common are H1 and H3. Ours is H37 and it is defined by the single extra mutation A3505G.

We have currently found two main sub-branches of H37. By convention the branches are indicated by lower case letters initially and for H37 these are the “a” and “b” branches. For most of us it is the “b” branch defined by the mutations A13748G and C16222T that is important. YFull don’t use C16222T because it is common, but I haven’t found an exception in any of my matches, so I include it.

Lastly we have branches under H37b. The convention is to use digits as the next branch indicators, eg “1”, “2”. As the branching continues, we keep alternating letters and digits. For example we could have H1a1b3 as a valid Haplogroup.

For my sub-group the Extra Mutation T319C defines the “1’ branch of H37b. So I am H37b1 and there are Extra Mutations left over.

My new match has two (likely non-insertion) mutations different to my H37b1.

For this we need to look at the full tree as I have so far determined it. The YFull tree has insufficient samples as yet.

My tree is: H37 Tree

My tree has two known sub-branches to H37b1, namely H37b1a defined by C16111A, and H37b1b defined by A16170G. In addition there are a number of people under H37b1 which are outside these branches. They are assigned to H37b1* until further samples clarify the issue.

Although I haven’t had a reply from my new match, from surname information they provide on the match report we can guess that the maternal ancestry is from Rhineland Germany.

H37b1b mostly contains Swedish samples and H37b1a contains mostly French samples, though there is one German sample in that group. H37b1* contains samples from a wider distribution, Italy, Austria, Germany, Czechia, Sweden and Finland, so that is a possible home for the new match. Some of these samples are GD2 with me. Hopefully the new match will be GD0 with one of more of these samples, particularly with one that I haven’t been able to contact before and as yet do not know their exact mutations.

It’s also possible that they do not have the T319C mutation and are outside of H37b1. There are six people already in H37b* that could be identical with the new match.

If the match never replies I can contact other matches in the likely regions to see what GD they get for the new person. This will help place them even if we don’t finally get the exact mutations.

tags: