I am continuing my work on the Encyclopedia of mtDNA Origins from my last post… This is a project that will create mtDNA pages for all named subclades. One part of each page will hold a table of results from National Geographic's Genographic Project's Geno 2 results. The combination of academic sample collection and public participation, the Genographic database is one of the largest sources of mtDNA result with maternal ancestry information.
Geno 2 Sample Requirement
From the first post, we have a user story –What the user wants. As a user with mtDNA results, I would like Maternal Origin information from Geno 2.0 tested samples.
Just what are Geno 2 results? What maternal origin information is available?
Geno 2 Background
The original Genographic Project launched in 2005. The first public participation tests were either Y-DNA for men or mtDNA for women. In 2012, National Geographic relaunched their website with new features and a new three-part test. Based on a microarray chip, the test included autosomal and mtDNA results for women. For men it included autosomal, mtDNA, and Y-DNA results. Along with testing, participants could answer demographic questions.
Three of those questions are birthplace, mother's birthplace, and maternal grandmother' birthplace. From them, it is possible to derive a recent maternal origin. Of course, the origin will be too recent to exclude the migrations and Diasporas of recent origin in the Americas. That will need individual analysis.
My first step was to download demographic and genetic information from the Genographic DAR, the researcher portal to results.
From there, I imported the results into Excel.
I next removed unneeded columns and added Hg IDs for record keeping. Note, every Genographic participant has a private ID (the GPID). There is a different ID used for the research database. Thus, the Hg ID creates a double layer of removal from the participant's ID.
I then replaced the empty fields and the places where the answer was Unknown with Unspecified. For places where the answer was not a country, I changed it to the appropriate country and corrected misspellings.
Finally, I create the Maternal Origin column. It follows these rules:
- If the Maternal Grandmother's Birthplace is not Unspecified, then it is equal to her birthplace.
- Otherwise, if the Mother's Birthplace is not Unspecified, it is equal to her birthplace.
- If neither of those, it is equal to the participant's birthplace.
I made some minor adjustments in the column name. Note that Geno 2 results currently use Phylotree Build 16.
Building the Geno 2 Sample Database
With the data prepared, I built the content type to hold it in the database. For the past mtDNA Stories and Phylotree Branches, I have used the PODs framework to create custom content types. These let me add special fields and options to each. For Geno 2 data I knew I wanted something more though. Custom content types are part of the WordPress data structure. For large sets of records with many fields, it can become unwieldy.
Thus, I decided to use PODs Advanced Content Types. These place the records and associated fields in their own table within the database.
The first five fields hold the data I prepared. The next two are relationships. As I mentioned, Geno 2 results use Phylotree build 16. For the initial launch I have linked each haplogroup to the corresponding mtDNA Story using the Build 16 Hg field. When I convert results to build 17, I will add the right relationships to the Build 17 Hg field and change settings to use that field.
Meanwhile, I visited my PHPmyAdmin interface, and used SQL to load records into the database.
Geno 2 Sample Templates
The first template is Get Geno 2 Profiles. Unlike the template for the Phylotree log, wrote this one to only return the rows for the table. This is because in some cases there will not be results for the branch. Creating the table elsewhere ensures that the lack of records is shown by an empty table and not an empty space on the page.
Second is updating the mtDNA Stories template.
I save both, and am done.
After looking at the Geno 2 dataset, it was clear that there would be many cases where there were hundreds and sometimes thousands of records for the branch. Making this user-friendly needs a bit more than a table of all results. I decided that the requirements should be extended to include pagination, column sorting, and searching.
One of the top code libraries to add those features to tables is DataTables.
Integrating it could take some time, so I checked to see if someone else in the WordPress community had already done so with a plugin. Yes! WP jQuery DataTable does exactly that.
After a few style and setting adjustments, it was ready.
Looking at Results
With all of that done, it is time to check out the results. Here is how it looks for K1a12a.
Not bad. To be continued…