Editorial Article: The use of probabilistic genotyping software in forensic DNA analysis

In this guest editorial, Dr. Michael Coble discusses how probabilistic genotyping (PG) software has revolutionized the ability of forensic labs to interpret DNA profiles

14 Sep 2021

Dr Michael Coble, associate professor and the associate director of the Center for Human Identification at the University of North Texas Health Science Center
Dr. Michael Coble is a fellow of the American Academy of Forensic Sciences and a member of the International Society for Forensic Genetics

While interpreting DNA evidence has been a staple of forensic labs for the past three decades, the methodologies used have continuously evolved with the introduction of new processes and technologies. No technology has impacted the ability of labs to interpret DNA profiles, though, quite like probabilistic genotyping (PG) software, writes Dr. Michael Coble, an associate professor and the associate director of the Center for Human Identification at the University of North Texas Health Science Center in Fort Worth, Texas.

Cracking cases worldwide
Unlike the binary decision-making, manual approach to mixture interpretation which preceded it, PG software has enabled forensic labs to use far more of the available DNA profile to determine whether a person of interest is a possible contributor (or not) of evidence from a criminal investigation. This means low-level, degraded, or mixed DNA samples that would have been discarded as uninterpretable or inconclusive in the past may now be able to yield DNA results that not only are interpretable, but also stand up well against scrutiny in subsequent judicial proceedings.

This, of course, represents a huge departure from the previous heuristic methodology. Relying on the application of fixed thresholds and other biological parameters (peak height ratios, mixture proportion ratios, stutter ratios, etc.) to analyze DNA samples, the former approaches work well – and are generally accepted in court proceedings – for single source, most two-person, and some three-person mixtures. When the DNA mixtures are more complex, though, the heuristic method becomes unwieldy for human interpretation.

Previous methodologies are also extremely slow and tedious, particularly when compared to PG software, which is capable of rapidly assessing literally thousands of proposed profiles with respect to how closely they resemble or can explain an observed DNA mixture.

As a result, PG software has rapidly become a preferred method for DNA analysis in hundreds of forensic labs worldwide. To date, it has been used successfully to resolve more than 220,000 cases and has proven to be particularly effective in contributing to the resolution of violent crime and sexual assault cases. It has also been useful in exonerating innocent individuals wrongly associated with crime scene evidence, and in reopening cold cases in which low-grade or mixture evidence originally dismissed as inconclusive can now be reexamined.

Adoption of PG software in forensic labs
Beyond its ability to quickly produce useful, interpretable DNA results, PG software generally has held up well against numerous legal challenges. With a few exceptions, the courts have ruled that evidence produced by PG software meets the standards for admissibility which include, first and foremost, scientific validity.

Citing the scientific literature, which contains numerous peer-reviewed studies on the use of PG software, proponents – and in many cases, the courts – have also noted the software’s foundation in well-established scientific principles (such as Markov Chain Monte Carlo, which is regularly used in computational biology, weather prediction, engineering, and physics) and its codification by organizations internationally like the UK Forensic Science Regulator, The International Society for Forensic Genetics, and the Scientific Working Group on DNA Analysis Methods.       

While the impact of PG software on case clearance and resolution is undeniable, adoption by some forensic labs has been surprisingly slow. This may be attributable in part to the nature of the labs themselves. Busy forensic labs often point to their heavy caseload and the intense pace of the work, which combine to make it difficult to free up sufficient time for proper installation, internal validation of the software, training, and implementation.

Training tips
Training can be an especially time-consuming process since it involves instruction in the principles and practices of the software being deployed, interpretation of DNA evidence and data produced by PG software, and the assignment of meaningful propositions that are used to assign a Likelihood Ratio (LR, a component of Bayes’ Theorem).

Another essential, but sometimes missing, part of the training process involves preparing forensic analysts to effectively convey the intricacies of DNA evidence in admissibility hearings, written reports, court proceedings, and to laypeople. This entails not simply being able to clearly explain to a judge, jury, prosecutors, and defense attorneys what PG software is, how it works, and why its findings are scientifically reliable, but doing so in a way that conveys confidence in its use and conclusions. 

To speed successful adoption and use of PG software, forensic labs can take advantage of the experiences and best practices established by other labs already using PG for casework. Labs are typically quite willing to share what has worked (and what has not) in their own deployments. While best practices vary depending on the software package used and personal practices, labs that have implemented PG software successfully generally have taken a project approach, assigning proper resources, setting milestones, allowing analysts sufficient time to complete validation work, and above all, being realistic about timeframes. A project approach also allows analysts to be involved in each stage of the validation and deployment, creating greater buy-in from the entire team.

Another resource can be found in transcripts of trials and other court proceedings, which are readily available online for analysts to review in advance of being required to appear in court. Software developers and forensic experts can also provide in-person counsel and training for addressing the questions prosecutors and defense attorneys are likely to pose with respect to PG software.

With advance preparation, analysts will be ready to discuss in everyday language both the complexities of PG software (including issues related to validation, peer review, acceptance in relevant communities, etc.) and the legal standards for the admissibility of scientific evidence. 

Beyond proper training, forensic labs regularly should review both the peer-reviewed literature and the validation studies performed by other forensic labs to understand how PG software is being used and any issues being encountered. Labs must also perform their own in-house (internal) studies, testing a series of single-source profiles to model peak height variability inherent to the lab system and running an appropriate number of mixtures, to validate their own software. 

Understanding the limitations 
Labs need to recognize that PG software, like all software, has its limitations. No matter how effective it is, PG software simply cannot interpret all DNA profiles. There will always be DNA samples that are far too complex, too degraded, or contain too little information to obtain a genetic profile. The general trend with PG software is the less information that is present in the mixture evidence profile, the more uninformative the LR generated by the software will be. 

By approaching implementation methodically, allowing adequate time for training, and sharing results with their peers, forensic labs will be able to make even greater use of PG software in analyzing DNA evidence, resolving profiles, and clearing caseloads. This, in turn, will have a tremendously positive impact on the quantity and quality of their analyses, which will only get better as developers fine-tune their work and forensic labs improve their own methods for training and deployment.