Editorial Article: Probabilistic genotyping: The forensic lab software used to crack cold cases

Dr. Bruce Budowle discusses how probabilistic genotyping is helping forensic labs identify low-level, degraded DNA samples

11 Nov 2020


Dr, Bruce Budowle served 26 years with the FBI 

In this guest editorial, Dr. Bruce Budowle, professor at the University of North Texas Health Science Center and Director of the Center for Human Identification, looks at how binary DNA interpretation decision-making approaches are being replaced by probabilistic genotyping (PG). Here, the former chief of the FBI's Forensic Science Research Unit explores the potential benefits of adopting PG software in forensic laboratories, where it has been used to uncover originally inconclusive evidence and help play a role in supporting convictions and exonerations.

Traditional DNA profiling challenges

Interpreting low level, degraded, or mixture profiles from DNA evidentiary material is one of the more challenging aspects of forensic DNA analyses. 

Traditionally, laboratories have dealt with mixture DNA profiles with simplified interpretation strategies. Various fixed thresholds and other biological parameters, such as heterozygote balance, mixture component ratios, and stutter ratios, were implemented. The interpretation was based on genetic data predominately being above a threshold(s) in which inclusion and exclusion as a potential contributor of the mixture were unequivocal outcomes. This manual practice of mixture interpretation was applied, usually to two-person mixtures. The process, though, was unwieldy for more complex mixtures. 

Today, the sensitivity of commercial short-tandem repeat (STR) multiplex kits has increased, and the type of evidence being analyzed and interpreted has moved from predominately high-quality, single-source profiles and simple mixtures to lower-quality and more complex mixtures. These developments, along with the enhanced detection capability, have necessitated a concomitant increase in the complexity of DNA interpretation methodology. The manual interpretation approach(es) would often discard complex data as “inconclusive,” resulting in some scenarios throwing away good data that may include, and equally as important exclude, persons of interest. The traditional interpretation process no longer sufficed. 

Probabilistic genotyping: A new form of DNA profiling

Fortunately, with the advent of the personal computer and increased computational power made readily accessible, the binary decision-making manual approach can be replaced with a more effective probability-based system known as probabilistic genotyping (PG).

Over the past few years, PG software is becoming the interpretational method of choice in forensic laboratories. Employing methods such as Markov Chain Monte Carlo that are routinely used in computational biology, physics, engineering, weather prediction, and the stock market. PG software can assess multitudes of proposed profiles on how closely they resemble or can explain an observed DNA mixture profile. From there, the probability of the observed DNA evidence can be calculated, assuming the DNA originated either from a person of interest or from an unknown donor (as well as other possible propositions). These two probabilities, in turn, are then presented as a likelihood ratio (LR), inferring the value of the findings and the level of support for one proposition over the other.

The power of the PG software enables the resolution of possible components of highly complex DNA mixtures far better than what was imaginable with the manual method alone. Thus, more meaningful and reliable data are available to support investigators and the judicial system. PG software has proven particularly effective in producing usable, interpretable, and reliable DNA results in criminal cases (including violent crime and sexual assault cases), and – just as importantly – in excluding individuals who have been wrongly associated as the source of crime scene evidence.

Cracking cold cases

PG software has also been instrumental in helping to solve cold cases in which evidence originally dismissed as inconclusive with a manual interpretation method was able to be reanalyzed and then used to develop investigative leads. Similarly, PG tools have played a role in supporting exonerations in post-conviction cases involving individuals who were wrongly convicted.   

For those using findings produced by PG software to support their case, the validity and reliability of PG tools have been demonstrated. The scientific literature is replete with peer-reviewed studies on the strengths and limitations of the use of PG software, and the tools are continuously being improved to increase efficacy and address challenges inherent in DNA typing of forensic evidence.

Some software has been subjected to developmental and internal validation studies in order to promote robust scrutiny and proper use by the scientific community. These supporting data also form the foundation for the admissibility of PG software in the courtroom. As a result, PG software has already been used in thousands of cases around the world. As an example, one such software, STRmix™, has been used to interpret DNA evidence in more than 220,000 cases worldwide since 2012.

A profound impact on forensics

While the benefits are clear, like any other technology PG software has limitations. It cannot, for example, interpret all DNA profiles. There are DNA profiles that have too little information or are far too complex to interpret. Therefore, it is incumbent on the user to become properly educated on the use of the software and to perform validation studies to understand the limitations of the tools employed. Doing so will reduce the chances of interpreting software output that is not supportable. 

Another limitation is a false belief that using PG software renders all such interpretations completely objective. The choice of algorithms and programming are made by humans. As a result, subjectivity is inherent in the software. Using such software can reduce the variation among users, however, and in doing so decrease subjectivity throughout the scientific community. Again, proper education of what PG software can and cannot do, as well as training in cognitive bias, may help reduce the misunderstanding of objectivity and subjectivity and promote proper interpretation of results.

Claims have been raised that PG software may or does contain miscodes, which is a likely valid assertion for all software. Given that, the real question should be: what impact does an identified (or unknown) miscode have on the PG-generated result? One should consider if there are mechanisms in place to detect miscodes and, if so, have those identified miscodes been evaluated and/or rectified. Validation studies are one effective method for identifying miscodes, which can be accomplished in part through examining extended output that contains the intermediate steps of the interpretation process.

Finally, PG software has been criticized as lacking adequate peer review because developers are often the authors of the studies that comprise the scientific literature. While there is perhaps some validity regarding this statement, it ignores the major part of the peer-review process. Publication of a paper is only the start of the peer-review process. The more effective part of peer review comes once the greater scientific community reads the paper(s), and comments, critiques, and at times performs studies to demonstrate any flaws or alternatives. To date, the overall peer review process supports that PG software is capable of generating reliable results when it is used properly.   

PG software continues to be a substantial improvement. With proper training and use, it will continue to have a profound impact in the forensic arena, providing reliable data from a broader range of DNA evidence and in particular mixtures and contributing results that can assist investigators and the judicial system.       

Learn more about DNA and RNA identification processes in our How to Buy DNA/RNA Purification and Quantification Technology>>