A rapid approach to profiling the HPV virome for multiple applications in discovery and therapeutics

Learn how researchers at the Brooke Army Medical Center developed a simple, accurate, and automated method for virome profiling that could advance HPV biomarker discovery and surveillance

11 May 2023
Georgina Wynne Hughes
Editorial Assistant
Jane Shen-Gunther, MD, PhD, COL, MC, Brooke Army Medical Center
Jane Shen-Gunther, MD, PhD, COL, MC, Brooke Army Medical Center

Human papillomavirus (HPV) is the most common viral cause of cancer worldwide and is accountable for approximately 600,000 cases of cervical cancer every year. Revolutionary advances in next-generation sequencing (NGS) technologies have paved the way for research into HPV diversity and pathogenesis, with the goal of improving diagnostic testing and therapeutic options. However, the full potential of these technologies has been hindered by the slow development and implementation of accompanying bioinformatics analysis tools.

In this SelectScience® interview, we speak with COL Jane Shen-Gunther, Chief of the Department of Clinical Investigation at Brooke Army Medical Center, to learn how her team has addressed this challenge by pairing a customized HPV genome database with an automated NGS data analysis workflow.

COL Jane Shen-Gunther is a physician-scientist sub-specializing in gynecologic oncology and has served as the Chief of the Department of Clinical Investigation at Brooke Army Medical Center for over 11 years. She has led the department’s HPV Research Program and has published several landmark studies on HPV profiling, including pioneering a novel molecular diagnostic test as an alternative to traditional cervical cancer screening1. Currently, her team is focused on discovering and profiling the assemblage of HPV viruses in cervical cytology samples, known as the HPV virome. To this end, they are developing wet and dry lab techniques to accelerate and simplify this process, including the use of NGS and bioinformatics analysis tools.

The NGS data maze

The era of NGS has transformed viral discovery and metagenomic research. With the commercialization of high-throughput sequencing (HTS) instruments, the variety and choice of sequencing technologies, chemistries, platforms, capacities, and kits have expanded exponentially. As a result, “the wet lab portion of genomics research has become more streamlined, simpler, and accessible to the researcher,” says COL Shen-Gunther.

However, despite this progress, dealing with the enormous volumes of complex data produced by NGS has remained a consistent challenge in viral metagenomic analysis. “This is compounded by disparate, open-source tools which are often command-line based and require advanced computational or coding skills,” says COL Shen-Gunther. “In addition, viral taxonomies based on the International Committee on Taxonomy of Viruses (ICTV) are subject to new revisions, which impose retooling of existing software and restructuring of reference databases for viral metagenomic analysis. These challenges pose a huge hurdle for clinical virologists and physician-scientists who may not be well-versed in bioinformatics and are time-constrained.”

Enter HPV DeepSeq

To circumvent this challenge, COL Shen-Gunther and her colleagues set out to develop a simpler workflow for NGS data analysis that could offer a rapid, multi-functional, and accessible solution suitable for inexpert practitioners of bioinformatics. The team first created a customized HPV database, adapted from the PapillomaVirus Espiteme (PaVE) reference genomes, to compile a comprehensive list of clinically relevant, taxonomized HPV genomes. This curated database was then incorporated into user-friendly, GUI-based commercial software to develop and test automated workflows for HPV taxonomic profiling and visualization. The resulting method, collectively termed HPV DeepSeq, offers “a simple, rapid, and accurate method for NGS data analysis,” says COL Shen-Gunther. This approach, she adds, will “ultimately propel HPV research and serve a broad range of applications, from discovery to therapeutics.”

Department of Clinical Investigation at Brooke Army Medical Center
Department of Clinical Investigation at Brooke Army Medical Center

During the study, HPV DeepSeq workflows for HPV virome profiling were evaluated using a subset of HPV-positive cervical cytology samples2. Deep sequencing of these clinical samples was carried out by LGC Biosearch Technologies, a collaboration supported by the Department of Defense (DoD) Congressionally Directed Medical Research Programs (CDMRP) grant. “Biosearch Technologies assisted us with the next-generation sequencing of almost 3,000 HPV-positive Pap smear samples,” explains COL Shen-Gunther. “This helped us identify the HPV viromes, genotypes, and variants found most commonly in high-grade, pre-cancerous Pap smears. As a result, the HPV viromes within 6 categories of Pap smears from normal to high-grade have now been characterized and curated.” Here, COL Shen-Gunther notes the potential of HPV viromes to serve as biomarkers for the degree or severity of pre-cancerous changes on the cervix. Similarly, she suggests that the methods developed and tested in the study could be used to identify HPV virome patterns and trends as a surveillance tool, which may help guide the development of future HPV vaccines.

Currently, HPV DeepSeq is being used for HPV discovery and research purposes, but COL Shen-Gunther and her team hope to translate their research database into a clinical database for use in clinical and public health laboratories. “This rapid, streamlined approach for the wet and dry labs will take a tremendous burden off clinical molecular laboratories and scientists,” she says. “Standardizing and automating results will facilitate clinical decision making and expedite patient care.”

Future outlooks

Improvements in bioinformatics and the development of streamlined, automated workflows such as HPV DeepSeq offer vast practical value for advancing our understanding of oncogenic viruses and developing new diagnostic and therapeutic strategies – and COL Shen-Gunther is at the forefront of these efforts. Building upon the current study, her team has also streamlined the process of mapping HPV integration sites within the host genome3. “These relatively new viral hybrid capture (VHC) and viral integration site (VIS) analysis workflows have proven to be rapid and accurate for localizing viral-host integration sites and identifying disrupted and neighboring human genes,” she says. “The hope is that by applying HPV mapping to pre- or invasive cancers we can advance our understanding of viral oncogenesis and facilitate the development of therapeutic agents.”


1. The pap smear challenge: Comparing clinical performance of a novel ‘molecular pap’ based on next-generation sequencing to traditional cervical cancer screening. [online] Available at: https://cdmrp.health.mil/dmrdp/research_highlights/20shen-gunther_highlight.aspx [Accessed 21 Apr. 2023]

2. Shen-Gunther, J., Xia, Q., Cai, H. and Wang, Y., 2021. HPV DeepSeq: An Ultra-Fast Method of NGS Data Analysis and Visualization Using Automated Workflows and a Customized Papillomavirus Database in CLC Genomics Workbench. Pathogens, 10(8), p.1026.

3. Shen-Gunther, J., Cai, H. and Wang, Y., 2022. HPV Integration Site Mapping: A Rapid Method of Viral Integration Site (VIS) Analysis and Visualization Using Automated Workflows in CLC Microbial Genomics. International Journal of Molecular Sciences, 23(15), p.8132.


The Defense Health Agency (DHA) of the U.S. Department of Defense has licensed the customized HPV database described herein to QIAGEN Digital Insights. The inventor of the customized taxonomy is J.SG. This editorial has undergone PAO review at Brooke Army Medical Center and was cleared for publication. The views expressed herein are those of the author(s) and do not necessarily reflect the official policy or position of the Defense Health Agency, Brooke Army Medical Center, the Department of Defense, nor any agencies under the U.S. Government.