UK +44 (0)1865 849841
Malaysia +603 2779 0098

A High impact Open-Source tool for Digital Pathology Quality Control

Ahead of his presentation at the 6th Digital Pathology and AI Congress: Europe, we spoke to Andrew Janowczyk about his work creating scalable data analysis techniques to facilitate cancer research through the development of open-source Digital Pathology tools.

How has digital pathology been applied to your work?

Roughly 40% of the population will be diagnosed with some form of cancer in their lifetime. In a large majority of these cases, a definitive cancer diagnosis is only possible via histopathologic confirmation on a tissue slide. With the increasing popularity of the digitisation of pathology slides, a wealth of new untapped data is now regularly being created.

Computational analysis of these routinely captured H&E slides is facilitating the creation of diagnostic tools for tasks such as disease identification and grading. Furthermore, by identifying patterns of disease presentation across large cohorts of retrospectively analysed patients, new insights for predicting prognosis and therapy response are possible.

Such biomarkers, derived from inexpensive histology slides, stand to improve the standard of care for all patient populations, especially where expensive genomic testing may not be readily available. Moreover, since numerous other diseases and disorders, such as oncoming clinical heart failure and kidney disease, are similarly diagnosed via pathology slides, those patients also stand to benefit from these same technological advances in the digital pathology space.

My presentation at the congress will discuss our research aimed towards reaching the goal of precision medicine, wherein patients receive optimised treatment based on historical evidence. The talk will discuss how the applications of deep learning in this domain are significantly improving the efficiency and robustness of these models.

Numerous challenges remain, however, especially in the context of quality control and annotation gathering. Therefore, my talk will introduce our open-source tools being developed and deployed to meet these pressing needs, such as HistoQC and HistoAnno

Can you tell us more about HistoQC?

HistoQC is a tool for high-speed automated quality control to help identify and delineate artefacts and discover cohort-level outliers. Operating using a variety of image metrics, the program identifies artefact free regions on digital slides suitable for downstream analysis.

The process of slide creation, whether physical or digital, is fraught with the potential for preanalytic variance that can significantly affect the quality of tissue slides, and thus impact clinical and research workflows. Whilst highly necessary, most current quality control processes are manual and therefore time-consuming and vulnerable to human error.

HistoQC was therefore developed as an open-source modular tool for quality control of Digital Pathology slides, seeking to facilitate the automated assessment of slide quality.

A recent comparison of the program against manual quality control, undertaken by two pathologists on 450 digital slides, demonstrated an average agreement of over 95% [1]. This result indicates great potential for HistoQC to in meeting the clinical and research needs for identifying high-quality digital pathology slides need for automated quality control for assessing slide quality and identifying artefacts.

What does the future hold for Digital Pathology?

We’re already starting to see the potential impact of digital pathology on diagnosis and workflow streamlining. Some of our algorithms can find patterns of disease associated with prognosis or therapy response which are not available by other means, which may allow for more precise treatments of patients based on historical data.

Emerging technology is allowing us to search patient databases for similar cases based on the particular presentation of the disease. This potentially opens up new avenues for optimised treatment by observing successful treatments for patients in similar situations and reapplying them to new cases going forward. Digital Pathology will help in both of these regards, as well as addressing the universal pathologist shortage.

Furthermore, with a combination of diagnostics as a remote software service, and in-house machine learning primary screening, we can hopefully enable pathologists to have more time to focus on difficult cases by streamlining their workflows.

There are two developments I would like to see in the field:

  1. A single set of standards for the creation of the digital image. Currently, each scanner manufacturer has a proprietary slide format, which makes reading the slides non-trivial when working across different cohorts. While some third-party tools exist to help with this (e.g., Openslide) it would make more sense for scanner manufacturers to address this via a single format (similar to the way DICOM helped in the radiology space). While there is interest in DICOM for digital pathology, it has not yet been well adopted, and scanners are not yet natively saving images in this format.
  2. An increased access to data. With the invention of deep learning, the major limitations towards scientific breakthroughs and superior tools will soon be access to data. In the research domain, we’ve already seen significant expedited progress made possible through sharing our limited research datasets. One can only wonder what an ideal world would look like if we could sufficiently address privacy and ethical concerns such that all public health data, including digital pathology slides, was able to be shared universally.


Andrew Janowczyk is Assistant Research Professor in Biomedical Engineering at the Centre of Computational Imaging and Personalized Diagnostics (CCIPD), Case Western Reserve University. He maintains a blog sharing his digital pathology and deep learning experiences, code, and data with the community at


The 6th Digital Pathology & AI Congress: Europe will explore how Deep Learning could aid the classification of soft tissue tumours. Take a look at the agenda and see the full line up of presentations and case studies.


[1] Janowczyk, Andrew & Zuo, Ren & Gilmore, Hannah & Feldman, Michael & Madabhushi, Anant. (2019). HistoQC: An Open-Source Quality Control Tool for Digital Pathology Slides. JCO Clinical Cancer Informatics. 3. 1-7. 10.1200/CCI.18.00157.

Leave a Reply

Subscribe to Our Newsletter

Get free reports and resources from our world class speakers.
  • This field is for validation purposes and should be left unchanged.