What AI Can and Can’t Do for Digital Pathology Right Now
Posted 15th July 2020 by Liv Sewell
It was fascinating to speak with Hamid Tizhoosh, Professor at the Faculty of Engineering at the University of Waterloo in Canada, Director of KIMIA Lab and keynote speaker at the 6th Digital Pathology & AI Congress: USA, about using AI to transform what is possible in medical imaging.
The Kimia Lab focuses on harnessing machine learning to advance medical image handling. What lead you to this field of research?
When I first got involved in the field in 1996, a lot of work had been done to understand, segment, measure, classify, and make predictions from medical images. I had already been involved in AI research for three or four years and I had an interest in developing computer vision to support medical imaging.
In the 1990s and early 2000s, computer vision was mainly based on hand-crafted techniques. AI was still in a phase where there was a lot of disappointment because at that point it was not meeting expectations. However, by 2010 it was clear that deep learning could enable major breakthroughs in computer vision and expected also in medical imaging.
My team and I had experience with AI and with medical images. What we hadn’t had before was a technology powerful enough to approach them. I realised that it was time to capitalise on our experience with medical images and deep learning technology. Now we are revisiting the issues and problems of medical images and harnessing recent advances in machine learning to build tools to work with medical images.
What potential does deep learning hold for the problems facing Pathologists?
The first major successes using deep learning with digital images were reported in 2012. Prior to that, for every single task with medical images, including pathology, you had to spend years working to develop a very customised and specialised tool to perform a specific task on medical images. These tools were so specialized that if you were to change one parameter in the system or something from the imaging side changed, the tool would collapse and would effectively be useless. They were very inflexible, even though they had taken years to be designed through painstaking customization.
Deep learning has provided a much more flexible and general framework to design and train highly customised small and large networks, enabling the collection of solutions that you can adjust to your domain if you have access to training data. This ultimately enables tasks with images to be processed much more accurately and reliably without having to constantly create bespoke solutions.
Radiology went digital in the 1990s, but pathology has been much slower. Digital images in pathology remain relatively scarce. But now we have a flexible framework, deep networks can be applied to solve pathology problems relatively easily if you have the right data. While getting hold of the data is still an obstacle, it is now possible to come up with a solution for a specific problem within days or weeks, and not years.
The flexibility of deep networks has replaced the tedious, rigid schemes of handcrafted design.
What are some of the hurdles you face in realising the potential of AI for pathology?
The digitisation process in pathology has been a slow one, especially pre-Covid.
Most pathologists are still working with microscopes and aren’t creating or using digital whole-slide images. There are some images that people capture with cameras on top of microscopes, but digitisation of biopsy samples to acquire whole slide images remains rather uncommon.
The digitisation process does seem to have been accelerated during the current pandemic. Many of us hope that the pandemic has clearly demonstrated the benefit of going digital.
Currently, the major challenge is that in many cases we don’t have access to image datasets large enough with which to teach machine learning tools. This has remained a major obstacle blocking the harnessing of AI potentials for medical imaging.
Where research is being undertaken, a research group will usually source and select the images of 200-300 patients for the project, but they may or may not make those images available to everyone else. In pathology, there is only one really large archive of pathology images, The Cancer Genome Atlas, containing 33, 000 or so images with reports and other metadata.
We need many more datasets to train, test and validate AI techniques. We are very much at the beginning of that curve and we can’t really exploit machine learning algorithms unless we solve the data availability problem.
Many computer scientists working in medical imaging mention that the data availability problem is one of their biggest challenges. Can you see any potential solutions?
There are some things that can and should happen from different sides.
Firstly, governments need to support the research institutions, hospitals and private sector companies to work together.
Secondly, hospitals and labs need to become more flexible in providing images.
The concern for hospitals is generally patient privacy, but even where there is consent or no reasonable concerns for patient privacy, many hospitals are still not willing to make the data available. This is often because it’s also about intellectual property and commercialization. Some hospitals see images as a goldmine and are not willing to easily share that goldmine unless they are treated like shareholders and not just research partners.
This must be figured out for the full potential of AI for imaging to be realised.
If the images in fact constitute IP, then researchers, companies and hospitals will need to come to an appropriate agreement that does not obstruct the advancement of technology, for the sake of public health.
Lastly, from the computer science side, we are working on “federated learning”, techniques that are considerate toward patient privacy and can combine learning from different hospitals without sharing data.
These techniques can deploy AI agents into individual hospitals such that they can learn within the hospital firewall with no concern about patient privacy and no patient-related data being exchanged. The AI agents then exchange information among themselves, which has nothing to do with images or patients, to share knowledge. That information is mainly the internal knowledge of deep networks with no reproducible connection to patents’ data.
It is a new direction in the AI community, and it will play a role in solving the data problems. But we are going to need hospitals, research institutes, industry leaders and governments to collaborate to create large-scale initiatives; otherwise we would blame the AI community for not delivering on its promises, a blame that would be neither justified nor fair.
Realistically, we need to access image archives of millions of patients: then we can see what AI can really do.
Hamid Tizhoosh is Professor at the Faculty of Engineering at the University of Waterloo in Canada and the director of KIMIA Lab, the Laboratory for Knowledge Inference in Medical Image Analysis.
Leave a Reply