The digitization of diagnostic pathology has recently been approved by the US Food and Drug Administration (FDA) opening up an era of big data in pathology. Artificial Intelligence (AI) is ideal for handling big data and potentially improving the speed and accuracy of the cancer diagnosis. Before pathologists can make proper use of AI several hurdles must be overcome:

  • Fear that AI “robots” will replace the pathologist
  • Lack of clarity on best practices for implementation
  • The “black box” problem
  • Lack of regulation

Counteracting these problems are:

  • “Augmented intelligence”
  • Scientific results
  • Use cases
  • Movement towards regulation

Pathologists Improve Speed and Accuracy of Cancer Diagnostic with AI-Based Assessment Tools

Towards Augmented Intelligence

The American Medical Association (AMA) is championing the concept of “augmented intelligence”, where AI enhances physician capabilities as opposed to functioning as an autonomous “healthcare robot”. Although cautious adoption of AI is the norm, for example, more than 70% of pathologists were interested in or excited about AI in pathology, there is still great uncertainty about the implementation of AI within the pathology profession as a whole.

Some physicians see AI as inevitable and akin to skydivers using a parachute, while others see an absolute need to prove efficacy before implementation with rigorous training. To allay fears and smooth the transition The American College of Radiology has created a Data Science Institute with an associated directory of hundreds of potential use cases for AI, many of them aimed at improving the speed and accuracy of cancer imaging and classification/ diagnosis. Although there is a crossover with pathology a similar undertaking by the College of American Pathologists (CAP) might be in order.

Limited Implementation But Strong Results

Implementation of AI in pathology is still limited, however, FDA approved automated digital systems (Philips IntelliSite Pathology Solution (PIPS)) that can be used in applications such as cervical cancer cytology analysis are in use in which digital slide images are captured automatically and presented to the diagnosing pathologist. Software as a Medical Device (SaMD) AI systems such as CytoProcessor by DATEXIM SAS can take images from digital pathology systems and pre-classify cases as normal, benign reactive, low-grade dysplasia, high-grade dysplasia, carcinoma in situ, or invasive carcinoma. The pathologist with this “augmented intelligence” can either accept the pre-classification or pursue an alternate diagnosis.

Such SaMD systems are not yet explicitly regulated by the FDA. CAP does, however, provide guidance on how to validate AI algorithms in anatomic pathology and so this is a grey area in terms of regulation. In a similar vein, IBM’s Watson for Oncology has undergone limited efficacy testing, however, it is being implemented by hospitals throughout the US and indeed the world to provide “augmented intelligence” to the diagnosing physician. Notably, some hospitals such as the MD Anderson Cancer Center in Houston, Texas have rejected the system following its initial adoption.

Overall, the evidence suggests that the recommendations given by AI interpreted cervical cytology slides are in high agreement (over 90%) with the diagnosis made by trained cytologists and may even be more accurate depending on the cytologist’s level of training. This may be a crucial point; AI could help generalists achieve the same level of accuracy and speed of diagnosis that specialists normally provide.

The “Black Box” Problem

Part of the reason that some pathologists are skeptical of AI is known as the “black box” problem. The accuracy of the algorithms is not easy to interpret precisely because most commercial suppliers of AI, currently, cannot explain how their algorithms reach their conclusion; not even in mathematical terms. This can lead to potentially dangerous situations where the AI uses information during training that is contemporaneously correlated with cancer risk but not causative and is thus subject to change. Sometime after the AI was trained this variable could shift (covariate shifts) or drift in the patient population and completely change the accuracy of the AI without any warning signs. As a result, many feel these systems cannot be fully trusted until the “black box” issue is solved. This is why “augmented intelligence” is preferable over full automation. A “black box” requires accuracy to be calibrated regularly, which requires a substantial time investment.

Digital noise is another area of concern regarding AI for diagnosis which can drastically reduce the accuracy of the AI. In a networked system, this weakness can be exploited by malicious adversarial attacks that purposefully but imperceptibly to the human eye add noise to digital medical images. In a “black box” AI system it would be impossible to determine why the AI was losing accuracy. A system that could draw attention to the noise in the image would drastically reduce this accuracy and security risk.

Solutions

If the “black box” could be solved this would enhance the accuracy of “augmented intelligence” and reduce the time needed for calibration. At the research level, some studies have attempted to address the “black box” by computing the amount that each pixel in the input slide images affects the output prediction, and thus the areas of the image that the AI uses to determine diagnosis can be estimated. This was accomplished using Google’s LYmph Node Assistant, or LYNA algorithm on the Camelyon16 evaluation breast cancer dataset for nodal metastasis detection. Commercial AIs with this function could facilitate calibration and allow pathologists to determine whether they agree with the AI’s diagnosis technique.

Under experimental conditions (not the clinic) pathologists using the LYNA algorithm to “augment intelligence” and outline, potential tumor regions in lymph node digital slides had a significantly increased ability to detect breast cancer micrometastases. Average review time per image was significantly shorter with “augmented intelligence” than without for both slides that contained micrometastases and also normal tissue images. Importantly pathologists with LYNA performed better than pathologists alone and also better than LYNA AI alone.

Other research using commercial and open-source software such as ImageJ, CellProfiler, and QuPath has indicated improved Ki67 scoring in breast cancer, Gleason grading in prostate cancer and tumor-infiltrating lymphocyte (TIL) scoring in melanoma using digital pathology AI classifications systems. Prognostic algorithms based on digitized histological slides have been created for several cancers, including lung cancer, melanoma, and glioma. Further regulatory work is needed to translate these AI developments into the clinical laboratory in a safe manner. A CAP led use case analysis could be invaluable here.

Towards Regulation

The US FDA’s Center for Devices and Radiological Health (CDRH) is considering a new regulatory framework for the total product life cycle of these technologies. The purpose is to ensure that the safety and effectiveness of SaMD are maintained while modifications from real-world AI learning would be allowed.

Conclusion

In conclusion, although research indicates that digital pathology AI can improve the speed and accuracy of cancer diagnosis by pathologists, and most practicing pathologists are interested in AI solutions, adoption of AI in routine practice is limited due to time constraints, a lack of computational expertise, and a lack of regulatory guidance on the implementation and maintenance of such systems in the clinical laboratory.