Unsupervised Anomaly Detection on Preclinical Liver H&E Whole Slide Images using Graph based Feature Distillation

Merck & Co., Inc., Rahway, NJ, USA
MICCAI 2025
email to lin.li23@merck.com for the code access

Abstract

Toxicity assessment of candidate compounds is an essential part of safety evaluation in the preclinical stage of drug development. Traditionally, drug safety evaluations depend on manual histopathological examinations of tissue sections from animal subjects, often leading to significant effort in evaluating normal tissues. Moreover, the collection of abnormality samples poses significant challenges due to the rarity and diversity of various types of abnormalities. This makes it impractical to develop a comprehensive training dataset that encompasses all potential anomalies, particularly those that are underrepresented. Consequently, traditional supervised learning methods may face difficulties, leading to a growing interest in unsupervised approaches for anomaly detection. In this study, we present GraphTox, a multi-resolution graph-based anomaly detector designed to assess hepatotoxicity in Rattus norvegicus liver tissues. GraphTox is built upon a novel resolution-aware foundation model pre-trained on 2.7 million liver tissue patches. Additionally, GraphTox employs graph-based feature distillation on normal liver whole slide images (WSIs) to identify hepatotoxicity. Our results demonstrate that GraphTox achieves an 11.1% improvement in area under the receiver operating characteristic curve (AUC) on an independent testing set compared to the best-performing non-graph-based anomaly detection models, and an 8.1% improvement over a graph-based model derived from a resolution-agnostic foundation model UNI v2. These findings highlight that GraphTox effectively leverages the resolution-aware digital pathology foundation model to capture multi-scale tissue characteristics within the local tissue graphs, thereby enhancing anomaly detection across various scales.

Introduction

In drug development, it's crucial to check tissue samples from animals to make sure new drugs are safe. Traditionally, experts manually examine these samples, which takes a lot of time and effort, especially when most samples are normal. Finding abnormal samples is also hard because they're rare and vary a lot, making it tough to train models using standard supervised learning.

Various anomalies

Figure 1: Examples of abnormal tissue on Rat liver at different resolutions. Vacuolation (bubble-like texture) is best seen at 20×. Hypertrophy (cell size changes) is clearer at 10×. Necrosis (dead tissue) is visible at 5×.

To solve this, our team developed GraphTox, a tool that uses unsupervised learning to detect liver toxicity in rats. Instead of needing labeled abnormal samples, GraphTox learns what "normal" looks like and flags anything that seems off. It uses a resolution-aware foundation model, trained on millions of liver tissue images, and combines this with graph-based feature distillation to detect anomalies more accurately.

The main contributions of this work include (1) a resolution-aware foundation model, (2) GraphTox, a graph based feature distillation method for anomaly detection on H&E histology images.

Resolution-aware Foundation Model

Whole slide images (WSIs) of tissue are huge and contain details at different zoom levels—like 5×, 10×, and 20× magnification. Pathologists often switch between these zoom levels to spot different types of damage. Most existing digital pathology models ignore these resolution differences. GraphTox solves this by using a resolution-aware foundation model that understands which zoom level it's looking at. It adds a special "resolution token" to the model so it can learn features specific to each magnification. (Figure 2 (a) and (b)) This helps the model better mimic how pathologists work -- looking at tissue across scales.

Method workflow

Figure 2: Overview of GraphTox.

GraphTox

GraphTox doesn't just look at tissue patches individually -- it builds local graphs that connect patches across different resolutions at the same location. For example, a patch at 5× is linked to nearby patches at 10× and 20×. (Figure 2 (c)) These graphs help the model understand how tissue features change with magnification.

Here's how it works: A teacher model (pretrained on millions of patches) provides high-quality features. A student model tries to copy the teacher's output using only normal tissue samples. If the student struggles to match the teacher on a new sample, it might be abnormal.

Experimental results

Figure 3: Anomaly detection on a whole slide image.

By comparing the student and teacher outputs across the graph, GraphTox calculates an anomaly score. The higher the score, the more likely the tissue is abnormal. This graph-based approach outperforms other models, including those using GANs or single-resolution feature distillation.

Type Model 10× 20× Combined
FD UNI res single 0.64 (0.60 - 0.69) 0.71 (0.67 - 0.75) 0.55 (0.50 - 0.60) 0.55 (0.50 - 0.60)
UNI res mix 0.71 (0.66 - 0.75) 0.69 (0.64 - 0.73) 0.65 (0.61 - 0.69) 0.70 (0.66 - 0.74)
UNIv2 res mix 0.73 (0.69 - 0.77) 0.73 (0.69 - 0.78) 0.70 (0.66 - 0.74) 0.72 (0.68 - 0.77)
RA res mix 0.73 (0.69 - 0.77) 0.67 (0.63 - 0.72) 0.69 (0.65 - 0.74) 0.71 (0.67 - 0.76)
GAN s2-AnoGAN 0.61 (0.56 - 0.65)
Graph-FD UNIv2 graph 0.74 (0.70 - 0.78)
RA transformer 0.74 (0.70 - 0.78)
GraphTox 0.80 (0.76 - 0.84)

BibTeX

BibTex Code Here