Colonoscopy pathology examination can find cells of early-stage colon tumor from small tissue slices. Pathologists need to daily examine hundreds of tissue slices, which is a time consuming and exhausting work. Here we propose a challenge task on automatic colonoscopy tissue segmentation and screening, aiming at automatic lesion segmentation and classification of the whole tissue (benign vs. malignant).
This dataset has positive samples and negative samples. Training positive samples contain 250 images of tissue from 93 WSIs, with pixel-level annotation in jpg format, where 0 means background and 255 for foreground (malignant lesion). You could simply get binary mask by a threshold 128. Training negative samples contain 410 images of tissue from 231 WSI. This negative images have no annotation because they don't have any malignant lesion.
For colonoscopy pathology examination, there are 10 or more tissues in a single WSI. To make this task easier, we selected one or two tissues in a WSI and did segment annotation by our pathologists. Also, we notice a small number of malignant grands could be missed by pathologists.
The average size of all images are of 5000x5000 pixels, some of them are extremely huge. We will also provide another 152 patients' 212 tissues as the testing set, in which 90 images from 65 patients contain lesion. All whole slide images were stained by hematoxylin and eosin and scanned at X20.
The data in the challenge will show great variations in terms of appearance because the data are collected from 4 medical centers, especially from several small centers in developing countries/regions. Image style differences can be an obstacle for the screening task. Holding the challenge and releasing the large quantity of expert level annotations will attract much attention from the medical imaging community and substantially advance the research on automatic colonoscopy screening.
The criteria for distinguishing between benign(negative) and malignant(positive) is really hard. Again, to make this task easier for a academic competition, according to WHO classification of tumours of the digestive system, we regard the following diseases as malignant lesion: high grade intraepithelial neoplasia and adenocarcinoma, including papillary adenocarcinoma, mucinous adenocarcinoma, poorly cohesive carcinoma and signet ring cell carcinoma. Low grade intraepithelial neoplasia and severe inflammation are usually hard case for pathologists. Then this dataset will not include these hard cases. Notice that in practical clinical diagnosis, pathologists would face more difficult and complicated situations.