Endoscopic instrument segmentation with crowdsourced data

A challenge in computer-assisted minimally-invasive surgery is the image-based tracking of medical instruments in the endoscopic images, which is a prerequisite for surgical navigation, skill assessment and workflow analysis. A promising approach for segmenting the instruments is to apply machine learning techniques to learn their shape and appearance from labeled training data. Below you can find labeled datasets which include training data for instrument segmentation. The annotations were acquired by experts and by anonymous untrained workers referred to as knowledge workers (KWs) via crowdsourcing. The segmentations of medical tools generated by the crowd are comparable to those made by medical experts. For more details we refer to [1].

Endoscopic Datasets

The training data was generated from a total of 6 surgical procedures, three from laparoscopic adrenalectomies and three from laparoscopic pancreatic resections. From each surgery, 20 images containing one or several medical instruments were extracted, yielding 120 images in total.

Expert Annotations: Half of the data from each surgical procedure was annotated by a medical expert with experience in laparoscopic surgeries.

Crowd Annotations: All images (i.e. twice as many as those annotated by the experts) were further annotated by 10 KWs each, yielding 2350 instrument segmentations in total. The crowd segmentations for one particular instrument was obtained by majority voting, i.e. a pixel was classified as instrument, if and only if at least 5 KWs had marked it as instrument*.

Each dataset contains the original image together with the instrument mask from the expert and the crowd annotations (Fig. 1).

The datasets can be downloaded here.

Fig. 1: Left: Original Image, right: Instrument Mask (RGB Value: Background (0,0,0), Instrument (255, 0,0))


If you use the datasets for your work, please cite the following paper:

[1] Maier-Hein L, Mersmann S, Kondermann D, Bodenstedt S, Sanchez A, Stock C, Kenngott HG, Eisenmann M, Speidel S: Can Masses of Non-Experts Train Highly Accurate Image Classifiers?. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2014, 438-445, 201

* Note that lap_pancreas_8: IMG_8 was excluded from validation in [1] because of a misplaced bounding box, which contained two medical instruments as opposed to just one. As we are providing the fused data sets (containing segmentations for all instruments present in the image), we have uploaded all segmentations (including lap_pancreas_8: IMG_8)