VerSe: A Vertebrae Labelling and Segmentation Benchmark for Multi-detector CT Images

From LRDE

Abstract

Vertebral labelling and segmentation are two fundamental tasks in an automated spine processing pipeline. Reliable and accurate processing of spine images is expected to benefit clinical decision support systems for diagnosissurgery planning, and population-based analysis of spine and bone health. However, designing automated algorithms for spine processing is challenging predominantly due to considerable variations in anatomy and acquisition protocols and due to a severe shortage of publicly available data. Addressing these limitations, the Large Scale Vertebrae Segmentation Challenge (VerSe) was organised in conjunction with the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) in 2019 and 2020, with a call for algorithms tackling the labelling and segmentation of vertebrae. Two datasets containing a total of 374 multi-detector CT scans from 355 patients were prepared and 4505 vertebrae have individually been annotated at voxel level by a human-machine hybrid algorithm (https://osf.io/nqjyw/, urlhttps://osf.io/t98fz/). A total of 25 algorithms were benchmarked on these datasets. In this work, we present the results of this evaluation and further investigate the performance variation at the vertebra level, scan level, and different fields of view. We also evaluate the generalisability of the approaches to an implicit domain shift in data by evaluating the top-performing algorithms of one challenge iteration on data from the other iteration. The principal takeaway from VerSe: the performance of an algorithm in labelling and segmenting a spine scan hinges on its ability to correctly identify vertebrae in cases of rare anatomical variations. The VerSe content and code can be accessed at: https://github.com/anjany/verse.

Documents

Bibtex (lrde.bib)

@Article{	  sekuboyina.21.media,
  author	= {Anjany Sekuboyina and Malek E. Husseini and Amirhossein
		  Bayat and Maximilian L\"offler and Hans Liebl and Hongwei
		  Li and Giles Tetteh and Jan Kuka\v{c}ka and Christian Payer
		  and Darko Stern and Martin Urschler and Maodong Chen and
		  Dalong Cheng and Nikolas Lessmann and Yujin Hu and Tianfu
		  Wang and Dong Yang and Daguang Xu and and Felix Ambellan
		  and Tamaz Amiranashvili and Moritz Ehlke and Hans Lamecker
		  and Sebastian Lehnert and Marilia Lirio and Nicol\'as
		  {P\'erez de Olaguer} and Heiko Ramm and Manish Sahu and
		  Alexander Tack and Stefan Zachow and Tao Jiang and Xinjun
		  Ma and Christoph Angerman and Xin Wang and Kevin Brown and
		  Matthias Wolf and Alexandre Kirszenberg and \'Elodie
		  Puybareau and Di Chen and Yiwei Bai and Brandon H. Rapazzo
		  and Timyoas Yeah and Amber Zhang and Shangliang Xu and Feng
		  Houa and Zhiqiang He and Chan Zeng and Zheng Xiangshang and
		  Xu Liming and Tucker J. Netherton and Raymond P. Mumme and
		  Laurence E. Court and Zixun Huang and Chenhang He and
		  Li-Wen Wang and Sai Ho Ling and L\^e Duy Hu\`ynh and
		  Nicolas Boutry and Roman Jakubicek and Jiri Chmelik and
		  Supriti Mulay and Mohanasankar Sivaprakasam and Johannes C.
		  Paetzold and Suprosanna Shit and Ivan Ezhov and Benedikt
		  Wiestler and Ben Glocker and Alexander Valentinitsch and
		  Markus Rempfler and Bj\"orn H. Menze and Jan S. Kirschke},
  title		= {{VerSe}: {A} Vertebrae Labelling and Segmentation
		  Benchmark for Multi-detector {CT} Images},
  journal	= {Medical Image Analysis},
  number	= {102166},
  year		= {2021},
  month		= jul,
  doi		= {10.1016/j.media.2021.102166},
  abstract	= {Vertebral labelling and segmentation are two fundamental
		  tasks in an automated spine processing pipeline. Reliable
		  and accurate processing of spine images is expected to
		  benefit clinical decision support systems for diagnosis,
		  surgery planning, and population-based analysis of spine
		  and bone health. However, designing automated algorithms
		  for spine processing is challenging predominantly due to
		  considerable variations in anatomy and acquisition
		  protocols and due to a severe shortage of publicly
		  available data. Addressing these limitations, the Large
		  Scale Vertebrae Segmentation Challenge (VerSe) was
		  organised in conjunction with the International Conference
		  on Medical Image Computing and Computer Assisted
		  Intervention (MICCAI) in 2019 and 2020, with a call for
		  algorithms tackling the labelling and segmentation of
		  vertebrae. Two datasets containing a total of 374
		  multi-detector CT scans from 355 patients were prepared and
		  4505 vertebrae have individually been annotated at voxel
		  level by a human-machine hybrid algorithm
		  (\url{https://osf.io/nqjyw/}, \url{https://osf.io/t98fz/}).
		  A total of 25 algorithms were benchmarked on these
		  datasets. In this work, we present the results of this
		  evaluation and further investigate the performance
		  variation at the vertebra level, scan level, and different
		  fields of view. We also evaluate the generalisability of
		  the approaches to an implicit domain shift in data by
		  evaluating the top-performing algorithms of one challenge
		  iteration on data from the other iteration. The principal
		  takeaway from VerSe: the performance of an algorithm in
		  labelling and segmenting a spine scan hinges on its ability
		  to correctly identify vertebrae in cases of rare anatomical
		  variations. The VerSe content and code can be accessed at:
		  \url{https://github.com/anjany/verse}.},
  volume	= {73},
  issue		= {102166}
}