Supported Models

Datature hosts a number of 3D medical image segmentation models. Please see those currently supported on our platform below.

SegResNet

SegResNet leverages a ResNet-like approach, which takes advantage of 3D convolutions and skip connections. The segmentation pipeline, and an addition of SegResNet is defined by a CNN-based encoder and CNN-based decoder.

For the encoder, ResNet blocks are used, which repeats the following flow twice: group normalization, followed by ReLU activation, and finally a convolution. Residual connections are then employed to transfer higher-resolution features to the decoder.

For the decoder, a similar approach to the encoder is used, but 3D bilinear upsizing is used to 1.) reduce the number of features while increasing the spatial dimension, and to 2.) add the residual connections from previous encoder steps. Note that the decoder is used for mask segmentation.

SegResNet also features a Variation AutoEncoder (VAE) branch, which seeks to reconstruct the original input volume using the output of the volume encoder. The inclusion of the VAE is used as a regularization technique since SegResNet’s loss function is defined by the dice loss between the predicted and ground truth segmentation masks, L2 loss between the output of the VAE branch, and the KL divergence between the estimated normal distribution and prior distribution.

Swin UNETR

For a more thorough introduction to Swin UNETR, please refer to our “A Comprehensive Guide to 3D Models for Medical Image Segmentation” article on our blog here.

Swin UNETR leverages a U-Net-based architecture with a transformer-based encoder and a CNN-based decoder. The encoder features skip connections to feed high-resolution features into the decoder. The decoder allows skip connections to flow into the decoder to enable multi-scale, high-resolution features to inform the output generation.

By hosting both a transformer and CNN, the Swin UNETR model allows the encoder to capture global context, while the CNN-based decoder allows the model to better capture local context while upsampling the output mask – where local context matters most.

nnUNet

nnUNet is a unique semantic segmentation method which seeks to configure a U-Net segmentation model based on the input training set. U-Net models are CNN-based models with a U-shaped architecture, benefiting from skip connections.

To configure your U-Net segmentation pipeline, nnUNet has a “three-step recipe”:
Fixed parameters, which are held constant across all possible segmentation pipelines. These feature the overall architecture, optimizer, learning rate, data augmentation, training procedure, and more.
Data fingerprint and rule-based parameters, which take into account the image modality, intensity distribution, median shape, and distribution of spacings that inform the rule-based parameters. These rule based-parameters help customize the configuration of the U-Net segmentation pipeline and feature the image resampling strategy, annotation resampling strategy, patch size, batch size, and more.
Empirical parameters, which are selected at the culmination of all training processes, such as choosing the best U-Net configuration out of those tested and the postprocessing strategy.

Ultimately, nnUNet, is a semantic segmentation method, which will use a combination of heuristics and workflows to generate a custom U-Net architecture for your dataset.