Uploading Annotations

We understand that developers may already have an annotation pipeline, be it with an outsourced workforce or an open source annotation program that works well with your existing datasets. We allow the users to upload their annotation files to the server to either use Nexus as an annotator for your team or just simply use Nexus for the model training phase.

How Do I Upload Annotations?

  1. Go to the Annotations tab - You can find the button on the top of the Datasets homepage.

  2. Select your relevant Import Format - You should select the annotation format that your file was constructed in. Please note that each annotation format has a required file type. Nexus supports a variety of annotation formats and we are striving to constantly cover more formats from different tools. To see more detailed information about the different Annotation Types, look through them in Supported Annotation Formats or locate them in the table below.

  3. Select the Upload Annotations button - You can drag-and-drop or browse for your existing annotation files for processing.

📘

The images need to be uploaded onto the platform before uploading the annotations, to ensure the platform can accurately match your annotation to the image filenames.


Supported Annotation Formats

Object Detection

📘

While we allow TFRecord for export, we don't accept TFRecord for import.

Annotation data can be presented in a Normalized or Unnormalized format. The system will do its best to infer the corresponding types.

Annotation TypeDescriptionRequired File Type
COCOThis file format is typically export from COCO Annotator and LabelMe. The sample expected file format that is accepted is as followsJSON
CSV Four CornersThis is another typical format exported when accessing Kaggle Datasets. The file is presented in a .csv format where individual rows represent 1 annotation each and the headers must be as suchCSV
CSV Width HeightAnother common CSV based representation of image annotation where the width and height of the bounding box is givenCSV
Pascal VOCThe PascalVOC annotation type is commonly exported from LabelImg and should be in an xml filetype.XML
YOLO DarkNetThis annotation format is commonly prepared to train YOLO models and it contains a label file and multiple .txt files for describing each image's annotations.TXT
YOLO Keras PyTorchThis format is almost the same as the YOLO format above, however it allows a single .txt file to describe all the annotations as suchTXT
CreateMLThis file format is typically exported from CreateML which has items per image.JSON

Segmentation

Annotation TypeDescriptionRequired File Type
COCO Annotator Polygons / MasksThis uses the same COCO JSON format except the annotations component should be replaced by something like the following example below.JSON
LabelMe Mask / PolygonThis file format is typically exported from LabelMe and provides 1 annotation file per image. Users can upload every individual annotation file that looks as such:JSON

Classification

Annotation TypeDescriptionRequired File Type
CSV ClassificationCSV file that contains two columns, one for images, and one for the labels.CSV

Keypoint

Annotation TypeDescriptionRequired File Type
COCO KeypointSimilar to the original COCO format, but instead of having the segmentation vertices, you contain keypoints as a list of arrays.JSON

Additionally, we support uploading of predefined skeletons for keypoints. This is to make pre-existing skeletons compatible with our skeleton editor. We currently support skeleton files with these formats:

Skeleton File TypeDescriptionRequired File Type
Datature SkeletonThis is our own custom skeleton schema, that integrates anything that would be needed.JSON

Medical 3D

Annotation TypeDescriptionRequired File Type
NIfTI - One-Hot SegmentationMulti-channel structure where each channel represents a separate class, containing binary masks (0 or 1 values).JSON, .nii, .nii.gz
NIfTI - Class-Index SegmentationSingle-channel structure where each voxel contains an integer representing its class label directly.JSON, .nii, .nii.gz
DiCOM RT Structure SetDefines regions of interest within 3D medical image series using contour data..dcm

Format Descriptions and Examples

COCO - Bounding Box

This file format is typically export from COCO Annotator and LabelMe. The sample expected file format that is accepted is as follows

{
    "info": {
        "year": "2020",
        "version": "1",
        "description": "Datature.io COCO Format",
        "contributor": "",
        "url": "https://datature.io/datasets/cat/3",
        "date_created": "2020-09-01T01:40:57+00:00"
    },
    "licenses": [
        {
            "id": 1,
            "url": "",
            "name": "Unknown"
        }
    ],
    "categories": [
        {
            "id": 0,
            "name": "WBC",
            "supercategory": "none"
        },
        ...
    ],
    "images": [
        {
            "id": 0,
            "license": 1,
            "file_name": "0.jpg",
            "height": 480,
            "width": 640,
            "date_captured": "2020-09-01T01:53:10+00:00"
        },
        ...
    ],
    "annotations": [
        {
            "id": 0,
            "image_id": 0,
            "category_id": 0,
            "bbox": [
                260,
                177,
                231,
                199
            ],
            "area": 307200,
            "segmentation": [],
            "iscrowd": 0
        },
        ...
    ]
}

CSV Four Corner - Bounding Box

This is another typical format exported when accessing Kaggle Datasets. The file is presented in a .csv format where individual rows represent 1 annotation each and the headers must be as such

filename,xmin,ymin,xmax,ymax,label
0.jpg,260,177,491,376,WBC
0.jpg,78,336,184,435,RBC
0.jpg,63,237,169,336,RBC
0.jpg,214,362,320,461,RBC
0.jpg,414,352,506,445,RBC
0.jpg,555,356,640,455,RBC
0.jpg,469,412,567,480,RBC
0.jpg,1,333,87,437,RBC
0.jpg,4,406,95,480,RBC
0.jpg,155,74,247,174,RBC
0.jpg,11,84,104,162,RBC
0.jpg,534,39,639,139,RBC
0.jpg,547,195,640,295,RBC
0.jpg,388,11,481,111,RBC
0.jpg,171,175,264,275,RBC
0.jpg,260,1,374,83,RBC
0.jpg,229,91,343,174,RBC
0.jpg,69,144,184,235,RBC
0.jpg,482,131,594,230,RBC
0.jpg,368,89,464,176,RBC

CSV Width Height - Bounding Box

Another common CSV based representation of image annotation where the width and height of the bounding box is given

filename,xmin,ymin,width,height,label
0.jpg,260,177,231,199,WBC
0.jpg,78,336,106,99,RBC
1.jpg,68,315,218,165,WBC
1.jpg,346,361,100,93,RBC

Pascal VOC - Bounding Box

The PascalVOC annotation type is commonly exported from LabelImg and should be in an xml filetype as such

<annotation>
	<folder></folder>
	<filename>0.jpg</filename>
	<path>0.jpg</path>
	<source>
		<database>datature.io</database>
	</source>
	<size>
		<width>640</width>
		<height>480</height>
		<depth>3</depth>
	</size>
	<segmented>0</segmented>
	<object>
		<name>WBC</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<occluded>0</occluded>
		<bndbox>
			<xmin>260</xmin>
			<xmax>491</xmax>
			<ymin>177</ymin>
			<ymax>376</ymax>
		</bndbox>
	</object>
</annotation>

YOLO DarkNet - Bounding Box

🚧

Do note that the labels file needs to have the exact filename label.labels or the annotation import will throw an error.

This annotation format is commonly prepared to train YOLO models. It contains a label.labels file to describe the classes and multiple <image_name>.txt files for describing each image's annotations. Both files have to be presented as such:

# sample label.labels [Required]
WBC
RBC
# sample image1.txt for image1.jpg
0 0.40625 0.36875 0.3609375 0.414583333333333
1 0.121875 0.7 0.165625 0.20625

Each line in the .txt files should be of the following format:

# coordinate bounds need to be normalized between 0 and 1
class_id center_x center_y width height
📘

To avoid import issues, please ensure that the four corners of each bounding box is within 0 and 1 in normalized bounds. This implies the following:

center_x - width >= 0.0
center_x + width <= 1.0
center_y - height >= 0.0
center_y + height <=1.0

YOLO Keras PyTorch - Bounding Box

🚧

Do note that the labels file needs to have the exact filename label.txt or the annotation import will throw an error.

This format is almost the same as the YOLO format above with a label.txt file to describe the classes, however it allows a single .txt file to describe all the annotations. Both files have to be presented as such:

# label.txt [Required]
WBC
RBC
# annotation.txt
0.jpg 260,177,491,376,0
0.jpg 78,336,184,435,1
0.jpg 63,237,169,336,1
0.jpg 214,362,320,461,1
0.jpg 69,144,184,235,1
0.jpg 482,131,594,230,1
0.jpg 368,89,464,176,1
1.jpg 68,315,286,480,0
1.jpg 346,361,446,454,1
1.jpg 53,179,146,299,1
1.jpg 449,400,536,480,1
1.jpg 165,160,257,264,1
1.jpg 464,209,566,319,1

CreateML Bounding Box

This file format is typically exported from CreateML which has items per image.

[
    {
        "image": "0001.jpg",
        "annotations": [
            {
                "label": "helmet",
                "coordinates": {
                    "x": 162.5,
                    "y": 45,
                    "width": 79,
                    "height": 88
                }
            },
            {
                "label": "person",
                "coordinates": {
                    "x": 145.5,
                    "y": 176,
                    "width": 251,
                    "height": 350
                }
            }
        ]
    }
]

COCO Annotator Polygons / Masks

This format is typically exported by COCO Annotator describes segmentation masks and the photos all in a single .json file as such:

{
    "images": [
        {
            "height": 2048,
            "width": 2048,
            "id": 0,
            "file_name": "X2.5_7_6_-6.0409_-4.2648_-5.5302.jpg"
        }
    ],
    "categories": [
        {
            "supercategory": "Dog",
            "id": 0,
            "name": "Dog"
        },
        {
            "supercategory": "P",
            "id": 1,
            "name": "P"
        }
    ],
    "annotations": [
        {
            "segmentation": [
                [
                    1021.2389380530973,
                    1103.5398230088495,
                    1027.433628318584,
                    1176.9911504424776,
                    813.2743362831858,
                    1209.7345132743362,
                    837.1681415929203,
                    1107.9646017699115
                ]
            ],
            "iscrowd": 0,
            "area": 17280.131568642784,
            "image_id": 0,
            "bbox": [
                813.0,
                1104.0,
                214.0,
                106.0
            ],
            "category_id": 4,
            "id": 1
        }
    ]
}

LabelMe Polygon / Mask

This file format is typically exported from LabelMe and provides 1 annotation file per image. Users can upload every individual annotation file that looks as such:

{
  "version": "4.5.6",
  "flags": {},
  "shapes": [
    {
      "label": "Ox",
      "points": [
        [
          62.576687116564415,
          1244.7852760736196
        ]
      ],
      "group_id": null,
      "shape_type": "polygon",
      "flags": {}
    },
    {
      "label": "Ox",
      "points": [
        [
          560.483870967742,
          1112.0967741935483
        ]
      ],
      "group_id": null,
      "shape_type": "polygon",
      "flags": {}
    },
    {
      "label": "P",
      "points": [
        [
          539.516129032258,
          1090.3225806451612
        ]
      ],
      "group_id": null,
      "shape_type": "polygon",
      "flags": {}
    }
  ],
  "imagePath": "X2.5_1_0_-8.3827_-3.7965_-0.3163.jpg",
  "imageHeight": 2048,
  "imageWidth": 2048
}

CSV Classification

This file format is a simple format for classification labels. It matches image filenames with the class labels by having two columns.

filename,label
image_1.png,cat
image_2.png,dog

COCO Keypoint

COCO Keypoint follows the same format as the other formats, except that it has keypoints additionally. Keypoints are of the format (x1,y1,v1,x2,y2,v2,...,xk,yk,vk), for k as the total number of keypoints. xk,yk refer to the x and y coordinates of the keypoint, and vk refers to the visibility of the keypoint, where 0 is not labelled, in which case x=y=0, 1 meaning labelled but not visible, and 2 as labelled and visible.

{
    "images": [
        {
            "height": 2048,
            "width": 2048,
            "id": 0,
            "file_name": "X2.5_7_6_-6.0409_-4.2648_-5.5302.jpg"
        }
    ],
    "categories": [
        {
            "supercategory": "Dog",
            "id": 0,
            "name": "Dog",
            "keypoints": [
                "nose",
                "eye",
                "ear",
                ...
            ],
            "skeleton": [
                [1, 2],
                [2, 3],
                [1, 3],
                ...
            ]
        }
    ],
    "annotations": [
        {
            "keypoints": [
              	612.4,
              	503.7,
              	2,
              	752.8,
              	296.5,
              	2,
              	...
            ],
            "image_id": 0,
            "category_id": 4,
            "id": 1,
            "num_keypoints": 17,
            ...
        }
    ]
}

Datature Skeleton File

Skeleton file encodes the skeleton incorporating other complex parameters that can affect skeletons, such as chirality, camera angle and position, and skeleton information like labels.

{
  	"info": [
        {
            "description": "Marcus' Skeletons",
            "version": "1.0",
            "year": 2023,
            "contributor": "Marcus",
            "date_created": "2023/09/05",
            "url": "nexus.datature.io"

        },
        {"...": "..."}
    ],
 	  "licenses": [
        {
            "url": "datature.io",
            "name": "Marcus Skeleton 1"
        }
    ],
    "cameras": [
        {
            "name": "Leftmost Camera",
            "parameters": {
                "focalLength": [<fx>, <fy>], // in pixels
                "principalPoint": [<cx>, <cy>, <cz>], // in pixels
                "skew": 0.0,
                "radialDistortion"?: [0.0, 0.0],
                "tangentialDistortion"?: [0.0, 0.0]
            }
        }
    ],
		"chirality": [
        {
            "name": "Custom Transform 1",
            "order": {
            "groups": [
                ["Head"],
                ["Chest"],
                ["Left Hand", "Right Hand"],
                ["Left Leg", "Right Leg"],
                ["First Toe", "Second Toe", "Third Toe", "Fourth Toe", "Fifth Toe"]
            ],
        "perform": "rotate" // or "flip"
    },
            "affects": "<List of Skeleton Names/Ids>",
            "transform": {
                "minTranslate": [0.0, 0.0, 0.0],
                "maxTranslate": [0.0, 0.0, 0.0],
                "minRotate": [0.0, 0.0, 0.0],
                "maxRotate": [0.0, 0.0, 0.0],
                "minScale": [0.0, 0.0, 0.0],
                "maxScale": [0.0, 0.0, 0.0],
                "minProjection": [0.0, 0.0, 0.0],
                "maxProjection": [0.0, 0.0, 0.0]
            }
        }
    ],
		"skeletons": [
        {
            "name": "Skeleton 1",
            "keypoints": [
                {
                    "name": "Right Leg",
                    "category": ["Leg", "Limb", "Right"], // Characteristics
                    "kwargs"? : ... // Additional parameters for customisability
                },
                {
                    "name": "Left Leg",
                    "category": ["Leg", "Limb", "Left"],
                    "kwargs"? : ...
                }
            ],
            "connections": [
                {
                    "pair": ["Right Leg", "Left Leg"], // or
                    "pair": [0, 1],
                    "kwargs"? : ...
                }
            ]
        }
    ],
    
}

NIfTI One-Hot Segmentation

This format uses a multi-channel structure where each channel represents a separate class, containing binary masks (0 or 1 values).

Structure:

  • Each class gets its own channel in the array
  • Within each channel, a value of 1 indicates that a voxel belongs to that class, while 0 indicates it doesn't
  • Multiple classes can potentially overlap at the same voxel location (across different channels)
  • Requires a companion labels.json file that maps class IDs to class names
[
    // Channel 0 (class 1)
    [[[1, 1, 0, 0],
      [1, 0, 0, 1],
      [0, 0, 1, 1]],
     
     [[1, 0, 0, 1],
      [0, 0, 0, 1],
      [0, 0, 1, 1]]],

    // Channel 1 (class 2)
    [[[0, 0, 1, 1],
      [0, 1, 1, 0],
      [0, 0, 0, 0]],
     
     [[0, 1, 1, 0],
      [0, 0, 1, 0],
      [0, 0, 0, 0]]]
]  // Each channel contains binary masks (0 or 1)
[
    {
        "id": 1,
        "name": "class1"
    },
    {
        "id": 2,
        "name": "class2"
    }
]

NIfTI Class-Index Segmentation

This format uses a single-channel structure where each voxel contains an integer representing its class label directly.

Structure:

  • Each voxel contains an integer value corresponding to the class ID it belongs to
  • A value of 0 typically represents background (no annotation)
  • Values 1, 2, 3, etc. represent different classes
  • Only one class can be assigned per voxel (mutually exclusive classes)
  • Also requires a companion labels.json file for class name mapping
[
    [[0, 0, 1, 1],
     [0, 1, 1, 0],
     [2, 2, 0, 0]],
    
    [[0, 1, 1, 0],
     [2, 2, 1, 0],
     [2, 2, 0, 0]]
]
[
    {
        "id": 1,
        "name": "class1"
    },
    {
        "id": 2,
        "name": "class2"
    }
]

DiCOM RT Structure Set

DICOM RT Structure Sets define regions of interest (ROIs) within 3D medical image series using contour data. When importing RT Structure Set files (.dcm), each file must reference exactly one DICOM study. For each ROI in an RT Structure Set file, the system renders a 3D bitmask derived from the associated contour data, imports this bitmask as a new annotation for the corresponding DICOM 3D image series asset, and applies the ROI name as the tag name.

Supported Contour Geometric Types (3006, 0042):

  • POINT: A single coordinate point.
  • OPEN_PLANAR: An open contour with coplanar points.
  • CLOSED_PLANAR: A polygon composed of coplanar points.
  • CLOSEDPLANAR_XOR: Multiple coplanar polygons combined using XOR operations.
🚧

The OPEN_NON_PLANAR type is not currently supported. Please contact us if your use case requires this.

DICOM File Meta -------------------------------------------------------
...
(0002, 0002) Media Storage SOP Class UID            UI: RT Structure Set Storage
...

DICOM Data Set --------------------------------------------------------
...
(0008, 0016) SOP Class UID                          UI: RT Structure Set Storage
...
# Must match the Study Instance UID of a 3D DICOM series uploaded or synced
# to the project as an asset.
(0020, 000D) Study Instance UID                     UI: ...
...
(3006, 0020) Structure Set ROI Sequence             ... item(s) ------
  (3006, 0022) ROI Number                           IS: '1'
  (3006, 0024) Referenced Frame Of Reference UID    UI: ...
  (3006, 0026) ROI Name                             LO: 'ROI 1'
  ...
  ------------
  ...
...
(3006, 0039) ROI Contour Sequence                   ... item(s) ------
  ...
  (3006, 0040) Contour Sequence                     ... item(s) ------
    (3006, 0016) Contour Image Sequence             ... item(s) ------
      # Defines images within the study containing the contour
      (0008, 1150) Referenced SOP Class UID         UI: ...
      (0008, 1155) Referenced SOP Instance UID
      ...
      ------------
      ...
    # Supported types are: POINT, OPEN_PLANAR, CLOSED_PLANAR, CLOSEDPLANAR_XOR.
    (3006, 0042) Contour Geometric Type             CS: 'CLOSED_PLANAR'
    (3006, 0046) Number of Contour Points           IS: '330'
    ...
    # Flattened sequence of (x, y, z) contour coordinates in patient-based
    # coordinate system.
    (3006, 0050) Contour Data                       DS: Array of 990 elements
    ------------
    ...
  ...
  (3006, 0084) Referenced ROI Number                IS: '1'
  ------------
  ...
...