Dataset Management

Dataset Management Functions

The functions listed on this page help with managing the creation and utilization of datasets to be used in batch jobs for inference.


Create Dataset

datature batch datasets create

Creates a new dataset and uploads SignedURLs to a GCP bucket.

Sample Output

$ datature batch datasets create

? Enter a name for the Dataset: rbc
? Enter the path to the Dataset file: rbc.ndjson
? Select a timezone: Singapore

✓ Dataset created successfully
This Dataset will expire and automatically be removed after August 25, 2024 12:45 PM +08 (10 days from now)

Uploading Dataset |████████████████████| 1/1 [100%] in 1.6s (0.62/s) 
✓ Dataset uploaded and processed successfully.

ID:             dataset_652ef566-af5c-4d52-b1e4-ec8ec6dc4b8e
Name:           rbc
Source:         UploadedNewlineDelimitedJsonFile
Expires At:     2024-08-25 12:45:16
Status:         Uploaded

List Datasets

datature batch datasets list

Lists all available datasets in the project.

Sample Output

$ datature batch datasets list

ID                                            Name                 Status              
dataset_a9242c7b-b0f6-4e4a-b40a-716ab9042f5c  my-dataset-2024-0... Uploaded            
dataset_65948422-d4c9-4099-bd3e-3c7df629dd90  my-dataset-2024-0... Uploaded            
dataset_f978b838-7975-44c2-9040-5afdec85e178  my-dataset-2024-0... Uploaded

Get Dataset

datature batch datasets get [DATASET_ID]

Gets a dataset by ID, if no ID is provided, a dropdown list will be displayed for the user to choose the dataset.

Sample Output

$ datature batch datasets get

? Which entry do you want to select? 
❯ my-dataset-2024-08-07
  my-dataset-2024-08-08
  my-dataset-2024-08-08
  
? Which entry do you want to select? my-dataset-2024-08-07

ID:             dataset_a9242c7b-b0f6-4e4a-b40a-716ab9042f5c
Name:           my-dataset-2024-08-07
Source:         UploadedNewlineDelimitedJsonFile
Expires At:     2024-08-17 12:30:48
Status:         Uploaded

Delete Dataset

datature batch datasets delete [DATASET_ID]

Deletes a dataset by ID, if no ID is provided, a dropdown list will be displayed for the user to choose the dataset.

$ datature batch datasets get

? Which entry do you want to select? 
❯ my-dataset-2024-08-07
  my-dataset-2024-08-08
  my-dataset-2024-08-08
  
? Which entry do you want to select? my-dataset-2024-08-07
✓ Dataset deleted successfully.