Runner Management

Runner Management Functions

These command-line functions help you to manage your runner installation.

Runner Install

datature runner install

Installs a new Runner and connects to Nexus to utilize the Runner for training.

Sample Output

$ datature runner install

? The Runner will take up approximately 754MB of disk space.
To run trainings on your Runner, it is also recommended to have an additional minimum of 8GB of disk space on y
our system. Continue? Yes
? Please enter the name for your custom runner: my-custom-runner
✔ Success: System specs verified.
✔ Success: microk8s is installed and running.
✔ Success: Runner 'my-custom-runner' [85180e] registered.
✔ Success: Runner namespace 'datature-runner-47c81e83-6a41-4538-92ea-7b5dda85180e' created.
✔ Success: Config map applied.
✔ Success: Service account applied.
✔ Success: Docker registry credentials set.
✔ Success: Deployment 'datature-runner' applied.
Initializing the Runner… |████████████████████████████████████████| 100/100 [100%] (16.60/s) 
✔ Success: Resources cleaned up.
Success: Runner installed and initialized.
Return to your Nexus workspace: https://nexus.datature.io/workspace/54c6eea142f045069e4bc04f73c7bd76

Runner List

datature runner list

Sample Output

$ datature runner list

RUNNER NAME                HASH       STATUS                 SECRET KEY
my-custom-runner           85180e     Available              Valid     

Runner Status

datature runner status

Arguments

[runner_name_or_hash]

Name or hash of your Runner (last 6 digits of the Runner ID). If this is not provided, you will be prompted to enter either value to specify the Runner.

Sample Output

$ datature runner status my-custom-runner

+-------------------------------------------------------------------------+
| RUNNER NAME                HASH       STATUS                 SECRET KEY |
| my-custom-runner           85180e     Occupied               Valid      |
|=========================================================================|
| Cores      RAM             GPUs       CUDA       Driver                 |
| 8          31    GiB       1          12.0.2.0   535.54.03              |
|-------------------------------------------------------------------------|
| GPU        Name                       VRAM       CC                     |
| 0          Tesla T4                   16   GiB   75                     |
+-------------------------------------------------------------------------+
+------------------------------------------------------------------------------+
| RUN ID  IMAGE           STATUS       STARTED AT           RAM        GPUs    |
| d502ca  model-yolov9    Creating     2024-07-15 14:18:11  29   GiB   1       |
+------------------------------------------------------------------------------+

Runner Logs

datature runner logs

Dumps all the logs over the lifetime of the specified Runner or training run into stdout (default), with an option to dump to user-specified file.

🚧

Accumulated logs over time may be large, consider using commands like tail to suppress log output if you are frequently using this command, e.g. datature runner logs my-custom-runner | tail -n 10 to print the last 10 lines.

Arguments

[runner_name_or_hash]

Name or hash of your Runner (last 6 digits of the Runner ID). If this is not provided, you will be prompted to enter either value to specify the Runner.

[--run RUN_ID]

6-digit ID of your training run. This can be found on the Trainings page on Nexus, or by running datature runner status [runner_name_or_hash] if the run is still ongoing. Specifying the run ID will dump the logs of the particular training run associated with the Runner, instead of the Runner logs.

[--output /path/to/logs]

Dumps logs to a specified file path instead of stdout.

Sample Output

$ datature runner logs my-custom-runner | tail -n 4

[2024-07-15 14:14:39,098] (datature-training) DEBUG - WARNING:tensorflow:AutoGraph could not transform <function DatatureDataset.__init__.<locals>.<lambda> at 0x7fd4c5514ca0> and will run it as-is.
[2024-07-15 14:14:39,099] (datature-training) DEBUG - Cause: could not parse the source code of <function DatatureDataset.__init__.<locals>.<lambda> at 0x7fd4c5514ca0>: no matching AST found among candidates:
[2024-07-15 14:14:39,099] (datature-training) DEBUG - 
[2024-07-15 14:14:39,099] (datature-training) DEBUG - To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert

Runner Reauthentication

datature runner reauth

Reauthenticate

Sample Output

$ datature runner reauth 

? Please enter the name or the hash of your Runner (last 6 characters of Runner ID): my-custom-runner
? Please enter your workspace secret key: ****************************************************************
✔ Success: Config map applied.
✔ Success: Deployment 'datature-runner' restarted successfully.
✔ Success: Runner 'my-custom-runner' [69a8e5] reauthenticated successfully.

Runner Suspend

datature runner suspend

Sample Output

$ datature runner suspend

? Please enter the name or the hash of your Runner (last 6 characters of Runner ID): my-custom-runner
✔ Success: Runner 'my-custom-runner' [69a8e5] suspended successfully.

Runner Resume

datature runner resume

Sample Output

$ datature runner resume

? Please enter the name or the hash of your Runner (last 6 characters of Runner ID): my-custom-runner
✔ Success: Runner 'my-custom-runner' [69a8e5] resumed successfully.

Runner Uninstall

datature runner uninstall

Sample Output

$ datature runner uninstall

? Please enter the name or the hash of your Runner (last 6 characters of Runner ID): my-custom-runner
? Are you sure you want to uninstall the Runner? This action is irreversible.
Uninstalling will remove all associated data and configurations. Yes
? There are ongoing runs associated with this Runner that will be killed. Ensure to save any necessary data before proceeding.
Do you wish to continue? Yes
✔ Success: Runner 'my-custom-runner' [bff1dd] uninstalled successfully.