How to Debug the AutoPot Workflow?
As with every new package, AutoPot will not be bug-free. Presently, our experience is that most workflow failures happen because execution of a task is not configured properly, for example because of wrong filenames.
In such cases the workflow will not abort but starts hanging, because the
AutoPot orchestrator is written so that new events will only be triggered when
all tasks on which the event depends do not have the state FAILED.
The state of runs can be checked by running the following command in the
workflow directory:
motoko info --verbose
The output may look like this:
******************************
TaskManager: md_select
******************************
Loaded 2 jobs: OK
Loaded job infos: OK
Table create: OK
| cfg_fname | id |
|----------------------------------|------|
| md_cfgs.xyz,0 | 1 |
| md_cfgs.xyz,1 | 2 |
Loaded 2 runs: OK
Loaded run infos: OK
Table create: OK
| run_name | id | job_id | state | nproc | machine_name | start_time | last step | last update | Time/step | Total Time |
|------------|------|----------|-----------|---------|----------------|----------------|-------------|---------------|-------------|--------------|
| test | 1 | 1 | FINISHED | 1 | localhost | 10:49 04/11/25 | None | None | None | None |
| test | 2 | 2 | FAILED | 1 | localhost | 10:52 04/11/25 | None | None | None | None |
Here, the run with id = 2 has failed. To fix it, go to
md_select/BD-md_select-runs/run-2 and inspect test.e2 and test.o2
to check what caused the error.
If the error is caused by launch.sh or doIt.py, it is probably
necessary to also fix them in the corresponding task directory so that future
runs are not impacted by the same error.