Kronos package

This section explains:

Kronos features

Kronos package offers the following features which eliminate the difficulties of making a pipeline:

Info

We define a pipeline (also called workflow) as a DAG composed of different tasks.

  • Single configuration file: the whole pipeline can be configured using a single configuration file.
  • Parallelization: parallelizable tasks are automatically run in parallel.
  • Synchronization: parallel tasks can be synchronized based on any of their parameters.
  • Local, cluster and cloud support: the pipelines can be run locally or on a cluster of computing nodes or in the cloud.
  • Forced dependencies: any task can be forced to wait for any other tasks.
  • Breakpoints: a pipeline can be programmatically paused and restated from any point in the pipeline.
  • Boilerplates: an executable boilerplate or script can be injected to a task and is run prior to running the task itself.
  • Keywords: a set of specific keywords in the configuration file which will be automatically replaced by proper values in the runtime.
  • Parameter sweep: a pipeline can be run for a list of different values for a set of input arguments.
  • Output directory customization: the structure of the output directory where all the intermediate files and results are stored can be configured by the user in the configuration file.
  • Event logging: all the events are automatically logged.

Kronos commands

Once Kronos is installed, it is added to the PATH, i.e. kronos becomes an available command which has the following sub-commands:

Command Description
make_component make a new component template
make_config make a new configuration file
update_config copy the fields of old configuration file to new configuration file
init initialize a pipeline from the given configuration file
run run Kronos-made pipelines with optional initialization

as well as the following options:

Options Description
-h or –help print help - optional
-v or –version show program’s version number and exit - optional
-w or –working_dir path/to/working_dir - optional

Tip

The -w is optional and if not specified, the current working directory is used to save output files/directories. It is recommended to specify it to avoid overwriting existing files. See What is the working directory? for more information.

make_component

This command creates a new component template. In other words, it automatically generates wrappers required for a seed to become a component.

Info

See Components for more information on seed and component.

The command is used as follows:

kronos -w </path/to/working_dir> make_component <name_for_component>

For example, the following code creates a component template called my_comp in a directory called my_components_dir:

kronos -w my_components_dir make_component my_comp

make_config

This command makes a new configuration file for the given list of component names.

The command is used as follows:

kronos -w </path/to/working_dir> make_config <list_of_components> -o <name_for_config_file>

For example, the following code creates a new configuration file called my_config_file.yaml for two components comp1 and comp2 in a directory called my_working_dir:

kronos -w my_working_dir make_config comp1 comp2 -o my_config_file

Warning

It is required to export the path of the components directory to the PYTHONPATH environment variable prior to running the make_config command:

export PYTHONPATH=</path/to/components_dir>:$PYTHONPATH

Tip

Note that the suffix .yaml is automatically added to the end of the provided name for the configuration file.

update_config

This command replaces the corresponding fields of an old configuration file with that of a new one. This is useful when there is a large configuration file which needs to be updated.

The command is used as follows:

kronos -w </path/to/working_dir> update_config <old_config.yaml> <new_config.yaml> -o <output_filename>

For example, the following code creates a new configuration file called new_config_file.yaml by updating my_config_file1.yaml using my_config_file2.yaml in a directory called my_working_dir:

kronos -w my_working_dir update_config my_config_file1.yaml my_config_file2.yaml -o new_config_file

init

This command initializes a new pipeline (i.e. creates a Python script) based on the input configuration file.

Info

We call a resulting Python script a pipeline script too.

The command is used as follows:

kronos -w </path/to/working_dir> init -y </path/to/config_file.yaml> -e <name_for_pipeline>

For example, the following code creates a Python script called my_pipeline.py for the input configuration file my_config_file.yaml in a directory called my_working_dir:

kronos -w my_working_dir init -y my_config_file.yaml -e my_pipeline

The output Python script of this command can be run using Kronos run command or can be run directly as a Python script.

Info

See How to initialize a pipeline? for more information.

Tip

Note that the suffix .py is automatically added to the end of the provided name for the pipeline.

Warning

The init command might create the following directories in addition to the pipeline Python script:

  • intermediate_config_files
  • intermediate_pipeline_scripts

These directories are used by Kronos and users should NOT modify them.

run

This command runs Kronos-made pipelines, i.e. pipeline scripts made by init command.

The command is used as follows:

kronos run -k </path/to/my_pipeline_script.py> -c </path/to/components_dir> [options]

Warning

It is required to export the path of the components directory to the PYTHONPATH environment variable prior to running the run command:

export PYTHONPATH=</path/to/components_dir>:$PYTHONPATH

Info

You can use run command to initialize and run the pipeline using the configuration file directly (i.e. without the need to init first). See Run the pipeline using run command for more information.