Command Line Interface Reference
corgi
corgi [OPTIONS]
Options
- --gpu, --no-gpu
Whether or not to use a GPU for processing if available.
- Default:
True
- --pretrained <pretrained>
The location (URL or filepath) of a pretrained model.
- --reload, --no-reload
Should the pretrained model be downloaded again if it is online and already present locally.
- Default:
False
- --file <file>
A fasta file with sequences to be classified.
- --max-seqs <max_seqs>
- --batch-size <batch_size>
- Default:
1
- --max-length <max_length>
- Default:
5000
- --min-length <min_length>
- Default:
128
- --output-dir <output_dir>
A path to output the results as a CSV.
- --csv <csv>
A path to output the results as a CSV. If not given then a default name is chosen inside the output directory.
- --save-filtered, --no-save-filtered
Whether or not to save the filtered sequences.
- Default:
True
- --threshold <threshold>
The threshold to use for filtering. If not given, then only the most likely category used for filtering.
corgi-train
corgi-train [OPTIONS] COMMAND [ARGS]...
Options
- -v, --version
Prints the current version.
- --install-completion
Install completion for the current shell.
- --show-completion
Show completion for the current shell, to copy it or customize the installation.
bibliography
corgi-train bibliography [OPTIONS]
bibtex
corgi-train bibtex [OPTIONS]
export
corgi-train export [OPTIONS] MODEL_PATH
Options
- --fp16, --no-fp16
Whether or not the floating-point precision of learner should be set to 16 bit.
- Default:
True
- --output-dir <output_dir>
The location of the output directory.
- Default:
./outputs
- --weight-decay <weight_decay>
The amount of weight decay. If None then it uses the default amount of weight decay in fastai.
- --csv <csv>
The CSV which has the sequences to use.
- --base-dir <base_dir>
The base directory with the RefSeq HDF5 files.
- --batch-size <batch_size>
The batch size.
- Default:
32
- --dataloader-type <dataloader_type>
- Default:
DataloaderType.PLAIN
- Options:
PLAIN | WEIGHTED | STRATIFIED
- --validation-seq-length <validation_seq_length>
- Default:
1000
- --deform-lambda <deform_lambda>
The lambda for the deform transform.
- --embedding-dim <embedding_dim>
The size of the embeddings for the nucleotides (N, A, G, C, T).
- Default:
8
- --filters <filters>
The number of filters in each of the 1D convolution layers. These are concatenated together
- Default:
256
- --cnn-layers <cnn_layers>
The number of 1D convolution layers.
- Default:
6
- --kernel-size-maxpool <kernel_size_maxpool>
The size of the pooling before going to the LSTM.
- Default:
2
- --lstm-dims <lstm_dims>
The size of the hidden layers in the LSTM in both directions.
- Default:
256
- --final-layer-dims <final_layer_dims>
The size of a dense layer after the LSTM. If this is zero then this layer isn’t used.
- Default:
0
- --dropout <dropout>
The amount of dropout to use. (not currently enabled)
- Default:
0.2
- --final-bias, --no-final-bias
Whether or not to use bias in the final layer.
- Default:
True
- --cnn-only, --no-cnn-only
- Default:
True
- --kernel-size <kernel_size>
The size of the kernels for CNN only classifier.
- Default:
3
- --cnn-dims-start <cnn_dims_start>
The size of the number of filters in the first CNN layer. If not set then it is derived from the MACC
- --factor <factor>
The factor to multiply the number of filters in the CNN layers each time it is downscaled.
- Default:
2.0
- --penultimate-dims <penultimate_dims>
The factor to multiply the number of filters in the CNN layers each time it is downscaled.
- Default:
1024
- --include-length, --no-include-length
- Default:
False
- --transformer-heads <transformer_heads>
The number of heads in the transformer.
- Default:
8
- --transformer-layers <transformer_layers>
The number of layers in the transformer. If zero then no transformer is used.
- Default:
0
- --macc <macc>
The approximate number of multiply or accumulate operations in the model. Used to set cnn_dims_start if not provided explicitly.
- Default:
10000000
- --project-name <project_name>
The name for this project for logging purposes.
- --run-name <run_name>
The name for this particular run for logging purposes.
- --run-id <run_id>
A unique ID for this particular run for logging purposes.
- --notes <notes>
A longer description of the run for logging purposes.
- --tag <tag>
A tag for logging purposes. Multiple tags can be added each introduced with –tag.
- --wandb, --no-wandb
Whether or not to use ‘Weights and Biases’ for logging.
- Default:
False
- --wandb-mode <wandb_mode>
The mode for ‘Weights and Biases’.
- Default:
online
- --wandb-dir <wandb_dir>
The location for ‘Weights and Biases’ output.
- --wandb-entity <wandb_entity>
An entity is a username or team name where you’re sending runs.
- --wandb-group <wandb_group>
Specify a group to organize individual runs into a larger experiment.
- --wandb-job-type <wandb_job_type>
Specify the type of run, which is useful when you’re grouping runs together into larger experiments using group.
- --mlflow, --no-mlflow
Whether or not to use MLflow for logging.
- Default:
False
Arguments
- MODEL_PATH
Required argument <click.types.Path object at 0x7fdaf8f069d0>
infer
corgi-train infer [OPTIONS]
Options
- --gpu, --no-gpu
Whether or not to use a GPU for processing if available.
- Default:
True
- --pretrained <pretrained>
The location (URL or filepath) of a pretrained model.
- --reload, --no-reload
Should the pretrained model be downloaded again if it is online and already present locally.
- Default:
False
- --file <file>
A fasta file with sequences to be classified.
- --max-seqs <max_seqs>
- --batch-size <batch_size>
- Default:
1
- --max-length <max_length>
- Default:
5000
- --min-length <min_length>
- Default:
128
- --output-dir <output_dir>
A path to output the results as a CSV.
- --csv <csv>
A path to output the results as a CSV. If not given then a default name is chosen inside the output directory.
- --save-filtered, --no-save-filtered
Whether or not to save the filtered sequences.
- Default:
True
- --threshold <threshold>
The threshold to use for filtering. If not given, then only the most likely category used for filtering.
lr-finder
corgi-train lr-finder [OPTIONS]
Options
- --plot-filename <plot_filename>
- --start-lr <start_lr>
- Default:
1e-07
- --end-lr <end_lr>
- Default:
10
- --iterations <iterations>
- Default:
100
- --fp16, --no-fp16
Whether or not the floating-point precision of learner should be set to 16 bit.
- Default:
True
- --output-dir <output_dir>
The location of the output directory.
- Default:
./outputs
- --weight-decay <weight_decay>
The amount of weight decay. If None then it uses the default amount of weight decay in fastai.
- --csv <csv>
The CSV which has the sequences to use.
- --base-dir <base_dir>
The base directory with the RefSeq HDF5 files.
- --batch-size <batch_size>
The batch size.
- Default:
32
- --dataloader-type <dataloader_type>
- Default:
DataloaderType.PLAIN
- Options:
PLAIN | WEIGHTED | STRATIFIED
- --validation-seq-length <validation_seq_length>
- Default:
1000
- --deform-lambda <deform_lambda>
The lambda for the deform transform.
- --embedding-dim <embedding_dim>
The size of the embeddings for the nucleotides (N, A, G, C, T).
- Default:
8
- --filters <filters>
The number of filters in each of the 1D convolution layers. These are concatenated together
- Default:
256
- --cnn-layers <cnn_layers>
The number of 1D convolution layers.
- Default:
6
- --kernel-size-maxpool <kernel_size_maxpool>
The size of the pooling before going to the LSTM.
- Default:
2
- --lstm-dims <lstm_dims>
The size of the hidden layers in the LSTM in both directions.
- Default:
256
- --final-layer-dims <final_layer_dims>
The size of a dense layer after the LSTM. If this is zero then this layer isn’t used.
- Default:
0
- --dropout <dropout>
The amount of dropout to use. (not currently enabled)
- Default:
0.2
- --final-bias, --no-final-bias
Whether or not to use bias in the final layer.
- Default:
True
- --cnn-only, --no-cnn-only
- Default:
True
- --kernel-size <kernel_size>
The size of the kernels for CNN only classifier.
- Default:
3
- --cnn-dims-start <cnn_dims_start>
The size of the number of filters in the first CNN layer. If not set then it is derived from the MACC
- --factor <factor>
The factor to multiply the number of filters in the CNN layers each time it is downscaled.
- Default:
2.0
- --penultimate-dims <penultimate_dims>
The factor to multiply the number of filters in the CNN layers each time it is downscaled.
- Default:
1024
- --include-length, --no-include-length
- Default:
False
- --transformer-heads <transformer_heads>
The number of heads in the transformer.
- Default:
8
- --transformer-layers <transformer_layers>
The number of layers in the transformer. If zero then no transformer is used.
- Default:
0
- --macc <macc>
The approximate number of multiply or accumulate operations in the model. Used to set cnn_dims_start if not provided explicitly.
- Default:
10000000
show-batch
corgi-train show-batch [OPTIONS]
Options
- --output-path <output_path>
A location to save the HTML which summarizes the batch.
- --csv <csv>
The CSV which has the sequences to use.
- --base-dir <base_dir>
The base directory with the RefSeq HDF5 files.
- --batch-size <batch_size>
The batch size.
- Default:
32
- --dataloader-type <dataloader_type>
- Default:
DataloaderType.PLAIN
- Options:
PLAIN | WEIGHTED | STRATIFIED
- --validation-seq-length <validation_seq_length>
- Default:
1000
- --deform-lambda <deform_lambda>
The lambda for the deform transform.
train
corgi-train train [OPTIONS]
Options
- --distributed, --no-distributed
If the learner is distributed.
- Default:
False
- --fp16, --no-fp16
Whether or not the floating-point precision of learner should be set to 16 bit.
- Default:
True
- --output-dir <output_dir>
The location of the output directory.
- Default:
./outputs
- --weight-decay <weight_decay>
The amount of weight decay. If None then it uses the default amount of weight decay in fastai.
- --csv <csv>
The CSV which has the sequences to use.
- --base-dir <base_dir>
The base directory with the RefSeq HDF5 files.
- --batch-size <batch_size>
The batch size.
- Default:
32
- --dataloader-type <dataloader_type>
- Default:
DataloaderType.PLAIN
- Options:
PLAIN | WEIGHTED | STRATIFIED
- --validation-seq-length <validation_seq_length>
- Default:
1000
- --deform-lambda <deform_lambda>
The lambda for the deform transform.
- --embedding-dim <embedding_dim>
The size of the embeddings for the nucleotides (N, A, G, C, T).
- Default:
8
- --filters <filters>
The number of filters in each of the 1D convolution layers. These are concatenated together
- Default:
256
- --cnn-layers <cnn_layers>
The number of 1D convolution layers.
- Default:
6
- --kernel-size-maxpool <kernel_size_maxpool>
The size of the pooling before going to the LSTM.
- Default:
2
- --lstm-dims <lstm_dims>
The size of the hidden layers in the LSTM in both directions.
- Default:
256
- --final-layer-dims <final_layer_dims>
The size of a dense layer after the LSTM. If this is zero then this layer isn’t used.
- Default:
0
- --dropout <dropout>
The amount of dropout to use. (not currently enabled)
- Default:
0.2
- --final-bias, --no-final-bias
Whether or not to use bias in the final layer.
- Default:
True
- --cnn-only, --no-cnn-only
- Default:
True
- --kernel-size <kernel_size>
The size of the kernels for CNN only classifier.
- Default:
3
- --cnn-dims-start <cnn_dims_start>
The size of the number of filters in the first CNN layer. If not set then it is derived from the MACC
- --factor <factor>
The factor to multiply the number of filters in the CNN layers each time it is downscaled.
- Default:
2.0
- --penultimate-dims <penultimate_dims>
The factor to multiply the number of filters in the CNN layers each time it is downscaled.
- Default:
1024
- --include-length, --no-include-length
- Default:
False
- --transformer-heads <transformer_heads>
The number of heads in the transformer.
- Default:
8
- --transformer-layers <transformer_layers>
The number of layers in the transformer. If zero then no transformer is used.
- Default:
0
- --macc <macc>
The approximate number of multiply or accumulate operations in the model. Used to set cnn_dims_start if not provided explicitly.
- Default:
10000000
- --epochs <epochs>
The number of epochs.
- Default:
20
- --freeze-epochs <freeze_epochs>
The number of epochs to train when the learner is frozen and the last layer is trained by itself. Only if fine_tune is set on the app.
- Default:
3
- --learning-rate <learning_rate>
The base learning rate (when fine tuning) or the max learning rate otherwise.
- Default:
0.0001
- --project-name <project_name>
The name for this project for logging purposes.
- --run-name <run_name>
The name for this particular run for logging purposes.
- --run-id <run_id>
A unique ID for this particular run for logging purposes.
- --notes <notes>
A longer description of the run for logging purposes.
- --tag <tag>
A tag for logging purposes. Multiple tags can be added each introduced with –tag.
- --wandb, --no-wandb
Whether or not to use ‘Weights and Biases’ for logging.
- Default:
False
- --wandb-mode <wandb_mode>
The mode for ‘Weights and Biases’.
- Default:
online
- --wandb-dir <wandb_dir>
The location for ‘Weights and Biases’ output.
- --wandb-entity <wandb_entity>
An entity is a username or team name where you’re sending runs.
- --wandb-group <wandb_group>
Specify a group to organize individual runs into a larger experiment.
- --wandb-job-type <wandb_job_type>
Specify the type of run, which is useful when you’re grouping runs together into larger experiments using group.
- --mlflow, --no-mlflow
Whether or not to use MLflow for logging.
- Default:
False
tune
corgi-train tune [OPTIONS]
Options
- --runs <runs>
The number of runs to attempt to train the model.
- Default:
1
- --engine <engine>
The optimizer to use to perform the hyperparameter tuning. Options: wandb, optuna, skopt.
- Default:
skopt
- --id <id>
The ID of this hyperparameter tuning job. If using wandb, then this is the sweep id. If using optuna, then this is the storage. If using skopt, then this is the file to store the results.
- Default:
- --name <name>
An informative name for this hyperparameter tuning job. If empty, then it creates a name from the project name.
- Default:
- --method <method>
The sampling method to use to perform the hyperparameter tuning. By default it chooses the default method of the engine.
- Default:
- --min-iter <min_iter>
The minimum number of iterations if using early termination. If left empty, then early termination is not used.
- --seed <seed>
A seed for the random number generator.
- --distributed, --no-distributed
If the learner is distributed.
- Default:
False
- --fp16, --no-fp16
Whether or not the floating-point precision of learner should be set to 16 bit.
- Default:
True
- --output-dir <output_dir>
The location of the output directory.
- Default:
./outputs
- --weight-decay <weight_decay>
The amount of weight decay. If None then it uses the default amount of weight decay in fastai.
- --csv <csv>
The CSV which has the sequences to use.
- --base-dir <base_dir>
The base directory with the RefSeq HDF5 files.
- --batch-size <batch_size>
The batch size.
- Default:
32
- --dataloader-type <dataloader_type>
- Default:
DataloaderType.PLAIN
- Options:
PLAIN | WEIGHTED | STRATIFIED
- --validation-seq-length <validation_seq_length>
- Default:
1000
- --deform-lambda <deform_lambda>
The lambda for the deform transform.
- --embedding-dim <embedding_dim>
The size of the embeddings for the nucleotides (N, A, G, C, T).
- --filters <filters>
The number of filters in each of the 1D convolution layers. These are concatenated together
- Default:
256
- --cnn-layers <cnn_layers>
The number of 1D convolution layers.
- --kernel-size-maxpool <kernel_size_maxpool>
The size of the pooling before going to the LSTM.
- Default:
2
- --lstm-dims <lstm_dims>
The size of the hidden layers in the LSTM in both directions.
- Default:
256
- --final-layer-dims <final_layer_dims>
The size of a dense layer after the LSTM. If this is zero then this layer isn’t used.
- Default:
0
- --dropout <dropout>
The amount of dropout to use. (not currently enabled)
- --final-bias, --no-final-bias
Whether or not to use bias in the final layer.
- --cnn-only, --no-cnn-only
- Default:
True
- --kernel-size <kernel_size>
The size of the kernels for CNN only classifier.
- --cnn-dims-start <cnn_dims_start>
The size of the number of filters in the first CNN layer. If not set then it is derived from the MACC
- --factor <factor>
The factor to multiply the number of filters in the CNN layers each time it is downscaled.
- --penultimate-dims <penultimate_dims>
The factor to multiply the number of filters in the CNN layers each time it is downscaled.
- --include-length, --no-include-length
- Default:
False
- --transformer-heads <transformer_heads>
The number of heads in the transformer.
- Default:
8
- --transformer-layers <transformer_layers>
The number of layers in the transformer. If zero then no transformer is used.
- Default:
0
- --macc <macc>
The approximate number of multiply or accumulate operations in the model. Used to set cnn_dims_start if not provided explicitly.
- Default:
10000000
- --epochs <epochs>
The number of epochs.
- Default:
20
- --freeze-epochs <freeze_epochs>
The number of epochs to train when the learner is frozen and the last layer is trained by itself. Only if fine_tune is set on the app.
- Default:
3
- --learning-rate <learning_rate>
The base learning rate (when fine tuning) or the max learning rate otherwise.
- Default:
0.0001
- --project-name <project_name>
The name for this project for logging purposes.
- --run-name <run_name>
The name for this particular run for logging purposes.
- --run-id <run_id>
A unique ID for this particular run for logging purposes.
- --notes <notes>
A longer description of the run for logging purposes.
- --tag <tag>
A tag for logging purposes. Multiple tags can be added each introduced with –tag.
- --wandb, --no-wandb
Whether or not to use ‘Weights and Biases’ for logging.
- Default:
False
- --wandb-mode <wandb_mode>
The mode for ‘Weights and Biases’.
- Default:
online
- --wandb-dir <wandb_dir>
The location for ‘Weights and Biases’ output.
- --wandb-entity <wandb_entity>
An entity is a username or team name where you’re sending runs.
- --wandb-group <wandb_group>
Specify a group to organize individual runs into a larger experiment.
- --wandb-job-type <wandb_job_type>
Specify the type of run, which is useful when you’re grouping runs together into larger experiments using group.
- --mlflow, --no-mlflow
Whether or not to use MLflow for logging.
- Default:
False
validate
corgi-train validate [OPTIONS]
Options
- --gpu, --no-gpu
Whether or not to use a GPU for processing if available.
- Default:
True
- --pretrained <pretrained>
The location (URL or filepath) of a pretrained model.
- --reload, --no-reload
Should the pretrained model be downloaded again if it is online and already present locally.
- Default:
False
- --csv <csv>
The CSV which has the sequences to use.
- --base-dir <base_dir>
The base directory with the RefSeq HDF5 files.
- --batch-size <batch_size>
The batch size.
- Default:
32
- --dataloader-type <dataloader_type>
- Default:
DataloaderType.PLAIN
- Options:
PLAIN | WEIGHTED | STRATIFIED
- --validation-seq-length <validation_seq_length>
- Default:
1000
- --deform-lambda <deform_lambda>
The lambda for the deform transform.