Training

After performing the instructions on the Preprocessing page, you will have a SeqBank and SeqTree file that you can use to train Terrier.

To train Terrier, you will need to use the terrier-tools CLI utility.

To train with the default settings of Terrier, you can run the following command:

SEQBANK=$REPBASE_DIR/Repbase-seqbank.sb
SEQTREE=$REPBASE_DIR/Repbase-seqtree.st
terrier-tools train \
    --seqtree $SEQTREE \
    --seqbank $SEQBANK

If you want to train using the pretrained Terrier model weights as a starting point, you can add the --pretrained flag:

SEQBANK=$REPBASE_DIR/Repbase-seqbank.sb
SEQTREE=$REPBASE_DIR/Repbase-seqtree.st
terrier-tools train         \
    --seqtree $SEQTREE         \
    --seqbank $SEQBANK         \
    --pretrained default

You can replace the word default with a path to a checkpoint file if you have one or to a URL to a checkpoint file.

You can see other command-line options by running:

terrier-tools train --help

To reproduce the training process of the main release of Terrier, you can use the following command:

terrier-tools train         \
    --seqtree $SEQTREE         \
    --seqbank $SEQBANK         \
    --max-learning-rate 0.001         \
    --macc 20000000000         \
    --cnn-layers 4 \
    --dropout 0.2479560973202271 \
    --embedding-dim 18 \
    --factor 1.959254226973812 \
    --kernel-size 7 \
    --penultimate-dims 1953 \
    --phi 1.0196823166741456 \
    --max-epochs 100 \
    --test-partition -2 \
    --validation-partition -1