Command Line Interface Reference
hespi
HErbarium Specimen sheet PIpeline
Takes a herbarium specimen sheet image detects components such as the institutional label, swatch, etc. It then classifies whether the institutional label is printed, typed, handwritten or a combination. If then detects the fields of the institutional label and attempts to read them through OCR and HTR models.
hespi [OPTIONS] IMAGES...
Options
- --output-dir <output_dir>
A directory to output the results.
- Default:
hespi-output
- --gpu, --no-gpu
Whether or not to use a GPU if available.
- Default:
True
- --fuzzy, --no-fuzzy
Whether or not to use fuzzy matching from teh reference database.
- Default:
True
- --fuzzy-cutoff <fuzzy_cutoff>
The threshold for the fuzzy matching score to use.
- Default:
0.8
- --htr, --no-htr
Whether or not to do handwritten text recognition using Microsoft’s TrOCR.
- Default:
True
- --llm <llm>
The Large Langauge Model to use. Currently OpenAI and Anthropic Claude models supported.
- Default:
gpt-4o
- --llm-api-key <llm_api_key>
The API key to use for the Large Language Model. Can be set as an environment variable using the standard variable names.
- Default:
- --llm-temperature <llm_temperature>
The temperature to use for the Large Language Model.
- Default:
0.0
- --trocr-size <trocr_size>
The size of the TrOCR model to use for handwritten text recognition.
- Default:
large
- Options:
small | base | large
- --sheet-component-weights <sheet_component_weights>
The path to the sheet component model weights.
- --label-field-weights <label_field_weights>
The path to the Label-Field model weights.
- --institutional-label-classifier-weights <institutional_label_classifier_weights>
The path to the institutional label classifier weights.
- --force-download, --no-force-download
Whether or not to force download model weights even if a weights file is present.
- Default:
False
- --tmp-dir <tmp_dir>
- --batch-size <batch_size>
The maximum batch size from run the sheet component model.
- Default:
4
- --sheet-component-res <sheet_component_res>
The resolution of images to use for the Sheet-Component model.
- Default:
1280
- --label-field-res <label_field_res>
The resolution of images to use for the Label-Field model.
- Default:
1280
- --install-completion <install_completion>
Install completion for the specified shell.
- Options:
bash | zsh | fish | powershell | pwsh
- --show-completion <show_completion>
Show completion for the specified shell, to copy it or customize the installation.
- Options:
bash | zsh | fish | powershell | pwsh
Arguments
- IMAGES
Required argument(s) A list of images to process. The images can also be URLs. STRING
Environment variables
- HESPI_SHEET_COMPONENT_WEIGHTS
Provide a default for
--sheet-component-weights
- HESPI_LABEL_FIELD_WEIGHTS
Provide a default for
--label-field-weights
- HESPI_INSTITUTIONAL_LABEL_CLASSIFIER_WEIGHTS
Provide a default for
--institutional-label-classifier-weights
hespi-tools
hespi-tools [OPTIONS] COMMAND [ARGS]...
Options
- --install-completion <install_completion>
Install completion for the specified shell.
- Options:
bash | zsh | fish | powershell | pwsh
- --show-completion <show_completion>
Show completion for the specified shell, to copy it or customize the installation.
- Options:
bash | zsh | fish | powershell | pwsh
bibtex
Shows the references in BibTeX format.
hespi-tools bibtex [OPTIONS]
institutional-label-classifier-location
Shows the location of the default Institutional Label Classifier model weights.
hespi-tools institutional-label-classifier-location [OPTIONS]
label-field-location
Shows the location of the default Label-Field model weights.
hespi-tools label-field-location [OPTIONS]
sheet-component-location
Shows the location of the default Sheet-Component model.
hespi-tools sheet-component-location [OPTIONS]
trocr
Run the TrOCR model on an image and print the recognized text.
hespi-tools trocr [OPTIONS] IMAGE
Options
- --size <size>
The size of the TrOCR model to use for handwritten text recognition.
- Default:
large
- Options:
small | base | large
Arguments
- IMAGE
Required argument <click.types.Path object at 0x7f5599d2fd30>