Add artifact

info

For developers

1. Create an artifact

Let's say you have a lang/ directory containing several files for the TIMIT dataset such as a lexicon. To add these files as a single artifact, you first need to compress them: tar -czf TIMIT-lang.tar.gz lang/ Please follow the naming: DATASET-category where category is also the name of the compressed dir (usually "lang" or "metadata").

2. Upload it to GitLab

  • install the glab tool.
  • glab auth login (needed once)
  • glab release upload --repo PTAL/Datasets/SpeechDatasets.jl <latest tag> <artifact archive> for example: glab release upload --repo PTAL/Datasets/SpeechDatasets.jl v0.21.1 TIMIT-lang.tar.gz
  • delete your archive rm TIMIT-lang.tar.gz

3. Add it to Artifacts.toml

  • julia --project=.
  • In the Julia REPL: using ArtifactUtils
  • add_artifact!("Artifacts.toml", "TIMIT-lang", "https://gitlab.lisn.upsaclay.fr/PTAL/Datasets/SpeechDatasets.jl/-/releases/v0.21.1/downloads/TIMIT-lang.tar.gz", lazy=true) you can copy the URL from the releases pages or edit this one to match the tag and artifact name

4. Use it in the package

In SpeechDatasets, each instantiated dataset will check for related artifacts, calling get_artifact

SpeechDatasets.get_artifactFunction
get_artifact(name::Symbol, datadir::AbstractString, category::AbstractString; override=false)

Get the requested artifact for dataset name in the datadir directory. category can be "lang" or "metadata".

source