BatchConvert: Efficient Image Data Conversion into OME-TIFF or OME-Zarr Formats with Parallel Processing and Automated Storage Transfer

public 1yr ago Version: main @ 03e32fe 0 bookmarks

View Workflow

Help improve this workflow!

This workflow has been published but could be further improved with some additional meta data:

Keyword(s) in categories input, output

You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .

BatchConvert

A command line tool for converting image data into either of the standard file formats OME-TIFF or OME-Zarr.

The tool wraps the dedicated file converters bfconvert and bioformats2raw to convert into OME-TIFF or OME-Zarr, respectively. The workflow management system NextFlow is used to perform conversion in parallel for batches of images.

The tool also wraps s3 and Aspera clients (go-mc and aspera-cli, respectively). Therefore, input and output locations can be specified as local or remote storage and file transfer will be performed automatically. The conversion can be run on HPC with Slurm.

Installation & Dependencies

Important note: The package has been so far only tested on Ubuntu 20.04.

The minimal dependency to run the tool is NextFlow, which should be installed and made accessible from the command line.

If conda exists on your system, you can install BatchConvert together with NextFlow using the following script:

git clone https://github.com/Euro-BioImaging/BatchConvert.git && \ source BatchConvert/installation/install_with_nextflow.sh

If you already have NextFlow installed and accessible from the command line (or if you prefer to install it manually e.g., as shown here ), you can also install BatchConvert alone, using the following script:

git clone https://github.com/Euro-BioImaging/BatchConvert.git && \ source BatchConvert/installation/install.sh

Other dependencies (which will be automatically installed):

bioformats2raw (entrypoint bioformats2raw)
bftools (entrypoint bfconvert)
go-mc (entrypoint mc)
aspera-cli (entrypoint ascp)

These dependencies will be pulled and cached automatically at the first execution of the conversion command. The mode of dependency management can be specified by using the command line option --profile or -pf . Depending on how this option is specified, the dependencies will be acquired / run either via conda or via docker/singularity containers.

Specifying --profile conda (default) will install the dependencies to an environment at ./.condaCache and use this environment to run the workflow. This option requires that miniconda/anaconda is installed on your system.

Alternatively, specifying --profile docker or --profile singularity will pull a docker or singularity image with the dependencies, respectively, and use this image to run the workflow. These options assume that the respective container runtime (docker or singularity) is available on your system. If singularity is being used, a cache directory will be created at the path ./.singularityCache where the singularity image is stored.

Finally, you can still choose to install the dependencies manually and use your own installations to run the workflow. In this case, you should specify --profile standard and make sure the entrypoints specified above are recognised by your shell.

Code Snippets

"""
batchconvert_cli.sh "$inpath.name" "${inpath.baseName}.ome.tiff"
"""

NextFlow From line 22 of modules/modules.nf

"""
if [[ -d "${inpath}/tempdir" ]];
    then
        batchconvert_cli.sh "${inpath}/tempdir/${pattern_file}" "${pattern_file.baseName}.ome.tiff"
    else
        batchconvert_cli.sh "$inpath/$pattern_file.name" "${pattern_file.baseName}.ome.tiff"
fi
# rm -rf ${inpath}/tempdir &> /dev/null
# rm -rf ${inpath}/*pattern &> /dev/null
"""

NextFlow From line 42 of modules/modules.nf

"""
batchconvert_cli.sh "$inpath.name" "${inpath.baseName}.ome.zarr"
"""

NextFlow From line 67 of modules/modules.nf

"""
if [[ -d "${inpath}/tempdir" ]];
    then
        batchconvert_cli.sh "${inpath}/tempdir/${pattern_file.name}" "${pattern_file.baseName}.ome.zarr"
    else
        batchconvert_cli.sh "$inpath/$pattern_file.name" "${pattern_file.baseName}.ome.zarr"
fi
# rm -rf ${inpath}/tempdir &> /dev/null
# rm -rf ${inpath}/*pattern &> /dev/null
"""

NextFlow From line 88 of modules/modules.nf

"""
sleep 5;
mc -C "./mc" alias set "${params.S3REMOTE}" "${params.S3ENDPOINT}" "${params.S3ACCESS}" "${params.S3SECRET}" &> /dev/null;
parse_s3_filenames.py "${params.S3REMOTE}/${params.S3BUCKET}/${source}/"
"""

NextFlow From line 109 of modules/modules.nf

"""
sleep 5;
localname="\$(basename $local)" && \
mc -C "./mc" alias set "${params.S3REMOTE}" "${params.S3ENDPOINT}" "${params.S3ACCESS}" "${params.S3SECRET}";
if [ -f $local ];then
    mc -C "./mc" cp $local "${params.S3REMOTE}"/"${params.S3BUCKET}"/"${params.out_path}"/"\$localname";
elif [ -d $local ];then
    mc -C "./mc" mirror $local "${params.S3REMOTE}"/"${params.S3BUCKET}"/"${params.out_path}"/"\$localname";
fi
echo "${params.S3REMOTE}"/"${params.S3BUCKET}"/"${params.out_path}"/$local > "./transfer_report.txt";
"""

NextFlow From line 126 of modules/modules.nf

"""
sleep 5;
mc -C "./mc" alias set "${params.S3REMOTE}" "${params.S3ENDPOINT}" "${params.S3ACCESS}" "${params.S3SECRET}";
mc -C "./mc" mirror "${params.S3REMOTE}"/"${params.S3BUCKET}"/"${source}" "transferred/${source}";
"""

NextFlow From line 145 of modules/modules.nf

"""
sleep 5;
mc -C "./mc" alias set "${params.S3REMOTE}" "${params.S3ENDPOINT}" "${params.S3ACCESS}" "${params.S3SECRET}";
mc -C "./mc" cp "${s3path}" "${s3name}";
"""

NextFlow From line 160 of modules/modules.nf

"""
ascp -P33001 -l 500M -k 2 -i $BIA_SSH_KEY -d $local [email protected]:${params.BIA_REMOTE}/${params.out_path};
echo "${params.BIA_REMOTE}"/"${params.out_path}" > "./transfer_report.txt";
"""

NextFlow From line 173 of modules/modules.nf

"""
ascp -P33001 -l 500M -k 2 -i $BIA_SSH_KEY -d [email protected]:${params.BIA_REMOTE}/$source ".";
"""

NextFlow From line 186 of modules/modules.nf

"""
ascp -P33001 -i $BIA_SSH_KEY [email protected]:$source transferred;
"""

NextFlow From line 198 of modules/modules.nf

"""
if [[ "${params.pattern}" == '' ]] && [[ "${params.reject_pattern}" == '' ]];then
    create_hyperstack --concatenation_order ${params.concatenation_order} ${inpath}
elif [[ "${params.reject_pattern}" == '' ]];then
    create_hyperstack --concatenation_order ${params.concatenation_order} --select_by ${params.pattern} ${inpath}
elif [[ "${params.pattern}" == '' ]];then
    create_hyperstack --concatenation_order ${params.concatenation_order} --reject_by ${params.reject_pattern} ${inpath}
else
    create_hyperstack --concatenation_order ${params.concatenation_order} --select_by ${params.pattern} --reject_by ${params.reject_pattern} ${inpath}
fi
"""

NextFlow From line 271 of modules/modules.nf

"""
if [[ "${params.pattern}" == '' ]] && [[ "${params.reject_pattern}" == '' ]];then
    create_hyperstack --concatenation_order ${params.concatenation_order} ${inpath}
elif [[ "${params.reject_pattern}" == '' ]];then
    create_hyperstack --concatenation_order ${params.concatenation_order} --select_by ${params.pattern} ${inpath}
elif [[ "${params.pattern}" == '' ]];then
    create_hyperstack --concatenation_order ${params.concatenation_order} --reject_by ${params.reject_pattern} ${inpath}
else
    create_hyperstack --concatenation_order ${params.concatenation_order} --select_by ${params.pattern} --reject_by ${params.reject_pattern} ${inpath}
fi
"""

NextFlow From line 290 of modules/modules.nf

"""
rm -rf "${inpath}/tempdir" &> /dev/null
rm -rf "${inpath}/*pattern" &> /dev/null
"""

NextFlow From line 318 of modules/modules.nf

"""
mc alias set "${params.S3REMOTE}" "${params.S3ENDPOINT}" "${params.S3ACCESS}" "${params.S3SECRET}";
mc mirror "${params.S3REMOTE}"/"${params.S3BUCKET}"/"${source}" "transferred";
"""

NextFlow From line 332 of modules/modules.nf

"""
"""

NextFlow From line 344 of modules/modules.nf

"""
"""

NextFlow From line 359 of modules/modules.nf

"""
if [[ "${params.merge_files}" == "True" ]];
    then
        create_hyperstack --concatenation_order ${params.concatenation_order} --select_by ${params.pattern} ${inpath};
        if [[ "${params.concatenation_order}" == "auto" ]];
            then
                batchconvert_cli.sh $inpath/*pattern "${inpath.baseName}.ome.zarr"
        elif ! [[ "${params.concatenation_order}" == "auto" ]];
            then
                batchconvert_cli.sh $inpath/tempdir/*pattern "${inpath.baseName}.ome.zarr"
        fi
elif [[ "${params.merge_files}" == "False" ]];
    then
        batchconvert_cli.sh $inpath "${inpath.baseName}.ome.zarr"
fi
rm -rf "${inpath}/tempdir" &> /dev/null
rm -rf "${inpath}/*pattern" &> /dev/null
"""