Child pages
  • Conda/Bioconda in FGCI
Skip to end of metadata
Go to start of metadata

Conda is a package manager that is used automatize installation of application software and software environments.

Conda (miniconda3) is available in FGCI grid. In FGCI, Conda can be used to do temporary application  software installations for grid jobs running in FGCI.
The installations are done to the temporary runtime directory of the job. This means that each job must do its' own installation that disappears when the job is cleaned away.
In cases where the Conda installation requires just downloading few files, this approach can be used even in the case of massive analysis task.
However, in cases where the installed tool is large it is advisable to use Conda based installations just for testing and small scale computing
and ask FGCI administrators to crate a permanent and shared Conda environment to the FGCI environment.

The Conda installation and the instructions here are written for installing applications form BIOCONDA repository (list of BIOCONDA  packages), but in principle this Conda installation could be used to install any Conda package
that does not need user interaction during the installation process.

 

Usage

To use Conda add Runtime Environment APPL/BIO/BIOCONDA to your job description file:

For example

&
(executable=run_megahit.sh)
(jobname=megahit-test)
(stdout=std.out)
(stderr=std.err)
(gmlog=gridlog_1)
(walltime=4h)
(memory=16000)
(runtimeenvironment="APPS/BIO/BIOCONDA")
(cpount=8)
(inputfiles=
( "seq_R1.fastq.gz" "seq_R1.fastq.gz" )
( "seq_R2.fastq.gz" "seq_R2.fastq.gz" )
( "seq_merged.fastq.gz" "seq_merged.fastq.gz" )
)
(outputfiles=
   ( "megahit_results.tgz" "megahit_results.tgz" )
)

 

The command script must contain the Conda installation commands. In the case of Bioconda based installations, you must first define the repository channels used by bioconda

conda config --add channels defaults
conda config --add channels conda-forge
conda config --add channels bioconda

 

In the actual installation command you must use option -y to automatically accept all the default settings and option -p to define the location were the installation is done.

In the example script below a tool called megahit is installed using Conda. First a new directory called conda_tmp is created and then used as the installation directory.

After the installation the bin directory, created to the conda_tmp is added to the command path so that the installed tools can be found.

 

#!/bin/bash
# make a temporary directory for conda
conda_tmp_path=$(pwd)/conda_tmp
mkdir $conda_tmp_path
# Install the software to be used 
conda config --add channels defaults
conda config --add channels conda-forge
conda config --add channels bioconda
conda install megahit -y -p $conda_tmp_path
# Add the installed tool to path
export PATH=${conda_tmp_path}/bin:$PATH
#run the command
megahit -1 seq_R1.fastq.gz -2 seq_R2.fastq.gz -r seq_merged.fastq.gz -t 8 -o megahit_results
#pack and compress the results
tar zcvf megahit_results.tgz megahit_results

exit

 

  • No labels