The UVA research community has access to numerous bioinformatics software installed directly or available through the bioconda Python modules.
Click here for a comprehensive list of currently-installed bioinformatics software.
Below are some popular tools and useful links for their documentation and usage:
To get an up-to-date list of the installed bioinformatics applications, log on to UVA HPC and run the following command in a terminal window:
module keyword bio
If you know which package you wish to use, you can look for it with
module spider <software>
For example,
module spider bcftools
This returns
----------------------------------------------------------------------------
bcftools:
----------------------------------------------------------------------------
Description:
SAMtools is a suite of programs for interacting with high-throughput
sequencing data. BCFtools - Reading/writing BCF2/VCF/gVCF files and
calling/filtering/summarising SNP and short indel sequence variants
Versions:
bcftools/1.3.1
bcftools/1.9
----------------------------------------------------------------------------
For detailed information about a specific "bcftools" module (including how to
load the modules) use the module's full name.
For example:
$ module spider bcftools/1.9
----------------------------------------------------------------------------
Available versions may change, but the format should be the same.
To obtain more information about a specific module version, including a list of any prerequisite modules that must be loaded first, run the module spider command with the version specified; for example:
module spider bcftools/1.3.1
Using a Specific Software Module
To use a specific software package, run the module load
command. The module load
command in itself does not execute any of the programs but only prepares the environment, i.e. it sets up variables needed to run specific applications and find libraries provided by the module.
After loading a module, you are ready to run the application(s) provided by the module. For example:
module load bcftools/1.3.1
bcftools --version
Output:
bcftools 1.3.1
Using htslib 1.3.1
Copyright (C) 2016 Genome Research Ltd.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
You will need to include the appropriate module load commands into your Slurm script.
General Considerations for Slurm Jobs
Most bioinformatics software packages are designed to run on a single compute node with varying support for multi-threading and utilization of multiple cpu cores. Many can run on only one core. In that case, please request only a single task.
Some software is multithreaded. Usually it communicates the number of threads requested through a command-line option. In this case the Slurm job scripts should contain the following two SBATCH directives:
#SBATCH -N 1 # request single node
#SBATCH --cpus-per-task=<X> # request multiple cpu cores
Replace <X>
with the actual number of cpu cores to be requested. Requesting more than 8 cpu cores does not provide any significant performance gain for many bioinformatics packages. This is a limitation due to code design rather than a UVA HPC constraint.
Please be certain that the number of cores you request matches the number you communicate to the software. To be certain, you can often use the environment variable SLURM_CPUS_PER_TASK
. For example,
biofoo -n ${SLURM_CPUS_PER_TASK}
You should only deviate from this general resource request format if you are absolutely certain that the software package supports execution on more than one compute node.
Reference Genomes on the HPC system
Research Computing provides a set of ready-to-use reference sequences and annotations for commonly analyzed organisms in a convenient, accessible location on Rivanna:
/project/genomes/
The majority of files have been downloaded from Illumina’s genomes repository (iGenomes), which contain assembly builds and corresponding annotations from Ensembl, NCBI and UCSC. Each genome directory contain index files of the whole genome for use with aligners like BWA and Bowtie2. In addition, STAR2 index files have been generated for each of Homo Sapiens (human) and Mus musculus (mouse) genomic builds.
Click the radio button for the genome of your choice, then click the clipboard icon to copy it. On Rivanna please use the right click method to paste.