Apptainer on Rivanna
Introduction Apptainer is a continuation of the Singularity project (see here). On December 18, 2023 we migrated from Singularity to Apptainer.
Containers created by Singularity and Apptainer are mutually compatible as of this writing, although divergence is to be expected.
One advantage of Apptainer is that users can now build container images natively on Rivanna.
Apptainer on Rivanna (after 12/18/2023) Apptainer is available as a module. The RC staff has also curated a library of pre-prepared Apptainer container images for popular applications as part of the shared software stack. Descriptions for these shared containers can be found via the module avail and module spider commands.
[Deprecated] On Dec 18, 2023 Singularity has been upgraded to Apptainer, a continuation of the Singularity project. Overview Singularity is a container application targeted to multi-user, high-performance computing systems. It interoperates well with Slurm and with the Lmod modules system. Singularity can be used to create and run its own containers, or it can import Docker containers.
Creating Singularity Containers To create your own image from scratch, you must have root privileges on some computer running Linux (any version). Follow the instructions at the Singularity site. If you have only Mac or Windows, you can use the Vagrant environment. Vagrant is a pre-packed system that runs under several virtual-machine environments, including the free Virtualbox environment.
NVIDIA DGX BasePOD™
Introducing the NVIDIA DGX BasePOD™ As artificial intelligence (AI) and machine learning (ML) continue to change how academic research is conducted, the NVIDIA DGX BasePOD, or BasePOD, brings new AI and ML functionality to Rivanna, UVA’s High-Performance Computing (HPC) system. The BasePOD is a cluster of high-performance GPUs that allows large deep-learning models to be created and utilized at UVA.
The NVIDIA DGX BasePOD™ on Rivanna, hereafter referred to as the POD, is comprised of:
10 DGX A100 nodes with 2TB of RAM memory per node 80 GB GPU memory per GPU device Compared to the regular GPU nodes, the POD contains advanced features such as:
Overview Containers bundle an application, the libraries and other executables it may need, and even the data used with the application into portable, self-contained files called images. Containers simplify installation and management of software with complex dependencies and can also be used to package workflows.
Please refer to the following pages for further information.
Singularity (before Dec 18, 2023) Apptainer (after Dec 18, 2023) Short course: Software Containers for HPC Container Registries for UVA Research Computing Images built by Research Computing are hosted on Docker Hub (and previously Singularity Library).
Singularity Library Due to storage limits we can no longer add Singularity images to Singularity Library.
Kubernetes is a container orchestrator for both short-running (such as workflow/pipeline stages) jobs and long-running (such as web and database servers) services. Containerized applications running in the UVARC Kubernetes cluster are visible to UVA Research networks (and therefore from Rivanna, Skyline, etc.). Web applications can be made visible to the UVA campus or the public Internet. Kubernetes Research Computing runs microservices in a Kubernetes cluster that automates the deployment of many containers, making their management easy and scalable. This cluster will eventually consist of several dozen instances, >2000 cores and >2TB of memory allocated to running containerized services. It will also have over 300TB of cluster storage and can attach to both project and standard storage.
– Container-based architecture, also known as “microservices,” is an approach to designing and running applications as a distributed set of components or layers. Such applications are typically run within containers, made popular in the last few years by Docker. Containers are portable, efficient, reusable, and contain code and any dependencies in a single package. Containerized services typically run a single process, rather than an entire stack within the same environment. This allows developers to replace, scale, or troubleshoot portions of their entire application at a time. General Availability (GA) of Kubernetes - Research Computing now manages microservice orchestration with Kubernetes, the open-source tool from Google.
// Run your Cloud computing is ideal for running flexible, scalable applications on demand, in periodic bursts, or for fixed periods of time. UVA Research Computing works alongside researchers to design and run research applications and datasets into Amazon Web Services, the leader among public cloud vendors. This means that server, storage, and database needs do not have to be estimated or purchased beforehand – they can be scaled larger and smaller with your needs, or programmed to scale dynamically with your application.
Service Oriented Architecture A key advantage of the cloud is that for many services you do not need to build or maintain the servers that support the service – you simply use it.
UVA Research Computing provides training opportunities covering a variety of data analysis, basic programming and computational topics. All of the classes listed below are taught by experts and are freely available to UVa faculty, staff and students.
Upcoming Workshops DATE WORKSHOP INSTRUCTOR Mar 11, 2024
Building Containers Natively on HPCRuoshi Sun Research Computing is partnering with the Research Library and the Health Sciences Library to deliver workshops covering a variety of research computing topics.
All Upcoming Workshops from UVA Library Research Data Services
All Upcoming Workshops from UVA Health Sciences Library
Workshop Material Course material and exercises are available through a companion site.
, image processing
How to add packages to a container?
Basic Steps Strictly speaking, you cannot add packages to an existing container since it is not editable. However, you can try to install missing packages locally. Using python-pip as an example:
module load apptainer apptainer exec <container.sif> python -m pip install –user <package> Replace <container.sif> with the actual filename of the container and <package> with the package name. The Python package will be installed in your home directory under .local/lib/pythonX.Y where X.Y is the Python version in the container.
If the installation results in a binary, it will often be placed in .local/bin. Remember to add this to your PATH:
BART (Binding Analysis for Regulation of Transcription) Web Working with researchers in the Zang Lab in the Center for Public Health Genomics (CPHG), RC helped launch BARTweb, an interactive web-based tool for users to analyze their Genelist or ChIP-seq datasets. BARTweb is a containerized Flask front-end (written in Python) that ingests files and submits them to a more robust Python-based genomics pipeline running on Rivanna, UVA’s high performance computing cluster (HPC). This architecture – of a public web application that uses a supercomputer to process data – is a new model for UVA, and one that eases the learning curve for researchers who may not have access to an HPC system or the expertise to run a BART pipeline in the command-line.
UVA Research Computing can help you find the right system for your computational workloads. From supercomputers to HIPAA secure systems to cloud-based deployments with advanced infrastructure, various systems are available to researchers.
Facilities Statement - Are you submitting a grant proposal and need standard information about UVA research computing environments? Get it here. High Performance Computing - Rivanna A traditional high performance cluster with a resource manager, a large file system, modules, and MPI processing. Get Started on Rivanna Secure Computing for Highly Sensitive Data - Ivy A multi-platform, HIPAA-compliant system for secure data that includes dedicated virtual machines (Linux and Windows), JupyterLab Notebooks, and Apache Spark.
Rivanna HPC Software
Overview Research Computing at UVA offers a variety of standard software packages for all Rivanna users. We also install requested software based on the needs of the high-performance computing (HPC) community as a whole. Software used by a single group should be installed by that group’s members, ideally on leased storage controlled by the group. Departments with a set of widely-used software packages may install them to the lsp_apps space. The Research Computing group also provides limited assistance for individual installations.
For help installing research software on your PC, please contact Research Software Support at firstname.lastname@example.org.
Software Modules and Containers Software on Rivanna is accessed via environment modules or containers.
Docker - The Basics
Note that Docker requires sudo privilege and therefore it is not supported on Rivanna. To use a Docker image you will need to convert it into Apptainer. More information can be found here on our website.
What Is Docker? “Docker is a set of platform-as-a-service (PaaS) products that use OS-level virtualization to deliver software in packages called containers. Containers are isolated from one another and bundle their own software, libraries and configuration files; they can communicate with each other through well-defined channels. All containers are run by a single operating-system kernel and are thus more lightweight than virtual machines. The service has both free and premium tiers.
The past few years have seen an explosion of interest in understanding the role of regulatory DNA. This interest has driven large-scale production of functional genomics data resources and analytical methods. One popular analysis is to test for enrichment of overlaps between a query set of genomic regions and a database of region sets. In this way, annotations from external data sources can be easily connected to new genomic data.
SOM Research Computing is working with faculty in the UVA Center for Public Health Genomics to implement LOLAweb, an online tool for performing genomic locus overlap annotations and analyses. This project, written in the statistical programming language R, allows users to specify region set data in BED format for automated enrichment analysis.
Refgenie: A Reference Genome Resource Manager
Reference genome assemblies are essential for high-throughput sequencing analysis projects. Typically, genome assemblies are stored on disk alongside related resources; e.g., many sequence aligners require the assembly to be indexed. The resulting indexes are broadly applicable for downstream analysis, so it makes sense to share them. However, there is no simple tool to do this.
Refgenie is a reference genome assembly asset manager. Refgenie makes it easier to organize, retrieve, and share genome analysis resources. In addition to genome indexes, refgenie can manage any files related to reference genomes, including sequences and annotation files. Refgenie includes a command line interface and a server application that provides a RESTful API, so it is useful for both tool development and analysis.
ACCESS: Advanced Cyberinfrastructure Coordination Ecosystem: Services and Support
The NSF’s ACCESS (Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support) program builds upon the successes of the 11-year XSEDE project, while also expanding the ecosystem with capabilities for new modes of research and further democratizing participation. ACCESS Home: access-ci.org access-ci.org/about Allocations Allocations: allocations.access-ci.org Documentation Support: support.access-ci.org Community Engagement ACCESS: support.access-ci.org/affinity-groups Campus Champions: https://campuschampions.cyberinfrastructure.org UVa Research Computing has two Champions, Ed Hall and Katherine Holcomb For more help, please feel free to contact RC staff to set up a consultation or visit us during office hours.
Docker Images on Rivanna
Docker requires sudo privilege and therefore it is not supported on Rivanna. To use a Docker image you will need to convert it into Apptainer.
Convert a Docker image There are several ways to convert a Docker image:
Download a remote image from Docker Hub Build from a local image cached in Docker daemon Build from a definition file (advanced) Instructions are provided in each of the following sections.
Docker Hub Docker images hosted on Docker Hub can be downloaded and converted in one step via the apptainer pull command:
module load apptainer apptainer pull docker://account/image Use the exact same command as you would for docker pull.
XSEDE: Extreme Science and Engineering Development Environment
XSEDE’s Mission was to substantially enhance the productivity of a growing community of scholars, researchers, and engineers through access to advanced digital services that support open research; and coordinate and add significant value to the leading cyberinfrastructure resources funded by the NSF and other agencies. — The XSEDE project ended on August 31, 2022 and was succeeded by the ACCESS project.
XSEDE Home: www.xsede.org
Rivanna HPC Software
Overview Research Computing at UVA offers a variety of standard software packages for all Rivanna users. We also install requested software based on the needs of the HPC community as a whole. Software used by a single group should be installed by that group’s members, ideally on leased storage controlled by the group. Departments with a set of widely-used software packages may install them to the lsp_apps space. The Research Computing group also provides limited assistance for individual installations.
For help installing research software on your PC, please contact Research Software Support at email@example.com.
Software Modules and Containers Software on Rivanna is provided via environment modules or as containers.
Computing Environments at UVA
Research Computing (UVA-RC) serves as the principal center for computational resources and associated expertise at the University of Virginia (UVA). Each year UVA-RC provides services to over 433 active PIs that sponsor more than 2463 unique users from 14 different schools/organizations at the University, maintaining a breadth of systems to support the computational and data intensive research of UVA’s researchers. High Performance Computing UVA-RC’s High Performance Computing (HPC) systems are designed with high speed networks, high performance storage, GPUs, and large amounts of memory in order to support modern compute and memory intensive programs. UVA-RC’s HPC systems are comprised of over 614 compute nodes, with a total of 20476 X86 64-bit compute cores and 240 TB total RAM.