Research Computing

`/tag/bioinformatics`

Drosophila Evolution through Space and Time 2.0
Evolutionary biologists use population-based DNA sequencing to gain insight into the nature of adaptation, genetic diversity, and organismal form and function. When collecting DNA data, scientists are often sample limited because of the logistical challenges of collecting DNA from wild individuals across large portions of a species range. This can be mitigated when groups of scientists work together to create data and then share it with the larger community. The Bergland Lab has been a central participant in developing and maintaining DEST (“Drosophila Evolution through Space and Time”), a large (~10TB) repository of Drosophila melanogaster population genomic data which has been processed and standardized.

Read more →

projects bioinformatics, data, hpc, parallel-computing
Workshops
UVA Research Computing provides training opportunities covering a variety of data analysis, basic programming and computational topics. All of the classes listed below are taught by experts and are freely available to UVa faculty, staff and students.
New to High-Performance Computing? We offer orientation sessions to introduce you to the Afton & Rivanna HPC systems on Wednesdays (appointment required).
Wednesdays 3:00-4:00pm Sign up for an “Intro to HPC” session Upcoming Workshops DATE WORKSHOP INSTRUCTOR There are currently no training events scheduled. Please check back soon! Research Computing is partnering with the Research Library and the Health Sciences Library to deliver workshops covering a variety of research computing topics.

Read more →

education, workshops bioinformatics, containers, HPC, image processing, Ivy, Matlab, programming, Python, R, Rivanna, Shiny
Bioinformatics & Genomics
UVA Research Computing (RC) can help with your bioinformatics project.
Next-generation sequence data analysis RC staff can help you start to use popular bioinformatics software for functions such as
Genome assembly, reference-based and/or de-novo Whole-Genome/Exome sequence analysis for variant calling/annotation RNA-Seq data analysis to quantify, discover and profile RNAs Mircobiome data analysis, including 16S rRNA surveys, OTU clustering, microbial profiling, taxonomic and functional analysis from whole shotgun metagenomic/metatranscriptomic datasets Epigenetic analysis from BSAS/ChIP-Seq/ATAC-Seq Computing Platforms UVA has three computing facilities available to researchers: Rivanna and Afton, for non-sensitive data, and Ivy, for sensitive data. In addition, cloud-based services offer a computing environment for running flexible, scalable on-demand applications.

Read more →

services bioinformatics, genomics
COVID Saliva Testing
In cooperation with the UVA Saliva Testing Lab, the UVA Health System, and the Virginia Department of Health, the “Be SAFE” saliva
testing program was launched in late 2020. Now a retired project, Be SAFE used saliva samples to detect the COVID-19 virus through a diagnostic PCR test.
Research Computing provided computational, storage, and data integration expertise to this project.

Read more →

projects bioinformatics, covid-19, data, health
Bioinformatics Resources and UVA HPC
The UVA research community has access to numerous bioinformatics software installed directly or available through the bioconda Python modules.
Click here for a comprehensive list of currently-installed bioinformatics software.
Popular Bioinformatics Software Below are some popular tools and useful links for their documentation and usage:
Tool Version Description Useful Links BEDTools 2.26.0 BEDTools utilities allow one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF/GTF, VCF. Homepage Tutorial BLAST+ 2.7.1 BLAST+ is a suite of command-line tools that offers applications for BLAST search, BLAST database creation/examination, and sequence filtering.

Read more →

howto bioinformatics, genomics, rivanna, tools
COVID-19 Surveillance Dashboard
The Biocomplexity Institute at the University of Virginia has been at the forefront of epidemiological modeling to track the COVID-19 pandemic and has developed a suite of COVID-19 epidemic response resources including a series of dashboards to better help the public and the government better understand the pandemic. This is a static view of the Institute’s interactive COVID-19 Surveillance Dashboard, which provides a visualization of COVID-19 cases, recoveries, and deaths across the globe. In an effort to support the planning and response efforts for the recent Coronavirus pandemic, researchers prepared this visualization tool that provides a unique way of examining data curated by different data sources.

Read more →

projects bioinformatics, covid-19, web
LOLAweb
The past few years have seen an explosion of interest in understanding the role of regulatory DNA. This interest has driven large-scale production of functional genomics data resources and analytical methods. One popular analysis is to test for enrichment of overlaps between a query set of genomic regions and a database of region sets. In this way, annotations from external data sources can be easily connected to new genomic data.
SOM Research Computing is working with faculty in the UVA Center for Public Health Genomics to implement LOLAweb, an online tool for performing genomic locus overlap annotations and analyses. This project, written in the statistical programming language R, allows users to specify region set data in BED format for automated enrichment analysis.

Read more →

projects bioinformatics, containers, cphg, docker, r, shiny
Refgenie: A Reference Genome Resource Manager
Reference genome assemblies are essential for high-throughput sequencing analysis projects. Typically, genome assemblies are stored on disk alongside related resources; e.g., many sequence aligners require the assembly to be indexed. The resulting indexes are broadly applicable for downstream analysis, so it makes sense to share them. However, there is no simple tool to do this.
Refgenie is a reference genome assembly asset manager. Refgenie makes it easier to organize, retrieve, and share genome analysis resources. In addition to genome indexes, refgenie can manage any files related to reference genomes, including sequences and annotation files. Refgenie includes a command line interface and a server application that provides a RESTful API, so it is useful for both tool development and analysis.

Read more →

projects bioinformatics, containers, cphg, docker, python
Bioinformatics and UVA HPC
Overview Many commonly used bioinformatics software packages on the HPC clusters are available as individual modules or as Python packages bundled in the bioconda modules.
Please see our HowTo for more information about using this software on the HPC system.
Software Availability If a particular package is not available, several options are available. If it is sufficiently widely used, Research Computing staff will install it as a new module. If we determine that it is too specialized, you can install it yourself. Please use permanent storage such as your home directory to install software. If you have difficulty we can assist you to install the package.

Read more →

HPC, software, bioinformatics bio, bioinformatics, computational-biology, docking, rosetta
NCBI Blast and UVA HPC
Description Basic Local Alignment Search Tool, or BLAST, is an algorithm
for comparing primary biological sequence information, such as the amino-acid
sequences of different proteins or the nucleotides of DNA sequences.
Software Category: bio
For detailed information, visit the NCBI Blast website.
Available Versions The current installation of NCBI Blast incorporates the most popular packages. To find the available versions and learn how to load them, run:
module spider blast The output of the command shows the available NCBI Blast module versions.
For detailed information about a particular NCBI Blast module, including how to load the module, run the module spider command with the module’s full version label.

Read more →

HPC, software, bio bioinformatics, multi-core
Center for Diabetes Technology PriMed
In their research around constant glucose monitoring and the automated maintenance of insulin for patients, the CDT is exploring data drawn from external data sources such as DexCom and FitBit. RC has assisted the CDT by designing a secure computing footprint in Amazon Web Services to pull in these data, parse and process them, in order to perform deeper analytics through machine learning. In January 2018, CDT sponsored a ski camp at Wintergreen Resort for a group of youth diagnosed with Type I diabetes with the goal of importing glucose, insulin, and exercise metrics at the end of each day through remote web APIs.

Read more →

projects bioinformatics, machine-learning
epihet
RC is working with researchers in the Center for Public Health Genomics to write an R package to calculate Relative Proportion of Sites with Intermediate Methylation (RPIM) scores, which represent the epigenetic heterogeneity in a bisulfite sequencing sample.
https://github.com/databio/epihet
PI: Nathan Sheffield (Center for Public Health Genomics)

Read more →

projects bioinformatics, cphg, r
Microbiome Analysis of Hospital Sink Drains
Sink drains are notoriously characterized as reservoirs of pathogens causing nosocomial transmissions in hospitals worldwide. Outbreaks where sinks have been implicated as source of antibiotic resistant bacteria have upsurged over the last few years. To understand transmission dynamics University of Virginia School of Medicine has established a unique “Sink Lab” for this research. This one-of-the kind laboratory establishes UVa as worldwide frontrunners in investigating sink related antibiotic resistant bacteria and how they spread. RC is working with the UVa Sink Lab for genomic analysis of the sink biomass.
RC is contributing to:
Comparative genomic analysis of gram-negative bacterial isolates:
The analysis aims at tracking the mobile genetic element blaKPC gene, which encodes for Klebsiella pneumoniae carbapenemase (KPC) enzyme that confers resistance to all beta lactam agents including penicillins, cephalosporins, monobactams and carbapenems.

Read more →

projects bioinformatics, sink-lab
simpleCache
In partnership with researchers in the Center for Public Health Genomics, School of Medicine Research Computing has contributed to the development of a novel package for computationally efficient caching and loading of data in R. simpleCache provides an interface to a series of functions to store and retrieve cached objects, including in the context batch processing or HPC environments. The package further extends base R functionality of saving and loading external representations of objects by enabling caching to pre-defined directories and timed cache operations.
RC helped document and develop new functions for the package ahead of its release to the Comprehensive R Archive Network (CRAN).

Read more →

projects bioinformatics, r
Bioinformatics Packages on Ivy Linux VM
Available Packages The following bioinformatics packages are available on the Ivy Linux Virtual Machines
Bowtie2 Bowtie2 is a memory-efficient tool for aligning short sequences to long reference genomes.
For bowtie2 usage information, please click [here] (/userinfo/ivy/ivy-linux-sw/bioinformatics/bowtie2)
HISAT2 HISAT2 is a fast and sensitive tool for aligning short reads against the general human population
(as well as single reference genome)
- Requires approval before installation
  For HISAT2 usage information, please click here
Read more →

userinfo bioinformatics, ivy, linux
Bioinformatics Packages on Windows VM
Available Packages The following bioinformatics packages are available on the Windows Virtual Machines
Bowtie2 For more information on bowtie2, please click [here] (/userinfo/ivy/ivy-windows-sw/bioinformatics/bowtie2) –>
HISAT2 Requires approval before installation. For more information on HISAT2, please click here

Read more →

userinfo Bioinformatics, Ivy, Windows
Bowtie2 on Ivy Linux VM
Bowtie2 is a memory-efficient tool for aligning short sequences to long reference genomes.
It indexes the genome using FM Index, which is based on Burrows-Wheeler Transform algorithm,
to keep its memory footprint small. Bowtie2 supports gapped, local and paired-end alignment modes.
Alignment to a known reference using Bowtie2 is often an essential first step in a myriad of NGS analyses workflows.
Bowtie2 Usage Alignment using bowtie2 is a 2-step process - indexing the reference genome, followed by aligning the sequence data.
Create indexes of your reference genome of interest stored in reference.fasta file:
bowtie2-build [option(s)] <reference.fasta> <bt2-index-basename> This will create new files with the provided basename and extensions .

Read more →

userinfo bioinformatics, linux, software
Bowtie2 on Ivy Windows VM
Bowtie2 is a memory-efficient tool for aligning short sequences to long reference genomes.
It indexes the genome using FM Index, which is based on Burrows-Wheeler Transform algorithm,
to keep its memory footprint small. Bowtie2 supports gapped, local and paired-end alignment modes.
Alignment to a known reference using Bowtie2 is often an essential first step in a myriad of NGS analyses workflows.
Bowtie2 Usage Alignment using bowtie2 is a 2-step process - indexing the reference genome, followed by aligning the sequence data.
Create indexes of your reference genome of interest stored in reference.fasta file:
bowtie2-build [option(s)] <reference.fasta> <bt2-index-basename> This will create new files with the provided basename and extensions .

Read more →

userinfo bioinformatics, software, windows
HISAT2 on Ivy Linux VM
- Please note that HISAT2 requires approval prior to installation on the VM
  HISAT2 is a fast and sensitive tool for aligning short reads against the general human population
  (as well as single reference genome). It indexes the genome using a Hierarchical Graph FM Index
  (HGFM) strategy, i.e. a large set of small indexes that collectively cover the whole genome
  (each index representing a genomic region of 56 Kbp).
  HISAT2 Usage: Alignment using HISAT2 is a 2-step process - indexing the reference genome, followed by aligning the sequence data.
  Create indexes of your reference genome of interest stored in reference.fasta file:
Read more →

userinfo bioinformatics, linux, software
HISAT2 on Ivy Windows VM
- Please note that HISAT2 requires approval prior to installation on the VM
  HISAT2 is a fast and sensitive tool for aligning short reads against the general human population
  (as well as single reference genome). It indexes the genome using a Hierarchical Graph FM Index
  (HGFM) strategy, i.e. a large set of small indexes that collectively cover the whole genome
  (each index representing a genomic region of 56 Kbp).
  HISAT2 Usage: Alignment using HISAT2 is a 2-step process - indexing the reference genome, followed by aligning the sequence data.
  Create indexes of your reference genome of interest stored in reference.fasta file:
Read more →

userinfo bioinformatics, software, windows

All categories All tags

/tag/bioinformatics

Drosophila Evolution through Space and Time 2.0

Workshops

Bioinformatics & Genomics

COVID Saliva Testing

Bioinformatics Resources and UVA HPC

COVID-19 Surveillance Dashboard

LOLAweb

Refgenie: A Reference Genome Resource Manager

Bioinformatics and UVA HPC

NCBI Blast and UVA HPC

Center for Diabetes Technology PriMed

epihet

Microbiome Analysis of Hospital Sink Drains

simpleCache

Bioinformatics Packages on Ivy Linux VM

Bioinformatics Packages on Windows VM

Bowtie2 on Ivy Linux VM

Bowtie2 on Ivy Windows VM

HISAT2 on Ivy Linux VM

HISAT2 on Ivy Windows VM

`/tag/bioinformatics`