Rivanna will be down for maintenance on Tuesday, October 3, 2023 beginning at 6 a.m.
You may continue to submit jobs until the maintenance period begins, but if the system determines your job will not have time to finish, it will not start until Rivanna is returned to service. All systems are expected to return to service by 6 a.m. on Wednesday, October 4.
IMPORTANT MAINTENANCE NOTES
RC engineers will be adding 36 nodes, each with 40 cores and 750 GB total memory, to the
largemem partition on Rivanna. Jobs that need more memory than 9 GB per core should be submitted to the
largemem partition rather than the
standard partition. Some examples are given below.
I need 4 cores and 100 GB memory. Since this amounts to 25 GB memory per core, the job should be submitted to
#SBATCH -p largemem
#SBATCH -c 4
I need 10 cores and 50 GB memory. Since this amounts to 5 GB memory per core, the job should be submitted to
standard without specific memory requests. By default 9 GB per core will be allocated.
#SBATCH -p standard
#SBATCH -c 10
I am not sure how much memory I need. First submit the job to the
standard partition without specific memory requests. If the job runs out of memory, resubmit to the
largemem partition. To check the memory usage of a completed job, you may either run the seff command or add to your Slurm script:
and check the report in your email.
NVIDIA driver upgrade and modules
The NVIDIA driver will be upgraded to version 535.104.12 (CUDA 12.2). The default CUDA module version will remain at 11.4.2. New modules will be added:
The corresponding Jupyter kernels for PyTorch and TensorFlow will be provided as well.
AlphaFold versions 2.1.2, 2.2.2, and their corresponding database will be removed. The 2.3 database will be migrated off of the current
/project storage and the
ALPHAFOLD_DATA_PATH environment variable will be updated accordingly.
QGIS (Open OnDemand) will be upgraded to 3.28.10.
Old scratch permanently retired on October 17
A reminder that the
/gpfs/gpfs0/scratch) filesystem will be permanently retired on October 17 and all the data it contains will be deleted. A sample script for users who wish to transfer files to the new
/scratch system can be found here.
If you have any questions or concerns about the maintenance period, you may contact us here.