Research Computing accounts and storage

Request an Account

Once you have a Gatorlink, you can request an account on the form here: https://www.rc.ufl.edu/get-started/hipergator/request-hipergator-account/

List Pam as your sponsor.

Storage

/blue/soltis

We currently have ~50TB of space
- We are unlikely to add more–/blue storage is $140/TB/yr!
This is primarily for current, active projects. Move files to /orange/soltis when you are not working on the project for more than a few months.
Each user has their own directory
You can check your usage with the blue_quota command
The /blue/soltis/share/ folder can be used for sharing, common data, etc.
If you have important data that you need regular access to and this should be backed up, keep it in /blue/soltis/share/Soltis_Backup/. This is the only location on /blue/soltis that is backed up in any way.

/orange/soltis

The /orange filesystem is somewhat slower than /blue, so not best for active data used in running jobs.
We currently have ~150TB of space.
Each user can create their own directory in /orange/soltis/<gatorlink>.
You can check your usage with the orange_quota command.
This is a great place to archive data you are not actively working on.
It is much cheaper than /blue–only $25/TB/yr.
One copy of all raw sequence data should go in /orange/soltis/SequenceBackup.
- This folder is backed up 2-3 times a week.
- When you receive sequence data, let Matt know where it is located and he will copy it there.
- Data in this folder should be read only.
- Please provide a README file that describes:
  - Taxa and herbarium voucher information
  - Library prep type and date
  - Sequencing type and date
  - Barcode information as applicable
  - Other information that will be helpful to someone trying to reuse your data or to you when you go to submit your data to the SRA.
If you have important data that you do not access regularly, you can put it in /orange/soltis/Backup_and_archive.

Suggested file/folder organization

A few suggestions for organizing files and folders:

One folder per project–when you leave, I will archive your data by compressing each folder in /blue/soltis/user. If you organize data into projects, one compressed archive corresponds to one project.
Use dates in filenames, ISO format is preferred: YYYY-MM-DD
No spaces or special characters in filenames.
Add a README to each folder explaining the contents, work done, links to git repos, etc.

Storage Backup

As noted above, the ONLY places that are backed up in any way are:

/blue/soltis/share/Soltis_Backup/: Use for important data that you need regular access to and this should be backed up.
/orange/soltis/Backup_and_archive/: Use for important data that you do not access regularly.
/home: Research Computing does maintain a daily snapshot of your home directory for one week. See this video for accessing your snapshots.
/orange/soltis/SequenceBackup: For archiving raw sequence data. Should be set to read only to minimize accidental data changes.

When you leave the lab

- When you leave, we can keep your GatorLink active by requesting an affiliation through Research Computing. Please work with Matt to take care of this if you think you will need access to HiPerGator for more than a few months after you leave.
- Please clean up your data! Delete files you no longer need, organize what you do want to keep. Make sure others will be able to understand what each folder contains and where your data are.
- Move files to /orange/soltis/former_members/ and compress. Ask Matt for help.
  - If Matt does this, it will be done using the command: tar cjvf /orange/soltis/former_members/$user/$outname.tar.bz2 $outname
  - Whe $user is your username and $outname is the folder name
  - To decompress, use tar xvf file.tar.bz2

Expect that even if you maintain an active GatorLink, at some point, Matt will archive your data! We typically do not delete things, but each folder in your /blue/soltis/gatorlink folder will be compressed and archived in /orange/soltis/former_members/gatorlink. We simply cannot afford to keep data in /blue forever if it is not actively being used. Current lab members need that space for active research. Please help by taking care of this before you leave!

Running jobs

Most jobs on HiPerGator are submitted to the scheduler to run.

See the UFRC Wiki page with sample SLURM scripts for examples of different types of job scripts.

When requesting resources for your job, it is important to keep in mind that we all share the same resources and there are limits to what is available.

slurmInfo

To view current limits and usage, the slurmInfo is helpful. Here’s an example of how to run it:

[magitz@login1 ~]$ slurmInfo -g soltis -u

----------------------------------------------------------------------
Allocation summary:    Time Limit             Hardware Resources
   Investment QOS           Hours          CPU     MEM(GB)     GPU
----------------------------------------------------------------------
           soltis             744          102         358       0
----------------------------------------------------------------------
CPU/MEM Usage:              Running          Pending        Total
                          CPU  MEM(GB)    CPU  MEM(GB)    CPU  MEM(GB)
----------------------------------------------------------------------
   Investment (soltis):    19       72      0        0     19       72
     Burst* (soltis-b):   192     1125      0        0    192     1125
----------------------------------------------------------------------
Individual usage:
  Investment (soltis)
          pamarasinghe:     9       21      0        0      9       21
              achander:     8       48      0        0      8       48
                g.chen:     2        3      0        0      2        3
  Burst (soltis-b)
              achander:   192     1125      0        0    192     1125
----------------------------------------------------------------------
HiPerGator Utilization
               CPU: Used (%) / Total         MEM(GB): Used (%) / Total
----------------------------------------------------------------------
        Total :  39982 (84%) / 47236      93151772 (47%) /   195538985
----------------------------------------------------------------------
* Burst QOS uses idle cores at low priority with a 4-day time limit

[magitz@login1 ~]$

Investment details

The output above shows that, when this was run, the lab had 102 cores and 358 GB of RAM in its investment. Of that, 19 cores and 72 GB of RAM were in use.

When you submit a job, you need to request cores (usually with the –cpus-per-task flag) and RAM (usually with the –mem flag). Please do not request way more cores or RAM than your job will actually use. This wastes resources and prevents others from doing work. See an example below for how we frequently request more RAM than needed.

Everyone is sharing these resources. Please send an email to the lab listserve if you plan on using a large fraction of the available resources.

Burst QOS

In addition to the investment, we can make use of idle resources on the cluster using the burst QOS. Jobs are limited to 4-day (96-hours) and may take longer to run, but we have 9X more resources available (both CPUs and RAM). Large jobs should try to use these resources, especially if they are less than 4-day.

To submit a job to the burst QOS, add: #SBATCH --qos=soltis-b to your job script or submit your job with
sbatch --qos=soltis-b myscript.sh.

Memory Requests

Our group is commonly limited more by memory than CPUs. This is partially because many of our applications use lots of RAM. But this is also because we sometimes are not careful in specifying reasonable memory requests. As an example, here’s a summary of the jobs run in August and September of 2019:

Group            1x     | 2x     | 10x    | 100x   | 1000x  | 10000x | Total 
-----            ------ | ------ | ------ | ------ | ------ | ------ | ------
soltis               30 |    482 |   4365 |    196 |   2063 |      1 |   7138

This table shows the number of jobs that requested a given fold more memory than was actually used. So, 2,064 jobs (~29%) requested over 1,000 times more RAM than they actually used!

When people have jobs pending because we are limited by RAM, having a job running that is using a tiny fraction of the RAM set aside for it is just wasteful! Yes, you do need to make sure you request more than you need, but 2X is more than enough! 10X is already excessive! Overall, I would read this to suggest that 93% of our jobs are inefficient when it comes to memory requests!

CPU requests

Serial jobs

Many applications and most Python and R scripts will only use a single core. Please do not request more cores than your job will use.

Threaded applications

Many applications, like assemblers and phylogenetics applications can use multiple cores, but they all need to be on the same physical server (node). For jobs like this, the resource request should be similar to:
#SBATCH --nodes 1 #SBATCH --ntasks 1 #SBATCH --cpus-per-task 8
This would provide 8-cores for your job to run. Most applications also need to be told how many cores to use. Be sure to specify this. It may be easiest to use the $SLURM_CPUS_ON_NODE variable for this.

Please make sure to test that your application will use the cores efficiently. My favorite example of this is blastn. Under many conditions, it actually goes slower with more cores–using more than one not only wastes resources, but slows down searches! Ask people, read manuals, check for yourself!

MPI applications

There are relatively few of these in our research. RAxML-ng is one exception–please be sure to read: https://github.com/amkozlov/raxml-ng/wiki/Parallelization
Matt is also happy to help with job scripts for this application, there are some counterintuitive settings that dramatically impact performance. Doing some preliminary testing is important!