Research Computing accounts and storage

December 10th, 2018
By Gitzendanner, Matt

Request an Account

Once you have a Gatorlink, you can request an account on the form here: https://www.rc.ufl.edu/access/account-request/

List Pam as your sponsor.

Storage

/ufrc/soltis

  • We currently have 50TB of space
    • We will will likely not add more–/ufrc storage is $140/TB/yr!
  • Each user has their own directory
  • You can check your usage with the blue_quota command in the ufrc module.
  • The /ufrc/soltis/share/ folder can be used for sharing, common data, etc.
  • If you have important data that you need regular access to and this should be backed up, keep it in /ufrc/soltis/share/Soltis_Backup/. This is the only location on /ufrc/soltis that is backed up in any way.

/orange/soltis

  • The /orange filesystem is somewhat slower than /ufrc, so not best for active data used in running jobs.
  • We currently have 84TB of space.
  • Each user can create their own directory in /orange/soltis/<gatorlink>.
  • You can check your usage with the orange_quota command in the ufrc module.
  • This is a great place to archive data you are not actively working on.
  • It is much cheaper than /ufrc–only $25/TB/yr.
  • One copy of all raw sequence data should go in /orange/soltis/SequenceBackup.
    • This folder is backed up nightly.
    • When you receive sequence data, let Matt know where it is located and he will copy it there.
    • Data in this folder should be read only.
    • Please provide a README file that describes:
      • Taxa and herbarium voucher information
      • Library prep type and date
      • Sequencing type and date
      • Barcode information as applicable
      • Other information that will be helpful to someone trying to reuse your data or to you when you go to submit your data to the SRA.
  • If you have important data that you do not access regularly, you can put it in /orange/soltis/Backup_and_archive.

Storage Backup

As noted above, the ONLY places that are backed up in any way are:

  • /ufrc/soltis/share/Soltis_Backup/: Use for important data that you need regular access to and this should be backed up.
  • /orange/soltis/Backup_and_archive/: Use for important data that you do not access regularly.
  • /home: Research Computing does maintain a daily snapshot of your home directory for one week. See this video for accessing your snapshots.
  • /orange/soltis/SequenceBackup: For archiving raw sequence data. Should be set to read only to minimize accidental data changes.

Running jobs

Most jobs on HiPerGator are submitted to the scheduler to run.

See the UFRC Wiki page with sample SLURM scripts for examples of different types of job scripts.

When requesting resources for your job, it is important to keep in mind that we all share the same resources and there are limits to what is available.

slurmInfo

To view current limits and usage, there is a command in the ufrc module called slurmInfo. Here’s an example of how to run it:

[magitz@login1 ~]$ module load ufrc
[magitz@login1 ~]$ slurmInfo -g soltis -u

----------------------------------------------------------------------
Allocation summary:    Time Limit             Hardware Resources
   Investment QOS           Hours          CPU     MEM(GB)     GPU
----------------------------------------------------------------------
           soltis             744          102         358       0
----------------------------------------------------------------------
CPU/MEM Usage:              Running          Pending        Total
                          CPU  MEM(GB)    CPU  MEM(GB)    CPU  MEM(GB)
----------------------------------------------------------------------
   Investment (soltis):    19       72      0        0     19       72
     Burst* (soltis-b):   192     1125      0        0    192     1125
----------------------------------------------------------------------
Individual usage:
  Investment (soltis)
          pamarasinghe:     9       21      0        0      9       21
              achander:     8       48      0        0      8       48
                g.chen:     2        3      0        0      2        3
  Burst (soltis-b)
              achander:   192     1125      0        0    192     1125
----------------------------------------------------------------------
HiPerGator Utilization
               CPU: Used (%) / Total         MEM(GB): Used (%) / Total
----------------------------------------------------------------------
        Total :  39982 (84%) / 47236      93151772 (47%) /   195538985
----------------------------------------------------------------------
* Burst QOS uses idle cores at low priority with a 4-day time limit

[magitz@login1 ~]$ 

Investment details

The output above shows that, when this was run, the lab had 102 cores and 358 GB of RAM in its investment. Of that, 19 cores and 72 GB of RAM were in use.

When you submit a job, you need to request cores (usually with the –cpus-per-task flag) and RAM (usually with the –mem flag). Please do not request way more cores or RAM than your job will actually use. This wastes resources and prevents others from doing work. See an example below for how we frequently request more RAM than needed.

Everyone is sharing these resources. Please send an email to the lab listserve if you plan on using a large fraction of the available resources.

Burst QOS

In addition to the investment, we can make use of idle resources on the cluster using the burst QOS. Jobs are limited to 4-day (96-hours) and may take longer to run, but we have 9X more resources available (both CPUs and RAM). Large jobs should try to use these resources, especially if they are less than 4-day.

To submit a job to the burst QOS, add: #SBATCH --qos=soltis-b to your job script or submit your job with
sbatch --qos=soltis-b myscript.sh.

Memory Requests

Our group is commonly limited more by memory than CPUs. This is partially because many of our applications use lots of RAM. But this is also because we sometimes are not careful in specifying reasonable memory requests. As an example, here’s a summary of the jobs run in August and September of 2019:

Group            1x     | 2x     | 10x    | 100x   | 1000x  | 10000x | Total 
-----            ------ | ------ | ------ | ------ | ------ | ------ | ------
soltis               30 |    482 |   4365 |    196 |   2063 |      1 |   7138

This table shows the number of jobs that requested a given fold more memory than was actually used. So, 2,064 jobs (~29%) requested over 1,000 times more RAM than they actually used!

When people have jobs pending because we are limited by RAM, having a job running that is using a tiny fraction of the RAM set aside for it is just wasteful! Yes, you do need to make sure you request more than you need, but 2X is more than enough! 10X is already excessive! Overall, I would read this to suggest that 93% of our jobs are inefficient when it comes to memory requests!

CPU requests

Serial jobs

Many applications and most Python and R scripts will only use a single core. Please do not request more cores than your job will use.

Threaded applications

Many applications, like assemblers and phylogenetics applications can use multiple cores, but they all need to be on the same physical server (node). For jobs like this, the resource request should be similar to:

#SBATCH --nodes 1
#SBATCH --ntasks 1
#SBATCH --cpus-per-task 8

This would provide 8-cores for your job to run. Most applications also need to be told how many cores to use. Be sure to specify this. It may be easiest to use the $SLURM_CPUS_ON_NODE variable for this.

Please make sure to test that your application will use the cores efficiently. My favorite example of this is blastn. Under many conditions, it actually goes slower with more cores–using more than one not only wastes resources, but slows down searches! Ask people, read manuals, check for yourself!

MPI applications

There are relatively few of these in our research. RAxML-ng is one exception–please be sure to read: https://github.com/amkozlov/raxml-ng/wiki/Parallelization
Matt is also happy to help with job scripts for this application, there are some counterintuitive settings that dramatically impact performance. Doing some preliminary testing is important!

Categories: Lab information


No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.