Research Computing accounts and storage

December 10th, 2018

Request an Account

Once you have a Gatorlink, you can request an account on the form here: https://www.rc.ufl.edu/access/account-request/

List Pam as your sponsor.

Storage

/ufrc/soltis

  • We currently have 50TB of space
    • We will will likely not add more–/ufrc storage is $140/TB/yr!
  • Each user has their own directory
  • You can check your usage with the blue_quota command in the ufrc module.
  • The /ufrc/soltis/share/ folder can be used for sharing, common data, etc.
  • If you have important data that you need regular access to and this should be backed up, keep it in /ufrc/soltis/share/Soltis_Backup/. This is the only location on /ufrc/soltis that is backed up in any way.

/orange/soltis

  • The /orange filesystem is somewhat slower than /ufrc, so not best for active data used in running jobs.
  • We currently have 84TB of space.
  • Each user can create their own directory in /orange/soltis/<gatorlink>.
  • You can check your usage with the orange_quota command in the ufrc module.
  • This is a great place to archive data you are not actively working on.
  • It is much cheaper than /ufrc–only $25/TB/yr.
  • One copy of all raw sequence data should go in /orange/soltis/SequenceBackup.
    • This folder is backed up nightly.
    • When you receive sequence data, let Matt know where it is located and he will copy it there.
    • Data in this folder should be read only.
    • Please provide a README file that describes:
      • Taxa and herbarium voucher information
      • Library prep type and date
      • Sequencing type and date
      • Barcode information as applicable
      • Other information that will be helpful to someone trying to reuse your data or to you when you go to submit your data to the SRA.
  • If you have important data that you do not access regularly, you can put it in /orange/soltis/Backup_and_archive.

Storage Backup

As noted above, the ONLY places that are backed up in any way are:

  • /ufrc/soltis/share/Soltis_Backup/: Use for important data that you need regular access to and this should be backed up.
  • /orange/soltis/Backup_and_archive/: Use for important data that you do not access regularly.
  • /home: Research Computing does maintain a daily snapshot of your home directory for one week. See this video for accessing your snapshots.
  • /orange/soltis/SequenceBackup: For archiving raw sequence data. Should be set to read only to minimize accidental data changes.

Running jobs

Most jobs on HiPerGator are submitted to the scheduler to run.

See the UFRC Wiki page with sample SLURM scripts for examples of different types of job scripts.

When requesting resources for your job, it is important to keep in mind that we all share the same resources and there are limits to what is available.

slurmInfo

To view current limits and usage, there is a command in the ufrc module called slurmInfo. Here’s an example of how to run it:

[magitz@login1 ~]$ module load ufrc
[magitz@login1 ~]$ slurmInfo -g soltis -u

----------------------------------------------------------------------
Allocation summary:    Time Limit             Hardware Resources
   Investment QOS           Hours          CPU     MEM(GB)     GPU
----------------------------------------------------------------------
           soltis             744          102         358       0
----------------------------------------------------------------------
CPU/MEM Usage:              Running          Pending        Total
                          CPU  MEM(GB)    CPU  MEM(GB)    CPU  MEM(GB)
----------------------------------------------------------------------
   Investment (soltis):    19       72      0        0     19       72
     Burst* (soltis-b):   192     1125      0        0    192     1125
----------------------------------------------------------------------
Individual usage:
  Investment (soltis)
          pamarasinghe:     9       21      0        0      9       21
              achander:     8       48      0        0      8       48
                g.chen:     2        3      0        0      2        3
  Burst (soltis-b)
              achander:   192     1125      0        0    192     1125
----------------------------------------------------------------------
HiPerGator Utilization
               CPU: Used (%) / Total         MEM(GB): Used (%) / Total
----------------------------------------------------------------------
        Total :  39982 (84%) / 47236      93151772 (47%) /   195538985
----------------------------------------------------------------------
* Burst QOS uses idle cores at low priority with a 4-day time limit

[magitz@login1 ~]$ 

Investment details

The output above shows that, when this was run, the lab had 102 cores and 358 GB of RAM in its investment. Of that, 19 cores and 72 GB of RAM were in use.

When you submit a job, you need to request cores (usually with the –cpus-per-task flag) and RAM (usually with the –mem flag). Please do not request way more cores or RAM than your job will actually use. This wastes resources and prevents others from doing work. See an example below for how we frequently request more RAM than needed.

Everyone is sharing these resources. Please send an email to the lab listserve if you plan on using a large fraction of the available resources.

Burst QOS

In addition to the investment, we can make use of idle resources on the cluster using the burst QOS. Jobs are limited to 4-day (96-hours) and may take longer to run, but we have 9X more resources available (both CPUs and RAM). Large jobs should try to use these resources, especially if they are less than 4-day.

To submit a job to the burst QOS, add: #SBATCH --qos=soltis-b to your job script or submit your job with
sbatch --qos=soltis-b myscript.sh.

Memory Requests

Our group is commonly limited more by memory than CPUs. This is partially because many of our applications use lots of RAM. But this is also because we sometimes are not careful in specifying reasonable memory requests. As an example, here’s a summary of the jobs run in August and September of 2019:

Group            1x     | 2x     | 10x    | 100x   | 1000x  | 10000x | Total 
-----            ------ | ------ | ------ | ------ | ------ | ------ | ------
soltis               30 |    482 |   4365 |    196 |   2063 |      1 |   7138

This table shows the number of jobs that requested a given fold more memory than was actually used. So, 2,064 jobs (~29%) requested over 1,000 times more RAM than they actually used!

When people have jobs pending because we are limited by RAM, having a job running that is using a tiny fraction of the RAM set aside for it is just wasteful! Yes, you do need to make sure you request more than you need, but 2X is more than enough! 10X is already excessive! Overall, I would read this to suggest that 93% of our jobs are inefficient when it comes to memory requests!

CPU requests

Serial jobs

Many applications and most Python and R scripts will only use a single core. Please do not request more cores than your job will use.

Threaded applications

Many applications, like assemblers and phylogenetics applications can use multiple cores, but they all need to be on the same physical server (node). For jobs like this, the resource request should be similar to:

#SBATCH --nodes 1
#SBATCH --ntasks 1
#SBATCH --cpus-per-task 8

This would provide 8-cores for your job to run. Most applications also need to be told how many cores to use. Be sure to specify this. It may be easiest to use the $SLURM_CPUS_ON_NODE variable for this.

Please make sure to test that your application will use the cores efficiently. My favorite example of this is blastn. Under many conditions, it actually goes slower with more cores–using more than one not only wastes resources, but slows down searches! Ask people, read manuals, check for yourself!

MPI applications

There are relatively few of these in our research. RAxML-ng is one exception–please be sure to read: https://github.com/amkozlov/raxml-ng/wiki/Parallelization
Matt is also happy to help with job scripts for this application, there are some counterintuitive settings that dramatically impact performance. Doing some preliminary testing is important!

UF and Museum VPN

December 10th, 2018

Museum VPN access

The Cisco AnyConnect VPN client can be downloaded from vpn.ufl.edu.

While you can connect with just your Gatorlink username, adding /flmnh to the end of your username will connect you to the Museum network and provide access to the Museum file server, the Geneious license server and other Museum resources.

Data Management at UF

August 19th, 2015

The University of Florida has a vested interest in protecting research data. Granting agencies hold the University responsible for producing the promised grant deliverables, and data loss is not an acceptable excuse for failing to meet grant expectations. As such the University provides several mechanisms to store and protect data. It is your responsibility to understand and follow regulations and make use of these resources.

OneDrive@UF

All faculty, staff and students have access to 5TB of cloud-based storage through Microsoft’s OneDrive for Business. This is an easy and secure method of syncing data among multiple computers and mobile devices as well as ensuring backup and versioning of files. Soltis lab members are highly encouraged to store the majority of their data in their OneDrive account. For more information see the GatorCloud site.

Known incompatibilities with OneDrive:

  • Hidden files (starting with .) cause problems. Places that these are common:
    • GitHub repositories

UF Dropbox for Education

UF offers an officially supported Dropbox for Education account to all Faculty and Staff. Students can request access to this as well, just let Matt know and we’ll get you setup.

A note about Online file storage

UF has approved OneDrive@UF and UF Dropbox for Education for university data. These are the only officially supported online file storage systems. The consumer OneDrive and Dropbox are not supported, nor is Google Drive. Please do not use the consumer versions of these services.

Museum File Server

The Soltis lab space for storing files on the Museum file server.

You need to be on a University network to connect to the file server. Use the VPN to connect if off campus.

You should be able to access these spaces by connecting to: ad.ufl.edu/flmnh

  • From a Mac:
    • In the Finder, select Go > Connect to Server…, or Command-K
    • Enter the Server Address as: cifs://ad.ufl.edu/flmnh
    • Use your Gatorlink credentials to login.
  • From a Windows computer:
    • From the File Explorer, select Map a Network drive, or enter the server in the search bar.
    • Enter the Server Address as: \\ad.ufl.edu\flmnh
    • Use your Gatorlink credentials to login.

Once connected, navigate to NaturalHistory/Soltis.

Within the Users folder you should have a folder your Gatorlink username. This is your space to store work-related files.

The Molecular folder is a shared workspace for everyone in the lab. Files in this space are editable (and delatable) by everyone in the lab. This is helpful for collaboration  and sharing.

The Museum file server is backed up and files can be recovered for up to 3 months.

 

A note on mobile devices and storage at UF

It is University policy that ALL mobile devices (laptops, cell phones, tablets) and portable storage (USB drives) be encrypted. This includes personally owned devices. There is some helpful information on this on the Mobile Device Security page. Please also see the Employee Guide to Information Security for other helpful security information.

 

 

Geneious license server

January 9th, 2015

Geneious_iconThe lab has 4 concurrent licenses for Geneious 8. Please download the latest version from Geneious.com.

To access these licenses, you must be connected to the FLMNH VPN. See Matt to setup your network access. Once that is done, use the Cisco VPN Client, obtained through vpn.ufl.edu and in the username box, use: username@ufl.edu/flmnh with your Gatorlink username.

After opening Geneious, select Activate License (also in the Help menu).

Click the button for “Use floating license server” and enter the server name:

soltis-geneious.flmnh.ufl.edu

and port 27001. If that doesn’t work, try the server name:

hyperchicken.flmnh.ufl.edu.

You should now have access to the lab’s license of Geneious.

 

 

Xerox Printer Information

February 26th, 2014

The lab has a Xerox WorkCentre 6400 for research-related printing needs.

To connect, the IP address is: 10.38.17.135  or DNS: sol-mus301-prt-6400x-1.mfd.ufl.edu

It is a WorkCentre 6400x, it is best to download the print drivers from Xerox: http://www.support.xerox.com/support/workcentre-6400/downloads

Please remember to print in Black and White and double sided when possible.

All tests, quizzes and handouts for classes must be printed through the department! Do not print copies of these types of things on the lab printer. There is a budget for courses in the department, but we cannot be reimbursed for those costs if your print on our printer.

 

Printing costs:

  • Black and White: $14.41 per month, includes 1,000 pages, 1.44¢ per page after that (we are typically under 1,000 black and white pages, making these essentially included in the cost of the printer).
  • Color: 4.94¢ per page

Volunteers

February 5th, 2014

All lab volunteers are required to fill out the Volunteer Application and give to Evgeny.

Once an applicant is added to the system and begins working with you, it is your responsibility to train the applicant to safely do the assigned work. Please see Matt or Evgeny if you have questions or need help.

All volunteers must keep track of their hours. We have an online site through OurVolts.com to facilitate this. Hours will be reported based on these logs.

Test private post

February 5th, 2014

This is a test private post