Advanced Research Computing
Virtual Visit Request info Apply
MENUMENU
  • About
    • Overview
    • Details
    • Terms of Service
    • FAQs
    • Our Team
    • Testimonials
  • Services
    • Coffee/Office Hours
    • Data Portal »
    • Pricing
    • Secure Computing
    • Service Requests
      • Request an Account
      • Request Classroom Access
      • Request Data Science Development/Consulting
      • Request Software
      • Request Storage
  • Resources
    • Documentation »
    • Workshops
    • Web Apps
      • Doppler (NAU only)
      • Metrics (NAU only)
      • OnDemand
      • XDMod
      • XDMoD Reports
  • Research
    • Current Projects
    • Publications
  • Collaboration
    • Arizona Research Computing
    • CRN
    • External
  • IN
  • ARC
  • Using Slurm

Coordinating Accounts with Slurm

NAU’s Monsoon cluster is host to several research groups. In order to balance the demands of these groups, Slurm is utilized to schedule jobs in a way to maximise fairness. Slurm provides a useful overlay to make starting large compute jobs easy.

Managing Jobs

Listing jobs

$ squeue -A professor # List by account
$ squeue -u abc123    # List by user

Cancelling Jobs

$ scancel 12345678                   # Cancel by job ID
$ scancel -u abc123                  # Cancel all of a user's jobs
$ scancel -u abc123 --state=running  # Cancel all of a user's RUNNING jobs
$ scancel -u abc123 --state=pending  # Cancel all of a user's PENDING jobs
$ scancel -A professor               # Cancel an entire account's jobs

Holding and Releasing Jobs

$ scontrol hold 12345678      # Hold by job ID
$ scontrol release 12345678   # Release the hold
$ scontrol uhold 12345678     # Hold job 12345678 but allow the job's owner to 
                              # release it

Limiting Users

Check the Current Limits

$ sacctmgr list assoc account=professor
$ sacctmgr list assoc user=abc123 format=account,user,grpcpurunmins
$ sacctmgr list assoc user=abc123

Limiting CPU Time

$ sacctmgr modify user abc123 set GrpCPURunMins=1440  # Limit a user's maximum CPU 
                                                      # time in pending/running 
                                                      # jobs to 1440 minutes 
                                                      # (e.g 24 hours on 1 core, 
                                                      #  12 hours on 2 cores, etc.)

Limiting Usable CPU’s

$ sacctmgr modify user abc123 set GrpCPUs=2 # The user can only have 2 CPUs 
                                            # allocated at a time

Checking the Current Settings and Status

Adding a Student to a SLURM Account

$ sacctmgr add user name=abc123 account=professor                       # Add user to account
$ sacctmgr modify user where name=abc123 set defaultaccount=professor   # Set user's default account
$ sacctmgr modify user where name=abc123 set defaultqos=professor       # Set user's Quality of Service (QoS)
$ sacctmgr update user name=abc123 account=professor set fairshare=128  # Set user's fairshare value

Check Account Limits and Fairshare

$ sacctmgr list assoc account=professor

Show historical Fairshare and Usage Information

$ sshare -a -l -A professor

Adjusting Priority

Slurm priority values are calculated by taking the sum of a variety of available factors, each an integer value multiplied by a number in the range 0-1.0. Some available factors include:

  • Job size
  • Queue time
  • Fairshare

Calculating Fairshare

Fairshare is calculated with the following equation, taking values from

$ sshare -laA youraccount
From that data, perform the following calculations:

Norm Shares = Raw Shares / sum(self + siblings' Raw Shares)
Effectv Usage = Raw Usage / account's Raw Usage
FairShare = Norm Shares / Effectv Usage

Modifying User Fairshare

$ sacctmgr modify user abc123 set fairshare=64