Advanced Research Computing
Virtual Visit Request info Apply
MENUMENU
  • About
    • Overview
    • Details
    • Policies
    • FAQs
    • Our Team
    • Testimonials
  • Services
    • Pricing
    • Office Hours
    • Service Requests
      • Request an Account
      • Request Storage
      • Request Classroom Access
      • Request Software
    • Data Portal »
  • Resources
    • Documentation »
    • Workshops
    • Web Apps
      • Doppler (NAU only)
      • Metrics (NAU only)
      • OnDemand
      • XDMod
      • XDMoD Reports
  • Research
    • Current Projects
    • Publications
  • Collaboration
    • CRN
    • External
  • IN
  • ARC
  • Using Slurm

Coordinating Accounts with Slurm

NAU’s Monsoon cluster is host to several research groups. In order to balance the demands of these groups, Slurm is utilized to schedule jobs in a way to maximise fairness. Slurm provides a useful overlay to make starting large compute jobs easy.

Managing Jobs

Listing jobs

squeue -A professor # List by account
squeue -u abc123    # List by user

Cancelling Jobs

scancel 12345678                   # Cancel by job ID
scancel -u abc123                  # Cancel all of a user's jobs
scancel -u abc123 --state=running  # Cancel all of a user's RUNNING jobs
scancel -u abc123 --state=pending  # Cancel all of a user's PENDING jobs
scancel -A professor               # Cancel an entire account's jobs

Holding and Releasing Jobs

scontrol hold 12345678      # Hold by job ID
scontrol release 12345678   # Release the hold
scontrol uhold 12345678     # Hold job 12345678 but allow the job's owner to 
                            # release it

Limiting Users

Check the Current Limits

sacctmgr list assoc account=professor
sacctmgr list assoc user=abc123 format=account,user,grpcpurunmins
sacctmgr list assoc user=abc123

Limiting CPU Time

sacctmgr modify user abc123 set GrpCPURunMins=1440  # Limit a user's maximum CPU 
                                                    # time in pending/running 
                                                    # jobs to 1440 minutes 
                                                    # (e.g 24 hours on 1 core, 
                                                    #  12 hours on 2 cores, etc.)

Limiting Usable CPU’s

sacctmgr modify user abc123 set GrpCPUs=2 # The user can only have 2 CPUs 
                                          # allocated at a time

Checking the Current Settings and Status

Check Account Limits and Fairshare

sacctmgr list assoc account=professor

Show historical Fairshare and Usage Information

sshare -a -l -A professor

Adjusting Priority

Slurm priority values are calculated by taking the sum of a variety of available factors, each an integer value multiplied by a number in the range 0-1.0. Some available factors include:

  • Job size
  • Queue time
  • Fairshare

Calculating Fairshare

Fairshare is calculated with the following equation, taking values from

sshare -laA youraccount

FS = Norm Shares / Effectv Usage

where

Norm Shares = Raw Shares / sum(self + siblings' Raw Shares)
Effectv Usage = Raw Usage / account's Raw Usage

Modifying User Fairshare

sacctmgr modify user abc123 set fairshare=64