+ - 0:00:00
Notes for current slide

Notes for the first slide!

Notes for next slide

An Introduction to High Performance Computing

1 / 67

Notes for the first slide!

What is High Performance Computing?

2 / 67

What is High Performance Computing?

  • Using more than your laptop or desktop (or high powered workstation)
3 / 67

What is High Performance Computing?

  • Using more than your laptop or desktop (or high powered workstation)

  • Designed with various hardware architectures that fit different problems

4 / 67

What is High Performance Computing?

  • Using more than your laptop or desktop (or high powered workstation)

  • Designed with various hardware architectures that fit different problems

  • Allows you to run analyses not possible with a typical computer

5 / 67

What is High Performance Computing?

  • Using more than your laptop or desktop (or high powered workstation)

  • Designed with various hardware architectures that fit different problems

  • Allows you to run analyses not possible with a typical computer

  • High memory, high core count, bigdata

6 / 67

Before we launch in to ISU-HPC...

7 / 67

Before we launch in to ISU-HPC...

  • Are you tired of typing:
$ ssh <netID>@hpc-class.its.iastate.edu

????

8 / 67

Before we launch in to ISU-HPC...

  • Are you tired of typing:
$ ssh <netID>@hpc-class.its.iastate.edu

????

  • cd into your root directory and try: ls -a
9 / 67

Before we launch in to ISU-HPC...

  • Are you tired of typing:
$ ssh <netID>@hpc-class.its.iastate.edu

????

  • cd into your root directory and try: ls -a

  • You should see a hidden folder named .ssh

10 / 67

Before we launch in to ISU-HPC...

  • Are you tired of typing:
$ ssh <netID>@hpc-class.its.iastate.edu

????

  • cd into your root directory and try: ls -a

  • You should see a hidden folder named .ssh

  • cd into .ssh and create a file named "config":

$ touch config
11 / 67

Before we launch in to ISU-HPC...

12 / 67

Before we launch in to ISU-HPC...

  • Now open the file config, add the following contents, and save the file:
Host hpc_class
HostName hpc-class.its.iastate.edu
User <net ID>
13 / 67

Before we launch in to ISU-HPC...

  • Now open the file config, add the following contents, and save the file:
Host hpc_class
HostName hpc-class.its.iastate.edu
User <net ID>
  • Now try it out!
$ ssh hpc_class
14 / 67

Before we launch in to ISU-HPC...

  • Now open the file config, add the following contents, and save the file:
Host hpc_class
HostName hpc-class.its.iastate.edu
User <net ID>
  • Now try it out!
$ ssh hpc_class
  • And once you're in hpc-class, a few new tricks:
$ hostname
$ whoami
15 / 67

Performance Monitoring

  • htop (or top)
  • iostat
  • iftop
htop screenshot

There are other tools, but these will get you the basic info you need and are installed on most systems by default

16 / 67

Terminology

  • HPC terminology can be confusing, sometimes the same words have different contextual meaning (e.g. threads)

Terms

  • Nodes: compute node, head node
17 / 67

Terminology

  • HPC terminology can be confusing, sometimes the same words have different contextual meaning (e.g. threads)

Terms

  • Nodes: compute node, head node

  • Processors (a cpu chip)

18 / 67

Terminology

  • HPC terminology can be confusing, sometimes the same words have different contextual meaning (e.g. threads)

Terms

  • Nodes: compute node, head node

  • Processors (a cpu chip)

  • Cores (physical cpus embedded on a single chip)

19 / 67

Terminology

  • HPC terminology can be confusing, sometimes the same words have different contextual meaning (e.g. threads)

Terms

  • Nodes: compute node, head node

  • Processors (a cpu chip)

  • Cores (physical cpus embedded on a single chip)

  • Threads: hyper-threading (software 'trick' to let two processes share a core)

20 / 67

Terminology

  • HPC terminology can be confusing, sometimes the same words have different contextual meaning (e.g. threads)

Terms

  • Nodes: compute node, head node

  • Processors (a cpu chip)

  • Cores (physical cpus embedded on a single chip)

  • Threads: hyper-threading (software 'trick' to let two processes share a core)

  • Scheduler: Allocates access to resources to users (ISU uses Slurm)

21 / 67

Cluster Diagram

cluster diagram
22 / 67

Compute Resources at ISU

Clusters

  • hpc-class
    • For classes, not research
    • 48 nodes
      • 2.0GHz, 16 cores, 64GB RAM
23 / 67

Compute Resources at ISU

Clusters

  • hpc-class

    • For classes, not research
    • 48 nodes
      • 2.0GHz, 16 cores, 64GB RAM
  • condo2017

    • Primarily for sponsored research
    • 180 Nodes
      • 2.6GHz, 16 cores, 128GB RAM
    • New free-tier - 48 nodes
      • 2.0GHz, 12 cores, 64GB RAM
24 / 67

Compute Resources at ISU

Custom hardware

  • BioCrunch (for multithreaded shared memory programs)
    • 2.4GHz, 80 threads, 768GB of RAM
25 / 67

Compute Resources at ISU

Custom hardware

  • BioCrunch (for multithreaded shared memory programs)
    • 2.4GHz, 80 threads, 768GB of RAM
  • BigRAM (for large memory needs like de-novo assembly)
    • 2.6GHz, 48 threads, 1.5TB of RAM
26 / 67

Compute Resources at ISU

Custom hardware

  • BioCrunch (for multithreaded shared memory programs)
    • 2.4GHz, 80 threads, 768GB of RAM
  • BigRAM (for large memory needs like de-novo assembly)
    • 2.6GHz, 48 threads, 1.5TB of RAM
  • Speedy (for single threaded programs, like R)
    • 3.4GHz, 24 threads, 256GB of RAM
27 / 67

Compute Resources at ISU

Custom hardware

  • BioCrunch (for multithreaded shared memory programs)
    • 2.4GHz, 80 threads, 768GB of RAM
  • BigRAM (for large memory needs like de-novo assembly)
    • 2.6GHz, 48 threads, 1.5TB of RAM
  • Speedy (for single threaded programs, like R)
    • 3.4GHz, 24 threads, 256GB of RAM
  • Speedy2 (for single threaded programs, like R)
    • 3.2GHz, 32 threads, 256GB of RAM
28 / 67

Compute Resources at ISU

Custom hardware

  • BioCrunch (for multithreaded shared memory programs)
    • 2.4GHz, 80 threads, 768GB of RAM
  • BigRAM (for large memory needs like de-novo assembly)
    • 2.6GHz, 48 threads, 1.5TB of RAM
  • Speedy (for single threaded programs, like R)
    • 3.4GHz, 24 threads, 256GB of RAM
  • Speedy2 (for single threaded programs, like R)
    • 3.2GHz, 32 threads, 256GB of RAM
  • LASWIN01 (for Windows only software)
    • 2.6GHz, 24 threads, 64GB of RAM
29 / 67

Compute Resources at ISU

Custom hardware

  • BioCrunch (for multithreaded shared memory programs)
    • 2.4GHz, 80 threads, 768GB of RAM
  • BigRAM (for large memory needs like de-novo assembly)
    • 2.6GHz, 48 threads, 1.5TB of RAM
  • Speedy (for single threaded programs, like R)
    • 3.4GHz, 24 threads, 256GB of RAM
  • Speedy2 (for single threaded programs, like R)
    • 3.2GHz, 32 threads, 256GB of RAM
  • LASWIN01 (for Windows only software)
    • 2.6GHz, 24 threads, 64GB of RAM
  • Legion (for massively parallel applications)
    • 4 nodes
    • 1.3GHz, 272 threads, 386GB of RAM (each)
30 / 67

Compute Resources

Xsede

  • ISU researchers have access to the national supercomputer centers (e.g. TACC, PSC) via Xsede
  • For problems that need to scale larger than our on-campus resources
  • Contact campus champions: Jim Coyle or Andrew Severin

Cloud (AWS, Azure, etc.)

  • Tempting introductory rates
  • Be cautious of putting in a credit card (charges can accumulate quickly)
  • Consider data transfer times and speed
  • Consult with IT before purchasing - they have special negotiated rates
31 / 67

Software

Modules:

  • Modules are used to allow multiple people to use the same software with consistent, reproducible results
32 / 67

Software

Modules:

  • Modules are used to allow multiple people to use the same software with consistent, reproducible results

  • Modules keep your environment clean and free of conflicts (e.g. python2/3 or Java7/8)

33 / 67

Software

Modules:

  • Modules are used to allow multiple people to use the same software with consistent, reproducible results

  • Modules keep your environment clean and free of conflicts (e.g. python2/3 or Java7/8)

  • Think of modules as a software library. You need to check out the software before you can use it.

34 / 67

Software

Modules:

  • Modules are used to allow multiple people to use the same software with consistent, reproducible results

  • Modules keep your environment clean and free of conflicts (e.g. python2/3 or Java7/8)

  • Think of modules as a software library. You need to check out the software before you can use it.

  • Modules can be searched:

35 / 67

Software

Modules:

  • Modules are used to allow multiple people to use the same software with consistent, reproducible results

  • Modules keep your environment clean and free of conflicts (e.g. python2/3 or Java7/8)

  • Think of modules as a software library. You need to check out the software before you can use it.

  • Modules can be searched:

$ module avail
-------------------- /opt/rit/modules ---------------------------------
abyss/1.5.2 freesurfer/5.3.0 lib/htslib/1.2.1
abyss/1.9.0 gapcloser/1.12-r6 lib/htslib/1.3
afni/17.0.10 gapfiller/1-10 lib/htslib/1.3.2
albert/20170105 gatk/3.4-46 lib/ICE/1.0.9
...
36 / 67

Software

Modules

  • To use a module:
37 / 67

Software

Modules

  • To use a module:
$ module load <name of software>
38 / 67

Software

Modules

  • To use a module:
$ module load <name of software>
  • To delete a module:
39 / 67

Software

Modules

  • To use a module:
$ module load <name of software>
  • To delete a module:
$ module unload <name of software>
40 / 67

Software

Modules

  • To use a module:
$ module load <name of software>
  • To delete a module:
$ module unload <name of software>
  • Default behavior is to load the latest version if not specified
41 / 67

Software

Modules

  • To use a module:
$ module load <name of software>
  • To delete a module:
$ module unload <name of software>
  • Default behavior is to load the latest version if not specified

  • Use 'module purge' to clear your environment before loading something different

42 / 67

Storage

43 / 67

Storage

Code

  • Git (github, bitbucket, gitlab, etc.)
44 / 67

Storage

Code

  • Git (github, bitbucket, gitlab, etc.)

Datasets

45 / 67

Storage

Code

  • Git (github, bitbucket, gitlab, etc.)

Datasets

  • Storage on the clusters and servers should be treated as temporary
46 / 67

Storage

Code

  • Git (github, bitbucket, gitlab, etc.)

Datasets

47 / 67

Storage

Code

  • Git (github, bitbucket, gitlab, etc.)

Datasets

48 / 67

Job Scheduler

49 / 67

Job Scheduler

  • Scheduler assigns jobs from the queue to the compute nodes
50 / 67

Job Scheduler

  • Scheduler assigns jobs from the queue to the compute nodes

  • ISU uses SLURM

51 / 67

Job Scheduler

  • Scheduler assigns jobs from the queue to the compute nodes

  • ISU uses SLURM

  • Think about the scheduler like a hotel reservation - you're charged for the room whether you use it or not, and if you ask for an especially long or large reservation, the room may not be available when you want it.

52 / 67

Job Scheduler

  • Scheduler assigns jobs from the queue to the compute nodes

  • ISU uses SLURM

  • Think about the scheduler like a hotel reservation - you're charged for the room whether you use it or not, and if you ask for an especially long or large reservation, the room may not be available when you want it.

  • Basic info required: how many nodes, how long

53 / 67

Job Scheduler

  • Scheduler assigns jobs from the queue to the compute nodes

  • ISU uses SLURM

  • Think about the scheduler like a hotel reservation - you're charged for the room whether you use it or not, and if you ask for an especially long or large reservation, the room may not be available when you want it.

  • Basic info required: how many nodes, how long

  • Script writer can get you started

54 / 67

Slurm Script Cheatsheat:

55 / 67

Slurm Script Cheatsheat:

Slurm Script Cheatsheat
56 / 67

Sample SLURM script (mysbatch.sh)

57 / 67

Sample SLURM script (mysbatch.sh)

#!/bin/bash
#SBATCH --time=1:00:00 # walltime
#SBATCH --nodes=2 # number of nodes in this job
#SBATCH --ntasks-per-node=16 # total number of processor cores in this job
#SBATCH --output=myout_%J.log
#SBATCH --error=myerr_%J.err
module load R
Rscript MyThing.R
58 / 67

Sample SLURM script (mysbatch.sh)

#!/bin/bash
#SBATCH --time=1:00:00 # walltime
#SBATCH --nodes=2 # number of nodes in this job
#SBATCH --ntasks-per-node=16 # total number of processor cores in this job
#SBATCH --output=myout_%J.log
#SBATCH --error=myerr_%J.err
module load R
Rscript MyThing.R

Then submit:

$ sbatch mysbatch.sh
59 / 67

Slurm Job Management Cheatsheat:

60 / 67

Slurm Job Management Cheatsheat:

Slurm Job Cheatsheat
61 / 67

Common stumbling blocks

62 / 67

Common stumbling blocks

  • Over or under using resources
63 / 67

Common stumbling blocks

  • Over or under using resources

  • Not taking advantage of the right machine for the problem

64 / 67

Common stumbling blocks

  • Over or under using resources

  • Not taking advantage of the right machine for the problem

  • Moving data through slow links

65 / 67

Common stumbling blocks

  • Over or under using resources

  • Not taking advantage of the right machine for the problem

  • Moving data through slow links

  • Trying to scale up programs that weren't designed for large datasets

66 / 67

Support

67 / 67

What is High Performance Computing?

2 / 67
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow