Skip to content

Some scripts and tools to help me manage my programs on the Palmetto Cluster

Notifications You must be signed in to change notification settings

wbgreen1/palmetto-scripts

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

95 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

palmetto-scripts

Some scripts and tools to help me manage my programs on the Palmetto Cluster

Installation:

Initialization:

do this first to install the palmetto-scripts (takes ~5 minutes to run). You only have to do this once.

bash <(curl -s https://raw.githubusercontent.com/dougnd/palmetto-scripts/master/bin/basicSetup.sh)

Install tensorflow:

getGPULikeNode  # get a node for installation purposes
dinstall tensorflow
exit # leave the node

Test tensorflow:

qsub -I -l select=1:ncpus=1:mem=10gb:ngpus=2:gpu_model=k40,walltime=0:30:00

cd $TMPDIR
wget https://github.com/tensorflow/tensorflow/tarball/master # may have to try this more than once
tar xf master
cd tensorflow-tensorflow-*/tensorflow/models/image/mnist
python convolutional.py

exit # leave the node

Install caffe:

getGPULikeNode  # get a node for installation purposes
dinstall caffe_cudnn
exit # leave the node

***** Notes on GPUs on palmetto: ******

Note: the following was added to the palmetto MOTD: "Jobs that request gpus but don't use them may be terminated without notice." Make sure if you are request a gpu, you actually are using it. You may want to develop on a local machine and deploy to palmetto once you know it works.

Futhermore, every GPU on a node is accessible to your job regardless of whether you request any. Thus, something like tensorflow will detect 2 GPU's and make use of them both even if you only request one. This is clearly not good since you may interfere with another job. Make sure you are only using GPU's assigned to you. If you are requesting a single GPU and want to know what device you were assigned, you can used the following (provided by the palmetto support team):

export gpuDev=$(qstat -f $PBS_JOBID | awk '/exec_vnode/ {
    match($0, /'`hostname`'\[([0-9]+)\]/, grp);
    print grp[1]
}')

You can then use the gpuDev enviornment variable in your scripts. e.g. :

echo "Using GPU: $gpuDev"
caffe device_query -gpu $gpuDev
caffe train -solver vehicleDetectorSolver.prototxt -gpu $gpuDev

Usage:

dinstall

dinstall [<options>...] <command> [<packages>...]

Command can be one of the following:

  • update - updates dinstall and palmetto-scripts (pulls from github).
  • install - installs packages and their dependencies. Multiple packages can be supplied.
    e.g.: dinstall install caffe_cudnn tensorflow (installs caffe and tensorflow as well as all thier dependencies)
  • uninstall - removes packages. Multiple packages can be supplied.
  • upgrade - upgrades packages. Multiple packages can be supplied.
  • list - lists available packages.
  • list installed - lists installed packages.

Options

  • --help prints basic usage information.
  • --version prints version information.
  • --ignore-binaries installs from source, ignoring available binaries.

Other commands

  • getCPUNode gets a node (6 cores, 10GB, 6 hr)
  • getGPUNode gets a GPU node with a k40 (6 cores, 10GB, 6 hr)
  • getGPULikeNode gets a node without a GPU, but the same architecture (6 cores, 10GB, 6 hr)

About

Some scripts and tools to help me manage my programs on the Palmetto Cluster

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Shell 100.0%