Three ways to use slurm on a high performance computer (HPC) (CC130)

Riffomonas Project

Просмотров 30 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 26 ноя 2024

Комментарии •

@Riffomonas 3 года назад ⁺⁹
Does your institution have a high performance computer that you have access to?
@felipe96150 2 года назад
I'm currently architecting a cluster for the institution that I work for. Thanks for the video.
@flscapes 8 месяцев назад ⁺¹
I work at the University of Oregon with our HPC called Talapas! Great video!
@VFella 2 месяца назад ⁺¹
LOL, I work at one XD.
I came over to see if I can recommend the video to some of our many clueless users. We are quite proficient in that, but I had never thought on YT video.
We have just implemented RStudio via Open Ondemand, a web-based access portal for HPC (the new EU Exascale supercomputer LUMI uses it too). You can, for instance, access an RStudio GUI frontend with a RStudio server backend enabled for cluster computing. You just need to enter the partition that you want (we have A100, H100 and a lot more, it's a top500 supercomputer), select the RAM, amount of CPU / GPU, tasks etc and click a button. It will send an salloc directly to the supercomputer, and the user only has to wait and click to start her session when ready.
I come from a commercial and purely IT background, but I love science, and I am familiar enough with it so that I understand that a scientist wants to do research, not to learn for becoming a Linux sysadmins. This is why I like your video. Thanks a lot of sharing, I will recommend it, even if it is already 3 years old (that's a geological time frame for HPC).
Greetings from the Netherlands ;)
@Riffomonas 2 месяца назад
@@VFella Thanks for tuning in!
@Frankie_Freedom Год назад ⁺⁵
This is great, i'm an HPC Admin and we just switched over from torque/pbs to slurm so this helps understanding slurm better.
@mrrooster7976 3 месяца назад ⁺¹
Me too. I have to know both. Benchmarking group wants PBS. Devs want slurm. Some work on the same machine and I need to make sure they don't pickup the same resources. The context switching is exhausting.
@UrbanGuitarLegend Месяц назад
Is there a certification for this?
@mrrooster7976 Месяц назад ⁺²
@@UrbanGuitarLegend Help me get better at guitar and Ill help you :)
seriously though.. cert for what?
@UrbanGuitarLegend Месяц назад ⁺¹
@mrrooster7976 LOL sure, no problem. 😂. Is there a cert for Slurm like we have for AWS and RedHat. I do Linux containerization and administration. I'm teaching myself how to administrator and configure Slurm from their website. But I haven't seen anything about a certification unless I missed it.
@Frankie_Freedom Месяц назад ⁺¹
@@UrbanGuitarLegend I haven't seen anything either, I don't think there is. Best way is yeah, just to do it.
@cyg7655 3 года назад ⁺¹³
The range of topics covered on this channel is truly amazing (and always very practical). Cannot thank you enough!
@Riffomonas 3 года назад
My pleasure - thanks for watching!
@aleonflux1138 3 года назад ⁺³
Perfect timing - I started using SLURM 2 weeks ago and this filled in lots of gaps for me.
@Riffomonas 3 года назад
Wonderful- I’m so glad to hear this was helpful! 😊
@taylorprice5297 3 года назад ⁺³
Thank you so much for making this video. It makes bioinformatics/metabarcoding analysis way. more approachable for me.
@Riffomonas 3 года назад
Fantastic - glad it helped!
@borisn.1346 2 года назад ⁺²
This is an amazing channel - thanks for your tireless work Pat!!
@Riffomonas 2 года назад
My pleasure! Thanks for watching 🤓
@1973vgc 3 года назад ⁺³
great, please more videos on this important topic! you are a genious!
@Riffomonas 3 года назад
Thanks Vir!
@aigonewrong. Год назад ⁺¹
Thank you for posting this. Very nice clear brief intro on Slurm! I wonder how easy it is to install. We use htcondor+docker to access gpu servers at work, and am considering giving htcondor AND slurm a whirl for side projects at home.
@LuizGNA Год назад ⁺²
Great video and great channel! I've just subscribed and will definitely share with my peers.
I'm getting started with HPC, so I have a basic question. Since HPCs are mainly based on terminals, how do you follow up after running your jobs? Do you download the data to your own computer? If so, I assume it makes sense to "develop" a script on your computer and then use the HPC when it is mature enough, right?
@666ejames 7 месяцев назад ⁺¹
You use ls -lth then scroll up to see the most recent stuff. If you instead do ls -lrth it will sort in reverse time order so the thing you see just above you after the ls command will be the most recent...saves lots of scrolling up...especially if you have a directory with 100s of files in from previous runs.
@Riffomonas 7 месяцев назад
awesome tip!
@jefflucas_life 9 месяцев назад ⁺¹
I built my own 5-node HPC with Lustre and SLURM/Munge.
PartitionName=lustrefs
AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
AllocNodes=ALL Default=YES QoS=N/A
DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED MaxCPUsPerSocket=UNLIMITED
NodeSets=ALL
Nodes=oss[1-5]
PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
OverTimeLimit=NONE PreemptMode=OFF
State=UP TotalCPUs=40 TotalNodes=5 SelectTypeParameters=NONE
JobDefaults=(null)
DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED
TRES=cpu=40,mem=39365M,node=5,billing=12
@alaricwdsouza 3 года назад ⁺¹⁰
I am pretty familiar with SLURM, but I have no experience with AWS. I would love a primer for AWS!
@nairuzelazzabi2172 2 года назад ⁺²
Thank you for the amazing video. Quick question, why did you have to re-run the single slurm script again at min 25:50 ?
@Riffomonas Год назад
I think I was trying to show how to run an array job that would fire off multiple jobs rather rather than a job that only fired off one seed
@RasmusKirkegaard 3 года назад ⁺³
For checking your own jobs in the slurm queue try "squeue --m"
@Riffomonas 3 года назад
Awesome - thanks!
@dikshantrajwal9987 Год назад ⁺¹
Hi, can you help me figure out how can I define custom resources in slurm ( resource should have count, associated with multiple nodes)
@liutrvcyrsui Год назад ⁺¹
04:27 AWS
21:48 slurm arrays
23:00
#! /bin/bash
...
#SBATCH --array 1-10
SEED=$( (SLURM_ARRAY_TASK_ID) )
echo $SEED
@GL-Kageyama 9 месяцев назад ⁺¹
Great!
@Learning432 Год назад ⁺¹
Hi, I need help how to use the g16 package on the server using slurm mode.
@Riffomonas Год назад
Hi - I'd encourage you to reach out to the system administrators for your HPC for help with this question.
@xiaoli0510 3 года назад ⁺²
what is the meaning of "make" command in your script?
@Riffomonas 3 года назад ⁺¹
Make us a program that can be used to automate workflows while keeping track of dependencies. I made a video about it awhile back … ruclips.net/video/eWHE2RIGrWo/видео.html
@Joshthegoated 8 месяцев назад ⁺¹
I need help installing sour
@Riffomonas 8 месяцев назад
Sorry I’m not much help with this. Our HPC administrators maintain slurm for us
@FareedaKalsoom 3 месяца назад ⁺¹
Ours also uses slurm
@JOHNSMITH-ve3rq 3 года назад ⁺³
Wanna do Google Cloud Platform sometime?
@Riffomonas 3 года назад
That would be awesome to try. I’ve worked on AWS but should check out google cloud too!
@AlexMiller-Wuppertal 3 года назад ⁺²
Alles Bestens. Aber auch besuche mich!
@omarelbliety3949 Год назад
What the eps for AWS ever done ?
@canadianrepublican1185 Год назад
It's not High Performance Computer, it's High Performance Computing. Academics run this type of equipment like grandmothers drive cars.
@canadianrepublican1185 Год назад
Never Ever use AWS for HPC.

Следующие

Автовоспроизведение

ACCEPTED! What to check when examining paper proofs (CC129)