Warewulf: Deep Dive, Use Cases, and Examples

Поделиться
HTML-код
  • Опубликовано: 4 ноя 2024

Комментарии • 11

  • @jonathon-anderson
    @jonathon-anderson 2 года назад +3

    Precisely on cue, the fourth compute node (c4) was working immediately after the stream was over. Sorry we weren't able to dig into why it wasn't working during the webinar! All we needed to do was `scontrol update nodename=c4 state=resume` to return the node to service after it was "unexpectedly" rebooted for the purposes of the demo. I even ran this command during the demo, as `scontrol update nodename=c[1-4] state=resume`; but this generated an error because c[1-3] were already "resumed." That error was a distraction, though, and didn't prevent it from resuming c4, which is why I discovered it working after the demo. If I had just tried one more time, it would have worked.

  • @jamesdavis9628
    @jamesdavis9628 2 года назад

    Excellent information. Particularly the part about leveraging the sysadmin side of containers in warewulf to help sysadmins improve their container comfort.

  • @leonardoloures4987
    @leonardoloures4987 2 года назад +1

    Simply amazing! Could you demonstrate examples in python? It would also be interesting examples with GPUs with Pytorch and Tensorflow.

    • @CtrlIQ
      @CtrlIQ  2 года назад

      Here are links to a couple of our previous webinars along with time stamps. Feel free to reach out to us at info@ciq.co if you would like any additional information.
      ruclips.net/video/JBQxdfcLC08/видео.html
      Tensorflow Jupyter Notebook/Workflow Lifecycle/Fuzzball UI - 54:03 to 1:09:56
      ruclips.net/video/Pbmxq3dg35E/видео.html
      PyTorch Jupyter notebook - 37:00 to 1:03:45

  • @jtoddowen
    @jtoddowen 2 года назад

    Do you have any examples with InfiniBand devices? We have a 24-core head node (with must run OpenSM, so one IB card) and twelve 48-core compute nodes with dual 56 Gb InfiniBand cards, 10 GbE for file serving / SLURM and a IMPI (really iDRAC) network for machine control. We're in a college environment and this is our first HPC, but we're really struggling with the deployment phase. The video was very helpful and I really never thought of using containers vs. traditional HPC.

    • @gregorykurtzer524
      @gregorykurtzer524 2 года назад

      Hi Todd. For the control node running OpenSM, just setup the IB as you normally would on a stateful system, Warewulf doesn't interfere at all there. In terms of the compute nodes, you would do something like: wwctl node set --netname infiniband --netdev ib0 --ipaddr x.x.x.x --netmask x.x.x.x n0000

  • @dimitriorbeliani4195
    @dimitriorbeliani4195 2 года назад +1

    Do you have Rocky Linux container on ARM64 (aarch64) for Warewulf?

    • @CtrlIQ
      @CtrlIQ  2 года назад +1

      Not yet, but please check out github.com/hpcng/warewulf/issues/62 for updates!

    • @dimitriorbeliani4195
      @dimitriorbeliani4195 2 года назад

      @@CtrlIQ Any news about ARM64 (aarch64)?

  • @severgun
    @severgun 2 года назад

    Is there any plans to provide prebuild packages for debian / ubuntu LTS?

    • @CtrlIQ
      @CtrlIQ  2 года назад

      Hi severgun!
      It wouldn’t be too hard, as it’s just golang. It should run everywhere. Only nit might be making sure the generated tfpd and dhcp server templates are compatible.
      I hope this helps! :)