Sharing Is Caring: GPU Sharing and CDI in Device Plugins - Evan Lezar, NVIDIA & David Porter, Google

Поделиться
HTML-код
  • Опубликовано: 20 мар 2024
  • Don't miss out! Join us at our next Flagship Conference: KubeCon + CloudNativeCon North America in Salt Lake City from November 12 - 15, 2024. Connect with our current graduated, incubating, and sandbox projects as the community gathers to further the education and advancement of cloud native computing. Learn more at kubecon.io
    Sharing Is Caring: GPU Sharing and CDI in Device Plugins - Evan Lezar, NVIDIA & David Porter, Google
    As the prevalence of AL/ML workloads running on Kubernetes increases, so too does the demand for efficient management of compute resources such as GPUs. This requires features implemented in the k8s Device Plugins that make resources consumable to end-user-applications. Additionally, the necessity of partition and resource sharing to improve utilization for cost reduction becomes critical. In this presentation we will deep-dive into the Container Device Interface (CDI) as a new option for Device Plugin authors and the flexibility this unlocks including resource sharing for GPUs. Starting from use cases, we will take a look under the hood at how a Device Plugin exposes GPUs and different sharing options that can be used to improve device utilization and right-sizing to workloads presented, for example time slicing, MIG, and MPS. We will discuss how k8s integrates with devices and CDI; GPU sharing mechanisms, and how applications and frameworks can integrate with this functionality.
  • НаукаНаука

Комментарии •