Distributed training with Ray on Kubernetes at Lyft

Поделиться
HTML-код
  • Опубликовано: 13 сен 2024
  • Distributed training with Ray on Kubernetes at Lyft
    Is your training infrastructure built on Kubernetes? Do you want to enable Ray on Kubernetes? Our ML platform at Lyft is completely based on Kubernetes because of its scalability and rapid bootstrapping time of resources. In this talk we will demonstrate how we are leveraging Ray on Kubernetes to create an infrastructure to perform distributed training. We will showcase our custom SDKs that let users spawn on-demand Ray clusters to train models from notebooks. Our SDKs abstract and hide the complexities of spawning and bringing down the on-demand cluster from our users so that they can focus on the "what" while the platform takes care of the "how."
    See all Ray Summit content @ anyscale.com/ra...

Комментарии •