Handling Massive Machine Learning Models // Simon Karasik // MLOps podcast

Поделиться
HTML-код
  • Опубликовано: 4 окт 2024
  • Join us at our first in-person conference on June 25 all about AI Quality: www.aiqualityc...
    Huge thank you to ‪@nebiusofficial‬ for sponsoring this episode. Nebius AI - nebius.ai/
    MLOps podcast #228 with Simon Karasik, Machine Learning Engineer at Nebius AI, Handling Multi-Terabyte LLM Checkpoints.
    // Abstract
    The talk provides a gentle introduction to the topic of LLM checkpointing: why is it hard, how big are the checkpoints. It covers various tips and tricks for saving and loading multi-terabyte checkpoints, as well as the selection of cloud storage options for checkpointing.
    // Bio
    Full-stack Machine Learning Engineer, currently working on infrastructure for LLM training, with previous experience in ML for Ads, Speech, and Tax.
    // MLOps Jobs board
    mlops.pallet.x...
    // MLOps Swag/Merch
    mlops-communit...
    // Related Links
    -------------- ✌️Connect With Us ✌️ ------------
    Join our slack community: go.mlops.commu...
    Follow us on Twitter: @mlopscommunity
    Sign up for the next meetup: go.mlops.commu...
    Catch all episodes, blogs, newsletters, and more: mlops.community/
    Connect with Demetrios on LinkedIn: / dpbrinkm
    Connect with Simon on LinkedIn: / simon-karasik
  • НаукаНаука

Комментарии •