Predibase
Predibase
  • Видео 52
  • Просмотров 42 240
Optimize Inference for Fine-tuned SLMs
As small language models (SLMs) become a critical part of today’s AI toolkit, teams need reliable and scalable serving infrastructure to meet growing demands. The Predibase Inference Engine simplifies serving infrastructure, making it easier to move models into production faster.
In this tech talk, you’ll learn how to speed up deployments, improve reliability, and reduce costs-all while avoiding the complexity of managing infrastructure.
You'll learn how to:
• 4x your SLM throughput with Turbo LoRA, FP8 and Speculative Decoding
• Effortlessly manage traffic surges with GPU autoscaling
• Ensure high availability SLAs with multi-region load balancing, automatic failover, and more
• Deploy into y...
Просмотров: 167

Видео

How to Fine-tune a SLM for Content Summarization w/ Llama-3.1-8B
Просмотров 19021 день назад
In this short tutorial, we'll show you how to easily and efficiently fine-tune a small language model, specifically Llama-3.1-8B, to accurately summarize a series of chat conversations. Tutorial Notebook: colab.research.google.com/drive/1fTP0bTEZcLLic3-2oLxuQajv MdcIGf?usp=sharing Get started customizing your own SLMs with our free trial: predibase.com/free-trial. Request a custom demo with an ...
How Clearwater Analytics Builds AI Agents with Small Language Models (SLMs)
Просмотров 474Месяц назад
Building agentic systems with small fine-tuned open-source language models can power impressive GenAI applications, but what does it do this successfully at production scale? In this tech talk, Clearwater Analytics, the leading provider of automated investment analytics and reporting, shares how they built and deployed a multi-agent solution for their customers using fine-tuned SLMs including a...
Your Models, Your Cloud: Secure Private LLMs in Your VPC in less than 30 mins
Просмотров 1722 месяца назад
As GenAI projects grow in scale, the need for secure and reliable infra is a must, especially when handling sensitive data. For many teams, this creates a dilemma: they can't use commercial LLMs due to data privacy and ownership concerns, and building their own secure, production-grade infra is too big of a challenge. What if you could deploy private LLMs in your cloud without all the hassle? N...
Demo: Synthetic Data Generation
Просмотров 1652 месяца назад
Remove barriers to fine-tuning by quickly generating synthetic data based on as few as 10 rows of seed data. In this short demo, you will see how to generate high quality synthetic data that can then be used to instantly fine-tune your model all within Predibase. Try Predibase for free: predibase.com/free-trial
Small is the New Big: Why Apple and Other AI Leaders are Betting Big on Small Language Models
Просмотров 3992 месяца назад
In this talk at the LLMOps Micro-Summmit, Piero Molino, cofounder and CSO of Predibase, discusses the GenAI architecture of the future and how developers can leverage the latest innovations in LLM tech to build big with small models. Specifically, he explores the modern GenAI architecture as outlined by Apple during the launch of their new Apple Intelligence platform and the different technique...
Building Better Models Faster with Synthetic Data
Просмотров 1572 месяца назад
In this talk at the LLMOps Micro-Summit, Maarten Van Segbroeck, Head of Applied Science at Gretel, discusses the evolution of GenAI, data as a blocker to developing better models and how you can use new techniques to generate high quality synthetic data to fine-tune highly accurate SLMs for your use case. Session slides: pbase.ai/3T27VOu
Fine-Tuning SLMs for Enterprise-Grade Evaluation & Observability
Просмотров 3262 месяца назад
In this talk at the LLMOps Micro-Summit, Atin Sanyal, Co-founder & CTO of Galileo, discusses techniques for combatting hallucinations in LLMs with a focus on new methods in fine-tuning small language models (SLMs) to observe and evaluate models. Session slides: pbase.ai/46Z4cXQ
Next Gen Inference for Fine-tuned LLMs - Blazing Fast & Cost-Effective
Просмотров 2632 месяца назад
In this talk, Arnav Garg, ML Eng Leader at Predibase, discusses new innovations in fine-tuned model inference. Specifically, he deep dives on Turbo LoRA, a new parameter-efficient fine-tuning method pioneered at Predibase that increases text generation throughput by 2-3x while simultaneously achieving task-specific response quality in line with LoRA. While existing fine-tuning methods focus onl...
Streamlining Background Checks with Fine-tuned Small Language Models on Predibase
Просмотров 2462 месяца назад
In this talk, Vlad Bukhin, Staff ML Engineer at Checkr discusses how they use LLM classifiers to help automate the complex process of transforming messy unstructured text data into one of 230 categories used to populate background checks. Specifically, he walkthroughs his journey from starting with an OpenAI and RAG implementation to ultimately landing on fine-tuning small language models on Pr...
Welcome Address and Agenda Overview
Просмотров 2642 месяца назад
​In this welcome address, Devvret Rishi, cofounder and CEO of Predibase, discusses the state of GenAI and the future of small models and runs through the different talks on the agenda for the summit. Summit Agenda: ​Why Apple and Other AI Leaders are Betting Big on Small Language Models • Piero Molino, Cofounder & CSO, Predibase • Slides: pbase.ai/3AuG5nJ ​GenAI at Production Scale with SLMs th...
Beat GPT-4 with a Small Model and 10 Rows of Data
Просмотров 6232 месяца назад
While fine-tuning small language models with high quality datasets can consistently yield results that rival large foundation models like GPT-4, assembling sufficient fine-tuning training data is a barrier for many teams. This webinar introduces a novel approach that could change that paradigm. By leveraging large language models like GPT-4 and Llama-3.1-405b to generate synthetic data, we expl...
How to Reduce Your OpenAI Spend by up to 90% with Small Language Models
Просмотров 1,4 тыс.3 месяца назад
OpenAI has revolutionized the way enterprises build with large language models. A developer can create a high-performing AI prototype in just a few days, but when it’s time to push to production, the cost of GPT-4 skyrockets, oftentimes reaching hundreds of thousands of dollars a month. The result: fewer use cases deployed, fewer users engaged, and more value left on the table. So, what does it...
Predibase Platform Overview: Small Language Models for Specialized AI
Просмотров 3183 месяца назад
Discover Predibase, the leading developer platform for building and deploying task-specific small language models (SLMs) in our cloud or yours. This video demonstrates how Predibase enables easy fine-tuning and serving of models on scalable, cost-effective infrastructure, meeting the needs of both Fortune 500 companies and innovative startups. Learn how Predibase's open-source foundations and p...
Introducing Solar LLM: The Best LLM for Fine-tuning that beats GPT-4, exclusively on Predibase
Просмотров 6234 месяца назад
Meet Solar LLM, Upstage's best-in-class ~11 B parameter model. Released late last year, Solar LLM has quickly proven to be the best small language model (SLM) to fine-tune for task-specific applications. In a recent comparison with 15 other leading SLMs, Solar LLM came out on top in over 50% of our fine-tuning experiments. We are also excited to announce that Solar is available for fine-tuning ...
Ludwig Community Sync: 06/14/2024
Просмотров 434 месяца назад
Ludwig Community Sync: 06/14/2024
Snowflake + Predibase: Smaller, faster & cheaper LLMs that beat GPT-4
Просмотров 2934 месяца назад
Snowflake Predibase: Smaller, faster & cheaper LLMs that beat GPT-4
Speed Up LLM Development with Synthetic Data and Fine-tuning
Просмотров 2094 месяца назад
Speed Up LLM Development with Synthetic Data and Fine-tuning
How we accelerated LLM fine-tuning by 15x in 15 days
Просмотров 3416 месяцев назад
How we accelerated LLM fine-tuning by 15x in 15 days
How I became a Ludwig Contributor
Просмотров 746 месяцев назад
How I became a Ludwig Contributor
Dickens: an LLM that Writes Great Expectations
Просмотров 1506 месяцев назад
Dickens: an LLM that Writes Great Expectations
Virtual Workshop: Fine-tune Your Own LLMs that Rival GPT-4
Просмотров 5936 месяцев назад
Virtual Workshop: Fine-tune Your Own LLMs that Rival GPT-4
LLM Fine-tuning Tutorial: Generate Docstring with Fine-tuned CodeLlama-13b
Просмотров 3617 месяцев назад
LLM Fine-tuning Tutorial: Generate Docstring with Fine-tuned CodeLlama-13b
LoRA Bake-off: Comparing Fine-Tuned Open-source LLMs that Rival GPT-4
Просмотров 1,4 тыс.7 месяцев назад
LoRA Bake-off: Comparing Fine-Tuned Open-source LLMs that Rival GPT-4
Ludwig Hackathon Winner: Building a Tax FAQ Chatbot with LLMs
Просмотров 3948 месяцев назад
Ludwig Hackathon Winner: Building a Tax FAQ Chatbot with LLMs
Ludwig Hackathon Winner: Assessing Health Data with ML
Просмотров 1668 месяцев назад
Ludwig Hackathon Winner: Assessing Health Data with ML
LoRA Land: How We Trained 25 Fine-Tuned Mistral-7b Models that Outperform GPT-4
Просмотров 6 тыс.8 месяцев назад
LoRA Land: How We Trained 25 Fine-Tuned Mistral-7b Models that Outperform GPT-4
5 Reasons Why Adapters are the Future of Fine-tuning LLMs
Просмотров 1,6 тыс.8 месяцев назад
5 Reasons Why Adapters are the Future of Fine-tuning LLMs
Fine-Tuning Zephyr-7B to Analyze Customer Support Call Logs
Просмотров 7239 месяцев назад
Fine-Tuning Zephyr-7B to Analyze Customer Support Call Logs
12 Best Practices for Distilling Smaller LLMs with GPT
Просмотров 1,6 тыс.10 месяцев назад
12 Best Practices for Distilling Smaller LLMs with GPT

Комментарии

  • @mohammedshuaibiqbal5469
    @mohammedshuaibiqbal5469 2 месяца назад

    Can we use phi-3 3B or gemini 2B for fine tuning custom data. Given a Job description extract technical skills only from it.

    • @arnavgrg
      @arnavgrg 2 месяца назад

      Absolutely! Both of these models should do fairly well since the task you’re describing is focused and narrow.

    • @utkarshujwal3286
      @utkarshujwal3286 16 дней назад

      Explained well, so probably training Small LLMs for individual tasks could be the key for better text classification tasks right ?

  • @mulderbm
    @mulderbm 3 месяца назад

    Thanks for the presentation. I am still building myself but this gave me the needed next steps out of the openai fold

  • @hunkims
    @hunkims 4 месяца назад

    Super cool!

  • @btcoal
    @btcoal 4 месяца назад

    Excellent.

  • @ml-simplified
    @ml-simplified 5 месяцев назад

    @ 55:12 : Wouldn't it be more appropriate to utilize <inst></inst> (or whatever the instruction format of the underlying LLM) instead of relying on a customized instruction format? You can use the same prompt but format should be followed depending on underlying LLM

  • @tecnopadre
    @tecnopadre 6 месяцев назад

    It looks like the goal of fine-tune a model, can be great for better results with cheaper models. But I miss on this video, the real cases and some examples. The video it's too technical and it looks like the slides and content it's only understandable for the company, not final users. Sorry for the hard comment but I think you have a great project that has to be explained easier. Thank you.

  • @ojasvisingh786
    @ojasvisingh786 7 месяцев назад

    🎉👏👏

  • @jeffg4686
    @jeffg4686 7 месяцев назад

    @5:08 - 😂😂😂

  • @AhmedKachkach
    @AhmedKachkach 7 месяцев назад

    This is great! Just the slide comparing base performance vs performance after fine-tuning makes this exercise worthwhile: proves that differences between foundation models are not *that* large, and that pure prompting is not sufficient to reach good performance (and once you do that, most differences in base models disappear ; though mistral models do seem to be significantly ahead!) Thanks for putting this together! If you're considering a similar comparison in the future, I'd be curious to see the effect of int4 quantization (with and without Quantization Aware Training) on prediction quality. Hard to find proper experiments testing this, mostly seeing evals with latency alone without a proper analysis of the quality cost (and how to reduce it, e.g. with QAT).

  • @nasiksami2351
    @nasiksami2351 8 месяцев назад

    Thanks for the amazing demonstration. I believe the notebook is private and I've sent a request to access the notebook. The approval will be appreciated, and also, please share the medium's blog link. Thank you.

  • @csepartha
    @csepartha 8 месяцев назад

    Nice tutorial.

  • @JulianHarris
    @JulianHarris 8 месяцев назад

    Have you guys looked at the next generation of quantisation: eg ternary/1.58 bit quantisation? It’s a different technique to conventional quantisation because you have matrices that only have 0, 1, -1, and you eliminate matrix multiplication almost entirely. The intuition is that the combination may not bring quite as many benefits, but it might be interesting to see how it performs in CPU architectures for instance.

  • @jeffg4686
    @jeffg4686 8 месяцев назад

    Nice !

  • @ofir952
    @ofir952 8 месяцев назад

    Thanks! How did you manage to remove the surrounding text of the LLM response?

    • @pieromolino_pb
      @pieromolino_pb 8 месяцев назад

      It's a side effect of fine-tuning on output that contains only the JSON without tany other text

    • @ofir952
      @ofir952 8 месяцев назад

      So, we cannot achieve this without fine-tuning? Llama2 keeps on adding it all the time 🥲@@pieromolino_pb

  • @tankieslayer6927
    @tankieslayer6927 8 месяцев назад

    FINE-TUNED MODEL RESPONSE Named Entity Recognition (CoNLL++) {"person": ["Such"], "organization": ["Yorkshire"], "location": [], "miscellaneous": []} Yeah, I am not impressed with the result of this fine-tuning.

    • @pieromolino_pb
      @pieromolino_pb 8 месяцев назад

      The input text is: By the close Yorkshire had turned that into a 37-run advantage but off-spinner Such had scuttled their hopes , taking four for 24 in 48 balls and leaving them hanging on 119 for five and praying for rain. Yorkshire in this case is a sports team, so organization is correct, and Such is a a player, so both model's predictions are correct indeed. I'd suggest to try to understand better what is going on next time.

    • @The_Real_Goodboy_Link
      @The_Real_Goodboy_Link 2 месяца назад

      Found the real solution, @tankieslayer6927, click on your icon on the top-right screen here, then settings, advanced settings, delete channel. Then go over to Google and do similarly for your account there. Problem solved!

  • @user-kl7kr6lc8r
    @user-kl7kr6lc8r 9 месяцев назад

    Thank you for amazing session 🙏🙏

  • @BiMoba
    @BiMoba 9 месяцев назад

    Super helpful experimental results! Thanks for the helpful webinar

  • @BiMoba
    @BiMoba 10 месяцев назад

    Does Ludwig support DPO?

    • @mustexist7542
      @mustexist7542 10 месяцев назад

      Could you please ask this question in the Ludwig Community Slack? This way more people will know the answer. Thank you very much!

  • @srinivasnimmagadda5817
    @srinivasnimmagadda5817 10 месяцев назад

    Insightful seminar on a step by step overview of how to use open source LLMs for commercialization. Take notes!

    • @Predibase
      @Predibase 10 месяцев назад

      Glad you liked it! Make sure to check out our repo of best practices for distillation: pbase.ai/DistillationPlaybook.

  • @topmaxdata
    @topmaxdata 11 месяцев назад

    It is nice presentation, thank you! But why do you not use Bert or T5 model for text classification task instead of LLM? Thank you.

    • @BiMoba
      @BiMoba 9 месяцев назад

      I think it's effective context length, LLMs like this should theoretically more powerful when it comes to classification of long texts like emails, essays and for complex classification tooo

  • @topmaxdata
    @topmaxdata 11 месяцев назад

    It is nice demo. Thank you. Would you please advice why you use LLM to do entity extraction? Should it be better to train the NER model? thank you.

  • @LjaDj5XQKey9mSDxh4
    @LjaDj5XQKey9mSDxh4 11 месяцев назад

    Amazing presentation

  • @ihsanbp7786
    @ihsanbp7786 11 месяцев назад

    could you share the collab notebook plz

    • @Predibase
      @Predibase 11 месяцев назад

      Here is a free notebook with a similar use case for the webinar on Automating Customer Support Tasks with Llama-2-7b: colab.research.google.com/drive/18Xac7MU4mcirHn0-JhOsCsLu_BDOjcls?usp=sharing#scrollTo=f9cf9843-d07f-47b5-9d9e-c0b8005b81f2

    • @serhunya
      @serhunya 8 месяцев назад

      not available anymore, can you pls share a new link thanks @@Predibase

  • @twinlens
    @twinlens Год назад

    This was really good, thanks guys. After trying a bunch of different ways, and having some success (and plenty of OOM) running GPU machines and hosting models ... your approach makes so much sense. Looking forward to trying it.

  • @yahyamohandisam4539
    @yahyamohandisam4539 Год назад

    Amazing explanation, Thank you guys

  • @kevon217
    @kevon217 Год назад

    Great discussion.

  • @prashantjain2023
    @prashantjain2023 Год назад

    I tried to follow the colab and I was able to fine-tune LLAMA2-7b on my own dataset. After fine-tuning, I'm trying to load the fine-tuned model on my VM (30GB RAM and GPU T4) but my system keep crashing due to OOM. Is there any other tested way to load the fine-tuned model binaries with ludwig? Would you be able to share code / video for that?

  • @Edukone
    @Edukone Год назад

    Thank you Predibase for this informative session. We look forward to include the Predibase training in our course structure.

  • @MarkChekhanovskiy
    @MarkChekhanovskiy Год назад

    how to get access to the notebook?

  • @gabrielecastaldi1618
    @gabrielecastaldi1618 Год назад

    Excellent explanation of Ludwig features and potentials with hands-on assessment of alternative approaches to optimize the output. I look forward to new compelling applications in various industrial fields.

  • @malipetek
    @malipetek Год назад

    So is predibase a competitor to huggingface?

    • @Predibase
      @Predibase Год назад

      No, Predibase and Huggingface are complementary offerings. With Predibase, you can use off-the-shelf models from Huggingface or choose to fine-tune them on your own custom data. Predibase provides an end-to-end low-code AI platform for customizing and deploying any type of ML model including LLMs. You can also build custom models from scratch using recommended model architectures. The recommendations are provided based on your data and the type of ML task you are trying to solve. Sign-up for a free trial to explore the platform: predibase.com/free-trial/.

    • @malipetek
      @malipetek Год назад

      @@Predibase I was considering using a model I found on huggingface with an api, but I have no intentions on improving the model. Should I go for huggingface or predibase?

    • @arnavgrg
      @arnavgrg Год назад

      Hi @@malipetek , you can choose either - Predibase offers fast inference through an SDK or API, either through a managed SaaS offering or in your own VPC!

    • @devismurugan
      @devismurugan 8 месяцев назад

      Hey @@arnavgrg Thanks for the great product. I deployed Predibase through docker on a VPC. Can you please suggest how to access the self-hosted/vpc predibase endpoints from the llamaindex?

    • @devismurugan
      @devismurugan 8 месяцев назад

      Hi@@arnavgrg , Thanks for the great product. Can you suggest how to use the VPC-based Predibase setup with LlamaIndex?

  • @vishalsingh2705
    @vishalsingh2705 Год назад

    brilliant product

  • @jamesdetweiler
    @jamesdetweiler Год назад

    Such a great presentation!

  • @jamesdetweiler
    @jamesdetweiler Год назад

    This is a great presentation!