- Видео 52
- Просмотров 42 240
Predibase
Добавлен 17 ноя 2022
Deliver GPT-4 performance at a fraction of the cost with small models trained for your use case!
As the developer platform for productionizing open-source AI, Predibase makes it easy to cost-efficiently fine-tune and serve small open-source LLMs on state-of-the-art infrastructure in the cloud-without sacrificing quality.
Built by the team that created the internal AI platforms at Apple and Uber, Predibase is fast, efficient, and scalable for any size job. Predibase pairs an easy to use declarative interface for training models with high-end GPU capacity on serverless infra for production serving. Most importantly, Predibase is built on open-source foundations, including Ludwig and LoRAX, and can be deployed in your private cloud so all of your data and models stay in your control.
In production with Fortune 500 and high growth companies, Predibase is helping engineering teams deliver AI driven value in days, not months.
Try Predibase for free: predibase.com/free-trial
As the developer platform for productionizing open-source AI, Predibase makes it easy to cost-efficiently fine-tune and serve small open-source LLMs on state-of-the-art infrastructure in the cloud-without sacrificing quality.
Built by the team that created the internal AI platforms at Apple and Uber, Predibase is fast, efficient, and scalable for any size job. Predibase pairs an easy to use declarative interface for training models with high-end GPU capacity on serverless infra for production serving. Most importantly, Predibase is built on open-source foundations, including Ludwig and LoRAX, and can be deployed in your private cloud so all of your data and models stay in your control.
In production with Fortune 500 and high growth companies, Predibase is helping engineering teams deliver AI driven value in days, not months.
Try Predibase for free: predibase.com/free-trial
Optimize Inference for Fine-tuned SLMs
As small language models (SLMs) become a critical part of today’s AI toolkit, teams need reliable and scalable serving infrastructure to meet growing demands. The Predibase Inference Engine simplifies serving infrastructure, making it easier to move models into production faster.
In this tech talk, you’ll learn how to speed up deployments, improve reliability, and reduce costs-all while avoiding the complexity of managing infrastructure.
You'll learn how to:
• 4x your SLM throughput with Turbo LoRA, FP8 and Speculative Decoding
• Effortlessly manage traffic surges with GPU autoscaling
• Ensure high availability SLAs with multi-region load balancing, automatic failover, and more
• Deploy into y...
In this tech talk, you’ll learn how to speed up deployments, improve reliability, and reduce costs-all while avoiding the complexity of managing infrastructure.
You'll learn how to:
• 4x your SLM throughput with Turbo LoRA, FP8 and Speculative Decoding
• Effortlessly manage traffic surges with GPU autoscaling
• Ensure high availability SLAs with multi-region load balancing, automatic failover, and more
• Deploy into y...
Просмотров: 167
Видео
How to Fine-tune a SLM for Content Summarization w/ Llama-3.1-8B
Просмотров 19021 день назад
In this short tutorial, we'll show you how to easily and efficiently fine-tune a small language model, specifically Llama-3.1-8B, to accurately summarize a series of chat conversations. Tutorial Notebook: colab.research.google.com/drive/1fTP0bTEZcLLic3-2oLxuQajv MdcIGf?usp=sharing Get started customizing your own SLMs with our free trial: predibase.com/free-trial. Request a custom demo with an ...
How Clearwater Analytics Builds AI Agents with Small Language Models (SLMs)
Просмотров 474Месяц назад
Building agentic systems with small fine-tuned open-source language models can power impressive GenAI applications, but what does it do this successfully at production scale? In this tech talk, Clearwater Analytics, the leading provider of automated investment analytics and reporting, shares how they built and deployed a multi-agent solution for their customers using fine-tuned SLMs including a...
Your Models, Your Cloud: Secure Private LLMs in Your VPC in less than 30 mins
Просмотров 1722 месяца назад
As GenAI projects grow in scale, the need for secure and reliable infra is a must, especially when handling sensitive data. For many teams, this creates a dilemma: they can't use commercial LLMs due to data privacy and ownership concerns, and building their own secure, production-grade infra is too big of a challenge. What if you could deploy private LLMs in your cloud without all the hassle? N...
Demo: Synthetic Data Generation
Просмотров 1652 месяца назад
Remove barriers to fine-tuning by quickly generating synthetic data based on as few as 10 rows of seed data. In this short demo, you will see how to generate high quality synthetic data that can then be used to instantly fine-tune your model all within Predibase. Try Predibase for free: predibase.com/free-trial
Small is the New Big: Why Apple and Other AI Leaders are Betting Big on Small Language Models
Просмотров 3992 месяца назад
In this talk at the LLMOps Micro-Summmit, Piero Molino, cofounder and CSO of Predibase, discusses the GenAI architecture of the future and how developers can leverage the latest innovations in LLM tech to build big with small models. Specifically, he explores the modern GenAI architecture as outlined by Apple during the launch of their new Apple Intelligence platform and the different technique...
Building Better Models Faster with Synthetic Data
Просмотров 1572 месяца назад
In this talk at the LLMOps Micro-Summit, Maarten Van Segbroeck, Head of Applied Science at Gretel, discusses the evolution of GenAI, data as a blocker to developing better models and how you can use new techniques to generate high quality synthetic data to fine-tune highly accurate SLMs for your use case. Session slides: pbase.ai/3T27VOu
Fine-Tuning SLMs for Enterprise-Grade Evaluation & Observability
Просмотров 3262 месяца назад
In this talk at the LLMOps Micro-Summit, Atin Sanyal, Co-founder & CTO of Galileo, discusses techniques for combatting hallucinations in LLMs with a focus on new methods in fine-tuning small language models (SLMs) to observe and evaluate models. Session slides: pbase.ai/46Z4cXQ
Next Gen Inference for Fine-tuned LLMs - Blazing Fast & Cost-Effective
Просмотров 2632 месяца назад
In this talk, Arnav Garg, ML Eng Leader at Predibase, discusses new innovations in fine-tuned model inference. Specifically, he deep dives on Turbo LoRA, a new parameter-efficient fine-tuning method pioneered at Predibase that increases text generation throughput by 2-3x while simultaneously achieving task-specific response quality in line with LoRA. While existing fine-tuning methods focus onl...
Streamlining Background Checks with Fine-tuned Small Language Models on Predibase
Просмотров 2462 месяца назад
In this talk, Vlad Bukhin, Staff ML Engineer at Checkr discusses how they use LLM classifiers to help automate the complex process of transforming messy unstructured text data into one of 230 categories used to populate background checks. Specifically, he walkthroughs his journey from starting with an OpenAI and RAG implementation to ultimately landing on fine-tuning small language models on Pr...
Welcome Address and Agenda Overview
Просмотров 2642 месяца назад
In this welcome address, Devvret Rishi, cofounder and CEO of Predibase, discusses the state of GenAI and the future of small models and runs through the different talks on the agenda for the summit. Summit Agenda: Why Apple and Other AI Leaders are Betting Big on Small Language Models • Piero Molino, Cofounder & CSO, Predibase • Slides: pbase.ai/3AuG5nJ GenAI at Production Scale with SLMs th...
Beat GPT-4 with a Small Model and 10 Rows of Data
Просмотров 6232 месяца назад
While fine-tuning small language models with high quality datasets can consistently yield results that rival large foundation models like GPT-4, assembling sufficient fine-tuning training data is a barrier for many teams. This webinar introduces a novel approach that could change that paradigm. By leveraging large language models like GPT-4 and Llama-3.1-405b to generate synthetic data, we expl...
How to Reduce Your OpenAI Spend by up to 90% with Small Language Models
Просмотров 1,4 тыс.3 месяца назад
OpenAI has revolutionized the way enterprises build with large language models. A developer can create a high-performing AI prototype in just a few days, but when it’s time to push to production, the cost of GPT-4 skyrockets, oftentimes reaching hundreds of thousands of dollars a month. The result: fewer use cases deployed, fewer users engaged, and more value left on the table. So, what does it...
Predibase Platform Overview: Small Language Models for Specialized AI
Просмотров 3183 месяца назад
Discover Predibase, the leading developer platform for building and deploying task-specific small language models (SLMs) in our cloud or yours. This video demonstrates how Predibase enables easy fine-tuning and serving of models on scalable, cost-effective infrastructure, meeting the needs of both Fortune 500 companies and innovative startups. Learn how Predibase's open-source foundations and p...
Introducing Solar LLM: The Best LLM for Fine-tuning that beats GPT-4, exclusively on Predibase
Просмотров 6234 месяца назад
Meet Solar LLM, Upstage's best-in-class ~11 B parameter model. Released late last year, Solar LLM has quickly proven to be the best small language model (SLM) to fine-tune for task-specific applications. In a recent comparison with 15 other leading SLMs, Solar LLM came out on top in over 50% of our fine-tuning experiments. We are also excited to announce that Solar is available for fine-tuning ...
Snowflake + Predibase: Smaller, faster & cheaper LLMs that beat GPT-4
Просмотров 2934 месяца назад
Snowflake Predibase: Smaller, faster & cheaper LLMs that beat GPT-4
Speed Up LLM Development with Synthetic Data and Fine-tuning
Просмотров 2094 месяца назад
Speed Up LLM Development with Synthetic Data and Fine-tuning
How we accelerated LLM fine-tuning by 15x in 15 days
Просмотров 3416 месяцев назад
How we accelerated LLM fine-tuning by 15x in 15 days
Dickens: an LLM that Writes Great Expectations
Просмотров 1506 месяцев назад
Dickens: an LLM that Writes Great Expectations
Virtual Workshop: Fine-tune Your Own LLMs that Rival GPT-4
Просмотров 5936 месяцев назад
Virtual Workshop: Fine-tune Your Own LLMs that Rival GPT-4
LLM Fine-tuning Tutorial: Generate Docstring with Fine-tuned CodeLlama-13b
Просмотров 3617 месяцев назад
LLM Fine-tuning Tutorial: Generate Docstring with Fine-tuned CodeLlama-13b
LoRA Bake-off: Comparing Fine-Tuned Open-source LLMs that Rival GPT-4
Просмотров 1,4 тыс.7 месяцев назад
LoRA Bake-off: Comparing Fine-Tuned Open-source LLMs that Rival GPT-4
Ludwig Hackathon Winner: Building a Tax FAQ Chatbot with LLMs
Просмотров 3948 месяцев назад
Ludwig Hackathon Winner: Building a Tax FAQ Chatbot with LLMs
Ludwig Hackathon Winner: Assessing Health Data with ML
Просмотров 1668 месяцев назад
Ludwig Hackathon Winner: Assessing Health Data with ML
LoRA Land: How We Trained 25 Fine-Tuned Mistral-7b Models that Outperform GPT-4
Просмотров 6 тыс.8 месяцев назад
LoRA Land: How We Trained 25 Fine-Tuned Mistral-7b Models that Outperform GPT-4
5 Reasons Why Adapters are the Future of Fine-tuning LLMs
Просмотров 1,6 тыс.8 месяцев назад
5 Reasons Why Adapters are the Future of Fine-tuning LLMs
Fine-Tuning Zephyr-7B to Analyze Customer Support Call Logs
Просмотров 7239 месяцев назад
Fine-Tuning Zephyr-7B to Analyze Customer Support Call Logs
12 Best Practices for Distilling Smaller LLMs with GPT
Просмотров 1,6 тыс.10 месяцев назад
12 Best Practices for Distilling Smaller LLMs with GPT
Can we use phi-3 3B or gemini 2B for fine tuning custom data. Given a Job description extract technical skills only from it.
Absolutely! Both of these models should do fairly well since the task you’re describing is focused and narrow.
Explained well, so probably training Small LLMs for individual tasks could be the key for better text classification tasks right ?
Thanks for the presentation. I am still building myself but this gave me the needed next steps out of the openai fold
Super cool!
Excellent.
@ 55:12 : Wouldn't it be more appropriate to utilize <inst></inst> (or whatever the instruction format of the underlying LLM) instead of relying on a customized instruction format? You can use the same prompt but format should be followed depending on underlying LLM
It looks like the goal of fine-tune a model, can be great for better results with cheaper models. But I miss on this video, the real cases and some examples. The video it's too technical and it looks like the slides and content it's only understandable for the company, not final users. Sorry for the hard comment but I think you have a great project that has to be explained easier. Thank you.
🎉👏👏
@5:08 - 😂😂😂
This is great! Just the slide comparing base performance vs performance after fine-tuning makes this exercise worthwhile: proves that differences between foundation models are not *that* large, and that pure prompting is not sufficient to reach good performance (and once you do that, most differences in base models disappear ; though mistral models do seem to be significantly ahead!) Thanks for putting this together! If you're considering a similar comparison in the future, I'd be curious to see the effect of int4 quantization (with and without Quantization Aware Training) on prediction quality. Hard to find proper experiments testing this, mostly seeing evals with latency alone without a proper analysis of the quality cost (and how to reduce it, e.g. with QAT).
Thanks for the amazing demonstration. I believe the notebook is private and I've sent a request to access the notebook. The approval will be appreciated, and also, please share the medium's blog link. Thank you.
Nice tutorial.
Have you guys looked at the next generation of quantisation: eg ternary/1.58 bit quantisation? It’s a different technique to conventional quantisation because you have matrices that only have 0, 1, -1, and you eliminate matrix multiplication almost entirely. The intuition is that the combination may not bring quite as many benefits, but it might be interesting to see how it performs in CPU architectures for instance.
Nice !
Thanks! How did you manage to remove the surrounding text of the LLM response?
It's a side effect of fine-tuning on output that contains only the JSON without tany other text
So, we cannot achieve this without fine-tuning? Llama2 keeps on adding it all the time 🥲@@pieromolino_pb
FINE-TUNED MODEL RESPONSE Named Entity Recognition (CoNLL++) {"person": ["Such"], "organization": ["Yorkshire"], "location": [], "miscellaneous": []} Yeah, I am not impressed with the result of this fine-tuning.
The input text is: By the close Yorkshire had turned that into a 37-run advantage but off-spinner Such had scuttled their hopes , taking four for 24 in 48 balls and leaving them hanging on 119 for five and praying for rain. Yorkshire in this case is a sports team, so organization is correct, and Such is a a player, so both model's predictions are correct indeed. I'd suggest to try to understand better what is going on next time.
Found the real solution, @tankieslayer6927, click on your icon on the top-right screen here, then settings, advanced settings, delete channel. Then go over to Google and do similarly for your account there. Problem solved!
Thank you for amazing session 🙏🙏
Super helpful experimental results! Thanks for the helpful webinar
Does Ludwig support DPO?
Could you please ask this question in the Ludwig Community Slack? This way more people will know the answer. Thank you very much!
Insightful seminar on a step by step overview of how to use open source LLMs for commercialization. Take notes!
Glad you liked it! Make sure to check out our repo of best practices for distillation: pbase.ai/DistillationPlaybook.
It is nice presentation, thank you! But why do you not use Bert or T5 model for text classification task instead of LLM? Thank you.
I think it's effective context length, LLMs like this should theoretically more powerful when it comes to classification of long texts like emails, essays and for complex classification tooo
It is nice demo. Thank you. Would you please advice why you use LLM to do entity extraction? Should it be better to train the NER model? thank you.
Amazing presentation
could you share the collab notebook plz
Here is a free notebook with a similar use case for the webinar on Automating Customer Support Tasks with Llama-2-7b: colab.research.google.com/drive/18Xac7MU4mcirHn0-JhOsCsLu_BDOjcls?usp=sharing#scrollTo=f9cf9843-d07f-47b5-9d9e-c0b8005b81f2
not available anymore, can you pls share a new link thanks @@Predibase
This was really good, thanks guys. After trying a bunch of different ways, and having some success (and plenty of OOM) running GPU machines and hosting models ... your approach makes so much sense. Looking forward to trying it.
Amazing explanation, Thank you guys
Great discussion.
I tried to follow the colab and I was able to fine-tune LLAMA2-7b on my own dataset. After fine-tuning, I'm trying to load the fine-tuned model on my VM (30GB RAM and GPU T4) but my system keep crashing due to OOM. Is there any other tested way to load the fine-tuned model binaries with ludwig? Would you be able to share code / video for that?
Thank you Predibase for this informative session. We look forward to include the Predibase training in our course structure.
how to get access to the notebook?
Excellent explanation of Ludwig features and potentials with hands-on assessment of alternative approaches to optimize the output. I look forward to new compelling applications in various industrial fields.
So is predibase a competitor to huggingface?
No, Predibase and Huggingface are complementary offerings. With Predibase, you can use off-the-shelf models from Huggingface or choose to fine-tune them on your own custom data. Predibase provides an end-to-end low-code AI platform for customizing and deploying any type of ML model including LLMs. You can also build custom models from scratch using recommended model architectures. The recommendations are provided based on your data and the type of ML task you are trying to solve. Sign-up for a free trial to explore the platform: predibase.com/free-trial/.
@@Predibase I was considering using a model I found on huggingface with an api, but I have no intentions on improving the model. Should I go for huggingface or predibase?
Hi @@malipetek , you can choose either - Predibase offers fast inference through an SDK or API, either through a managed SaaS offering or in your own VPC!
Hey @@arnavgrg Thanks for the great product. I deployed Predibase through docker on a VPC. Can you please suggest how to access the self-hosted/vpc predibase endpoints from the llamaindex?
Hi@@arnavgrg , Thanks for the great product. Can you suggest how to use the VPC-based Predibase setup with LlamaIndex?
brilliant product
Such a great presentation!
This is a great presentation!