LLMOps (LLM Bootcamp)

The Full Stack

Просмотров 93 тыс.

2 100

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 9 фев 2025

Комментарии • 62

@ryaninghilterra109 Год назад ⁺¹⁹
Dolly 2.0 is fully permissive license for commercial use I thought? . Listed on your talk as proprietary license . Also the mosaic fully open source models look promising (MPT-7B)
@The_Full_Stack Год назад ⁺²¹
This is correct! We got a bit tripped up on Dolly 2.0, as the licensing is weirdly complicated.
From what we can tell, it has an MIT license on the weights (huggingface.co/databricks/dolly-v2-12b), an Apache license on the training/inference code (github.com/databrickslabs/dolly/blob/master/LICENSE) and a CC-BY-SA license on the training data (github.com/databrickslabs/dolly#model-overview).
MPT wasn't out yet when these were recorded (three weeks ago, late April 2023), but we agree it looks promising. Especially the long context window models!
@ryaninghilterra109 Год назад ⁺⁴
Awesome, appreciate the detailed response!
@robertcormia7970 Год назад ⁺³
Fantastic "part 3" in a sequence of topics. The speaker (Josh) is very comfortable explaining application development for LLMs, which is our main focus in developing an AI certificate at our college. Josh is clearly experienced and enthusiastic about this field, and explains topics well!
@MridulBanikcse Год назад
gem of a resource. concise and clear.
@winsontruong6264 Год назад ⁺⁵³
if anyone else is exploring this chat its good to note that because LLMs are moving so fast, there are even more Apache 2.0 models out there that have been released after this presentation. RedPajama and GPT4All-j variants have A2.0 licenses and from memory their performance is decent.
@astronemir Год назад
👍
@blivas Год назад
What is the latest on these?
@MrMaikeul Год назад ⁺⁴
Dude, this was awesome. Thanks for spilling the beans on what's to come in our space ;)
@joelalexander7293 Год назад ⁺¹
Goldmine of information. Love it!
@a_user_from_earth Год назад
What a phenomenal talk! Amazing slides, kept simple yet they do really add something to your great explanatioins. Showing the difference between then and now, between DNNs and LLMs operations was also great and a very kind wrap up in the first half.
@siddharthbhargava4857 Год назад ⁺²
An amazing presentation. We definitely need more videos/content like this that can help navigate the quick-paced, dynamic tech world. Thank you.
@alessandrorossi1294 Год назад ⁺¹⁴
I started working in this space a few months after Glove and Word2Vec embeddings came out back in 2014. I have to say when I see the word "bootcamp" in a title I usually run for the hills, but this guy actually gave a great presentation with a coherence and fluency showing he actually has experience and didn't just learn this from index cards 5 minutes before the presentation (my usual experience with bootcamps). Bravo!
@jonassteinberg3779 6 месяцев назад
This is probably the best talk on LLMops from the dev perspective, as opposed to from the devops perspective, on the internet.
@elginbeloy9066 Год назад ⁺⁵
My boy Josh Tobin. legend
@jsfnnyc Год назад
Great talk!!
@AlexLeu Год назад ⁺²
Very useful session. I've learned a lot -- especially evaluation metrics for LLM. Thank you!
@arvj123 Год назад
This is exactly the thing I was looking for (having made a codebase analysis tool with LLM that I want to share with my team). Thank you for making this video for free. Much appreciation to whoever runs this channel.
@snehotoshbanerjee1786 Год назад
Fantastic talk!
@datasciencetoday7127 Год назад ⁺³
dear youtube algo please give me more recommendation like this
@ShifraTech Год назад ⁺²
Great Talk, a lot of work to be done in the LLM deployment/production scene for Software Eng/DevOps
I like how this was recorded like 2 weeks maybe ago ?
now it a bit aged look at anthropic announcing their 100k not out yet but more promising the 65k MPT-7B-StoryWriter-65k+ by Mosaic
crazy how this field is progressing
@fudanjx Год назад ⁺⁴
## Choosing a base language model
- Trade-offs to consider:
- Out-of-the-box quality
- Speed and latency
- Cost
- Fine-tunability
- Data security
- License permissibility
- Conclusion: Start with GPT-4 for most use cases
## Managing prompts and chains
- Level 1: No tracking
- Level 2: Manage in git
- Level 3: Use a specialized tool (if needed)
## Evaluating performance
- Build evaluation set incrementally:
1. Start small
2. Use LM to generate test cases
3. Add more data as you discover failure modes
- Metrics:
- Accuracy (if correct answer exists)
- Reference matching (if reference answer exists)
- Which is better? (if previous answer exists)
- Incorporates feedback? (if human feedback exists)
- Static metrics (if no data exists)
## Deployment
- Call API from frontend
- Isolate LM logic as separate service (if needed)
## Monitoring
- Outcomes
- Model performance metrics
- Common issues: incorrect answers, toxicity, etc.
## Improving the model
- Use feedback to improve prompt
- Optionally fine-tune the model
@omarelmady Год назад
Very impressive quality Lecture, excited to learn more. I’m looking to get started with making my own chatbot tutor.
@alireza29675 Год назад
I found exactly what I was searching for! The explanation was amazing and the insights were great
@datasciencetoday7127 Год назад
so good man, I really loved the model comparisons
@jonassteinberg3779 6 месяцев назад
All one needs to do to track prompt accuracy, at least from a basic standpoint, is track prompts in git, as he mentions, but that have an automation pipeline that runs prompt changes against a ground truth or fine-tuning data set, probably in cicd. Have that pipe output statistical measurements and vois la: automated prompt comparison.
@alchemication Год назад
Wow. Great timing for this. Thanks! The only model I’m missing in comparison is Open Assistant, seems to be fully “open”
@richardadams1909 Год назад
Really good talk! New to LLMs and learned a lot. At the end when you were talking about the iteration cycle, you were describing how you would come up with an idea as an individual, experiment a bit, then share with your team.
As a software developer I find that pair or mob programming is really good approach at the start of a new piece of work. Do you have any thoughts on 'pair-prompting' as a way to improve the initial stage of the project ? After all, interacting with a LLM is a conversation, so having a few people working together on refining prompts could help reduce biases/assumptions you are introducing as an individual.
@fintech1378 Год назад
it feels quite ancient?after openAI dev day? things can be obsolete in months?
@oguuzhansahin Год назад
So, my question is how flan-t5's context length is referred as 2K? As far as I know, it must be 512. Am I wrong?
@alessandrorossi1294 Год назад
Why weren't GPT-J models included in the open source discussion?
@ayanghosh8226 Год назад
Can we find the slides used anywhere? The Fine tuning related slides were skipped due to shortage of time, but it seemed there is a lot of useful information in it too. If there is a link available to the slides, kindly share.
@ayanghosh8226 Год назад
Ignore the comment. I found the slide links in the description. Thank you! Excellent presentation ❤
@JonathanFraser-i7h Год назад
well your open source slide was just wrong. OpenRAIL absolutely does allow commercial use for both the BLOOM and BLOOMZ model. Oddly enough, BLOOMZ which is alot like gpt-3.5 is conspicuously missing from your slides.
@togo7022 Год назад ⁺¹
The move from MLOps to LLMOps will be quite humbling for the MLOps world/hype. LLMs means custom internal DS/ML functions are no longer that important when you have commodity API to use. LLMOps then just becomes basic data engineering and management again
@The_Full_Stack Год назад
Definitely possible -- that's why we spent less time on deployment in the LLM Bootcamp than in our Deep Learning Course.
But if FOSS models and finetuning take off, then MLOps concerns about experiment management and model versioning will come roaring back!
@3169aaaa Год назад
cool
@Sean_neaS Год назад
I'm doing initial coding on an open source model, then I can switch to gpt-4 once I know I'm not doing anything stupid like infinite loops
@The_Full_Stack Год назад
Interesting approach! For intensive and open-ended applications like agents, the LM calls can definitely add up to a ton of tokens.
When using model providers, follow best practices for all cloud services, like putting guardrails in place to limit the pain from surprise bills.
@pietraderdetective8953 Год назад
I'm surprised seeing claude-instant only got 1 out of 4 star in terms of quality.
I've been using both chatgpt 3.5 and claude-instant and I much prefer claude-instant.
in my opinion if chatgpt 3.5 receives 3 star then claude-instant deserves at least the same..
the issue with openai model is they put too much filter / constraints to the models...if I ask chatgpt something considered to be "sensitive" it just outright refuse to answer the question.
@thomasr22272 Год назад
You guys barely mentioned Prompt injection attacks, come on this is a crucial aspect for the future of LLMs
@The_Full_Stack Год назад
We agree that mitigating prompt injection is critical for LLM-powered apps that use tools or access possibly sensitive information!
Because prompt injection isn't solved yet, we covered it in our What's Next? lecture, where we discuss multiple safety+security concerns for LLM software: ruclips.net/video/ax_R4yz1WwM/видео.html
@picklenickil Год назад
Calude was supposed to be 100k context....?
@StephenRayner Год назад ⁺¹
That only just dropped. I doubt this is up to date? Only at 5:33 atm
@jerryyuan3958 Год назад
@@StephenRayner It is supported in Poe now
@The_Full_Stack Год назад ⁺³
Correct! These videos are about three weeks old, and a lot happened in the FOSS model world in that time.
@Lolleka Год назад ⁺¹
@@The_Full_Stack Does this mean that the time to obsolescence is getting drastically shorter?
@picklenickil Год назад
I was expecting more on deployment.
Comeon.. 😢😮😅
@NeuroScientician Год назад ⁺¹
Llama is OSS now
@sachinkun21 Год назад
Source? Can't find anything regarding this
@sachinkun21 Год назад
On the official model card it still says: License Non-commercial bespoke license
@jordancardenas4953 Год назад
@@sachinkun21
OpenLLaMA is an open reproduction of LLaMA with the original architecture but trained with the RedPajama Dataset
@NeuroScientician Год назад
@@sachinkun21 Released under the name OpenLLaMA under Apache 2.0
@sachinkun21 Год назад
Oh that one. I thought meta’s. Haven’t experimented with OpenLlama so can’t say anything about performance but meta’s llama if open sourced will also open doors for it’s popular dialogue derivatives such as Vicuna and Koala.
@limitlesslife7536 Год назад
Dolly is not proprietary.
@Freddychao 2 месяца назад
All outdated. Watch with your own judgement.
@mrGapMan1 Год назад ⁺²
This presentation is horribly outdated after 1 week. There are now super competent open source LLMs, uncencored that can be used as Auto-GPTs with Langchain and Pinecode and 100K tokens. Cmon, this bootcamp needs to chill or go streaming every second day to be relevant.
@The_Full_Stack Год назад
Some of the material we cover does change quickly, and the state of play for FOSS models happened to change a lot in the three weeks since we recorded this video! Here's hoping they keep improving.
We really like HELM (crfm.stanford.edu/helm) and the LMSys leaderboard (chat.lmsys.org/?leaderboard) for keeping up with capabilities and benchmarking models against one another. What do you use?
@BodinhoDE Год назад ⁺¹
The presentation is mainly about how to evaluate / testing / deploying LLMs. Can you elaborate what is „horribly“ outdated on these topics?
@mrGapMan1 Год назад
@@The_Full_Stack I used hyperbole speech to point to the super progress AI is making and that this kind of conference would probably be better of waiting untill the progress reaches a steady state. I didn't mean to hurt peoples feeling. Sorry if i did.
@OliNorwell Год назад ⁺²
@@BodinhoDE The Vicuna 13B model and comparable ones are far far better than what is suggested here (where they are rated as basically useless), that's the only misleading part of this video. But also, they can't update the video every week, so it's hard to be annoyed!

Следующие

Автовоспроизведение