Superwise
Superwise
  • Видео 13
  • Просмотров 58 261
Beyond fine tuning: Approaches in LLM optimization
On the one hand, we all want to optimize our LLMs, but fine-turning, much less training, is a resource-intensive and expensive task. This webinar will dive into techniques, methodologies, and best practice approaches to LLM optimization without any fine-tuning involved.
Key topics
- Prompt optimization and evals: TDD basics for LLMs, 3 paths to evals, and all things synthetic.
- Optimization with production insights: Tuning vs. optimization, RLHF, and advanced RAG optimization techniques such as self-querying, contextual compression, and parent and child chunking.
LLM architectures, deployments, and impacts on optimization: Model pruning, quantization, semantic caching, and edge.
Deck:
go.super...
Просмотров: 624

Видео

Unraveling prompt engineering
Просмотров 1,9 тыс.Год назад
On the face of things, prompting seems intuitive and straightforward. With that said, you'd be hard-pressed to find a developer who'll say that it's easy. This webinar will focus on unraveling the "art" of prompt engineering into best practices and methodologies. Key topics LLM compatibility: considerations in use case, task, model, infrastructure, and prompting selection and matching. The much...
Emerging architectures for LLM applications
Просмотров 51 тыс.Год назад
Everything from training models from scratch and fine-tuning open-source models to using hosted APIs, with a particular emphasis on the design pattern of in-context learning. Key topics we'll cover during the session include: - Data preprocessing and embedding, focusing on the role of contextual data, embeddings, and vector databases in creating effective LLM applications. - Strategies for prom...
To train or not to train your LLM
Просмотров 2,2 тыс.Год назад
LLMs are entering a critical phase as an increasing number of companies move towards integrating them into real-life business applications. While GPT-style models perform admirably initially, developing practical solutions in real-world scenarios remains complex. Is prompt engineering alone sufficient to achieve the desired accuracy? Are you open to sharing your data with a major vendor? Under ...
Improving search relevance with ML monitoring
Просмотров 453Год назад
When building machine learning systems for ranking and search relevance, it's crucial to measure the quality of the results for ongoing improvement of your models. Furthermore, you have to be able to identify edge cases in which your models are wrong, as well as detect corrupt data before it's exposed to users. In this session, we'll show how you can track search results quality on architecture...
A guide to putting together a continuous ML stack
Просмотров 4802 года назад
Let's take a dive into MLOps CI/CD CT pipeline automation. In part 1, we’ll focus on how to put together a continuous ML pipeline to train, deploy, monitor, and retrain your models. Part 2 will focus on automations and production-first insights to detect and resolve issues faster.
A guide to multi tenancy architectures in ML
Просмотров 2222 года назад
This session will cover architectural considerations for multi-tenancy in ML, best practices in traditional software engineering that can be copy/pasted over to MLOps, as well as new considerations unique to ML - Why and when to use ML multi-tenancy - Architecture pros and cons - Observability and monitoring at high scale - Security and compliance considerations
Model observability is all you need
Просмотров 962 года назад
In this webinar, our head of product, Amir Servi, will walk you through Superwise’s ML observability platform. He will demonstrate how to integrate a model, analyze its behavior in production, configure monitoring policies and get alerted when anomalies are detected, and create a retraining strategy based on the above. You’ll learn how to do the same for your own model! What you’ll learn: - How...
A guide to putting together a continuous ML stack
Просмотров 772 года назад
A guide to putting together a continuous ML stack
Continuous MLOps pipelines: A dive into continuous training automation
Просмотров 3962 года назад
In this webinar, we’ll learn how to implement the 1st level of MLOps maturity and perform continuous training of the model by automating the ML pipeline. We'll start with the ML pipeline and see how we can detect performance degradation and data drift in order to trigger the pipeline and create a new model based on fresh data. What will you learn: - See an example of an ML pipeline implementati...
Superwise Model Observability
Просмотров 5302 года назад
Easily monitor and observe your entire model inference flow and ML decision-making process. Simple, customizable, scalable, and secure ML monitoring.
superwise x MachineLearning RoundUp - WEBINAR HIGHLIGHTS
Просмотров 1163 года назад
This webinar features insights from the editors of the Machine Learning Ops Roundup, and a few "stories from the field" on why monitor your models in production. The 5 min highlights focus on a few questions: - How is ML different from traditional software? - What does it take to create a top ML infrastructure? - Who owns the models in production? - What is the need for monitoring? - What is AI...
How monday.com uses superwise to monitor their marketing ML models
Просмотров 2223 года назад
What happens when the #1 productivity solution needs to scale its use of AI? Check out the highlights of the webinar led by monday.com to learn the best practices of their marketing and data science teams!

Комментарии

  • @maria-wh3km
    @maria-wh3km 2 месяца назад

    it was awesome, thanks guys, keep up the good work.

  • @zugbob
    @zugbob 6 месяцев назад

    These are some of the best videos I've seen on this topic.

  • @parth905
    @parth905 7 месяцев назад

    Great video, was great to learn about all the things that go into putting a model into action.

  • @minkijung3
    @minkijung3 7 месяцев назад

    Thanks for this presentation. It was really insightful.

  • @braindenburg
    @braindenburg 8 месяцев назад

    Great video- really insightful. I will share this on LinkedIn, as I think more people need to see it!

  • @RiazLaghari
    @RiazLaghari 8 месяцев назад

    Great!

  • @vladimirobellini6128
    @vladimirobellini6128 9 месяцев назад

    great

  • @vladimirobellini6128
    @vladimirobellini6128 9 месяцев назад

    great ideas txs!

  • @jind0sh
    @jind0sh 10 месяцев назад

    🎯 Key Takeaways for quick navigation: 00:00 🤖 *Introduction to the topic of emerging architectures for LLM applications* - Overview of the importance of understanding the architectures for implementing LLMs. - Introduction of the speakers from Tensor Ops and Superwise. - Contextual explanation of Tensor Ops as an AI consultancy service and Superwise's role in deploying LLMs in production environments. 01:38 🏗️ *Overview of the agenda and the concept of "Emerging Architectures for LLM Applications"* - Naming inspiration from a16z's blog post "Emerging Architectures for LLM Applications." - Emphasis on exploring concepts introduced by a16z but with unique insights and use cases. - The need to address limitations faced when deploying LLMs in real-world applications. 02:49 🌐 *Why discuss architectures for LLMs and the limitations faced in production* - Addressing the hype vs. reality of AI taking over jobs and the actual limitations faced in deploying LLMs. - Mention of limitations, such as the model's limited context window and biases, leading to the need for architectural considerations. - Introduction of a poll to gather insights on the use cases the audience is exploring with LLMs. 05:17 🧠 *Retrieval Augmented Generation: Design pattern and common use case* - Introduction to the design pattern called "Retrieval Augmented Generation" (RAG). - Explanation of how RAG addresses the limitations of LLMs by combining retrieval systems with LLMs. - Highlighting the importance of using a vector database as part of the retrieval system. 08:48 🔍 *Limitations and considerations in designing Retrieval Augmented Generation systems* - Overview of limitations faced in LLMs, such as limited context windows and biases. - Introduction to the use of vector databases and potential issues with their applicability. - Advanced techniques like time component addition and using smaller llms for post-processing to improve accuracy. 13:17 🛠️ *Components and steps in developing Retrieval Augmented Generation systems* - Overview of the four main steps in developing RAG systems: data ingestion, pre-processing, retrieval strategy, and efficient document retrieval. - Emphasis on the need for efficient orchestration to overcome limitations in LLMs. - Introduction of advanced RAG architectures for improved accuracy. 15:06 ⚙️ *Advanced Retrieval Augmented Generation architectures for improved accuracy* - Explanation of strategies like time component addition, relevance reorganization, contextual compression, and self-querying. - Highlighting the importance of context in improving the efficiency and relevance of retrieved documents. - The role of llms in post-processing to enhance the quality of retrieved information. 18:00 🔄 *Iterative prompt engineering and the use of FLAIR technique* - Acknowledgment of the importance of prompt engineering for better results in LLMs. - Introduction to the FLAIR technique for refining prompts and improving confidence in llm responses. - The iterative process of refining prompts, retrieving data, and generating responses to achieve better performance. 20:28 🤔 *Tackling complex tasks beyond the capabilities of a single LLM* - Transition from traditional MapReduce-like strategies to handling more complex tasks in LLMs. - Introduction of the challenge of generating content or summaries for massive datasets. - The need for solutions beyond simple aggregation for tasks that cannot be handled by a single LLM. 23:03 🚀 *Orchestrating multiple LLMs for complex tasks* - Overview of the challenges in orchestrating multiple LLMs for complex tasks. - Introduction to the need for efficient aggregation strategies for tasks like content generation. - The complexity of tasks requiring collaboration among multiple LLMs for more effective outcomes. 23:29 📚 *Overview of MapReduce in LLM Applications* - Explanation of the inspiration behind MapReduce in language models. - Discussion on dividing and processing text data in chunks for intermediate summarization. - Introduction to emerging merging algorithms and their parallel to MapReduce. 26:43 📊 *Monitoring in LLM Architecture* - Differentiation between LLM evaluation and monitoring. - Explanation of the evaluation phase and benchmarks in language models. - Focus on monitoring, highlighting the key differences in approach compared to classical machine learning systems. 27:49 🔄 *Phases of Monitoring in LLM* - Overview of monitoring phases: User Query, Retrieval Process, Model Response, and User Feedback. - Specific metrics for monitoring user queries, including sentiment score, topic classification, and injection detection. - Metrics and considerations for monitoring the retrieval process, emphasizing relevance and ranking. 34:24 🚀 *Metrics for Monitoring Model Responses* - Introduction to monitoring model responses, focusing on counting refusals and analyzing similarity drift. - Considerations for detecting personal identifiable information (PII) in responses. - Emphasis on monitoring user feedback, distinguishing between explicit and implicit feedback and its valuable role. 37:14 🔧 *Agent Architecture in LLM* - Explanation of agent architecture and its experimental nature. - Introduction to the React framework and its role in decomposing tasks for language models. - Discussion on the reflection framework and its impact on accuracy improvement. 39:35 🛠️ *Tools in Agent Architecture* - Overview of tools in agent architecture as functions accepting strings for specific tasks. - Examples of tools, such as Python interpreters, Google searching, and data fetching, as extensions of the language model's capabilities. - Highlighting the potential of agents to decide when to use these tools dynamically. 43:17 🚄 *Techniques to Speed Up LLM Inference* - Discussion on techniques to enhance LLM inference speed, including token limit adjustments. - Considerations for caching, fine-tuning, and reducing the size of language models. - Emphasis on the complexity of addressing inference speed, involving various factors beyond model size. 47:27 ⚠️ *Safeguarding LLM Outputs* - Exploration of strategies for preventing out-of-context or nonsensical results from language models. - Consideration of the role of prompt engineering and training in mitigating undesired outputs. - Suggestion of using additional language models to filter and refine outputs for improved context awareness. 48:29 🌐 *AI Safety and Fine-Tuning LLMs* - Fine-tuning LLMs for specific safety metrics. - AI safety companies using smaller models for safety validation. 49:12 🌱 *Gardening Concept in LLMs* - Post-processing step after model response for safety. - Filtering out offensive or PII content and replacing with a generic response. 49:57 🔄 *Evolution of ChatGPT and Relevance* - Discussion on the changing relevance of ChatGPT. - OpenAI's early edge, increasing competition with other models. - Gap closing, factors beyond models' quality influencing decisions. 52:30 🌐 *Closing Gap and Considerations* - Closing gap in model quality. - Importance of engineering systems and support behind models. - Decision-making factors: price, availability, privacy, legal considerations. Made with HARPA AI

  • @rajeshkr12
    @rajeshkr12 11 месяцев назад

    @superwiseai please share the presentation

  • @chirusikar
    @chirusikar 11 месяцев назад

    Total gibberish in this video

  • @svendpai
    @svendpai Год назад

    Great talk, your webinars are a wonderful source of information :)

  • @mayurpatilprince2936
    @mayurpatilprince2936 Год назад

    Informative video ... Waiting for next video :)

  • @GigaFro
    @GigaFro Год назад

    Hm… the first speaker made it sound like you couldn’t use context outside of the user input without prompt chaining. This is not true. Otherwise, loved the talk !

  • @billykotsos4642
    @billykotsos4642 Год назад

    Great talk !

  • @vakman9497
    @vakman9497 Год назад

    I was very pleased to see how well everything was broken down! I was also shook to see a lot of the architecture strategies were things we were already implementing at our company so I'm happy to see we are on the right track 😅

  • @_rjlynch
    @_rjlynch Год назад

    Very informative, thanks!

  • @MMABeijing
    @MMABeijing Год назад

    That was very nice, thank you all

  • @dream_machine812
    @dream_machine812 Год назад

    This series has been insightful. Thanks for uploading

  • @williampourmajidi4710
    @williampourmajidi4710 Год назад

    🎯 Key Takeaways for quick navigation: 00:00 📚 Introduction to the topic of emerging architectures for LLM applications. 01:54 🧐 Why focus on LLM architectures. 04:02 📊 Audience poll on LLM use cases. 05:17 🧠 Retrieval Augmented Generation (RAG) as a design pattern. 08:05 💡 Advanced techniques in RAG and architectural considerations. 14:40 📦 Orchestration and addressing complex tasks with LLMs. 23:53 🧩 LLMs in Intermediate Summarization 26:43 📊 Monitoring in LLM Architecture 32:04 🛠️ LLM Agents and Tools 39:05 🔄 Improving LLM Inference Speed 49:26 🛡️ OpenAI's ChatGPT and its relevance in the field, 50:12 🌐 Evolution of ChatGPT and the AI landscape, 51:09 💼 OpenAI's models and their resource allocation, 52:16 🏢 Factors influencing model choice: Engineering, economy, and legal considerations, Made with HARPA AI

  • @hidroman1993
    @hidroman1993 Год назад

    So informative, looking forward to seeing more

  • @GigaFro
    @GigaFro Год назад

    Can someone provide an example of how one might introduce time as a factor in the embedding?

    • @serkanserttop1
      @serkanserttop1 Год назад

      It would be in a meta field that you use to filter results, not in the vector embeddings itself.

  • @YesWeCan-u9i
    @YesWeCan-u9i Год назад

    Thank you for your insights. I see there are no comments and not that many views. But please keep posting. This is changing the world.

  • @sunnychopper6663
    @sunnychopper6663 Год назад

    Really informative video. It will be interesting to see how different layers are formed throughout the coming months. Given the complexities of RAG, it'd be interesting to see hosted solutions that can offer competitive pricing on a RAG engine.

  • @IsraelDavid-z8g
    @IsraelDavid-z8g Год назад

    Wonderful video, learns a lot, thanks. This vieo was great! Thank you so much..

  • @VaibhavPatil-rx7pc
    @VaibhavPatil-rx7pc Год назад

    Excellent detailed information thanks, please share slide details,

    • @superwiseai
      @superwiseai Год назад

      Thank you! You can access the slides here - go.superwise.ai/hubfs/PDF%20assets/LLM%20Architectures_8.8.2023.pdf

  • @todd-alex
    @todd-alex Год назад

    Very informative. Several layers of LLM architectures need to be simplified like this. Maybe a standard for XAI should be developed based on a simplified architectural stack like this for LLMs.

  • @Aidev7876
    @Aidev7876 Год назад

    Honestly. Not huge value for 55 minutes,,,

    • @k.8597
      @k.8597 Год назад

      these videos seldom are.. lol.

  • @vikassalaria24
    @vikassalaria24 Год назад

    Really great presentation.Keep up the good work

  • @HodgeLukeCEO
    @HodgeLukeCEO Год назад

    Can you make the slides available? I have an issue seeing them and following along.

    • @superwiseai
      @superwiseai Год назад

      No problem here you go - go.superwise.ai/hubfs/PDF%20assets/LLM%20Architectures_8.8.2023.pdf

  • @afederici75
    @afederici75 Год назад

    This vieo was great! Thank you so much.

  • @dr-maybe
    @dr-maybe Год назад

    Very interesting, thanks for sharing

  • @MengGe-s8l
    @MengGe-s8l Год назад

    Great sharing, that's just what I am looking for

  • @MengGe-s8l
    @MengGe-s8l Год назад

    Wonderful video, learns a lot, thanks

  • @zhw7635
    @zhw7635 Год назад

    Nice to see these topics covered, these come up as soon as I was attempting to implement something with llms

  • @MattHabermehl
    @MattHabermehl Год назад

    4k views and only 2 comments. This is the best RUclips video I've seen by far on these strategies. Great content - thank you so much for sharing your expertise!

  • @investigativeinterviewing4617

    This is one of the best webinars I have seen on this topic. Great slides and presenters!

  • @jessrich176
    @jessrich176 2 года назад

    pr໐๓໐Ş๓