*Don't* miss these LLMs Concepts!!

Поделиться
HTML-код
  • Опубликовано: 5 июл 2024
  • Timestamps
    00:00 Chapter 1: What is a large language model (LLM)?
    05:23 Chapter 2: How do LLMs work?
    12:13 Chapter 3: Base models vs fine-tuned models
    17:48 Chapter 4: How to improve an LLM
    There's a lot of misconceptions about the current generation of AI.
    I feel it stems from the lack of basic understanding of how they came about.
    Here's my attempt to simplify a few things around it.
    1. What's a Language Model
    2. What's a base Model?
    3. Base Model vs Fine-tuned Model
    🔗 Links 🔗
    Word Embeddings Link - lena-voita.github.io/nlp_cour...
    T5 Paper - arxiv.org/pdf/1910.10683
    InstructGPT for alignment - openai.com/index/instruction-...
    ❤️ If you want to support the channel ❤️
    Support here:
    Patreon - / 1littlecoder
    Ko-Fi - ko-fi.com/1littlecoder
    🧭 Follow me on 🧭
    Twitter - / 1littlecoder
    Linkedin - / amrrs
  • НаукаНаука

Комментарии • 66

  • @MartuzaFerdous
    @MartuzaFerdous 12 дней назад

    Fantastic job. You have a natural talent to explain a complex concepts very clearly and simply. U are making a difference. Wish u all the best.

  • @HmP_ai
    @HmP_ai 15 дней назад +1

    Keep going brother, amazing AI content !

  • @awesomedata8973
    @awesomedata8973 Месяц назад +2

    This is a great video. Please go even further in-depth next time! -- I love the model testing, but I also love the theory and fundamentals behind it all too!

  • @ManzarIMalik
    @ManzarIMalik 26 дней назад +2

    Bro you should be making more tutorial based content, your explanation skill is top notch!

    • @1littlecoder
      @1littlecoder  26 дней назад

      Thanks bro. Any example what might of content would help you ?

    • @StewieGriffin-damnU
      @StewieGriffin-damnU 24 дня назад

      @@1littlecoder How about a Playlist on All things LLMs(From Absolute Beginners to current developments)

  • @inishkohli273
    @inishkohli273 25 дней назад +1

    Thank you so much bhai...Completed both video😊. Really enjoyed it. Your guide to what to read are very helpful because most of the time we dive into learning without knowing what to learn, how to learn, and what amount ot knowledge to acquire . You covered all the tips . Keep it up

  • @Jvo_Rien
    @Jvo_Rien Месяц назад

    you explain very well, thank you brother

  • @terryethompson
    @terryethompson Месяц назад +1

    Very nice job. This video is fire!!!

  • @bernard2735
    @bernard2735 Месяц назад

    Great video. Thank you.

  • @ArunKumar-mp1di
    @ArunKumar-mp1di Месяц назад

    I was curious about Markov chains, thank you for including it in this video. I have a question regarding fine-tuning for a domain-specific use case.
    If we want to fine-tune an open-source model for an agentic system, can we use LoRA? Additionally, when creating the fine-tuning dataset, what data should we include? Should we incorporate every piece of data passed through the LLM, such as the CoT prompt + corrected output reasoning, query + context along with the response, and planning if that was part of an LLM call, etc.?

  • @abdoulayediallo3777
    @abdoulayediallo3777 Месяц назад

    How do you do the continuous pre training phase? Fine tuning the base model vs the Instruction model ?
    Even though base model have less context window.
    But great video 🎉❤

  • @ahmadsaud3531
    @ahmadsaud3531 Месяц назад +1

    Thanks!

    • @1littlecoder
      @1littlecoder  Месяц назад

      Thank you so much, Glad you found it worthy!

  • @IndiaOutsource
    @IndiaOutsource Месяц назад

    I have some questions:
    We take datasets (like Alpaca and ShareGPT) and train using transformers, which I believe is the common method. However, there are others like RNN, CNN, and self-attention mechanisms to get a base model.
    Now, how can I do this for Tamil? I found some datasets on Kaggle or was thinking of scraping data and making my own dataset. If so, what format should this be? JSON? I've asked around, but no one has answered. I'm a poor arts student (not from computer science background), following your channel from the begining and trying things by myself. I'd like to create my own fine-tuned or base model. Can you help clarify this?

  • @ahmadsaud3531
    @ahmadsaud3531 Месяц назад +1

    thanks a lot

  • @user-ce7vu3ct3y
    @user-ce7vu3ct3y Месяц назад

    This is too good, you and Hussein Nasser are the best

  • @mathavansg9227
    @mathavansg9227 Месяц назад +1

    great video!

  • @The_Criminal_Minds
    @The_Criminal_Minds 16 дней назад

    I really like when you pointed out if someone already knows about these concept you can skip the video. Generally what happens is we watch the whole video to see if there is something new and ends up with adding nothing new to the knowledge

  • @ganesha7316
    @ganesha7316 Месяц назад

    Nice video. Thanks for the explanation. So LLMs predict next word, this part is pretty clear. If i give a question with multiple options as answer, LLMs were able to identify the correct option and reply back. How does it work? If you have some time, can you please throw some light on it?

  • @motivationmantra5206
    @motivationmantra5206 Месяц назад +1

    Great video. More videos
    Rag video using llama index with unstructured data

  • @ravishmahajan9314
    @ravishmahajan9314 24 дня назад

    Hi ! Grear Video, May you please tell from where should i learn all the things related to LLM?

    • @1littlecoder
      @1littlecoder  24 дня назад

      I'd say see what you want to do with that and learn from that. That's a very top down approach and works well for most. Deep learning dot ai has good courses.

  • @KumR
    @KumR Месяц назад

    Hi Abdul. Thanks for this. Is base model same as instruct model? Is pre training supervised or unsupervised?

    • @geniusxbyofejiroagbaduta8665
      @geniusxbyofejiroagbaduta8665 Месяц назад

      No an instruct model is a further finetuned base model on some dataset that guilds the based model on ways to answer questions

    • @KumR
      @KumR Месяц назад

      ok i think thats a general confusion too on what are diff type of models. which ones are unsupervised and supervised amongst them too....

  • @ajan4174
    @ajan4174 Месяц назад

    Hey is learning langhchain and langgraph and dela with llms will help us to get a job as ai engineer or is it just a bubble for the time now ?

    • @1littlecoder
      @1littlecoder  Месяц назад

      I would say forget about bubble at this point. Honestly, it doesn't matter at this point. You learn these things. You start doing some project and make some money

    • @ayushmishra5861
      @ayushmishra5861 Месяц назад +1

      ​@@1littlecoderbut what if we are building expertise in this domain rather than building in some other domain and this bubble bursts after few years. All my experience in this domain becomes irrelevant then?

  • @AbdulBasit-ff6tq
    @AbdulBasit-ff6tq 26 дней назад

    Do you plan to create a video on knowledge infusion into LLM soon?

    • @AbdulBasit-ff6tq
      @AbdulBasit-ff6tq 26 дней назад

      Is it possible with QLora or should we go with CPT? And can we actually CPT a llm model with Qlora or will we require as much compute power as the organization used to train the that base model?

  • @DefaultFlame
    @DefaultFlame Месяц назад +2

    I don't know what language it is, but if it's made from corn and makes you drunk it's probably bourbon. The "everyone likes it" line is a lie though.
    Edit: Ah, so that's what the video is about. I supposed I could stop watching but I'm going to keep watching to the end anyway.
    I remember doing so much experimentation with (and spending so much money on) the text-instruct-davinci model of GPT-3. One of the most interesting was experimenting with having it generate images and doing image recognition on SVG code. I actually still have the SVG code from a couple of it's better "art" pieces that were actually legible.

  • @sanemonk1
    @sanemonk1 Месяц назад

    From this video i kinda of got an impression that just like base model even broader fine tuned models are in the market that enterprises can pay to use and further fine tune for their individual needs.
    Fine tuned models are built by each enterprise only correct?
    No such broader fine tuned pay and use models are in the market currently. Correct?

  • @HashtagTiluda
    @HashtagTiluda Месяц назад

    Can you make a video about fine-tuning multi-modal LLM models with custom image dataset?

    • @1littlecoder
      @1littlecoder  Месяц назад +1

      Any specific use case or domain?

    • @HashtagTiluda
      @HashtagTiluda Месяц назад

      @@1littlecoder you can take the plantdoc dataset from Kaggle

  • @heisenbergwhite5845
    @heisenbergwhite5845 Месяц назад

    Any thought on llama3-v stealing all work from an other model, thought you can cover this on the channel, as it is huge

    • @1littlecoder
      @1littlecoder  Месяц назад

      Thank you llama3 v is definitely on the list

    • @1littlecoder
      @1littlecoder  Месяц назад

      Oh I missed the stealing part, are they ?

  • @tamilil-1857
    @tamilil-1857 Месяц назад

    Please make videos on LLMOps 😊

    • @1littlecoder
      @1littlecoder  Месяц назад +2

      Do you mean the productionizing part of it? I actually been thinking to get a guest for this!

  • @user-yg2qv4kf4r
    @user-yg2qv4kf4r Месяц назад

    It would be nice, If you make videos to build a base model and fine-turn it.

  • @VenkatesanVenkat-fd4hg
    @VenkatesanVenkat-fd4hg Месяц назад +1

    I like to ask one important question that all big companies going to provide apis which will do multimodal, which can be handled by machine learning engineer or SE, what is the purpose of data scientist. I think it will vanish in India because here most of them going to use api alone, right?. I think current llm can do both frontend and backend codes well, then whether the count of SE will be reduced ( mostly my view)....by Senior Data Scientist.....

    • @1littlecoder
      @1littlecoder  Месяц назад +1

      I'm of that opinion data scientist as a role is highly ambiguous and polluted and definitely think the LLMs that we see will make a lot of difference there. But lot of things come at play like data residency, data privacy and other such aspects that can make them still relevant even if AI rises.

    • @MSworldvlog-mr4rs
      @MSworldvlog-mr4rs Месяц назад

      If that happens, most of the jobs can be replaced by AI

  • @MichealScott24
    @MichealScott24 26 дней назад

    ❤🙇

  • @gunngunn6763
    @gunngunn6763 Месяц назад +1

    What is generative AI?

    • @1littlecoder
      @1littlecoder  Месяц назад

      Using AI to generate something, image or text

    • @sanemonk1
      @sanemonk1 Месяц назад

      LlM based Generative Ai only makes q&a, chatbots, rags, agents? What is the Complete scope of Gen AI depeloper? e@@1littlecoder

  • @ahmadsaud3531
    @ahmadsaud3531 Месяц назад

    hi, can i have a session(s) with you for fixed number of hours and we agree on the price per hour, my interest is mainly to learn and discuss few use cases with and get the benefit from your experience. please let me know if that is possible.

    • @1littlecoder
      @1littlecoder  Месяц назад

      please email me 1littlecoder at gmail dot com. we can discuss

  • @user-ce7vu3ct3y
    @user-ce7vu3ct3y Месяц назад

    youtube is bad, it always delete my comments on your videos :(. Is there anywhere else I can take advice or ask questions from you?

  • @Ovmt
    @Ovmt Месяц назад

    Hey i am too much confused with the llm infrastructure right now we have langchain Lang graph, hugging face , llama index (i guess which do the same work as langchain does ifk why its still there). I think with llm we can just finetune llm or implement RAG or make agents infact why there are so much things

    • @1littlecoder
      @1littlecoder  Месяц назад

      Are you referring to those specific packages or just generally the concepts?

    • @Ovmt
      @Ovmt Месяц назад

      I was referring to the package​@@1littlecoder

  • @saideepesh6036
    @saideepesh6036 Месяц назад

    I think you can reduce your portion from the video, it's blocking few things and standing out loud on the screen, Thanks.

    • @1littlecoder
      @1littlecoder  Месяц назад

      Thanks for the feedback. I used to make my background transparent for the same but few feedbacks came as it's not nice. I'll probably next reduce the size and see.

  • @donkeroo1
    @donkeroo1 Месяц назад

    Clearly, building and implementing an LLM requires special training and not DBAs.

    • @1littlecoder
      @1littlecoder  Месяц назад

      Could you please elaborate

    • @donkeroo1
      @donkeroo1 Месяц назад

      First of all, love the content. My comment is a general observation that LLMs are statistical models that require technical training. Most enterprise LLM implementations interestingly are owned by IT, who lack proper training to ensure an LLM succeed. Having IT own the implementation is really a recipe for failure imo.