OpenAI's NEW Embedding Models

Поделиться
HTML-код
  • Опубликовано: 25 июл 2024
  • НаукаНаука

Комментарии • 56

  • @jamesbriggs
    @jamesbriggs  6 месяцев назад +3

    Try out the code here github.com/pinecone-io/examples/blob/master/learn/search/semantic-search/openai-embed-v3/openai-embed-v3.ipynb
    Testing the 256-d embeddings here! ruclips.net/video/bW931qHLV0M/видео.html (they're very good)

  • @ernestosantiesteban6333
    @ernestosantiesteban6333 6 месяцев назад +2

    Great video! You always give us recent information. Thank you for your work.

    • @jamesbriggs
      @jamesbriggs  6 месяцев назад +1

      You’re welcome, thanks for watching :)

  • @vladif251
    @vladif251 6 месяцев назад

    Great overview. Thanks James

  • @mortezalayegh2587
    @mortezalayegh2587 6 месяцев назад

    Thanks for the update. Great Content. 👍👍👍👍

  • @jp00738
    @jp00738 6 месяцев назад +1

    Yo James, thanks for the deep dive! One thing I felt missing was showing the models performing on bigger dimensions as well. I bet this would generate some awesome responses. Guess openai wants to do that "apple marketing move" to compare models, but to be honest less than 1k are still a bit drunk answer style maybe? 😂

  • @micbab-vg2mu
    @micbab-vg2mu 6 месяцев назад

    Thank you for the update _ I was not aware about a new embedding model.

    • @jamesbriggs
      @jamesbriggs  6 месяцев назад +1

      They released them last night :)

    • @micbab-vg2mu
      @micbab-vg2mu 6 месяцев назад

      Great - ada model is quite old:) I will test new models during the weekend. @@jamesbriggs

  • @avi7278
    @avi7278 6 месяцев назад

    Great James, thanks I was looking for this. Do you have any videos about indexing and running RAG on entire codebases / projects ?

  • @leonardotato3067
    @leonardotato3067 6 месяцев назад +1

    "Hi James, I've been a fan of your videos for a long time and they never cease to impress me. Your dedication is evident in every piece of content. As the Director of a technology consulting firm in Spain, where we're venturing into creating specialized chatbots for information search, your insights have been invaluable. Keep up the fantastic work!"

    • @jamesbriggs
      @jamesbriggs  6 месяцев назад

      Hey Leonardo, that’s really awesome to hear, thanks and good luck with the venture!

  • @MastersWithHarshith
    @MastersWithHarshith 5 месяцев назад +1

    Can you do a comparison video between the new embeddings VS Cohere's embeddings?

  • @luisguillermopardo7792
    @luisguillermopardo7792 6 месяцев назад

    omg that's impresive. Greetings from Colombia

  • @haneulkim4902
    @haneulkim4902 4 месяца назад +1

    Great video as always. One question, how it the embedding model trained? Are the embeddings simply extracted from chatGPT4 or are they trained differently from the beginning (pre-training stage)?

  • @GeobotPY
    @GeobotPY 6 месяцев назад

    Thak you! Do you have any videos where you make some quantative analysis on how to evaluate LLMs and RAG? For instance I looked at RAGAS - seems interesting, but I find evaluation of RAG quite difficult to quantify. Keep up the good work!

    • @jamesbriggs
      @jamesbriggs  6 месяцев назад +1

      Planning to do this soon - I do have a video on evaluation metrics for retrieval though - doesn’t focus on RAG but still very relevant (any new video I do on ragas will likely incl these too)
      ruclips.net/video/BD9TkvEsKwM/видео.html

    • @IvarDaigon
      @IvarDaigon 6 месяцев назад +1

      The only way to really evaluate if a model is good for your use case is to start with the largest one that reliably does the job that you want it to do, then create a series of unit tests/benchmarks against that model and then go down the list of models (or embedding dimensions) and run the same tests against each model until the tests break.
      Unfortunally synthetic benchmarks OR user reviews will not give you the answers you need because natural language is highly nuanced so it depends entirely on the specific use case that you are interested in.

    • @Lucky9_9
      @Lucky9_9 5 месяцев назад

      @@IvarDaigonwow thank you for this pro level template! Filling this away for future reference!

  • @user-ef2pv2du3j
    @user-ef2pv2du3j 6 месяцев назад

    Hey James! Thanks for the video. Wondering, can you do a video converting TruLens?!

    • @jamesbriggs
      @jamesbriggs  6 месяцев назад

      Would love to they’re great! Hopefully soon

  • @IvarDaigon
    @IvarDaigon 6 месяцев назад

    Side note: The models with "ada" in them were named after Ada Lovelace so they should be pronounced the way her name is pronounced by most people in the UK at the time.
    Open AI started with model codename Ada Lovelace (GPT-1), then Charles Babbage (GPT-2), then Leonardo Davinici (GPT-3) but they stopped when they got to GPT4 or the letter E.
    They could have kept going with Einstein, Faraday, Galileo and then if they got stuck they could have just asked GPT-4 to create a list of codenames for them.
    My guess is that the embedding-3 models are based on GPT-3.5 which would be good balance between speed and language abilities but who really knows for sure because they keep changing the naming convention almost every time they release a new model.

  • @fintech1378
    @fintech1378 6 месяцев назад

    can we use this for image and video embedding too?multimodal embedding for multimodal RAG

    • @jamesbriggs
      @jamesbriggs  6 месяцев назад

      No just text, I’m actually surprised they didn’t release multimodal embedding models

  • @josephbeau-reder813
    @josephbeau-reder813 6 месяцев назад

    Thank you for the video man!
    I am quite surprised about the "qualitative" analysis of each model at the end (especially the "compare LLaMaA/GPT4" question) : you indeed check if the model has understood the question (it needs to compare LLaMA and GPT4) but isn't it even more important to check if it the informations provided are correct (and well sourced) ?
    Because the answer can indeed compare LLaMa and GPT4 but base this comparaison on hallucination (and wrong source).

  • @jawdridi
    @jawdridi 6 месяцев назад +2

    why the similarity search scores for a given query with the large and small embeddings are lower than the old OPENAI embedding?

    • @jamesbriggs
      @jamesbriggs  5 месяцев назад +2

      They just have a different range, actually Ada 002 was weird because the range was so small all the time - these models seem to have a larger range that tends to show lower similarity scores, you see the same in many recent open source models and Cohere’s embedding models

    • @jawdridi
      @jawdridi 5 месяцев назад

      thank you for the clarification. I love your videos. Keep it up @@jamesbriggs

  • @snarfer293
    @snarfer293 6 месяцев назад

    Is there something wrong with the text extraction? Output at 11:14 has spaces in "vuln erabilities", "organi zations", "sys tem", etc which would create issues in tokenization, and same with 11:53 having "TheseclosedproductLLMsareheavilyfine-tunedtoalignwithuman" as a single string.

    • @jamesbriggs
      @jamesbriggs  6 месяцев назад +1

      Yep, but real world data is messy - so I like to test with this and see how they perform

  • @luciolrv
    @luciolrv 5 месяцев назад

    A PINECONE Question: Will pinecone charge us more if we use the 3-large model since the vector dimension is larger? If so will it be more expensive just for upsert or also for retrieval?

  • @snarfer293
    @snarfer293 6 месяцев назад

    It would be better to test semantic retrieval instead of these queries that can also be done with standard word-to-word matching. Ideally, using words that are not in the target text. Also, are there public data sets with good numerical judgments so we can use ndcg to evaluate retrieval rankings?

    • @jamesbriggs
      @jamesbriggs  6 месяцев назад +1

      Yeah I’ll test the new models and a few others with good benchmarks soon - this was a quick first look after I heard about the release :)

  • @eyemazed
    @eyemazed 6 месяцев назад

    is it just me or does the new "non lazy" gpt4 turbo model take much longer to respond to an API call? (testing with RAG, 20-30k input tokens)

  • @dariuszsemba
    @dariuszsemba 6 месяцев назад

    I think "the revolution" in embedding models went unnoticed for a reason :) It's hard to pinpoint any disrupting method/technology which would make embeddings much more useful whereas ChatGPT had its moment thanks to RLHF method.
    Actually, vector databases gained all this attention a few months after ChatGPT's gigantic success - to me that proves how vector databases and embedding models are just about the ongoing hype.

  • @JohnMcclaned
    @JohnMcclaned 6 месяцев назад

    9:30 - why don't you do them in parallel? doing them one by one will always take longer

    • @jamesbriggs
      @jamesbriggs  6 месяцев назад

      Just writing quick simple code - but yeah I should’ve

  • @OccamsPlasmaGun
    @OccamsPlasmaGun 6 месяцев назад

    Maybe the embedding model stores "Llama 2" at the origin (0.0, 0.0, 0.0, ...) to screw up cosine similarity search.

    • @jamesbriggs
      @jamesbriggs  6 месяцев назад

      Always replaces competitive model names with outdated model names lol

  • @JohnMcclaned
    @JohnMcclaned 6 месяцев назад

    between this and the new serverless pinecone, get excited.. christmas came late

    • @jamesbriggs
      @jamesbriggs  6 месяцев назад

      It’s pretty good timing

  • @dr.mikeybee
    @dr.mikeybee 6 месяцев назад +10

    Mistral 7b scores better. MTEB for Mistral is 66.63. And Mistral isn't the top performer. Why would anyone pay for embeddings?

    • @jamesbriggs
      @jamesbriggs  6 месяцев назад +12

      I'm all for OSS embedding models - most of my recent videos focus on them - but the reality is openai api is easy, cheap, and gives you top-tier embedding performance (even if not quite number 1) so it's popular
      That being said, the 256-dim Ada 002 performance, if true, is pretty impressive

    • @lpls
      @lpls 6 месяцев назад +9

      You pay either way. Running a model ain't free.

    • @the-us-runner
      @the-us-runner 6 месяцев назад +2

      Why? To get embeddings that work across dozens of languages.

    • @kazwat
      @kazwat 6 месяцев назад

      @@lplsyes it is

    • @lpls
      @lpls 6 месяцев назад +1

      You have to run it somewhere, right? CPU, memory, power...

  • @AnthonyZboralski
    @AnthonyZboralski 6 месяцев назад

    Your videos would be much better without your overuse of stock footage

    • @jamesbriggs
      @jamesbriggs  6 месяцев назад

      I’ll take on the feedback, thanks