Weaviate • Vector Database
Weaviate • Vector Database
  • Видео 252
  • Просмотров 384 937
New embedding model: Contextual Document Embeddings
Traditional document embeddings have a significant limitation: they encode documents independently, without considering their context or neighboring documents.
This means they have to choose a single global weighting for terms, potentially missing important contextual nuances, or overweighting terms that might occur a lot in the dataset. This can be problematic when embedding in different domains or contexts.
✨ The Solution: Contextual Document Embeddings (CDE) ✨
CDE operates in two stages:
1️⃣ Adversarial contrastive learning: batch and embed related context from neighboring documents
2️⃣ Embed the target document while considering the contextual embeddings of the related document batch
CDE ...
Просмотров: 705

Видео

Agentic RAG with Erika Cardenas - Weaviate Podcast #109!
Просмотров 65214 дней назад
Hey everyone! Thank you so much for watching the 109th episode of the Weaviate Podcast with Erika Cardenas! Erika, in collaboration with Leonie Monigatti, have recently published "What is Agentic RAG". This blog post that was even covered in VentureBeat with additional quotes from Weaviate Co-Founder and CEO Bob van Luijt! This podcast continues the discussion on all things Agentic RAG, coverin...
Let Me Speak Freely? with Zhi Rui Tam - Weaviate Podcast #108!
Просмотров 26121 день назад
JSON mode has been one of the biggest enablers for working with Large Language Models! JSON mode is even expanding into Multimodal Foundation models! But how exactly is JSON mode achieved? There are generally 3 paths to JSON mode: (1) constrained generation (such as Outlines), (2) begging the model for a JSON response in the prompt, and (3) A two stage process of generate-then-format. I am BEYO...
Optimize your vector database's search speed, accuracy, and costs
Просмотров 14421 день назад
Weaviate's new hot, warm, and cold storage tiers offer flexible options for managing resources to optimize search speed, accuracy, and costs 🚀 There are three main levers to adjust: • Choosing the vector index type (HNSW, flat, or dynamic) • Using compression techniques (binary, product, or scalar quantization) • Managing flexible tenant states (active, inactive, or offloaded) Learn when you sh...
SWE-bench with John Yang and Carlos E. Jimenez - Weaviate Podcast #107!
Просмотров 263Месяц назад
Hey everyone! Thank you so much for watching the 107th episode of the Weaviate Podcast! This one dives into SWE-bench, SWE-agent, and most recently SWE-bench Multimodal with John Yang from Stanford University and Carlos E. Jimenez from Princeton University! One of the most impactful applications of AI we have seen so far is in programming and software engineering! John, Carlos, and team are at ...
AI in Education with Rose E. Wang - Weaviate Podcast #106!
Просмотров 341Месяц назад
Hey everyone! I am SUPER excited to publish the 106th episode of the Weaviate Podcast featuring Rose E. Wang!! Rose is a Ph.D. student at Stanford University where she has lead incredible research at the cutting-edge of AI applications in Education. The podcast heavily discusses her recent work on Tutor CoPilot! Tutor CoPilot is one of the world's largest randomized control trials on the impact...
Compound AI Systems with Philip Kiely - Weaviate Podcast #105!
Просмотров 440Месяц назад
Hey everyone! Thanks so much for watching the 105th episode of the Weaviate Podcast with Philip Kiely! This one dives into all sorts of apsects related to Compound AI Systems! We are now seeing far better results with AI models by breaking up tasks into multiple stages and inferences. Philip explains the work they are doing at Baseten to optimize and scale deployments of these emerging systems ...
Hack Night at GitHub with Weaviate
Просмотров 256Месяц назад
Beyond hacking and writing code, there’s something incredibly fun about creating environments for likeminded and smart people to get together to learn and hack on new tech. It takes a lot of work, but the reward is great and it's pure vibes. It creates the perfect synergy for incredible things to happen, from rad demos by magically talented people like Leann Chen from Diffbot, Ben A. at Telepor...
Late chunking improves context recall in RAG pipelines
Просмотров 1,1 тыс.Месяц назад
Optimizing your chunking techniques is one of the top places to improve performance in your RAG pipelines, but what’s the best one? Jina AI just released a new method called late chunking that takes the same amount of storage space as naive chunking, but solves the problem of lost context, similarly to ColBERT. You can implement it super easily with just a few extra lines in your embedding step...
Matryoshka Representation Learning (MRL) for ML tasks and vector compression
Просмотров 4982 месяца назад
Matryoshka Representation Learning (MRL) for ML tasks and vector compression
AI Agents That Matter with Sayash Kapoor and Benedikt Stroebl - Weaviate Podcast #104!
Просмотров 6272 месяца назад
AI Agents That Matter with Sayash Kapoor and Benedikt Stroebl - Weaviate Podcast #104!
Chat With Your Data With Verba
Просмотров 1,6 тыс.2 месяца назад
Chat With Your Data With Verba
MIPRO and DSPy with Krista Opsahl-Ong! - Weaviate Podcast #103
Просмотров 2,1 тыс.3 месяца назад
MIPRO and DSPy with Krista Opsahl-Ong! - Weaviate Podcast #103
AI-Native Development with Guy Podjarny and Bob van Luijt - Weaviate Podcast #102!
Просмотров 2913 месяца назад
AI-Native Development with Guy Podjarny and Bob van Luijt - Weaviate Podcast #102!
Chat with your code: RAG with Weaviate and LlamaIndex
Просмотров 4344 месяца назад
Chat with your code: RAG with Weaviate and LlamaIndex
Scaling Pandas with Devin Petersohn - Weaviate Podcast #101!
Просмотров 3014 месяца назад
Scaling Pandas with Devin Petersohn - Weaviate Podcast #101!
Generative UIs with Lucas Negritto and Bob van Luijt - Weaviate Podcast #100!
Просмотров 7074 месяца назад
Generative UIs with Lucas Negritto and Bob van Luijt - Weaviate Podcast #100!
Advanced AI Agents with RAG
Просмотров 7 тыс.5 месяцев назад
Advanced AI Agents with RAG
ACORN with Liana Patel and Abdel Rodriguez - Weaviate Podcast #99!
Просмотров 8325 месяцев назад
ACORN with Liana Patel and Abdel Rodriguez - Weaviate Podcast #99!
Window Search Tree with Josh Engels - Weaviate Podcast #98!
Просмотров 4095 месяцев назад
Window Search Tree with Josh Engels - Weaviate Podcast #98!
Vector Quantization: The Vector Clubhouse Episode 2
Просмотров 2715 месяцев назад
Vector Quantization: The Vector Clubhouse Episode 2
AI Renaissance Berlin - AI Buzzwords
Просмотров 2015 месяцев назад
AI Renaissance Berlin - AI Buzzwords
The Future of Search with Nils Reimers and Erika Cardenas - Weaviate Podcast #97!
Просмотров 1,3 тыс.5 месяцев назад
The Future of Search with Nils Reimers and Erika Cardenas - Weaviate Podcast #97!
Deep Learning with Letitia Parcalabescu - Weaviate Podcast #96!
Просмотров 4535 месяцев назад
Deep Learning with Letitia Parcalabescu - Weaviate Podcast #96!
All Your Vector Embeddings Are Belong To You
Просмотров 8366 месяцев назад
All Your Vector Embeddings Are Belong To You
Open Source RAG running LLMs locally with Ollama
Просмотров 29 тыс.6 месяцев назад
Open Source RAG running LLMs locally with Ollama
Guest Lecture: Vector Quantization Techniques with Etienne | Brown University CSCI
Просмотров 5806 месяцев назад
Guest Lecture: Vector Quantization Techniques with Etienne | Brown University CSCI
DSPy End-to-End: Meetup in San Francisco
Просмотров 6 тыс.6 месяцев назад
DSPy End-to-End: Meetup in San Francisco
Google Cloud Marketplace with Dai Vu and Bob van Luijt - Weaviate Podcast #95!
Просмотров 3466 месяцев назад
Google Cloud Marketplace with Dai Vu and Bob van Luijt - Weaviate Podcast #95!
ParlayANN with Magdalen Dobson Manohar - Weaviate Podcast #94!
Просмотров 3917 месяцев назад
ParlayANN with Magdalen Dobson Manohar - Weaviate Podcast #94!

Комментарии

  • @rsanthi20
    @rsanthi20 2 дня назад

    Please avoid using background music for technical videos

  • @connor-shorten
    @connor-shorten 7 дней назад

    If interested, Erika's talk from Google Pier 57 is live on Arize AI RUclips!

  • @Europetraveller789
    @Europetraveller789 10 дней назад

    Great

  • @asadullahshafique4261
    @asadullahshafique4261 10 дней назад

    very excited to learn from you, i remain your thankful !

  • @Lemure_Noah
    @Lemure_Noah 12 дней назад

    Wow! Someone put the head out of the box!

  • @TweedBeetle
    @TweedBeetle 14 дней назад

    Intro music goes incredibly hard

  • @praveengowd
    @praveengowd 15 дней назад

    Nice explanation

  • @greatworksalliance6042
    @greatworksalliance6042 19 дней назад

    This seems to be a detailed method of RAPTOR RAG applied to context and implied extension...can we now go further filling the gaps of basic language mechanics that should have been applied over 40yrs ago, and finally put this primitive stuff behind us getting to the good stuff🤔🧐😁

  • @s0ckpupp3t
    @s0ckpupp3t 20 дней назад

    You guys look like you're having so much fun haha

  • @Liberty_boy
    @Liberty_boy 20 дней назад

    Oh looking forward to this!

  • @dshorten1766
    @dshorten1766 20 дней назад

    Fantastic episode, great explanation!

  • @Karl-Asger
    @Karl-Asger 20 дней назад

    In-house episode! On a great topic! Exciting

  • @Mattheouw
    @Mattheouw 21 день назад

    oh thank you !

  • @DaveBayless
    @DaveBayless 26 дней назад

    Very helpful. Thank you.

  • @EmmanuelBright-pq6ji
    @EmmanuelBright-pq6ji 26 дней назад

    Can i use ollama?

  • @ix4564
    @ix4564 26 дней назад

    🤪

  • @connor-shorten
    @connor-shorten 26 дней назад

    Structured Outputs!!

  • @markrather7863
    @markrather7863 Месяц назад

    ❤ weaviate

  • @markrather7863
    @markrather7863 Месяц назад

    WEAVIATE FTW, yaaaaasss 🫰

  • @jakobkristensen2390
    @jakobkristensen2390 Месяц назад

    I can think of many practical use cases where long context can't replace vector retrieval, even if context window size explodes

  • @birendrasingh1750
    @birendrasingh1750 Месяц назад

    Suppose we are using 1536 dimensions for chunks and now we moved to create embedding for entire article to implement late chunking. Are not we diluting the information by this way. Because same vector dimension is now representing my entire article. Any thought on this. ?

  • @limjuroy7078
    @limjuroy7078 Месяц назад

    Thanks for the tutorial!

  • @marcinjedynski974
    @marcinjedynski974 Месяц назад

    I mean she explains it beautifully and that's a kind of a explanation I was looking for but what really got me into a state of flow is this ambient Techno track in the background :D

  • @nikosterizakis
    @nikosterizakis Месяц назад

    Nice video - very informative.

  • @connor-shorten
    @connor-shorten Месяц назад

    Thanks so much for joining Philip!

  • @jakobkristensen2390
    @jakobkristensen2390 Месяц назад

    Connor at 50:10 you mention a paper Alto but didnt link it? Or am I mistaken?

  • @jakobkristensen2390
    @jakobkristensen2390 Месяц назад

    Very informative

  • @savannahquire5414
    @savannahquire5414 Месяц назад

    Heck yeah!!

  • @dshorten1766
    @dshorten1766 Месяц назад

    Great video, Connor!

  • @sajjaddehghani8735
    @sajjaddehghani8735 Месяц назад

    Where is the file or path of data that is stored locally?

  • @marccox8977
    @marccox8977 Месяц назад

    This is great! Thank You!

  • @martinkallas8513
    @martinkallas8513 Месяц назад

    what's the difference between collections and schemas ?

  • @ajithdevadiga9939
    @ajithdevadiga9939 Месяц назад

    amazing idea.

  • @frankkwabenaaboagye
    @frankkwabenaaboagye 2 месяца назад

    This playlist is very helpful. 👍👍

  • @KashishVarshney-cr7iz
    @KashishVarshney-cr7iz 2 месяца назад

    why don't you use rerank attribute. that is most important attribute

  • @funTech_else_entrepreneurship
    @funTech_else_entrepreneurship 2 месяца назад

    Is that mean we don't need to covert unstructured data into structure for RAG applications but can use vector indexing instead?

  • @mklobucaric
    @mklobucaric 2 месяца назад

    Thank you!

  • @CissieRobbins-p6t
    @CissieRobbins-p6t 2 месяца назад

    Hartmann Loaf

  • @BalderAndBalder
    @BalderAndBalder 2 месяца назад

    The ads between each video is almost longer than each video. Can't you turn ads off?

  • @deeplearningpartnership
    @deeplearningpartnership 2 месяца назад

    Cool.

  • @connor-shorten
    @connor-shorten 2 месяца назад

    Thank you so much for joining the podcast Benedikt and Sayash! Learned so much from our chat and really excited about where this research is heading!

  • @Karl-Asger
    @Karl-Asger 2 месяца назад

    Keep up the great work Connor

    • @connor-shorten
      @connor-shorten 2 месяца назад

      Thanks so much Karl! Means a lot!

  • @郑瀚-v9l
    @郑瀚-v9l 2 месяца назад

    excellent video, and impressed insights.

  • @arpitaingermany
    @arpitaingermany 2 месяца назад

    I am not able to view the Overview

  • @arpitaingermany
    @arpitaingermany 2 месяца назад

    But again sharing private info to this company can be dangerous, so uploading documents is a doubt

  • @tristanbob
    @tristanbob 2 месяца назад

    Great video about an awesome open source contribution. Thanks!

  • @beta5770
    @beta5770 2 месяца назад

    Adding multimodal support would just make this amazing. Weaviate does come with the multimodal operators

  • @hannahb1426
    @hannahb1426 2 месяца назад

    Great video !!!!

  • @dan_taninecz_geopol
    @dan_taninecz_geopol 2 месяца назад

    Can it use external vector databases?

    • @Weaviate
      @Weaviate 2 месяца назад

      Weaviate only

  • @dan_taninecz_geopol
    @dan_taninecz_geopol 2 месяца назад

    Does this get around the issue of inaccuracies in the function call or in the translation from natural to structured language?