Robust Text-to-SQL With LangChain: Claude 3 vs GPT-4

Поделиться
HTML-код
  • Опубликовано: 10 сен 2024

Комментарии • 14

  • @abhinabaghose3380
    @abhinabaghose3380 23 часа назад

    what has been your experience with text to pandas dataframe? Is it better than text to sql in terms of complexity?

  • @mrchongnoi
    @mrchongnoi 3 дня назад

    I am late to the game on this video. I have been working on a TextToSQL project. Like most of the examples I have viewed, the LLM can understand the context of columns. From the project I am working on, the names of the columns may have some hint of what the data would be or its use. The schema I have has a date, a reference date, and a delivery date. Delivery date is obvious. There are other fields where the names are not indicative of the values. What happens when you have multiple tables with a large schema? My approach is to use the LLM to build the SQL and not to synthesize, as the amount of data could be quite large.

  • @andaldana
    @andaldana Месяц назад

    Great video! Always something new to learn!

  • @Shai_Di
    @Shai_Di 3 месяца назад +1

    This is really interesting but I have some concerns about this method, I'd love to hear what you think about them:
    1. We are always sending the entire schema as context. If we want to have a large dataset connected to this "application", we will waste a ton of tokens on that. The agent that LangChain built slowly decides which tables might be relevant, thus reducing the amount of tokens used as context. How would you approach something like this?
    2. Sometimes, tables and column names might not be super intuitive to the LLM, and without sampling the data, it can assume properties, values or anything else. So this requires the user to review the query and make sure it makes sense, which is what we are kind of trying to prevent when we start using AI for queries. What do you think about adding a semi step that will somehow sample the relevant data?

  • @TheBestgoku
    @TheBestgoku 5 месяцев назад

    THIS is function-calling but instead of a "json" u get a "sql query". Am i missing something?

    • @rabbitmetrics
      @rabbitmetrics  5 месяцев назад +1

      That is one way to think of it. But in this case LangChain is handling the parsing of the LLM output (note the "model.bind(stop=["
      SQLResult:"])" in the chain). When you generate SQL or any other code you'll find that the code is often returned in quotes or with some text explaining the code. The trick is to minimize this by parsing the output in a suitable way.

  • @sahinakhtar2246
    @sahinakhtar2246 8 дней назад

    Unable to see the names of the db using "print(db.get_usable_table_names())" but the Database connected successfully, it shows an empty array [], What I'll do?

  • @lionhuang9209
    @lionhuang9209 5 месяцев назад +1

    Where can we download the code file?

    • @rabbitmetrics
      @rabbitmetrics  5 месяцев назад

      There's a link below the video to the Colab notebook with code and written tutorial including how to generate the ecom tables

  • @kelvinadungosi1579
    @kelvinadungosi1579 4 месяца назад

    Hi, great tutorial! How would you implement a chat fuctionality? where you can ask follow up questions??

    • @rabbitmetrics
      @rabbitmetrics  4 месяца назад

      Thanks! I would use ChatMessageHistory to manage the conversation and catch the traceback - this is needed for more advanced queries.

  • @SR-zi1pw
    @SR-zi1pw 5 месяцев назад

    What happens if he drops the table when hallucinating

    • @MaxwellHay
      @MaxwellHay 5 месяцев назад +2

      Read only role

    • @rabbitmetrics
      @rabbitmetrics  4 месяца назад

      As mentioned, make sure to restrict access scope and permission.