Pandas DataFrame Agent... the future of data analysis?

Поделиться
HTML-код
  • Опубликовано: 21 июн 2023
  • 👉🏻 Kick-start your freelance career in data: www.datalumina.io/data-freela...
    Let's dive into the Pandas DataFrame Agent from the LangChain library to see how we can integrate analytical capabilities into LLM apps. We use the OpenAI API to ask questions about an Excel/CSV dataset and experiment with the possibilities and limitations of this LangChain Toolkit.
    🔗 Links
    github.com/daveebbelaar/langc...
    ⚙️ Copy my VS Code Setup • How to Set up VS Code ...
    👋🏻 About Me
    Hey there, my name is @daveebbelaar and I work as a freelance data scientist and run a company called Datalumina. You've stumbled upon my RUclips channel, where I give away all my secrets when it comes to working with data. I'm not here to sell you any data course - everything you need is right here on RUclips. Making videos is my passion, and I've been doing it for 18 years.
    While I don't sell any data courses, I do offer a coaching program for data professionals looking to start their own freelance business. If that sounds like you, head over to www.datalumina.io/ to learn more about working with me and kick-starting your freelance career.
  • НаукаНаука

Комментарии • 45

  • @daveebbelaar
    @daveebbelaar  11 месяцев назад +3

    👉🏻Kick-start your freelance career in data: www.datalumina.io/data-freelancer
    👉🏻Learn more about data science and AI: www.datalumina.io/newsletter

    • @igoweiqibaduk8283
      @igoweiqibaduk8283 11 месяцев назад

      Hi Dave, could not find your email. The tool of booking a call in /data-freelancer page step 2 after video is not working, just wrote that July is unavailable, but month switch does not work. Regards, George.

    • @daveebbelaar
      @daveebbelaar  11 месяцев назад +1

      @@igoweiqibaduk8283 Hey George, thanks for your message. It is correct that the calendar is fully booked right now. I am expecting to take some more calls in 2-3 weeks. You are welcome to subscribe to our newsletter to stay updated on availability.

    • @rajatkumarsinha2159
      @rajatkumarsinha2159 8 месяцев назад

      Hi Dave,
      Can you guide how to give my CSV file in Dolly 2.0 with langchain to have a question answer like above?

  • @RanaGustico
    @RanaGustico 11 месяцев назад +2

    About the calculations: Have you tried the prompt:
    - "Act as an expert matematician. . Explain this step by step (that last words are sometimes is required) "
    I've read about this workaround to make AI self correct before responses. Happy to watch you update and review with the new stuff. Nice content sir!

  • @joseluisbeltramone599
    @joseluisbeltramone599 9 месяцев назад +2

    Hi Dave: Thank you very much for the excellent explanation. Now, would you please do a video where you meet with the tokens limitation of the LLM? I would like to see how to overcome this. Thanks in advance!

  • @irvinJoelBanta
    @irvinJoelBanta 11 месяцев назад

    Love your videos, keep it up

  • @camilocampos5900
    @camilocampos5900 11 месяцев назад +3

    Every day I am more impressed by the llm potential with langchain, I am a fan of knowledge thank you for your content

    • @wongyithong9838
      @wongyithong9838 9 месяцев назад

      Exactly the same feeling, every time I see the title of these videos, wondering what apps I can build to solve real world problem.

  • @shikharvarshney7010
    @shikharvarshney7010 11 месяцев назад

    Awesome Explanation !!

  • @AwB
    @AwB 11 месяцев назад +1

    Great video. The 2 dataframes part was interesting. I was hoping I can pass in a summary dataframe and a raw dataframe, tell the LLM what is in each dataframe, and then ask it to write an article using both dataframes. "Write an article in this months results (which are in the summary dataframe), and also don't forget too mention some interesting related facts from the raw dataframe. This would require it to join the dataframes together.
    Do you think this is possible yet? I see lots of chatGPT with your database but I'm curious how it can work with multiple tables of data.

  • @MikeRhodesIdeas
    @MikeRhodesIdeas 4 месяца назад +1

    @daveebbelaar any plans to update this for langchain 0.1.0 ?? Maybe in the members' area??

  • @JT-Works
    @JT-Works 8 месяцев назад +1

    I am building a Streamlit app with the Panda Dataframe Agent, and for the life of me, I cannot get the chatbot to have any memory context in chat. Is there a tutorial where you cover this?

  • @DK-dp3kk
    @DK-dp3kk 4 месяца назад

    Thank you. Nice video. Do you know if you can summarize text within a cell in the data frame? If you have a dataset that includes blog posts and you want a new column that has a 2 line summary. Ideas?

  • @streetcodenate
    @streetcodenate 5 месяцев назад

    Perfect, my dawg!

  • @prateekkeshari
    @prateekkeshari 11 месяцев назад

    It's interesting to play with it - have tried it out multiple times - but i do see limitations of it. Someitmes it also outputs wrong answers. What (in your opinion) would it take for it to be production ready?

  • @kumargaurav2170
    @kumargaurav2170 10 месяцев назад

    I think using memory component from Langchain will help overcoming bottleneck of memory management for operations requiring more than 1 step.

  • @HazemAzim
    @HazemAzim 7 месяцев назад

    nice but did you try that with chat models ChatOpenai and use gpt-turbo-3.5 which is much cheaper ? I think the pandasDatframe agent will not work properly though !

  • @onangarodney7746
    @onangarodney7746 11 месяцев назад

    Would it be more accurate if you added the Wolfram OpenAi plugin to the mix?

  • @Canna_Science_and_Technology
    @Canna_Science_and_Technology 11 месяцев назад

    Just an idea, a video using the new function feature would be great. ;-)

  • @micbab-vg2mu
    @micbab-vg2mu 11 месяцев назад

    Great - Thank you

  • @tommyharlim276
    @tommyharlim276 8 месяцев назад

    how do i put this sort of application to a website so that i can upload my own data on the website and enter a prompt and have it displayed on the website ?

  • @waddaa
    @waddaa 9 месяцев назад

    I have been looking for a chain or agent that can work with tools and your own files as well but I couldn't find. Is this even possible?

  • @quickandsmart6298
    @quickandsmart6298 11 месяцев назад

    I've actually looked at this dataset before and one thing I noticed was that the agent actually made another error at 11:30. It found the median salary using the salary column and not the salary_in_usd column so for example the Head of Machine Learning role only had a single person who lived in india, so when converting 6,000,000 indian rupees it only ends up being 76k USD, far from what the results show. While the agent is very powerful, clearly it's not perfect and you have to make sure the questions provided are specific enough and double check the actual code it provides. Regardless, great video and it's definitely a tool I'll look to be using in later projects!

    • @daveebbelaar
      @daveebbelaar  11 месяцев назад

      Ahh, good one! And thanks! Definitely something I missed

  • @xanderklein3356
    @xanderklein3356 10 месяцев назад

    Awesome video. Can you do this with Node js?

  • @nerding_io
    @nerding_io 11 месяцев назад

    Very awesome!

  • @gamerwager5317
    @gamerwager5317 11 месяцев назад

    My suggestion as a RUclips make the video smaller ur voice is great for background track but add more info into the video , which add value to views time .😊

  • @user-ib8qm8eh3q
    @user-ib8qm8eh3q 7 месяцев назад +1

    Hi Dave, pls can I use an open source model for this instead of Open ai?

  • @johnbrisbin3626
    @johnbrisbin3626 11 месяцев назад

    I note that again you use text-davinci which openai claims is just a slower and more expensive way of getting what got 3.5 gives you for a fraction of the price.
    Have you found differently in real use?

    • @daveebbelaar
      @daveebbelaar  11 месяцев назад

      You're right, for real use-case I would use gpt-3.5 or 4. These are a little different to configure because they are chat-based models, but it would indeed be the preferred option.

  • @nanto88
    @nanto88 11 месяцев назад

    awesome

  • @madhu1987ful
    @madhu1987ful Месяц назад

    Can this work on big data frames? Say 1 million rows of Data ?

  • @alchemication
    @alchemication 11 месяцев назад

    Thanks for sharing. The reason this can fail in real
    World is that biz is way more complex and a ton of jargon is used. After spending 100s of hours on this topic I can conclude it’s a good start but for real world scenarios on complex data, we need to be way more creative. Best!

  • @justinchung982
    @justinchung982 7 месяцев назад

    Please show doing this with Llama2!

  • @RyanScottForReal
    @RyanScottForReal 11 месяцев назад +1

    You need to apply memory agent

  • @temp911Luke
    @temp911Luke День назад

    Would be more interested if you could use the REAL open AI models (open source models) instead of gpt4 .

  • @SMCGPRA
    @SMCGPRA Месяц назад

    Can we use opensource LLM

    • @girishnaik6433
      @girishnaik6433 Месяц назад

      did you get the answer? I'd really like to know it

  • @ajaypranav1390
    @ajaypranav1390 2 месяца назад

    why not use PandasAI

  • @klammer75
    @klammer75 11 месяцев назад +1

    Does the pandas agent take a memory parameter? Really like these agents when they can hold a little chat history….I had issues getting their csv agent to hold onto the current convo as it wouldn’t take a ‘working memory’ parameter like some of the other agents would….great video🥳🦾🤓