Unstructured.IO: Get Your Data LLM-Ready

Поделиться
HTML-код
  • Опубликовано: 12 янв 2025

Комментарии • 20

  • @attilavass6935
    @attilavass6935 Год назад +5

    I can't wait for a video where scraped HTML is converted to Langchain's Parent Document Retriever (large AND small chunks) with Unstructured, with enriched metadata got from HTML structure / tags.

    • @DevelopersDigest
      @DevelopersDigest  Год назад +2

      Love this idea, I have something in mind in for parsing HTML content on the fly I would like to try! I will share a video when I set up it up. It might not be exactly what you are looking for but hopefully along those lines 🙂

    • @TRUEBLUEMATEE
      @TRUEBLUEMATEE 11 месяцев назад

      Yes please share if you have anything that can do this!@@DevelopersDigest

  • @xyz-vv5tg
    @xyz-vv5tg 10 месяцев назад +2

    How does it extract tabular data from pdfs? Does it know the relationships between rows and columns?

  • @gouthamkarakavalasa4267
    @gouthamkarakavalasa4267 Год назад

    There is an issue at text extracting from a PDF document with function : "partition_pdf" function with "by_title" as a strategy. Expectation is to extract text chunks based on titles. But it has extracted the text with lots of noises.

  • @manjorysaran-tl7ht
    @manjorysaran-tl7ht 9 месяцев назад +1

    how do yyou get the access to the api-key? is it paid?

  • @Anyreck
    @Anyreck Год назад +1

    Sounds incredibly useful: multiformat, intelligent parsing....

    • @DevelopersDigest
      @DevelopersDigest  Год назад +1

      Absolutely, I am going to create at least one or two more technical videos using unstructured in some upcoming video! The team is solving an important piece of the puzzle 🧩 in data pipelines for LLM applications! 🙂

  • @ryanscott642
    @ryanscott642 8 месяцев назад

    Thanks for this. is there one for excel or better yet for taking tables out of documents and loading it into database tables for that kind of data?

  • @johnlaurencepoole6408
    @johnlaurencepoole6408 Год назад +1

    The steampunk artwork that drew me into this video is great. Did you have an AI process design it?

    • @DevelopersDigest
      @DevelopersDigest  Год назад +2

      Yes, I used the DALL-E 3 model with the following prompt;

“A landscape photo depicting an unstructured, huge pile of books on the left side. In the middle, there's a complex Rube Goldberg machine, intricately designed with gears, pulleys, and various mechanical parts. The machine functions to organize books, and on the right side, it outputs into an organized, pristine Victorian library, filled with neatly arranged shelves of books, an ornate fireplace, and elegant furniture. The entire scene combines chaos and order in a whimsical, fantastical manner.”

Cheers! 🥂

    • @johnlaurencepoole6408
      @johnlaurencepoole6408 Год назад +1

      @@DevelopersDigest You should do a video on how you created the image... you might garner a different audience as I'm sure artistic bent people would be fascinated. I think it's a really luring piece of art, I just want to see the details. Congrats.

    • @DevelopersDigest
      @DevelopersDigest  Год назад

      Thank you for this suggestion! I haven’t considered diving into AI image generations tools much yet but with your feedback I will certainly add that to my list! Thank you! 🙏

    • @johnlaurencepoole6408
      @johnlaurencepoole6408 Год назад

      @@DevelopersDigest I've been submitting the exact prompt you provided on various "free" services and the results are nothing compared to what you had generated. I'm posting my conclusion more for the benefit of people interested in art work to document that the DALL-E 3 creation beats several others. I do not mean to hijack your parser video, but, as I said, the image is what drew me into your video. Thank you!

    • @johnlaurencepoole6408
      @johnlaurencepoole6408 Год назад

      I just signed up for the $20/ChapGPT account so I could access Dall-E 3 and make the inriguing type images you do. Thank you, thank you.

  • @jerremy7
    @jerremy7 Год назад +1

    How do you access the GUI? I got an API key, but it's unclear where to go next..

    • @DevelopersDigest
      @DevelopersDigest  Год назад +1

      This is the repo!
      github.com/Unstructured-IO/unstructured-api-gui

    • @jerremy7
      @jerremy7 Год назад

      thanks! @@DevelopersDigest

  • @everybodyguitar5271
    @everybodyguitar5271 6 месяцев назад

    The GPU can't be used. It always show invalid key although my key is valid.