Upload and Search Your Own Documents With Bubble, OpenAI, and Pinecone (Part 2)

Jeff from Blank Slate Labs

Просмотров 2,9 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 2 фев 2025

Комментарии • 36

@aisaluk 3 месяца назад
Very comprehensive video, good stuff!
Would be great if you can make another video on this topic but with the new Pinecone Assistant connected with OpenAI Assistant (which is connected and operating in Bubble)
@tursunable Год назад
Absolutely love it! Thank you for this complete tutorial! You're successfully elevating Bubble from the constraints imposed by the AI era! 🎉🎉
@msaavedra897 11 месяцев назад
Thank you for this! Keep it up🙂
@chisomonyea 2 месяца назад
I’m curious to know if this could be done by using app data in bubble instead of pinecone.
@teddyschoenith9152 Год назад
hey I have it set up where there are 5 different chatbots, each of the chatbots has its own pdf's, Now when the user deletes a chatbot how do I delete everything related to that chatbot in pinecone, (because when they add a chatbot It will use the same namespace as the one that was deleted), also need to be able to change the knowledge base within that specific namespace, lie they can add or delete certain files in that chatbot
Год назад ⁺¹
Would love to see a tutorial on scraping a website without a sitemap to store in pinecone!
@Sono-oq4gi Год назад
Hi, thanks for the vids. I am stock for the operation at 17:32mins. The" 's matches" doesnt appear. I have just these options : 's namespace and 's raw body text. Thanks for your help.
@BlankSlateLabs Год назад
Ah, there probably was an issue with the original set up of the API in the API connector. If you go back to 1:20 check to make sure when you set up the API, you get matches back when the API is initialized (as seen here at 7:09)
@Sono-oq4gi Год назад
@@BlankSlateLabs Thanks for your reply. I have checked the config it is the same than yours. But after initializing I don't have the same screen than you. Do you think is because my text is different?
@BlankSlateLabs Год назад
No problem! Ah, that issue is most likely due to there not being any vectors for it to match with already in Pinecone when initializing the API in the API connector.
A couple of things:
1) Have you already upserted some vectors into Pinecone (from the Part 1 video)?
2) Are you using the same Namespace and metadata filters in the upsert and for the API initialization?
3) are the vectors you are using to search by valid (should be a list of 1536 numbers, separated by comma)?
@Sono-oq4gi Год назад
All is good! Big thanks!
@BlankSlateLabs Год назад
🙌 great to hear!
@77netwar Год назад
Is there a way to query a collection of documents (your example seems to be questioning 1 document at a time) and having a reference link to the documents that were used in the response returned to the user?
@BlankSlateLabs Год назад
Yes, totally. First create a new database table in Bubble for collections and set the collection for each document and document chunk. Then there are two ways to define the collection during the upsert and then when querying.
1) Use metadata to define a collection during upsetting and then filter when querying.
2) Use namespace to define a collection and query based on each namespace.
Then when you get the results back, in the step when creating the prompt and adding the context you can include the document name in addition to the document chunk text content in the format as text step. In the prompt text you would add more instructions to tell OpenAI to also include the source document name for each part of the answer.
@salemmohammad2701 Год назад
If I want to make it an extended discussion (instead of a single question and answer) how can I make OpenAI remember what was discussed during the conversation?
Any hints to help me search would be appreciated
@BlankSlateLabs Год назад
sure! You need to send all previous messages along with the API call to OpenAI. Here is a tutorial that shows how to do that for a more general ChatGPT clone-like product in Bubble: www.planetnocode.com/tutorial/plugins/openai/build-a-chatgpt-clone-in-30-mins-with-bubble-io/
@salemmohammad2701 Год назад
@@BlankSlateLabs I really appreciate your answers. so how much text of previous messages I can send with the api call? and also how can I find informations like this regarding cohere?
@BlankSlateLabs Год назад
@@salemmohammad2701 You can send as much as the input token limit of the model allows. For OpenAI, the limits are here: platform.openai.com/docs/models
You can also just cap it at last N messages (maybes something like last 10 messages) to provide continuity with the context.
Regarding Cohere, that I can't comment on. They don't seem to have a chat based API, so you'd probably just have to append the prompt with a transcript of the prior conversation for each message.
@salemmohammad2701 Год назад
@@BlankSlateLabs thank you very much, your responses are very helpful
@BlankSlateLabs Год назад
@@salemmohammad2701 no problem! happy it has been helpful. :)
@MisCopilotos Год назад
Hey, great video. Question, what's your opinion regarding using flowise (langchain)?
In your experience, is it easier to do this directly from Bubble to Pinecone/Open AI; or to use an intermediate part like flowise where there are more functionalities and easier from a visual perspective? Thanks!
@BlankSlateLabs Год назад ⁺¹
Oh shoot, sorry never responded to this! In my experience it's a bit easier to do directly in Bubble, but you do miss out on using better document loaders/parsers that come with Langchain. I have not used Flowise yet, though does look useful.
My general opinion is that the things done within Langchain are not actually that complex (mainly just connecting to other APIs or libraries and coordinating data flow). I find it easier (and then have more control) to do it myself. For more complex things, I'll go outside of Bubble and use a no-code backend called Xano.
@MisCopilotos Год назад
@@BlankSlateLabs Thanks for the answer. I'm still figuring this out. The thing is, for me, Flowise plays an interesting role as it makes it easy to test AI possibilities before moving to a platform. This includes playing with chain-prompts, agents, web connections + chains, custom tools, and so on. I find this very useful, but it does take time to refine the flow and prompts to achieve a good result. However, when I switch to Bubble, I'm unsure whether I should do the API call to Flowise or make the connections myself. I'll try making the API connections myself on Bubble to see if I have more control. Thanks again!
@BlankSlateLabs Год назад ⁺¹
@@MisCopilotos Totally understand and yeah that is definitely a benefit of Flowise/Langchain. A lot is already built in to get started and experiment. For me it would come down to a) do I need to do more outside of Flowise than within and b) are there customizations (such as prompts, data flows) that i need to make that are not accessible through Flowise.
@mot._vat._on Год назад
Those tutorials are great! I really appreciate providing full guidance on how to put everything together with working example.
Have you tried your demo app with Excel or CSV files?
I am trying to build similar app but want to analyze the uploaded Excel or CSV similar to what GPT Code Interpreter does. I believe it was still not available functionality at the time you created the tutorial. Do you think that now is possible to query files directly to OpenAI to analyze and interact with them in similar fashion or still need to split the files to chunks and rely on Pinecone?
@BlankSlateLabs Год назад ⁺¹
Hey! Thanks so much for the feedback! Yeah, this was before the days of Code Interpreter. I have not done anything yet with Excel or CSVs. From what I have heard, the best way to query data is to think of GPT as a query writer vs. the actual data analyzer.
So to compare it to this Pinecone example: You have data (the documents and their chunks), you have a process to search for the data (vector search in Pinecone), and then once you have the right data (Text Context), you can use GPT to summarize that data.
So a system where you hold tabular data in something like a SQL-based database can work like below:
The SQL database is the data source. You then use GPT to translate text question into a SQL query. You then run the SQL query on your database to get the right data. Then you pass along the matched data to GPT to summarize or answer questions from.
@mot._vat._on Год назад
@@BlankSlateLabs I really appreciate your response and advise on that! Really helpful information!
@LearnWithBahman Год назад
awesome , what if we ask question that Is not in the document ? Do we need bubble paid account?
@BlankSlateLabs Год назад ⁺¹
In this implementation it would just tell you that the context does not include any information about the question.
Yes for the chunking process, which uses backend workflows, you will need a paid Bubble Account.
@JamesH-v3g Год назад
Is see you copy and paste the text from a document. Have you thought of a way to allow users to upload the PDF…instead of copy and paste.
I’m guessing you’d have to extract the text via some other API. Wondering if you found the easiest way.
Great videos 🎉
@BlankSlateLabs Год назад ⁺²
There are definitely a couple of ways to do the PDF upload route. A couple of APIs I found are convertapi.com/pdf-to-txt and pdf.co/pdf-to-text-api
It actually is probably worth me creating another video on just that. (I am getting this question from a few folks and agreed would make this more valuable). I’ll get something out this week!

Следующие

Автовоспроизведение

Semantic Search in Bubble with OpenAI and Pinecone