Comedy dataset update! I have found an approach I think I like for it, though I didn't have time to complete it for this video. So, I will also cover that in today's live stream!
Okay so after a cup of coffee and watching a couple of times, WOW. You helped me so much thank you. This has been driving me nuts and you make it look so easy to fix. I wish I was as smart as you. Thank you again. 🎉
I knew I subscribed here for good reason. this is consistently extremely high quality information -- not the regurgitated stuff. This is super educational and has immensely improved my understanding. Please keep going bud, this is great.
You’re literally a genius! I appreciate you taking the time to share the knowledge with us! Exactly what I was looking for… how to create a dataset and in such a well put together video. Thank you
The appeal has been processed by the approval AI... And it passed! The prescription will now be covered. 😊 (Thank you for the video! I think datasets and install dependencies are ML's greatest pain points at the moment.)
How would building a training set on a codebase look? Is there a good example of automating generation of a Q&A training set based on code? How do you chunk it to fit in context window - break it up by functions and classes? Where would extraneous stuff go, like requirements, imports, etc... Thanks for the great content!
This video was awesome! I'm finally starting to wrap my head round this stuff. At the same time I'm realising the power that is being unleashed onto the world! BTW did you see this new paper:SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression. Looks like it's right up your alley!
Thank you! I’m glad it’s helpful :D I have not seen this, this is super cool though, thank you for pointing me to it! I would love to see some implementation of pruning in LLM’s. Quantization is incredibly powerful, but we can only compress so much until we hit the limit. With pruning plus weight compression, as could run 30/65B parameter models on a single consumer GPU.
Amazing, Thanks a lot for sharing your reflections on your work and experience ! It is much appreciated ! First time I check something like this quickly browsing and stick without having to review / study and come back later. I am able to get a Birds eye view on the topic and options available for work, and the underlying purpose. 🥇Pure Gold. Definitely Subscribed !
Hey man, thanks for your videos they are instructive. I am new to LLMs and I think there is a significant gap in RUclips content with the new LLMs. I know there are videos on fine tuning GPT3 but I can't find anything like walk through in fine tuning a larger new open source model like Falcon-40b instruct. If there was a playlist going through the process: QA fine tune data definition, synthetic data production, fine tuning and test. I am sure others like myself will be very keen followers
I really love the concept, but whatever I have tried I get ERROR: Token indices sequence length is longer than the specified maximum sequence length for this model (194233 > 2048) Could you please update it? it would be of immense value to me :)
I am getting an error like this. Token indices sequence length is longer than the specified maximum sequence length for this model (546779 > 2048). Running this sequence through the model will result in indexing errors Max retries exceeded. Skipping this chunk.
Hi, I have some confusion about your content about leveraging embeddings. My understanding so far is that, embedding approach simply means "few-shot learning". The pipeline is, say, I have a query, I embed the query into a vector and then search for similar vectors which represent relevant examples in a vector db, now I have my initial query + some examples of (query, answer) from the db. Then I somehow cleverly concat my query with the retrieved examples to form a long instruction/prompt, feed it to the llm and just wait for the output. Did I get my understanding right?
When you uploaded the additional data in superbooga, did you have to prep it first as a question and answer format like you did on the fine tuning, or were you able to just upload books, files, etc for that part? Also thanks for doing these vidoes! These are by far the most informative on how this stuff works!
I just naively dumped the entire file, which I wouldn’t do for a more sophisticated application. Though superbooga will just chunk the files for you, so you can just drag and drop massive files.
@@AemonAlgiz Thanks, How do you deal with more complex formatted material, such as research papers? Are the parsers good enough to handle them without a lot of data cleaning or prep work on the paper first?
@@unshadowlabs this has been my area of expertise for years! I worked in scientific publishing for over a decade, so what I find is that trying to naively parse them works to some extent, especially with research papers since they tend to be very topically dense. What you may find challenging is keeping all of the context densely packed, so it may be worth trying to split on taxonomic/ontological concepts.
@@AemonAlgiz Awesome, thanks for the reply! A suggestion for a video, I would love to see how you deal with different types of content and sources and what type of data processing, wrangling, or cleaning, and what type of tools you recommend given your expertise, background, and experience.
Hi, I was listening to your description of raw text and then how are you converted it. But can you just upload a very short story that has the style you like and take all the defaults of the training tab and use the plain TXT file and make a lora that will be useful in that it will simulate the style I like in model I want to use?
Hi @AemonAlgiz, great video! I am using a similar approach (I use langchain for the handing over the documents to a LLM) and I have tried a wizardLM model which hasn't performed too great. What strategies (fine-tuning, in-context learning or other models?) would you recommend to improve the performance of answering a question given the retrieved documents? Can you recommend specific models (Flan-T5 or other models?)
Gorilla is specifically tuned for use with langchain, so that may be an interesting model to test with. What kind of data are you want to use? That may influence my answer here
Hi @@AemonAlgiz I don't quite understand how to use Gorilla with an existing vector database. Could you make a video on that or do you have guidance for that? Am I suppose to use the OpenAI API for that use case?
Could you clarify the performance of the LLMs where you provide it context, but dont do a fine tune? Was that last oogabooga medial appeal demo with a fine tuned model, or was it just using the additional embedded context?
Token indices sequence length is longer than the specified maximum sequence length for this model (249345 > 2048). Running this sequence through the model will result in indexing errors I am facing this issue, please help for resolution.
hmm... : I would like to be able to : Update the llm , ie by extrracting the documents in a folder , extracting the text and fine tuning it in ? ie : i suppose the best way would be to inject it as a text dump ~ HOW?(Please) ie take the whole text and tne a single epoch only !: As well as saving my chat history as a input/Response dump : single epoch only . Question : each time we fine tune ? it takes the last layer and makes a copy then trains the copy and replaces the last layer ? as the model weights are FROZEN? does this mean that they dont get updated ....? if so then the lora is applied to this last layer esentially replacing the layer ? If we keep replacing the last layer do we essentially wipe over the previous training ?? i have seen that you can target Specific layers ? ... How to determine which layers to target? then create the config to match these layers? Question : How dowe create a strategy for regular tuning without destroying the last training ? should we be Targetting different layers each fine tuning ? Also Why canwe not tune it Live!! ie while we are talking to it ? or discuss with the model and adust the model whilst talking ? is adjusting the weights done by the AUTOGRAD? NN in pytorch with the optimization ? ie adam optimizer ? as with each turn we can produce the loss from the input by supplying the expected outputs to compare with simuarity so if the output is over a specfic threshhold it would finetune acording to the loss (optimize this(once)) ... ie switching between train and evaluation , (freezing a specific percentage of the model )... ? ie essentially woring with a live brain ??? how can we update the llm with conversation , ??? by giving it the function (function calling) to execute a single training optimization based on user feedback ? ie positive and negative votes... and the current response chain ... ie if the rag was used then the content should be tuned in ?? SOrry for the long post but it all connects to the same thingy?
Basically, we would rather teach the model how to use information than try to teach it everything. So, if we can give the model enough examples of what a procedure looks like, it can learn how to better follow it. So, take for example a para-legal or a lawyer. They’re well educated on how to write legal briefs, though they’re not aware of every law to exist. They have learned how to research and leverage information, which is what we’re trying to do with this approach.
Hi @aemonAlgiz , I am new to Python (and LLMs) and wanted to try creating a dataset from a book as well. However when running the provided code, I got a warning: "Token indices sequence length is longer than the specified maximum sequence length for this model (181602 > 2048). Running this sequence through the model will result in indexing errors Max retries exceeded. Skipping this chunk." (which happened a lot). The new .JSON file was empty. I tried changing the "model_max_length": from 2048 to 200000 in the tokenizer_config from my model, but that only made the warning disappear (but the result was the same). Would love if anyone has a solution to this :)
That’s a great question! You can encourage the model to “behave” in a particular way. Though of course you’re not really imbuing the model with knowledge you’re causing a preference for tokens that satisfy some requirement. For example, if I had enough samples for a solid fine tune on appeals it would write near human like in the process. So combining the influence on the models behavior with additional context from documents, you get a more modern version of an expert system. This is a technique we have been using in industry to get models to fulfill very specific use-cases.
Think of it as of you were using bing but the search results are very specific. This is good for closed domains and very specific tasks . I use it for work as well in closed domain data
so with superbooga you could just drop in the file with the Q&A from the book, add an injection point in your prompt and the LLM has access to the data? That sounds too easy lol So say you want to have oogabooga be a storytelling ai, can you add the injection point in that opening prompt, feed it a Q&A made from stargate scripts and then have it use that data in responses to set tone and characters?
Superbooga makes it pretty easy! They have a drag and drop embedding system and it handles the rest for you. It’s not going to be optimal for all use-cases but it works well in general
I still understood literally nothing. What vector databases have anything to do with embedding vectors in language models? and how they get utilized anyway? This video being like "we mentioned them in adjacent sentences and this shows they can work together".
@@AemonAlgiz the whole thing, the entire pipeline, especially for QA purpose. like, if I have a huge document put into a vector database, an embedding for a question about this document can very well be really far away from any relevant vector in the database, thus, making chances of getting relevant vector from the database smaller. if this vector affects further model generation, then we won't get answer on this question. it's also not clear how exactly this vector is getting used within the model anyway. it this concatenation? or used as a bias vector? or is it a soft promt?
@@РыгорБородулин-ц1е this is a great question! This is why we have the tags around different portions of the input, mainly to control the documents that are queried for. Since we can wrap the input, we have explicit control over what portion of the input text gets embedded for the query. Does that make more sense? Also, the way we chunk inputs helps to prevent getting portions of the document that aren’t relevant. The way I embedded in this example was naive, though we can use very intricate chunking methodologies to have a higher assurance of topical density.
@@AemonAlgiz in such case, if we need explicit control over which documents/portions of documents are queried, it looks like queries in question look more like queries to old-fashioned databases and less like questions to a language model, with a lot of manual labour and engineering knowledge required to do make fruitful requests
Hi, I'm new to Python and getting an error related to the token sequence length exceeding the maximum limit of the model, could you please help me to solve the problem? ERROR: Token indices sequence length is longer than the specified maximum sequence length for this model (194233 > 2048). Running this sequence through the model will result in indexing errors 2023-08-24 10:41:54.890169: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
So in the embedding approach the embeddings are just additional information that are injected in the prompt itself? In other words, the fine tuned model knows how to do something, but i can use an extra help (the embedding info) to generate a better prompt? If so we are optimizing the prompt, right? Thanks for the video!
This content is top notch among ML and AI in RUclips showing us how it really works!
Thank you, I’m glad it’s helpful!
Comedy dataset update! I have found an approach I think I like for it, though I didn't have time to complete it for this video. So, I will also cover that in today's live stream!
Okay so after a cup of coffee and watching a couple of times, WOW. You helped me so much thank you. This has been driving me nuts and you make it look so easy to fix. I wish I was as smart as you. Thank you again. 🎉
You always ask the best questions, so keep them coming :)
Amazing work... this channel is pure gold, the exact amount of concepts, everything is spot on. Nothing beats teaching by experience like you do.
I’m glad it was helpful and thank you for the comment :)!
I very much appreciate that you always have this way of listing the most important bullet points at the beginning
I’m glad it’s helpful! I figured it would be nice to give a quick overview
Finally some freaking great tutorial! Practical, straight to the point and it works!!
I knew I subscribed here for good reason. this is consistently extremely high quality information -- not the regurgitated stuff. This is super educational and has immensely improved my understanding.
Please keep going bud, this is great.
Thank you! It’s greatly appreciated
Dude seriously your content is so clear and easy to follow keep it up!
You’re literally a genius! I appreciate you taking the time to share the knowledge with us! Exactly what I was looking for… how to create a dataset and in such a well put together video. Thank you
I would pay a lot of money for this information, thank you.
The appeal has been processed by the approval AI... And it passed! The prescription will now be covered. 😊
(Thank you for the video! I think datasets and install dependencies are ML's greatest pain points at the moment.)
Thank you! I’m glad it was helpful :)
Great explanation with the right level of details and depth. Good stuff. Thanks!
I’m glad it was helpful!
Thats awesome! And you can even save the new appeal to create more data !
Indeed! It becomes a very nice self reinforcing model, this is why I really like the fine tuning and embedding approach
Wow, how do you make everything look easy. Nice thanks. So East coast, man your early bird.
I live in MST, haha. I just wake up very early :)
How would building a training set on a codebase look? Is there a good example of automating generation of a Q&A training set based on code? How do you chunk it to fit in context window - break it up by functions and classes? Where would extraneous stuff go, like requirements, imports, etc... Thanks for the great content!
Superb presentation. As always. 😊
This video was awesome! I'm finally starting to wrap my head round this stuff. At the same time I'm realising the power that is being unleashed onto the world!
BTW did you see this new paper:SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression. Looks like it's right up your alley!
Thank you! I’m glad it’s helpful :D
I have not seen this, this is super cool though, thank you for pointing me to it! I would love to see some implementation of pruning in LLM’s. Quantization is incredibly powerful, but we can only compress so much until we hit the limit. With pruning plus weight compression, as could run 30/65B parameter models on a single consumer GPU.
Amazing, Thanks a lot for sharing your reflections on your work and experience ! It is much appreciated ! First time I check something like this quickly browsing and stick without having to review / study and come back later. I am able to get a Birds eye view on the topic and options available for work, and the underlying purpose. 🥇Pure Gold. Definitely Subscribed !
Awesome video!
Thank you!
Hey man, thanks for your videos they are instructive. I am new to LLMs and I think there is a significant gap in RUclips content with the new LLMs. I know there are videos on fine tuning GPT3 but I can't find anything like walk through in fine tuning a larger new open source model like Falcon-40b instruct. If there was a playlist going through the process: QA fine tune data definition, synthetic data production, fine tuning and test. I am sure others like myself will be very keen followers
I’ll make a playlist today!
do you have a video on how to prepare a dataset for creative writing?
You are an Angel. 💜
Thank you! I’m glad it was helpful :) I do like how you left your name that haha
it's so helpful thank you, what if I have a multiple pdf files at the same time and each one of them has his own subject can I do the same for them ?
I really love the concept, but whatever I have tried I get ERROR: Token indices sequence length is longer than the specified maximum sequence length for this model (194233 > 2048)
Could you please update it? it would be of immense value to me :)
Did u find the solution?
Would also love to know :)
How you resolved this problem?
great explanations thanks a lot for your efforts making this great content!
Awesome content!! Thank you very much!!👏🏻👏🏻👍🏻
I am getting an error like this.
Token indices sequence length is longer than the specified maximum sequence length for this model (546779 > 2048). Running this sequence through the model will result in indexing errors
Max retries exceeded. Skipping this chunk.
Same here. Does anyone have an answer?
top notch content
Hi Aemon, i am new to local llm api setting up. Could you explain a little on how to get around setting it up? thanks
Hey there! From the OobaBooga web application you can enable extensions, including the api. It will run on port 5000 by default!
Hi Aemon I checked api and public_api on the flags/extensions page, any idea why I can't connect to port 5000?
Well done but how do you handle the max model length of tokenizer.encode?
Hi, I have some confusion about your content about leveraging embeddings. My understanding so far is that, embedding approach simply means "few-shot learning". The pipeline is, say, I have a query, I embed the query into a vector and then search for similar vectors which represent relevant examples in a vector db, now I have my initial query + some examples of (query, answer) from the db. Then I somehow cleverly concat my query with the retrieved examples to form a long instruction/prompt, feed it to the llm and just wait for the output. Did I get my understanding right?
When you uploaded the additional data in superbooga, did you have to prep it first as a question and answer format like you did on the fine tuning, or were you able to just upload books, files, etc for that part? Also thanks for doing these vidoes! These are by far the most informative on how this stuff works!
I just naively dumped the entire file, which I wouldn’t do for a more sophisticated application. Though superbooga will just chunk the files for you, so you can just drag and drop massive files.
@@AemonAlgiz Thanks, How do you deal with more complex formatted material, such as research papers? Are the parsers good enough to handle them without a lot of data cleaning or prep work on the paper first?
@@unshadowlabs this has been my area of expertise for years! I worked in scientific publishing for over a decade, so what I find is that trying to naively parse them works to some extent, especially with research papers since they tend to be very topically dense. What you may find challenging is keeping all of the context densely packed, so it may be worth trying to split on taxonomic/ontological concepts.
@@AemonAlgiz Awesome, thanks for the reply! A suggestion for a video, I would love to see how you deal with different types of content and sources and what type of data processing, wrangling, or cleaning, and what type of tools you recommend given your expertise, background, and experience.
This is a great idea, I have dealt with some nightmarish formats
Hey @AemonAlgiz - How did you create the instruction set of data for the CYPHER query examples? Did you do that all manually?
Hi, I was listening to your description of raw text and then how are you converted it. But can you just upload a very short story that has the style you like and take all the defaults of the training tab and use the plain TXT file and make a lora that will be useful in that it will simulate the style I like in model I want to use?
Aemon, what time will your live stream be?
6PM MST :D
Hi @AemonAlgiz, great video! I am using a similar approach (I use langchain for the handing over the documents to a LLM) and I have tried a wizardLM model which hasn't performed too great. What strategies (fine-tuning, in-context learning or other models?) would you recommend to improve the performance of answering a question given the retrieved documents? Can you recommend specific models (Flan-T5 or other models?)
Gorilla is specifically tuned for use with langchain, so that may be an interesting model to test with. What kind of data are you want to use? That may influence my answer here
@@AemonAlgiz haven't heard of Gorilla so thank's for pointing that out! I would like to answer questions given paragraphs of a technical manual
Hi @@AemonAlgiz I don't quite understand how to use Gorilla with an existing vector database. Could you make a video on that or do you have guidance for that? Am I suppose to use the OpenAI API for that use case?
Amazing work! I would like to know if it is possible to use langchain to load pdfs to batch generate instruction datasets?
Could you clarify the performance of the LLMs where you provide it context, but dont do a fine tune? Was that last oogabooga medial appeal demo with a fine tuned model, or was it just using the additional embedded context?
Hi @aemonAlgiz - how long did it take to finetune stablelm-base-alpha-7b ? On what hardware?
Howdy! Not very long for this, since it was a fairly small finetune, about an hour. I use an AMD 7950X3D CPU and a RTX 4090
Token indices sequence length is longer than the specified maximum sequence length for this model (249345 > 2048). Running this sequence through the model will result in indexing errors
I am facing this issue, please help for resolution.
can you explain code to convert pdf to json.. i dont know how you doing that.. it's great and thats what we need.. thanks before
Hey aemon, how can I structure my dataset so it outputs answers in a specific format every time. Is this possible?
hmm... : I would like to be able to : Update the llm , ie by extrracting the documents in a folder , extracting the text and fine tuning it in ?
ie : i suppose the best way would be to inject it as a text dump ~ HOW?(Please)
ie take the whole text and tne a single epoch only !:
As well as saving my chat history as a input/Response dump : single epoch only .
Question : each time we fine tune ? it takes the last layer and makes a copy then trains the copy and replaces the last layer ? as the model weights are FROZEN? does this mean that they dont get updated ....? if so then the lora is applied to this last layer esentially replacing the layer ?
If we keep replacing the last layer do we essentially wipe over the previous training ??
i have seen that you can target Specific layers ? ... How to determine which layers to target? then create the config to match these layers?
Question : How dowe create a strategy for regular tuning without destroying the last training ? should we be Targetting different layers each fine tuning ?
Also Why canwe not tune it Live!! ie while we are talking to it ? or discuss with the model and adust the model whilst talking ? is adjusting the weights done by the AUTOGRAD? NN in pytorch with the optimization ? ie adam optimizer ? as with each turn we can produce the loss from the input by supplying the expected outputs to compare with simuarity so if the output is over a specfic threshhold it would finetune acording to the loss (optimize this(once)) ... ie switching between train and evaluation , (freezing a specific percentage of the model )... ? ie essentially woring with a live brain ???
how can we update the llm with conversation , ??? by giving it the function (function calling) to execute a single training optimization based on user feedback ? ie positive and negative votes... and the current response chain ... ie if the rag was used then the content should be tuned in ??
SOrry for the long post but it all connects to the same thingy?
When training a dataset, it seems the Q&A is too specific to the book. Wouldn't that make the model too specific to the use case you're training ?
Amazing work! I still trying to understand the embeddings approach.😊
Basically, we would rather teach the model how to use information than try to teach it everything. So, if we can give the model enough examples of what a procedure looks like, it can learn how to better follow it.
So, take for example a para-legal or a lawyer. They’re well educated on how to write legal briefs, though they’re not aware of every law to exist. They have learned how to research and leverage information, which is what we’re trying to do with this approach.
The only way you'll understand it is by trying it yourself
@@Hypersniper05 you are right.
@@AemonAlgiz thanks for the explanation to my doubt. I will try yo reproduce in my colab pro.
Let me know how the experiment goes!
Hi @aemonAlgiz , I am new to Python (and LLMs) and wanted to try creating a dataset from a book as well. However when running the provided code, I got a warning:
"Token indices sequence length is longer than the specified maximum sequence length for this model (181602 > 2048). Running this sequence through the model will result in indexing errors
Max retries exceeded. Skipping this chunk." (which happened a lot).
The new .JSON file was empty. I tried changing the "model_max_length": from 2048 to 200000 in the tokenizer_config from my model, but that only made the warning disappear (but the result was the same).
Would love if anyone has a solution to this :)
did u got the solutiion??
@@abhaypratap7415 nope
What difference between this and chat with documents?
That’s a great question! You can encourage the model to “behave” in a particular way. Though of course you’re not really imbuing the model with knowledge you’re causing a preference for tokens that satisfy some requirement. For example, if I had enough samples for a solid fine tune on appeals it would write near human like in the process.
So combining the influence on the models behavior with additional context from documents, you get a more modern version of an expert system. This is a technique we have been using in industry to get models to fulfill very specific use-cases.
Think of it as of you were using bing but the search results are very specific. This is good for closed domains and very specific tasks . I use it for work as well in closed domain data
Which model did you use on oobabooga ?
Waiting for new content 😊
so with superbooga you could just drop in the file with the Q&A from the book, add an injection point in your prompt and the LLM has access to the data?
That sounds too easy lol
So say you want to have oogabooga be a storytelling ai, can you add the injection point in that opening prompt, feed it a Q&A made from stargate scripts and then have it use that data in responses to set tone and characters?
Superbooga makes it pretty easy! They have a drag and drop embedding system and it handles the rest for you. It’s not going to be optimal for all use-cases but it works well in general
@AemonAlgiz How to enable Superbooga api .?
iam getting this error"Max retries exceeded. Skipping this chunk."
🙏 thanks
thank you soooooo much
thanks man
Thanks for your awesome video, if you some day want to work as a mentor for our startup, write me dude.
I still understood literally nothing. What vector databases have anything to do with embedding vectors in language models? and how they get utilized anyway? This video being like "we mentioned them in adjacent sentences and this shows they can work together".
Howdy! I’m happy to try and explain anything that’s not clear. Where are things not making sense?
@@AemonAlgiz the whole thing, the entire pipeline, especially for QA purpose. like, if I have a huge document put into a vector database, an embedding for a question about this document can very well be really far away from any relevant vector in the database, thus, making chances of getting relevant vector from the database smaller. if this vector affects further model generation, then we won't get answer on this question. it's also not clear how exactly this vector is getting used within the model anyway. it this concatenation? or used as a bias vector? or is it a soft promt?
@@РыгорБородулин-ц1е this is a great question! This is why we have the tags around different portions of the input, mainly to control the documents that are queried for. Since we can wrap the input, we have explicit control over what portion of the input text gets embedded for the query. Does that make more sense?
Also, the way we chunk inputs helps to prevent getting portions of the document that aren’t relevant. The way I embedded in this example was naive, though we can use very intricate chunking methodologies to have a higher assurance of topical density.
@@AemonAlgiz in such case, if we need explicit control over which documents/portions of documents are queried, it looks like queries in question look more like queries to old-fashioned databases and less like questions to a language model, with a lot of manual labour and engineering knowledge required to do make fruitful requests
Lipsync issue your audio
did you skip the training process?
I am going to get fired if you don't come back
Hi, I'm new to Python and getting an error related to the token sequence length exceeding the maximum limit of the model, could you please help me to solve the problem?
ERROR: Token indices sequence length is longer than the specified maximum sequence length for this model (194233 > 2048). Running this sequence through the model will result in indexing errors 2023-08-24 10:41:54.890169: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
would also love a answer to the Token Indicies issue
So in the embedding approach the embeddings are just additional information that are injected in the prompt itself? In other words, the fine tuned model knows how to do something, but i can use an extra help (the embedding info) to generate a better prompt? If so we are optimizing the prompt, right? Thanks for the video!
UnboundLocalError: local variable ‘iter’ referenced before assignment
how can i solve my problem
Wasted 10 minutes to find out you're using an API "oogabooga?" Instead of actually telling us how.
😂