am I the only one who takes active notes on Dan's videos? Earlier this year, I wrote to myself "Choose to consume content that's worth remembering". IndyDevDan's content is always an immediate click. Thanks for pumping out great content, Dan.
Tvm Dan! Step-by-Step Implementation 1. Build a Basic Prompt (Level 1): o Write a direct instruction or query. o Test for desired outputs using tools like llm. 2. Add Reusability (Level 2): o Define static variables and instructions. o Save the prompt in a structured format like XML for future use. 3. Incorporate Examples (Level 3): o Add example outputs to guide the LLM. o Test the prompt for consistency across different inputs. 4. Scale with Dynamic Variables (Level 4): o Integrate dynamic elements into your prompt. o Automate updates for scalability in applications.
Bro. This video is hands down the best kept secret in all of prompting. I toyed around with super small 1b parameter models like SMOLLM2 and basically gave up on trying to get them to do anything useful but the XML tip was basically the biggest unlock I've come across in this space. It should have been completely obvious but I've never seen this mentioned anywhere else.
Gold! Gold! I’ve learned so much about prompt engineering from this video. I’m almost finished using this method to build a legal assistant-with LPC/SQE or BPTC qualifications and post-qualification experience (PQE) level intelligence-to draft service contracts and negotiate terms with my clients via email.
Just created a 500 line xml prompt for a project. Mostly using llm to create it based on a working example of what I wanted it to accomplish. Has security, optimization, examples, data definitions, tables of contents, instructions, search definition sections. Works great! 💪
I've been using and experimenting with LLMs a lot over the last couple years and I still learned a lot of great tips from this video. Thanks Dan! LLMs are pretty good at following prompts written in markdown, but XML-esqe format does seem to be more consistent for models like qwen and the other local models I tested.
after watching your much much valuable video, I'm really excited and inspired by your sentence: "The prompt is the new fundamental unit of knowledge work". thank you.
Really useful, I almost got there on my own by experimenting with a sort of personal CV generation tool based on "historical" project data and requirements from projects I want to apply to. I definitely like the idea of using XML, so far I've been using JSON but XML might be better choice to structure task as you rightfully argumented. The only thing I'm not sure are that quite simple examples. I think LLMs understand well, when you write once that you expect certain structure, but several times might wire them better towards the expected outcome...
suggest adding "ell" to your workflow In docs they use python f-strings to manage prompts as functions. You can replace f-strings with your xml snippets. It gives you a lot of benefits and full control over prompt engineering. One more suggestion: think of YML instead of XML it gives you clearer view and editing experience. Multiple files support inside single yml file as well. after you can simply convert : yml -> xml -> txt -> string inside 'ell' function ultimate combination
This topic is fascinating. I have so many questions. - If you see an output 10+ times as expected, is it reasonable to assume it will always function perfectly with proper inputs? - Is the temperature on these models usually set to zero for maximum consistency? - If you raise the temp for some reason, would that introduce more of a need for unit testing? - Would you choose to unit test with LLMs as well? I’m going to lie down now... Thanks for the inspirational and informative content.
My thoughts: - No such thing as 'always' with LLMs BUT the larger your dataset (100+) the more confident you can be. 10+ is a good signal. 100+ is good data backed signal. - Yes typically lower is better for instruction following - Yes - Totally, the live benchmarking is a good middle ground between 'vibe tests' and hardcore benchmarking. If you're going to prod unit tests / benchmarks are a must.
I love the presentation and the thought process behind these videos. Like the idea of leveraging prompts and not relying on abstractions What would you use to implement RAG?
Holy cow man. I'm a cloud architect and only recently been bitten by the AI bug lol. But this first promot example (level 1) is years beyond anything ive explored. Anyone have any other videos or cources to even get to L1? This dude is fantastic!
I would love to see some benchmarks with DSPY as the prompting tool for each of these models. From my understanding of how DSPY works, is it optimizes instructions to each LLM based on the prompting pattern that each model has been tested to perform best with. It basically is an engineer-minded python library that brings programmatic instructions back to prompting. It would be awesome to see you run some tests with each of the models optimized with DSPY, to compare outputs...since it's really not fair to test the same prompts with different models, since each has their optimial prompting style i feel.
Until task are easy and I can check them by myself - I'm able to write this little queries also by myself. If task is huge and I cannot check correctness - I cannot trust to the answer. If I cannot trust output of LLM - I have to do task by myself anyway. Prompt engineering is applicable only for chatbots and something like that, which does not imply strict and precise answers and that's means that prompt engineering still not a skill, because it is not measurable.
Great video and vert well structured presentation ! Thanks for sharing. Can I ask, if make sense, to do a video with additional hints for situations where we need prompt for different agents in a multi agent sequence ? Any considerations on how to evolve this logic if we have to run multiple prompts ?
Doing level 5 now that generates an html snippet inserts into a existing web page. Claude is dynamically generating the html bootstrap 5 responsive interface performing as a data analyst. Dynamic non deterministic ui is fascinating.
Hi Dan, Really enjoy your content, thank you. Two questions: 1. What are your thoughts about prompt expandars and the Anthropic prompt generator? 2. I've read some research papers that undermine the importance of the quality of the prompt. Especially when it comes to models with reasoning capabilities. Those shifted my perspective about the importance of promprt engineering. I have this assumption that the models are going to be smart enough to work things out with a zero shot prompt and many iterations of feedback. But, im curios to hear your thoughts. Thanks again for contributing and sharing your knowledge
YW - thank you! 1. Prompts running on prompts (expansion & generation) is a big trend coming that few are using (metaprompting). We'll cover this soon. 2. Yes and No. Hard problems will always be hard no matter how smart the person or model. I have a whole list of problems that GPT-6, claude 10 and o5 will 100% fail. Why? Because at some point raw intelligence is not enough, you need large context, systems, mass collaboration, iterations, and wait for it... 'taste'.
10:02 it would be interesting to run conclusive experiments for a) effectiveness of structured prompt versus plain english to achieve the same goals b) actual amount of time it takes the writer to write the structured prompt versus english, as well as time it takes to do average edits. i get that most compsci bros are natively thinking in structure/code anyway so this could be easier for many people, but if a well-written english language prompt can achieve the same results and the nontechnical members of your team can read and edit the prompt just as well, then it's not straightforward that structured/xml prompts are the way to go. i never thought about writing prompts that way in xml but im going to try it now. thus far i've never felt like i haven't been able to get the result i wanted with regular english writing
Level 6 is likely to be non deterministic because it lacks no template just gets data fragments and leans more into an operational role or area an effectiveness. Used more to provide insights and less summarization
Thank you for your videos! Amazing learning. Can you please help understand the requirements on hardware to run LLMs you are running locally ? If I buy currently available mac mini M4 PRO fully loaded to 64GB of memory and max the CPU/GPU cores, is it good enough? Or need to wait for next year's mac mini m4 ultra studio? What setup are you running? Can you do a video on how to run Lama with some models being local and some being run remotely on powerful GPUs
I process natural language, but could use XML instead of plain txt for prompts. Would it make my logically structured prompts more effective? I can understand how using XML instead of txt might make prompting for code more effective. Is that Generally true? I think so, based on this video.
Know what would be interesting. Having an AI agent that uses a similar approach coding assistants use, but for prompt building. Have it display the prompt, and make edits inline like cursor does and then test fire in a mock environment. Then take the output and then feed it back into the agent again for refinement......like automate prompt engineering.
GREAT VIDEO!!! I'm going to share it with my team at the office. Question. I'm having issues with a system prompt. Do these techniques help with those as well. If do will a tool_instructions help with tool execution or am I overthinking it?
in the end, what I didn't get is how to actually run the level 4 prompt with `llm` library. How do you pass the content variable to it? I get how to pass the prompt saved as a file, but how to pass both, the file with prompt text and the content? Thx
Excellent tutorial !!! I'm curious about how to display "tokens count" at the bottom right panel of the Cursor app as yours and what's the model name you were using at the beginning of this video ?
One more question: what tool are you using for your onscreen semi-transparent slides you use to present and tool to make the video recording with this square marker that you move on screen?
Is it possible to give micro examples instead of an entire example output? Like examples of certain words to be written in some way instead of the whole output written that way. Or should we just use instruction instead to clarify.
I wouldn't say Prompt Engineering isn't a real "skill", ruclips.net/video/ujnLJru2LIs/видео.htmlsi=GrRivtBfuxgScSDk&t=173 but i would say Prompt Engineering isn't a skill that will be around for long.... especially as LLMs become better and better, the need for very specific prompts changes, and each model has been shown to perform better with different prompt patterns, and sometimes updating to a newer model can break your prompts... For example in o1 trying to use Chain Of Thought will actually return worse output, apparently because CoT is already cooked into the system...
Please please plase, just fix your mic... Love your videos but the thumping sound from your typing is just obnoxious, especially since it costs $20 to fix.
You need a better prompt for your scripts. The pacing of this video is off, in the beginning it feels too slow and on level 4 it feels overhasty. The arc of suspense is weak.
am I the only one who takes active notes on Dan's videos? Earlier this year, I wrote to myself "Choose to consume content that's worth remembering". IndyDevDan's content is always an immediate click. Thanks for pumping out great content, Dan.
Problem is I don’t remember it after 1week 😄, so just consume it and forget
why not use NoteBookLM for this?😀
me too pausing like crazy. love these videos ...i wait for every single one!
@@ПотужнийНезламізм Make Anki cards and review daily
Taking the transcriptions and posting them into Claude for notes. 🎉
Tvm Dan!
Step-by-Step Implementation
1. Build a Basic Prompt (Level 1):
o Write a direct instruction or query.
o Test for desired outputs using tools like llm.
2. Add Reusability (Level 2):
o Define static variables and instructions.
o Save the prompt in a structured format like XML for future use.
3. Incorporate Examples (Level 3):
o Add example outputs to guide the LLM.
o Test the prompt for consistency across different inputs.
4. Scale with Dynamic Variables (Level 4):
o Integrate dynamic elements into your prompt.
o Automate updates for scalability in applications.
Bro. This video is hands down the best kept secret in all of prompting. I toyed around with super small 1b parameter models like SMOLLM2 and basically gave up on trying to get them to do anything useful but the XML tip was basically the biggest unlock I've come across in this space. It should have been completely obvious but I've never seen this mentioned anywhere else.
Couldn't have said it better myself word for word
100% - the GenAI Eng ecosystem is sleeping on xml but not us.
We concur +1 & team xml un lock. From a minute old newbie sub
Gold! Gold! I’ve learned so much about prompt engineering from this video. I’m almost finished using this method to build a legal assistant-with LPC/SQE or BPTC qualifications and post-qualification experience (PQE) level intelligence-to draft service contracts and negotiate terms with my clients via email.
Just created a 500 line xml prompt for a project. Mostly using llm to create it based on a working example of what I wanted it to accomplish. Has security, optimization, examples, data definitions, tables of contents, instructions, search definition sections. Works great! 💪
Great content thanks! I use fabric as my prompt factory and I am inspired to test xml vs. md time and efficiency
I've been using and experimenting with LLMs a lot over the last couple years and I still learned a lot of great tips from this video. Thanks Dan! LLMs are pretty good at following prompts written in markdown, but XML-esqe format does seem to be more consistent for models like qwen and the other local models I tested.
You are one of the few RUclipsrs that I can't just listen to. I have to actually watch it. Your content is so data rich! Love the work!
I can't believe I only just now found your channel.
Uggh. I have so much catching up to do watching your videos.
after watching your much much valuable video, I'm really excited and inspired by your sentence: "The prompt is the new fundamental unit of knowledge work". thank you.
+1 & Team
I am forever grateful for your sharing. This is awesome and will definitely make an epic difference to my workflow
Really useful, I almost got there on my own by experimenting with a sort of personal CV generation tool based on "historical" project data and requirements from projects I want to apply to. I definitely like the idea of using XML, so far I've been using JSON but XML might be better choice to structure task as you rightfully argumented. The only thing I'm not sure are that quite simple examples. I think LLMs understand well, when you write once that you expect certain structure, but several times might wire them better towards the expected outcome...
suggest adding "ell" to your workflow
In docs they use python f-strings to manage prompts as functions.
You can replace f-strings with your xml snippets.
It gives you a lot of benefits and full control over prompt engineering.
One more suggestion:
think of YML instead of XML
it gives you clearer view and editing experience. Multiple files support inside single yml file as well.
after you can simply convert : yml -> xml -> txt -> string inside 'ell' function
ultimate combination
This topic is fascinating. I have so many questions.
- If you see an output 10+ times as expected, is it reasonable to assume it will always function perfectly with proper inputs?
- Is the temperature on these models usually set to zero for maximum consistency?
- If you raise the temp for some reason, would that introduce more of a need for unit testing?
- Would you choose to unit test with LLMs as well?
I’m going to lie down now... Thanks for the inspirational and informative content.
My thoughts:
- No such thing as 'always' with LLMs BUT the larger your dataset (100+) the more confident you can be. 10+ is a good signal. 100+ is good data backed signal.
- Yes typically lower is better for instruction following
- Yes
- Totally, the live benchmarking is a good middle ground between 'vibe tests' and hardcore benchmarking. If you're going to prod unit tests / benchmarks are a must.
@@indydevdan Thank you for taking the time to respond! This continues to fascinate me. Best of luck in your continued YT and AI journey
You are amazing
I think you go beyond most people I know in the industry
Thanks for your efforts to spread insightful best practices
This is cool. I've been introduced to this new type of prompting with XML. I always wondered JSON is the best choice
I love the presentation and the thought process behind these videos. Like the idea of leveraging prompts and not relying on abstractions
What would you use to implement RAG?
top notch, per usual! Keep up the good work!
Holy cow man. I'm a cloud architect and only recently been bitten by the AI bug lol. But this first promot example (level 1) is years beyond anything ive explored. Anyone have any other videos or cources to even get to L1? This dude is fantastic!
Fantastic tutorial. Leveled up! 🎉
I would love to see some benchmarks with DSPY as the prompting tool for each of these models. From my understanding of how DSPY works, is it optimizes instructions to each LLM based on the prompting pattern that each model has been tested to perform best with. It basically is an engineer-minded python library that brings programmatic instructions back to prompting. It would be awesome to see you run some tests with each of the models optimized with DSPY, to compare outputs...since it's really not fair to test the same prompts with different models, since each has their optimial prompting style i feel.
I loved the experimentation and the practicality. Will wait for more.
Until task are easy and I can check them by myself - I'm able to write this little queries also by myself. If task is huge and I cannot check correctness - I cannot trust to the answer. If I cannot trust output of LLM - I have to do task by myself anyway. Prompt engineering is applicable only for chatbots and something like that, which does not imply strict and precise answers and that's means that prompt engineering still not a skill, because it is not measurable.
Can you do a cursor/vscode setup? You setup looks amazing.
IKR!
Great video and vert well structured presentation ! Thanks for sharing. Can I ask, if make sense, to do a video with additional hints for situations where we need prompt for different agents in a multi agent sequence ? Any considerations on how to evolve this logic if we have to run multiple prompts ?
Doing level 5 now that generates an html snippet inserts into a existing web page. Claude is dynamically generating the html bootstrap 5 responsive interface performing as a data analyst. Dynamic non deterministic ui is fascinating.
Hi Dan,
Really enjoy your content, thank you.
Two questions:
1. What are your thoughts about prompt expandars and the Anthropic prompt generator?
2. I've read some research papers that undermine the importance of the quality of the prompt. Especially when it comes to models with reasoning capabilities.
Those shifted my perspective about the importance of promprt engineering.
I have this assumption that the models are going to be smart enough to work things out with a zero shot prompt and many iterations of feedback.
But, im curios to hear your thoughts.
Thanks again for contributing and sharing your knowledge
YW - thank you!
1. Prompts running on prompts (expansion & generation) is a big trend coming that few are using (metaprompting). We'll cover this soon.
2. Yes and No. Hard problems will always be hard no matter how smart the person or model. I have a whole list of problems that GPT-6, claude 10 and o5 will 100% fail. Why? Because at some point raw intelligence is not enough, you need large context, systems, mass collaboration, iterations, and wait for it... 'taste'.
@@indydevdan Thanks, appreciate it
Awesome work man, appreciate the content. If you get some time, give some insight how to bring AI power to an existing codebase
10:02 it would be interesting to run conclusive experiments for a) effectiveness of structured prompt versus plain english to achieve the same goals b) actual amount of time it takes the writer to write the structured prompt versus english, as well as time it takes to do average edits.
i get that most compsci bros are natively thinking in structure/code anyway so this could be easier for many people, but if a well-written english language prompt can achieve the same results and the nontechnical members of your team can read and edit the prompt just as well, then it's not straightforward that structured/xml prompts are the way to go. i never thought about writing prompts that way in xml but im going to try it now. thus far i've never felt like i haven't been able to get the result i wanted with regular english writing
Practical and exceptionally useful. Thank you!
"Check out my prompt guide"
Anyone else: I roll my eyes
IndyDanDev: I tap the thumbnail so fast my thumb goes back in time.
💀🙏🚀
Hi Dan, Great content and the level1-4 prompting is very informative. Have you found level3/4 prompting approach to reduce LLM hallucinations?
Very very interesting. Thank you!
Level 6 is likely to be non deterministic because it lacks no template just gets data fragments and leans more into an operational role or area an effectiveness. Used more to provide insights and less summarization
Great contribution 👍
Thank you for your videos! Amazing learning.
Can you please help understand the requirements on hardware to run LLMs you are running locally ?
If I buy currently available mac mini M4 PRO fully loaded to 64GB of memory and max the CPU/GPU cores, is it good enough? Or need to wait for next year's mac mini m4 ultra studio?
What setup are you running?
Can you do a video on how to run Lama with some models being local and some being run remotely on powerful GPUs
I process natural language, but could use XML instead of plain txt for prompts.
Would it make my logically structured prompts more effective? I can understand how using XML instead of txt might make prompting for code more effective. Is that Generally true? I think so, based on this video.
thank you! all the great things to you!
Great value 🎉 thank you 🙏
Oh man i love this kind of content keep it comin mate I'll watch it all.... 👍⭐⭐⭐⭐⭐(6/5)🌟
btw can you make video about windsurf too?
Great content, congratulations! Could you make a GPT to make the procedure even easier?
Know what would be interesting.
Having an AI agent that uses a similar approach coding
assistants use, but for prompt building.
Have it display the prompt, and make edits inline
like cursor does and then test fire in a mock environment.
Then take the output and then feed it back into the agent
again for refinement......like automate prompt engineering.
DSPy
GREAT VIDEO!!! I'm going to share it with my team at the office. Question. I'm having issues with a system prompt. Do these techniques help with those as well. If do will a tool_instructions help with tool execution or am I overthinking it?
Amazing content. Thank you a lot!
in the end, what I didn't get is how to actually run the level 4 prompt with `llm` library. How do you pass the content variable to it? I get how to pass the prompt saved as a file, but how to pass both, the file with prompt text and the content? Thx
Excellent tutorial !!! I'm curious about how to display "tokens count" at the bottom right panel of the Cursor app as yours and what's the model name you were using at the beginning of this video ?
@IndyDevDan
One more question: what tool are you using for your onscreen semi-transparent slides you use to present and tool to make the video recording with this square marker that you move on screen?
So good!
9:45 to 9:52 how did you do that?
Thoughts on new Claude vs. old Claude? Seems like new Claude is kinda dumb in some ways but seems better in other ways
hey Dan, so when you are running open ai, sonnet and gemini models are all those models paid, i.e you have api keys stored away in a env file right??
Awesome stuff as always. What kind of machine do you have?
How do you select multiple words and edit them into 'instructions' at once? I'd like to know all the shortcuts you use. 9:41
Ctrl + D
Fantastic work ! Why don’t you use voice? Takes to long?
Any good ideas for a free prompting library? Should be free, easy to search through and accessible from the web
Is it possible to give micro examples instead of an entire example output?
Like examples of certain words to be written in some way instead of the whole output written that way. Or should we just use instruction instead to clarify.
your xml is not compliant, there is no root element, would it help to add , or would it make it worse?
what do you use to tell you the tokens size in the files?
What do you think of the LangGraph framework?
I'm a simple man, I see indydevdan I click like & subscribe
You are Einstein of promoting
We need this for Node js
What extensions are you using for your text editor? I'm assuming VSCode?
Nice templates.
I just answered my own question. Cursor.
Also, why do you use cursor and not windsurf?
Brother did you see the new Model Context Protocol (MCP)?
I did - video in the queue - still working through value prop.
Level 5 unit testing?
Amazing content as aways.
PS: you say Boolean wrong.
facts
"Pompting is super important"
5 minutes later: Procees to show haw llms dont follow instructions
Thanks for the video Bro, how can i learn with you?
Interesting
yo... what is that typing speed darn boi
🐐
Honestly... I don't get it. Summarise this, examples for that, prompt library... How do I use this to build better systems?
Can someome convince me why prompt engineering is engineering at all?
I wouldn't say Prompt Engineering isn't a real "skill", ruclips.net/video/ujnLJru2LIs/видео.htmlsi=GrRivtBfuxgScSDk&t=173 but i would say Prompt Engineering isn't a skill that will be around for long.... especially as LLMs become better and better, the need for very specific prompts changes, and each model has been shown to perform better with different prompt patterns, and sometimes updating to a newer model can break your prompts... For example in o1 trying to use Chain Of Thought will actually return worse output, apparently because CoT is already cooked into the system...
Have I been saying Boolean wrong my whole life...
No it's me.
@indydevdan haha ok good
Please please plase, just fix your mic... Love your videos but the thumping sound from your typing is just obnoxious, especially since it costs $20 to fix.
You need a better prompt for your scripts. The pacing of this video is off, in the beginning it feels too slow and on level 4 it feels overhasty. The arc of suspense is weak.
You are Einstein of promoting