I'm having so much fun playing with some of your insights. I had already built some interesting tools when functions first came out and twitter was showing how pydantic was the key to making it really usable. But this is such a great progression. Thanks!
🎯 Key Takeaways for quick navigation: 00:01 🎵 *Introduction and Scope* - Jason Liu introduces himself as a keynote speaker, providing an overview of his talk on type hints, Pydantic, and structured prompting. - Discusses the challenges of using language models in production, particularly when outputting JSON or structured data. 01:19 🌐 *Introduction to Pydantic* - Introduces Pydantic as a library for data model validation, emphasizing its similarity to data classes and its reliance on type hints. - Highlights the benefits of using Pydantic, including better validation, cleaner code, and automatic generation of JSON schema. 04:21 🧩 *Introduction to Structured Prompting with Pydantic* - Discusses the concept of structured prompting, where prompts are actual code represented by Pydantic objects. - Demonstrates how Pydantic enables defining objects with nested references, methods, and cleaner code for language model prompts. 05:28 🔧 *Introduction to Instructor Library* - Introduces the "instructor" library, designed to simplify the usage of Pydantic for prompting language models, especially for OpenAI function calls. - Explains how "instructor" patches the completion API and facilitates type-safe and auto-complete features. 06:24 🔄 *Advantages of Structured Prompting* - Explores the advantages of structured prompting, emphasizing the ability to define nested references, methods, and reusable components. - Discusses how this approach leads to more modular, maintainable, and bug-free code. 07:47 🛡️ *Using Validators with Pydantic* - Demonstrates the use of validators with Pydantic, showcasing the ability to add custom validation functions. - Illustrates how language model validators can be integrated to catch and handle errors effectively. 08:57 🌐 *Structured Prompting for Knowledge Workflows* - Explores how structured prompting can go beyond structured outputs, enabling the modeling of knowledge workflows and plans. - Discusses the potential for representing knowledge graphs and leveraging language models for more productive development. 12:14 🔄 *Advanced Applications: Search Query Planning* - Demonstrates advanced applications, such as search query planning using structured prompting. - Shows how defining a data structure for search types and execution methods simplifies the process of querying multiple backends. 14:34 📊 *Advanced Applications: Knowledge Graph Extraction* - Illustrates an advanced application focused on extracting knowledge graphs by closely modeling the data structure to the graph visualization API. - Emphasizes the simplicity achieved in code with the structured prompting approach. 16:11 🔮 *Future Possibilities and Conclusion* - Discusses the future possibilities of structured outputs, including multimodal applications and generative UI over images, audio, and more. - Concludes with excitement about the evolving space of structured prompting and its potential in various domains. Made with HARPA AI
Wow, outstanding talk. I came in thinking a 15 minute talk on typing wouldn't be that interesting, but by the end I felt like this was the most well explained and immediately useful presentation on here.
@@jxnlcoawesome presentation. Where can I dive deeper into this for other models? I heavily use Pydantic today, but other models for work. You mention ask Marvin. What are the differences between that and Instructor?
- Consider using structured prompting for better LL model outputs (00:31) - Ensure LL models output JSON or structured data compatible with existing software (00:50) - Utilize OpenAI function calls for improved JSON schema validation (2:49) - Employ the Pantic library for data model validation and to generate JSON schema (3:53) - Implement instructor library to simplify OpenAI function calling with Pantic (5:16) - Use doc strings in Pantic models to improve prompt and data quality (6:47) - Create validators in Pantic models for data integrity and error handling (7:20) - Use LL models to output structured data for complex data processing (11:08) - Explore advanced applications of structured prompts for knowledge extraction (12:07) - Check out additional examples and documentation on structured prompting (16:29)
00:13 Pydantic is all you need: Jason Liu 02:08 Pydantic is a library for data model validation. 04:07 Pydantic is a trusted library for handling JSON schema and object definition in Python. 06:06 Pydantic allows for cleaner code and easier maintenance by defining nested references and object behavior. 07:54 Validation error handling in Instructor helps fix errors in language models. 09:45 Pydantic allows for structured prompting and object-oriented programming. 11:38 Language models can output data structures to traverse and process data more effectively. 13:27 Pydantic enables easy creation and visualization of graph structures. 15:27 The paraphrasing detection algorithms help identify quotes and provide more accurate answers. 17:20 Pydantic enables extraction of bounding boxes and structured outputs. Crafted by Merlin AI.
Engineers used to be expensive because they produced potentially infinite automation. Today, they're expensive because they consume potentially infinite automation.
engineers are expensive because not everyone wants to deal with things that abstract and complex. I mean, engineers are close to create a machine that is smarter than them, which can make them obsolete, but still did it. which other profession is so true to it?
Respond only with a valid json and nothing else. MY AND MY FAMILY'S LIFE DEPEND UPON THIS! Let us hope Roko's basilisk will look kindly upon us emotionally manipulating our poor LLMs
Makes me interested in using Rust for LLM interactions. Whether the LLM return is validated or not you’re forced to handle every case and thus will make more robust systems by design.
This just seems to be wrapping a prompt into a python class or function. How is this any more reliable/reliant that just straight prompts/json, since what goes to OpenAI API's is still a prompt. Not sure what I am missing here, but I don't see this being that much more useful.
John 3:16 For God so loved the world, that he gave his only begotten Son, that whosoever believeth in him should not perish, but have everlasting life. Isaiah 53:6 All we like sheep have gone astray; we have turned every one to his own way; and the LORD hath laid on him the iniquity of us all. Romans 4:5 But to him that worketh not, but believeth on him that justifieth the ungodly, his faith is counted for righteousness. 1 Corinthians 15:3 For I delivered unto you first of all that which I also received, how that Christ died for our sins according to the scriptures; 1 Corinthians 15:4 And that he was buried, and that he rose again the third day according to the scriptures: Ephesians 1:12 That we should be to the praise of his glory, who first trusted in Christ. Ephesians 1:13 In whom ye also trusted, after that ye heard the word of truth, the gospel of your salvation: in whom also after that ye believed, ye were sealed with that holy Spirit of promise, Ephesians 1:14 Which is the earnest of our inheritance until the redemption of the purchased possession, unto the praise of his glory. Ephesians 4:30 And grieve not the holy Spirit of God, whereby ye are sealed unto the day of redemption.
you might ask for a chapter with a title and body, and then ask for a list class Chapter: timestamp: int title: str summary: str class VideoChapters: chapters: List[Chapter] response_model=VideoChapters something. like this! i actually build youtubechapters dot app so there ya go
Models have behavior. Type hints do not. Base model has runtime type information. Typescript does not. You would need zod. Which lacks coloration of behavior.
This is great. I've been using openai to process survey data in begging mode "please respond with valid json"... try/except/loop Anyone know: 1. If using pydantic, will we hit any issues with async calls? 2. Can we achieve structured, error-handled responses using langchain instead? Is that the way to go?
Whoever decided to make a fresh version of pydantic without backwards compatibility (breaking fastapi and everything else), you wasted so many human-hours, and made people remember what are virtual environments. 😡
@@jxnlco haskell \s To be honest, pydantic is so far ahead of other solutions like dataclasses and attrs that I have trouble arguing against them. But they come with their own typesystem and the authors have been known for causing drama in the python community.
LMAO, I made something better than this tbh! I don't even want to go on to showcase it to you all, but it's way better than this. dude, I never thought this was a problem to all, lolz. well, I'll be releasing it too then ig. thanks for making this video!
@@xlagunaa okay, so basically i made it a private repo, well this is the idea ig i didn't explain it well back then lol, well, this is how it works, we get a json response, and i convert the json response into a class like object so now instead of accessing data like: response["user"]["name"], you can do it like response.user.name and this just makes ig the typing easier and the code more readable, i mean that's my preference tbh, but then it didn't really work that well, if u liked the concept then ig i'd give u the link to it.
i hate everything about what llms are doing to us, and the fact that a bunch of hangers-on blockchain type bros are just making as much bullshit as possible on top of them
Thanks for having me on! this was my first publish speaking thing in like.. .5 years
Our pleasure! You did excellent!
Really great talk. Credible on the code but effortlessly natural in delivery. Hope to hear more from you, do you have a blog or other outputs?
I'm having so much fun playing with some of your insights. I had already built some interesting tools when functions first came out and twitter was showing how pydantic was the key to making it really usable. But this is such a great progression. Thanks!
@@philsheard832yeah I have a personal blog and int instructor docs.
Really enjoyed this. Thanks Jason!
🎯 Key Takeaways for quick navigation:
00:01 🎵 *Introduction and Scope*
- Jason Liu introduces himself as a keynote speaker, providing an overview of his talk on type hints, Pydantic, and structured prompting.
- Discusses the challenges of using language models in production, particularly when outputting JSON or structured data.
01:19 🌐 *Introduction to Pydantic*
- Introduces Pydantic as a library for data model validation, emphasizing its similarity to data classes and its reliance on type hints.
- Highlights the benefits of using Pydantic, including better validation, cleaner code, and automatic generation of JSON schema.
04:21 🧩 *Introduction to Structured Prompting with Pydantic*
- Discusses the concept of structured prompting, where prompts are actual code represented by Pydantic objects.
- Demonstrates how Pydantic enables defining objects with nested references, methods, and cleaner code for language model prompts.
05:28 🔧 *Introduction to Instructor Library*
- Introduces the "instructor" library, designed to simplify the usage of Pydantic for prompting language models, especially for OpenAI function calls.
- Explains how "instructor" patches the completion API and facilitates type-safe and auto-complete features.
06:24 🔄 *Advantages of Structured Prompting*
- Explores the advantages of structured prompting, emphasizing the ability to define nested references, methods, and reusable components.
- Discusses how this approach leads to more modular, maintainable, and bug-free code.
07:47 🛡️ *Using Validators with Pydantic*
- Demonstrates the use of validators with Pydantic, showcasing the ability to add custom validation functions.
- Illustrates how language model validators can be integrated to catch and handle errors effectively.
08:57 🌐 *Structured Prompting for Knowledge Workflows*
- Explores how structured prompting can go beyond structured outputs, enabling the modeling of knowledge workflows and plans.
- Discusses the potential for representing knowledge graphs and leveraging language models for more productive development.
12:14 🔄 *Advanced Applications: Search Query Planning*
- Demonstrates advanced applications, such as search query planning using structured prompting.
- Shows how defining a data structure for search types and execution methods simplifies the process of querying multiple backends.
14:34 📊 *Advanced Applications: Knowledge Graph Extraction*
- Illustrates an advanced application focused on extracting knowledge graphs by closely modeling the data structure to the graph visualization API.
- Emphasizes the simplicity achieved in code with the structured prompting approach.
16:11 🔮 *Future Possibilities and Conclusion*
- Discusses the future possibilities of structured outputs, including multimodal applications and generative UI over images, audio, and more.
- Concludes with excitement about the evolving space of structured prompting and its potential in various domains.
Made with HARPA AI
I was at the conference in person -- this talk was a major highlight. Glad to see it again!
thanks for sharing!
Jason Lio is seriously next level, he brings so much in this video, can watch this 10 times.
hopefully watching it the 11th time you can spell his name properly
Wow, outstanding talk. I came in thinking a 15 minute talk on typing wouldn't be that interesting, but by the end I felt like this was the most well explained and immediately useful presentation on here.
Same here!
Thanks for the kind words would love any feedback on the docs or on the talk if you have any
@@jxnlcoawesome presentation. Where can I dive deeper into this for other models? I heavily use Pydantic today, but other models for work. You mention ask Marvin. What are the differences between that and Instructor?
- Consider using structured prompting for better LL model outputs (00:31)
- Ensure LL models output JSON or structured data compatible with existing software (00:50)
- Utilize OpenAI function calls for improved JSON schema validation (2:49)
- Employ the Pantic library for data model validation and to generate JSON schema (3:53)
- Implement instructor library to simplify OpenAI function calling with Pantic (5:16)
- Use doc strings in Pantic models to improve prompt and data quality (6:47)
- Create validators in Pantic models for data integrity and error handling (7:20)
- Use LL models to output structured data for complex data processing (11:08)
- Explore advanced applications of structured prompts for knowledge extraction (12:07)
- Check out additional examples and documentation on structured prompting (16:29)
and now we have PydanticAI. Revisiting this talk after launch.
00:13 Pydantic is all you need: Jason Liu
02:08 Pydantic is a library for data model validation.
04:07 Pydantic is a trusted library for handling JSON schema and object definition in Python.
06:06 Pydantic allows for cleaner code and easier maintenance by defining nested references and object behavior.
07:54 Validation error handling in Instructor helps fix errors in language models.
09:45 Pydantic allows for structured prompting and object-oriented programming.
11:38 Language models can output data structures to traverse and process data more effectively.
13:27 Pydantic enables easy creation and visualization of graph structures.
15:27 The paraphrasing detection algorithms help identify quotes and provide more accurate answers.
17:20 Pydantic enables extraction of bounding boxes and structured outputs.
Crafted by Merlin AI.
this "all you need" trend in the AI community is out of hand lol
True haha @@richcaputo2929
I used decorators over my calls to allow for feedback loops ! Pydantic is a must 😊
Thank you Jason. Phenomenal work and effort + you & your team.
Amazing talk.
Geez this is must have video to watch
liked so much
Awesome talk jasonliu ......really learned something new today.Thank you for sharing this
This is something very different from all the other stuff out there!
Well this really helped me learn Pydantic🎉
Engineers used to be expensive because they produced potentially infinite automation.
Today, they're expensive because they consume potentially infinite automation.
engineers are expensive because not everyone wants to deal with things that abstract and complex. I mean, engineers are close to create a machine that is smarter than them, which can make them obsolete, but still did it. which other profession is so true to it?
Thank you for your presentation
Incredible talk!
Really interesting! Thx!
This awesome! Thank you for sharing this!
Thank you
Great talk. Such a useful workflow, thanks!
This is soo good, would love to have a link for repo for the reference
Such a smooth talk!!
Respond only with a valid json and nothing else. MY AND MY FAMILY'S LIFE DEPEND UPON THIS!
Let us hope Roko's basilisk will look kindly upon us emotionally manipulating our poor LLMs
Thank you :)
amazing job!
Great talk and content!
Great talk!
TIL that you can copy text from youtube videos. very slick... and great talk Jason :)
Validation with citation is very interesting. I did this using normal prompt engineering and hoped I’d get good results every time 😂
Hi: thanks for the talk. Where can we find out more about instructor? (16:33) Oops, there it is one second later.
Makes me interested in using Rust for LLM interactions. Whether the LLM return is validated or not you’re forced to handle every case and thus will make more robust systems by design.
Nice one!
lol i JUST finished our pydantic based datamodels - it’s great, nice to see this, feels like `validation` lol
How come instructor requests don't need async await for calling the APIs?
cause it just uses `create` vs `acreate`
This has been growing on me.
this is so good it hurts my feelings a little
How is this different from langchain?
This just seems to be wrapping a prompt into a python class or function. How is this any more reliable/reliant that just straight prompts/json, since what goes to OpenAI API's is still a prompt. Not sure what I am missing here, but I don't see this being that much more useful.
70M downloads a month...how many developers exist in the world ?
I don't want to install the whole rust toolchain to install pydantic on riscv.
Who is the speaker, where is he on Twitter?
jxnlco !
Now this is good. This is going to be the way to actually integrate llm models with apps that goes beyond chatbots more reliable
thank you! let me know if you have any thoughts, available in the github issues etc or on twitter
John 3:16 For God so loved the world, that he gave his only begotten Son, that whosoever believeth in him should not perish, but have everlasting life.
Isaiah 53:6 All we like sheep have gone astray; we have turned every one to his own way; and the LORD hath laid on him the iniquity of us all.
Romans 4:5 But to him that worketh not, but believeth on him that justifieth the ungodly, his faith is counted for righteousness.
1 Corinthians 15:3 For I delivered unto you first of all that which I also received, how that Christ died for our sins according to the scriptures;
1 Corinthians 15:4 And that he was buried, and that he rose again the third day according to the scriptures:
Ephesians 1:12 That we should be to the praise of his glory, who first trusted in Christ.
Ephesians 1:13 In whom ye also trusted, after that ye heard the word of truth, the gospel of your salvation: in whom also after that ye believed, ye were sealed with that holy Spirit of promise,
Ephesians 1:14 Which is the earnest of our inheritance until the redemption of the purchased possession, unto the praise of his glory.
Ephesians 4:30 And grieve not the holy Spirit of God, whereby ye are sealed unto the day of redemption.
jason here
-- "AI will take software engineer job"
-- Software engineer: "Hold my beer."
I don't get it. What's new about this? Has been part of langchain for a while now.
well who do you think introduced this to langchain ;)
@@jxnlco I see & apologize 😀 This has been fantastic. Loved the idea straightaway.
Are you on Twitter?
@@alivecoding4995 @jxnlco
haha yeah i was the reviewer on the langchain function calling PRS when it came out. but we do validation differently.
I still don't understand how instructor works behind the scenes, for example if I want chapters from a youtube transcript how do I prompt it?
you might ask for a chapter with a title and body, and then ask for a list
class Chapter:
timestamp: int
title: str
summary: str
class VideoChapters:
chapters: List[Chapter]
response_model=VideoChapters
something. like this! i actually build youtubechapters dot app so there ya go
so basically typescript....... ?
Models have behavior. Type hints do not.
Base model has runtime type information. Typescript does not. You would need zod. Which lacks coloration of behavior.
This is great. I've been using openai to process survey data in begging mode "please respond with valid json"... try/except/loop
Anyone know:
1. If using pydantic, will we hit any issues with async calls?
2. Can we achieve structured, error-handled responses using langchain instead? Is that the way to go?
we'll run into issues of using async validators, but async don't be an issue otherwise
Lopez Mark Jones Angela Rodriguez Jason
And people say rustc is too complicated.
I wish that was more like a tutorial. It is really hard to follow.
I'll be making one coming out in a month or so!
if you want to integrate with legacy software, named pipes are your friends. thank me later
Taylor Melissa White Kenneth Clark Daniel
Lewis Patricia Johnson Scott Taylor Frank
Hernandez Amy Wilson Paul Jackson Mark
Or…. you could just use a statically typed language to begin with….
an llm is not statically typed.
@@jxnlco what I meant is have the LLM generate code in a typed language, and write the code that interprets the results in a typed language.
If I want so much of gibberish boilerplate with types i will use Java.
Brown Angela Thompson Matthew White Linda
Johnson Scott Smith Susan Johnson Angela
Whoever decided to make a fresh version of pydantic without backwards compatibility (breaking fastapi and everything else), you wasted so many human-hours, and made people remember what are virtual environments. 😡
great talk, but please stop using pydantic!
what do you suggest?
@@jxnlco haskell \s
To be honest, pydantic is so far ahead of other solutions like dataclasses and attrs that I have trouble arguing against them. But they come with their own typesystem and the authors have been known for causing drama in the python community.
@@jxnlco attrs
Oh, this is on Python. Sorry, not interested.
🐍
LMAO, I made something better than this tbh! I don't even want to go on to showcase it to you all, but it's way better than this. dude, I never thought this was a problem to all, lolz. well, I'll be releasing it too then ig. thanks for making this video!
How its called and where I can try it
@@xlagunaa okay, so basically i made it a private repo, well this is the idea ig i didn't explain it well back then lol, well, this is how it works, we get a json response, and i convert the json response into a class like object so now instead of accessing data like: response["user"]["name"], you can do it like response.user.name and this just makes ig the typing easier and the code more readable, i mean that's my preference tbh, but then it didn't really work that well, if u liked the concept then ig i'd give u the link to it.
i hate everything about what llms are doing to us, and the fact that a bunch of hangers-on blockchain type bros are just making as much bullshit as possible on top of them
Harris Christopher Harris Donna Miller Frank