@@TechnoMageCreator then give us a clear rule when you should use this princip. What items are foldable? Would the argument be ok on a sheet of glass? The human might very well fit in a big suitcase but needs to be blended before. Maybe the data from "Will it blend?" will be very valuable? :)
I am a software engineer, and I find it pretty amusing how Matthew presents the task of ordinary logical inference and verification as a global breakthrough. 😀
Yeah, but the problem is the ambiguity of the world, not everything can be easily encoded into formal rules and in many cases, informal reasoning need to be done here. If formal reasoning would be enough for AGI, we will have it in previous century.
This video basically says that the reasoning has to be 100% perfect like a computer program then goes on to say that Amazon has solved this by getting to be 99% accurate
I imagine the problem is more with "Did you get the rules right?" Wherein, you have to test and give feedback until the rules themselves are actually perfect.
I agree. I thought we would be talking about P vs. NP, the Traveling Salesman problem, or how to extract business logic out of a written policy. Not using a cloud service to write a chatbot for employees.
this actually has the potential to be a huge development as you say, possibly tackling hallucinations once and for all, and I haven't seen anything like it out of openAI yet. genuinely cool development and interesting that amazon was the first out with it, making me wonder if the other players are on it yet or not
Mathew I used to really like your videos, they were informative and interesting. But in the last few weeks to months it feels like all your videos are just click baits. They have titles that make it seem like the has been new and exciting breakthroughs but then when you actually explain what is happening it's just speculation, hear say or rumors. You rarely test the new LLM's in the same video as you announce them in, so you can farm a second video of you testing it. It's a real shame. Hopefully you will come back with some more impactful, less click bait and factually strong videos. Really loved seeming your enthusiasm in your older videos when testing and reviewing these subjects.
Yeah, it really feels like he is shilling marketing for access. If AI has a slow news week, interview some interesting people instead or just explore a new AI service and cut an edit of that.
This method should be applied to Laws, Contracts, Agreements, Patents, Scientific White Papers/Peer Review, and a whole host of other written documents claiming something is true or false based on the Data.and Evidence.
Meh its 100% useless for unreal engine right now which is a fringe subject the likes of which you want to hand total control over. Sheep shouldnt have rights or a voice on the internet.
This is very exiting and scary. I don’t know what’s with the people that complain. They clearly don’t get it. This could solve law and replace judges for example. People stop complaining about your preconceived assumptions and get the big picture.
@@matthew_berman Wow! Seriously? This is what is wrong with the AI hype machine. While slightly better than having a 5 year old answer the questions you demonstrated, it is a hell of a long way away from the point where it should be trusted to make deterministic decisions that affect people's freedom, rights or property ownership. What you said was akin to saying that since Windows Calculator can do basic math operations, we should be doing string theory with it in the next few weeks. Completely oblivious to the nuances and intricacies of these complex systems.
@@juliusjames6229 The claim is that AWS is the only company with this technology. If some other company came up with it, presumably the video would have featured them instead.
@@ordinarygg No I dont think so. Im not discounting what may be novel to one person is standard to another. Its just that "cognative architectures" has been a thing for a while I thought.
@@AngeloXification yet it will burst at some point, if you know how they actually works. The next big thing will kill it in one day, and it will not require GPU
the llm make code rules. and the code rules verify all future model outputs. using 2 llms as you is twice hallucination. while this is not. the verifiyer is flat code. not a second hallucinative llm
The IAM problem is very similar to the problem concerning the safety of nuclear systems code. You need millions and top logicians and mathematicians to check/create the code, generally using Formal Methods such as Z or VDM.
They are not even approaching that problem. They're just wrestling with the basic rules of inference logic, let alone self-reference godelian problems.
Godel’s incompleteness says there are things that are true about taking a leave of absence, that can’t be arrived at from what’s in your company’s leave of absence document. Like maybe putting 10 ducks in your backpack and biking around your boss’s living room gives you an instant unpaid indefinite leave of absence.
Use a rule based expert system such as Blaze advisor to code the IAM system. LLM parses document into rules to be executed in the rules engine. That's all.
*You forget to mention that this job is called rule analysis and it has been around since the beginning of computing.* *From the desired indications, a decision table is created, then a rule reduction is done, during this rule reduction, the repeated and contradictory rules are automatically eliminated, and finally a decision tree is generated (it is the same as a list of logical conditions).* *There was no need to use an AI nor was there a need to hire thousands of engineers. If you want to automate it, you just have to use a natural language analyzer that converts each sentence into a logical assertion.* *This is not discovering the future...it is just giving it a coat of paint.*
What is "automated reasoning". Is it not just symbolic logic? Or maybe automated reasoning is the term used when symbolic logic is automated within computer programs?
Yes … for example… Government authority is delegated by sovereign individuals. That which one sovereign individual can do to another is delegable, and if it cannot be delegated then it does not exist. Therefore, government cannot legitimately claim an authority which cannot be delegated.
I love you did this AWS video even if it was a sponsorship. Learning technologies that align with your channels content and audience is the perfect overlap. Some of the coolest AI technologies I've worked with are because someone paid for my time to learn and contextualize use cases. So appreciate you have done that work for us and shared a technique which can be used even outside of AWS. Not sure most folks realize how innovations in one eco system and be applied to others. For example, I'd love to see if these policy rules could be broken up and used with Grok to simulate the same results.
this fails when terms in contracts are (unknowingly) ambiguous .. or where a normal person would be expected to act even though it is not explicitly written as a rule. For example, a few years ago in my country it was interpreted that uber drivers are effectively full--time employees, despite the mumbo-jumbo stating they weren't. Reminds me of the adage: "Rules are made for the guidance of the wise .. and for the obedience of fools"
I'm stoked about the potential of AI-powered automated reasoning! It's mind-blowing to think that complex logic can be distilled from natural language documents. Game-changer for coding and beyond!
@@jacobdallas7396 from what i understand, it's true from our perspective applying the laws of science from our universe. but that as the perspective ("universe") changes, scientific laws related to math (can) change (ie quantum physics perhaps)
If a woodworker started out with 10 fingers and via table saw accident cut 2 of them off, I'm pretty sure he would consider the statement "I no longer have 10 fingers." to be true.
0:55 Hallucination is, in my assumptions, a side affect of guard rails. So to troubleshoot would be like skipping a step when diagnosing. 10:32 [[...]] 12:52 This is just the firm stance that we chose secure-by-default over secure-by-design for AI's rollout.
The comments here seem to be missing the big deal. I think their focus is too much on the marketing announcement itself. The big deal is this: Explainable AI, which the EU has mandated for AI systems. The fancy trick is not that the AI is able to answer questions. For the first time, an AI system can point to the exact rules and variables it used to make a determination. That's explainable AI.
Amazon did a wonderful thing, in my opinion! First index what you have and what you want/need. Then implement it in a good way. Not just spaghetti code everything together.
This was AI in the eighties which is why everyone got so disillusioned with AI until large neural nets became feasible. Joining the two was kind of inevitable. This would be super helpful for things like payroll accounting rules as well which is horrendously complicated.
The HR department can write this in English and understand it, yet the best mathematicians and coders get stumped with complexity? Tell us another joke.
Just to be clear, it is absolutely obvious that a major change in business is coming. There is also going to be a change in the job market. Also what is a specialist or professional in any field, anymore? If I can do PhD level reasoning with AWS Artificial intelligence then what is a qualifications relevance in determining eligibility for a job? Now only ability and aptitude matter. A meritorious society, whether you want it or not. Even Elon Musk said it is his personal preference, and it is also mine.
Matt, another idea which I don't know has been made good use of is a NN that's specifically trained on reasoning about several LLMs responses to a question, and the reason NN comes up with a definitive answer based on a meta analysis of the results as well as itself being able to use RAG context.
I am in life sciences IT and (GxP, Validation, etc.) When will this be available on AWS as a production product? Can't wait no hallucattions AND explainable.... bring it on!!!
Matthew: "Airplane companies have a highly complex policies on when to give a refund. A vastly complicated rule set and a ton of variables..." Airplane policies: "Don't give a refund"
Thanks for the introduction and the great explanation of this topic. I always find it very interesting to learn about new things that I use every day from a technical point of view, but that have completely passed me by! 👍😊
This is some refreshing and exciting news, I can't wait to see what the future will be. Having a second LLM to fact check the reasoning level of the first LLM using math is a good way to improve accuracy.
The relationship between friction in dry and wet conditions is not typically referred to as a **transitive property**. Instead, it is an example of a **proportional relationship** governed by the physics of friction. To clarify this. Transitive Property The transitive property applies to relationships like equality or order. For example: - If \( A = B \) and \( B = C \), then \( A = C \). - If \( A > B \) and \( B > C \), then \( A > C \). This logic doesn't directly apply here because we’re not dealing with a chain of relationships that transfer equality or inequality across terms. Proportional Relationship The equation: \[ \frac{F_{f,\text{wet}}}{F_{f,\text{dry}}} = \frac{\mu_w}{\mu_d} \] shows a proportional relationship, where the frictional force (\( F_f \)) is proportional to the coefficient of friction (\( \mu \)) under the same normal force (\( N \)). The smaller \( \mu_w \), the smaller \( F_{f,\text{wet}} \), which directly demonstrates the reduction in grip in wet conditions. Why Not Transitive? The reasoning is **causal** and based on physical principles (the decrease in \( \mu \) due to water), not a logical transference of equality or inequality. Thus, the reduction in grip isn't an example of transitivity but rather an application of proportionality and physics.
Just putting this out there but the transitive property isn't just applicable to equalities or inequalities but any operation f where f(A, B) and f(B, C) implies f(A, C)
@@christopheriman4921 You may well say that, but it doesn't make it so to everything. Particularly when you point to a specific example to which it doesn't apply. Math is math and your example is an example in itself why AI hallucinates at times, it either is or is not correct for the scenario presented, you are not asking for an opinion but a factual outcome from the question asked. I do appreciate where AI is going, the problem is in my mind having AI actually make correct decisions by making mistakes. How many rockets exploded before a successful launch, the mistakes made were what lead to the correct construction and successful lift off. I do like listening to your views. Keep it going.
OK Matthew. Now we have a potential AI model to use to build truth databases (truthbases). Use AI agents to scrape books, articles, and the web for published "knowledge" and research findings ("facts") in a given field. Now use a reasoning model to set preliminary truth weights to each "fact". Next use the reasoning model to link related "facts". Finally, use the weight(s) given to a research finding and the weights of any confirmatory/supportive/non-supportive/contradictory nearby data to establish a weighted truth for each "fact". Use agents to build matrices of 'facts" and associated truths. The effect would be to condense large blocks of research in relatively compact collections of weighted "facts" ("truths"). Use the truthbases to avoid repeating research that has already been validated and to guide future research. Use the truthbases and reasoning models to weed out non-factual data in our AI training sets based upon lack of supporting facts. Maintain a truthbase of frequently visited truths and trending truths to reduce validation and response generation times and to reduce outputted hallucinations. Use the truthbases on request to validate LLM responses.
not amazing... they probably used SpaCy and entities creating (NLP), this is not that advanced. I am a psychologist, and reasoning is not a mystery in psychology. After creating relationships with the entities... you need to match to a template. the logic is provided after the terms were BERT or so embeding and functions cosine, match the terms , use an indexer, agents at every critical stage, I am working on unstructure data, it is a lot harder. Becasue we have to assign specific agent and capability to that "node". orchestrate and pass on to the response team, the part that isi impressive is speed .. but that is not a problem for a large corporations.
This is the first time I've heard about the hallucinations being an example of creativity. I honestly don't understand that. In any event, I remember a number of times where my coworkers or myself have, as humans, went "off the rails" trying to solve problems logically. For the examples that stick in my mind, the ultimate issue usually was that the problem we were working on had unknown variables that threw a monkey wrench into our attempts to solve the problem. Sometimes those unknowns would drive us nuts until we figured that out. I have to suspect that this issue can crop up in AI too.
I'm glad that you brought up potential contradictory specifications. Now what about under-specifications? For example maybe there is a rule for employees under 65, and a version of it for employees over 65, but what if an employee is 65?
Hey Matt your hair grew a little bit between the timestamps 7:40 -7:45 I can see there was a lot of research put into this I really appreciate these in all seriousness 🙂
The ability to infer which variables are relevant and the relationship between them to do classification, as well as the ability to output the reasoning model is something decision trees like J48 can do as well. Text extraction and parsing is more of an LLM thing. Maybe they used both.
That's great! This is a big step towards using AI for legislation. Imagine it being fed all the laws and checking the judgement of judges or in the future making judgements itself.
I didn't understand the "assumption" example at 9:12, did the AI make the wrong assumption and give the wrong answer or you gave the wrong answer and it caught that? Who made the assumption?
Awesome video dude, fascinating! This will no doubt be a puzzle piece in creating 100% trustable 'logical' chain of thinking leading to have new other systems that require trustable reasoning and 0 hallucinations to be built about something along this line.
that is exactly the problem with LLMs, people thinking that LLM can replace all Expert Systems in the hype of generative AI essentially AWS has invested into real logic, real processing, dependable actual mathematical proofs just like the expert systems we have today, applying it to their domain kudos and respect
If feels like we are moving further away from understanding what the code is doing and closer to, I don't care how it works, just as long as it works. I mean it's been that way for a while, when having to use other people's libraries to get things done, but there has always been the feeling that at least somebody knows what those libraries are doing, so they can get patched and updated if something goes wrong.
@@ytubeanon makes perfect sense when you consider AI originated from being force fed the entire corpus of human knowledge, then being asked "bro guess how A relates to B"
@@natef3141 well, one would think A.I. uses a certain method to get its answer, which shouldn't change when the same question is asked again (art aside) cuz it should use the same methodology. The speed, consistency and reliability are the intentional reasons to use A.I. in the first place
@@ytubeanon consistency =/= correctness. But you are right, reliability (the combination of those two things) is extremely important, which ultimately circles back to the methodology piece you just mentioned. My point is simply that the original methodologies, however consistent you want to call them, are inherently limited in correctness precisely because of the training methods that created them. These issues may not exist if the training sets and user applications were more limited depth or breadth, but the fact is that the pure combinatorial explosion of possible scenarios is mindbogglingly high, full of nuances, subtleties, and pitfalls that can't really be "trained out" by random chance. Intuitively, it should make sense that thinking methodology will have the biggest yields. Exhaustively gaining knowledge and making connections is simply not possible, and it's not necessary--the smartest humans don't think this way. We can often be put into unfamiliar domains, have a specialist with the knowledge needed, and go from there.
Exciting topic. Basically, you have extracted everything you need to create a decision tree and then the software. So AI would no longer be necessary and the answers would be better! AI could then be used to make changes and additions, etc. Data extraction with AI is not difficult, would like to see what else is behind it 🤔
Sorry to rain on your parade, but this kind of thing was done decades ago. There was even a special programming language called Prolog (Programming Logic). In summary (according to ChatGPT), Prolog’s strength in AI lies in its ability to perform logical reasoning and solve problems based on knowledge defined as facts and rules. I wonder how well it would perform if it had the same amount of compute power given to it as LLMs.
Is this publicly available for us to try, looks like you are in N. Virginia, I'm not seeing Adv Reasoning Preview under Safeguards in Bedrock in my AWS console. Can we get a link in the description to the amazon form you used to upload the doc? or get us access to the preview
@@matthew_berman I am not seeing it either. That would be awesome if you could reach out to your contact for more details on signing up for access to this feature! 👍
I would love to see the department of governmental efficiency apply this software to the entire rubric of law in the United States and see how many logical fallacies are included!
Lots of this feel very old tech married to new LLMs to get natural language as the input. I wrote expert systems in mid 1980's that implemented logical reasoning rules in LISP or Prolog. These reasoning rules feel like they are pretty much the same thing. So I think we're adding an LLM code gen with a rule based language as the target language. In the early 2000's my university had a research center working on software verification that used a theorem prover to analyze and verify code (which could be rules). So much of this sounds like it is based on early AI research. In 1986 I got to play with a TI Explorer, an early LISP based AI processor that ran KEE (Knowledge Engineering Environment) expert system.
These systems, more than anything else, need safety guardrails. Otherwise they would be abused by large companies to make intentionally overcomplicated mousetrap policies, which could get people robbed, indebted for lifetime or jailed.
Can this rule be identified as a classic paradox? In a futuristic city, there is only one robot technician, a self-aware machine, who repairs all and only the robots that do not repair themselves. The question is, does the robot technician repair itself?
Hey, I have one question. Can we achieve this using function calling? we could extract all the rules from given document. After that for each rule, we could create function calling schema. For instance, if rule is that employ could get bonus if he has 10+ year experience. Now, based on this rule we could create schema using LLM. This schema could contain one variable name experience. Now, we could extract the user information from their question for experience variable and could verify it through code that whether it is greater than 10 or not. This is just simple idea but we could make complex system. Not sure, I might be wrong. Open to discussion.
I’ll have to try it out! Would be nice if you could build rules directly from the knowledge base. Otherwise you have your documents being ingested from a source for RAG, then you need to do the same thing for the reasoning guardrail. People will want it automated, maybe lambda with boto3 if it’s supported.
10:56 shows that it still made assumptions. You shouldn't try to hide these things. It's fine that its not perfect, what's not fine is you covering up for it.
It's embarrassing just how flawed this system is. FYI guys, there are people out here who actually know enough about this stuff to be able to call bullshit!
And here is a deeply related problem: niche data allocation. example : Let's say ... I provide data by using a platform at levels not found anywhere else. Companies benefit according to "how didn't we think of that?" MOMENT .. they could publicly recognize the fact.
When I take magic mushrooms while coding and start hallucinating, I usually get quite creative transitive properties. Wonder why AI can't leverage that. Lol 😂
I can't imagine the complexity. I fit in my clothes, my clothes fit in my suitcase, so I fit in my suitcase.
That's simple, take in consideration the volume occupied by clothes, human and available in a suitcase. Archimedes solved this issue long ago...
@@TechnoMageCreator Yeah so just ignore the entire video, nice solve 👍
Makes sense
@@TechnoMageCreator then give us a clear rule when you should use this princip. What items are foldable? Would the argument be ok on a sheet of glass? The human might very well fit in a big suitcase but needs to be blended before. Maybe the data from "Will it blend?" will be very valuable? :)
You know science lol. That's why you got trolls in the comments bruh...@@TechnoMageCreator
I am a software engineer, and I find it pretty amusing how Matthew presents the task of ordinary logical inference and verification as a global breakthrough. 😀
Yeah, but the problem is the ambiguity of the world, not everything can be easily encoded into formal rules and in many cases, informal reasoning need to be done here. If formal reasoning would be enough for AGI, we will have it in previous century.
It took me a minute to realize this is what he was talking about.
🤣
Honestly it felt like an ad for Amazon the entire time... At least what I was able to stomach watching
The ad should be marked as sponsored because that's all it is @@guitarchitectural
ikr, here I was thinking they solved P vs NP or smth
This video basically says that the reasoning has to be 100% perfect like a computer program then goes on to say that Amazon has solved this by getting to be 99% accurate
getting it to be 99% accurate WITH human input. 👍
You only actually need to do better than humans. Humans don't do better than 99%, 99% of the time. 🤷♂️
I imagine the problem is more with "Did you get the rules right?"
Wherein, you have to test and give feedback until the rules themselves are actually perfect.
i think the point is that ideally it is 100% but current systems are much lower than that.
No, when something is 99% accurate in this sense, it is 100% accurate 99% of the time.
A video about amazons new AI sponsored by amazon..... mylord
Not even talking about how it works or anything.
I just watched a 13 min ad. Thanks....
literally what I was thinking! such shame…
13 minute of amazon advertisement haha
I agree. I thought we would be talking about P vs. NP, the Traveling Salesman problem, or how to extract business logic out of a written policy. Not using a cloud service to write a chatbot for employees.
He just explained how it works.
Also are you going to pay this man's bills since you complaining about ads?
@@j2csharp you can ask an llm to do that. Have you never used one? 😂
this actually has the potential to be a huge development as you say, possibly tackling hallucinations once and for all, and I haven't seen anything like it out of openAI yet. genuinely cool development and interesting that amazon was the first out with it, making me wonder if the other players are on it yet or not
Mathew I used to really like your videos, they were informative and interesting. But in the last few weeks to months it feels like all your videos are just click baits. They have titles that make it seem like the has been new and exciting breakthroughs but then when you actually explain what is happening it's just speculation, hear say or rumors. You rarely test the new LLM's in the same video as you announce them in, so you can farm a second video of you testing it. It's a real shame. Hopefully you will come back with some more impactful, less click bait and factually strong videos. Really loved seeming your enthusiasm in your older videos when testing and reviewing these subjects.
Yeah, it really feels like he is shilling marketing for access. If AI has a slow news week, interview some interesting people instead or just explore a new AI service and cut an edit of that.
a channel is good untill they reach 100k subs, then they start to get sponsors and move from their original goals
How is this video speculation? This was a demo, sponsored sure, but still an actual demo.
@@digitalchild he did not say sponsored, he said AWS worked with him on this video.
@ He said partnered, which is pretty clear is a paid sponsorship. It’s well known that partnerships on youtube are paid.
The fact that they named the system “I-AM”is not lost on me, considering the deep spiritual and philosophical connotations tied to that phrase.
Think we need more than Amazon or a video sponsored by them to say this works. Good video however, very intersting topic.
This method should be applied to Laws, Contracts, Agreements, Patents, Scientific White Papers/Peer Review, and a whole host of other written documents claiming something is true or false based on the Data.and Evidence.
I cant wait until all the liars are exposed by this
Meh its 100% useless for unreal engine right now which is a fringe subject the likes of which you want to hand total control over. Sheep shouldnt have rights or a voice on the internet.
This is going to make smart contracts. Indispensable.
cant wait until theyre all exposed.
How about applying this method to evaluate and critique the current economic system based on benchmarks of fairness, transparency, and justice?
This is very exiting and scary. I don’t know what’s with the people that complain. They clearly don’t get it. This could solve law and replace judges for example. People stop complaining about your preconceived assumptions and get the big picture.
Yes
@@matthew_berman Wow! Seriously? This is what is wrong with the AI hype machine. While slightly better than having a 5 year old answer the questions you demonstrated, it is a hell of a long way away from the point where it should be trusted to make deterministic decisions that affect people's freedom, rights or property ownership.
What you said was akin to saying that since Windows Calculator can do basic math operations, we should be doing string theory with it in the next few weeks.
Completely oblivious to the nuances and intricacies of these complex systems.
Is this an ad for AWS?
Yes
@@juliusjames6229 That sucks. I think I'm not watching this guy any more.
@@juliusjames6229 The claim is that AWS is the only company with this technology. If some other company came up with it, presumably the video would have featured them instead.
YUP!
Im struggling to hear where the breakthrough here is. I feel like I've done the same using a validation LLM and structured output.
he just AI seller lol, his money is more content on AI, when it will gone, his channel is done)
@@ordinarygg stawp it
@@ordinarygg No I dont think so. Im not discounting what may be novel to one person is standard to another. Its just that "cognative architectures" has been a thing for a while I thought.
@@AngeloXification yet it will burst at some point, if you know how they actually works. The next big thing will kill it in one day, and it will not require GPU
the llm make code rules. and the code rules verify all future model outputs. using 2 llms as you is twice hallucination. while this is not. the verifiyer is flat code. not a second hallucinative llm
Sceptical. AI generates most obvious predicates well but nuanced logic and corner cases are vital to automated reasoning.
Well, the AI that extracts the rules can still hallucinate...
But it is another step forward!
I agree
The IAM problem is very similar to the problem concerning the safety of nuclear systems code. You need millions and top logicians and mathematicians to check/create the code, generally using Formal Methods such as Z or VDM.
Godel's *Incompleteness Theorem* has something to say...
They are not even approaching that problem. They're just wrestling with the basic rules of inference logic, let alone self-reference godelian problems.
Godel’s incompleteness says there are things that are true about taking a leave of absence, that can’t be arrived at from what’s in your company’s leave of absence document. Like maybe putting 10 ducks in your backpack and biking around your boss’s living room gives you an instant unpaid indefinite leave of absence.
100% perfect reasoning that isn't 100% accurate... ok AWS.. keep slaving.
Use a rule based expert system such as Blaze advisor to code the IAM system. LLM parses document into rules to be executed in the rules engine. That's all.
*You forget to mention that this job is called rule analysis and it has been around since the beginning of computing.*
*From the desired indications, a decision table is created, then a rule reduction is done, during this rule reduction, the repeated and contradictory rules are automatically eliminated, and finally a decision tree is generated (it is the same as a list of logical conditions).*
*There was no need to use an AI nor was there a need to hire thousands of engineers. If you want to automate it, you just have to use a natural language analyzer that converts each sentence into a logical assertion.*
*This is not discovering the future...it is just giving it a coat of paint.*
Tree based logic can do something similar
What is "automated reasoning". Is it not just symbolic logic? Or maybe automated reasoning is the term used when symbolic logic is automated within computer programs?
That's really cool, o1 and sonnet can do proofs for very simple things but for more complex math it absolutely fails.
Sounds like non technical emotional people are having trouble with simple logic.
lol. i see what you did there.
This could be applied to government regulation as well as laws.
Yes … for example… Government authority is delegated by sovereign individuals. That which one sovereign individual can do to another is delegable, and if it cannot be delegated then it does not exist. Therefore, government cannot legitimately claim an authority which cannot be delegated.
All in all: just give the right system prompt and prompt structure. 🎉
I love you did this AWS video even if it was a sponsorship. Learning technologies that align with your channels content and audience is the perfect overlap. Some of the coolest AI technologies I've worked with are because someone paid for my time to learn and contextualize use cases. So appreciate you have done that work for us and shared a technique which can be used even outside of AWS. Not sure most folks realize how innovations in one eco system and be applied to others. For example, I'd love to see if these policy rules could be broken up and used with Grok to simulate the same results.
Technical AI usage requires testing and iterations just like creative use. Asking for proof and reasoning is a good idea.
Working with AWS really gives you a visceral understanding of how hard they worked on the IAM permissions.
this fails when terms in contracts are (unknowingly) ambiguous .. or where a normal person would be expected to act even though it is not explicitly written as a rule. For example, a few years ago in my country it was interpreted that uber drivers are effectively full--time employees, despite the mumbo-jumbo stating they weren't. Reminds me of the adage: "Rules are made for the guidance of the wise .. and for the obedience of fools"
That is "intentional ambiguity". If you are trying to be transparent and accurate then this should not be a problem.
Is this a RUclips video or an infomercial for aws
Did I watch 13 min AWS Ad 😂?
Wow, this is a game-changer! Automated reasoning is a huge challenge, and if AWS has solved it, that's amazing!
I'm stoked about the potential of AI-powered automated reasoning! It's mind-blowing to think that complex logic can be distilled from natural language documents. Game-changer for coding and beyond!
It took me a long time to realise that you can always prove something false but never prove something true. It depends on assumptions and context.
You can in math, proof by negation is one way but not the only one
Proving 1+1=2 seems easy enough to prove true
@@jacobdallas7396 from what i understand, it's true from our perspective applying the laws of science from our universe. but that as the perspective ("universe") changes, scientific laws related to math (can) change (ie quantum physics perhaps)
If a woodworker started out with 10 fingers and via table saw accident cut 2 of them off, I'm pretty sure he would consider the statement "I no longer have 10 fingers." to be true.
@@themax2go eh I'm of the belief that simple mathematics apply to all dimensions/universes
0:55 Hallucination is, in my assumptions, a side affect of guard rails. So to troubleshoot would be like skipping a step when diagnosing.
10:32 [[...]]
12:52 This is just the firm stance that we chose secure-by-default over secure-by-design for AI's rollout.
Matthew, your audio on this video sounds is amazing!
The comments here seem to be missing the big deal. I think their focus is too much on the marketing announcement itself.
The big deal is this: Explainable AI, which the EU has mandated for AI systems.
The fancy trick is not that the AI is able to answer questions.
For the first time, an AI system can point to the exact rules and variables it used to make a determination.
That's explainable AI.
Fantastic video that makes you realize that hallucinations are controllable. The potential of these models is huge.
Amazon did a wonderful thing, in my opinion! First index what you have and what you want/need. Then implement it in a good way. Not just spaghetti code everything together.
This was AI in the eighties which is why everyone got so disillusioned with AI until large neural nets became feasible. Joining the two was kind of inevitable. This would be super helpful for things like payroll accounting rules as well which is horrendously complicated.
The HR department can write this in English and understand it, yet the best mathematicians and coders get stumped with complexity? Tell us another joke.
Just to be clear, it is absolutely obvious that a major change in business is coming. There is also going to be a change in the job market. Also what is a specialist or professional in any field, anymore? If I can do PhD level reasoning with AWS Artificial intelligence then what is a qualifications relevance in determining eligibility for a job? Now only ability and aptitude matter. A meritorious society, whether you want it or not. Even Elon Musk said it is his personal preference, and it is also mine.
Matt, another idea which I don't know has been made good use of is a NN that's specifically trained on reasoning about several LLMs responses to a question, and the reason NN comes up with a definitive answer based on a meta analysis of the results as well as itself being able to use RAG context.
I am in life sciences IT and (GxP, Validation, etc.) When will this be available on AWS as a production product? Can't wait no hallucattions AND explainable.... bring it on!!!
Matthew: "Airplane companies have a highly complex policies on when to give a refund. A vastly complicated rule set and a ton of variables..."
Airplane policies: "Don't give a refund"
Thanks for the introduction and the great explanation of this topic. I always find it very interesting to learn about new things that I use every day from a technical point of view, but that have completely passed me by! 👍😊
This is some refreshing and exciting news, I can't wait to see what the future will be. Having a second LLM to fact check the reasoning level of the first LLM using math is a good way to improve accuracy.
The relationship between friction in dry and wet conditions is not typically referred to as a **transitive property**. Instead, it is an example of a **proportional relationship** governed by the physics of friction. To clarify this.
Transitive Property
The transitive property applies to relationships like equality or order. For example:
- If \( A = B \) and \( B = C \), then \( A = C \).
- If \( A > B \) and \( B > C \), then \( A > C \).
This logic doesn't directly apply here because we’re not dealing with a chain of relationships that transfer equality or inequality across terms.
Proportional Relationship
The equation:
\[
\frac{F_{f,\text{wet}}}{F_{f,\text{dry}}} = \frac{\mu_w}{\mu_d}
\]
shows a proportional relationship, where the frictional force (\( F_f \)) is proportional to the coefficient of friction (\( \mu \)) under the same normal force (\( N \)). The smaller \( \mu_w \), the smaller \( F_{f,\text{wet}} \), which directly demonstrates the reduction in grip in wet conditions.
Why Not Transitive?
The reasoning is **causal** and based on physical principles (the decrease in \( \mu \) due to water), not a logical transference of equality or inequality. Thus, the reduction in grip isn't an example of transitivity but rather an application of proportionality and physics.
Just putting this out there but the transitive property isn't just applicable to equalities or inequalities but any operation f where f(A, B) and f(B, C) implies f(A, C)
@@christopheriman4921 You may well say that, but it doesn't make it so to everything. Particularly when you point to a specific example to which it doesn't apply. Math is math and your example is an example in itself why AI hallucinates at times, it either is or is not correct for the scenario presented, you are not asking for an opinion but a factual outcome from the question asked.
I do appreciate where AI is going, the problem is in my mind having AI actually make correct decisions by making mistakes. How many rockets exploded before a successful launch, the mistakes made were what lead to the correct construction and successful lift off.
I do like listening to your views. Keep it going.
OK Matthew. Now we have a potential AI model to use to build truth databases (truthbases).
Use AI agents to scrape books, articles, and the web for published "knowledge" and research findings ("facts") in a given field. Now use a reasoning model to set preliminary truth weights to each "fact". Next use the reasoning model to link related "facts". Finally, use the weight(s) given to a research finding and the weights of any confirmatory/supportive/non-supportive/contradictory nearby data to establish a weighted truth for each "fact". Use agents to build matrices of 'facts" and associated truths. The effect would be to condense large blocks of research in relatively compact collections of weighted "facts" ("truths").
Use the truthbases to avoid repeating research that has already been validated and to guide future research. Use the truthbases and reasoning models to weed out non-factual data in our AI training sets based upon lack of supporting facts. Maintain a truthbase of frequently visited truths and trending truths to reduce validation and response generation times and to reduce outputted hallucinations. Use the truthbases on request to validate LLM responses.
Great video! Thanks for the information, what a time to live in
not amazing... they probably used SpaCy and entities creating (NLP), this is not that advanced. I am a psychologist, and reasoning is not a mystery in psychology. After creating relationships with the entities... you need to match to a template. the logic is provided after the terms were BERT or so embeding and functions cosine, match the terms , use an indexer, agents at every critical stage, I am working on unstructure data, it is a lot harder. Becasue we have to assign specific agent and capability to that "node". orchestrate and pass on to the response team, the part that isi impressive is speed .. but that is not a problem for a large corporations.
This is the first time I've heard about the hallucinations being an example of creativity. I honestly don't understand that. In any event, I remember a number of times where my coworkers or myself have, as humans, went "off the rails" trying to solve problems logically. For the examples that stick in my mind, the ultimate issue usually was that the problem we were working on had unknown variables that threw a monkey wrench into our attempts to solve the problem. Sometimes those unknowns would drive us nuts until we figured that out. I have to suspect that this issue can crop up in AI too.
I'm glad that you brought up potential contradictory specifications. Now what about under-specifications? For example maybe there is a rule for employees under 65, and a version of it for employees over 65, but what if an employee is 65?
You can build guardrails simply by validating outputs with another llm
You don't know if the other LLM is also hallucinating.
@@pleabargainthe other LLM should be just checking validity of the answer ... Chances are pretty slim that it will hallucinate
Hey Matt your hair grew a little bit between the timestamps 7:40 -7:45 I can see there was a lot of research put into this I really appreciate these in all seriousness 🙂
Can it accurately extrapolate in the absence of specificity?
The ability to infer which variables are relevant and the relationship between them to do classification, as well as the ability to output the reasoning model is something decision trees like J48 can do as well. Text extraction and parsing is more of an LLM thing. Maybe they used both.
Nice video.
I’m confused, 1,000 word limit for a policy…? What policy is limited to 1500 tokens? Wouldn’t grounded RAG be more effective
As a platform engineer I'm more afraid of serverless SAAS products taking my job than AI.
If AI can have love, this is the beginning of everything that will happen well in the future. Excellent e.p!❤
That's great! This is a big step towards using AI for legislation. Imagine it being fed all the laws and checking the judgement of judges or in the future making judgements itself.
They will be able to make HR allot more complex and easily fix complex issues. The contracts will get lots better..
TBH I don't mind sponsored videos. You can still learn a lot from them, despite their obvious focus on one product.
I didn't understand the "assumption" example at 9:12, did the AI make the wrong assumption and give the wrong answer or you gave the wrong answer and it caught that? Who made the assumption?
Awesome video dude, fascinating! This will no doubt be a puzzle piece in creating 100% trustable 'logical' chain of thinking leading to have new other systems that require trustable reasoning and 0 hallucinations to be built about something along this line.
that is exactly the problem with LLMs, people thinking that LLM can replace all Expert Systems in the hype of generative AI
essentially AWS has invested into real logic, real processing, dependable actual mathematical proofs just like the expert systems we have today, applying it to their domain
kudos and respect
This is actually brilliant 👏🏾 was wondering why so few other places are doing this
At 10:49 it's still getting the answer wrong. Maybe because the hierarchy of the logic is not correct.
the full video is an ad for a painfully unexciting enterprise AWS service LMAO
Amazon's servers have seen a 300% increase in hack attacks in the last 6 months. Hope AI can jump in and save the day.
If feels like we are moving further away from understanding what the code is doing and closer to, I don't care how it works, just as long as it works.
I mean it's been that way for a while, when having to use other people's libraries to get things done, but there has always been the feeling that at least somebody knows what those libraries are doing, so they can get patched and updated if something goes wrong.
Not magicians, logicians. 😂
😎🤖
I often feel like a recurring latest breakthrough is some form of asking the A.I. "but, like, are you sure bro?"
@@ytubeanon makes perfect sense when you consider AI originated from being force fed the entire corpus of human knowledge, then being asked "bro guess how A relates to B"
@@natef3141 well, one would think A.I. uses a certain method to get its answer, which shouldn't change when the same question is asked again (art aside) cuz it should use the same methodology. The speed, consistency and reliability are the intentional reasons to use A.I. in the first place
@@ytubeanon consistency =/= correctness. But you are right, reliability (the combination of those two things) is extremely important, which ultimately circles back to the methodology piece you just mentioned. My point is simply that the original methodologies, however consistent you want to call them, are inherently limited in correctness precisely because of the training methods that created them. These issues may not exist if the training sets and user applications were more limited depth or breadth, but the fact is that the pure combinatorial explosion of possible scenarios is mindbogglingly high, full of nuances, subtleties, and pitfalls that can't really be "trained out" by random chance.
Intuitively, it should make sense that thinking methodology will have the biggest yields. Exhaustively gaining knowledge and making connections is simply not possible, and it's not necessary--the smartest humans don't think this way. We can often be put into unfamiliar domains, have a specialist with the knowledge needed, and go from there.
A proofing system like the SPARK extension for the Ada language could allow AI to do this.
@solifugus Look, don't post sensible comments. RUclips is not meant for that.
@@coldlyanalytical1351 lol
Exciting topic.
Basically, you have extracted everything you need to create a decision tree and then the software. So AI would no longer be necessary and the answers would be better! AI could then be used to make changes and additions, etc.
Data extraction with AI is not difficult, would like to see what else is behind it 🤔
Sorry to rain on your parade, but this kind of thing was done decades ago. There was even a special programming language called Prolog (Programming Logic). In summary (according to ChatGPT), Prolog’s strength in AI lies in its ability to perform logical reasoning and solve problems based on knowledge defined as facts and rules. I wonder how well it would perform if it had the same amount of compute power given to it as LLMs.
Is this publicly available for us to try, looks like you are in N. Virginia, I'm not seeing Adv Reasoning Preview under Safeguards in Bedrock in my AWS console.
Can we get a link in the description to the amazon form you used to upload the doc? or get us access to the preview
The preview is available, talk to you AWS manager! Happy to talk to my contact if that doesn't work.
@@matthew_berman I am not seeing it either. That would be awesome if you could reach out to your contact for more details on signing up for access to this feature! 👍
I would love to see the department of governmental efficiency apply this software to the entire rubric of law in the United States and see how many logical fallacies are included!
Click bait based on vibes not education.
Wtf are you talking about lol?
Lol. Explain....
Not click bait. I worked with the creators of this service at AWS on this video. Relax.
Even if it click bite i love his videos.. he knows how to story telling tech. I have learn a lot with his videos. He is the AI Oracle !! 😂
Don’t miss one… 😊
Lots of this feel very old tech married to new LLMs to get natural language as the input. I wrote expert systems in mid 1980's that implemented logical reasoning rules in LISP or Prolog. These reasoning rules feel like they are pretty much the same thing. So I think we're adding an LLM code gen with a rule based language as the target language. In the early 2000's my university had a research center working on software verification that used a theorem prover to analyze and verify code (which could be rules). So much of this sounds like it is based on early AI research. In 1986 I got to play with a TI Explorer, an early LISP based AI processor that ran KEE (Knowledge Engineering Environment) expert system.
These systems, more than anything else, need safety guardrails. Otherwise they would be abused by large companies to make intentionally overcomplicated mousetrap policies, which could get people robbed, indebted for lifetime or jailed.
uummm the comments here tell me this vid id not worth watching. wats happening sir?
Jeez, that has the potential for revolutionising law.
What happens when somewhere the rules contradict each other? When it's complex, like government law, those things do happen.
Can this rule be identified as a classic paradox?
In a futuristic city, there is only one robot technician, a self-aware machine, who repairs all and only the robots that do not repair themselves. The question is, does the robot technician repair itself?
Matthew became such a caricature of all those AI news sensationalism that I would not be surprised that he end up in a meatcanyon video.
This sounds like using an LLM to translate a document into business rules and then have the LLM use those business rules to answer questions.
Amazons product solved the hardest thing... Brought to you by.... AMAZON!!!! ?
Hey, I have one question. Can we achieve this using function calling? we could extract all the rules from given document. After that for each rule, we could create function calling schema. For instance, if rule is that employ could get bonus if he has 10+ year experience. Now, based on this rule we could create schema using LLM. This schema could contain one variable name experience. Now, we could extract the user information from their question for experience variable and could verify it through code that whether it is greater than 10 or not.
This is just simple idea but we could make complex system.
Not sure, I might be wrong. Open to discussion.
This is not how you ensure no hallucinations. Gödel would slam this.
So it's basically a language to proof translator. Pretty damn cool.
I’ll have to try it out! Would be nice if you could build rules directly from the knowledge base. Otherwise you have your documents being ingested from a source for RAG, then you need to do the same thing for the reasoning guardrail. People will want it automated, maybe lambda with boto3 if it’s supported.
10:56 shows that it still made assumptions. You shouldn't try to hide these things. It's fine that its not perfect, what's not fine is you covering up for it.
This is amazing, This type of reasoning is op at every level as far as utility is concerned
It's embarrassing just how flawed this system is. FYI guys, there are people out here who actually know enough about this stuff to be able to call bullshit!
Hope Open AI adds this to their 12 days of Christmas
And here is a deeply related problem: niche data allocation. example : Let's say ... I provide data by using a platform at levels not found anywhere else. Companies benefit according to "how didn't we think of that?" MOMENT .. they could publicly recognize the fact.
When I take magic mushrooms while coding and start hallucinating, I usually get quite creative transitive properties. Wonder why AI can't leverage that. Lol 😂
How do you not have 500k subscribers yet? You seriously grind so hard and your content is getting better, great stuff bro!