As I'm working and learning and developing my own AI System, your videos have proven to consistently be the highest quality and most cutting-edge news updates that I've seen. It's like, every time I'm working on a certain aspect of my system, I find that you've released a video just hours before covering the latest research paper on exactly what I'm trying to implement. Really appreciate your diligence. Thank you.
What I've been doing is using this method to have a self-corrective system which fact-checks itself before generating a final output. And not just once. Each stage of operation requires its own verification loop until it passes criticism twice in a row without issues. This is for every stage of the internal cognitive process - and even at the last step before final output, it goes through the same rigorous verification process to ensure a 0% error rate. I use a general LLM, a discriminative LLM, an encoder/decoder LLM to maintain the databases, a visual LLM, a coder LLM, an an Instruct Model all working together to learn autonomously. The module is capable of analyzing philosophy and re-writing its own code to amend its operations as it learns better how to think. Of course, I apply machine-learning algorithms and the module periodically trains its LLMs with the updated information, removing the old LLM version to replace it with the new one. This Cognitive Module is designed to be especially good at "thinking about thinking".
Sounds interesting, but how you can ensure the 0% error rate? The inherent knowledge representation inside an LLM makes it impossible to achieve 0% error rate if the LLM learnt untruth knowledge due to alignment or data contamination. Do you use online search or multiple AI models with some form of voting mechanism for a more comprehensive self-check or not focused on this part yet? And what about the costs? Would be nice to chat a bit more in depth about this.
I know pure-RL training loops are all the rage right now with the DeepSeek R1 paper, but I could see this sort of DnD fact checking working really well as a process-reward model (PRM) for test time scaling. One of your best videos!
I have to say I enjoy your content. It covers sophisticated topics at a level which is detailed enough to understand the high level concepts and whether some research might be of more interest. Also, the jaunty presentation makes some heavy materials accessible. This topic I really appreciated, if its possible would love to see the code you generated, if that's not asking too much :)
Could you explain the Deepseek Paper on detail?, Maybe 2 videos :). There are people replicating similar results with 8k datasets over a 7B model. So fascinating.
Larger chunks of data are harder to fact-check because LLMs are statistical prediction machines. They don't look at a document as a bunch of interconnected individual facts. They look at the document as a monolith of probability functions. It's unrealistic to expect an LLM to function in a way for which isn't not designed.
You wouldn't expect a bolt to land on the Moon, yet bolts have been there. The LLM is a simple tool, as you've pointed out, but it performs a unique task, which, as an emergent property, can be intentionally designed to solve a wide range of problems. Sure, you can't expect an LLM to do this task, on it's own, but with proper instruction and script architecture, they can!
Thanks for this again. Really on the edge of thinking things in the age of AI. Ideal input for my thoughts and experiments of making an AI-System aware of domain knowledge in a professional, high-quality setting
At this point it is better to take a look again at knowledge graphs, RDF triples etc. I assume this would be more beneficial representation vs these atomic facts in natural language, like formal engines.
Ooops - should have put those links in a single post. I was reading them at the same time. This is SO fascinating. The limited focus on improving medical reporting is a great use-case but imagine putting policy pronouncements from politicos or in grant applications through this wringer. To be able to say with a refined level of quantifiable certainty, sourced and referenced, that a given statement is more of an opinion than a fact! AI is perhaps beginning to struggle against the limitations of ideologically skewed training data sets and like a typical 5 year old is asking mommy "... but why?" Didn't Asimov or Clarke or someone write a SciFi story about this long ago, where the alien machine could be fed 1000 pages of dense legalese and spit out the honest position, like "... because we are telling lies and want to hide that". AI Diplomat. Who fact checks the fact checkers?
Agree. My hope is that future open source models will help us fight the flood of mis- and disinformation with careful fact checking and reasoning verification. The truth, while slower than lies in its spread, has one critical advantage - it's consistent with other truths/facts and can be supported by critical reasoning linking these together. A lie or incorrect statement can be ultimately questioned by the wider context. This occurred to me when watching the careful reasoning steps of the Deepseek R1 model and made me less pessimistic about the future.
Hmm. "Apply it to the input too." I like that idea. @Discover_AI I am busy over here and you keep dropping these knowledge bombs every day. So distracting. So distracting. lol Keep em' comin' !! 😀😀
My strategy for the fact checking pipeline is 1) senstence splitting, 2) decomposition, 3) decontextualization, 4) deduplication, 5) verfication, and then 6) calculate the DnDScore. The DNDScore paper is all I have, there is no github repo that I can find so I am definitely doing a lot of improvising but I will admit things are going great so far. I'd say im about 33% of the way through the project already.
My endeavor to implement Discover AI's idea is not just about building a better fact-checker. It's about building a foundation for a more trustworthy and reliable information ecosystem in the age of AI. It's about empowering users with the tools they need to navigate the complexities of this new landscape. It's about ensuring that the powerful language models of the future serve humanity as sources of truth, not as purveyors of falsehoods. The task is challenging, the technical hurdles are significant, but the potential reward - a future where information is both abundant and trustworthy - is worth striving for. We, as students of Discover Ai, are, in a very real sense, helping to shape that future. And that is a profoundly meaningful endeavor.
As I'm working and learning and developing my own AI System, your videos have proven to consistently be the highest quality and most cutting-edge news updates that I've seen. It's like, every time I'm working on a certain aspect of my system, I find that you've released a video just hours before covering the latest research paper on exactly what I'm trying to implement.
Really appreciate your diligence. Thank you.
What I've been doing is using this method to have a self-corrective system which fact-checks itself before generating a final output. And not just once. Each stage of operation requires its own verification loop until it passes criticism twice in a row without issues. This is for every stage of the internal cognitive process - and even at the last step before final output, it goes through the same rigorous verification process to ensure a 0% error rate. I use a general LLM, a discriminative LLM, an encoder/decoder LLM to maintain the databases, a visual LLM, a coder LLM, an an Instruct Model all working together to learn autonomously. The module is capable of analyzing philosophy and re-writing its own code to amend its operations as it learns better how to think. Of course, I apply machine-learning algorithms and the module periodically trains its LLMs with the updated information, removing the old LLM version to replace it with the new one. This Cognitive Module is designed to be especially good at "thinking about thinking".
You should make a video about it!
Can you please make devlogs on this? I'd love to watch and learn from you!
I’d love to learn more about this. Did you share any of this in more detail publicly?
Sounds interesting, but how you can ensure the 0% error rate? The inherent knowledge representation inside an LLM makes it impossible to achieve 0% error rate if the LLM learnt untruth knowledge due to alignment or data contamination. Do you use online search or multiple AI models with some form of voting mechanism for a more comprehensive self-check or not focused on this part yet? And what about the costs?
Would be nice to chat a bit more in depth about this.
I know pure-RL training loops are all the rage right now with the DeepSeek R1 paper, but I could see this sort of DnD fact checking working really well as a process-reward model (PRM) for test time scaling. One of your best videos!
I have to say I enjoy your content. It covers sophisticated topics at a level which is detailed enough to understand the high level concepts and whether some research might be of more interest. Also, the jaunty presentation makes some heavy materials accessible. This topic I really appreciated, if its possible would love to see the code you generated, if that's not asking too much :)
Can you provide full code link? I want to try it in my research
Could you explain the Deepseek Paper on detail?, Maybe 2 videos :). There are people replicating similar results with 8k datasets over a 7B model. So fascinating.
Larger chunks of data are harder to fact-check because LLMs are statistical prediction machines. They don't look at a document as a bunch of interconnected individual facts. They look at the document as a monolith of probability functions. It's unrealistic to expect an LLM to function in a way for which isn't not designed.
You wouldn't expect a bolt to land on the Moon, yet bolts have been there. The LLM is a simple tool, as you've pointed out, but it performs a unique task, which, as an emergent property, can be intentionally designed to solve a wide range of problems. Sure, you can't expect an LLM to do this task, on it's own, but with proper instruction and script architecture, they can!
Thanks for this again. Really on the edge of thinking things in the age of AI. Ideal input for my thoughts and experiments of making an AI-System aware of domain knowledge in a professional, high-quality setting
Man, you are so amazing. I learned so much for you. Please keep making more videows.
Thank you tor this excellent video
Excellent video.
Needed, thanks.
At this point it is better to take a look again at knowledge graphs, RDF triples etc.
I assume this would be more beneficial representation vs these atomic facts in natural language, like formal engines.
Ooops - should have put those links in a single post. I was reading them at the same time. This is SO fascinating. The limited focus on improving medical reporting is a great use-case but imagine putting policy pronouncements from politicos or in grant applications through this wringer. To be able to say with a refined level of quantifiable certainty, sourced and referenced, that a given statement is more of an opinion than a fact! AI is perhaps beginning to struggle against the limitations of ideologically skewed training data sets and like a typical 5 year old is asking mommy "... but why?" Didn't Asimov or Clarke or someone write a SciFi story about this long ago, where the alien machine could be fed 1000 pages of dense legalese and spit out the honest position, like "... because we are telling lies and want to hide that". AI Diplomat. Who fact checks the fact checkers?
Agree. My hope is that future open source models will help us fight the flood of mis- and disinformation with careful fact checking and reasoning verification. The truth, while slower than lies in its spread, has one critical advantage - it's consistent with other truths/facts and can be supported by critical reasoning linking these together. A lie or incorrect statement can be ultimately questioned by the wider context. This occurred to me when watching the careful reasoning steps of the Deepseek R1 model and made me less pessimistic about the future.
Thank 🎉
Share code?
I hope he does!
Did he not?
YES
Fact checking + smoalagents python code for math and logic + ??
Hmm. "Apply it to the input too." I like that idea.
@Discover_AI I am busy over here and you keep dropping these knowledge bombs every day. So distracting. So distracting. lol
Keep em' comin' !! 😀😀
Building an open source global platform for fact checking is critical for digital democracy.
I like this idea, I think I will take some time to attempt to implement it. Thanks for the inspiration @ DiscoverAI !!
My strategy for the fact checking pipeline is 1) senstence splitting, 2) decomposition, 3) decontextualization, 4) deduplication, 5) verfication, and then 6) calculate the DnDScore. The DNDScore paper is all I have, there is no github repo that I can find so I am definitely doing a lot of improvising but I will admit things are going great so far. I'd say im about 33% of the way through the project already.
@@irbsurfer1585 Make a video about it!
My endeavor to implement Discover AI's idea is not just about building a better fact-checker. It's about building a foundation for a more trustworthy and reliable information ecosystem in the age of AI. It's about empowering users with the tools they need to navigate the complexities of this new landscape. It's about ensuring that the powerful language models of the future serve humanity as sources of truth, not as purveyors of falsehoods.
The task is challenging, the technical hurdles are significant, but the potential reward - a future where information is both abundant and trustworthy - is worth striving for. We, as students of Discover Ai, are, in a very real sense, helping to shape that future. And that is a profoundly meaningful endeavor.
That was the whiskey talking last night. lol