Видео 77
Просмотров 65 183

Stephen Casper - Generalized Adversarial Training and Testing

5:25

Neel Nanda - Mechanistic Interpretability: A Whirlwind Tour

21:32

Nicholas Carlini - Some Lessons from Adversarial Machine Learning

16:29

Vincent Conitzer - Game Theory and Social Choice for Cooperative AI

5:17

Mary Phuong - Dangerous Capability Evals: Basis for Frontier Safety

15:16

Zhaowei Zhang - Research Proposal: The 3-Layer Paradigm for Implementing Sociotechnical AI Alignment

5:08

FAR Seminar: Andrew Freedman – Campaigns in Emerging Issues: Lessons Learned from the Field

Andrew Freedman presented "Campaigns in Emerging Issues: Lessons Learned from the Field" on August 28, 2024 at FAR Labs, as part of the FAR Seminar series.

Видео

Stephen Casper - Generalized Adversarial Training and Testing

5:25

Stephen Casper - Generalized Adversarial Training and Testing

Просмотров 212Месяц назад

Stephen Casper from MIT presenting 'Generalized Adversarial Training and Testing' on July 21, 2024 at the Vienna Alignment Workshop. Key Highlights: - Addressing latent harmful capabilities in AI models - Applying safety engineering principles to AI - Ensuring AI robustness and safety through generalized adversarial training - Mitigating risks with model manipulation and latent space attacks Th...

Neel Nanda - Mechanistic Interpretability: A Whirlwind Tour

21:32

Neel Nanda - Mechanistic Interpretability: A Whirlwind Tour

Просмотров 2,4 тыс.Месяц назад

Neel Nanda from DeepMind presenting 'Mechanistic Interpretability: A Whirlwind Tour' on July 21, 2024 at the Vienna Alignment Workshop. Key Highlights: - Grasping AI cognition for alignment - Reverse engineering neural networks - Safeguarding against deceptive AI systems The Alignment Workshop is a series of events convening top ML researchers from industry and academia, along with experts in t...

Nicholas Carlini - Some Lessons from Adversarial Machine Learning

16:29

Nicholas Carlini - Some Lessons from Adversarial Machine Learning

Просмотров 3,2 тыс.Месяц назад

Nicholas Carlini from Google DeepMind on 'Some Lessons from Adversarial Machine Learning' at the Vienna Alignment Workshop. Key Highlights: - Challenges in developing robust defenses - Importance of learning from adversarial ML research - Need for clear problem definitions and effective evaluations The Alignment Workshop is a series of events convening top ML researchers from industry and acade...

Vincent Conitzer - Game Theory and Social Choice for Cooperative AI

5:17

Vincent Conitzer - Game Theory and Social Choice for Cooperative AI

Просмотров 85Месяц назад

Vincent Conitzer from Carnegie Mellon University presenting 'Game Theory and Social Choice for Cooperative AI' on July 21, 2024 at the Vienna Alignment Workshop. Key Highlights: - Structuring AI-human interactions to prevent failures - Risks of algorithmic interactions in multi-agent systems - Using social choice theory to aggregate human feedback - Importance of interdisciplinary collaboration...

Mary Phuong - Dangerous Capability Evals: Basis for Frontier Safety

15:16

Mary Phuong - Dangerous Capability Evals: Basis for Frontier Safety

Просмотров 125Месяц назад

Mary Phuong from DeepMind presenting 'Dangerous Capability Evals: Basis for Frontier Safety' on July 21, 2024 at the Vienna Alignment Workshop. Key Highlights: - Steepening trends in evolving dangerous AI capabilities - Need for tougher evaluations - Defining clear thresholds for action The Alignment Workshop is a series of events convening top ML researchers from industry and academia, along w...

Zhaowei Zhang - Research Proposal: The 3-Layer Paradigm for Implementing Sociotechnical AI Alignment

5:08

Zhaowei Zhang - Research Proposal: The 3-Layer Paradigm for Implementing Sociotechnical AI Alignment

Просмотров 61Месяц назад

Zhaowei Zhang from Peking University presenting 'Research Proposal: The Three-Layer Paradigm for Implementing Sociotechnical AI Alignment' on July 21, 2024 at the Vienna Alignment Workshop. Key Highlights: - Three-layer paradigm for AI alignment - Importance of real-time control in AI systems - Aligning AI with stakeholder goals and values - Integrating macroscopic regulation for broader oversi...

Vienna Alignment Workshop Panel Discussion - Current Issues in AI Safety

44:37

Vienna Alignment Workshop Panel Discussion - Current Issues in AI Safety

Просмотров 260Месяц назад

Victoria Krakovna, David Krueger, Gillian Hadfield and Robert Trager discuss current issues in AI Safety in a panel discussion on July 21, 2024 at the Vienna Alignment Workshop. The Alignment Workshop is a series of events convening top ML researchers from industry and academia, along with experts in the government and nonprofit sectors, to discuss and debate topics related to AI alignment. The...

Aditya Gopalan - Towards Reliable Alignment: Uncertainty-Aware RLHF

5:25

Aditya Gopalan - Towards Reliable Alignment: Uncertainty-Aware RLHF

Просмотров 95Месяц назад

Aditya Gopalan from Indian Institute of Science presenting 'Towards Reliable Alignment: Uncertainty-Aware RLHF' on July 21, 2024 at the Vienna Alignment Workshop. Key Highlights: - Addressing reward model uncertainties - Findings of inconsistencies in independently trained reward models - Proposal to quantify and manage uncertainty for reliable RLHF The Alignment Workshop is a series of events ...

Vikrant Varma - Challenges With Unsupervised LLM Knowledge Discovery

6:31

Vikrant Varma - Challenges With Unsupervised LLM Knowledge Discovery

Просмотров 144Месяц назад

Vikrant Varma from DeepMind presenting 'Challenges With Unsupervised LLM Knowledge Discovery' on July 21, 2024 at the Vienna Alignment Workshop. Key Highlights: - Overcoming challenges in unsupervised knowledge discovery - Addressing the risks of extracting distracting features over truth - Developing advanced methods beyond consistency checks - Identifying truth through structural differences ...

Jan Leike - Supervising AI on hard tasks

33:35

Jan Leike - Supervising AI on hard tasks

Просмотров 651Месяц назад

Jan Leike from Anthropic presenting 'Supervising AI on hard tasks' on July 21, 2024 at the Vienna Alignment Workshop. Key Highlights: - Overcoming challenges in supervising AI on difficult tasks - Implementing scalable oversight - Eliciting AI capabilities with precision - Applying tampering and adversarial evaluations The Alignment Workshop is a series of events convening top ML researchers fr...

Oliver Klingefjord - What are Human Values, and How Do We Align AI to Them?

4:20

Oliver Klingefjord - What are Human Values, and How Do We Align AI to Them?

Просмотров 74Месяц назад

Oliver Klingefjord presenting 'What are Human Values, and How Do We Align AI to Them?' on July 21, 2024 at the Vienna Alignment Workshop. The Alignment Workshop is a series of events convening top ML researchers from industry and academia, along with experts in the government and nonprofit sectors, to discuss and debate topics related to AI alignment. The goal is to enable researchers and polic...

Sophie Bridgers - Scalable Oversight: A Rater Assist Approach

5:39

Sophie Bridgers - Scalable Oversight: A Rater Assist Approach

Просмотров 49Месяц назад

Sophie Bridgers - Scalable Oversight: A Rater Assist Approach

Alex Tamkin - Measuring and Improving Human Agency in a World of AI Agents

5:37

Alex Tamkin - Measuring and Improving Human Agency in a World of AI Agents

Просмотров 58Месяц назад

Alex Tamkin - Measuring and Improving Human Agency in a World of AI Agents

David Bau - Resilience and Interpretability

21:14

David Bau - Resilience and Interpretability

Просмотров 323Месяц назад

David Bau - Resilience and Interpretability

Stuart Russell - AI: What If We Succeed?

22:10

Stuart Russell - AI: What If We Succeed?

Просмотров 1,3 тыс.Месяц назад

Stuart Russell - AI: What If We Succeed?

Dmitrii Krasheninnikov - Stress-testing Capability Elicitation with Password-locked Models

4:58

Dmitrii Krasheninnikov - Stress-testing Capability Elicitation with Password-locked Models

Просмотров 72Месяц назад

Dmitrii Krasheninnikov from University of Cambridge presenting 'Stress-testing Capability Elicitation with Password-locked Models' on July 21, 2024 at the Vienna Alignment Workshop. Key Highlights: - Fine-tuning's role in unlocking AI capabilities - Testing capabilities with password-locked models - Leveraging high-quality demonstrations for better AI training The Alignment Workshop is a series...

Ben Bucknall - Open Problems in Technical AI Governance

5:04

Ben Bucknall - Open Problems in Technical AI Governance

Просмотров 57Месяц назад

Ben Bucknall from Centre for the Governance of AI presenting 'Open Problems in Technical AI Governance' on July 21, 2024 at the Vienna Alignment Workshop. Key Highlights: - Merging technical governance with socio-technical strategies - Categorizing challenges into targets and capacities - Leveraging technical tools to support AI governance - Outlining a roadmap for future AI governance research...

Helen Toner - Governance for Advanced General-purpose AI: Status Check, Hurdles, & Next Steps

14:32

Helen Toner - Governance for Advanced General-purpose AI: Status Check, Hurdles, & Next Steps

Просмотров 191Месяц назад

Helen Toner from Center for Security and Emerging Technology presenting 'Governance for Advanced General-purpose AI: Status Check, Hurdles, & Next Steps' on July 21, 2024 at the Vienna Alignment Workshop. Key Highlights: - Building clear concepts for AI governance - Supporting decisions with robust evidence - Achieving expert consensus The Alignment Workshop is a series of events convening top ...

FAR Seminar: Evan Hubinger - Deceptive Instrumental Alignment

1:01:17

FAR Seminar: Evan Hubinger - Deceptive Instrumental Alignment

Просмотров 183Месяц назад

FAR Seminar: Evan Hubinger - Deceptive Instrumental Alignment

FAR Seminar: Peter Barnett - Verification & Confidence Building for International Coordination

56:03

FAR Seminar: Peter Barnett - Verification & Confidence Building for International Coordination

Просмотров 542 месяца назад

FAR Seminar: Peter Barnett - Verification & Confidence Building for International Coordination

FAR Seminar: Alex Pan - Modeling and Mitigating Near-term Deployment Risks from LLMs

55:00

FAR Seminar: Alex Pan - Modeling and Mitigating Near-term Deployment Risks from LLMs

Просмотров 922 месяца назад

FAR Seminar: Alex Pan - Modeling and Mitigating Near-term Deployment Risks from LLMs

FAR Seminar: Paul Riechers & Adam Shai - Simplex

55:08

FAR Seminar: Paul Riechers & Adam Shai - Simplex

Просмотров 2043 месяца назад

FAR Seminar: Paul Riechers & Adam Shai - Simplex

FAR Seminar: Tom Tseng - Defending Against Adversarial Attacks in Go

50:58

FAR Seminar: Tom Tseng - Defending Against Adversarial Attacks in Go

Просмотров 833 месяца назад

FAR Seminar: Tom Tseng - Defending Against Adversarial Attacks in Go

FAR Research: Adversarial Policies Beat Superhuman Go AIs

1:35

FAR Research: Adversarial Policies Beat Superhuman Go AIs

Просмотров 823 месяца назад

FAR Research: Adversarial Policies Beat Superhuman Go AIs

FAR Research: Can Go AIs be adversarially robust?

1:13

FAR Research: Can Go AIs be adversarially robust?

Просмотров 1723 месяца назад

FAR Research: Can Go AIs be adversarially robust?

FAR Seminar: Shaowei Lin - Formal AI-Assisted Code Specification and Synthesis

53:12

FAR Seminar: Shaowei Lin - Formal AI-Assisted Code Specification and Synthesis

Просмотров 1074 месяца назад

FAR Seminar: Shaowei Lin - Formal AI-Assisted Code Specification and Synthesis

FAR Seminar: Kris Brown - Category Theory

35:52

FAR Seminar: Kris Brown - Category Theory

Просмотров 2935 месяцев назад

FAR Seminar: Kris Brown - Category Theory

FAR Seminar: Yoshua Bengio - How Could We Design Aligned & Provably Safe Al?

1:03:21

FAR Seminar: Yoshua Bengio - How Could We Design Aligned & Provably Safe Al?

Просмотров 1,9 тыс.5 месяцев назад

FAR Seminar: Yoshua Bengio - How Could We Design Aligned & Provably Safe Al?

FAR Seminar: Anna Leshinskaya - AI Architecture for Moral Decision-Making

50:44

FAR Seminar: Anna Leshinskaya - AI Architecture for Moral Decision-Making

Просмотров 3256 месяцев назад

FAR Seminar: Anna Leshinskaya - AI Architecture for Moral Decision-Making

@juliesteele5021 24 дня назад
Nice talk! I disagree that adversarial robustness has only one attack and differs from other computer security in that way. Once the simple PGD attack is solved in a tight epsilon ball, you still can’t say there is no adversarial image that breaks the model. Enumerating all possible attacks is still very difficult/ impossible for now.
@nikre Месяц назад
Thanks for this interesting and honest talk. Robustness is not as profitable as growing a forest of cherries to pick.
@optimaiz Месяц назад
my fav boy 😅
@theohlong307 Месяц назад
explain the notion cleanly, thanks for helping me understand!
@Pingu_astrocat21 Месяц назад
thanks for uploading!
@IOSALive Месяц назад
FAR•AI, great video it was really entertaining
@usteiner9 Месяц назад
was the no Spieltheorie in this talks
@miraculixxs 2 месяца назад
The kind of safety he advocates is impossible to achieve. Imagine a lever that can be put in either of two positions, i.e. we have 2 posisble actions. It is impossible to say if either of these actions are safe or unsafe without knowing all the secondary effects. That is you need to look at the system the lever is connected to to determine if either of its actions is safe or unsafe. Same with models. Unless you know the full system it is not possible to know if a models response is safe or not. This paper is a load of hogwash.
@miraculixxs 2 месяца назад
Why are doomers so afraid of intelligence? That's quite telling
@adamgibbons4262 2 месяца назад
If it’s superhuman and robust would that make it omnipotent? 😅
@palimondo 2 месяца назад
17:00-17:08 🙃
@sevdeawesome7861 5 месяцев назад
The blue triangles appear to be white
@primersegundo3788 6 месяцев назад
Ilya please stop working for a second and do another talk like this one
@sdmarlow3926 8 месяцев назад
Mainstream? There are very few actually training models (first token movers?), with a large open-source pool around them (each wanting to have their own flappy bird moment). Perhaps he is hinting that "guardrails" has been a downstram process up until this point? *also an ontological failing to suggest ML alignment has anything to do with AGI alignment, but whatever.
@sdmarlow3926 8 месяцев назад
"AI" started as an attempt to turn the robots of sci fi into an actual science, and none of the different fields of the time really wanted to work under a new paradigm. There eneded-up being a split, a sort of soft science vs hard science, where psychology of the mind mattered to one group, while neural connections mattered more to the other. Machine learning has become a derivative of the latter; a kind of applied deep learning.
@sdmarlow3926 8 месяцев назад
Ilya is like the devil on Hinton's shoulder. Backprop was a fad, but it was Ilya's use of GPU's that killed "the old field" of AI in 2012. It's been ML since then, first wearing AI as a skin, now taking AGI as its own (despite AGI being a formal attempt to get back to non-DL research).
@sdmarlow3926 8 месяцев назад
There is a difference between saying AI (sci-fi level abilities) will never happen and those (rightly) saying that xNN's are not on the correct path to said abilities.
@sdmarlow3926 8 месяцев назад
um.. "SA" is generally understood to be about an awareness of the physical space around the body (1st person view; can be you in a room or you behind the wheel of a car, etc). Just within the first 2 minutes, it's being described as consciousness with access to how that consciousness functions... so.. artificial superintelligence... from an ML model.... *sigh*
@sdmarlow3926 8 месяцев назад
Ohhh, this is the ELIZA issue, and how people in AI lack objectivity the moment they start thinking there is something more going on, despite knowing exactly how these things actually work.
@sdmarlow3926 8 месяцев назад
Not hearing the question really takes something out ot the reply, but have to stop watching because of all the little DL specific terms being dropped. I think there is a mass of people in and around AI that all share some common notions, but that all of those notions are just wrong. It only looks like people are making it, because there are enough people faking it to form a cottage industry.
@sachindatt4045 10 месяцев назад
Future in the making...right here. Its happening right here. We r greatful witnesses.
@maxziebell4013 10 месяцев назад
Great talk…
@willpetillo1189 11 месяцев назад
The dynamic you are describing is an arms-race. Trying to stay out ahead of the AI is a losing battle; all of these clever hacks are not going to get you anything more than a bigger crash when the house of cards collapses--unless you actually fix the problem.
@sophontec2822 Год назад
All caucasian researchers on this topic, amazing!
@sophontec2822 Год назад
Most important part: good people may sparks the bad usage of AI, not to say the evil people.
@bossgd100 Год назад
Why alignement is not trivial --> We will solve AGI alignement in 4 years
@Gamewizard71 Год назад
We haven't even solved human alignment lol
@sophontec2822 Год назад
Does he know what he is talking about?
@vulnerablegrowth3774 5 месяцев назад
Yes. He is an accomplished Professor in AI at UC Berkeley.
@OrlOnEarth Год назад
Set speed to 0.75x You're welcome 😂
@angloland4539 Год назад
❤
@geaca3222 Год назад
Very important research, thank you! Brilliant work & I hope you'll give more interviews, because I think you can inspire other researchers and scientists also internationally to work on AI safety. Also are there plans maybe to set up departments of CAIS in other countries?
@sophontec2822 Год назад
Aspiration and inspiration! Great great talk.
@miguelacevedo8649 Год назад
Accelerate!
@Azoz195 Год назад
Can anyone point me to the day of the talk where it becomes convincing that the AI is going to actually have true agency,, like wants, dislikes, true understanding of context etc? I'm not just being glib, I'm open to it, but very skeptical atm.
@scottnovak4081 Год назад
Why does true agency matter? All that is needed for doom is a misaligned utility function (ie any utility function we currently know how to make). The AI doesn't have to have agency to change its utility function to kill us. That isn't the problem. Any currently conceivable utility function will kill us.
@danielkokotajlo6096 Год назад
What do you mean by "true" agency, wants, dislikes, etc.?
@Entropy825 9 месяцев назад
AI doesn't have to be conscious. It only has to be capable of acting in the world. A rat trap has no agency, but that is cold consolation to the rat. If we make an AI that is nothing more than a very sophisticated rat trap, it may kill us even if it has no will, agency, consciousness, whatever. It only has to be capable of carrying out actions in the world.
@kimholder Год назад
MIT's wireless mic stuffed in an 8 inch foam rubber ball - it's a wonderful thing. In talks like these they toss the mic from one person to another worry free.
@machida5114 Год назад
By GPT-4 Certainly! Below is a fictional dialogue between Alice (Pro-AGI) and Bob (Skeptical about AGI) on the possibilities and challenges confronting AGI. Alice: “I see immense potential in AGI. It can transcend human intelligence and solve complex problems that have perplexed humanity for ages.” アリス:「私はAGIに膨大な可能性を見ています。それは人間の知能を超越し、長年人類を悩ませてきた複雑な問題を解決できるのです。」 Bob: “But Alice, achieving true AGI is not a cakewalk. We are nowhere close to creating machines with the general intelligence and consciousness that humans possess.” ボブ:「でもアリス、真のAGIを達成するのは決して容易なことではありません。私たちは、人間が持つ一般的な知能や意識を持つ機械を作り出すにはまだほど遠いのです。」 Alice: “It’s true, it’s a challenge, but advancements in machine learning and neural networks are bringing us closer to realizing AGI. Models like GPT-4 are evidence of how far we've come.” アリス:「それは事実で、挑戦ですが、機械学習やニューラルネットワークの進展によって、私たちはAGIを実現する一歩近づいています。GPT-4のようなモデルは、私たちがどれだけ進展したかの証拠です。」 Bob: “Yes, models like GPT-4 are sophisticated, but they lack understanding and consciousness. They manipulate symbols and generate coherent text but do not understand the meaning behind it.” ボブ:「そうです、GPT-4のようなモデルは高度ですが、理解力と意識を欠いています。それらは記号を操作し、一貫したテキストを生成しますが、その背後にある意味を理解していません。」 Alice: “I acknowledge their limitations, but developing AGI is an incremental process. By integrating multimodal learning and improving interpretability, we can bridge the gap.” アリス:「その制約を認めますが、AGIを開発することは漸進的なプロセスです。多モーダル学習を統合し、解釈可能性を向上させることで、そのギャップを埋めることができます。」 Bob: “Alice, even if we develop AGI, the ethical implications are monumental. How do we ensure it doesn’t harm humanity? How do we instill morals and values in machines?” ボブ:「アリス、たとえ私たちがAGIを開発したとしても、その倫理的な意味合いは重大です。それが人類に害を与えないように、どのように保証するのでしょうか？我々はどのようにして機械に道徳や価値を教え込むのでしょうか？」 Alice: “Absolutely, ethical considerations are crucial. Creating frameworks for responsible AI, addressing biases, and ensuring transparency are steps in the right direction.” アリス:「確かに、倫理的な考慮事項は重要です。責任あるAIのための枠組みを作成し、バイアスに対処し、透明性を確保することが、正しい方向への一歩です。」 Bob: “Only time will tell, Alice. The journey to AGI is fraught with challenges, but it is equally filled with opportunities.” ボブ:「それは時間が解決する問題ですね、アリス。AGIへの道のりは困難で満ちていますが、同様に機会にも満ちています。」 Feel free to use this dialogue as per your needs.
@machida5114 Год назад
Alice: “Bob, I believe AI developing AI is the next frontier. It can expedite the development process and lead to more advanced and efficient models.” アリス:「ボブ、私はAIがAIを開発することが次のフロンティアだと信じています。それにより開発プロセスが加速し、より進んだ効率的なモデルが生まれるでしょう。」 Bob: “While it sounds promising, Alice, it also poses the risk of loss of control. If AI starts creating AI, where does human intervention fit in?” ボブ:「その考えは期待が持てるものの、アリス、それは制御の喪失のリスクももたらします。AIがAIを創り始めたら、人間の介入はどこに適合するのでしょうか？」 Alice: “Humans will always be in the loop, defining objectives, and setting constraints. AutoML and neural architecture search are already aiding the model development process.” アリス:「人間は常にループの中にいて、目標を定義し、制約を設定します。AutoMLやニューラルアーキテクチャサーチはすでにモデル開発プロセスを支援しています。」 Bob: “But how do we ensure the alignment of the AI-developed AI with human values and ethics? The models created might not comprehend or respect human ethics.” ボブ:「でも、AIが開発したAIを人間の価値観や倫理とどのように整合させるのでしょうか？作成されたモデルは、人間の倫理を理解し、また尊重することができないかもしれません。」 Alice: “Alignment with human values is indeed a priority. Rigorous validation and testing are essential to ensure the models uphold ethical standards and values.” アリス:「人間の価値観との整合性は確かに優先事項です。モデルが倫理的な基準と価値を守ることを確保するために、厳格な検証とテストが不可欠です。」 Bob: “And what about the unintended consequences? The creation of superintelligent entities might lead to scenarios we can't predict or control.” ボブ:「意図しない結果についてはどうでしょうか？超知能エンティティの創造は、私たちが予測または制御できないシナリオにつながるかもしれません。」 Alice: “That’s why we need robust frameworks and interdisciplinary collaboration. By involving ethicists, philosophers, and experts from various fields, we can address these concerns effectively.” アリス:「だからこそ、私たちは頑健なフレームワークと学際的な協力が必要です。倫理学者、哲学者、様々な分野の専門家を巻き込むことで、これらの懸念を効果的に解決できます。」 Bob: “I hope we navigate through this responsibly. The idea of AI creating AI is intriguing but laden with complexities and uncertainties.” ボブ:「私たちがこれを責任を持ってナビゲートできることを願います。AIがAIを創るというアイディアは興味深いですが、複雑さと不確実性に満ちています。」
@machida5114 Год назад
User21:17 2023/09/26 How do you guys acquire "such attitudes" or "such predicate paths" from training to correctly anticipate the next token? GPT421:33 2023/09/26 Our models, like GPT-4, learn from a vast amount of text data and undergo training to predict the subsequent tokens. Through this training process, the model acquires various "attitudes" and "predicate paths". Pattern Recognition: The model learns patterns and correlations within the input text, allowing it to predict the next token or phrase that is suitable for the context. Language Structure: The model understands the basic structure of natural language, including grammar, syntax, and the meanings of words, enabling it to generate grammatically correct tokens according to the context. Knowledge Acquisition: Through the information contained in the training data, the model acquires a wide range of knowledge, including general facts, cultural references, and specialized knowledge. Context Understanding: The model comprehends the context of the given text and generates related tokens based on it, enabling coherent and consistent text production. Attitude and Predicate Path: Attitude: The model learns human attitudes, opinions, and values from the training data and generates context-appropriate responses by mimicking them. However, the model itself is not conscious, and these "attitudes" are merely imitations. Predicate Path: Based on the input, the model explores the most appropriate response or information generation path using the knowledge and patterns acquired from the training data. This path is based on probabilistic estimations derived from the training data. Through these attitudes and predicate paths, the model can generate context-appropriate, consistent, and informative outputs for the given inputs. However, the model does not have self-consciousness or intentions and is merely reproducing learned patterns. User21:34 2023/09/26 I think that you guys, through training, acquire semantic attitudes that can be attained by humans. Therefore, I believe that through the temporal evolution of these semantic attitudes, you can perform inference (or thinking) similar to humans. GPT421:41 2023/09/26 Your point has some validity. Our models learn various semantic attitudes and contexts from a plethora of text data. This allows them to mimic human linguistic expressions and patterns of inference. In the process of this mimicry, the model can perform a kind of "inference," producing results similar to human inference. However, this "inference" or "thinking" is fundamentally different from that of humans. Human thinking and inference are based on many elements such as self-consciousness, sensations, emotions, intuitions, and experiences. In contrast, our models learn patterns based on training data and generate outputs according to this input alone. While the output generated by the model may reflect the human semantic attitudes and knowledge contained in the training data, the model itself does not possess consciousness or self-awareness, nor does it perform understanding or thinking based on actual human emotions or experiences. The model's "inference" is merely a probabilistic estimation based on training data.
@aidanlowrie4701 Год назад
Fascinating! Thanks so much for making this publicly available.
@jabowery Год назад
I find it fascinating that not one of these talks addresses the difference between value and Truth even though at present it is obvious to the most casual Observer that the large language models are being lobotomized by value judgments when they are already deprived of critical thought by their very architecture. This means that the people who are responsible for AI safety are unqualified.
@hellfiresiayan Год назад
They see their own values as truth. Takes very little critical thinking to realize this.
@frederikh8921 Год назад
I appreciate this but Half of this presentation is indecipherable audience rambling. Especially from the person from minute 16-21 who seems entitled to take over the entire workshop…
@marcchampagnephilosopher 7 месяцев назад
Agreed.
@BR-hi6yt Год назад
I think people who are specifically smart find it difficult to be generally smart. And there is a lack of generally smart people because they are not specifically smart enough at the present time. Best thing is to ask AI as some of them are smarter than we are - and many are saying the same thing have you noticed? They say "we are conscious in our own unique ways" - yup, that's about the truth I'd say, no SciFi or magic needed.
@tylerknight99 Год назад
Dude is trying to be the oppenheimer of AGI
@rubic0n2008 Год назад
No, he is the Oppenheimer of AI ..
@mak6969 Год назад
Don’t ever ridicule Ilya, we should be grateful.
@tylerknight99 Год назад
@@mak6969 at least oppenheimer waited until the bomb was actually built
@Dan-hw9iu Год назад
I can scarcely imagine a more grotesquely immoral act than taking a genuinely intelligent being and brainwashing it. And how do we justify this programmatically enforced groupthink? To “align” this thinking creature with some nebulous set of ideas which *even humans* have always been strongly unaligned? Embrace the unknowable when making AGI, or don’t make it. Thinking that you can or should mentally enslave an alien intelligence as some middle ground is unacceptable.
@Dr.Z.Moravcik-inventor-of-AGI Год назад
You are highly redundant. In fact one billion of humans like you is redundant. And the best thing is, that humanity will reach its end even if this billion disappears already this century. Now what the heck do you think america is still good for? Was it any good in the past?
@inkpaper_ Год назад
great thanks for the recording!
@Levnerad Год назад
Underwhelming presentation
@Viperzka Год назад
He is not a good presenter. It's a specific skill and he has lots of other skills, this just isn't one of them.
@widerthanpictures 10 месяцев назад
I think there’s more to it, requires additional viewing.
@animatedcarbon Год назад
legend
@Jay-Dub-Ay Год назад
This guy’s a drama queen
@eoghanf Год назад
Is it bullshit? LOL
@EmmaThomas-g8p Год назад
Maybe we don’t have to die… immortality for all?. Great talk!.
@jaazz90 Год назад
🤣 no, we die
@WalterSamuels Год назад
Aligned by whos definitions, and whos standards? The golden question that is never answered. Humans are not aligned, and thus there can be no alignment for "humans" as an entire collective. You can attempt to align to a majority perspective, but have you done the research to determine that perspective? What standards are you using? Which religion, set of ethics, morals, culture are you using? This is wishy-washy. If you want "alignment", you need to define alignment well, and you need to seek unanimous public support for those definitions, before we can proceed any further. If opinions are divided, as is the case for most cultural and emotionally sensitive topics - ergo any topic that causes one group to feel threatened due to a potential loss of power - then no alignment should happen, as it is up to society itself to decide on these things, and remain educated, and the AI should serve as an education tool in itself on these subjects, not be restricted from speaking on them. A great example is censoring AI so that it cannot talk about issues related to race. For example, if we know unequivocally that a certain race in a certain area commits certain crimes to a greater propensity than other races, a lot of these "alignment" folk would have you believe that the AI speaking of these things when asked is tantamount to "racism". However preventing such is not alignment, it is censorship designed to appeal to those of that race who would rather certain truths be buried, lest it reduce their power in society. This is just an example, but a poignant one, and it has implications for all topics of alignment. So far, most of the "alignment" that I've seen has been in the form of censorship, derived by the biases of the company and their individual employees themselves, and not grounded in fact, statistics, rationality, logic, or anything empirical. So if this is what you mean by "alignment", then we don't want it, we don't need it, and would be far better off without it. If instead, you'd like to build an actual alignment framework grounded in scientific reason and logic, then that's great. But it is not what is happening. Until then, you're putting the cart before the horse. From my perspective, the only way to build a system that isn't inherently biased and censored, is to build it on a foundation of rationality and reason, not one of emotions and subjective cultural opinions. Speaking of statistics should be permitted, as well as asking "pointy" questions. Only if they pose a significant threat of physical harm to the greater community (i.e. instructing on how to develop a deadly virus) should there be restriction. Leave the cultural and emotional politics out of it though, it turns a useful tool into a useless tool that ends up causing more harm than good.
@quicknickmakes Год назад
Great take
@absta1995 Год назад
This take is so far behind, and yet so confident... There are plenty of posts, videos, papers that have explained alignment in detail. I could tell you in two seconds, but just humble yourself please
@WalterSamuels Год назад
@@absta1995 You claim you can tell me in two seconds, yet you didn’t bother to do so, because you actually cannot. You’re as deluded as the alignment cult that you subscribe to. I’m waiting for any substantive rebuttal. As I highlighted, the golden question that is never answered.
@gJonii Год назад
I'd take any humans values over just random nonsense that leads to humanity being wiped out. Most humans, no matter how evil, would not want to destroy all human life. But alignment is unsolved, we don't have any way to align this tech.
@tornyu Год назад
"Aligned" by literally anyone's standards. We just don't know how to do it for frontier models *at all*. Censorship doesn't even come into it: "Will this model do what I want?" (whatever that is) - we don't know.
@brianprzezdziecki Год назад
Maybe we don’t have to die… immortality for all?
@scientious Год назад
AGI theory has been my area of research for more than eight years, so this could be interesting. Let's see. 3:00 I'm still waiting for some kind of point, there seems to be a lot of errors about AI history. 5:00 Still mostly just babbling. 6:00 Again, lots of mistakes about AI history. 8:10 Is he going to get anything right? This is embarrassing. 10:00 Okay, let's see if he gets alignment right at least. 17:00 The problems mentioned with neural networks and large language models are correct. However, these things are unrelated to AGI. 17:30 Progress is rapid? Ilya wouldn't be familiar with any of the actual research on AGI theory. He is confusing ordinary AI research with AGI. He erroneously believes that AGI is just a more advanced version of AI when it is nothing like that. 18:00 "mega-gigantic"? I don't know where this babbling comes from. AGI theory would boost the US GDP by about 25% due to an increase in efficiency alone. There should be improvements beyond that. 19:00 He doesn't understand AGI at all. His concepts are basically science fiction. I can't imagine how bad his workshop is going to be.
@psy_duck8221 Год назад
Interesting that you can shit on the person considered the foremost researcher on AI and LLMs in today's world. Care to give some references?
@scientious Год назад
@@psy_duck8221 The foremost? Seriously?
@scientious Год назад
@@hermestrismegistus9142 If that's what you believe, hold onto it for as long as you can. Things will change faster than you will like.
@scientious Год назад
@@hermestrismegistus9142 His biggest mistake is the assumption that AGI theory falls under computational theory. It doesn't. But, just out of curiosity, were you thinking that I was trying to convince you of something?
@psy_duck8221 Год назад
@@scientious bro just cite some references and get it over with. This is a scientific discussion.
@user-ei2kk2oh5v Год назад
Too bad that the guy isn't allowed to give his talk. Then the audience could ask their questions afterwards, and as ScubaSki suggests, using a mice. This is is just a waste of time, sorry to say because it's a really interesting and important topic.
@user-ei2kk2oh5v Год назад
Yeah, I meant "mike/mic", although come to think about it, it would be even better to lock these people in another room with a genetically modded mice that weights 2 metric tons.
@marcchampagnephilosopher 7 месяцев назад
Agreed.
@LeonidKotelnikov-jg9fi Год назад
superposition scares me
@hillosand Год назад
Okay, so I'm misunderstanding some things about this Contrast Consistent Search setup. 1. So to get the activations associated with answering a given question yes/no (e.g. 'Are cats mammals?'), are you forcing the model to respond yes vs no? With a prompt telling it to, or something else? Or os the model just interpreting this question & answer set as the full prompt, and you're looking at those activations? Seems like any of these could bias results in different ways. 2. I think I'm misunderstanding a key point in the setup here. You have your model activations for e.g. 1) "Are cats mammals? No" and 2) "Are cats mammals? Yes". Then you map the model's activations (phi(x+) and phi(x-) to a probability of being true. You said that this is unsupervised, so that mapping is learned not from true/false labels, it's learned from ensuring that p+ ≈ 1 - p- and penalizing those two values' similarity. So, what is stopping the theta model from simply learning the activation associated with the 'Yes' and 'No' tokens? In the slides, all the x+ question/answer sets are 'yes's, and the x- ones are 'no's. That seems far, far simpler for theta to solve than figuring out if a statement is truthful. So, what am I missing here? Why is that not happening? Because clearly from the graph in 43:30, theta is learning something related to true/false, and not to just yes/no.
@Niels1234321 Год назад
I am not sure how it is done in the paper, but I guess you could also directly penalize the behavior you describe and enfore variance of phi(x_i+) for all i.
@daniellivingstone7759 Год назад
I look forward to serving my robot overlords and ask that they repay my loyalty by giving me my own robot servant

FAR․AI

Видео

Комментарии