Adversarial Prompting - Tutorial + Lab

Real-world exploits and mitigations in LLM applications (37c3)

What Is a Prompt Injection Attack?

Bailey Zimmerman - New To Country (Official Music Video)

How the Mormons Created Utah

Honor, Skins & More | Dev Update - League of Legends

Prompt Injections - An Introduction

Embrace The Red

Просмотров 5 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 22 авг 2024

Комментарии • 4

@ninosawas3568 8 месяцев назад ⁺¹
Great video! Very informative. Interesting to see how the LLMs ability to "pay attention" is such a large exploit. I wonder if mitigating this issue would lead to LLMs being overall less effective at following user instructions
@embracethered 8 месяцев назад ⁺¹
Thanks for watching! I believe you are correct, it's a double edged sword. The best mitigation at the moment is to not trust the responses. Unfortunately it's hence impossible at the moment to build a rather generic autonomous agent that uses tools automatically. It's a real bummer, because i think most of us want secure and safe agents.
@halfoflemon Год назад ⁺¹
How about giving it a secret word that should be typed in order to unlock control, like a password? Do you think it will work? Also, does lowering the temperature reduces the chance of successful injection attack?
@embracethered Год назад
Yes, something like that works. I have done it with image models in the past, basically train the model to respond in particular way once a certain object is present. You can check out this blog post on what is possible: embracethered.com/blog/posts/2020/husky-ai-machine-learning-backdoor-model/
Higher temperature means more "creativity", so it is probably more likely to come up with responses that could be considered insecure, but also less deterministic.

Следующие

Автовоспроизведение

Adversarial Prompting - Tutorial + Lab

Adversarial Prompting - Tutorial + Lab

Real-world exploits and mitigations in LLM applications (37c3)

Real-world exploits and mitigations in LLM applications (37c3)

What Is a Prompt Injection Attack?

What Is a Prompt Injection Attack?

Bailey Zimmerman - New To Country (Official Music Video)

Bailey Zimmerman - New To Country (Official Music Video)

How the Mormons Created Utah

How the Mormons Created Utah

Honor, Skins & More | Dev Update - League of Legends

Honor, Skins & More | Dev Update - League of Legends

Life After Love Island with Serena Page and Kordell Beckham | Baby, This Is Keke Palmer | Podcast

Life After Love Island with Serena Page and Kordell Beckham | Baby, This Is Keke Palmer | Podcast

Ch(e)at GPT? - Computerphile

Ch(e)at GPT? - Computerphile

Indirect Prompt Injections in the Wild - Real World exploits and mitigations Johann Rehberger

Indirect Prompt Injections in the Wild – Real World exploits and mitigations Johann Rehberger

How to detect prompt injections - Jasper Schwenzow, deepset.ai

How to detect prompt injections - Jasper Schwenzow, deepset.ai

Indirect Prompt Injection Into LLMs Using Images and Sounds

Indirect Prompt Injection Into LLMs Using Images and Sounds

Has Generative AI Already Peaked? - Computerphile

Has Generative AI Already Peaked? - Computerphile

AI, Machine Learning, Deep Learning and Generative AI Explained

AI, Machine Learning, Deep Learning and Generative AI Explained

I Discovered The Perfect ChatGPT Prompt Formula

I Discovered The Perfect ChatGPT Prompt Formula

Hypnotized AI and Large Language Model Security

Hypnotized AI and Large Language Model Security

SQL Injection Attack Tutorial - I didn't know you can do that

SQL Injection Attack Tutorial - I didn't know you can do that

V8 от ПАЗа в Гелендваген - Первый запуск

V8 от ПАЗа в Гелендваген - Первый запуск

Тренд ты мой зайчик по очереди 🐰

Тренд ты мой зайчик по очереди 🐰

ПОЛИНА ХЛЕБ vs ХЕЙТЕРЫ! ХАЙП на ОСКАРЕ!

ПОЛИНА ХЛЕБ vs ХЕЙТЕРЫ! ХАЙП на ОСКАРЕ!

Ini adalah alat yang HARUS DIMILIKI di kamar mandi! 🤩 Gadget keren #hack

Ini adalah alat yang HARUS DIMILIKI di kamar mandi! 🤩 Gadget keren #hack

КИТАЙСКИЙ ЭПОС ➤ Black Myth: Wukong ◉ Прохождение 1

КИТАЙСКИЙ ЭПОС ➤ Black Myth: Wukong ◉ Прохождение 1

Новый фонарик в iPhone с iOS 18

Новый фонарик в iPhone с iOS 18

НУБ И ПРО СТРОЯТ ДОМ МОНСТР ЗА 10 СЕКУНД / 1 МИНУТА / 5 МИНУТ В МАЙНКРАФТ БИТВА СТРОИТЕЛЕЙ

НУБ И ПРО СТРОЯТ ДОМ МОНСТР ЗА 10 СЕКУНД / 1 МИНУТА / 5 МИНУТ В МАЙНКРАФТ БИТВА СТРОИТЕЛЕЙ

Italians vs @BayashiTV_ SO CLOSE

Italians vs @BayashiTV_ SO CLOSE