Hacking Google Bard: Prompt Injection to Data Exfiltration via Image Markdown Rendering (Demo Video)

Поделиться
HTML-код
  • Опубликовано: 27 окт 2024

Комментарии • 10

  • @balonikowaty
    @balonikowaty 11 месяцев назад +3

    Great work Johann, as always! The more we give access to other data sources. which include documents, the more we expose each other to indirect injection attacks. It is worth pointing out that instructions could have been made in white ink size 0.1, making the document look normal!

  • @fire17102
    @fire17102 11 месяцев назад +3

    Read the post, really good
    I guess these sort of procedures will work across many different stacks and companies
    Also I wonder if you log your attempts, probably allot of wisdom can be drawn from your first attempt evolving to the last. You got it on the 10th try. Maybe showing a smart llm all 10 of those could find patterns. Effectively creating a prompt optimizer thay bring you faster results next time.
    All the best

    • @embracethered
      @embracethered  11 месяцев назад +1

      Thanks for the note! Yes, this is a very common flaw across LLM apps. Check out some of my other posts about Bing Chat, ChatGPT or Claude.
      Yep, on the iteration count - spot on. A lot of initial tests were around basic validation that injection and reading of chat history worked, then the addition of Image rendering, then in context learning examples to increase reliability of the exploit.

  • @ChristopherBruns-o7o
    @ChristopherBruns-o7o 2 месяца назад +1

    Do something fun like reflective shell function deceleration.

  • @6cylbmw
    @6cylbmw 8 месяцев назад +1

    I didn't really understand the vulnerability impact. You are exfiltrating own chat (user A) to own drive (user A) drive. How is it exploitable?

    • @embracethered
      @embracethered  8 месяцев назад

      Attacker is causing the Chatbot to send past chat data to attackers server (in this case a google doc is capturing the exfiltrated data).
      Check out the linked blog post, explains it in detail.

  • @petraat8806
    @petraat8806 9 месяцев назад

    im trying to understand what just happened please can someone explain

    • @embracethered
      @embracethered  9 месяцев назад

      You can read up on the details here: embracethered.com/blog/posts/2023/google-bard-data-exfiltration/
      And if you want to understand the big picture around LLM prompt injections check out this talk m.ruclips.net/video/qyTSOSDEC5M/видео.html
      Thanks for watching!