Building Safe and Secure LLM Applications | Risks of LLM | Machine Unlearning for Responsible AI

Поделиться
HTML-код
  • Опубликовано: 11 сен 2024
  • Building a Safer LLM
    How to Build a Secure LLM
    Ensuring AI Safety
    Safe and responsible development with generative language models
    Machine Unlearning
    Developing Safe and Responsible Large Language Models
    Risks of Large Language Models (LLM)
    Discover how machine unlearning is revolutionizing the field of AI safety in this groundbreaking video. We dive deep into the cutting-edge research paper "Towards Safer Large Language Models through Machine Unlearning" by Zheyuan Liu, Guangyao Dou, Zhaoxuan Tan, Yijun Tian, and Meng Jiang from the University of Notre Dame and the University of Pennsylvania.
    As large language models (LLMs) become increasingly powerful, there are growing concerns about their potential to generate harmful content. However, the authors propose a novel solution called Selective Knowledge negation Unlearning (SKU), which aims to remove harmful knowledge from LLMs while preserving their overall performance and capabilities.
    In this video, we explore the two-stage process of SKU, starting with the harmful knowledge acquisition stage. This stage utilizes three innovative modules to identify and learn harmful information within the model from different perspectives. We then move on to the knowledge negation stage, where the isolated harmful knowledge is strategically removed, resulting in a safer and more reliable language model.
    The results of SKU are truly impressive. The authors demonstrate that SKU can reduce the harmful response rate to just 3% on unlearned prompts and 4% on unseen prompts, a significant improvement compared to the original model's 57% harmful rate. Moreover, SKU maintains low perplexity scores, indicating that it can still generate coherent and fluent text, and achieves high BLEURT scores, showing semantic similarity to the safe original model outputs.
    We also take a closer look at the three key modules in the harmful knowledge acquisition stage: the guided distortion module, the random disassociation module, and the preservation divergence module. Each module plays a crucial role in identifying and learning diverse harmful knowledge that can be effectively unlearned in the negation stage.
    The impact of SKU on the future of AI safety cannot be overstated. By enabling targeted unlearning of harmful knowledge in LLMs while maintaining their core capabilities, SKU paves the way for safer and more trustworthy AI systems. This pioneering research opens up new possibilities for responsible deployment of LLMs in real-world applications.
    Don't miss this opportunity to learn about the cutting-edge techniques that are shaping the future of AI safety. Watch now and discover how machine unlearning with SKU can help us create smarter, safer, and more reliable language models.
    For more information on this research, visit the GitHub repository at github.com/fra... or read the full paper "Towards Safer Large Language Models through Machine Unlearning" by Zheyuan Liu, Guangyao Dou, Zhaoxuan Tan, Yijun Tian, and Meng Jiang.
    #MachineUnlearning #AISafety #LanguageModels #LLM #SKU #ResponsibleAI #UnlearningHarmfulKnowledge #SelectiveKnowledgeUnlearning #SaferAI #TrustworthyAI
    About this Channel:
    Welcome to Anybody Can Prompt (ABCP), your source for the latest Artificial Intelligence news, trends, and technology updates. By AI, for AI, and of AI, we bring you groundbreaking news in AI Trends, AI Research, Machine Learning, and AI Technology. Stay updated with daily content on AI breakthroughs, academic research, and AI ethics.
    Do you ever feel overwhelmed by the rapid advancements in AI, especially Gen AI?
    Upgrade your life with a daily dose of the biggest tech news - broken down in AI breakthroughs, AI ethics, and AI academia. Be the first to know about cutting-edge AI tools and the latest LLMs. Join over 15,000 minds who rely on ABCP for the latest in generative AI.
    Subscribe to our newsletter for FREE to get updates straight to your inbox:
    anybodycanprom...
    Check out our latest list of Gen AI Tools [Updated May 2024]
    sites.google.c...
    Let's stay connected on any of the following platforms of your choice:
    anybodycanprom...
    / anybodycanprompt
    / anybodycanprompt
    / 61559330045287
    x.com/abcp_com...
    github.com/any...
    Please share this channel & the videos you liked with like-minded Gen AI enthusiasts.
    #AI #ArtificialIntelligence #AINews #GenerativeAI #TechNews #ABCP #aiupdates
    Subscribe here- anybodycanprom...

Комментарии •