- Видео 12
- Просмотров 2 327
Probably Private
Германия
Добавлен 25 май 2023
Probably Private is a channel for privacy and data science, machine learning and AI enthusiasts. Whether you are here to learn about data science/ML/AI via the lens of privacy, or the other way around, this channel aims to be an open conversation on technical and social aspects of privacy and its intersections with surveillance, law, technology, mathematics and probability.
The Probably Private Newsletter (see link!) gives updates in written form on similar topics, so subscribe there to get emails regularly and to learn in more depth.
Katharine Jarmul is an internationally recognized author, lecturer and privacy activist/researcher, who has been working in the field of machine learning and large-scale data analysis/processing for the past 15 years. Her most recent book, Practical Data Privacy, has been translated into in 3 languages and is available from your favorite book resellers.
The Probably Private Newsletter (see link!) gives updates in written form on similar topics, so subscribe there to get emails regularly and to learn in more depth.
Katharine Jarmul is an internationally recognized author, lecturer and privacy activist/researcher, who has been working in the field of machine learning and large-scale data analysis/processing for the past 15 years. Her most recent book, Practical Data Privacy, has been translated into in 3 languages and is available from your favorite book resellers.
Introduction to Priveedly: Running your own personalized content engine
This is a short introduction video to Priveedly, an open-source setup for running your own Python-based content reader (feeds, subreddits) and recommender.
You can find more about Priveedly on the GitHub: github.com/kjam/priveedly and via the blog post on why/how/for what reason I used it: blog.kjamistan.com/priveedly-your-private-and-personal-content-reader-and-recommender.html
You can find more about Priveedly on the GitHub: github.com/kjam/priveedly and via the blog post on why/how/for what reason I used it: blog.kjamistan.com/priveedly-your-private-and-personal-content-reader-and-recommender.html
Просмотров: 6
Видео
Training your own personalized and private content recommender using scikit-learn and Priveedly
Просмотров 521 час назад
This video is for users of Priveedly (github.com/kjam/priveedly) or similar feed readers who want to train their own personal, private recommender system. The video is meant for folks who know Python but who are beginners in training their own machine learning models. I appreciate any feedback if you have open questions or thoughts after using the Jupyter Notebook or this video. I will try to a...
What Adversarial Machine Learning Teaches us about AI Memorization
Просмотров 42День назад
Hacking systems, like adversarial machine learning, can often teach us about how they work. In this video, you'll learn about the field of adversarial machine learning and how it relates to what you've learned thus far in memorization in AI systems. Related Blog Post: blog.kjamistan.com/adversarial-examples-demonstrate-memorization-properties.html My CCC talk (2017): media.ccc.de/v/34c3-8860-de...
How could we have known about AI memorization? Exploring differential privacy in deep learning.
Просмотров 54414 дней назад
In this video, we'll begin investigating how we could have figured out that AI models or deep learning models were memorizing parts of their training data. Specifically, in this video you'll learn about: - the concept of differential privacy - one of the early differentially private deep learning system designs, called PATE - how looking at PATE teaches us which types of examples are memorized ...
How AI/ML memorization happens: Overparameterized models
Просмотров 68728 дней назад
In this video, you'll learn about how model size growth and overparameterization created the ability to both generalize and memorize well. You'll investigate model size and what parameters are, while analyzing how this created the ability to memorize data and still generalize well. Related Article to learn more or reference research: blog.kjamistan.com/how-memorization-happens-overparametrized-...
AI Memorization: How and why novel examples are memorized in deep learning systems
Просмотров 41Месяц назад
In this video in the memorization series, you'll learn how and why memorization of novel examples (or outliers/singletons) happens. Essentially, these are too expensive not to memorize! You'll also learn about several interesting studies of how deep learning works , some of which jumpstarted the initial investigation into memorization in deep learning and why it seems to enhance performance rat...
How AI / ML Memorization Happens: Repetition
Просмотров 35Месяц назад
In this week's video, you'll learn about how deep learning and AI models memorize highly repeated examples. In many ways, this is both desired and expected behavior for such models; but there are probably cases where text or images are highly repeated and this is not wanted like with copyrighted text or code. In this video you'll also get a brief introduction to how model size relates to memori...
How machine learning training and evaluation contribute to memorization
Просмотров 39Месяц назад
In this video, I'll walk you through the third article in the memorization series, looking at how deep learning models are trained and evaluated. In doing so, you'll encounter questions about if this is the best way to build machine learning systems and how this can incentivize humans and models to memorize training data. Article for this video: blog.kjamistan.com/gaming-evaluation-the-evolutio...
Memorization in ML Series: Encodings and Embeddings
Просмотров 502 месяца назад
In this video, you'll get an overview of how data gets encoded in order to perform deep learning. You'll learn a brief introduction to how this evolved in early machine learning and connect the concept to why and how embeddings in NLP / language model systems can end up leaking biases and sensitive information. You can read the full article: blog.kjamistan.com/encodings-and-embeddings-how-does-...
Memorization in Machine Learning Series: Dataset distributions, history, and biases
Просмотров 1152 месяца назад
In this video, I'll walk you through properties, history and biases of how machine learning dataset creation works. These are important fundamentals to understand for both using machine learning systems, but also understanding how and why memorization happens. The blog series page is located at: blog.kjamistan.com/a-deep-dive-into-memorization-in-deep-learning.html The article on datasets is he...
Privacy Engineering with Nigel Smart
Просмотров 217Год назад
In this edition, you'll learn about another face of privacy engineering - the cryptographic side! Dr. Nigel Smart, a world renowned cryptographer, leads you through the concept of collaborative computing, where individuals, companies and organizations can share data to compute things without revealing their data to each other. As a long-time researcher and advisor to many startups, including hi...
What is Privacy Engineering? With Damien Desfontaines
Просмотров 574Год назад
Explore the world of privacy engineering with today's leading experts, including Damien Desfontaines, a pioneer in the field of usable differential privacy and anonymization. Damien started his career at Google and now leads open-source development for Tumult Labs, helping organizations like the US Census Bureau deploy anonymization at scale. Please check out his work using the links in the com...
I know it’s really difficult to determine what a “fact” because reality is viewed from so many perspectives…
I just had a wild idea…could we use memorization to reduce (or eliminate) the hallucination of facts? Example: Assuming Wikipedia is humanity’s baseline factset, is it possible for an LLM to memorize all Wikipedia, such that the model never hallucinates answers that contain Wiki facts?
I believe that this is one of those problems the models can't do without. Because one of the main features of most models is to take a small sample and reproduce it. Think about when you provide a sample code that you intend to be extended. That sample code is a white peacock for all we know but we still expect the model to be useful with non trivial tasks. I believe the same ability like to adapt to a context window is the only way for most models to be more than just a mild average of everything they have seen. So it kinda has to be baked in. The righteous way imho is simply to be very careful when producing training datasets, like this is part of a company duty and responsibility and if they fail to do so, then the public sues (and rightfully so)
Omg again this is brilliant 👏.
Holy shit this video is a gem, can't believe you have so few views!!!
I wish, i can be ur student
Good 🎉🎉🎉
The question of allowing AI to memorise some things but not others is a great disadvantage and this kind of thinking only retards AI's learning ability, as it is a machine used to benefit mankind. There might be memory degradation over time, but I would at least build something into this array where it can retain this memory forever. Weights & biases can as a main memory, but I'm sure someone will find a better way to transform this, maybe in the form of images etc. Maybe for now attention is all you need, but to complete the picture, AI needs to master or built into it 'contrast & context'. This would give it artificial or virtual reality. Governments will have to mandate how much an AI can use a person's likeness, esp if they are a marketable quantity like a celebrity etc.
Nicely explained. I eventually believe that the whole node array of the AI will organically set up specialised areas (if allowed to ??) I didn't know that the nodes memorise stuff. But it makes sense that they would reuse their experiences for future tasks. How is this similar or different to memGPT ?
This is awesome. Waiting for more content on Privacy Engineering!!
Interesting Links: Damien's Post: What does a Privacy Engineer do, anyway? desfontain.es/privacy/privacy-engineer.html Damien's Blog: desfontain.es/privacy/index.html Tumult Labs Library: docs.tmlt.dev/analytics/latest/ Damien on LinkedIn: www.linkedin.com/in/desfontaines/ On Twitter: twitter.com/tedonprivacy/ On Mastodon: hachyderm.io/@tedted