How to OPTIMIZE your prompts for better Reasoning!

Class 1 System Design | Interview Process And SOLID Principle | LLD

Stanford Webinar - Large Language Models Get the Hype, but Compound Systems Are the Future of AI

Felix "Unfair" | [Stray Kids : SKZ-PLAYER]

I.N "HALLUCINATION" | [Stray Kids : SKZ-PLAYER]

Death Of A Unicorn | Official Trailer HD | A24

Can Large Language Models (or Humans) Disentangle Text? - Official Video

Connor T. Jerzak's Academic Content

Просмотров 98

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 4 фев 2025

Комментарии • 1

@connorjerzak 7 дней назад ⁺²
Transcript:
Hello. This is the official video for “Can Large Language Models (or Humans) Disentangle Text?,” published as a short paper at the NLP and CSS Workshop at NAACL in 2024.
First, I want to give some motivation and context. This work is about the task of disentangling variables from text. Sometimes text is contaminated by a variable that we wish to remove or control for. For instance, if we are using text to do causal inference, we might want the text to be independent from certain variables. If we are using text as training data, we might want to remove personal information from it for fairness or ethical reasons.
Approaches in the past have focused on interventions at the text embedding or representation level. These approaches have often succeeded but require labeled data and result in transformations that are not necessarily interpretable by humans. Given the advances in large language models in the last few years, we asked if we can use them to directly rewrite text to remove a particular target variable while preserving everything else. Additionally, we wanted to see how humans perform on the same task.
In terms of representation disentanglement, prior work often focuses on learning a guarding function that takes a text representation and renders it independent of some target variable. In our case, we wanted to take raw text and transform it into other raw text so that it is independent of the target variable, while being minimally intrusive. A trivial way would be to just return an empty string, but that obviously removes all information. We want to remove only the variable of interest.
Our high-level goal is to use large language models to do this rewriting and make the text independent from the target variable, while preserving everything else.
For our experiments, we used a dataset of Amazon reviews, two thousand of them. Each has two labels: a binary sentiment label and a topic label from one of six possible topics. We chose the sentiment as the target variable because it is fairly challenging-sentiment information is spread out in the text.
We tested two language models: the Mistral 7B and GPT-4. For each model, we had three prompt strategies. The first was a control strategy that just asked the large language model to rewrite the text as closely as possible. The second was a few-shot strategy, where we asked the model to rewrite the review while removing any sentiment information, giving a few examples of how to do it. The third strategy was prompt chaining with two stages: first identifying the parts of the text that contain sentiment, then asking the model to remove those passages and rewrite the text.
We also did two comparison experiments. One used a classic representation-level method, the mean projection, and the other tested humans performing the same rewriting task to try to remove sentiment from the text.
In our experimental setup, we started with original reviews and trained two classifiers on them: one for sentiment and one for topic. Then, we passed the reviews to a large language model to obtain Rewritten reviews. We trained the same two classifiers on the Rewritten text and compared performance.
A successful rewriting would achieve near-chance accuracy for the sentiment classifier, since the text should no longer contain sentiment signals, while preserving the topic classifier’s performance.
We found that the mean projection method, which operates on embeddings, did succeed at removing sentiment signal. The classifier on the transformed embeddings was close to chance for sentiment accuracy, and the topic classifier was unaffected.
However, our large language model approach did not do so well. The best we managed was a drop from 88.5% original sentiment accuracy down to 76%, which is still above the 50% chance level. Topic accuracy remained well preserved, but we could not fully remove the sentiment signal.
Interestingly, humans also struggled to rewrite the reviews to remove sentiment, achieving about 80% accuracy on the sentiment classifier.
Here are some numbers: the original sentiment accuracy was about 88.5%. The topic accuracy was about 95%. Our best rewriting method that actually rewrote text-rather than operating on embeddings-reduced sentiment accuracy only to around 76%.
The implications are that large language models and humans seem not able to fully strip sentiment information out of text when rewriting it. This suggests that sentiment is thoroughly baked into the text. The representation-level approach was successful, but does not necessarily provide a meaningful or interpretable text output.
This is a cautionary note for interpretability and fairness claims when transformations happen at the embedding level. It may successfully remove the targeted information, but the resulting representations do not translate clearly back to human-readable text.
In future work, it might be interesting to test more advanced prompt or rewriting strategies, or try different tasks. Maybe with personal information, which is often localized, the rewriting would be easier. Another direction would be to see what happens if two variables are more dependent on each other. That might make removing one while retaining the other even more difficult.
In conclusion, we see that current large language models do not reliably remove sentiment traces with these simple methods. Humans also struggle, indicating that it might just be a hard task.
Some limitations to note: some variables, such as localized personal information, might be easier to remove. Also, we relied on machine learning-based classifiers to evaluate our success. It might be interesting to see if a human could classify the sentiment in the rewritten text as accurately as a machine classifier.
That is the end of the talk. If you want more details, please read the paper or get in touch with the authors. Thank you for listening.

Следующие

Автовоспроизведение

How to OPTIMIZE your prompts for better Reasoning!

How to OPTIMIZE your prompts for better Reasoning!

Class 1 System Design | Interview Process And SOLID Principle | LLD

Class 1 System Design | Interview Process And SOLID Principle | LLD

Stanford Webinar - Large Language Models Get the Hype, but Compound Systems Are the Future of AI

Stanford Webinar - Large Language Models Get the Hype, but Compound Systems Are the Future of AI

Felix "Unfair" | [Stray Kids : SKZ-PLAYER]

Felix "Unfair" | [Stray Kids : SKZ-PLAYER]

I.N "HALLUCINATION" | [Stray Kids : SKZ-PLAYER]

I.N "HALLUCINATION" | [Stray Kids : SKZ-PLAYER]

Death Of A Unicorn | Official Trailer HD | A24

Death Of A Unicorn | Official Trailer HD | A24

Superman - Teaser Trailer Tomorrow

Superman - Teaser Trailer Tomorrow

How language model post-training is done today

How language model post-training is done today

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Introduction to Managerial Accounting

Introduction to Managerial Accounting

Stat 300 Lecture - January 27, 2025

Stat 300 Lecture - January 27, 2025

Stat 1 - Class 4 - CDFs, E[X], Merging Data in R - UT Austin - Connor Jezak (Aug 30, 2023)

Stat 1 - Class 4 - CDFs, E[X], Merging Data in R - UT Austin - Connor Jezak (Aug 30, 2023)

AI Is Making You An Illiterate Programmer

AI Is Making You An Illiterate Programmer

The moment we stopped understanding AI [AlexNet]

The moment we stopped understanding AI [AlexNet]

Digital Forensics Additional Topics

Digital Forensics Additional Topics

one year of studying (it was a mistake)

one year of studying (it was a mistake)

В Москве взорвали главу батальона «Арбат». Набиуллину вызовут на допрос. Сбой сайта Мос.ру

В Москве взорвали главу батальона «Арбат». Набиуллину вызовут на допрос. Сбой сайта Мос.ру

ДОНК ПРОТИВ ВАНДЕРФУЛА! SPIRIT - NAVI IEM KATOWICE 2025

ДОНК ПРОТИВ ВАНДЕРФУЛА! SPIRIT - NAVI IEM KATOWICE 2025

67th GRAMMY Awards Premiere Ceremony

67th GRAMMY Awards Premiere Ceremony

в какой цвет покраситься?! #шортс #тикток

в какой цвет покраситься?! #шортс #тикток

Кто угнал тачку?

Кто угнал тачку?

МОНЕСИ ПРОТИВ НИКО! G2 - Falcons IEM Katowice 2025

МОНЕСИ ПРОТИВ НИКО! G2 - Falcons IEM Katowice 2025

Игрок 456 на самом деле БАНКРОТ? (Игра в Кальмара)

Игрок 456 на самом деле БАНКРОТ? (Игра в Кальмара)

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts