How Can We Generate BETTER Sequences with LLMs?

Поделиться
HTML-код
  • Опубликовано: 20 сен 2024
  • We know that LLMs are trained to predict the next word. When we decode the output sequence, we use the tokens of the prompt and the previously predicted tokens to predict the next word. With greedy decoding or multinomial sampling decoding, we use those predictions to output the next token in an autoregressive manner. But is this the sequence we are looking for, considering the prompt? Do we actually care about the probability of the next token in a sequence? What we want is the whole sequence to maximize the probability conditioned on the prompt, not each token separately.
    So let's look at why predicting the next token is not the prediction we care about, and how we can do better than simply autoregressing by just looking at the probability of the next token.

Комментарии • 2