Corrections + Few Shot Examples (Part 1) | LangSmith Evaluations

Поделиться
HTML-код
  • Опубликовано: 25 июн 2024
  • Evaluation is the process of continuously improving your LLM application. This requires a way to judge your application’s outputs, which are often natural language. Using an LLM to grade natural language outputs (e.g., for correctness relative to a reference answer, tone, or conciseness) is a popular approach, but requires prompt engineering and careful auditing of the LLM judge!
    Our new release of LangSmith presents a solution to this rising problem, allowing a user to (1) correct LLM-as-a-Judge outputs and then (2) pass those corrections back to the judge as few-shot example for future iterations. This creates LLM-as-a-Judge evaluators grounded in human feedback that better encode your preferences without the need for challenging prompt engineering.
    Here we show how apply Corrections + Few Shot to online evaluators that are pinned to a project.
  • НаукаНаука

Комментарии • 2

  • @andydataguy
    @andydataguy Месяц назад

    This evaluation series is great!! 🙌🏾💜

  • @arturassgrygelis3473
    @arturassgrygelis3473 16 дней назад

    Why i cant see, like you, evaluations in feedback? i get separate project and there all evaluations goes(Not handy).
    I have seted up four evaluations , and two of them are relevance recall and precision, and i get , i dont know why two extra of those with random questions, not inputed by me.. For example input what rights has citizens of Lithuania? Outputs talks about what is capital of France and other questions about france