SwissText2024: Jesse Berent (Google) on "Connecting Digital Ink with Large Vision/Language Models"

Поделиться
HTML-код
  • Опубликовано: 2 июл 2024
  • Recording of Jesse Berent's keynote at the 9th SwissText Conference in Chur, Switzerland on "Connecting Digital Ink with Large Vision/Language Models".
    About SwissText: The Swiss Text Analytics Conference (SwissText) is an annual conference in Switzerland that brings together experts from industry and academia in the fields of Natural Language Processing (NLP), Computational Linguistics and Text Analytics. LINK: www.swisstext.org
    Digital note-taking and hand drawn input is gaining popularity, offering a durable, editable, and easily indexable way of storing notes in the vectorized form known as digital ink. At the same time, the adoption of tablets with touchscreens and styluses is increasing, and a key feature is interpreting handwritten or drawn input. This talk explores the intersection of handwriting recognition and modern AI by focusing on two new approaches. The first part delves into the application of large vision-language models (VLMs) to online handwriting recognition using new representations and tokenizers. This approach, which is compatible with off-the-shelf models and methods,
    offers a promising avenue for seamless integration of online handwriting recognition into existing multi-modal models. The second part will focus on converting images of handwriting (pen-and-paper notes) into digital ink with VLMs. This capability bridges the gap between traditional and digital note-taking, facilitating seamless integration of handwritten content into digital AI-assisted workflows. The presentation will conclude with a discussion of the broader implications of these advancements for the future of handwriting recognition and human-computer interaction.
  • НаукаНаука

Комментарии •