Development and Operation of Expressive Speech Synthesis System -English version-

Поделиться
HTML-код
  • Опубликовано: 23 сен 2024
  • Text-to-speech (TTS) generates from text using computers. In recent years, deep learning-based TTS has reached a quality comparable to that of humans for neutral speech, but falls short on expressiveness.
    At LINE, we’re working on a development for a highly-expressive TTS system with a variety of speaking styles and precise controls. This session introduces two themes from the aspects of emotional TTS model development and TTS system operation.
    The first half covers the method accepted to INTERSPEECH 2022, which is an international conference on speech processing. With a small amount of neutral speech as a base, it applies voice conversion to generate pseudo emotional data used to construct an emotional TTS model .
    The second half covers initiatives for implementing micro-service to improve the speed and efficiency of the maintenance and development cycle of a system with multiple inference modules.
    ■ Category
    Data / AI
    ■ Speaker
    Ryo Terashima / LINE
    Kosuke Futamata / LINE
    ■ Tech-Verse Website
    tech-verse.me/...
    #techverse_en
    ■ Slide
    speakerdeck.co...
    ■ Other language Movie
    JA: • 感情表現豊かな音声合成システムを実現するため...
    KO: • 풍부한 표현력의 음성 합성 시스템 구현을...

Комментарии •