6. prosince 2022
Řečník · 0 sledujících
Řečník · 0 sledujících
Řečník · 0 sledujících
Polyphone disambiguation aims to capture accurate pronunciation knowledge from natural text sequences for reliable Text-to-speech (TTS) systems. However, previous approaches require substantial annotated training data and additional efforts from language experts, making it difficult to extend high-quality neural TTS systems to out-of-domain daily conversations and countless languages worldwide. This paper tackles the polyphone disambiguation problem from a concise and novel perspective: we propose Dict-TTS, a semantic-aware generative text-to-speech model with an online website dictionary (the existing prior information in the natural language). Specifically, we design a semantics-to-pronunciation attention (S2PA) module to match the semantic patterns between the input text sequence and the prior semantics in the dictionary and obtain the corresponding pronunciations; The S2PA module can be easily trained with the end-to-end TTS model without any annotated phoneme labels. Experimental results in three languages show that our model outperforms several strong baseline models in terms of pronunciation accuracy and improves the prosody modeling of TTS systems. Further extensive analyses with different linguistic encoders demonstrate that each design in Dict-TTS is effective. Audio samples are available at https://dicttts.github.io/DictTTS-Demo/.Polyphone disambiguation aims to capture accurate pronunciation knowledge from natural text sequences for reliable Text-to-speech (TTS) systems. However, previous approaches require substantial annotated training data and additional efforts from language experts, making it difficult to extend high-quality neural TTS systems to out-of-domain daily conversations and countless languages worldwide. This paper tackles the polyphone disambiguation problem from a concise and novel perspective: we prop…
Účet · 962 sledujících
Profesionální natáčení a streamování po celém světě.
Prezentace na podobné téma, kategorii nebo přednášejícího
Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %
Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %
Dongze Lian, …
Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %
Jack Lindsey, …
Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %
Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %
Hanxu Zhou, …
Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %