首页 > Ai资讯 > Ai知识库 > 整理了近期所有TTS相关的大模型

整理了近期所有TTS相关的大模型

发布时间：2024年06月06日

从 XTTS 到 Pheme,从OpenVoice 到 VITS，每个大模型包括源码地址，支持的语言，非常棒！

XTTS

[Repo](https://github.com/coqui-ai/TTS)

[](https://huggingface.co/coqui/XTTS-v2)

[CPML](https://coqui.ai/cpml)

[Yes](https://huggingface.slack.com/archives/C05QZTQJUDD/p1705418518292139)

Multilingual

[Technical notes](https://erogol.substack.com/p/xttsv2-notes)

[](https://huggingface.co/spaces/coqui/xtts)

TorToiSe-TTS

[Repo](https://github.com/neonbjb/tortoise-tts)

[](https://huggingface.co/jbetker/tortoise-tts-v2)

[Apache 2.0](https://github.com/neonbjb/tortoise-tts/blob/main/LICENSE)

[Yes](https://git.ecker.tech/mrq/tortoise-tts)

English

[Technical report](https://arxiv.org/abs/2305.07243)

[](https://huggingface.co/spaces/Manmay/tortoise-tts)

VITS/ MMS-TTS

[Repo](https://github.com/huggingface/transformers/tree/7142bdfa90a3526cfbed7483ede3afbef7b63939/src/transformers/models/vits)

[](https://huggingface.co/kakao-enterprise) / [MMS](https://huggingface.co/models?search=mms-tts)

[Apache 2.0](https://github.com/huggingface/transformers/blob/main/LICENSE)

[Yes](https://github.com/ylacombe/finetune-hf-vits)

English

[Paper](https://arxiv.org/abs/2106.06103)

[](https://huggingface.co/spaces/kakao-enterprise/vits)

Pheme

[Repo](https://github.com/PolyAI-LDN/pheme)

[](https://huggingface.co/PolyAI/pheme)

[CC-BY](https://github.com/PolyAI-LDN/pheme/blob/main/LICENSE)

[Yes](https://github.com/PolyAI-LDN/pheme#training)

English

[Paper](https://arxiv.org/abs/2401.02839)

[](https://huggingface.co/spaces/PolyAI/pheme)

OpenVoice

[Repo](https://github.com/myshell-ai/OpenVoice)

[](https://huggingface.co/myshell-ai/OpenVoice)

[CC-BY-NC 4.0](https://github.com/myshell-ai/OpenVoice/blob/main/LICENSE)

ZH + EN

[Paper](https://arxiv.org/abs/2312.01479)

[](https://huggingface.co/spaces/myshell-ai/OpenVoice)

IMS-Toucan

[Repo](https://github.com/DigitalPhonetics/IMS-Toucan)

[GH release](https://github.com/DigitalPhonetics/IMS-Toucan/tags)

[Apache 2.0](https://github.com/DigitalPhonetics/IMS-Toucan/blob/ToucanTTS/LICENSE)

[Yes](https://github.com/DigitalPhonetics/IMS-Toucan#build-a-toucantts-pipeline)

Multilingual

[Paper](https://arxiv.org/abs/2206.12229)

[](https://huggingface.co/spaces/Flux9665/IMS-Toucan)

Matcha-TTS

[Repo](https://github.com/shivammehta25/Matcha-TTS)

[GDrive](https://drive.google.com/drive/folders/17C_gYgEHOxI5ZypcfE_k1piKCtyR0isJ)

[MIT](https://github.com/shivammehta25/Matcha-TTS/blob/main/LICENSE)

[Yes](https://github.com/shivammehta25/Matcha-TTS/tree/main#train-with-your-own-dataset)

English

[Paper](https://arxiv.org/abs/2309.03199)

[](https://huggingface.co/spaces/shivammehta25/Matcha-TTS)

pflowTTS

[Unofficial Repo](https://github.com/p0p4k/pflowtts_pytorch)

[GDrive](https://drive.google.com/drive/folders/1x-A2Ezmmiz01YqittO_GLYhngJXazaF0)

[MIT](https://github.com/p0p4k/pflowtts_pytorch/blob/master/LICENSE)

[Yes](https://github.com/p0p4k/pflowtts_pytorch#instructions-to-run)

English

[Paper](https://openreview.net/pdf?id=zNA7u7wtIN)

Not Available

StyleTTS 2

[Repo](https://github.com/yl4579/StyleTTS2)

[](https://huggingface.co/yl4579/StyleTTS2-LibriTTS/tree/main)

[MIT](https://github.com/yl4579/StyleTTS2/blob/main/LICENSE)

[Yes](https://github.com/yl4579/StyleTTS2#finetuning)

English

[Paper](https://arxiv.org/abs/2306.07691)

[](https://huggingface.co/spaces/styletts2/styletts2)

VALL-E

[Unofficial Repo](https://github.com/enhuiz/vall-e)

Not Available

[MIT](https://github.com/enhuiz/vall-e/blob/main/LICENSE)

[Yes](https://github.com/enhuiz/vall-e#get-started)

[Paper](https://arxiv.org/abs/2301.02111)

Not Available

HierSpeech++

[Repo](https://github.com/sh-lee-prml/HierSpeechpp)

[GDrive](https://drive.google.com/drive/folders/1-L_90BlCkbPyKWWHTUjt5Fsu3kz0du0w)

[CC-BY-NC-SA 4.0](https://github.com/sh-lee-prml/HierSpeechpp/blob/main/LICENSE)

KR + EN

[Paper](https://arxiv.org/abs/2311.12454)

[](https://huggingface.co/spaces/LeeSangHoon/HierSpeech_TTS)

Bark

[Repo](https://github.com/huggingface/transformers/tree/main/src/transformers/models/bark)

[](https://huggingface.co/suno/bark)

[MIT](https://github.com/suno-ai/bark/blob/main/LICENSE)

Multilingual

[Paper](https://arxiv.org/abs/2209.03143)

[](https://huggingface.co/spaces/suno/bark)

EmotiVoice

[Repo](https://github.com/netease-youdao/EmotiVoice)

[GDrive](https://drive.google.com/drive/folders/1y6Xwj_GG9ulsAonca_unSGbJ4lxbNymM)

[Apache 2.0](https://github.com/netease-youdao/EmotiVoice/blob/main/LICENSE)

[Yes](https://github.com/netease-youdao/EmotiVoice/wiki/Voice-Cloning-with-your-personal-data)

ZH + EN

Not Available

参考地址：

https://github.com/Vaibhavs10/open-tts-tracker/tree/main

出自：https://mp.weixin.qq.com/s/c2sICIdX3lcFBgpZS4uEzg

又快又好，秒级出图的AI大模型 stable diffusion最全18种controlnet模型，详细教程讲解。

整理了近期所有TTS相关的大模型

最新工具