SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
semantic text-to-speech codec acoustic dac speech-representation audio-representation encodec soundstream music-representation-learning gpt4o speech-language-model
-
Updated
Jan 2, 2025 - Python