NVIDIA Nemo Codec Demo (22kHz)
This app demonstrates the NVIDIA Nemo Codec model (nvidia/nemo-nano-codec-22khz-0.6kbps-12.5fps
) used in Kani TTS.
How it works:
- Upload an audio file (wav, mp3, flac, etc.).
- The audio will be automatically resampled to 22kHz if needed.
- The 22kHz audio is encoded into discrete tokens by the Nemo codec.
- These tokens are then decoded back into audio by the Nemo codec.
- You can listen to the original, the 22kHz version (if resampled), and the final reconstructed audio.
Technical details:
- Sample rate: 22kHz
- Compression: ~0.6kbps
- Frame rate: 12.5fps
- 4 codebook levels per frame
Note: Processing happens locally. Larger files will take longer. If the input is stereo, only the first channel is processed.