wav442letter

Fully convolutional speech-to-text model based on Facebook's Wav2Letter. Developed alongside Andrew Schallwig and Matt Palazzolo for EECS 442 at the University of Michigan.

The original paper can be found here.

Our results are summarized below, with Facebook's original results on the left and ours on the right. Our goal was to try to replicate Facebook's results with far fewer computational resources; although clearly not successful, we certainly achieved a decent approximation given that we used 0.3% of the training data and 30% of the trainable parameters of the original model.

The model was built in PyTorch and trained on the dev-clean subset of the LibriSpeech ASR corpus, available here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

wav442letter

Files

README.md

Latest commit

History

README.md

File metadata and controls

wav442letter