You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for this meangingful repo and I have a question.
My transcription result is listed as follows and how can I change the transcription result don't output number and generation the english number? It will get higher wer
{"truth": "ninety five lines and no more thats it", "hypo": " 95 lines and no more thats it", "wer": 0.25}
{"truth": "my grandmother has type one diabetes", "hypo": " my grandmother has type 1 diabetes", "wer": 0.16666666666666666}
{"truth": "ford is approximately two hundred years old as supported by the books", "hypo": " ford is approximately 200 years old as supported by the books", "wer": 0.16666666666666666}
The text was updated successfully, but these errors were encountered:
defget_suppress_tokens() ->list[int]:
"""Get list of all tokens with numerics characters. Store this list in the `suppress_tokens` field in whisper parameters. Returns: list[int] List of all tokens with numeric characters. """tokenizer=Tokenizer(
tokenizer=model.hf_tokenizer,
task="transcribe",
language="en",
multilingual=True
)
number_tokens= [
iforiinrange(tokenizer.eot)
ifall(cin"0123456789"forcintokenizer.decode([i]).strip())
]
suppress_tokens= [-1] +number_tokensreturnsuppress_tokens
where model is an instance of faster_whisper.WhisperModel
Pass the suppress_tokens argument into transcription parameters
Thanks for this meangingful repo and I have a question.
My transcription result is listed as follows and how can I change the transcription result don't output number and generation the english number? It will get higher wer
{"truth": "ninety five lines and no more thats it", "hypo": " 95 lines and no more thats it", "wer": 0.25}
{"truth": "my grandmother has type one diabetes", "hypo": " my grandmother has type 1 diabetes", "wer": 0.16666666666666666}
{"truth": "ford is approximately two hundred years old as supported by the books", "hypo": " ford is approximately 200 years old as supported by the books", "wer": 0.16666666666666666}
The text was updated successfully, but these errors were encountered: