[Whisper] Kspon Valid --- (2) CER
ยท
๐Ÿ‘พ Deep Learning
Robust Speech Recognition via Large-Scale Weak Supervision *large model์€ 2023.1 large-v2์™€ ๋™์ผํ•˜๊ฒŒ ๋ฐ”๋€œ KsponSpeech ๋ฐ์ดํ„ฐ๋Š” ์งง์€ ๋ฐœํ™”์˜ audio๋ฅผ ์ฃผ๋กœ ๊ตฌ์„ฑ๋˜์–ด์žˆ๋‹ค. Whisper๋Š” 99๊ฐœ์˜ ํ† ํฐ์œผ๋กœ ์ฒ˜์Œ ๋ฐœํ™”์— ๋Œ€ํ•œ ์–ธ์–ด ์˜ˆ์ธก(language identification)์„ ์ˆ˜ํ–‰ํ•œ๋‹ค. ํ•˜์ง€๋งŒ ๋„ˆ๋ฌด ์งง์€ ๋ฐœํ™” ๊ฐ™์€ ๊ฒฝ์šฐ whisper๊ฐ€ ๋‹ค๋ฅธ ์–ธ์–ด๋กœ ์˜ˆ์ธกํ•ด translate ์ž์ฒด๊ฐ€ ํ‹€๋ ค๋ฒ„๋ ค CER์ด ์ฆ๊ฐ€ํ•˜๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค. language Configure์„ korean์œผ๋กœ ์„ค์ •ํ•˜๋ฉด language identification์„ ์ˆ˜ํ–‰ํ•˜์ง€ ์•Š๊ณ  ๋ฐ”๋กœ transcript๋กœ ์˜ˆ์ธกํ•ด ๋” ์ข‹์€ ์„ฑ๊ณผ๊ฐ€ ๋‚ฌ๋‹ค. model size๋Š” ์˜ˆ..
๋‹คํ–ˆ๋‹ค
'kspon' ํƒœ๊ทธ์˜ ๊ธ€ ๋ชฉ๋ก