728x90

https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&dataSetSn=123 

 

AI-Hub

๋ถ„์•ผํ•œ๊ตญ์–ด ์œ ํ˜• ์˜ค๋””์˜ค , ํ…์ŠคํŠธ ๊ฐฑ์‹ ๋…„์›” : 2023-02 ๊ตฌ์ถ•๋…„๋„ : 2018 ์กฐํšŒ์ˆ˜ : 6,273 ๋‹ค์šด๋กœ๋“œ : 12,094 ์šฉ๋Ÿ‰ : ๋‹ค์šด๋กœ๋“œ ๊ด€์‹ฌ๋ฐ์ดํ„ฐ ๋“ฑ๋ก ๊ด€์‹ฌ 31

aihub.or.kr

 

Whisper ํ•œ๊ตญ์–ด ์Œ์„ฑ (Ksponspeech dataset) ๊ฒ€์ฆ

Kspon์˜ eval_clean๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•ด Whisper์˜ ์„ฑ๋Šฅ ์ง€ํ‘œ๋ฅผ ์ž‘์„ฑํ•ด๋ณด์•˜๋‹ค. 

 

์‚ฌ์šฉ ๋ชจ๋ธ (large ๋ชจ๋ธ, ์ถ”ํ›„ ๋‹ค์–‘ํ•œ ์‹คํ—˜ ์ง„ํ–‰) 

 

์ฒซ ์‹คํ—˜์€ large ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•ด language identification์„ ํ•œ๊ตญ์–ด๋กœ ์„ค์ •ํ•˜์ง€ ์•Š๊ณ  CER์„ ๊ณ„์‚ฐํ–ˆ๋‹ค. 

 

๊ฒฐ๊ณผ 0.42๋กœ ๋†€๋ผ์šธ ์ •๋„์˜ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์คฌ๋‹ค.

language identification์€ 99๊ฐœ์˜ Token์„ Transcript ์ „ ๋จผ์ € ์ˆ˜ํ–‰ํ•˜๋Š”๋ฐ ์งง์€ ๋ฌธ์žฅ์˜ ๊ฒฝ์šฐ language identification์ด ์ž˜ ์ž‘๋™ํ•˜์ง€ ์•Š์•„ ๋‹ค๋ฅธ ๋‚˜๋ผ ์–ธ์–ด๋กœ ๋ฒˆ์—ญ๋˜๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ๋‹ค. 

 

 ์–ด๋””์„œ? On y sonne.

 

 ๊ทธ๋•Œ ๋ฆด๋ฆฌ์•„๊ฐ€ ๋ญ ์–ด๋–ป๊ฒŒ ํ–ˆ๋‚˜? ู‡ู„ ุชุฑูŠุฏ ุฃู† ุชุนู…ู„ ู…ุนูŠุŸ

Whisper๋Š” Word Error Rate๋กœ ํ•™์Šต์„ ์ง„ํ–‰ํ•ด WER์„ ๋น„๊ตํ•˜๋ฉด ๋” ์ข‹์€ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜ฌ ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒ๋œ๋‹ค.

 

Model Size CER WER
small    
medium    
large 0.42  
small-ko    
medium-ko    
large-ko    

  

 

 

 

 

 

๋ฐ˜์‘ํ˜•
๋‹คํ–ˆ๋‹ค