728x90

Auto-Regressive Large Language Models (AR-LLMs)

  • ํ•˜๋‚˜์˜ ํ…์ŠคํŠธ ํ† ํฐ ๋‹ค์Œ์— ๋‹ค๋ฅธ ํ† ํฐ์„ ์ถœ๋ ฅ
  • ํ† ํฐ์€ ๋‹จ์–ด๋‚˜ ํ•˜์œ„๋‹จ์–ด๋ฅผ ๋‚˜ํƒ€๋ƒ„
  • ์ธ์ฝ”๋”/์˜ˆ์ธก๊ธฐ๋Š” ์ˆ˜์‹ญ์–ต ๊ฐœ์˜ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ๊ฐ€์ง„ ํŠธ๋žœ์Šคํฌ๋จธ ์•„ํ‚คํ…์ฒ˜
    • ์ผ๋ฐ˜์ ์œผ๋กœ 10์–ต ~ 5,000์–ต ๊ฐœ์˜ ๋งค๊ฐœ๋ณ€์ˆ˜
    • ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ: 1์กฐ ~ 2์กฐ ๊ฐœ์˜ ํ† ํฐ ์‚ฌ์šฉ
  • ๋Œ€ํ™”/ํ…์ŠคํŠธ ์ƒ์„ฑ LLM ์ข…๋ฅ˜
    • Open Source : BlenderBot, Galactica, LlaMa, Llama-2, Code Llama (FAIR), Mistral-7B, Mixtral-4x7 B (Mistral), Falcon (UAE), Alpaca (Stanford), Yi (01.AI), OLMo (AI2), Gemma (Google)
    • Proprietary : Meta AI (Meta), LaMDA/Bard, Gemini (Google), ChatGPT (OpenAI)

์„ฑ๋Šฅ ๋ฐ ํ•œ๊ณ„

  • ์„ฑ๋Šฅ์€ ๋†€๋ผ์›€ ํ•˜์ง€๋งŒ stupid mistakes (์‚ฌ์‹ค์  ์˜ค๋ฅ˜, ๋…ผ๋ฆฌ์  ์˜ค๋ฅ˜, ์ผ๊ด€์„ฑ ๋ถ€์กฑ)
    • LLM์€ ์ œํ•œ๋œ ์ถ”๋ก  ๋Šฅ๋ ฅ์„ ๊ฐ€์ง ๋˜ํ•œ ๊ธฐ๋ณธ ํ˜„์‹ค์— ๋Œ€ํ•œ ์ œํ•œ๋œ ์ง€์‹, ๊ธฐ์–ต๋ ฅ ๋ถ€์กฑ, ๋‹ต๋ณ€ ๊ณ„ํš ๋Šฅ๋ ฅ ๋ถ€์กฑ

Llama-2: https://ai.meta.com/llama/

(์งค๋ง‰ํ•˜๊ฒŒ Meta LLAMA ์†Œ๊ฐœ)


SeamlessM4T

(Meta์˜ ์ƒˆ๋กœ์šด STT(Speech-to-text) ๋ชจ๋ธ, ๊ฐ€๋Šฅํ•œ task(speech-to-speech translation, speech-to-text translation, text-to-text translation, speech recognition))

  • 100๊ฐ€์ง€ ์–ธ์–ด ์Œ์„ฑ, text ํ•™์Šต
  • 100๊ฐ€์ง€ ์–ธ์–ด text output
  • 35๊ฐœ ๊ตญ์–ด ์Œ์„ฑ output
  • ์‹ค์‹œ๊ฐ„ ์Œ์„ฑ๊ณผ ํ‘œํ˜„ ๊ฐ€๋Šฅ

 

 

https://github.com/facebookresearch/seamless_communication

 

 

GitHub - facebookresearch/seamless_communication: Foundational Models for State-of-the-Art Speech and Text Translation

Foundational Models for State-of-the-Art Speech and Text Translation - facebookresearch/seamless_communication

github.com

 


Auto-Regressive Generative Models Suck!.

(Auto-Regressive Generative ๋ชจ๋ธ์€ ๊ตฌ๋ฆฌ๋‹ค!)

  • Auto-Regressive Generative Model์€ ๋งํ•  ์šด๋ช…์ด์—ˆ๋‹ค. (๋ฆฌ์†Œ์Šค ์ตœ์•…)
  • ํ†ต์ œ๊ฐ€ ๋ถˆ๊ฐ€๋Šฅํ•˜๋‹ค. (non-toxic, factual etc)
  • ์ƒ์„ฑ๋œ ํ† ํฐ์ด ์ •๋‹ต ์ง‘ํ•ฉ์„ ๋ฒ—์–ด๋‚  ํ™•๋ฅ  e(error)๊ฐ€ ์žˆ๋‹ค.
    • ๊ธธ์ด๊ฐ€ n์ธ ๋‹ต๋ณ€์˜ ์ •ํ™•ํ™œ ํ™•๋ฅ (P)์ผ ๋•Œ P=(1-e)^n์œผ๋กœ ๋ฐœ์‚ฐ
    • ๊ธธ์ด๊ฐ€ ๊ธด ๋ฌธ์žฅ์ผ์ˆ˜๋ก ๋‹ต๋ณ€์˜ ํ€„๋ฆฌํ‹ฐ๊ฐ€ ๋–จ์–ด์ง (ArXiv:2305.18654)

Limitations of LLMs: no planning!

(LLM์€ ๋ฌด๊ณ„ํš์ด๋‹ค. ์ถ”๋ก  ๋Šฅ๋ ฅ 0)

  • ๋‡Œ์˜ Wernike(๋ฒ ๋ฅด๋‹ˆ์ผ€)์™€ Broca(๋ธŒ๋กœ์นด) ์˜์—ญ์„ ๋ชจ๋ฐฉํ•  ๋ฟ (๋Œ€๋‡Œ ํ”ผ์งˆ์˜ ๋‘ ๋ถ€๋ถ„ ์ค‘ ํ•˜๋‚˜ ์ฃผ๋กœ ์–ธ์–ด ์ƒ์„ฑ์— ๊ด€์—ฌํ•˜๋Š” ๋ธŒ๋กœ์นด, ๋ฌธ์ž ๋ฐ ์Œ์„ฑ ์–ธ์–ด ์ดํ•ด๋Š” ๋ฒ ๋ฅด๋‹ˆ์ผ€)
  • ํ˜„์žฌ์˜ LLM์€ ๋‘ ๋ถ€๋ถ„์˜ ์˜์—ญ์„ ์ œ๋Œ€๋กœ ๋ชจ๋ธ๋งํ•˜์ง€ ๋ชปํ•ด planning์ด ๋ถˆ๊ฐ€๋Šฅ

๊ทธ๋Ÿผ์—๋„ ์ข‹์€ ์ 

  • ์ž‘๋ฌธ ๋„์šฐ๋ฏธ, ์ดˆ์•ˆ ์ƒ์„ฑ, ์Šคํƒ€์ผ ๋‹ค๋“ฌ๊ธฐ, ์ฝ”๋“œ ์ƒ์„ฑ

์•ˆ ์ข‹์€ ์ 

  • ์‚ฌ์‹ค์— ๊ด€ํ•œ ์ผ๊ด€๋œ ๋‹ต๋ณ€ ์ƒ์„ฑ (hallucinations!)
  • ์ตœ์‹  ์ •๋ณด ๋ฐ˜์˜(๋งˆ์ง€๋ง‰ ํ•™์Šต ์ดํ›„ ์ •๋ณด ๋‹ต๋ณ€ ๋ถˆ๊ฐ€)
  • ์ ์ ˆํ•œ ํ–‰๋™(ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์…‹์—์„œ ๋ชจ๋ฐฉ)
  • reasoning, planning, math
  • ๊ฒ€์ƒ‰ ์—”์ง„, ๊ณ„์‚ฐ๊ธฐ, ๋ฐ์ดํ„ฐ ๋ฒ ์ด์Šค ์ฟผ๋ฆฌ ๋“ฑ์˜ ๋„๊ตฌ๋กœ ์‚ฌ์šฉ

์šฐ๋ฆฌ๋Š” LLM์ด ์ดํ•ดํ•˜๊ณ  ์˜ฌ๋ฐ”๋ฅธ ์ง€์‹œ๋ฅผ ํ•ด์ค„ ๊ฒƒ์œผ๋กœ ์ƒ๊ฐํ•˜๋‚˜ LLM์€ ๋‹จํ•˜๋‚˜๋„ ์„ธ์ƒ์„ ์ดํ•ดํ•˜์ง€ ๋ชปํ•จ

 

https://www.bing.com/images/create/a-man-is-solving-a-problem-inside-a-room2c-and-anot/1-66165e67d0a54033ae24b5cf8f3c7ab7?id=I2%2bgKaAyjcWhiX5w0efoCg%3d%3d&view=detailv2&idpp=genimg&thId=OIG4.ovF_1zf58tUVcPVbSZLN&FORM=GCRIDP&mode=overlay

 


  

์ •๋ฆฌํ•˜์ž๋ฉด ์ง€๊ธˆ์˜ LLM์€ ์ •์ ์ด๋‹ค. ์ƒˆ๋กœ์šด ์ง€์‹์˜ ์Šต๋“์ด ๋ฐ”๋กœ๋ฐ”๋กœ ์ด๋ฃจ์–ด์งˆ ์ˆ˜ ์—†๊ณ  reasoning(์ดํ•ด), planning(๊ณ„ํš)์ด ๋ถˆ๊ฐ€๋Šฅํ•ด ํ•ญ์ƒ ์˜ณ์€ ๋‹ต๋ณ€์„ ํ•˜์ง€ ๋ชปํ•œ๋‹ค. ๋”ฐ๋ผ์„œ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ ๋ชจ๋ธ์ด ์™„๋ฒฝํ•ด์ง€๋ ค๋ฉด  (1)์—์„œ ์„ค๋ช…ํ•œ ์„ธ์ƒ์„ ๋ฐฐ์›Œ์•ผ ํ•œ๋‹ค. ๊ทธ๋Ÿฌ๋ ค๋ฉด Auto-Regressive ๋ฐฉ์‹์—์„œ ํƒˆํ”ผํ•˜๊ณ  ์‚ฌ๋žŒ์˜ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ ๊ธฐ๊ด€์ธ (์Œ์„ฑ, ํ…์ŠคํŠธ) Broca, Wernicke์˜ ๊ตฌ์กฐ์™€ ๊ฐ™์ด ๋‘ ๊ฐ€์ง€ ๊ฐ๊ฐ ๊ธฐ๊ด€์„ ๋ชจ๋‘ ์ˆ˜์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋ชจ๋ธ์ด ํ•„์š”.

๋ฐ˜์‘ํ˜•
๋‹คํ–ˆ๋‹ค