LIMA : Less is More for Alignment
ยท
๐Ÿ—ฃ๏ธ Natural Language Processing
Large language model์„ ๋‘ ๋‹จ๊ณ„ ์Šคํ…์œผ๋กœ ํ•™์Šต ๋น„๊ต (1) raw text์—์„œ ๋น„์ง€๋„ ํ•™์Šต์„ ํ†ตํ•ด ์ผ๋ฐ˜์ ์ธ ๋Œ€ํ™” ๋ฌธ์žฅ(general-purpose) ํ•™์Šต (2) large scale instruction tuning๊ณผ ๊ฐ•ํ™” ํ•™์Šต์„ ํ†ตํ•ด human preference modeling [Experiment] ํ…Œ์ŠคํŠธ๋ฅผ ์œ„ํ•ด 1000๊ฐœ์˜ ์‹ค์ œ ์œ ์ € ํ”„๋กฌํ”„ํŠธ์™€ high-quality ์‘๋‹ต์„ ์„ ๋ณ„. 750๊ฐœ์˜ ์งˆ๋ฌธ๊ณผ ๋‹ต๋ณ€์„ Community forum์—์„œ ์„ ๋ณ„(Stack Exchang, wikiHow) ์ถ”๊ฐ€๋กœ 250๊ฐœ์˜ ์งˆ๋ฌธ๊ณผ ๋‹ต๋ณ€์„ ์ˆ˜๋™์œผ๋กœ ์ž‘์„ฑ (Alignment style) LLaMa [Touvron et al., 2023] 65B parameter model์— fine-tuning [Resu..
๋‹คํ–ˆ๋‹ค