Text Embedding + t-SNE Visualization
ยท
๐Ÿ—ฃ๏ธ Natural Language Processing
https://betterprogramming.pub/openais-embedding-model-with-vector-database-b69014f04433 OpenAI’s Embedding Model With Vector Database The updated Embedding model offers State-of-the-Art performance with 4x longer context window. Thew new model is 90% cheaper. The smaller… betterprogramming.pub Introduction OpenAI๋Š” 2022๋…„ 12์›” ์ž„๋ฒ ๋”ฉ ๋ชจ๋ธ์„ ํ…์ŠคํŠธ ์ž„๋ฒ ๋”ฉ-ada-002๋กœ ์—…๋ฐ์ดํŠธํ–ˆ์Šต๋‹ˆ๋‹ค. ์ƒˆ ๋ชจ๋ธ์€ ๋‹ค์Œ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค: 90%-99.8% ์ €๋ ดํ•œ ๋น„์šฉ 1/8..
[Langchain] paper-translator
ยท
๐Ÿ—ฃ๏ธ Natural Language Processing
https://github.com/seohyunjun/paper-translator GitHub - seohyunjun/paper-translator: pdf paper translator pdf paper translator. Contribute to seohyunjun/paper-translator development by creating an account on GitHub. github.com Version History v0.1.2 2023/6/15 ChatGPT API Update : gpt-3.5-turbo-16k token 4k -> 16k (about 3 pages cover per 1 request) https://openai.com/blog/function-calling-and-ot..
LIMA : Less is More for Alignment
ยท
๐Ÿ—ฃ๏ธ Natural Language Processing
Large language model์„ ๋‘ ๋‹จ๊ณ„ ์Šคํ…์œผ๋กœ ํ•™์Šต ๋น„๊ต (1) raw text์—์„œ ๋น„์ง€๋„ ํ•™์Šต์„ ํ†ตํ•ด ์ผ๋ฐ˜์ ์ธ ๋Œ€ํ™” ๋ฌธ์žฅ(general-purpose) ํ•™์Šต (2) large scale instruction tuning๊ณผ ๊ฐ•ํ™” ํ•™์Šต์„ ํ†ตํ•ด human preference modeling [Experiment] ํ…Œ์ŠคํŠธ๋ฅผ ์œ„ํ•ด 1000๊ฐœ์˜ ์‹ค์ œ ์œ ์ € ํ”„๋กฌํ”„ํŠธ์™€ high-quality ์‘๋‹ต์„ ์„ ๋ณ„. 750๊ฐœ์˜ ์งˆ๋ฌธ๊ณผ ๋‹ต๋ณ€์„ Community forum์—์„œ ์„ ๋ณ„(Stack Exchang, wikiHow) ์ถ”๊ฐ€๋กœ 250๊ฐœ์˜ ์งˆ๋ฌธ๊ณผ ๋‹ต๋ณ€์„ ์ˆ˜๋™์œผ๋กœ ์ž‘์„ฑ (Alignment style) LLaMa [Touvron et al., 2023] 65B parameter model์— fine-tuning [Resu..
paper-translator test (LIMA: Less Is More for Alignment)
ยท
๐Ÿ—ฃ๏ธ Natural Language Processing
Test paper alignment Markdown format tranlslate LIMA: ์–ด์šธ๋ฆผ์— ์žˆ์–ด์„œ๋Š” ์ ์€ ๊ฒƒ์ด ๋” ์ข‹๋‹ค Chunting Zhou, Pengfei Liu, Puxin Xu, Srini Iyer, Jiao Sun, Yuning Mao, Xuezhe Ma, Avia Efrat, Ping Yu, Lili Yu, Susan Zhang, Gargi Ghosh, Mike Lewis, Luke Zettlemoyer, Omer Levy ๋ฉ”ํƒ€ AI, ์นด๋„ค๊ธฐ ๋ฉœ๋ก  ๋Œ€ํ•™๊ต, ๋‚จ๊ฐ€์ฃผ ๋Œ€ํ•™๊ต, ํ…”์•„๋น„๋ธŒ ๋Œ€ํ•™๊ต ์š”์•ฝ ๋Œ€ํ˜• ์–ธ์–ด ๋ชจ๋ธ์€ ๋‘ ๋‹จ๊ณ„๋กœ ํ›ˆ๋ จ๋ฉ๋‹ˆ๋‹ค. (1) ์›์‹œ ํ…์ŠคํŠธ์—์„œ ๋ฌด๊ฐ๋… ์‚ฌ์ „ ํ›ˆ๋ จ์„ ํ†ตํ•ด ์ผ๋ฐ˜์ ์ธ ๋ชฉ์ ์˜ ํ‘œํ˜„์„ ํ•™์Šตํ•˜๊ณ  (2) ๋Œ€๊ทœ๋ชจ ์ง€๋„ ํŠœ๋‹ ๋ฐ ๊ฐ•ํ™” ํ•™์Šต์„ ํ†ตํ•ด ์ตœ์ข… ์ž‘์—… ๋ฐ ์‚ฌ์šฉ..
[Langchain] Paper-Translator
ยท
๐Ÿ—ฃ๏ธ Natural Language Processing
https://github.com/seohyunjun/paper-translator GitHub - seohyunjun/paper-translator: pdf paper translator pdf paper translator. Contribute to seohyunjun/paper-translator development by creating an account on GitHub. github.com [paper] https://arxiv.org/abs/2304.06035 Choose Your Weapon: Survival Strategies for Depressed AI Academics Are you an AI researcher at an academic institution? Are you an..
[LangChain] Sentence-Transformer
ยท
๐Ÿ—ฃ๏ธ Natural Language Processing
https://www.sbert.net/docs/pretrained_models.html Pretrained Models — Sentence-Transformers documentation We provide various pre-trained models. Using these models is easy: Multi-Lingual Models The following models generate aligned vector spaces, i.e., similar inputs in different languages are mapped close in vector space. You do not need to specify the input www.sbert.net LangChain๊ณผ ๊ฐ™์ด ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š”..
[OpenAI API] OpenAI Token
ยท
๐Ÿ—ฃ๏ธ Natural Language Processing
https://github.com/seohyunjun/openAI_API_token GitHub - seohyunjun/openAI_API_token: openAI API token information openAI API token information. Contribute to seohyunjun/openAI_API_token development by creating an account on GitHub. github.com
[LangChain] No using OpenAI API RetrievalQA
ยท
๐Ÿ—ฃ๏ธ Natural Language Processing
LangChain No using OpenAI API (1) QA๋ฅผ ์œ„ํ•œ Document ๋ถˆ๋Ÿฌ์˜ค๊ธฐ # Load and process the text files # loader = TextLoader("./data/texts") loader = DirectoryLoader('./pdf/', glob="./*.pdf", loader_cls=PyPDFLoader) documents = loader.load() # Document ๋ถ„์ ˆ text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200) texts = text_splitter.split_documents(documents) (2) Embedding # HuggingF..
๋‹คํ–ˆ๋‹ค
'๐Ÿ—ฃ๏ธ Natural Language Processing' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๊ธ€ ๋ชฉ๋ก (2 Page)