[NLP] Word Embedding - Word2Vec

2023. 3. 27. 21:04
๐Ÿง‘๐Ÿป‍๐Ÿ’ป ์ฃผ์š” ์ •๋ฆฌ
 
NLP
Word Embedding
Word2Vec

 

 

 

Word2Vec

 

Training of Word2Vec ๋ฐฉ์‹์€ ๋งŽ์ด ๋‚˜์˜ค๋Š” ๊ฒƒ์„ ์ž์ฃผ ํ•™์Šตํ•œ๋‹ค๋Š” ๊ฒƒ์— ์ดˆ์ ์„ ๋’€์Šต๋‹ˆ๋‹ค.

 

-> ์ž์ฃผ ๋“ฑ์žฅํ•˜๋Š” ๋‹จ์–ด๋Š” ๋” ๋†’์€ ๊ฐ€๋Šฅ์„ฑ์œผ๋กœ ์—…๋ฐ์ดํŠธ๊ฐ€ ์ด๋ฃจ์–ด์ง‘๋‹ˆ๋‹ค.-> ๋‹จ์–ด๋“ค์„ ํ™•๋ฅ ๊ณผ ํ•จ๊ฒŒ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.

 

 

์œ„ ๊ทธ๋ฆผ๊ณผ ๊ฐ™์ด, ๋นˆ๋„์ˆ˜๊ฐ€ ๋‚ฎ์•„์งˆ ์ˆ˜๋ก, ๋“œ๋žํ•  ํ™•๋ฅ ์ด ์ž‘์•„์ง‘๋‹ˆ๋‹ค.

 

๊ทธ๋ฆฌ๊ณ , ๋นˆ๋„์ˆ˜๊ฐ€ ๋†’์•„์งˆ ์ˆ˜๋ก, 1์—์„œ ๋น ์ง€๋Š” ์ˆ˜๊ฐ€ ๊ฐ์†Œํ•˜๋ฏ€๋กœ, ๋“œ๋žํ•  ํ™•๋ฅ ์ด ๋†’์•„์ง‘๋‹ˆ๋‹ค.

 

Negative Sampling

 

์•„๋ž˜์™€ ๊ฐ™์€ skip-gram์˜ ํ˜•ํƒœ๋ฅผ ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

 

์•„๋ž˜ ๊ทธ๋ฆผ์—์„œ ๋ฌธ์žฅ์—์„œ cat์„ ํ†ตํ•ด 4๊ฐ€์ง€ ๋‹จ์–ด๋ฅผ output์œผ๋กœ ๋ƒ…๋‹ˆ๋‹ค.

 

 

 

์ถœ์ฒ˜ : https://wikidocs.net/69141

 

 

 

์—ฌ๊ธฐ์„œ negative sampling์€, 

์ถœ์ฒ˜ : https://wikidocs.net/69141

 

input๊ณผ output์ด์—ˆ๋˜ text๋“ค์„ ๋ชจ๋‘ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„์„œ ํ•ด๋‹น ๊ฐ’๋“ค์ด ๊ฐ€๊นŒ์ด ์žˆ์„ ํ™•๋ฅ ์„ ์ถœ๋ ฅ์œผ๋กœ ๋ฑ‰์–ด๋ƒ…๋‹ˆ๋‹ค.

 

์ด๋•Œ Activation function์€ softmax function์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

 

์ด๋ฅผ Skip-gram(Skip-Gram with Negative Sampling, SGNS), SGNS๋ผ๊ณ  ๋ถ€๋ฆ…๋‹ˆ๋‹ค.

 

 

์ถœ์ฒ˜ : https://wikidocs.net/69141

 

๊ทธ๋ฆฌ๊ณ , ์—ฌ๊ธฐ์„œ ๋ ˆ์ด๋ธ”์„ ์˜ค๋ฅธ์ชฝ๊ณผ ๊ฐ™์ด ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

 

์ถœ์ฒ˜ : https://wikidocs.net/69141

 

๊ทธ๋ฆฌ๊ณ  ๋ ˆ์ด๋ธ”์ด 1์ธ ๋ ˆ์ด๋ธ”์„ ํ•˜๊ณ , ์ด์ œ ๋ ˆ์ด๋ธ”์ด 0์ธ ์ƒ˜ํ”Œ๋“ค์„ ์œ„์™€ ๊ฐ™์ด ๋„ฃ์Šต๋‹ˆ๋‹ค.

 

 

์ด์ œ ์ด ๋ฐ์ดํ„ฐ์…‹์€ ์ž…๋ ฅ1๊ณผ ์ž…๋ ฅ2๊ฐ€ ์‹ค์ œ๋กœ ์œˆ๋„์šฐ ํฌ๊ธฐ ๋‚ด์—์„œ ์ด์›ƒ ๊ด€๊ณ„์ธ ๊ฒฝ์šฐ์—๋Š” ๋ ˆ์ด๋ธ”์ด 1, ์•„๋‹Œ ๊ฒฝ์šฐ์—๋Š” ๋ ˆ์ด๋ธ”์ด 0์ธ ๋ฐ์ดํ„ฐ์…‹์ด ๋ฉ๋‹ˆ๋‹ค.

 

 

์ถœ์ฒ˜ : https://wikidocs.net/69141

 

๊ทธ๋ฆฌ๊ณ  ์œ„์™€ ๊ฐ™์ด negative sampling์˜ ๊ฒฐ๊ณผ๋ฅผ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

Negative sampling์€ ๋„ˆ๋ฌด๋‚˜๋„ ํฐ V์˜ ํฌ๊ธฐ์— ๋Œ€ํ•ด์„œ softmax function์„ ์‹คํ–‰ํ•˜๋Š” ๋ฐ๋Š” ๋„ˆ๋ฌด๋‚˜๋„ ๋งŽ์€ ์‹œ๊ฐ„์ด ๊ฑธ๋ ค ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๋‚˜์˜ค๊ฒŒ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

 

๋žœ๋คํ•˜๊ฒŒ 5 ~. 5๊ฐœ์˜ negative samples๋“ค์„ ๋ฝ‘์Šต๋‹ˆ๋‹ค.

 

๊ทธ๋ฆฌ๊ณ , ์„ ํƒ๋œ ๋‹จ์–ด์—์„œ softmax๋ฅผ ๊ณ„์‚ฐํ•œ๊ณ , ๊ฐ๊ฐ์˜ words์— ๋Œ€ํ•œ ํ™•๋ฅ ์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.

 

 

 

์œ„์™€ ๊ฐ™์ด Word2Vec๋ฅผ ํ•™์Šต์‹œํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

 

 

 

Word Analogies

 

์ด์ œ ์‹ค์ œ๋กœ, ๋‹จ์–ด๋ฅผ ๋ฒกํ„ฐ๊ฐ„์˜ ๊ณ„์‚ฐ์œผ๋กœ ์–ผ๋งˆ๋‚˜ ์ผ์น˜ํ•˜๋Š” ๋‹จ์–ด๋ฅผ ๋ฝ‘์„ ์ˆ˜ ์žˆ๋Š” ๊ฐ€๋ฅผ ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

 

 

 

vec(“Berlin”) - vec(“Germany”) + vec(“France”)

์œ„ ๋ฌธ์žฅ์˜ ๊ฒฐ๊ณผ๋Š” ๋ฌด์—‡์ด ๋ ๊นŒ์š”?

 

๋น„์Šทํ•œ ์˜ˆ์‹œ๋กœ ์•„๋ž˜ ๊ทธ๋ฆผ๊ณผ ๊ฐ™์ด ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

 

 

๋“œ๋””์–ด ๋‹จ์–ด๋ฅผ ๋ฒกํ„ฐ๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ ์ง์ ‘์ ์ธ ์ˆ˜์น˜๋ฅผ ์ถ”์ถœํ•˜์—ฌ ์›ํ•˜๋Š” ๊ฒฐ๊ณผ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๊ฒŒ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

 

 

 

Additive Compositionality

 

Additive Compositionality can meaningfully combine vectors with termwise addition

 

 

๊ทธ๋ฆฌ๊ณ  ์ˆซ์ž ํ‘œํ˜„์œผ๋กœ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๊ฒƒ๋“ค์„ ์–ป์„ ์ˆ˜ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

 

  • Word Vectors
    • Word vectors in linear relationship with softmax nonlinearity
    • Vectors represent distribution of context in which word appears
    • Vectors are logarithmically related to probabilities
  • Sum of word vectors
    • Sums correspond to products.
    • Product of context distributions
    • ANDing together the two words in the sum.

 

'Artificial Intelligence > Natural Language Processing' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

[NLP] Word Embedding - GloVe  (0) 2023.03.31
[NLP] Word Embedding - CBOW and Skip-Gram  (2) 2023.03.27
[NLP] Word Embedding - Skip Gram  (0) 2023.03.27
[NLP] Word Embedding - CBOW  (1) 2023.03.27
[NLP] Introduction to Word Embedding  (0) 2023.03.26

BELATED ARTICLES

more