๐ง๐ปโ๐ป ์ฃผ์ ์ ๋ฆฌ
NLP
Word Embedding
๋ฐฐ๊ฒฝ ์ง์
What is the Embedding?
Embedding์ด๋ ๋ฌด์์ผ๊น์?
์ฐ๋ฆฌ๋ input์ผ๋ก ๋ค์ด์จ sentence์ธ sequence๋ฅผ vectorํ ํด์ ๊ณ์ฐํด์ผํฉ๋๋ค.
์ด ๋ ํ์ํ ๊ฒ์ด ๋ฐ๋ก Word Embedding์ ๋๋ค.
์ฆ, ๋ฌธ์๋ฅผ ์ซ์๋ก ํํํ๋ ๊ฒ์ ๋๋ค.
์ฐ๋ฆฌ๊ฐ CNN์์ image๋ฅผ ์ฒ๋ฆฌํ ๋, [28 x 28 x 3 x 256] ์ ๋์ vector๊ฐ ํ์ํฉ๋๋ค.
์ด๋, hidden layer์์ hidden representation์ ๋ฝ์๋ด๋ ๊ณผ์ ์ ํ์ฃ .
์ฆ, vector ๊ฐ์์ ์๋ฏธ์๋ representation์ ๋ฝ์๋ด๋ ๊ฒ์ ๋๋ค.

Embedding๋ ๋ง์ฐฌ๊ฐ์ง ์ ๋๋ค.
์์ ๊ฐ์ด ์ฐ๋ฆฌ๊ฐ input์ผ๋ก ๋ฃ๋ ๊ฐ์ vector๋ก ๋ฐ๊พธ์ด ์ฐ์ฐํ๋ ๊ฒ์ ๋๋ค.

์์ ๊ฐ์ ์๋ฃ๋ฅผ ๋ด ์๋ค.
๋จ์ด๋ฅผ ์์ ๊ฐ์ด 3์ฐจ์ ๊ณต๊ฐ์ mappingํฉ๋๋ค.
vector[Queen] = vector[King] - vector[Man] + vector[Woman] ๊ณผ ๊ฐ์ ์ฐ์ฐ์ ํตํด
๋จ์ด๋ฅผ embedding or representation์ผ๋ก ๋ฐ๊พธ๋ฉฐ ์ด๋ฌํ ์ฐ์ฐ์ด ๊ฐ๋ฅํ๋ค๋ ๊ฒ์ ์๊ฒ ๋์์ต๋๋ค.
๊ทธ๋ผ ์กฐ๊ธ ๋ ์ดํด๋ณผ๊น์?
NLP ๋ถ์ผ๋ ๋ค์๊ณผ ๊ฐ์ ๋ถ์ผ๋ก ๋๊ฒ ์ฐ์ด๋ฉฐ ์๋์ ๊ฐ์ ๊ด๋ จ ๋ฉ์๋๋ค์ด ์กด์ฌํฉ๋๋ค.
1. Word Similarity
Classic Methods : Edit Distance, WordNet, Porter's Stemmer, Lemmatization using dictionaries
- Easily identifies similar words and synonyms since they ocuur in similar contexts.
- Stemming (thought -> think)
- Inflections, Tense forms
- eg. Think, thiught / ponder, podering
- Plane, Aircraft, Flight
2. Machine Translation
Classic Methods : Rule-based machine translation, morphological transformation
3. Part-of-Speech and Named Entity Recognition
Classic Methods : Sequential Models (MEMM, Conditional Random Fields), Logistic Regression
4. Relation Extracting
Classic Methods : OpenIE, Linear programming models, Bootstrapping
5. Sentiment Analysis
Classic Methods : Naive Bayes, Random Forests/SVM
- Classifying sentences as positive and negative
- Building sentiment lexicons using seed sentiment sets
- No need for classifiers, we can just use cosine distances to compare unseen reviews to known reviwes.
-> ๋จ์ด๊ฐ distance๋ฅผ ๊ณ ๋ คํ์ฌ ๊ณ์ฐํ๋ค.
=> L1 distance๋ ๋น์ทํ ์๋ก ๊ฐ์ด ์ ์ด์ง๋ค. L2 distance๋ ์ ํด๋ฆฌ๋์์ ์ฌ์ฉํ๋ ํ์ด์ด๋ค. cos distance๋ ๋น์ทํ ์๋ก ๊ฐ๋ ์ฐจ์ด๋ฅผ ์๊ฒ ํ์ฌ ๊ณ์ฐํ๋ ๋ฐฉ๋ฒ์ด๋ค.
6. Co-reference Resolution
- Chaining entity mentions across multiple documents
- can we find and and unify the multiple contexts in which mentions occurs?
7. Clustering
- Words in the same class naturally occur in similar contexts, and this feature vector can directly be used with any conventional clustering algorithms (K-Means, agglomerative, etc..).
- Human doesn't have to waste time hand-picking useful word features to cluster on.
8. Semantic Analysis of Documents
- Build word distributions for various topics, etc.
์์ ๊ฐ์ ์์ฒญ๋ vectors๋ค์ ์ด๋ป๊ฒ ๋ง๋ค๊น์?
-> Similar words have silmilar representation !
Lower-dimension vector representations for words based on their context
- Co-occurrence Matrix with SVD
- Word2Vec
- Global Vector Representations (Glove)
- Paragraph Vectors
Co-occurrence Matrix with Singular Value Decomposition
data์ ๋ํด์ Co-occurrence ํ๋ ๋จ์ด์ ๋ํด์ ๊ฐ์ด ํํํฉ๋๋ค.
ํด๋น ๊ณผ์ ์ ํตํด์ ํ ์ด๋ธ์ ์ป์ต๋๋ค.
๊ทธ๋ฆฌ๊ณ Vector์ ์ฐจ์์์ ์๋ฏธ ์๋ dimension๋ง ๋ด ๋๋ค.
์ด ๊ณผ์ ์ Dimension Reduction using SVD๋ผ๊ณ ํฉ๋๋ค.
SVD๋ vector์ ์ฐจ์์ ๋๋ค.
๋ง์ ๊ณ์ฐ๋ ๋๋ฌธ์ ์ด ๊ณผ์ ์ ๋ง์ด ์ฌ์ฉํ์ง ์์ต๋๋ค.
Word2Vec
- Represent each word with a low-dimensional vector
- Word similarity = vector similarity
- Key idea : Predict surrounding words of every word
- Faster and can easily incorporate a new sentence/document or add a word to the vocabulary
Represent the meaning of word
- Two basic neural network models:
- Continuous Bag of Word(CBOW) : use a window of word to predict the middle word.
- Skip-gram (SG) : use a word to predict the surrounbding ones in window.

์์ ๊ฐ์ ์ฐจ์ด๋ฅผ ๋ณด์ธ๋ค.
์ด ์ฐจ์ด์ ๋ํด ๋ค์ ์๊ฐ๋ถํฐ ์์ธํ ์์๋ณด๊ฒ ์ต๋๋ค.
'Artificial Intelligence > Natural Language Processing' ์นดํ ๊ณ ๋ฆฌ์ ๋ค๋ฅธ ๊ธ
[NLP] Word Embedding - Word2Vec (0) | 2023.03.27 |
---|---|
[NLP] Word Embedding - Skip Gram (0) | 2023.03.27 |
[NLP] Word Embedding - CBOW (1) | 2023.03.27 |
[NLP] Overview NLP (0) | 2023.03.21 |
[NLP] Introduction to NLP (0) | 2023.03.21 |