Artificial Intelligence/Natural Language Processing

[NLP] Sequential Data Modeling

Han Jang 2023. 4. 10. 15:21
๐Ÿง‘๐Ÿป‍๐Ÿ’ป์šฉ์–ด ์ •๋ฆฌ

Neural Networks
RNN
LSTM
Attention
CNN

 

 

Sequential Data Modeling

 

  • Sequential Data
    • Most of data are sequential
    • Speech, Text, Image, ...
  • Deep Learnings for Sequential Data
    • Convolutional Neural Networks (CNN)
      • Try to find local features from a sequence
    • Recurrent Neural Networks : LSTM, GRU
      • Try to capture the feature of the past

 

 

์ง€๊ธˆ๊นŒ์ง€ ์ž…๋ ฅ์— ๋Œ€ํ•ด ์‚ดํŽด๋ณด์•˜์ฃ .

 

๊ทธ๋Ÿฐ๋ฐ, ๊ทธ ์ž…๋ ฅ์— ๋Œ€ํ•ด์„œ ๋‹ค output์ด ์กด์žฌํ•ฉ๋‹ˆ๋‹ค.

 

์šฐ๋ฆฌ๋Š” ๊ทธ output์„ ์šฐ๋ฆฌ๊ฐ€ ์›ํ•˜๋Š”๋Œ€๋กœ ์‚ฌ์šฉํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

 

 

๊ทธ๋ž˜์„œ ์•„๋ž˜์™€ ๊ฐ™์€ Task์— ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

 

 

๊ทธ๋ž˜์„œ ์•„๋ž˜์™€ ๊ฐ™์€ Sequence Generation์˜ task๋„ ์šฐ๋ฆฌ๊ฐ€ ํ’€์–ด๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

Sequence๊ฐ€ ๋‚˜์˜ค๋ฉด Sequence๋ฅผ ๋ฑ‰๋Š” ํ˜•์‹์ž…๋‹ˆ๋‹ค.

 

 

 

 

์šฐ๋ฆฌ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ข…๋ฅ˜์˜ Task๋กœ ๊ตฌ๋ถ„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

 

  • One to Many
    • Image Captioning
    • Image -> Sequence of words
    • caption model
  • Many to One
    • Sentiment Classification
    • sentence -> sentiment
    • classification ๋ชจ๋ธ
    • encoder์—์„œ ๋งŽ์ด, ์ฃผ์–ด์ง„ ๊ฒƒ์œผ๋กœ output์„ ๋งŒ๋“ ๋‹ค.
    • input์€ ์ด๋ฏธ ์•„๋Š” ๊ฐ’.
  • Many to Many
    • Machine Translation
    • sentence -> sentence
    • ๋ฒˆ์—ญ ๋ชจ๋ธ
    • decoder๋Š” ์˜ˆ์ธกํ•œ ๊ฒƒ์„ ๋งค time ๋งˆ๋‹ค ์ƒˆ๋กœ์šด ๊ฒƒ์„ ๊ฐ€์ง€๊ณ  ๋„ฃ์–ด์„œ ์ž์‹œ ์จ์•ผํ•˜๋Š” ๊ฒƒ์ž„.
  • synched Many to Many
    • Stock Price Prediction
    • Prediction of next word
    • language model
    • ๋ฐ”๋กœ ๋ฐ”๋กœ ๋‹ค์Œ ๋ฌธ์žฅ ์˜ˆ์ธกํ•˜๋Š” language model

 

 

 

๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋ชจ๋ธ์ด ์žˆ๋‹ค๊ณ  ๋ด…์‹œ๋‹ค.

 

๋‹ค์Œ์€ ์–ด๋–ค ๋ชจ๋ธ์ผ๊นŒ์š”?

 

 

Many to Many ์ž…๋‹ˆ๋‹ค.

 

 

๊ทธ๋ฆฌ๊ณ  ์šฐ๋ฆฌ๋Š” ์ •๋‹ต์„ ์•Œ๊ณ  ์žˆ์œผ๋‹ˆ, ์˜ˆ์ธก๊ฐ’๊ณผ ์‹ค์ œ๊ฐ’ ์‚ฌ์ด์˜ MSE๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

 

๊ทธ๋ฆฌ๊ณ , ๊ฐ๊ฐ์— ๋Œ€ํ•ด MSE๋ฅผ ์ ์šฉํ•ฉ๋‹ˆ๋‹ค.

 

 

 

 

 

๊ทธ๋ ‡๋‹ค๋ฉด Testํ•  ๋•Œ๋Š” ์–ด๋–ป๊ฒŒ ํ• ๊นŒ์š”?

 

 

training์€ ์ž…๋ ฅ ๋’ค์— ๋ญ๊ฐ€ ์˜ค๋Š”์ง€ ์•Œ์ง€๋งŒ, Test๋Š” ์•Œ์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค.

 

๊ณ„์†ํ•ด์„œ ์˜ˆ์ธกํ•˜์—ฌ ์ž…๋ ฅ์œผ๋กœ ๋‹ค์‹œ ๋„ฃ๊ณ , ์˜ˆ์ธกํ•˜์—ฌ ๋„ฃ๊ณ ๋ฅผ ๋ฐ˜๋ณตํ•ฉ๋‹ˆ๋‹ค.

 

๊ทธ๋ž˜์„œ Training๊ณผ Test๋Š” ๊ตฌ์กฐ๊ฐ€ ๋‹ค๋ฅด๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

 

 

Many to One

 

classification์—์„  ์–ด๋–จ๊นŒ์š”?

๊ธ์ •/๋ถ€์ •์„ ์œ„์™€ ๊ฐ™์ด ๋งˆ์ง€๋ง‰์—์„œ ์˜ˆ์ธกํ•˜๋„๋ก ํ•ฉ๋‹ˆ๋‹ค.

 

๊ทธ๋ฆฌ๊ณ ,

 

Many to Many

 

๋ฒˆ์—ญ ๋ชจ๋ธ์€ ๋ชจ๋“  ์ž…๋ ฅ์„ ์•Œ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

 

๊ทธ๋ ‡๋‹ค๋ฉด, ๋ช‡ ๊ฐœ์˜ ๋‹จ์–ด๋ฅผ ๋„ฃ๊ณ , ๊ทธ๊ฒƒ์— ๋Œ€ํ•œ output์ด ๋‚˜์˜ค๋„๋ก ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค.

 

 

๋‹ค์‹œ ํ•˜๋‚˜์”ฉ ์‚ดํŽด๋ด…๋‹ˆ๋‹ค.

 

 

 

 

  • One to Many
    • Caption Generation
      • Image is represented by a CNN
      • Word Embedding at the input layer
      • Softmax at the output layer

 

encoder์™€ decoder๋ฅผ ์ž˜ ์‚ดํŽด๋ด…์‹œ๋‹ค.

 

encoder๋Š” ์šฐ๋ฆฌ๊ฐ€ ๊ฐ€์ง„ ์ฃผ์–ด์ง„ input์„ ๊ฐ€์ง€๊ณ  output์„ ๋งž์ถ”๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

 

decoder๋Š” ์˜ˆ์ธกํ•œ ๊ฒƒ์ธ output์„ ๊ฐ€์ง€๊ณ  ๋งค time๋งˆ๋‹ค ๋‹ค์‹œ ๋„ฃ์–ด์„œ ์ƒˆ๋กœ์šด ์˜ˆ์ธก์„ ํ•ฉ๋‹ˆ๋‹ค.

 

 

  • Many to Many
    • Word Embedding

 

 

 

์•„๋ž˜์™€ ๊ฐ™์ด ๋ฒˆ์—ญ์„ ํ•˜์—ฌ, 

 

Encoder๋Š” ์ฃผ์–ด์ง„ input์„ ์ง‘์–ด ๋„ฃ์€ ๋ถ€๋ถ„์ž…๋‹ˆ๋‹ค.

 

์ƒˆ๋กœ์šด RNN์„ ๋„ฃ์€ ์œ— ๋ถ€๋ถ„์ด Decoder์ž…๋‹ˆ๋‹ค.

 

 

๊ฒฐ๊ตญ ์ •๋ฆฌํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

 

 

 

 

 

Encoder๋Š” input sequence๋ฅผ ๋„ฃ๋Š” ๋ถ€๋ถ„,

 

Decoder๋Š” ์ด๊ฒƒ์— ๋Œ€ํ•œ ์•ž embedding ๊ฐ’์„ ๊ฐ€์ง€๊ณ  ๋’ท output์„ ์ƒ์„ฑํ•˜๋Š” ๋ถ€๋ถ„์ž…๋‹ˆ๋‹ค.

 

RNN ๊ตฌ์กฐ๊ฐ€ encoder, decoder ๋‘˜ ๋‹ค ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

๊ฒฐ๊ตญ, input์ด ์–ด๋–ค ๊ตฌ์กฐ, output ์ด sequence ๊ตฌ์กฐ๋ผ๋ฉด encoder - decoder ๊ตฌ์กฐ๋กœ modelingํ•ด์•ผ๊ฒ ๋‹ค๊ณ  ์ƒ๊ฐํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.

 

 

 

Attention์„ ์“ฐ๋Š” ์ด์œ ?

 

๊ต‰์žฅํžˆ ๊ธด ๋ฌธ์žฅ์ด ๋“ค์–ด์™”๋‹ค๊ณ  ๋ด…์‹œ๋‹ค.

 

100๊ฐœ์˜ hidden state๊ฐ€ ์žˆ์–ด์„œ, ๋งจ ๋งˆ์ง€๋ง‰์˜ sentense embedding์ด output ์ƒ์„ฑํ•  ๋•Œ, ๋ชจ๋“  ๋‹จ์–ด์˜ ์ •๋ณด๋ฅผ ๋‹ค ํฌํ•จํ•˜๊ณ  ์žˆ์„๊นŒ์š”?

 

output์„ ์ƒ์„ฑํ•  ๋•Œ, ๊ฐ๊ฐ์˜ ๋‹จ์–ด๋ฅผ ๋‹ค ๋ณด๋ฉด์„œ, ํ˜„์žฌ hidden state์™€ ๊ฐ 100๊ฐœ์˜ hidden state๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ ๊ตฌํ•˜๋ฉฐ ๊ฐ€์žฅ attention score๊ฐ€ ๋†’์€ ๊ฒƒ์„ ์„ ํƒ์„ ํ•ฉ๋‹ˆ๋‹ค.

 

๊ฒฐ๊ตญ, encoder๊ฐ€ source sentence๋ฅผ ํ•˜๋‚˜์˜ vector๋กœ encodingํ•˜๊ธฐ ์–ด๋ ต๋‹ค.

 

 

๊ทธ๋ž˜์„œ ์ด๋Ÿฌํ•œ seq2seq task ์—์„œ๋Š” ๊ธธ์ด๊ฐ€ ๊ธธ ๋•Œ, ์„ฑ๋Šฅ์ด ๋–จ์–ด์ง€๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ๋‹ค.

 

๊ทธ๋ž˜์„œ Attention์ด ๋“ฑ์žฅํ•˜์˜€์Šต๋‹ˆ๋‹ค.

 

 

 

'Artificial Intelligence > Natural Language Processing' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

[NLP] Transformer  (0) 2023.05.21
[NLP] Attention  (0) 2023.04.12
[NLP] RNN - LSTM, GRU  (0) 2023.04.04
[NLP] RNN  (0) 2023.04.04
[NLP] Word Embedding - GloVe [practice]  (0) 2023.03.31