[Advanced Classification] part 5 - 3

2023. 1. 19. 15:43
๐Ÿง‘๐Ÿป‍๐Ÿ’ป์šฉ์–ด ์ •๋ฆฌ

ANN (Aritificial Neural Network)
DNN (Deep Neural Network)
Multilayer perceptron model

 

 ANN (Aritificial Neural Network)

  • nonlinear classification model์„ ์ œ๊ณต
  • ์•ž์„œ ์žˆ๋˜ ๊ฒƒ๊ณผ ๊ฒฐ์ด ๋‹ค๋ฅธ classifier
  • ์ธ๊ฐ„์˜ ๋‡Œ ์‹ ๊ฒฝ์„ ๋ชจ์‚ฌํ•œ ํ˜•ํƒœ๋กœ ๋งŒ๋“ค์–ด์ง„ model
    • ์ธ์ ‘ํ•˜๋Š” node์—์„œ ์‹ ํ˜ธ๋ฅผ ์ˆ˜์ง‘ํ•˜๊ฒŒ ๋˜๋ฉด, ๊ทธ ์‹ ํ˜ธ๋“ค์€ ์ „๊ธฐ์ ์œผ๋กœ ํ•˜๋‚˜์˜ node๋“ค์— ํ•ฉ์„ฑ์ด ๋˜์–ด์„œ ์–ด๋Š ์ „๊ธฐ ์ˆ˜์ค€ ์ด์ƒ์œผ๋กœ ์‹ ํ˜ธ๊ฐ€ ์ฆ๊ฐ€๋์„ ๋•Œ, ํ™”ํ•™์ ์ธ ๊ณผ์ •์„ ํ†ตํ•ด์„œ ๊ทธ ๋‹ค์Œ ๋‰ด๋Ÿฐ์œผ๋กœ ์‹ ํ˜ธ๋ฅผ ์ „ํŒŒํ•ด ๋‚˜๊ฐ„๋‹ค.
  • Linear combinations์œผ๋กœ ๋งŒ๋“  score ๊ฐ’์„ ์ด์šฉ
  • nonlinearํ•œ ์—ฐ์‚ฐ์ด "Actiovation Function"์— ์˜ํ•ด์„œ ์ด๋ฃจ์–ด์ง‘๋‹ˆ๋‹ค.
    • Activation function์€ linear combination์œผ๋กœ์„œ์˜ score ๊ฐ’์„ ์ž…๋ ฅ์œผ๋กœ ์‚ผ์•„์„œ sigmoid ์™€ ๊ฐ™์€ ํ•จ์ˆ˜๋ฅผ ํ†ตํ•ด nonlinearํ•œ ๊ด€๊ณ„๋กœ  mapping ํ•ด์ฃผ๋Š” ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

 

์ถœ์ฒ˜ : https://openbooks.lib.msu.edu/neuroscience/chapter/synapse-structure/

 

ANN์—์„œ์˜ Activation functions

  • Sigmoid neurons
    • neural network๋ฅผ ๊นŠ์ด์žˆ๊ฒŒ ์Œ“์•„๊ฐˆ ๋•Œ, ํšจ๊ณผ์ ์ด์ง€ ์•Š์Œ์ด ๋ณด์—ฌ์ง‘๋‹ˆ๋‹ค.
    • ํ•™์Šต์ด ์ง„ํ–‰๋จ์— ๋”ฐ๋ผ gradient ๊ฐ’์ด ์ ์  ์ž‘์•„์ง€๋Š” ๊ฒฝํ–ฅ์ด ์žˆ์Šต๋‹ˆ๋‹ค.
  • ReLU
    • max (0, x)
    • ์ด ํ•จ์ˆ˜๋ฅผ ๋ฏธ๋ถ„ํ•˜๋”๋ผ๋„ ์—ฌ์ „ํžˆ gradient term์ด 1๋กœ ๋‚˜ํƒ€๋‚˜๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • Leaky ReLU
    • max (0.1x, x)

์ถœ์ฒ˜ : https://www.researchgate.net/figure/ANN-Activation-Functions_fig3_48667203

 

๐Ÿ’ก ์ด๋Ÿฌํ•œ ANN์— ๊ณ„์ธต์„ ์Œ“์•„์„œ ๊นŠ๊ฒŒ ์Œ“๊ฒŒ ๋˜๋ฉด DNN์ด ๋ฉ๋‹ˆ๋‹ค.

 

 

DNN (Deep Neural Network)

  • ๊ฐ๊ฐ์˜ ๊ณ„์ธต์— ๋”ฐ๋ผ์„œ ํ•™์Šต์„ ํ•˜๊ฒŒ๋˜๋Š” feature์˜ ํ˜•ํƒœ๊ฐ€ ๋‹ฌ๋ผ์ง„๋‹ค๊ณ  ํ•˜๋Š” ๊ฒƒ์ด ์ž˜ ์•Œ๋ ค์ ธ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ด๋Ÿฌํ•œ nonlinear ํ•จ์ˆ˜๋“ค์ด ๊ณ„์ธต์ ์œผ๋กœ ์Œ“์—ฌ ๊ฐ์— ๋”ฐ๋ผ์„œ, signal spade์—์„œ์˜ ๋ณต์žกํ•œ ์‹ ํ˜ธ๋“ค์˜ ํŒจํ„ด๋“ค์„ ์กฐ๊ธˆ ๋” ์ •ํ™•ํ•˜๊ฒŒ ๋ถ„๋ฅ˜ํ•  ์ˆ˜๊ฐ€ ์žˆ๋‹ค๊ณ  ํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
    • ๋ณต์žกํ•œ ํ˜•ํƒœ์˜ input feature๋“ค์„, ๋ณต์žกํ•œ sample๋“ค์„ ํšจ๊ณผ์ ์œผ๋กœ ๋ถ„๋ฅ˜ํ•˜๊ธฐ ์œ„ํ•œ linear play๋Š” ๊ตฌ์„ฑํ•˜๊ธฐ ์–ด๋ ต์Šต๋‹ˆ๋‹ค.
  • ํ•˜๋‚˜์˜ ๊ณ„์ธต์„ ํ†ตํ•œ nonlinear activation function ์—ญ์‹œ ์ด๋Ÿฌํ•œ ๊ฒƒ๋“ค์„ ํšจ๊ณผ์ ์œผ๋กœ ๋ถ„๋ฅ˜ํ•˜๊ธฐ ์–ด๋ ต๋‹ค๊ณ  ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ด๋Ÿฌํ•œ neural network๋ฅผ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๊ณ„์ธต์„ ํ†ตํ•ด์„œ ์Œ“์•„๊ฐ€๋ฉฐ ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•˜๋ฉด ๋ณต์žกํ•œ model๋“ค ์—ญ์‹œ, ๋” ๋ณต์žกํ•œ ํ˜•ํƒœ์˜ hyper plane์„ ํ†ตํ•ด์„œ ๋ชจ๋ธ๋“ค์„ ๋” ์ž˜ ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

์ถœ์ฒ˜ : https://velog.io/@dlwns97/Neural-Network-with-Pytorch-1

 

์œ„์™€ ๊ฐ™์ด XOR Problem์ด ์žˆ์Šต๋‹ˆ๋‹ค.

 

์•„๋ž˜ ํ‘œ๋ฅผ ๊ฐ™์ด ๋ดค์„ ๋•Œ , ์„œ๋กœ ๋‹ค๋ฅผ ๋•Œ output์ด 1์ด๊ณ , ๊ฐ™์„ ๋•Œ 0์ž…๋‹ˆ๋‹ค.

output input1 input2
0 0 0
1 0 1
1 1 0
0 1 1

 

Postive ๋ผ๋ฆฌ, negative ๋ผ๋ฆฌ, ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์žˆ์„๊นŒ์š”?

 

์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ํ•˜๋‚˜์˜ hyper plane์„ ๊ทธ์–ด ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์žˆ์„๊นŒ์š”?

 

์ด๋Š” ๊ต‰์žฅํžˆ ์–ด๋ ค์šธ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

 

Neural Network๊ฐ€ ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ํ’€ ์ˆ˜ ์žˆ์Œ์„ ์•Œ์•„๋ƒˆ์Šต๋‹ˆ๋‹ค.

 

 

 

Neural Network๋ฅผ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์ธต์œผ๋กœ ์Œ“์€ ๊ฒƒ์„, 

 

Multilayer perceptron model (MLP)

์ด๋ผ๊ณ  ํ•  ์ˆ˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

 

MLP

  • XOR problem์„ ํ’€ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • multilayer์˜ ๊ณ„์ธต์ด ์˜ฌ๋ผ๊ฐ์— ๋”ฐ๋ผ ๋” ๋ณต์žกํ•œ ํ˜•ํƒœ์˜ hyper plane ์„ ๋งŒ๋“ญ๋‹ˆ๋‹ค.
  • ์ด๋Ÿฌํ•œ neural network๋Š” linear classification ๋ฌธ์ œ์—์„œ ์ˆ˜ํ–‰ํ•  ์ˆ˜๊ฐ€ ์—†๋Š”, XOR problem๊ณผ ๊ฐ™์ด nonlinearํ•œ ๋ฌธ์ œ๋“ค์„ ํ’€์–ด๋‚ผ ์ˆ˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๊ณ ์ฐจ์›์˜ ์‹ ํ˜ธ, image ๊ฐ™์€ ํ˜•ํƒœ์— ๋ณด๋‹ค ๋” ์ž˜ ๋™์ž‘ํ•จ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

ANN

  • Example : MNIST data recognition
  • Computer Vision, Image Recognition์— ๋งŽ์ด ํ™œ์šฉ๋ฉ๋‹ˆ๋‹ค.
  • ํ•œํŽธ, ์ด๋Ÿฌํ•œ ๋ชจ๋ธ์˜ accuacy๊ฐ€ ๊ณ„์ธต์„ ๊ณ„์† ๋Š˜๋ ค๊ฐ„๋‹ค๊ณ  ํ• ์ง€๋ผ๋„, ์–ด๋Š ์ˆœ๊ฐ„ ๋‚ฎ์•„์ง€๊ฒŒ ๋˜๋Š” ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
    • ์ด๊ฒƒ์€ Gradient Vanishing problem ๋•Œ๋ฌธ์— ๋ฐœ์ƒํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
    • ๋ชจ๋ธ ํ•™์Šต ๊ณผ์ •์—์„œ parameter๋ฅผ chain rule์„ ํ†ตํ•ด ํ•™์Šตํ•˜๊ฒŒ ๋˜๋Š”๋ฐ, ๊ณ„์ธต์ด ๊นŠ์–ด์ง€๋ฉด ๊นŠ์–ด์งˆ ์ˆ˜๋ก, Gradient ๊ฐ’์ด ๊ณ„์†ํ•ด์„œ ์ค„์–ด๋“ค๊ฒŒ ๋˜์–ด์„œ, ๊นŠ์€ layer์— ๋Œ€ํ•ด์„œ ํ•™์Šต์ด ํšจ๊ณผ์ ์œผ๋กœ ์ด๋ฃจ์–ด์ง€์ง€ ์•Š๊ฒŒ ๋˜๋Š” ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
  • ์ด๋Ÿฌํ•œ ํ•™์Šต ๊ณผ์ •, ์ด๋Ÿฌํ•œ optimization ๋ฌธ์ œ๋ฅผ BackPropagation ์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
  • ์ด๋Ÿฌํ•œ BackPropagation ๊ณผ์ •์—์„œ ๋ฐœ์ƒ๋˜๋Š” "vanishing gradient problem"์„ ํ’€๊ธฐ ์œ„ํ•œ ๋งŽ์€ ์—ฐ๊ตฌ๊ฐ€ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.
    • ๊ฒฐ๊ตญ pre-training ๊ณผ fine tuning๊ณผ ๊ฐ™์€ ๊ณผ์ •์„ ๊ฑฐ์ณ์„œ ANN๊ณผ CNN(Convolutional Neural Network)๊ณผ ๊ฐ™์€ Deep Learning model ๊ณผ ๊ฐ™์€ ๊ฒƒ์œผ๋กœ ๋ฐœ์ „ ๋˜์–ด ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋‚ด๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

 

Convolutional Neural Network (CNN)

  • ANN์—์„œ ๊ณ ๋„ํ™” ๋˜์–ด์„œ, image๋‚˜ video์™€ ๊ฐ™์€ ๊ณ ์ฐจ์› ์‹ ํ˜ธ๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ๋‹ค๋ฃจ๋Š” ๋ฐ ๋งŽ์ด ํ™œ์šฉ๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

'Artificial Intelligence' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

[Ensemble Learning] part 6 - 2  (0) 2023.01.20
[Ensemble Learning] part 6 - 1  (0) 2023.01.20
[Advanced Classification] part 5 - 2  (0) 2023.01.19
[Advanced Classification] part 5 - 1  (0) 2023.01.19
[Linear Classification] part 4 - 3  (0) 2023.01.19

BELATED ARTICLES

more