[Ensemble Learning] part 6 - 1

2023. 1. 20. 13:00
๐Ÿง‘๐Ÿป‍๐Ÿ’ป์šฉ์–ด ์ •๋ฆฌ

Ensemble Learning
Expert
bagging
boosting
Random Forest
GBM

 

 

Ensemble Learning

  • ์ด๋ฏธ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๊ฑฐ๋‚˜ ๊ฐœ๋ฐœํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ๊ฐ„๋‹จํ•œ ํ™•์žฅ
    • Supervised Learning task์—์„œ ์„ฑ๋Šฅ์„ ์˜ฌ๋ฆด ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•
  • ๋จธ์‹ ๋Ÿฌ๋‹์—์„œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ์ข…๋ฅ˜์— ์ƒ๊ด€ ์—†์ด ์„œ๋กœ ๋‹ค๋ฅด๊ฑฐ๋‚˜, ๊ฐ™์€ ๋งค์ปค๋‹ˆ์ฆ˜์œผ๋กœ ๋™์ž‘ํ•˜๋Š” ๋‹ค์–‘ํ•œ ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ๋ฌถ์–ด ํ•จ๊ป˜ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค.
  • ์—ฌ๋Ÿฌ ๋‹ค๋ฅธ model์„ ํ•จ๊ป˜ ๋ชจ์•„ ์˜ˆ์ธก model์˜ ์ง‘ํ•ฉ์œผ๋กœ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
  • ํ•˜๋‚˜์˜ ํ•™์Šต model์€ ์šฐ๋ฆฌ๊ฐ€ expert๋ผ๊ณ  ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๋‹ค์–‘ํ•œ model์˜ ๊ฐ ์žฅ์ ์„ ์‚ด๋ ค์„œ ์˜ˆ์ธก ์„ฑ๋Šฅ์„ ํ–ฅ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์˜ˆ์ธก ์„ฑ๋Šฅ์„ ์•ˆ์ •์ ์œผ๋กœ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์ด ์žฅ์ ์ž…๋‹ˆ๋‹ค.
    • ํ•˜๋‚˜์˜ model์ด ๊ฒฐ์ •ํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค ๋‹ค์–‘ํ•œ ์—ฌ๋Ÿฌ ๊ฐœ์˜ model์˜ ๊ฒฐ์ •์œผ๋กœ ์ตœ์ข… ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ์ œ๊ณตํ•˜๊ธฐ ๋•Œ๋ฌธ์— noise ๋“ฑ์œผ๋กœ๋ถ€ํ„ฐ ๋ณด๋‹ค ์•ˆ์ •์ ์ž…๋‹ˆ๋‹ค.
  • ์‰ฝ๊ฒŒ ๊ตฌํ˜„์ด ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
    • ์—ฌ๋Ÿฌ ๊ฐœ์˜ model์„ ํ•™์Šตํ•˜๊ณ  ๊ฑฐ์˜ ์ง์ ‘์ ์œผ๋กœ ์—ฐํ•ฉํ•˜์—ฌ ์ ์šฉ์„ ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๊ฐ„ํŽธํ•˜๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
    • model ๋“ค์€ ๋…๋ฆฝ์ ์œผ๋กœ ๋™์ž‘ํ•˜๊ธฐ ๋•Œ๋ฌธ์— model parameter์˜ tuning์ด ๋งŽ์ด ํ•„์š” ์—†์Šต๋‹ˆ๋‹ค.
  • ๊ทธ๋Ÿฌ๋‚˜ ๋‹จ์ ๋„ ์กด์žฌํ•ฉ๋‹ˆ๋‹ค.
    • ์ด ๋ชจ๋ธ ์ž์ฒด๋กœ๋Š” compactํ•œ ํ‘œํ˜„์ด ๋˜๊ธฐ ์–ด๋ ค์šด ๋ฌธ์ œ๊ฐ€ ์กด์žฌํ•ฉ๋‹ˆ๋‹ค.

 

์ถœ์ฒ˜ : https://dict.naver.com/frkodict/#/search?range=all&query=ensemble

 

ensemble์€ ํ”„๋ž‘์Šค์–ด๋กœ "ํ•จ๊ป˜, ๋™์‹œ์—, ํ•œ๊บผ๋ฒˆ์— ํ˜‘๋ ฅํ•˜์—ฌ" ๋ผ๋Š” ์˜๋ฏธ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

 

 

Ensemble Methods

  • ํ•™์Šต data set์„ randomํ•˜๊ฒŒ S1๋ถ€ํ„ฐ .. Sn๊นŒ์ง€ ๋‚˜๋ˆ„์–ด์„œ ํ•™์Šต์„ ์ง„ํ–‰ํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
  • ๊ฐ™๊ฑฐ๋‚˜ ๋‹ค๋ฅธ ํ•™์Šต model๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค.
    • ๊ทธ๋ž˜์„œ, ๊ฐ™์€ ํ•™์Šต data๋กœ ํ•™์Šตํ•˜๋Š” ๊ฒƒ์„ ์ง€์–‘ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.
  • ์ตœ์ข… ๊ฒฐ์ •์€ ์šฐ๋ฆฌ๊ฐ€ ๋งˆ์ง€๋ง‰์— ํ•™์Šตํ•œ ๋‹ค์ˆ˜์˜ model์ด ๊ฐ๊ฐ ๊ฒฐ์ •์„ ๋‚ด๋ฆฐ ํ›„ ๋‹ค์ˆ˜๊ฒฐ๋กœ ์ตœ์ข… ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ์ œ๊ณตํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

 

Ensemble ๊ตฌ์„ฑ ๊ฐ€์žฅ ๊ธฐ๋ณธ์ ์ธ ์š”์†Œ ๊ธฐ์ˆ 

  • bagging
    • ํ•™์Šต ๊ณผ์ •์—์„œ training sample์„ ๋žœ๋คํ•˜๊ฒŒ ๋‚˜๋ˆ„์–ด ์„ ํƒํ•ด ํ•™์Šตํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
    • sample์„ n๊ฐœ๋กœ ๋‚˜๋ˆ„์–ด sub sample์„ ๊ตฌ์„ฑํ•ฉ๋‹ˆ๋‹ค.
    • ์ด๋ ‡๊ฒŒ ๋‚˜๋ˆ ์ง„ sub sample๋“ค์€ Classifier 1, 2, 3,... n๊ฐœ๋ฅผ ํ•™์Šตํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.
    • ์„œ๋กœ ๊ฐ™์€ model์„ ๊ฐ€์งˆ์ง€๋ผ๋„ ์„œ๋กœ ๋‹ค๋ฅธ ํŠน์„ฑ์˜ ํ•™์Šต์ด ๊ฐ€๋Šฅํ•ด์ง‘๋‹ˆ๋‹ค.
    • model์„ ๋ณ‘๋ ฌ์ ์œผ๋กœ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
    • ๊ฐ sample set์ด ๋‹ค๋ฅธ model์— ์˜ํ–ฅ์„ ๋ฏธ์น˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.
    • Bagging == Bootstrapping + aggregating
      • Bootstrapping
        • ๋‹ค์ˆ˜์˜ sample data set์„ ์ƒ์„ฑํ•ด์„œ ํ•™์Šตํ•˜๋Š” ๋ฐฉ์‹์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.
        • ๊ฐ™์€ model์„ ์‚ฌ์šฉํ•œ๋‹ค๋ฉด model parameter๊ฐ€ ์„œ๋กœ ๋‹ฌ๋ผ์ ธ์•ผ ํ•  ๊ฒƒ์ด๊ธฐ ๋•Œ๋ฌธ์— sample์„ randomํ•˜๊ฒŒ ์„ ํƒํ•ด์•ผํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.
        • ๊ทธ๋Ÿฌ๋‚˜, ์„œ๋กœ ๋‹ค๋ฅธ model์„ ์‚ฌ์šฉํ•˜๊ฒŒ ๋œ๋‹ค๋ฉด ์–ด์ฐจํ”ผ ์„œ๋กœ ๋‹ค๋ฅธ model ํ˜•ํƒœ๋กœ ๋‹ค๋ฅด๊ฒŒ ๋™์ž‘ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๊ฐ™์€ sample์„ ์ด์šฉํ•ด๋„ ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค.
        • ์ด๋Ÿฐ ๊ณผ์ •์„ m๋ฒˆ ๋ฐ˜๋ณตํ•˜๋ฉด m๊ฐœ data set๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ํšจ๊ณผ๊ฐ€ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๋ณด๋‹ค ๋” noise์— robustํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
    • lower variance์˜ ์•ˆ์ •์ ์ธ ์„ฑ๋Šฅ์„ ์ œ๊ณตํ•˜๋Š”๋ฐ ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค.
    • ๋งŒ์ผ, training sample์˜ ์ˆซ์ž๊ฐ€ ์ ๊ฑฐ๋‚˜ model์ด ๋ณต์žกํ•œ ๊ฒฝ์šฐ์— ๋ฐœ์ƒํ•˜๋Š” overfitting์˜ ๋ฌธ์ œ์— ๋Œ€ํ•ด์„œ sample์„ randomํ•˜๊ฒŒ ์„ ํƒํ•˜๋Š” ๊ณผ์ •์—์„œ data augmentation ํšจ๊ณผ๋ฅผ ๊ฐ€์งˆ ์ˆ˜ ์žˆ๊ณ , ๊ฐ„๋‹จํ•œ model์„ ์ง‘ํ•ฉ์œผ๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์•ˆ์ •์ ์ธ ์„ฑ๋Šฅ์„ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

 

  • boosting
    • Sequentialํ•˜๊ฒŒ ๋™์ž‘ํ•ฉ๋‹ˆ๋‹ค.
      • ์˜ˆ๋ฅผ ๋“ค์–ด, Classifier 1์˜ ๊ฒฐ๊ณผ๋ฅผ Classifier 2 ํ•™์Šต์— ์ ์šฉํ•ฉ๋‹ˆ๋‹ค.
      • Classifier 1์˜ ๊ฒฐ๊ณผ๋กœ ์–ด๋–ค sample์ด ์ค‘์š”ํ•˜๊ณ  ์–ด๋–ค sample์ด ์ค‘์š”ํ•˜์ง€ ์•Š์€์ง€ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
      • ๊ทธ๋Ÿฌํ•œ ๊ฒฐ์ •์— weight๋ฅผ ์ฃผ์–ด์„œ ๋‹ค์Œ ํ•™์Šต ๊ณผ์ •์—์„œ, Classifier 2 ํ•™์Šต์— ์ ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
      • ์ด๋ ‡๊ฒŒ ์ด์ „์— ๋™์ž‘ํ•œ Classifier๋“ค์˜ ๋™์ž‘์„ ํ˜„์žฌ์˜ Classifier ๊ฒฐ๊ณผ๋ฅผ ํ–ฅ์ƒํ•˜๋Š”๋ฐ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
      • Weak Classifier์˜ Cascading
        • Weak Classifier
          • Bias๊ฐ€ ๋†’์€ Classifier
          • model ์ž์ฒด๊ฐ€ ๋‹จ์ˆœํ•˜์—ฌ Strong classifier์— ๋น„ํ•ด ์„ฑ๋Šฅ์ด ๋‚ฎ์•„์„œ ํ˜ผ์ž์„œ๋Š” ๋ฌด์—‡์„ ํ•˜๊ธฐ ์–ด๋ ค์šด model์ž…๋‹ˆ๋‹ค.
          • ๊ทธ๋Ÿฌ๋‚˜ ์ด๋Ÿฌํ•œ model์„ "Cascading"ํ•˜์—ฌ ์ ์šฉํ•˜๊ฒŒ๋˜๋ฉด, ์—ฐ์‡„์ ์œผ๋กœ ์ ์šฉ์ด ๋˜์–ด์„œ Sequentialํ•œ ํŠน์„ฑ์„ ๊ฐ€์ง€๊ณ  Classifier์˜ ํŠน์„ฑ์„ ์˜ฌ๋ฆด ์ˆ˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
    • ๋Œ€ํ‘œ์ ์€ Boosting Algorithm
      • Adaboost
        • base classifier์— ์˜ํ•ด์„œ ์˜ค๋ถ„๋ฅ˜๋œ sample์— ๋Œ€ํ•ด ๋ณด๋‹ค ๋†’์€ ๊ฐ€์ค‘์น˜๋ฅผ ๋‘์–ด ๋‹ค์Œ ํ•™์Šต์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.
        • ๊ฐ„๋‹จํ•˜๊ฒŒ ๊ตฌํ˜„ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
        • ํŠน์ •ํ•œ ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๊ตฌ์• ๋ฐ›์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
    • Bagging๊ณผ Boosting์„ ํ™œ์šฉํ•œ ๋Œ€ํ‘œ์ ์ธ ์•Œ๊ณ ๋ฆฌ์ฆ˜
      • Random Forest
        • ์„œ๋กœ ๋‹ค๋ฅด๊ฒŒ ํ•™์Šต๋œ Decision Tree์˜ ๊ฒฐ์ •์œผ๋กœ ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ „์ฒด์ ์œผ๋กœ Bagging์„ ํ†ตํ•ด ํ•™์Šตํ•œ๋‹ค๊ณ  ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
      • Gradient Boostting Machine (GBM) 
        • ๋งค Node์—์„œ ๊ฒฐ์ •์ด ์ด๋ฃจ์–ด์ง€๊ธฐ ๋•Œ๋ฌธ์— ์œ„ Classification์— Sequentialํ•œ Boosting์„ ์‹คํ–‰ํ•œ๋‹ค๊ณ  ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
        • Machine Learning Competition์—์„œ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ์ž๋ž‘ํ•ฉ๋‹ˆ๋‹ค.

 

'Artificial Intelligence' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

[Seq2Seq with Attention for Natural Language Understanding and Generation] Part 4 - 1  (0) 2023.01.25
[Ensemble Learning] part 6 - 2  (0) 2023.01.20
[Advanced Classification] part 5 - 3  (0) 2023.01.19
[Advanced Classification] part 5 - 2  (0) 2023.01.19
[Advanced Classification] part 5 - 1  (0) 2023.01.19

BELATED ARTICLES

more