[XAI] ์„ค๋ช…๊ฐ€๋Šฅํ•œ AI (Explainable AI)

2023. 1. 26. 10:54
๐Ÿง‘๐Ÿป‍๐Ÿ’ป์šฉ์–ด ์ •๋ฆฌ
 
XAI

 

์ž์œจ์ฃผํ–‰, ์˜๋ฃŒ ์ธ๊ณต์ง€๋Šฅ ๋ฐ ์—ฌ๋Ÿฌ ๊ฐ€์ง€ ์š”์†Œ๊ฐ€ ์ธ๊ฐ„์—๊ฒŒ ์ง์ ‘ ์˜ํ–ฅ์„ ๋ฏธ์น˜๋Š” ๊ฒƒ์— ๋Œ€ํ•ด ํ•„์š”ํ•ด์ง‘๋‹ˆ๋‹ค.

 

์™œ AI๊ฐ€ ์ด๋Ÿฌํ•œ ๊ฒฐ๊ณผ๋ฅผ ๋‚ด์—ˆ๋Š”์ง€ ์„ค๋ช…์ด ํ•„์š”ํ•ด์กŒ์Šต๋‹ˆ๋‹ค.

 

XAI ๊ธฐ๋ฒ•

  • ๋ชจ๋ธ/๋ฐ์ดํ„ฐ์…‹์˜ ์˜ค๋ฅ˜ ์ƒ‰์ถœ
  • ๋ชจ๋ธ์ด ์–ผ๋งˆ๋‚˜ ํŽธํ–ฅ๋˜์—ˆ๋‚˜
  • ์ž์œจ์ฃผํ–‰ ์ž๋™์ฐจ๊ฐ€ ์–ด๋–ป๊ฒŒ ์ž˜๋ชป ์ธ์‹ํ•˜์—ฌ ๊ทธ๋Ÿฌํ•œ ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ๋‚ด์—ˆ๋Š”๊ฐ€
  • COVID-19 X-ray ๊ฒ€์ถœ

 

 

์–ด๋–ป๊ฒŒ ํ•ด์•ผ ์„ค๋ช… ๊ฐ€๋Šฅ์„ฑ์„ ๋†’์ผ ์ˆ˜ ์žˆ์„๊นŒ?



  • Localํ•œ ๋ฐฉ๋ฒ•
    • ์ฃผ์–ด์ง„ ํŠน์ • ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ๊ฐœ๋ณ„์ ์œผ๋กœ ์„ค๋ช…ํ•˜๋ ค๋Š” ๋ฐฉ๋ฒ•
  • Globalํ•œ ๋ฐฉ๋ฒ•
    • ์ „์ฒด ๋ฐ์ดํ„ฐ์…‹์—์„œ ๋ชจ๋ธ์˜ ์ „๋ฐ˜์ ์ธ ํ–‰๋™์„ ์„ค๋ช…ํ•˜๊ณ ์ž ํ•˜๋Š” ๋ฐฉ๋ฒ•

 

  • White-box ๋ฐฉ๋ฒ•
    • ๋ชจ๋ธ์˜ ๋‚ด๋ถ€ ๊ตฌ์กฐ๋ฅผ ์ •ํ™•ํ•˜๊ฒŒ ์•Œ๊ณ  ์žˆ๋Š” ์ƒํ™ฉ์—์„œ ์„ค๋ช…์„ ์‹œ๋„ํ•˜๋Š” ๋ฐฉ๋ฒ•
  • Black-box ๋ฐฉ๋ฒ•
    • ๋‹จ์ˆœํžˆ ๋ชจ๋ธ์˜ ์ž…๋ ฅ๊ณผ ์ถœ๋ ฅ๋งŒ ๊ฐ€์ง€๊ณ  ์„ค๋ช…์„ ์‹œ๋„ํ•˜๋Š” ๋ฐฉ๋ฒ•

 

  • Instinsic
    • ๋ชจ๋ธ์˜ ๋ณต์žก๋„๋ฅผ ํ›ˆ๋ จํ•˜๊ธฐ ์ „๋ถ€ํ„ฐ ์„ค๋ช…ํ•˜๊ธฐ ์šฉ์ผํ•˜๋„๋ก ์ œ์•ˆํ•œ ๋’ค ํ•™์Šต์„ ์‹œ์ผœ์„œ ๊ทธ ํ›„ ํ•™์Šต๋œ ๋ชจ๋ธ์„ ๊ฐ€์ง€๊ณ  ์„ค๋ช…ํ•˜๋Š” ๋ฐฉ๋ฒ•
  • Post-hoc
    • ์ž„์˜ ๋ชจ๋ธ์˜ ํ›ˆ๋ จ์ด ๋๋‚œ ๋’ค์— ์ด ๋ฐฉ๋ฒ•์„ ์ ์šฉํ•ด์„œ ๊ทธ ๋ชจ๋ธ์˜ ํ–‰๋™์„ ์„ค๋ช…ํ•˜๋Š” ๋ฐฉ๋ฒ•

 

  • Model-specific
    • ํŠน์ • ๋ชจ๋ธ ๊ตฌ์กฐ์—๋งŒ ์ ์šฉ์ด ๊ฐ€๋Šฅ
  • Model-agnostic
    • ๋ชจ๋ธ์˜ ๊ตฌ์กฐ์™€ ๊ด€๊ณ„์—†์ด ์–ด๋Š ๋ชจ๋ธ์—๋„ ํ•ญ์ƒ ์ ์šฉ

 

 

Linear model ,  Simple Decision Tree

 

- Global, White-box, Intrinsic, Model-specific

 

 

Grad-CAM

- Local, White-box, Post-hoc, Model-agnostic

 

 

Saliency map-based

 

์ž…๋ ฅ์— ๋Œ€ํ•œ ๋ชจ๋ธ์˜ Gradient๋กœ ์„ค๋ช…์„ ์ œ๊ณตํ•˜๋Š” ๊ฒƒ์ด ๊ฐ€์žฅ ๊ฐ„๋‹จํ•ฉ๋‹ˆ๋‹ค.

 

Gradient๊ฐ€ ํฌ๋‹ค๋ฉด ํ•ด๋‹น ํ”ฝ์…€์ด ์ถœ๋ ฅ ๊ฐ’์— ์˜ํ–ฅ์„ ๋งŽ์ด ๋ฏธ์น˜๋Š” ๊ฒƒ์ด๋ฏ€๋กœ ์ค‘์š”ํ•˜๋‹ค๊ณ  ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

๊ทธ๋ฆฌ๊ณ  Back Propagation์œผ๋กœ ์‰ฝ๊ฒŒ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๊ตฌํ˜„์ƒ์œผ๋กœ๋„ ๊ฐ„๋‹จํžˆ ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์ด์ ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

 

 

  • Strength
    • Back Propagation์„ ํ†ตํ•ด ์‰ฝ๊ฒŒ ๊ณ„์‚ฐ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
  • Weakness
    • shattering gradient problem ์œผ๋กœ ์ธํ•ด noisy ๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

  • ์ด noisy๋ฅผ ์—†์• ๊ธฐ ์œ„ํ•ด SmoothGrad๋ผ๋Š” ๋ฐฉ๋ฒ•์ด ์ œ์‹œ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
    • Gradient๋ฅผ ๊ตฌํ•˜๋Š” ๊ฒƒ์„ ์—ฌ๋Ÿฌ ๋ฒˆ ํ•˜์—ฌ gradient๋“ค์˜ ํ‰๊ท ์œผ๋กœ ์„ค๋ช…์„ ์ œ์‹œํ•˜๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค.
    • ๊น”๋”ํ•˜๊ฒŒ ์„ค๋ช…ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์žฅ์ ์ด ์žˆ์Šต๋‹ˆ๋‹ค.
    • ๋Œ€๋ถ€๋ถ„์˜ Saliency-map ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•๋“ค์—๋„ ์ ์šฉ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
    • noisy๋ฅผ ์„ž๋Š” ํšŸ์ˆ˜ ๋งŒํผ Deep Learning model์˜ forward-backward pro

 

  • Class Activation Map (CAM)
  • Weakly Supervised Learning
  • Grad-CAM

 

  • LIME(Local Interpretable Model-agnostic Explanations
    • Block-box
  • Influence function

 

  • ์˜ˆ์ธก์— ๋Œ€ํ•œ ์„ค๋ช…
    • Human-based visual assessment
      • ์‹œ๊ฐ„ ๋งŽ์ด ๋“ฆ
  • Human annotation
  • Pixel perturbation
    • AOPC (Area Over the MoRF Perturbation Curve)
    • Insertion
    • Deletion
  • ROAR (Remove and Retrain)
  • Model randomization
  • Adversarial attack
    • softplus
  • Adversarial model manipulation

BELATED ARTICLES

more