๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€๋ถ„์„(Logistic regression analysis) = ๋กœ์ง“๋ถ„์„(Logit analysis)

ํšŒ๊ท€๋ถ„์„์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ๋ถ„๋ฅ˜๋ถ„์„ ๋ชจํ˜•

๋‘ ๊ฐœ์˜ ๊ฐ’๋งŒ์„ ๊ฐ€์ง€๋Š” ์ข…์†๋ณ€์ˆ˜์™€ ๋…๋ฆฝ๋ณ€์ˆ˜๋“ค ๊ฐ„์˜ ์ธ๊ณผ๊ด€๊ณ„๋ฅผ ๋กœ์ง€์Šคํ‹ฑ ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜์—ฌ ์ถ”์ •

 

01. Sigmoid Function

 

 

 

 

 

02. ๋กœ์ง“๋ณ€ํ™˜

 

 

 

 

 

03. ์ดํ•ญ ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€

 

 

 

 

 

04. ๋‹คํ•ญ ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€

 

 

 

 

 

05. ์˜ค๋ถ„๋ฅ˜ํ‘œ (confusion matrix)

  ์˜ˆ์ธก์น˜
Positive Negative
๊ด€์ธก์น˜
(์‹ค์ œ๊ฐ’)
POS TP (์ฐธ ๊ธ์ •) FN (๊ฑฐ์ง“ ๋ถ€์ •)
NEG FP (๊ฑฐ์ง“ ๊ธ์ •) TN (์ฐธ ๋ถ€์ •)

์ •๋ถ„๋ฅ˜์œจ(Accuracy) = (TP+TN) / ์ „์ฒด๊ด€์ธก์น˜ = ๋ถ„๋ฅ˜์ •ํ™•๋„, ์ •ํ™•๋„

์˜ค๋ถ„๋ฅ˜์œจ(Inaccuracy) = ( FN+FP) / ์ „์ฒด๊ด€์ธก์น˜

์ •ํ™•๋ฅ (Precision) = TP / (TP + FP)

์žฌํ˜„์œจ(Recall) = TP / (TP + FN)

F ์ธก์ •์น˜(F measure) =2 x ((Precision x Recall) / (Precision + Recall))

 

 

 

 

 

6. ROC ๊ทธ๋ž˜ํ”„ (ํ‰๊ฐ€๋„๊ตฌ)

ROC(Receiver Operation Characteristic Curve) ๊ณก์„  & AUC๊ฐ’

ROC ๊ณก์„  : FPR๊ณผ TPR์˜ ๋ณ€ํ™”๋ฅผ ๋ณด๋Š”๋ฐ ์ด์šฉ, 2์ง„ ๋ถ„๋ฅ˜๋ชจ๋ธ ์„ฑ๋Šฅ ํ‰๊ฐ€

AUC(Area Under Curve=์˜ˆ์ธก๋ ฅ)๊ฐ’์€ ROC ๊ณก์„  ๋ฉด์ ์„ ๊ตฌํ•œ ๊ฒƒ์œผ๋กœ 1์— ๊ฐ€๊นŒ์šธ์ˆ˜๋ก ์„ฑ๋Šฅ์ด ์ข‹๋‹ค.

AUC ์ˆ˜์น˜๊ฐ€ ์ปค์ง€๋ ค๋ฉด FPR์ด ์ž‘์€ ์ƒํƒœ์—์„œ ์–ผ๋งˆ๋‚˜ ํฐ TPR์„ ์–ป๋Š”๊ฐ€?

๊ฐ€์šด๋ฐ ์ง์„ ์—์„œ ๋ฉ€์–ด์ง€๊ณ  ์™ผ์ชฝ ์ƒ๋‹จ ๋ชจ์„œ๋ฆฌ์˜ ๊ณก์„ ์ด ์ง์‚ฌ๊ฐํ˜•์— ๊ฐ€๊นŒ์šธ ์ˆ˜๋ก ์ข‹์€ AUC ์„ฑ๋Šฅ ์ˆ˜์น˜๋ฅผ ์–ป๋Š”๋‹ค.

AUC80%์ •๋„๋ฉด ์ด๋Ÿฐ ROC๊ณก์„ 

• TPR(True Positive Rate) : ์‹ค์ œ ์–‘์„ฑ์„ ์–‘์„ฑ์œผ๋กœ ์˜ˆ์ธกํ•  ๋น„์œจ = TP / TP + FN

• FNR(False Negative Rate) : ์‹ค์ œ ์–‘์„ฑ์„ ์Œ์„ฑ์œผ๋กœ ์˜ˆ์ธกํ•  ๋น„์œจ = FN / TP + FN

• TNR(True Negative Rate) : ์‹ค์ œ ์Œ์„ฑ์„ ์Œ์„ฑ์œผ๋กœ ์˜ˆ์ธกํ•  ๋น„์œจ = TN / FP + TN

• FPR(False Positive Rate) : ์‹ค์ œ ์Œ์„ฑ์„ ์–‘์„ฑ์œผ๋กœ ์˜ˆ์ธกํ•  ๋น„์œจ = FP / FP + TN

 

• ๋ฏผ๊ฐ๋„(Sensitivity) = TPR

• ํŠน์ด๋„(Specificity) = TNR

• FPR = 1 – ํŠน์ด๋„

 

* ์™ผ์ชฝ ์œ„ ๋นˆ ๊ณต๊ฐ„์ด ์˜ค๋ถ„๋ฅ˜์œจ

 

 

 

* ํ˜ผ๋™ํ–‰๋ ฌ
          P(์–‘์„ฑ) N(์Œ์„ฑ)
P(์–‘์„ฑ)  TP      FN
N(์Œ์„ฑ)  FP      TN
* ๊ด€์ธก์น˜/์˜ˆ์ธก์น˜, ์–‘์„ฑ/์Œ์„ฑ ์œ„์น˜ ๋ณ€๊ฒฝ ๊ฐ€๋Šฅ

2. ์–‘์„ฑ/์Œ์„ฑ ๋น„์œจ
๋ฏผ๊ฐ๋„ = ์ง„์–‘์„ฑ๋น„์œจ(TPR) : ์–‘์„ฑ(๊ฐ์—ผ) -> ์–‘์„ฑ ์˜ˆ์ธก๋น„์œจ = TP / (TP/FN)
1-๋ฏผ๊ฐ๋„ = ์œ„์–‘์„ฑ๋น„์œจ(FNR) : ์–‘์„ฑ-> ์Œ์„ฑ ์˜ˆ์ธก๋น„์œจ = FN / (TP+FN)

ํŠน์ด๋„ = ์ง„์Œ์„ฑ๋น„์œจ(TNR) : ์Œ์„ฑ(๋น„๊ฐ์—ผ) -> ์Œ์„ฑ ์˜ˆ์ธก๋น„์œจ = TN / (TN+FP)
1-ํŠน์ด๋„ = ์œ„์Œ์„ฑ๋น„์œจ(FPR) : ์Œ์„ฑ -> ์–‘์„ฑ ์˜ˆ์ธก๋น„์œจ = FN / (TN+FP)

3. 1์ข…์˜ค๋ฅ˜ / 2์ข…์˜ค๋ฅ˜
1์ข…์˜ค๋ฅ˜(α) : ๊ท€๋ฌด๊ฐ€์„ค์ด T์ง€๋งŒ -> ๊ท€๋ฌด๊ฐ€์„ค ๊ธฐ๊ฐ = ์œ„์Œ์„ฑ๋น„์œจ(FPR)
2์ข…์˜ค๋ฅ˜(β) : ๊ท€๋ฌด๊ฐ€์„ค์„ ๊ธฐ๊ฐํ•ด์•ผํ•˜๋Š”๋ฐ -> ์ฑ„ํƒ = ์œ„์–‘์„ฑ๋น„์œจ(FNR)
๊ฐ€์ • : ๋ถ€์ •์ ์ธ ํ˜•ํƒœ๋กœ ๊ฐ€์„ค์„ ์„ธ์›€

4. ROC Curve
๋ฏผ๊ฐ๋„(y์ถ•) vs ์œ„์Œ์„ฑ๋น„์œจ(FPR) = 1-ํŠน์ด๋„(x)
ํ•ด์„๋ฐฉ๋ฒ• : ์œ„์Œ์„ฑ๋น„์œจ์ด ์ž‘๊ณ , ๋ฏผ๊ฐ๋„๊ฐ€ ํด์ˆ˜๋ก ์ข‹์€ ๋ชจํ˜•์ด๋‹ค.

5. TP์™€ FP์˜ ๋น„์œจ
TP๋น„์œจ = TP / N
FP๋น„์œจ = FP / N

+ Recent posts