AI ์‹ค๋ฌด ์‘์šฉ ๊ณผ์ •
[์‘์šฉ๊ต์œก๊ณผ์ •] ๋จธ์‹ ๋Ÿฌ๋‹ ์‹œ์ž‘ํ•˜๊ธฐ (2) ์ง€๋„ํ•™์Šต - ํšŒ๊ท€

 

 

ํšŒ๊ท€ ๊ฐœ๋… ์•Œ์•„๋ณด๊ธฐ

[๊ฐ€์ •] ์•„์ด์Šคํฌ๋ฆผ ๊ฐ€๊ฒŒ ์šด์˜์ž์ผ ๋•Œ, ์˜ˆ์ƒ๋˜๋Š” ์‹ค์ œ ํŒ๋งค๋Ÿ‰๋งŒํผ๋งŒ์˜ ์ฃผ๋ฌธ๋Ÿ‰์„ ์›ํ•œ๋‹ค. ์ด ๋•Œ ๋งŒ์•ฝ ํ‰๊ท  ๊ธฐ์˜จ์„ ์ด์šฉํ•ด ํŒ๋งค๋Ÿ‰์„ ์˜ˆ์ƒํ•  ์ˆ˜ ์žˆ๋‹ค๋ฉด?


[๋ฌธ์ œ ์ •์˜]
๋ฐ์ดํ„ฐ : ๊ณผ๊ฑฐ ํ‰๊ท  ๊ธฐ์˜จ(X)๊ณผ ๊ทธ์— ๋”ฐ๋ฅธ ์•„์ด์Šคํฌ๋ฆผ ํŒ๋งค๋Ÿ‰(y)
๊ฐ€์ • : ํ‰๊ท  ๊ธฐ์˜จ๊ณผ ํŒ๋งค๋Ÿ‰์€ ์„ ํ˜•์ ์ธ ๊ด€๊ณ„๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์Œ * ์„ ํ˜• ๊ด€๊ณ„ : ๊ฐ™์€ ์ฆ๊ฐ ๊ฒฝํ–ฅ์„ฑ์„ ๋ณด์ด๋Š” ๊ด€๊ณ„
๋ชฉํ‘œ : ํ‰๊ท  ๊ธฐ์˜จ์— ๋‹ค๋ฅธ ์•„์ด์Šคํฌ๋ฆผ ํŒ๋งค๋Ÿ‰ ์˜ˆ์ธกํ•˜๊ธฐ 
ํ•ด๊ฒฐ๋ฐฉ์•ˆ : ์˜ˆ์ธก๋Œ€์ƒ์ด ์žˆ๊ณ , ๊ทธ์— ๋Œ€ํ•œ feature label์ด ์žˆ์œผ๋ฏ€๋กœ ์ง€๋„ํ•™์Šต -> ์˜ˆ์ธกํ•ด์•ผํ•  y๋ฐ์ดํ„ฐ๊ฐ€ ์ˆ˜์น˜ํ˜• ๊ฐ’์ด๋ฏ€๋กœ ํšŒ๊ท€ ๋ถ„์„ ์•Œ๊ณ ๋ฆฌ์ฆ˜

 

ํšŒ๊ท€ ๋ถ„์„์ด๋ž€?
label์ด ์ˆ˜์น˜ํ˜•์ผ ๋•Œ, ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์žฅ ์ž˜ ์„ค๋ช…ํ•˜๋Š” ๋ชจ๋ธ์„ ์ฐพ์•„ ์ž…๋ ฅ๊ฐ’์— ๋”ฐ๋ฅธ ๋ฏธ๋ž˜ ๊ฒฐ๊ณผ๊ฐ’์„ ์˜ˆ์ธกํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜
์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ X : ํ‰๊ท ๊ธฐ์˜จ, y : ์•„์ด์Šคํฌ๋ฆผ ํŒ๋งค๋Ÿ‰
๊ฐ€์ • : Y = β0 + β1X (์ง์„ ๋ชจ๋ธ)
๋ชฉํ‘œ : ์ ์ ˆํ•œ β0, β1๊ฐ’(๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์žฅ ์ž˜ ์„ค๋ช…ํ•˜๋Š” ๋ชจ๋ธ)์„ ์ฐพ์ž


์ ์ ˆํ•œ β0, β1๊ฐ’ ์ฐพ๊ธฐ
์™„๋ฒฝํ•œ ์˜ˆ์ธก์€ ๋ถˆ๊ฐ€๋Šฅํ•˜๋ฏ€๋กœ ์ตœ๋Œ€ํ•œ ์ž˜ ๊ทผ์‚ฌํ•ด์•ผํ•œ๋‹ค.
๊ฐ ๋ฐ์ดํ„ฐ์˜ ์‹ค์ œ ๊ฐ’๊ณผ ๋ชจ๋ธ์ด ์˜ˆ์ธกํ•˜๋Š” ๊ฐ’์˜ ์ฐจ์ด๋ฅผ ์ตœ์†Œํ•œ์œผ๋กœ ํ•˜๋Š” ์„ ์„ ์ฐพ์ž.
๋‹จ์ˆœ ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ์„ ํ•™์Šตํ•˜๋ฉฐ ์ฐจ์ด๋ฅผ ์ตœ์†Œํ•œ์œผ๋กœ ํ•˜๋Š” ์„ ์„ ์ฐพ๋Š” ๋ฐฉ๋ฒ•์„ ์•Œ์•„๋ณด์ž.




 

๋‹จ์ˆœ ์„ ํ˜• ํšŒ๊ท€

๋‹จ์ˆœ์„ ํ˜•ํšŒ๊ท€๋ž€?
๋ฐ์ดํ„ฐ๋ฅผ ์„ค๋ช…ํ•˜๋Š” ๋ชจ๋ธ์„ ์ง์„  ํ˜•ํƒœ๋กœ ๊ฐ€์ •ํ•œ ๊ฒƒ
Y = β0 + β1X
์ง์„ ์„ ๊ตฌ์„ฑํ•˜๋Š” β0(y์ ˆํŽธ)์™€ β1(๊ธฐ์šธ๊ธฐ)๋ฅผ ๊ตฌํ•ด์•ผ ํ•จ


๋ฐ์ดํ„ฐ๋ฅผ ์ž˜ ์„ค๋ช…ํ•œ๋‹ค๋Š” ๊ฒƒ์€?
์‹ค์ œ ์ •๋‹ต๊ณผ ๋‚ด๊ฐ€ ์˜ˆ์ธกํ•œ ๊ฐ’๊ณผ์˜ ์ฐจ์ด๊ฐ€ ์ž‘์„์ˆ˜๋ก ์ข‹์ง€ ์•Š์„๊นŒ?

-> ์™ผ์ชฝ ๊ทธ๋ž˜ํ”„์˜ ์ฐจ์ด๊ฐ€ ์˜ค๋ฅธ์ชฝ์— ๋น„ํ•ด ์ ์–ด ๋ณด์ธ๋‹ค.

 

 

์‹ค์ œ ๊ฐ’๊ณผ ์˜ˆ์ธก ๊ฐ’์˜ ์ฐจ์ด๋ฅผ ๊ตฌํ•ด๋ณด์ž.

 

 

์‹ค์ œ ๊ฐ’๊ณผ ์˜ˆ์ธก ๊ฐ’์˜ ์ฐจ์ด์˜ ํ•ฉ์œผ๋กœ ๋น„๊ตํ•˜๊ธฐ์—๋Š” ์˜ˆ์™ธ๊ฐ€ ์žˆ๋‹ค.

[ํ•ด์„] ํ•ฉ๊ณ„๊ฐ€ ๋‘˜ ๋‹ค 0์ด์ง€๋งŒ, ์šฐ์ธก์˜ ๊ทธ๋ž˜ํ”„๊ฐ€ 100%๋ฅผ ์„ค๋ช…ํ•˜๊ณ  ์žˆ๋‹ค. ์‹ค์ œ๊ฐ’๊ณผ ์˜ˆ์ธก๊ฐ’์˜ ํ•ฉ์€ ์ข‹์€ ๋ฐฉ๋ฒ•์ด ์•„๋‹ˆ๋‹ค.

 

 

์‹ค์ œ ๊ฐ’๊ณผ ์˜ˆ์ธก ๊ฐ’์˜ ์ฐจ์ด์˜ ์ œ๊ณฑ์˜ ํ•ฉ์œผ๋กœ ๋น„๊ตํ•˜์ž.

 

 

Lossํ•จ์ˆ˜
์‹ค์ œ ๊ฐ’๊ณผ ์˜ˆ์ธก ๊ฐ’ ์ฐจ์ด์˜ ์ œ๊ณฑ์˜ ํ•ฉ์„ Lossํ•จ์ˆ˜๋กœ ์ •์˜ํ•œ๋‹ค. -> Lossํ•จ์ˆ˜๊ฐ€ ์ž‘์„์ˆ˜๋ก ์ข‹์€ ๋ชจ๋ธ

 

Lossํ•จ์ˆ˜ ์ค„์ด๊ธฐ
Lossํ•จ์ˆ˜์—์„œ ์ฃผ์–ด์ง„ ๊ฐ’์€ ์ž…๋ ฅ ๊ฐ’๊ณผ ์‹ค์ œ ๊ฐ’์ด๋‹ค.
β0(y์ ˆํŽธ)์™€ β1(๊ธฐ์šธ๊ธฐ) ๊ฐ’์„ ์กฐ์ ˆํ•˜์—ฌ Lossํ•จ์ˆ˜์˜ ํฌ๊ธฐ๋ฅผ ์ž‘๊ฒŒ ๋งŒ๋“ ๋‹ค -> ๊ฑฐ์˜ ๋ชจ๋“  ์ง์„ ์„ ๊ทธ๋ฆด ์ˆ˜ ์žˆ๋‹ค!
๊ธฐ์šธ๊ธฐ ๊ฐ’์ด ์ปค์ง€๋ฉด ์ง์„ ์ด ์œ„์ชฝ์œผ๋กœ ์†Ÿ๊ณ , ๊ธฐ์šธ๊ธฐ ๊ฐ’์ด ์ž‘์•„์ง€๋ฉด ์ง์„ ์ด ์•„๋ž˜๋กœ ๋‚ด๋ ค๊ฐ„๋‹ค -> ๊ฐ๋„ ์กฐ์ ˆ
์ ˆํŽธ์ด ์ปค์ง€๋ฉด y์ถ• ์œ„์ชฝ์—์„œ ์‹œ์ž‘ํ•˜๋Š” ์ง์„ , ์ ˆํŽธ์ด ์ž‘์•„์ง€๋ฉด y์ถ• ์•„๋ž˜์—์„œ ์‹œ์ž‘ํ•˜๋Š” ์ง์„ 
Loss๊ฐ’์„ ๊ตฌํ•ด์„œ ๊ทธ ์ง์„ ์ด ์–ผ๋งˆ๋‚˜ ์ข‹์€ ์ง์„ ์ธ์ง€ ํŒ๋‹จํ•  ์ˆ˜ ์žˆ๋Š” ๊ทผ๊ฑฐ๊ฐ€ ๋œ๋‹ค


Lossํ•จ์ˆ˜์˜ ํฌ๊ธฐ๋ฅผ ์ž‘๊ฒŒ ํ•˜๋Š” β0(y์ ˆํŽธ)์™€ β1(๊ธฐ์šธ๊ธฐ)๋ฅผ ์ฐพ๋Š” ๋ฐฉ๋ฒ•

1) Gradient descent (๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ•)
2) Normal equation (least squares)
3) Brute force search
4) ...

 

 

๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ•
Lossํ•จ์ˆ˜ ๊ฐ’์ด ์ œ์ผ ์ž‘๊ฒŒ ํ•˜๋Š” ์ ˆํŽธ, ๊ธฐ์šธ๊ธฐ๋ฅผ β0*, β1*์ด๋ผ๊ณ  ํ•˜์ž.
๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ•์€ ๊ณ„์‚ฐ ํ•œ๋ฒˆ์œผ๋กœ β0*, β1*๋ฅผ ๊ตฌํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ, ์ดˆ๊ธฐ๊ฐ’์—์„œ ์ ์ง„์ ์œผ๋กœ ๊ตฌํ•˜๋Š” ๋ฐฉ์‹์ด๋‹ค.
* ์ดˆ๊ธฐ๊ฐ’ : ์ž„์˜์˜ β0*, β1*๊ฐ’


β0, β1๊ฐ’์„ Lossํ•จ์ˆ˜ ๊ฐ’์ด ์ž‘์•„์ง€๊ฒŒ ๊ณ„์† updateํ•˜๋Š” ๋ฐฉ๋ฒ•

1) β0, β1๊ฐ’์„ ๋žœ๋คํ•˜๊ฒŒ ์ดˆ๊ธฐํ™”
2) ํ˜„์žฌ β0, β1๊ฐ’์œผ๋กœ Loss๊ฐ’ ๊ณ„์‚ฐ
3) ํ˜„์žฌ  β0, β1๊ฐ’์„ ์–ด๋–ป๊ฒŒ ๋ณ€ํ™”์‹œ์ผœ์•ผ Loss๊ฐ’์„ ์ค„์ผ ์ˆ˜ ์žˆ๋Š”์ง€ ์•Œ ์ˆ˜ ์žˆ๋Š” Gradient๊ฐ’ ๊ณ„์‚ฐ
* Gradient๊ฐ’ : Loss๊ฐ’์„ ์ค„์ผ ์ˆ˜ ์žˆ๊ฒŒ ํ•˜๋Š”์ง€์— ๋Œ€ํ•œ ํžŒํŠธ, ์ด๋ฏธ์ง€์—์„œ ํ•˜๊ฐ•ํ•˜๋Š” ๋ฐฉํ–ฅ
4) Gredient๊ฐ’์„ ํ™œ์šฉํ•˜์—ฌ  β0, β1๊ฐ’ ์—…๋ฐ์ดํŠธ
5) Loss๊ฐ’์˜ ์ฐจ์ด๊ฐ€ ๊ฑฐ์˜ ์—†์–ด์งˆ ๋•Œ๊นŒ์ง€ 2~4๋ฒˆ ๊ณผ์ •์„ ๋ฐ˜๋ณต (Loss๊ฐ’๊ณผ ์ฐจ์ด๊ฐ€ ์ค„๋ฉด, Gradient๊ฐ’๋„ ์ž‘์•„์ง)

 

 

๋‹จ์ˆœ ์„ ํ˜• ํšŒ๊ท€ ๊ณผ์ •
1) ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ -> X, Y๊ฐ’
2) ๋‹จ์ˆœ ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ ํ•™์Šต -> ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ•
3) ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ์˜ˆ์ธก -> ์ตœ์ข…์ ์ธ ์ง์„  ๊ตฌํ•˜๊ธฐ (Loss๊ฐ’์ด ์ œ์ผ ์ ์Œ)

๋‹จ์ˆœ ์„ ํ˜• ํšŒ๊ท€ ํŠน์ง•
๊ฐ€์žฅ ๊ธฐ์ดˆ์ , ์—ฌ์ „ํžˆ ๋งŽ์ด ์‚ฌ์šฉ๋˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜
์ž…๋ ฅ๊ฐ’์ด 1๊ฐœ์ธ ๊ฒฝ์šฐ์—๋งŒ ์ ์šฉ ๊ฐ€๋Šฅ
์ž…๋ ฅ๊ฐ’๊ณผ ๊ฒฐ๊ณผ๊ฐ’์˜ ๊ด€๊ณ„๋ฅผ ์•Œ์•„๋ณด๋Š”๋ฐ ์šฉ์ด
์ž…๋ ฅ๊ฐ’์ด ๊ฒฐ๊ณผ๊ฐ’์— ์–ผ๋งˆ๋‚˜ ์˜ํ–ฅ์„ ๋ฏธ์น˜๋Š”์ง€ ์•Œ ์ˆ˜ ์žˆ์Œ
๋‘ ๋ณ€์ˆ˜ ๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ ์ง๊ด€์ ์œผ๋กœ ํ•ด์„ํ•˜๊ณ ์ž ํ•˜๋Š” ๊ฒฝ์šฐ ํ™œ์šฉ

 

 

 

[์—ฐ์Šต๋ฌธ์ œ] ๋‹จ์ˆœ ์„ ํ˜• ํšŒ๊ท€ ๋ถ„์„ํ•˜๊ธฐ - ๋ฐ์ดํ„ฐ ์ „ ์ฒ˜๋ฆฌ
์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ๋ฅผ sklearn์—์„œ ๋ถˆ๋Ÿฌ ์˜ฌ ์„ ํ˜• ๋ชจ๋ธ์— ์ ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ „ ์ฒ˜๋ฆฌ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
์ด๋ฒˆ ์‹ค์Šต์—์„œ๋Š” sklearn์—์„œ ์ œ๊ณตํ•˜๋Š” LinearRegression์„ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•œ ๋ฐ์ดํ„ฐ ์ „ ์ฒ˜๋ฆฌ๋ฅผ ์ˆ˜ํ–‰ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

LinearRegression ๋ชจ๋ธ์˜ ์ž…๋ ฅ๊ฐ’์œผ๋กœ๋Š” Pandas์˜ DataFrame์˜ feature (X) ๋ฐ์ดํ„ฐ์™€ Series ํ˜•ํƒœ์˜ label (Y) ๋ฐ์ดํ„ฐ๋ฅผ ์ž…๋ ฅ ๋ฐ›์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
X, Y์˜ ์ƒ˜ํ”Œ์˜ ๊ฐœ์ˆ˜๋Š” ๊ฐ™์•„์•ผ ํ•ฉ๋‹ˆ๋‹ค.
1. X ๋ฐ์ดํ„ฐ๋ฅผ column ๋ช…์ด X์ธ DataFrame์œผ๋กœ ๋ณ€ํ™˜ํ•˜๊ณ  train_X์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
2. ๋ฆฌ์ŠคํŠธ Y๋ฅผ Series ํ˜•์‹์œผ๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ train_Y์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

import matplotlib as mpl
mpl.use("Agg")
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression

import elice_utils
eu = elice_utils.EliceUtils()

X = [8.70153760, 3.90825773, 1.89362433, 3.28730045, 7.39333004, 2.98984649, 2.25757240, 9.84450732, 9.94589513, 5.48321616]
Y = [5.64413093, 3.75876583, 3.87233310, 4.40990425, 6.43845020, 4.02827829, 2.26105955, 7.15768995, 6.29097441, 5.19692852]

#1. X์˜ ํ˜•ํƒœ๋ฅผ ๋ณ€ํ™˜ํ•˜์—ฌ train_X์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
train_X = pd.DataFrame(X, columns=['X'])

#2. Y์˜ ํ˜•ํƒœ๋ฅผ ๋ณ€ํ™˜ํ•˜์—ฌ train_Y์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
train_Y = pd.Series(Y)

#๋ณ€ํ™˜๋œ ๋ฐ์ดํ„ฐ๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
print('์ „ ์ฒ˜๋ฆฌํ•œ X ๋ฐ์ดํ„ฐ: \n {}'.format(train_X))
print('์ „ ์ฒ˜๋ฆฌํ•œ X ๋ฐ์ดํ„ฐ shape: {}\n'.format(train_X.shape))

print('์ „ ์ฒ˜๋ฆฌํ•œ Y ๋ฐ์ดํ„ฐ: \n {}'.format(train_Y))
print('์ „ ์ฒ˜๋ฆฌํ•œ Y ๋ฐ์ดํ„ฐ shape: {}'.format(train_Y.shape))



[์—ฐ์Šต๋ฌธ์ œ2] ๋‹จ์ˆœ ์„ ํ˜• ํšŒ๊ท€ ๋ถ„์„ํ•˜๊ธฐ - ํ•™์Šตํ•˜๊ธฐ
๊ธฐ๊ณ„ํ•™์Šต ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ scikit-learn ์„ ์‚ฌ์šฉํ•˜๋ฉด Loss ํ•จ์ˆ˜๋ฅผ ์ตœ์†Œ๊ฐ’์œผ๋กœ ๋งŒ๋“œ๋Š” β0, β1์„ ์‰ฝ๊ฒŒ ๊ตฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
[์—ฐ์Šต๋ฌธ์ œ1]์—์„œ ์ „ ์ฒ˜๋ฆฌํ•œ ๋ฐ์ดํ„ฐ๋ฅผ LinearRegression ๋ชจ๋ธ์— ์ž…๋ ฅํ•˜์—ฌ ํ•™์Šต์„ ์ˆ˜ํ–‰ํ•ด๋ด…์‹œ๋‹ค.

import matplotlib as mpl
mpl.use("Agg")
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
import elice_utils
eu = elice_utils.EliceUtils()


X = [8.70153760, 3.90825773, 1.89362433, 3.28730045, 7.39333004, 2.98984649, 2.25757240, 9.84450732, 9.94589513, 5.48321616]
Y = [5.64413093, 3.75876583, 3.87233310, 4.40990425, 6.43845020, 4.02827829, 2.26105955, 7.15768995, 6.29097441, 5.19692852]

train_X = pd.DataFrame(X, columns=['X'])
train_Y = pd.Series(Y)

#1. ๋ชจ๋ธ์„ ์ดˆ๊ธฐํ™” ํ•ฉ๋‹ˆ๋‹ค.
lrmodel = LinearRegression()

#2. train_X, train_Y ๋ฐ์ดํ„ฐ๋ฅผ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค.
lrmodel.fit(train_X, train_Y)


#ํ•™์Šตํ•œ ๊ฒฐ๊ณผ๋ฅผ ์‹œ๊ฐํ™”ํ•˜๋Š” ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค.
plt.scatter(X, Y) 
plt.plot([0, 10], [lrmodel.intercept_, 10 * lrmodel.coef_[0] + lrmodel.intercept_], c='r') 
plt.xlim(0, 10) 
plt.ylim(0, 10) 
plt.title('Training Result')
plt.savefig("test.png") 
eu.send_image("test.png")

 

 

[์—ฐ์Šต๋ฌธ์ œ3] ๋‹จ์ˆœ ์„ ํ˜• ํšŒ๊ท€ ๋ถ„์„ํ•˜๊ธฐ - ์˜ˆ์ธกํ•˜๊ธฐ
[์‹ค์Šต2]์˜ ํ•™์Šตํ•œ ๋ชจ๋ธ์„ ๋ฐ”ํƒ•์œผ๋กœ ์˜ˆ์ธก ๊ฐ’์„ ๊ตฌํ•ด๋ด…์‹œ๋‹ค.
LinearRegression์„ ์‚ฌ์šฉํ•˜์—ฌ ์˜ˆ์ธก์„ ํ•ด์•ผํ•œ๋‹ค๋ฉด predict ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
1. lrmodel์„ ํ•™์Šตํ•˜๊ณ  train_X์˜ ์˜ˆ์ธก๊ฐ’์„ ๊ตฌํ•˜์—ฌ pred_X์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

import matplotlib as mpl
mpl.use("Agg")
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression

import elice_utils
eu = elice_utils.EliceUtils()


    
X = [8.70153760, 3.90825773, 1.89362433, 3.28730045, 7.39333004, 2.98984649, 2.25757240, 9.84450732, 9.94589513, 5.48321616]
Y = [5.64413093, 3.75876583, 3.87233310, 4.40990425, 6.43845020, 4.02827829, 2.26105955, 7.15768995, 6.29097441, 5.19692852]

train_X = pd.DataFrame(X, columns=['X'])
train_Y = pd.Series(Y)

#๋ชจ๋ธ์„ ํŠธ๋ ˆ์ด๋‹ํ•ฉ๋‹ˆ๋‹ค.
lrmodel = LinearRegression()
lrmodel.fit(train_X, train_Y)

#1. train_X์— ๋Œ€ํ•ด์„œ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.
pred_X = lrmodel.predict(train_X)
print('train_X์— ๋Œ€ํ•œ ์˜ˆ์ธก๊ฐ’ : \n{}\n'.format(pred_X))
print('์‹ค์ œ๊ฐ’ : \n{}'.format(train_Y))

 

 

 

 

 

 

๋‹ค์ค‘ ์„ ํ˜• ํšŒ๊ท€

[๊ฐ€์ •] ๋งŒ์•ฝ, ์ž…๋ ฅ๊ฐ’ X์— ๊ฐ•์ˆ˜๋Ÿ‰์ด ์ถ”๊ฐ€๋œ๋‹ค๋ฉด?
์ฆ‰, ํ‰๊ท  ๊ธฐ์˜จ๊ณผ ํ‰๊ท  ๊ฐ•์ˆ˜๋Ÿ‰์— ๋”ฐ๋ฅธ ์•„์ด์Šคํฌ๋ฆผ ํŒ๋งค๋Ÿ‰์„ ์—์ธกํ•˜๊ณ ์ž ํ•  ๋•Œ

์—ฌ๋Ÿฌ ๊ฐœ์˜ ์ž…๋ ฅ๊ฐ’(X)์œผ๋กœ ๊ฒฐ๊ณผ๊ฐ’(Y)์„ ์˜ˆ์ธกํ•˜๊ณ ์ž ํ•˜๋Š” ๊ฒฝ์šฐ -> ๋‹ค์ค‘ ์„ ํ˜• ํšŒ๊ท€

 

๋‹ค์ค‘ ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ ์ดํ•ดํ•˜๊ธฐ

์ž…๋ ฅ๊ฐ’ X๊ฐ€ ์—ฌ๋Ÿฌ ๊ฐœ(2๊ฐœ ์ด์ƒ)์ธ ๊ฒฝ์šฐ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ํšŒ๊ท€ ์•Œ๊ณ ๋ฆฌ์ฆ˜
๊ฐ ๊ฐœ๋ณ„ Xi์— ํ•ด๋‹นํ•˜๋Š” ์ตœ์ ์˜ βi๋ฅผ ์ฐพ์•„์•ผ ํ•œ๋‹ค.

์„ ํ˜• ๊ด€๊ณ„๋ฅผ ๊ฐ€์ •ํ•œ๋‹ค.

 

 

๋‹ค์ค‘ ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ์˜ Lossํ•จ์ˆ˜
๋‹จ์ˆœ ์„ ํ˜• ํšŒ๊ท€์™€ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ Lossํ•จ์ˆ˜๋Š” ์ž…๋ ฅ๊ฐ’๊ณผ ์‹ค์ œ๊ฐ’ ์ฐจ์ด์˜ ์ œ๊ณฑ์˜ ํ•ฉ์œผ๋กœ ์ •์˜ํ•œ๋‹ค.
๋งˆ์ฐฌ๊ฐ€์ง€๋กœ β0, β1, β2...βm ๊ฐ’์„ ์กฐ์ ˆํ•˜์—ฌ Lossํ•จ์ˆ˜์˜ ํฌ๊ธฐ๋ฅผ ์ž‘๊ฒŒ ํ•œ๋‹ค.

ํ‰๊ท  ๊ธฐ์˜จ๊ณผ ํ‰๊ท  ๊ฐ•์ˆ˜๋Ÿ‰์œผ๋กœ ์•„์ด์Šคํฌ๋ฆผ ํŒ๋งค๋Ÿ‰ ์˜ˆ์ธก ์˜ˆ์‹œ

543.94/4 = Loss๊ฐ’์€ ์•ฝ 133

 

๋‹ค์ค‘ ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ์˜ ๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ•

β0, β1, β2...βm ๊ฐ’์„ Lossํ•จ์ˆ˜ ๊ฐ’์ด ์ž‘์•„์ง€๊ฒŒ ๊ณ„์† ์—…๋ฐ์ดํŠธ ํ•˜๋Š” ๋ฐฉ๋ฒ•

 

1) β0, β1, β2...βm ๊ฐ’์„ ๋žœ๋คํ•˜๊ฒŒ ์ดˆ๊ธฐํ™”
2) ํ˜„์žฌ β0, β1, β2...βm ๊ฐ’์œผ๋กœ Loss๊ฐ’ ๊ณ„์‚ฐ
3) ํ˜„์žฌ β0, β1, β2...βm ๊ฐ’์„ ์–ด๋–ป๊ฒŒ ๋ณ€ํ™”์‹œ์ผœ์•ผ Loss๊ฐ’์„ ์ค„์ผ ์ˆ˜ ์žˆ๋Š”์ง€ ์•Œ ์ˆ˜ ์žˆ๋Š” Gradient๊ฐ’ ๊ณ„์‚ฐ
* Gradient๊ฐ’ : Loss๊ฐ’์„ ์ค„์ผ ์ˆ˜ ์žˆ๊ฒŒ ํ•˜๋Š”์ง€์— ๋Œ€ํ•œ ํžŒํŠธ
4) Gredient๊ฐ’์„ ํ™œ์šฉํ•˜์—ฌ β0, β1, β2...βm ๊ฐ’ ์—…๋ฐ์ดํŠธ
5) Loss๊ฐ’์˜ ์ฐจ์ด๊ฐ€ ๊ฑฐ์˜ ์—†์–ด์งˆ ๋•Œ๊นŒ์ง€ 2~4๋ฒˆ ๊ณผ์ •์„ ๋ฐ˜๋ณต (Loss๊ฐ’๊ณผ ์ฐจ์ด๊ฐ€ ์ค„๋ฉด, Gradient๊ฐ’๋„ ์ž‘์•„์ง)

 

ํ‰๊ท  ๊ธฐ์˜จ๊ณผ ํ‰๊ท  ๊ฐ•์ˆ˜๋Ÿ‰์œผ๋กœ ์•„์ด์Šคํฌ๋ฆผ ํŒ๋งค๋Ÿ‰ ์˜ˆ์ธก ์˜ˆ์‹œ

42.89/4 = Loss๊ฐ’์€ ์•ฝ 10

์ด์ „์˜ 133์—์„œ Loss๊ฐ’์ด ํ˜„์ €ํžˆ ์ค„์—ˆ์Œ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

 

 

๋‹ค์ค‘ ์„ ํ˜• ํšŒ๊ท€ ํŠน์ง•
์—ฌ๋Ÿฌ ๊ฐœ์˜ ์ž…๋ ฅ๊ฐ’๊ณผ ๊ฒฐ๊ณผ๊ฐ’ ๊ฐ„์˜ ๊ด€๊ณ„ ํ™•์ธ ๊ฐ€๋Šฅ
์–ด๋–ค ์ž…๋ ฅ๊ฐ’์ด ๊ฒฐ๊ณผ๊ฐ’์— ์–ด๋– ํ•œ ์˜ํ–ฅ์„ ๋ฏธ์น˜๋Š”์ง€ ์•Œ ์ˆ˜ ์žˆ์Œ
์—ฌ๋Ÿฌ ๊ฐœ์˜ ์ž…๋ ฅ๊ฐ’ ์‚ฌ์ด ๊ฐ„์˜ ์ƒ๊ด€๊ด€๊ณ„๊ฐ€ ๋†’์„ ๊ฒฝ์šฐ ๊ฒฐ๊ณผ์— ๋Œ€ํ•œ ์‹ ๋ขฐ์„ฑ์„ ์žƒ์„ ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ์Œ
* ์ƒ๊ด€๊ด€๊ณ„ : ๋‘ ๊ฐ€์ง€ ๊ฒƒ์˜ ํ•œ์ชฝ์ด ๋ณ€ํ™”ํ•˜๋ฉด ๋‹ค๋ฅธ ํ•œ์ชฝ๋„ ๋”ฐ๋ผ์„œ ๋ณ€ํ™”ํ•˜๋Š” ๊ด€๊ณ„

* ๋งŒ์•ฝ X1๊ณผ X2์˜ ์ƒ๊ด€๊ด€๊ณ„๊ฐ€ ๋†’๋‹ค๋ฉด? β๋Š” ๊ฐ๊ฐ์˜ X์— ๋Œ€ํ•ด ์–ผ๋งˆ๋‚˜ ์˜ํ–ฅ์„ ๋ฏธ์น˜๋Š”์ง€๋ฅผ ์•Œ๊ณ  ์‹ถ์–ด ์ƒ์ •ํ•œ ๋ณ€์ˆ˜์ธ๋ฐ, X1์ด ์ปค์งˆ ๋•Œ X2๋„ ์ปค์ง€๋ฉด ์„œ๋กœ์˜ β์— ์˜ํ–ฅ์„ ๋ฏธ์น˜๊ฒŒ ๋˜๊ณ , ๊ฐ€์ •์„ ๋ฒ—์–ด๋‚œ๋‹ค (์‹ ๋ขฐ์„ฑ์„ ์žƒ๋Š”๋‹ค)

 

 

 

[์—ฐ์Šต๋ฌธ์ œ4] ๋‹ค์ค‘ ํšŒ๊ท€ ๋ถ„์„ํ•˜๊ธฐ - ๋ฐ์ดํ„ฐ ์ „ ์ฒ˜๋ฆฌ
FB, TV, Newspaper ๊ด‘๊ณ ์— ๋Œ€ํ•œ ๋น„์šฉ ๋Œ€๋น„ Sales ๋ฐ์ดํ„ฐ๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ, ์ด๋ฅผ ๋‹ค์ค‘ ํšŒ๊ท€ ๋ถ„์„์œผ๋กœ ๋ถ„์„ํ•ด๋ด…์‹œ๋‹ค.
๋ฐ์ดํ„ฐ๋ฅผ ์ „ ์ฒ˜๋ฆฌ ํ•˜๊ธฐ ์œ„ํ•ด์„œ 3๊ฐœ์˜ ๋ณ€์ˆ˜๋ฅผ ๊ฐ–๋Š” feature ๋ฐ์ดํ„ฐ์™€ Sales ๋ณ€์ˆ˜๋ฅผ label ๋ฐ์ดํ„ฐ๋กœ ๋ถ„๋ฆฌํ•˜๊ณ  ํ•™์Šต์šฉ, ํ‰๊ฐ€์šฉ ๋ฐ์ดํ„ฐ๋กœ ๋‚˜๋ˆ ๋ด…์‹œ๋‹ค.

1. DataFrame์œผ๋กœ ์ฝ์–ด ์˜จ df์—์„œ Sales ๋ณ€์ˆ˜๋Š” label ๋ฐ์ดํ„ฐ๋กœ Y์— ์ €์žฅํ•˜๊ณ  ๋‚˜๋จธ์ง„ X์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
2/ train_test_split๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ X, Y๋ฅผ ํ•™์Šต์šฉ:ํ‰๊ฐ€์šฉ=8:2ํ•™์Šต์šฉ : ํ‰๊ฐ€์šฉ = 8:2ํ•™์Šต์šฉ:ํ‰๊ฐ€์šฉ=8:2 ๋น„์œจ๋กœ ๋ถ„๋ฆฌํ•ฉ๋‹ˆ๋‹ค. (random_state=42๋Š” ๊ณ ์ •ํ•ฉ๋‹ˆ๋‹ค.)

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.model_selection import train_test_split

df = pd.read_csv("data/Advertising.csv")
print('์›๋ณธ ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ :')
print(df.head(),'\n')

#์ž…๋ ฅ ๋ณ€์ˆ˜๋กœ ์‚ฌ์šฉํ•˜์ง€ ์•Š๋Š” Unnamed: 0 ๋ณ€์ˆ˜ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ญ์ œํ•ฉ๋‹ˆ๋‹ค
df = df.drop(columns=['Unnamed: 0'])

#1. Sales ๋ณ€์ˆ˜๋Š” label ๋ฐ์ดํ„ฐ๋กœ Y์— ์ €์žฅํ•˜๊ณ  ๋‚˜๋จธ์ง„ X์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
X = df.drop(columns=['Sales'])
Y = df['Sales']

#2. ํ•™์Šต์šฉ ํ‰๊ฐ€์šฉ ๋ฐ์ดํ„ฐ๋กœ ๋ถ„๋ฆฌํ•ฉ๋‹ˆ๋‹ค.
train_X, test_X, train_Y, test_Y = train_test_split(X, Y, test_size=0.2, random_state=42)

#์ „ ์ฒ˜๋ฆฌํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค
print('train_X : ')
print(train_X.head(),'\n')
print('train_Y : ')
print(train_Y.head(),'\n')

print('test_X : ')
print(test_X.head(),'\n')
print('test_Y : ')
print(test_Y.head())

 


[์—ฐ์Šต๋ฌธ์ œ5] ๋‹ค์ค‘ ํšŒ๊ท€ ๋ถ„์„ํ•˜๊ธฐ - ํ•™์Šตํ•˜๊ธฐ
[์‹ค์Šต4]์—์„œ ์ „ ์ฒ˜๋ฆฌํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ๋‹ค์ค‘ ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ์„ ์ ์šฉํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
๋‹ค์ค‘ ์„ ํ˜• ํšŒ๊ท€ ๋˜ํ•œ ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ๊ณผ ๊ฐ™์€ ๋ฐฉ์‹์œผ๋กœ LinearRegression์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์ด๋ฒˆ ์‹ค์Šต์—์„œ๋Š” ํ•™์Šต์šฉ ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค์ค‘ ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ํ•™์Šตํ•˜๊ณ , ํ•™์Šต๋œ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ถœ๋ ฅํ•ด๋ด…์‹œ๋‹ค.
LinearRegression์˜ beta์™€ ๊ฐ™์€ ํŒŒ๋ผ๋ฏธํ„ฐ๋“ค์€ ์•„๋ž˜ ์ฝ”๋“œ์™€ ๊ฐ™์ด ๊ตฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
lrmodel = LinearRegression()
lrmodel.intercept_
lrmodel.coef_[i]

1. ๋‹ค์ค‘ ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ LinearRegression์„ ๋ถˆ๋Ÿฌ์™€ lrmodel์— ์ดˆ๊ธฐํ™”ํ•˜๊ณ  fit์„ ์‚ฌ์šฉํ•˜์—ฌ train_X, train_Y๋ฐ์ดํ„ฐ๋ฅผ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค.
2. ํ•™์Šต๋œ ๋ชจ๋ธ lrmodel์—์„œ beta_0, beta_1, beta_2, beta_3์— ํ•ด๋‹นํ•˜๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๋ถˆ๋Ÿฌ์™€ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.model_selection import train_test_split

#๋ฐ์ดํ„ฐ๋ฅผ ์ฝ๊ณ  ์ „ ์ฒ˜๋ฆฌํ•ฉ๋‹ˆ๋‹ค
df = pd.read_csv("data/Advertising.csv")
df = df.drop(columns=['Unnamed: 0'])

X = df.drop(columns=['Sales'])
Y = df['Sales']

train_X, test_X, train_Y, test_Y = train_test_split(X, Y, test_size=0.2, random_state=42)

#1.  ๋‹ค์ค‘ ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ์„ ์ดˆ๊ธฐํ™” ํ•˜๊ณ  ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค
lrmodel = LinearRegression()
lrmodel.fit(train_X, train_Y)

#2. ํ•™์Šต๋œ ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’์„ ๋ถˆ๋Ÿฌ์˜ต๋‹ˆ๋‹ค
beta_0 = lrmodel.intercept_ # y์ ˆํŽธ (๊ธฐ๋ณธ ํŒ๋งค๋Ÿ‰)
beta_1 = lrmodel.coef_[0] # 1๋ฒˆ์งธ ๋ณ€์ˆ˜์— ๋Œ€ํ•œ ๊ณ„์ˆ˜ (ํŽ˜์ด์Šค๋ถ)
beta_2 = lrmodel.coef_[1] # 2๋ฒˆ์งธ ๋ณ€์ˆ˜์— ๋Œ€ํ•œ ๊ณ„์ˆ˜ (TV)
beta_3 = lrmodel.coef_[2] # 3๋ฒˆ์งธ ๋ณ€์ˆ˜์— ๋Œ€ํ•œ ๊ณ„์ˆ˜ (์‹ ๋ฌธ)

print("beta_0: %f" % beta_0)
print("beta_1: %f" % beta_1)
print("beta_2: %f" % beta_2)
print("beta_3: %f" % beta_3)



[์—ฐ์Šต๋ฌธ์ œ6] ๋‹ค์ค‘ ํšŒ๊ท€ ๋ถ„์„ํ•˜๊ธฐ - ์˜ˆ์ธกํ•˜๊ธฐ
[์‹ค์Šต5]์—์„œ ํ•™์Šตํ•œ ๋‹ค์ค‘ ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ์„ ๋ฐ”ํƒ•์œผ๋กœ ์ด๋ฒˆ์—” ์ƒˆ๋กœ์šด ๊ด‘๊ณ  ๋น„์šฉ์— ๋”ฐ๋ฅธ Sales ๊ฐ’์„ ์˜ˆ์ธกํ•ด๋ด…์‹œ๋‹ค.
LinearRegression์„ ์‚ฌ์šฉํ•˜์—ฌ ์˜ˆ์ธก์„ ํ•ด์•ผํ•œ๋‹ค๋ฉด ์•„๋ž˜์™€ ๊ฐ™์ด predict ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
pred_X = lrmodel.predict(X)

1. lrmodel์„ ํ•™์Šตํ•˜๊ณ  test_X์˜ ์˜ˆ์ธก๊ฐ’์„ ๊ตฌํ•˜์—ฌ pred_X์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
2. lrmodel์„ ํ•™์Šตํ•˜๊ณ  ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ df1์˜ ์˜ˆ์ธก๊ฐ’์„ ๊ตฌํ•˜์—ฌ pred_df1์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.model_selection import train_test_split

#๋ฐ์ดํ„ฐ๋ฅผ ์ฝ๊ณ  ์ „ ์ฒ˜๋ฆฌํ•ฉ๋‹ˆ๋‹ค
df = pd.read_csv("data/Advertising.csv")
df = df.drop(columns=['Unnamed: 0'])

X = df.drop(columns=['Sales'])
Y = df['Sales']

train_X, test_X, train_Y, test_Y = train_test_split(X, Y, test_size=0.2, random_state=42)


#๋‹ค์ค‘ ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ์„ ์ดˆ๊ธฐํ™” ํ•˜๊ณ  ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค
lrmodel = LinearRegression()
lrmodel.fit(train_X, train_Y)

print('test_X : ')
print(test_X)

#1. test_X์— ๋Œ€ํ•ด์„œ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.
pred_X = lrmodel.predict(test_X)
print('test_X์— ๋Œ€ํ•œ ์˜ˆ์ธก๊ฐ’ : \n{}\n'.format(pred_X))

#์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ df1์„ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค
df1 = pd.DataFrame(np.array([[0, 0, 0], [1, 0, 0], [0, 1, 0], [0, 0, 1], [1, 1, 1]]), columns=['FB', 'TV', 'Newspaper'])
print('df1 : ')
print(df1)

#2. df1์— ๋Œ€ํ•ด์„œ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.
pred_df1 = lrmodel.predict(df1)
print('df1์— ๋Œ€ํ•œ ์˜ˆ์ธก๊ฐ’ : \n{}'.format(pred_df1))

 

 

 

 

 

 

ํšŒ๊ท€ ํ‰๊ฐ€ ์ง€ํ‘œ

์–ด๋–ค ๋ชจ๋ธ์ด ์ข‹์€ ๋ชจ๋ธ์ธ์ง€๋ฅผ ์–ด๋–ป๊ฒŒ ํ‰๊ฐ€ํ•  ์ˆ˜ ์žˆ์„๊นŒ?

๋ชฉํ‘œ๋ฅผ ์–ผ๋งˆ๋‚˜ ์ž˜ ๋‹ฌ์„ฑํ–ˆ๋Š”์ง€ ์ •๋„๋ฅผ ํ‰๊ฐ€ํ•ด์•ผ ํ•œ๋‹ค.
์‹ค์ œ ๊ฐ’๊ณผ ๋ชจ๋ธ์ด ์˜ˆ์ธกํ•˜๋Š” ๊ฐ’์˜ ์ฐจ์ด์— ๊ธฐ๋ฐ˜ํ•œ ํ‰๊ฐ€ ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•œ๋‹ค. ex) RSS, MSE, MAE, MATE, R^2


RSS ๋‹จ์ˆœ ์˜ค์ฐจ


1. ์‹ค์ œ ๊ฐ’๊ณผ ์˜ˆ์ธก ๊ฐ’์˜ ๋‹จ์ˆœ ์˜ค์ฐจ ์ œ๊ณฑ ํ•ฉ
2. ๊ฐ’์ด ์ž‘์„์ˆ˜๋ก ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์ด ๋†’์Œ
3. ์ „์ฒด ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ์‹ค์ œ ๊ฐ’๊ณผ ์˜ˆ์ธกํ•˜๋Š” ๊ฐ’์˜ ์˜ค์ฐจ ์ œ๊ณฑ์˜ ์ดํ•ฉ

RSS ํŠน์ง•
๊ฐ€์žฅ ๊ฐ„๋‹จํ•œ ํ‰๊ฐ€ ๋ฐฉ๋ฒ•์œผ๋กœ ์ง๊ด€์ ์ธ ํ•ด์„์ด ๊ฐ€๋Šฅ
๊ทธ๋Ÿฌ๋‚˜ ์˜ค์ฐจ๋ฅผ ๊ทธ๋Œ€๋กœ ์ด์šฉํ•˜๋ฏ€๋กœ ์ž…๋ ฅ ๊ฐ’์˜ ํฌ๊ธฐ์— ์˜์กด์ ์ž„ (๋ฐ์ดํ„ฐ์˜ ๋ฒ”์œ„ ์ž์ฒด๊ฐ€ ์ž‘์œผ๋ฉด RSS๋„ ์ž‘๊ณ , ๋ฒ”์œ„๊ฐ€ ํฌ๋ฉด RSS๋„ ํฌ๋‹ค)
์ ˆ๋Œ€์ ์ธ ๊ฐ’๊ณผ ๋น„๊ต๊ฐ€ ๋ถˆ๊ฐ€๋Šฅ

 

 

MSE, MAE (์ ˆ๋Œ€์ ์ธ ํฌ๊ธฐ์— ์˜์กดํ•œ ์ง€ํ‘œ)

1. MSE(Mean Squared Error)
ํ‰๊ท  ์ œ๊ณฑ ์˜ค์ฐจ, RSS์—์„œ ๋ฐ์ดํ„ฐ ์ˆ˜ ๋งŒํผ ๋‚˜๋ˆˆ ๊ฐ’.
์ž‘์„์ˆ˜๋ก ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์ด ๋†’๋‹ค๊ณ  ํ‰๊ฐ€ํ•  ์ˆ˜ ์žˆ๋‹ค.
* MSE๋Š” ์ง€ํ‘œ, Loss๋Š” ๋ชจ๋ธ์—์„œ ์ค„์—ฌ์•ผ ํ•˜๋Š” ๊ฐ’

 

2. MAE(Mean Absolute Error)
ํ‰๊ท  ์ ˆ๋Œ€๊ฐ’ ์˜ค์ฐจ, ์‹ค์ œ ๊ฐ’๊ณผ ์˜ˆ์ธก ๊ฐ’์˜ ์˜ค์ฐจ์˜ ์ ˆ๋Œ€๊ฐ’์˜ ํ‰๊ท 
์ž‘์„์ˆ˜๋ก ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์ด ๋†’๋‹ค๊ณ  ํ‰๊ฐ€ํ•  ์ˆ˜ ์žˆ๋‹ค.

MSE, MAE ํŠน์ง•
MSE : ์ด์ƒ์น˜(Outlier) ์ฆ‰, ๋ฐ์ดํ„ฐ๋“ค ์ค‘ ํฌ๊ฒŒ ๋–จ์–ด์ง„ ๊ฐ’์— ๋ฏผ๊ฐํ•˜๋‹ค
MAE : ๋ณ€๋™์„ฑ์ด ํฐ ์ง€ํ‘œ์™€ ๋‚ฎ์€ ์ง€ํ‘œ๋ฅผ ๊ฐ™์ด ์˜ˆ์ธกํ•  ๋•Œ ์œ ์šฉํ•˜๋‹ค
๊ฐ€์žฅ ๊ฐ„๋‹จํ•œ ํ‰๊ฐ€ ๋ฐฉ๋ฒ•๋“ค๋กœ ์ง๊ด€์  ํ•ด์„์ด ๊ฐ€๋Šฅ
๊ทธ๋Ÿฌ๋‚˜ ํ‰๊ท ์„ ๊ทธ๋Œ€๋กœ ์ด์šฉํ•˜๋ฏ€๋กœ ์ž…๋ ฅ ๊ฐ’์˜ ํฌ๊ธฐ์— ์˜์กด์ 
์ ˆ๋Œ€์ ์ธ ๊ฐ’๊ณผ ๋น„๊ต ๋ถˆ๊ฐ€๋Šฅ


R^2 ๊ฒฐ์ • ๊ณ„์ˆ˜

ํšŒ๊ท€ ๋ชจ๋ธ์˜ ์„ค๋ช…๋ ฅ์„ ํ‘œํ˜„ํ•˜๋Š” ์ง€ํ‘œ
1์— ๊ฐ€๊นŒ์šธ์ˆ˜๋ก ๋†’์€ ์„ฑ๋Šฅ์˜ ๋ชจ๋ธ์ด๋ผ๊ณ  ํ•ด์„ํ•  ์ˆ˜ ์žˆ๋‹ค

 

TSS๋Š” ๋ฐ์ดํ„ฐ ํ‰๊ท  ๊ฐ’๊ณผ ์‹ค์ œ ๊ฐ’ ์ฐจ์ด์˜ ์ œ๊ณฑ

RSS๋Š” ์˜ˆ์ธก์— ์˜ํ•œ ํšŒ๊ท€์„ ์„ ๊ธ‹๊ณ , ๊ทธ ํšŒ๊ท€์„ ๊ณผ ์‹ค์ œ ๊ฐ’๊ณผ์˜ ์ฐจ์ด

* TSS > RSS. RSS๋ฅผ 0์— ๊ฐ€๊นŒ์šด ๊ฐ’์„ ๋„์ถœํ•ด๋‚ผ์ˆ˜๋ก, R^2์˜ ๊ฐ’๋„ 1์— ๊ฐ€๊นŒ์›Œ์ง„๋‹ค (์ข‹์€ ์˜ˆ์ธก)

0์— ๊ฐ€๊นŒ์šฐ๋ฉด ์ข‹์ง€ ๋ชปํ•œ ์˜ˆ์ธก, ์Œ์ˆ˜๊ฐ€ ๋„์ถœ๋˜๋ฉด ์™„์ „ ํ—›๋‹ค๋ฆฌ ๋ชจ๋ธ

 

R^2 ํŠน์ง•
์˜ค์ฐจ๊ฐ€ ์—†์„ ์ˆ˜๋ก 1์— ๊ฐ€๊นŒ์šด ๊ฐ’์„ ๊ฐ–์Œ
๊ฐ’์ด 0์ธ ๊ฒฝ์šฐ, ๋ฐ์ดํ„ฐ์˜ ํ‰๊ท  ๊ฐ’์„ ์ถœ๋ ฅํ•˜๋Š” ์ง์„  ๋ชจ๋ธ์„ ์˜๋ฏธ
์Œ์ˆ˜ ๊ฐ’์ด ๋‚˜์˜จ ๊ฒฝ์šฐ, ํ‰๊ท  ๊ฐ’ ์˜ˆ์ธก๋ณด๋‹ค ์„ฑ๋Šฅ์ด ์ข‹์ง€ ์•Š์Œ

 

 

 

[์—ฐ์Šต๋ฌธ์ œ7] ํšŒ๊ท€ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ํ‰๊ฐ€ ์ง€ํ‘œ - MSE, MAE
[์‹ค์Šต6] ์— ์ด์–ด์„œ Sales ์˜ˆ์ธก ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด์„œ ๋‹ค์–‘ํ•œ ํšŒ๊ท€ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ํ‰๊ฐ€ ์ง€ํ‘œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋น„๊ตํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
MSE์™€ MAE๋Š” sklearn ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ํ•จ์ˆ˜๋ฅผ ํ†ตํ•˜์—ฌ ์‰ฝ๊ฒŒ ๊ตฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
MSE, MAE ํ‰๊ฐ€ ์ง€ํ‘œ๋ฅผ ๊ณ„์‚ฐํ•˜๊ธฐ ์œ„ํ•œ ์‚ฌ์ดํ‚ท๋Ÿฐ ํ•จ์ˆ˜/๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ
mean_squared_error(y_true, y_pred): MSE ๊ฐ’ ๊ณ„์‚ฐํ•˜๊ธฐ
mean_absolute_error(y_true, y_pred): MAE ๊ฐ’ ๊ณ„์‚ฐํ•˜๊ธฐ

1. train_X ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ MSE, MAE ๊ฐ’์„ ๊ณ„์‚ฐํ•˜์—ฌ MSE_train, MAE_train์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
2. test_X ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ MSE, MAE ๊ฐ’์„ ๊ณ„์‚ฐํ•˜์—ฌ MSE_test, MAE_test์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error
from sklearn.metrics import mean_squared_error

#๋ฐ์ดํ„ฐ๋ฅผ ์ฝ๊ณ  ์ „ ์ฒ˜๋ฆฌํ•ฉ๋‹ˆ๋‹ค
df = pd.read_csv("data/Advertising.csv")
df = df.drop(columns=['Unnamed: 0'])

X = df.drop(columns=['Sales'])
Y = df['Sales']

train_X, test_X, train_Y, test_Y = train_test_split(X, Y, test_size=0.2, random_state=42)


#๋‹ค์ค‘ ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ์„ ์ดˆ๊ธฐํ™” ํ•˜๊ณ  ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค
lrmodel = LinearRegression()
lrmodel.fit(train_X, train_Y)


#train_X ์˜ ์˜ˆ์ธก๊ฐ’์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค
pred_train = lrmodel.predict(train_X)

#1. train_X ์˜ MSE, MAE ๊ฐ’์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค
MSE_train = mean_squared_error(train_Y, pred_train)
MAE_train = mean_absolute_error(train_Y, pred_train)
print('MSE_train : %f' % MSE_train)
print('MAE_train : %f' % MAE_train)

#test_X ์˜ ์˜ˆ์ธก๊ฐ’์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค
pred_test = lrmodel.predict(test_X)

#2. test_X ์˜ MSE, MAE ๊ฐ’์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค
MSE_test = mean_squared_error(test_Y, pred_test)
MAE_test = mean_absolute_error(test_Y, pred_test)
print('MSE_test : %f' % MSE_test)
print('MAE_test : %f' % MAE_test)



[์—ฐ์Šต๋ฌธ์ œ8] ํšŒ๊ท€ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ํ‰๊ฐ€ ์ง€ํ‘œ - R2
[์‹ค์Šต7] ์— ์ด์–ด์„œ Sales ์˜ˆ์ธก ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด์„œ ๋‹ค์–‘ํ•œ ํšŒ๊ท€ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ํ‰๊ฐ€ ์ง€ํ‘œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋น„๊ตํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
R2 score๋Š” sklearn ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ํ•จ์ˆ˜๋ฅผ ํ†ตํ•˜์—ฌ ์‰ฝ๊ฒŒ ๊ตฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
R2 ํ‰๊ฐ€ ์ง€ํ‘œ๋ฅผ ๊ณ„์‚ฐํ•˜๊ธฐ ์œ„ํ•œ ์‚ฌ์ดํ‚ท๋Ÿฐ ํ•จ์ˆ˜/๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ
r2_score(y_true, y_pred): R2 score ๊ฐ’ ๊ณ„์‚ฐํ•˜๊ธฐ

1. train_X ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ R2 ๊ฐ’์„ ๊ณ„์‚ฐํ•˜์—ฌ R2_train์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
2. test_X ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ R2 ๊ฐ’์„ ๊ณ„์‚ฐํ•˜์—ฌ R2_test์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error
from sklearn.metrics import mean_squared_error

#๋ฐ์ดํ„ฐ๋ฅผ ์ฝ๊ณ  ์ „ ์ฒ˜๋ฆฌํ•ฉ๋‹ˆ๋‹ค
df = pd.read_csv("data/Advertising.csv")
df = df.drop(columns=['Unnamed: 0'])

X = df.drop(columns=['Sales'])
Y = df['Sales']

train_X, test_X, train_Y, test_Y = train_test_split(X, Y, test_size=0.2, random_state=42)


#๋‹ค์ค‘ ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ์„ ์ดˆ๊ธฐํ™” ํ•˜๊ณ  ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค
lrmodel = LinearRegression()
lrmodel.fit(train_X, train_Y)

#train_X ์˜ ์˜ˆ์ธก๊ฐ’์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค
pred_train = lrmodel.predict(train_X)

#1. train_X ์˜ R2 ๊ฐ’์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค
R2_train = r2_score(train_Y, pred_train)
print('R2_train : %f' % R2_train)

#test_X ์˜ ์˜ˆ์ธก๊ฐ’์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค
pred_test = lrmodel.predict(test_X)

#2. test_X ์˜ R2 ๊ฐ’์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค
R2_test = r2_score(test_Y, pred_test)
print('R2_test : %f' % R2_test)

+ Recent posts