AI ์‹ค๋ฌด ์‘์šฉ ๊ณผ์ •
[์‘์šฉ๊ต์œก๊ณผ์ •] ๋”ฅ๋Ÿฌ๋‹ ์‹œ์ž‘ํ•˜๊ธฐ (4) ๋”ฅ๋Ÿฌ๋‹

 

 

๋”ฅ๋Ÿฌ๋‹๊ฐœ๋ก 

์ธ๊ณต์ง€๋Šฅ > ๋จธ์‹ ๋Ÿฌ๋‹ > ๋”ฅ๋Ÿฌ๋‹

 

 

๋”ฅ๋Ÿฌ๋‹์ด๋ž€?

๋จธ์‹ ๋Ÿฌ๋‹์˜ ์—ฌ๋Ÿฌ ๋ฐฉ๋ฒ•๋ก  ์ค‘ ํ•˜๋‚˜๋กœ์จ ์ธ๊ณต์‹ ๊ฒฝ๋ง์— ๊ธฐ๋ฐ˜ํ•˜์—ฌ ์ปดํ“จํ„ฐ์—๊ฒŒ ์‚ฌ๋žŒ์˜ ์‚ฌ๊ณ ๋ฐฉ์‹์„ ๊ฐ€๋ฅด์น˜๋Š” ๋ฐฉ๋ฒ•

 

 

์ธ๊ณต์‹ ๊ฒฝ๋ง์ด๋ž€?

์ƒ๋ฌผํ•™์˜ ์‹ ๊ฒฝ๋ง์—์„œ ์˜๊ฐ์„ ์–ป์€ ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜. ์‚ฌ๋žŒ์˜ ์‹ ๊ฒฝ ์‹œ์Šคํ…œ์„ ๋ชจ๋ฐฉํ•จ

 

 

์‹ ๊ฒฝ ์‹œ์Šคํ…œ?

๋‘๋‡Œ์˜ ๊ฐ€์žฅ ์ž‘์€ ์ •๋ณด์ฒ˜๋ฆฌ ๋‹จ์œ„

์‹ ๊ฒฝ์„ธํฌ๋“ค์ด ๋‹ค๋ฐœ๋“ค์„ ํ†ตํ•ด ์—ฐ๊ฒฐ๋˜์–ด ์žˆ๋Š” ๊ตฌ์กฐ

์ž๊ทน์ด ๋“ค์–ด์˜ค๋ฉด ์‹ ๊ฒฝ์„ธํฌ๋ฅผ ํ†ตํ•ด ๋‹ค์Œ ์‹ ๊ฒฝ์„ธํฌ๋กœ ์ „๋‹ฌ๋˜๊ณ , ๋‚ด๋ถ€์˜ ์—ฌ๋Ÿฌ๊ฐ€์ง€ ์ฒ˜๋ฆฌ ๊ณผ์ •์ด ์กด์žฌํ•œ๋‹ค.

 

์‚ฌ๋žŒ์˜ ์‹ ๊ฒฝ ์‹œ์Šคํ…œ

 

 

๋”ฅ๋Ÿฌ๋‹ ์—ญ์‚ฌ

First AI winter : ์ฒซ๋ฒˆ์งธ ๋”ฅ๋Ÿฌ๋‹ ๋น™ํ•˜๊ธฐ

1986๋…„๋„ : ๊ธฐ๋ณธ์ ์ธ ์ด๋ก ๋“ค ๋“ฑ์žฅ

Second AI winter : ๋‘๋ฒˆ์งธ ๋”ฅ๋Ÿฌ๋‹ ๋น™ํ•˜๊ธฐ

2012๋…„๋„ : ์ด๋ฏธ์ง€๋„ท (GPU๋ฅผ ์‚ฌ์šฉํ•œ ๋”ฅ๋Ÿฌ๋‹ AlexNet์œผ๋กœ ์ด๋ฏธ์ง€๋ฅผ ๊ตฌ๋ถ„์˜ ์ •ํ™•๋„๋ฅผ ๋Œ์–ด์˜ฌ๋ฆผ)

 

 

ํ˜„๋Œ€์˜ ๋‹ค์–‘ํ•œ ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ์ˆ  ์ ์šฉ ์‚ฌ๋ก€

1. ์–ผ๊ตด ์ธ์‹ ์นด๋ฉ”๋ผ

2. ๊ธฐ๊ณ„ ๋ฒˆ์—ญ ๋ชจ๋ธ

3. ์•ŒํŒŒ๊ณ  ์ œ๋กœ (๋”ฅ๋Ÿฌ๋‹X ๊ฐ•ํ™”ํ•™์Šต๋ชจ๋ธ+๋”ฅ๋Ÿฌ๋‹O)

 

 

 

 

 

ํผ์…‰ํŠธ๋ก  (Perceptron)

์‹ ๊ฒฝ๋ง ์ด์ „์˜ ์—ฐ๊ตฌ

์–ผ๊ตด ์ธ์‹ : ๋„ค๋ชจ ๋ฐ•์Šค ์•ˆ์˜ ์ด๋ชฉ๊ตฌ๋น„๋ฅผ ๊ตฌ๋ณ„

์ˆซ์ž ๋ฐ ๋ฌธ์ž : ๋ช‡ ๊ฐ€์ง€ ํŒจํ„ด์„ ํ†ตํ•ด ๊ตฌ๋ณ„

-> ์‚ฌ๋žŒ์ด ์ง์ ‘ ํŒจํ„ด์„ ํŒŒ์•…ํ•œ ๋’ค ์˜ˆ์ธก

 

 

1958๋…„๋„ ์ดˆ๊ธฐ ์‹ ๊ฒฝ๋ง ํผ์…‰ํŠธ๋ก  ๋“ฑ์žฅ

N๊ฐœ์˜ ์‹ ํ˜ธ๋ฅผ ๋ฐ›๋Š” ๋Œ๊ธฐ > ํ•˜๋‚˜๋กœ ํ•ฉ์ณ์ง€๊ณ  > ์ „๋‹ฌ๋˜๊ณ  > N๊ฐœ์˜ ๋‹ค๋ฅธ ์‹ ํ˜ธ๋“ค๋กœ ์ „๋‹ฌ๋˜๋Š” ํ˜•ํƒœ (=์ธํ’‹ x, ์•„์›ƒํ’‹ y)

 

 

ํผ์…‰ํŠธ๋ก ์˜ ๊ธฐ๋ณธ ๊ตฌ์กฐ

๊ฐ€์ค‘์น˜ : ๋“ค์–ด์˜ค๋Š” ๊ฐ’์„ ์–ผ๋งˆ๋‚˜ ์ฆํญํ•˜๊ณ  ๊ฐํญํ•ด์ค„์ง€ ํŒ๋‹จ

bias : ์ž…๋ ฅํ•˜๋Š” ๊ฐ’์— ์ƒ๊ด€์—†์ด ๋“ค์–ด์˜ค๋Š” ๊ฐ’

∑ (summation) : '๋ชจ๋‘ ๋”ํ•ด๋ผ'

ํ™œ์„ฑํ™”ํ•จ์ˆ˜ Activation function

 

 

ํ™œ์„ฑํ™”ํ•จ์ˆ˜

๋“ค์–ด์˜ค๋Š” x๊ฐ’์ด 0๋ณด๋‹ค ํฌ๋ฉด 1๋กœ ๋งคํ•‘, 0๋ณด๋‹ค ์ž‘์œผ๋ฉด 0์œผ๋กœ ๋งคํ•‘

 

 

ํผ์…‰ํŠธ๋ก  ๋™์ž‘ ์˜ˆ

[์ถ”๊ฐ€ ์˜ˆ์‹œ] 1, -2, -0.5๊ฐ€ ๋“ค์–ด์™”๋‹ค๋ฉด? y = -0.5 + 2 * 1 + 1 * (-1) = activation 0.5 = 0

์ž…๋ ฅ๊ฐ’์— ๋”ฐ๋ผ ๋„์ถœ๊ฐ’์ด ๋‹ค๋ฅผ ์ˆ˜ ์žˆ๋‹ค.

 

 

ํผ์…‰ํŠธ๋ก  ๋™์ž‘ ์˜ˆ2

X1 ์‹ ์ž‘ ๋“œ๋ผ๋งˆ ์ˆ˜ / X2 ์—ฌ๊ฐ€์‹œ๊ฐ„ / Y ํ•™์Šต ์—ฌ๋ถ€

 

w0 ํ•™์Šต ์˜์ง€ / w1 ์‹ ์ž‘ ๋“œ๋ผ๋งˆ์— ๋ฐ›๋Š” ์˜ํ–ฅ / w2 ์—ฌ๊ฐ€ ์‹œ๊ฐ„์— ๋”ฐ๋ฅธ ํ•™์Šตํ•˜๊ณ  ์‹ถ์€ ์ •๋„

-> ์ž…๋ ฅ๊ฐ’(w)์— ๋”ฐ๋ผ ์˜ˆ์ธก๊ฐ’์ด ๋ฐ”๋€๋‹ค.

 

์  : ๊ฐ๊ฐ์˜ ๋ฐ์ดํ„ฐ

๋ณด๋ผ์ƒ‰ ์  : ํ•™์Šต์„ ํ•˜์ง€ ์•Š์€ ๊ฒฝ์šฐ

์ดˆ๋ก์ƒ‰ ์  : ํ•™์Šต์„ ํ•œ ๊ฒฝ์šฐ

* ์ง์„ ์„ ์–ผ๋งˆ๋‚˜ ์ž˜ ๊ตฌํ•˜๋А๋ƒ๊ฐ€ ํผ์…‰ํŠธ๋ก ์„ ์–ผ๋งŒํผ ์ž˜ ๊ตฌํ˜„ํ–ˆ๋Š”์ง€์— ๋Œ€ํ•œ ์ฒ™๋„๋‹ค.

 

 

ํผ์…‰ํŠธ๋ก ์„ ์ด์šฉํ•œ ์„ ํ˜•๋ถ„๋ฅ˜๊ธฐ

๋…ธ๋ž€์ƒ‰ : ๊ฐ•์•„์ง€ / ํŒŒ๋ž€์ƒ‰ : ๊ณ ์–‘์ด

ํผ์…‰ํŠธ๋ก ์€ ์„ ํ˜• ๋ถ„๋ฅ˜๊ธฐ๋กœ์จ ๋ฐ์ดํ„ฐ ๋ถ„๋ฅ˜๊ฐ€ ๊ฐ€๋Šฅํ•˜๋‹ค (์„ ์„ ํ†ตํ•ด ๋ถ„๋ฅ˜)

 

 

๋ฌธ์ œ์ 

ํ•˜๋‚˜์˜ ์„ ์œผ๋กœ ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์—†๋Š” ๋ฌธ์ œ๊ฐ€ ๋“ฑ์žฅ = ํผ์…‰ํŠธ๋ก ์œผ๋กœ ์™„๋ฒฝํ•œ ๋ถ„๋ฅ˜ ๋ถˆ๊ฐ€๋Šฅ

 

 

[์—ฐ์Šต๋ฌธ์ œ1] ํผ์…‰ํŠธ๋ก  ์ž‘๋™ ์˜ˆ์‹œ ๊ตฌํ˜„ํ•˜๊ธฐ

perceptron์˜ ์˜ˆ์ธก ๊ฒฐ๊ณผ๊ฐ€ ํ•™์Šตํ•œ๋‹ค:1 ์ด ๋‚˜์˜ค๋„๋ก x1x_{1}x2x_{2}์— ์ ์ ˆํ•œ ๊ฐ’์„ ์ž…๋ ฅํ•˜์„ธ์š”. ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋Š” ‘์‹ ํ˜ธ์˜ ์ดํ•ฉ์ด 0 ์ด์ƒ์ด๋ฉด ํ•™์Šตํ•˜๊ณ , 0 ๋ฏธ๋งŒ์ด๋ผ๋ฉด ํ•™์Šตํ•˜์ง€ ์•Š๋Š”๋‹ค‘๋Š” ๊ทœ์น™์„ ๊ฐ€์ง‘๋‹ˆ๋‹ค.

#ํ•™์Šต ์—ฌ๋ถ€๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ํผ์…‰ํŠธ๋ก  ํ•จ์ˆ˜
def Perceptron(x_1,x_2):
    
    #์„ค์ •ํ•œ ๊ฐ€์ค‘์น˜๊ฐ’์„ ์ ์šฉ
    w_0 = -5 
    w_1 = -1
    w_2 = 5
    
    #ํ™œ์„ฑํ™” ํ•จ์ˆ˜์— ๋“ค์–ด๊ฐˆ ๊ฐ’์„ ๊ณ„์‚ฐ
    output = w_0+w_1*x_1+w_2*x_2
    
    #ํ™œ์„ฑํ™” ํ•จ์ˆ˜ ๊ฒฐ๊ณผ๋ฅผ ๊ณ„์‚ฐ
    if output < 0:
        y = 0
    else:
        y = 1
    
    return y, output


#1. perceptron์˜ ์˜ˆ์ธก ๊ฒฐ๊ณผ๊ฐ€ ํ•™์Šตํ•œ๋‹ค:1 ์ด ๋‚˜์˜ค๋„๋ก x_1, x_2์— ์ ์ ˆํ•œ ๊ฐ’์„ ์ž…๋ ฅํ•˜์„ธ์š”.
x_1 = 0
x_2 = 2

result, go_out = Perceptron(x_1,x_2)

print("์‹ ํ˜ธ์˜ ์ดํ•ฉ : %d" % go_out)

if go_out > 0:
    print("ํ•™์Šต ์—ฌ๋ถ€ : %d\n ==> ํ•™์Šตํ•œ๋‹ค!" % result)
else:
    print("ํ•™์Šต ์—ฌ๋ถ€ : %d\n ==> ํ•™์Šตํ•˜์ง€ ์•Š๋Š”๋‹ค!" % result)

์‹ ํ˜ธ์˜ ์ดํ•ฉ : 5

ํ•™์Šต ์—ฌ๋ถ€ : 1

==> ํ•™์Šตํ•œ๋‹ค!

 

 

[์—ฐ์Šต๋ฌธ์ œ2] DIY ํผ์…‰ํŠธ๋ก  ๋งŒ๋“ค๊ธฐ

1. ์‹ ํ˜ธ์˜ ์ดํ•ฉ output์„ ์ •์˜ํ•˜๊ณ , output์ด 0 ์ด์ƒ์ด๋ฉด 1์„, ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด 0์ธ y๋ฅผ ๋ฐ˜ํ™˜ํ•˜๋Š” ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋ฅผ ์ž‘์„ฑํ•ด perceptron ํ•จ์ˆ˜๋ฅผ ์™„์„ฑํ•ฉ๋‹ˆ๋‹ค.

'''
1. ์‹ ํ˜ธ์˜ ์ดํ•ฉ๊ณผ ๊ทธ์— ๋”ฐ๋ฅธ ๊ฒฐ๊ณผ 0 ๋˜๋Š” 1์„ ๋ฐ˜ํ™˜ํ•˜๋Š” ํ•จ์ˆ˜ perceptron์„ ์™„์„ฑํ•ฉ๋‹ˆ๋‹ค.
   Step01. ์ž…๋ ฅ ๋ฐ›์€ ๊ฐ’์„ ์ด์šฉํ•˜์—ฌ ์‹ ํ˜ธ์˜ ์ดํ•ฉ์„ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
   Step02. ์‹ ํ˜ธ์˜ ์ดํ•ฉ์ด 0 ์ด์ƒ์ด๋ฉด 1์„, ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด 0์„ ๋ฐ˜ํ™˜ํ•˜๋Š” ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋ฅผ ์ž‘์„ฑํ•ฉ๋‹ˆ๋‹ค.
'''
def perceptron(w, x):    
    output = w[0] + w[1]*x[0] + w[2]*x[1] + w[3]*x[2] + w[4]*x[3]    
    y = 1    
    return y, output

#x_1, x_2, x_3, x_4์˜ ๊ฐ’์„ ์ˆœ์„œ๋Œ€๋กœ list ํ˜•ํƒœ๋กœ ์ €์žฅ
x = [1,2,3,4]

#w_0, w_1, w_2, w_3, w_4์˜ ๊ฐ’์„ ์ˆœ์„œ๋Œ€๋กœ list ํ˜•ํƒœ๋กœ ์ €์žฅ
w = [2, -1, 1, 3, -2]

#ํผ์…‰ํŠธ๋ก ์˜ ๊ฒฐ๊ณผ๋ฅผ ์ถœ๋ ฅ
y, output = perceptron(w,x)

print('output: ', output)
print('y: ', y)

 

 

[์—ฐ์Šต๋ฌธ์ œ3] ํผ์…‰ํŠธ๋ก ์˜ ์•Œ๋งž์€ ๊ฐ€์ค‘์น˜ ์ฐพ๊ธฐ

๋‹จ์ธต ํผ์…‰ํŠธ๋ก ์„ ์ง์ ‘ ๊ตฌํ˜„ํ•ด๋ณด๋ฉฐ ์ ์ ˆํ•œ ๊ฐ€์ค‘์น˜(Weight)์™€ Bias ๊ฐ’์„ ์ฐพ์•„๋ด…์‹œ๋‹ค.

 

1. perceptron ํ•จ์ˆ˜์˜ ์ž…๋ ฅ์œผ๋กœ ๋“ค์–ด๊ฐˆ ๊ฐ€์ค‘์น˜ ๊ฐ’์„ ์ž…๋ ฅํ•ด์ฃผ์„ธ์š”.

w ๋ฆฌ์ŠคํŠธ ์•ˆ์˜ ๊ฐ’๋“ค์€ ์ˆœ์„œ๋Œ€๋กœ w0,w1,w2w_0, w_1, w_2์— ํ•ด๋‹น๋ฉ๋‹ˆ๋‹ค.

import numpy as np


def perceptron(w, x):    
    output = w[1] * x[0] + w[2] * x[1] + w[0]    
    if output >= 0:
        y = 1
    else:
        y = 0    
    return y



#Input ๋ฐ์ดํ„ฐ
X = [[0,0], [0,1], [1,0], [1,1]]

#1. perceptron ํ•จ์ˆ˜์˜ ์ž…๋ ฅ์œผ๋กœ ๋“ค์–ด๊ฐˆ ๊ฐ€์ค‘์น˜ ๊ฐ’์„ ์ž…๋ ฅํ•ด์ฃผ์„ธ์š”. ์ˆœ์„œ๋Œ€๋กœ w_0, w_1, w_2์— ํ•ด๋‹น๋ฉ๋‹ˆ๋‹ค.
w = [-2, 1, 1]

#AND Gate๋ฅผ ๋งŒ์กฑํ•˜๋Š”์ง€ ์ถœ๋ ฅํ•˜์—ฌ ํ™•์ธ
print('perceptron ์ถœ๋ ฅ')

for x in X:
    print('Input: ',x[0], x[1], ', Output: ',perceptron(w, x))

 

 

 

 

 

๋‹ค์ธต ํผ์…‰ํŠธ๋ก 

๋‹จ์ธต ํผ์…‰ํŠธ๋ก  ๋‹ค์ธต ํผ์…‰ํŠธ๋ก  (Multi Layer Perceptron)
 


์ž…๋ ฅ์ธต๊ณผ ์ถœ๋ ฅ์ธต๋งŒ ์กด์žฌ
ํผ์…‰ํŠธ๋ก ์ด 1๊ฐœ๋งŒ ์กด์žฌํ•˜๋Š” ๊ฒฝ์šฐ
๋‹จ์ธต ํผ์…‰ํŠธ๋ก ์„ ์—ฌ๋Ÿฌ ๊ฐœ ์Œ“์€ ๊ฒƒ
๋‹จ์ธต ํผ์…‰ํŠธ๋ก ์„ ๋งŽ์ด ์Œ“์„์ˆ˜๋ก ์—ฌ๋Ÿฌ๊ฐ€์ง€ ๊ฒฐ๊ณผ๊ฐ’์„ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค.

* ๋น„ ์„ ํ˜•์ ์ธ ๋ฌธ์ œ(=์„  ํ•˜๋‚˜๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„๋ฆฌํ•˜์ง€ ๋ชปํ•˜๋Š” ๋ฌธ์ œ) ํ•ด๊ฒฐ

 

 

ํžˆ๋“ ์ธต (Hidden Layer)

์ž…๋ ฅ์ธต๊ณผ ์ถœ๋ ฅ์ธต ์‚ฌ์ด์˜ ๋ชจ๋“  Layer

 

ํžˆ๋“ ์ธต ๊ฐœ์ˆ˜์™€ ๋”ฅ๋Ÿฌ๋‹

ํžˆ๋“ ์ธต์ด ๋งŽ์•„์ง€๋ฉด ๊นŠ์€ ์‹ ๊ฒฝ๋ง์ด๋ผ๋Š” ์˜๋ฏธ์˜ Deep Learning๋‹จ์–ด ์‚ฌ์šฉ

- ์žฅ์  : ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์˜ ์ฆ๊ฐ€ (์„ฑ๋Šฅ์ด ์ข‹์•„์งˆ ์ˆ˜ ์žˆ์Œ)

- ๋‹จ์  : ๊ฐ€์ค‘์น˜ ์กด์žฌ. (ํผ์…‰ํŠธ๋ก  ํ•˜๋‚˜์— ํ•„์š”ํ•œ ๊ฐ€์ค‘์น˜๋Š” n+1. ๋‹ค์ธต ํผ์…‰ํŠธ๋ก ์€ ๊ตฌํ•ด์•ผํ•˜๋Š” ๊ฐ€์ค‘์น˜๊ฐ€ ๊ต‰์žฅํžˆ ๋งŽ์•„์ง„๋‹ค.)

 

 

 

 

 

ํ…์„œํ”Œ๋กœ์šฐ์™€ ์‹ ๊ฒฝ๋ง

๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์˜ ๊ตฌ์„ฑ์š”์†Œ

 

 

๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์˜ ํ•™์Šต ๋ฐฉ๋ฒ•

์˜ˆ์ธก๊ฐ’๊ณผ ์‹ค์ œ๊ฐ’ ๊ฐ„์˜ ์˜ค์ฐจ๊ฐ’์„ ์ตœ์†Œํ™”ํ•˜๊ธฐ ์œ„ํ•ด ์˜ค์ฐจ๊ฐ’์„ ์ตœ์†Œํ™”ํ•˜๋Š” ๋ชจ๋ธ์˜ ์ธ์ž๋ฅผ ์ฐพ๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ ์šฉ.

Loss Function์„ ์ตœ์†Œํ™”ํ•˜๋Š” ๊ฐ€์ค‘์น˜๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•ด ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ ์šฉ

 

 

๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์—์„œ ์˜ˆ์ธก๊ฐ’ ๊ตฌํ•˜๋Š” ๋ฐฉ๋ฒ•

์ˆœ์ „ํŒŒ (Forward propagation) : ์ž…๋ ฅ ๊ฐ’์„ ๋ฐ”ํƒ•์œผ๋กœ ๊ฐ€๊นŒ์šด ํผ์…‰ํŠธ๋ก ๋ถ€ํ„ฐ ์ ์ง„์ ์œผ๋กœ ์ถœ๋ ฅ ๊ฐ’์„ ๊ณ„์‚ฐํ•˜๋Š” ๊ณผ์ •

 

 

์ˆœ์ „ํŒŒ ์˜ˆ์‹œ

bias๋Š” ๋ชจ๋‘ 0์ด๋ผ๊ณ  ๊ฐ€์ •

activation function : ๋‹ค์–‘ํ•œ ์ข…๋ฅ˜์˜ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๊ฐ€ ์กด์žฌํ•˜๋ฉฐ, ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์— ๋”ฐ๋ผ ์“ฐ์ž„์ด ๋‹ค๋ฅด๋‹ค

 

 

์ตœ์ ํ™” ๋ฐฉ์‹

์ˆœ์ „ํŒŒ๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ์˜ˆ์ธก ๊ฐ’๊ณผ ์‹ค์ œ๊ฐ’ ๊ฐ„์˜ ์˜ค์ฐจ๊ฐ’์„ ๊ตฌํ•˜์—ฌ Loss function์„ ๊ตฌํ•  ์ˆ˜ ์žˆ์Œ

๊ทธ๋ ‡๋‹ค๋ฉด ์ตœ์ ํ™”๋ฅผ ์–ด๋–ป๊ฒŒ ํ•ด์•ผํ• ๊นŒ? -> ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ•(Gradient descent)์„ ์‚ฌ์šฉ

 

 

๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ•

๊ฐ€์ค‘์น˜๋ฅผ Loss function๊ฐ’์ด ์ž‘์•„์ง€๊ฒŒ ์—…๋ฐ์ดํŠธ ํ•˜๋Š” ๋ฐฉ๋ฒ•

๊ฐ€์ค‘์น˜๋Š” Gradient๊ฐ’์„ ์‚ฌ์šฉํ•˜์—ฌ ์—…๋ฐ์ดํŠธ๋ฅผ ์ˆ˜ํ–‰

Gradient๊ฐ’์€ ๊ฐ ๊ฐ€์ค‘์น˜๋งˆ๋‹ค ์ •ํ•ด์ง€๋ฉฐ, ์—ญ์ „ํŒŒ(Backpropogation)๋ฅผ ํ†ตํ•˜์—ฌ ๊ตฌํ•  ์ˆ˜ ์žˆ์Œ

 

์—ญ์ „ํŒŒ

 

 

๊ฐ€์ค‘์น˜ ์—…๋ฐ์ดํŠธ ๊ณผ์ •

์œ„ ๊ณผ์ •์„ ์ˆ˜ํ–‰ํ•˜์—ฌ ๊ฐ€์ค‘์น˜๋“ค์„ ์—…๋ฐ์ดํŠธํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ด๋ฅผ ๋ฐ˜๋ณตํ•˜์—ฌ Loss function์„ ์ œ์ผ ์ž‘๊ฒŒ ๋งŒ๋“œ๋Š” ๊ฐ€์ค‘์น˜๋ฅผ ๊ตฌํ•จ

 

 

๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์˜ ํ•™์Šต์ˆœ์„œ

1. ํ•™์Šต์šฉ feature๋ฐ์ดํ„ฐ๋ฅผ ์ž…๋ ฅํ•˜์—ฌ ์˜ˆ์ธก๊ฐ’ ๊ตฌํ•˜๊ธฐ (์ˆœ์ „ํŒŒ)

2. ์˜ˆ์ธก๊ฐ’๊ณผ ์‹ค์ œ๊ฐ’ ์‚ฌ์ด์˜ ์˜ค์ฐจ ๊ตฌํ•˜๊ธฐ (Loss๊ตฌํ•˜๊ธฐ)

3. Loss๋ฅผ ์ค„์ผ ์ˆ˜ ์žˆ๋Š” ๊ฐ€์ค‘์น˜ ์—…๋ฐ์ดํŠธํ•˜๊ธฐ (์—ญ์ „ํŒŒ)

4. 1~3๋ฒˆ์„ ๋ฐ˜๋ณตํ•˜๋ฉฐ Loss๋ฅผ ์ตœ์†Œ๋กœ ํ•˜๋Š” ๊ฐ€์ค‘์น˜ ์–ป๊ธฐ

 

 

 

 

 

ํ…์„œํ”Œ๋กœ์šฐ๋กœ ๋”ฅ๋Ÿฌ๋‹ ๊ตฌํ˜„ํ•˜๊ธฐ - ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ

TensorFlow?

์œ ์—ฐํ•˜๊ณ , ํšจ์œจ์ ์ด๋ฉฐ, ํ™•์žฅ์„ฑ์žˆ๋Š” ๋”ฅ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ

๋Œ€ํ˜• ํด๋Ÿฌ์Šคํ„ฐ ์ปดํ“จํ„ฐ๋ถ€ํ„ฐ ์Šค๋งˆํŠธํฐ๊นŒ์ง€ ๋‹ค์–‘ํ•œ ๋””๋ฐ”์ด์Šค์—์„œ ๋™์ž‘ ๊ฐ€๋Šฅ

๊ฐ€์žฅ ๋งŽ์ด ์‚ฌ์šฉ๋˜๋Š” ํ”„๋ ˆ์ž„์›Œํฌ

 

 

๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ

Tensorflow ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์€ Tensorํ˜•ํƒœ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ž…๋ ฅ๋ฐ›๋Š”๋‹ค.

Tensor : ๋‹ค์ฐจ์›๋ฐฐ์—ด๋กœ์จ Tensorflow์—์„œ ์‚ฌ์šฉํ•˜๋Š” ๊ฐ์ฒด

๋ฐ์ดํ„ฐ > Tensorํ˜•ํƒœ ๋ฐ์ดํ„ฐ ๋ณ€ํ™˜ > Tensorflow๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ

* ๋‹ค๋ฅธ ์ •์˜ : 1์ฐจ์› vector, 2์ฐจ์› matrix, 3์ฐจ์›๋ถ€ํ„ฐ Tensor

 

Dataset API๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋”ฅ๋Ÿฌ๋‹๋ชจ๋ธ์šฉ Dataset ์ƒ์„ฑ

#pandas๋ฅผ ์ด์šฉํ•ด ๋ฐ์ดํ„ฐ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ
df = pd.read_csv('data.csv')
feature = df.drop(columns=['label'])
labe = df['label']

#tensorํ˜•ํƒœ๋กœ ๋ฐ์ดํ„ฐ ๋ณ€ํ™˜
dataset = tf.data.Dataset.from_tensor_slices((feature.values, label.values))

 

๋”ฅ๋Ÿฌ๋‹์— ์‚ฌ์šฉํ•˜๋Š” ๋ฐ์ดํ„ฐ๋Š” ์ถ”๊ฐ€์ ์ธ ์ „์ฒ˜๋ฆฌ ์ž‘์—…์ด ํ•„์š” -> Epoch, Batch

- Epoch : ํ•œ ๋ฒˆ์˜ epoch๋Š” ์ „์ฒด ๋ฐ์ดํ„ฐ ์…‹์— ๋Œ€ํ•ด ํ•œ ๋ฒˆ ํ•™์Šต์„ ์™„๋ฃŒํ•œ ์ƒํƒœ

- Batch : ๋‚˜๋ˆ ์ง„ ๋ฐ์ดํ„ฐ ์…‹ (๋ณดํ†ต mini-batch๋ผ๊ณ  ํ‘œํ˜„)

iteration์€ epoch๋ฅผ ๋‚˜๋ˆ„์–ด์„œ ์‹คํ–‰ํ•˜๋Š” ํšŸ์ˆ˜๋ฅผ ์˜๋ฏธ

 

* ๋”ฅ๋Ÿฌ๋‹ ํ•™์Šต๊ณผ์ •์—์„œ ๋ฐ์ดํ„ฐ ์–‘๊ณผ ๋ชจ๋ธ์ด ์ปค์ง€๋ฉด w๋ฅผ ๊ณ„์‚ฐํ•  ๋•Œ ๊ต‰์žฅํžˆ ๋งŽ์€ ์—ฐ์‚ฐ๋Ÿ‰์ด ํ•„์š”ํ•˜๋‹ค. ๊ณ„์‚ฐ๋Ÿ‰์„ ์ค„์ด๊ธฐ ์œ„ํ•ด ์ „์ฒด(epoch)์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋„ฃ๋Š” ๊ฒŒ ์•„๋‹ˆ๋ผ, ๊ทธ ๋ฐ์ดํ„ฐ๋ฅผ ์ชผ๊ฐœ์„œ ๋„ฃ์–ด๋ณด์ž!(=1batch, 2batch...) ํ™•๋ฅ ์ ์œผ๋กœ ์„ฑ๋Šฅ์ด ๋–จ์–ด์งˆ ์ˆ˜๋Š” ์žˆ์œผ๋‚˜, ์ฒ˜๋ฆฌ ์†๋„๋Š” ํ›จ์”ฌ ๋น ๋ฅด๋‹ค.

 

ex) ์ด ๋ฐ์ดํ„ฐ๊ฐ€ 1000๊ฐœ, Batch size = 100์ผ ๋•Œ

1iteration = 100๊ฐœ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด์„œ ํ•™์Šต

1epoch = 1000/Batch size = 10iteration 

#tensorํ˜•ํƒœ๋กœ ๋ฐ์ดํ„ฐ ๋ณ€ํ™˜
dataset = tf.data.Dataset.from_tensor_slices((feature.values, label.values))

#dataset์˜ batch์‚ฌ์ด์ฆˆ๋ฅผ 32๋กœ ์„ค์ •
datset = dataset.batch(32)

 

 

 

[์—ฐ์Šต๋ฌธ์ œ1] ํ…์„œํ”Œ๋กœ์šฐ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์‹ ๊ฒฝ๋ง ๊ตฌํ˜„ํ•˜๊ธฐ - ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ

ํ…์„œํ”Œ๋กœ์šฐ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์‹ ๊ฒฝ๋ง์„ ๊ตฌํ˜„ํ•ด๋ณด๋Š” ๊ณผ์ •์„ ์ˆ˜ํ–‰ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ํ…์„œํ”Œ๋กœ์šฐ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์˜ ํ•™์Šต ๋ฐ์ดํ„ฐ๋Š” ๊ธฐ์กด ๋ฐ์ดํ„ฐ๋ฅผ tf.data.Dataset ํ˜•์‹์œผ๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. pandas์˜ DataFrame ํ˜•ํƒœ ๋ฐ์ดํ„ฐ๋ฅผ Dataset์œผ๋กœ ๋ณ€ํ™˜ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” from_tensor_slices() ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ds์— ์ €์žฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

1. pandas DataFrame df์—์„œ Sales ๋ณ€์ˆ˜๋Š” label ๋ฐ์ดํ„ฐ๋กœ Y์— ์ €์žฅํ•˜๊ณ  ๋‚˜๋จธ์ง„ X์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

2. ํ•™์Šต์šฉ ๋ฐ์ดํ„ฐ train_X, train_Y๋ฅผ tf.data.Dataset ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

- from_tensor_slices ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

import tensorflow as tf
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

np.random.seed(100)
tf.random.set_seed(100)

#๋ฐ์ดํ„ฐ๋ฅผ DataFrame ํ˜•ํƒœ๋กœ ๋ถˆ๋Ÿฌ ์˜ต๋‹ˆ๋‹ค.
df = pd.read_csv("data/Advertising.csv")

#DataFrame ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ 5๊ฐœ๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
print('์›๋ณธ ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ :')
print(df.head(),'\n')

#์˜๋ฏธ์—†๋Š” ๋ณ€์ˆ˜๋Š” ์‚ญ์ œํ•ฉ๋‹ˆ๋‹ค.
df = df.drop(columns=['Unnamed: 0'])

#1. Sales ๋ณ€์ˆ˜๋Š” label ๋ฐ์ดํ„ฐ๋กœ Y์— ์ €์žฅํ•˜๊ณ  ๋‚˜๋จธ์ง„ X์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
X = df.drop(columns=['Sales'])
Y = df['Sales']

train_X, test_X, train_Y, test_Y = train_test_split(X, Y, test_size=0.3)

#2. ํ•™์Šต์šฉ ๋ฐ์ดํ„ฐ๋ฅผ tf.data.Dataset ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค. from_tensor_slices ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ณ€ํ™˜ํ•˜๊ณ  batch๋ฅผ ์ˆ˜ํ–‰ํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.
train_ds = tf.data.Dataset.from_tensor_slices((train_X.values, train_Y.values))
train_ds = train_ds.shuffle(len(train_X)).batch(batch_size=5)

#ํ•˜๋‚˜์˜ batch๋ฅผ ๋ฝ‘์•„์„œ feature์™€ label๋กœ ๋ถ„๋ฆฌํ•ฉ๋‹ˆ๋‹ค.
[(train_features_batch, label_batch)] = train_ds.take(1)

#batch ๋ฐ์ดํ„ฐ๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
print('\nFB, TV, Newspaper batch ๋ฐ์ดํ„ฐ:\n',train_features_batch)
print('Sales batch ๋ฐ์ดํ„ฐ:',label_batch)

 

 

 

 

 

๋ชจ๋ธ ๊ตฌํ˜„

Keras ํŒจํ‚ค์ง€

ํ…์„œํ”Œ๋กœ์šฐ์˜ ํŒจํ‚ค์ง€๋กœ ์ œ๊ณต๋˜๋Š” ๊ณ ์ˆ˜์ค€API

๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ๊ฐ„๋‹จํ•˜๊ณ  ๋น ๋ฅด๊ฒŒ ๊ตฌํ˜„๊ฐ€๋Šฅ

 

 

Keras ๋ฉ”์†Œ๋“œ (1)

๋ชจ๋ธ ํด๋ž˜์Šค ๊ฐ์ฒด ์ƒ์„ฑ

tf.keras.models.Sequential()

 

๋ชจ๋ธ์˜ ๊ฐ Layer๊ตฌ์„ฑ

tf.keras.layer.Dense(units, activation)

units : ๋ ˆ์ด์–ด ์•ˆ์˜ Node ์ˆ˜

activation : ์ ์šฉํ•  activationํ•จ์ˆ˜ ์„ค์ •

 

 

Input Layer์˜ ์ž…๋ ฅ ํ˜•ํƒœ ์ €์žฅํ•˜๊ธฐ

์ฒซ ๋ฒˆ์งธ(=Input layer)๋Š” ์ž…๋ ฅ ํ˜•ํƒœ์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ํ•„์š”๋กœ ํ•œ๋‹ค.

input_shape or input_dim ์ธ์ž ์„ค์ • ํ•„์š”

 

 

๋ชจ๋ธ ๊ตฌ์ถ• ์ฝ”๋“œ ์˜ˆ์‹œ

model = tf.keras.models.Sequential([	
	tf.keras.layers.Dense(10, input_dim=2, activation='sigmoid'), #2๊ฐœ์˜ ์ž…๋ ฅ๋ณ€์ˆ˜, 10๊ฐœ ๋…ธ๋“œ
    tf.keras.layers.Dense(10, activation='sigmoid'), #10๊ฐœ์˜ ๋…ธ๋“œ
    tf.keras.layers.Dense(1, activation='sigmoid'), #1๊ฐœ์˜ ๋…ธ๋“œ
])

* ์ž…๋ ฅ์ด ๋‘ ๊ฐœ, ์ถœ๋ ฅ์ด ํ•˜๋‚˜, ์„ธ ๊ฐœ์˜ ์ธต ์กด์žฌ

 

 

Keras ๋ฉ”์†Œ๋“œ (2)

๋ชจ๋ธ์— Layer์ถ”๊ฐ€ํ•˜๊ธฐ

[model].add(tf.keras.layers.Dense(units, activation))

units : ๋ ˆ์ด์–ด ์•ˆ์˜ Node ์ˆ˜

activation : ์ ์šฉํ•  activationํ•จ์ˆ˜ ์„ค์ •

 

 

๋ชจ๋ธ ๊ตฌ์ถ• ์ฝ”๋“œ ์˜ˆ์‹œ(2)

model = tf.keras.models.Sequential()

model.add(tf.keras.layers.Dense(10, input_dim=2, activation='sigmoid'))
model.add(tf.keras.layers.Dense(10, activation='sigmoid'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))

 

 

Keras ๋ฉ”์†Œ๋“œ (3)

๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ ํ•™์Šต์‹œํ‚ค๊ธฐ

 

๋ชจ๋ธ ํ•™์Šต ๋ฐฉ์‹์„ ์„ค์ •ํ•˜๊ธฐ ์œ„ํ•œ ํ•จ์ˆ˜

[model].compile(optimizer, loss)

- optimizer : ๋ชจ๋ธ ํ•™์Šต ์ตœ์ ํ™” ๋ฐฉ๋ฒ• ex) GD, SGD,  Adam...

- loss : ์†์‹ค ํ•จ์ˆ˜ ์„ค์ • (ํšŒ๊ท€์—์„œ๋Š” MSE, ๋ถ„๋ฅ˜์—์„œ๋Š” Cross Entropy...)

 

๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ค๊ธฐ ์œ„ํ•œ ํ•จ์ˆ˜

[model].fit(x,y)

- x : ํ•™์Šต ๋ฐ์ดํ„ฐ

- y : ํ•™์Šต ๋ฐ์ดํ„ฐ์˜ label

* tensor ํ˜•ํƒœ์˜ dataset์„ ๋„ฃ์–ด๋„ ๋œ๋‹ค

 

 

์ฝ”๋“œ ์˜ˆ์‹œ

#MSE๋ฅผ loss๋กœ ์„ค์ •, ์ตœ์ ํ™” ๋ฐฉ์‹์€ SGD์‚ฌ์šฉ
model.compile(loss='mean_squared_error', optimizer='SGD') #MSE

#dataset์— ์ €์žฅ๋œ ๋ฐ์ดํ„ฐ๋ฅผ ์ž…๋ ฅํ•˜๊ณ , epochs๋ฅผ 100์œผ๋กœ ์„ค์ •ํ•˜์—ฌ ํ•™์Šต
model.fit(dataset, epochs=100)

 

 

Keras ๋ฉ”์†Œ๋“œ (4)

ํ‰๊ฐ€ ๋ฐ ์˜ˆ์ธกํ•˜๊ธฐ

 

๋ชจ๋ธ์„ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•œ ๋ฉ”์†Œ๋“œ

[model].evaluate(x, y)

- x : ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ

- y : ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์˜ label

 

๋ชจ๋ธ๋กœ ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•˜๊ธฐ ์œ„ํ•œ ํ•จ์ˆ˜

[model].predict(x)

- x : ์˜ˆ์ธกํ•˜๊ณ ์ž ํ•˜๋Š” ๋ฐ์ดํ„ฐ

 

 

์ฝ”๋“œ ์˜ˆ์‹œ

#MSE๋ฅผ loss๋กœ ์„ค์ •, ์ตœ์ ํ™” ๋ฐฉ์‹์€ SGD์‚ฌ์šฉ
model.compile(loss='mean_squared_error', optimizer='SGD') #MSE

#dataset์— ์ €์žฅ๋œ ๋ฐ์ดํ„ฐ๋ฅผ ์ž…๋ ฅํ•˜๊ณ , epochs๋ฅผ 100์œผ๋กœ ์„ค์ •ํ•˜์—ฌ ํ•™์Šต
model.fit(dataset, epochs=100)

#๋ชจ๋ธ ํ‰๊ฐ€ ๋ฐ ์˜ˆ์ธกํ•˜๊ธฐ
model.evaluate(X_test, Y_test)
predicted_labels_test = model.predict(X_test)

 

 

 

[์—ฐ์Šต๋ฌธ์ œ2] ํ…์„œํ”Œ๋กœ์šฐ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์‹ ๊ฒฝ๋ง ๊ตฌํ˜„ํ•˜๊ธฐ - ๋ชจ๋ธ ๊ตฌํ˜„

[์‹ค์Šต1]์— ์ด์–ด์„œ ์ด๋ฒˆ ์‹ค์Šต์—์„œ๋Š” ํ…์„œํ”Œ๋กœ์šฐ์™€ ์ผ€๋ผ์Šค(Keras)๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์„ ๊ตฌํ˜„ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

 

1. tf.keras.models.Sequential()์„ ํ™œ์šฉํ•˜์—ฌ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

- ์ž์œ ๋กญ๊ฒŒ layers๋ฅผ ์Œ“๊ณ  ๋งˆ์ง€๋ง‰ layers๋Š” ๋…ธ๋“œ ์ˆ˜๋ฅผ 1๊ฐœ๋กœ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.

import tensorflow as tf
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

np.random.seed(100)
tf.random.set_seed(100)

#๋ฐ์ดํ„ฐ๋ฅผ DataFrame ํ˜•ํƒœ๋กœ ๋ถˆ๋Ÿฌ ์˜ต๋‹ˆ๋‹ค.
df = pd.read_csv("data/Advertising.csv")

#DataFrame ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ 5๊ฐœ๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
print('์›๋ณธ ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ :')
print(df.head(),'\n')

#์˜๋ฏธ์—†๋Š” ๋ณ€์ˆ˜๋Š” ์‚ญ์ œํ•ฉ๋‹ˆ๋‹ค.
df = df.drop(columns=['Unnamed: 0'])

X = df.drop(columns=['Sales'])
Y = df['Sales']

#ํ•™์Šต์šฉ ํ…Œ์ŠคํŠธ์šฉ ๋ฐ์ดํ„ฐ๋กœ ๋ถ„๋ฆฌํ•ฉ๋‹ˆ๋‹ค.
train_X, test_X, train_Y, test_Y = train_test_split(X, Y, test_size=0.3)

#Dataset ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
train_ds = tf.data.Dataset.from_tensor_slices((train_X.values, train_Y))
train_ds = train_ds.shuffle(len(train_X)).batch(batch_size=5)

#1. tf.keras.models.Sequential()๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์ž์œ ๋กญ๊ฒŒ layers๋ฅผ ์Œ“๊ณ  ๋งˆ์ง€๋ง‰ layers๋Š” ๋…ธ๋“œ ์ˆ˜๋ฅผ 1๊ฐœ๋กœ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(10, input_shape=(3,)),
    tf.keras.layers.Dense(1)
    ])

print(model.summary())

 

 

[์—ฐ์Šต๋ฌธ์ œ3] ํ…์„œํ”Œ๋กœ์šฐ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์‹ ๊ฒฝ๋ง ๊ตฌํ˜„ํ•˜๊ธฐ - ๋ชจ๋ธ ํ•™์Šต

ํ•™์Šต๋ฐฉ๋ฒ• ์„ค์ • : complie() ๋ฉ”์„œ๋“œ๋Š” ๋ชจ๋ธ์„ ์–ด๋–ป๊ฒŒ ํ•™์Šตํ•  ์ง€์— ๋Œ€ํ•ด์„œ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. loss๋Š” ํšŒ๊ท€์—์„œ๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ MSE์ธ ‘mean_squared_error’, ๋ถ„๋ฅ˜์—์„œ๋Š” ‘sparse_categorical_crossentropy’ ๋ฅผ ์ฃผ๋กœ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

ํ•™์Šต ์ˆ˜ํ–‰ : X ๋ฐ์ดํ„ฐ๋ฅผ ์—ํฌํฌ๋ฅผ 100๋ฒˆ์œผ๋กœ ํ•˜์—ฌ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค. verbose ์ธ์ž๋Š” ํ•™์Šต ์‹œ, ํ™”๋ฉด์— ์ถœ๋ ฅ๋˜๋Š” ํ˜•ํƒœ๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. (0: ํ‘œ๊ธฐ ์—†์Œ, 1: ์ง„ํ–‰ ๋ฐ”, 2: ์—ํฌํฌ๋‹น ํ•œ ์ค„ ์ถœ๋ ฅ)

 

1. Dataset์œผ๋กœ ๋ณ€ํ™˜๋œ ํ•™์Šต์šฉ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ๋ชจ๋ธ์˜ ํ•™์Šต์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

- compile ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ตœ์ ํ™” ๋ชจ๋ธ์„ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. loss๋Š” ‘mean_squared_error’, optimizer๋Š” ‘adam’์œผ๋กœ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.

- fit ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ•™์Šต์šฉ ๋ฐ์ดํ„ฐ๋ฅผ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค. epochs๋Š” 100์œผ๋กœ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.

import tensorflow as tf
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

np.random.seed(100)
tf.random.set_seed(100)

#๋ฐ์ดํ„ฐ๋ฅผ DataFrame ํ˜•ํƒœ๋กœ ๋ถˆ๋Ÿฌ ์˜ต๋‹ˆ๋‹ค.
df = pd.read_csv("data/Advertising.csv")

#DataFrame ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ 5๊ฐœ๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
print('์›๋ณธ ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ :')
print(df.head(),'\n')

#์˜๋ฏธ์—†๋Š” ๋ณ€์ˆ˜๋Š” ์‚ญ์ œํ•ฉ๋‹ˆ๋‹ค.
df = df.drop(columns=['Unnamed: 0'])

X = df.drop(columns=['Sales'])
Y = df['Sales']

#ํ•™์Šต์šฉ ํ…Œ์ŠคํŠธ์šฉ ๋ฐ์ดํ„ฐ๋กœ ๋ถ„๋ฆฌํ•ฉ๋‹ˆ๋‹ค.
train_X, test_X, train_Y, test_Y = train_test_split(X, Y, test_size=0.3)

#Dataset ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
train_ds = tf.data.Dataset.from_tensor_slices((train_X.values, train_Y))
train_ds = train_ds.shuffle(len(train_X)).batch(batch_size=5)


#keras๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(10, input_shape=(3,)),
    tf.keras.layers.Dense(1)
    ])


"""
1. ํ•™์Šต์šฉ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ๋ชจ๋ธ์˜ ํ•™์Šต์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
step1. compile ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ตœ์ ํ™” ๋ชจ๋ธ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. loss๋Š” mean_squared_error, optimizer๋Š” adam์œผ๋กœ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.
step2. fit ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Dataset์œผ๋กœ ๋ณ€ํ™˜๋œ ํ•™์Šต์šฉ ๋ฐ์ดํ„ฐ๋ฅผ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค. epochs๋Š” 100์œผ๋กœ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.
"""
model.compile(loss='mean_squared_error', optimizer='adam')
history = model.fit(train_ds, epochs=100, verbose=2)

 

 

[์—ฐ์Šต๋ฌธ์ œ4] ํ…์„œํ”Œ๋กœ์šฐ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์‹ ๊ฒฝ๋ง ๊ตฌํ˜„ํ•˜๊ธฐ - ๋ชจ๋ธ ํ‰๊ฐ€ ๋ฐ ์˜ˆ์ธก

ํ‰๊ฐ€ ๋ฐฉ๋ฒ• : evaluate() ๋ฉ”์„œ๋“œ๋Š” ํ•™์Šต๋œ ๋ชจ๋ธ์„ ๋ฐ”ํƒ•์œผ๋กœ ์ž…๋ ฅํ•œ feature ๋ฐ์ดํ„ฐ X์™€ label Y์˜ loss ๊ฐ’๊ณผ metrics ๊ฐ’์„ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฒˆ ์‹ค์Šต์—์„œ๋Š” metrics ๋ฅผ compile์—์„œ ์„ค์ •ํ•˜์ง€ ์•Š์•˜์ง€๋งŒ, ๋ถ„๋ฅ˜์—์„œ๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ accuracy๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ evaluate ์‚ฌ์šฉ ์‹œ, 2๊ฐœ์˜ ์•„์›ƒํ’‹์„ ๋ฆฌํ„ดํ•ฉ๋‹ˆ๋‹ค.

์˜ˆ์ธก ๋ฐฉ๋ฒ• : X ๋ฐ์ดํ„ฐ์˜ ์˜ˆ์ธก label ๊ฐ’์„ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.

 

1. evaluate ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ…Œ์ŠคํŠธ์šฉ ๋ฐ์ดํ„ฐ์˜ loss ๊ฐ’์„ ๊ณ„์‚ฐํ•˜๊ณ  loss์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

2. predict ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ…Œ์ŠคํŠธ์šฉ ๋ฐ์ดํ„ฐ์˜ ์˜ˆ์ธก๊ฐ’์„ ๊ณ„์‚ฐํ•˜๊ณ  predictions์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

import tensorflow as tf
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

np.random.seed(100)
tf.random.set_seed(100)

#๋ฐ์ดํ„ฐ๋ฅผ DataFrame ํ˜•ํƒœ๋กœ ๋ถˆ๋Ÿฌ ์˜ต๋‹ˆ๋‹ค.
df = pd.read_csv("data/Advertising.csv")

#DataFrame ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ 5๊ฐœ๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
print('์›๋ณธ ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ :')
print(df.head(),'\n')

#์˜๋ฏธ์—†๋Š” ๋ณ€์ˆ˜๋Š” ์‚ญ์ œํ•ฉ๋‹ˆ๋‹ค.
df = df.drop(columns=['Unnamed: 0'])

X = df.drop(columns=['Sales'])
Y = df['Sales']

#ํ•™์Šต์šฉ ํ…Œ์ŠคํŠธ์šฉ ๋ฐ์ดํ„ฐ๋กœ ๋ถ„๋ฆฌํ•ฉ๋‹ˆ๋‹ค.
train_X, test_X, train_Y, test_Y = train_test_split(X, Y, test_size=0.3)

#Dataset ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
train_ds = tf.data.Dataset.from_tensor_slices((train_X.values, train_Y))
train_ds = train_ds.shuffle(len(train_X)).batch(batch_size=5)

#keras๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(10, input_shape=(3,)),
    tf.keras.layers.Dense(1)
    ])

#ํ•™์Šต์šฉ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ๋ชจ๋ธ์˜ ํ•™์Šต์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
model.compile(loss='mean_squared_error', optimizer='adam')
history = model.fit(train_ds, epochs=100, verbose=2)

#1. evaluate ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ…Œ์ŠคํŠธ์šฉ ๋ฐ์ดํ„ฐ์˜ loss ๊ฐ’์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
loss = model.evaluate(test_X, test_Y, verbose=0)

#2. predict ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ…Œ์ŠคํŠธ์šฉ ๋ฐ์ดํ„ฐ์˜ ์˜ˆ์ธก๊ฐ’์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
predictions = model.predict(test_X)

#๊ฒฐ๊ณผ๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
print("ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์˜ Loss ๊ฐ’: ", loss)
for i in range(5):
    print("%d ๋ฒˆ์งธ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์˜ ์‹ค์ œ๊ฐ’: %f" % (i, test_Y.iloc[i]))
    print("%d ๋ฒˆ์งธ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์˜ ์˜ˆ์ธก๊ฐ’: %f" % (i, predictions[i][0]))

 

 

 

[์—ฐ์Šต๋ฌธ์ œ5] ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ๋กœ ๋ถ„๋ฅ˜ํ•˜๊ธฐ

Iris ๋ฐ์ดํ„ฐ๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ ๋ถ“๊ฝƒ์˜ ์ข…๋ฅ˜๋ฅผ ๋ถ„๋ฅ˜ํ•˜๋Š” ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์„ ๊ตฌํ˜„ํ•ฉ๋‹ˆ๋‹ค. Iris ๋ฐ์ดํ„ฐ๋Š” ์•„๋ž˜์™€ ๊ฐ™์ด ๊ฝƒ๋ฐ›์นจ ๊ธธ์ด, ๊ฝƒ๋ฐ›์นจ ๋„“์ด, ๊ฝƒ์žŽ ๊ธธ์ด, ๊ฝƒ์žŽ ๋„“์ด ๋„ค ๊ฐ€์ง€ ๋ณ€์ˆ˜์™€ ์„ธ ์ข…๋ฅ˜์˜ ๋ถ“๊ฝƒ ํด๋ž˜์Šค๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

๋ชจ๋ธ ๊ตฌํ˜„ (5๊ฐœ์˜ ๋ฒ”์ฃผ๋ฅผ ๊ฐ–๋Š” label ์˜ˆ์‹œ) : ๋ถ„๋ฅ˜ ๋ชจ๋ธ์—์„œ๋Š” ๋งˆ์ง€๋ง‰ ๋ ˆ์ด์–ด์— ๋ถ„๋ฅ˜ ๋ฐ์ดํ„ฐ์˜ label ๋ฒ”์ฃผ์˜ ๊ฐœ์ˆ˜๋งŒํผ ๋…ธ๋“œ๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. ์ถ”๊ฐ€๋กœ activation ์ธ์ž๋กœ ‘softmax’ ๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.

ํ•™์Šต ๋ฐฉ๋ฒ• : ๋ถ„๋ฅ˜์—์„œ๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ loss๋ฅผ ‘sparse_categorical_crossentropy’์œผ๋กœ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. metrics ์ธ์ž๋Š” ์—ํฌํฌ๋งˆ๋‹ค ๊ณ„์‚ฐ๋˜๋Š” ํ‰๊ฐ€ ์ง€ํ‘œ๋ฅผ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ์ •ํ™•๋„๋ฅผ ์˜๋ฏธํ•˜๋Š” ‘accuracy’ ๋ฅผ ์ž…๋ ฅํ•˜๋ฉด ์—ํฌํฌ๋งˆ๋‹ค accuracy๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.

 

1. keras๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. 3๊ฐ€์ง€ ๋ฒ”์ฃผ๋ฅผ ๊ฐ–๋Š” label ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„๋ฅ˜ํ•˜๊ธฐ ์œ„ํ•ด์„œ ๋งˆ์ง€๋ง‰ ๋ ˆ์ด์–ด ๋…ธ๋“œ๋ฅผ ์•„๋ž˜์™€ ๊ฐ™์ด ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.

- ๋…ธ๋“œ์˜ ์ˆ˜๋Š” 3๊ฐœ

- activation์€ ‘softmax’๋กœ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.

import tensorflow as tf
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

np.random.seed(100)
tf.random.set_seed(100)

#sklearn์— ์ €์žฅ๋œ ๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ๋Ÿฌ ์˜ต๋‹ˆ๋‹ค.
X, Y = load_iris(return_X_y = True)

#DataFrame์œผ๋กœ ๋ณ€ํ™˜
df = pd.DataFrame(X, columns=['๊ฝƒ๋ฐ›์นจ ๊ธธ์ด','๊ฝƒ๋ฐ›์นจ ๋„“์ด', '๊ฝƒ์žŽ ๊ธธ์ด', '๊ฝƒ์žŽ ๋„“์ด'])
df['ํด๋ž˜์Šค'] = Y

X = df.drop(columns=['ํด๋ž˜์Šค'])
Y = df['ํด๋ž˜์Šค']

#ํ•™์Šต์šฉ ํ‰๊ฐ€์šฉ ๋ฐ์ดํ„ฐ๋กœ ๋ถ„๋ฆฌํ•ฉ๋‹ˆ๋‹ค
train_X, test_X, train_Y, test_Y = train_test_split(X, Y, test_size=0.2, random_state = 42)

#Dataset ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
train_ds = tf.data.Dataset.from_tensor_slices((train_X.values, train_Y))
train_ds = train_ds.shuffle(len(train_X)).batch(batch_size=5)

# 1. keras๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. 3๊ฐ€์ง€ ๋ฒ”์ฃผ๋ฅผ ๊ฐ–๋Š” label ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„๋ฅ˜ํ•˜๊ธฐ ์œ„ํ•ด์„œ ๋งˆ์ง€๋ง‰ ๋ ˆ์ด์–ด ๋…ธ๋“œ๋ฅผ ์•„๋ž˜์™€ ๊ฐ™์ด ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(10, input_dim=4),
    tf.keras.layers.Dense(3, activation='softmax')
    ])

#ํ•™์Šต์šฉ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ๋ชจ๋ธ์˜ ํ•™์Šต์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(train_ds, epochs=100, verbose=2)

#ํ…Œ์ŠคํŠธ์šฉ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ํ•™์Šต๋œ ๋ชจ๋ธ์„ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.
loss, acc = model.evaluate(test_X, test_Y)

#ํ…Œ์ŠคํŠธ์šฉ ๋ฐ์ดํ„ฐ์˜ ์˜ˆ์ธก๊ฐ’์„ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
predictions = model.predict(test_X)

#๊ฒฐ๊ณผ๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
print("ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์˜ Accuracy ๊ฐ’: ", acc)
for i in range(5):
    print("%d ๋ฒˆ์งธ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์˜ ์‹ค์ œ๊ฐ’: %d" % (i, test_Y.iloc[i]))
    print("%d ๋ฒˆ์งธ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์˜ ์˜ˆ์ธก๊ฐ’: %d" % (i, np.argmax(predictions[i])))

 

 

 

 

 

 

๋‹ค์–‘ํ•œ ์‹ ๊ฒฝ๋ง

์šฐ๋ฆฌ ์ฃผ๋ณ€์˜ ์ด๋ฏธ์ง€ ์ฒ˜๋ฆฌ ๊ธฐ์ˆ  ex) ์–ผ๊ตด ์ธ์‹ ์นด๋ฉ”๋ผ, ํ™”์งˆ ๊ฐœ์„ , ์ด๋ฏธ์ง€ ์ž๋™ ํƒœ๊น…

๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ด๋ฏธ์ง€๊ฐ€ ์žˆ๋‹ค๊ณ  ํ•  ๋•Œ, ์–ด๋–ค ๋™๋ฌผ์ธ์ง€ ๋ถ„๋ฅ˜ํ•˜๊ณ ์ž ํ•œ๋‹ค๋ฉด?

-> ์ปดํ“จํ„ฐ์—๊ฒŒ ์ด๋ฏธ์ง€๋Š” ๊ฐ ํ”ฝ์…€ ๊ฐ’์„ ๊ฐ€์ง„ ์ˆซ์ž ๋ฐฐ์—ด๋กœ ์ธ์‹

* ํ”ฝ์…€ : ์ •์‚ฌ๊ฐํ˜• ํ˜•ํƒœ์˜ ์ด๋ฏธ์ง€์˜ ์ž‘์€ ๋‹จ์œ„

 

 

์ด๋ฏธ์ง€ ์ „์ฒ˜๋ฆฌํ•˜๊ธฐ

๋ชจ๋‘ ๊ฐ™์€ ํฌ๊ธฐ๋ฅผ ๊ฐ™๋Š” ์ด๋ฏธ์ง€๋กœ ํ†ต์ผ

1) ๊ฐ€๋กœ, ์„ธ๋กœ ํ”ฝ์…€ ์‚ฌ์ด์ฆˆ๋ฅผ ํ‘œํ˜„ํ•˜๋Š” ํ•ด์ƒ๋„๋ฅผ ํ†ต์ผ

2) ์ƒ‰์„ ํ‘œํ˜„ํ•˜๋Š” ๋ฐฉ์‹ ํ†ต์ผ (RGG, HSV, Gray-scale, Binary, ...)

28X28์˜ ํ•ด์ƒ๋„, Gray-scale๋กœ ํ†ต์ผ๋œ ์ƒํƒœ

 

 

 

[์—ฐ์Šต๋ฌธ์ œ6] MNIST ๋ถ„๋ฅ˜ CNN ๋ชจ๋ธ - ๋ฐ์ดํ„ฐ ์ „ ์ฒ˜๋ฆฌ

์‹ ๊ฒฝ๋ง์„ ์ด์šฉํ•œ ํ•™์Šต์„ ์‹œ์ž‘ํ•  ๋•Œ ๋Œ€๋ถ€๋ถ„ MNIST๋ฅผ ์ ‘ํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. MNIST๋Š” ์†๊ธ€์”จ๋กœ ๋œ ์‚ฌ์ง„์„ ๋ชจ์•„ ๋‘” ๋ฐ์ดํ„ฐ์ž…๋‹ˆ๋‹ค. ์†์œผ๋กœ ์“ด 0๋ถ€ํ„ฐ 9๊นŒ์ง€์˜ ๊ธ€์ž๋“ค์ด ์žˆ๊ณ , ์ด ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•ด์„œ ์‹ ๊ฒฝ๋ง์„ ํ•™์Šต์‹œํ‚ค๊ณ , ํ•™์Šต ๊ฒฐ๊ณผ๊ฐ€ ์†๊ธ€์”จ๋ฅผ ์ธ์‹ํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ๊ฒ€์ฆํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋ฅผ ์ถœ๋ ฅํ•˜๊ณ  ๊ทธ ํ˜•ํƒœ๋ฅผ ํ™•์ธํ•˜์—ฌ CNN ๋ชจ๋ธ์— ์ ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ๋ฐ์ดํ„ฐ ์ „ ์ฒ˜๋ฆฌ๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

MNIST ๋ฐ์ดํ„ฐ๋Š” ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ์ด์ง€๋งŒ ๊ฐ€๋กœ ๊ธธ์ด์™€ ์„ธ๋กœ ๊ธธ์ด๋งŒ ์กด์žฌํ•˜๋Š” 2์ฐจ์› ๋ฐ์ดํ„ฐ์ž…๋‹ˆ๋‹ค. CNN ๋ชจ๋ธ์€ ์ฑ„๋„(RGB ํ˜น์€ ํ‘๋ฐฑ)๊นŒ์ง€ ๊ณ ๋ คํ•œ 3์ฐจ์› ๋ฐ์ดํ„ฐ๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›๊ธฐ์— ์ฑ„๋„ ์ฐจ์›์„ ์ถ”๊ฐ€ํ•ด ๋ฐ์ดํ„ฐ์˜ ๋ชจ์–‘(shape)์„ ๋ฐ”๊ฟ”์ค๋‹ˆ๋‹ค. ๊ฒฐ๊ณผ๋Š” ์•„๋ž˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค.

[๋ฐ์ดํ„ฐ ์ˆ˜, ๊ฐ€๋กœ ๊ธธ์ด, ์„ธ๋กœ ๊ธธ์ด]
-> [๋ฐ์ดํ„ฐ ์ˆ˜, ๊ฐ€๋กœ ๊ธธ์ด, ์„ธ๋กœ ๊ธธ์ด, ์ฑ„๋„ ์ˆ˜]

 

์ฐจ์› ์ถ”๊ฐ€ ํ•จ์ˆ˜ : tf.expand_dims(data, axis)

Tensor ๋ฐฐ์—ด ๋ฐ์ดํ„ฐ์—์„œ ๋งˆ์ง€๋ง‰ ์ถ•(axis)์— ํ•ด๋‹นํ•˜๋Š” ๊ณณ์— ์ฐจ์› ํ•˜๋‚˜๋ฅผ ์ถ”๊ฐ€ํ•  ์ˆ˜ ์žˆ๋Š” ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค. ( axis์— -1์„ ๋„ฃ์œผ๋ฉด ์–ด๋–ค data๊ฐ€ ๋“ค์–ด์˜ค๋˜ ๋งˆ์ง€๋ง‰ ์ถ•์˜ index๋ฅผ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.)

1. ํ•™์Šต์šฉ ๋ฐ ํ‰๊ฐ€์šฉ ๋ฐ์ดํ„ฐ๋ฅผ CNN ๋ชจ๋ธ์˜ ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก (์ƒ˜ํ”Œ๊ฐœ์ˆ˜, ๊ฐ€๋กœํ”ฝ์…€, ์„ธ๋กœํ”ฝ์…€, 1) ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

- tf.expand_dims ํ•จ์ˆ˜๋ฅผ ํ™œ์šฉํ•˜์—ฌ train_images, test_images ๋ฐ์ดํ„ฐ์˜ ํ˜•ํƒœ๋ฅผ ๋ณ€ํ™˜ํ•˜๊ณ  ๊ฐ๊ฐ train_images, test_images์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from elice_utils import EliceUtils

elice_utils = EliceUtils()

import logging, os
logging.disable(logging.WARNING)
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"

#๋™์ผํ•œ ์‹คํ–‰ ๊ฒฐ๊ณผ ํ™•์ธ์„ ์œ„ํ•œ ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค.
np.random.seed(123)
tf.random.set_seed(123)


#MNIST ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ๋ถˆ๋Ÿฌ์˜ต๋‹ˆ๋‹ค.
mnist = tf.keras.datasets.mnist

#MNIST ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ Train set๊ณผ Test set์œผ๋กœ ๋‚˜๋ˆ„์–ด ์ค๋‹ˆ๋‹ค.
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()    

#Train ๋ฐ์ดํ„ฐ 5000๊ฐœ์™€ Test ๋ฐ์ดํ„ฐ 1000๊ฐœ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
train_images, train_labels = train_images[:5000], train_labels[:5000]
test_images, test_labels = test_images[:1000], test_labels[:1000]


print("์›๋ณธ ํ•™์Šต์šฉ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ ํ˜•ํƒœ: ",train_images.shape)
print("์›๋ณธ ํ‰๊ฐ€์šฉ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ ํ˜•ํƒœ: ",test_images.shape)
print("์›๋ณธ ํ•™์Šต์šฉ label ๋ฐ์ดํ„ฐ: ",train_labels)

#์ฒซ ๋ฒˆ์งธ ์ƒ˜ํ”Œ ๋ฐ์ดํ„ฐ๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
plt.figure(figsize=(10, 10))
plt.imshow(train_images[0], cmap=plt.cm.binary)
plt.colorbar()
plt.title("Training Data Sample")
plt.savefig("sample1.png")
elice_utils.send_image("sample1.png")

#9๊ฐœ์˜ ํ•™์Šต์šฉ ์ƒ˜ํ”Œ ๋ฐ์ดํ„ฐ๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
class_names = ['zero', 'one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine']
for i in range(9):
    plt.subplot(3,3,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i], cmap=plt.cm.binary)
    plt.xlabel(class_names[train_labels[i]])
plt.savefig("sample2.png")
elice_utils.send_image("sample2.png")

#1. CNN ๋ชจ๋ธ์˜ ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก (์ƒ˜ํ”Œ๊ฐœ์ˆ˜, ๊ฐ€๋กœํ”ฝ์…€, ์„ธ๋กœํ”ฝ์…€, 1) ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
train_images = tf.expand_dims(train_images, -1)
test_images = tf.expand_dims(test_images, -1)

print("๋ณ€ํ™˜ํ•œ ํ•™์Šต์šฉ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ ํ˜•ํƒœ: ",train_images.shape)
print("๋ณ€ํ™˜ํ•œ ํ‰๊ฐ€์šฉ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ ํ˜•ํƒœ: ",test_images.shape)

 

 

 

 

 

๊ธฐ์กด ๋‹ค์ธต ํผ์…‰ํŠธ๋ก  ๊ธฐ๋ฐ˜ ์‹ ๊ฒฝ๋ง์˜ ์ด๋ฏธ์ง€ ์ฒ˜๋ฆฌ ๋ฐฉ์‹

๊ทน๋„๋กœ ๋งŽ์€ ์ˆ˜์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ ํ•„์š”ํ•˜๋‹ค. 

๋งŒ์•ฝ ์ด๋ฏธ์ง€์— ๋ณ€ํ™”๊ฐ€ ์žˆ๋‹ค๋ฉด? -> ๋ฐ์ดํ„ฐ ๊ด€์ ์—์„œ ๋ณด๋ฉด ์ƒˆ๋กœ์šด ์ด๋ฏธ์ง€์ด๋ฏ€๋กœ ๋ถ„๋ฅ˜ ์„ฑ๋Šฅ์ด ๋–จ์–ด์งˆ ์ˆ˜ ์žˆ๋‹ค.

 

 

ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง(Convolution Neural Network)

์ž‘์€ ํ•„ํ„ฐ๋ฅผ ์ˆœํ™˜์‹œํ‚ค๋Š” ๋ฐฉ์‹.

์ด๋ฏธ์ง€์˜ ํŒจํ„ด์ด ์•„๋‹Œ ํŠน์ง•์„ ์ค‘์ ์œผ๋กœ ์ธ์‹ํ•œ๋‹ค. -> ์„ฑ๋Šฅ ํ–ฅ์ƒ

 

 

ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง์˜ ๊ตฌ์กฐ

CNN์ด ์ž…๋ ฅ ์ด๋ฏธ์ง€์˜ ํŠน์ง•์„ ์ถ”์ถœ, Fully-Connected Layer๊ฐ€ ๋ถ„๋ฅ˜ํ•˜๋Š” ๊ณผ์ •์œผ๋กœ ๋™์ž‘ํ•œ๋‹ค

- CNN : Convolution Layer + Pooling Layer

- Fully-Connected Layer : ์—ฌํƒœ๊นŒ์ง€ ๊ณต๋ถ€ํ–ˆ๋˜ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ (=Dense Layer)

 

 

์ด๋ฏธ์ง€์—์„œ ์–ด๋– ํ•œ ํŠน์ง•์ด ์žˆ๋Š” ์ง€๋ฅผ ๊ตฌํ•˜๋Š” ๊ณผ์ •

ํ•„ํ„ฐ๊ฐ€ ์ด๋ฏธ์ง€๋ฅผ ์ด๋™ํ•˜๋ฉฐ ์ƒˆ๋กœ์šด ์ด๋ฏธ์ง€(ํ”ผ์ณ๋งต)๋ฅผ ์ƒ์„ฑ (๊ท€, ์ˆ˜์—ผ, ์ž…, ์ƒ‰๊น” ํ•„ํ„ฐ ๋“ฑ๋“ฑ์„ ์ด๋™ํ•˜๋ฉฐ ํ•ด๋‹น ์ด๋ฏธ์ง€๊ฐ€ ์žˆ๋Š”์ง€ ์—†๋Š”์ง€๋ฅผ ํŒ๋‹จํ•˜๊ณ , ์ด๋ฏธ์ง€๊ฐ€ ๋งค์น˜๋˜๋ฉด ๊ทธ ๊ฐ’์„ ๊ฐ€์žฅ ํฌ๊ฒŒ๋” ๋งŒ๋“ ๋‹ค)

 

 

ํ”ผ์ณ๋งต์˜ ํฌ๊ธฐ ๋ณ€ํ˜•

Padding : ์›๋ณธ๊ณผ ๋‹ค๋ฅธ ์‚ฌ์ด์ฆˆ์˜ ํ•„ํ„ฐ ์ƒ์„ฑ๋˜๋Š” ๊ฒƒ์„ ๋ฐฉ์ง€ํ•˜๊ณ ์ž ๋งŒ๋“  ๋ฐฉ์‹

Striding : ์„ค์ •๊ฐ’์— ๋”ฐ๋ผ ๊ฒ€์‚ฌ ๊ตฌ์—ญ์„ ์ง€์ •ํ•  ์ˆ˜ ์žˆ๋‹ค

 

 

Pooling Layer

์ด๋ฏธ์ง€์˜ ์™œ๊ณก์ด ์˜ํ–ฅ(๋…ธ์ด์ฆˆ)๋ฅผ ์ถ•์†Œํ•˜๋Š” ๊ณผ์ •.

์ •๋ณด ๋˜ํ•œ ์••์ถ•๋œ๋‹ค.

ex) ํ•„ํ„ฐ๋ฅผ ๊ฑฐ์ณ ์ƒ์„ฑ๋œ ํ”ผ์ณ๋งต์— Max Pooling์„ ์‚ฌ์šฉํ•˜๋ฉด ๋†’์€ ๊ฐ’์„ ๋Œ€ํ‘œ๊ฐ’์œผ๋กœ, ์ž‘์€ ๊ฐ’์€ 0์œผ๋กœ ํ†ต์ผ์‹œํ‚จ๋‹ค

Average Pooling์€ ํ‰๊ท ๊ฐ’์œผ๋กœ ๋Œ€์ฒดํ•˜๋Š” ๊ธฐ๋ฒ• (๊ฑฐ์˜ ์‚ฌ์šฉํ•˜์ง€ ์•Š์Œ)

 

 

Fully Connected Layer

์ถ”์ถœ๋œ ํŠน์ง•์„ ์‚ฌ์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€๋ฅผ ๋ถ„๋ฅ˜

 

 

๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•œ Softmaxํ™œ์„ฑํ™” ํ•จ์ˆ˜

๋งˆ์ง€๋ง‰ ๊ณ„์ธต์— Softmaxํ™œ์„ฑํ™” ํ•จ์ˆ˜ ์‚ฌ์šฉ

a + b + c + d + e + f = 1 (๊ฐ๊ฐ์˜ ๊ฐ’์€ ํ™•๋ฅ ๊ฐ’)

a, b, c, d, e >= 0

 

Q. ๊ณ ์–‘์ด๊ฐ€ ๋งž๋ƒ(1) ํ‹€๋ฆฌ๋ƒ(0)๋ผ๋Š” ๋ฌธ์ œ๋ฅผ ํ’€ ๋•Œ -> ๋งˆ์ง€๋ง‰ ๊ฐ’์—์„œ step function์„ ๋„ฃ์œผ๋ฉด ๋‹ต์„ ๊ตฌํ•  ์ˆ˜ ์žˆ์—ˆ์Œ

Q. ๊ณ ์–‘์ด, ๊ฐ•์•„์ง€, ํ† ๋ผ ๊ฐ™์€ ๋‹ค์–‘ํ•œ label์„ ์˜ˆ์ธกํ•ด์•ผ ํ•  ๋•Œ -> ํ™œ์„ฑํ™”ํ•จ์ˆ˜ Softmax ์‚ฌ์šฉ, ๋งˆ์ง€๋ง‰ layer์˜ unit์˜ ๊ฐœ์ˆ˜๋Š” ์˜ˆ์ธกํ•ด์•ผํ•˜๋Š” label์˜ ๋ฒ”์ฃผ์˜ ๊ฐœ์ˆ˜๋งŒํผ์œผ๋กœ ์„ค์ •ํ•ด์•ผ ํ•œ๋‹ค

 

 

์ •๋ฆฌ

1. ํ•ฉ์„ฑ๊ณฑ : ํŠน์ง• ์ถ”์ถœ

2. ํ’€๋ง : ์‚ฌ์ด์ฆˆ ์กฐ์ ˆ, ๋…ธ์ด์ฆˆ ์ฒ˜๋ฆฌ

* ์ด ๊ณผ์ •์„ N๋ฒˆ ๋ฐ˜๋ณต -> ํŠน์ง• ๋ณ„ ํ•„ํ„ฐ๋ฅผ ๋งŽ์ด ์ƒ์„ฑ (ํ’€๋ง ๋•๋ถ„์— ๋ฐ์ดํ„ฐ ์‚ฌ์ด์ฆˆ๊ฐ€ ์ž‘๊ธฐ๋•Œ๋ฌธ์— ์ „์ฒด์ ์ธ ์–‘์ด ๋Š˜์ง€ ์•Š๋Š”๋‹ค) = ๋ฐ˜๋ณตํ• ๋•Œ๋งˆ๋‹ค ์ค„์–ด๋“  ์˜์—ญ์—์„œ์˜ ํŠน์ง•์„ ์ฐพ๊ฒŒ ๋˜๊ณ , ์˜์—ญ์˜ ํฌ๊ธฐ๊ฐ€ ์ž‘์•„์กŒ๊ธฐ ๋•Œ๋ฌธ์— ๋น ๋ฅธ ํ•™์Šต์ด ๊ฐ€๋Šฅํ•ด์ง„๋‹ค.

3. ํ™œ์„ฑํ•จ์ˆ˜ : ๋ถ„๋ฅ˜

 

 

ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง ๊ธฐ๋ฐ˜ ๋‹ค์–‘ํ•œ ์ด๋ฏธ์ง€ ์ฒ˜๋ฆฌ ๊ธฐ์ˆ 

Object detection & segmentation : ๊ฐ๊ฐ์˜ ์ด๋ฏธ์ง€๋ฅผ ๊ตฌ๋ถ„ํ•  ์ˆ˜ ์žˆ์Œ

Super resolution (SR) : ํ•ด์ƒ๋„๊ฐ€ ๋‚ฎ์€ ์ด๋ฏธ์ง€์˜ ํ•ด์ƒ๋„๋ฅผ ๋†’์ผ ์ˆ˜ ์žˆ๋‹ค

 

 

 

[์—ฐ์Šต๋ฌธ์ œ7] MNIST ๋ถ„๋ฅ˜ CNN ๋ชจ๋ธ - ๋ชจ๋ธ ๊ตฌํ˜„

Keras์—์„œ CNN ๋ชจ๋ธ์„ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด ํ•„์š”ํ•œ ํ•จ์ˆ˜/๋ฉ”์„œ๋“œ

1. CNN ๋ ˆ์ด์–ด tf.keras.layers.Conv2D(filters, kernel_size, activation, padding) : ์ž…๋ ฅ ์ด๋ฏธ์ง€์˜ ํŠน์ง•, ์ฆ‰ ์ฒ˜๋ฆฌํ•  ํŠน์ง• ๋งต(map)์„ ์ถ”์ถœํ•˜๋Š” ๋ ˆ์ด์–ด์ž…๋‹ˆ๋‹ค.

  • filters : ํ•„ํ„ฐ(์ปค๋„) ๊ฐœ์ˆ˜
  • kernel_size : ํ•„ํ„ฐ(์ปค๋„)์˜ ํฌ๊ธฐ
  • activation : ํ™œ์„ฑํ™” ํ•จ์ˆ˜
  • padding : ์ด๋ฏธ์ง€๊ฐ€ ํ•„ํ„ฐ๋ฅผ ๊ฑฐ์น  ๋•Œ ๊ทธ ํฌ๊ธฐ๊ฐ€ ์ค„์–ด๋“œ๋Š” ๊ฒƒ์„ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•ด์„œ ๊ฐ€์žฅ์ž๋ฆฌ์— 0์˜ ๊ฐ’์„ ๊ฐ€์ง€๋Š” ํ”ฝ์…€์„ ๋„ฃ์„ ๊ฒƒ์ธ์ง€ ๋ง ๊ฒƒ์ธ์ง€๋ฅผ ๊ฒฐ์ •ํ•˜๋Š” ๋ณ€์ˆ˜. ‘SAME’ ๋˜๋Š” ‘VALID’

2. Maxpool ๋ ˆ์ด์–ด tf.keras.layers.MaxPool2D(padding) : ์ฒ˜๋ฆฌํ•  ํŠน์ง• ๋งต(map)์˜ ํฌ๊ธฐ๋ฅผ ์ค„์—ฌ์ฃผ๋Š” ๋ ˆ์ด์–ด์ž…๋‹ˆ๋‹ค.

  • padding : ‘SAME’ ๋˜๋Š” ‘VALID’

3. Flatten ๋ ˆ์ด์–ด tf.keras.layers.Flatten() : Convolution layer ๋˜๋Š” MaxPooling layer์˜ ๊ฒฐ๊ณผ๋Š” N์ฐจ์›์˜ ํ…์„œ ํ˜•ํƒœ์ž…๋‹ˆ๋‹ค. ์ด๋ฅผ 1์ฐจ์›์œผ๋กœ ํ‰ํ‰ํ•˜๊ฒŒ ๋งŒ๋“ค์–ด์ค๋‹ˆ๋‹ค.

 

4. Dense ๋ ˆ์ด์–ด tf.keras.layers.Dense(node, activation)

  • node : ๋…ธ๋“œ(๋‰ด๋Ÿฐ) ๊ฐœ์ˆ˜
  • activation : ํ™œ์„ฑํ™” ํ•จ์ˆ˜

 

1. keras๋ฅผ ํ™œ์šฉํ•˜์—ฌ CNN ๋ชจ๋ธ์„ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.

- ๋ถ„๋ฅ˜ ๋ชจ๋ธ์— ๋งž๊ฒŒ ๋งˆ์ง€๋ง‰ ๋ ˆ์ด์–ด์˜ ๋…ธ๋“œ ์ˆ˜๋Š” 10๊ฐœ, activation ํ•จ์ˆ˜๋Š” ‘softmax’๋กœ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.

import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from visual import *
from elice_utils import EliceUtils

elice_utils = EliceUtils()

import logging, os
logging.disable(logging.WARNING)
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"

#๋™์ผํ•œ ์‹คํ–‰ ๊ฒฐ๊ณผ ํ™•์ธ์„ ์œ„ํ•œ ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค.
np.random.seed(123)
tf.random.set_seed(123)


# MNIST ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ๋ถˆ๋Ÿฌ์˜ต๋‹ˆ๋‹ค.
mnist = tf.keras.datasets.mnist

#MNIST ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ Train set๊ณผ Test set์œผ๋กœ ๋‚˜๋ˆ„์–ด ์ค๋‹ˆ๋‹ค.
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()    

#Train ๋ฐ์ดํ„ฐ 5000๊ฐœ์™€ Test ๋ฐ์ดํ„ฐ 1000๊ฐœ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
train_images, train_labels = train_images[:5000], train_labels[:5000]
test_images, test_labels = test_images[:1000], test_labels[:1000]

#CNN ๋ชจ๋ธ์˜ ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก (์ƒ˜ํ”Œ๊ฐœ์ˆ˜, ๊ฐ€๋กœํ”ฝ์…€, ์„ธ๋กœํ”ฝ์…€, 1) ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
train_images = tf.expand_dims(train_images, -1)
test_images = tf.expand_dims(test_images, -1)


#1. CNN ๋ชจ๋ธ์„ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. ๋ถ„๋ฅ˜ ๋ชจ๋ธ์— ๋งž๊ฒŒ ๋งˆ์ง€๋ง‰ ๋ ˆ์ด์–ด์˜ ๋…ธ๋“œ ์ˆ˜๋Š” 10๊ฐœ, activation ํ•จ์ˆ˜๋Š” 'softmax'๋กœ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.
model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(filters = 32, kernel_size = (3,3), activation = 'relu', padding = 'SAME', input_shape = (28,28,1)),
    tf.keras.layers.MaxPool2D(padding = 'SAME'),
    tf.keras.layers.Conv2D(filters = 32, kernel_size = (3,3), activation = 'relu', padding = 'SAME'),
    tf.keras.layers.MaxPool2D(padding = 'SAME'),
    tf.keras.layers.Conv2D(filters = 32, kernel_size = (3,3), activation = 'relu', padding = 'SAME'),
    tf.keras.layers.MaxPool2D(padding = 'SAME'),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(64, activation = 'relu'),
    tf.keras.layers.Dense(10, activation = 'softmax')
])

#CNN ๋ชจ๋ธ ๊ตฌ์กฐ๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
print(model.summary())

#CNN ๋ชจ๋ธ์˜ ํ•™์Šต ๋ฐฉ๋ฒ•์„ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.
model.compile(loss = 'sparse_categorical_crossentropy',
              optimizer = 'adam',
              metrics = ['accuracy'])
              
#ํ•™์Šต์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. 
history = model.fit(train_images, train_labels, epochs = 20, batch_size = 512)

#ํ•™์Šต ๊ฒฐ๊ณผ๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
Visulaize([('CNN', history)], 'loss')

 

 

 

[์—ฐ์Šต๋ฌธ์ œ8] MNIST ๋ถ„๋ฅ˜ CNN ๋ชจ๋ธ - ํ‰๊ฐ€ ๋ฐ ์˜ˆ์ธก

Keras์—์„œ CNN ๋ชจ๋ธ์˜ ํ‰๊ฐ€ ๋ฐ ์˜ˆ์ธก์„ ์œ„ํ•ด ํ•„์š”ํ•œ ํ•จ์ˆ˜/๋ฉ”์„œ๋“œ

ํ‰๊ฐ€ ๋ฐฉ๋ฒ• model.evaluate(X, Y) : evaluate() ๋ฉ”์„œ๋“œ๋Š” ํ•™์Šต๋œ ๋ชจ๋ธ์„ ๋ฐ”ํƒ•์œผ๋กœ ์ž…๋ ฅํ•œ feature ๋ฐ์ดํ„ฐ X์™€ label Y์˜ loss ๊ฐ’๊ณผ metrics ๊ฐ’์„ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.

์˜ˆ์ธก ๋ฐฉ๋ฒ• model.predict_classes(X) : X ๋ฐ์ดํ„ฐ์˜ ์˜ˆ์ธก label ๊ฐ’์„ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.

 

1. evaluate ๋ฉ”์„œ๋“œ์™€ ํ‰๊ฐ€์šฉ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.

- loss์™€ accuracy๋ฅผ ๊ณ„์‚ฐํ•˜๊ณ  loss, test_acc์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

2. predict_classes ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ‰๊ฐ€์šฉ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ predictions์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from visual import *
from plotter import *
from elice_utils import EliceUtils

elice_utils = EliceUtils()

import logging, os
logging.disable(logging.WARNING)
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"

# ๋™์ผํ•œ ์‹คํ–‰ ๊ฒฐ๊ณผ ํ™•์ธ์„ ์œ„ํ•œ ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค.
np.random.seed(123)
tf.random.set_seed(123)


# MNIST ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ๋ถˆ๋Ÿฌ์˜ต๋‹ˆ๋‹ค.
mnist = tf.keras.datasets.mnist

# MNIST ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ Train set๊ณผ Test set์œผ๋กœ ๋‚˜๋ˆ„์–ด ์ค๋‹ˆ๋‹ค.
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()    

# Train ๋ฐ์ดํ„ฐ 5000๊ฐœ์™€ Test ๋ฐ์ดํ„ฐ 1000๊ฐœ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
train_images, train_labels = train_images[:5000], train_labels[:5000]
test_images, test_labels = test_images[:1000], test_labels[:1000]

# CNN ๋ชจ๋ธ์˜ ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก (์ƒ˜ํ”Œ๊ฐœ์ˆ˜, ๊ฐ€๋กœํ”ฝ์…€, ์„ธ๋กœํ”ฝ์…€, 1) ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
train_images = tf.expand_dims(train_images, -1)
test_images = tf.expand_dims(test_images, -1)


# CNN ๋ชจ๋ธ์„ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.
model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(filters = 32, kernel_size = (3,3), activation = 'relu', padding = 'SAME', input_shape = (28,28,1)),
    tf.keras.layers.MaxPool2D(padding = 'SAME'),
    tf.keras.layers.Conv2D(filters = 32, kernel_size = (3,3), activation = 'relu', padding = 'SAME'),
    tf.keras.layers.MaxPool2D(padding = 'SAME'),
    tf.keras.layers.Conv2D(filters = 32, kernel_size = (3,3), activation = 'relu', padding = 'SAME'),
    tf.keras.layers.MaxPool2D(padding = 'SAME'),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(64, activation = 'relu'),
    tf.keras.layers.Dense(10, activation = 'softmax')
])

# CNN ๋ชจ๋ธ ๊ตฌ์กฐ๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
print(model.summary())

# CNN ๋ชจ๋ธ์˜ ํ•™์Šต ๋ฐฉ๋ฒ•์„ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.
model.compile(loss = 'sparse_categorical_crossentropy',
              optimizer = 'adam',
              metrics = ['accuracy'])
              
# ํ•™์Šต์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. 
history = model.fit(train_images, train_labels, epochs = 10, batch_size = 128, verbose = 2)

Visulaize([('CNN', history)], 'loss')

"""
1. ํ‰๊ฐ€์šฉ ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.
   loss์™€ accuracy๋ฅผ ๊ณ„์‚ฐํ•˜๊ณ  loss, test_acc์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
"""
loss, test_acc = model.evaluate(test_images, test_labels, verbose = 0)

"""
2. ํ‰๊ฐ€์šฉ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ predictions์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
"""
predictions = model.predict_classes(test_images)

# ๋ชจ๋ธ ํ‰๊ฐ€ ๋ฐ ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
print('\nTest Loss : {:.4f} | Test Accuracy : {}'.format(loss, test_acc))
print('์˜ˆ์ธกํ•œ Test Data ํด๋ž˜์Šค : ',predictions[:10])

# ํ‰๊ฐ€์šฉ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ๋ ˆ์ด์–ด ๊ฒฐ๊ณผ๋ฅผ ์‹œ๊ฐํ™”ํ•ฉ๋‹ˆ๋‹ค.
Plotter(test_images, model)

Test Loss : 0.1703 | Test Accuracy : 0.9490000009536743 ์˜ˆ์ธกํ•œ Test Data ํด๋ž˜์Šค : [7 2 1 0 4 1 4 9 6 9] ๋ ˆ์ด์–ด ์ด๋ฆ„: conv2d

๋ ˆ์ด์–ด ์ด๋ฆ„: max_pooling2d

๋ ˆ์ด์–ด ์ด๋ฆ„: conv2d_1

๋ ˆ์ด์–ด ์ด๋ฆ„: max_pooling2d_1

๋ ˆ์ด์–ด ์ด๋ฆ„: conv2d_2

๋ ˆ์ด์–ด ์ด๋ฆ„: max_pooling2d_2

 

 

 

 

 

 

์ž์—ฐ์–ด ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•œ ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ

์ฃผ๋ณ€์˜ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ ex) ๊ธฐ๊ณ„ ๋ฒˆ์—ญ ๋ชจ๋ธ, ์Œ์„ฑ์ธ์‹

 

์ฒ˜๋ฆฌ ๊ณผ์ •

1. ์ž์—ฐ์–ด ์ „์ฒ˜๋ฆฌ (Preprocessing)

2. ๋‹จ์–ด ํ‘œํ˜„ (Word Embedding)

3. ๋ชจ๋ธ ์ ์šฉ (Modeling)

 

 

์˜ค๋ฅ˜ ๊ต์ • (Noise canceling)

"์•ˆ๋…•ํ•˜ ์„ธ์š”. ๋ฐ˜๊ฐ‘ ์Šค๋‹ˆ๋‹ค." 

"์•ˆ๋…•ํ•˜์„ธ์š”. ๋ฐ˜๊ฐ‘์Šต๋‹ˆ๋‹ค." 

-> ์ž์—ฐ์–ด ๋ฌธ์žฅ์˜ ์ŠคํŽ ๋ง ์ฒดํฌ ๋ฐ ๋„์–ด์“ฐ๊ธฐ ์˜ค๋ฅ˜ ๊ต์ •

 

 

ํ† ํฐํ™” (Tokenizing)

"๋”ฅ๋Ÿฌ๋‹ ๊ธฐ์ดˆ ๊ณผ๋ชฉ์„ ์ˆ˜๊ฐ•ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค."

"['๋”ฅ', '๋Ÿฌ๋‹', '๊ธฐ์ดˆ', '๊ณผ๋ชฉ', '์„', '์ˆ˜๊ฐ•', 'ํ•˜๊ณ ', '์žˆ์Šต๋‹ˆ๋‹ค'. '.']

-> ๋ฌธ์žฅ์„ ํ† ํฐ์œผ๋กœ ๋‚˜๋ˆˆ๋‹ค. * ํ† ํฐ : ์–ด์ ˆ, ๋‹จ์–ด ๋“ฑ์œผ๋กœ ๋ชฉ์ ์— ๋”ฐ๋ผ ๋‹ค๋ฅด๊ฒŒ ์ •์˜

 

 

๋ถˆ์šฉ์–ด ์ œ๊ฑฐ(StopWord removal)

ํ•œ๊ตญ์–ด์—์„œ ex) ์•„, ํœด, ์•„์ด๊ตฌ, ์•„์ด์ฟ , ์•„์ด๊ณ , ์‰ฟ, ๊ทธ๋Ÿฟ์ง€ ์•Š์œผ๋ฉด, ๊ทธ๋Ÿฌ๋‚˜, ๊ทธ๋Ÿฐ๋ฐ, ํ•˜์ง€๋งŒ, ...

-> ๋ถˆํ•„์š”ํ•œ ๋‹จ์–ด ์ œ๊ฑฐ

 

 

Bag of Words

์ž์—ฐ์–ด ๋„ํฐ์„ ํ•˜๋‚˜์”ฉ ๋ฝ‘์•„ index๋ฅผ ๋ถ€์—ฌํ•˜๋Š” ์ž‘์—…

 

์ž์—ฐ์–ด ๋ฐ์ดํ„ฐ

['์•ˆ๋…•', '๋งŒ๋‚˜์„œ', '๋ฐ˜๊ฐ€์›Œ']

['์•ˆ๋…•', '๋‚˜๋„', '๋ฐ˜๊ฐ€์›Œ']

์ˆ˜์น˜ํ˜• ๋ณ€ํ™˜

Bag of Words 

['์•ˆ๋…•' : 0, '๋งŒ๋‚˜์„œ' : 1, '๋ฐ˜๊ฐ€์›Œ' : 2, '๋‚˜๋„' : 3]

 

 

ํ† ํฐ ์‹œํ€€์Šค

Bag of Words์—์„œ ๋‹จ์–ด์— ํ•ด๋‹น๋˜๋Š” ์ธ๋ฑ์Šค๋กœ ๋ณ€ํ™˜.

๋ชจ๋“  ๋ฌธ์žฅ์˜ ๊ธธ์ด๋ฅผ ๋งž์ถ”๊ธฐ ์œ„ํ•ด ๊ธฐ์ค€๋ณด๋‹ค ์งง์€ ๋ฌธ์žฅ์—๋Š” ํŒจ๋”ฉ์„ ์ˆ˜ํ–‰ํ•œ๋‹ค.

* ๊ธด ๋ฌธ์žฅ์„ ๊ธฐ์ค€์œผ๋กœ ๊ธธ์ด๋ฅผ ํ†ต์ผํ•˜๋‚˜, ์œ ๋‚œํžˆ ๊ธด ๋ฌธ์žฅ์€ ์ œ์™ธํ•˜๊ณ  ์ž‘์—… ์ˆ˜ํ–‰.

 

 

 

[์—ฐ์Šต๋ฌธ์ œ9] ์˜ํ™” ๋ฆฌ๋ทฐ ๊ธ์ •/๋ถ€์ • ๋ถ„๋ฅ˜ RNN ๋ชจ๋ธ - ๋ฐ์ดํ„ฐ ์ „ ์ฒ˜๋ฆฌ

์˜ํ™” ๋ฆฌ๋ทฐ์™€ ๊ฐ™์€ ์ž์—ฐ์–ด ์ž๋ฃŒ๋Š” ๊ณง ๋‹จ์–ด์˜ ์—ฐ์†์ ์ธ ๋ฐฐ์—ด๋กœ์จ, ์‹œ๊ณ„์—ด ์ž๋ฃŒ๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ฆ‰, ์‹œ๊ณ„์—ด ์ž๋ฃŒ(์—ฐ์†๋œ ๋‹จ์–ด)๋ฅผ ์ด์šฉํ•ด ๋ฆฌ๋ทฐ์— ๋‚ดํฌ๋œ ๊ฐ์ •(๊ธ์ •, ๋ถ€์ •)์„ ์˜ˆ์ธกํ•˜๋Š” ๋ถ„๋ฅ˜๊ธฐ๋ฅผ ๋งŒ๋“ค์–ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

 

1. ์ธ๋ฑ์Šค๋กœ ๋ณ€ํ™˜๋œ X_train, X_test ์‹œํ€€์Šค์— ํŒจ๋”ฉ์„ ์ˆ˜ํ–‰ํ•˜๊ณ  ๊ฐ๊ฐ X_train, X_test์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

- ์‹œํ€€์Šค ์ตœ๋Œ€ ๊ธธ์ด๋Š” 300์œผ๋กœ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.

import json
import numpy as np
import tensorflow as tf
import data_process
from keras.datasets import imdb
from keras.preprocessing import sequence

import logging, os
logging.disable(logging.WARNING)
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"

#ํ•™์Šต์šฉ ๋ฐ ํ‰๊ฐ€์šฉ ๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ๋Ÿฌ์˜ค๊ณ  ์ƒ˜ํ”Œ ๋ฌธ์žฅ์„ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
X_train, y_train, X_test, y_test = data_process.imdb_data_load()

#1. ์ธ๋ฑ์Šค๋กœ ๋ณ€ํ™˜๋œ X_train, X_test ์‹œํ€€์Šค์— ํŒจ๋”ฉ์„ ์ˆ˜ํ–‰ํ•˜๊ณ  ๊ฐ๊ฐ X_train, X_test์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค. ์‹œํ€€์Šค ์ตœ๋Œ€ ๊ธธ์ด๋Š” 300์œผ๋กœ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.
X_train = sequence.pad_sequences(X_train, maxlen=300, padding='post')
X_test = sequence.pad_sequences(X_test, maxlen=300, padding='post')

print("\nํŒจ๋”ฉ์„ ์ถ”๊ฐ€ํ•œ ์ฒซ ๋ฒˆ์งธ X_train ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ ํ† ํฐ ์ธ๋ฑ์Šค sequence: \n",X_train[0])

 

 

 

 

 

 

์›Œ๋“œ ์ž„๋ฒ ๋”ฉ Word Embedding

Bag of Words๋Š” ์˜๋ฏธ ์—†์ด ๋‹จ์ˆœํ•˜๊ฒŒ ์ˆœ์„œ๋ฅผ ๋ถ€์—ฌํ•œ๋‹ค. Word Embedding์„ ํ†ตํ•ด Bag of Words์˜ ์ธ๋ฑ์Šค๋กœ ์ •์˜๋œ ํ† ํฐ๋“ค์—๊ฒŒ ์˜๋ฏธ๋ฅผ ๋ถ€์—ฌํ•œ๋‹ค.

 

์ด์œ 

๋ฒกํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ† ํฐ์˜ ํŠน์ง•์„ ์„ค๋ช… (์œ ์‚ฌ๋„ ๊ตฌํ•˜๊ธฐ, ์—ฐ์‚ฐ ๊ฐ€๋Šฅ)

- ์œ ์‚ฌ๋„ : ์–ด๋จธ๋‹ˆ์™€ ์•„๋ฒ„์ง€์˜ ์ž„๋ฒ ๋”ฉ[]์„ ์‚ดํŽด๋ณด๋ฉด 0์˜ ๋ถ€๋ถ„์ด ๋Œ€๋žต์ ์œผ๋กœ ๊ฒน์น˜๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค. (์œ ์‚ฌ๊ด€๊ณ„๋ฅผ ๊ฐ€์ง„ ๋ฒกํ„ฐ๋“ค๋ผ๋ฆฌ๋Š” ์œ ์‚ฌํ•˜๊ฒŒ ํ˜•ํƒœ๋ฅผ ๋งŒ๋“ ๋‹ค -> ๋‹จ์–ด๋“ค์˜ ์—ฐ๊ด€์„ฑ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.)

 

 

๊ธฐ์กด ๋‹ค์ธต ํผ์…‰ํŠธ๋ก  ์‹ ๊ฒฝ๋ง์˜ ์ž์—ฐ์–ด ๋ถ„๋ฅ˜ ๋ฐฉ์‹

๋Œ€๊ด„ํ˜ธ๋ฅผ ์—†์• ๊ณ  MLP๋ชจ๋ธ์— ๋„ฃ์–ด์ค˜์•ผ ํ•˜๋Š”๋ฐ, ์ด ๊ฒฝ๊ณ„๋ฅผ ํ—ˆ๋ฌผ๋ฉด ํŠน์ง•๋“ค์ด ์‚ฌ๋ผ์ง€๊ฒŒ ๋œ๋‹ค. ์ž„๋ฒ ๋”ฉ์˜ ํšจ๊ณผ๊ฐ€ ์‚ฌ๋ผ์ง€๊ณ , ๋ฌธ์žฅ๋“ค ๊ฐ„์˜ ๊ด€๊ณ„ ๋˜ํ•œ ๋ฌด๋„ˆ์ง„๋‹ค.

-> ์ž์—ฐ์–ด ๋ฌธ์žฅ์„ ๊ธฐ์กด MLP๋ชจ๋ธ์— ์ ์šฉ์‹œํ‚ค๊ธฐ์—๋Š” ํ•œ๊ณ„๊ฐ€ ์žˆ๋‹ค. ํ† ํฐ ๊ฐ„ ์ˆœ์„œ์™€ ๊ด€๊ณ„๋ฅผ ์ ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋ชจ๋ธ์€ ์—†์„๊นŒ?

 

 

RNN model (Recurrent Neural Network)

X → RNN → Y

๊ธฐ์กด ํผ์…‰ํŠธ๋ก  ๊ณ„์‚ฐ๊ณผ ๋น„์Šทํ•˜๊ฒŒ X์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ›์•„ Y๋ฅผ ์ถœ๋ ฅํ•œ๋‹ค

 

 

์ˆœํ™˜ ์‹ ๊ฒฝ๋ง์˜ ์ž…์ถœ๋ ฅ ๊ตฌ์กฐ

์ถœ๋ ฅ ๊ฐ’์„ ๋‘ ๊ฐˆ๋ž˜๋กœ ๋‚˜๋‰˜์–ด ์‹ ๊ฒฝ๋ง์—๊ฒŒ ๊ธฐ์–ตํ•˜๋Š” ๊ธฐ๋Šฅ ๋ถ€์—ฌ

์ด์ „์— ์‚ฌ์šฉํ–ˆ๋˜ ํ† ํฐ์— ๋Œ€ํ•œ ๊ธฐ์–ต์„ ๋ฐ›์•„์™€์„œ, ๋‹ค์Œ ํ† ํฐ์˜ ๊ณ„์‚ฐ์— ์‚ฌ์šฉํ•œ๋‹ค.

 

 

์ˆœํ™˜ ์‹ ๊ฒฝ๋ง ๊ธฐ๋ฐ˜ ์ž์—ฐ์–ด ๋ถ„๋ฅ˜ ์˜ˆ์‹œ

[์ˆ˜์—…์ด], [์ด], [๋„ˆ๋ฌด], [์žฌ๋ฐŒ์–ด]๋ฅผ ๊ณ„์† RNN๋ถ„๋ฅ˜์ฒ˜๋ฆฌํ•œ๋‹ค.

๋งˆ์ง€๋ง‰์— ๋‚˜์˜จ ๊ฒฐ๊ณผ๋ฌผ Y๋งŒ Fully connected Layer์— ๋„ฃ์–ด 0์ธ์ง€ 1์ธ์ง€๋ฅผ ํŒ๋‹จํ•œ๋‹ค. (์ด์ „ output Y๋Š” ์‹ ๊ฒฝ์“ฐ์ง€ ์•Š๋Š”๋‹ค)

 

 

์ •๋ฆฌ

1) ์ž„๋ฒ ๋”ฉ : ์ „์ฒ˜๋ฆฌ ๋œ ๋ฐ์ดํ„ฐ์˜ ํŠน์ง• ์ถ”์ถœ 

2) RNN : ๊ธฐ์–ตํ•˜๋Š” ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ. ์•ž์„œ ์‚ฌ์šฉ๋œ ํ† ํฐ๋“ค์ด ํ•จ๊ป˜ ํ•™์Šต๋˜๋ฏ€๋กœ ์„œ๋กœ๊ฐ„์˜ ์ˆœ์„œ ๊ด€๊ณ„๋„ ํฌํ•จ๋˜์–ด ํ•™์Šต

3) ํ™œ์„ฑํ•จ์ˆ˜ : ๋ถ„๋ฅ˜์ž‘์—… ์ง„ํ–‰ ex) sigmoid, softmax

 

 

 

 

 

[์—ฐ์Šต๋ฌธ์ œ10] ์˜ํ™” ๋ฆฌ๋ทฐ ๊ธ์ •/๋ถ€์ • ๋ถ„๋ฅ˜ RNN ๋ชจ๋ธ - ๋ชจ๋ธ ํ•™์Šต

Keras์—์„œ RNN ๋ชจ๋ธ์„ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด ํ•„์š”ํ•œ ํ•จ์ˆ˜/๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ

์ผ๋ฐ˜์ ์œผ๋กœ RNN ๋ชจ๋ธ์€ ์ž…๋ ฅ์ธต์œผ๋กœ Embedding ๋ ˆ์ด์–ด๋ฅผ ๋จผ์ € ์Œ“๊ณ , RNN ๋ ˆ์ด์–ด๋ฅผ ๋ช‡ ๊ฐœ ์Œ“์€ ๋‹ค์Œ, ์ดํ›„ Dense ๋ ˆ์ด์–ด๋ฅผ ๋” ์Œ“์•„ ์™„์„ฑํ•ฉ๋‹ˆ๋‹ค.

 

์ž„๋ฒ ๋”ฉ ๋ ˆ์ด์–ด tf.keras.layers.Embedding(input_dim, output_dim, input_length) : ๋“ค์–ด์˜จ ๋ฌธ์žฅ์„ ๋‹จ์–ด ์ž„๋ฒ ๋”ฉ(embedding)ํ•˜๋Š” ๋ ˆ์ด์–ด

  • input_dim: ๋“ค์–ด์˜ฌ ๋‹จ์–ด์˜ ๊ฐœ์ˆ˜
  • output_dim: ๊ฒฐ๊ณผ๋กœ ๋‚˜์˜ฌ ์ž„๋ฒ ๋”ฉ ๋ฒกํ„ฐ์˜ ํฌ๊ธฐ(์ฐจ์›)
  • input_length: ๋“ค์–ด์˜ค๋Š” ๋‹จ์–ด ๋ฒกํ„ฐ์˜ ํฌ๊ธฐ
  • RNN ๋ ˆ์ด์–ด

๋‹จ์ˆœ RNN ๋ ˆ์ด์–ด : tf.keras.layers.SimpleRNN(units)

  • units: ๋ ˆ์ด์–ด์˜ ๋…ธ๋“œ ์ˆ˜

 

1. RNN ๋ชจ๋ธ์„ ๊ตฌํ˜„ํ•ฉ๋‹ˆ๋‹ค.

- ์ž„๋ฒ ๋”ฉ ๋ ˆ์ด์–ด ๋‹ค์Œ์œผ๋กœ SimpleRNN์„ ์‚ฌ์šฉํ•˜์—ฌ RNN ๋ ˆ์ด์–ด๋ฅผ ์Œ“๊ณ  ๋…ธ๋“œ์˜ ๊ฐœ์ˆ˜๋Š” 5๊ฐœ๋กœ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.

import json
import numpy as np
import tensorflow as tf
import data_process
from keras.datasets import imdb
from keras.preprocessing import sequence

import logging, os
logging.disable(logging.WARNING)
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"

#๋™์ผํ•œ ์‹คํ–‰ ๊ฒฐ๊ณผ ํ™•์ธ์„ ์œ„ํ•œ ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค.
np.random.seed(123)
tf.random.set_seed(123)

#ํ•™์Šต์šฉ ๋ฐ ํ‰๊ฐ€์šฉ ๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ๋Ÿฌ์˜ค๊ณ  ์ƒ˜ํ”Œ ๋ฌธ์žฅ์„ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
X_train, y_train, X_test, y_test = data_process.imdb_data_load()

max_review_length = 300

#ํŒจ๋”ฉ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
X_train = sequence.pad_sequences(X_train, maxlen=max_review_length, padding='post')
X_test = sequence.pad_sequences(X_test, maxlen=max_review_length, padding='post')


embedding_vector_length = 32

"""
1. ๋ชจ๋ธ์„ ๊ตฌํ˜„ํ•ฉ๋‹ˆ๋‹ค.
์ž„๋ฒ ๋”ฉ ๋ ˆ์ด์–ด ๋‹ค์Œ์œผ๋กœ `SimpleRNN`์„ ์‚ฌ์šฉํ•˜์—ฌ RNN ๋ ˆ์ด์–ด๋ฅผ ์Œ“๊ณ  ๋…ธ๋“œ์˜ ๊ฐœ์ˆ˜๋Š” 5๊ฐœ๋กœ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. 
Dense ๋ ˆ์ด์–ด๋Š” 0, 1 ๋ถ„๋ฅ˜์ด๊ธฐ์— ๋…ธ๋“œ๋ฅผ 1๊ฐœ๋กœ ํ•˜๊ณ  activation์„ 'sigmoid'๋กœ ์„ค์ •๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
"""
model = tf.keras.models.Sequential([
    tf.keras.layers.Embedding(1000, embedding_vector_length, input_length = max_review_length),
    tf.keras.layers.SimpleRNN(5),
    tf.keras.layers.Dense(1, activation='sigmoid')
    ])

#๋ชจ๋ธ์„ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.
print(model.summary())

#ํ•™์Šต ๋ฐฉ๋ฒ•์„ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.
model.compile(loss = 'binary_crossentropy', optimizer = 'adam', metrics = ['accuracy'])

#ํ•™์Šต์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
model_history = model.fit(X_train, y_train, epochs = 3, verbose = 2)

 

 

 

[์—ฐ์Šต๋ฌธ์ œ11] ์˜ํ™” ๋ฆฌ๋ทฐ ๊ธ์ •/๋ถ€์ • ๋ถ„๋ฅ˜ RNN ๋ชจ๋ธ - ํ‰๊ฐ€ ๋ฐ ์˜ˆ์ธก

Keras์—์„œ RNN ๋ชจ๋ธ์˜ ํ‰๊ฐ€ ๋ฐ ์˜ˆ์ธก์„ ์œ„ํ•ด ํ•„์š”ํ•œ ํ•จ์ˆ˜/๋ฉ”์„œ๋“œ

ํ‰๊ฐ€ ๋ฐฉ๋ฒ• model.evaluate(X, Y) : evaluate() ๋ฉ”์„œ๋“œ๋Š” ํ•™์Šต๋œ ๋ชจ๋ธ์„ ๋ฐ”ํƒ•์œผ๋กœ ์ž…๋ ฅํ•œ feature ๋ฐ์ดํ„ฐ X์™€ label Y์˜ loss ๊ฐ’๊ณผ metrics ๊ฐ’์„ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.

์˜ˆ์ธก ๋ฐฉ๋ฒ• model.predict(X) : X ๋ฐ์ดํ„ฐ์˜ ์˜ˆ์ธก label ๊ฐ’์„ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.

 

1. evaluate ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ‰๊ฐ€์šฉ ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.

- loss์™€ accuracy๋ฅผ ๊ณ„์‚ฐํ•˜๊ณ  loss, test_acc์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

2. predict ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ‰๊ฐ€์šฉ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ predictions์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

import json
import numpy as np
import tensorflow as tf
import data_process
from keras.datasets import imdb
from keras.preprocessing import sequence

import logging, os
logging.disable(logging.WARNING)
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"

#๋™์ผํ•œ ์‹คํ–‰ ๊ฒฐ๊ณผ ํ™•์ธ์„ ์œ„ํ•œ ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค.
np.random.seed(123)
tf.random.set_seed(123)

#ํ•™์Šต์šฉ ๋ฐ ํ‰๊ฐ€์šฉ ๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ๋Ÿฌ์˜ค๊ณ  ์ƒ˜ํ”Œ ๋ฌธ์žฅ์„ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
X_train, y_train, X_test, y_test = data_process.imdb_data_load()

max_review_length = 300

#ํŒจ๋”ฉ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
X_train = sequence.pad_sequences(X_train, maxlen=max_review_length, padding='post')
X_test = sequence.pad_sequences(X_test, maxlen=max_review_length, padding='post')


embedding_vector_length = 32


#๋ชจ๋ธ์„ ๊ตฌํ˜„ํ•ฉ๋‹ˆ๋‹ค.
model = tf.keras.models.Sequential([
    tf.keras.layers.Embedding(1000, embedding_vector_length, input_length = max_review_length),
    tf.keras.layers.SimpleRNN(5),
    tf.keras.layers.Dense(1, activation='sigmoid')
    ])

#๋ชจ๋ธ์„ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.
print(model.summary())

#ํ•™์Šต ๋ฐฉ๋ฒ•์„ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.
model.compile(loss = 'binary_crossentropy', optimizer = 'adam', metrics = ['accuracy'])

#ํ•™์Šต์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
model_history = model.fit(X_train, y_train, epochs = 5, verbose = 2)

#1. ํ‰๊ฐ€์šฉ ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. loss์™€ accuracy๋ฅผ ๊ณ„์‚ฐํ•˜๊ณ  loss, test_acc์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
loss, test_acc = model.evaluate(X_test, y_test, verbose = 0)

#2. ํ‰๊ฐ€์šฉ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ predictions์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
predictions = model.predict(X_test)

#๋ชจ๋ธ ํ‰๊ฐ€ ๋ฐ ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
print('\nTest Loss : {:.4f} | Test Accuracy : {}'.format(loss, test_acc))
print('์˜ˆ์ธกํ•œ Test Data ํด๋ž˜์Šค : ',1 if predictions[0]>=0.5 else 0)

+ Recent posts