DAY66. Tensorflow Keras model (2)Overfitting solution

LEE_BOMB 2021. 12. 23. 17:04

2021. 12. 23. 17:04

karas mnist history

* keras mnist batch참고
History : 훈련과 검증과정에서 발생하는 손실값/정확도 결과 기억 기능

from tensorflow.keras.datasets import mnist #mnist load 
from tensorflow.keras.utils import to_categorical #Y변수 : encoding 
from tensorflow.keras import Sequential #keras model 생성 
from tensorflow.keras.layers import Dense #DNN layer 구축
import matplotlib.pyplot as plt #시각화 도구

keras 내부 w,b변수 seed 적용

import tensorflow as tf
import numpy as np 
import random as rd

tf.random.set_seed(123)
np.random.seed(123)
rd.seed(123)
import time #학습 소요 시간 측정

1. mnist dataset load

(x_train, y_train), (x_val, y_val) = mnist.load_data() #(images, labels)

images : X변수

x_train.shape #(60000, 28, 28) - (size, h, w) : 2d 제공 
x_val.shape #(10000, 28, 28)

x_train[0] #0~255
x_train.max() #255

labels : y변수

y_train.shape #(60000,)
y_train[0] #5

2. X,y변수 전처리
1) X변수 : 정규화 & reshape(2d -> 1d)

x_train = x_train / 255. #정규화 
x_val = x_val / 255.

x_train[0]

reshape(2d -> 1d)

x_train = x_train.reshape(-1, 784) #(60000, 28*28)
x_val = x_val.reshape(-1, 784) #(10000, 28*28)

2) y변수 : class(10진수) -> one-hot encoding(2진수)

y_train = to_categorical(y_train)
y_val = to_categorical(y_val)

전처리 확인

x_train.shape #(60000, 784)
y_train[0] #[0., 0., 0., 0., 0., 1., 0., 0., 0., 0.] - 5
y_train.shape #(60000, 10)

start_time = time.time() #소요 시간 체트

3. keras model

model = Sequential()

4. DNN model layer 구축
hidden layer1 : w[784, 128]

model.add(Dense(units=128, input_shape=(784,), activation='relu')) #1층

hidden layer2 : w[128, 64]

model.add(Dense(units=64, activation='relu')) #2층

hidden layer3 : w[64, 32]

model.add(Dense(units=32, activation='relu')) #3층

output layer : w[32, 10]

model.add(Dense(units=10, activation='softmax')) #4층

model layer 확인

model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
dense_8 (Dense)              (None, 128)               100480=784x128+128
_________________________________________________________________
dense_9 (Dense)              (None, 64)                8256
_________________________________________________________________
dense_10 (Dense)             (None, 32)                2080
_________________________________________________________________
dense_11 (Dense)             (None, 10)                330
=================================================================
Total params: 111,146

5. model compile : 학습과정 설정(다항분류기)

model.compile(optimizer='adam', #default : learning_rate=0.001
              loss='categorical_crossentropy', 
              metrics=['accuracy'])

6. [수정] model training : train(70) vs val(30)

model_fit = model.fit(x=x_train, y=y_train, #훈련셋 
          epochs=15, #반복학습 횟수 : 60000 * 10 = 600,000 -> full batch
          batch_size=100, # 1epoch(100 * 600) * 10 = 600,000 -> mini batch 
          verbose=1, #출력여부 
          validation_data= (x_val, y_val)) #검증셋

stop_time = time.time() - start_time 

print('소요시간 : ', stop_time)

full batch
accuracy: 0.9923 - val_loss: 0.0873 - val_accuracy: 0.9793
소요시간 : 25.037985801696777

mini batch
accuracy: 0.9933 - val_loss: 0.0916 - val_accuracy: 0.9739
소요시간 : 11.333117961883545

7. model evaluation : val dataset

print('model evaluation')
model.evaluate(x=x_val, y=y_val)
#loss: 0.0916 - accuracy: 0.9739

8. model history

print(model_fit.history.keys()) #dict_keys(['loss', 'accuracy', 'val_loss', 'val_accuracy'])

loss vs val_loss : overfitting 시작점 : epoch=2

plt.plot(model_fit.history['loss'], 'y', label='train loss')
plt.plot(model_fit.history['val_loss'], 'r', label='val loss')
plt.xlabel('epochs')
plt.ylabel('loss')
plt.legend(loc = 'best')
plt.show()

accuracy vs val_accuracy : overfitting 시작점 : epoch=2

plt.plot(model_fit.history['accuracy'], 'y', label='train accuracy')
plt.plot(model_fit.history['val_accuracy'], 'r', label='val accuracy')
plt.xlabel('epochs')
plt.ylabel('accuracy')
plt.legend(loc = 'best')
plt.show()

karas mnist dropout

Dropout : 무작위 네트워크 삭제 -> 과적화 최소화
* karas_mnist_history 참고

from tensorflow.keras.datasets import mnist #mnist load 
from tensorflow.keras.utils import to_categorical #Y변수 : encoding 
from tensorflow.keras import Sequential #keras model 생성 
from tensorflow.keras.layers import Dense, Dropout #[추가] DNN layer 구축
import matplotlib.pyplot as plt #시각화 도구

keras 내부 w,b변수 seed 적용

import tensorflow as tf
import numpy as np 
import random as rd

tf.random.set_seed(123)
np.random.seed(123)
rd.seed(123)
import time #학습 소요 시간 측정

1. mnist dataset load

(x_train, y_train), (x_val, y_val) = mnist.load_data() #(images, labels)

images : X변수

x_train.shape #(60000, 28, 28) - (size, h, w) : 2d 제공 
x_val.shape #(10000, 28, 28)

x_train[0] #0~255
x_train.max() #255

labels : y변수

y_train.shape #(60000,)
y_train[0] #5

2. X,y변수 전처리
1) X변수 : 정규화 & reshape(2d -> 1d)

x_train = x_train / 255. #정규화 
x_val = x_val / 255.

x_train[0]

reshape(2d -> 1d)

x_train = x_train.reshape(-1, 784) #(60000, 28*28)
x_val = x_val.reshape(-1, 784) #(10000, 28*28)

2) y변수 : class(10진수) -> one-hot encoding(2진수)

y_train = to_categorical(y_train)
y_val = to_categorical(y_val)

전처리 확인

x_train.shape # (60000, 784)
y_train[0] #[0., 0., 0., 0., 0., 1., 0., 0., 0., 0.] - 5
y_train.shape #(60000, 10)

start_time = time.time() #소요 시간 체트

3. keras model

model = Sequential()

4. DNN model layer 구축
hidden layer1 : w[784, 128]

model.add(Dense(units=128, input_shape=(784,), activation='relu'))#1층 
model.add(Dropout(rate=0.3)) #[추가]

hidden layer2 : w[128, 64]

model.add(Dense(units=64, activation='relu')) #2층 
model.add(Dropout(rate=0.1)) #[추가]

hidden layer3 : w[64, 32]

model.add(Dense(units=32, activation='relu')) #3층
model.add(Dropout(rate=0.1)) #[추가]

output layer : w[32, 10]

model.add(Dense(units=10, activation='softmax')) #4층

model layer 확인

model.summary()

5. model compile : 학습과정 설정(다항분류기)

model.compile(optimizer='adam', #default : learning_rate=0.001
              loss='categorical_crossentropy', 
              metrics=['accuracy'])

6. model training : train(70) vs val(30)

model_fit = model.fit(x=x_train, y=y_train, #훈련셋 
          epochs=15, #반복학습 횟수 : 60000 * 10 = 600,000 -> full batch
          batch_size=100, #1epoch(100 * 600) * 10 = 600,000 -> mini batch 
          verbose=1, #출력여부 
          validation_data= (x_val, y_val)) #검증셋

stop_time = time.time() - start_time 

print('소요시간 : ', stop_time)

print('model evaluation')
model.evaluate(x=x_val, y=y_val) #loss: 0.0916 - accuracy: 0.9739

8. model history

print(model_fit.history.keys()) #dict_keys(['loss', 'accuracy', 'val_loss', 'val_accuracy'])

loss vs val_loss : overfitting 시작점 : epoch=2

plt.plot(model_fit.history['loss'], 'y', label='train loss')
plt.plot(model_fit.history['val_loss'], 'r', label='val loss')
plt.xlabel('epochs')
plt.ylabel('loss')
plt.legend(loc = 'best')
plt.show()

accuracy vs val_accuracy : overfitting 시작점 : epoch=2

plt.plot(model_fit.history['accuracy'], 'y', label='train accuracy')
plt.plot(model_fit.history['val_accuracy'], 'r', label='val accuracy')
plt.xlabel('epochs')
plt.ylabel('accuracy')
plt.legend(loc = 'best')
plt.show()

karas mnist earlyStopping

* step02_karas_mnist_dropout 참고

1. Dropout : 무작위 네트워크 삭제 -> 과적화 최소화
2. EarlyStopping
- val loss value에 변화가 없는 경우 특정 시점에서 학습 조기 종료

from tensorflow.keras.datasets import mnist #mnist load 
from tensorflow.keras.utils import to_categorical # Y변수 : encoding 
from tensorflow.keras import Sequential #keras model 생성 
from tensorflow.keras.layers import Dense, Dropout #DNN layer 구축
from tensorflow.keras.callbacks import EarlyStopping #[추가]

import matplotlib.pyplot as plt #시각화 도구

keras 내부 w,b변수 seed 적용

import tensorflow as tf
import numpy as np 
import random as rd

tf.random.set_seed(123)
np.random.seed(123)
rd.seed(123)
import time #학습 소요 시간 측정

1. mnist dataset load

(x_train, y_train), (x_val, y_val) = mnist.load_data() #(images, labels)

images : X변수

x_train.shape #(60000, 28, 28) - (size, h, w) : 2d 제공 
x_val.shape #(10000, 28, 28)

x_train[0] #0~255
x_train.max() #255

labels : y변수

y_train.shape #(60000,)
y_train[0] #5

2. X,y변수 전처리
1) X변수 : 정규화 & reshape(2d -> 1d)

x_train = x_train / 255. #정규화 
x_val = x_val / 255.

x_train[0]

reshape(2d -> 1d)

x_train = x_train.reshape(-1, 784) #(60000, 28*28)
x_val = x_val.reshape(-1, 784) #(10000, 28*28)

2) y변수 : class(10진수) -> one-hot encoding(2진수)

y_train = to_categorical(y_train)
y_val = to_categorical(y_val)

전처리 확인

x_train.shape #(60000, 784)
y_train[0] #[0., 0., 0., 0., 0., 1., 0., 0., 0., 0.] - 5
y_train.shape #(60000, 10)

start_time = time.time() # 요 시간 체트

3. keras model

model = Sequential()

4. DNN model layer 구축
hidden layer1 : w[784, 128]

model.add(Dense(units=128, input_shape=(784,), activation='relu')) #1층 
model.add(Dropout(rate=0.3))

hidden layer2 : w[128, 64]

model.add(Dense(units=64, activation='relu')) #2층 
model.add(Dropout(rate=0.1))

hidden layer3 : w[64, 32]

model.add(Dense(units=32, activation='relu')) #3층
model.add(Dropout(rate=0.1))

output layer : w[32, 10]

model.add(Dense(units=10, activation='softmax')) #4층

model layer 확인

model.summary()

model.compile(optimizer='adam', #default : learning_rate=0.001
              loss='categorical_crossentropy', 
              metrics=['accuracy'])

6. model training : train(70) vs val(30)
[추가] epoch=5 이후 검증 손실이 개선되지 않으면 조기종료

callback = EarlyStopping(monitor='val_loss', patience=5)

model_fit = model.fit(x=x_train, y=y_train, #훈련셋 
          epochs=30, #[수정] 반복학습 횟수 
          batch_size=100, # 1epoch(100 * 600) * 10 = 600,000 -> mini batch 
          verbose=1, #출력여부 
          validation_data= (x_val, y_val), # 검증셋
          callbacks = callback) #[추가] 조기종료 

stop_time = time.time() - start_time 

print('소요시간 : ', stop_time)

print('model evaluation')
model.evaluate(x=x_val, y=y_val) #loss: 0.0916 - accuracy: 0.9739

8. model history

print(model_fit.history.keys()) #dict_keys(['loss', 'accuracy', 'val_loss', 'val_accuracy'])

loss vs val_loss : overfitting 시작점 : epoch=2

plt.plot(model_fit.history['loss'], 'y', label='train loss')
plt.plot(model_fit.history['val_loss'], 'r', label='val loss')
plt.xlabel('epochs')
plt.ylabel('loss')
plt.legend(loc = 'best')
plt.show()

accuracy vs val_accuracy : overfitting 시작점 : epoch=2

plt.plot(model_fit.history['accuracy'], 'y', label='train accuracy')
plt.plot(model_fit.history['val_accuracy'], 'r', label='val accuracy')
plt.xlabel('epochs')
plt.ylabel('accuracy')
plt.legend(loc = 'best')
plt.show()

저작자표시 (새창열림)

'데이터분석가 과정 > Tensorflow' 카테고리의 다른 글

DAY68. Tensorflow CNN model (2)ImageGenerator (0)	2021.12.27
DAY67. Tensorflow CNN model (0)	2021.12.24
DAY65. Tensorflow Keras model (1)dnn model (0)	2021.12.22
DAY64. Tensorflow Classification (Sigmoid, Softmax) (0)	2021.12.21
DAY63. Tensorflow LinearRegression (3)keras dnn (0)	2021.12.20

💣

DAY66. Tensorflow Keras model (2)Overfitting solution

'데이터분석가 과정 > Tensorflow' 카테고리의 다른 글

+ Recent posts

티스토리툴바