TensorFlow + Keras を使って MNIST(手書き数字データ)を入力とした学習を動かしたときのメモです.
シンプルなCNNの識別器とDCGANを動かします.
{{small:コード全文:[link:https://github.com/sanko-shoko/simplesp_py/tree/master/mnist] }}
*インストール
tensorflow keras の順にインストールします.
{/
pip install tensorflow
pip install keras
/}
gpuを使う場合はこちらです.
{/
pip install tensorflow_gpu
pip install keras
/}
インストールしたバージョンを確認します.
{/
pip list
...
Keras (2.0.6)
tensorflow (1.2.1)
/}
*simple CNN
主な参考元 {{small:[link:https://github.com/fchollet/keras/blob/master/examples/mnist_cnn.py] }}
畳み込み層x2 と 全結合層x2 の比較的簡単なネットワークを構築します.
**準備
一通り必要な機能のインポートとデータセットの取得を行います.
{#
from __future__ import print_function
import os
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Conv2D
from keras.layers import Dropout, Flatten, MaxPooling2D
# load mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
# normalize
x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255
# one hot
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)
# reshape
shape = (28, 28, 1)
x_train = x_train.reshape(-1, shape[0], shape[1], shape[2])
x_test = x_test.reshape(-1, shape[0], shape[1], shape[2])
print('x_train :', x_train.shape)
print('x_test :', x_test.shape)
print('y_train :', y_train.shape)
print('y_test :', y_test.shape)
#}
ここまでの出力
{/
Using TensorFlow backend.
x_train : (60000, 28, 28, 1)
x_test : (10000, 28, 28, 1)
y_train : (60000, 10)
y_test : (10000, 10)
/}
MNISTは学習用の画像が6万枚,テスト用の画像が1万枚あります.
画像のサイズは28x28で,グレースケールの1チャンネルです.
各画像に設定されているラベルは0~9までの10種類です.
**モデルの設定
{#
# define model
def cnn_model(class_num, input_shape):
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(class_num, activation='softmax'))
return model
model = cnn_model(10, shape)
model.summary()
opt = keras.optimizers.Adadelta()
model.compile(loss=keras.losses.categorical_crossentropy, optimizer=opt, metrics=['accuracy'])
#}
ここまでの出力
{/
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 26, 26, 32) 320
_________________________________________________________________
conv2d_2 (Conv2D) (None, 24, 24, 64) 18496
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 12, 12, 64) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 12, 12, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 9216) 0
_________________________________________________________________
dense_1 (Dense) (None, 128) 1179776
_________________________________________________________________
dropout_2 (Dropout) (None, 128) 0
_________________________________________________________________
dense_2 (Dense) (None, 10) 1290
=================================================================
Total params: 1,199,882
Trainable params: 1,199,882
Non-trainable params: 0
_________________________________________________________________
/}
**学習と評価
{#
epochs = 10
batch_size = 128
param_folder = './param'
if not os.path.isdir(param_folder):
os.makedirs(param_folder)
# call back (for save param)
cbk = keras.callbacks.ModelCheckpoint(filepath = os.path.join(param_folder, 'param{epoch:02d}.hdf5'))
# train
model.fit(x_train, y_train, batch_size, epochs=epochs, verbose = 1, callbacks = [cbk], validation_data = (x_test, y_test))
# test
result = model.predict(x_test)
score = model.evaluate(x_test, y_test, verbose = 0)
print('test result:', result.argmax(axis=1))
print('test loss and accuracy:', score)
#}
ここまでの出力
{/
Train on 60000 samples, validate on 10000 samples
Epoch 1/10
60000/60000 [==============================] - 146s - loss: 0.3395 - acc: 0.8978 - val_loss: 0.0793 - val_acc: 0.9750
Epoch 2/10
60000/60000 [==============================] - 142s - loss: 0.1150 - acc: 0.9667 - val_loss: 0.0529 - val_acc: 0.9828
Epoch 3/10
60000/60000 [==============================] - 148s - loss: 0.0872 - acc: 0.9747 - val_loss: 0.0465 - val_acc: 0.9858
Epoch 4/10
60000/60000 [==============================] - 155s - loss: 0.0720 - acc: 0.9784 - val_loss: 0.0410 - val_acc: 0.9858
Epoch 5/10
60000/60000 [==============================] - 144s - loss: 0.0629 - acc: 0.9813 - val_loss: 0.0357 - val_acc: 0.9873
Epoch 6/10
60000/60000 [==============================] - 140s - loss: 0.0565 - acc: 0.9831 - val_loss: 0.0353 - val_acc: 0.9877
Epoch 7/10
60000/60000 [==============================] - 132s - loss: 0.0522 - acc: 0.9844 - val_loss: 0.0342 - val_acc: 0.9885
Epoch 8/10
60000/60000 [==============================] - 139s - loss: 0.0466 - acc: 0.9859 - val_loss: 0.0301 - val_acc: 0.9896
Epoch 9/10
60000/60000 [==============================] - 140s - loss: 0.0436 - acc: 0.9866 - val_loss: 0.0300 - val_acc: 0.9892
Epoch 10/10
60000/60000 [==============================] - 149s - loss: 0.0417 - acc: 0.9875 - val_loss: 0.0299 - val_acc: 0.9897
test result: [7 2 1 ..., 4 5 6]
test loss and accuracy: [0.029858358073154886, 0.98970000000000002]
/}
*DCGAN
主な参考元 {{small:[link:https://github.com/jacobgil/keras-dcgan] }}
**準備
一通り必要な機能のインポートとデータセットの取得を行います.
{#
from __future__ import print_function
import os
import numpy as np
from PIL import Image
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Conv2D, BatchNormalization
from keras.layers import Activation, Flatten, Dropout, UpSampling2D, MaxPooling2D, Reshape
# load mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
# normalize
x_train = (x_train.astype('float32') - 127.5) / 127.5
# reshape
shape = (28, 28, 1)
x_train = x_train.reshape(-1, shape[0], shape[1], shape[2])
print('x_train :', x_train.shape)
#}
ここまでの出力
{/
Using TensorFlow backend.
x_train : (60000, 28, 28, 1)
/}
**モデルの設定
{#
# define model
def generator_model():
model = Sequential()
model.add(Dense(1024, input_shape = (100, )))
model.add(Activation('tanh'))
model.add(Dense(7 * 7 * 128))
model.add(BatchNormalization())
model.add(Activation('tanh'))
model.add(Reshape((7, 7, 128), input_shape = (7 * 7 * 128,)))
model.add(UpSampling2D(size = (2, 2)))
model.add(Conv2D(64, (5, 5), padding='same'))
model.add(Activation('tanh'))
model.add(UpSampling2D(size = (2, 2)))
model.add(Conv2D(1, (5, 5), padding = 'same'))
model.add(Activation('tanh'))
return model
def discriminator_model():
model = Sequential()
model.add(Conv2D(64, (5, 5), padding='same', input_shape=(28, 28, 1)))
model.add(Activation('tanh'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(128, (5, 5)))
model.add(Activation('tanh'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(1024))
model.add(Activation('tanh'))
model.add(Dense(1))
model.add(Activation('sigmoid'))
return model
def combined_model(generator, discriminator):
model = Sequential()
model.add(generator)
model.add(discriminator)
return model
generator = generator_model()
generator.summary()
discriminator = discriminator_model()
discriminator.summary()
discriminator.trainable = False
combined = combined_model(generator, discriminator)
combined.summary()
opt = keras.optimizers.SGD(lr=0.0005, momentum=0.9, nesterov=True)
discriminator.trainable = True
discriminator.compile(loss='binary_crossentropy', optimizer=opt)
discriminator.trainable = False
combined.compile(loss='binary_crossentropy', optimizer=opt)
#}
ここまでの出力
{/
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 1024) 103424
_________________________________________________________________
activation_1 (Activation) (None, 1024) 0
_________________________________________________________________
dense_2 (Dense) (None, 6272) 6428800
_________________________________________________________________
batch_normalization_1 (Batch (None, 6272) 25088
_________________________________________________________________
activation_2 (Activation) (None, 6272) 0
_________________________________________________________________
reshape_1 (Reshape) (None, 7, 7, 128) 0
_________________________________________________________________
up_sampling2d_1 (UpSampling2 (None, 14, 14, 128) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 14, 14, 64) 204864
_________________________________________________________________
activation_3 (Activation) (None, 14, 14, 64) 0
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 28, 28, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 28, 28, 1) 1601
_________________________________________________________________
activation_4 (Activation) (None, 28, 28, 1) 0
=================================================================
Total params: 6,763,777
Trainable params: 6,751,233
Non-trainable params: 12,544
_________________________________________________________________
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_3 (Conv2D) (None, 28, 28, 64) 1664
_________________________________________________________________
activation_5 (Activation) (None, 28, 28, 64) 0
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 14, 14, 64) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 10, 10, 128) 204928
_________________________________________________________________
activation_6 (Activation) (None, 10, 10, 128) 0
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 5, 5, 128) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 3200) 0
_________________________________________________________________
dense_3 (Dense) (None, 1024) 3277824
_________________________________________________________________
activation_7 (Activation) (None, 1024) 0
_________________________________________________________________
dense_4 (Dense) (None, 1) 1025
_________________________________________________________________
activation_8 (Activation) (None, 1) 0
=================================================================
Total params: 3,485,441
Trainable params: 3,485,441
Non-trainable params: 0
_________________________________________________________________
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
sequential_1 (Sequential) (None, 28, 28, 1) 6763777
_________________________________________________________________
sequential_2 (Sequential) (None, 1) 3485441
=================================================================
Total params: 10,249,218
Trainable params: 6,751,233
Non-trainable params: 3,497,985
_________________________________________________________________
/}
GANではdiscriminator単体のモデル①と,generatorとdiscriminatorを連結したモデル②を交互に学習させます.
モデル①の学習時は,discriminatorのパラメータを学習させます.
モデル②の学習時は,discriminatorのパラメータは固定させ,generatorのパラメータのみを学習させます.
**学習
{#
epochs = 100
batch_size = 128
param_folder = './param'
if not os.path.isdir(param_folder):
os.makedirs(param_folder)
for epoch in range(epochs):
print('Epoch {}/{}'.format(epoch + 1, epochs))
itmax = int(x_train.shape[0] / batch_size)
progbar = keras.utils.generic_utils.Progbar(target = itmax)
for i in range(itmax):
# train discriminator
x = x_train[i * batch_size : (i + 1) * batch_size]
n = np.random.uniform(-1, 1, (batch_size, 100))
g = generator.predict(n, verbose=0)
y = [1] * batch_size + [0] * batch_size
d_loss = discriminator.train_on_batch(np.concatenate((x, g)), y)
# train generator
n = np.random.uniform(-1, 1, (batch_size, 100))
y = [1] * batch_size
g_loss = combined.train_on_batch(n, y)
progbar.add(1, values=[("d_loss", d_loss), ("g_loss", g_loss)])
# save image
if i % 20 == 0:
tmp = [r.reshape(-1, 28) for r in np.split(g[:100], 10)]
img = np.concatenate(tmp, axis = 1)
img = (img * 127.5 + 127.5).astype(np.uint8)
Image.fromarray(img).save("{}_{}.png".format(epoch, i))
# save param
generator.save_weights(os.path.join(param_folder, 'generator_{}.hdf5'.format(epoch)))
discriminator.save_weights(os.path.join(param_folder, 'discriminator_{}.hdf5'.format(epoch)))
#}
ここまでの出力
{/
Epoch 1/100
468/468 [==============================] - 1327s - d_loss: 0.4235 - g_loss: 1.0386
Epoch 2/100
468/468 [==============================] - 1366s - d_loss: 0.4928 - g_loss: 1.0905
Epoch 3/100
468/468 [==============================] - 1393s - d_loss: 0.4240 - g_loss: 1.2327
Epoch 4/100
468/468 [==============================] - 1406s - d_loss: 0.3942 - g_loss: 1.4242
Epoch 5/100
468/468 [==============================] - 1452s - d_loss: 0.2884 - g_loss: 1.9572
Epoch 6/100
468/468 [==============================] - 1465s - d_loss: 0.2263 - g_loss: 2.3618
Epoch 7/100
468/468 [==============================] - 1379s - d_loss: 0.2653 - g_loss: 2.4052
Epoch 8/100
468/468 [==============================] - 1358s - d_loss: 0.3320 - g_loss: 2.1082
Epoch 9/100
468/468 [==============================] - 1371s - d_loss: 0.3575 - g_loss: 2.0694
Epoch 10/100
468/468 [==============================] - 1363s - d_loss: 0.3413 - g_loss: 1.9845
...
/}
[img:gdty]
{{small:図1 generatorで生成される画像群}}
>> ご意見・ご質問など お気軽にご連絡ください.info