mnist data set

티스토리 뷰

AI/딥러닝(sung kim)

mnist data set

취뽀가자!! 2018. 7. 23. 03:05

Mnist dataset

- 0 ~9까지 사람이 쓴 손글씨 이미지

- 28 x 28(784) pixels

- 파일 구조

[학습 데이터] - 60,000개 (test : validation = 55,000 : 5,000)

train-images-idx3-ubyte.gz

train-labels-idx1-ubyte.gz

[테스트 데이터] - 10,000 개

t10k-images-idx3-ubyte.gz

t10k-labels-idx1-ubyte.gz

tensorflow

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
 
import matplotlib.pyplot as plt
import random
 
# 데이터가 없을 경우 자동으로 다운받음(시간이 어느정도 걸림)
# one_hot = True로 하면 lable 데이터가 one_hot 방식으로 나옴
mnist = input_data.read_data_sets("data/MNIST_data/", one_hot=True)
 
# 분류 숫자 0 ~ 9
nb_classes = 10
 
# MNIST data image of shape 28 * 28 = 784
X = tf.placeholder(tf.float32, [None, 784])
# 0 - 9 digits recognition = 10 classes
Y = tf.placeholder(tf.float32, [None, nb_classes])
 
W = tf.Variable(tf.random_normal([784, nb_classes]))
b = tf.Variable(tf.random_normal([nb_classes]))
 
# Hypothesis (using softmax)
hypothesis = tf.nn.softmax(tf.matmul(X, W) + b)
 
# cross_entropy
cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)
 
# Test model
is_correct = tf.equal(tf.arg_max(hypothesis, 1), tf.arg_max(Y, 1))
# Calculate accuracy
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))
 
# parameters
# epoch : 전체 데이터를 학습하는 횟수
# batch : 한번에 메모리에 올리는 데이터 수
training_epochs = 15
batch_size = 100
 
with tf.Session() as sess:
    #Initialize Tensorflow variables
    sess.run(tf.global_variables_initializer())
    #Training cycle
 
    for epoch in range(training_epochs):
        avg_cost=0
        #전체 데이터의 개수를 batch_size로 나누면 1epoch에 필요한 횟수를 구할 수 있다.
        total_batch=int(mnist.train.num_examples/batch_size)
 
        for i in range(total_batch):
            #Training data를 이용하여 학습
            batch_xs,batch_ys=mnist.train.next_batch(batch_size)
            c,_=sess.run([cost,optimizer],feed_dict={X:batch_xs,Y:batch_ys})
            avg_cost+=c/total_batch
 
        print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost))
        # Epoch: 0015 cost = 0.450051648
 
    #Test data를 활용하여 정확도 측정
    #sess.run()과 tensor.eval은 같은 기능
    print("Accuracy: ", accuracy.eval(session=sess, feed_dict={X: mnist.test.images, Y: mnist.test.labels}))
    #Accuracy : 0.8886
 
    #Test data중 하나를 임의로 뽑아서 테스트
    r = random.randint(0, mnist.test.num_examples - 1)
    print("Label:", sess.run(tf.argmax(mnist.test.labels[r:r+1], 1)))
    print("Prediction:", sess.run(tf.argmax(hypothesis, 1), feed_dict={X: mnist.test.images[r:r+1]}))
    # Label: [3]
    # Prediction: [3]
 
    # 화면에 출력
    plt.imshow(mnist.test.images[r:r+1].reshape(28, 28), cmap='Greys', interpolation='nearest')
    plt.show()
 
Colored by Color Scripter
cs

----------------------------

이 글은 모두를 위한 딥러닝을 듣고 정리한 글입니다.

'AI > 딥러닝(sung kim)' 카테고리의 다른 글

딥네트웍 학습 시키기(backpropagation) (0)	2018.08.19
XOR 문제 딥러닝으로 풀기 (0)	2018.07.23
training/test data set, learning rate, normalization (0)	2018.07.23
learning rate, standardization, normalization (0)	2018.07.12
과적합(Overfitting) (0)	2018.07.12

공지사항

최근에 올라온 글

최근에 달린 댓글

Total

Today

Yesterday

링크

TAG more

« 2025/09 »
일	월	화	수	목	금	토
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

글 보관함

jwlee

티스토리 뷰

mnist data set

Mnist dataset

tensorflow

'AI > 딥러닝(sung kim)' 카테고리의 다른 글

티스토리툴바