about云開發

 找回密碼
 立即注冊

QQ登錄

只需一步,快速開始

掃一掃,訪問微社區

查看: 596|回復: 1
打印 上一主題 下一主題

[連載型] TensorFlow ML cookbook 第八章2節 實施高級CNN

[復制鏈接]

250

主題

50

聽眾

1

收聽

版主

Rank: 7Rank: 7Rank: 7

積分
4205

最佳新人熱心會員

跳轉到指定樓層
樓主
發表于 2019-7-2 15:37:09 | 只看該作者 |只看大圖 回帖獎勵 |倒序瀏覽 |閱讀模式
本帖最后由 levycui 于 2019-7-2 15:41 編輯
問題導讀:
1、如何聲明一些圖像參數,高度和寬度,以及隨機裁剪圖像的大小?
2、如何使用read_cifar_files()函數返回隨機扭曲的圖像?
3、如何聲明我們的模型函數及設置兩個卷積層?
4、如何初始化我們的損耗和測試精度函數?






上一篇:TensorFlow ML cookbook 第八章1節 卷積神經網絡-實施更簡單的CNN

實施高級CNN
能夠擴展CNN模型以進行圖像識別非常重要,這樣我們才能理解如何增加網絡的深度。如果我們有足夠的數據,這可能會提高我們預測的準確性。擴展CNN網絡的深度是以標準方式完成的:我們只需重復卷積,maxpool,ReLU系列,直到我們對深度感到滿意為止。許多更精確的圖像識別網絡以這種方式操作。

做好準備
在本文中,我們將實現一種更先進的讀取圖像數據的方法,并使用更大的CNN在CIFAR10數據集上進行圖像識別(https://www.cs.toronto.edu/~kriz/cifar.html)。此數據集包含60,000個32x32圖像,這些圖像恰好屬于十個可能類別中的一個。圖像的潛在類別是飛機,汽車,鳥,貓,鹿,狗,青蛙,馬,船和卡車。您還可以參閱“另請參閱”部分的第一個要點。
大多數圖像數據集太大而無法放入內存中。我們可以使用TensorFlow設置一個圖像管道,一次從文件中批量讀取。我們通過設置圖像閱讀器,然后創建在圖像閱讀器上運行的批處理隊列來實現此目的。

此外,對于圖像識別數據,通常在將圖像發送之前隨機擾動圖像以進行訓練。在這里,我們將隨機裁剪,翻轉和更改亮度。

此配方是官方TensorFlow CIFAR-10教程的改編版本,可在本章末尾的“另請參閱”部分中找到。我們將教程濃縮為一個腳本,并逐行完成,并解釋所有必要的代碼。我們還將一些常量和參數恢復為原始引用的紙張值,我們將在以下適當的步驟中指出。

怎么做…
1.首先,我們加載必要的庫并啟動圖形會話:
[Python] 純文本查看 復制代碼
import os
import sys
import tarfile
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from six.moves import urllib
sess = tf.Session() 


2,現在我們將聲明一些模型參數。 我們的批量大小為128(用于火車和測試)。 我們將每50代輸出一次狀態,總共運行20,000代。 每500代,我們將評估一批測試數據。 然后我們將聲明一些圖像參數,高度和寬度,以及隨機裁剪圖像的大小。 有三個通道(紅色,綠色和藍色),我們有十個不同的目標。 然后,我們將聲明我們將從隊列中存儲數據和圖像批次的位置:
[Python] 純文本查看 復制代碼
batch_size = 128
output_every = 50
generations = 20000
eval_every = 500
image_height = 32
image_width = 32
crop_height = 24
crop_width = 24
num_channels = 3
num_targets = 10
data_dir = 'temp'
extract_folder = 'cifar-10-batches-bin'


3,建議在我們向好的模型邁進時降低學習率,因此我們將以指數方式降低學習率:初始學習率將設置為0.1,并且我們將按指數方式將其降低10% 250代。 確切的公式將由x表示當前世代數。 默認值是為了不斷減少,但TensorFlow確實接受了階梯參數,該參數僅更新學習速率:
[Python] 純文本查看 復制代碼
learning_rate = 0.1
lr_decay = 0.9
num_gens_to_wait = 250.

4.現在我們將設置參數,以便我們可以讀取二進制CIFAR-10圖像:
[Python] 純文本查看 復制代碼
image_vec_length = image_height * image_width * num_channels
record_length = 1 + image_vec_length 


5,接下來,我們將設置數據目錄和URL以下載CIFAR-10圖像,如果我們還沒有它們:
[Python] 純文本查看 復制代碼
data_dir = 'temp'
if not os.path.exists(data_dir):
  os.makedirs(data_dir)
cifar10_url = 'http://www.cs.toronto.edu/~kriz/cifar-10-binary. tar.gz'
data_file = os.path.join(data_dir, 'cifar-10-binary.tar.gz')
if not os.path.isfile(data_file):
  # Download file
  filepath, _ = urllib.request.urlretrieve(cifar10_url, data_ file, progress)
  # Extract file
  tarfile.open(filepath, 'r:gz').extractall(data_dir) 


6,我們將設置記錄閱讀器并使用以下read_cifar_files()函數返回隨機扭曲的圖像。 首先,我們需要聲明一個讀取固定字節長度的記錄讀取器對象。 在我們讀取圖像隊列之后,我們將圖像和標簽分開。 最后,我們將使用TensorFlow的內置圖像修改功能隨機扭曲圖像:
[Python] 純文本查看 復制代碼
def read_cifar_files(filename_queue, distort_images = True):
  reader = tf.FixedLengthRecordReader(record_bytes=record_ length)
  key, record_string = reader.read(filename_queue)
  record_bytes = tf.decode_raw(record_string, tf.uint8)
  # Extract label
  image_label = tf.cast(tf.slice(record_bytes, [0], [1]), tf.int32)
  # Extract image
  image_extracted = tf.reshape(tf.slice(record_bytes, [1], [image_vec_length]), [num_channels, image_height, image_width])
  # Reshape image
  image_uint8image = tf.transpose(image_extracted, [1, 2, 0])
  reshaped_image = tf.cast(image_uint8image, tf.float32)
  # Randomly Crop image
  final_image = tf.image.resize_image_with_crop_or_pad(reshaped_ image, crop_width, crop_height)
  if distort_images:
    # Randomly flip the image horizontally, change the brightness and contrast
    final_image = tf.image.random_flip_left_right(final_image)
    final_image = tf.image.random_brightness(final_image,max_ delta=63)
    final_image = tf.image.random_contrast(final_ image,lower=0.2, upper=1.8)
  # Normalize whitening
  final_image = tf.image.per_image_whitening(final_image)
  return(final_image, image_label) 


7,現在我們將聲明一個函數,它將填充我們的圖像管道供批處理器使用。 我們首先需要設置我們想要讀取的圖像的文件列表,并定義如何使用通過預構建的TensorFlow函數創建的輸入生成器對象來讀取它們。 輸入生成器可以傳遞給我們在上一步中創建的讀取函數read_cifar_files()。 然后我們將在隊列中設置一個批處理閱讀器shuffle_batch():
[Python] 純文本查看 復制代碼
def input_pipeline(batch_size, train_logical=True):
  if train_logical:
    files = [os.path.join(data_dir, extract_folder, 'data_ batch_{}.bin'.format(i)) for i in range(1,6)]
  else:
    files = [os.path.join(data_dir, extract_folder, 'test_ batch.bin')]
  filename_queue = tf.train.string_input_producer(files)
  image, label = read_cifar_files(filename_queue)
  min_after_dequeue = 1000
  capacity = min_after_dequeue + 3 * batch_size
  example_batch, label_batch = tf.train.shuffle_batch([image, label], batch_size, capacity, min_after_dequeue)
  return(example_batch, label_batch)


正確設置min_after_dequeue很重要。 此參數負責設置用于采樣的圖像緩沖區的最小大小。 官方TensorFlow文檔建議將其設置為(#threads + error margin)* batch_size。 請注意,將其設置為更大的大小會導致更均勻的混洗,因為它正在從隊列中的更大數據集進行混洗,但是在此過程中也將使用更多內存。

8,接下來,我們可以聲明我們的模型函數。 我們將使用的模型有兩個卷積層,后面是三個完全連接的層。 為了使變量聲明更容易,我們首先聲明兩個變量函數。 兩個卷積層將分別創建64個特征。 第一個完全連接的層將連接第二個卷積層和384個隱藏節點。 第二個完全連接的操作將這384個隱藏節點連接到192個隱藏節點。 最后的隱藏層操作將192個節點連接到我們試圖預測的10個輸出類。 請參閱以下標有#的內聯注釋:
[Python] 純文本查看 復制代碼
def cifar_cnn_model(input_images, batch_size, train_logical=True):
  def truncated_normal_var(name, shape, dtype):
    return(tf.get_variable(name=name, shape=shape, dtype=dtype, initializer=tf.truncated_normal_ initializer(stddev=0.05)))
  def zero_var(name, shape, dtype):
    return(tf.get_variable(name=name, shape=shape, dtype=dtype, initializer=tf.constant_initializer(0.0)))
  # First Convolutional Layer
  with tf.variable_scope('conv1') as scope:
    # Conv_kernel is 5x5 for all 3 colors and we will create 64 features
    conv1_kernel = truncated_normal_var(name='conv_kernel1', shape=[5, 5, 3, 64], dtype=tf.float32)
    # We convolve across the image with a stride size of 1
    conv1 = tf.nn.conv2d(input_images, conv1_kernel, [1, 1, 1, 1], padding='SAME')
    # Initialize and add the bias term
    conv1_bias = zero_var(name='conv_bias1', shape=[64], dtype=tf.float32)
    conv1_add_bias = tf.nn.bias_add(conv1, conv1_bias)
    # ReLU element wise
    relu_conv1 = tf.nn.relu(conv1_add_bias)
  # Max Pooling
  pool1 = tf.nn.max_pool(relu_conv1, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1],padding='SAME', name='pool_layer1')
  # Local Response Normalization
  norm1 = tf.nn.lrn(pool1, depth_radius=5, bias=2.0, alpha=1e-3, beta=0.75, name='norm1')
  # Second Convolutional Layer
  with tf.variable_scope('conv2') as scope:
    # Conv kernel is 5x5, across all prior 64 features and we create 64 more features
    conv2_kernel = truncated_normal_var(name='conv_kernel2', shape=[5, 5, 64, 64], dtype=tf.float32)
    # Convolve filter across prior output with stride size of 1
    conv2 = tf.nn.conv2d(norm1, conv2_kernel, [1, 1, 1, 1], padding='SAME')
    # Initialize and add the bias
    conv2_bias = zero_var(name='conv_bias2', shape=[64], dtype=tf.float32)
    conv2_add_bias = tf.nn.bias_add(conv2, conv2_bias)
    # ReLU element wise
    relu_conv2 = tf.nn.relu(conv2_add_bias)
  # Max Pooling
  pool2 = tf.nn.max_pool(relu_conv2, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='SAME', name='pool_layer2')
  # Local Response Normalization (parameters from paper)
  norm2 = tf.nn.lrn(pool2, depth_radius=5, bias=2.0, alpha=1e-3, beta=0.75, name='norm2')
  # Reshape output into a single matrix for multiplication for the fully connected layers
  reshaped_output = tf.reshape(norm2, [batch_size, -1])
  reshaped_dim = reshaped_output.get_shape()[1].value
  # First Fully Connected Layer
  with tf.variable_scope('full1') as scope:
    # Fully connected layer will have 384 outputs.
    full_weight1 = truncated_normal_var(name='full_mult1', shape=[reshaped_dim, 384], dtype=tf.float32)
    full_bias1 = zero_var(name='full_bias1', shape=[384], dtype=tf.float32)
    full_layer1 = tf.nn.relu(tf.add(tf.matmul(reshaped_output, full_weight1), full_bias1))
    # Second Fully Connected Layer
  with tf.variable_scope('full2') as scope:
    # Second fully connected layer has 192 outputs.
    full_weight2 = truncated_normal_var(name='full_mult2', shape=[384, 192], dtype=tf.float32)
    full_bias2 = zero_var(name='full_bias2', shape=[192], dtype=tf.float32)
    full_layer2 = tf.nn.relu(tf.add(tf.matmul(full_layer1, full_weight2), full_bias2))
    # Final Fully Connected Layer -> 10 categories for output (num_targets)
  with tf.variable_scope('full3') as scope:
    # Final fully connected layer has 10 (num_targets) outputs.
    full_weight3 = truncated_normal_var(name='full_mult3', shape=[192, num_targets], dtype=tf.float32)
    full_bias3 = zero_var(name='full_bias3', shape=[num_ targets], dtype=tf.float32)
    final_output = tf.add(tf.matmul(full_layer2, full_ weight3), full_bias3)
  return(final_output)


我們的本地響應歸一化參數取自論文,參見(3)。

9,現在我們將創建損失函數。 我們將使用softmax函數,因為圖片只能占用一個類別,因此輸出應該是十個目標的概率分布:
[Python] 純文本查看 復制代碼
def cifar_loss(logits, targets):
# Get rid of extra dimensions and cast targets into integers
targets = tf.squeeze(tf.cast(targets, tf.int32))
# Calculate cross entropy from logits and targets
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_ logits(logits, targets)
# Take the average loss across batch size
cross_entropy_mean = tf.reduce_mean(cross_entropy)
return(cross_entropy_mean)



10,接下來,我們宣布我們的培訓步驟。 學習率將以指數階躍函數降低:
[Python] 純文本查看 復制代碼
def train_step(loss_value, generation_num):
  # Our learning rate is an exponential decay (stepped down)
  model_learning_rate = tf.train.exponential_decay(learning_ rate, generation_num, num_gens_to_wait, lr_decay, staircase=True)
  # Create optimizer
  my_optimizer = tf.train.GradientDescentOptimizer(model_ learning_rate)
  # Initialize train step
  train_step = my_optimizer.minimize(loss_value)
  return(train_step)


11,我們還必須具有精確度函數,可以計算一批圖像的精度。 我們將輸入logits和目標向量,并輸出平均精度。 然后我們可以將它用于列車和測試批次:
[Python] 純文本查看 復制代碼
def accuracy_of_batch(logits, targets):
  # Make sure targets are integers and drop extra dimensions
  targets = tf.squeeze(tf.cast(targets, tf.int32))
  # Get predicted values by finding which logit is the greatest
  batch_predictions = tf.cast(tf.argmax(logits, 1), tf.int32)
  # Check if they are equal across the batch
  predicted_correctly = tf.equal(batch_predictions, targets)
  # Average the 1's and 0's (True's and False's) across the batch size
  accuracy = tf.reduce_mean(tf.cast(predicted_correctly, tf.float32))
  return(accuracy) 


12,現在我們有了一個imagepipeline函數,我們可以初始化訓練圖像管道和測試圖像管道:
[Python] 純文本查看 復制代碼
images, targets = input_pipeline(batch_size, train_logical=True)
test_images, test_targets = input_pipeline(batch_size, train_ logical=False) 


13,接下來,我們將初始化訓練輸出和測試輸出的模型。 重要的是要注意,在創建訓練模型之后我們必須聲明scope.reuse_variables(),這樣,當我們為測試網絡聲明模型時,它將使用相同的模型參數:
[Python] 純文本查看 復制代碼
with tf.variable_scope('model_definition') as scope:
  # Declare the training network model
  model_output = cifar_cnn_model(images, batch_size)
  # Use same variables within scope
  scope.reuse_variables()
  # Declare test model output
  test_output = cifar_cnn_model(test_images, batch_size) 


14,我們現在可以初始化我們的損耗和測試精度函數。 然后我們將聲明生成變量。 此變量需要聲明為不可訓練,并傳遞給我們的訓練函數,該函數在學習速率指數衰減計算中使用它:
[Python] 純文本查看 復制代碼
loss = cifar_loss(model_output, targets)
accuracy = accuracy_of_batch(test_output, test_targets)
generation_num = tf.Variable(0, trainable=False)
train_op = train_step(loss, generation_num)


15,我們現在將初始化所有模型的變量,然后通過運行TensorFlow函數start_queue_runners()來啟動圖像管道。 當我們開始訓練或測試模型輸出時,管道將輸入一批圖像來代替飼料字典:
[Python] 純文本查看 復制代碼
init = tf.initialize_all_variables()
sess.run(init)
tf.train.start_queue_runners(sess=sess) 


16,我們現在循環培訓我們的培訓,節省培訓損失和測試準確性:
[Python] 純文本查看 復制代碼
train_loss = []
test_accuracy = []
for i in range(generations):
_, loss_value = sess.run([train_op, loss])
if (i+1) % output_every == 0:
train_loss.append(loss_value)
output = 'Generation {}: Loss = {:.5f}'.format((i+1), loss_value)
print(output)
if (i+1) % eval_every == 0:
[temp_accuracy] = sess.run([accuracy])
test_accuracy.append(temp_accuracy)
acc_output = ' --- Test Accuracy= {:.2f}%.'.format(100.*temp_accuracy)
print(acc_output) 


17,這導致以下輸出:
[Python] 純文本查看 復制代碼
Generation 19500: Loss = 0.04461
--- Test Accuracy = 80.47%.
Generation 19550: Loss = 0.01171
Generation 19600: Loss = 0.06911
Generation 19650: Loss = 0.08629
Generation 19700: Loss = 0.05296
Generation 19750: Loss = 0.03462
Generation 19800: Loss = 0.03182
Generation 19850: Loss = 0.07092
Generation 19900: Loss = 0.11342
Generation 19950: Loss = 0.08751
Generation 20000: Loss = 0.02228
--- Test Accuracy = 83.59%.


18,最后,這里有一些matplotlib代碼,它將繪制培訓過程中的損失和測試準確度:
[Python] 純文本查看 復制代碼
eval_indices = range(0, generations, eval_every)
output_indices = range(0, generations, output_every)
# Plot loss over time
plt.plot(output_indices, train_loss, 'k-')
plt.title('Softmax Loss per Generation')
plt.xlabel('Generation')
plt.ylabel('Softmax Loss')
plt.show()
# Plot accuracy over time
plt.plot(eval_indices, test_accuracy, 'k-')
plt.title('Test Accuracy')
plt.xlabel('Generation')
plt.ylabel('Accuracy')
plt.show()


圖5:訓練損失在左側,測試精度在右側。 對于CIFAR-10圖像識別CNN,我們能夠實現在測試集上達到約75%準確度的模型。

這個怎么運作
在我們下載了CIFAR-10數據之后,我們建立了一個圖像管道而不是使用源字典。 有關圖像管道的更多信息,請參閱官方TensorFlow CIFAR-10教程。 我們使用此列車和測試管道來嘗試預測圖像的正確類別。 最后,該模型在測試集上達到了約75%的準確度。

也可以看看
有關CIFAR-10數據集的更多信息,請參閱學習Tiny Images的多個特征層,Alex Krizhevsky,2009。https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf
要查看原始TensorFlow代碼,請訪問https://github.com/tensorflow/te ... odels/image/cifar10
有關局部響應歸一化的更多信息,請參閱使用深度卷積神經網絡的ImageNet分類,Krizhevsky,A。等人。 2012.http://paper.nips.cc/paper/4824- ... nal-neural-networks

最新經典文章,歡迎關注公眾號


原文:
Implementing an Advanced CNN
It is important to be able to extend CNN models for image recognition so that we understand how to increase the depth of the network. This may increase the accuracy of our predictions if we have enough data. Extending the depth of CNN networks is done in a standard fashion: we just repeat the convolution, maxpool, ReLU series until we are satisfied with the depth. Many of the more accurate image recognition networks operate in this fashion.


Getting ready
In this recipe, we will implement a more advanced method of reading image data and use a larger CNN to do image recognition on the CIFAR10 dataset (https://www.cs.toronto. edu/~kriz/cifar.html). This dataset has 60,000 32x32 images that fall into exactly one of ten possible classes. The potential classes for the images are airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck. You can also refer to the first bullet point of the See also section.

Most image datasets will be too large to fit into memory. What we can do with TensorFlow is set up an image pipeline to read in a batch at a time from a file. We do this by essentially setting up an image reader and then creating a batch queue that operates on the image reader.


Also, with image recognition data, it is common to randomly perturb the image before sending it through for training. Here, we will randomly crop, flip, and change the brightness.
This recipe is an adapted version of the official TensorFlow CIFAR-10 tutorial, which is available under the See also section at the end of this chapter. We have condensed the tutorial into one script and will go through it line-by-line and explain all the code that is necessary. We also revert some constants and parameters to the original cited paper values, which we will point out in the following appropriated steps.


How to do it…
1、    To start with, we load the necessary libraries and start a graph session:
import os
import sys
import tarfile
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from six.moves import urllib
sess = tf.Session()
2、    Now we'll declare some of the model parameters. Our batch size will be 128 (for train and test). We will output a status every 50 generations and run for a total of 20,000 generations. Every 500 generations, we'll evaluate on a batch of the test data. We'll then declare some image parameters, height and width, and what size the random cropped images will take. There are three channels (red, green, and blue), and we have ten different targets. Then, we'll declare where we will store the data and image batches from the queue:

batch_size = 128
output_every = 50
generations = 20000
eval_every = 500
image_height = 32
image_width = 32
crop_height = 24
crop_width = 24
num_channels = 3
num_targets = 10
data_dir = 'temp'
extract_folder = 'cifar-10-batches-bin'

3、    It is recommended to lower the learning rate as we progress towards a good model, so we will exponentially decrease the learning rate: the initial learning rate will be set at 0.1, and we will exponentially decrease it by a factor of 10% every 250 generations. The exact formula will be given by where x is the current generation number. The default is for this to continually decrease, but TensorFlow does accept a staircase argument which only updates the learning rate:
learning_rate = 0.1
lr_decay = 0.9
num_gens_to_wait = 250.


4、    Now we'll set up parameters so that we can read in the binary CIFAR-10 images:
image_vec_length = image_height * image_width * num_channels
record_length = 1 + image_vec_length


5、    Next, we'll set up the data directory and the URL to download the CIFAR-10 images, if we don't have them already:
data_dir = 'temp'
if not os.path.exists(data_dir):
os.makedirs(data_dir)
cifar10_url = 'http://www.cs.toronto.edu/~kriz/cifar-10-binary. tar.gz'
data_file = os.path.join(data_dir, 'cifar-10-binary.tar.gz')
if not os.path.isfile(data_file):
# Download file
filepath, _ = urllib.request.urlretrieve(cifar10_url, data_ file, progress)
# Extract file
tarfile.open(filepath, 'r:gz').extractall(data_dir)


6、    We'll set up the record reader and return a randomly distorted image with the following read_cifar_files() function. First, we need to declare a record reader object that will read in a fixed length of bytes. After we read the image queue, we'll split apart the image and label. Finally, we will randomly distort the image with TensorFlow's built in image modification functions:
def read_cifar_files(filename_queue, distort_images = True):
reader = tf.FixedLengthRecordReader(record_bytes=record_ length)
key, record_string = reader.read(filename_queue)
record_bytes = tf.decode_raw(record_string, tf.uint8)
# Extract label
image_label = tf.cast(tf.slice(record_bytes, [0], [1]), tf.int32)
# Extract image
image_extracted = tf.reshape(tf.slice(record_bytes, [1], [image_vec_length]), [num_channels, image_height, image_width])
# Reshape image
image_uint8image = tf.transpose(image_extracted, [1, 2, 0])
reshaped_image = tf.cast(image_uint8image, tf.float32)
# Randomly Crop image
final_image = tf.image.resize_image_with_crop_or_pad(reshaped_ image, crop_width, crop_height)
if distort_images:
# Randomly flip the image horizontally, change the brightness and contrast
final_image = tf.image.random_flip_left_right(final_image)
final_image = tf.image.random_brightness(final_image,max_ delta=63)
final_image = tf.image.random_contrast(final_ image,lower=0.2, upper=1.8)
# Normalize whitening
final_image = tf.image.per_image_whitening(final_image)
return(final_image, image_label)


7、    Now we'll declare a function that will populate our image pipeline for the batch processor to use. We first need to set up the file list of images we want to read through, and to define how to read them with an input producer object, created through prebuilt TensorFlow functions. The input producer can be passed into the reading function that we created in the preceding step, read_cifar_files(). We'll then set a batch reader on the queue, shuffle_batch():
def input_pipeline(batch_size, train_logical=True):
if train_logical:
files = [os.path.join(data_dir, extract_folder, 'data_ batch_{}.bin'.format(i)) for i in range(1,6)]
else:
files = [os.path.join(data_dir, extract_folder, 'test_ batch.bin')]
filename_queue = tf.train.string_input_producer(files)
image, label = read_cifar_files(filename_queue)
min_after_dequeue = 1000
capacity = min_after_dequeue + 3 * batch_size
example_batch, label_batch = tf.train.shuffle_batch([image, label], batch_size, capacity, min_after_dequeue)
return(example_batch, label_batch)

It is important to set the min_after_dequeue properly. This parameter is responsible for setting the minimum size of an image buffer for sampling. The official TensorFlow documentation recommends setting it to (#threads + error margin)*batch_size. Note that setting it to a larger size results in more uniform shuffling, as it is shuffling from a larger set of data in the queue, but that more memory will also be used in the process.

8、    Next, we can declare our model function. The model we will use has two convolutional layers, followed by three fully connected layers. To make variable declaration easier, we'll start by declaring two variable functions. The two convolutional layers will create 64 features each. The first fully connected layer will connect the 2nd convolutional layer with 384 hidden nodes. The second fully connected operation will connect those 384 hidden nodes to 192 hidden nodes. The final hidden layer operation will then connect the 192 nodes to the 10 output classes we are trying to predict. See the following inline comments marked with #:
def cifar_cnn_model(input_images, batch_size, train_logical=True):
def truncated_normal_var(name, shape, dtype):
return(tf.get_variable(name=name, shape=shape, dtype=dtype, initializer=tf.truncated_normal_ initializer(stddev=0.05)))
def zero_var(name, shape, dtype):
return(tf.get_variable(name=name, shape=shape, dtype=dtype, initializer=tf.constant_initializer(0.0)))
# First Convolutional Layer
with tf.variable_scope('conv1') as scope:
# Conv_kernel is 5x5 for all 3 colors and we will create 64 features
conv1_kernel = truncated_normal_var(name='conv_kernel1', shape=[5, 5, 3, 64], dtype=tf.float32)
# We convolve across the image with a stride size of 1
conv1 = tf.nn.conv2d(input_images, conv1_kernel, [1, 1, 1, 1], padding='SAME')
# Initialize and add the bias term
conv1_bias = zero_var(name='conv_bias1', shape=[64], dtype=tf.float32)
conv1_add_bias = tf.nn.bias_add(conv1, conv1_bias)
# ReLU element wise
relu_conv1 = tf.nn.relu(conv1_add_bias)
# Max Pooling
pool1 = tf.nn.max_pool(relu_conv1, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1],padding='SAME', name='pool_layer1')
# Local Response Normalization
norm1 = tf.nn.lrn(pool1, depth_radius=5, bias=2.0, alpha=1e-3, beta=0.75, name='norm1')
# Second Convolutional Layer
with tf.variable_scope('conv2') as scope:
# Conv kernel is 5x5, across all prior 64 features and we create 64 more features
conv2_kernel = truncated_normal_var(name='conv_kernel2', shape=[5, 5, 64, 64], dtype=tf.float32)
# Convolve filter across prior output with stride size of 1
conv2 = tf.nn.conv2d(norm1, conv2_kernel, [1, 1, 1, 1], padding='SAME')
# Initialize and add the bias
conv2_bias = zero_var(name='conv_bias2', shape=[64], dtype=tf.float32)
conv2_add_bias = tf.nn.bias_add(conv2, conv2_bias)
# ReLU element wise
relu_conv2 = tf.nn.relu(conv2_add_bias)
# Max Pooling
pool2 = tf.nn.max_pool(relu_conv2, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='SAME', name='pool_layer2')
# Local Response Normalization (parameters from paper)
norm2 = tf.nn.lrn(pool2, depth_radius=5, bias=2.0, alpha=1e-3, beta=0.75, name='norm2')
# Reshape output into a single matrix for multiplication for the fully connected layers
reshaped_output = tf.reshape(norm2, [batch_size, -1])
reshaped_dim = reshaped_output.get_shape()[1].value
# First Fully Connected Layer
with tf.variable_scope('full1') as scope:
# Fully connected layer will have 384 outputs.
full_weight1 = truncated_normal_var(name='full_mult1', shape=[reshaped_dim, 384], dtype=tf.float32)
full_bias1 = zero_var(name='full_bias1', shape=[384], dtype=tf.float32)
full_layer1 = tf.nn.relu(tf.add(tf.matmul(reshaped_output, full_weight1), full_bias1))
# Second Fully Connected Layer
with tf.variable_scope('full2') as scope:
# Second fully connected layer has 192 outputs.
full_weight2 = truncated_normal_var(name='full_mult2', shape=[384, 192], dtype=tf.float32)
full_bias2 = zero_var(name='full_bias2', shape=[192], dtype=tf.float32)
full_layer2 = tf.nn.relu(tf.add(tf.matmul(full_layer1, full_weight2), full_bias2))
# Final Fully Connected Layer -> 10 categories for output (num_targets)
with tf.variable_scope('full3') as scope:
# Final fully connected layer has 10 (num_targets) outputs.
full_weight3 = truncated_normal_var(name='full_mult3', shape=[192, num_targets], dtype=tf.float32)
full_bias3 = zero_var(name='full_bias3', shape=[num_ targets], dtype=tf.float32)
final_output = tf.add(tf.matmul(full_layer2, full_ weight3), full_bias3)
return(final_output)

Our local response normalization parameters are taken from the paper and are referenced in See also (3).

9、    Now we'll create the loss function. We will use the softmax function because a picture can only take on exactly one category, so the output should be a probability distribution over the ten targets:
def cifar_loss(logits, targets):
# Get rid of extra dimensions and cast targets into integers
targets = tf.squeeze(tf.cast(targets, tf.int32))
# Calculate cross entropy from logits and targets
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_ logits(logits, targets)
# Take the average loss across batch size
cross_entropy_mean = tf.reduce_mean(cross_entropy)
return(cross_entropy_mean)


10、    Next, we declare our training step. The learning rate will decrease in an exponential step function:
def train_step(loss_value, generation_num):
# Our learning rate is an exponential decay (stepped down)
model_learning_rate = tf.train.exponential_decay(learning_ rate, generation_num, num_gens_to_wait, lr_decay, staircase=True)
# Create optimizer
my_optimizer = tf.train.GradientDescentOptimizer(model_ learning_rate)
# Initialize train step
train_step = my_optimizer.minimize(loss_value)
return(train_step)

11、    We must also have an accuracy function that calculates the accuracy across a batch of images. We'll input the logits and target vectors, and output an averaged accuracy. We can then use this for both the train and test batches:
def accuracy_of_batch(logits, targets):
# Make sure targets are integers and drop extra dimensions
targets = tf.squeeze(tf.cast(targets, tf.int32))
# Get predicted values by finding which logit is the greatest
batch_predictions = tf.cast(tf.argmax(logits, 1), tf.int32)
# Check if they are equal across the batch
predicted_correctly = tf.equal(batch_predictions, targets)
# Average the 1's and 0's (True's and False's) across the batch size
accuracy = tf.reduce_mean(tf.cast(predicted_correctly, tf.float32))
return(accuracy)


12、Now that we have an imagepipeline function, we can initialize both the training image pipeline and the test image pipeline:
images, targets = input_pipeline(batch_size, train_logical=True)
test_images, test_targets = input_pipeline(batch_size, train_ logical=False)


13、    Next, we'll initialize the model for the training output and the test output. It is important to note that we must declare scope.reuse_variables() after we create the training model so that, when we declare the model for the test network, it will use the same model parameters:
with tf.variable_scope('model_definition') as scope:
# Declare the training network model
model_output = cifar_cnn_model(images, batch_size)
# Use same variables within scope
scope.reuse_variables()
# Declare test model output
test_output = cifar_cnn_model(test_images, batch_size)


14、    We can now initialize our loss and test accuracy functions. Then we'll declare the generation variable. This variable needs to be declared as non-trainable, and passed to our training function that uses it in the learning rate exponential decay calculation:
loss = cifar_loss(model_output, targets)
accuracy = accuracy_of_batch(test_output, test_targets)
generation_num = tf.Variable(0, trainable=False)
train_op = train_step(loss, generation_num)

15、    We'll now initialize all of the model's variables and then start the image pipeline by running the TensorFlow function, start_queue_runners(). When we start the train or test model output, the pipeline will feed in a batch of images in place of a feed dictionary:
init = tf.initialize_all_variables()
sess.run(init)
tf.train.start_queue_runners(sess=sess)


16、    We now loop through our training generations and save the training loss and the test accuracy:
train_loss = []
test_accuracy = []
for i in range(generations):
_, loss_value = sess.run([train_op, loss])
if (i+1) % output_every == 0:
train_loss.append(loss_value)
output = 'Generation {}: Loss = {:.5f}'.format((i+1), loss_value)
print(output)
if (i+1) % eval_every == 0:
[temp_accuracy] = sess.run([accuracy])
test_accuracy.append(temp_accuracy)
acc_output = ' --- Test Accuracy= {:.2f}%.'.format(100.*temp_accuracy)
print(acc_output)


17、This results in the following output:
Generation 19500: Loss = 0.04461
--- Test Accuracy = 80.47%.
Generation 19550: Loss = 0.01171
Generation 19600: Loss = 0.06911
Generation 19650: Loss = 0.08629
Generation 19700: Loss = 0.05296
Generation 19750: Loss = 0.03462
Generation 19800: Loss = 0.03182
Generation 19850: Loss = 0.07092
Generation 19900: Loss = 0.11342
Generation 19950: Loss = 0.08751
Generation 20000: Loss = 0.02228
--- Test Accuracy = 83.59%.

18、Finally, here is some matplotlib code that will plot the loss and test accuracy over the course of the training:
eval_indices = range(0, generations, eval_every)
output_indices = range(0, generations, output_every)
# Plot loss over time
plt.plot(output_indices, train_loss, 'k-')
plt.title('Softmax Loss per Generation')
plt.xlabel('Generation')
plt.ylabel('Softmax Loss')
plt.show()
# Plot accuracy over time
plt.plot(eval_indices, test_accuracy, 'k-')
plt.title('Test Accuracy')
plt.xlabel('Generation')
plt.ylabel('Accuracy')
plt.show()


Figure 5: The training loss is on the left and the test accuracy is on the right. For the CIFAR-10 image recognition CNN, we were able to achieve a model that reaches around 75% accuracy on the test set.


How it works…
After we downloaded the CIFAR-10 data, we established an image pipeline instead of using a feed dictionary. For more information on the image pipeline, please see the official TensorFlow CIFAR-10 tutorials. We used this train and test pipeline to try to predict the correct category of the images. By the end, the model had achieved around 75% accuracy on the test set.

See also
For more information about the CIFAR-10 dataset, please see Learning Multiple Layers of Features from Tiny Images, Alex Krizhevsky, 2009. https://www. cs.toronto.edu/~kriz/learning-features-2009-TR.pdf


To see original TensorFlow code, visit https://github.com/tensorflow/ tensorflow/tree/r0.11/tensorflow/models/image/cifar10
For more on local response normalization, please see, ImageNet Classification with Deep Convolutional Neural Networks, Krizhevsky, A., et. al. 2012. http:// papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks



分享到:  QQ好友和群QQ好友和群 QQ空間QQ空間 騰訊微博騰訊微博 騰訊朋友騰訊朋友 微信微信
收藏收藏1 轉播轉播 分享分享 分享淘帖 贊 踩 分享到微信分享到微信
您需要登錄后才可以回帖 登錄 | 立即注冊

本版積分規則

關閉

推薦上一條 /3 下一條

QQ|小黑屋|about云開發-學問論壇|社區 ( 京ICP備12023829號

GMT+8, 2019-8-8 09:18 , Processed in 0.522144 second(s), 32 queries , Gzip On.

Powered by Discuz! X3.2 Licensed

快速回復 返回頂部 返回列表
梭哈电子游艺