Tensorboard: Integration with Tensorflow and Keras

When designing machine learning models, it is essential to receive feedback on their performance. To date, deep learning models largely remain a black box for us, and their internals are hard to peek a look at. However, it is still possible to obtain some insight, which is crucial to developing your model.

Why is it necessary to receive feedback on your model’s performance during training? The answer is to know which direction to tune your hyperparameters and whether your tuning will boost performance.

A naive way to monitor a model’s training performance is via a command line output. However, it is easy to miss the finer details this way. For example, it is easy to output the loss function after each training epoch, but it’s trickier to visualize how the weights are changed during training.

As with any task, it is essential to use a dedicated tool to gain more for less effort. Meet Tensorboard, the visualization framework that comes with Tensorflow.

tensorboard

Installation

You can install Tensorboard via Anaconda or Pip by running the following command:

pip install tensorboard

Alternatively, you can also use the following for Anaconda:

conda install tensorboard

Usage: Overview

Tensorboard runs as a server software. This server is started locally and continually monitors a directory that is specified by the user and contains the machine learning model logs. The logs need to be written in a specific format for Tensorboard to understand but major ML libraries, like Tensorflow or Keras, support this output out of the box.

Let us have a look at how Tensorboard works on an example.

Example

Task Specification

Let us assume we need to model a function f(x) = x * x with machine learning. Precisely, we have the following training data we need to fit:

import numpy as np

Training Data

samples_num = 30
train_X     = np.arange(0, samples_num).reshape((samples_num, 1))
train_Y     = (train_X ** 2).reshape(samples_num, 1)

We will attempt to model the function with a neural network that has one hidden layer. We will implement the model in both Tensorflow and Keras to see how they interoperate with Tensorboard.

The Model

First, we will define the model in Tensorflow:

import tensorflow as tf

learning_rate = 0.000001 training_epochs = 500 display_step = 10 hidden_size = 128

g = tf.Graph() with g.as_default():

Inputs

X = tf.placeholder(np.float32, (samples_num, 1)) # *, 1

Y = tf.placeholder(np.float32, (samples_num, 1)) # *, 1

Model

W_1 = tf.get_variable("W_1", (1, hidden_size), np.float32, initializer=tf.random_uniform_initializer) b_1 = tf.get_variable("b_1", (1, hidden_size), np.float32, initializer=tf.random_uniform_initializer)

W_2 = tf.get_variable("W_2", (hidden_size, 1), np.float32, initializer=tf.random_uniform_initializer) b_2 = tf.get_variable("b_2", (1 ), np.float32, initializer=tf.random_uniform_initializer)

hidden = tf.nn.relu(tf.matmul(X, W_1) + b_1)
pred = tf.matmul(hidden, W_2) + b_2

The above model has the hidden layer defined as relu(X * W_1 + b_1). That is, we are using a standard fully-connected layer with relu (linear rectified unit) (fn) as an activation function.

The hidden layer is then reused to output the predictions: hidden * W_2 + b_2.

Afterwards, we define the weight initialisers and the objective function:

init = tf.global_variables_initializer()

Objective

cost = tf.losses.mean_squared_error(Y, pred)
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

The Summaries

We would like to monitor the performance of our model as we train it with the help of Tensorboard. To do so, we will use the “summaries” framework from Tensorflow:

Summaries

tf.summary.scalar ('loss', cost)
tf.summary.histogram('W_1' , W_1 )
tf.summary.histogram('b_1' , b_1 )
tf.summary.histogram('W_2' , W_2 )
tf.summary.histogram('b_2' , b_2 )

Using the functions from Tensorflow’s summary package, one can log the performance of the graph’s selected nodes. In our case, we used a scalar summary to log the performance of the cost node, which outputs the mean squared error of our model. We used histogram summaries to log the nodes that output matrices. We will see how the logs look in Tensorboard in a moment.

Once we have defined the summaries, we need to merge them into a separate node:

summaries = tf.summary.merge_all() log_writer = tf.summary.FileWriter(log_dir, graph = g)

The merge_all() function creates a runnable graph node, which will turn the logs into the Tensorboard format. We will then be able to use the log_writer to write the logs in a specified directory: log_dir. The log_dir is created as followed:

def new_run_log_dir(base_dir): log_dir = os.path.join('./log', base_dir) if not os.path.exists(log_dir): os.makedirs(log_dir) run_id = len([name for name in os.listdir(log_dir)]) run_log_dir = os.path.join(log_dir, str(run_id)) return run_log_dir

log_dir = new_run_log_dir('tensorflow-demo')

The new_run_log_dir creates a separate directory under the base directory ./log/tensorflow_demo for each run, with the names sequentially starting from 0. In other words, the first run’s logs will be available under ./log/tensorflow_demo/0/, the second run’s logs under ./log/tensorflow_demo/1/, and so on. The new_run_log_dir function can infer the number of the current run by counting the number of directories already created by each run under the ./log/tensorflow_demo/ directory. By having the summaries organised in separate identifiable directories, you can view the summary visualisations side by side in Tensorboard.

Training and Writing the Summaries

With the above model, we can train the model and write the summaries simultaneously by doing the following:

Training

with tf.Session(graph = g) as sess: sess.run(init)

Fit all training data

for epoch in range(training_epochs): sess.run(optimizer, feed_dict = {X: train_X, Y: train_Y})

# Log summaries
if epoch % display_step == 0:
  summary, c = sess.run([summaries, cost], feed_dict={X: train_X, Y:train_Y})
  print('Epoch: {};\tCost: {}'.format(epoch, c))
  log_writer.add_summary(summary, epoch)
  log_writer.flush()

predicted = sess.run(pred, feed_dict = {X: train_X})
print_summary(train_Y, predicted)

We first initialise the weights via sess.run(init) (remember that init was the node representing the graph initialiser in the model above). Then we perform the optimisation steps in the for loop.

Notice the if statement in the loop: the summary writing happens under every display_step epochs. Let us see what’s going on here:

  1. The summaries are computed for the current epoch into the summary variable: summary, c = sess.run([summaries, cost], feed_dict={X: train_X, Y:train_Y}).
  2. The summaries are added to the summary writer to be written to the output directory: log_writer.add_summary(summary, epoch).
  3. log_writer is instructed to write the log directory: log_writer.flush().

Visualising the Summaries with Tensorboard

As already mentioned, Tensorboard is a server software that monitors a specified directory for summaries and visualises them. Let us start Tensorboard and specify ./log as our target directory:

tensorboard --logdir log

You should get an output as followed:

TensorBoard 1.5.1 at http://localhost:6006 (Press CTRL+C to quit)

In order to use Tensorboard, navigate to http://localhost:6006 and you should see the following:

scalar-metrics

The three most important areas are highlighted in the screenshot above and should be self-explanatory. Notice how Tensorboard can display metrics of several runs at once? This can be useful for comparing the performance of the different runs. On the screenshot, I also have a few metrics from the Keras example, which we will get to soon.

Histograms

On the screenshot above, we have an example of a scalar summary (the loss function on the training dataset). The plot should be self-explanatory with the epochs on the horizontal axis and the value of the loss function on the vertical axis. Let us now see how Tensorboard interprets the output of the histogram function – remember that we used the histogram summary to log the weights of the model.

histograms

Above you can see the Distributions view of the matrices. For each epoch, the frequencies of the values in the matrix are specified by colour intensity. Also, notice how we can observe the distributions of the same variable across different runs, the first run being orange and the second one pink.

Here is how the same matrix is viewed as a histogram:

histograms-2

Graphs

Another feature of Tensorboard is visualising the model’s graph:

graphs

You can see the Tensorflow model that we have programmed displayed as a graph above.

Keras integration with Tensorboard

Let us now see how you can implement the same example in Keras while integrating with Tensorboard. Our model is defined as followed:

import os
os.environ['KERAS_BACKEND' ] = 'tensorflow'
os.environ['MKL_THREADING_LAYER'] = 'GNU'

import keras as ks
from keras.models import Sequential
from keras.layers import Dense
from keras.callbacks import TensorBoard

Parameters

learning_rate = 0.000001
training_epochs = 500
display_step = 10
hidden_size = 128

Model

model = Sequential()
model.add(Dense(hidden_size, activation='relu', input_shape=[1]))
model.add(Dense(1))

model.compile(loss = ks.losses.mean_squared_error,
optimizer = ks.optimizers.Adadelta())

The model is trained as followed:

log_dir = new_run_log_dir('keras-demo')

model.fit(train_X, train_Y,
epochs = training_epochs,
verbose = 1,
validation_data = (train_X, train_Y),
callbacks = [TensorBoard(log_dir = log_dir,
histogram_freq = 50)])

Notice how the training algorithm is instructed to perform Tensorboard output via the following line:

callbacks = [TensorBoard(log_dir = log_dir,
histogram_freq = 50)])

In Keras, you can control the fitting process via callbacks, one of which is TensorBoard.

Summary

Tensorboard is a powerful tool that allows you to visualise the internals of your model while you train it:

  • Scalar values as plots
  • Matrices as histograms and probability distributions
  • Support for image and audio summaries

Tensorboard integrates with Tensorflow and Keras. One should opt for Tensorboard to debug console output since the former provides more information and is easier to use.

The examples presented in the article are available on GitHub: https://github.com/anatoliykmetyuk/tensorboard-demos.

Last updated on Apr 16, 2020