## Prototyping kernels and advanced visualization with Python ops Operation kernels in TensorFlow are entirely written in C++ for efficiency. But writing a TensorFlow kernel in C++ can be quite a pain. So, before spending hours implementing your kernel you may want to prototype something quickly, however inefficient. With tf.py_func() you can turn any piece of python code to a TensorFlow operation. For example this is how you can implement a simple ReLU nonlinearity kernel in TensorFlow as a python op: ```python import numpy as np import tensorflow as tf import uuid def relu(inputs): # Define the op in python def _relu(x): return np.maximum(x, 0.) # Define the op's gradient in python def _relu_grad(x): return np.float32(x > 0) # An adapter that defines a gradient op compatible with TensorFlow def _relu_grad_op(op, grad): x = op.inputs[0] x_grad = grad * tf.py_func(_relu_grad, [x], tf.float32) return x_grad # Register the gradient with a unique id grad_name = "MyReluGrad_" + str(uuid.uuid4()) tf.RegisterGradient(grad_name)(_relu_grad_op) # Override the gradient of the custom op g = tf.get_default_graph() with g.gradient_override_map({"PyFunc": grad_name}): output = tf.py_func(_relu, [inputs], tf.float32) return output ``` To verify that the gradients are correct you can use TensorFlow's gradient checker: ```python x = tf.random_normal([10]) y = relu(x * x) with tf.Session(): diff = tf.test.compute_gradient_error(x, [10], y, [10]) print(diff) ``` compute_gradient_error() computes the gradient numerically and returns the difference with the provided gradient. What we want is a very low difference. Note that this implementation is pretty inefficient, and is only useful for prototyping, since the python code is not parallelizable and won't run on GPU. Once you verified your idea, you definitely would want to write it as a C++ kernel. In practice we commonly use python ops to do visualization on Tensorboard. Consider the case that you are building an image classification model and want to visualize your model predictions during training. TensorFlow allows visualizing images with tf.summary.image() function: ```python image = tf.placeholder(tf.float32) tf.summary.image("image", image) ``` But this only visualizes the input image. In order to visualize the predictions you have to find a way to add annotations to the image which may be almost impossible with existing ops. An easier way to do this is to do the drawing in python, and wrap it in a python op: ```python import io import matplotlib.pyplot as plt import numpy as np import PIL import tensorflow as tf def visualize_labeled_images(images, labels, max_outputs=3, name="image"): def _visualize_image(image, label): # Do the actual drawing in python fig = plt.figure(figsize=(3, 3), dpi=80) ax = fig.add_subplot(111) ax.imshow(image[::-1,...]) ax.text(0, 0, str(label), horizontalalignment="left", verticalalignment="top") fig.canvas.draw() # Write the plot as a memory file. buf = io.BytesIO() data = fig.savefig(buf, format="png") buf.seek(0) # Read the image and convert to numpy array img = PIL.Image.open(buf) return np.array(img.getdata()).reshape(img.size[0], img.size[1], -1) def _visualize_images(images, labels): # Only display the given number of examples in the batch outputs = [] for i in range(max_outputs): output = _visualize_image(images[i], labels[i]) outputs.append(output) return np.array(outputs, dtype=np.uint8) # Run the python op. figs = tf.py_func(_visualize_images, [images, labels], tf.uint8) return tf.summary.image(name, figs) ``` Note that since summaries are usually only evaluated once in a while (not per step), this implementation may be used in practice without worrying about efficiency.