## Prototyping kernels and advanced visualization with Python ops Operation kernels in TensorFlow are entirely written in C++ for efficiency. But writing a TensorFlow kernel in C++ can be quite a pain. So, before spending hours implementing your kernel you may want to prototype something quickly, however inefficient. With tf.py_function() you can turn any piece of python code to a TensorFlow operation. For example this is how you can implement a simple ReLU nonlinearity kernel in TensorFlow as a python op: ```python import numpy as np import tensorflow as tf import uuid def relu(inputs): # Define the op in python def _py_relu(x): return np.maximum(x, 0.) # Define the op's gradient in python def _py_relu_grad(x): return np.float32(x > 0) @tf.custom_gradient def _relu(x): y = tf.py_function(_py_relu, [x], tf.float32) def _relu_grad(dy): return dy * tf.py_function(_py_relu_grad, [x], tf.float32) return y, _relu_grad return _relu(inputs) ``` To verify that the gradients are correct you can compare the numerical and analytical gradients and compare the vlaues. ```python # Compute analytical gradient x = tf.random.normal([10], dtype=np.float32) with tf.GradientTape() as tape: tape.watch(x) y = relu(x) g = tape.gradient(y, x) print(g) # Compute numerical gradient dx_n = 1e-5 dy_n = relu(x + dx_n) - relu(x) g_n = dy_n / dx_n print(g_n) ``` The numbers should be very close. Note that this implementation is pretty inefficient, and is only useful for prototyping, since the python code is not parallelizable and won't run on GPU. Once you verified your idea, you definitely would want to write it as a C++ kernel. In practice we commonly use python ops to do visualization on Tensorboard. Consider the case that you are building an image classification model and want to visualize your model predictions during training. TensorFlow allows visualizing images with tf.summary.image() function: ```python image = tf.placeholder(tf.float32) tf.summary.image("image", image) ``` But this only visualizes the input image. In order to visualize the predictions you have to find a way to add annotations to the image which may be almost impossible with existing ops. An easier way to do this is to do the drawing in python, and wrap it in a python op: ```python def visualize_labeled_images(images, labels, max_outputs=3, name="image"): def _visualize_image(image, label): # Do the actual drawing in python fig = plt.figure(figsize=(3, 3), dpi=80) ax = fig.add_subplot(111) ax.imshow(image[::-1,...]) ax.text(0, 0, str(label), horizontalalignment="left", verticalalignment="top") fig.canvas.draw() # Write the plot as a memory file. buf = io.BytesIO() data = fig.savefig(buf, format="png") buf.seek(0) # Read the image and convert to numpy array img = PIL.Image.open(buf) return np.array(img.getdata()).reshape(img.size[0], img.size[1], -1) def _visualize_images(images, labels): # Only display the given number of examples in the batch outputs = [] for i in range(max_outputs): output = _visualize_image(images[i], labels[i]) outputs.append(output) return np.array(outputs, dtype=np.uint8) # Run the python op. figs = tf.py_function(_visualize_images, [images, labels], tf.uint8) return tf.summary.image(name, figs) ``` Note that since summaries are usually only evaluated once in a while (not per step), this implementation may be used in practice without worrying about efficiency.