## Prototyping kernels and advanced visualization with Python ops
Operation kernels in TensorFlow are entirely written in C++ for efficiency. But writing a TensorFlow kernel in C++ can be quite a pain. So, before spending hours implementing your kernel you may want to prototype something quickly, however inefficient. With tf.py_function() you can turn any piece of python code to a TensorFlow operation.
For example this is how you can implement a simple ReLU nonlinearity kernel in TensorFlow as a python op:
```python
import numpy as np
import tensorflow as tf
import uuid
def relu(inputs):
# Define the op in python
def _py_relu(x):
return np.maximum(x, 0.)
# Define the op's gradient in python
def _py_relu_grad(x):
return np.float32(x > 0)
@tf.custom_gradient
def _relu(x):
y = tf.py_function(_py_relu, [x], tf.float32)
def _relu_grad(dy):
return dy * tf.py_function(_py_relu_grad, [x], tf.float32)
return y, _relu_grad
return _relu(inputs)
```
To verify that the gradients are correct you can compare the numerical and analytical gradients and compare the vlaues.
```python
# Compute analytical gradient
x = tf.random.normal([10], dtype=np.float32)
with tf.GradientTape() as tape:
tape.watch(x)
y = relu(x)
g = tape.gradient(y, x)
print(g)
# Compute numerical gradient
dx_n = 1e-5
dy_n = relu(x + dx_n) - relu(x)
g_n = dy_n / dx_n
print(g_n)
```
The numbers should be very close.
Note that this implementation is pretty inefficient, and is only useful for prototyping, since the python code is not parallelizable and won't run on GPU. Once you verified your idea, you definitely would want to write it as a C++ kernel.
In practice we commonly use python ops to do visualization on Tensorboard. Consider the case that you are building an image classification model and want to visualize your model predictions during training. TensorFlow allows visualizing images with tf.summary.image() function:
```python
image = tf.placeholder(tf.float32)
tf.summary.image("image", image)
```
But this only visualizes the input image. In order to visualize the predictions you have to find a way to add annotations to the image which may be almost impossible with existing ops. An easier way to do this is to do the drawing in python, and wrap it in a python op:
```python
def visualize_labeled_images(images, labels, max_outputs=3, name="image"):
def _visualize_image(image, label):
# Do the actual drawing in python
fig = plt.figure(figsize=(3, 3), dpi=80)
ax = fig.add_subplot(111)
ax.imshow(image[::-1,...])
ax.text(0, 0, str(label),
horizontalalignment="left",
verticalalignment="top")
fig.canvas.draw()
# Write the plot as a memory file.
buf = io.BytesIO()
data = fig.savefig(buf, format="png")
buf.seek(0)
# Read the image and convert to numpy array
img = PIL.Image.open(buf)
return np.array(img.getdata()).reshape(img.size[0], img.size[1], -1)
def _visualize_images(images, labels):
# Only display the given number of examples in the batch
outputs = []
for i in range(max_outputs):
output = _visualize_image(images[i], labels[i])
outputs.append(output)
return np.array(outputs, dtype=np.uint8)
# Run the python op.
figs = tf.py_function(_visualize_images, [images, labels], tf.uint8)
return tf.summary.image(name, figs)
```
Note that since summaries are usually only evaluated once in a while (not per step), this implementation may be used in practice without worrying about efficiency.