Image generation using Neural Style Transfer and Kangas

Sandy M.
Heartbeat
Published in
4 min readJul 24, 2023

--

Photo by David Clode on Unsplash

Algorithms for deep learning have been used to create art, enabling us to create original images by fusing the style and content of other embodiments. In this lesson, we will examine the idea of neural style transfer and discover how to use deep learning frameworks to execute it. By the conclusion of this tutorial, you will be able to produce your own artistic images that combine the style and substance of several photos.

Overview of Neural Style Transfer

Neural style transfer is a technology that blends two images’ content and style to produce a new image incorporating both features. In order to extract the content and stylistic features from the input photos, a pre-trained convolutional neural network (CNN) is used in the process. We will also be using kangas for visualizing images because it's a great tool for exploring and visualizing large-scale multimedia data.

The steps covered in the lesson are as follows:

  • Install Kangas
  • Preparing and loading the content and style pictures
  • Being familiar with the VGG19 model architecture
  • Taking the pre-trained model’s content and style attributes
  • defining the loss function to balance the weights of the content and the style
  • Reducing the loss by optimizing the resulting picture

Kangas Installation

pip install kangas

Neural style transfer implementation

Using Python and a deep learning framework called TensorFlow, we will go into the code implementation of neural style transfer in this section.

First, let’s import all the libraries that will be needed for this tutorial

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import kangas as kg
from kangas import DataGrid, Image

Now let’s load the pre-trained VGG19 model that will be used for training the model

vgg_model = tf.keras.applications.VGG19(include_top=False, weights='imagenet')

After that, we’ll preprocess the images that will be used to build the model. We’ll be using two images, the first image is the content image is the image whose content or subject we want to retain in the final generated image and the second is the style image which is the image that possesses a distinct artistic style.

content_image_path = 'content_image.jpg'
style_image_path = 'style_image.jpg'

# Load and preprocess content and style images
def load_and_preprocess_image(image_path):
img = tf.io.read_file(image_path)
img = tf.image.decode_image(img, channels=3)
img = tf.image.convert_image_dtype(img, tf.float32)
img = tf.image.resize(img, (256, 256))
img = img[tf.newaxis, :]
return img

content_image = load_and_preprocess_image(content_image_path)
style_image = load_and_preprocess_image(style_image_path)


After processing the image let’s load them with kangas and set up a data grid so we can have a good view of the images

content_image = kg.Image(content_image, name='content.jpeg') 
dg = kg.DataGrid(name='Art Images')
dg.extend([{'Image': content_image}])

style_image = kg.Image(style_image, name='style.jpeg')
dg.append([{'Image': style_image}])

# kickstart the UI
dg.show()

After preprocessing the images the next step will be to define the styles and content layer that will be used to carry out feature extraction.

content_layers = ['block5_conv2']
style_layers = ['block1_conv1', 'block2_conv1', 'block3_conv1', 'block4_conv1', 'block5_conv1']

Now it’s time to build the style transfer model

def style_transfer_model(vgg_model, style_layers, content_layers):
vgg_model.trainable = False
style_outputs = [vgg_model.get_layer(name).output for name in style_layers]
content_outputs = [vgg_model.get_layer(name).output for name in content_layers]
model_outputs = style_outputs + content_outputs
return tf.keras.models.Model(vgg_model.input, model_outputs)

style_transfer = style_transfer_model(vgg_model, style_layers, content_layers)

And now define the content and style weights

content_weight = 1e3
style_weight = 1e-2

The next step will be for us to define the loss function

def style_content_loss(outputs, style_targets, content_targets, style_weight, content_weight):
style_outputs = outputs[:len(style_layers)]
content_outputs = outputs[len(style_layers):]

style_loss = tf.add_n([tf.reduce_mean(tf.square(style_outputs[i] - style_targets[i])) for i in range(len(style_outputs))])
style_loss *= style_weight / len(style_outputs)

content_loss = tf.add_n([tf.reduce_mean(tf.square(content_outputs[i] - content_targets[i])) for i in range(len(content_outputs))])
content_loss *= content_weight / len(content_outputs)

total_loss = style_loss + content_loss
return total_loss

Now let’s extract the content and style features from the pre-trained model

def get_feature_representations(model, content_image, style_image):
content_features = model(content_image)['block5_conv2']
style_features = [model(layer)[name] for layer, name in zip(style_image, style_layers)]
return content_features, style_features

content_features, style_features = get_feature_representations(style_transfer, content_image, style_image)

Next, we’ll create the initial generated image as a random noise and define the learning rate.

generated_image = tf.Variable(content_image + tf.random.normal(content_image.shape) * 0.01)
optimizer = tf.optimizers.Adam(learning_rate=0.02, beta_1=0.99, epsilon=1e-1)

It’s time to perform neural style transfer and define the number of training interactions

@tf.function()
def train_step(image):
with tf.GradientTape() as tape:
outputs = style_transfer(image)
loss = style_content_loss(outputs, style_features, content_features, style_weight, content_weight)
gradient = tape.gradient(loss, image)
optimizer.apply_gradients([(gradient, image)])
image.assign(tf.clip_by_value(image, clip_value_min=0.0, clip_value_max=1.0))

# Number of training iterations
num_iterations = 1000

And then perform style transfer for each of the iterations

for i in range(num_iterations):
train_step(generated_image)
if i % 100 == 0:
print("Iteration:", i)

Now let’s use kangas again to visualize the generated image

image = kg.Image(generated_image[0]).to_pil()
image

Conclusion

In this lesson, we looked at the fascinating field of neural style transfer in deep learning for the arts. We discussed the main ideas underlying the method and gave detailed directions for implementing it with Python and a deep learning framework. We may create appealing and beautiful images via neural style transfer by fusing the content of one image with the style of another. You can carry on experimenting, changing the settings, and investigating variants of the approach to make it better.
Keep in mind that neural style transfer is just the start of the amazing possibilities that deep learning presents for the creation of art and images.

Editor’s Note: Heartbeat is a contributor-driven online publication and community dedicated to providing premier educational resources for data science, machine learning, and deep learning practitioners. We’re committed to supporting and inspiring developers and engineers from all walks of life.

Editorially independent, Heartbeat is sponsored and published by Comet, an MLOps platform that enables data scientists & ML teams to track, compare, explain, & optimize their experiments. We pay our contributors, and we don’t sell ads.

If you’d like to contribute, head on over to our call for contributors. You can also sign up to receive our weekly newsletter (Deep Learning Weekly), check out the Comet blog, join us on Slack, and follow Comet on Twitter and LinkedIn for resources, events, and much more that will help you build better ML models, faster.

--

--