{% extends "base.html" %} {% block topNavbar %}
In this method, we choose a style image and many content images to train a feedforward model. By passing the output image of model, original style image and original content image into VGG pretrained model, we can extract features from specific layers, and then use loss function to calculate the difference between these features, combine all differences, we can get total loss. Our objective is to minimize the total loss, by using optimizer and backpropagate to update the model's weight, we can get a model which generate a combination image using the style image we selected.
In this method, we use many style images and many content images to train a decoder model. Different from fast neural style method, instead of using content loss function and style loss function, we use the method called "Style swap". First, we pass the content image and style image to VGG pretrained model, and get a content feature and a style feature in specific layer (we use relu3-3), and then extract many patches from the style feature, using each patches to scan over each part of the content feature to find the similarity. At the end, we replace every part the content feature by the most similar one patch from style feature to get a collage content. And pass the collage content to the decoder model. Finally, pass the output of decoder model into VGG pretrained model, compare the difference with original content, we can use the loss to optimize our decoder model, and get a model which can generate a combination image by any style image.
In this method, we use Mask R-CNN pretraind model to classify and segment objects on the image. the model was pretrained on MS COCO dataset, so it can recognize about 80 different categories of object. After using Mask R-CNN to segment objects, we can generate a mask on those objects. Transfer the image two times with two different style images by fast neural style method, and we crop one generated image by the mask, and paste the cropped part onto the other generated image. Then we can get a combination image with two different styles.