0% found this document useful (0 votes)

10 views6 pages

ML_Seminar

This project report details the implementation of real-time neural style transfer using Adaptive Instance Normalization (AdaIN), which allows for fast and flexible stylization of arbitrary images without requiring style-specific training. The methodology includes a pre-trained encoder, an AdaIN feature transformation layer, and a trainable decoder, with training conducted on content and style images from distinct datasets. Results demonstrate the model's ability to generate high-quality stylized images in real-time while maintaining the content structure, confirming the effectiveness of the AdaIN approach.

Uploaded by

RUPESH KUMAR

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views6 pages

ML_Seminar

Uploaded by

RUPESH KUMAR

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Machine Learning Project Report

Real-Time Arbitrary Style Transfer Using Adaptive

Instance Normalization
Rupesh Kumar
Roll No: 221DS032
CDS 26
MACS, NITK
April 13, 2025

Abstract
This project implements real-time neural style transfer using Adaptive Instance
Normalization (AdaIN). The model performs artistic style transfer by aligning feature
statistics between content and style images. Unlike traditional methods that are slow
and require style-specific training, this approach enables fast and flexible stylization of
arbitrary images using a single feed-forward model. The implementation is based on
PyTorch and follows the architecture proposed by Huang and Belongie. This report
outlines the methodology, training setup, loss functions, and results of the model.

1 Introduction
Neural Style Transfer (NST) is a technique that recomposes the content of one image in
the visual appearance or ”style” of another. Traditional approaches, such as the method
introduced by Gatys et al. (2016), use a pre-trained convolutional neural network (e.g.,
VGG-19) to iteratively optimize a randomly initialized image so that its deep features match
those of a given content and style image. While this method yields visually impressive results,
it is computationally intensive and cannot be used in real-time applications.
Subsequent approaches attempted to accelerate the process by training feed-forward net-
works to mimic the optimization. However, these networks were either limited to specific
styles or required multi-style training and large models. To overcome these constraints, Xun
Huang and Serge Belongie introduced the concept of Adaptive Instance Normalization
(AdaIN) in 2017. AdaIN enables arbitrary style transfer in real-time using a single feed-
forward network by aligning the channel-wise mean and variance of content features to match
those of the style features.
This report is based on a PyTorch implementation of the AdaIN method, as provided in
the open-source repository: https://github.com/RUPESH-KUMAR01/AdaIn_style_transfer.

1
It includes detailed discussion of the architecture, loss functions, training procedures, and
evaluation results.

2 Methodology
The AdaIN model consists of three major components: a pre-trained encoder, the AdaIN
feature transformation layer, and a trainable decoder. The overall architecture is inspired
by the one proposed in the original AdaIN paper, and the implementation closely follows
the provided PyTorch code.

2.1 Encoder
The encoder is a truncated version of the VGG-19 network, stopping at layer relu4 1. It
is used to extract high-level feature representations of both the content and style images.
In the code, this is implemented by loading the pre-trained VGG-19 model and freezing its
parameters to avoid updates during training.

2.2 Adaptive Instance Normalization (AdaIN)

The core novelty of this approach lies in the AdaIN layer. Given encoded content features
fc and style features fs , AdaIN aligns their channel-wise statistics as follows:

fc − µ(fc )
AdaIN(fc , fs ) = σ(fs ) · + µ(fs )
σ(fc )
Here, µ(·) and σ(·) represent the channel-wise mean and standard deviation, respectively.
This transformation modifies the content features so that they have the same statistical
distribution as the style features, allowing the decoder to generate stylized outputs.
In the code, this is implemented using simple PyTorch operations over the feature maps.

2.3 Decoder
The decoder network is a symmetric convolutional network designed to reconstruct an image
from AdaIN-transformed features. It includes upsampling layers (nearest-neighbor interpola-
tion) followed by convolutional blocks. The decoder is trained from scratch while the encoder
remains fixed.
The goal of the decoder is to produce an output image that reflects the style of the
reference image while preserving the structure of the content image. In the implementation,
the decoder is trained using a joint content and style loss (described in the next section).

3 Training
3.1 Datasets
The training procedure uses two distinct datasets:

2
• Content images: Typically sampled from MS-COCO, a large dataset of natural
images.

• Style images: Sourced from artistic datasets such as WikiArt, which contain paintings
in various artistic styles.

Each training batch samples one content and one style image randomly. The images are
resized and normalized before being passed through the encoder.

3.2 Optimization
Only the decoder is trained. The encoder (VGG-19) is kept frozen. The Adam optimizer is
used with:

• Learning rate: 1 × 10−4

• Batch size: 8

• Number of iterations: 200,000

The training objective is to minimize a weighted combination of content and style losses,
encouraging the decoder to generate outputs that resemble the style image in appearance
but retain the content structure.

3.3 Training Pipeline

The training pipeline proceeds as follows:

1. Load and normalize one content and one style image.

2. Extract their features using the encoder.

3. Apply AdaIN to obtain target features.

4. Decode the target features to an output image.

5. Compute content and style losses.

6. Backpropagate the total loss and update the decoder.

4 Loss Functions
The training objective consists of two parts: content loss and style loss. Both are computed
using features extracted from the encoder.

3
4.1 Content Loss
The content loss ensures that the output image maintains the structure and semantics of the
content image. It is computed as the Euclidean distance between the AdaIN-transformed
features and the features of the output image:

Lc = ∥f (g(t)) − t∥22
where f is the encoder, g is the decoder, and t is the AdaIN-transformed target feature.

4.2 Style Loss

The style loss encourages the output image to have the same channel-wise mean and standard
deviation as the style image across multiple layers of the encoder:
X
∥µ(ϕi (Iout )) − µ(ϕi (Istyle ))∥22 + ∥σ(ϕi (Iout )) − σ(ϕi (Istyle ))∥22

Ls =
i

Here, ϕi denotes the features extracted from the ith layer of the encoder.

4.3 Total Loss

The final loss used during training is:

Ltotal = Lc + λLs
The hyperparameter λ balances content and style. In practice, λ = 10 gives good results.

5 Results
5.1 Stylized Output
The model is capable of generating high-quality stylized images that retain the spatial struc-
ture of the content image while adopting the visual appearance (colors, textures, brush-
strokes) of the style image. This is achieved without training a separate model for each
style, demonstrating the flexibility of AdaIN-based style transfer.
Figure 1 shows a sample result, where the content image has been transformed with
the style of a reference painting. The model generalizes well to unseen styles and produces
results in real-time, making it suitable for interactive applications.

4
Figure 1: Example of stylized image output. The output preserves the structure of the
content image while adopting the artistic characteristics of the style image.

5.2 Loss Curve

The training process shows stable convergence. The loss graph (Figure 2) plots the combined
content and style loss over training iterations. A consistent downward trend is observed,
indicating effective learning by the decoder.
As the model learns to balance the content and style objectives, the stylized outputs
improve in perceptual quality. Notably, the model avoids overfitting to particular styles
due to the arbitrary nature of AdaIN and the wide variety of training pairs sampled during
training.

Figure 2: Training loss over time. The total loss combines content and style objectives.

5
6 Conclusion
This project successfully reimplements the AdaIN-based style transfer architecture for real-
time stylization using PyTorch. The approach demonstrates how statistical alignment of
feature maps using AdaIN allows arbitrary style transfer without requiring per-style retrain-
ing. The encoder is fixed, and the decoder is trained to reconstruct stylized outputs using a
content-style trade-off loss.
The results confirm that AdaIN produces perceptually convincing stylized images that
preserve the content structure while adopting artistic styles. Compared to earlier methods,
this model is lightweight and fast enough for real-time use.
Future improvements could include temporal consistency for video stylization, user-
guided controls for blending styles, or improving high-frequency texture detail in stylized
results.

References
• Huang, Xun, and Serge Belongie. ”Arbitrary Style Transfer in Real-Time with Adap-
tive Instance Normalization.” arXiv preprint arXiv:1703.06868 (2017).

• Gatys, Leon A., Alexander S. Ecker, and Matthias Bethge. ”Image Style Transfer
Using Convolutional Neural Networks.” In Proceedings of the IEEE conference on
computer vision and pattern recognition. 2016.

• GitHub Repository: https://github.com/RUPESH-KUMAR01/AdaIn_style_transfer

Assignment 1 Microprocessors and Embedded Systems Nov 23 (AutoRecovered)
No ratings yet
Assignment 1 Microprocessors and Embedded Systems Nov 23 (AutoRecovered)
8 pages
COSEC CENTRA Prerequisites for Software Installation
No ratings yet
COSEC CENTRA Prerequisites for Software Installation
10 pages
IT Essentials - Computer Hardware and Software Chapters 11-16 Answers
100% (1)
IT Essentials - Computer Hardware and Software Chapters 11-16 Answers
11 pages
lec 1 Data Acquisition and preprocessing
No ratings yet
lec 1 Data Acquisition and preprocessing
8 pages
Zhangyuanxin+Final
No ratings yet
Zhangyuanxin+Final
12 pages
A Neural Algorithm of Artistic Style 1508.06576v2
No ratings yet
A Neural Algorithm of Artistic Style 1508.06576v2
16 pages
Neural Style Transfer a Critical Review
No ratings yet
Neural Style Transfer a Critical Review
31 pages
Perceptual Losses For Real-Time Style Transfer and Super-Resolution
No ratings yet
Perceptual Losses For Real-Time Style Transfer and Super-Resolution
18 pages
Contrastive Learning in Image Style Transfer: A Thorough Examination using CAST and UCAST Frameworks
No ratings yet
Contrastive Learning in Image Style Transfer: A Thorough Examination using CAST and UCAST Frameworks
8 pages
Documents_2025-2_M09W01MC - [v3] Multi-modal Style Transfer_[Slide_v3] Advanced Style Transfer
No ratings yet
Documents_2025-2_M09W01MC - [v3] Multi-modal Style Transfer_[Slide_v3] Advanced Style Transfer
52 pages
MDL Programming Basic Methods
No ratings yet
MDL Programming Basic Methods
33 pages
27087-Article Text-31150-1-2-20230626
No ratings yet
27087-Article Text-31150-1-2-20230626
3 pages
Dual Adversarial Inference For Text-to-Image Synthesis
No ratings yet
Dual Adversarial Inference For Text-to-Image Synthesis
20 pages
Artistic Style Transfer using Deep Learning and Style Fusion (1)
No ratings yet
Artistic Style Transfer using Deep Learning and Style Fusion (1)
5 pages
11U Culminating - June 2022
100% (1)
11U Culminating - June 2022
2 pages
GenAI_2025__StyleGAN
No ratings yet
GenAI_2025__StyleGAN
57 pages
AAI Experiment 8 Report
No ratings yet
AAI Experiment 8 Report
3 pages
Message Box and Input Box in Excel Vba
No ratings yet
Message Box and Input Box in Excel Vba
6 pages
Datasheet M590-M690
No ratings yet
Datasheet M590-M690
2 pages
Image Style Transfer
No ratings yet
Image Style Transfer
9 pages
Deep Final Research Paper (1) (2)
No ratings yet
Deep Final Research Paper (1) (2)
8 pages
Preserving Color in Neural Artistic Style Transfer
No ratings yet
Preserving Color in Neural Artistic Style Transfer
8 pages
Chen_ArtAdapter_Text-to-Image_Style_Transfer_using_Multi-Level_Style_Encoder_and_Explicit_CVPR_2024_paper
No ratings yet
Chen_ArtAdapter_Text-to-Image_Style_Transfer_using_Multi-Level_Style_Encoder_and_Explicit_CVPR_2024_paper
10 pages
2207.13038v1
No ratings yet
2207.13038v1
10 pages
Deng_Z_Zero-shot_Style_Transfer_via_Attention_Reweighting_CVPR_2024_paper
No ratings yet
Deng_Z_Zero-shot_Style_Transfer_via_Attention_Reweighting_CVPR_2024_paper
11 pages
Deeplearning - Ai Deeplearning - Ai
No ratings yet
Deeplearning - Ai Deeplearning - Ai
89 pages
Gatys 2016
No ratings yet
Gatys 2016
10 pages
A Unified Framework for Generalizable Style Transf
No ratings yet
A Unified Framework for Generalizable Style Transf
13 pages
Power Scout
No ratings yet
Power Scout
10 pages
Kolkin Style Transfer by Relaxed Optimal Transport and Self-Similarity CVPR 2019 Paper
No ratings yet
Kolkin Style Transfer by Relaxed Optimal Transport and Self-Similarity CVPR 2019 Paper
10 pages
Wang Multimodality-Guided Image Style Transfer Using Cross-Modal GAN Inversion WACV 2024 Paper
No ratings yet
Wang Multimodality-Guided Image Style Transfer Using Cross-Modal GAN Inversion WACV 2024 Paper
10 pages
DSTN
No ratings yet
DSTN
9 pages
Technical_Paper
No ratings yet
Technical_Paper
11 pages
Neural Style Transfer final paper
No ratings yet
Neural Style Transfer final paper
10 pages
Kotovenko Content and Style Disentanglement For Artistic Style Transfer ICCV 2019 Paper
No ratings yet
Kotovenko Content and Style Disentanglement For Artistic Style Transfer ICCV 2019 Paper
10 pages
Universal Style Transfer Via Feature Transforms
No ratings yet
Universal Style Transfer Via Feature Transforms
11 pages
CLIPstyler- Image Style Transfer With a Single Text Condition
No ratings yet
CLIPstyler- Image Style Transfer With a Single Text Condition
22 pages
Common Troubleshooting Guide For FS Switches
No ratings yet
Common Troubleshooting Guide For FS Switches
17 pages
Artist: Aesthetically Controllable Text-Driven Stylization Without Training
No ratings yet
Artist: Aesthetically Controllable Text-Driven Stylization Without Training
11 pages
Karras Analyzing and Improving The Image Quality of StyleGAN CVPR 2020 Paper
No ratings yet
Karras Analyzing and Improving The Image Quality of StyleGAN CVPR 2020 Paper
10 pages
Neural Style Transfer
No ratings yet
Neural Style Transfer
13 pages
Style Transfer Guide Presentation
No ratings yet
Style Transfer Guide Presentation
12 pages
Sepsis Project
No ratings yet
Sepsis Project
15 pages
ICANN2021 541 Original v1
No ratings yet
ICANN2021 541 Original v1
12 pages
Logcat
No ratings yet
Logcat
2 pages
A Style-Based GAN Encoder For High Fidelity
No ratings yet
A Style-Based GAN Encoder For High Fidelity
17 pages
Stylegan-T: Unlocking The Power of Gans For Fast Large-Scale Text-To-Image Synthesis
No ratings yet
Stylegan-T: Unlocking The Power of Gans For Fast Large-Scale Text-To-Image Synthesis
13 pages
Neural Final (1) 23
No ratings yet
Neural Final (1) 23
15 pages
5 Key Board Learning Guide
100% (1)
5 Key Board Learning Guide
53 pages
Region-Based Convolutional Networks For Accurate Object Detection and Segmentation
No ratings yet
Region-Based Convolutional Networks For Accurate Object Detection and Segmentation
21 pages
HTML Headings
No ratings yet
HTML Headings
9 pages
A Style-Aware Content Loss For Real-Time HD Style Transfer
No ratings yet
A Style-Aware Content Loss For Real-Time HD Style Transfer
22 pages
Learning From Multi-Domain Artistic Images For Arbitrary Style Transfer
No ratings yet
Learning From Multi-Domain Artistic Images For Arbitrary Style Transfer
11 pages
Arbitrary Video Style Transfer Via Multi-Channel Correlation
No ratings yet
Arbitrary Video Style Transfer Via Multi-Channel Correlation
8 pages
DFF Flexfields
No ratings yet
DFF Flexfields
13 pages
Generative Adversarial Networks for Realistic Texture Synthesis in Style Transfer Final Commit
No ratings yet
Generative Adversarial Networks for Realistic Texture Synthesis in Style Transfer Final Commit
6 pages
Multi-Style Transfer: Generalizing Fast Style Transfer To Several Genres
No ratings yet
Multi-Style Transfer: Generalizing Fast Style Transfer To Several Genres
8 pages
Image Sensatory Style Transfer Using Tensorflow Algorithm
No ratings yet
Image Sensatory Style Transfer Using Tensorflow Algorithm
9 pages
PPT Script
No ratings yet
PPT Script
10 pages
s41095-023-0371-3
No ratings yet
s41095-023-0371-3
13 pages
SSRN Id3354412
No ratings yet
SSRN Id3354412
8 pages
CN NOTES ALL UNIT
No ratings yet
CN NOTES ALL UNIT
35 pages
Presentation #7 a Style-Based GANs
No ratings yet
Presentation #7 a Style-Based GANs
23 pages
Hariharan[5y_0m] - System Admin
No ratings yet
Hariharan[5y_0m] - System Admin
2 pages
Image Style Transfer: B.E. (CSE) VI Semester Case Study
No ratings yet
Image Style Transfer: B.E. (CSE) VI Semester Case Study
26 pages
Android Programming Notes UNIT- I
No ratings yet
Android Programming Notes UNIT- I
17 pages
Art2Real - Unfolding The Reality of Artworks
No ratings yet
Art2Real - Unfolding The Reality of Artworks
11 pages
Digital Electronics Demystified
No ratings yet
Digital Electronics Demystified
3 pages
The Use of SMS Language in Short Messaging Service (SMS)
No ratings yet
The Use of SMS Language in Short Messaging Service (SMS)
14 pages
08 Style Transfer
No ratings yet
08 Style Transfer
50 pages
CA - Lab Outline
No ratings yet
CA - Lab Outline
4 pages
Neural Style Transfer
No ratings yet
Neural Style Transfer
14 pages
Everaert Diffusion in Style ICCV 2023 Paper
No ratings yet
Everaert Diffusion in Style ICCV 2023 Paper
11 pages
Xforce Keygen Autocad 2013 64 Bit Windows 8 Download PDF
No ratings yet
Xforce Keygen Autocad 2013 64 Bit Windows 8 Download PDF
3 pages
2024 Cat Readiness Playbook
No ratings yet
2024 Cat Readiness Playbook
31 pages
Tatm 2.1 Tmu
No ratings yet
Tatm 2.1 Tmu
106 pages
Mniproject Report.
No ratings yet
Mniproject Report.
22 pages
Harmony Evolution Options: For Process Industries Users
No ratings yet
Harmony Evolution Options: For Process Industries Users
66 pages
Instance Normalization: The Missing Ingredient For Fast Stylization
No ratings yet
Instance Normalization: The Missing Ingredient For Fast Stylization
6 pages
Controlling Perceptual Factors in Neural Style Transfer
No ratings yet
Controlling Perceptual Factors in Neural Style Transfer
9 pages
Gatys Image Style Transfer CVPR 2016 Paper
No ratings yet
Gatys Image Style Transfer CVPR 2016 Paper
10 pages
A Style-Based Generator Architecture For Generative Adversarial Networks
No ratings yet
A Style-Based Generator Architecture For Generative Adversarial Networks
12 pages
Design: Design Inspiration From Generative Networks
No ratings yet
Design: Design Inspiration From Generative Networks
12 pages
ESD Bits
No ratings yet
ESD Bits
20 pages
AIA360 - ServiceTrainingGuide - REV9
No ratings yet
AIA360 - ServiceTrainingGuide - REV9
99 pages
Spain Electronic Invoicing - Integration Guide SAP ERP, S4HANA OP - Cloud Foundry
No ratings yet
Spain Electronic Invoicing - Integration Guide SAP ERP, S4HANA OP - Cloud Foundry
26 pages
Gpon Basic Command Olt Zte
No ratings yet
Gpon Basic Command Olt Zte
22 pages
AI for Everyone: An Intermediate Guide to Artificial Intelligence
From Everand
AI for Everyone: An Intermediate Guide to Artificial Intelligence
Nova Clarke
No ratings yet
Blender Pro Studio Advanced Techniques for Real-World Projects: Blender, #3
From Everand
Blender Pro Studio Advanced Techniques for Real-World Projects: Blender, #3
Steven Mcananey
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet

ML_Seminar

Uploaded by

ML_Seminar

Uploaded by

Machine Learning Project Report

Real-Time Arbitrary Style Transfer Using Adaptive

2.2 Adaptive Instance Normalization (AdaIN)

• Learning rate: 1 × 10−4

• Number of iterations: 200,000

3.3 Training Pipeline

1. Load and normalize one content and one style image.

2. Extract their features using the encoder.

3. Apply AdaIN to obtain target features.

4. Decode the target features to an output image.

5. Compute content and style losses.

6. Backpropagate the total loss and update the decoder.

4.2 Style Loss

4.3 Total Loss

5.2 Loss Curve

• GitHub Repository: https://github.com/RUPESH-KUMAR01/AdaIn_style_transfer

You might also like