Untitled Presentation
Untitled Presentation
SEMINAR
Academic Year
2023-2024
MHGAN: Multi-Hierarchies
Generative Adversarial Network for
High-Quality Face Sketch Synthesis
1. INTRODUCTION
2. OBJECTIVE
3. LITERATURE SURVEY AND THEORY
4. PROPOSED METHOD
5. EXPERIMENTATION
6. CONCLUSION
7. FUTURE SCOPE
8. REFERENCES
INTRODUCTION
● Face sketch synthesis is the process of generating face sketches from face photos. It has
been studied for a long time due to its wide applications in digital entertainment and law
enforcement.
● Recently, GAN-based methods have shown promising results on image-to-image
translation problems, especially photo-to-sketch synthesis.
● However, existing face sketch synthesis methods often lack models for specific facial
regions and usually generate face sketches with coarse structures.
● To address these limitations, this paper proposes a novel Multi-Hierarchies GAN
(MHGAN), which divides the face image into multiple hierarchical structures to learn
different regions' features of the face.
OBJECTIVE
● To propose a novel GAN-based method for face sketch synthesis that can
generate high-quality sketches with fine local details and coarse facial
structures.
● To improve the quality of the generated sketches by incorporating a local
region module, mask module and fusion module.
● To evaluate the performance of the proposed MHGAN on the CUFS and
CUFSF standard datasets and photos on the internet.
LITERATURE SURVEY AND THEORY
The proposed method in the paper "MHGAN: Multi-Hierarchies Generative Adversarial Network
for High-Quality Face Sketch Synthesis" consists of three modules:
● Local region module : The local region module learns the detailed features of different local
regions of the face by GAN.
GAN is a powerful deep learning framework that can be used to generate realistic images.
The local region module uses GAN to learn the detailed features of each local region of the
face, such as the eyes, nose, and mouth.
● Mask module : The mask module generates a coarse facial structure of a sketch. It uses a
facial feature extractor to enhance the high-level image and learn the latent spaces' feature.
The facial feature extractor is a deep learning model that can extract the high-level features
of a face image.
The mask module uses the facial feature extractor to generate a coarse facial structure of
the sketch, which will be used as a guidance for the fusion module.
● Fusion module : The fusion module generates the final sketch by combining fine local regions and
coarse facial structure. It uses a coarse-to-fine strategy to generate the final sketch. The
coarse-to-fine strategy starts with generating a coarse sketch and then gradually refines it to get a
final sketch with high quality.
● The local region module and mask module are both trained adversarially. The local region
module tries to generate realistic local regions, while the mask module tries to distinguish
between real and fake local regions. The fusion module is trained using a supervised learning
approach. The final sketch is generated by combining the output of the local region module
and mask module.
● The proposed method has been shown to be effective in generating high-quality face sketches.
It outperforms the state-of-the-art methods in terms of both qualitative and quantitative
measures.
EXPERIMENTATION
The authors evaluated the performance of the proposed MHGAN on two standard datasets: the CUFS
dataset and the CUFSF dataset. The CUFS dataset consists of 1,000 face photos and sketches, while the
CUFSF dataset consists of 2,000 face photos and sketches. The authors also evaluated the performance
of the proposed MHGAN on photos on the internet.
The following quantitative metrics are used to evaluate the performance of the proposed MHGAN:
● Structural Similarity Index (SSIM): This metric measures the similarity between two images. A
higher SSIM score indicates a higher similarity between the two images.
● Peak Signal-to-Noise Ratio (PSNR): This metric measures the amount of noise in an image. A
higher PSNR score indicates a lower noise level in the image.
● Human Visual Evaluation (HEVAL): This metric is a subjective evaluation of the quality of the
generated sketches by human subjects.
● The results of the experiments showed that the proposed MHGAN outperformed the
state-of-the-art methods in terms of all three metrics. The proposed MHGAN achieved an
average SSIM score of 0.932, an average PSNR score of 32.37 dB, and an average HEVAL score of
4.28 on the CUFS dataset. The proposed MHGAN also achieved an average SSIM score of 0.928,
an average PSNR score of 32.18 dB, and an average HEVAL score of 4.25 on the CUFSF dataset.
● The authors also conducted qualitative evaluations of the generated sketches. The qualitative
evaluations showed that the proposed MHGAN is able to generate high-quality sketches with
fine local details and coarse facial structures. The generated sketches are also more realistic and
natural-looking than the sketches generated by the state-of-the-art methods
.
● The authors also conducted ablation studies to investigate the effectiveness of the different
components of the proposed MHGAN. The ablation studies showed that the local region module,
mask module, and fusion module all contribute to the improved performance of the proposed
MHGAN.
EXPERIMENTATION RESULTS
● Speed: The authors plan to modify the encoder or fusion module to improve
the synthesis speed of the method. This would make the method more
practical for real-world applications.
● The method could be used to train artificial intelligence models to
recognize faces or generate sketches.
REFERENCES
1. M. Zhang, N. Wang, Y. Li, and X. Gao, ‘‘Bionic face sketch generator,’’ IEEE Trans.
Cybern., vol. 50, no. 6, pp. 2701–2714, Jun. 2020
2. M. Zhang, J. Zhang, Y. Chi, Y. Li, N. Wang, and X. Gao, ‘‘Cross-domain face sketch
synthesis,’’ IEEE Access, vol. 7, pp. 98866–98874, 2019
3. K.-K. Huang, D.-Q. Dai, C.-X. Ren, and Z.-R. Lai, ‘‘Learning kernel extended dictionary
for face recognition,’’ IEEE Trans. Neural Netw. Learn. Syst., vol. 28, no. 5, pp.
1082–1094, May 2017.
4. N. Wang, D. Tao, X. Gao, X. Li, and J. Li, ‘‘Transductive face sketch photo synthesis,’’ IEEE
Trans. Neural Netw. Learn. Syst., vol. 24, no. 9, pp. 1364–1376, Sep. 2013.