COMP9491 Week2 Deep - Learning 1
COMP9491 Week2 Deep - Learning 1
COMP9491 Applied AI
Term 2, 2023
Outline
▪ Vision-language studies
▪ Generative models
▪ Semi-supervised learning
▪ EfficientNet:
▪ Remix:
Source: https://www.assemblyai.com/blog/how-dall-e-2-actually-works/
▪ GAN (NeurIPS’14)
https://developers.google.com/machine-learning/gan/gan_structure
▪ GAN (NeurIPS’14)
Minimax loss:
https://developers.google.com/machine-learning/gan/gan_structure
▪ GAN (NeurIPS’14)
▪ CycleGAN
▪ StyleGAN
▪ Data augmentation
▪ Data augmentation using generative adversarial networks
(CycleGAN) to improve generalizability in CT segmentation
tasks (Scientific Reports, 2019)
▪ Image super-resolution
▪ Photo-realistic single image super-resolution using a
generative adversarial network (CVPR’17)
▪ Image completion
▪ Wide-context semantic image extrapolation (CVPR’19)
▪ Language generation
▪ Adversarial ranking for language generation (NeurIPS’17)
▪ Speech synthesis
▪ High fidelity speech synthesis with adversarial networks (ICLR’20)
▪ Speech enhancement
▪ Exploring speech enhancement with generative adversarial
networks for robust speech recognition (ICASSP’18)
▪ Diffusion Models
▪ Deep unsupervised learning using nonequilibrium thermodynamics
(ICML’15)
▪ Two stages:
▪ Forward diffusion slowly destroys structure in a data distribution by
adding Gaussian noise iteratively
▪ Reverse diffusion gradually reconstructs or denoises the images back to
the original using deep learning
https://developer.nvidia.
com/blog/improvin
g-diffusion-
models-as-an-
alternative-to-
gans-part-1/
Source: https://towardsdatascience.com/understanding-variational-autoencoders-vaes-f70510919f73
Source: Tackling the generative learning trilemma with denoising diffusion GANs, ICLR 2022.
▪ Problem definition
▪ Incorporate additional unlabeled training data to train the
supervised learning model
▪ Advantage: annotate only a
small subset of training data
while maintaining the model
performance
▪ Data synthesis
▪ Generate additional data with pseudo ground truth labels, and
include these data into the training
▪ Mixup
▪ Data augmentation using GAN
▪ Adversarial learning
▪ A typical approach: Improved techniques for training GANs
(NeurIPS’16)
▪ Main ideas:
▪ For labelled real data, the discriminator classifies their label
▪ For unlabelled real data and generated data, they are trained
with the adversarial loss only
▪ Adversarial learning
▪ Deep adversarial networks for biomedical image segmentation
utilizing unannotated images (MICCAI’17)
▪ Graph regularization
▪ Label propagation for deep semi-supervised learning (CVPR’19)
▪ Graph regularization
▪ Label propagation for deep semi-supervised learning (CVPR’19)
▪ Graph regularization
▪ Label propagation for deep semi-supervised learning (CVPR’19)
▪ Self-ensembling
▪ Uncertainty-aware self-ensembling model for semi-supervised
3D left atrium segmentation (MICCAI’19)
▪ Unsupervised learning
▪ Transfer learning
▪ Weakly supervised learning
▪ Self-supervised learning
▪ Few/zero shot learning
▪ Meta learning
▪ Active learning
▪ Continual learning
▪ Federated learning
▪ …