Deep Learning for Computer Vision: Data Augmentation (UPC 2016)

4 likes4,100 views

The document discusses the use of data augmentation in deep convolutional neural networks for image classification, specifically referencing the ImageNet challenge. It outlines techniques to reduce overfitting, such as reducing network capacity, dropout, and various data augmentation strategies that enhance training by altering input images. The document also highlights the importance of creating robust features through these augmentations and their effectiveness in unsupervised learning scenarios.

Data & Analytics

Most read

[course site]
Augmentation
Day 2 Lecture 2
Eva Mohedano

Introduction
ImageNet Classification with Deep Convolutional Neural Networks, Krizhevsky A., 2012
ImageNet Large-Scale Visual Recognition Challenge (ILSVRC) 1.2
million training images, 50,000 validation images, and 150,000
testing images
Architecture of 5 convolutional + 3 fully connected = 60 million
parameters ~ 650.000 neurons.
Overfitting!!
2

● Reduce network capacity
● Dropout
● Data augmentation
Ways to reduce overfitting
3

● Reduce network capacity
● Dropout
● Data augmentation
Ways to reduce overfitting
1% of total parameters (884K). Decrease in performance
4

● Reduce network capacity
● Dropout
● Data augmentation
Ways to reduce overfitting
37M, 16M, 4M parametes!! (fc6,fc7,fc8)
5

Ways to reduce overfitting
● Reduce network capacity
● Dropout
● Data augmentation Every forward pass, network slightly different.
Reduce co-adaptation between neurons
More robust features
More interations for convergence
6

Ways to reduce overfitting
● Reduce network capacity
● Dropout
● Data augmentation
7

Data Augmentation
During training, alterate the input image (Krizhevsky A., 2012)
- Random crops on the original image
- Translations
- Horitzontal reflections
- Increases size of training x2048
- On-the-fly augmentation
During testing
- Average prediction of image augmented by the four corner
patches and the center patch + flipped image. (10
augmentations of the image)
8

Data Augmentation
Alternate intensities RGB channels intensities
PCA on the set of RGB pixel throughout the ImageNet training set.
To each training image, add multiples of the found principal components
Object identity should be invariant to changes of
illumination
9

Augmentation for discriminative unsupervised
feature learning
Discriminative Unsupervised Feature Learning with Exemplar Convolutional Neural Networks, Dosovitskiy,
A., 2014
MOTIVATION
● Large datasets of training data
● Local descriptors should be invariant transformations (rotation, translation, scale, etc)
WHAT THEY DO
● Training a CNN to generate local representation by optimising a surrogate classification task
● Task does NOT require labeled data
10

Augmentation for discriminative unsupervised
feature learning
Select random location k and crop 32x32 window
(restrictions: region must contain objects or part of the
object: high amount of gradients)
Apply a transformation [translation, rotation, scalig, RGB
modification, contrast modification]
...
Generate augmented dataset: 16000 classes of 150 examples each
Class k=1, with 150 examples
11

Augmentation for discriminative unsupervised
feature learning
Generate augmented dataset: 16000 classes of 150 examples each
Example of classes
Example of examples for one class
12

Augmentation for discriminative unsupervised
feature learning
Classification accuracies
Superior performance to SIFT for image matching.
13

Summary
Augmentation helps to prevent overfitting
It makes network invariant to certain transformations: translations, flip, etc
Can be done on-the-fly
Can be used to learn image representations when no label datasets are available.
14

The document discusses various data augmentation techniques used in machine learning to enhance model performance and prevent overfitting, particularly in image classification and object detection tasks. It covers the advantages of data augmentation, types of augmentations, as well as challenges faced in implementing them across different frameworks. Additionally, it emphasizes the need for a unified augmentation library to improve portability and effectiveness across diverse use cases.

"Getting More from Your Datasets: Data Augmentation, Annotation and Generativ...Edge AI and Vision Alliance

Deep Learning - Convolutional Neural NetworksChristian Perone

This document provides an agenda for a presentation on deep learning, neural networks, convolutional neural networks, and interesting applications. The presentation will include introductions to deep learning and how it differs from traditional machine learning by learning feature representations from data. It will cover the history of neural networks and breakthroughs that enabled training of deeper models. Convolutional neural network architectures will be overviewed, including convolutional, pooling, and dense layers. Applications like recommendation systems, natural language processing, and computer vision will also be discussed. There will be a question and answer section.

Image classification with Deep Neural NetworksYogendra Tamang

This document discusses image classification using deep neural networks. It provides background on image classification and convolutional neural networks. The document outlines techniques like activation functions, pooling, dropout and data augmentation to prevent overfitting. It summarizes a paper on ImageNet classification using CNNs with multiple convolutional and fully connected layers. The paper achieved state-of-the-art results on ImageNet in 2010 and 2012 by training CNNs on a large dataset using multiple GPUs.

Deep Learning - Overview of my work IIMohamed Loey

The document covers various aspects of deep learning, including definitions of artificial narrow, general, and super intelligence, and details on machine learning techniques like supervised and unsupervised learning. It discusses the architecture of neural networks such as convolutional neural networks and their application in image recognition, speech recognition, and natural language processing, along with datasets like MNIST and CIFAR-10. Techniques to mitigate overfitting in neural networks, such as dropout and L2 regularization, are also highlighted, along with a summary of a new CNN architecture built for classifying Arabic handwritten characters.

Convolutional neural network from VGG to DenseNetSungminYou

This document summarizes recent developments in convolutional neural networks (CNNs) for image recognition, including residual networks (ResNets) and densely connected convolutional networks (DenseNets). It reviews CNN structure and components like convolution, pooling, and ReLU. ResNets address degradation problems in deep networks by introducing identity-based skip connections. DenseNets connect each layer to every other layer to encourage feature reuse, addressing vanishing gradients. The document outlines the structures of ResNets and DenseNets and their advantages over traditional CNNs.

CNN Machine learning DeepLearningAbhishek Sharma

This document discusses convolutional neural networks (CNNs). It explains that CNNs were inspired by research on the human visual system and take a similar approach to teach computers to identify objects in images. The document outlines the key components of CNNs, including convolutional and pooling layers to extract features from images, as well as fully connected layers to classify objects. It also notes that CNNs take pixel data as input and use many examples to generalize and make predictions, similar to how humans learn visual recognition.

Convolutional Neural Network Models - Deep LearningMohamed Loey

The document provides an overview of Convolutional Neural Network (CNN) models developed for image classification, highlighting notable models such as AlexNet, ZFNet, VGGNet, GoogLeNet, and ResNet. It outlines the architecture, training processes, and performance metrics of these models, including their respective top-5 error rates in the ImageNet Large Scale Visual Recognition Challenge. Additionally, it covers specific training techniques and innovations employed by each model to improve accuracy and efficiency.

Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...Edureka!

Introduction to Deep learningleopauly

The document provides an introduction to deep learning, covering its history, key concepts like convolutional neural networks, and research focuses in image segmentation. It discusses various applications of deep learning across fields such as computer vision, natural language processing, and robotics, as well as current activities at the University of Leeds. The text emphasizes the evolution of deep learning, addressing challenges and advancements including containerization for high-performance computing.

Image classification using CNNNoura Hussein

The document describes a project focused on classifying images using a Convolutional Neural Network (CNN) and TensorFlow, utilizing the CIFAR-10 dataset containing 60,000 images across ten categories. Key phases include data preprocessing, building and training the CNN model, and testing the model, which achieved an accuracy of 71.44%. The project emphasizes the effectiveness of CNNs for analyzing visual imagery with minimal preprocessing.

CONVOLUTIONAL NEURAL NETWORKMd Rajib Bhuiyan

This document presents an overview of Convolutional Neural Networks (CNNs), emphasizing their structure and functionality in analyzing visual imagery. It describes the key components of CNNs, including convolutional, pooling, and fully connected layers, and explains how these layers operate on input data. Additionally, it illustrates how CNNs process image inputs through a 3D arrangement of neurons and conclude with an example related to the CIFAR-10 dataset.

CnnNirthika Rajendran

Convolutional neural networks (CNNs) learn multi-level features and perform classification jointly and better than traditional approaches for image classification and segmentation problems. CNNs have four main components: convolution, nonlinearity, pooling, and fully connected layers. Convolution extracts features from the input image using filters. Nonlinearity introduces nonlinearity. Pooling reduces dimensionality while retaining important information. The fully connected layer uses high-level features for classification. CNNs are trained end-to-end using backpropagation to minimize output errors by updating weights.

Transformer Introduction (Seminar Material)Yuta Niki

The document discusses the introduction of transformers and their advanced models like BERT, emphasizing the importance of experiments in NLP and deep learning. It outlines the architecture of transformers, including the encoder and decoder structures, attention mechanisms, and the significance of multi-head attention. Additionally, it provides references for further reading and details the motivations behind various components of the transformer model.

PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksJinwon Lee

The document presents EfficientNet, a model scaling method for convolutional neural networks, aiming to improve both accuracy and efficiency through a compound scaling approach that balances network width, depth, and resolution. The authors demonstrate that traditional scaling methods often lead to sub-optimal results, while their proposed method simplifies the scaling process by fixing layer architecture and uniformly scaling dimensions with constant ratios. Empirical studies show that this approach significantly enhances performance while operating within resource constraints.

Convolutional Neural NetworkVignesh Suresh

This document provides an overview of convolutional neural networks (CNNs). It defines CNNs as multiple layer feedforward neural networks used to analyze visual images by processing grid-like data. CNNs recognize images through a series of layers, including convolutional layers that apply filters to detect patterns, ReLU layers that apply an activation function, pooling layers that detect edges and corners, and fully connected layers that identify the image. CNNs are commonly used for applications like image classification, self-driving cars, activity prediction, video detection, and conversion applications.

Introduction to CNNShuai Zhang

The document discusses convolutional neural networks (CNNs). It begins with an introduction and overview of CNN components like convolution, ReLU, and pooling layers. Convolution layers apply filters to input images to extract features, ReLU introduces non-linearity, and pooling layers reduce dimensionality. CNNs are well-suited for image data since they can incorporate spatial relationships. The document provides an example of building a CNN using TensorFlow to classify handwritten digits from the MNIST dataset.

Introduction to Transformer ModelNuwan Sriyantha Bandara

This document provides an overview of natural language processing (NLP) and the evolution of its techniques from symbolic and statistical methods to neural networks and deep learning. It explains the transformer architecture, focusing on its use of self-attention for sequence-to-sequence tasks and its advantages in handling long-range dependencies. The document also highlights challenges such as context fragmentation due to fixed-length input segments and discusses future directions, including transformer XL and BERT.

NLP State of the Art | BERTshaurya uppal

The document outlines BERT (Bidirectional Encoder Representations from Transformers), a pretrained model by Google that excels in various natural language processing tasks through its bidirectional context learning and attention mechanisms. It explains BERT's architecture, including its masked language modeling and next sentence prediction tasks, as well as its ability to handle out-of-vocabulary words through a word piece tokenizer. BERT's effectiveness and applications are highlighted, indicating its potential for practical uses in areas such as text classification and sentiment analysis.

BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingMinh Pham

The document presents a seminar on BERT (Bidirectional Encoder Representations from Transformers), a breakthrough in natural language processing that utilizes deep bidirectional learning to enhance language understanding. It discusses the limitations of previous models and outlines BERT's architecture, pre-training tasks, and fine-tuning procedures, demonstrating its superiority in various NLP tasks. The findings indicate that BERT's bidirectional nature and unique training approach significantly improve performance across many benchmarks.

Attention Is All You NeedIllia Polosukhin

The document discusses the evolution of neural networks in natural language processing (NLP) and the limitations of recurrent neural networks (RNNs) in efficiency and parallelization. It introduces the transformer architecture, which employs self-attention mechanisms and positional encoding to enhance translation tasks by removing bottlenecks associated with traditional encoder-decoder models. The work highlights significant contributions from various studies and provides resources for further exploration.

ViT (Vision Transformer) Review [CDM]Dongmin Choi

The document discusses the use of transformers for image recognition, highlighting that a pure transformer can outperform convolutional networks in image classification tasks when trained on large datasets. It outlines the methodology of converting images into sequences of patches for processing by the transformer architecture and compares its performance against state-of-the-art models. The conclusion emphasizes that while transformers show promise, challenges remain for tasks beyond image classification.

CNN Attention NetworksTaeoh Kim

1) The document discusses different types of attention mechanisms in CNNs including self-attention and simplified attention for recalibration. 2) It reviews the evolution of CNN architectures including AlexNet, VGG, ResNet and variants, DenseNet, ResNeXt, Xception, MobileNet and ShuffleNet. 3) These attention mechanisms and CNN architectures are applied to tasks like image recognition, machine translation and image captioning.

backpropagation in neural networksAkash Goel

Backpropagation is a learning algorithm for multi-layer neural networks, developed in 1969, which adjusts weights based on the error of the output compared to expected results. The process consists of a forward pass to determine output and a backward pass to calculate weight adjustments through gradient descent. Neural networks, which leverage backpropagation, have applications in classification and function approximation, thriving in scenarios with ample training data and where explicit rules are hard to define.

Data AugmentationMd Tajul Islam

The document discusses data augmentation techniques for improving machine learning models. It begins with definitions of data augmentation and reasons for using it, such as enlarging datasets and preventing overfitting. Examples of data augmentation for images, text, and audio are provided. The document then demonstrates how to perform data augmentation for natural language processing tasks like text classification. It shows an example of augmenting a movie review dataset and evaluating a text classifier. Pros and cons of data augmentation are discussed, along with key takeaways about using it to boost performance of models with small datasets.

An introduction to the Transformers architecture and BERTSuman Debnath

The document provides an overview of natural language processing (NLP) and the evolution of its algorithms, particularly focusing on the transformer architecture and BERT. It explains how these models work, highlighting key components such as the encoder mechanisms, attention processes, and pre-training tasks. Additionally, it addresses various use cases of NLP, including text classification, summarization, and question answering.

Faster R-CNN: Towards real-time object detection with region proposal network...Universitat Politècnica de Catalunya

The document discusses Faster R-CNN, a real-time object detection framework that integrates a Region Proposal Network (RPN) to enhance efficiency by sharing convolutional features with the object detection network. It outlines the methodology behind Faster R-CNN, including how the RPN generates proposals, the training steps involved, and experiments that demonstrate its superior proposal quality and detection speed compared to previous methods. Faster R-CNN serves as a foundation for leading object detection systems in significant competitions.

Deep Learning: Recurrent Neural Network (Chapter 10) Larry Guo

Chapter 10 discusses recurrent neural networks (RNNs), emphasizing their ability to process sequential data where order matters. It covers various types, applications, training methods like backpropagation through time (BPTT), and challenges such as exploding/vanishing gradients, introducing solutions like Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRUs). The chapter also highlights practical applications of RNNs in image captioning, chatbots, and natural language processing.

Deep Learning for Computer Vision: Visualization (UPC 2016)Universitat Politècnica de Catalunya

The document covers various visualization techniques in convolutional neural networks (CNNs), including learned weights, activations, representation space, deconvolution-based methods, optimization-based methods, deepdream, and neural style transfer. It highlights the importance of visualization in improving CNN architectures and understanding their inner workings, providing specific methods like occlusion experiments, t-SNE for feature representation, and backpropagation techniques. Additionally, resources for practical implementations and further learning are included.

Deep Learning for Computer Vision: Backward Propagation (UPC 2016)Universitat Politècnica de Catalunya

The document discusses backpropagation and optimization techniques in neural networks, emphasizing the use of supervised and unsupervised learning methods to train models effectively. It details the backpropagation algorithm's functionality, which involves minimizing a loss function using stochastic gradient descent and its variants. Additionally, it illustrates how to compute gradients iteratively through the network layers to improve parameter fitting.

More Related Content

What's hot (20)

Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...Edureka!

Introduction to Deep learningleopauly

Image classification using CNNNoura Hussein

CONVOLUTIONAL NEURAL NETWORKMd Rajib Bhuiyan

CnnNirthika Rajendran

Transformer Introduction (Seminar Material)Yuta Niki

PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksJinwon Lee

Convolutional Neural NetworkVignesh Suresh

Introduction to CNNShuai Zhang

Introduction to Transformer ModelNuwan Sriyantha Bandara

NLP State of the Art | BERTshaurya uppal

BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingMinh Pham

Attention Is All You NeedIllia Polosukhin

ViT (Vision Transformer) Review [CDM]Dongmin Choi

CNN Attention NetworksTaeoh Kim

backpropagation in neural networksAkash Goel

Data AugmentationMd Tajul Islam

An introduction to the Transformers architecture and BERTSuman Debnath

Faster R-CNN: Towards real-time object detection with region proposal network...Universitat Politècnica de Catalunya

Deep Learning: Recurrent Neural Network (Chapter 10) Larry Guo

Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...Edureka!

Introduction to Deep learningleopauly

Image classification using CNNNoura Hussein

CONVOLUTIONAL NEURAL NETWORKMd Rajib Bhuiyan

CnnNirthika Rajendran

Transformer Introduction (Seminar Material)Yuta Niki

PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksJinwon Lee

Convolutional Neural NetworkVignesh Suresh

Introduction to CNNShuai Zhang

Introduction to Transformer ModelNuwan Sriyantha Bandara

NLP State of the Art | BERTshaurya uppal

BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingMinh Pham

Attention Is All You NeedIllia Polosukhin

ViT (Vision Transformer) Review [CDM]Dongmin Choi

CNN Attention NetworksTaeoh Kim

backpropagation in neural networksAkash Goel

Data AugmentationMd Tajul Islam

An introduction to the Transformers architecture and BERTSuman Debnath

Faster R-CNN: Towards real-time object detection with region proposal network...Universitat Politècnica de Catalunya

Deep Learning: Recurrent Neural Network (Chapter 10) Larry Guo

Viewers also liked (20)

Deep Learning for Computer Vision: Visualization (UPC 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Backward Propagation (UPC 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Memory usage and computational considerati...Universitat Politècnica de Catalunya

The document discusses estimating memory usage and computational requirements for designing deep neural networks, focusing on factors like network size, mini-batch sizes, and effective aperture sizes. It covers methods for calculating memory and computational complexity, providing insights into the architecture and training considerations of convolutional networks. Additionally, strategies for improving accuracy through increased network depth and width are highlighted, along with the implications for memory constraints on modern GPUs.

Deep Learning for Computer Vision: Deep Networks (UPC 2016)Universitat Politècnica de Catalunya

The document discusses the architecture and functioning of Convolutional Neural Networks (CNNs), emphasizing the layers involved, such as hidden, output, and convolutional layers. It describes techniques for reducing the number of parameters, including the use of local connections, translation invariance, pooling, padding, and stride. The examples provided reference specific architectures, such as lenet-5, to illustrate how these concepts are applied in practice.

Deep Learning for Computer Vision: ImageNet Challenge (UPC 2016)Universitat Politècnica de Catalunya

The document discusses the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), detailing the dataset comprising 1,000 object classes with 1.2 million training images and 100,000 test images. It highlights the evolution of image classification techniques, with a focus on architectures like AlexNet, GoogLeNet, and ResNet, emphasizing improvements in accuracy and feature extraction methods. Additionally, it mentions various publications and contributions by researchers in the development of convolutional neural networks and visualization techniques.

Deep Learning for Computer Vision: Object Detection (UPC 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Software Frameworks (UPC 2016)Universitat Politècnica de Catalunya

The document provides an overview of various deep learning software frameworks including Caffe, Theano, Keras, Torch, TensorFlow, MXNet, and others, detailing their features, pros and cons. It covers aspects such as implementation languages, performance benchmarks, user interfaces, and various functionalities for model training and evaluation. Additionally, it includes examples of code for creating neural networks using some of these frameworks.

Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)Universitat Politècnica de Catalunya

The document presents techniques for semi-supervised and transfer learning in deep learning, challenging the notion that deep learning requires a million labeled examples. It discusses various assumptions and models, including manifold hypothesis and energy-based models, that leverage unlabelled data to learn useful representations. Several methods such as autoencoders, ladder networks, and implications for learning from video are detailed, highlighting the active research in this area.

Deep Learning for Computer Vision: Image Classification (UPC 2016)Universitat Politècnica de Catalunya

The document discusses image classification, highlighting the importance of train/test splits, metrics, and models in building predictive systems. It emphasizes a structured approach to data preparation and performance assessment, as well as outlines various metrics like accuracy, precision, and recall. Additionally, it introduces the role of linear models and neurons in neural network architecture for classification tasks.

Deep Learning for Computer Vision: Face Recognition (UPC 2016)Universitat Politècnica de Catalunya

This document summarizes lecture material on face recognition. It discusses face detection, alignment, identification, and verification. It also reviews several popular face recognition systems like DeepFace, FaceNet, and Deep ID. Experiments were conducted at UPC on various databases using deep neural networks like VGG, GoogleNet, and ResNet. The best results achieved 97% accuracy on a database of 3,500 identities and 100,000 images. Ongoing work involves verification using advanced techniques like joint Bayesian models, siamese networks, and triplets.

Deep Learning for Computer Vision: Segmentation (UPC 2016)Universitat Politècnica de Catalunya

The document discusses various techniques for image segmentation, including semantic segmentation, instance segmentation, and fully convolutional networks. It covers the methodologies for detecting and classifying objects within images, with a focus on pixel labeling and the use of convolutional and deconvolutional layers. The lecture also highlights relevant resources and papers in the field of computer vision.

Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)Universitat Politècnica de Catalunya

The document provides an overview of recurrent neural networks (RNNs) and their variants, including long short-term memory (LSTM) and gated recurrent units (GRU). It discusses the mechanisms of RNNs, their challenges with long-term memory, and presents LSTM as a solution to avoid issues like gradient vanishing. Additionally, it touches on various applications of these neural network architectures, such as machine translation and image classification.

Deep Learning for Computer Vision: Optimization (UPC 2016)Universitat Politècnica de Catalunya

The lecture discusses optimizing deep networks using stochastic gradient descent (SGD) and addresses challenges such as non-convex optimization, local minima, and saddle points. It highlights the importance of learning rate selection, weight initialization, and techniques like batch normalization to improve convergence. Various extensions of SGD, including momentum and Adam, are presented to enhance the optimization process.

Deep Learning for Computer Vision: Image Retrieval (UPC 2016)Universitat Politècnica de Catalunya

The document discusses advanced methods for content-based image retrieval, focusing on generating rankings of similar images based on image queries. It emphasizes the use of convolutional neural network (CNN) representations and specific techniques like siamese networks for learning effective image descriptors. Additionally, it highlights datasets and approaches to improve retrieval accuracy through ranking and similarity metrics.

Deep Learning for Computer Vision: Welcome (UPC TelecomBCN 2016)Universitat Politècnica de Catalunya

The document outlines the structure and instructors of a course at Universitat Politècnica de Catalunya, emphasizing the importance of data science and computer vision in the context of growing multimedia content. It provides links to the course website and acknowledges contributors while detailing the course schedule and grading criteria. The instructors include notable faculty and candidates from both UPC and other institutions.

Deep Learning for Computer Vision: Generative models and adversarial training...Universitat Politècnica de Catalunya

Generative adversarial networks (GANs) use two neural networks, a generator and discriminator, that compete against each other. The generator aims to produce realistic samples to fool the discriminator, while the discriminator tries to distinguish real samples from generated ones. This adversarial training can produce high-quality, sharp samples but is challenging to train as the generator and discriminator must be carefully balanced.

Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...Universitat Politècnica de Catalunya

This document discusses transfer learning and domain adaptation techniques for training deep learning models when limited labeled data is available. It describes using pre-trained networks to extract features, fine-tuning networks on related tasks, and unsupervised domain adaptation methods. Transfer learning can outperform training from scratch by leveraging knowledge gained from large labeled datasets.

Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)Universitat Politècnica de Catalunya

The document discusses the development of saliency prediction models, particularly focusing on the architectures of shallow and deep networks like Junting Net and SalNet, used for generating saliency maps of images. It includes acknowledgments to contributors, technical details such as training methodologies and datasets like SALICON and ISUN, and highlights successes in the CVPR challenges. Key metrics for evaluating model performance against human eye fixations are also presented.

Deep Learning for Computer Vision: Video Analytics (UPC 2016)Universitat Politècnica de Catalunya

The document outlines a lecture on video analytics presented by Xavier Giró-i-Nieto, covering topics such as scene classification, object detection, and tracking using convolutional neural networks (CNNs). It examines current methods, including 3D convolutional networks for spatiotemporal features and discusses various architectures and results from related research. The presentation emphasizes the importance of deep learning in analyzing video data and incorporates findings from multiple research studies.

Deep Learning for Computer Vision: Medical Imaging (UPC 2016)Universitat Politècnica de Catalunya

The document discusses the applications of deep learning in medical imaging, including techniques for image description, diagnosis, reconstruction, and model selection. It highlights several projects such as malaria parasite detection and brain tumor segmentation challenges, detailing methodologies and results. Key contributions from various professors and students are noted, showcasing collaborative efforts and advancements in medical image analysis.

Deep Learning for Computer Vision: Visualization (UPC 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Backward Propagation (UPC 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Memory usage and computational considerati...Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Deep Networks (UPC 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: ImageNet Challenge (UPC 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Object Detection (UPC 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Software Frameworks (UPC 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Image Classification (UPC 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Face Recognition (UPC 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Segmentation (UPC 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Optimization (UPC 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Image Retrieval (UPC 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Welcome (UPC TelecomBCN 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Generative models and adversarial training...Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Video Analytics (UPC 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Medical Imaging (UPC 2016)Universitat Politècnica de Catalunya

Similar to Deep Learning for Computer Vision: Data Augmentation (UPC 2016) (20)

Seminar_Presentation_pptAyushDixit52

The document outlines a seminar on data augmentation, highlighting its importance in reducing overfitting in deep learning models, particularly for image classification tasks. It discusses various techniques, including random cropping, translations, and intensity alterations, to enhance training datasets and improve classification performance. Ultimately, it emphasizes that data augmentation aids in making networks invariant to specific transformations and is beneficial when working with unlabeled datasets.

imageclassification-160206090009.pdfKammetaJoshna

The document discusses image classification using deep neural networks. It provides background on image classification and convolutional neural networks. The document outlines techniques like activation functions, pooling, dropout and data augmentation to prevent overfitting. It summarizes a paper on ImageNet classification using CNNs with multiple convolutional layers and GPU training. Key results showed improved accuracy with larger datasets and model capacity.

Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015Jia-Bin Huang

The document provides an overview of convolutional neural networks (CNNs) in computer vision, detailing their structure, training processes, and applications such as image classification and segmentation. It discusses the evolution of CNNs, techniques like backpropagation and transfer learning, and methods for understanding and visualizing CNNs. Additionally, it highlights important architectures, training strategies, and recent advancements in the field.

ImageNet Classification with Deep Convolutional Neural NetworksWilly Marroquin (WillyDevNET)

The document details the training of a large, deep convolutional neural network that achieved significant advancements in image classification on the ImageNet dataset, specifically in the ILSVRC competitions. With 60 million parameters, the network demonstrated top-1 and top-5 error rates of 37.5% and 17.0%, outperforming previous state-of-the-art results. The study also discusses the architecture and innovative techniques employed, such as dropout for regularization and the use of GPUs for efficient training, which contributed to improved performance and reduced training time.

Convolutional neural network Yan Xu

The document presents an overview of Convolutional Neural Networks (CNNs), detailing their advantages over traditional neural networks, specific architectures like AlexNet, and their performance in tasks such as image classification and object detection. It highlights the role of deep learning techniques and technologies that have contributed to the rise of CNNs, including improved optimization methods and powerful computational resources. Additionally, it discusses the use of TensorFlow and TFLearn for implementing CNNs, emphasizing the state-of-the-art results achieved with these models while acknowledging ongoing challenges in interpreting their mechanisms.

Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSitakanta Mishra

This document discusses comparing the performance of different convolutional neural networks (CNNs) when trained on large image datasets using Apache Spark. It summarizes the datasets used - CIFAR-10 and ImageNet - and preprocessing done to standardize image sizes. It then provides an overview of CNN architecture, including convolutional layers, pooling layers, and fully connected layers. Finally, it introduces SparkNet, a framework that allows training deep networks using Spark by wrapping Caffe and providing tools for distributed deep learning on Spark. The goal is to see if SparkNet can provide faster training times compared to a single machine by distributing training across a cluster.

Convolutional neural networks 이론과 응용홍배 김

This document introduces convolutional neural networks (CNNs). It discusses how CNNs extract features using filters and pooling to build up representations of images while reducing the number of parameters. The key operations of CNNs including convolution, nonlinear activation, pooling and fully connected layers are explained. Examples of CNN applications are provided. The evolution of CNNs is then reviewed, from LeNet and AlexNet to VGGNet, GoogleNet, and improvements like ReLU, dropout, and batch normalization that helped CNNs train better and go deeper.

Deep learning: Modeling high-level face features through deep networksNelson Forte

The document discusses the use of deep learning, particularly convolutional networks, for modeling high-level facial features with low error rates and reduced learning time. It highlights various techniques and adaptive methods used to improve performance in image classification, referencing results from multiple models, including Clarifai's attempts on original training data. The document also includes references to relevant literature in the field of neural networks and image recognition.

"Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural ...Edge AI and Vision Alliance

The document discusses advancements in energy-efficient hardware for embedded vision and deep learning, focusing on optimized algorithms and architectures to improve video processing. It highlights the necessity for energy-efficient systems due to the growing demand for video data and outlines various techniques such as joint algorithm and hardware design, multi-scale detection, and parallel processing. The result is the development of hardware capable of real-time object detection with significantly reduced power consumption.

Finding the best solution for Image ProcessingTech Triveni

The document discusses the evolution and methodologies of image processing, particularly through the use of Convolutional Neural Networks (CNNs) and advancements such as Residual Networks (ResNet). It covers historical perspectives, various CNN architectures (like AlexNet and VGGNet), and highlights the improvements in learning capacity over the years. Additionally, it acknowledges the challenges faced in deep neural networks and outlines future research directions in the field.

Architecture Design for Deep Neural Networks IWanjin Yu

This document summarizes Gao Huang's presentation on neural architectures for efficient inference. The presentation covered three parts: 1) macro-architecture innovations in convolutional neural networks (CNNs) such as ResNet, DenseNet, and multi-scale networks; 2) micro-architecture innovations including group convolution, depthwise separable convolution, and attention mechanisms; and 3) moving from static networks to dynamic networks that can adaptively select simpler or more complex models based on input complexity. The key idea is to enable faster yet accurate inference by matching computational cost to input difficulty.

Recent advances of AI for medical imaging : Engineering perspectivesNamkug Kim

The document discusses advancements in deep learning and its applications in medical imaging, particularly in radiology. It highlights significant breakthroughs in neural networks, including convolutional and recurrent networks, and their use in tasks such as image recognition, speech recognition, and cancer detection. Additionally, it outlines various research projects and collaborations focused on medical imaging AI technology, emphasizing the importance of data cleansing and model training for accurate diagnostics.

State-of-the-art Image Processing across all domainsKnoldus Inc.

The document presents an overview of image processing history and various approaches, including convolutional neural networks (CNN) and residual neural networks (ResNet). It highlights the advancements these technologies have made in different fields like healthcare and manufacturing while comparing architectures such as AlexNet, VGGNet, and ResNet. Additionally, it discusses the importance of open-source contributions and includes references for further reading.

Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...Jedha Bootcamp

The document discusses the fundamentals of deep learning, particularly through the use of Convolutional Neural Networks (CNNs) for image processing and classification. It outlines key components such as convolutional layers, activation functions, and pooling techniques, and emphasizes the importance of feature extraction and dimensionality reduction. The content also addresses practical applications of CNNs, techniques for dataset creation, and the learning process involving backpropagation.

A Fully Progressive approach to Single image super-resolution Mohammed Ashour

This document provides an overview of single-image super-resolution (SISR) and introduces a fully progressive approach to improve image reconstruction quality. It discusses the importance of high-resolution images in various fields, highlights key techniques including generative adversarial networks (GANs) and dense compression units, and examines the impact of curriculum learning in model training. The evaluation of these techniques shows improvements in reconstruction accuracy and efficiency over traditional methods.

DLD meetup 2017, Efficient Deep LearningBrodmann17

The document discusses efficient techniques for deep learning on edge devices. It begins by noting that deep neural networks have high computational complexity which makes inference inefficient for edge devices without powerful GPUs. It then outlines the deep learning stack from hardware to libraries to frameworks to algorithms. The document focuses on how algorithms define model complexity and discusses the evolution of CNN architectures from LeNet5 to ResNet which generally increased in complexity. It covers techniques for reducing model size and operations like pruning, quantization, and knowledge distillation. The challenges of real-life applications on edge devices are discussed.

AI and Deep Learning Subrat Panda, PhD

The document outlines a presentation by Subrat Panda and Biswa Gourav Singh on deep learning, covering its fundamentals, applications across various industries, and practical steps for implementation using frameworks like TensorFlow. It emphasizes the importance of data, algorithm selection, and considerations such as transfer learning and hyper-parameter optimization. The session also discusses challenges in deep learning, such as explainability and debugging, while encouraging participation in the Indian Deep Learning Initiative.

Meetup 18/10/2018 - Artificiële intelligentie en mobiliteitDigipolis Antwerpen

Deep learning models require massive computing power that makes running them on edge devices difficult. Several neural network architectures have been developed to address this, including MobileNets which use depthwise separable convolutions, SqueezeNet with "fire modules", and XNOR-Net which trains binary convolutional neural networks. These approaches reduce model sizes and computations needed while maintaining accuracy, making edge deployment more feasible. Further research and support is still needed but prototypes could be deployed within months and full solutions within a few years to provide AI capabilities to more users.

Convolutional neural networksLearning Courses Online

This document provides an overview of convolutional neural networks (ConvNets). It begins by briefly introducing deep learning and explaining that ConvNets are a supervised deep learning method. It then discusses how ConvNets learn feature representations directly from data in a hierarchical manner using successive layers that apply filters to local regions of the input. The document provides examples of filters and feature maps and explains how techniques like pooling and multiple filters allow ConvNets to capture different features and build translation invariance. It concludes by discussing how ConvNets can be used for tasks like object detection and examples of popular ConvNet libraries.

Research Paper of Image Recognition .02.pdftsmabhi

Seminar_Presentation_pptAyushDixit52

imageclassification-160206090009.pdfKammetaJoshna

Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015Jia-Bin Huang

ImageNet Classification with Deep Convolutional Neural NetworksWilly Marroquin (WillyDevNET)

Convolutional neural network Yan Xu

Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSitakanta Mishra

Convolutional neural networks 이론과 응용홍배 김

Deep learning: Modeling high-level face features through deep networksNelson Forte

"Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural ...Edge AI and Vision Alliance

Finding the best solution for Image ProcessingTech Triveni

Architecture Design for Deep Neural Networks IWanjin Yu

Recent advances of AI for medical imaging : Engineering perspectivesNamkug Kim

State-of-the-art Image Processing across all domainsKnoldus Inc.

Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...Jedha Bootcamp

A Fully Progressive approach to Single image super-resolution Mohammed Ashour

DLD meetup 2017, Efficient Deep LearningBrodmann17

AI and Deep Learning Subrat Panda, PhD

Meetup 18/10/2018 - Artificiële intelligentie en mobiliteitDigipolis Antwerpen

Convolutional neural networksLearning Courses Online

Research Paper of Image Recognition .02.pdftsmabhi

More from Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Universitat Politècnica de Catalunya

The document discusses deep generative learning, contrasting discriminative and generative models, and explores various architectures including GANs, VAEs, and diffusion models for tasks such as image and music generation. It outlines the technical elements involved in training these models and acknowledges contributions from various researchers. The content emphasizes the advancements and applications of generative models across different domains, providing references to recent noteworthy works.

Deep Generative Learning for AllUniversitat Politècnica de Catalunya

This document provides an overview of deep generative learning and summarizes several key generative models including GANs, VAEs, diffusion models, and autoregressive models. It discusses the motivation for generative models and their applications such as image generation, text-to-image synthesis, and enhancing other media like video and speech. Example state-of-the-art models are provided for each application. The document also covers important concepts like the difference between discriminative and generative modeling, sampling techniques, and the training procedures for GANs and VAEs.

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya

The document discusses the Vision Transformer (ViT) model for computer vision tasks. It covers: 1. How ViT tokenizes images into patches and uses position embeddings to encode spatial relationships. 2. ViT uses a class embedding to trigger class predictions, unlike CNNs which have decoders. 3. The receptive field of ViT grows as the attention mechanism allows elements to attend to other distant elements in later layers. 4. Initial results showed ViT performance was comparable to CNNs when trained on large datasets but lagged CNNs trained on smaller datasets like ImageNet.

Towards Sign Language Translation & Production | Xavier Giro-i-NietoUniversitat Politècnica de Catalunya

The document discusses advancements in sign language translation and production, highlighting the significance of accessibility for deaf individuals. It covers a range of topics, including a crash course on sign languages, the state-of-the-art How2Sign dataset, applications in real-world contexts, and ongoing challenges in the field. Key areas of research focus on neural sign language translation and production techniques to enhance communication and understanding across modalities.

The Transformer - Xavier Giró - UPC Barcelona 2021Universitat Politècnica de Catalunya

Xavier Giro-I-Nieto's lecture on transformers outlines key mechanisms such as self-attention and multi-head self-attention, explaining their roles in natural language processing and image generation. The discussion also covers positional encodings and the removal of recurrent layers in the transformer architecture. Various references and studies are provided to support these concepts, demonstrating the broad applicability of transformers beyond language.

Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Universitat Politècnica de Catalunya

This document outlines the work of Xavier Giro-i-Nieto on learning representations for sign language videos, emphasizing instance search and action recognition through various methodologies. It details several research contributions, including advancements in visual search, video saliency prediction, and object segmentation, referencing various papers and results achieved in conferences and workshops. The document serves as a comprehensive overview of the evolution of techniques in multimedia analysis and sign language understanding.

Open challenges in sign language translation and productionUniversitat Politècnica de Catalunya

The document discusses the challenges in sign language translation and production, emphasizing the importance of accessibility for the deaf community, particularly in light of the COVID-19 pandemic. It provides a brief overview of sign languages and highlights ongoing research and technological developments aimed at improving sign language recognition and translation. The document also addresses the complexities involved in developing effective models for translating and producing sign language content.

Generation of Synthetic Referring Expressions for Object Segmentation in VideosUniversitat Politècnica de Catalunya

The document discusses a method for generating synthetic referring expressions to improve object segmentation in videos, particularly using the YouTube-VIS dataset. It evaluates the generated expressions' effectiveness through experiments and concludes that while synthetic expressions do not outperform human-produced ones, they are beneficial for pre-training models without additional annotation costs. Future work involves enhancing the method with more cues and applying it to other datasets.

Discovery and Learning of Navigation Goals from Pixels in MinecraftUniversitat Politècnica de Catalunya

The thesis explores self-supervised learning and reinforcement learning techniques for discovering navigation goals in Minecraft using pixel inputs. It discusses various approaches, including contrastive and variational methods, and presents experiments demonstrating successful skill acquisition from both random and expert trajectories. The findings indicate that expert trajectories can effectively facilitate skill discovery in an embodied AI context.

Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat Politècnica de Catalunya

The document presents a master thesis on sign language recognition and translation using human keypoint estimation and a transformer model. It discusses the communication challenges faced by sign language users in digital environments and proposes solutions such as the creation of automatically generated subtitles and translations. The research includes using datasets like how2sign and examining methods and limitations of current sign language translation technologies.

Intepretability / Explainable AI for Deep Neural NetworksUniversitat Politècnica de Catalunya

This document discusses interpretability and explainable AI (XAI) in neural networks. It begins by providing motivation for why explanations of neural network predictions are often required. It then provides an overview of different interpretability techniques, including visualizing learned weights and feature maps, attribution methods like class activation maps and guided backpropagation, and feature visualization. Specific examples and applications of each technique are described. The document serves as a guide to interpretability and explainability in deep learning models.

Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya

The document provides an overview of convolutional neural networks (CNNs), detailing their architecture, interpretability, computation, receptive fields, and applications. It discusses the historical evolution of CNNs, key architectures such as LeNet and AlexNet, and the advantages of applying CNNs to various data types, including images, text, and speech. Additionally, it highlights techniques for improving accuracy and computational efficiency in training deep learning models.

Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Universitat Politècnica de Catalunya

The document presents a lecture outline on self-supervised audio-visual learning by Xavier Giro-i-Nieto, covering topics such as motivation, feature learning, cross-modal translation, and embodied AI. It discusses various self-supervised learning methods, including generative, predictive, and contrastive approaches, as well as their application in training models on unlabeled data. Key references and examples illustrate how ambient sounds can guide visual understanding, contributing to advancements in audio-visual model performance.

Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya

The document discusses attention mechanisms in various neural network applications, such as Seq2Seq models for machine translation and image captioning. Key concepts include the roles of keys, queries, and values, as well as different types of attention mechanisms like additive, multiplicative, and scaled dot product. Case studies demonstrate how attention improves the predictive capabilities of models in tasks involving images and automatic speech recognition.

Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya

Este documento trata sobre las redes generativas adversariales (GANs), explicando su estructura básica que consiste en un generador y un discriminador que compiten entre sí para mejorar la calidad de las muestras generadas. Se abordan aplicaciones como mejora de imágenes, generación de audio y traducción de lenguajes, así como la importancia de la formación adversarial en su desarrollo. Además, se mencionan investigaciones y pioneros en el campo de las GANs en la Universitat Politècnica de Catalunya.

Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Universitat Politècnica de Catalunya

The document discusses Q-learning and its integration with neural networks, focusing on Deep Q-Networks (DQN) and their evolution in deep reinforcement learning. It outlines several approaches and improvements to DQN, including Double DQN, Prioritized Experience Replay, and Dueling Networks, addressing challenges such as overestimation bias and sample efficiency. Additionally, it touches on methods for handling continuous actions in reinforcement learning, specifically through the use of Deep Deterministic Policy Gradient (DDPG).

Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Universitat Politècnica de Catalunya

The document outlines various models and techniques related to generative and discriminative tasks in language and vision, focusing on image and video captioning, visual question answering, and sign language translation. It references numerous studies and methods, including encoder-decoder representations, attention mechanisms, and multimodal machine translation. Key topics include combating data bias, dynamic memory networks, and advancements in visual reasoning and object grounding through language.

Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya

This document summarizes image segmentation techniques using deep learning. It begins with an overview of semantic segmentation and instance segmentation. It then discusses several techniques for semantic segmentation, including deconvolution/transposed convolution for learnable upsampling, skip connections to combine predictions from different CNN depths, and dilated convolutions to increase the receptive field without losing resolution. For instance segmentation, it covers proposal-based methods like Mask R-CNN, and single-shot and recurrent approaches as alternatives to proposal-based models.

Curriculum Learning for Recurrent Video Object SegmentationUniversitat Politècnica de Catalunya

The document discusses curriculum learning applied to recurrent video object segmentation, presenting a methodology where training data is introduced progressively from simple to complex concepts. It explores the Kitti-MOTS dataset, describing an end-to-end recurrent network model for video object segmentation, along with various experimental setups and evaluation metrics. The conclusions highlight the importance of dataset understanding and suggest directions for future work, including further adjustments to training techniques.

Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Universitat Politècnica de Catalunya

The document provides an overview of representation learning, unsupervised, and self-supervised learning methods in machine learning and artificial intelligence. It covers key concepts such as predictive and contrastive methods, the role of autoencoders, and applications in video and image processing. The presentation includes various examples, research references, and visual representation strategies to enhance understanding of these learning paradigms.

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Universitat Politècnica de Catalunya

Deep Generative Learning for AllUniversitat Politècnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya

Towards Sign Language Translation & Production | Xavier Giro-i-NietoUniversitat Politècnica de Catalunya

The Transformer - Xavier Giró - UPC Barcelona 2021Universitat Politècnica de Catalunya

Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Universitat Politècnica de Catalunya

Open challenges in sign language translation and productionUniversitat Politècnica de Catalunya

Generation of Synthetic Referring Expressions for Object Segmentation in VideosUniversitat Politècnica de Catalunya

Discovery and Learning of Navigation Goals from Pixels in MinecraftUniversitat Politècnica de Catalunya

Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat Politècnica de Catalunya

Intepretability / Explainable AI for Deep Neural NetworksUniversitat Politècnica de Catalunya

Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya

Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Universitat Politècnica de Catalunya

Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya

Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya

Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Universitat Politècnica de Catalunya

Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Universitat Politècnica de Catalunya

Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya

Curriculum Learning for Recurrent Video Object SegmentationUniversitat Politècnica de Catalunya

Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Universitat Politècnica de Catalunya

Recently uploaded (20)

Attendance Presentation Project Excel.pptxs2025266191

Allotted-MBBS-Student-list-batch-2021.pdfsubhansaifi0603

Model Evaluation & Visualisation part of a series of intro modules for data ...brandonlee626749

PPT1_CB_VII_CS_Ch3_FunctionsandChartsinCalc.ppsxanimaroy81

Lesson-3_Program-Outcomes-and-Student-Learning-Outcomes_For-Students.pdfSarahMaeDuallo

NASA ESE Study Results v4 05.29.2020.pptxCiroAlejandroCamacho

Shifting Focus on AI: How it Can Make a Positive Difference1508 A/S

@Reset-Password.pptx presentakh;kenvtionMarkLariosa1

MRI Pulse Sequence in radiology physics.pptxBelaynehBishaw

定制OCAD学生卡加拿大安大略艺术与设计大学成绩单范本,OCAD成绩单复刻taqyed

2025年极速办安大略艺术与设计大学毕业证【q薇1954292140】学历认证流程安大略艺术与设计大学毕业证加拿大本科成绩单制作【q薇1954292140】海外各大学Diploma版本，因为疫情学校推迟发放证书、证书原件丢失补办、没有正常毕业未能认证学历面临就业提供解决办法。当遭遇挂科、旷课导致无法修满学分，或者直接被学校退学，最后无法毕业拿不到毕业证。此时的你一定手足无措，因为留学一场，没有获得毕业证以及学历证明肯定是无法给自己和父母一个交代的。【复刻安大略艺术与设计大学成绩单信封,Buy OCAD University Transcripts】购买日韩成绩单、英国大学成绩单、美国大学成绩单、澳洲大学成绩单、加拿大大学成绩单（q微1954292140）新加坡大学成绩单、新西兰大学成绩单、爱尔兰成绩单、西班牙成绩单、德国成绩单。成绩单的意义主要体现在证明学习能力、评估学术背景、展示综合素质、提高录取率，以及是作为留信认证申请材料的一部分。安大略艺术与设计大学成绩单能够体现您的的学习能力，包括安大略艺术与设计大学课程成绩、专业能力、研究能力。（q微1954292140）具体来说，成绩报告单通常包含学生的学习技能与习惯、各科成绩以及老师评语等部分，因此，成绩单不仅是学生学术能力的证明，也是评估学生是否适合某个教育项目的重要依据！我们承诺采用的是学校原版纸张（原版纸质、底色、纹路）我们工厂拥有全套进口原装设备，特殊工艺都是采用不同机器制作，仿真度基本可以达到100%，所有成品以及工艺效果都可提前给客户展示，不满意可以根据客户要求进行调整，直到满意为止！【主营项目】一.安大略艺术与设计大学毕业证【q微1954292140】安大略艺术与设计大学成绩单、留信认证、使馆认证、教育部认证、雅思托福成绩单、学生卡等！二.真实使馆公证(即留学回国人员证明,不成功不收费) 三.真实教育部学历学位认证（教育部存档！教育部留服网站永久可查）四.办理国外各大学文凭(一对一专业服务,可全程监控跟踪进度)

lecture12.pdf Introduction to bioinformaticsSergeyTsygankov6

Starbucks in the Indian market through its joint venture.sales480687

一比一原版(TUC毕业证书)开姆尼茨工业大学毕业证如何办理taqyed

鉴于此，办理TUC大学毕业证开姆尼茨工业大学毕业证书【q薇1954292140】留学一站式办理学历文凭直通车（开姆尼茨工业大学毕业证TUC成绩单原版开姆尼茨工业大学学位证假文凭）未能正常毕业？【q薇1954292140】办理开姆尼茨工业大学毕业证成绩单/留信学历认证/学历文凭/使馆认证/留学回国人员证明/录取通知书/Offer/在读证明/成绩单/网上存档永久可查！如果您处于以下几种情况： ◇在校期间，因各种原因未能顺利毕业……拿不到官方毕业证 ◇面对父母的压力，希望尽快拿到； ◇不清楚认证流程以及材料该如何准备； ◇回国时间很长，忘记办理； ◇回国马上就要找工作，办给用人单位看； ◇企事业单位必须要求办理的 ◇需要报考公务员、购买免税车、落转户口 ◇申请留学生创业基金【办理开姆尼茨工业大学成绩单Buy Technische Universität Chemnitz Transcripts】购买日韩成绩单、英国大学成绩单、美国大学成绩单、澳洲大学成绩单、加拿大大学成绩单（q微1954292140）新加坡大学成绩单、新西兰大学成绩单、爱尔兰成绩单、西班牙成绩单、德国成绩单。成绩单的意义主要体现在证明学习能力、评估学术背景、展示综合素质、提高录取率，以及是作为留信认证申请材料的一部分。开姆尼茨工业大学成绩单能够体现您的的学习能力，包括开姆尼茨工业大学课程成绩、专业能力、研究能力。（q微1954292140）具体来说，成绩报告单通常包含学生的学习技能与习惯、各科成绩以及老师评语等部分，因此，成绩单不仅是学生学术能力的证明，也是评估学生是否适合某个教育项目的重要依据！

英国毕业证范本利物浦约翰摩尔斯大学成绩单底纹防伪LJMU学生证办理学历认证 taqyed

LJMU利物浦约翰摩尔斯大学毕业证书多少钱【q薇1954292140】1:1原版利物浦约翰摩尔斯大学毕业证+LJMU成绩单【q薇1954292140】完美还原海外各大学毕业材料上的工艺：水印，阴影底纹，钢印LOGO烫金烫银，LOGO烫金烫银复合重叠。文字图案浮雕、激光镭射、紫外荧光、温感、复印防伪等防伪工艺。【主营项目】一、工作未确定，回国需先给父母、亲戚朋友看下文凭的情况，办理毕业证|办理文凭: 买大学毕业证|买大学文凭【q薇1954292140】学位证明书如何办理申请？二、回国进私企、外企、自己做生意的情况，这些单位是不查询毕业证真伪的，而且国内没有渠道去查询国外文凭的真假，也不需要提供真实教育部认证。鉴于此，办理利物浦约翰摩尔斯大学毕业证|LJMU成绩单【q薇1954292140】国外大学毕业证, 文凭办理, 国外文凭办理, 留信网认证三.材料咨询办理、认证咨询办理请加学历顾问【微信:1954292140】毕业证购买指大学文凭购买，毕业证办理和文凭办理。学院文凭定制，学校原版文凭补办，扫描件文凭定做，100%文凭复刻。

Residential Zone 4 for industrial villageMdYasinArafat13

Camuflaje Tipos Características Militar 2025.ppte58650738

Presentation by Tariq & Mohammed (1).pptxAbooddSandoqaa

YEAP !NOT WHAT YOU THINK aakshdjdncnkenfjpayalmistryb

Skdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnneodnrodndocndodnd0dndjxoxnxndkxnxkdndkxndkdjndnrnidnroz doendodnrodnxkdnrocngksjrndkdnr dnkxnddnkxndnrkdnxnxkdSkdnne

最新版美国加利福尼亚大学旧金山法学院毕业证（UCLawSF毕业证书）定制taqyea

一比一还原加利福尼亚大学旧金山法学院毕业证/UCLawSF毕业证书2025原版【q薇1954292140】我们专业办理澳洲大学毕业证成绩单，美国大学毕业证成绩单,英国大学毕业证成绩单，加拿大大学毕业证成绩单，新加坡大学毕业证成绩单，新西兰大学毕业证成绩单，韩国大学毕业证成绩单，日本大学毕业证成绩单。【复刻一套加利福尼亚大学旧金山法学院毕业证成绩单信封等材料最强攻略,Buy University of California College of the Law, San Francisco Transcripts】购买日韩成绩单、英国大学成绩单、美国大学成绩单、澳洲大学成绩单、加拿大大学成绩单（q微1954292140）新加坡大学成绩单、新西兰大学成绩单、爱尔兰成绩单、西班牙成绩单、德国成绩单。成绩单的意义主要体现在证明学习能力、评估学术背景、展示综合素质、提高录取率，以及是作为留信认证申请材料的一部分。加利福尼亚大学旧金山法学院成绩单能够体现您的的学习能力，包括加利福尼亚大学旧金山法学院课程成绩、专业能力、研究能力。（q微1954292140）具体来说，成绩报告单通常包含学生的学习技能与习惯、各科成绩以及老师评语等部分，因此，成绩单不仅是学生学术能力的证明，也是评估学生是否适合某个教育项目的重要依据！我们承诺采用的是学校原版纸张（原版纸质、底色、纹路）我们工厂拥有全套进口原装设备，特殊工艺都是采用不同机器制作，仿真度基本可以达到100%，所有成品以及工艺效果都可提前给客户展示，不满意可以根据客户要求进行调整，直到满意为止！【主营项目】一、工作未确定，回国需先给父母、亲戚朋友看下文凭的情况，办理毕业证|办理文凭: 买大学毕业证|买大学文凭【q薇1954292140】加利福尼亚大学旧金山法学院学位证明书如何办理申请？二、回国进私企、外企、自己做生意的情况，这些单位是不查询毕业证真伪的，而且国内没有渠道去查询国外文凭的真假，也不需要提供真实教育部认证。鉴于此，办理美国成绩单加利福尼亚大学旧金山法学院毕业证【q薇1954292140】国外大学毕业证, 文凭办理, 国外文凭办理, 留信网认证三.材料咨询办理、认证咨询办理请加学历顾问【微信:1954292140】加利福尼亚大学旧金山法学院毕业证购买指大学文凭购买，毕业证办理和文凭办理。学院文凭定制，学校原版文凭补办，扫描件文凭定做，100%文凭复刻。

最新版美国约翰霍普金斯大学毕业证（JHU毕业证书）原版定制Taqyea

2025原版约翰霍普金斯大学毕业证书pdf电子版【q薇1954292140】美国毕业证办理JHU约翰霍普金斯大学毕业证书多少钱？【q薇1954292140】海外各大学Diploma版本，因为疫情学校推迟发放证书、证书原件丢失补办、没有正常毕业未能认证学历面临就业提供解决办法。当遭遇挂科、旷课导致无法修满学分，或者直接被学校退学，最后无法毕业拿不到毕业证。此时的你一定手足无措，因为留学一场，没有获得毕业证以及学历证明肯定是无法给自己和父母一个交代的。【复刻约翰霍普金斯大学成绩单信封,Buy The Johns Hopkins University Transcripts】购买日韩成绩单、英国大学成绩单、美国大学成绩单、澳洲大学成绩单、加拿大大学成绩单（q微1954292140）新加坡大学成绩单、新西兰大学成绩单、爱尔兰成绩单、西班牙成绩单、德国成绩单。成绩单的意义主要体现在证明学习能力、评估学术背景、展示综合素质、提高录取率，以及是作为留信认证申请材料的一部分。约翰霍普金斯大学成绩单能够体现您的的学习能力，包括约翰霍普金斯大学课程成绩、专业能力、研究能力。（q微1954292140）具体来说，成绩报告单通常包含学生的学习技能与习惯、各科成绩以及老师评语等部分，因此，成绩单不仅是学生学术能力的证明，也是评估学生是否适合某个教育项目的重要依据！我们承诺采用的是学校原版纸张（原版纸质、底色、纹路）我们工厂拥有全套进口原装设备，特殊工艺都是采用不同机器制作，仿真度基本可以达到100%，所有成品以及工艺效果都可提前给客户展示，不满意可以根据客户要求进行调整，直到满意为止！【主营项目】一、工作未确定，回国需先给父母、亲戚朋友看下文凭的情况，办理毕业证|办理文凭: 买大学毕业证|买大学文凭【q薇1954292140】约翰霍普金斯大学学位证明书如何办理申请？二、回国进私企、外企、自己做生意的情况，这些单位是不查询毕业证真伪的，而且国内没有渠道去查询国外文凭的真假，也不需要提供真实教育部认证。鉴于此，办理美国成绩单约翰霍普金斯大学毕业证【q薇1954292140】国外大学毕业证, 文凭办理, 国外文凭办理, 留信网认证

Attendance Presentation Project Excel.pptxs2025266191

Allotted-MBBS-Student-list-batch-2021.pdfsubhansaifi0603

Model Evaluation & Visualisation part of a series of intro modules for data ...brandonlee626749

PPT1_CB_VII_CS_Ch3_FunctionsandChartsinCalc.ppsxanimaroy81

Lesson-3_Program-Outcomes-and-Student-Learning-Outcomes_For-Students.pdfSarahMaeDuallo

NASA ESE Study Results v4 05.29.2020.pptxCiroAlejandroCamacho

Shifting Focus on AI: How it Can Make a Positive Difference1508 A/S

@Reset-Password.pptx presentakh;kenvtionMarkLariosa1

MRI Pulse Sequence in radiology physics.pptxBelaynehBishaw

定制OCAD学生卡加拿大安大略艺术与设计大学成绩单范本,OCAD成绩单复刻taqyed

lecture12.pdf Introduction to bioinformaticsSergeyTsygankov6

Starbucks in the Indian market through its joint venture.sales480687

一比一原版(TUC毕业证书)开姆尼茨工业大学毕业证如何办理taqyed

英国毕业证范本利物浦约翰摩尔斯大学成绩单底纹防伪LJMU学生证办理学历认证 taqyed

Residential Zone 4 for industrial villageMdYasinArafat13

Camuflaje Tipos Características Militar 2025.ppte58650738

Presentation by Tariq & Mohammed (1).pptxAbooddSandoqaa

YEAP !NOT WHAT YOU THINK aakshdjdncnkenfjpayalmistryb

最新版美国加利福尼亚大学旧金山法学院毕业证（UCLawSF毕业证书）定制taqyea

最新版美国约翰霍普金斯大学毕业证（JHU毕业证书）原版定制Taqyea

Deep Learning for Computer Vision: Data Augmentation (UPC 2016)

1. [course site] Augmentation Day 2 Lecture 2 Eva Mohedano

2. Introduction ImageNet Classification with Deep Convolutional Neural Networks, Krizhevsky A., 2012 ImageNet Large-Scale Visual Recognition Challenge (ILSVRC) 1.2 million training images, 50,000 validation images, and 150,000 testing images Architecture of 5 convolutional + 3 fully connected = 60 million parameters ~ 650.000 neurons. Overfitting!! 2

3. ● Reduce network capacity ● Dropout ● Data augmentation Ways to reduce overfitting 3

4. ● Reduce network capacity ● Dropout ● Data augmentation Ways to reduce overfitting 1% of total parameters (884K). Decrease in performance 4

5. ● Reduce network capacity ● Dropout ● Data augmentation Ways to reduce overfitting 37M, 16M, 4M parametes!! (fc6,fc7,fc8) 5

6. Ways to reduce overfitting ● Reduce network capacity ● Dropout ● Data augmentation Every forward pass, network slightly different. Reduce co-adaptation between neurons More robust features More interations for convergence 6

7. Ways to reduce overfitting ● Reduce network capacity ● Dropout ● Data augmentation 7

8. Data Augmentation During training, alterate the input image (Krizhevsky A., 2012) - Random crops on the original image - Translations - Horitzontal reflections - Increases size of training x2048 - On-the-fly augmentation During testing - Average prediction of image augmented by the four corner patches and the center patch + flipped image. (10 augmentations of the image) 8

9. Data Augmentation Alternate intensities RGB channels intensities PCA on the set of RGB pixel throughout the ImageNet training set. To each training image, add multiples of the found principal components Object identity should be invariant to changes of illumination 9

10. Augmentation for discriminative unsupervised feature learning Discriminative Unsupervised Feature Learning with Exemplar Convolutional Neural Networks, Dosovitskiy, A., 2014 MOTIVATION ● Large datasets of training data ● Local descriptors should be invariant transformations (rotation, translation, scale, etc) WHAT THEY DO ● Training a CNN to generate local representation by optimising a surrogate classification task ● Task does NOT require labeled data 10

11. Augmentation for discriminative unsupervised feature learning Select random location k and crop 32x32 window (restrictions: region must contain objects or part of the object: high amount of gradients) Apply a transformation [translation, rotation, scalig, RGB modification, contrast modification] ... Generate augmented dataset: 16000 classes of 150 examples each Class k=1, with 150 examples 11

12. Augmentation for discriminative unsupervised feature learning Generate augmented dataset: 16000 classes of 150 examples each Example of classes Example of examples for one class 12

13. Augmentation for discriminative unsupervised feature learning Classification accuracies Superior performance to SIFT for image matching. 13

14. Summary Augmentation helps to prevent overfitting It makes network invariant to certain transformations: translations, flip, etc Can be done on-the-fly Can be used to learn image representations when no label datasets are available. 14

Deep Learning for Computer Vision: Data Augmentation (UPC 2016)

Recommended

More Related Content

What's hot (20)

Viewers also liked (20)

Similar to Deep Learning for Computer Vision: Data Augmentation (UPC 2016) (20)

More from Universitat Politècnica de Catalunya (20)

Recently uploaded (20)

Deep Learning for Computer Vision: Data Augmentation (UPC 2016)