발표자: 박태성 (UC Berkeley 박사과정)
발표일: 2017.6.
Taesung Park is a Ph.D. student at UC Berkeley in AI and computer vision, advised by Prof. Alexei Efros.
His research interest lies between computer vision and computational photography, such as generating realistic images or enhancing photo qualities. He received B.S. in mathematics and M.S. in computer science from Stanford University.
개요:
Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs.
However, for many tasks, paired training data will not be available.
We present an approach for learning to translate an image from a source domain X to a target domain Y in the absence of paired examples.
Our goal is to learn a mapping G: X → Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss.
Because this mapping is highly under-constrained, we couple it with an inverse mapping F: Y → X and introduce a cycle consistency loss to push F(G(X)) ≈ X (and vice versa).
Qualitative results are presented on several tasks where paired training data does not exist, including collection style transfer, object transfiguration, season transfer, photo enhancement, etc.
Quantitative comparisons against several prior methods demonstrate the superiority of our approach.
발표자: 최윤제(고려대 석사과정)
최윤제 (Yunjey Choi)는 고려대학교에서 컴퓨터공학을 전공하였으며, 현재는 석사과정으로 Machine Learning을 공부하고 있는 학생이다. 코딩을 좋아하며 이해한 것을 다른 사람들에게 공유하는 것을 좋아한다. 1년 간 TensorFlow를 사용하여 Deep Learning을 공부하였고 현재는 PyTorch를 사용하여 Generative Adversarial Network를 공부하고 있다. TensorFlow로 여러 논문들을 구현, PyTorch Tutorial을 만들어 Github에 공개한 이력을 갖고 있다.
개요:
Generative Adversarial Network(GAN)은 2014년 Ian Goodfellow에 의해 처음으로 제안되었으며, 적대적 학습을 통해 실제 데이터의 분포를 추정하는 생성 모델입니다. 최근 들어 GAN은 가장 인기있는 연구 분야로 떠오르고 있고 하루에도 수 많은 관련 논문들이 쏟아져 나오고 있습니다.
수 없이 쏟아져 나오고 있는 GAN 논문들을 다 읽기가 힘드신가요? 괜찮습니다. 기본적인 GAN만 완벽하게 이해한다면 새로 나오는 논문들도 쉽게 이해할 수 있습니다.
이번 발표를 통해 제가 GAN에 대해 알고 있는 모든 것들을 전달해드리고자 합니다. GAN을 아예 모르시는 분들, GAN에 대한 이론적인 내용이 궁금하셨던 분들, GAN을 어떻게 활용할 수 있을지 궁금하셨던 분들이 발표를 들으면 좋을 것 같습니다.
발표영상: https://youtu.be/odpjk7_tGY0
Generative Adversarial Networks (GANs) are a class of machine learning frameworks where two neural networks contest with each other in a game. A generator network generates new data instances, while a discriminator network evaluates them for authenticity, classifying them as real or generated. This adversarial process allows the generator to improve over time and generate highly realistic samples that can pass for real data. The document provides an overview of GANs and their variants, including DCGAN, InfoGAN, EBGAN, and ACGAN models. It also discusses techniques for training more stable GANs and escaping issues like mode collapse.
This document discusses generative adversarial networks (GANs) and the LAPGAN model. It explains that GANs use two neural networks, a generator and discriminator, that compete against each other. The generator learns to generate fake images to fool the discriminator, while the discriminator learns to distinguish real from fake images. LAPGAN improves upon GANs by using a Laplacian pyramid to decompose images into multiple scales, with separate generator and discriminator networks for each scale. This allows LAPGAN to generate sharper images by focusing on edges and conditional information at each scale.
Generative Adversarial Networks (GANs) are a type of generative model that uses two neural networks - a generator and discriminator - competing against each other. The generator takes noise as input and generates synthetic samples, while the discriminator evaluates samples as real or generated. They are trained together until the generator fools the discriminator. GANs can generate realistic images, do image-to-image translation, and have applications in reinforcement learning. However, training GANs is challenging due to issues like non-convergence and mode collapse.
Generative adversarial networks (GANs) are a class of machine learning frameworks where two neural networks, a generator and discriminator, compete against each other. The generator learns to generate new data with the same statistics as the training set to fool the discriminator, while the discriminator learns to better distinguish real samples from generated samples. GANs have applications in image generation, image translation between domains, and image completion. Training GANs can be challenging due to issues like mode collapse.
A Short Introduction to Generative Adversarial NetworksJong Wook Kim
Generative adversarial networks (GANs) are a class of machine learning frameworks where two neural networks compete against each other. One network generates new data instances, while the other evaluates them for authenticity. This adversarial process allows the generating network to produce highly realistic samples matching the training data distribution. The document discusses the GAN framework, various algorithm variants like WGAN and BEGAN, training tricks, applications to image generation and translation tasks, and reasons why GANs are a promising area of research.
This document provides an overview of generative adversarial networks (GANs). It explains that GANs were introduced in 2014 and involve two neural networks, a generator and discriminator, that compete against each other. The generator produces synthetic data to fool the discriminator, while the discriminator learns to distinguish real from synthetic data. As they train, the generator improves at producing more realistic outputs that match the real data distribution. Examples of GAN applications discussed include image generation, text-to-image synthesis, and face aging.
NYAI - A Path To Unsupervised Learning Through Adversarial Networks by Soumit...Rizwan Habib
A Path To Unsupervised Learning Through Adversarial Networks - (Soumith Chintala, Researcher at Facebook AI Research)
Soumith Chintala is a Researcher at Facebook AI Research, where he works on deep learning, reinforcement learning, generative image models, agents for video games and large-scale high-performance deep learning. He holds a Masters in CS from NYU, and spent time in Yann LeCun's NYU lab building deep learning models for pedestrian detection, natural image OCR, depth-images among others.
Soumith will go over generative adversarial networks, a particular way of training neural networks to build high quality generative models. The talk will take you through an easy to follow timeline of the research and improvements in adversarial networks, followed by some future directions, as well as applications.
This document provides an overview of generative adversarial networks (GANs). It begins by explaining the basic GAN framework of having a generator and discriminator. It then discusses several GAN variations including DCGAN, EBGAN, WGAN, and BEGAN. Applications of GANs mentioned include image synthesis, image-to-image translation, and domain adaptation. The document notes that evaluating GANs is difficult as log-likelihood does not correlate with visual quality and qualitative assessments can be misleading. It reviews metrics like Inception Score and discusses challenges training GANs such as instability. In conclusion, the document covers the key concepts of GANs, related methods, applications, evaluation challenges, and training techniques.
GANs are the new hottest topic in the ML arena; however, they present a challenge for the researchers and the engineers alike. Their design, and most importantly, the code implementation has been causing headaches to the ML practitioners, especially when moving to production.
Starting from the very basic of what a GAN is, passing trough Tensorflow implementation, using the most cutting-edge APIs available in the framework, and finally, production-ready serving at scale using Google Cloud ML Engine.
Slides for the talk: https://www.pycon.it/conference/talks/deep-diving-into-gans-form-theory-to-production
Github repo: https://github.com/zurutech/gans-from-theory-to-production
Deep generative models can be either generative or discriminative. Generative models directly model the joint distribution of inputs and outputs, while discriminative models directly model the conditional distribution of outputs given inputs. Common deep generative models include restricted Boltzmann machines, deep belief networks, variational autoencoders, generative adversarial networks, and deep convolutional generative adversarial networks. These models use different network architectures and training procedures to generate new examples that resemble samples from the training data distribution.
발표자: 이활석 (Naver Clova)
발표일: 2017.11.
(현) NAVER Clova Vision
(현) TFKR 운영진
개요:
최근 딥러닝 연구는 지도학습에서 비지도학습으로 급격히 무게 중심이 옮겨지고 있습니다.
특히 컴퓨터 비전 기술 분야에서는 지도학습에 해당하는 이미지 내에 존재하는 정보를 찾는 인식 기술에서,
비지도학습에 해당하는 특정 정보를 담는 이미지를 생성하는 기술인 생성 기술로 연구 동향이 바뀌어 가고 있습니다.
본 세미나에서는 생성 기술의 두 축을 담당하고 있는 VAE(variational autoencoder)와 GAN(generative adversarial network) 동작 원리에 대해서 간략히 살펴 보고, 관련된 주요 논문들의 결과를 공유하고자 합니다.
딥러닝에 대한 지식이 없더라도 생성 모델을 학습할 수 있는 두 방법론인 VAE와 GAN의 개념에 대해 이해하고
그 기술 수준을 파악할 수 있도록 강의 내용을 구성하였습니다.
The document discusses different types of generative models including auto-regressive models, variational auto-encoders, and generative adversarial networks. It provides examples of each type of model and highlights some of their features and issues during training. Specific models discussed in more detail include PixelRNNs, DCGANs, WGANs, BEGANs, Pix2Pix, and CycleGANs. The document aims to introduce deep generative models and their applications.
Deep generative models are making progress in modeling complex, high-dimensional data in an unsupervised manner. Two promising approaches are variational autoencoders (VAEs) and generative adversarial networks (GANs). VAEs impose a prior on the code space to regularize and allow for sampling, while GANs use a generator and discriminator in an adversarial training procedure. Recent work has focused on extensions of these models, including conditional generation and combining aspects of VAEs and GANs. However, generative modeling of natural images remains challenging, with models still underfitting the complexity of unconstrained data.
Tutorial on Theory and Application of Generative Adversarial NetworksMLReview
Description
Generative adversarial network (GAN) has recently emerged as a promising generative modeling approach. It consists of a generative network and a discriminative network. Through the competition between the two networks, it learns to model the data distribution. In addition to modeling the image/video distribution in computer vision problems, the framework finds use in defining visual concept using examples. To a large extent, it eliminates the need of hand-crafting objective functions for various computer vision problems. In this tutorial, we will present an overview of generative adversarial network research. We will cover several recent theoretical studies as well as training techniques and will also cover several vision applications of generative adversarial networks.
EuroSciPy 2019 - GANs: Theory and ApplicationsEmanuele Ghelfi
EuroSciPy 2019: https://pretalx.com/euroscipy-2019/talk/Q79NND/
GANs are the new hottest topic in the ML arena; however, they present a challenge for the researchers and the engineers alike. Their design, and most importantly, the code implementation has been causing headaches to the ML practitioners, especially when moving to production.
The workshop aims at providing a complete understanding of both the theory and the practical know-how to code and deploy this family of models in production. By the end of it, the attendees should be able to apply the concepts learned to other models without any issues.
We will be showcasing all the shiny new APIs introduced by TensorFlow 2.0 by showing how to build a GAN from scratch and how to "productionize" it by leveraging the AshPy Python package that allows to easily design, prototype, train and export Machine Learning models defined in TensorFlow 2.0.
The workshop is composed of:
- Theoretical introduction
- GANs from Scratch in TensorFlow 2.0
- High-performance input data pipeline with TensorFlow Datasets
- Introduction to the AshPy API
- Implementing, training, and visualizing DCGAN using AshPy
- Serving TF2 Models with Google Cloud Functions
The materials of the workshop will be openly provided via GitHub (https://github.com/zurutech/gans-from-theory-to-production).
The document discusses Generative Adversarial Networks (GANs), a type of generative model proposed by Ian Goodfellow in 2014. GANs use two neural networks, a generator and discriminator, that compete against each other. The generator produces synthetic data to fool the discriminator, while the discriminator learns to distinguish real from synthetic data. GANs have been used successfully to generate realistic images when trained on large datasets. Examples mentioned include Pix2Pix for image-to-image translation and STACKGAN for text-to-image generation.
Generative Adversarial Networks (GANs) are a type of deep learning model used for unsupervised machine learning tasks like image generation. GANs work by having two neural networks, a generator and discriminator, compete against each other. The generator creates synthetic images and the discriminator tries to distinguish real images from fake ones. This allows the generator to improve over time at creating more realistic images that can fool the discriminator. The document discusses the intuition behind GANs, provides a PyTorch implementation example, and describes variants like DCGAN, LSGAN, and semi-supervised GANs.
This document summarizes generative adversarial networks (GANs) and their applications. It begins by introducing GANs and how they work by having a generator and discriminator play an adversarial game. It then discusses several variants of GANs including DCGAN, LSGAN, conditional GAN, and others. It provides examples of applications such as image-to-image translation, text-to-image synthesis, image generation, and more. It concludes by discussing major GAN variants and potential future applications like helping children learn to draw.
This tutorial provides an overview of recent advances in deep generative models. It will cover three types of generative models: Markov models, latent variable models, and implicit models. The tutorial aims to give attendees a full understanding of the latest developments in generative modeling and how these models can be applied to high-dimensional data. Several challenges and open questions in the field will also be discussed. The tutorial is intended for the 2017 conference of the International Society for Bayesian Analysis.
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...宏毅 李
The document provides an overview of generative adversarial networks (GANs) and their applications to signal processing and natural language processing. It begins with a general introduction to GANs, including how they work, common issues, and potential solutions. Conditional GANs and unsupervised conditional GANs are also discussed. The document then outlines applications of GANs to signal processing and natural language processing.
Unsupervised learning representation with Deep Convolutional Generative Adversarial Network, Paper by Alec Radford, Luke Metz, and Soumith Chintala
(indico Research, Facebook AI Research).
Generative Adversarial Networks and Their ApplicationsArtifacia
This is the presentation from our AI Meet Jan 2017 on GANs and its applications.
You can join Artifacia AI Meet Bangalore Group: https://www.meetup.com/Artifacia-AI-Meet/
Generative Adversarial Networks is an advanced topic and requires a prior basic understanding of CNNs. Here is some pre-reading material for you.
- https://arxiv.org/pdf/1406.2661v1.pdf
- https://arxiv.org/pdf/1701.00160v1.pdf
Slides by Víctor Garcia about:
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A. Efros, "Image-to-Image Translation Using Conditional Adversarial Networks".
In arxiv, 2016.
https://phillipi.github.io/pix2pix/
We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.
We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.
The document summarizes a presentation on applying GANs in medical imaging. It discusses several papers on this topic:
1. A paper that used GANs to reduce noise in low-dose CT scans by training on paired routine-dose and low-dose CT images. This approach generated reconstructed low-dose CT images with improved quality.
2. A paper that used GANs for cross-modality synthesis, specifically generating skin lesion images from other modalities.
3. Additional papers discussed other medical imaging applications of GANs such as vessel-fundus image synthesis and organ segmentation.
Semantic Segmentation on Satellite ImageryRAHUL BHOJWANI
This is an Image Semantic Segmentation project targeted on Satellite Imagery. The goal was to detect the pixel-wise segmentation map for various objects in Satellite Imagery including buildings, water bodies, roads etc. The data for this was taken from the Kaggle competition <https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection>.
We implemented FCN, U-Net and Segnet Deep learning architectures for this task.
Computer Vision and GenAI for Geoscientists.pptxYohanes Nuwara
Presentation in a webinar hosted by Petroleum Engineers Association (PEA) in 28 July 2023. The topic of the webinar is computer vision for petroleum geoscience.
This document provides an overview of generative adversarial networks (GANs). It explains that GANs were introduced in 2014 and involve two neural networks, a generator and discriminator, that compete against each other. The generator produces synthetic data to fool the discriminator, while the discriminator learns to distinguish real from synthetic data. As they train, the generator improves at producing more realistic outputs that match the real data distribution. Examples of GAN applications discussed include image generation, text-to-image synthesis, and face aging.
NYAI - A Path To Unsupervised Learning Through Adversarial Networks by Soumit...Rizwan Habib
A Path To Unsupervised Learning Through Adversarial Networks - (Soumith Chintala, Researcher at Facebook AI Research)
Soumith Chintala is a Researcher at Facebook AI Research, where he works on deep learning, reinforcement learning, generative image models, agents for video games and large-scale high-performance deep learning. He holds a Masters in CS from NYU, and spent time in Yann LeCun's NYU lab building deep learning models for pedestrian detection, natural image OCR, depth-images among others.
Soumith will go over generative adversarial networks, a particular way of training neural networks to build high quality generative models. The talk will take you through an easy to follow timeline of the research and improvements in adversarial networks, followed by some future directions, as well as applications.
This document provides an overview of generative adversarial networks (GANs). It begins by explaining the basic GAN framework of having a generator and discriminator. It then discusses several GAN variations including DCGAN, EBGAN, WGAN, and BEGAN. Applications of GANs mentioned include image synthesis, image-to-image translation, and domain adaptation. The document notes that evaluating GANs is difficult as log-likelihood does not correlate with visual quality and qualitative assessments can be misleading. It reviews metrics like Inception Score and discusses challenges training GANs such as instability. In conclusion, the document covers the key concepts of GANs, related methods, applications, evaluation challenges, and training techniques.
GANs are the new hottest topic in the ML arena; however, they present a challenge for the researchers and the engineers alike. Their design, and most importantly, the code implementation has been causing headaches to the ML practitioners, especially when moving to production.
Starting from the very basic of what a GAN is, passing trough Tensorflow implementation, using the most cutting-edge APIs available in the framework, and finally, production-ready serving at scale using Google Cloud ML Engine.
Slides for the talk: https://www.pycon.it/conference/talks/deep-diving-into-gans-form-theory-to-production
Github repo: https://github.com/zurutech/gans-from-theory-to-production
Deep generative models can be either generative or discriminative. Generative models directly model the joint distribution of inputs and outputs, while discriminative models directly model the conditional distribution of outputs given inputs. Common deep generative models include restricted Boltzmann machines, deep belief networks, variational autoencoders, generative adversarial networks, and deep convolutional generative adversarial networks. These models use different network architectures and training procedures to generate new examples that resemble samples from the training data distribution.
발표자: 이활석 (Naver Clova)
발표일: 2017.11.
(현) NAVER Clova Vision
(현) TFKR 운영진
개요:
최근 딥러닝 연구는 지도학습에서 비지도학습으로 급격히 무게 중심이 옮겨지고 있습니다.
특히 컴퓨터 비전 기술 분야에서는 지도학습에 해당하는 이미지 내에 존재하는 정보를 찾는 인식 기술에서,
비지도학습에 해당하는 특정 정보를 담는 이미지를 생성하는 기술인 생성 기술로 연구 동향이 바뀌어 가고 있습니다.
본 세미나에서는 생성 기술의 두 축을 담당하고 있는 VAE(variational autoencoder)와 GAN(generative adversarial network) 동작 원리에 대해서 간략히 살펴 보고, 관련된 주요 논문들의 결과를 공유하고자 합니다.
딥러닝에 대한 지식이 없더라도 생성 모델을 학습할 수 있는 두 방법론인 VAE와 GAN의 개념에 대해 이해하고
그 기술 수준을 파악할 수 있도록 강의 내용을 구성하였습니다.
The document discusses different types of generative models including auto-regressive models, variational auto-encoders, and generative adversarial networks. It provides examples of each type of model and highlights some of their features and issues during training. Specific models discussed in more detail include PixelRNNs, DCGANs, WGANs, BEGANs, Pix2Pix, and CycleGANs. The document aims to introduce deep generative models and their applications.
Deep generative models are making progress in modeling complex, high-dimensional data in an unsupervised manner. Two promising approaches are variational autoencoders (VAEs) and generative adversarial networks (GANs). VAEs impose a prior on the code space to regularize and allow for sampling, while GANs use a generator and discriminator in an adversarial training procedure. Recent work has focused on extensions of these models, including conditional generation and combining aspects of VAEs and GANs. However, generative modeling of natural images remains challenging, with models still underfitting the complexity of unconstrained data.
Tutorial on Theory and Application of Generative Adversarial NetworksMLReview
Description
Generative adversarial network (GAN) has recently emerged as a promising generative modeling approach. It consists of a generative network and a discriminative network. Through the competition between the two networks, it learns to model the data distribution. In addition to modeling the image/video distribution in computer vision problems, the framework finds use in defining visual concept using examples. To a large extent, it eliminates the need of hand-crafting objective functions for various computer vision problems. In this tutorial, we will present an overview of generative adversarial network research. We will cover several recent theoretical studies as well as training techniques and will also cover several vision applications of generative adversarial networks.
EuroSciPy 2019 - GANs: Theory and ApplicationsEmanuele Ghelfi
EuroSciPy 2019: https://pretalx.com/euroscipy-2019/talk/Q79NND/
GANs are the new hottest topic in the ML arena; however, they present a challenge for the researchers and the engineers alike. Their design, and most importantly, the code implementation has been causing headaches to the ML practitioners, especially when moving to production.
The workshop aims at providing a complete understanding of both the theory and the practical know-how to code and deploy this family of models in production. By the end of it, the attendees should be able to apply the concepts learned to other models without any issues.
We will be showcasing all the shiny new APIs introduced by TensorFlow 2.0 by showing how to build a GAN from scratch and how to "productionize" it by leveraging the AshPy Python package that allows to easily design, prototype, train and export Machine Learning models defined in TensorFlow 2.0.
The workshop is composed of:
- Theoretical introduction
- GANs from Scratch in TensorFlow 2.0
- High-performance input data pipeline with TensorFlow Datasets
- Introduction to the AshPy API
- Implementing, training, and visualizing DCGAN using AshPy
- Serving TF2 Models with Google Cloud Functions
The materials of the workshop will be openly provided via GitHub (https://github.com/zurutech/gans-from-theory-to-production).
The document discusses Generative Adversarial Networks (GANs), a type of generative model proposed by Ian Goodfellow in 2014. GANs use two neural networks, a generator and discriminator, that compete against each other. The generator produces synthetic data to fool the discriminator, while the discriminator learns to distinguish real from synthetic data. GANs have been used successfully to generate realistic images when trained on large datasets. Examples mentioned include Pix2Pix for image-to-image translation and STACKGAN for text-to-image generation.
Generative Adversarial Networks (GANs) are a type of deep learning model used for unsupervised machine learning tasks like image generation. GANs work by having two neural networks, a generator and discriminator, compete against each other. The generator creates synthetic images and the discriminator tries to distinguish real images from fake ones. This allows the generator to improve over time at creating more realistic images that can fool the discriminator. The document discusses the intuition behind GANs, provides a PyTorch implementation example, and describes variants like DCGAN, LSGAN, and semi-supervised GANs.
This document summarizes generative adversarial networks (GANs) and their applications. It begins by introducing GANs and how they work by having a generator and discriminator play an adversarial game. It then discusses several variants of GANs including DCGAN, LSGAN, conditional GAN, and others. It provides examples of applications such as image-to-image translation, text-to-image synthesis, image generation, and more. It concludes by discussing major GAN variants and potential future applications like helping children learn to draw.
This tutorial provides an overview of recent advances in deep generative models. It will cover three types of generative models: Markov models, latent variable models, and implicit models. The tutorial aims to give attendees a full understanding of the latest developments in generative modeling and how these models can be applied to high-dimensional data. Several challenges and open questions in the field will also be discussed. The tutorial is intended for the 2017 conference of the International Society for Bayesian Analysis.
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...宏毅 李
The document provides an overview of generative adversarial networks (GANs) and their applications to signal processing and natural language processing. It begins with a general introduction to GANs, including how they work, common issues, and potential solutions. Conditional GANs and unsupervised conditional GANs are also discussed. The document then outlines applications of GANs to signal processing and natural language processing.
Unsupervised learning representation with Deep Convolutional Generative Adversarial Network, Paper by Alec Radford, Luke Metz, and Soumith Chintala
(indico Research, Facebook AI Research).
Generative Adversarial Networks and Their ApplicationsArtifacia
This is the presentation from our AI Meet Jan 2017 on GANs and its applications.
You can join Artifacia AI Meet Bangalore Group: https://www.meetup.com/Artifacia-AI-Meet/
Generative Adversarial Networks is an advanced topic and requires a prior basic understanding of CNNs. Here is some pre-reading material for you.
- https://arxiv.org/pdf/1406.2661v1.pdf
- https://arxiv.org/pdf/1701.00160v1.pdf
Slides by Víctor Garcia about:
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A. Efros, "Image-to-Image Translation Using Conditional Adversarial Networks".
In arxiv, 2016.
https://phillipi.github.io/pix2pix/
We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.
We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.
The document summarizes a presentation on applying GANs in medical imaging. It discusses several papers on this topic:
1. A paper that used GANs to reduce noise in low-dose CT scans by training on paired routine-dose and low-dose CT images. This approach generated reconstructed low-dose CT images with improved quality.
2. A paper that used GANs for cross-modality synthesis, specifically generating skin lesion images from other modalities.
3. Additional papers discussed other medical imaging applications of GANs such as vessel-fundus image synthesis and organ segmentation.
Semantic Segmentation on Satellite ImageryRAHUL BHOJWANI
This is an Image Semantic Segmentation project targeted on Satellite Imagery. The goal was to detect the pixel-wise segmentation map for various objects in Satellite Imagery including buildings, water bodies, roads etc. The data for this was taken from the Kaggle competition <https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection>.
We implemented FCN, U-Net and Segnet Deep learning architectures for this task.
Computer Vision and GenAI for Geoscientists.pptxYohanes Nuwara
Presentation in a webinar hosted by Petroleum Engineers Association (PEA) in 28 July 2023. The topic of the webinar is computer vision for petroleum geoscience.
Computer Vision and GenAI for Geoscientists.pptxYohanes Nuwara
Presentation in a webinar hosted by Petroleum Engineers Association (PEA) in 28 July 2023. The topic of the webinar is computer vision for petroleum geoscience.
Semantic segmentation with Convolutional Neural Network ApproachesUMBC
In this project, we propose methods for semantic segmentation with the deep learning state-of-the-art models. Moreover,
we want to filterize the segmentation to the specific object in specific application. Instead of concentrating on unnecessary objects we
can focus on special ones and make it more specialize and effecient for special purposes. Furtheromore, In this project, we leverage
models that are suitable for face segmentation. The models that are used in this project are Mask-RCNN and DeepLabv3. The
experimental results clearly indicate that how illustrated approach are efficient and robust in the segmentation task to the previous work
in the field of segmentation. These models are reached to 74.4 and 86.6 precision of Mean of Intersection over Union. The visual
Results of the models are shown in Appendix part.
This document discusses deep learning techniques for object detection and recognition. It provides an overview of computer vision tasks like image classification and object detection. It then discusses how crowdsourcing large datasets from the internet and advances in machine learning, specifically deep convolutional neural networks (CNNs), have led to major breakthroughs in object detection. Several state-of-the-art CNN models for object detection are described, including R-CNN, Fast R-CNN, Faster R-CNN, SSD, and YOLO. The document also provides examples of applying these techniques to tasks like face detection and detecting manta rays from aerial videos.
The document summarizes Assaf Mushinsky's presentation at CVPR 2017. Some key points:
- He discussed state-of-the-art research in object detection, segmentation, pose estimation, and network architectures from papers presented at CVPR 2017.
- Papers presented efficient object detection methods that improved speed and accuracy trade-offs like YOLO9000 and Feature Pyramid Networks. Mask R-CNN was discussed for instance segmentation and pose estimation.
- New network architectures like Densely Connected Networks, Xception, and ResNeXt were covered that improved accuracy and efficiency over ResNet and Inception.
- The presentation highlighted recent advances in computer vision from the CVPR conference but did not cover older
This is an intensive meetup at Samsung Next IL covering most interesting papers that were presented in CVPR 2017 last month. It is a good opportunity to have an overview of recent advancements in the field of Deep Learning with applications to Computer-Vision.
The following topics are covered:
• Object detection
• Pose estimation
• Efficient networks
Generative adversarial networks (GANs) show promise for enhancing computer vision in low visibility conditions. GANs can learn to translate images from low visibility domains like hazy or low-light conditions to clear images without paired training data. Recent work has incorporated hyperspectral guidance to improve image-to-image translation for tasks like dehazing. A domain-aware model was proposed to address the distributional discrepancy between RGB and hyperspectral images. Additionally, optimizing the spectral profile in translation helps mitigate spectral aberrations in results. These techniques push the limits of machine learning for analyzing visual data in challenging conditions with applications like autonomous vehicles and medical imaging.
Transformer Architectures in Vision
[2018 ICML] Image Transformer
[2019 CVPR] Video Action Transformer Network
[2020 ECCV] End-to-End Object Detection with Transformers
[2021 ICLR] An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
Structured Forests for Fast Edge Detection [Paper Presentation]Mohammad Shaker
A Paper Presentation for "Structured Forests for Fast Edge Detection" by Dollár, Piotr, and C. Lawrence Zitnick at Computer Vision (ICCV), 2013 IEEE International Conference on. IEEE, 2013.
Diffusion models beat gans on image synthesisBeerenSahu
Diffusion models have recently been shown to produce higher quality images than GANs while also offering better diversity and being easier to scale and train. Specifically, a 2021 paper by OpenAI demonstrated that a diffusion model achieved an FID score of 2.97 on ImageNet 128x128, beating the previous state-of-the-art held by BigGAN. Diffusion models work by gradually adding noise to images in a forward process and then learning to remove noise in a backward denoising process, allowing them to generate diverse, high fidelity images.
A location-aware embedding technique for accurate landmark recognitionFederico Magliani
The current state of the research in landmark recognition highlights the good accuracy which can be achieved by embedding techniques, such as Fisher vector and VLAD. All these techniques do not exploit spatial information, i.e. consider all the features and the corresponding descriptors without embedding their location in the image. This paper presents a new variant of the well-known VLAD (Vector of Locally Aggregated Descriptors) embedding technique which accounts, at a certain degree, for the location of features. The driving motivation comes from the observation that, usually, the most interesting part of an image (e.g., the landmark to be recognized) is almost at the center of the image, while the features at the borders are irrelevant features which do no depend on the landmark. The proposed variant, called locVLAD (location-aware VLAD), computes the mean of the two global descriptors: the VLAD executed on the entire original image, and the one computed on a cropped image which removes a certain percentage of the image borders. This simple variant shows an accuracy greater than the existing state-of-the-art approach. Experiments are conducted on two public datasets (ZuBuD and Holidays) which are used both for training and testing. Morever a more balanced version of ZuBuD is proposed.
FaceNet provides a unified embedding for face recognition, verification, and clustering tasks using a deep convolutional neural network. It was developed by Google researchers and achieved state-of-the-art results on benchmark datasets, cutting the error rate by 30% compared to previous work. The model uses a 22-layer CNN that maps face images to 128-dimensional embeddings, where distances between embeddings correspond to face similarity. It was trained with triplet loss to optimize the embeddings.
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
https://telecombcn-dl.github.io/2019-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptxRishavKumar530754
LiDAR-Based System for Autonomous Cars
Autonomous Driving with LiDAR Tech
LiDAR Integration in Self-Driving Cars
Self-Driving Vehicles Using LiDAR
LiDAR Mapping for Driverless Cars
π0.5: a Vision-Language-Action Model with Open-World GeneralizationNABLAS株式会社
今回の資料「Transfusion / π0 / π0.5」は、画像・言語・アクションを統合するロボット基盤モデルについて紹介しています。
拡散×自己回帰を融合したTransformerをベースに、π0.5ではオープンワールドでの推論・計画も可能に。
This presentation introduces robot foundation models that integrate vision, language, and action.
Built on a Transformer combining diffusion and autoregression, π0.5 enables reasoning and planning in open-world settings.
ADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITYijscai
With the increased use of Artificial Intelligence (AI) in malware analysis there is also an increased need to
understand the decisions models make when identifying malicious artifacts. Explainable AI (XAI) becomes
the answer to interpreting the decision-making process that AI malware analysis models use to determine
malicious benign samples to gain trust that in a production environment, the system is able to catch
malware. With any cyber innovation brings a new set of challenges and literature soon came out about XAI
as a new attack vector. Adversarial XAI (AdvXAI) is a relatively new concept but with AI applications in
many sectors, it is crucial to quickly respond to the attack surface that it creates. This paper seeks to
conceptualize a theoretical framework focused on addressing AdvXAI in malware analysis in an effort to
balance explainability with security. Following this framework, designing a machine with an AI malware
detection and analysis model will ensure that it can effectively analyze malware, explain how it came to its
decision, and be built securely to avoid adversarial attacks and manipulations. The framework focuses on
choosing malware datasets to train the model, choosing the AI model, choosing an XAI technique,
implementing AdvXAI defensive measures, and continually evaluating the model. This framework will
significantly contribute to automated malware detection and XAI efforts allowing for secure systems that
are resilient to adversarial attacks.
Sorting Order and Stability in Sorting.
Concept of Internal and External Sorting.
Bubble Sort,
Insertion Sort,
Selection Sort,
Quick Sort and
Merge Sort,
Radix Sort, and
Shell Sort,
External Sorting, Time complexity analysis of Sorting Algorithms.
RICS Membership-(The Royal Institution of Chartered Surveyors).pdfMohamedAbdelkader115
Glad to be one of only 14 members inside Kuwait to hold this credential.
Please check the members inside kuwait from this link:
https://www.rics.org/networking/find-a-member.html?firstname=&lastname=&town=&country=Kuwait&member_grade=(AssocRICS)&expert_witness=&accrediation=&page=1
Analysis of reinforced concrete deep beam is based on simplified approximate method due to the complexity of the exact analysis. The complexity is due to a number of parameters affecting its response. To evaluate some of this parameters, finite element study of the structural behavior of the reinforced self-compacting concrete deep beam was carried out using Abaqus finite element modeling tool. The model was validated against experimental data from the literature. The parametric effects of varied concrete compressive strength, vertical web reinforcement ratio and horizontal web reinforcement ratio on the beam were tested on eight (8) different specimens under four points loads. The results of the validation work showed good agreement with the experimental studies. The parametric study revealed that the concrete compressive strength most significantly influenced the specimens’ response with the average of 41.1% and 49 % increment in the diagonal cracking and ultimate load respectively due to doubling of concrete compressive strength. Although the increase in horizontal web reinforcement ratio from 0.31 % to 0.63 % lead to average of 6.24 % increment on the diagonal cracking load, it does not influence the ultimate strength and the load-deflection response of the beams. Similar variation in vertical web reinforcement ratio leads to an average of 2.4 % and 15 % increment in cracking and ultimate load respectively with no appreciable effect on the load-deflection response.
☁️ GDG Cloud Munich: Build With AI Workshop - Introduction to Vertex AI! ☁️
Join us for an exciting #BuildWithAi workshop on the 28th of April, 2025 at the Google Office in Munich!
Dive into the world of AI with our "Introduction to Vertex AI" session, presented by Google Cloud expert Randy Gupta.
Raish Khanji GTU 8th sem Internship Report.pdfRaishKhanji
This report details the practical experiences gained during an internship at Indo German Tool
Room, Ahmedabad. The internship provided hands-on training in various manufacturing technologies, encompassing both conventional and advanced techniques. Significant emphasis was placed on machining processes, including operation and fundamental
understanding of lathe and milling machines. Furthermore, the internship incorporated
modern welding technology, notably through the application of an Augmented Reality (AR)
simulator, offering a safe and effective environment for skill development. Exposure to
industrial automation was achieved through practical exercises in Programmable Logic Controllers (PLCs) using Siemens TIA software and direct operation of industrial robots
utilizing teach pendants. The principles and practical aspects of Computer Numerical Control
(CNC) technology were also explored. Complementing these manufacturing processes, the
internship included extensive application of SolidWorks software for design and modeling tasks. This comprehensive practical training has provided a foundational understanding of
key aspects of modern manufacturing and design, enhancing the technical proficiency and readiness for future engineering endeavors.
The Fluke 925 is a vane anemometer, a handheld device designed to measure wind speed, air flow (volume), and temperature. It features a separate sensor and display unit, allowing greater flexibility and ease of use in tight or hard-to-reach spaces. The Fluke 925 is particularly suitable for HVAC (heating, ventilation, and air conditioning) maintenance in both residential and commercial buildings, offering a durable and cost-effective solution for routine airflow diagnostics.
15. New approach
● Deep learning pixel-to-pixel segmentation.
○ Hand labelled mask is needed.
○ Let’s generate it !
From : Ra Gyoung Yoon et al, “Quantitative assesment of change in regional disease patterns on
serial HRCT of fibrotic interstitial pneumonia with texture-based automated quantification system”.
2012.
16. Mask generation
● A naive approach → Failed.
○ Because the neural network have learned deterministic
patterns instead of lung disease patterns.
Honeycombing
Emphysema
17. Mask generation
● Ken Perlin, “An image Synthesizer”, 1985
○ natural appearing textures
○ gradient based fractal noise
○ heavily used in game business
18. Mask generation
● One random Perlin noise ( simplex noise )
● two randomly selected ROI patches
ConsolidationGGO
Mask ROI Patch
37. Our contributions
● A simple and practical pixel mask generation
method for DILD ROI dataset using Perlin noise.
○ No radiologist mask needed.
● We applied state-of-the-art deep CNN based
pixel-to-pixel segmentation method to DILD
dataset.
○ High accuracy with reasonable computing time.