SlideShare a Scribd company logo
Online Video Object Segmentation via
Convolutional Trident Network
Seminar at Naver
Won-Dong Jang
Korea University
2017-08-22
Introduction & related works
Video object segmentation
• Clustering pixels in videos into objects or background
• Unsupervised
• Supervised
• Semi-supervised
Segment track
Video object segmentation
• Unsupervised video object segmentation
• Discover and segment a primary object in a video
• Without user annotations
Input videos Ground-truth segmentation labels
Video object segmentation
• Unsupervised video object segmentation
• Saliency detection-based approach
• Visual saliency
• Motion saliency
Examples of saliency detection
Video object segmentation
• Supervised video object segmentation
• User can annotate mislabeled pixels in any frames
• Interaction between an algorithm and a user
Annotation
at the first frame
Segmentation results after adding the annotation
Segmentation results
Additional
user annotation
Video object segmentation
• Semi-supervised video object segmentation
• Discover and segment a primary object in a video
• With user annotations in the first frame
Annotated target object
in the first frame
Segmentation results
Related works
• Object proposal-based algorithm
• Generate object proposals in each frame
• Select object proposals that are similar to the target object
[2015][ICCV][Perazzi] Fully Connected Object Proposals for Video Segmentation
Generated object proposals
Segment track
Related works
• Superpixel-based algorithm
• Over-segment each frame into superpixels
• Trace the target object based on the inter-frame matching
[2015][CVPR][Wen] Joint Online Tracking and Segmentation
Inter-frame matching Segment track
Proposed algorithm
In this presentation
• A semi-supervised online segmentation algorithm
• Online method
• Offline techniques require a huge memory space for a long video
• Deep learning-based approach
• Connection between a convolutional neural network and an MRF
optimization strategy
• Remarkable performance on the DAVIS benchmark dataset
Overview
• Framework
• Inter-frame propagation
• Inference via convolutional trident network (CTN)
• Yields three tailored probability maps for the MRF optimization
• MRF optimization
Inter-frame propagation
• Propagation from the previous frame
• To roughly locate the target object in the current frame
• Using backward optical flow
• From 𝑡 to 𝑡 − 1
Segmentation label map
for frame 𝑡 − 1
Backward motion
(optical flow)
Propagation map
for frame 𝑡
Convolutional trident network
• Infer segmentation information
• The propagation map may be inaccurate
• The inferred information is effectively used to solve a binary
labeling problem
Network architecture
• Encoder-decoder architecture
• Single encoder
• VGG encoder
• Three decoders
• Separative decoder
• Definite foreground decoder
• Definite background decoder
Network architecture
• Encoder-decoder architecture
• Single encoder
• VGG encoder
• Three decoders
• Separative decoder
• Definite foreground decoder
• Definite background decoder
Network architecture
• VGG encoder
• Trained for image classification
• Using ImageNet dataset
22K categories
14M images
Network architecture
• VGG encoder
• Trained for image classification
• Using ImageNet dataset
Convolutional layers
Fully connected
layers
Fox
Cat
Lion
Dog
Zebra
Umbrella
Tank
Lamp
Desk
Orange
Kite
…
Network architecture
• Separative decoder (SD)
• Separate a target object from the background
• Using a down-sampled foreground propagation patch
Conv1_1
Conv1_2
Pooling
Conv2_1
Conv2_2
Pooling
Conv3_1
Conv3_2
Conv3_3
Pooling
Conv4_1
Conv4_2
Conv4_3
Pooling
Conv5_1
Conv5_2
Conv5_3
SD-Dec2
SD-Dec3
SD-Dec4
SD-Dec5
SD-Pred
SD-Dec1
Unpooling
Unpooling
Unpooling
Unpooling
Skip connections
Image patch
Separative
probability patch
Foreground
propagation patch
Network architecture
• Definite foreground and background decoders
• Definite pixels indicate locations that should be labeled as the
foreground or the background indubitably
• Fixing labels in definite pixels improves labeling accuracies
• Inspired by the image matting problem
Input image Tri-map Matting result
Network architecture
• Definite foreground decoder (DFD)
• Identifies definite foreground pixels
Conv1_1
Conv1_2
Pooling
Conv2_1
Conv2_2
Pooling
Conv3_1
Conv3_2
Conv3_3
Pooling
Conv4_1
Conv4_2
Conv4_3
Pooling
Conv5_1
Conv5_2
Conv5_3
DFD-Dec2
DFD-Dec3
DFD-Dec4
DFD-Dec5
DFD-Pred
DFD-Dec1
Unpooling
Unpooling
Unpooling
Unpooling
Image patch
Definite foreground
probability patch
Foreground
propagation patch
Network architecture
• Definite background decoder (DBD)
• Finds definite background pixels
Conv1_1
Conv1_2
Pooling
Conv2_1
Conv2_2
Pooling
Conv3_1
Conv3_2
Conv3_3
Pooling
Conv4_1
Conv4_2
Conv4_3
Pooling
Conv5_1
Conv5_2
Conv5_3
DBD-Dec2
DBD-Dec3
DBD-Dec4
DBD-Dec5
DBD-Pred
DBD-Dec1
Unpooling
Unpooling
Unpooling
Unpooling
Image patch
Definite background
probability patch
Background
propagation patch
Network architecture
• Implementation issues in decoders
• Prediction layer
• Sigmoid layer is used to yield normalized outputs within [0, 1]
• Rectified linear unit (ReLU) + Batch normalization
• Kernel size
• 3 ×3 kernels are used in the prediction layers
• 5 × 5 kernels are used in the other convolution layers
Training phase
• Lack of video object segmentation dataset
• There are several datasets for video object segmentation
• However, each of them consists of a small number of videos from
12 to 59
• Instead,
• We use the PASCAL VOC 2012 dataset
• Object segmentation
• 26,844 object masks
• 11,355 images
Training phase
• Preprocessing of training data
• Ground-truth masks for the DFD and DBD are not available
• Hence, we generate them through simple image processing
Training phase
• Preprocessing of training data
• Degrade the objet mask to imitate propagation errors
• By performing suppression and noise addition
• Synthesize the ground-truth masks for the DFD and DBD
• By applying erosion and dilation
Training phase
• Implementation issues
• Caffe library
• Cross-entropy losses
−
1
𝑛
෍
𝑛=1
𝑁
𝑝 𝑛 log Ƹ𝑝 𝑛 + 1 − 𝑝 𝑛 log 1 − Ƹ𝑝 𝑛
• Minibatch
• with eight training data
• Learning rate
• 1e-3 for the first 55 epochs
• 1e-4 for the next 35 epochs
• Stochastic gradient descent
Inference phase
• Set input data
• By cropping the frame and propagation map
• CTN outputs three probability patches
• Separative probability map 𝑅S
• Definite foreground probability map 𝑅F
• Definite background probability map 𝑅B
Frame 𝑡
Segmentation label
at frame 𝑡 − 1
Optical flow
Inter-frame propagationInput at frame 𝑡
Propagation map
at frame 𝑡
Inference via convolutional encoder-decoder network
Encoder
Background
propagation patch
Foreground
propagation patch
Image patch
Definite
background
decoder
Definite
foreground
decoder
Separative
decoder
Definite background
probability patch
Definite foreground
probability patch
Separative
probability patch
Inference phase
• Classification
• Separative probability map 𝑅S
• If 𝑅S(𝐩) > 𝜃sep, pixel 𝐩 is classified as the foreground
• ℒ be the coordinate set for such foreground pixels
• Definite foreground probability map 𝑅F
• If 𝑅F(𝐩) > 𝜃def, pixel 𝐩 is classified as the definite foreground
• ℱ denotes the set of the definite foreground pixels
• Definite background probability map 𝑅B
• If 𝑅B(𝐩) > 𝜃def, pixel 𝐩 is classified as the definite background
• ℬ indicates the set of the definite background pixels
MRF optimization
• Solve two-class MRF optimization problem
• To improve the segmentation quality further
• Define a graph 𝐺 = (𝑁, 𝐸)
• Nodes are pixels in the current frame
• Each node is connected to its four neighbors by edges
MRF optimization
• MRF energy function
ℰ 𝑆 = ෍
𝐩∈𝑁
𝒟 𝐩, 𝑆 + 𝛾 × ෍
𝐩,𝐪 ∈𝐸
𝒵 𝐩, 𝐪, 𝑆
Definite background
probability patch
Definite foreground
probability patch
Image patch Separative
probability patch
Unary cost 𝒟
- Returns extremely high costs on DF and DB pixels
when they have background and foreground labels
Pairwise cost 𝒵
- Encourage neighboring pixels
to have the same label
MRF optimization
• Unary cost computation
• Build the RGB color Gaussian mixture models (GMMs) of the
foreground and the background, respectively
• 𝐾 = 10 for both GMMs
• Use pixels in ℒ to construct the foreground GMMs
• Use pixels in ℒ 𝑐
to construct the background GMMs
• Gaussian cost
𝜓 𝐩, 𝑠 = min
𝑘
− log 𝑓 𝐩; ℳ𝑠,𝑘
• Unary cost
𝒟 𝐩, 𝑆 = ൞
∞ if 𝑝 ∈ ℱ and 𝑆 𝑝 = 0
∞ if 𝑝 ∈ ℱ and 𝑆 𝑝 = 0
𝜓 𝐩, 𝑆(𝐩) otherwise
MRF optimization
• Pairwise cost computation
• Pairwise cost
𝒵 𝐩, 𝐪, 𝑆 = ቊ
exp −𝑑 𝐩, 𝐪 if 𝑆 𝐩 ≠ 𝑆 𝐪
0 otherwise
• Graph-cut optimization
ℰ 𝑆 = ෍
𝐩∈𝑁
𝒟 𝐩, 𝑆 + 𝛾 × ෍
𝐩,𝐪 ∈𝐸
𝒵 𝐩, 𝐪, 𝑆
Reappearing object detection
• Identification of reappearing parts
• A target object may disappear and be occluded by other
objects
• Use backward-forward motion consistency
Experimental results
Experimental results
• The DAVIS benchmark dataset
• 50 videos
• 854 × 480 resolution
• Number of frames
• From 25 to 104
• Difficulties
• Fast motion
• Occlusion
• Object deformation
Experimental results
• Performance measures
• Region similarity
• Jaccard index
• Contour accuracy
• F-measure
• Statistics
• Mean
• Recall
• Decay
Experimental results
• Performance comparison
Experimental results
• Qualitative results
Experimental results
• The SegTrack dataset
• 5 videos
• Performance comparison
• Jaccard index
Experimental results
• Ablation studies
• Effectiveness of each decoder
• Efficacy of MRF optimization
Experimental results
• Running time analysis
• Prop-Q
• Uses the state-of-the-art optical flow technique
• Prop-F
• Adopts a much faster optical flow technique
• Parameter selection
Conclusions
• A semi-supervised online video object segmentation
algorithm is introduced in this presentation
• Deep learning-based semi-supervised video object
segmentation algorithm
• Tailored network for MRF optimization
• Remarkable performance on the DAVIS dataset
• Q&A
• Thank you

More Related Content

What's hot (20)

Unsupervised visual representation learning overview: Toward Self-Supervision
Unsupervised visual representation learning overview: Toward Self-SupervisionUnsupervised visual representation learning overview: Toward Self-Supervision
Unsupervised visual representation learning overview: Toward Self-Supervision
LEE HOSEONG
 
Super resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun YooSuper resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun Yoo
JaeJun Yoo
 
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Jia-Bin Huang
 
Deep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image EnhancementDeep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image Enhancement
Sean Moran
 
YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection review
LEE HOSEONG
 
Deep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image EnhancementDeep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image Enhancement
Sean Moran
 
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D streamColor and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream
NAVER Engineering
 
Talk 2011-buet-perception-event
Talk 2011-buet-perception-eventTalk 2011-buet-perception-event
Talk 2011-buet-perception-event
Mahfuzul Haque
 
Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...
Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...
Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...
Shanghai Jiao Tong University(上海交通大学)
 
Deep Learning for Computer Vision: A comparision between Convolutional Neural...
Deep Learning for Computer Vision: A comparision between Convolutional Neural...Deep Learning for Computer Vision: A comparision between Convolutional Neural...
Deep Learning for Computer Vision: A comparision between Convolutional Neural...
Vincenzo Lomonaco
 
Object-Region Video Transformers
Object-Region Video TransformersObject-Region Video Transformers
Object-Region Video Transformers
Sangwoo Mo
 
PR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionPR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object Detection
Jinwon Lee
 
Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)
Hwa Pyung Kim
 
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
JaeJun Yoo
 
Shai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingShai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble tracking
wolf
 
Introduction to deep learning based voice activity detection
Introduction to deep learning based voice activity detectionIntroduction to deep learning based voice activity detection
Introduction to deep learning based voice activity detection
NAVER Engineering
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
Sangwoo Mo
 
Generative Models for General Audiences
Generative Models for General AudiencesGenerative Models for General Audiences
Generative Models for General Audiences
Sangwoo Mo
 
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSDeep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Ganesan Narayanasamy
 
Passive stereo vision with deep learning
Passive stereo vision with deep learningPassive stereo vision with deep learning
Passive stereo vision with deep learning
Yu Huang
 
Unsupervised visual representation learning overview: Toward Self-Supervision
Unsupervised visual representation learning overview: Toward Self-SupervisionUnsupervised visual representation learning overview: Toward Self-Supervision
Unsupervised visual representation learning overview: Toward Self-Supervision
LEE HOSEONG
 
Super resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun YooSuper resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun Yoo
JaeJun Yoo
 
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Jia-Bin Huang
 
Deep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image EnhancementDeep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image Enhancement
Sean Moran
 
YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection review
LEE HOSEONG
 
Deep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image EnhancementDeep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image Enhancement
Sean Moran
 
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D streamColor and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream
NAVER Engineering
 
Talk 2011-buet-perception-event
Talk 2011-buet-perception-eventTalk 2011-buet-perception-event
Talk 2011-buet-perception-event
Mahfuzul Haque
 
Deep Learning for Computer Vision: A comparision between Convolutional Neural...
Deep Learning for Computer Vision: A comparision between Convolutional Neural...Deep Learning for Computer Vision: A comparision between Convolutional Neural...
Deep Learning for Computer Vision: A comparision between Convolutional Neural...
Vincenzo Lomonaco
 
Object-Region Video Transformers
Object-Region Video TransformersObject-Region Video Transformers
Object-Region Video Transformers
Sangwoo Mo
 
PR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionPR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object Detection
Jinwon Lee
 
Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)
Hwa Pyung Kim
 
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
JaeJun Yoo
 
Shai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingShai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble tracking
wolf
 
Introduction to deep learning based voice activity detection
Introduction to deep learning based voice activity detectionIntroduction to deep learning based voice activity detection
Introduction to deep learning based voice activity detection
NAVER Engineering
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
Sangwoo Mo
 
Generative Models for General Audiences
Generative Models for General AudiencesGenerative Models for General Audiences
Generative Models for General Audiences
Sangwoo Mo
 
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSDeep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Ganesan Narayanasamy
 
Passive stereo vision with deep learning
Passive stereo vision with deep learningPassive stereo vision with deep learning
Passive stereo vision with deep learning
Yu Huang
 

Viewers also liked (20)

딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
NAVER Engineering
 
Introduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningIntroduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement Learning
NAVER Engineering
 
알파고 해부하기 1부
알파고 해부하기 1부알파고 해부하기 1부
알파고 해부하기 1부
Donghun Lee
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?
NAVER Engineering
 
Multimodal Sequential Learning for Video QA
Multimodal Sequential Learning for Video QAMultimodal Sequential Learning for Video QA
Multimodal Sequential Learning for Video QA
NAVER Engineering
 
바둑인을 위한 알파고
바둑인을 위한 알파고바둑인을 위한 알파고
바둑인을 위한 알파고
Donghun Lee
 
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
NAVER Engineering
 
알파고 풀어보기 / Alpha Technical Review
알파고 풀어보기 / Alpha Technical Review알파고 풀어보기 / Alpha Technical Review
알파고 풀어보기 / Alpha Technical Review
상은 박
 
Step-by-step approach to question answering
Step-by-step approach to question answeringStep-by-step approach to question answering
Step-by-step approach to question answering
NAVER Engineering
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGAN
NAVER Engineering
 
RLCode와 A3C 쉽고 깊게 이해하기
RLCode와 A3C 쉽고 깊게 이해하기RLCode와 A3C 쉽고 깊게 이해하기
RLCode와 A3C 쉽고 깊게 이해하기
Woong won Lee
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
NAVER Engineering
 
[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기
[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기
[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기
이 의령
 
알파고 (바둑 인공지능)의 작동 원리
알파고 (바둑 인공지능)의 작동 원리알파고 (바둑 인공지능)의 작동 원리
알파고 (바둑 인공지능)의 작동 원리
Shane (Seungwhan) Moon
 
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
Taehoon Kim
 
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Jeongkyu Shin
 
알아두면 쓸데있는 신기한 강화학습 NAVER 2017
알아두면 쓸데있는 신기한 강화학습 NAVER 2017알아두면 쓸데있는 신기한 강화학습 NAVER 2017
알아두면 쓸데있는 신기한 강화학습 NAVER 2017
Taehoon Kim
 
what is_tabs_share
what is_tabs_sharewhat is_tabs_share
what is_tabs_share
NAVER D2
 
[132]웨일 브라우저 1년 그리고 미래
[132]웨일 브라우저 1년 그리고 미래[132]웨일 브라우저 1년 그리고 미래
[132]웨일 브라우저 1년 그리고 미래
NAVER D2
 
[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기
[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기
[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기
NAVER D2
 
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
NAVER Engineering
 
Introduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningIntroduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement Learning
NAVER Engineering
 
알파고 해부하기 1부
알파고 해부하기 1부알파고 해부하기 1부
알파고 해부하기 1부
Donghun Lee
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?
NAVER Engineering
 
Multimodal Sequential Learning for Video QA
Multimodal Sequential Learning for Video QAMultimodal Sequential Learning for Video QA
Multimodal Sequential Learning for Video QA
NAVER Engineering
 
바둑인을 위한 알파고
바둑인을 위한 알파고바둑인을 위한 알파고
바둑인을 위한 알파고
Donghun Lee
 
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
NAVER Engineering
 
알파고 풀어보기 / Alpha Technical Review
알파고 풀어보기 / Alpha Technical Review알파고 풀어보기 / Alpha Technical Review
알파고 풀어보기 / Alpha Technical Review
상은 박
 
Step-by-step approach to question answering
Step-by-step approach to question answeringStep-by-step approach to question answering
Step-by-step approach to question answering
NAVER Engineering
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGAN
NAVER Engineering
 
RLCode와 A3C 쉽고 깊게 이해하기
RLCode와 A3C 쉽고 깊게 이해하기RLCode와 A3C 쉽고 깊게 이해하기
RLCode와 A3C 쉽고 깊게 이해하기
Woong won Lee
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
NAVER Engineering
 
[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기
[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기
[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기
이 의령
 
알파고 (바둑 인공지능)의 작동 원리
알파고 (바둑 인공지능)의 작동 원리알파고 (바둑 인공지능)의 작동 원리
알파고 (바둑 인공지능)의 작동 원리
Shane (Seungwhan) Moon
 
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
Taehoon Kim
 
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Jeongkyu Shin
 
알아두면 쓸데있는 신기한 강화학습 NAVER 2017
알아두면 쓸데있는 신기한 강화학습 NAVER 2017알아두면 쓸데있는 신기한 강화학습 NAVER 2017
알아두면 쓸데있는 신기한 강화학습 NAVER 2017
Taehoon Kim
 
what is_tabs_share
what is_tabs_sharewhat is_tabs_share
what is_tabs_share
NAVER D2
 
[132]웨일 브라우저 1년 그리고 미래
[132]웨일 브라우저 1년 그리고 미래[132]웨일 브라우저 1년 그리고 미래
[132]웨일 브라우저 1년 그리고 미래
NAVER D2
 
[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기
[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기
[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기
NAVER D2
 
Ad

Similar to Online video object segmentation via convolutional trident network (20)

IRJET- Semantic Segmentation using Deep Learning
IRJET- Semantic Segmentation using Deep LearningIRJET- Semantic Segmentation using Deep Learning
IRJET- Semantic Segmentation using Deep Learning
IRJET Journal
 
Video Object Segmentation - Laura Leal-Taixé - UPC Barcelona 2018
Video Object Segmentation - Laura Leal-Taixé - UPC Barcelona 2018Video Object Segmentation - Laura Leal-Taixé - UPC Barcelona 2018
Video Object Segmentation - Laura Leal-Taixé - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
IRJET-Multiple Object Detection using Deep Neural Networks
IRJET-Multiple Object Detection using Deep Neural NetworksIRJET-Multiple Object Detection using Deep Neural Networks
IRJET-Multiple Object Detection using Deep Neural Networks
IRJET Journal
 
slide-171212080528.pptx
slide-171212080528.pptxslide-171212080528.pptx
slide-171212080528.pptx
SharanrajK22MMT1003
 
Video Classification Basic
Video Classification Basic Video Classification Basic
Video Classification Basic
Silversparro Technologies
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
CHENHuiMei
 
Semantic Video Segmentation with Using Ensemble of Particular Classifiers and...
Semantic Video Segmentation with Using Ensemble of Particular Classifiers and...Semantic Video Segmentation with Using Ensemble of Particular Classifiers and...
Semantic Video Segmentation with Using Ensemble of Particular Classifiers and...
ITIIIndustries
 
Real Time Object Dectection using machine learning
Real Time Object Dectection using machine learningReal Time Object Dectection using machine learning
Real Time Object Dectection using machine learning
pratik pratyay
 
Object detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetObject detection - RCNNs vs Retinanet
Object detection - RCNNs vs Retinanet
Rishabh Indoria
 
Deep Video Object Segmentation - Xavier Giro - UPC Barcelona 2019
Deep Video Object Segmentation - Xavier Giro - UPC Barcelona 2019Deep Video Object Segmentation - Xavier Giro - UPC Barcelona 2019
Deep Video Object Segmentation - Xavier Giro - UPC Barcelona 2019
Universitat Politècnica de Catalunya
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
Cvpr 2017 Summary Meetup
Cvpr 2017 Summary MeetupCvpr 2017 Summary Meetup
Cvpr 2017 Summary Meetup
Amir Alush
 
Brodmann17 CVPR 2017 review - meetup slides
Brodmann17 CVPR 2017 review - meetup slides Brodmann17 CVPR 2017 review - meetup slides
Brodmann17 CVPR 2017 review - meetup slides
Brodmann17
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
proposal_pura
proposal_puraproposal_pura
proposal_pura
Erick Lin
 
FASSOLD Deep learning for semantic analysis and annotation of conventional an...
FASSOLD Deep learning for semantic analysis and annotation of conventional an...FASSOLD Deep learning for semantic analysis and annotation of conventional an...
FASSOLD Deep learning for semantic analysis and annotation of conventional an...
FIAT/IFTA
 
Deep Learning for Computer Vision: Video Analytics (UPC 2016)
Deep Learning for Computer Vision: Video Analytics (UPC 2016)Deep Learning for Computer Vision: Video Analytics (UPC 2016)
Deep Learning for Computer Vision: Video Analytics (UPC 2016)
Universitat Politècnica de Catalunya
 
Temporal Segment Network
Temporal Segment NetworkTemporal Segment Network
Temporal Segment Network
Dongang (Sean) Wang
 
Stadnford University practical presentation.pdf
Stadnford University practical presentation.pdfStadnford University practical presentation.pdf
Stadnford University practical presentation.pdf
horiamommand
 
Machine learning based augmented reality for improved learning application th...
Machine learning based augmented reality for improved learning application th...Machine learning based augmented reality for improved learning application th...
Machine learning based augmented reality for improved learning application th...
IJECEIAES
 
IRJET- Semantic Segmentation using Deep Learning
IRJET- Semantic Segmentation using Deep LearningIRJET- Semantic Segmentation using Deep Learning
IRJET- Semantic Segmentation using Deep Learning
IRJET Journal
 
IRJET-Multiple Object Detection using Deep Neural Networks
IRJET-Multiple Object Detection using Deep Neural NetworksIRJET-Multiple Object Detection using Deep Neural Networks
IRJET-Multiple Object Detection using Deep Neural Networks
IRJET Journal
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
CHENHuiMei
 
Semantic Video Segmentation with Using Ensemble of Particular Classifiers and...
Semantic Video Segmentation with Using Ensemble of Particular Classifiers and...Semantic Video Segmentation with Using Ensemble of Particular Classifiers and...
Semantic Video Segmentation with Using Ensemble of Particular Classifiers and...
ITIIIndustries
 
Real Time Object Dectection using machine learning
Real Time Object Dectection using machine learningReal Time Object Dectection using machine learning
Real Time Object Dectection using machine learning
pratik pratyay
 
Object detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetObject detection - RCNNs vs Retinanet
Object detection - RCNNs vs Retinanet
Rishabh Indoria
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
Cvpr 2017 Summary Meetup
Cvpr 2017 Summary MeetupCvpr 2017 Summary Meetup
Cvpr 2017 Summary Meetup
Amir Alush
 
Brodmann17 CVPR 2017 review - meetup slides
Brodmann17 CVPR 2017 review - meetup slides Brodmann17 CVPR 2017 review - meetup slides
Brodmann17 CVPR 2017 review - meetup slides
Brodmann17
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
proposal_pura
proposal_puraproposal_pura
proposal_pura
Erick Lin
 
FASSOLD Deep learning for semantic analysis and annotation of conventional an...
FASSOLD Deep learning for semantic analysis and annotation of conventional an...FASSOLD Deep learning for semantic analysis and annotation of conventional an...
FASSOLD Deep learning for semantic analysis and annotation of conventional an...
FIAT/IFTA
 
Stadnford University practical presentation.pdf
Stadnford University practical presentation.pdfStadnford University practical presentation.pdf
Stadnford University practical presentation.pdf
horiamommand
 
Machine learning based augmented reality for improved learning application th...
Machine learning based augmented reality for improved learning application th...Machine learning based augmented reality for improved learning application th...
Machine learning based augmented reality for improved learning application th...
IJECEIAES
 
Ad

More from NAVER Engineering (20)

React vac pattern
React vac patternReact vac pattern
React vac pattern
NAVER Engineering
 
디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX
NAVER Engineering
 
진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)
NAVER Engineering
 
서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트
NAVER Engineering
 
BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호
NAVER Engineering
 
이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라
NAVER Engineering
 
날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기
NAVER Engineering
 
쏘카프레임 구축 배경과 과정
 쏘카프레임 구축 배경과 과정 쏘카프레임 구축 배경과 과정
쏘카프레임 구축 배경과 과정
NAVER Engineering
 
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
NAVER Engineering
 
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
NAVER Engineering
 
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
NAVER Engineering
 
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
NAVER Engineering
 
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
NAVER Engineering
 
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
NAVER Engineering
 
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
NAVER Engineering
 
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
NAVER Engineering
 
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
NAVER Engineering
 
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
NAVER Engineering
 
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
NAVER Engineering
 
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
NAVER Engineering
 
디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX
NAVER Engineering
 
진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)
NAVER Engineering
 
서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트
NAVER Engineering
 
BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호
NAVER Engineering
 
이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라
NAVER Engineering
 
날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기
NAVER Engineering
 
쏘카프레임 구축 배경과 과정
 쏘카프레임 구축 배경과 과정 쏘카프레임 구축 배경과 과정
쏘카프레임 구축 배경과 과정
NAVER Engineering
 
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
NAVER Engineering
 
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
NAVER Engineering
 
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
NAVER Engineering
 
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
NAVER Engineering
 
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
NAVER Engineering
 
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
NAVER Engineering
 
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
NAVER Engineering
 
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
NAVER Engineering
 
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
NAVER Engineering
 
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
NAVER Engineering
 
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
NAVER Engineering
 
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
NAVER Engineering
 

Recently uploaded (20)

Your startup on AWS - How to architect and maintain a Lean and Mean account
Your startup on AWS - How to architect and maintain a Lean and Mean accountYour startup on AWS - How to architect and maintain a Lean and Mean account
Your startup on AWS - How to architect and maintain a Lean and Mean account
angelo60207
 
Viral>Wondershare Filmora 14.5.18.12900 Crack Free Download
Viral>Wondershare Filmora 14.5.18.12900 Crack Free DownloadViral>Wondershare Filmora 14.5.18.12900 Crack Free Download
Viral>Wondershare Filmora 14.5.18.12900 Crack Free Download
Puppy jhon
 
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOMEstablish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Anchore
 
Secure Access with Azure Active Directory
Secure Access with Azure Active DirectorySecure Access with Azure Active Directory
Secure Access with Azure Active Directory
VICTOR MAESTRE RAMIREZ
 
Cisco ISE Performance, Scalability and Best Practices.pdf
Cisco ISE Performance, Scalability and Best Practices.pdfCisco ISE Performance, Scalability and Best Practices.pdf
Cisco ISE Performance, Scalability and Best Practices.pdf
superdpz
 
Introduction to Typescript - GDG On Campus EUE
Introduction to Typescript - GDG On Campus EUEIntroduction to Typescript - GDG On Campus EUE
Introduction to Typescript - GDG On Campus EUE
Google Developer Group On Campus European Universities in Egypt
 
Edge-banding-machines-edgeteq-s-200-en-.pdf
Edge-banding-machines-edgeteq-s-200-en-.pdfEdge-banding-machines-edgeteq-s-200-en-.pdf
Edge-banding-machines-edgeteq-s-200-en-.pdf
AmirStern2
 
Providing an OGC API Processes REST Interface for FME Flow
Providing an OGC API Processes REST Interface for FME FlowProviding an OGC API Processes REST Interface for FME Flow
Providing an OGC API Processes REST Interface for FME Flow
Safe Software
 
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Agentic AI: Beyond the Buzz- LangGraph Studio V2Agentic AI: Beyond the Buzz- LangGraph Studio V2
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Shashikant Jagtap
 
Oracle Cloud Infrastructure AI Foundations
Oracle Cloud Infrastructure AI FoundationsOracle Cloud Infrastructure AI Foundations
Oracle Cloud Infrastructure AI Foundations
VICTOR MAESTRE RAMIREZ
 
Crypto Super 500 - 14th Report - June2025.pdf
Crypto Super 500 - 14th Report - June2025.pdfCrypto Super 500 - 14th Report - June2025.pdf
Crypto Super 500 - 14th Report - June2025.pdf
Stephen Perrenod
 
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
Edge AI and Vision Alliance
 
Kubernetes Security Act Now Before It’s Too Late
Kubernetes Security Act Now Before It’s Too LateKubernetes Security Act Now Before It’s Too Late
Kubernetes Security Act Now Before It’s Too Late
Michael Furman
 
“Solving Tomorrow’s AI Problems Today with Cadence’s Newest Processor,” a Pre...
“Solving Tomorrow’s AI Problems Today with Cadence’s Newest Processor,” a Pre...“Solving Tomorrow’s AI Problems Today with Cadence’s Newest Processor,” a Pre...
“Solving Tomorrow’s AI Problems Today with Cadence’s Newest Processor,” a Pre...
Edge AI and Vision Alliance
 
Ben Blair - Operating Safely in a Vibe Coding World
Ben Blair - Operating Safely in a Vibe Coding WorldBen Blair - Operating Safely in a Vibe Coding World
Ben Blair - Operating Safely in a Vibe Coding World
AWS Chicago
 
Mastering AI Workflows with FME - Peak of Data & AI 2025
Mastering AI Workflows with FME - Peak of Data & AI 2025Mastering AI Workflows with FME - Peak of Data & AI 2025
Mastering AI Workflows with FME - Peak of Data & AI 2025
Safe Software
 
Your startup on AWS - How to architect and maintain a Lean and Mean account J...
Your startup on AWS - How to architect and maintain a Lean and Mean account J...Your startup on AWS - How to architect and maintain a Lean and Mean account J...
Your startup on AWS - How to architect and maintain a Lean and Mean account J...
angelo60207
 
Domino IQ – Was Sie erwartet, erste Schritte und Anwendungsfälle
Domino IQ – Was Sie erwartet, erste Schritte und AnwendungsfälleDomino IQ – Was Sie erwartet, erste Schritte und Anwendungsfälle
Domino IQ – Was Sie erwartet, erste Schritte und Anwendungsfälle
panagenda
 
Azure vs AWS Which Cloud Platform Is Best for Your Business in 2025
Azure vs AWS  Which Cloud Platform Is Best for Your Business in 2025Azure vs AWS  Which Cloud Platform Is Best for Your Business in 2025
Azure vs AWS Which Cloud Platform Is Best for Your Business in 2025
Infrassist Technologies Pvt. Ltd.
 
Enabling BIM / GIS integrations with Other Systems with FME
Enabling BIM / GIS integrations with Other Systems with FMEEnabling BIM / GIS integrations with Other Systems with FME
Enabling BIM / GIS integrations with Other Systems with FME
Safe Software
 
Your startup on AWS - How to architect and maintain a Lean and Mean account
Your startup on AWS - How to architect and maintain a Lean and Mean accountYour startup on AWS - How to architect and maintain a Lean and Mean account
Your startup on AWS - How to architect and maintain a Lean and Mean account
angelo60207
 
Viral>Wondershare Filmora 14.5.18.12900 Crack Free Download
Viral>Wondershare Filmora 14.5.18.12900 Crack Free DownloadViral>Wondershare Filmora 14.5.18.12900 Crack Free Download
Viral>Wondershare Filmora 14.5.18.12900 Crack Free Download
Puppy jhon
 
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOMEstablish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Anchore
 
Secure Access with Azure Active Directory
Secure Access with Azure Active DirectorySecure Access with Azure Active Directory
Secure Access with Azure Active Directory
VICTOR MAESTRE RAMIREZ
 
Cisco ISE Performance, Scalability and Best Practices.pdf
Cisco ISE Performance, Scalability and Best Practices.pdfCisco ISE Performance, Scalability and Best Practices.pdf
Cisco ISE Performance, Scalability and Best Practices.pdf
superdpz
 
Edge-banding-machines-edgeteq-s-200-en-.pdf
Edge-banding-machines-edgeteq-s-200-en-.pdfEdge-banding-machines-edgeteq-s-200-en-.pdf
Edge-banding-machines-edgeteq-s-200-en-.pdf
AmirStern2
 
Providing an OGC API Processes REST Interface for FME Flow
Providing an OGC API Processes REST Interface for FME FlowProviding an OGC API Processes REST Interface for FME Flow
Providing an OGC API Processes REST Interface for FME Flow
Safe Software
 
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Agentic AI: Beyond the Buzz- LangGraph Studio V2Agentic AI: Beyond the Buzz- LangGraph Studio V2
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Shashikant Jagtap
 
Oracle Cloud Infrastructure AI Foundations
Oracle Cloud Infrastructure AI FoundationsOracle Cloud Infrastructure AI Foundations
Oracle Cloud Infrastructure AI Foundations
VICTOR MAESTRE RAMIREZ
 
Crypto Super 500 - 14th Report - June2025.pdf
Crypto Super 500 - 14th Report - June2025.pdfCrypto Super 500 - 14th Report - June2025.pdf
Crypto Super 500 - 14th Report - June2025.pdf
Stephen Perrenod
 
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
Edge AI and Vision Alliance
 
Kubernetes Security Act Now Before It’s Too Late
Kubernetes Security Act Now Before It’s Too LateKubernetes Security Act Now Before It’s Too Late
Kubernetes Security Act Now Before It’s Too Late
Michael Furman
 
“Solving Tomorrow’s AI Problems Today with Cadence’s Newest Processor,” a Pre...
“Solving Tomorrow’s AI Problems Today with Cadence’s Newest Processor,” a Pre...“Solving Tomorrow’s AI Problems Today with Cadence’s Newest Processor,” a Pre...
“Solving Tomorrow’s AI Problems Today with Cadence’s Newest Processor,” a Pre...
Edge AI and Vision Alliance
 
Ben Blair - Operating Safely in a Vibe Coding World
Ben Blair - Operating Safely in a Vibe Coding WorldBen Blair - Operating Safely in a Vibe Coding World
Ben Blair - Operating Safely in a Vibe Coding World
AWS Chicago
 
Mastering AI Workflows with FME - Peak of Data & AI 2025
Mastering AI Workflows with FME - Peak of Data & AI 2025Mastering AI Workflows with FME - Peak of Data & AI 2025
Mastering AI Workflows with FME - Peak of Data & AI 2025
Safe Software
 
Your startup on AWS - How to architect and maintain a Lean and Mean account J...
Your startup on AWS - How to architect and maintain a Lean and Mean account J...Your startup on AWS - How to architect and maintain a Lean and Mean account J...
Your startup on AWS - How to architect and maintain a Lean and Mean account J...
angelo60207
 
Domino IQ – Was Sie erwartet, erste Schritte und Anwendungsfälle
Domino IQ – Was Sie erwartet, erste Schritte und AnwendungsfälleDomino IQ – Was Sie erwartet, erste Schritte und Anwendungsfälle
Domino IQ – Was Sie erwartet, erste Schritte und Anwendungsfälle
panagenda
 
Azure vs AWS Which Cloud Platform Is Best for Your Business in 2025
Azure vs AWS  Which Cloud Platform Is Best for Your Business in 2025Azure vs AWS  Which Cloud Platform Is Best for Your Business in 2025
Azure vs AWS Which Cloud Platform Is Best for Your Business in 2025
Infrassist Technologies Pvt. Ltd.
 
Enabling BIM / GIS integrations with Other Systems with FME
Enabling BIM / GIS integrations with Other Systems with FMEEnabling BIM / GIS integrations with Other Systems with FME
Enabling BIM / GIS integrations with Other Systems with FME
Safe Software
 

Online video object segmentation via convolutional trident network

  • 1. Online Video Object Segmentation via Convolutional Trident Network Seminar at Naver Won-Dong Jang Korea University 2017-08-22
  • 3. Video object segmentation • Clustering pixels in videos into objects or background • Unsupervised • Supervised • Semi-supervised Segment track
  • 4. Video object segmentation • Unsupervised video object segmentation • Discover and segment a primary object in a video • Without user annotations Input videos Ground-truth segmentation labels
  • 5. Video object segmentation • Unsupervised video object segmentation • Saliency detection-based approach • Visual saliency • Motion saliency Examples of saliency detection
  • 6. Video object segmentation • Supervised video object segmentation • User can annotate mislabeled pixels in any frames • Interaction between an algorithm and a user Annotation at the first frame Segmentation results after adding the annotation Segmentation results Additional user annotation
  • 7. Video object segmentation • Semi-supervised video object segmentation • Discover and segment a primary object in a video • With user annotations in the first frame Annotated target object in the first frame Segmentation results
  • 8. Related works • Object proposal-based algorithm • Generate object proposals in each frame • Select object proposals that are similar to the target object [2015][ICCV][Perazzi] Fully Connected Object Proposals for Video Segmentation Generated object proposals Segment track
  • 9. Related works • Superpixel-based algorithm • Over-segment each frame into superpixels • Trace the target object based on the inter-frame matching [2015][CVPR][Wen] Joint Online Tracking and Segmentation Inter-frame matching Segment track
  • 11. In this presentation • A semi-supervised online segmentation algorithm • Online method • Offline techniques require a huge memory space for a long video • Deep learning-based approach • Connection between a convolutional neural network and an MRF optimization strategy • Remarkable performance on the DAVIS benchmark dataset
  • 12. Overview • Framework • Inter-frame propagation • Inference via convolutional trident network (CTN) • Yields three tailored probability maps for the MRF optimization • MRF optimization
  • 13. Inter-frame propagation • Propagation from the previous frame • To roughly locate the target object in the current frame • Using backward optical flow • From 𝑡 to 𝑡 − 1 Segmentation label map for frame 𝑡 − 1 Backward motion (optical flow) Propagation map for frame 𝑡
  • 14. Convolutional trident network • Infer segmentation information • The propagation map may be inaccurate • The inferred information is effectively used to solve a binary labeling problem
  • 15. Network architecture • Encoder-decoder architecture • Single encoder • VGG encoder • Three decoders • Separative decoder • Definite foreground decoder • Definite background decoder
  • 16. Network architecture • Encoder-decoder architecture • Single encoder • VGG encoder • Three decoders • Separative decoder • Definite foreground decoder • Definite background decoder
  • 17. Network architecture • VGG encoder • Trained for image classification • Using ImageNet dataset 22K categories 14M images
  • 18. Network architecture • VGG encoder • Trained for image classification • Using ImageNet dataset Convolutional layers Fully connected layers Fox Cat Lion Dog Zebra Umbrella Tank Lamp Desk Orange Kite …
  • 19. Network architecture • Separative decoder (SD) • Separate a target object from the background • Using a down-sampled foreground propagation patch Conv1_1 Conv1_2 Pooling Conv2_1 Conv2_2 Pooling Conv3_1 Conv3_2 Conv3_3 Pooling Conv4_1 Conv4_2 Conv4_3 Pooling Conv5_1 Conv5_2 Conv5_3 SD-Dec2 SD-Dec3 SD-Dec4 SD-Dec5 SD-Pred SD-Dec1 Unpooling Unpooling Unpooling Unpooling Skip connections Image patch Separative probability patch Foreground propagation patch
  • 20. Network architecture • Definite foreground and background decoders • Definite pixels indicate locations that should be labeled as the foreground or the background indubitably • Fixing labels in definite pixels improves labeling accuracies • Inspired by the image matting problem Input image Tri-map Matting result
  • 21. Network architecture • Definite foreground decoder (DFD) • Identifies definite foreground pixels Conv1_1 Conv1_2 Pooling Conv2_1 Conv2_2 Pooling Conv3_1 Conv3_2 Conv3_3 Pooling Conv4_1 Conv4_2 Conv4_3 Pooling Conv5_1 Conv5_2 Conv5_3 DFD-Dec2 DFD-Dec3 DFD-Dec4 DFD-Dec5 DFD-Pred DFD-Dec1 Unpooling Unpooling Unpooling Unpooling Image patch Definite foreground probability patch Foreground propagation patch
  • 22. Network architecture • Definite background decoder (DBD) • Finds definite background pixels Conv1_1 Conv1_2 Pooling Conv2_1 Conv2_2 Pooling Conv3_1 Conv3_2 Conv3_3 Pooling Conv4_1 Conv4_2 Conv4_3 Pooling Conv5_1 Conv5_2 Conv5_3 DBD-Dec2 DBD-Dec3 DBD-Dec4 DBD-Dec5 DBD-Pred DBD-Dec1 Unpooling Unpooling Unpooling Unpooling Image patch Definite background probability patch Background propagation patch
  • 23. Network architecture • Implementation issues in decoders • Prediction layer • Sigmoid layer is used to yield normalized outputs within [0, 1] • Rectified linear unit (ReLU) + Batch normalization • Kernel size • 3 ×3 kernels are used in the prediction layers • 5 × 5 kernels are used in the other convolution layers
  • 24. Training phase • Lack of video object segmentation dataset • There are several datasets for video object segmentation • However, each of them consists of a small number of videos from 12 to 59 • Instead, • We use the PASCAL VOC 2012 dataset • Object segmentation • 26,844 object masks • 11,355 images
  • 25. Training phase • Preprocessing of training data • Ground-truth masks for the DFD and DBD are not available • Hence, we generate them through simple image processing
  • 26. Training phase • Preprocessing of training data • Degrade the objet mask to imitate propagation errors • By performing suppression and noise addition • Synthesize the ground-truth masks for the DFD and DBD • By applying erosion and dilation
  • 27. Training phase • Implementation issues • Caffe library • Cross-entropy losses − 1 𝑛 ෍ 𝑛=1 𝑁 𝑝 𝑛 log Ƹ𝑝 𝑛 + 1 − 𝑝 𝑛 log 1 − Ƹ𝑝 𝑛 • Minibatch • with eight training data • Learning rate • 1e-3 for the first 55 epochs • 1e-4 for the next 35 epochs • Stochastic gradient descent
  • 28. Inference phase • Set input data • By cropping the frame and propagation map • CTN outputs three probability patches • Separative probability map 𝑅S • Definite foreground probability map 𝑅F • Definite background probability map 𝑅B Frame 𝑡 Segmentation label at frame 𝑡 − 1 Optical flow Inter-frame propagationInput at frame 𝑡 Propagation map at frame 𝑡 Inference via convolutional encoder-decoder network Encoder Background propagation patch Foreground propagation patch Image patch Definite background decoder Definite foreground decoder Separative decoder Definite background probability patch Definite foreground probability patch Separative probability patch
  • 29. Inference phase • Classification • Separative probability map 𝑅S • If 𝑅S(𝐩) > 𝜃sep, pixel 𝐩 is classified as the foreground • ℒ be the coordinate set for such foreground pixels • Definite foreground probability map 𝑅F • If 𝑅F(𝐩) > 𝜃def, pixel 𝐩 is classified as the definite foreground • ℱ denotes the set of the definite foreground pixels • Definite background probability map 𝑅B • If 𝑅B(𝐩) > 𝜃def, pixel 𝐩 is classified as the definite background • ℬ indicates the set of the definite background pixels
  • 30. MRF optimization • Solve two-class MRF optimization problem • To improve the segmentation quality further • Define a graph 𝐺 = (𝑁, 𝐸) • Nodes are pixels in the current frame • Each node is connected to its four neighbors by edges
  • 31. MRF optimization • MRF energy function ℰ 𝑆 = ෍ 𝐩∈𝑁 𝒟 𝐩, 𝑆 + 𝛾 × ෍ 𝐩,𝐪 ∈𝐸 𝒵 𝐩, 𝐪, 𝑆 Definite background probability patch Definite foreground probability patch Image patch Separative probability patch Unary cost 𝒟 - Returns extremely high costs on DF and DB pixels when they have background and foreground labels Pairwise cost 𝒵 - Encourage neighboring pixels to have the same label
  • 32. MRF optimization • Unary cost computation • Build the RGB color Gaussian mixture models (GMMs) of the foreground and the background, respectively • 𝐾 = 10 for both GMMs • Use pixels in ℒ to construct the foreground GMMs • Use pixels in ℒ 𝑐 to construct the background GMMs • Gaussian cost 𝜓 𝐩, 𝑠 = min 𝑘 − log 𝑓 𝐩; ℳ𝑠,𝑘 • Unary cost 𝒟 𝐩, 𝑆 = ൞ ∞ if 𝑝 ∈ ℱ and 𝑆 𝑝 = 0 ∞ if 𝑝 ∈ ℱ and 𝑆 𝑝 = 0 𝜓 𝐩, 𝑆(𝐩) otherwise
  • 33. MRF optimization • Pairwise cost computation • Pairwise cost 𝒵 𝐩, 𝐪, 𝑆 = ቊ exp −𝑑 𝐩, 𝐪 if 𝑆 𝐩 ≠ 𝑆 𝐪 0 otherwise • Graph-cut optimization ℰ 𝑆 = ෍ 𝐩∈𝑁 𝒟 𝐩, 𝑆 + 𝛾 × ෍ 𝐩,𝐪 ∈𝐸 𝒵 𝐩, 𝐪, 𝑆
  • 34. Reappearing object detection • Identification of reappearing parts • A target object may disappear and be occluded by other objects • Use backward-forward motion consistency
  • 36. Experimental results • The DAVIS benchmark dataset • 50 videos • 854 × 480 resolution • Number of frames • From 25 to 104 • Difficulties • Fast motion • Occlusion • Object deformation
  • 37. Experimental results • Performance measures • Region similarity • Jaccard index • Contour accuracy • F-measure • Statistics • Mean • Recall • Decay
  • 40. Experimental results • The SegTrack dataset • 5 videos • Performance comparison • Jaccard index
  • 41. Experimental results • Ablation studies • Effectiveness of each decoder • Efficacy of MRF optimization
  • 42. Experimental results • Running time analysis • Prop-Q • Uses the state-of-the-art optical flow technique • Prop-F • Adopts a much faster optical flow technique • Parameter selection
  • 43. Conclusions • A semi-supervised online video object segmentation algorithm is introduced in this presentation • Deep learning-based semi-supervised video object segmentation algorithm • Tailored network for MRF optimization • Remarkable performance on the DAVIS dataset • Q&A • Thank you