[DL輪読会]Neural Radiance Flow for 4D View Synthesis and Video Processing (NeRF...Deep Learning JP
Neural Radiance Flow (NeRFlow) is a method that extends Neural Radiance Fields (NeRF) to model dynamic scenes from video data. NeRFlow simultaneously learns two fields - a radiance field to reconstruct images like NeRF, and a flow field to model how points in space move over time using optical flow. This allows it to generate novel views from a new time point. The model is trained end-to-end by minimizing losses for color reconstruction from volume rendering and optical flow reconstruction. However, the method requires training separate models for each scene and does not generalize to unknown scenes.
CV分野での最近の脱○○系論文3本を紹介します。
・脱ResNets: RepVGG: Making VGG-style ConvNets Great Again
・脱BatchNorm: High-Performance Large-Scale Image Recognition Without Normalization
・脱attention: LambdaNetworks: Modeling Long-Range Interactions Without Attention
Detect helmet impacts in NFL games using videos and player tracking data. A two-stage pipeline involves helmet detection followed by classification of detections as impacts or non-impacts. Post-processing includes temporal non-maximum suppression using tracking results to reduce false positives. Multiple models are ensembled and thresholds tuned on a validation set for best performance.
3D Perception for Autonomous Driving - Datasets and Algorithms -Kazuyuki Miyazawa
This document summarizes several 3D perception datasets and algorithms for autonomous driving. It begins with an overview of Kazuyuki Miyazawa from Mobility Technologies Co. and then covers popular datasets like KITTI, ApolloScape, nuScenes, and Waymo Open Dataset, describing their sensor setups, data formats, and licenses. It also summarizes seminal 3D object detection algorithms like PointNet, VoxelNet, and SECOND that take point cloud data as input.
15. Mobility Technologies Co., Ltd.
target image It
set of source images Is ∈ IS (実装では It-1, It+1)
estimated depth Dt
synthesized target image It
目的関数
15
^
^
16. Mobility Technologies Co., Ltd.
target image It
set of source images Is ∈ IS (実装では It-1, It+1)
estimated depth Dt
synthesized target image It
Appearance Matching Loss
16
^
^
オクルージョンの影響を軽減するためそれぞれのソー
ス画像に対して求めたロスの画素ごとの最小値を採用
推定したデプスによりソース画像をターゲット画像と
一致するようにワープさせた際の誤差(ワープ画像と
ターゲット画像間のSSIMとL1ロスの重み付き和)
17. Mobility Technologies Co., Ltd.
target image It
set of source images Is ∈ IS (実装では It-1, It+1)
estimated depth Dt
synthesized target image It
Appearance Matching Loss
17
^
^ ワープ対象領域外を
除外するマスク
ワープによって逆に誤差が大きくなる領域を除外する
マスク(静止シーンやカメラと等速で運動する物体を
除外するため)
18. Mobility Technologies Co., Ltd.
target image It
set of source images Is ∈ IS (実装では It-1, It+1)
estimated depth Dt
synthesized target image It
Depth Smoothness Loss
18
^
^
テクスチャの少ない領域では滑らかなデプスとな
るように制御するためのロス(画素勾配が小さい
場合にデプス勾配が大きくなるとペナルティも大
きくなる)
画素勾配
デプス勾配
22. Mobility Technologies Co., Ltd.
Packing
22
Ci x H x W 4Ci x H/2 x W/2
D x 4Ci x H/2 x W/2
4DCi x H/2 x W/2
Co x H/2 x W/2
■ poolingを使わず空間情報の損失を回避
■ 空間方向 → チャネル方向変換+Conv3D
■ 逆順にすることでunpacking