[DL輪読会]IDA-3D: Instance-Depth-Aware 3D Object Detection From Stereo Vision for Autonomous Driving

381 Views

November 12, 20

#deep learning #Stereo Vision #3D Object Detection #Instance-Depth-Aware #Autonomous Driving #Computer Vision

スライド概要

2020/08/07
Deep Learning JP:
http://deeplearning.jp/seminar-2/

Deep Learning JP

@DeepLearning2023

スライド一覧

DL輪読会資料

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

（ダウンロード不可）

関連スライド

【DL輪読会】KAN: Kolmogorov–Arnold Networks

Deep Learning JP 92.7K

【拡散モデル勉強会】拡散モデルの数理

Deep Learning JP 71.9K

【DL輪読会】Evolutionary Optimization of Model Merging Recipes モデルマージの進化的最適化

Deep Learning JP 61.6K

【DL輪読会】Conditional Flow Matching

Deep Learning JP 55.4K

【DL輪読会】Cosmos World Foundation Model Platform for Physical AI

Deep Learning JP 52.3K

【拡散モデル勉強会】Introduction to Diffusion Models

Deep Learning JP 50.5K

各ページのテキスト

IDA-3D: Instance-Depth-Aware 3D Object Detection From Stereo Vision for Autonomous Driving 上智大学山中研 B4 杉崎弘明 1

スライド ● https://docs.google.com/presentation/d/1eFXDwm2sFQ48p7qkKAG49Ef4jdnuiFEW-nGWJGXrnZ0/e dit?usp=sharing 2

https://docs.google.com/presentation/d/1eFXDwm2sFQ48p7qkKAG49Ef4jdnuiFEW-nGWJGXrnZ0/edit?usp=sharing

書誌情報 ● ● ● ● タイトル ○ IDA-3D: Instance-Depth-Aware 3D Object Detection From Stereo Vision for Autonomous Driving 著者 ○ Wanli Peng, Hao Pan, He Liu, Yi Sun CVPR 2020 リンク ○ https://openaccess.thecvf.com/content_CVPR_2020/html/Peng_IDA-3D_Instance-Depth-Awar e_3D_Object_Detection_From_Stereo_Vision_for_Autonomous_CVPR_2020_paper.html 3

https://openaccess.thecvf.com/content_CVPR_2020/html/Peng_IDA-3D_Instance-Depth-Aware_3D_Object_Detection_From_Stereo_Vision_for_Autonomous_CVPR_2020_paper.html

概要 ● ● 3D物体検出タスク ○ 3D Bounding Box の推定 ○ 本論文の条件 ■ 高コストなLiDARによる Depth Map は学習時でも使用しない ■ ステレオ画像本論文の貢献 ○ 上記の条件の中では SOTA ○ Depth Mapを学習に利用したモデルと比較しても精度の近いモデルの実現 4

関連研究 ● Stereo R-CNN ○ Region Proposal Network (RPN) の用いて, 物体周辺 (Region of Interest; RoI) 以外の情報を落とす. ○ 地面に接している特徴点等の情報を推論し , 3D BBoxに変形する. 5

提案手法 ● Stereo R-CNNのモデルをベースに 3D BBox の中心座標を直接求める End-to-End なモデル 6

提案手法 ● ● Stereo R-CNNでは出力層では特徴点情報などを出力していたが , 提案手法では Depth以外の情報を出力 ○ 入力画像中の画像中での BBox ○ 自動車の向き (θ) ○ 3D BBoxの height, width, length ○ u, v: 入力画像中での中心点座標 RPNの出力を分岐させ , 3D BBoxの中心のDepthを推論するモジュールを追加 7

Instance Depth Aware (IDA) Module ● ● RPNによって切り抜いた特徴量マップに対し , Cost Volume を作成 ○ shape (depth level, width, height, feature size) 3DCNNを通して, Depth Level の確率分布を出力し , その期待値を予測値とする 8

Nonuniform Depth ● ● ● Disparity (視差) は物体が近くにあるほど大きい Disparity に合わせて, Depthに変換すると遠くにある物体の誤差が大きくなってしまう . Depth Levelを式(3)のように変換することで誤差を抑える . 9

10.

Depth Adaption ● RPNから切り取られた RoIの大きさからざっくりとした depth の範囲を指定することで精度が上がる . ○ 具体的な計算手法は明示されていない ○ （個人の感想） ■ depth範囲を小さくすれば扱える粒度が細かくなるので当然 ■ 対象が車でありどの車もある程度同じ大きさであるといえることからできる？ 10

11.

実験 ● ● KITTIデータセットを使用以下の図 ○ M : 単眼画像からの推論するモデル ○ S : ステレオ画像から推論するモデル (本実験と同じ前提条件 ) ○ AP_bev: bird-eye-view (鳥瞰図) の2D BBoxに変換してから Average Precisionを計算 11

12.

実験 ● LiDARによるDepth Mapを用いて学習を行うモデル (Psuedo-LiDAR) と比較 12

13.

実験 ● Nonuniform / Adaption 処理の有無によるエラー率の変化 ○ 遠くの物体の誤差を抑えることができている . (左下図) 13

14.

まとめ ● ● ● 本論文はDepthMapを使用せずにステレオ画像から 3D物体検出するモデルを提案同条件の他のモデルに対して , Stereo-RCNNとIDAの使用により精度向上 (SOTA) Depth Levelの調整で遠くにある物体の誤差を抑える 14