Towards safe and smooth coexistence of mobile robots and inattentive humans (PNARUDE Workshop at IROS2022)

717 Views

October 27, 22

歩行者移動予測歩きスマホ視覚的注意

スライド概要

Invited talk, PNARUDE Workshop at IEEE/RSJ IROS2022

Tamura Lab.

@tamlab

スライド一覧

東北大学大学院工学研究科ロボティクス専攻田村研究室

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

関連スライド

各ページのテキスト

Towards safe and smooth coexistence of mobile robots and inattentive humans Yusuke Tamura Tohoku University [email protected] PNARUDE Workshop, IROS2022, Kyoto, Japan.

Mobile robot coexists with humans... In order for mobile robots and self-driving cars to move safely and smoothly, they need to predict the behavior of others around them. 2

How to predict pedestrian behaviors? Social Force Model (Helbing & Molnar, 1995) • assumes that virtual forces act on a pedestrian from other pedestrians and environment. Prediction of pedestrian behavior using SFM (Tamura et al., 2010) Extended social force model (Tamura et al., 2012) • Considering pedestrians’ intention • Creating an appropriate subgoal according to the estimated intention Y. Tamura, T. Fukuzawa, H. Asama, Smooth Collision Avoidance in Human-Robot Coexisting Environment, IROS2010, 3887-3892, 2010. Y. Tamura, P.D. Le, et al., Development of Pedestrian Behavior Model Taking Account of Intention, IROS2012, 382-387, 2012. 3

Environment-dependent pedestrian behavior West South Precision Eng. Urban Eng. North Rest room East S. Hamasaki, Y. Tamura, et al., Prediction of Human’s Movement for Collision Avoidance of Mobile Robot, IEEE ROBIO2011, 1633-1638, 2011. 4

DL-based pedestrian trajectory prediction methods Pedestrian trajectory prediction considering the relationship with others • Social LSTM (Alahi et al., 2016), Social GAN (Gupta et al., 2018), … • BiRNN encoder-decoder framework (Wu et al., 2019) J. Wu, H. Woo, Y. Tamura, et al., Pedestrian Trajectory Prediction using BiRNN Encoder-Decoder Framework, Advanced Robotics, 33, 18, 956-969, 2019. 5

Trajectory prediction with multi-head attention Trajectory prediction considering interaction between pedestrians and vehicles at unsignalized intersections DUT Dataset S. Tanno, Y. Tamura, et al., Trajectory Prediction by Attention Model Considering Pedestrians and Vehicles Interaction, JSME Robomech2022, 1A1-H11, 2022 (in Japanese). 6

Limitation of the current trajectory prediction - The influence of others and the environment on a pedestrian’s movements depends on the pedestrian’s attention. - Pedestrians who are not paying attention to their surroundings are at higher risk of collision. 7

Smartphone Zombie 8

Do you use your smartphone while walking? YES 10s 57 20s 43 66.9 30s カテゴリ軸 NO 33.1 58.7 40s 41.3 46.3 50s 34.3 65.7 27.6 60s 72.4 21.1 70s 78.9 42.6 Average 0 53.7 25 57.4 https://www.moba-ken.jp/whitepaper/wp21/pdf/wp21_all.pdf 50 75 100 NTT Docomo Mobile Society Research Institute, 2021. 9

https://www.moba-ken.jp/whitepaper/wp21/pdf/wp21_all.pdf

10.

Observation at train stations Data was obtained at - Kintetsu-Nara Station - Yamato-Saidaiji Station - Osaka-Abenobashi Station Osaka-Abenobashi Station (Kintetsu Railway) 10

11.

Many zombies… 11

12.

Characteristics of smartphone zombies Face is directed toward the smartphone. Smartphone is in hand. Elbow is bent. 12

13.

How to detect smartphone zombies? 1. Posture estimation using Posenet (extraction of shoulder, elbow and wrist) 2. Extraction of an image around the hand ➔Determining whether or not a smartphone is included in the image by SVM. Actual Estimated Zombie Not zombie Zombie 101 19 Not zombie 0 120 Figure 3.1 : An example of smartphone zombie. Wrist Elbow A. Kawasumi, Y. Tamura, Y. Hirata, Smartphone Zombie Detection from Camera Images for Human-Robot Coexistence, Proc. SICE SI2021, 2844-2848, 2021 (in Japanese). 13

14.

How to detect smartphone zombies with 3D LiDAR? 2D point cloud from a lateral view J. Wu, Y. Tamura, et al., Smartphone Zombie Detection from LiDAR Point Cloud for Mobile Robot Safety, IEEE Robotics and Automation Letters, 5, 2, 2256-2263, 2020. 14

15.

ated as the mean of their x and body y coordinate of points belonging the are expected to appear at similar regions. Finally, the parts, such as hands and toarms, direction of the objectlateral in Σw profile and the of rotation of robot have already each candidate object can be acquired as shown in Fig. 4.2. Detection of smartphone zombies with 3D LiDAR matrix R can be represented as the pedestrian’s moving direction (4.2) Z axis    r r ˆit ẏˆti 0 ẋ cosφ −sinφ 0 t t Object tracking for estimating     ˆi   i r r ˆ =  ẏt −ẋt 0  sinφt cosφt 0 . t=1    t=2 0 0 1 0 0 1 t=3 each point in the cluster are normalized by the maximum value of where qij i (ximean , yme calculated as to the object the rotation matrix R ca R= Y ax the maximum height of candidate objects will correspond to 1 and is nds and arms, are expected to appear at similar regions. t = 2 the t = 1 Finally, is X ax idate object can be acquired as shown in Fig. 4.2. (a) Views =1 t=3 Fig. 4. Illustration of the lateral view taken from the unified direction. (a) shows object tracking in the world coordinate with its corresponding images, where the red arrows show the moving direction of the objects. (b) shows the (b) of lateral view in Σo . The z coordinate of projection result from the direction those points are normalized using the height of the object. perpendicular to the moving direction (e.g.) 2D histogram Fig. 4.2: Illustration of the lateral view taken from the unified direction. (a) shows object tracking in the world coordinate with its corresponding images, where the red arrows show the moving direction of the objects. (b) shows the projection result from the direction of lateral view in Σo . The z coordinate of those points are normalized using the height of the object. t=2 t=3 Then, the ized by the maximum h their body pa at similar re object can b D. Local Fe Features f from the t Features Section III-C features to d zombie. 42 Feature ex (b) for each ca J. Wu, Y. Tamura, et al., Smartphone Zombie Detection from LiDAR Point Cloud for Mobile Robot Safety, IEEE Robotics and Automation Letters, 5, 2, 2256-2263,dimensions 2020. 15 Fig. 5. Illustration of the 2D histogram obtained by considering the partial

16.

5.3 Offline Test of Smartphone Zom Detection results 1.0 0.8 0.8 True positive rate 0.6 0.6 0.4 0.4 PCA-based 0.2 0.2 PCA-based Tracking enhanced 0.2 0.2 0.4 0.4 0.6 0.6 False positive rate 0.8 0.8 False positive rate 0.6 0.4 0.2 Proposed Chapter 5 Experiments 0.0 0 0.0 0 PCA-base Tracking 0.8 True positive rate True positive rate 1.0 1.0 ROC curve 0.0 0.00 1.0 1.0 0.02 0.04 0.06 False positive rate (a) (b) PR curve 0.6 0.6 F1 score Precision Precision 1.0 Fig.1.01.0 5.7: Offline testing: comparison of the performance of smartphone zombi PCA-based PCA-based AP=0.14 PCA-base between the Proposed proposed tracking enhanced feature extraction and PCA-based fea Tracking(a) enhanced 0.8 0.8 of ROC curve. 0.8 (b) shows part ofTracking in terms showsAP=0.39 the ROC curve, while the RO the value of FPR was between 0 and 0.1. 0.4 0.4 0.6 0.4 the0.2 PCA-based method and the proposed method was0.2carried out based model tra 0.2 combinations of features. Notice that though the author made best afford on app 0.0 0 0.0 0 0.2 0.2 0.4 0.4 0.6 0.6 0.8 0.8 1.0 1.0 0.0 0 the-art work "Voxelnet"Recall [Zhou 2018] to smartphone zombie detetion,Threshold the model Recall (b) train on current dataset(a)and failed to provide even a single successful detection. J. Wu, Y. Tamura, et al., Smartphone Zombie Detection from LiDAR Point Cloud for Mobile Robot Safety, IEEE Robotics and Automation Letters, 5, 2, 2256-2263, 2020. 16 Fig. 5.8: Evaluation of detection performance: (a) shows the Precision-Recall c −2 −1

17.

Zombie detection from a mobile robot Spherical camera LiDAR (b) ram obtained by considering the partial es in heat map. (a) shows the lateral he region of interest. (b) shows the heat brightness of the grid in the heat map s distributed inside the grid, which are points belonging to the object. Pioneer 3-AT features. In practice, the feature Fig. 6. Experiment setup. Left: for data collection. Right: outdoor will be executed simultaneously J. Wu, Y. Tamura, et al., Smartphone Zombie Detection from LiDAR Point Cloud for Mobile Robot Safety, IEEE Robotics and Automation Letters, 5, 2, 2256-2263, 2020. demonstration. 17

18.

How to predict inattentive pedestrians’ behavior? Do smartphone zombies walk straight without being affected by anything? NO. They walk on the basis of limited perception and memory. Different pedestrians pay different degrees of attention to their surroundings. Two approaches: 1. Estimation of human attention 2. Prediction with multiple hypotheses. 18

19.

Idea from magician’s technique Y. Tamura, T. Akashi, S. Yano. H. Osumi, Human Visual Attention Model Based on Analysis of Magic for Smooth Human-Robot Interaction, International Journal of Social Robotics, 8, 5, 689-694, 2016. 19

20.

Modeling of human visual attention Input&image Face1Hand&map Gaze&map Saliency&map Manipulation&map Attention&map Y. Tamura, T. Akashi, S. Yano. H. Osumi, Human Visual Attention Model Based on Analysis of Magic for Smooth Human-Robot Interaction, International Journal of Social Robotics, 8, 5, 689-694, 2016. 20

21.

Estimation of human visual attention Y. Tamura, T. Akashi, S. Yano. H. Osumi, Human Visual Attention Model Based on Analysis of Magic for Smooth Human-Robot Interaction, International Journal of Social Robotics, 8, 5, 689-694, 2016. 21

22.

Pedestrian prediction considering the uncertainties Using Monte Carlo Dropout for considering multiple hypotheses なり収集したデータセットを用いて学習を行う．全結合層の学習時は，LSTM 層の学習を行わない． Input 3.3 LSTM MC Dropout[7] は普通のモデルと異なり，過学習の防止に使われる Dropout を推論時にも実行する．これによって，各層から得られる特徴量の一部が 0 となるためばらつきが出る．これを繰り返して複数の出力を得ることで複数の経路を予測する．本研究では MC Dropout を使用するために，LSTM 層と出力の前を除く全結合層 (FC Layer) の処理後に Dropout 層を通している． Dropout によって特徴量の一部が欠けるので，予測結果が歩行者の移動とは大きく異なるような結果になる可能性がある．そこで本研究では，得られた 0.5 秒ごとの予測結果から移動距離を計算し，この移動距離が一定以上の場合は外れ値として予測結果を除去する． Feature value(128Dim.) Output Output 4. 4.1 Output Copy Concatenate Dropout FC Layer Input: 1.0 法による不確実性の s Monte Carlo Dropout 考慮 Output: 2.0 s Output EV 検証実験歩行者の移動データ収集全結合の学習に用いる歩行者のデータ収集には著者らの論文 [10] と同様の手法を使用した．この論文では RGB カメラを使用しているため，深度の値は得られたデータから推定していた．一方で本研究では RGB-D カメラを用いるため，深度はカメラから得られた値を用いた．RGB-D カメラとして Intel Realsense™ D435 を使用して実際の歩行軌跡を取得した．カメラの高さは歩行者の全身を写すこと，移動ロボットの大きさを Observed 考慮して約 1 m とした．フレームレートは 10 fps であ Predicted Ground Truth 22

23.

Summary Towards safe and smooth coexistence of mobile robots and inattentive humans - Prediction of pedestrian behaviors - Detection of smartphone zombies - Prediction of inattentive pedestrian behaviors 23

Towards safe and smooth coexistence of mobile robots and inattentive humans (PNARUDE Workshop at IROS2022)

Tamura Lab.

関連スライド

Monte Carlo Dropout法による不確実性を考慮した歩行者の移動予測（RSJ2022）

搭乗者の目的地推定に基づく運動主体感を考慮したパーソナルモビリティの操作支援（第27回ロボティクスシンポジア）

歩行者が車両に道を譲る可能性の推定と軌道予測への利用（第28回ロボティクスシンポジア）

自律移動ロボットを用いた放射線源の分布推定技術（電気学会 放射線を利用した微量分析およびイメージング技術調査専門委員会）

廃止措置研究・人材育成等強化プログラムにおける東京大学の取り組み（RSJ2018OF）

人とロボットの安全な共存のためのカメラ映像を用いた歩きスマホ検出（SI2021）

各ページのテキスト

自律移動ロボットを用いた放射線源の分布推定技術（電気学会放射線を利用した微量分析およびイメージング技術調査専門委員会）