[DL輪読会]Fast and Slow Learning of Recurrent Independent Mechanisms

>100 Views

June 04, 21

deep learning

スライド概要

2021/06/04
Deep Learning JP:
http://deeplearning.jp/seminar-2/

Deep Learning JP

@DeepLearning2023

スライド一覧

DL輪読会資料

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

関連スライド

【DL輪読会】Evolutionary Optimization of Model Merging Recipes モデルマージの進化的最適化

Deep Learning JP 26.4K

【DL輪読会】KAN: Kolmogorov–Arnold Networks

Deep Learning JP 25.5K

【DL輪読会】Generative Agents: Interactive Simulacra of Human Behavior

Deep Learning JP 13.4K

【DL輪読会】LLMベースの自律型エージェントシステムのサーベイ

Deep Learning JP 12.6K

【DL輪読会】4D Gaussian Splatting for Real-Time Dynamic Scene Rendering

Deep Learning JP 12.5K

【DL輪読会】LightGlue: Local Feature Matching at Light Speed

Deep Learning JP 10.4K

各ページのテキスト

DEEP LEARNING JP [DL Papers] Fast and Slow Learning of Recurrent Independent Mechanisms XIN ZHANG, Matsuo Lab http://deeplearning.jp/

http://deeplearning.jp/

書誌情報 ● タイトル： ○ Fast and Slow Learning of Recurrent Independent Mechanisms ● 著者 ○ Kanika Madan, Rosemary Nan Ke, Anirudh Goyal, Bernhard Scholkopfm, Yoshua Bengio. ● ICLR 2021 ● 概要 ○ 脳に存在する機能毎に独立な部分を,Modular Networkで実現しようと... ○ Recurrent Independent Mechanisms(RIM)はその一種. ○ RIMの学習を異なるStepで行う仕組みを提案し, 手法を改良した研究. 2

Introduction

Introduction：Modular Networks ➢ VQA：Parserで再利用な可能なModuleを選び, Networkを作成. Deep Compositional Question Answering with Neural Module Networks 2016 4

https://arxiv.org/pdf/1511.02799.pdf

Introduction：Modular Networks ➢ 多めにネットワークを生成して, 進化論の思想で, 役立つModuleを残していく. 5

Introduction：Modular Networks ➢ ロボットのModule, タスクのModuleを学習して, 新たな組み合わせに汎化できる. Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer 2016 6

https://arxiv.org/pdf/1609.07088.pdf

Meta Learning of Recurrent Independent Mechanisms

RIM:Recurrent Independent Mechanisms ➢ Inputを潜在空間にEncode, RIMを通すことで, Inputに関連したMemoryをOutput. ○ OutputをValue, Policyに分割して,PPOの学習に使う. ➢ RIMは, 独立したNこのModule, AttentionでInputに関連したK個のRIMを更新. 8

Meta Learning of RIM ➢ Fast Inner：RIM, Policy head. ➢ SLOW：Input Attention & Communication Attention, Value head. 9

10.

提案手法：MIR ➢ PPOのLoss. ➢ θM, θA,でAttentionとModuleの更新異なるStepで行う. 10

11.

Related Work - Modular Networks（Introdcution） - Meta Learning - Modular meta-learning 2018 - Meta-Learning to Disentangle Causal Mechanisms - A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms - Learning neural causal models from unknown interventions 11

12.

Experiment

13.

a: Improve sample efficiency? ➢ YES, 赤い線が提案手法, 横軸がFrame数. 13

14.

b: Lead to policy that generalize better? ➢ Yes, More DifficultはZero shot Transfer, Baselineを大きくリード. 14

15.

c: Fast adaptation to new distributions? ➢ 簡単な環境でPre-trainして,target 環境で成功率を測る. ○ もっと効率的に知識のピースを再利用していると言える. 15

16.

Ablation: Meta-Learning setupが大事？ ➢ Meta-learningの重要さを示す? Meta-LSTMがvanillaより良い図. 16

17.

Ablation: Sparsity, Slow-factor of Outer loop n=4, k=2の例.. ➢ 全部使うより, SparsityがModuleの機能性を向上させる. 17

18.

Ablation: Value function Visualization ➢ 左の図, Valueが上がったり下がったり...ゴールが見えている時は, 高い値を示す. ➢ Frame 12はゴールの目の前にいて,すごく高い, 13はタスク終了なので,低くなる. 18

19.

Ablation: Visualizing Module Activations ➢ 左のInputで,活性化されるModuleを示している. n=5, k=3. ➢ F7のところで左の緑の点が見えて,M5が活性化される.. 19

20.

Ablation: Importance of Fast and Slow Update Loops. ➢ Inner loop, Outer loopの役割を交換すると,精度は落ちる. Vanilaと同じ程度に. ➢ AttentionのLearning rateだけを落としても,うまくいかない.(slowLR) 20

21.

Ablation: Roles of the Active Modules ➢ Active Modulesを減らしたら,エピソードを完成するのに,より長い時間をかけた. 21

22.

Conclusion

23.

まとめ&感想まとめ： - 知識の分解と再利用を実現するのに, 必要なアーキテクチャに関する研究. - 多くの関連分野(meta RL, HRL, time scale in RL, attention)をうまく繋げた面白い研究.(OpenReview.) - 具体的にはRIMをMete-Learning的な考え方で実現してみた. - Meta-learningの活用で,汎化性能を挙げられることに期待. 感想： - Modular Networkの研究が面白い, RIMはBengio先生が推してて重要な研究. - それぞれのModuleが異なる役割をもっと明確に担当させるのに, 方法がありそう. - DADS の 23

https://openreview.net/pdf?id=HJgLZR4KvH

24.

Appendix - 関連研究: - Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules ブログ RIM： - https://www.zhihu.com/search?type=content&q=Recurrent%20independent%20mechanism s

25.

Appendix：PPO