[DL輪読会]DeepLearningと曲がったパラメータ空間 (Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation)

192 Views

March 23, 18

#Deep Learning #Neural Networks #Optimization Methods #Gradient Update #Catastrophic Forgetting

スライド概要

2018/03/02
Deep Learning JP:
http://deeplearning.jp/seminar-2/

Deep Learning JP

@DeepLearning2023

スライド一覧

DL輪読会資料

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

（ダウンロード不可）

関連スライド

【DL輪読会】KAN: Kolmogorov–Arnold Networks

Deep Learning JP 89.2K

【拡散モデル勉強会】拡散モデルの数理

Deep Learning JP 63.2K

【DL輪読会】Evolutionary Optimization of Model Merging Recipes モデルマージの進化的最適化

Deep Learning JP 60.5K

【拡散モデル勉強会】Introduction to Diffusion Models

Deep Learning JP 44.7K

【DL輪読会】Conditional Flow Matching

Deep Learning JP 43K

【DL輪読会】Cosmos World Foundation Model Platform for Physical AI

Deep Learning JP 41.7K

各ページのテキスト

DEEP LEARNING JP [DL Papers] Deep Learning Reiji Hatsugai, DeepX http://deeplearning.jp/ 1

http://deeplearning.jp/

[beta]

M\kxgn[UFW
• -6@@:3<YH,:@96?(.<7=?;3A:=<(03A?:B[UFW"O^P
• \'
–
–
–
–
–
–

^S]R\

$[

Ec

L\mqliXP

+/\&
tgjf{ow|xyplmz|i
hvls
*3A3@A?=>9:4(7=?86AA:<8
11\b
11\rxliulik

• -YH,.0\![UFW"O^P
• d:2352?=XFWF^P
–

VaHTSaP_^QeCC

• ISKNeWJ^PDD
– EYX %P`G[HKFW^PCX]d#Pc

[Zb^P
)

$ $"&# $!%&#

!% %#'$ %"& '$

• •

[beta]



F+,/

• RUSXOIL CK
•
BDEWN?A@GK L>K
• HRUSXO!ITXMVPQ")
– *.JHL CK $3'5'#170;;64610<698%
–  I-.&26=3:53813L>K

, 

, https://www.jstage.jst.go.jp/article/sicejl1962/40/10/40_10_735/_pdf

(

10.

& ' • #+/,*0-/.1'&$!' # )%"(

11.

12.

$ $"&# $!%&#

13.

[beta]


•  $$!#.



• ",-,-3+(<
• 07.")&&(,-3+(<  2*0&/&'4%5
•  $$!#6+5 1 .+5:8;98

• 11



D.P.Kingma, Adam: A Method for Stochastic Optimization, https://arxiv.org/pdf/1412.6980.pdf

14.

15.

16.

[beta]


• "/201!&'(-, -*($5.1(+(6#1(-,
– ,3&/0& 3&$1-/./-%2$1! "# $@

<

;

>?

• /-,&$)&/#$1-/&%../-4(+#1&2/3#12/&
– @CDBA



• /-,&$)&/#$1-/&%!&$2/0(3&../-4(+#1(-,
– @CDBA

• 7<;



989=:

John Schulman et al, Trust Region Policy Optimization, https://arxiv.org/pdf/1502.05477.pdf
Yuhuai Wu et al, Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation,
https://arxiv.org/pdf/1708.05144.pdf
Roger Grose & James Martens, A Kronecker-factored approximate Fisher matrix for convolutional layers, https://arxiv.org/pdf/1602.01407.pdf
James Martens & Roger Grose, Optimizing neural network with kronecker-factored approximation curvature,
https://arxiv.org/pdf/1503.05671.pdf

17.

[beta]

^RTQeZafbc[WYdfS
• ^RTQeDCL<2
– @I\b_fVE CP?H;D 9=J><
–   H?1%*@3M 3?1CP?:MEFHPB782

• b]bU

• Θ∗ FDG>?2ME@1'-00.,/OHMA 
• `Xc48I IHKNM
• E)&+#O><5($*+ "D>?2M

5@6Mg

C.M.Bishop, PRML () 213p
Hippolyt Ritter et al, A Scaleble Laplace Approximation for Neural Networks, https://openreview.net/pdf?id=Skdvd2xAZ

!

https://openreview.net/pdf?id=Skdvd2xAZ

18.

[beta]

$#"!'% &"!
• 987305/)*+., ((.
– $#"!
–  &"!

• :<;=6 -214

-&$!""&-

-

Nitish Shirish Keskar et al, On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima,
https://arxiv.org/pdf/1609.04836.pdf

https://arxiv.org/pdf/1609.04836.pdf

19.

[beta]

$5'1&/-*.(!%3%231/0)*&"/1('33*.(
• ##EBGLKJIHG;AB:

=GD9

– 6GLKJI <>CLKJ I =GALKJC=GI
– ?DLKJIC8G-4,3*3%2+,'%1.*.(BF

• LKJ@DMONPLDD

7

I<B9FLKJ I 

James Kirkpatrick et al, Overcoming catastrophic forgetting in neural networks, https://arxiv.org/pdf/1612.00796.pdf

https://arxiv.org/pdf/1612.00796.pdf

20.

[beta]

6;A<>745@A1*
• +8<9A3#
– 
– 9:=0

•  *745@A1", (8<9A30 -%.
• .8<9A3*#?2 )'/&$ 0!."
–

*()

Yann Le Cun et al, Optimal Brain Damage, http://yann.lecun.com/exdb/publis/pdf/lecun-90b.pdf
Yoojin Choi et al, Towards the Limit of Network Quantization, https://arxiv.org/pdf/1612.01543.pdf

21.

[beta]


•

HTWPMUPMN
–  L;D<E9
– $4%01!0)!, /%0230"!2).-

•
•


– 8KRZO6L 9C@BVRXF:>K7
8KQNSRZO!"#$" DGYNG

(22/$%%/,%!0-)-'*/3-$%012!-$)-'",!#+".5/0%$)#2).-14)!)-&,3%-#%&3-#2).-1
–

??G

F@=9C8JIA

Pang Wei Koh, Understanding Black-box Predictions via Influence Functions, https://arxiv.org/pdf/1703.04730.pdf

https://arxiv.org/pdf/1703.04730.pdf

22.

$ $"&# $!%&#

23.

[beta]

#++&"(&+%#*($)*'",&)(",*&-:EFH
• =?:

5

<>6CGEIA ::EFH1



– !#+ #,:CGEIA 
–
8EFH.≈ 10$% J

• B@ID9=? <>:;2430

• 87:

/8

24.

[beta]





• (

)!" ('

– 
– &#!  %$%#!" !#%% 
– "% # 

• $$

C.M.Bishop, PRML p251
D.P.Kingma, Adam: A Method for Stochastic Optimization, https://arxiv.org/pdf/1412.6980.pdf
James Kirkpatrick et al, Overcoming catastrophic forgetting in neural networks, https://arxiv.org/pdf/1612.00796.pdf
Yann Le Cun et al, Optimal Brain Damage, http://yann.lecun.com/exdb/publis/pdf/lecun-90b.pdf

25.

!"# $ (& "# $). *, • &#'!"# $ () & "# $ )-( • 1/012$ .!"+. * %! "# $ (& "# $). 3

26.

[beta]

?6!":=
• !*)(-&#,%"+#$'%),/ 
•
C :5A C3 !" = $<.= !%& $C@A
•  <FDEGx;708()C5A4:192B>!%& $C

http://www.slis.tsukuba.ac.jp/~fujisawa.makoto.fu/cgibin/wiki/index.php?%CF%A2%CE%A91%BC%A1%CA%FD%C4%F8%BC%B0%A1%A7%B6%A6%CC%F2%B8%FB%C7%DB%CB
%A1

http://www.slis.tsukuba.ac.jp/~fujisawa.makoto.fu/cgi-

27.

[beta]

!"# $ (& "# $)=



9; ,#)"

•  526!"# $ () & "# $. 8<7• @>?@A3 =01:= 9/4! "# $ (& "# $)= 
• !((#$ !)%' '% *) +&

B

import tensorflow as tf
grads = tf.gradients(loss, params)
hvp = tf.gradients(tf.reduce_sum(grads*x), params)

28.

•

29.

%#" % &#%$$%#)!&'%(&'% • 178652 %#" %4*. • %#" %1 ! "# 4 0 -3,/+ 4*3/

30.

[beta]

&$#!& ($&&)&' *%%&$+ "( $#
• 

1.

• 3-

2

69:87

2/,0

45

31.

[beta]

.>@6 5!"7

859

• 
– .
–

$**%#&/207/-= ;2.

• !"# $ %& ' "# $? 

– CDF/. 

9=
7%+$)#+%'&741 ,(?3=/-=

• HIEB
– 

.

– :<



– GA.

32.

[beta]

92 7:<b[c
• NXVoqprm%,`K*^
– ."K*a_

SiYKh

NK

• ;7>52/;756_!Z)PYjg[$]92/7:<kcfiheL^]XV
– 0dL

bXYhQ\1/Sf^I]*^jiYP[L

– 

'`UZ^jiYKh_Z85=[MD@H? C@?FEBEA[M

• nlqr +`URK
• 9@GGB?E_(&`bWbWJh
• [_
– 7:<N

]-jg

#^(&ZOheL^]XYOV_Z]^MT^]f]KMs

43