[DL Hacks]AVID: Adversarial Visual Irregularity Detection

>100 Views

August 29, 18

#deep learning #Deep Learning #AVID #Anomaly Detection #Implementation #Experiment

スライド概要

2018/08/27
Deep Learning JP:
http://deeplearning.jp/hacks/

Deep Learning JP

@DeepLearning2023

スライド一覧

DL輪読会資料

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

（ダウンロード不可）

関連スライド

【DL輪読会】KAN: Kolmogorov–Arnold Networks

Deep Learning JP 90.8K

【拡散モデル勉強会】拡散モデルの数理

Deep Learning JP 67.6K

【DL輪読会】Evolutionary Optimization of Model Merging Recipes モデルマージの進化的最適化

Deep Learning JP 61.2K

【DL輪読会】Conditional Flow Matching

Deep Learning JP 50K

【DL輪読会】Cosmos World Foundation Model Platform for Physical AI

Deep Learning JP 47.4K

【拡散モデル勉強会】Introduction to Diffusion Models

Deep Learning JP 47.2K

各ページのテキスト

DEEP LEARNING JP [DL Hacks] AVID: Adversarial Visual Irregularity Detection Hiromi Nakagawa, Matsuo Lab http://deeplearning.jp/

http://deeplearning.jp/

Agenda 1. 論文紹介 2. 実装 3. 実験結果 2

Agenda 1. 論文紹介 2. 実装 3. 実験結果久保くんのDL輪読会のスライドを引用します https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection 3

https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection

論文紹介引用：https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection 4

https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection

論文紹介引用：https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection 5

https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection

論文紹介 Inpaiting Network (Generator側) • 直接的に異常を検知するのではなく、入力画像から異常を消すように作用する。 • アーキテクチャとしてはU -N etで正常画像のみで学習される。学習時は正常画像にガウシアンノイズを加えたものを入力とする。テスト時は異常部分が消えるイメージ。 14 引用：https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection 6

https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection

論文紹介 Detection Network (Discriminator側) • セマンティックセグメンテーションで用いられる FCN の構造をとっている。 • 入力画像に対して、異常領域を検出するように学習する。ヒートマップで表すとイメージがつきやすい→ 15 引用：https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection 7

https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection

論文紹介学習方法 • 一般的なGA N の学習 • 提案手法の学習は行列 16 引用：https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection 8

https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection

論文紹介異常の判定の仕方 • I (Gen erato r側)がp ixel-levelの検出、 D (D iscrim in ato r側)がp atch -levelの検出を行う。 • I側はテスト画像と生成画像の差で異常かどうかを判定する。異常がない場合ははゼロに近くなるが、異常がある場合は値が大きくなる。 • D 側は各領域に対する出力を閾値より下かどうかで異常を判定する。 • 両者を考慮して以下の条件に当てはまるものを異常として定義する。 17 引用：https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection 9

https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection

10.

論文紹介データセット 1 . U CSD : 歩行者が通行している画像(定点カメラを 1 0 f p s)。自動車や自転車があると異常値となる。 Ped 1 と Ped 2 の2 つのサブセットが用意されている。 2 . U M N : 歩行者が通行している動画。急に歩行者が走り出す。 (動画) 3 . IR -M N IST: 3 が抜けたM N IST。テスト時だけ3 が出てくるのでそれを異常とする。 UCSD 正常画像 UCSD 異常画像 19 引用：https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection 10

https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection

11.

論文紹介結果1 (UCSD) • FL(フレームレベル): 1 p xでも異常と検出されれば異常。 • PL(ピクセルレベル): 最低でも 4 0 % を g ro u n d -tru th と合致させる。 • ラストカラムの意味 – D : d eep lea rn in g 使用 – E: en d -to -en d の学習 – P: p atch ベースの学習か否か 20 引用：https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection 11

https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection

12.

論文紹介結果1 (UCSD) 入力画像 Iの出力画像 21 引用：https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection 12

https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection

13.

論文紹介結果2 (UMN) • 正常状態と異常状態と変化状態しかない単純なデータセットのため、 f ram e-level のEERと A U Cを算出。動画のためには以下の手法でプリプロセスを行う。 (著者の別論文) Deep-anomaly: Fully convolutional neural network for fast anomaly detection in crowded scenes https:/ / arx iv.org/ abs/ 1609.00866 22 引用：https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection 13

https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection

14.

論文紹介結果3 (IR-MNIST) I (Gen erato r側)の入力と出力 D (D iscrim in a to r側)の出力のヒートマップ 23 引用：https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection 14

https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection

15.

論文紹介結果3 (IR-MNIST) • 異常のTh resh o ld を変化させて結果を記録。 24 引用：https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection 15

https://www.slideshare.net/DeepLearningJP2016/dlavidadversarial-visual-irregularity-detection

16.

Agenda 1. 論文紹介 2. 実装 3. 実験結果 https://github.com/Hirominnn/AVID_pytorch 16

https://github.com/Hirominnn/AVID_pytorch

17.

実装 UNetクラス • Generator – 元論文と同様にU-Netを使用 – 以下URLなどを参考に実装 • https://github.com/milesial/Pytorch-UNet • https://github.com/jaxony/unet-pytorch 17

18.

実装 • Generator サブモジュール 18

19.

実装 • Discriminator – 以下URLなどを参考にFCNを実装 • https://github.com/pochih/FCN-pytorch • https://github.com/wkentaro/pytorch-fcn – 元論文の図は64x64→11x11のモデルを仮定 • IR-MNIST(224x224)などは64x64にリサイズすると判読できないレベルまでつぶれてしまうため、 112x112→11x11や224x224→11x11のモデルを実装した 19

20.

実装 • Discriminator – 64x64→11x11のFCNの例 20

21.

実装 • 学習（一部省略） 21

22.

Agenda 1. 論文紹介 2. 実装 3. 実験結果再現しきれず、、 22

23.

実験結果 • 元論文と同様のパラメータではうまく行かなかったので、いくつか変更（探索） – DiscriminatorのFCNの大きさ： • 元論文：64x64→11x11 • 実装：IR-MNIST：224x224→11x11、UCSD：112x112→11x11 – 最適化： • 元論文：SGDでG/Dともに学習率 2e-3、モメンタム 0.9 • 実装：AdamでGの学習率 1e-4 ~ 2e-4、Dの学習率 2e-5 ~ 1e-4 – ノイズの係数γ • 元論文：0.4 • 実装：0.6 ~ 0.7 23

24.

実験結果 • IR-MNIST – Gの学習率をDより高くしておくと再構成については比較的スムーズに学習が進む 24

25.

実験結果 • IR-MNIST – が、テスト時に復元できないはずの「3」が復元できてしまう、、 – 恒等写像を覚えてしまっている？ノイズが不足？元画像+正解マスク生成画像 |生成画像 – 元画像| 25

26.

実験結果 • UCSD – 時系列を反映するために、元論文に沿って2フレームの差分x3の3チャネルに前処理済み • Gはそれっぽく復元は出来るようになった • Dはほぼ0.5しか吐かなくなる 26

27.

実験結果 • UCSD – ものによっては異常検知できてるっぽいものもあった元画像(加工済) D(生成画像) α=0.4, ζ=0.49時のマスクと元画像+マスク Gの生成画像元画像+正解マスク |生成画像-元画像| 27

28.

実験結果 • UCSD – ものによっては異常検知できてるっぽいものもあった元画像(加工済) D(生成画像) α=0.35, ζ=0.49時のマスクと元画像+マスク Gの生成画像元画像+正解マスク |生成画像-元画像| 28

29.

実験結果 • UCSD – 時系列を反映するために、元論文に沿って2フレームの差分x3の3チャネルに前処理済み • 224x224だとなかなかうまく学習できなかった 29

30.

感想 • 元論文のハイパラでうまくいかない、かつかなりハイパラに敏感で実験が大変だった – GANの実装・実験自体が初だったので、GANの勘所がなかなかわからず苦戦 – 学習率はG>DとしないとDが強くなりすぎてGがノイズしか吐かなくなるが、Dもうまく学習が進まない(0.5しか吐かなくなる)と異常検知の段階で役に立たず、、 – ノイズが弱いとGが恒等写像を覚えてしまうっぽい？（テスト時に未知物体も復元できてしまう） – 筆者が本当に(64,64)のInputを想定していたのか?など不明点も多かった – 再構成誤差とかも入れたほうが安定しそう？（特に初めの方） • 訓練時にガウシアンノイズをかけるだけでテスト時に異常を消せるのか？の疑問は解決されず、、 – どなたか実験うまくいったら教えてください • (本筋に関係ないところとして)画像を-1~1で正規化すると可視化などで微妙につまった – -1~1のtensorをPILに変換すると、0~1のtensorを変換した場合と値が異なる 30