`torch.nn.init` 모듈

ANN을 구현할 때, 각 layer의 weight 와 bias를 초기화하는 방법을 제공함.

초기화는 ANN의 수렴 속도 및 학습 안정화에 매우 큰 영향을 줌.
torch.nn.init는 일반적으로 사용되는 다양한 초기화 방법들이 구현되어 있음.

`.uniform_(tensor, a=0., b=1.)`

지정된 parameters를 uniform distance로 초기화
a와 b는 값의 범위를 지정하는데 사용됨: [a,b)

`.normal_(tensor, mean=0., std=1.)`

normal distribution으로 초기화.

`.constant_(tensor, val=0.)`

val 에 지정된 상수값으로 초기화.

`.ones_()` and `.zeros_()`

0과 1로 초기화.

`.xavier_uniform_` and `.xavier_normal_`

Glorot Initialization.
Glorot 초기화로서 uniform distribution과 normal distirbution임.
activation이 linear이거나 tanh인 경우와 같이 [-1.,1] 범위이면서 좌우 대칭의 분포인 activation function을 위한 초기화.

import torch
import torch.nn as nn
import torch.nn.init as init

# Xavier Uniform 초기화를 사용하는 예시
linear_layer = nn.Linear(10, 5)
init.xavier_uniform_(linear_layer.weight)
# 선택적으로 bias 초기화
init.constant_(linear_layer.bias, 0)

# Xavier Normal 초기화를 사용하는 예시
linear_layer = nn.Linear(10, 5)
init.xavier_normal_(linear_layer.weight)

`.kaiming_uniform_` and `.kaiming_normal_`

He Initialization.
역시 uniform distribution과 normal distribution을 제공.
- a : uniform distribution 인 경우 존재. Leakage ReLU의 negative slope 값임 (즉, ReLU에서 0).
- mode : fan_in 과 fan_out 에서 선택 가능. 초기화에서 입력과 출력 유닛의 갯수 중 어떤 것을 기반으로 할지를 결정함.
- nonlinearity : 사용하는 비선형 activation을 지정. "relu" 와 "leaky_relu"를 지정할 수 있음.
activation 이 ReLU 계열인 경우 이용됨.
- Xavier (or Glorot)의 초기화를 ReLU에 사용할 경우,
- layer가 깊어질 수록, variance가 감소하는 문제가 발생하였고,
- 이를 보완하기 위해 제안됨.

# Kaiming Uniform 초기화를 사용하는 예시
linear_layer = nn.Linear(10, 5)
init.kaiming_uniform_(linear_layer.weight, mode='fan_in', nonlinearity='relu')
# 선택적으로 bias 초기화
init.constant_(linear_layer.bias, 0)

# Kaiming Normal 초기화를 사용하는 예시
linear_layer = nn.Linear(10, 5)
init.kaiming_normal_(linear_layer.weight, mode='fan_in', nonlinearity='relu')

.kaiming_normal_ 의 경우, weights 의 variance(분산)이 다음과 같음.

$σ2(ω)=2(1+a2)×fan_mode<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><msup><mi>σ</mi><mn>2</mn></msup><mo stretchy="false">(</mo><mi>ω</mi><mo stretchy="false">)</mo><mo>=</mo><mfrac><mn>2</mn><mrow><mo stretchy="false">(</mo><mn>1</mn><mo>+</mo><msup><mi>a</mi><mn>2</mn></msup><mo stretchy="false">)</mo><mo>×</mo><mtext>fan_mode</mtext></mrow></mfrac></math>$

.kaiming_uniform_ 은 아래의 범위내에 균등분포를 따름.

$[−√6(1+a2)×fan_mode,+√6(1+a2)×fan_mode)<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mo>−</mo><msqrt><mfrac><mn>6</mn><mrow><mo stretchy="false">(</mo><mn>1</mn><mo>+</mo><msup><mi>a</mi><mn>2</mn></msup><mo stretchy="false">)</mo><mo>×</mo><mtext>fan_mode</mtext></mrow></mfrac></msqrt><mo>,</mo><mo>+</mo><msqrt><mstyle displaystyle="true" scriptlevel="0"><mfrac><mn>6</mn><mrow><mo stretchy="false">(</mo><mn>1</mn><mo>+</mo><msup><mi>a</mi><mn>2</mn></msup><mo stretchy="false">)</mo><mo>×</mo><mtext>fan_mode</mtext></mrow></mfrac></mstyle></msqrt><mo data-mjx-texclass="CLOSE">)</mo></mrow></math>$

`.orthogonal_`

weight에 해당하는 matrix가 orthogonal matrix가 되도록 초기화.

`.sparse_`

sparse tensor를 초기화.
sparcity를 통해 0이 아닌 요소의 ratio를 조절.

같이 읽어보면 좋은 자료

https://dsaint31.me/mkdocs_site/ML/ch09/weight_initializations/

BME228

Weight Initialization (가중치 초기화) Weight Initialization은 Gradient Vanishing or Exploding Problem을 개선하기 위해 연구된 방법으로 2010년에 상당한 성과를 보이면서 deep neural network를 효과적으로 학습시키기

dsaint31.me

'Python' 카테고리의 다른 글

[PyTorch] Custom Model 과 torch.nn.Module의 메서드들. (0)	2024.04.12
[PyTorch] CustomANN Example: From Celsius to Fahrenheit (0)	2024.04.12
[PyTorch] Dataset and DataLoader (0)	2024.04.09
[Python] pathlib.Path 사용하기. (0)	2024.03.31
[DL] Tensor: Random Tensor 만들기 (NumPy, PyTorch) (0)	2024.03.29

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

ds31x

[PyTorch] torch.nn.init

`torch.nn.init` 모듈

`.uniform_(tensor, a=0., b=1.)`

`.normal_(tensor, mean=0., std=1.)`

`.constant_(tensor, val=0.)`

`.ones_()` and `.zeros_()`

`.xavier_uniform_` and `.xavier_normal_`

`.kaiming_uniform_` and `.kaiming_normal_`

`.orthogonal_`

`.sparse_`

같이 읽어보면 좋은 자료

'Python' 카테고리의 다른 글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역

[PyTorch] torch.nn.init

torch.nn.init 모듈

.uniform_(tensor, a=0., b=1.)

.normal_(tensor, mean=0., std=1.)

.constant_(tensor, val=0.)

.ones_() and .zeros_()

.xavier_uniform_ and .xavier_normal_

.kaiming_uniform_ and .kaiming_normal_

.orthogonal_

.sparse_

같이 읽어보면 좋은 자료

'Python' 카테고리의 다른 글

관련글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역

`torch.nn.init` 모듈

`.uniform_(tensor, a=0., b=1.)`

`.normal_(tensor, mean=0., std=1.)`

`.constant_(tensor, val=0.)`

`.ones_()` and `.zeros_()`

`.xavier_uniform_` and `.xavier_normal_`

`.kaiming_uniform_` and `.kaiming_normal_`

`.orthogonal_`

`.sparse_`