[Pytorch][BERT] 버트 소스코드 이해

BertSelfAttention

1) key, query, value 생성

: 전체 hiddens_size를 실제로 attention 연산을 적용할 크기로 축소한다

2) attention score 구한다

3) context vector 구한다 (=output)

<소스 코드 내>

✔ num_attention_heads

: (int, optional, defaults to 12)

: Number of attention heads for each attention layer in the Transformer encoder.

[Pytorch][BERT] 버트 소스코드 이해_⑫ BertSelfOutput (0)	2022.10.28
[Pytorch][BERT] 버트 소스코드 이해_⑩ BERT Layer (0)	2022.10.28
[Pytorch][BERT] 버트 소스코드 이해_⑨ BERT model 출력값 (0)	2022.10.28
[Pytorch][BERT] 버트 소스코드 이해_⑧ BERT model 입력값 (0)	2022.10.28
[Pytorch][BERT] 버트 소스코드 이해_⑦ Bert Pooler (0)	2022.10.28