[Pytorch][BERT] 버트 소스코드 이해

[Pytorch][BERT] 버트 소스코드 이해 목차
BERT	📑 BERT Config 👀
	📑 BERT Tokenizer
	📑 BERT Model	📑 BERT Input
		📑 BERT Output
		📑 BERT Embedding
		📑 BERT Pooler
		📑 BERT Enocder	📑 BERT Layer	📑 BERT SelfAttention
				📑 BERT SelfOtput

BertConfig

<소스 코드> configuration_bert.py

class BertConfig(PretrainedConfig):
		def __init__(
	        self,
	        vocab_size=30522,
	        hidden_size=768,
	        num_hidden_layers=12,
	        num_attention_heads=12,
	        intermediate_size=3072,
	        hidden_act="gelu",
	        hidden_dropout_prob=0.1,
	        attention_probs_dropout_prob=0.1,
	        max_position_embeddings=512,
	        type_vocab_size=2,
	        initializer_range=0.02,
	        layer_norm_eps=1e-12,
	        pad_token_id=0,
	        position_embedding_type="absolute",
	        use_cache=True,
	        classifier_dropout=None,
	        **kwargs
	    ):

🐸 Config에 있는 parameter들

parateter	description	default
vocab_size	BERT의 Vocabulary 크기/ 고유한 토큰의 개수	30522
hidden_size	Encoder 와 Pooler 층의 차원 수	768
num_hidden_layers	Encoder의 hiden layer 수	12
num_attention_heads	Encoder가 가지는 attention head 수	12
intermediate_size	Encoder의 intermediate(=feed-forward) 차원 수	3072
hidden_act	Encoder와 Pooler의 활성화 함수	"gelu”
hidden_dropout_prob	Embedding과 Encoder와 Pooler의 각 완전연결층의 droptout 비율	0.1
attention_probs_dropout_prob	attention probabilities 의 dropout 비율	0.1
max_position_embeddings	모델이 처리할 수 있는 sequence의 최대 길이	512
type_vocab_size	token_type_ids의 vocabulary 크기	2
initializer_range	모든 가중치 벡터 초기화에 쓰이는 표준 편차 값	0.02
layer_norm_eps	layer normalization layers에 쓰이는 epsilon 값	1e-12
position_embedding_type	position embedding의 유형 ("absolute", "relative_key", "relative_key_query")	"absolute”
use_cache	모델이 마지막 key/value attention들을 반환할 것인가 여부 (is_decoder=True 일 때만 의미있음)
classifier_dropout	classification head의 dropout 비율

<Enlish version>

parateter	description	default
vocab_size	Vocabulary size of the BERT model. Defines the number of different tokens that can be represented by the ‘inputs_ids’ passed when calling [BertModel] or [TFbertModel]	30522
hidden_size	Dimensionality of the encoder layers and the pooler layer.	768
num_hidden_layers	Number of hidden layers in the Transformer encoder.	12
num_attention_heads	Number of attention heads for each attention layer in the Transformer encoder.	12
intermediate_size	Dimensionality of the "intermediate" (often named feed-forward) layer in the Transformer encoder.	3072
hidden_act	The non-linear activation function (function or string) in the encoder and pooler. If string, "gelu",	"gelu”
hidden_dropout_prob	The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.	0.1
attention_probs_dropout_prob	The dropout ratio for the attention probabilities.	0.1
max_position_embeddings	The maximum sequence length that this model might ever be used with. Typically set this to something large	512
type_vocab_size	The vocabulary size of the token_type_ids passed when calling [BertModel] or [TFBertModel].	2
initializer_range	The standard deviation of the truncated_normal_initializer for initializing all weight matrices.	0.02
layer_norm_eps	The epsilon used by the layer normalization layers.	1e-12
position_embedding_type	Type of position embedding. Choose one of "absolute", "relative_key", "relative_key_query". For positional embeddings use "absolute". For more information on `"relative_key"	"absolute”
use_cache	Whether or not the model should return the last key/values attentions (not used by all models). Only relevant if config.is_decoder=True
classifier_dropout	The dropout ratio for the classification head.

'AI > 파이토치(Pytorch)' 카테고리의 다른 글

[Pytorch][BERT] 버트 소스코드 이해_④ BertModel (0)	2022.07.05
[Pytorch][BERT] 버트 소스코드 이해_③ BertTokenizer (0)	2022.07.05
[Pytorch][BERT] 버트 소스코드 이해 (1)	2022.07.05
[파이토치] 미니배치와 데이터 로드 하기 (0)	2021.09.16
[파이토치] Autograd (0)	2021.09.08

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Hyen4110

[Pytorch][BERT] 버트 소스코드 이해_② BertConfig

[Pytorch][BERT] 버트 소스코드 이해 목차

BertConfig

🐸 Config에 있는 parameter들

'AI > 파이토치(Pytorch)' 카테고리의 다른 글

댓글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역

[Pytorch][BERT] 버트 소스코드 이해_② BertConfig

[Pytorch][BERT] 버트 소스코드 이해 목차

BertConfig

🐸 Config에 있는 parameter들

'AI > 파이토치(Pytorch)' 카테고리의 다른 글

관련글

댓글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역