Anonymous
_batch_encode_plus () erhielt ein unerwartetes Keyword -Argument 'return_attention_masks' '
Post
by Anonymous » 17 Mar 2025, 16:01
Ich studiere Roberta -Modell, um Emotionen in Tweets zu erkennen.
auf Google Colab. Befolgen Sie diese Notbook-Datei von Kaggle-
https://www.kaggle.com/ishivinal/tweet- ... d=38608295
Code-Snippet:
Code: Select all
def regular_encode(texts, tokenizer, maxlen=512):
enc_di = tokenizer.batch_encode_plus(
texts,
return_attention_masks=True,
return_token_type_ids=False,
pad_to_max_length=True,
#padding=True,
max_length=maxlen
)
return np.array(enc_di['input_ids'])
def build_model(transformer, max_len=160):
input_word_ids = Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
sequence_output = transformer(input_word_ids)[0]
cls_token = sequence_output[:, 0, :]
out = Dense(13, activation='softmax')(cls_token)
model = Model(inputs=input_word_ids, outputs=out)
model.compile(Adam(lr=1e-5), loss='categorical_crossentropy', metrics=['accuracy'])
return model
AUTO = tf.data.experimental.AUTOTUNE
MODEL = 'roberta-base'
tokenizer = AutoTokenizer.from_pretrained(MODEL)
X_train_t = regular_encode(X_train, tokenizer, maxlen= max_len)
X_test_t = regular_encode(X_test, tokenizer, maxlen=max_len)
< /code>
Im Teil reguliert. Ich erhalte den folgenden Fehler: < /p>
TypeError Traceback (most recent call last)
in ()
----> 1 X_train_t = regular_encode(X_train, tokenizer, maxlen= max_len)
2 X_test_t = regular_encode(X_test, tokenizer, maxlen=max_len)
2 frames
/usr/local/lib/python3.7/dist-packages/transformers/models/gpt2/tokenization_gpt2_fast.py in _batch_encode_plus(self, *args, **kwargs)
161 )
162
--> 163 return super()._batch_encode_plus(*args, **kwargs)
164
165 def _encode_plus(self, *args, **kwargs) -> BatchEncoding:
TypeError: _batch_encode_plus() got an unexpected keyword argument 'return_attention_masks'
1742223695
Anonymous
Ich studiere Roberta -Modell, um Emotionen in Tweets zu erkennen. auf Google Colab. Befolgen Sie diese Notbook-Datei von Kaggle-https://www.kaggle.com/ishivinal/tweet-emotions-analysis-using-lstm-glove-roberta?scriptversionId=38608295 Code-Snippet: [code]def regular_encode(texts, tokenizer, maxlen=512): enc_di = tokenizer.batch_encode_plus( texts, return_attention_masks=True, return_token_type_ids=False, pad_to_max_length=True, #padding=True, max_length=maxlen ) return np.array(enc_di['input_ids']) def build_model(transformer, max_len=160): input_word_ids = Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids") sequence_output = transformer(input_word_ids)[0] cls_token = sequence_output[:, 0, :] out = Dense(13, activation='softmax')(cls_token) model = Model(inputs=input_word_ids, outputs=out) model.compile(Adam(lr=1e-5), loss='categorical_crossentropy', metrics=['accuracy']) return model AUTO = tf.data.experimental.AUTOTUNE MODEL = 'roberta-base' tokenizer = AutoTokenizer.from_pretrained(MODEL) X_train_t = regular_encode(X_train, tokenizer, maxlen= max_len) X_test_t = regular_encode(X_test, tokenizer, maxlen=max_len) < /code> Im Teil reguliert. Ich erhalte den folgenden Fehler: < /p> TypeError Traceback (most recent call last) in () ----> 1 X_train_t = regular_encode(X_train, tokenizer, maxlen= max_len) 2 X_test_t = regular_encode(X_test, tokenizer, maxlen=max_len) 2 frames /usr/local/lib/python3.7/dist-packages/transformers/models/gpt2/tokenization_gpt2_fast.py in _batch_encode_plus(self, *args, **kwargs) 161 ) 162 --> 163 return super()._batch_encode_plus(*args, **kwargs) 164 165 def _encode_plus(self, *args, **kwargs) -> BatchEncoding: TypeError: _batch_encode_plus() got an unexpected keyword argument 'return_attention_masks' [/code]