mirror of
https://github.com/huggingface/transformers.git
synced 2025-08-01 02:31:11 +06:00
Added missing code in exemplary notebook - custom datasets fine-tuning (#15300)
* Added missing code in exemplary notebook - custom datasets fine-tuning Added missing code in tokenize_and_align_labels function in the exemplary notebook on custom datasets - token classification. The missing code concerns adding labels for all but first token in a single word. The added code was taken directly from huggingface official example - this [colab notebook](https://github.com/huggingface/notebooks/blob/master/transformers_doc/custom_datasets.ipynb). * Changes requested in the review - keep the code as simple as possible
This commit is contained in:
parent
0501beb846
commit
e79a0faeae
@ -326,7 +326,9 @@ def tokenize_and_align_labels(examples):
|
||||
label_ids.append(-100)
|
||||
elif word_idx != previous_word_idx: # Only label the first token of a given word.
|
||||
label_ids.append(label[word_idx])
|
||||
|
||||
else:
|
||||
label_ids.append(-100)
|
||||
previous_word_idx = word_idx
|
||||
labels.append(label_ids)
|
||||
|
||||
tokenized_inputs["labels"] = labels
|
||||
|
Loading…
Reference in New Issue
Block a user