Update run_lm_finetuning.py

The previous method, just as phrased, did not exist in the class.
2025-07-31 10:12:23 +06:00 · 2019-09-27 15:18:42 -03:00 · 2019-09-27 15:18:42 -03:00 · 9478590630
commit 9478590630
parent ca559826c4
1 changed files with 1 additions and 1 deletions
--- a/examples/run_lm_finetuning.py
+++ b/examples/run_lm_finetuning.py
@ -75,7 +75,7 @@ class TextDataset(Dataset):
            tokenized_text = tokenizer.convert_tokens_to_ids(tokenizer.tokenize(text))

            for i in range(0, len(tokenized_text)-block_size+1, block_size): # Truncate in block of block_size
-                self.examples.append(tokenizer.add_special_tokens_single_sentence(tokenized_text[i:i+block_size]))
+                self.examples.append(tokenizer.add_special_tokens_single_sequence(tokenized_text[i:i+block_size]))
            # Note that we are loosing the last truncated example here for the sake of simplicity (no padding)
            # If your dataset is small, first you should loook for a bigger one :-) and second you
            # can change this behavior by adding (model specific) padding.