* starting attn refactor for encoder decoder models via bart (eager + sdpa)
* flash attention works, remove unnecessary code
* flex attention support for bart!, gotta check if the renaming is not too aggressive
* some comments
* skip flex grad test for standalone as done with the other test
* revert flex attn rename (for now), sdpa simplify, and todos
* more todos
* refactor mask creation for reuse
* modular attempt at biogpt
* first batch of other models
* fix attn dropout
* fix autoformer copies
* hubert
* another batch of models
* copies/style + last round of bart models --> whisper next?
* remove unnecessary _reshape function and remove copy to whisper
* add skip for decoder-only models out of enc-dec (same as in bart)
* bring back licences
* remove comment, added to pr read instead
* mostly docs
* disable sew flex attn as it's unclear attn mask for now
* oops
* test fixes for enc-dec
* torch fx fixes + try at flex attn
* skip on mbart
* some more fixes
* musicgen skip / delete old attn class logic + sdpa compose compile skip
* disable flex attn for musicgen, not worth the effort
* more fixes and style
* flex attention test for dropout and encoder decoder that dont have main input names
* informer fixes
* the weirdest thing I've encountered yet...
* style
* remove empty tensor attempt, found core root in previous commits
* disable time series due to tests being very text centric on inputs
* add speech to text to be ignoring the other attns, also due to tests
* update docs
* remaining issues resolved ?
* update docs for current state --> nllb moe and pegasus x sdpa is questionable :D
* some models have not set the is_causal flag...
* change dtype in softmax tol old behaviour + some modular fixes
* I hate it but it is what it is
* fixes from main for bart
* forgot this one
* some model fixes
* style
* current status
* marian works now
* fixing some copies
* some copy fixes + time series x informer
* last models possibly and fixes on style/copies
* some post merge fixes
* more fixes
* make attention interface callable and move warnings there
* style lol
* add comment to "unsupported"
* remove callable interface and change interface warnings + some copies
* fix
* ternary is ugly af, make it simpler
* how did that happen
* fix flex attn test
* failing the test
* no more fallback! fixing copies next
* style + attn fixed
* fixing copies and mask creation
* wrong copy
* fixup tests and disable flex attn for now
* fixup last tests?
* tmp commit
* move tests to the right class
* remove ALL all_generative_model_classes = ...
* skip tf roberta
* skip InstructBlipForConditionalGenerationDecoderOnlyTest
* videollava
* reduce diff
* reduce diff
* remove on vlms
* fix a few more
* manual rebase bits
* more manual rebase
* remove all manual generative model class test entries
* fix up to ernie
* a few more removals
* handle remaining cases
* recurrent gemma
* it's better here
* make fixup
* tf idefics is broken
* tf bert + generate is broken
* don't touch tf :()
* don't touch tf :(
* make fixup
* better comments for test skips
* revert tf changes
* remove empty line removal
* one more
* missing one
* Remove ConversationalPipeline and Conversation object, as they have been deprecated for some time and are due for removal
* Update not-doctested.txt
* Fix JA and ZH docs
* Fix JA and ZH docs some more
* Fix JA and ZH docs some more
* left-padding test revisited
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* try to stylify using ruff
* might need to remove these changes?
* use ruf format andruff check
* use isinstance instead of type comparision
* use # fmt: skip
* use # fmt: skip
* nits
* soem styling changes
* update ci job
* nits isinstance
* more files update
* nits
* more nits
* small nits
* check and format
* revert wrong changes
* actually use formatter instead of checker
* nits
* well docbuilder is overwriting this commit
* revert notebook changes
* try to nuke docbuilder
* style
* fix feature exrtaction test
* remve `indent-width = 4`
* fixup
* more nits
* update the ruff version that we use
* style
* nuke docbuilder styling
* leve the print for detected changes
* nits
* Remove file I/O
Co-authored-by: charliermarsh
<charlie.r.marsh@gmail.com>
* style
* nits
* revert notebook changes
* Add # fmt skip when possible
* Add # fmt skip when possible
* Fix
* More ` # fmt: skip` usage
* More ` # fmt: skip` usage
* More ` # fmt: skip` usage
* NIts
* more fixes
* fix tapas
* Another way to skip
* Recommended way
* Fix two more fiels
* Remove asynch
Remove asynch
---------
Co-authored-by: charliermarsh <charlie.r.marsh@gmail.com>
* stronger GC tests
* better tests and skip failing tests
* break down into 3 sub-tests
* break down into 3 sub-tests
* refactor a bit
* more refactor
* fix
* last nit
* credits contrib and suggestions
* credits contrib and suggestions
---------
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fix
* last attempt
* current work
* fix forward compatibility
* save all special tokens
* current state
* revert additional changes
* updates
* remove tokenizer.model
* add a test and the fix
* nit
* revert one more break
* fix typefield issue
* quality
* more tests
* fix fields for FC
* more nits?
* new additional changes
* how
* some updates
* simplify all
* more nits
* revert some things to original
* nice
* nits
* a small hack
* more nits
* ahhaha
* fixup
* update
* make test run on ci
* use subtesting
* update
* Update .circleci/create_circleci_config.py
* updates
* fixup
* nits
* replace typo
* fix the test
* nits
* update
* None max dif pls
* a partial fix
* had to revert one thing
* test the fast
* updates
* fixup
* and more nits
* more fixes
* update
* Oupsy 👁️
* nits
* fix marian
* on our way to heaven
* Update src/transformers/models/t5/tokenization_t5.py
Co-authored-by: Lysandre Debut <hi@lysand.re>
* fixup
* Update src/transformers/tokenization_utils_fast.py
Co-authored-by: Leo Tronchon <leo.tronchon@gmail.com>
* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Leo Tronchon <leo.tronchon@gmail.com>
* fix phobert
* skip some things, test more
* nits
* fixup
* fix deberta
* update
* update
* more updates
* skip one test
* more updates
* fix camembert
* can't test this one
* more good fixes
* kind of a major update
- seperate what is only done in fast in fast init and refactor
- add_token(AddedToken(..., speicla = True)) ignores it in fast
- better loading
* fixup
* more fixups
* fix pegasus and mpnet
* remove skipped tests
* fix phoneme tokenizer if self.verbose
* fix individual models
* update common tests
* update testing files
* all over again
* nits
* skip test for markup lm
* fixups
* fix order of addition in fast by sorting the added tokens decoder
* proper defaults for deberta
* correct default for fnet
* nits on add tokens, string initialized to special if special
* skip irrelevant herbert tests
* main fixes
* update test added_tokens_serialization
* the fix for bart like models and class instanciating
* update bart
* nit!
* update idefix test
* fix whisper!
* some fixup
* fixups
* revert some of the wrong chanegs
* fixup
* fixup
* skip marian
* skip the correct tests
* skip for tf and flax as well
---------
Co-authored-by: Lysandre Debut <hi@lysand.re>
Co-authored-by: Leo Tronchon <leo.tronchon@gmail.com>
* fix test for bart. Order is correct now let's skip BPEs
* ouf
* styling
* fix bert....
* slow refactoring
* current updates
* massive refactoring
* update
* NICE!
* update to see where I am at
* updates
* update
* update
* revert
* updates
* updates
* start supporting legacy_save
* styling
* big update
* revert some changes
* nits
* nniiiiiice
* small fixes
* kinda fix t5 with new behaviour
* major update
* fixup
* fix copies
* today's updates
* fix byt5
* upfate
* update
* update
* updates
* update vocab size test
* Barthez does not use not need the fairseq offset ids
* super calll must be after
* calll super
* move all super init
* move other super init
* fixup
* nits
* more fixes
* nits
* more fixes
* nits
* more fix
* remove useless files
* ouch all of them are affected
* and more!
* small imporvements
* no more sanitize token
* more changes around unique no split tokens
* partially fix more things
* keep legacy save but add warning
* so... more fixes
* updates
* guess deberta tokenizer could be nuked
* fixup
* fixup did some bad things
* nuke it if it breaks
* remove prints and pretrain fast from slow with new format.
* fixups
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fiou
* nit
* by default specials should not be normalized?
* update
* remove brakpoint
* updates
* a lot of updates
* fixup
* fixes revert some changes to match fast
* small nits
* that makes it cleaner
* fix camembert accordingly
* update
* some lest breaking changes
* update
* fixup
* fix byt5 and whisper mostly
* some more fixes, canine's byte vocab
* fix gpt2
* fix most of the perceiver tests (4 left)
* fix layout lmv3
* fixup
* fix copies for gpt2 style
* make sure to only warn once
* fix perciever and gpt2 tests
* some more backward compatibility: also read special tokens map because some ppl use it........////.....
* fixup
* add else when reading
* nits
* fresh updates
* fix copies
* will this make everything faster?
* fixes
* more fixes
* update
* more fixes
* fixup
* is the source of truth right?
* sorry camembert for the troubles
* current updates
* fixup
* update led
* update
* fix regression
* fix single word
* more model specific fixes
* fix t5 tests
* fixup
* more comments
* update
* fix nllb
* rstrip removed
* small fixes
* better handle additional_special_tokens and vocab sizes
* fixing
* styling
* fix 4 / 21
* fixup
* fix nlbb's tests
* some fixes
* fix t5
* fixes
* style
* fix canine tests
* damn this is nice
* nits
* m2m100 nit
* fixups
* fixes!
* fixup
* stash
* fix merge
* revert bad change
* fixup
* correct order for code Llama
* fix speecht5 post merge
* styling
* revert source of 11 fails
* small nits
* all changes in one go
* fnet hack
* fix 2 more tests
* update based on main branch of tokenizers
* fixup
* fix VITS issues
* more fixes
* fix mgp test
* fix camembert issues
* oups camembert still has 2 failing tests
* mluke fixes
* decode fixes
* small nits
* nits
* fix llama and vits
* fix camembert
* smal nits
* more fixes when initialising a fast from a slow and etc
* fix one of the last test
* fix CPM tokenizer test
* fixups
* fix pop2piano
* fixup
* ⚠️ Change tokenizers required version ⚠️
* ⚠️ Change tokenizers required version ⚠️
* "tokenizers>=0.14,<0.15", don't forget smaller than
* fix musicgen tests and pretraiendtokenizerfast
* fix owlvit and all
* update t5
* fix 800 red
* fix tests
* fix the fix of the fix of t5
* styling
* documentation nits
* cache _added_tokens_encoder
* fixups
* Nit
* fix red tests
* one last nit!
* make eveything a lot simpler
* Now it's over 😉
* few small nits
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* updates that work for now
* tests that should no be skipped / changed and fixed next
* fixup
* i am ashamed
* pushe the fix
* update
* fixups
* nits
* fix added_tokens_encoder
* fix canine test
* fix pegasus vocab
* fix transfoXL
* fixup
* whisper needs to be fixed for train new
* pegasus nits
* more pegasus fixes
* minor update
* better error message in failed test
* fix whisper failing test
* fix whisper failing test
* fix pegasus
* fixup
* fix **** pegasus
* reset things
* remove another file
* attempts to fix the strange custome encoder and offset
* nits here and there
* update
* fixup
* nit
* fix the whisper test
* nits nits
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* updates based on review
* some small update to potentially remove
* nits
* import rlu cache
* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Lysandre Debut <hi@lysand.re>
* move warning to `from_pretrained`
* update tests results now that the special tokens are always added
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>
* hidden layers, huh, what are they good for (absolutely nothing)
* Some tests break with 1 hidden layer, use 2
* Use 1 hidden layer in a few slow models
* Use num_hidden_layers=2 everywhere
* Slightly higher tol for groupvit
* Slightly higher tol for groupvit
* Fix one BLIP arg not being optional, remove misspelled arg
* Remove the lxmert test overrides and just use the base test_saved_model_creation
* saved_model_creation fixes and re-enabling tests across the board
* Remove unnecessary skip
* Stop caching sinusoidal embeddings in speech_to_text
* Fix transfo_xl compilation
* Fix transfo_xl compilation
* Fix the conditionals in xglm
* Set the save spec only when building
* Clarify comment
* Move comment correctly
* Correct embeddings generation for speech2text
* Mark RAG generation tests as @slow
* Remove redundant else:
* Add comment to clarify the save_spec line in build()
* Fix size tests for XGLM at last!
* make fixup
* Remove one band_part operation
* Mark test_keras_fit as @slow
* Revert whisper change and modify the test_compile_tf_model test
* make fixup
* Tweak test slightly
* Add functional model saving to test
* Ensure TF can infer shapes for data2vec
* Add override for efficientformer
* Mark test as slow
* Rework TF type hints to use | None instead of Optional[] for tf.Tensor
* Rework TF type hints to use | None instead of Optional[] for tf.Tensor
* Don't forget the imports
* Add the imports to tests too
* make fixup
* Refactor tests that depended on get_type_hints
* Better test refactor
* Fix an old hidden bug in the test_keras_fit input creation code
* Fix for the Deit tests
* Result of black 23.1
* Update target to Python 3.7
* Switch flake8 to ruff
* Configure isort
* Configure isort
* Apply isort with line limit
* Put the right black version
* adapt black in check copies
* Fix copies
* move generation_*.py src files into generation/*.py
* populate generation.__init__ with lazy loading
* move imports and references from generation.xxx.object to generation.object
* added test
* correct embedding init
* some changes in blenderbot (incomplete)
* update blenderbot (diff to be used as reference)
* update blenderbot_small
* update LED
* update marian
* update T5 and remove TFWrappedEmbeddings
* nullcontext() -> ContextManagers()
* fix embedding init
* Add serving_output and serving methods to some vision models
* Add serving outputs for DeiT
* Don't convert hidden states - differing shapes
* Make saveable
* Fix up
* Make swin saveable
* Add in tests
* Fix funnel tests (can't convert to tensor)
* Fix numpy call
* Tidy up a bit
* Add in hidden states - resnet
* Remove numpy
* Fix failing tests - tensor shape and skipping tests
* Remove duplicated function
* PR comments - formatting and var names
* PR comments
Add suggestions made by Joao Gante:
* Use tf.shape instead of shape_list
* Use @tooslow decorator on tests
* Simplify some of the logic
* PR comments
Address Yih-Dar Sheih comments - making tensor names consistent and make types float
* Types consistent with docs; disable test on swin (slow)
* CI trigger
* Change input_features to float32
* Add serving_output for segformer
* Fixup
Co-authored-by: Amy Roberts <amyeroberts@users.noreply.github.com>
* Support for Bart and LayoutLM, and partial support for XLNet
* Support for mbart
* A lot of new models supported
* Support for other models
* LayoutLM fix
* Use strings instead of classes