* left-padding test revisited
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* initial-commit
* start cleaning
* small nits
* small nits
* current updates
* add kernels
* small refactoring little step
* add comments
* styling
* nit
* nits
* Style
* Small changes
* Push dummy mambda simple slow
* nit
* Use original names
* Use original names and remove norm
* Updates for inference params
* Style nd updates
* nits
* Match logits
* Add a test
* Add expected generated text
* nits doc, imports and styling
* style
* oups
* dont install kernels, invite users to install the required kernels
* let use use the original packages
* styling
* nits
* fix some copieds
* update doc
* fix-copies
* styling done
* nits
* fix import check
* run but wrong cuda ress
* mamba CUDA works :)
* fix the fast path
* config naming nits
* conversion script is not required at this stage
* finish fixing the fast path: generation make sense now!
* nit
* Let's start working on the CIs
* style
* better style
* more nits
* test nit
* quick fix for now
* nits
* nit
* nit
* nit
* nits
* update test rest
* fixup
* update test
* nit
* some fixes
* nits
* update test values
* fix styling
* nit
* support peft
* integrations tests require torchg
* also add slow markers
* styling
* chose forward wisely
* nits
* update tests
* fix gradient checkpointing
* fixup
* nit
* fix doc
* check copies
* fix the docstring
* fix some more tests
* style
* fix beam search
* add init schene
* update
* nit
* fix
* fixup the doc
* fix the doc
* fixup
* tentative update but slow is no longer good
* nit
* should we always use float32?
* nits
* revert wrong changes
* res in float32
* cleanup
* skip fmt for now
* update generation values
* update test values running original model
* fixup
* update tests + rename inference_params to cache_params + make sure training does not use cache_params
* small nits
* more nits
* fix final CIs
* style
* nit doc
* I hope final doc nits
* nit
* 🫠
* final touch!
* fix torch import
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <hi@lysand.re>
* Apply suggestions from code review
* fix fix and fix
* fix base model prefix!
* nit
* Update src/transformers/models/mamba/__init__.py
* Update docs/source/en/model_doc/mamba.md
Co-authored-by: Lysandre Debut <hi@lysand.re>
* nit
---------
Co-authored-by: Lysandre Debut <hi@lysand.re>
* First draft
* More improvements
* More improvements
* More fixes
* Fix copies
* More improvements
* More fixes
* More improvements
* Convert checkpoint
* More improvements, set up tests
* Fix more tests
* Add UdopModel
* More improvements
* Fix equivalence test
* More fixes
* Redesign model
* Extend conversion script
* Use real inputs for conversion script
* Add image processor
* Improve conversion script
* Add UdopTokenizer
* Add fast tokenizer
* Add converter
* Update README's
* Add processor
* Add fully fledged tokenizer
* Add fast tokenizer
* Use processor in conversion script
* Add tokenizer tests
* Fix one more test
* Fix more tests
* Fix tokenizer tests
* Enable fast tokenizer tests
* Fix more tests
* Fix additional_special_tokens of fast tokenizer
* Fix tokenizer tests
* Fix more tests
* Fix equivalence test
* Rename image to pixel_values
* Rename seg_data to bbox
* More renamings
* Remove vis_special_token
* More improvements
* Add docs
* Fix copied from
* Update slow tokenizer
* Update fast tokenizer design
* Make text input optional
* Add first draft of processor tests
* Fix more processor tests
* Fix decoder_start_token_id
* Fix test_initialization
* Add integration test
* More improvements
* Improve processor, add test
* Add more copied from
* Add more copied from
* Add more copied from
* Add more copied from
* Remove print statement
* Update README and auto mapping
* Delete files
* Delete another file
* Remove code
* Fix test
* Fix docs
* Remove asserts
* Add doc tests
* Include UDOP in exotic model tests
* Add expected tesseract decodings
* Add sentencepiece
* Use same design as T5
* Add UdopEncoderModel
* Add UdopEncoderModel to tests
* More fixes
* Fix fast tokenizer
* Fix one more test
* Remove parallelisable attribute
* Fix copies
* Remove legacy file
* Copy from T5Tokenizer
* Fix rebase
* More fixes, copy from T5
* More fixes
* Fix init
* Use ArthurZ/udop for tests
* Make all model tests pass
* Remove UdopForConditionalGeneration from auto mapping
* Fix more tests
* fixups
* more fixups
* fix the tokenizers
* remove un-necessary changes
* nits
* nits
* replace truncate_sequences_boxes with truncate_sequences for fix-copies
* nit current path
* add a test for input ids
* ids that we should get taken from c9f7a32f57
* nits converting
* nits
* apply ruff
* nits
* nits
* style
* fix slow order of addition
* fix udop fast range as well
* fixup
* nits
* Add docstrings
* Fix gradient checkpointing
* Update code examples
* Skip tests
* Update integration test
* Address comment
* Make fixup
* Remove extra ids from tokenizer
* Skip test
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update year
* Address comment
* Address more comments
* Address comments
* Add copied from
* Update CI
* Rename script
* Update model id
* Add AddedToken, skip tests
* Update CI
* Fix doc tests
* Do not use Tesseract for the doc tests
* Remove kwargs
* Add original inputs
* Update casting
* Fix doc test
* Update question
* Update question
* Use LayoutLMv3ImageProcessor
* Update organization
* Improve docs
* Update forward signature
* Make images optional
* Remove deprecated device argument
* Add comment, add add_prefix_space
* More improvements
* Remove kwargs
---------
Co-authored-by: ArthurZucker <arthur.zucker@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* 🐛 Fix oneformer instance post processing when using panoptic task type
* ✅ Add unit test for oneformer instance post processing panoptic bug
---------
Co-authored-by: Nick DeGroot <1966472+nickthegroot@users.noreply.github.com>
* remove control flow
* update gptneox
* update ....
* nits
* Actually let's just break. Otherwise we are silently failing which imo is not optimal
* version BC
* fix tests
* fix eager causal
* nit
* add a test
* style
* nits
* nits
* more nits for the test
* update and fix
* make sure cuda graphs are not skipped
* read token is needed for meta llama
* update!
* fiixup
* compile test should be slow
* fix thet fix copies
* stle 🫠
* stash commit
* stash commit
* It works!
* Remove unnecessary change
* We don't actually need the cache_dir!
* Update docstring
* Add test
* Add test with custom cache dir too
* Update model repo path
* Revert "Add tie_weights() to LM heads and set bias in set_output_embeddings() (#28948)"
This reverts commit 725f4ad1cc.
* Revert "Patch to skip failing `test_save_load_low_cpu_mem_usage` tests (#29043)"
This reverts commit 4156f517ce.
* add add_dummy_prefix_space option to slow
* checking kwargs might be better. Should be there for all spm tokenizer IMO
* nits
* fix copies
* more copied
* nits
* add prefix space
* nit
* nits
* Update src/transformers/convert_slow_tokenizer.py
* fix inti
* revert wrong styling
* fix
* nits
* style
* updates
* make sure we use slow tokenizer for conversion instead of looking for the decoder
* support llama ast well
* update llama tokenizer fast
* nits
* nits nits nits
* update the doc
* update
* update to fix tests
* skip unrelated tailing test
* Update src/transformers/convert_slow_tokenizer.py
* add proper testing
* test decode as well
* more testing
* format
* fix llama test
* Apply suggestions from code review
* pass through trust_remote_code for dynamically loading unregistered tokenizers specified by config
add test
* change directories back to previous directory after test
* fix ruff check
* Add a note to that block for future in case we want to remove it later
---------
Co-authored-by: Matt <rocketknight1@gmail.com>
* Add tie_weights() to LM heads and set bias in set_output_embeddings()
The bias were not tied correctly in some LM heads, and this change should fix that.
* Moving test_save_and_load_low_cpu_mem_usage to ModelTesterMixin
* Adding _tie_weights() to MPNet and Vilt
* Skip test for low cpu mem usage for Deta/DeformableDetr since they cannot init on meta device
* Rename to test name to save_load to match the convention
* Update the processing so bbox coords are adjusted for padding
* Just pad masks
* Tidy up, add tests
* Better tests
* Fix yolos and mark as slow for pycocotols
* Fix yolos - return_tensors
* Clarify padding and normalization behaviour
* add sudachi_projection option
* Upgrade sudachipy>=0.6.8
* add a test case for sudachi_projection
* Compatible with older versions of SudachiPy
* make fixup
* make style
* error message for unidic download
* revert jumanpp test cases
* format options for sudachi_projection
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* format options for sudachi_split_mode and sudachi_dict_type
* comment
* add tests for full_tokenizer kwargs
* pass projection arg directly
* require_sudachi_projection
* make style
* revert upgrade sudachipy
* check is_sudachi_projection_available()
* revert dependency_version_table and bugfix
* style format
* simply raise ImportError
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* simply raise ImportError
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* refactor with addedtokens decoder
* style
* get rid of lang code to id
* style
* keep some things for BC
* update tests
* add the mask token at the end of the vocab
* nits
* nits
* fix final tests
* style
* nits
* Update src/transformers/models/nllb/tokenization_nllb_fast.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* nits
* style?
* Update src/transformers/convert_slow_tokenizer.py
* make it a tad bit more custom
* ruff please stop
Co-Authored by avidale
<dale.david@mail.ru>
* Update
Co-authored-by: avidale
<dale.david@mail.ru>
* Update
Co-authored-by: avidale <dale.david@mail.ru>
* oupts
* ouft
* nites
* test
* fix the remaining failing tests
* style
* fix failing test
* ficx other test
* temp dir + test the raw init
* update test
* style
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* This is a test commit
* testing commit
* final commit with some changes
* Removed copy statement
* Fixed formatting issues
* Fixed error added past_key_values in the forward method
* Fixed a trailing whitespace. Damn the formatting rules are strict
* Added the copy statement