Yih-Dar
3258ff9330
use pytest.mark
directly ( #27390 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-11-09 13:32:54 +01:00
Hz, Ji
50378cbf6c
device agnostic models testing ( #27146 )
...
* device agnostic models testing
* add decorator `require_torch_fp16`
* make style
* apply review suggestion
* Oops, the fp16 decorator was misused
2023-10-31 18:12:14 +01:00
Patrick von Platen
ac5893756b
[Attention Mask] Refactor all encoder-decoder attention mask ( #27086 )
...
* [FA2 Bart] Add FA2 to all Bart-like
* better
* Refactor attention mask
* remove all customized atteniton logic
* format
* mass rename
* replace _expand_mask
* replace _expand_mask
* mass rename
* add pt files
* mass replace & rename
* mass replace & rename
* mass replace & rename
* mass replace & rename
* Update src/transformers/models/idefics/modeling_idefics.py
* fix more
* clean more
* fix more
* make style
* fix again
* finish
* finish
* finish
* finish
* finish
* finish
* finish
* finish
* finish
* finish
* Apply suggestions from code review
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* small fix mistral
* finish
* finish
* finish
* finish
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-10-27 16:42:01 +02:00
Patrick von Platen
33f98cfded
Remove ambiguous padding_mask
and instead use a 2D->4D Attn Mask Mapper ( #26792 )
...
* [Attn Mask Converter] refactor attn mask
* up
* Apply suggestions from code review
Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>
* improve
* rename
* better cache
* renaming
* improve more
* improve
* fix bug
* finalize
* make style & make fix-copies
* correct more
* start moving attention_mask
* fix llama
* improve falcon
* up
* improve more
* improve more
* Update src/transformers/models/owlv2/modeling_owlv2.py
* make style
* make style
* rename to converter
* Apply suggestions from code review
---------
Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>
2023-10-23 18:54:00 +02:00
Yih-Dar
5334796d20
Copied from
for test files (#26713 )
...
* copied statement for test files
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-11 14:12:09 +02:00
Younes Belkada
368a58e61c
[core
] Integrate Flash attention 2 in most used models ( #25598 )
...
* v1
* oops
* working v1
* fixup
* add some TODOs
* fixup
* padding support + try with module replacement
* nit
* alternative design
* oops
* add `use_cache` support for llama
* v1 falcon
* nit
* a bit of refactor
* nit
* nits nits
* add v1 padding support falcon (even though it seemed to work before)
* nit
* falcon works
* fixup
* v1 tests
* nit
* fix generation llama flash
* update tests
* fix tests + nits
* fix copies
* fix nit
* test- padding mask
* stype
* add more mem efficient support
* Update src/transformers/modeling_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* fixup
* nit
* fixup
* remove it from config when saving
* fixup
* revert docstring
* add more checks
* use values
* oops
* new version
* fixup
* add same trick for falcon
* nit
* add another test
* change tests
* fix issues with GC and also falcon
* fixup
* oops
* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* add init_rope
* updates
* fix copies
* fixup
* fixup
* more clarification
* fixup
* right padding tests
* add docs
* add FA in docker image
* more clarifications
* add some figures
* add todo
* rectify comment
* Change to FA2
* Update docs/source/en/perf_infer_gpu_one.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* split in two lines
* change test name
* add more tests
* some clean up
* remove `rearrange` deps
* add more docs
* revert changes on dockerfile
* Revert "revert changes on dockerfile"
This reverts commit 8d72a66b4b
.
* revert changes on dockerfile
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <hi@lysand.re>
* address some comments
* docs
* use inheritance
* Update src/transformers/testing_utils.py
Co-authored-by: Lysandre Debut <hi@lysand.re>
* fixup
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/modeling_utils.py
* final comments
* clean up
* style
* add cast + warning for PEFT models
* fixup
---------
Co-authored-by: Felix Marty <9808326+fxmarty@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>
2023-09-22 17:42:10 +02:00
Arthur
015f8e110d
[CodeLlama
] Add support for CodeLlama
( #25740 )
...
* add all
* Revert "Delete .github directory"
This reverts commit 9b0ff7b052e2b20b629a26fb13606b78a42944d1.
* make conversion script backward compatible
* fixup
* more styling
* copy to llama changes
* fix repo consistency
* nits
* document correct classes
* updates
* more fixes
* nits
* update auto mappings
* add readmes
* smallupdates
* llama-code replace with llama_code
* make fixup
* updates to the testsing suite
* fix fast nits
* more small fixes
* fix decode
* fix template processing
* properly reset the normalizer
* nits processor
* tokenization tests pass
* styling
* last tests
* additional nits
* one test is left
* nits
Co-authored-by faabian <faabian@users.noreply.github.com>
* update failing test
* fixup
* remove decode infilling users should handle it on their onw after generation, padding can be a problem
* update
* make test slow and more meaningfull
* fixup
* doc update
* fixup
* Apply suggestions from code review
* add kwargs doc
* tokenizer requires `requires_backend`
* type requires_backends
* CodeLlama instead of LlamaCode
* more name cahnges
* nits
* make doctests happy
* small pipeline nits
* last nit
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* update
* add codellama to toctree
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-08-25 18:57:40 +02:00
Yih-Dar
bd90cda9a6
CI with num_hidden_layers=2
🚀 🚀 🚀 ( #25266 )
...
* CI with layers=2
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-08-02 20:22:36 +02:00
Arthur
0511369a8b
[LlamaConfig
] Nit: pad token should be None by default ( #24958 )
...
* pad token should be None by default
* fix tests
* nits
2023-07-21 14:32:34 +02:00
Arthur
07360b6c9c
[Llama2
] Add support for Llama 2 ( #24891 )
...
* add llama
* add other readmes
* update padding id in readme
* add link to paper
* fix paths and tokenizer
* more nits
* styling
* fit operation in 2 lines when possible
* nits
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add form
* update reademe
* update readme, we don't have a default pad token
* update test and tokenization
* LLaMA instead of Llama
* nits
* add expected text
* add greeedy output
* styling
* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* sequential device map
* skip relevant changes
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-07-18 15:18:31 -04:00
Joao Gante
34d9409427
Llama/GPTNeoX: add RoPE scaling ( #24653 )
...
* add rope_scaling
* tmp commit
* add gptneox
* add tests
* GPTNeoX can now handle long inputs, so the pipeline test was wrong
* Update src/transformers/models/open_llama/configuration_open_llama.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* remove ntk
* remove redundant validation
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-07-13 16:47:30 +01:00
Joao Gante
fd3eb3e3cd
Beef up Llama tests ( #22314 )
...
* tmp commit
* beef up llama tests
2023-03-22 15:20:48 +00:00
lewtun
f251441387
Add LlamaForSequenceClassification ( #22209 )
...
* Add LlamaForSequenceClassification
* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Add docstring
* Add test
* Add input embedding getter and setter
* Remove dead code
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-03-17 14:39:26 +01:00
Jason Phang
0041be5b3d
LLaMA Implementation ( #21955 )
...
* LLaMA
* sharding and docs
* tweak
* black
* inits
* ruff
* LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP
* init
* no checkpoint
* docs
* ruff
* type_vocab_size
* tokenizer fixes
* tokenizer fixes
* Update tokenization_llama.py
* Update tokenization_llama.py
* Update configuration_llama.py
* Update modeling_llama.py
* tokenizer add_bos by default
* licenses
* remove decoder
* norms and mlp
* rope overhaul
* tweaks
* black
* mention OPT implementation
* off-by-one naming
* typo
* fix
* tokenization fix and slicing bug
* padding config
* cleanup
* black
* update tests
* undo typo
* fix vocab caching logic
* ruff
* docbuilder
* attn fix from BlackSamorez
* initial feedback
* typo
* docs
* llama case
* llama case
* load checkpoint docs
* comment about tokenizer
* tokenizer defaults
* clear past_key_values if use_cache=False
* last tweaks
* last tweaks
* last tweaks
* last tweaks
---------
Co-authored-by: Stella Biderman <stellabiderman@gmail.com>
2023-03-16 09:00:53 -04:00