Joao Gante
186b8dc190
Tests: upgrade test_eager_matches_sdpa_generate
( #34386 )
2024-10-25 11:55:07 +01:00
Raushan Turganbay
65bb284448
Compile compatibilty for decoder-only models ( #32617 )
...
* squash into one commit
* add qwen2-vl for rope standardization
* fix mistral compile
* fix qwen2-vl
* fix-copies
2024-09-09 10:59:04 +02:00
Anton Vlasjuk
605f3245dc
Fix mask creations of GPTNeoX
and GPT2
( #31944 )
...
* fix mask creation of gpt2 and gpt_neox caused by me
* forgot the reshape of masks when shape > 2
* add tests for gpt neox and gpt2
* nit on a comment
2024-07-23 10:11:12 +02:00
Anton Vlasjuk
b07770c5eb
[GPT-NeoX
] Add SDPA support ( #31031 )
...
* starting support for sdpa in `gptneox` models
* small comment on tests
* fix dropout
* documentation and style
* clarify concrete paths for reference
* generalise attn projections and rope application
added head mask check to sdpa mask creation
handle sdpa memory backend bug via own version flag
* update docs and style
* move dtype casting outside of general attn_projection_and_rope function
fix flash_attn_2 stuff
* more generic attn warning if output_attns or head_mask
* simplify head mask check by moving head mask creation to a later point
* remove copied llama artifact
* remove padding_mask from attention function signature
* removing unnecessary comments, only "save" attn implementation once
* [run_slow] gpt_neox
2024-06-26 13:56:36 +01:00
Arthur
673440d073
update ruff version ( #30932 )
...
* update ruff version
* fix research projects
* Empty
* Fix errors
---------
Co-authored-by: Lysandre <lysandre@huggingface.co>
2024-05-22 06:40:15 +02:00
Joao Gante
441de62f49
RoPE models: add numerical sanity-check test for RoPE scaling ( #29808 )
...
* add hard rope scaling test
* make fixup
* quick rope scaling tests
* add copy statements
2024-03-28 11:25:50 +00:00
Arthur
83f9196cc4
[GPTNeoX
] Fix BC issue with 4.36 ( #28602 )
...
* fix dtype issue
* add a test
* update copied from mentions
* nits
* fixup
* fix copies
* Apply suggestions from code review
2024-01-21 17:01:19 +00:00
Yih-Dar
bd90cda9a6
CI with num_hidden_layers=2
🚀 🚀 🚀 ( #25266 )
...
* CI with layers=2
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-08-02 20:22:36 +02:00
Joao Gante
34d9409427
Llama/GPTNeoX: add RoPE scaling ( #24653 )
...
* add rope_scaling
* tmp commit
* add gptneox
* add tests
* GPTNeoX can now handle long inputs, so the pipeline test was wrong
* Update src/transformers/models/open_llama/configuration_open_llama.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* remove ntk
* remove redundant validation
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-07-13 16:47:30 +01:00
Arthur
e5c760d636
[GPTNeoX] Nit in config ( #24349 )
...
* add raise value error for attention size
* nits to fix test_config
* style
2023-06-20 19:19:19 +02:00
Yih-Dar
dadc9fb427
Update GPTNeoXLanguageGenerationTest
( #24193 )
...
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-12 15:37:12 +02:00
Yih-Dar
ffad4f1373
Update tiny models and pipeline tests ( #23446 )
...
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-05-18 17:29:04 +02:00
peter-sk
83b38fbea8
GPTNeoXForQuestionAnswering ( #23059 )
...
* first draft - gives index error in question_answering.py
* maturing
* no labels
* pipeline should know about QA
* fixing checks
* formatting
* fixed docstring
* initial commit
* formatting
* adding the class to many places
* towards less unhappy checks
* nearly there
* and gpt neox for qa
* use right model
* forgot this one
* base_model_prefix is "gpt_neox" for GPTNeoX* models
* unnecessary stuff
* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* format
* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* removed gpt2 stuff
---------
Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-05-04 10:15:15 -04:00
peter-sk
614e191c4d
added GPTNeoXForTokenClassification ( #23002 )
...
* initial commit
* added GPTNeoXForTokenClassification
* typo
* doc
fixed extra comma that turned into a tuple
* unifying variable names
fixing forward call
* classifier_dropout is in config
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
---------
Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-04-27 11:08:26 -04:00
Sugawara
6daa9cb515
add GPTNeoXForSequenceClassification ( #22671 )
...
* add GPTNeoXForSequenceClassification
* move the labels to logits.device (ref: #22561 )
* fix
2023-04-10 11:52:23 -04:00
Joao Gante
7dcd8703ef
Generate: support for left-padding on GPTNeoX and Llama ( #22382 )
2023-03-27 15:48:23 +01:00
Joao Gante
0fa46524ac
Generate: Add GPTNeoX integration test ( #22346 )
2023-03-24 11:33:16 +00:00
Joao Gante
502fec779b
Generate: add test for left-padding support ( #22322 )
2023-03-23 17:00:22 +00:00
Yih-Dar
871c31a6f1
🔥 Rework pipeline testing by removing PipelineTestCaseMeta
🚀 ( #21516 )
...
* Add PipelineTesterMixin
* remove class PipelineTestCaseMeta
* move validate_test_components
* Add for ViT
* Add to SPECIAL_MODULE_TO_TEST_MAP
* style and quality
* Add feature-extraction
* update
* raise instead of skip
* add tiny_model_summary.json
* more explicit
* skip tasks not in mapping
* add availability check
* Add Copyright
* A way to diable irrelevant tests
* update with main
* remove disable_irrelevant_tests
* skip tests
* better skip message
* better skip message
* Add all pipeline task tests
* revert
* Import PipelineTesterMixin
* subclass test classes with PipelineTesterMixin
* Add pipieline_model_mapping
* Fix import after adding pipieline_model_mapping
* Fix style and quality after adding pipieline_model_mapping
* Fix one more import after adding pipieline_model_mapping
* Fix style and quality after adding pipieline_model_mapping
* Fix test issues
* Fix import requirements
* Fix mapping for MobileViTModelTest
* Update
* Better skip message
* pipieline_model_mapping could not be None
* Remove some PipelineTesterMixin
* Fix typo
* revert tests_fetcher.py
* update
* rename
* revert
* Remove PipelineTestCaseMeta from ZeroShotAudioClassificationPipelineTests
* style and quality
* test fetcher for all pipeline/model tests
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-28 19:40:57 +01:00
Sylvain Gugger
6f79d26442
Update quality tooling for formatting ( #21480 )
...
* Result of black 23.1
* Update target to Python 3.7
* Switch flake8 to ruff
* Configure isort
* Configure isort
* Apply isort with line limit
* Put the right black version
* adapt black in check copies
* Fix copies
2023-02-06 18:10:56 -05:00
Yih-Dar
14fb8a63b9
skip some gpt_neox tests that require 80G RAM ( #17923 )
...
* skip some gpt_neox tests that require 80G RAM
* remove tests
* fix quality
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-01 09:04:38 -04:00
Jason Phang
205bc4152c
Fix GPT-NeoX-20B past handling, attention computation ( #17811 )
...
* Fix GPT-NeoX-20B past handling, swap attention computation to hopefully avoid NaN, update docs
* 20B tests
2022-06-30 08:47:40 -04:00
Sylvain Gugger
fdb120805c
Fix cache for GPT-Neo-X ( #17764 )
...
* Fix cache for GPT-Neo-X
* Add more tests
2022-06-20 08:43:36 -04:00
Jason Phang
71e602725b
[WIP] Adding GPT-NeoX-20B ( #16659 )
...
* initial
* first try
* working 20B
* 20B tokenizers
* Docs
* Import fixes for missing classes
* Update docs, fixup
* black formatting
* isort
* flake
* dummy objects
* documentation
* Documentation yml
* more docs
* tweaks for tests
* tokenization auto
* fix neox tests
* test
* test
* einsum
* address PR feedback
* Documentation
* Update README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/gpt_neox/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/gpt_neox/configuration_gpt_neox.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Remove undefined LaTeX syntax
* Update to full url to avoid confusion about if that's supposed to refer to the Hub
* fix auto
* move tests
* documentation fix
* more doc fixes
* test refactor
* fix import
* fix import
* fix import
* fix import
* fix import
* style fixes
* More modeling fixes
Co-authored-by: Jason Phang <zp489@gr057.hpc.nyu.edu>
Co-authored-by: Stella Biderman <stellabiderman@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-24 09:31:10 -04:00