transformers/tests
Joel Lamy-Poirier e0921c6b53
Add GPTBigCode model (Optimized GPT2 with MQA from Santacoder & BigCode) (#22575)
* Add model with cli tool

* Remove unwanted stuff

* Add new code

* Remove inference runner

* Style

* Fix checks

* Test updates

* make fixup

* fix docs

* fix doc

* fix test

* hopefully fix pipeline tests

* refactor

* fix CIs

* add comment

* rename to `GPTBigCodeForCausalLM`

* correct readme

* make fixup + docs

* make fixup

* fixes

* fixes

* Remove pruning

* Remove import

* Doc updates

* More pruning removal

* Combine copies

* Single MQA implementation, remove kv cache pre-allocation and padding

* Update doc

* Revert refactor to match gpt2 style

* Merge back key and value caches, fix some type hints

* Update doc

* Fix position ids pith padding (PR 21080)

* Add conversion script temporarily

* Update conversion script

* Remove checkpoint conversion

* New model

* Fix MQA test

* Fix copies

* try fix tests

* FIX TEST!!

* remove  `DoubleHeadsModel`

* add MQA tests

* add slow tests

* clean up

* add CPU checker

* final fixes

* fixes

- fix GPU issue
- fixed slow tests
- skip disk offload

* fix final issue

* Simplify and comment baddbmm fix

* Remove unnecessary code

* Transpose tweaks

* Use beta=1 on cpu, improve tests

---------

Co-authored-by: younesbelkada <younesbelkada@gmail.com>
2023-04-10 10:57:21 +02:00
..
benchmark [Test refactor 1/5] Per-folder tests reorganization (#15725) 2022-02-23 15:46:28 -05:00
deepspeed [deepspeed] offload + non-cpuadam optimizer exception (#22043) 2023-03-09 08:12:57 -08:00
extended Apply ruff flake8-comprehensions (#21694) 2023-02-22 09:14:54 +01:00
fixtures [WIP] add SpeechT5 model (#18922) 2023-02-03 12:43:46 -05:00
generation Generate: TextIteratorStreamer timeout (#22576) 2023-04-05 09:57:46 +01:00
mixed_int8 [bnb] fix bnb failing test (#22439) 2023-03-29 15:13:00 +02:00
models Add GPTBigCode model (Optimized GPT2 with MQA from Santacoder & BigCode) (#22575) 2023-04-10 10:57:21 +02:00
onnx Time to Say Goodbye, torch 1.7 and 1.8 (#22291) 2023-03-21 19:22:01 +01:00
optimization Make schedulers picklable by making lr_lambda fns global (#21768) 2023-03-02 12:08:43 -05:00
pipelines Soft error whisper. (#22475) 2023-04-04 16:21:57 +02:00
repo_utils Test fetch v2 (#22367) 2023-03-31 16:18:43 -04:00
sagemaker Apply ruff flake8-comprehensions (#21694) 2023-02-22 09:14:54 +01:00
tokenization Update quality tooling for formatting (#21480) 2023-02-06 18:10:56 -05:00
trainer Implemented safetensors checkpoints save/load for Trainer (#22498) 2023-04-04 09:05:04 -04:00
utils Update tiny model summary file for recent models (#22637) 2023-04-06 22:52:59 +02:00
__init__.py GPU text generation: mMoved the encoded_prompt to correct device 2020-01-06 15:11:12 +01:00
test_backbone_common.py Backbone add mixin tests (#22542) 2023-04-06 13:50:15 +01:00
test_configuration_common.py Remove set_access_token usage + fail tests if FutureWarning (#22051) 2023-03-09 09:23:48 -05:00
test_feature_extraction_common.py Remove set_access_token usage + fail tests if FutureWarning (#22051) 2023-03-09 09:23:48 -05:00
test_image_processing_common.py Remove set_access_token usage + fail tests if FutureWarning (#22051) 2023-03-09 09:23:48 -05:00
test_image_transforms.py Rescale image back if it was scaled during PIL conversion (#22458) 2023-03-30 11:29:11 +01:00
test_modeling_common.py Fix inverted conditional in TF common test! (#22540) 2023-04-04 21:59:54 +01:00
test_modeling_flax_common.py Remove set_access_token usage + fail tests if FutureWarning (#22051) 2023-03-09 09:23:48 -05:00
test_modeling_tf_common.py Fix inverted conditional in TF common test! (#22540) 2023-04-04 21:59:54 +01:00
test_pipeline_mixin.py Make tiny model creation + pipeline testing more robust (#22500) 2023-04-06 17:45:55 +02:00
test_sequence_feature_extraction_common.py Apply ruff flake8-comprehensions (#21694) 2023-02-22 09:14:54 +01:00
test_tokenization_common.py Fix llama tokenizer (#22402) 2023-04-03 09:07:32 -04:00