transformers/tests
fxmarty 1da1302ec8
Flash Attention 2 support for RoCm (#27611)
* support FA2

* fix typo

* fix broken tests

* fix more test errors

* left/right

* fix bug

* more test

* typo

* fix layout flash attention falcon

* do not support this case

* use allclose instead of equal

* fix various bugs with flash attention

* bump

* fix test

* fix mistral

* use skiptest instead of return that may be misleading

* add fix causal arg flash attention

* fix copies

* more explicit comment

* still use self.is_causal

* fix causal argument

* comment

* fixes

* update documentation

* add link

* wrong test

* simplify FA2 RoCm requirements

* update opt

* make flash_attn_uses_top_left_mask attribute private and precise comment

* better error handling

* fix copy & mistral

* Update src/transformers/modeling_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/modeling_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/modeling_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/utils/import_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* use is_flash_attn_greater_or_equal_2_10 instead of is_flash_attn_greater_or_equal_210

* fix merge

* simplify

* inline args

---------

Co-authored-by: Felix Marty <felix@hf.co>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-12-04 21:52:17 +09:00
..
benchmark [Test refactor 1/5] Per-folder tests reorganization (#15725) 2022-02-23 15:46:28 -05:00
bettertransformer Fixed malapropism error (#26660) 2023-10-09 11:04:57 +02:00
deepspeed device-agnostic deepspeed testing (#27342) 2023-11-09 12:34:13 +01:00
extended Device agnostic trainer testing (#27131) 2023-10-30 18:16:40 +00:00
fixtures [WIP] add SpeechT5 model (#18922) 2023-02-03 12:43:46 -05:00
fsdp device agnostic fsdp testing (#27120) 2023-11-01 07:17:06 +01:00
generation Generate: GenerationConfig throws an exception when generate args are passed (#27757) 2023-11-30 14:16:31 +00:00
models Added test cases for rembert refering to albert and reformer test_tok… (#27637) 2023-12-04 13:36:57 +01:00
optimization Make schedulers picklable by making lr_lambda fns global (#21768) 2023-03-02 12:08:43 -05:00
peft_integration [Peft] modules_to_save support for peft integration (#27466) 2023-11-14 10:32:57 +01:00
pipelines Update tiny model summary file (#27388) 2023-11-23 21:00:39 +01:00
quantization [AWQ ] Addresses TODO for awq tests (#27467) 2023-11-13 18:18:41 +01:00
repo_utils Docstring check (#26052) 2023-10-04 15:13:37 +02:00
sagemaker Broken links fixed related to datasets docs (#27569) 2023-11-17 13:44:09 -08:00
tokenization [Styling] stylify using ruff (#27144) 2023-11-16 17:43:19 +01:00
tools Add support for for loops in python interpreter (#24429) 2023-06-26 09:58:14 -04:00
trainer Fixed passing scheduler-specific kwargs via TrainingArguments lr_scheduler_kwargs (#27595) 2023-11-28 08:33:45 +01:00
utils Update tiny model summary file (#27388) 2023-11-23 21:00:39 +01:00
__init__.py GPU text generation: mMoved the encoded_prompt to correct device 2020-01-06 15:11:12 +01:00
test_backbone_common.py [AutoBackbone] Add test (#26094) 2023-09-18 23:47:54 +02:00
test_configuration_common.py [ PretrainedConfig] Improve messaging (#27438) 2023-11-15 14:10:39 +01:00
test_configuration_utils.py Remove-auth-token (#27060) 2023-11-13 14:20:54 +01:00
test_feature_extraction_common.py Split common test from core tests (#24284) 2023-06-15 07:30:24 -04:00
test_feature_extraction_utils.py Remove-auth-token (#27060) 2023-11-13 14:20:54 +01:00
test_image_processing_common.py Input data format (#25464) 2023-08-16 17:45:02 +01:00
test_image_processing_utils.py Remove-auth-token (#27060) 2023-11-13 14:20:54 +01:00
test_image_transforms.py Normalize floating point cast (#27249) 2023-11-10 15:35:27 +00:00
test_modeling_common.py Flash Attention 2 support for RoCm (#27611) 2023-12-04 21:52:17 +09:00
test_modeling_flax_common.py Split common test from core tests (#24284) 2023-06-15 07:30:24 -04:00
test_modeling_flax_utils.py Default to msgpack for safetensors (#27460) 2023-11-13 15:17:01 +01:00
test_modeling_tf_common.py Deprecate TransfoXL (#27607) 2023-11-24 11:48:02 +01:00
test_modeling_tf_utils.py Default to msgpack for safetensors (#27460) 2023-11-13 15:17:01 +01:00
test_modeling_utils.py [ModelOnTheFlyConversionTester] Mark as slow for now (#27823) 2023-12-04 08:33:15 +01:00
test_pipeline_mixin.py Shorten the conversation tests for speed + fixing position overflows (#26960) 2023-10-31 14:20:04 +00:00
test_sequence_feature_extraction_common.py Fix typo (#25966) 2023-09-05 10:12:25 +02:00
test_tokenization_common.py [Styling] stylify using ruff (#27144) 2023-11-16 17:43:19 +01:00
test_tokenization_utils.py Remove-auth-token (#27060) 2023-11-13 14:20:54 +01:00