Commit Graph

57 Commits

Author SHA1 Message Date
Hz, Ji
50378cbf6c
device agnostic models testing (#27146)
* device agnostic models testing

* add decorator `require_torch_fp16`

* make style

* apply review suggestion

* Oops, the fp16 decorator was misused
2023-10-31 18:12:14 +01:00
Yih-Dar
b8f1cde931
Fix Mistral OOM again (#26847)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-16 22:47:20 +02:00
Yih-Dar
db5e0c3292
Fix MistralIntegrationTest OOM (#26754)
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-12 12:31:11 +02:00
Yih-Dar
5334796d20
Copied from for test files (#26713)
* copied statement for test files

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-11 14:12:09 +02:00
fxmarty
64845307b3
Remove unnecessary unsqueeze - squeeze in rotary positional embedding (#26162)
* remove unnecessary unsqueeze-squeeze in llama

* correct other models

* fix

* revert gpt_neox_japanese

* fix copie

* fix test
2023-10-06 18:25:15 +09:00
Younes Belkada
ae9a344cce
[Mistral] Add Flash Attention-2 support for mistral (#26464)
* add FA-2 support for mistral

* fixup

* add sliding windows

* fixing few nits

* v1 slicing cache - logits do not match

* add comment

* fix bugs

* more mem efficient

* add warning once

* add warning once

* oops

* fixup

* more comments

* copy

* add safety checker

* fixup

* Update src/transformers/models/mistral/modeling_mistral.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* copied from

* up

* raise when padding side is right

* fixup

* add doc + few minor changes

* fixup

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-10-03 13:44:46 +02:00
Chris Bamford
72958fcd3c
[Mistral] Mistral-7B-v0.1 support (#26447)
* [Mistral] Mistral-7B-v0.1 support

* fixing names

* slightly longer test

* fixups

* not_doctested

* wrongly formatted references

* make fixuped

---------

Co-authored-by: Timothee Lacroix <t@eugen.ai>
Co-authored-by: timlacroix <t@mistral.ai>
2023-09-27 18:30:46 +02:00