Yih-Dar
f2d5dfbab2
Remove @slow
for test_eager_matches_sdpa_inference
( #34558 )
...
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-05 16:10:42 +01:00
Joao Gante
8a734ea2c3
Tests: move generate
tests to the right mixin and delete redundant tests ( #34464 )
...
* tmp commit
* tmp commit
* cull overwrites of deleted tests
* typo
* more specific docstring
* make fixup
* parameterize at the top?
* correction
* more deletions :D
* tmp commit
* for VLMs too
* fix _check_outputs
* test nit
* make fixup
* fix another flaky
* test_generate_from_inputs_embeds -- handle missing attention mask
2024-10-30 10:59:08 +00:00
Longjie Zheng
0d1692a49b
Fix attn mask ignore logic in training-time trace ( #32613 )
...
* fix attn mask logic for training-time trace
* add test
* fix
* fix
* fix
* fix
* fix
* format
* [run-slow] llama
* avoid accelearate
* [run-slow] llama
2024-10-04 19:00:45 +02:00
Ao Tang
6a03942db7
Add Nemotron HF Support ( #31699 )
...
* Add nemotron support
* fix inference
* add unit test
* add layernorm1p as a class to avoid meta device mismatch
* test fixed
* Add copied_from statements
* remove pretraining_tp args
* remove nemotronlayernorm
* force LN computation done in FP32
* remove nemotrontokenizer and use llamatokenizer
* license update
* add option for kv_channels for minitron8b
* remove assert
* o_proj fixed
* o_proj reshape
* add gated_proj option
* typo
* remove todos
* fix broken test after merging latest main
* remove nezha/nat after meging main
* chnage default config to 15b model
* add nemo conversion script
* rename conversion script
* remove gate_proj option
* pr comment resolved
* fix unit test
* rename kv_channels to head_dim
* resolve PR issue
* add nemotron md
* fix broken tests
* refactor rope for nemotron
* test fix
* remove linearscaling
* whitespace and import
* fix some copied-from
* code style fix
* reformatted
* add position_embedding to nemotronattention
* rope refactor to only use config, copied-from fix
* format
* Run make fix-copies
* nemotron md with autodoc
* doc fix
* fix order
* pass check_config_docstrings.py
* fix config_attributes
* remove all llama BC related code
* Use PreTrainedTokenizerFast
* ruff check examples
* conversion script update
* add nemotron to toctree
2024-08-06 15:42:05 +02:00