s-JoL
c2c99dc7ef
add open-llama model with ckpt ( #22795 )
...
* update Open-Llama model
* update
* update format
* update doc
* update
* update stable embedding test
* update test case
* update format
* update readme
* fix typo
* update name
* remove tokenizer and update format
* remove convert_open_llama_weights_to_hf
* update warning and doc_string
---------
Co-authored-by: songliang.bayesian <songliang.bayesian@bytedance.com>
2023-04-28 11:01:32 -04:00
Yih-Dar
0bf34b1c9f
Skip pt/flax equivalence tests in pytorch bigbird
test file ( #23040 )
...
skip
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-28 17:00:13 +02:00
Shivam Shrirao
4d0ea3d269
Cuda rng_state_all is used when saving in distributed mode so same should also be used when loading ( #23045 )
...
cuda rng state should be all for distributed bc all were saved
2023-04-28 09:28:01 -04:00
Maria Khalusova
521a8ffa53
[docs] Doc TOC updates ( #23049 )
...
* first draft of toc restructure
* polishing based on feedback
2023-04-28 09:24:28 -04:00
Hyeonseo Yun
4893d919f1
🌐 [i18n-KO] Translated model_sharing.mdx
to Korean ( #22991 )
...
* docs: ko: init: model_sharing.mdx
* docs: ko: trans: model_sharing.mdx
Co-Authored-By: Kihoon Son <75935546+KIHOON71@users.noreply.github.com>
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* docs: ko: revised: apply code reviews model_sharing.mdx
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
* docs: ko: revised: apply aditional reviews model_sharing.mdx
1. Natural Expression
2. `파인 튜닝` to `미세 조정`
3. Glossary Sync
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
* docs: ko: revised: apply aditional reviews in model_sharing.mdx
1. Spell check
2. Natural Expression
3. Sync Glossary
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
* docs: ko: revised: `프로그래밍 방식` to `API` in model_sharing.mdx
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
---------
Co-authored-by: Kihoon Son <75935546+KIHOON71@users.noreply.github.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
2023-04-28 09:20:33 -04:00
Maxime Méloux
9b435204b1
Add Trainer support for ReduceLROnPlateau ( #23010 )
...
* Add Trainer support for ReduceLROnPlateau
Fixes #16503
* Remove training argument and add default instance
---------
Co-authored-by: mmeloux <maxime.meloux@loria.fr>
2023-04-28 09:17:30 -04:00
Yih-Dar
cf7baf4060
Make _test_xla_generate
less flaky ( #22996 )
...
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-28 13:27:28 +02:00
Ehsan M. Kermani
a0e7332839
Fix CLAP link across all READMEs ( #23032 )
...
* Fix CLAP link across all READMEs
* Fix copy only for en
2023-04-27 18:07:02 -04:00
Bartosz Szmelczynski
88399476c3
Fix bigbird random attention ( #21023 )
...
* switch np.random.permutation to jax.random.permuation
* remove comments
* remove leftover comment
* skip similarity tests
* modify indices_prng_key usage, add deterministic behaviour
* update style
* remove unused import
* remove copy statement since classes are not identical
* remove numpy import
* revert removing copied from statements
* make style from copied
* remove copied from statement
* update copied from statement to include only np.ndarry
* add deterministic args, unittestskip equivalence tests
2023-04-27 13:52:28 -04:00
Yih-Dar
27b66bea01
Update BridgeTowerModelTester
( #23029 )
...
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-27 18:26:17 +02:00
peter-sk
d65b14ed67
added GPTNeoForTokenClassification ( #22908 )
...
* added GPTNeoForTokenClassification
* add to top-level init
* fixup
* test
* more fixup
* add to gpt_neo.mdx
* repo consistency
* dummy copy
* fix copies
* optax >= 0.1.5 assumes jax.Array exists - which it doesn't for jax <= 0.3.6
* merge with main made this superfluous
* added classifier_dropout
* remove legacy code
* removed fmt:on/off
removed expected_outputs
* doc style fix
* classifier_dropout is always in config
---------
Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com>
2023-04-27 12:10:03 -04:00
peter-sk
614e191c4d
added GPTNeoXForTokenClassification ( #23002 )
...
* initial commit
* added GPTNeoXForTokenClassification
* typo
* doc
fixed extra comma that turned into a tuple
* unifying variable names
fixing forward call
* classifier_dropout is in config
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
---------
Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-04-27 11:08:26 -04:00
Arthur
1933231a0a
[MEGA] nit size test ( #23028 )
...
* add fast not use warning
* properly check sequence_length vs chunk_size
* fixup
2023-04-27 16:21:00 +02:00
Yih-Dar
a4908da04e
Fix the expected error in test_offline_mode_pipeline_exception
( #23022 )
...
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-27 14:22:05 +02:00
Nayeon Han
e28fff18b8
🌐 [i18n-KO] Translated multilingual.mdx
to Korean ( #23008 )
...
docs: ko: `multilingual.mdx`
Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
2023-04-27 08:06:12 -04:00
Younes Belkada
9435cc6670
[Pix2Struct
] Fix pix2struct doctest ( #23023 )
...
fix pix2struct doctest
2023-04-27 11:48:02 +02:00
fxmarty
3042c63a95
Add methods to PreTrainedModel to use PyTorch's BetterTransformer ( #21259 )
...
* fix mess
* better documentation
* typo
* fix doc
* update
* add test
* fix test
* more tests
* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* move to utils
* Apply suggestions from code review
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
* nit
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
2023-04-27 11:03:42 +02:00
Sylvain Gugger
0083b149e9
🚨 🚨 🚨 Use default ignore index in Luke ( #23014 )
...
Use default ignore index in Luke
2023-04-26 17:55:01 -04:00
Zachary Mueller
8b129030cb
Bring back PartialState DeepSpeed ( #22921 )
...
* Bring back deepspeed integration
* Branchname
* Self-scheduled
* newline
* Use deepspeed env var
* Remove comment
* Del env var after partialstate
2023-04-26 15:35:59 -04:00
Sylvain Gugger
4331923b97
Fix None value when adding info to auto_map ( #22990 )
2023-04-26 14:39:36 -04:00
Arthur
d0b5002378
[Llama Tokenizer] Fast llama template ( #22959 )
...
* update template processing for llama fast to add eos
* style
* update
* adress training from new issue
* fix
* update
* special tokens can be given even if not used
2023-04-26 19:13:20 +02:00
Younes Belkada
00bc6e2067
[PEFT
] Add HFTracer support for PEFT ( #23006 )
...
* add hack fx
* continue hacking
* final changes
* Test
* Add a keys method
* Fix keys method
* revert unneeded changes
* small nit
---------
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
2023-04-26 18:45:05 +02:00
Younes Belkada
304aacac90
🚨 🚨 🚨 [Pix2Struct
] Attempts to fix training issues 🚨 🚨 🚨 ( #23004 )
...
* multiple fixes
- add `add_special_tokens` to `True` by default
- remove label smoothing and labels masking
* fix test
2023-04-26 18:29:25 +02:00
Javier de la Rosa
ba0dc54576
Add gradient checkpointing to Whisper Flax ( #22954 )
...
* Add gradient checkpointing to Whisper Flax
* self.gradient_checkpointing only needed in nn.Module, removing unnecessary comments
2023-04-26 12:19:16 -04:00
Yih-Dar
a72b82ebe6
Remove a failing ONNX test ( #23011 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-26 17:44:12 +02:00
Ritik Nandwal
20ac86c6f1
Add TensorFlow Wav2Vec2 for sequence classification ( #22073 )
...
* Add initial changes for TF wav2vec2 for sequence classification
* Add suggested changes
* Add serving and serving output methods
* Add serving_output implementation and fix layer_weights
* Add fixes
* Fixed test cases
* Fixing test and adding suggested changes
2023-04-26 13:35:30 +01:00
Hyeonseo Yun
4c2b4c4c3c
🌐 [i18n-KO] Translated token_classification.mdx
to Korean ( #22945 )
...
* docs: ko: init: token_classification.mdx
* docs: ko: trans: tasks/token_classification.mdx
* docs: ko: revise: apply suggestions tasks/token_classification.mdx
right vocabulary, spell check, natural expression
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
* docs: ko: revise: `Hub` to `허브` in tasks/token_classification.mdx
* docs: ko: revise: `example` in tasks/token_classification.mdx
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-Authored-By: Kihoon Son <75935546+KIHOON71@users.noreply.github.com>
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* docs: ko: revise: ko expression in tasks/token_classification.mdx
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
* Revert "docs: ko: revise: ko expression in tasks/token_classification.mdx"
This reverts commit 8efe28059b
.
* docs: ko: revise: `quick tour` in tasks/token_classification.mdx
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
---------
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Kihoon Son <75935546+KIHOON71@users.noreply.github.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
2023-04-26 07:56:14 -04:00
Sohyun Sim
6dc2474727
🌐 [i18n-KO] Translated tasks/image_captioning.mdx
to Korean ( #22943 )
...
docs: ko: tasks/image_captioning.mdx
Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com>
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
2023-04-26 07:54:58 -04:00
Daniel Levenson
4e1522d65a
Fix typo in mega.mdx ( #22998 )
...
MegaConfiig -> MegaConfig
2023-04-25 17:58:45 -04:00
Wonhyeong Seo
d95045717e
🌐 [i18n-KO] Translated serialization.mdx
to Korean ( #22806 )
...
docs: ko: serialization.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
2023-04-25 12:38:51 -04:00
Younes Belkada
a0ae2310ec
[DocTest
] Fix correct checkpoint ( #22988 )
...
fix pipeline issue
2023-04-25 15:20:36 +02:00
Lingepumpe
5427250351
Avoid invalid escape sequences, use raw strings ( #22936 )
...
* Avoid invalid escape sequences, use raw strings
* Integrate PR feedback
2023-04-25 09:17:56 -04:00
Jari Van Melckebeke
81c1910c86
fixed small typo in code example ( #22982 )
...
fixed typo in code example
fixed a really small typo in the docs of single gpu inference
2023-04-25 08:56:21 -04:00
AleksanderWWW
0a570dbd2e
Neptune fix bug init run ( #22836 )
...
* [neptune] fix checkpoint bug with relative out_dir
* update imports
* reformat with black
* check neptune without imports
* fix typing-related issue
* run black on code
* use os.path.sep instead of raw \
* simplify imports and remove type annotation
* make ruff happy
* apply review suggestions
* replace run with with_id kwarg to run
* update imports to avoid deprecation warnings for the latest client
---------
Co-authored-by: kshitij12345 <kshitijkalambarkar@gmail.com>
2023-04-25 08:51:05 -04:00
Younes Belkada
d4d628462f
[SAM
] Add sam doc ( #22984 )
...
* add sam doc
* fixes
* multiple fixes
2023-04-25 14:00:27 +02:00
Nayeon Han
f0f5e28f82
🌐 [i18n-KO] Fixed tasks/masked_language_modeling.mdx
( #22965 )
...
fix: docs: missing newline before code block
2023-04-25 09:59:17 +02:00
Yih-Dar
60f9649653
Fix DeepSpeed
CI job link in Past CI ( #22967 )
...
* Fix job link
* fix artifact name logic
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-25 09:52:19 +02:00
Yih-Dar
073baf7f22
Install accelerete@main
in PyTorch Past CI jobs ( #22963 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-24 21:19:06 +02:00
Joao Gante
e4a97f82bf
Generate: assisted generation with sample (take 2) ( #22949 )
...
* temperature controls speed
2023-04-24 19:54:55 +01:00
Gabriel Yang
7701716efc
🌐 [i18n-KO] translate create_a_model
doc to Korean ( #22754 )
...
docs: ko: translates create_a_model.mdx
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
2023-04-24 13:02:19 -04:00
amyeroberts
8f20e61c85
Update feature selection in to_tf_dataset ( #21935 )
...
* Update feature selection
* Check compatibility with datasets version
* Checkout from datasets main
2023-04-24 17:34:30 +01:00
Matt
345a1371d8
Fix TF example in quicktour ( #22960 )
...
* Fix TF example in quicktour
* Fix model.fit() and the dataset section too
2023-04-24 17:25:13 +01:00
othertea
503e8c8b32
fix ValueError message in LlamaAttention ( #22966 )
2023-04-24 12:02:05 -04:00
Nicolas Patry
6e32959329
Reverting Deta cloning mecanism. ( #22656 )
...
* Fixed the revert by making sure that even the regexp can cover all
duplicates.
* Code simplification using hash.
* Fixing the `ident`.
* Fixing ignoring patterened duplicate names.
* Using `accelerate@find_tied_parameters` for from_pretrained
This is more correct there, since it handles meta device seemlessly
and we don't need to handle "non-duplicate" tensors (slices of each
other).
* Protecting accelerate.
* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-04-24 11:24:35 -04:00
Nayeon Han
d6f1da6b71
🌐 [i18n-KO] Translated run_scripts.mdx
to Korean ( #22793 )
...
docs: ko: `run_scripts` to Korean
Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
2023-04-24 10:18:20 -04:00
Lucain
74c55ab9e5
Prepare tests for hfh 0.14 ( #22958 )
...
* Test hf_hub 0.14.0rc1
* fix mocked tests
* package version
---------
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
Co-authored-by: testbot <lucainp@hf.co>
2023-04-24 09:31:50 -04:00
hanrui1sensetime
69f2d5386b
[Fix Bugs] Fix keys in _load_pretrained_model
( #22947 )
...
fix transformers keys
2023-04-24 09:28:51 -04:00
Connor Boyle
b5f06d6c59
Raise error if stride
is too high in TokenClassificationPipeline
( #22942 )
...
* Raise error if `stride` is too high
* Clarify use of `stride`
2023-04-24 09:27:49 -04:00
Yih-Dar
3f6a4b5bd7
Decorate test_codegen_sample_max_time
as flaky ( #22953 )
...
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-24 15:27:31 +02:00
fxmarty
edb6d950cb
Add an attribute to disable custom kernels in deformable detr in order to make the model ONNX exportable ( #22918 )
...
* add disable kernel option
* add comment
* fix copies
* add disable_custom_kernels to config
* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* style
* fix
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-04-24 09:27:03 -04:00