Vijeth Moudgalya
58022e41b8
#23388 Issue: Update RoBERTa configuration ( #23863 )
2023-05-30 10:53:40 -04:00
Arthur
6fc0454b2f
[LlamaTokenizerFast] nit update post_processor
on the fly ( #23855 )
...
* Update the processor when changing add_eos and add_bos
* fixup
* update
* add a test
* fix failing tests
* fixup
2023-05-30 16:50:41 +02:00
Clémentine Fourrier
0623f08e99
Update collating_graphormer.py ( #23862 )
2023-05-30 10:23:20 -04:00
peridotml
62ba64b90a
Adds a FlyteCallback ( #23759 )
...
* initial flyte callback
* lint
* logs should still be saved to Flyte even if pandas isn't install (unlikely)
* cr - flyte team
* add docs for Flytecallback
* fix doc string - cr sgugger
* Apply suggestions from code review
cr - sgugger fix doc strings
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-05-30 10:08:07 -04:00
Hyeonseo Yun
867316670a
🌐 [i18n-KO] Translated troubleshooting.mdx
to Korean ( #23166 )
...
* docs: ko: troubleshooting.mdx
* revised: fix _toctree.yml #23112
* feat: nmt draft `troubleshooting.mdx`
* fix: manual edits `troubleshooting.mdx`
* revised: resolve suggestions troubleshooting.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
---------
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
2023-05-30 09:49:47 -04:00
Kihoon Son
192aa04783
[i18n-KO] Translated video_classification.mdx to Korean ( #23026 )
...
* task/video_classification translated
Co-Authored-By: Hyeonseo Yun <0525_hhgus@naver.com>
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
* Update video_classification.mdx
* Update _toctree.yml
* Update _toctree.yml
* Update _toctree.yml
* Update _toctree.yml
---------
Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
2023-05-30 09:28:44 -04:00
Kihoon Son
a077f710f3
🌐 [i18n-KO] Translated fast_tokenizers.mdx
to Korean ( #22956 )
...
* docs: ko: fast_tokenizer.mdx
content - translated
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>
Co-Authored-By: Hyeonseo Yun <0525_hhgus@naver.com>
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
* Update fast_tokenizers.mdx
* Update fast_tokenizers.mdx
* Update fast_tokenizers.mdx
* Update fast_tokenizers.mdx
* Update _toctree.yml
---------
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
2023-05-30 09:27:40 -04:00
Matthijs Hollemans
2faa09530b
fix Whisper tests on GPU ( #23753 )
...
* move input features to GPU
* skip these tests because undefined behavior
* unskip tests
2023-05-30 09:06:58 -04:00
Matt
ac224dee90
TF SAM shape flexibility fixes ( #23842 )
...
SAM shape flexibility fixes for compilation
2023-05-30 13:08:44 +01:00
Samin Yasar
af45ec0a16
add type hint in pipeline model argument ( #23740 )
...
* add type hint in pipeline model argument
* add pretrainedmodel and tfpretainedmodel type hint
* make type hints string
2023-05-30 11:05:58 +01:00
Eli Simhayev
4b6a5a7caa
[Time-Series] Autoformer model ( #21891 )
...
* ran `transformers-cli add-new-model-like`
* added `AutoformerLayernorm` and `AutoformerSeriesDecomposition`
* added `decomposition_layer` in `init` and `moving_avg` to config
* added `AutoformerAutoCorrelation` to encoder & decoder
* removed caninical self attention `AutoformerAttention`
* added arguments in config and model tester. Init works! 😁
* WIP autoformer attention with autocorrlation
* fixed `attn_weights` size
* wip time_delay_agg_training
* fixing sizes and debug time_delay_agg_training
* aggregation in training works! 😁
* `top_k_delays` -> `top_k_delays_index` and added `contiguous()`
* wip time_delay_agg_inference
* finish time_delay_agg_inference 😎
* added resize to autocorrelation
* bug fix: added the length of the output signal to `irfft`
* `attention_mask = None` in the decoder
* fixed test: changed attention expected size, `test_attention_outputs` works!
* removed unnecessary code
* apply AutoformerLayernorm in final norm in enc & dec
* added series decomposition to the encoder
* added series decomp to decoder, with inputs
* added trend todos
* added autoformer to README
* added to index
* added autoformer.mdx
* remove scaling and init attention_mask in the decoder
* make style
* fix copies
* make fix-copies
* inital fix-copies
* fix from https://github.com/huggingface/transformers/pull/22076
* make style
* fix class names
* added trend
* added d_model and projection layers
* added `trend_projection` source, and decomp layer init
* added trend & seasonal init for decoder input
* AutoformerModel cannot be copied as it has the decomp layer too
* encoder can be copied from time series transformer
* fixed generation and made distrb. out more robust
* use context window to calculate decomposition
* use the context_window for decomposition
* use output_params helper
* clean up AutoformerAttention
* subsequences_length off by 1
* make fix copies
* fix test
* added init for nn.Conv1d
* fix IGNORE_NON_TESTED
* added model_doc
* fix ruff
* ignore tests
* remove dup
* fix SPECIAL_CASES_TO_ALLOW
* do not copy due to conv1d weight init
* remove unused imports
* added short summary
* added label_length and made the model non-autoregressive
* added params docs
* better doc for `factor`
* fix tests
* renamed `moving_avg` to `moving_average`
* renamed `factor` to `autocorrelation_factor`
* make style
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* fix configurations
* fix integration tests
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fixing `lags_sequence` doc
* Revert "fixing `lags_sequence` doc"
This reverts commit 21e34911e3
.
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* model layers now take the config
* added `layer_norm_eps` to the config
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* added `config.layer_norm_eps` to AutoformerLayernorm
* added `config.layer_norm_eps` to all layernorm layers
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fix variable names
* added inital pretrained model
* added use_cache docstring
* doc strings for trend and use_cache
* fix order of args
* imports on one line
* fixed get_lagged_subsequences docs
* add docstring for create_network_inputs
* get rid of layer_norm_eps config
* add back layernorm
* update fixture location
* fix signature
* use AutoformerModelOutput dataclass
* fix pretrain config
* no need as default exists
* subclass ModelOutput
* remove layer_norm_eps config
* fix test_model_outputs_equivalence test
* test hidden_states_output
* make fix-copies
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* removed unused attr
* Update tests/models/autoformer/test_modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* use AutoFormerDecoderOutput
* fix formatting
* fix formatting
---------
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-05-30 10:23:32 +02:00
Sylvain Gugger
17a55534f5
Enable code-specific revision for code on the Hub ( #23799 )
...
* Enable code-specific revision for code on the Hub
* invalidate old revision
2023-05-26 15:51:15 -04:00
Zachary Mueller
edf7772826
Log the right train_batch_size if using auto_find_batch_size and also log the adjusted value seperately. ( #23800 )
...
* Log right bs
* Log
* Diff message
2023-05-26 15:09:05 -04:00
Ran Ran
e724246935
Fix no such file or directory error ( #23783 )
...
* Fix no such file or directory error
* Address comment
* Fix formatting issue
2023-05-26 14:24:57 -04:00
Wang, Yi
b7b729b38d
no_cuda does not take effect in non distributed environment ( #23795 )
...
Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
2023-05-26 10:47:51 -04:00
amitportnoy
d61d747627
Update trainer.mdx class_weights example ( #23787 )
...
class_weights tensor should follow model's device
2023-05-26 08:36:33 -04:00
Sylvain Gugger
4d9b76a80f
Fix RWKV backward on GPU ( #23774 )
2023-05-26 08:33:17 -04:00
Arthur
8d28dba35d
[OPT] Doc nit, using fast is fine ( #23789 )
...
small doc nit
2023-05-26 14:30:32 +02:00
Younes Belkada
f67dac97bd
[Nllb-Moe
] Fix nllb moe accelerate issue ( #23758 )
...
fix nllb moe accelerate issue
2023-05-25 22:37:33 +02:00
dependabot[bot]
d685e330b5
Bump tornado from 6.0.4 to 6.3.2 in /examples/research_projects/visual_bert ( #23767 )
...
Bump tornado in /examples/research_projects/visual_bert
Bumps [tornado](https://github.com/tornadoweb/tornado ) from 6.0.4 to 6.3.2.
- [Changelog](https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst )
- [Commits](https://github.com/tornadoweb/tornado/compare/v6.0.4...v6.3.2 )
---
updated-dependencies:
- dependency-name: tornado
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-25 16:16:12 -04:00
dependabot[bot]
4b0e7ded1c
Bump tornado from 6.0.4 to 6.3.2 in /examples/research_projects/lxmert ( #23766 )
...
Bumps [tornado](https://github.com/tornadoweb/tornado ) from 6.0.4 to 6.3.2.
- [Changelog](https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst )
- [Commits](https://github.com/tornadoweb/tornado/compare/v6.0.4...v6.3.2 )
---
updated-dependencies:
- dependency-name: tornado
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-25 16:16:01 -04:00
玩火
f04f549bae
Fix is_ninja_available() ( #23752 )
...
* Fix is_ninja_available()
search ninja using subprocess instead of importlib.
* Fix style
* Fix doc
* Fix style
2023-05-25 16:10:25 -04:00
Arthur
3416bba7c7
[LongFormer] code nits, removed unused parameters ( #23749 )
...
* remove unused parameters
* remove unused parameters in config
2023-05-25 16:06:14 +02:00
Sylvain Gugger
6e4bc67099
Revamp test selection for the example tests ( #23737 )
...
* Revamp test selection for the example tests
* Rename old XLA test and fake modif in run_glue
* Fixes
* Fake Trainer modif
* Remove fake modifs
2023-05-25 09:38:21 -04:00
Sylvain Gugger
7d4fe85ef3
Fix psuh_to_hub in Trainer when nothing needs pushing ( #23751 )
2023-05-25 09:38:09 -04:00
Ravi Theja
06c28cd0fc
Add LlamaIndex to awesome-transformers.md ( #23484 )
2023-05-25 09:35:10 -04:00
Eric J. Wang
f0a2a82ab4
Fix pip install --upgrade accelerate
command in modeling_utils.py ( #23747 )
...
Fix command in modeling_utils.py
2023-05-25 07:48:48 -04:00
Matt
e45e756d22
Remove the last few TF serving sigs ( #23738 )
...
Remove some more serving methods that (I think?) turned up while this PR was open
2023-05-24 21:19:44 +01:00
Sylvain Gugger
9850e6ddab
Enable prompts on the Hub ( #23662 )
...
* Enable prompts on the Hub
* Update src/transformers/tools/prompts.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Address review comments
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-05-24 16:09:13 -04:00
Zachary Mueller
75bbf20bce
Fix sagemaker DP/MP ( #23681 )
...
* Check for use_sagemaker_dp
* Add a check for is_sagemaker_mp when setting _n_gpu again. Should be last broken thing
* Try explicit check?
* Quality
2023-05-24 15:51:09 -04:00
Daniel King
89159651ba
Fix the regex in get_imports
to support multiline try blocks and excepts with specific exception types ( #23725 )
...
* fix and test get_imports for multiline try blocks, and excepts with specific errors
* fixup
* add some more tests
* add license
2023-05-24 15:40:19 -04:00
Sanchit Gandhi
d8222be57e
[Whisper] Reduce batch size in tests ( #23736 )
2023-05-24 17:31:25 +01:00
Matt
814de8fac7
Overhaul TF serving signatures + dummy inputs ( #23234 )
...
* Let's try autodetecting serving sigs
* Don't clobber existing sigs
* Change shapes for multiplechoice models
* Make default dummy inputs smarter too
* Fix missing f-string
* Let's YOLO a serving output too
* Read __class__.__name__ properly
* Don't just pass naked lists in there and expect it to be okay
* Code cleanup
* Update default serving sig
* Clearer error messages
* Further updates to the default serving output
* make fixup
* Update the serving output a bit more
* Cleanups and renames, raise errors appropriately when we can't infer inputs
* More renames
* we're building in a functional context again, yolo
* import DUMMY_INPUTS from the right place
* import DUMMY_INPUTS from the right place
* Support cross-attention in the dummies
* Support cross-attention in the dummies
* Complete removal of dummy/serving overrides in BERT
* Complete removal of dummy/serving overrides in RoBERTa
* Obliterate lots and lots of serving sig and dummy overrides
* merge type hint changes
* Fix for token_type_ids with vocab_size 1
* Add missing property decorator
* Fix T5 and hopefully some models that take conv inputs
* More signature pruning
* Fix T5's signature
* Fix Wav2Vec2 signature
* Fix LongformerForMultipleChoice input signature
* Fix BLIP and LED
* Better default serving output error handling
* Fix BART dummies
* Fix dummies for cross-attention, esp encoder-decoder models
* Fix visionencoderdecoder signature
* Fix BLIP serving output
* Small tweak to BART dummies
* Cleanup the ugly parameter inspection line that I used in a few places
* committed a breakpoint again
* Move the text_dims check
* Remove blip_text serving_output
* Add decoder_input_ids to the default input sig
* Remove all the manual overrides for encoder-decoder model signatures
* Tweak longformer/led input sigs
* Tweak default serving output
* output.keys() -> output
* make fixup
2023-05-24 17:03:24 +01:00
Connor Henderson
3d7baef114
fix: Whisper generate, move text_prompt_ids trim up for max_new_tokens calculation ( #23724 )
...
move text_prompt_ids trimming to top
2023-05-24 11:34:21 -04:00
Jungnerd
50a56bedb6
fix: delete duplicate sentences in document_question_answering.mdx
( #23735 )
...
fix: delete duplicate sentence
2023-05-24 11:20:50 -04:00
Matt
d2d8822604
TF SAM memory reduction ( #23732 )
...
* Extremely small change to TF SAM dummies to reduce memory usage on build
* remove debug breakpoint
* Debug print statement to track array sizes
* More debug shape printing
* More debug shape printing
* Now remove the debug shape printing
* make fixup
* make fixup
2023-05-24 15:59:02 +01:00
pagarsky
28aa438cd2
Minor awesome-transformers.md fixes ( #23453 )
...
Minor docs fixes
2023-05-24 08:57:52 -04:00
Matt
f8b2574416
Better TF docstring types ( #23477 )
...
* Rework TF type hints to use | None instead of Optional[] for tf.Tensor
* Rework TF type hints to use | None instead of Optional[] for tf.Tensor
* Don't forget the imports
* Add the imports to tests too
* make fixup
* Refactor tests that depended on get_type_hints
* Better test refactor
* Fix an old hidden bug in the test_keras_fit input creation code
* Fix for the Deit tests
2023-05-24 13:52:52 +01:00
Wang, Yi
767e6b5314
fix gptj could not jit.trace in GPU ( #23317 )
...
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2023-05-24 08:48:31 -04:00
uchuhimo
b4698b7ef2
fix: use bool instead of uint8/byte in Deberta/DebertaV2/SEW-D to make it compatible with TensorRT ( #23683 )
...
* Use bool instead of uint8/byte in DebertaV2 to make it compatible with TensorRT
TensorRT cannot accept onnx graph with uint8/byte intermediate tensors. This PR uses bool tensors instead of unit8/byte tensors to make the exported onnx file can work with TensorRT.
* fix: use bool instead of uint8/byte in Deberta and SEW-D
---------
Co-authored-by: Yuxian Qiu <yuxianq@nvidia.com>
2023-05-24 08:47:43 -04:00
Maria Khalusova
2eaaf17a0b
Export to ONNX doc refocused on using optimum, added tflite ( #23434 )
...
* doc refocused on using optimum, tflite
* minor updates to fix checks
* Apply suggestions from code review
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
* TFLite to separate page, added links
* Removed the onnx list builder
* make style
* Update docs/source/en/serialization.mdx
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
---------
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
2023-05-24 08:13:23 -04:00
Tim Dettmers
796162c512
Paged Optimizer + Lion Optimizer for Trainer ( #23217 )
...
* Added lion and paged optimizers and made original tests pass.
* Added tests for paged and lion optimizers.
* Added and fixed optimizer tests.
* Style and quality checks.
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
2023-05-24 12:53:28 +02:00
Tim Dettmers
9d73b92269
4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) ( #23479 )
...
* Added lion and paged optimizers and made original tests pass.
* Added tests for paged and lion optimizers.
* Added and fixed optimizer tests.
* Style and quality checks.
* Initial draft. Some tests fail.
* Fixed dtype bug.
* Fixed bug caused by torch_dtype='auto'.
* All test green for 8-bit and 4-bit layers.
* Added fix for fp32 layer norms and bf16 compute in LLaMA.
* Initial draft. Some tests fail.
* Fixed dtype bug.
* Fixed bug caused by torch_dtype='auto'.
* All test green for 8-bit and 4-bit layers.
* Added lion and paged optimizers and made original tests pass.
* Added tests for paged and lion optimizers.
* Added and fixed optimizer tests.
* Style and quality checks.
* Fixing issues for PR #23479 .
* Added fix for fp32 layer norms and bf16 compute in LLaMA.
* Reverted variable name change.
* Initial draft. Some tests fail.
* Fixed dtype bug.
* Fixed bug caused by torch_dtype='auto'.
* All test green for 8-bit and 4-bit layers.
* Added lion and paged optimizers and made original tests pass.
* Added tests for paged and lion optimizers.
* Added and fixed optimizer tests.
* Style and quality checks.
* Added missing tests.
* Fixup changes.
* Added fixup changes.
* Missed some variables to rename.
* revert trainer tests
* revert test trainer
* another revert
* fix tests and safety checkers
* protect import
* simplify a bit
* Update src/transformers/trainer.py
* few fixes
* add warning
* replace with `load_in_kbit = load_in_4bit or load_in_8bit`
* fix test
* fix tests
* this time fix tests
* safety checker
* add docs
* revert torch_dtype
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* multiple fixes
* update docs
* version checks and multiple fixes
* replace `is_loaded_in_kbit`
* replace `load_in_kbit`
* change methods names
* better checks
* oops
* oops
* address final comments
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-05-24 12:52:45 +02:00
Wang, Yi
33687a3f61
add GPTJ/bloom/llama/opt into model list and enhance the jit support ( #23291 )
...
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2023-05-24 10:57:56 +01:00
zspo
003a0cf8cc
Fix some docs what layerdrop does ( #23691 )
...
* Fix some docs what layerdrop does
* Update src/transformers/models/data2vec/configuration_data2vec_audio.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Fix more docs
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-05-23 14:50:40 -04:00
小桐桐
357f281ba2
fix: load_best_model_at_end error when load_in_8bit is True ( #23443 )
...
Ref: https://github.com/huggingface/peft/issues/394
Loading a quantized checkpoint into non-quantized Linear8bitLt is not supported.
call module.cuda() before module.load_state_dict()
2023-05-23 14:50:27 -04:00
Yih-Dar
de5f86e59d
Skip TFCvtModelTest::test_keras_fit_mixed_precision
for now ( #23699 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-05-23 20:47:47 +02:00
LWprogramming
3d57404464
is_batched fix for remaining 2-D numpy arrays ( #23309 )
...
* Fix is_batched code to allow 2-D numpy arrays for audio
* Tests
* Fix typo
* Incorporate comments from PR #23223
2023-05-23 14:37:35 -04:00
Younes Belkada
6b7d6f848b
[Blip
] Fix blip doctest ( #23698 )
...
fix blip doctest
2023-05-23 18:25:44 +02:00
Matt
876d9a32c6
TF version compatibility fixes ( #23663 )
...
* New TF version compatibility fixes
* Remove dummy print statement, move expand_1d
* Make a proper framework inference function
* Make a proper framework inference function
* ValueError -> TypeError
2023-05-23 16:42:11 +01:00