Sylvain Gugger
00f6ba0e7e
Skip failing test for now
2023-05-31 06:31:33 -04:00
Sourab Mangrulkar
a73b1d59a3
accelerate deepspeed and gradient accumulation integrate ( #23236 )
...
* mixed precision support via accelerate
* fix issues
* fix for the sharded ddp case
* fix flax and tf failing tests
* `refactor the place to create `Accelerator` object
* move ddp prep to accelerate
* fix 😅
* resolving comments
* move fsdp handling to accelerate
* fixex
* fix saving
* shift torch dynamo handling to accelerate
* shift deepspeed integration and save & load utils to accelerate
* fix accelerate launcher support
* oops
* fix 🐛
* save ckpt fix
* Trigger CI
* nasty 🐛 😅
* as deepspeed needs grad_acc fixes, transfer grad_acc to accelerate
* make tests happy
* quality ✨
* loss tracked needs to account for grad_acc
* fixing the deepspeed tests
* quality ✨
* 😅 😅 😅
* tests 😡
* quality ✨
* Trigger CI
* resolve comments and fix the issue with the previous merge from branch
* Trigger CI
* accelerate took over deepspeed integration
---------
Co-authored-by: Stas Bekman <stas@stason.org>
2023-05-31 15:16:22 +05:30
Denisa Roberts
88f50a1e89
Add TensorFlow implementation of EfficientFormer ( #22620 )
...
* Add tf code for efficientformer
* Fix return dict bug - return last hidden state after last stage
* Fix corresponding return dict bug
* Override test tol
* Change default values of training to False
* Set training to default False X3
* Rm axis from ln
* Set init in dense projection
* Rm debug stuff
* Make style; all tests pass.
* Modify year to 2023
* Fix attention biases codes
* Update the shape list logic
* Add a batch norm eps config
* Remove extract comments in test files
* Add conditional attn and hidden states return for serving output
* Change channel dim checking logic
* Add exception for withteacher model in training mode
* Revert layer count for now
* Add layer count for conditional layer naming
* Transpose for conv happens only in main layer
* Make tests smaller
* Make style
* Update doc
* Rm from_pt
* Change to actual expect image class label
* Remove stray print in tests
* Update image processor test
* Remove the old serving output logic
* Make style
* Make style
* Complete test
2023-05-31 10:43:12 +01:00
Sylvain Gugger
9fea71b465
Fix last instances of kbit -> quantized ( #23797 )
2023-05-31 11:38:20 +02:00
Sam Passaglia
38dbbc2640
Fix bug leading to missing token in GPTSanJapaneseTokenizer ( #23883 )
...
* add \n
* removed copied from header
2023-05-31 11:32:27 +02:00
Sourab Mangrulkar
03db591047
shift torch dynamo handling to accelerate ( #23168 )
...
* mixed precision support via accelerate
* fix issues
* fix for the sharded ddp case
* fix flax and tf failing tests
* `refactor the place to create `Accelerator` object
* move ddp prep to accelerate
* fix 😅
* resolving comments
* move fsdp handling to accelerate
* fixex
* fix saving
* shift torch dynamo handling to accelerate
2023-05-31 14:42:07 +05:30
Sourab Mangrulkar
0b774074a5
move fsdp handling to accelerate ( #23158 )
...
* mixed precision support via accelerate
* fix issues
* fix for the sharded ddp case
* fix flax and tf failing tests
* `refactor the place to create `Accelerator` object
* move ddp prep to accelerate
* fix 😅
* resolving comments
* move fsdp handling to accelerate
* fixex
* fix saving
2023-05-31 14:10:46 +05:30
Sohyun Sim
015829e6c4
🌐 [i18n-KO] Translated pad_truncation.mdx
to Korean ( #23823 )
...
* docs: ko: pad_truncation.mdx
* feat: manual draft
* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
---------
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
2023-05-31 10:23:59 +02:00
Sourab Mangrulkar
1cf148a6aa
Smangrul/accelerate ddp integrate ( #23151 )
...
* mixed precision support via accelerate
* fix issues
* fix for the sharded ddp case
* fix flax and tf failing tests
* `refactor the place to create `Accelerator` object
* move ddp prep to accelerate
* fix 😅
* resolving comments
2023-05-31 13:42:49 +05:30
Sourab Mangrulkar
9f0646a555
Smangrul/accelerate mp integrate ( #23148 )
...
* mixed precision support via accelerate
* fix issues
* fix for the sharded ddp case
* fix flax and tf failing tests
* `refactor the place to create `Accelerator` object
* address comments by removing debugging print statements
2023-05-31 12:27:51 +05:30
Abhinav Patil
de9255de27
Adds AutoProcessor.from_pretrained support for MCTCTProcessor ( #23856 )
...
Adds support for AutoProcessor.from_pretrained to MCTCTProcessor models
2023-05-30 14:36:18 -04:00
George
6451ad0471
Editing issue with pickle def with lambda function ( #23869 )
...
* Editing issue with pickle def with lambda function
* fix type
* Made helper function private
* delete tab
---------
Co-authored-by: georgebredis <9454-georgebredis@users.noreply.gitlab.aicrowd.com>
2023-05-30 13:26:37 -04:00
Arthur
af2aac51fc
[from_pretrained] imporve the error message when _no_split_modules
is not defined ( #23861 )
...
* Better warning
* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* format line
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-05-30 17:12:14 +02:00
Vijeth Moudgalya
58022e41b8
#23388 Issue: Update RoBERTa configuration ( #23863 )
2023-05-30 10:53:40 -04:00
Arthur
6fc0454b2f
[LlamaTokenizerFast] nit update post_processor
on the fly ( #23855 )
...
* Update the processor when changing add_eos and add_bos
* fixup
* update
* add a test
* fix failing tests
* fixup
2023-05-30 16:50:41 +02:00
Clémentine Fourrier
0623f08e99
Update collating_graphormer.py ( #23862 )
2023-05-30 10:23:20 -04:00
peridotml
62ba64b90a
Adds a FlyteCallback ( #23759 )
...
* initial flyte callback
* lint
* logs should still be saved to Flyte even if pandas isn't install (unlikely)
* cr - flyte team
* add docs for Flytecallback
* fix doc string - cr sgugger
* Apply suggestions from code review
cr - sgugger fix doc strings
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-05-30 10:08:07 -04:00
Hyeonseo Yun
867316670a
🌐 [i18n-KO] Translated troubleshooting.mdx
to Korean ( #23166 )
...
* docs: ko: troubleshooting.mdx
* revised: fix _toctree.yml #23112
* feat: nmt draft `troubleshooting.mdx`
* fix: manual edits `troubleshooting.mdx`
* revised: resolve suggestions troubleshooting.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
---------
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
2023-05-30 09:49:47 -04:00
Kihoon Son
192aa04783
[i18n-KO] Translated video_classification.mdx to Korean ( #23026 )
...
* task/video_classification translated
Co-Authored-By: Hyeonseo Yun <0525_hhgus@naver.com>
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
* Update video_classification.mdx
* Update _toctree.yml
* Update _toctree.yml
* Update _toctree.yml
* Update _toctree.yml
---------
Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
2023-05-30 09:28:44 -04:00
Kihoon Son
a077f710f3
🌐 [i18n-KO] Translated fast_tokenizers.mdx
to Korean ( #22956 )
...
* docs: ko: fast_tokenizer.mdx
content - translated
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>
Co-Authored-By: Hyeonseo Yun <0525_hhgus@naver.com>
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
* Update fast_tokenizers.mdx
* Update fast_tokenizers.mdx
* Update fast_tokenizers.mdx
* Update fast_tokenizers.mdx
* Update _toctree.yml
---------
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
2023-05-30 09:27:40 -04:00
Matthijs Hollemans
2faa09530b
fix Whisper tests on GPU ( #23753 )
...
* move input features to GPU
* skip these tests because undefined behavior
* unskip tests
2023-05-30 09:06:58 -04:00
Matt
ac224dee90
TF SAM shape flexibility fixes ( #23842 )
...
SAM shape flexibility fixes for compilation
2023-05-30 13:08:44 +01:00
Samin Yasar
af45ec0a16
add type hint in pipeline model argument ( #23740 )
...
* add type hint in pipeline model argument
* add pretrainedmodel and tfpretainedmodel type hint
* make type hints string
2023-05-30 11:05:58 +01:00
Eli Simhayev
4b6a5a7caa
[Time-Series] Autoformer model ( #21891 )
...
* ran `transformers-cli add-new-model-like`
* added `AutoformerLayernorm` and `AutoformerSeriesDecomposition`
* added `decomposition_layer` in `init` and `moving_avg` to config
* added `AutoformerAutoCorrelation` to encoder & decoder
* removed caninical self attention `AutoformerAttention`
* added arguments in config and model tester. Init works! 😁
* WIP autoformer attention with autocorrlation
* fixed `attn_weights` size
* wip time_delay_agg_training
* fixing sizes and debug time_delay_agg_training
* aggregation in training works! 😁
* `top_k_delays` -> `top_k_delays_index` and added `contiguous()`
* wip time_delay_agg_inference
* finish time_delay_agg_inference 😎
* added resize to autocorrelation
* bug fix: added the length of the output signal to `irfft`
* `attention_mask = None` in the decoder
* fixed test: changed attention expected size, `test_attention_outputs` works!
* removed unnecessary code
* apply AutoformerLayernorm in final norm in enc & dec
* added series decomposition to the encoder
* added series decomp to decoder, with inputs
* added trend todos
* added autoformer to README
* added to index
* added autoformer.mdx
* remove scaling and init attention_mask in the decoder
* make style
* fix copies
* make fix-copies
* inital fix-copies
* fix from https://github.com/huggingface/transformers/pull/22076
* make style
* fix class names
* added trend
* added d_model and projection layers
* added `trend_projection` source, and decomp layer init
* added trend & seasonal init for decoder input
* AutoformerModel cannot be copied as it has the decomp layer too
* encoder can be copied from time series transformer
* fixed generation and made distrb. out more robust
* use context window to calculate decomposition
* use the context_window for decomposition
* use output_params helper
* clean up AutoformerAttention
* subsequences_length off by 1
* make fix copies
* fix test
* added init for nn.Conv1d
* fix IGNORE_NON_TESTED
* added model_doc
* fix ruff
* ignore tests
* remove dup
* fix SPECIAL_CASES_TO_ALLOW
* do not copy due to conv1d weight init
* remove unused imports
* added short summary
* added label_length and made the model non-autoregressive
* added params docs
* better doc for `factor`
* fix tests
* renamed `moving_avg` to `moving_average`
* renamed `factor` to `autocorrelation_factor`
* make style
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* fix configurations
* fix integration tests
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fixing `lags_sequence` doc
* Revert "fixing `lags_sequence` doc"
This reverts commit 21e34911e3
.
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* model layers now take the config
* added `layer_norm_eps` to the config
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* added `config.layer_norm_eps` to AutoformerLayernorm
* added `config.layer_norm_eps` to all layernorm layers
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fix variable names
* added inital pretrained model
* added use_cache docstring
* doc strings for trend and use_cache
* fix order of args
* imports on one line
* fixed get_lagged_subsequences docs
* add docstring for create_network_inputs
* get rid of layer_norm_eps config
* add back layernorm
* update fixture location
* fix signature
* use AutoformerModelOutput dataclass
* fix pretrain config
* no need as default exists
* subclass ModelOutput
* remove layer_norm_eps config
* fix test_model_outputs_equivalence test
* test hidden_states_output
* make fix-copies
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* removed unused attr
* Update tests/models/autoformer/test_modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* use AutoFormerDecoderOutput
* fix formatting
* fix formatting
---------
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-05-30 10:23:32 +02:00
Sylvain Gugger
17a55534f5
Enable code-specific revision for code on the Hub ( #23799 )
...
* Enable code-specific revision for code on the Hub
* invalidate old revision
2023-05-26 15:51:15 -04:00
Zachary Mueller
edf7772826
Log the right train_batch_size if using auto_find_batch_size and also log the adjusted value seperately. ( #23800 )
...
* Log right bs
* Log
* Diff message
2023-05-26 15:09:05 -04:00
Ran Ran
e724246935
Fix no such file or directory error ( #23783 )
...
* Fix no such file or directory error
* Address comment
* Fix formatting issue
2023-05-26 14:24:57 -04:00
Wang, Yi
b7b729b38d
no_cuda does not take effect in non distributed environment ( #23795 )
...
Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
2023-05-26 10:47:51 -04:00
amitportnoy
d61d747627
Update trainer.mdx class_weights example ( #23787 )
...
class_weights tensor should follow model's device
2023-05-26 08:36:33 -04:00
Sylvain Gugger
4d9b76a80f
Fix RWKV backward on GPU ( #23774 )
2023-05-26 08:33:17 -04:00
Arthur
8d28dba35d
[OPT] Doc nit, using fast is fine ( #23789 )
...
small doc nit
2023-05-26 14:30:32 +02:00
Younes Belkada
f67dac97bd
[Nllb-Moe
] Fix nllb moe accelerate issue ( #23758 )
...
fix nllb moe accelerate issue
2023-05-25 22:37:33 +02:00
dependabot[bot]
d685e330b5
Bump tornado from 6.0.4 to 6.3.2 in /examples/research_projects/visual_bert ( #23767 )
...
Bump tornado in /examples/research_projects/visual_bert
Bumps [tornado](https://github.com/tornadoweb/tornado ) from 6.0.4 to 6.3.2.
- [Changelog](https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst )
- [Commits](https://github.com/tornadoweb/tornado/compare/v6.0.4...v6.3.2 )
---
updated-dependencies:
- dependency-name: tornado
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-25 16:16:12 -04:00
dependabot[bot]
4b0e7ded1c
Bump tornado from 6.0.4 to 6.3.2 in /examples/research_projects/lxmert ( #23766 )
...
Bumps [tornado](https://github.com/tornadoweb/tornado ) from 6.0.4 to 6.3.2.
- [Changelog](https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst )
- [Commits](https://github.com/tornadoweb/tornado/compare/v6.0.4...v6.3.2 )
---
updated-dependencies:
- dependency-name: tornado
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-25 16:16:01 -04:00
玩火
f04f549bae
Fix is_ninja_available() ( #23752 )
...
* Fix is_ninja_available()
search ninja using subprocess instead of importlib.
* Fix style
* Fix doc
* Fix style
2023-05-25 16:10:25 -04:00
Arthur
3416bba7c7
[LongFormer] code nits, removed unused parameters ( #23749 )
...
* remove unused parameters
* remove unused parameters in config
2023-05-25 16:06:14 +02:00
Sylvain Gugger
6e4bc67099
Revamp test selection for the example tests ( #23737 )
...
* Revamp test selection for the example tests
* Rename old XLA test and fake modif in run_glue
* Fixes
* Fake Trainer modif
* Remove fake modifs
2023-05-25 09:38:21 -04:00
Sylvain Gugger
7d4fe85ef3
Fix psuh_to_hub in Trainer when nothing needs pushing ( #23751 )
2023-05-25 09:38:09 -04:00
Ravi Theja
06c28cd0fc
Add LlamaIndex to awesome-transformers.md ( #23484 )
2023-05-25 09:35:10 -04:00
Eric J. Wang
f0a2a82ab4
Fix pip install --upgrade accelerate
command in modeling_utils.py ( #23747 )
...
Fix command in modeling_utils.py
2023-05-25 07:48:48 -04:00
Matt
e45e756d22
Remove the last few TF serving sigs ( #23738 )
...
Remove some more serving methods that (I think?) turned up while this PR was open
2023-05-24 21:19:44 +01:00
Sylvain Gugger
9850e6ddab
Enable prompts on the Hub ( #23662 )
...
* Enable prompts on the Hub
* Update src/transformers/tools/prompts.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Address review comments
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-05-24 16:09:13 -04:00
Zachary Mueller
75bbf20bce
Fix sagemaker DP/MP ( #23681 )
...
* Check for use_sagemaker_dp
* Add a check for is_sagemaker_mp when setting _n_gpu again. Should be last broken thing
* Try explicit check?
* Quality
2023-05-24 15:51:09 -04:00
Daniel King
89159651ba
Fix the regex in get_imports
to support multiline try blocks and excepts with specific exception types ( #23725 )
...
* fix and test get_imports for multiline try blocks, and excepts with specific errors
* fixup
* add some more tests
* add license
2023-05-24 15:40:19 -04:00
Sanchit Gandhi
d8222be57e
[Whisper] Reduce batch size in tests ( #23736 )
2023-05-24 17:31:25 +01:00
Matt
814de8fac7
Overhaul TF serving signatures + dummy inputs ( #23234 )
...
* Let's try autodetecting serving sigs
* Don't clobber existing sigs
* Change shapes for multiplechoice models
* Make default dummy inputs smarter too
* Fix missing f-string
* Let's YOLO a serving output too
* Read __class__.__name__ properly
* Don't just pass naked lists in there and expect it to be okay
* Code cleanup
* Update default serving sig
* Clearer error messages
* Further updates to the default serving output
* make fixup
* Update the serving output a bit more
* Cleanups and renames, raise errors appropriately when we can't infer inputs
* More renames
* we're building in a functional context again, yolo
* import DUMMY_INPUTS from the right place
* import DUMMY_INPUTS from the right place
* Support cross-attention in the dummies
* Support cross-attention in the dummies
* Complete removal of dummy/serving overrides in BERT
* Complete removal of dummy/serving overrides in RoBERTa
* Obliterate lots and lots of serving sig and dummy overrides
* merge type hint changes
* Fix for token_type_ids with vocab_size 1
* Add missing property decorator
* Fix T5 and hopefully some models that take conv inputs
* More signature pruning
* Fix T5's signature
* Fix Wav2Vec2 signature
* Fix LongformerForMultipleChoice input signature
* Fix BLIP and LED
* Better default serving output error handling
* Fix BART dummies
* Fix dummies for cross-attention, esp encoder-decoder models
* Fix visionencoderdecoder signature
* Fix BLIP serving output
* Small tweak to BART dummies
* Cleanup the ugly parameter inspection line that I used in a few places
* committed a breakpoint again
* Move the text_dims check
* Remove blip_text serving_output
* Add decoder_input_ids to the default input sig
* Remove all the manual overrides for encoder-decoder model signatures
* Tweak longformer/led input sigs
* Tweak default serving output
* output.keys() -> output
* make fixup
2023-05-24 17:03:24 +01:00
Connor Henderson
3d7baef114
fix: Whisper generate, move text_prompt_ids trim up for max_new_tokens calculation ( #23724 )
...
move text_prompt_ids trimming to top
2023-05-24 11:34:21 -04:00
Jungnerd
50a56bedb6
fix: delete duplicate sentences in document_question_answering.mdx
( #23735 )
...
fix: delete duplicate sentence
2023-05-24 11:20:50 -04:00
Matt
d2d8822604
TF SAM memory reduction ( #23732 )
...
* Extremely small change to TF SAM dummies to reduce memory usage on build
* remove debug breakpoint
* Debug print statement to track array sizes
* More debug shape printing
* More debug shape printing
* Now remove the debug shape printing
* make fixup
* make fixup
2023-05-24 15:59:02 +01:00
pagarsky
28aa438cd2
Minor awesome-transformers.md fixes ( #23453 )
...
Minor docs fixes
2023-05-24 08:57:52 -04:00