transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Stas Bekman	dee17d5676	[trainer docs] document how to select specific gpus (#15551 ) * [trainer docs] document how to select specific gpus * expand * add urls * add accelerate launcher	2022-02-09 10:12:29 -08:00
Yih-Dar	258480864d	update serving_output for some TF models (#15568 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-02-09 18:32:51 +01:00
Sylvain Gugger	315e67404d	Fix tests hub failure (#15580 ) * Expose hub test problem * Fix tests	2022-02-09 12:27:59 -05:00
Sylvain Gugger	b1ba03e082	Fix quality	2022-02-09 12:06:59 -05:00
Sylvain Gugger	eed3186b79	Trigger doc build	2022-02-09 11:57:59 -05:00
Chan Woo Kim	2b5603f6ac	Constrained Beam Search [without disjunctive decoding] (#15416 ) * added classes to get started with constrained beam search * in progress, think i can directly force tokens now but not yet with the round robin * think now i have total control, now need to code the bank selection * technically works as desired, need to optimize and fix design choices leading to undersirable outputs * complete PR #1 without disjunctive decoding * removed incorrect tests * Delete k.txt * Delete test.py * Delete test.sh * revert changes to test scripts * genutils * full implementation with testing, no disjunctive yet * shifted docs * passing all tests realistically ran locally * removing accidentally included print statements * fixed source of error in initial PR test * fixing the get_device() vs device trap * fixed documentation docstrings about constrained_beam_search * fixed tests having failing for Speech2TextModel's floating point inputs * fix cuda long tensor * added examples and testing for them and founx & fixed a bug in beam_search and constrained_beam_search * deleted accidentally added test halting code with assert False * code reformat * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py * fixing based on comments on PR * took out the testing code that should but work fails without the beam search moditification ; style changes * fixing comments issues * docstrings for ConstraintListState * typo in PhrsalConstraint docstring * docstrings improvements Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-02-09 16:59:26 +01:00
Clara Meister	0113aae5b7	Add implementation of typical sampling (#15504 ) * typical decoding * changing arg name * add test config params * forgotten arg rename * fix edge case where scores are same * test for typical logits warper * code quality fixes	2022-02-09 16:48:41 +01:00
Suraj Patil	f588cf4050	[Flax tests/FlaxBert] make from_pretrained test faster (#15561 )	2022-02-09 16:48:08 +01:00
Lysandre Debut	7029240927	Upgrade click version (#15579 )	2022-02-09 10:28:43 -05:00
Sanchit Gandhi	9e00566b9b	Add Wav2Vec2 Adapter Weights to Flax (#15566 ) * Add Wav2Vec2 Adapter Weights to Flax * Suggested changes	2022-02-09 10:24:40 -05:00
Sylvain Gugger	1f60bc46f3	Make sure custom configs work with Transformers (#15569 ) * Make sure custom configs work with Transformers * Apply code review suggestions	2022-02-09 10:04:44 -05:00
Lysandre Debut	7732d0fe7a	Upgrade black to version ~=22.0 (#15565 ) * Upgrade black to version ~=22.0 * Check copies * Fix code	2022-02-09 09:28:57 -05:00
Leandro von Werra	d923f76203	add model scaling section (#15119 ) * add model scaling section * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * integrate reviewer feedback * initialize GPU properly * add note about BnB optimizer * move doc from `scaling.mdx` to `performance.mdx` * integrate reviewer feedback * revert section levels Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-09 15:27:30 +01:00
Sylvain Gugger	b5c6fdecf0	PoC for a ProcessorMixin class (#15549 ) * PoC for a ProcessorMixin class * Documentation * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Roll out to other processors * Add base feature extractor class in init * Use args and kwargs Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-02-09 09:24:49 -05:00
Yih-Dar	ba3f9a71a1	logger.warn --> logger.warning (#15572 ) * change logger.warn to logger.warning * make style Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-02-09 08:20:05 -05:00
Suraj Patil	a6885db912	[Flax tests] fix test_model_outputs_equivalence (#15571 ) * fix test_model_outputs_equivalence * fix tuple outputs for blenderbot	2022-02-09 12:26:48 +01:00
Nathan Raw	fcb4f11c92	📝 Add codecarbon callback to docs (#15563 )	2022-02-08 14:10:53 -05:00
Boris Dayma	077c00c0b2	feat(flax): allow encoder_outputs in generate (#15554 ) * feat(flax): allow encoder_outputs in generate * doc(flax): encoder_outputs in generate * fix: style * fix: style	2022-02-08 17:53:22 +01:00
Joao Gante	8406fa6dd5	Add TFSpeech2Text (#15113 ) * Add wrapper classes * convert inner layers to tf * Add TF Encoder and Decoder layers * TFSpeech2Text models * Loadable model * TF model with same outputs as PT model * test skeleton * correct tests and run the fixup * correct attention expansion * TFSpeech2Text pask_key_values with TF format	2022-02-08 16:27:23 +00:00
Yih-Dar	6a5472a8e1	Force use_cache to be False in PyTorch (#15385 ) * use_cache = False for PT models if labels is passed * Fix for BigBirdPegasusForConditionalGeneration * add warning if users specify use_cache=True * Use logger.warning instead of warnings.warn Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-02-08 16:20:53 +01:00
Suraj Patil	0acd84f7cb	[GPTJ] fix docs (#15558 )	2022-02-08 15:54:19 +01:00
aaron	87d08afb16	electra is added to onnx supported model (#15084 ) * electra is added to onnx supported model * add google/electra-base-generator for test onnx module Co-authored-by: Lewis Tunstall <lewis.c.tunstall@gmail.com>	2022-02-08 15:47:49 +01:00
Michael Benayoun	0fe17f375a	FX tracing improvement (#14321 ) * Change the way tracing happens, enabling dynamic axes out of the box * Update the tests and modeling xlnet * Add the non recoding of leaf modules to avoid recording more values for the methods to record than what will be seen at tracing time (which would otherwise desynchronize the recorded values and the values that need to be given to the proxies during tracing, causing errors). * Comments and making tracing work for gpt-j and xlnet * Refactore things related to num_choices (and batch_size, sequence_length) * Update fx to work on PyTorch 1.10 * Postpone autowrap_function feature usage for later * Add copyrights * Remove unnecessary file * Fix issue with add_new_model_like * Apply suggestions	2022-02-07 22:25:33 +01:00
Steven Liu	552f8d3091	Create a custom model guide (#15489 ) * 📝 add config section * 📝 finish first draft * 📝 add feature extractor and processor * 🖍 apply feedback from review * 📝 minor edits * last review	2022-02-07 12:34:56 -06:00
Yih-Dar	ad1d3c4d4b	Make TF Wav2Vec2 outputs the same as PT's version (#15530 ) * fix outputs * fix for CTC * fix doc * make style Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-02-07 18:09:57 +01:00
Yih-Dar	131e258411	Fix TF T5/LED missing cross attn in retrun values (#15511 ) * add cross attn to outputs * add cross attn to outputs for TFLED * add undo padding * remove unused import * fix style Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-02-07 17:41:48 +01:00
lewtun	6775b211b6	Remove Longformers from ONNX-supported models (#15273 )	2022-02-07 17:32:13 +01:00
François REMY	7a1412e12b	Wav2Vec2 models must either throw or deal with add_apater (#15409 ) * Wav2Vec2 models must either throw or deal with add_apater Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Add pre-add_adapter backwards compatibility * Add pre-add_adapter backwards compatibility * Fix issue in tests/test_modeling_wav2vec2.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-02-07 17:03:12 +01:00
Anton Lozhkov	a459f7f97d	Add ASR CTC streaming example (#15309 ) * Single-epoch run * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Infinite dataset * Trainer fix + distributed benchmark * Benchmark fix * unused import * interleaved splits * interleaved splits * has_length util * Move to research projects * Leftover Sized checks * Bump min version * Unused import * Revert trainer changes Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-02-07 18:35:37 +03:00
Anton Lozhkov	75b13f82e9	[Trainer] Deeper length checks for IterableDatasetShard (#15539 ) * Unused import * Make `has_length()` torch-independent to use in callbacks * Update src/transformers/trainer_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-07 18:34:56 +03:00
NielsRogge	84eec9e6ba	Add ConvNeXT (#15277 ) * First draft * Add conversion script * Improve conversion script * Improve docs and implement tests * Define model output class * Fix tests * Fix more tests * Add model to README * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply more suggestions from code review * Apply suggestions from code review * Rename dims to hidden_sizes * Fix equivalence test * Rename gamma to gamma_parameter * Clean up conversion script * Add ConvNextFeatureExtractor * Add corresponding tests * Implement feature extractor correctly * Make implementation cleaner * Add ConvNextStem class * Improve design * Update design to also include encoder * Fix gamma parameter * Use sample docstrings * Finish conversion, add center cropping * Replace nielsr by facebook, make feature extractor tests smaller * Fix integration test Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-07 16:11:37 +01:00
Patrick von Platen	c47d259241	[torch_int_div] Correct true division in generation (#15498 ) * [torch_int_div] Correct true division in generation * up * up	2022-02-07 16:04:18 +01:00
Patrick von Platen	5f1918a4a8	[ASR pipeline] correct asr pipeline for seq2seq models (#15541 )	2022-02-07 15:35:44 +01:00
Patrick von Platen	e02bdce791	Revert "Handle PyTorch to Flax conversion of 1D convolutions (#15519 )" (#15540 ) This reverts commit `854a0d526c`.	2022-02-07 12:33:49 +01:00
Stas Bekman	8ce1330631	[deepspeed docs] DeepSpeed ZeRO Inference (#15486 ) * [deepspeed docs] DeepSpeed ZeRO Inference * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * tweak * deal with black * extra cleanup, better comments Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-04 13:51:02 -08:00
Sylvain Gugger	ac6aa10f23	Standardize semantic segmentation models outputs (#15469 ) * Standardize instance segmentation models outputs * Rename output * Update src/transformers/modeling_outputs.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add legacy argument to the config and model forward * Update src/transformers/models/beit/modeling_beit.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Copy fix in Segformer Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2022-02-04 14:52:07 -05:00
Stas Bekman	31be2f45a9	[deepspeed docs] Megatron-Deepspeed info (#15488 )	2022-02-04 11:15:13 -08:00
Yih-Dar	bbe9c6981b	Fix TFRemBertEncoder all_hidden_states (#15510 ) * fix * fix test * remove expected_num_hidden_layers Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-02-04 16:32:14 +00:00
Sanchit Gandhi	854a0d526c	Handle PyTorch to Flax conversion of 1D convolutions (#15519 )	2022-02-04 17:08:03 +01:00
Yih-Dar	486260c68e	use kwargs (#15509 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-02-04 15:25:37 +00:00
Yih-Dar	525dbbf84a	Remove loss from some flax models docs & examples (#15492 ) * Remove return_loss from Flax models * fix more * fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-02-03 21:39:46 +01:00
Stas Bekman	21dcaec5d5	[deepspeed docs] memory requirements (#15506 )	2022-02-03 10:55:14 -08:00
davidleonfdez	f1a4c4ead5	[WIP] Add preprocess_logits_for_metrics Trainer param (#15473 ) * Add preprocess_logits_for_metrics Trainer param * Compute accuracy in LM examples * Improve comments	2022-02-03 12:07:20 -05:00
Stas Bekman	4f5faaf044	[deepspeed] fix a bug in a test (#15493 ) * [deepspeed] fix a bug in a test * consistency	2022-02-03 08:55:45 -08:00
NielsRogge	90166121ee	Add general vision docstrings (#15501 ) * Add general docstrings * Remove legacy docstrings * Add BEiT * Add DEiT * Add SegFormer * Fix beit output class * Fix missing return_dict	2022-02-03 17:47:22 +01:00
Patrick von Platen	e2b6e73fa2	[Flax tests] Disable scheduled GPU tests (#15503 )	2022-02-03 17:12:14 +01:00
Yih-Dar	f5d98da29e	fix load_weight_prefix (#15101 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-02-03 15:11:53 +00:00
Yih-Dar	71dccd0774	fix (#15494 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-02-03 12:57:28 +01:00
CHI LIU	5ec368d79e	Correct eos_token_id settings in generate (#15403 ) * Correct eos_token_id set in generate * Set eos_token_id in test * Correct eos_token_id set in generate * Set eos_token_id in test	2022-02-03 00:24:40 +01:00
SaulLu	39b5d1a63a	fix set truncation attribute in `__init__` of `PreTrainedTokenizerBase` (#15456 ) * change truncation_side in init of `PreTrainedTokenizerBase` Co-authored-by: LSinev <LSinev@users.noreply.github.com> * add test * Revert "replace assert with exception for `padding_side` arg in `PreTrainedTokenizerBase` `__init__`" This reverts commit `7a98b87962`. * fix kwargs * Revert "fix kwargs" This reverts commit 67b0a5270e8cf1dbf70e6b0232e94c0452b6946f. * Update tests/test_tokenization_common.py Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * delete truncation_side variable * reorganize test * format * complete doc * Revert "Revert "replace assert with exception for `padding_side` arg in `PreTrainedTokenizerBase` `__init__`"" This reverts commit d5a10a7e2680539e5d9e98ae5d896c893d224b80. * fix typo * fix typos to render documentation * Revert "Revert "Revert "replace assert with exception for `padding_side` arg in `PreTrainedTokenizerBase` `__init__`""" This reverts commit 16cf58811943a08f43409a7c83eaa330686591d0. * format Co-authored-by: LSinev <LSinev@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2022-02-02 23:18:09 +01:00

1 2 3 4 5 ...

8938 Commits