transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-25 15:28:59 +06:00

Author	SHA1	Message	Date
Nicolas Patry	19d37c2dd3	Hotfix `chunk_length_s` instead of `_ms`. (#15029 ) * Hotfix `chunk_length_s` instead of `_ms`. * Adding fix of `pad_token` which should be last/previous token for CTC proper decoding * Fixing ChunkPipeline unwrapping. * Adding a PackIterator specific test.	2022-01-04 14:07:44 +01:00
Daniel Stancl	21aecc0971	Add Flax RoFormer (#15005 ) * Add FlaxRoFormer * Clean code + make quality * Fix output pooling for FlaxRoFormerForMultipleChoiceModule * Apply suggestions from code review * add flax model to repos Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-01-04 13:23:10 +01:00
Patrick von Platen	dbac8899fe	[Tests] Correct Wav2Vec2 & WavLM tests (#15015 ) * up * up * up	2022-01-03 20:19:04 +01:00
Anton Lozhkov	38f95d1846	Large audio chunking for the existing ASR pipeline (#14896 ) * Naive ASR chunking * Fixing batching for ASR. Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2022-01-03 16:54:17 +01:00
Nicolas Patry	d33dc7966a	Improve truncation_side (#14947 ) * Enabling `truncation_side` for Slow and Fast tokenizer. Co-Authored-by: Niels Rogge <48327001+NielsRogge@users.noreply.github.com> * Disable failing tests. * Layout xlm. * assert -> assertEqual. Co-authored-by: Niels Rogge <48327001+NielsRogge@users.noreply.github.com>	2022-01-03 16:18:39 +01:00
Nicolas Patry	8c2618e6aa	Fixing t2t pipelines lists outputs. (#15008 ) Backward compatibility broken in https://github.com/huggingface/transformers/pull/14988	2022-01-03 14:49:58 +01:00
Nicolas Patry	f8a989cfb2	Adding `num_return_sequences` support for text2text generation. (#14988 ) * Adding `num_return_sequences` support for text2text generation. Co-Authored-By: Enze <pu.miao@foxmail.com> * Update tests/test_pipelines_text2text_generation.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/test_pipelines_text2text_generation.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Enze <pu.miao@foxmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-30 16:17:15 +01:00
Patrick von Platen	c043ce6cfd	[Generate] correct encoder_outputs are passed without attention_mask (#14980 ) * [Generate] correct encoder_outputs are passed without attention_mask * Apply suggestions from code review * up	2021-12-30 10:16:03 +01:00
Patrick von Platen	a1392883ce	[AutoProcessor] Correct AutoProcessor and automatically add processor… (#14881 ) * [AutoProcessor] Correct AutoProcessor and automatically add processor class * up * up * up * up * up * up * up * up * continue tomorrow * up * up * up * make processor class private * fix loop	2021-12-30 09:56:43 +01:00
Nicolas Patry	d7d60df0ec	Fixing a pathological case for slow tokenizers (#14981 ) * Fixing a pathological case for slow tokenizers * Update src/transformers/tokenization_utils.py	2021-12-30 09:10:34 +01:00
Patrick von Platen	600496fa50	[Wav2Vec2] Rename model's feature extractor to feature encoder (#14959 ) * rename classes * clean up more namings * remove bogus file * Apply suggestions from code review * Apply suggestions from code review * replace more names * more regex replace * make style * correct * correct more * make style * finish * correct more in wav2vec2 * make style * improve freeze_extractor * add aliases * add tf aliases	2021-12-28 20:33:23 +01:00
Patrick von Platen	1bfa347707	[Tests] Speed up tokenizer tests (#14964 ) * speed up canine and mluke * speed up mbart and mbart50 toks * upload files	2021-12-28 17:02:50 +01:00
Patrick von Platen	1e847b40c0	[WavLM] give model for precision (#14958 )	2021-12-28 11:07:05 +01:00
Stas Bekman	10fd4fa1a6	[doc] :class: hunt (#14955 ) * [doc] :class: hunt * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix the fix + style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-27 17:17:38 -08:00
Stas Bekman	e13f72fbff	[doc] :obj: hunt (#14954 ) * redo sans examples * style	2021-12-27 15:49:48 -08:00
Daniel Stancl	501307b58b	Add `ElectraForCausalLM` -> Enable Electra encoder-decoder model (#14729 ) * Add ElectraForCausalLM and cover some basic tests & need to fix a few tests * Fix bugs * make style * make fix-copies * Update doc * Change docstring to markdown format * Remove redundant update_keys_to_ignore	2021-12-27 12:37:52 +01:00
Nicolas Patry	b058490ceb	ChunkPipeline (batch_size enabled on `zero-cls` and `qa` pipelines. (#14225 ) * Pipeline chunks. * Batching for Chunking pipelines ? * Batching for `question-answering` and `zero-shot-cls`. * Fixing for FNet. * Making ASR a chunk pipeline. * Chunking ASR API. * doc style. * Fixing ASR test. * Fixing QA eror (p_mask, padding is 1, not 0). * Enable both vad and simple chunking. * Max length for vad. * remove inference mode, crashing on s2t. * Revert ChunkPipeline for ASRpipeline. Too many knobs for simple integration within the pipeline, better stick to external convenience functions instead, more control to be had, simpler pipeline and also easier to replace with other things later. * Drop necessity for PT for these. * Enabling generators. * Add mic + cleanup. * Typo. * Typo2. * Remove ASR work, it does not belong in this PR anymore. * Update src/transformers/pipelines/pt_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/pipelines/zero_shot_classification.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Adding many comments. * Doc quality. * `hidden_states` handling. * Adding doc. * Bad rebase. * Autofixing docs. * Fixing CRITICAL bug in the new Zerocls pipeline. Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-12-27 11:26:20 +01:00
Sylvain Gugger	676643c6d6	Better logic for getting tokenizer config in AutoTokenizer (#14906 ) * Better logic for getting tokenizer config in AutoTokenizer * Remove needless import * Remove debug statement * Address review comments	2021-12-23 14:18:07 -05:00
Sylvain Gugger	f566c6e3b7	Fix failing GPU trainer tests (#14903 ) * Fix failing GPU trainer tests * Remove print statements	2021-12-23 13:59:33 -05:00
Patrick von Platen	fe4197ab11	[Generate] Remove attention_mask and integrate model_main_input_name (#14856 ) * up * save * correct * up * correct more * up * up * up * up * up * correct * fix tf * fix * remove tokenizer	2021-12-23 19:43:37 +01:00
Anton Lozhkov	ee55ea692b	Update diarization and WavLM tolerances (#14902 )	2021-12-23 19:53:56 +03:00
Yih-Dar	8f2cc1c3ab	Add TFCLIPModel (#13967 ) * Start the work for TFCLIPModel * Convert to TF code (TODO: loss + doc) * Clean up * Fix pooled_output for TFCLIPTextTransformer - using tf.gather_nd * assert -> raise error * Expose TFCLIPModel * Deal with dummy_inputs * Add tests * Fix all tests. TODO: manual check weight loading + add more comments * Fix pt tf equivalence test * fixes * update TFCLIPVisionEmbeddings's Conv2D * Fix loss + overwrite test_pt_tf_model_equivalence from common * Add a comment about the change about MainLayer in test_keras_save_load * Set return_loss=True in TFCLIPModelTester + make tests pass * overwrite test_pt_tf_model_equivalence from tf common * fix base_model_prefix * Fix examples * remove unused * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply review suggestions * change self.pre_layrnorm to self.pre_layernorm * apply more review suggestions * return attention probs before dropout (to align with PT) * fix weight init * fix * build doc * fix missing doc * fix for test Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-23 11:19:44 -05:00
lewtun	6b655cc63f	Add ONNX support for MarianMT models (#14586 ) * First commit to add MarianMT to ONNX * Now MarianModel.forward() automatically generates decoder_input_ids, like BartModel.forward() * Adjusted MarianOnnxConfig.inputs and outputs to work with seq2seq-lm feature * Style fix * Added support for other features for already supported models * Partial support for causal and seq2seq models * Partial support for causal and seq2seq models * Add default task for MarianMT ONNX * Remove automatic creation of decoder_input_ids * Extend inputs and outputs for MarianMT ONNX config * Add MarianMT to ONNX unit tests * Refactor * OnnxSeq2SeqConfigWithPast to support seq2seq models * Parameterized the onnx tests * Restored run_mlm.py * Restored run_mlm.py * [WIP] BART update * BART and MBART * Add past_key_values and fix dummy decoder inputs Using a sequence length of 1 in generate_dummy_outputs() produces large discrepancies, presumably due to some hidden optimisations. * Refactor MarianOnnxConfig to remove custom past_key_values logic * Fix quality * Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)" This reverts commit `0f4e39c559`. * is_torch_available test to avoid failing imports * sorting parameterize parameters to solve ERROR gw0 gw1 * tests fix * tests fix * GPT2 with past fix * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially * Removed onnx file * Refactor Marian export to account for base changes * Fix copies * Implemented suggestions * Extend support for causal LM * Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)" This reverts commit `0f4e39c559`. * is_torch_available test to avoid failing imports * sorting parameterize parameters to solve ERROR gw0 gw1 * tests fix * tests fix * GPT2 with past fix * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially * Removed onnx file * Implemented suggestions * Fixed __init__ to resolve conflict with master * Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)" This reverts commit `0f4e39c559`. * is_torch_available test to avoid failing imports * sorting parameterize parameters to solve ERROR gw0 gw1 * tests fix * tests fix * GPT2 with past fix * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially * Removed onnx file * Implemented suggestions * Fixed __init__ to resolve conflict with master * Remove commented import * Remove ONNX model * Remove redundant class method * Tidy up imports * Fix quality * Refactor dummy input function * Add copied from statements to Marian config functions * Remove false copied from comments * Fix copy from comment Co-authored-by: Massimiliano Bruni <massimiliano.bruni@hcl.com> Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>	2021-12-23 13:35:56 +01:00
Henrik Holm	6a7b9da2ae	Add 'with torch.no_grad()' to integration test forward pass (#14808 )	2021-12-23 04:23:39 -05:00
Michael Benayoun	13504dcbea	Onnx enable tasks for supported models (part 2) (#14700 ) * Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)" This reverts commit `0f4e39c559`. * is_torch_available test to avoid failing imports * sorting parameterize parameters to solve ERROR gw0 gw1 * tests fix * tests fix * GPT2 with past fix * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially * Removed onnx file * Implemented suggestions * Fixed __init__ to resolve conflict with master * Remove commented import	2021-12-22 14:43:11 +01:00
Ryokan RI	824fd44fc3	Feature/fix slow test in mluke (#14749 ) * make MLukeTokenizerTest fast * make LukeTokenizerTest fast * add entry to _toctree.yaml	2021-12-22 06:35:59 -05:00
SaulLu	c94c1b8967	update the arguments `add_prefix_space` and `trim_offsets` in `backend_tokenizer.post_processor` of `RobertaTokenizerFast` (#14752 ) * add tests * change post-processor, pre-tokenizer and decoder (can't update decoder) * update test (remove decoder which doesn't depend on trim and add_prefix) * just update the post_processor * fix change * `trim_offsets` has no influence on `pre_tokenizer` * remove a test that need some input from the `tokenizers` lib maintainers * format * add new test offsets roberta * polish comments	2021-12-22 10:51:55 +01:00
Leandro von Werra	5722d05831	Add custom `stopping_criteria` and `logits_processor` to `generate` (#14779 ) * add custom `stopping_criteria` and `logits_processor` to `generate` * add tests for custom `stopping_criteria` and `logits_processor` * fix typo in RAG * address reviewer comments * improve custom logits processor/stopping criteria error message * fix types in merge function signature * change default for custom list from `None` to empty list * fix rag generate * add string split suggestion Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-12-21 16:47:41 +01:00
Stas Bekman	b6ec956976	[logging] implement warning_advice / TRANSFORMERS_NO_ADVISORY_WARNINGS (#14669 ) * [logging] implement warning_advice / TRANSFORMERS_NO_ADVISORY_WARNINGS * reword	2021-12-20 20:48:38 -08:00
Sylvain Gugger	33f36c869f	Add a main_input_name attribute to all models (#14803 ) * Add a main_input_name attribute to all models * Fix tests * Wtf Vs Code? * Update src/transformers/models/imagegpt/modeling_imagegpt.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Style * Fix copies Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-12-20 11:19:08 -05:00
Henrik Holm	0940e9b242	Add 'with torch.no_grad()' to integration test forward pass (#14820 )	2021-12-20 09:28:17 -05:00
Henrik Holm	b37cf7dee4	Add 'with torch.no_grad()' to integration test forward pass (#14821 )	2021-12-20 09:25:34 -05:00
Patrick von Platen	952a77b05d	[Perceiver] Skip multi-gpu tests for now (#14813 ) * [Perceiver] Skip multi-gpu tests for now * Update tests/test_modeling_perceiver.py * up * up	2021-12-20 15:22:50 +01:00
Anton Lozhkov	3883e3a75e	Add SD and SV heads for WavLM (#14847 ) * Add converted heads * Add dummies	2021-12-20 16:40:56 +03:00
Patrick von Platen	cd583bdaa5	[WavLM] Fix slow tests (#14845 )	2021-12-20 12:06:42 +01:00
Patrick von Platen	84ea427f46	[ImageGPT] Deprecate pixel_values input name to input_ids (#14801 ) * [ImageGPT] Deprecate pixel_values input name to input_ids * up * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * correct * finish Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2021-12-17 20:05:22 +01:00
Patrick von Platen	c4a96cecbc	Wav2Vec2 meets phonemes (#14353 ) * up * add tokenizer * improve more * finish tokenizer * finish * adapt speech recognition script * adapt convert * more fixes * more fixes * update phonemizer wav2vec2 * better naming * fix more tests * more fixes swedish * correct tests * finish * improve script * remove file * up * lets get those 100 model architectures until the end of the month * make fix-copies * correct more * correct script * more fixes * more fixes * add to docs * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * replace assert * fix copies * fix docs * new try docs * boom boom * update * add phonemizer to audio tests * make fix-copies * up * upload models * some changes * Update tests/test_tokenization_wav2vec2_phoneme.py Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * more fixes * remove @ Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>	2021-12-17 19:56:44 +01:00
Daniel Stancl	ff066119ca	Implement head_mask for Flax BERT and other models copied from BERT (#14620 ) * Implement head_mask for Flax BERT and other models copied from BERT * Remove `from jax._src.nn.functions import sigmoid` Remove `from jax._src.nn.functions import sigmoid` unintentionally added by IDE * Remove no more valid copy statement * Apply patil-suraj's suggestions from code review * Apply suggestions from the code review * Update Flax template * Fix a typo * Also update template for CausalLM modules	2021-12-17 17:06:59 +01:00
Patrick von Platen	95119ad7b0	[Generate] Correct input_ids detection (#14815 ) * [Generate] Correct input_ids detection * correct	2021-12-17 16:08:54 +01:00
NielsRogge	cbf036f7ae	Add test (#14810 )	2021-12-17 04:33:27 -05:00
Lysandre Debut	d194d639ab	Remove datasets requirement (#14795 )	2021-12-16 14:34:14 -05:00
Patrick von Platen	bef1e3e4a0	Add WavLM (#14354 ) * first commit * fix some stuff * fix more readme * Apply suggestions from code review * update * correct * up * attn layer works * push code * make modedls work * Small change * more refactor * finish * up * fix convertsion * fix position bias * Fix style * fix conversion * make fix-copies * add * clean * fix docs * fix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply final changes * make fix-copies Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-16 18:57:05 +01:00
Patrick von Platen	b18d8534ea	[Generate] Make generate multi-modal (#14784 ) * finish refactor * refactor * add tests * add more tests * up * finish tests * finish * up * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * improve docstring * fix docs Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-16 18:03:55 +01:00
Anton Lozhkov	48463ebb33	Add Speaker Diarization and Verification heads (#14723 ) * Models * Squashed commit of the following: commit 72278e1e931a16d0879acc77f65762f3364833d0 Author: anton-l <aglozhkov@gmail.com> Date: Fri Dec 10 21:45:08 2021 +0300 * Add unispeech heads * Add sd/sv automodels * Docs cleanup * Fix docstrings * rename xvector classes * examples * Tests cleanup * Style * Better checkpoints for tests * leftover docs * apply review suggestions * Style + init tests * Update unispeech-sat tdnn downsampling	2021-12-16 19:22:14 +03:00
Matt	48d4827697	TF model cards (#14720 ) * Initial commit for Keras model cards * Revert accidental change * make style * make style * make style * Fix PR comments * Move repo creation to __init__ * Fixes to README.md creation * Partial progress for proper card creation on `push_to_hub` * Proper card creation from `push_to_hub` plus fixes for malformed model cards * Fixes for model card creation outside the callback * Adding a model card creation test * Putting the model card creation test in the right file. Good job, Matt. * make style * Fix model card test temp dir usage * Fix model card creation when no optimizer present * Fixes for when training history not present * Fix accidental edit to test_modeling_common	2021-12-15 14:57:52 +00:00
Nicolas Patry	e7ed7ffdcb	Adding support for multiple mask tokens. (#14716 ) * Adding support for multiple mask tokens. - Original implem: https://github.com/huggingface/transformers/pull/10222 Co-authored-by: njafer <naveen.jafer@oracle.com> * In order to accomodate optionally multimodal models like Perceiver we add information to the tasks to specify tasks where we know for sure if we need the tokenizer/feature_extractor or not. * Adding info in the documentation about multi masks. + marked as experimental. * Add a copy() to prevent overriding the same tensor over and over. * Fixup. * Adding small test for multi mask with real values.. Co-authored-by: njafer <naveen.jafer@oracle.com>	2021-12-14 16:46:16 +01:00
Nicolas Patry	546a91abe9	Fixing tests for Perceiver (#14739 ) * Adding some slow test to check for perceiver at least from a high level. * Re-enabling fast tests for Perceiver ImageClassification. * Perceiver might try to run without Tokenizer (Fast doesn't exist) and with FeatureExtractor some text only pipelines. * Oops. * Adding a comment for `update_config_with_model_class`. * Remove `model_architecture` to get `tiny_config`. * Finalize rebase. * Smarter way to handle undefined FastTokenizer. * Remove old code. * Addressing some nits. * Don't instantiate `None`.	2021-12-14 09:43:07 +01:00
NielsRogge	e926ea2bdd	Improve perceiver (#14750 ) * First draft * Improve docstring + clean up tests * Remove unused code * Add check in case one doesn't provide a preprocessor	2021-12-13 18:46:49 +01:00
Yih-Dar	12d9b95723	Fix: change tooslow to slow (#14734 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2021-12-13 16:12:58 +00:00
Lysandre Debut	3d66146afc	Fixing tests for Perceiver (#14745 ) - Do not run image-classification pipeline (_CHECKPOINT_FOR_DOC uses the checkpoint for langage, which cannot load a FeatureExtractor so current logic fails). - Add a safeguard to not run tests when `tokenizer_class` or `feature_extractor_class` are defined, but cannot be loaded This happens for Perceiver for the "FastTokenizer" (which doesn't exist so None) and FeatureExtractor (which does exist but cannot be loaded because the checkpoint doesn't define one which is reasonable for the said checkpoint) - Added `get_vocab` function to `PerceiverTokenizer` since it is used by `fill-mask` pipeline when the argument `targets` is used to narrow a subset of possible values. Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2021-12-13 08:13:39 -05:00
Nicolas Patry	7cb1fdd4d1	Fixing tests for perceiver (texts) (#14719 ) * Fixing tests for perceiver (texts) * For MaskedLM	2021-12-10 19:38:59 -05:00
NielsRogge	7375758bee	Fix tests (#14703 )	2021-12-09 08:32:35 -05:00
Sylvain Gugger	e6219320b9	Make MLuke tokenizer tests slow (#14690 )	2021-12-08 15:59:57 -05:00
lewtun	0f4e39c559	Revert "Added support for other features for already supported models (#14358 )" (#14679 ) This reverts commit `0c70f145d1`.	2021-12-08 13:04:40 -05:00
Michael Benayoun	0c70f145d1	Added support for other features for already supported models (#14358 ) * Added support for other features for already supported models * Partial support for causal and seq2seq models * Partial support for causal and seq2seq models * OnnxSeq2SeqConfigWithPast to support seq2seq models * Parameterized the onnx tests * Restored run_mlm.py * Restored run_mlm.py * [WIP] BART update * BART and MBART * Added comments * Another sequence length of the past_key_values	2021-12-08 18:39:56 +01:00
Patrick von Platen	ee4fa2e465	[AutoProcessor] Add Wav2Vec2WithLM & small fix (#14675 ) * [AutoProcessor] Add Wav2Vec2WithLM & small fix * revert line removal * Update src/transformers/__init__.py * add test * up * up * small fix	2021-12-08 15:51:28 +01:00
NielsRogge	65b20b739b	Add Perceiver IO (#14487 ) * First draft * Style and remove mlm * Make forward pass work * More improvements * More improvements * Fix bug * More improvements * More improvements * Add PerceiverTokenizer first draft * Improve conversion script * More improvements * Make conversion script work for the encoder * Make conversion script work with local pickle files * Style & quality, fix-copies * Add dummy input to conversion script * Add absolute position embeddings to TextPreProcessor * Make forward pass of encoder work * More improvements * Move text preprocessor to separate script * More improvements * More improvements * Add post processor * Make MLM model work * Style * Add PerceiverForMaskedLM * Add PerceiverImagePreprocessor * Make style * Make PerceiverForImageClassification work * More improvements * More improvements * Use tokenizer in conversion script * Use PerceiverForMaskedLM in conversion script * Define custom PerceiverModelOutput * Improve PerceiverAttention to make it work for both MLM and image classification * More improvements * More improvements * More improvements to the conversion script * Make conversion script work for both MLM and image classification * Add PerceiverFeatureExtractor * More improvements * Style and quality * Add center cropping * Fix bug * Small fix * Add print statement * Fix bug in image preprocessor * Fix bug with conversion script * Make output position embeddings an nn.Parameter layer instead of nn.Embedding * Comment out print statements * Add position encoding classes * More improvements * Use position_encoding_kwargs * Add PerceiverForImageClassificationFourier * Make style & quality * Add PerceiverForImageClassificationConvProcessing * Style & quality * Add flow model * Move processors to modeling file * Make position encodings modular * Make basic decoder use modular position encodings * Add PerceiverForOpticalFlow to conversion script * Add AudioPreprocessor * Make it possible for the basic decoder to use Fourier position embeddings * Add PerceiverForMultimodalAutoencoding * Improve model for optical flow * Improve _build_network_inputs method * Add print statement * Fix device issue * Fix device of Fourier embeddings * Add print statements for debugging * Add another print statement * Add another print statement * Add another print statement * Add another print statement * Improve PerceiverAudioPreprocessor * Improve conversion script for multimodal modal * More improvements * More improvements * Improve multimodal model * Make forward pass multimodal model work * More improvements * Improve tests * Fix some more tests * Add output dataclasses * Make more tests pass * Add print statements for debuggin * Add tests for image classification * Add PerceiverClassifierOutput * More improvements * Make more tests pass for the optical flow model * Make style & quality * Small improvements * Don't support training for optical flow model for now * Fix _prepare_for_class for tests * Make more tests pass, add some docs * Add multimodal model to tests * Minor fixes * Fix tests * Improve conversion script * Make fixup * Remove pos_dim argument * Fix device issue * Potential fix for OOM * Revert previous commit * Fix test_initialization * Add print statements for debugging * Fix print statement * Add print statement * Add print statement * Add print statement * Add print statement * Add print statement * Add print statement * Remove need for output_shape * Comment out output_shape * Remove unnecessary code * Improve docs * Fix make fixup * Remove PerceiverTextProcessor from init * Improve docs * Small improvement * Apply first batch of suggestions from code review * Apply more suggestions from code review * Update docstrings * Define dicts beforehand for readability * Rename task to architecture in conversion script, include PerceiverModel in tests * Add print statements for debugging * Fix tests on GPU * Remove preprocessors, postprocessors and decoders from main init * Add integration test * Fix docs * Replace einops by torch * Update for new docs frontend * Rename PerceiverForImageClassification * Improve docs * Improve docs * Improve docs of PerceiverModel * Fix some more tests * Improve center_crop * Add PerceiverForSequenceClassification * Small improvements * Fix tests * Add integration test for optical flow model * Clean up * Add tests for tokenizer * Fix tokenizer by adding special tokens properly * Fix CI	2021-12-08 14:20:34 +01:00
Patrick von Platen	961732c276	[Wav2Vec2] PyCTCDecode Integration to support language model boosted decoding (#14339 ) * up * up * up * make it cleaner * correct * make styhahalal * add more tests * finish * small fix * make style * up * tryout to solve cicrle ci * up * fix more tests * fix more tests * apply sylvains suggestions * fix import * correct docs * add pyctcdecode only to speech tests * fix more tests * add tf, flax and pt tests * add pt * fix last tests * fix more tests * Apply suggestions from code review * change lines * Apply suggestions from code review Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * correct tests * correct tests * add doc string Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>	2021-12-08 12:07:54 +01:00
Nicolas Patry	2e12d90b9e	Fixing Dataset for TQA + token-classification. (#14658 ) * Fixing Dataset for TQA + token-classification. * Fixing the tests. * Making sure `offset_mappings` is a valid argument.	2021-12-08 09:54:24 +01:00
Stas Bekman	b66c5ab20c	[deepspeed] fix --load_best_model_at_end (#14652 ) * [deepspeed] fix load_best_model_at_end * try with pull_request_target * revert: try with pull_request_target * style * add test * cleanup	2021-12-06 21:57:47 -08:00
Ryokan RI	30646a0a3c	Add mLUKE (#14640 ) * implement MLukeTokenizer and LukeForMaskedLM * update tests * update docs * add LukeForMaskedLM to check_repo.py * update README * fix test and specify the entity pad id in tokenization_(m)luke * fix EntityPredictionHeadTransform	2021-12-07 00:25:28 -05:00
Yih-Dar	4cdb67caba	Use cross_attention_hidden_size in Encoder-Decoder models (#14378 ) * add cross_attention_hidden_size to text-2-text encoder-decoder models (PT/Flax) * for TFEncoderDecoderModel * add equivalence test for TFEncoderDecoderModel * fix * fix failed equivalence tests * remove unused import * add detailed comment * Fix check_equivalence_tf_to_pt by using encoder/decoder * cleaning * Use cross_attention_hidden_size in speech-to-text * clean fast init logging msg in encoder decoder models * increase tol from 1e-5 to 1e-3 for tf test * style * style * make sure projection layer can run * remove type conversion + add check * fix conflict (config.output_hidden_size) * Remove TF -> PT in check_pt_tf_equivalence for TFEncoderDecoderModel Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2021-12-07 00:27:32 +01:00
Lysandre Debut	e9688875bf	Auto processor fix (#14623 ) * Add AutoProcessor class Init and tests Add doc Fix init Update src/transformers/models/auto/processing_auto.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Reverts to tokenizer or feature extractor when available Adapt test * Revert "Adapt test" This reverts commit `bbdde5fab0`. * Revert "Reverts to tokenizer or feature extractor when available" This reverts commit `77659ff5d2`. * Don't revert everything Lysandre! Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>	2021-12-06 12:49:50 -05:00
tucan9389	0f3f045ebd	Add GPTJForQuestionAnswering (#14503 ) * Add GPTJForQuestionAnswering * Reformat for GPTJForQuestionAnswering * Fix isort error * make style for GPTJForQA * Add _keys_to_ignore_on_load_missing * Change the sequence of qa and classification Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-12-06 11:44:10 -05:00
Stas Bekman	71b1bf7ea8	[trainer] add tf32-mode control (#14606 ) * [trainer] add --tf32 support * it's pt>=.17 * it's pt>=.17 * flip the default to True * add experimental note * simplify logic * style * switch to 3-state logic * doc * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * re-style code Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-03 10:08:58 -08:00
Li-Huai (Allan) Lin	66ea739168	Improve tokenizer tests (#13594 ) * Use new method to acquire tokenizers * Resolve TODOs. * Style * Fix * Enable do_lower_case in test_tokenize_special_tokens * Apply suggestion from code review * Fix mask token handling * Revert "Fix mask token handling" This reverts commit `daaa3f5291`. * Fix FNet mask token tokenization * Complete everything * Apply suggestions from code review	2021-12-03 08:39:10 +01:00
Nik	6645eb61fa	fix #14524 (IndexError when mask prob is too low) (#14525 ) * fix #14524 (IndexError when mask prob is too low) * fix formatting * correct documentation, add option for setting min_num_masks * change the semantic meaning of `mask_prob` in _compute_mask_indices With this commit the meaing of `mask_prob` actually adhered to the probability for each vector to be the start of a masked span of length. * fix check_copies test * fix documentation to semantic meaning of `upper bound of overall masking percentage`, revert changes to _compute_mask_indices * fix typo	2021-12-02 17:05:31 +03:00
Daniel Stancl	50d909be28	[Flax] Add FlaxBlenderbotSmall (#14576 ) * [WIP] Add FlaxBlenderbotSmall * Revert some unintentionally changed files Revert some unintentionally files changed by improperly filled cookiecutter instructions. * Fix repo consistency * Fix Flax-PT equivalence * Apply suggestions from code review * Update index.mdx * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-12-02 14:21:48 +05:30
Suraj Patil	4c0dd199c8	FlaxGPTJ (#14396 ) * add flax gptj * no bias in attention dense * no wpe * fix rotary embeddings * fix rotary embeds * fix rotray embeds * quality * doc and quality * fix equivalence tests	2021-12-01 10:57:39 +05:30
Jamie DeAntonis	70996a5420	WIP: Support for Training with BF16 (#13207 ) * started bf16 integration * minor changes * code now runs * style * lay foundation for bf16 testing * lay foundation for bf16 testing * start the tests * better bf16 check * style * 2 separate checkers - one for bf16 support, another for bf16+autocast * Update src/transformers/training_args.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * a couple of comment resolutions * more comment resolutions * resolved a small bug * just some print statemtns * added todo marking * added a todo * adjust for API change s/fast_dtype/dtype/ * fix style * merge 2 bf16 util functions * bf16 now does scaling too * Add support for bfloat16 * Revert T5 layernorm to float32 This is based on the comment at https://github.com/huggingface/transformers/pull/14448/files#r752660929 and the PyTorch PR https://github.com/pytorch/pytorch/pull/66920 . * Add comment about conversion to float32 before returning the numpy data * Add comment about AMP-bfloat16 incompatibility * Fix formatting * typo * reformer / bf16 * cleanup * require at least pt-1.10 * fix * will deal with deepspeed separately * cleanup * revert * cleanup * fp16_full_eval and bf16_full_eval are separate modes * proper deprecation * cleanup * test and fixes * spelling * cleanup * add a note that this API is experimental Co-authored-by: jamie <jamie@cortx.com> Co-authored-by: Stas Bekman <stas@stason.org> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: suriya <suriya@cortx.com> Co-authored-by: Manuel R. Ciosici <manuelrciosici@gmail.com>	2021-11-30 18:00:47 -08:00
Suraj Patil	fc1d97f29d	VisionTextDualEncoder (#13511 ) * init vision_text_dual_encoder * fix merge * remove extra heads * fix tests * remove VISION_TEXT_DUAL_ENCODER_PRETRAINED_CONFIG_ARCHIVE_MAP * remove archive map * fix imports * fix more imports * fix init * delete tokenizers * fix imports * clean * support clip's vision model * handle None config * begin tests * more test and few fixes * warn about newly init weights * more tests * add loss to model * remove extra classes from doc * add processor * doc and small fixes * add start docstr * update flax model * flax tests * more flax tests * doc * quality * doc and quality * fix doc * doc * remove comments * update warning * quality * fix docs * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * replace asserts, fix imports * update imports * fix import * address some review comments * fix check * reduce tolerance * fix test * add flax integration test * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address Sylvain's comments * fix style * add pt_flax_equivalence test in PT tests * add pt integration test * update test * use pre-trained checkpoint in examples Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-11-30 22:21:48 +05:30
Daniel Stancl	faacd74729	[Flax] Add FlaxBlenderbot (#13633 ) * Init Flax implementation for Blenderbot * Add a majority of stuff except for tests * make style quality * Add tests and fix some bugs * Add tests * Clean source code and fix some bugs * Fix copies and docs * Fix jax device condition for tests * Fix layer norm in the encoder * Fix a few typos in the test file * make fix-copies * make fix-copies * fix layer norm * Fix Flax params dtype (#13090) * Fix PR reference (#13098) * make fix-copies * Update tests/test_modeling_flax_blenderbot.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-11-30 17:36:54 +05:30
Kamal Raj	c468a87a69	Tapas tf (#13393 ) * TF Tapas first commit * updated docs * updated logger message * updated pytorch weight conversion script to support scalar array * added use_cache to tapas model config to work properly with tf input_processing * 1. rm embeddings_sum 2. added # Copied 3. + TFTapasMLMHead 4. and lot other small fixes * updated docs * + test for tapas * updated testing_utils to check is_tensorflow_probability_available * converted model logits post processing using numpy to work with both PT and TF models * + TFAutoModelForTableQuestionAnswering * added TF support * added test for TFAutoModelForTableQuestionAnswering * added test for TFAutoModelForTableQuestionAnswering pipeline * updated auto model docs * fixed typo in import * added tensorflow_probability to run tests * updated MLM head * updated tapas.rst with TF model docs * fixed optimizer import in docs * updated convert to np data from pt model is not `transformers.tokenization_utils_base.BatchEncoding` after pipeline upgrade * updated pipeline: 1. with torch.no_gard removed, pipeline forward handles 2. token_type_ids converted to numpy * updated docs. * removed `use_cache` from config * removed floats_tensor * updated code comment * updated Copyright Year and logits_aggregation Optional * updated docs and comments * updated docstring * fixed model weight loading * make fixup * fix indentation * added tf slow pipeline test * pip upgrade * upgrade python to 3.7 * removed from_pt from tests * revert commit `f18cfa9`	2021-11-30 11:07:55 +01:00
NielsRogge	25156eb296	Rename ImageGPT (#14526 ) * Rename * Add MODEL_FOR_CAUSAL_IMAGE_MODELING_MAPPING	2021-11-29 10:19:11 +01:00
Nicolas Patry	04683c0659	Fix a slow test. (#14527 )	2021-11-25 12:59:33 -05:00
NielsRogge	3772af49ce	[Tests] Improve vision tests (#14458 ) * Improve tests * Install vision for tf tests	2021-11-24 15:22:20 +01:00
Stas Bekman	956a483173	[deepspeed] zero inference (#14253 ) * [deepspeed] zero inference * only z3 makes sense for inference * fix and style * docs * rework * fix test * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * responding to suggestions Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-11-23 14:09:15 -08:00
Sylvain Gugger	204d251310	Auto processor (#14465 ) * Add AutoProcessor class * Init and tests * Add doc * Fix init * Update src/transformers/models/auto/processing_auto.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Reverts to tokenizer or feature extractor when available * Adapt test Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-11-22 12:17:38 -05:00
Nicolas Patry	a4553e6c64	Moving pipeline tests from `Narsil` to `hf-internal-testing`. (#14463 ) * Moving everything to `hf-internal-testing`. * Fixing test values. * Moving to other repo. * Last touch?	2021-11-22 04:40:45 -05:00
Shang Zhang	a59e7c1ed4	Add QDQBert model and quantization examples of SQUAD task (#14066 ) * clean up branch for add-qdqbert-model * README update for QAT example; update docstrings in modeling_qdqbert.py * Update qdqbert.rst * Update README.md * Update README.md * calibration data using traning set; QAT example runs in fp32 * re-use BERTtokenizer for qdqbert * Update docs/source/model_doc/qdqbert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/qdqbert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/qdqbert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove qdqbert tokenizer * Update qdqbert.rst * update evaluate-hf-trt-qa.py * update configuration_qdqbert.py * update modeling_qdqbert.py: add copied statement; replace assert with ValueError * update copied from statement * add is_quantization_available; run make fix-copies * unittest add require_quantization * add backend dependency to qdqbert model * update README; update evaluate script; make style * lint * docs qdqbert update * circleci build_doc add pytorch-quantization for qdqbert * update README * update example readme with instructions to upgrade TensorRT to 8.2 * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * change quantization to pytorch_quantization for backend requirement * feed_forward_chunking not supported in QDQBert * make style * update model docstrings and comments in testing scripts * rename example to quantization-qdqbert; rename example scripts from qat to quant * Update src/transformers/models/qdqbert/modeling_qdqbert.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * rm experimental functions in quant_trainer * qa cleanup * make fix-copies for docs index.rst * fix doctree; use post_init() for qdqbert * fix early device assignment for qdqbert * fix CI:Model templates runner Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-11-19 13:33:39 -05:00
Nicolas Patry	81fe8afaac	Adding support for `hidden_states` and `attentions` in unbatching (#14420 ) support.	2021-11-19 15:37:52 +01:00
Patrick von Platen	f25a9332e8	[Generation] Allow `inputs_embeds` as an input (#14443 ) * up * finalize * finalize * finish * Update src/transformers/generation_utils.py * apply feedback	2021-11-19 15:35:06 +01:00
NielsRogge	0490b98877	[ImageGPT] Small fixes (#14460 ) * Add integration test * Fix typo	2021-11-19 15:15:02 +01:00
Sylvain Gugger	83ef8bcac2	Fix finite IterableDataset test on multiple GPUs (#14445 )	2021-11-18 10:25:06 -05:00
NielsRogge	da36c557f7	Add ImageGPT (#14240 ) * First draft * More improvements * Improve conversion script * Fix init weights for layer norm * Fix correct model for conversion script * Don't tie input and output embeddings * Add print statements for debugging * Add print statements for debugging * Fix vocab size of model * Improve documentation, remove fast tokenizer * Add ImageGPTForImageClassification, improve docs * Fix docs issue * Set verbosity level back to info * Improve tests * Fix tests and add figure * Delete tokenizer file * Remove ImageGPTTokenizer from init files * Remove ImageGPTLayer from init files * Remove ImageGPT tokenizer from docs * First draft of ImageGPTFeatureExtractor * Fix typo * Fix bug * More improvements * Apply suggestions from code review, add tests for feature extractor * Fix layernorm * Update save_pretrained method * Fix issue * Make all tests of ImageGPTFeatureExtractor pass * Update code examples * Rename model inputs to pixel_values * Improve code examples * Update init_weights to post_init * Fix post_init	2021-11-18 16:24:34 +01:00
Sylvain Gugger	d83b0e0c07	Add a post init method to all models (#14431 ) * Add a post init method to all models * Fix tests * Fix last tests * Fix templates * Add comment * Forgot to save	2021-11-18 08:38:09 -05:00
N	1991da07f7	[WIP] Ensure TF model configs can be converted to proper JSON (#14415 ) * test: make sure model configs are jsonifiable * fix: return python dict instead of config object * fix: accept pretrained config and use correct class * Re-enabling slow tests and applying them to core models only * Re-enabling slow tests and applying them to core models only * Add new test file to fetcher * Remove tooslow tests from test_modeling_tf_common.py * make style * Style fixes * Style fixes * Style fixes * Style fixes * Adding core tests to GPT2 and BART * Removing unused imports Co-authored-by: niklas.fruehauf <niklas.fruehauf@sovanta.com> Co-authored-by: matt <rocketknight1@gmail.com>	2021-11-17 20:24:39 +00:00
NielsRogge	a2864a50e7	Improve semantic segmentation models (#14355 ) * Improve tests * Improve documentation * Add ignore_index attribute * Add semantic_ignore_index to BEiT model * Add segmentation maps argument to BEiTFeatureExtractor * Simplify SegformerFeatureExtractor and corresponding tests * Improve tests * Apply suggestions from code review * Minor docs improvements * Streamline segmentation map tests of SegFormer and BEiT * Improve reduce_labels docs and test * Fix code quality * Fix code quality again	2021-11-17 15:29:58 +01:00
Patrick von Platen	700a748fe6	[Wav2Vec2] Add New Wav2Vec2 Translation (#14392 ) * add new wav2vec2 translation * correct * up * add tests * correct end copy * correct more * up * correct unispeech sat * finish * finalize * finish * up	2021-11-17 14:38:56 +01:00
Valentin	a33168aa78	Avoid looping when data exhausted (#14413 ) * stop training when a finite IterableDataset is exhausted when using an iterable dataset num_epochs is set to sys.maxsize to make sure all data is consumed likewise we want to set max_steps high enough but still stop when all data is consumed (cherry picked from commit 6f0e1d6363153da9051e93acffe1cbab3a3f3b12) * fix typo flase -> false * add test for stopping training on exhausted finite iterable dataset * remove redundant gradient_accumulation_steps * run make style reformat training_args docstring	2021-11-16 16:50:04 -05:00
Sylvain Gugger	040fd47162	Fix gradient_checkpointing backward compatibility (#14408 ) * Fix gradient_checkpointing backward compatibility * Remove needless line * make sure mask prob is big enough and length small enough * Fix tests Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>	2021-11-16 08:58:42 -05:00
Lysandre Debut	1cc453d33c	Allow per-version configurations (#14344 ) * Allow per-version configurations * Update tests/test_configuration_common.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/test_configuration_common.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-11-15 16:38:02 -05:00
Yih-Dar	a67d47b40c	Fix weight loading issue (#14016 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2021-11-15 17:48:40 +01:00
NielsRogge	74e6111ba7	Fix test and docs (#14399 )	2021-11-15 17:35:33 +01:00
Patrick von Platen	4ce74edf51	[Speech2Text2] Enable tokenizers (#14390 ) * [Speech2Text2] Enable tokenizers * minor fix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-11-15 16:34:11 +01:00
Suraj Patil	2e60276b38	[M2M100Tokenizer] fix _build_translation_inputs (#14382 ) * add return_tensors paramter * fix test * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-11-13 20:57:12 +05:30
Nicolas Patry	ed5d15518b	Adding support for raw python `generator` in addition to `Dataset` for pipelines (#14352 ) * Adding support for raw python `generator` in addition to `Dataset` The main goal is to ease the create of streaming data to the pipe. `Dataset` is more involved and pytorch specific. This PR, provides a way to use a python iterator too. This enabled #14250 but can be proposed as a standalone PR. ```python from transformers import pipeline def read_data(filename): with open(filename, 'r') as f: for line in f: yield f pipe = pipeline("text-classification") for classified in pipe(read_data("large_file.txt")): print("Success ! ", classified) ``` The main caveat of this, is the interaction with `DataLoader` with `num_workers>1`. When you have multiple workers, each receive a copy of the generator (like `IterableDataset`). That means the naive Iterator will fail since all workers iterate on all items of the generator. There are ways to do clever "skipping", but it could be bad still because all workers still do have to pass through all items of the generator (they just ignore items they don't handle), depending on the case it might be bad. Using `num_workers=1` is the simplest fix and if the cost of loading your data is small enough should be good enough. In the above example trying to do smart tricks to skip some lines is unlikely to be a net positive for instance. If there are better ways to do "jumps" on some data, then using `Dataset` is more advised (since then differents workers can just jump themselves). * Adding iterator support for `tf` too.	2021-11-12 09:20:40 +01:00
Suraj Patil	3d607df8f4	fix loading flax bf16 weights in pt (#14369 ) * fix loading flax bf16 weights in pt * fix clip test * fix t5 test * add logging statement * Update src/transformers/modeling_flax_pytorch_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * switch back to native any * fix check for bf16 weights Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-11-11 21:20:49 +05:30
Matt	4c35c8d89c	Experimenting with adding proper get_config() and from_config() methods (#14361 ) * Experimenting with adding proper get_config() and from_config() methods * Adding a test for get/from config * Fix test for get/from config	2021-11-11 14:21:50 +00:00
Suraj Patil	e92190c0f8	Fix Flax params dtype (#13098 ) * fix inits * fix embed dtype * fix embed dtype * add test to check default dtype * quality * add type conversion methods for flax models * more robust casting * cast sinusoidal positions * update pegasus * update albert * update test * make sure dtype is passed to every module * style * fix electra dense * fix t5 * quality * add more tests * better name * use the dtype for lm head computation * fix albert * style * fix albert embed dtype * more tests * fix vision enc-dec * cleanup * fix embed dtype pegasus * fix default param test * doc * update template * fix final_logits_bias dtype * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * fix doc * fix doc * add detailed docstring for dtype parameter * remove un-necessary import Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-11-11 14:45:20 +05:30

1 2 3 4 5 ...

1513 Commits