transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-08-03 03:31:05 +06:00

Author	SHA1	Message	Date
Yih-Dar	2199382dfd	Use random_attention_mask for TF tests (#16517 ) * use random_attention_mask for TF tests * Fix for TFCLIP test (for now). Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-01 16:53:07 +02:00
Jim Rohrer	9de70f213e	Add ONNX export for BeiT (#16498 ) * Add beit onnx conversion support * Updated docs * Added cross reference to ViT ONNX config	2022-04-01 10:52:42 +02:00
Francesco Saverio Zuppichini	c4deb7b3ae	Feature Extractor accepts `segmentation_maps` (#15964 ) * feature extractor accepts * resolved conversations * added examples in test for ADE20K * num_classes -> num_labels * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * resolving conversations * resolving conversations * removed ADE * CI * minor changes in conversion script * reduce_labels in feature extractor * minor changes * correct preprocess for instace segmentation maps * minor changes * minor changes * CI * debugging * better padding * going to update labels inside the model * going to update labels inside the model * minor changes * tests * removed changes in feature_extractor_utils * conversation * conversation * example in feature extractor * more docstring in modeling * test * make style * doc Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-30 18:46:51 +02:00
Yih-Dar	2b483230a1	Raise diff tolerance value for TFViTMAEModelTest (#16483 ) * Raise diff tolerance value Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-03-29 22:12:27 +02:00
Sander Land	d7c8ce57d4	Avoid accessing .dataset of a DataLoader in Trainer (#16451 ) * Avoid accessing .dataset of a dataloader * style * fix * cleaning up, reverting some misunderstandings * black * add train_dataset argument to get_train_dataloader, and fix other instances of length checks * flake8 * address comments * fix bug * cleanup * add test * Update tests/trainer/test_trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * under torch * merge * stylistic suggestion Co-authored-by: Sander Land <sander@chatdesk.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-29 15:00:18 -04:00
Sayak Paul	5b40a37bc4	Add TF ViT MAE (#16255 ) * ported TFViTMAEIntermediate and TFViTMAEOutput. * added TFViTMAEModel and TFViTMAEDecoder. * feat: added a noise argument in the implementation for reproducibility. * feat: vit mae models with an additional noise argument for reproducibility. Co-authored-by: ariG23498 <aritra.born2fly@gmail.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-03-29 18:24:15 +01:00
Joao Gante	7a9ef8181c	TF: properly handle kwargs in encoder_decoder architectures (#16465 ) * properly handle kwargs in encoder_decoder architectures * make fixup	2022-03-29 18:17:47 +01:00
Yih-Dar	86cff21cf6	Fix some TF GPT-J CI testings (#16454 ) * Fix for test_mixed_precision * Fix test_saved_model_creation by using shape_list instead of shape * skit test_model_from_pretrained on GPU for now to avoid GPU OOM * skip test_gptj_sample_max_time for now Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-03-29 18:04:20 +02:00
Yih-Dar	aebca696af	Fix missing output_attentions in PT/Flax equivalence test (#16271 ) * fix - set output_attentions to True * Update tests/test_modeling_flax_common.py * update for has_attentions * overwrite check_outputs in FlaxBigBirdModelTest Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2022-03-29 17:51:48 +02:00
NielsRogge	979b039c89	Add DPT (#15991 ) * First draft * More improvements * Add fusion blocks * Make conversion script work for dpt_large * Make conversion script work * Improve implementation * Improve conversion script * Add DPTForSemanticSegmentation * Make conversion work for semantic segmentation * Add tests * Remove print statements * First draft * Redesign neck * Improve tests * Improve implementation some more * Make neck output list of tensors * Improve neck and feature extractor * Fix integration tests * Make more tests pass * Make all tests pass * Add missing config archive map * Add in_index attribute to make heads accept list of tensors * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply some more suggestions * Add copied from statements * Remove assert * Apply suggestions from code review * Apply suggestions from code review * Remove DPTInterpolate in favor of nn.Upsample * Add comments * Apply suggestions from code review * Apply suggestions from code review * Add proposed design * Update design * Add DPTReassembleLayer * Add DPTFeatureFusionStage * Apply more suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * Fix rebase * Update in_index and out_indices * Fix conversion script * Fix code quality * Add model to toctree and use DepthEstimatorOutput * Fix rebase * Fix code examples * Improve code * Fix copied from statements * Apply suggestions from code review * Remove compute_loss method * Apply suggestions from code review * Fix documentation tests file * Remove test.py file * Improve doc example Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>	2022-03-28 16:28:10 +02:00
Sanchit Gandhi	7ca4633555	[FlaxSpeechEncoderDecoderModel] Ensure Input and Output Word Embeddings Are Not Tied (#16444 ) * [FlaxSpeechEncoderDecoderModel] Ensure Input and Output Word Embeddings Are Not Tied * rebase	2022-03-28 14:14:10 +02:00
Jaesun Park	e0ac72b7bd	Fix PerceiverMLP and test (#16405 ) Co-authored-by: Jaesun Park <jaesun.park1@navercorp.com>	2022-03-28 14:06:48 +02:00
Sanchit Gandhi	925fc57b70	[Flax] Improve Robustness of Back-Prop Tests (#16418 ) * [Flax] Improve Robustness of Back-Prop Tests * check equality of logits/outputs * make fixup	2022-03-28 11:56:54 +02:00
Daniel Stancl	ed2ee373d0	Add TF implementation of GPT-J (#15623 ) * Initial commit * Add TFGPTJModel * Fix a forward pass * Add TFGPTJCausalLM * Add TFGPTJForSequenceClassification * Add TFGPTJForQuestionAnswering * Fix docs * Deal with TF dynamic shapes * Add Loss parents to models * Adjust split and merge heads to handle 4 and 5-dim tensors * Update outputs for @tooslow tests	2022-03-25 19:27:19 +00:00
Sanchit Gandhi	e231c72906	[FlaxSpeechEncoderDecoder] Fix feature extractor gradient test (#16407 )	2022-03-25 17:46:53 +01:00
lewtun	a97f3150c4	Add ONNX support for Blenderbot and BlenderbotSmall (#15875 ) * Add ONNX support for Blenderbot * Add BlenderbotSmall ONNX configuration * Update serialization table	2022-03-25 17:04:43 +01:00
Sylvain Gugger	b473617d63	Checkpoint sharding (#16343 ) * Sharded checkpoint support * Handle distant sharded checkpoints * Add tests * TODO is done * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Fix docstring * Add example and format * Address review comments * More review comments * End of merge * Revert unintentional change * VsCode what did you do? * Style * Changes * Address final comments * Quality * Moar tests * Move import beneath is_pt_available Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2022-03-25 11:59:25 -04:00
Sylvain Gugger	cae394c8fa	Adapt import to new structure	2022-03-24 14:40:05 -04:00
Yih-Dar	f571dc20ac	Update PT Flax equivalence tests in PT test file (#16280 ) * update PT/Flax equivalence tests on PT side * overwrite check_outputs in BigBirdModelTest Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-03-24 14:45:30 +01:00
Yih-Dar	2a27c80063	Fix BigBirdModelTester (#16310 ) * fix * update the expected value in test_fast_integration Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-03-24 13:43:52 +01:00
Edward Beeching	aff9bc405a	Decision transformer gym (#15845 ) * Created the Decision Transformer Modle * updating tests, copy to other machine * Added last hidden size to Decision Transformer modelling outputs * Removed copy of original DT file * made a temporary change to gpt2 to have it conform with the Decision Transformer version * Updated tests * Ignoring a file used to test the DT model * added comments to config file * added comments and argument descriptions to decision transformer file * Updated doc * Ran "make style" * Remove old model imports * Removed unused imports, cleaned up init file * Update docs/source/model_doc/decision_transformer.mdx added my username Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Reverted changes made to gpt2 * Removed datasets submodule * Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states * Added support for return of hidden states, attentions and return dict of gpt2 model. * Updated tests to include many of the ModelTesterMixin tests. The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes * Added missing line to the end of gpt2 file * Added an integration test for the Decision Transformer Test performs and autoregressive evaluation for two time steps * Set done and info to _ to fix failing test * Updated integration test to be deterministic and check expected outputs * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Removed unnecessary config options * Cleaned up commented code and old comments. * Cleaned up commented code. * Changed DecisionTransformer to Decision Transformer * Added Decision Transformer to the main README file * Added copy of GTP2 called DecisionTranformerGPT2Model * isorted imports * isorted imports * Added model to non-English README files * Ran make fix-copies and corrected some cases. * Updated index file to include Decision Transformer * Added gpt2 model as copy inside the Decision Transformer model file * Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS * Deleted redundant checkpoint files (I don't know how these got committed) * Removed testing files. (These should have never been committed) * Removed accidentally committed files * Moved the Decision Transformer test to its own directory * Add type hints for Pegasus (#16324) * Funnel type hints (#16323) * add pt funnel type hints * add tf funnel type hints * Add type hints for ProphetNet PyTorch (#16272) * [GLPN] Improve docs (#16331) * Add link to notebook * Add link * Fix bug Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> * Added type hints for Pytorch Marian calls (#16200) * Added type hinting for forward functions in pytorch marian * typo correction * Removed type hints on functions from BART per Suraj Patil request * fix import pb * fix typo * corrected tuple call * ran black * after fix-copies Some optional tags on primitives were removed, past_key_values in MarianForCausalLM changed from Tuple of Tuple to List * Fixing copies to roformer and pegasus Co-authored-by: Clementine Fourrier <cfourrie@inria.fr> Co-authored-by: matt <rocketknight1@gmail.com> * Moved DecisionTransformOutput to modeling_decision_transformer * Moved the example usage to research project and cleaned comments * Made tests ignore the copy of gpt2 in Decision Transformer * Added module output to modelling decision transformer * removed copied gpt2 model from list of transformers models * Updated tests and created __init__ file for new test location * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Removed unneeded summary type from config file * Fixed copies * Updated pretrained config map to refer to hopper-medium checkpoint * done (#16340) * Added Decision transformer to model docs * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add type annotations for Rembert/Splinter and copies (#16338) * undo black autoformat * minor fix to rembert forward with default * make fix-copies, make quality * Adding types to template model * Removing List from the template types * Remove `Optional` from a couple of types that don't accept `None` Co-authored-by: matt <rocketknight1@gmail.com> * [Bug template] Shift responsibilities for long-range (#16344) * Fix code repetition in serialization guide (#16346) * Adopt framework-specific blocks for content (#16342) * ✨ refactor code samples with framework-specific blocks * ✨ update training.mdx * 🖍 apply feedback * Updates the default branch from master to main (#16326) * Updates the default branch from master to main * Links from `master` to `main` * Typo * Update examples/flax/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Updated model with custom docstring example * Created the Decision Transformer Modle * updating tests, copy to other machine * Added last hidden size to Decision Transformer modelling outputs * Removed copy of original DT file * made a temporary change to gpt2 to have it conform with the Decision Transformer version * Updated tests * Ignoring a file used to test the DT model * added comments to config file * added comments and argument descriptions to decision transformer file * Updated doc * Ran "make style" * Remove old model imports * Removed unused imports, cleaned up init file * Update docs/source/model_doc/decision_transformer.mdx added my username Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Reverted changes made to gpt2 * Removed datasets submodule * Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states * Added support for return of hidden states, attentions and return dict of gpt2 model. * Updated tests to include many of the ModelTesterMixin tests. The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes * Added missing line to the end of gpt2 file * Added an integration test for the Decision Transformer Test performs and autoregressive evaluation for two time steps * Set done and info to _ to fix failing test * Updated integration test to be deterministic and check expected outputs * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Removed unnecessary config options * Cleaned up commented code and old comments. * Cleaned up commented code. * Changed DecisionTransformer to Decision Transformer * Added Decision Transformer to the main README file * Added copy of GTP2 called DecisionTranformerGPT2Model * isorted imports * isorted imports * Added model to non-English README files * Ran make fix-copies and corrected some cases. * Updated index file to include Decision Transformer * Added gpt2 model as copy inside the Decision Transformer model file * Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS * Deleted redundant checkpoint files (I don't know how these got committed) * Removed testing files. (These should have never been committed) * Removed accidentally committed files * Moved the Decision Transformer test to its own directory * Moved DecisionTransformOutput to modeling_decision_transformer * Moved the example usage to research project and cleaned comments * Made tests ignore the copy of gpt2 in Decision Transformer * Added module output to modelling decision transformer * removed copied gpt2 model from list of transformers models * Updated tests and created __init__ file for new test location * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Removed unneeded summary type from config file * Fixed copies * Updated pretrained config map to refer to hopper-medium checkpoint * Added Decision transformer to model docs * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Updated model with custom docstring example * Updated copies, config auto, and readme files. Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Dan Tegzes <48134725+Tegzes@users.noreply.github.com> Co-authored-by: Adam Montgomerie <adam@avanssion.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com> Co-authored-by: Clementine Fourrier <cfourrie@inria.fr> Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: Francesco Saverio Zuppichini <francesco.zuppichini@gmail.com> Co-authored-by: Jacob Dineen <54680234+jacobdineen@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2022-03-23 16:18:43 -04:00
Sylvain Gugger	c595b6e6a9	Make Transformers use cache files when hf.co is down (#16362 ) * Make Transformers use cache files when hf.co is down * Fix tests * Was there a random circleCI failure? * Isolate patches * Style * Comment out the failure since it doesn't fail anymore * Better comment	2022-03-23 15:56:49 -04:00
Joao Gante	9e8c37dc82	TF - Fix interchangeable past/past_key_values and revert output variable name in GPT2 (#16332 ) * revert tf gpt2 * add test for unpack_inputs and fix test case * add changes to vision encoder decoder	2022-03-23 18:41:18 +00:00
Sylvain Gugger	4975002df5	Reorganize file utils (#16264 ) * Split file_utils in several submodules * Fixes * Add back more objects * More fixes * Who exactly decided to import that from there? * Second suggestion to code with code review * Revert wront move * Fix imports * Adapt all imports * Adapt all imports everywhere * Revert this import, will fix in a separate commit	2022-03-23 10:26:33 -04:00
Lysandre Debut	eca77f4719	Updates the default branch from master to main (#16326 ) * Updates the default branch from master to main * Links from `master` to `main` * Typo * Update examples/flax/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-23 03:46:59 -04:00
NielsRogge	0c55d47cde	Add GLPN (#16199 ) * First draft * Fix logits calculation * Improve tests * Add copied from statements * Fix base_model_prefix * Improve implementation, upload new models * Update design * Fix integration test * Add model to README and toctree * Add document image * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add decoder_hidden_size attribute * Update design of decoder * Add DepthEstimatorOutput class * Rename in_index to head_in_index and add feature extractor tests * Apply suggestions from code review * Apply suggestions from code review * Update pretrained model name and add to doc tests * Remove test.py script * Update copied from statements and clean up Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-22 08:51:13 +01:00
Yih-Dar	f466936476	Add has_attentions to TFModelTesterMixin as done on PyTorch side (#16259 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-03-19 11:44:17 +01:00
Yih-Dar	75c666b4a8	Aggressive PT/TF equivalence test on PT side (#16250 ) * Aggressive PT/TF equivalence test on PT side * Ugly fix for `TFTapasForQuestionAnswering` * apply review suggestions Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-03-18 18:51:24 +01:00
Yih-Dar	d481b6414d	Make Flax pt-flax equivalence test more aggressive (#15841 ) * Make test_equivalence_pt_to_flax more aggressive * Make test_equivalence_flax_to_pt more aggressive * don't use to_tuple * clean-up * fix missing test cases + testing on GPU * fix conversion * fix `ValueError: assignment destination is read-only` * Add type checking * commit to revert later * Fix * fix * fix device * better naming * clean-up Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-03-18 18:15:36 +01:00
Suraj Patil	b25b92ac4f	update jax version and re-enable some tests (#16254 )	2022-03-18 16:45:39 +01:00
Nicolas Patry	ecb4662d17	Attention mask is important in the case of batching... (#16222 ) * Attention mask is important in the case of batching... * Improve the fix. * Making the sentence different enough that they exhibit different predictions.	2022-03-18 10:02:12 +01:00
NielsRogge	ec4e421b7d	Update expected slices for pillow > 9 (#16117 ) * Update expected slices for pillow > 9 * Add expected slices depending on pillow version * Add different slices depending on pillow version for other models Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-03-18 09:46:45 +01:00
Suraj Patil	632ff3c39e	[FlaxSpeechEncoderDecoderModel] Skip from_encoder_decoder_pretrained (#16236 ) * skip the test * fix * fix skip	2022-03-17 20:05:14 +01:00
罗崚骁(LUO Lingxiao)	81643edda5	Support PEP 563 for HfArgumentParser (#15795 ) * Support PEP 563 for HfArgumentParser * Fix issues for Python 3.6 * Add test for string literal annotation for HfArgumentParser * Remove wrong comment * Fix typo * Improve code readability Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Use `isinstance` to compare types to pass quality check * Fix style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-17 13:51:37 -04:00
Lysandre Debut	5a6b3ccd28	Skip equivalence test for TransfoXL (#16224 ) * Skip test for TransfoXL * Single list	2022-03-17 09:03:07 -04:00
Francesco Saverio Zuppichini	d9b8d1a9f5	update test (#16219 )	2022-03-17 08:11:55 -04:00
NielsRogge	03c14a515f	[Tests] Fix DiT test (#16218 ) * Fix device * Clean up Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-03-17 10:53:57 +01:00
Lysandre Debut	73f0a5d1f6	Fixes Loss for TransfoXL when using Trainer API v2 (#16140 ) * fix(transfo_xl): Fixes TransfoXL support when using Trainer. * fix(tests): Uses losses_1 and losses_2 pattern with TransfoXL test. * fix(transfo_xl): Adds requested changes to allow for backward compatibility. fix(transfo_xl): Adds requested changes to allow for backward compatibility. fix(transfo_xl): Fixes code styling. * Backward compatibility * Update src/transformers/models/transfo_xl/modeling_transfo_xl.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Gustavo de Rosa <gth.rosa@uol.com.br> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-17 05:49:24 -04:00
Patrick von Platen	2410d0f8ed	Fix generation min length (#16206 ) * up * fix min lengths	2022-03-16 18:49:23 +01:00
Francesco Saverio Zuppichini	667b823b89	Swin support for any input size (#15986 ) * padding done * correctly return one attention per layer * almost correct, attentions are not flatten one tuple per stage * tests green * doc * conversations * reshaping hidden_states * view in the test * reshape_hidden_states in Encoder and Model * new outputs with reshaped_hidden_states * conversations * doc * Update docs/source/model_doc/swin.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * conversations * fix tests * minor changes * resolved conversations * attentions one per stage * typo * typos * typos * function signature * CI * clean up tests Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-03-16 18:38:25 +01:00
Joao Gante	204c54d411	TF: add beam search tests (#16202 )	2022-03-16 15:44:33 +00:00
Suraj Patil	190994573a	Fix loading CLIPVisionConfig and CLIPTextConfig (#16198 ) * override from_pretrained * add tests * remove docstrings * fix typo * Trigger CI	2022-03-16 16:24:01 +01:00
Sanchit Gandhi	ee27b3d7df	Replace all deprecated `jax.ops` operations with jnp's `at` (#16078 ) * Replace all deprecated `jax.ops` operations with jnp's `at` * np to jnp scores * suggested changes	2022-03-16 09:08:55 +00:00
Matt	cd4c5c9060	TF XLA greedy generation (#15786 ) * First attempt at TF XLA generation * Fix comments * Update XLA greedy generate with direct XLA calls * Support attention mask, prepare_inputs_for_generation no longer hardcoded for greedy * Handle position_ids correctly * make xla generate work for non xla case * force using xla generate * refactor * more fixes * finish cleaning * finish * finish * clean gpt2 tests * add gpt2 tests * correct more cases * up * finish * finish * more fixes * flake 8 stuff * final rag fix * Update src/transformers/models/rag/modeling_tf_rag.py * finish t5 as well * finish * Update src/transformers/generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-03-15 14:19:20 +01:00
NielsRogge	a7aca42fc4	Improve Swin for VisionEncoderDecoder (#16070 ) * Add Swin2Bart test * Fix swin tests Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-03-15 09:59:48 +01:00
Francesco Saverio Zuppichini	0a057201a9	Visual Attention Network (VAN) (#16027 ) * encoder works * addded files * norm in stage * convertion script * tests * fix copies * make fix-copies * fixed __init__ * make fix-copies * fix * shapiro test needed * make fix-copie * minor changes * make style + quality * minor refactor conversion script * rebase + tests * removed unused variables * updated doc * toctree * CI * doc * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * resolved conversations * make fixup * config passed to modules * config passed to modules * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * conversations * conversations * copyrights * normal test * tests Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-03-15 08:47:12 +01:00
Francesco Saverio Zuppichini	e3008c679f	[WIP] Resnet (#15770 ) * first commit * ResNet model correctly implemented. basic modeling + weights conversion is done removed unused doc mdx file doc and conversion script added feature_extractor to auto test minor changes + style + quality doc test Delete process.yml A left over from my attempt of running circleci locally * minor changes * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * new test format * minor changes from conversations * minor changes from conversations * make style + quality * readded the tests * test + README * minor changes from conversations * error in README * make fix-copies * removed regression for classification head * make quality * fixed loss control flow * fixed loss control flow * resolved conversations * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * READMEs * index.mdx * minor changes * updated tests and models * unused import * outputs * Update docs/source/model_doc/resnet.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added embeddings_size * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * conversation * added push to hub * test * embedding_size * make fix-copies * resolved conversations * CI * changed organization * minor changes * CI * minor changes * conversations * conversation * doc * tests * removed unused docstring * conversation * removed unused outputs * CI Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-03-14 19:57:55 +01:00
Yih-Dar	923c35b5c5	Make TF pt-tf equivalence test more aggressive (#15839 ) * Make TF pt-tf equivalence test more aggressive * Fix for TFConvNextModelTest and TFTransfoXLModelTest * fix kwargs for outputs * clean-up * Add docstring for check_outputs() * remove: need to rename encoder-decoder * clean-up * send PyTorch things to the correct device * Add back the accidentally removed test case in test_pt_tf_model_equivalence() * Fix: change to tuple before calling check_outputs() * Fix: tfo could be a list * use to_tuple() * allow tfo only to be tuple or tensor * allow tfo to be list or tuple for now + style change * minor fix * remove np.copy and update comments * tfo -> tf_output, same for pt * Add more detailed comment * remove the incorrect comment Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-03-14 13:31:32 +01:00
Sanchit Gandhi	2de99e6c43	Fix Loading of Flax(Speech)EncoderDecoderModel kwargs from PreTrained Encoder-Decoder Checkpoints (#16056 ) * Fix Loading of Flax(Speech)EncoderDecoderModel kwargs from PreTrained Encoder-Decoder Checkpoints * change wording	2022-03-14 10:12:29 +01:00
lewtun	6e1e88fd38	Add TFCamembertForCausalLM and ONNX integration test (#16073 ) * Make Camembert great again! * Add Camembert to TensorFlow ONNX tests	2022-03-14 08:40:42 +01:00

1 2 3 4 5 ...

1702 Commits