transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-04 21:30:07 +06:00

Author	SHA1	Message	Date
Aritra Roy Gosthipaty	0dc7b3a785	[TensorFlow] Adding GroupViT (#18020 ) * chore: initial commit * chore: adding util methods yet to work on the nn.functional.interpolate port with align_corener=True * chore: refactor the utils * used tf.compat.v1.image.resize to align the F.interpolate function * added type hints to the method signatures * added references to the gists where one 2 one alignment of torch and tf has been shown * chore: adding the layers * chore: porting all the layers from torch to tf This is the initial draft, nothing is tested yet. * chore: aligning the layers with reference to tf clip * chore: aligning the modules * added demaraction comments * added copied and adapted from comments * chore: aligning with CLIP * chore: wrangling the layers to keep it tf compatible * chore: aligning the names of the layers for porting * chore: style changes * chore: adding docs and inits * chore: adding tfp dependencis the code is taken from TAPAS * chore: initial commit for testing * chore: aligning the vision embeddings with the vit implementatino * chore: changing model prefix * chore: fixing the name of the model and the layer normalization test case * chore: every test passes but the slow ones * chore: fix style and integration test * chore: moving comments below decorators * chore: make fixup and fix-copies changes * chore: adding the Vision and Text Model to check_repo * chore: modifying the prefix name to align it with the torch implementation * chore: fix typo in configuration * choer: changing the name of the model variable * chore: adding segmentation flag * chore: gante's review * chore: style refactor * chore: amy review * chore: adding shape_list to parts that have been copied from other snippets * chore: init batchnorm with torch defaults * chore: adding shape_list to pass the tests * test fix: adding seed as 0 * set seed * chore: changing the straight through trick to fix -ve dimensinos * chore: adding a dimension to the loss * chore: adding reviewers and contributors names to the docs * chore: added changes after review * chore: code quality fixup * chore: fixing the segmentation snippet * chore: adding to the layer calls * chore: changing int32 to int64 for inputs of serving * chore: review changes * chore: style changes * chore: remove from_pt=True * fix: repo consistency Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-09-29 10:48:04 +01:00
DepuMeng	126a739058	Add support for conditional detr (#18948 ) * added conditional_detr files * checked copies * checked copies * fixed style and copies * fixed style and copies * fixed hub * fixed style * Update README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/_toctree.yml Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/index.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/convert_conditional_detr_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/conditional_detr.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixed some doc issue * changed prefix to ConditionalDetr * fixed docs * Update README_ko.md * added spatial_model_name * fixed fix-copies * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added some copied from * added some copied from * added some copied from * added some copied from * fixed use_pretrained issue * changed post-process * added conditional_detr files * checked copies * checked copies * fixed style and copies * fixed style and copies * fixed hub * fixed style * Update README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/_toctree.yml Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/index.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/convert_conditional_detr_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixed some doc issue * Update docs/source/en/model_doc/conditional_detr.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * changed prefix to ConditionalDetr * fixed docs * Update README_ko.md * added spatial_model_name * fixed fix-copies * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added some copied from * added some copied from * added some copied from * added some copied from * fixed use_pretrained issue * changed post-process * fix style quality and copies * fix style quality and copies * fix style quality and copies * fix style quality and copies * add more fix-copies * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixed some variable names & added more fix-copies * fixed some variable names & added more fix-copies * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added more copied from * fixed quality * changed pretrained config * added more copied-from and fixed the issue in feature_extraction_auto * added conditional_detr files * checked copies * checked copies * fixed style and copies * fixed style and copies * fixed hub * fixed style * Update README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/_toctree.yml Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/index.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/convert_conditional_detr_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixed some doc issue * Update docs/source/en/model_doc/conditional_detr.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * changed prefix to ConditionalDetr * fixed docs * Update README_ko.md * added spatial_model_name * fixed fix-copies * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added some copied from * added some copied from * added some copied from * added some copied from * fixed use_pretrained issue * changed post-process * added conditional_detr files * checked copies * fixed style and copies * fixed some doc issue * changed prefix to ConditionalDetr * fixed docs * added spatial_model_name * fixed fix-copies * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added some copied from * added some copied from * added some copied from * added some copied from * fix style quality and copies * fix style quality and copies * fix style quality and copies * add more fix-copies * fixed some variable names & added more fix-copies * fixed some variable names & added more fix-copies * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added more copied from * fixed quality * changed pretrained config * added more copied-from and fixed the issue in feature_extraction_auto * fixed style * added conditional_detr files * checked copies * checked copies * fixed style and copies * fixed style and copies * fixed hub * fixed style * Update README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/_toctree.yml Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/index.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/convert_conditional_detr_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixed some doc issue * Update docs/source/en/model_doc/conditional_detr.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * changed prefix to ConditionalDetr * fixed docs * Update README_ko.md * added spatial_model_name * fixed fix-copies * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added some copied from * added some copied from * added some copied from * added some copied from * fixed use_pretrained issue * changed post-process * added conditional_detr files * checked copies * fixed style and copies * fixed some doc issue * changed prefix to ConditionalDetr * fixed docs * added spatial_model_name * fixed fix-copies * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added some copied from * added some copied from * added some copied from * added some copied from * fix style quality and copies * fix style quality and copies * fix style quality and copies * add more fix-copies * fixed some variable names & added more fix-copies * fixed some variable names & added more fix-copies * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added more copied from * fixed quality * changed pretrained config * added more copied-from and fixed the issue in feature_extraction_auto * rebased Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Depu Meng <depumeng@Depus-MacBook-Pro.local>	2022-09-22 09:45:04 +02:00
NielsRogge	59407bbeb3	Add Deformable DETR (#17281 ) * First draft * More improvements * Improve model, add custom CUDA code * Import torch before * Add script that imports custom layer * Add everything in new ops directory * Import custom layer in modeling file * Fix ARCHIVE_MAP typo * Creating the custom kernel on the fly. * Import custom layer in modeling file * More improvements * Fix CUDA loading * More improvements * Improve conversion script * Improve conversion script * Make it work until encoder_outputs * Make forward pass work * More improvements * Make logits match original implementation * Make implementation also support single_scale model * Add support for single_scale and dilation checkpoint * Add support for with_box_refine model * Support also two stage model * Improve tests * Fix more tests * Make more tests pass * Upload all models to the hub * Clean up some code * Improve decoder outputs * Rename intermediate hidden states and reference points * Improve model outputs * Move tests to dedicated folder * Improve model outputs * Fix retain_grad test * Improve docs * Clean up and make test_initialization pass * Improve variable names * Add copied from statements * Improve docs * Fix style * Improve docs * Improve docs, move tests to model folder * Fix rebase * Remove DetrForSegmentation from auto mapping * Apply suggestions from code review * Improve variable names and docstrings * Apply some more suggestions from code review * Apply suggestion from code review * better docs and variables names * hint to num_queries and two_stage confusion * remove asserts and code refactor * add exception if two_stage is True and with_box_refine is False * use f-strings * Improve docs and variable names * Fix code quality * Fix rebase * Add require_torch_gpu decorator * Add pip install ninja to CI jobs * Apply suggestion of @sgugger * Remove DeformableDetrForObjectDetection from auto mapping * Remove DeformableDetrModel from auto mapping * Add model to toctree * Add model back to mappings, skip model in pipeline tests * Apply @sgugger's suggestion * Fix imports in the init * Fix copies * Add CPU implementation * Comment out GPU function * Undo previous change * Apply more suggestions * Remove require_torch_gpu annotator * Fix quality * Add logger.info * Fix logger * Fix variable names * Fix initializaztion * Add missing initialization * Update checkpoint name * Add model to doc tests * Add CPU/GPU equivalence test * Add Deformable DETR to pipeline tests * Skip model for object detection pipeline Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Nouamane Tazi <nouamane98@gmail.com> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>	2022-09-14 11:45:21 +02:00
NielsRogge	bb6f6d5338	Add X-CLIP (#18852 ) * First draft * Improve conversion script * Make vision encoder work * More improvements * Improve conversion script * Fix quality * Add MultiframeIntegrationTransformer * More improvements * Make MiT output work * Fix quality * Add prompts generator * Add tests * Fix some tests * Fix some more tests * Fix more tests * Improve conversion script * Fix model outputs * Fix more tests * Add XClipProcessor * Use processor in conversion script * Fix integration test * Update README, fix docs * Fix all tests * Add MIT output to XClipOutput * Create better variable names * Rename XClip to XCLIP * Extend conversion script * Add support for large models * Add support for 16 frame models * Add another model' * Fix module issue * Apply suggestions from code review * Add figure to docs * Fix CLIPProcessor issue * Apply suggestions from code review * Delete file * Convert more checkpoints * Convert last checkpoint * Update nielsr to microsoft	2022-09-08 14:50:30 +02:00
Jason Phang	53e33e6f1b	PEGASUS-X (#18551 ) * PegasusX Initial commit * rename * pegasus X implementation * pegx update * pegx fix * pegasus-x fixes * pegx updates * cleanup * cleanup * cleanup * tests * stylefixes * Documentation update * Model hub fix * cleanup * update * update * testfix * Check fix * tweaks for merging * style * style * updates for pr * style * change pegasus-x repo	2022-09-02 19:54:02 +02:00
Ankur Goyal	5c4c869014	Add LayoutLMForQuestionAnswering model (#18407 ) * Add LayoutLMForQuestionAnswering model * Fix output * Remove TF TODOs * Add test cases * Add docs * TF implementation * Fix PT/TF equivalence * Fix loss * make fixup * Fix up documentation code examples * Fix up documentation examples + test them * Remove LayoutLMForQuestionAnswering from the auto mapping * Docstrings * Add better docstrings * Undo whitespace changes * Update tokenizers in comments * Fixup code and remove `from_pt=True` * Fix tests * Revert some unexpected docstring changes * Fix tests by overriding _prepare_for_class Co-authored-by: Ankur Goyal <ankur@impira.com>	2022-08-31 10:05:33 +02:00
NielsRogge	b83796ded7	Remove ViltForQuestionAnswering from check_repo (#18762 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-08-30 14:15:36 +02:00
NielsRogge	2ab790e82d	Add Donut (#18488 ) * First draft * Improve script * Update script * Make conversion work * Add final_layer_norm attribute to Swin's config * Add DonutProcessor * Convert more models * Improve feature extractor and convert base models * Fix bug * Improve integration tests * Improve integration tests and add model to README * Add doc test * Add feature extractor to docs * Fix integration tests * Remove register_buffer * Fix toctree and add missing attribute * Add DonutSwin * Make conversion script work * Improve conversion script * Address comment * Fix bug * Fix another bug * Remove deprecated method from docs * Make Swin and Swinv2 untouched * Fix code examples * Fix processor * Update model_type to donut-swin * Add feature extractor tests, add token2json method, improve feature extractor * Fix failing tests, remove integration test * Add do_thumbnail for consistency * Improve code examples * Add code example for document parsing * Add DonutSwin to MODEL_NAMES_MAPPING * Add model to appropriate place in toctree * Update namespace to appropriate organization Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-08-12 16:40:58 +02:00
Sylvain Gugger	377cdded7a	Clean up hub (#18497 ) * Clean up utils.hub * Remove imports * More fixes * Last fix	2022-08-08 08:48:10 -04:00
gilad19	2b09650885	Add ViltForTokenClassification e.g. for Named-Entity-Recognition (NER) (#17924 ) * Add ViltForTokenClassification e.g. for Named-Entity-Recognition (NER) * Add ViltForTokenClassification e.g. for Named-Entity-Recognition (NER) * provide classifier only text hidden states * add test_for_token_classification * Update src/transformers/models/vilt/modeling_vilt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/vilt/modeling_vilt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/vilt/modeling_vilt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/vilt/modeling_vilt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * add test_for_token_classification Co-authored-by: gfuchs <gfuchs@ebay.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-07-26 10:11:32 +02:00
Alara Dirik	12d66b4701	Add OWL-ViT model for zero-shot object detection (#17938 ) * add owlvit model skeleton * add class and box predictor heads * convert modified flax clip to pytorch * fix box and class predictors * add OwlViTImageTextEmbedder * convert class and box head checkpoints * convert image text embedder checkpoints * add object detection head * fix bugs * update conversion script * update conversion script * fix q,v,k,out weight conversion conversion * add owlvit object detection output * fix bug in image embedder * fix bugs in text embedder * fix positional embeddings * fix bug in inference mode vision pooling * update docs, init tokenizer and processor files * support batch processing * add OwlViTProcessor * remove merge conflicts * readd owlvit imports * fix bug in OwlViTProcessor imports * fix bugs in processor * update docs * fix bugs in processor * update owlvit docs * add OwlViTFeatureExtractor * style changes, add postprocess method to feature extractor * add feature extractor and processor tests * add object detection tests * update conversion script * update config paths * update config paths * fix configuration paths and bugs * fix bugs in OwlViT tests * add import checks to processor * fix docs and minor issues * fix docs and minor issues * fix bugs and issues * fix bugs and issues * fix bugs and issues * fix bugs and issues * update docs and examples * fix bugs and issues * update conversion script, fix positional embeddings * process 2D input ids, update tests * fix style and quality issues * update docs * update docs and imports * update OWL-ViT index.md * fix bug in OwlViT feature ext tests * fix code examples, return_dict by default * return_dict by default * minor fixes, add tests to processor * small fixes * add output_attentions arg to main model * fix bugs * remove output_hidden_states arg from main model * update self.config variables * add option to return last_hidden_states * fix bug in config variables * fix copied from statements * fix small issues and bugs * fix bugs * fix bugs, support greyscale images * run fixup * update repo name * merge OwlViTImageTextEmbedder with obj detection head * fix merge conflict * fix merge conflict * make fixup * fix bugs * fix bugs * add additional processor test	2022-07-22 13:35:32 +03:00
Sayak Paul	561b9a8c00	[SegFormer] TensorFlow port (#17910 ) * add: segformer utils and img. classification. * add: segmentation layer. * feat: working implementation of segformer. * chore: remove unused variable. * add test, remaining modifications. * remove: unnecessary files. * add: rest of the files. Co-authored-by: matt <rocketknight1@gmail.com> * chore: remove ModuleList comment. * chore: apply make style. * chore: apply make fixup-copies. * add to check_repo.py * add decode head to IGNORE_NON_TESTED * chore: run make style. * chore: PR comments. * chore: minor changes to model doc. * tests: reduction across samples. * add a note on the space. * sort importats. * fix: reduction in loss computation. * chore: align loss function with that of NER. * chore: correct utils/documentation_tests.txt Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * chore: simplify the interpolation of logits in loss computation. * chore: return transposed logits when return_dict=False. * chore: add link to the tf fine-tuning repo. * address pr comments. * address niels's comments. * remove from_pt=True since tf weights are in. * remove comment from pt model. * address niels's comments. Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2022-07-21 18:22:37 +01:00
StevenTang1998	3cff4cc587	Add MVP model (#17787 ) * Add MVP model * Update README * Remove useless module * Update docs * Fix bugs in tokenizer * Remove useless test * Remove useless module * Update vocab * Remove specifying * Remove specifying * Add #Copied ... statement * Update paper link * Remove useless TFMvp * Add #Copied ... statement * Fix style in test mvp model * Fix some typos * Fix properties of unset special tokens in non verbose mode * Update paper link * Update MVP doc * Update MVP doc * Fix README * Fix typos in docs * Update docs	2022-06-29 09:30:55 -04:00
Jerry Jiarui XU	6c8f4c9a93	Adding GroupViT Models (#17313 ) * add group vit and fixed test (except slow) * passing slow test * addressed some comments * fixed test * fixed style * fixed copy * fixed segmentation output * fixed test * fixed relative path * fixed copy * add ignore non auto configured * fixed docstring, add doc * fixed copies * Apply suggestions from code review merge suggestions Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * resolve comment, renaming model * delete unused attr * use fix copies * resolve comments * fixed attn * remove unused vars * refactor tests * resolve final comments * add demo notebook * fixed inconsitent default * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * rename stage->stages * Create single GroupViTEncoderLayer class * Update conversion script * Simplify conversion script * Remove cross-attention class in favor of GroupViTAttention * Convert other model as well, add processor to conversion script * addressing final comment * fixed args * Update src/transformers/models/groupvit/modeling_groupvit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-06-28 20:51:47 +02:00
Daniel Stancl	a72f1c9f5b	Add `LongT5` model (#16792 ) * Initial commit * Make some fixes * Make PT model full forward pass * Drop TF & Flax implementation, fix copies etc * Add Flax model and update some corresponding stuff * Drop some TF things * Update config and flax local attn * Add encoder_attention_type to config * . * Update docs * Do some cleansing * Fix some issues -> make style; add some docs * Fix position_bias + mask addition + Update tests * Fix repo consistency * Fix model consistency by removing flax operation over attn_mask * [WIP] Add PT TGlobal LongT5 * . * [WIP] Add flax tglobal model * [WIP] Update flax model to use the right attention type in the encoder * Fix flax tglobal model forward pass * Make the use of global_relative_attention_bias * Add test suites for TGlobal model * Fix minor bugs, clean code * Fix pt-flax equivalence though not convinced with correctness * Fix LocalAttn implementation to match the original impl. + update READMEs * Few updates * Update: [Flax] improve large model init and loading #16148 * Add ckpt conversion script accoring to #16853 + handle torch device placement * Minor updates to conversion script. * Typo: AutoModelForSeq2SeqLM -> FlaxAutoModelForSeq2SeqLM * gpu support + dtype fix * Apply some suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * * Remove (de)parallelize stuff * Edit shape comments * Update README.md * make fix-copies * Remove caching logic for local & tglobal attention * Apply another batch of suggestions from code review * Add missing checkpoints * Format converting scripts * Drop (de)parallelize links from longT5 mdx * Fix converting script + revert config file change * Revert "Remove caching logic for local & tglobal attention" This reverts commit 2a619828f6ddc3e65bd9bb1725a12b77fa883a46. * Stash caching logic in Flax model * Make side relative bias used always * Drop caching logic in PT model * Return side bias as it was * Drop all remaining model parallel logic * Remove clamp statements * Move test files to the proper place * Update docs with new version of hf-doc-builder * Fix test imports * Make some minor improvements * Add missing checkpoints to docs * Make TGlobal model compatible with torch.onnx.export * Replace some np.ndarray with jnp.ndarray * Fix TGlobal for ONNX conversion + update docs * fix _make_global_fixed_block_ids and masked neg value * update flax model * style and quality * fix imports * remove load_tf_weights_in_longt5 from init and fix copies * add slow test for TGlobal model * typo fix * Drop obsolete is_parallelizable and one warning * Update __init__ files to fix repo-consistency * fix pipeline test * Fix some device placements * [wip]: Update tests -- need to generate summaries to update expected_summary * Fix quality * Update LongT5 model card * Update (slow) summarization tests * make style * rename checkpoitns * finish * fix flax tests Co-authored-by: phungvanduy <pvduy23@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: patil-suraj <surajp815@gmail.com>	2022-06-13 22:36:58 +02:00
Bram Vanroy	73083581a4	explicitly set utf8 for Windows (#17664 )	2022-06-13 08:05:45 -04:00
Chan Woo Kim	119e3c0fc8	M-CTC-T Model (#16402 ) * added cbs to notebooks, made copy-paste error fix in generation_utils * initial push for mctc model * mctc feature extractor done * added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly. * added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly. * passing attention, now struggling to figure out how attention masks make sense here * works when excluding attention masks. ask later how one would integrate attention maskshere * bizarre configuration error (model prefix comes first in config dict json and messes up the order) * all passing but bizzarre config dict ordering issue when to_dict * passing all major tests * feature extraction, processor, tokenizer added & tests passing * style & consistency & other logistical fixes * copy paste fix * model after feature extraction working * commiting final feature extraction results; need to fix normalization * feature extraction passing tests; probably should add tests on the specific flashlight-copied functions? * delete print ; format code a bit * fixing tests * passing major tests * fixing styles * completed tokenization test with real example; not sure if these values are entirely correct. * last test fixes from local * reverting accidentally included custom setup configs * remove load tf weights; fix config error * testing couldnt import featureextractor * fix docs * fix docs * resolving comments * style fixes * style fixes * Update to MCTCConv1dSubSampler Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * relposemb fixes * conv1d name issue; expecting config fail with paraentheses * fix config issue * fix config issue * fix config issue * change everything to MCTCT * fixing naming change errors * archive list * copyrights and docs * copyrights and docs * copyrights and docs * merge resolution * move tests, fix to changed optionaldependency structure * test directories changed * fixing tests * how to avoid tf tests? * how to avoid tf tests? * tests passing locally * allow mctctprocessor imported any env * allow mctctprocessor imported any env * fixed second round of feedback, need to fix docs * doc changes not being applied * all fixed * style fix * feedback fixes * fix copies and feature extraction style fix * Update tests/models/visual_bert/test_modeling_visual_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * copy paste huggingface:main visual bert * added eof newline to visual bert; all tests are passing otherwise * fix slow tests by adding attention mask * change model id to speechbrain * make fix-copies * fix readme unwanted deletes * fixing readmes, make fix-copies * consistent M-CTC-T naming * Update src/transformers/models/mctct/__init__.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * all fixed but variable naming * adjust double quotes * fixed variable names * copyright and mr quilter * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * correct slow tests * make fix-copies * Update src/transformers/models/mctct/configuration_mctct.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mctct/configuration_mctct.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * m-ctc-t not mctct Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-06-08 00:33:07 +02:00
Sylvain Gugger	afe5d42d8d	Black preview (#17217 ) * Black preview * Fixup too! * Fix check copies * Use the same version as the CI * Bump black	2022-05-12 16:25:55 -04:00
Younes Belkada	b971c769e8	Add OPT (#17088 ) * First version - OPT model * Final changes - putting use cache to False * few changes - remove commented block * few changes - remove unecessary files * fix style issues * few changes - remove a test file - added the logits test * Update src/transformers/models/auto/tokenization_auto.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * add gen tests * few changes - rm mask filling example on docstring * few changes - remove useless args * some changes - more tests should pass now - needs to clean more - documentation still needs to be done * fix code quality * major changes - change attention architecture to BART-like - modify some tests - style fix * rm useless classes - remove opt for: - QA - cond generation - seq classif * Removed autodoc calls to non-existant classes TOkenizers are not implemented * Update src/transformers/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/auto/modeling_tf_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Replaced OPTTokeniser with GPT2 tokenizer * added GPT2Tokenizer.from_pretrained("patrickvonplaten/opt_gpt2_tokenizer") * Removed OPTTokenizer * make style * Make style replaces ``` ...).unsqueeze(``` by ``` >>>).unsqueeze(``` * make repo consistency * Removed PretrainedOPTModel * fix opt.mdx removed other heads * fix init, removed 3 heads * removed heads * finished cleaning head * removed seauence classif and question answering * removed unused imports * removed useless dummy object for QA, SC and CG * removed tests for removed useless dummy object for QA, SC and CG * Removed head_mask using encoder layers which don't exist * fixed test * fix line * added OPT to toctree * Updated model path with pushed weigths * fix model path * fixed code quality * fixed embeddings and generation tests * update paths * clean comments * removed OPTClassificationHead for sentence classification * renamed hidden layer * renamed num layers to standard num_hidden_layers * num_attention_heads fix * changes for 125m * add first version for 125m * add first version - flax * add new version * causal LM output * replace output type with BaseModelOutputWithPastAndCrossAttentions * revert working config from 150m to 350m * clean * removed decoder input ids * fixed embed dim * more embed_dim issues * make style + removed enc_dec test * update falx model * removed troublesome copy * added is_encoder_decoder=False to config * added set_input emb fuinction to model class * requires torch on embed test * use head mask instead of decoder head mask input param solves a test * 8 test remaining, update * Updated create_and_check_decoder_model_past_large_inputs * Make style * update op tokenizer with condition * make style * See if I can push * some clean up * remove linear head hack * save intermediate * save correct attention * add copied from from bart * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * fix part of the reviewss Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * same changes in naming / conversion * correct mask * more fixes * delete FlaxOPT and TfOPT * clean traces of Flax and Tf * fix mask * fixed positionnal embedding length when past key value is provoded * get 125m, 6.7b to work * Added do_layer_norm * solved mismatch in load dictionnary * clean up preapre opt input dict * fixed past key value as bool * fix previus * fixed return dict False tuple issue * All tests are passing * Make style * Ignore OPTDecoder non tested * make fix-copies * make repo consistency * small fix * removed uselss @torch.no_grad decorator * make styl;e * fix previous opt test * style * make style * added opt documentation * update OPT_PRETRAINED_MODEL_ARCHIVE_LIST * up * more fixes * model & config work * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * added comment on padding hack (+2) * cleaup * review update * docstring for missing arg * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/opt/__init__.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * update pretrained map * update path and tests * make style * styling * make consistency * add gpt2 tok new * more tok fixes * Update src/transformers/models/auto/tokenization_auto.py * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/models/opt/test_modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update based on reviews * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * make style * make tokenizer auto tests pass * apply Lysandre suggestion * finish tests * add some good tokenizer tests * improve docs slighly Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: ArthurZucker <arthur.zucker@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2022-05-12 12:24:35 +02:00
Amanpreet Singh	a10f61834d	[feat] Add FLAVA model (#16654 ) * [WIP] Add FLAVA model This PR aims to add [FLAVA](ihttps://arxiv.org/abs/2112.04482) model to the transformers repo. Following checklist delineates the list of things to be done for this PR to be complete: [x] Flava init [x] Flava base models [x] Flava layers [x] Flava Configs [x] Flava encoders [x] Flava pretraining models [ ] Flava classification/retrieval models (To be added in a separate PR) [x] Documentation updates [x] Imports updates [x] Argstring updates [x] Flava pretrained checkpoints [x] Flava tests [x] Flava processors [x] Sanity check [x] Lint	2022-05-11 14:56:48 -07:00
Yih-Dar	19420fd99e	Move test model folders (#17034 ) * move test model folders (TODO: fix imports and others) * fix (potentially partially) imports (in model test modules) * fix (potentially partially) imports (in tokenization test modules) * fix (potentially partially) imports (in feature extraction test modules) * fix import utils.test_modeling_tf_core * fix path ../fixtures/ * fix imports about generation.test_generation_flax_utils * fix more imports * fix fixture path * fix get_test_dir * update module_to_test_file * fix get_tests_dir from wrong transformers.utils * update config.yml (CircleCI) * fix style * remove missing imports * update new model script * update check_repo * update SPECIAL_MODULE_TO_TEST_MAP * fix style * add __init__ * update self-scheduled * fix add_new_model scripts * check one way to get location back * python setup.py build install * fix import in test auto * update self-scheduled.yml * update slack notification script * Add comments about artifact names * fix for yolos Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-05-03 14:42:02 +02:00
Sanchit Gandhi	cd9274d010	[FlaxBert] Add ForCausalLM (#16995 ) * [FlaxBert] Add ForCausalLM * make style * fix output attentions * Add RobertaForCausalLM * remove comment * fix fx-to-pt model loading * remove comment * add modeling tests * add enc-dec model tests * add big_bird * add electra * make style * make repo-consitency * add to docs * remove roberta test * quality * amend cookiecutter * fix attention_mask bug in flax bert model tester * tighten pt-fx thresholds to 1e-5 * add 'copied from' statements * amend 'copied from' statements * amend 'copied from' statements * quality	2022-05-03 11:26:19 +02:00
Yih-Dar	1f9e862507	Update check_models_are_tested to deal with Windows path (#16973 ) * fix * Apply suggestions from code review Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-28 15:31:57 +02:00
Yih-Dar	596afb4297	Fix check_all_models_are_tested (#16970 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-27 21:18:29 +02:00
Patrick von Platen	8d3f952adb	[Data2Vec] Add data2vec vision (#16760 ) * save intermediate * add vision * add vision * save * finish models * finish models * continue * finish * up * up * up * tests all pass * clean up * up * up * fix bugs in beit * correct docs * finish * finish docs * make style * up * more fixes * fix type hint * make style * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/data2vec/test_modeling_data2vec_vision.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix test Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-18 17:52:13 +02:00
Sylvain Gugger	b9a768b3ff	Enable doc in Spanish (#16518 ) * Reorganize doc for multilingual support * Fix style * Style * Toc trees * Adapt templates	2022-04-04 10:25:46 -04:00
NielsRogge	979b039c89	Add DPT (#15991 ) * First draft * More improvements * Add fusion blocks * Make conversion script work for dpt_large * Make conversion script work * Improve implementation * Improve conversion script * Add DPTForSemanticSegmentation * Make conversion work for semantic segmentation * Add tests * Remove print statements * First draft * Redesign neck * Improve tests * Improve implementation some more * Make neck output list of tensors * Improve neck and feature extractor * Fix integration tests * Make more tests pass * Make all tests pass * Add missing config archive map * Add in_index attribute to make heads accept list of tensors * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply some more suggestions * Add copied from statements * Remove assert * Apply suggestions from code review * Apply suggestions from code review * Remove DPTInterpolate in favor of nn.Upsample * Add comments * Apply suggestions from code review * Apply suggestions from code review * Add proposed design * Update design * Add DPTReassembleLayer * Add DPTFeatureFusionStage * Apply more suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * Fix rebase * Update in_index and out_indices * Fix conversion script * Fix code quality * Add model to toctree and use DepthEstimatorOutput * Fix rebase * Fix code examples * Improve code * Fix copied from statements * Apply suggestions from code review * Remove compute_loss method * Apply suggestions from code review * Fix documentation tests file * Remove test.py file * Improve doc example Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>	2022-03-28 16:28:10 +02:00
Edward Beeching	aff9bc405a	Decision transformer gym (#15845 ) * Created the Decision Transformer Modle * updating tests, copy to other machine * Added last hidden size to Decision Transformer modelling outputs * Removed copy of original DT file * made a temporary change to gpt2 to have it conform with the Decision Transformer version * Updated tests * Ignoring a file used to test the DT model * added comments to config file * added comments and argument descriptions to decision transformer file * Updated doc * Ran "make style" * Remove old model imports * Removed unused imports, cleaned up init file * Update docs/source/model_doc/decision_transformer.mdx added my username Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Reverted changes made to gpt2 * Removed datasets submodule * Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states * Added support for return of hidden states, attentions and return dict of gpt2 model. * Updated tests to include many of the ModelTesterMixin tests. The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes * Added missing line to the end of gpt2 file * Added an integration test for the Decision Transformer Test performs and autoregressive evaluation for two time steps * Set done and info to _ to fix failing test * Updated integration test to be deterministic and check expected outputs * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Removed unnecessary config options * Cleaned up commented code and old comments. * Cleaned up commented code. * Changed DecisionTransformer to Decision Transformer * Added Decision Transformer to the main README file * Added copy of GTP2 called DecisionTranformerGPT2Model * isorted imports * isorted imports * Added model to non-English README files * Ran make fix-copies and corrected some cases. * Updated index file to include Decision Transformer * Added gpt2 model as copy inside the Decision Transformer model file * Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS * Deleted redundant checkpoint files (I don't know how these got committed) * Removed testing files. (These should have never been committed) * Removed accidentally committed files * Moved the Decision Transformer test to its own directory * Add type hints for Pegasus (#16324) * Funnel type hints (#16323) * add pt funnel type hints * add tf funnel type hints * Add type hints for ProphetNet PyTorch (#16272) * [GLPN] Improve docs (#16331) * Add link to notebook * Add link * Fix bug Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> * Added type hints for Pytorch Marian calls (#16200) * Added type hinting for forward functions in pytorch marian * typo correction * Removed type hints on functions from BART per Suraj Patil request * fix import pb * fix typo * corrected tuple call * ran black * after fix-copies Some optional tags on primitives were removed, past_key_values in MarianForCausalLM changed from Tuple of Tuple to List * Fixing copies to roformer and pegasus Co-authored-by: Clementine Fourrier <cfourrie@inria.fr> Co-authored-by: matt <rocketknight1@gmail.com> * Moved DecisionTransformOutput to modeling_decision_transformer * Moved the example usage to research project and cleaned comments * Made tests ignore the copy of gpt2 in Decision Transformer * Added module output to modelling decision transformer * removed copied gpt2 model from list of transformers models * Updated tests and created __init__ file for new test location * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Removed unneeded summary type from config file * Fixed copies * Updated pretrained config map to refer to hopper-medium checkpoint * done (#16340) * Added Decision transformer to model docs * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add type annotations for Rembert/Splinter and copies (#16338) * undo black autoformat * minor fix to rembert forward with default * make fix-copies, make quality * Adding types to template model * Removing List from the template types * Remove `Optional` from a couple of types that don't accept `None` Co-authored-by: matt <rocketknight1@gmail.com> * [Bug template] Shift responsibilities for long-range (#16344) * Fix code repetition in serialization guide (#16346) * Adopt framework-specific blocks for content (#16342) * ✨ refactor code samples with framework-specific blocks * ✨ update training.mdx * 🖍 apply feedback * Updates the default branch from master to main (#16326) * Updates the default branch from master to main * Links from `master` to `main` * Typo * Update examples/flax/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Updated model with custom docstring example * Created the Decision Transformer Modle * updating tests, copy to other machine * Added last hidden size to Decision Transformer modelling outputs * Removed copy of original DT file * made a temporary change to gpt2 to have it conform with the Decision Transformer version * Updated tests * Ignoring a file used to test the DT model * added comments to config file * added comments and argument descriptions to decision transformer file * Updated doc * Ran "make style" * Remove old model imports * Removed unused imports, cleaned up init file * Update docs/source/model_doc/decision_transformer.mdx added my username Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Reverted changes made to gpt2 * Removed datasets submodule * Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states * Added support for return of hidden states, attentions and return dict of gpt2 model. * Updated tests to include many of the ModelTesterMixin tests. The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes * Added missing line to the end of gpt2 file * Added an integration test for the Decision Transformer Test performs and autoregressive evaluation for two time steps * Set done and info to _ to fix failing test * Updated integration test to be deterministic and check expected outputs * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Removed unnecessary config options * Cleaned up commented code and old comments. * Cleaned up commented code. * Changed DecisionTransformer to Decision Transformer * Added Decision Transformer to the main README file * Added copy of GTP2 called DecisionTranformerGPT2Model * isorted imports * isorted imports * Added model to non-English README files * Ran make fix-copies and corrected some cases. * Updated index file to include Decision Transformer * Added gpt2 model as copy inside the Decision Transformer model file * Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS * Deleted redundant checkpoint files (I don't know how these got committed) * Removed testing files. (These should have never been committed) * Removed accidentally committed files * Moved the Decision Transformer test to its own directory * Moved DecisionTransformOutput to modeling_decision_transformer * Moved the example usage to research project and cleaned comments * Made tests ignore the copy of gpt2 in Decision Transformer * Added module output to modelling decision transformer * removed copied gpt2 model from list of transformers models * Updated tests and created __init__ file for new test location * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Removed unneeded summary type from config file * Fixed copies * Updated pretrained config map to refer to hopper-medium checkpoint * Added Decision transformer to model docs * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Updated model with custom docstring example * Updated copies, config auto, and readme files. Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Dan Tegzes <48134725+Tegzes@users.noreply.github.com> Co-authored-by: Adam Montgomerie <adam@avanssion.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com> Co-authored-by: Clementine Fourrier <cfourrie@inria.fr> Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: Francesco Saverio Zuppichini <francesco.zuppichini@gmail.com> Co-authored-by: Jacob Dineen <54680234+jacobdineen@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2022-03-23 16:18:43 -04:00
Sylvain Gugger	4975002df5	Reorganize file utils (#16264 ) * Split file_utils in several submodules * Fixes * Add back more objects * More fixes * Who exactly decided to import that from there? * Second suggestion to code with code review * Revert wront move * Fix imports * Adapt all imports * Adapt all imports everywhere * Revert this import, will fix in a separate commit	2022-03-23 10:26:33 -04:00
NielsRogge	0c55d47cde	Add GLPN (#16199 ) * First draft * Fix logits calculation * Improve tests * Add copied from statements * Fix base_model_prefix * Improve implementation, upload new models * Update design * Fix integration test * Add model to README and toctree * Add document image * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add decoder_hidden_size attribute * Update design of decoder * Add DepthEstimatorOutput class * Rename in_index to head_in_index and add feature extractor tests * Apply suggestions from code review * Apply suggestions from code review * Update pretrained model name and add to doc tests * Remove test.py script * Update copied from statements and clean up Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-22 08:51:13 +01:00
Sanchit Gandhi	b256f3518d	Add FlaxBartForCausalLM (#15995 ) * add causal lm * add CausalLM tests * Add FlaxBartForCausalLM * Add EncoderDecoder model tests * change docstring * make repo-consistency * suggested changes * remove jax ops * correction * rename pre-trained decoder model	2022-03-09 19:53:01 +01:00
Javier de la Rosa	01485ceec3	Add missing support for Flax XLM-RoBERTa (#15900 ) * Adding Flax XLM-RoBERTa * Add Flax to __init__ * Adding doc and dummy objects * Add tests * Add Flax XLM-R models autodoc * Fix tests * Add Flask XLM-RoBERTa to TEST_FILES_WITH_NO_COMMON_TESTS * Update src/transformers/models/xlm_roberta/modeling_flax_xlm_roberta.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Update tests/xlm_roberta/test_modeling_flax_xlm_roberta.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Update tests/xlm_roberta/test_modeling_flax_xlm_roberta.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Remove test on large Flask XLM-RoBERTa * Add tokenizer to the test Co-authored-by: Suraj Patil <surajp815@gmail.com>	2022-03-04 14:36:28 +01:00
Francesco Saverio Zuppichini	d83d22f578	Maskformer (#15682 ) * maskformer * conflicts * conflicts * minor fixes * feature extractor test fix refactor MaskFormerLoss following conversation MaskFormer related types should not trigger a module time import error missed one removed all the types that are not used update config mapping minor updates in the doc resolved conversation that doesn't need a discussion minor changes resolved conversations fixed DetrDecoder * minor changes minor changes fixed mdx file test feature_extractor return types functional losses -> classes removed the return type test for the feature extractor minor changes + style + quality * conflicts? * rebase master * readme * added missing files * deleded poolformers test that where in the wrong palce * CI * minor changes * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * resolved conversations * minor changes * conversations [Unispeech] Fix slow tests (#15818) * remove soundfile old way of loading audio * Adapt slow test [Barthez Tokenizer] Fix saving (#15815) [TFXLNet] Correct tf xlnet generate (#15822) * [TFXLNet] Correct tf xlnet * adapt test comment Fix the push run (#15807) Fix semantic segmentation pipeline test (#15826) Fix dummy_inputs() to dummy_inputs in symbolic_trace doc (#15776) Add model specific output classes to PoolFormer model docs (#15746) * Added model specific output classes to poolformer docs * Fixed Segformer typo in Poolformer docs Adding the option to return_timestamps on pure CTC ASR models. (#15792) * Adding the option to return_timestamps on pure CTC ASR models. * Remove `math.prod` which was introduced in Python 3.8 * int are not floats. * Reworking the PR to support "char" vs "word" output. * Fixup! * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Quality. Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> HFTracer.trace should use/return self.graph to be compatible with torch.fx.Tracer (#15824) Fix tf.concatenate + test past_key_values for TF models (#15774) * fix wrong method name tf.concatenate * add tests related to causal LM / decoder * make style and quality * clean-up * Fix TFBertModel's extended_attention_mask when past_key_values is provided * Fix tests * fix copies * More tf.int8 -> tf.int32 in TF test template * clean-up * Update TF test template * revert the previous commit + update the TF test template * Fix TF template extended_attention_mask when past_key_values is provided * Fix some styles manually * clean-up * Fix ValueError: too many values to unpack in the test * Fix more: too many values to unpack in the test * Add a comment for extended_attention_mask when there is past_key_values * Fix TFElectra extended_attention_mask when past_key_values is provided * Add tests to other TF models * Fix for TF Electra test: add prepare_config_and_inputs_for_decoder * Fix not passing training arg to lm_head in TFRobertaForCausalLM * Fix tests (with past) for TF Roberta * add testing for pask_key_values for TFElectra model Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> [examples/summarization and translation] fix readme (#15833) Add ONNX Runtime quantization for text classification notebook (#15817) Re-enable doctests for the quicktour (#15828) * Re-enable doctests for the quicktour * Re-enable doctests for task_summary (#15830) * Remove & Framework split model report (#15825) Add TFConvNextModel (#15750) * feat: initial implementation of convnext in tensorflow. * fix: sample code for the classification model. * chore: added checked for from the classification model. * chore: set bias initializer in the classification head. * chore: updated license terms. * chore: removed ununsed imports * feat: enabled argument during using drop_path. * chore: replaced tf.identity with layers.Activation(linear). * chore: edited default checkpoint. * fix: minor bugs in the initializations. * partial-fix: tf model errors for loading pretrained pt weights. * partial-fix: call method updated * partial-fix: cross loading of weights (4x3 variables to be matched) * chore: removed unneeded comment. * removed playground.py * rebasing * rebasing and removing playground.py. * fix: renaming TFConvNextStage conv and layer norm layers * chore: added initializers and other minor additions. * chore: added initializers and other minor additions. * add: tests for convnext. * fix: integration tester class. * fix: issues mentioned in pr feedback (round 1). * fix: how output_hidden_states arg is propoagated inside the network. * feat: handling of arg for pure cnn models. * chore: added a note on equal contribution in model docs. * rebasing * rebasing and removing playground.py. * feat: encapsulation for the convnext trunk. * Fix variable naming; Test-related corrections; Run make fixup * chore: added Joao as a contributor to convnext. * rebasing * rebasing and removing playground.py. * rebasing * rebasing and removing playground.py. * chore: corrected copyright year and added comment on NHWC. * chore: fixed the black version and ran formatting. * chore: ran make style. * chore: removed from_pt argument from test, ran make style. * rebasing * rebasing and removing playground.py. * rebasing * rebasing and removing playground.py. * fix: tests in the convnext subclass, ran make style. * rebasing * rebasing and removing playground.py. * rebasing * rebasing and removing playground.py. * chore: moved convnext test to the correct location * fix: locations for the test file of convnext. * fix: convnext tests. * chore: applied sgugger's suggestion for dealing w/ output_attentions. * chore: added comments. * chore: applied updated quality enviornment style. * chore: applied formatting with quality enviornment. * chore: revert to the previous tests/test_modeling_common.py. * chore: revert to the original test_modeling_common.py * chore: revert to previous states for test_modeling_tf_common.py and modeling_tf_utils.py * fix: tests for convnext. * chore: removed output_attentions argument from convnext config. * chore: revert to the earlier tf utils. * fix: output shapes of the hidden states * chore: removed unnecessary comment * chore: reverting to the right test_modeling_tf_common.py. * Styling nits Co-authored-by: ariG23498 <aritra.born2fly@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> * minor changes * doc fix in feature extractor * doc * typose * removed detr logic from config * removed detr logic from config * removed num_labels * small fix in the config * auxilary -> auxiliary * make style * some test is failing * fix a weird char in config prevending doc-builder * retry to fix the doc-builder issue * make style * new try to fix the doc builder * CI * change weights to facebook Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: ariG23498 <aritra.born2fly@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>	2022-03-02 15:48:20 +01:00
Eduardo Gonzalez Ponferrada	df5a4094a6	Add Data2Vec (#15507 ) * Add data2vec model cloned from roberta * Add checkpoint conversion script * Fix copies * Update docs * Add checkpoint conversion script * Remove fairseq data2vec_text script and fix format * Add comment on where to get data2vec_text.py * Remove mock implementation cheat.py and fix style * Fix copies * Remove TF and Flax classes from init * Add back copy from fairseq data2vec_text.py and fix style * Update model name in docs/source/index.mdx to be CamelCase * Revert model name in table to lower-case to get check_table test to pass * Update src/transformers/models/data2vec/__init__.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/convert_data2vec_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update docs/source/model_doc/data2vec.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/data2vec.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/auto/configuration_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/configuration_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/test_modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/configuration_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update documentation * Copy-paste Data2VecConfig from BertConfig * Update config checkpoint to point to edugp/data2vec-nlp-base. Fix style and repo-consistency * Update config special tokens to match RoBERTa * Split multiple assertions and add individual error messages * Rename Data2VecModel to Data2VecForTextModel * Add Data2Vec to _toctree.yml * Rename Data2VecEmbeddings to Data2VecForTextEmbeddings * Add initial Data2VecForAudio model (unfinished). Only matching fairseq's implementation up to the feature encoder (before positional encoding). * finish audio model * finish audio file * Update names and fix style, quality and repo consistency * Remove Data2VecAudioForPretraining. Add tests for Data2VecAudio, mimicking the Wav2Vec2 test suite. Fix bias initilization in positional conv layers. Move back configurations for audio and text to separate files. * add inputs to logits to data2vec' * correct autio models * correct config auto * correct tok auto * Update utils/tests_fetcher.py * delete unnecessary files * delete unnecessary files * further renaming * make all tests pass * finish * remove useless test file * Update tests/test_modeling_common.py * Update utils/check_repo.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec_text.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Fix copies * Update docs * Remove fairseq data2vec_text script and fix format * Add comment on where to get data2vec_text.py * Remove mock implementation cheat.py and fix style * Fix copies * Remove TF and Flax classes from init * Add back copy from fairseq data2vec_text.py and fix style * Update model name in docs/source/index.mdx to be CamelCase * Revert model name in table to lower-case to get check_table test to pass * Update documentation * Update src/transformers/models/data2vec/__init__.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/convert_data2vec_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/auto/configuration_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/configuration_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/test_modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/configuration_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Copy-paste Data2VecConfig from BertConfig * Update config checkpoint to point to edugp/data2vec-nlp-base. Fix style and repo-consistency * Update config special tokens to match RoBERTa * Split multiple assertions and add individual error messages * Rename Data2VecModel to Data2VecForTextModel * Add Data2Vec to _toctree.yml * Rename Data2VecEmbeddings to Data2VecForTextEmbeddings * Add initial Data2VecForAudio model (unfinished). Only matching fairseq's implementation up to the feature encoder (before positional encoding). * finish audio model * finish audio file * add inputs to logits to data2vec' * Update names and fix style, quality and repo consistency * Remove Data2VecAudioForPretraining. Add tests for Data2VecAudio, mimicking the Wav2Vec2 test suite. Fix bias initilization in positional conv layers. Move back configurations for audio and text to separate files. * correct autio models * correct config auto * correct tok auto * delete unnecessary files * delete unnecessary files * Update utils/tests_fetcher.py * further renaming * make all tests pass * finish * remove useless test file * Update tests/test_modeling_common.py * Update utils/check_repo.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec_text.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Move data2vec tests to new structure * Fix test imports for text tests * Remove fairseq files * Change paper link to arxiv * Modify Data2Vec documentation to reflect that the encoder is not shared across the audio and text models in the current implementation. * Update text model checkpoint to be facebook/data2vec-text-base * Add 'Copy from' statements and update paper links and docs * fix copy from statements * improve copied from * correct more copied from statements * finish copied from stuff * make style * add model to README * add to master Co-authored-by: Eduardo Gonzalez Ponferrada <eduardo@ferrumhealth.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-01 11:09:20 +01:00
Sanchit Gandhi	e3342edc4e	Flax Speech-Encoder-Decoder Model (#15613 ) * rebase * Delete shift tokens func * downsample decoder input seq len for init * correct attention mask * add tests * pt flax cross test * make fixup * init file for import * change pt-flax cross test threshold * pt-flax test logits only * move tests * make repo-consistency * consistent indentation Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-02-28 12:22:36 +01:00
Lysandre Debut	0400b2263d	[Test refactor 2/5] Tests fetcher (#15726 ) * Tests fetcher * Review comments Co-authored-by: sgugger <sylvain.gugger@gmail.com> Review comments	2022-02-23 15:46:37 -05:00
Gunjan Chhablani	ae1f835028	Add PLBart (#13269 ) * Init PLBART * Add missing configuration file * Add conversion script and configurationf ile * Fix style * Update modeling and conversion scripts * Fix scale embedding in config * Add comment * Fix conversion script * Add classification option to conversion script * Fix vocab size in config doc * Add tokenizer files from MBart50 * Allow no lang code in regular tokenizer * Add PLBart Tokenizer Converters * Remove mask from multi tokenizer * Remove mask from multi tokenizer * Change from MBart-50 to MBart tokenizer * Fix names and modify src/tgt behavior * Fix imports for tokenizer * Remove <mask> from multi tokenizer * Fix style * Change tokenizer_class to processor_class * Add attribute map to config class * Update modeling file to modified MBart code * Update configuration file to MBart style configuration * Fix tokenizer * Separate tokenizers * Fix error in tokenization auto * Copy MBart tests * Replace with MBart tokenization tests * Fix style * Fix language code in multi tokenizer * Fix configuration docs * Add entry for plbart_multi in transformers init * Add dummy objects and fix imports * Fix modeling tests * Add TODO in config * Fix copyright year * Fix modeling docs and test * Fix some tokenization tests and style * Add changes from review * Fix copies * Fix docs * Fix docs * Fix style * Fix year * Add changes from review * Remove extra changes * Fix base tokenizer and doc * Fix style * Fix modeling and slow tokenizer tests * Remove Multi-tokenizer Converter and Tests * Delete QA model and Multi Tokenizer dummy objects * Fix repo consistency and code quality issues * Fix example documentation * Fix style * Remove PLBartTokenizer from type checking in init * Fix consistency issue * Add changes from review * Fix style * Remove PLBartTokenizerFast * Remove FastTokenizer converter * Fix AutoTokenzier mapping * Add plbart to toctree and fix consistency issues * Add language codes tokenizer test * Fix styling and doc issues * Add fixes for failing tests * Fix copies * Fix failing modeling test * Change assert to assertTrue in modeling tests	2022-02-18 14:17:09 +01:00
Sylvain Gugger	ac6aa10f23	Standardize semantic segmentation models outputs (#15469 ) * Standardize instance segmentation models outputs * Rename output * Update src/transformers/modeling_outputs.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add legacy argument to the config and model forward * Update src/transformers/models/beit/modeling_beit.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Copy fix in Segformer Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2022-02-04 14:52:07 -05:00
Suraj Patil	d25e25ee2b	Add XGLM models (#14876 ) * add xglm * update vocab size * fix model name * style and tokenizer * typo * no mask token * fix pos embed compute * fix args * fix tokenizer * fix positions * fix tokenization * style and dic fixes * fix imports * add fast tokenizer * update names * add pt tests * fix tokenizer * fix typo * fix tokenizer import * fix fast tokenizer * fix tokenizer * fix converter * add tokenizer test * update checkpoint names * fix tokenizer tests * fix slow tests * add copied from comments * rst -> mdx * flax model * update flax tests * quality * style * doc * update index and readme * fix copies * fix doc * update toctrr * fix indent * minor fixes * fix config doc * don't save embed_pos weights * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * address Sylvains commnets, few doc fixes * fix check_repo * align order of arguments * fix copies * fix labels * remove unnecessary mapping * fix saving tokenizer Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-01-28 18:55:23 +01:00
NielsRogge	ac227093e4	Add ViLT (#14895 ) * First commit * Add conversion script * Make conversion script work for base model * More improvements * Update conversion script, works for vqa * Add indexing argument to meshgrid * Make conversion script work for ViltForPreTraining * Add ViltForPreTraining to docs * Fix device issue * Add processor * Add MinMaxResize to feature extractor * Implement call method of ViltProcessor * Fix tests * Add integration test * Add loss calculation for VQA * Improve tests * Improve some more tests * Debug tests * Small improvements * Add support for attention_mask * Remove mask_it * Add pixel_mask * Add tests for ViltFeatureExtractor * Improve tests * Add ViltForNaturalLanguageVisualReasoning * Add ViltForNaturalLanguageVisualReasoning to conversion script * Minor fixes * Add support for image_embeds, update docstrings to markdown * Update docs to markdown * Improve conversion script * Rename ViltForPreTraining to ViltForMaskedLM * Improve conversion script * Convert docstrings to markdown * Fix code example of retrieval model * Properly convert masked language model * Add integration test for nlvr * Fix code quality * Apply suggestions from code review * Add copied from statements * Fix pretrained_config_archive_map * Fix docs * Add model to README * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply more suggestions from code review * Make code more readable * Add ViltForNaturalLanguageVisualReasoning to the tests * Rename ViltForVisualQuestionAnswering to ViltForQuestionAnswering * Replace pixel_values_2 by single tensor * Add hidden_states and attentions * Fix one more test * Fix all tests * Update year * Fix rebase issues * Fix another rebase issue * Remove ViltForPreTraining from auto mapping * Rename ViltForImageRetrievalTextRetrieval to ViltForImageAndTextRetrieval * Make it possible to use BertTokenizerFast in the processor * Use BertTokenizerFast by default * Rename ViltForNaturalLanguageVisualReasoning, define custom model output Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-19 19:51:59 +01:00
Li-Huai (Allan) Lin	22454ae492	Add REALM (#13292 ) * REALM initial commit * Retriever OK (Update new_gelu). * Encoder prediction score OK * Encoder pretrained model OK * Update retriever comments * Update docs, tests, and imports * Prune unused models * Make embedder as a module `RealmEmbedder` * Add RealmRetrieverOutput * Update tokenization * Pass all tests in test_modeling_realm.py * Prune RealmModel * Update docs * Add training test. * Remove completed TODO * Style & Quality * Prune `RealmModel` * Fixup * Changes: 1. Remove RealmTokenizerFast 2. Update docstrings 3. Add a method to RealmTokenizer to handle candidates tokenization. * Fix up * Style * Add tokenization tests * Update `from_pretrained` tests * Apply suggestions * Style & Quality * Copy BERT model * Fix comment to avoid docstring copying * Make RealmBertModel private * Fix bug * Style * Basic QA * Save * Complete reader logits * Add searcher * Complete searcher & reader * Move block records init to constructor * Fix training bug * Add some outputs to RealmReader * Add finetuned checkpoint variable names parsing * Fix bug * Update REALM config * Add RealmForOpenQA * Update convert_tfrecord logits * Fix bugs * Complete imports * Update docs * Update naming * Add brute-force searcher * Pass realm model tests * Style * Exclude RealmReader from common tests * Fix * Fix * convert docs * up * up * more make style * up * upload * up * Fix * Update src/transformers/__init__.py * adapt testing * change modeling code * fix test * up * up * up * correct more * make retriever work * update * make style * finish main structure * Resolve merge conflict * Make everything work * Style * Fixup * Fixup * Update training test * fix retriever * remove hardcoded path * Fix * Fix modeling test * Update model links * Initial retrieval test * Fix modeling test * Complete retrieval tests * Fix * style * Fix tests * Fix docstring example * Minor fix of retrieval test * Update license headers and docs * Apply suggestions from code review * Style * Apply suggestions from code review * Add an example to RealmEmbedder * Fix Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-18 07:24:13 -05:00
Sylvain Gugger	1b730c3d11	Better dummies (#15148 ) * Better dummies * See if this fixes the issue * Fix quality * Style * Add doc for DummyObject	2022-01-14 10:59:41 -05:00
Yih-Dar	b67fd797be	Add TFVisionEncoderDecoderModel (#14148 ) * Start the work on TFVisionEncoderDecoderModel * Expose TFVisionEncoderDecoderModel * fix import * Add modeling_tf_vision_encoder_decoder to _ignore_modules in get_model_modules() * reorder * Apply the fix for checkpoint loading as in #14016 * remove attention_mask + fix VISION_DUMMY_INPUTS * A minimal change to make TF generate() work for vision models as encoder in encoder-decoder setting * fix wrong condition: shape_list(input_ids) == 2 * add tests * use personal TFViTModel checkpoint (for now) * Add equivalence tests + projection layer * style * make sure projection layer can run * Add examples * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Clean comments (need to work on TODOs for PyTorch models) * Remove TF -> PT in check_pt_tf_equivalence for TFVisionEncoderDecoderModel * fixes * Revert changes in PT code. * Update tests/test_modeling_tf_vision_encoder_decoder.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Add test_inference_coco_en for TF test * fix quality * fix name * build doc * add main_input_name * Fix ckpt name in test * fix diff between master and this PR * fix doc * fix style and quality * fix missing doc * fix labels handling * Delete auto.rst * Add the changes done in #14016 * fix prefix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make style Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-01-10 13:30:14 -05:00
Kamal Raj	9fbf7c87c3	Update check_repo.py (#15014 ) added new line	2022-01-10 06:55:43 -05:00
Sylvain Gugger	8f6373c61c	Map model_type and doc pages names (#14944 ) * Map model_type and doc pages names * Add script * Fix typo * Quality * Manual check for Auto Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2022-01-03 05:08:55 -05:00
Yih-Dar	8f2cc1c3ab	Add TFCLIPModel (#13967 ) * Start the work for TFCLIPModel * Convert to TF code (TODO: loss + doc) * Clean up * Fix pooled_output for TFCLIPTextTransformer - using tf.gather_nd * assert -> raise error * Expose TFCLIPModel * Deal with dummy_inputs * Add tests * Fix all tests. TODO: manual check weight loading + add more comments * Fix pt tf equivalence test * fixes * update TFCLIPVisionEmbeddings's Conv2D * Fix loss + overwrite test_pt_tf_model_equivalence from common * Add a comment about the change about MainLayer in test_keras_save_load * Set return_loss=True in TFCLIPModelTester + make tests pass * overwrite test_pt_tf_model_equivalence from tf common * fix base_model_prefix * Fix examples * remove unused * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply review suggestions * change self.pre_layrnorm to self.pre_layernorm * apply more review suggestions * return attention probs before dropout (to align with PT) * fix weight init * fix * build doc * fix missing doc * fix for test Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-23 11:19:44 -05:00
Sylvain Gugger	e37bc579fc	Fix typo in error message	2021-12-22 08:19:36 -05:00
Sylvain Gugger	27b3031de2	Mass conversion of documentation from rst to Markdown (#14866 ) * Convert docstrings of all configurations and tokenizers * Processors and fixes * Last modeling files and fixes to models * Pipeline modules * Utils files * Data submodule * All the other files * Style * Missing examples * Style again * Fix copies * Say bye bye to rst docstrings forever	2021-12-21 15:06:33 -05:00
Sylvain Gugger	7533d30acd	Convert Trainer doc page to MarkDown (#14753 ) * Convert Trainer doc page to MarkDown * Fix repo consistency * Fix the doc build test job	2021-12-13 13:09:50 -05:00
NielsRogge	65b20b739b	Add Perceiver IO (#14487 ) * First draft * Style and remove mlm * Make forward pass work * More improvements * More improvements * Fix bug * More improvements * More improvements * Add PerceiverTokenizer first draft * Improve conversion script * More improvements * Make conversion script work for the encoder * Make conversion script work with local pickle files * Style & quality, fix-copies * Add dummy input to conversion script * Add absolute position embeddings to TextPreProcessor * Make forward pass of encoder work * More improvements * Move text preprocessor to separate script * More improvements * More improvements * Add post processor * Make MLM model work * Style * Add PerceiverForMaskedLM * Add PerceiverImagePreprocessor * Make style * Make PerceiverForImageClassification work * More improvements * More improvements * Use tokenizer in conversion script * Use PerceiverForMaskedLM in conversion script * Define custom PerceiverModelOutput * Improve PerceiverAttention to make it work for both MLM and image classification * More improvements * More improvements * More improvements to the conversion script * Make conversion script work for both MLM and image classification * Add PerceiverFeatureExtractor * More improvements * Style and quality * Add center cropping * Fix bug * Small fix * Add print statement * Fix bug in image preprocessor * Fix bug with conversion script * Make output position embeddings an nn.Parameter layer instead of nn.Embedding * Comment out print statements * Add position encoding classes * More improvements * Use position_encoding_kwargs * Add PerceiverForImageClassificationFourier * Make style & quality * Add PerceiverForImageClassificationConvProcessing * Style & quality * Add flow model * Move processors to modeling file * Make position encodings modular * Make basic decoder use modular position encodings * Add PerceiverForOpticalFlow to conversion script * Add AudioPreprocessor * Make it possible for the basic decoder to use Fourier position embeddings * Add PerceiverForMultimodalAutoencoding * Improve model for optical flow * Improve _build_network_inputs method * Add print statement * Fix device issue * Fix device of Fourier embeddings * Add print statements for debugging * Add another print statement * Add another print statement * Add another print statement * Add another print statement * Improve PerceiverAudioPreprocessor * Improve conversion script for multimodal modal * More improvements * More improvements * Improve multimodal model * Make forward pass multimodal model work * More improvements * Improve tests * Fix some more tests * Add output dataclasses * Make more tests pass * Add print statements for debuggin * Add tests for image classification * Add PerceiverClassifierOutput * More improvements * Make more tests pass for the optical flow model * Make style & quality * Small improvements * Don't support training for optical flow model for now * Fix _prepare_for_class for tests * Make more tests pass, add some docs * Add multimodal model to tests * Minor fixes * Fix tests * Improve conversion script * Make fixup * Remove pos_dim argument * Fix device issue * Potential fix for OOM * Revert previous commit * Fix test_initialization * Add print statements for debugging * Fix print statement * Add print statement * Add print statement * Add print statement * Add print statement * Add print statement * Add print statement * Remove need for output_shape * Comment out output_shape * Remove unnecessary code * Improve docs * Fix make fixup * Remove PerceiverTextProcessor from init * Improve docs * Small improvement * Apply first batch of suggestions from code review * Apply more suggestions from code review * Update docstrings * Define dicts beforehand for readability * Rename task to architecture in conversion script, include PerceiverModel in tests * Add print statements for debugging * Fix tests on GPU * Remove preprocessors, postprocessors and decoders from main init * Add integration test * Fix docs * Replace einops by torch * Update for new docs frontend * Rename PerceiverForImageClassification * Improve docs * Improve docs * Improve docs of PerceiverModel * Fix some more tests * Improve center_crop * Add PerceiverForSequenceClassification * Small improvements * Fix tests * Add integration test for optical flow model * Clean up * Add tests for tokenizer * Fix tokenizer by adding special tokens properly * Fix CI	2021-12-08 14:20:34 +01:00

1 2 3

129 Commits