Joao Gante
e6d27ca5c8
TF: XLA beam search + most generation-compatible models are now also XLA-generate-compatible ( #17857 )
...
* working beam search 🎉
* XLA generation compatible with ALL classes
* add xla generation slow test
2022-06-29 12:41:01 +01:00
Leon Derczynski
b8142753f9
Add missing comment quotes ( #17379 )
2022-06-29 06:16:36 -04:00
NielsRogge
e113c5cb64
Remove render tags ( #17897 )
...
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-06-29 06:06:42 -04:00
Santiago Castro
90415475bb
Fix the Conda package build ( #16737 )
...
* Fix the Conda package build
* Update build.sh
* Update release-conda.yml
2022-06-29 06:03:16 -04:00
Michal Szutenberg
babd7b1a92
Remove DT_DOUBLE from the T5 graph ( #17891 )
2022-06-29 10:23:49 +01:00
Yih-Dar
6aae59d0b5
Compute min_resolution in prepare_image_inputs ( #17915 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-29 10:30:20 +02:00
Nicolas Patry
776855c752
Fixing a regression with return_all_scores
introduced in #17606 ( #17906 )
...
Fixing a regression with `return_all_scores` introduced in #17606
- The legacy test actually tested `return_all_scores=False` (the actual
default) instead of `return_all_scores=True` (the actual weird case).
This commit adds the correct legacy test and fixes it.
Tmp legacy tests.
Actually fix the regression (also contains lists)
Less diffed code.
2022-06-28 17:24:45 -04:00
Sylvain Gugger
5f1e67a566
Pin PyTorch in requirements as well
2022-06-28 15:56:10 -04:00
Sylvain Gugger
5a3d0cbdda
Pin PyTorch while we fix compatibility with 1.12
2022-06-28 15:07:26 -04:00
Jerry Jiarui XU
6c8f4c9a93
Adding GroupViT Models ( #17313 )
...
* add group vit and fixed test (except slow)
* passing slow test
* addressed some comments
* fixed test
* fixed style
* fixed copy
* fixed segmentation output
* fixed test
* fixed relative path
* fixed copy
* add ignore non auto configured
* fixed docstring, add doc
* fixed copies
* Apply suggestions from code review
merge suggestions
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* resolve comment, renaming model
* delete unused attr
* use fix copies
* resolve comments
* fixed attn
* remove unused vars
* refactor tests
* resolve final comments
* add demo notebook
* fixed inconsitent default
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* rename stage->stages
* Create single GroupViTEncoderLayer class
* Update conversion script
* Simplify conversion script
* Remove cross-attention class in favor of GroupViTAttention
* Convert other model as well, add processor to conversion script
* addressing final comment
* fixed args
* Update src/transformers/models/groupvit/modeling_groupvit.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-06-28 20:51:47 +02:00
mrbean
b424f0b4a3
Mrbean/codegen onnx ( #17903 )
2022-06-28 14:57:53 +02:00
regisss
76d13de5ae
Add ONNX support for DETR ( #17904 )
2022-06-28 14:48:43 +02:00
Bill Ray
bfcd5743ee
In group_texts
function, drop last block if smaller than block_size
( #17908 )
2022-06-28 08:34:55 -04:00
amyeroberts
f71895a633
Move logic into pixelshuffle layer ( #17899 )
...
* Move all pixelshuffle logic into layer
* Rename layer
* Use correct input to function
2022-06-28 13:04:19 +01:00
Matt
0094565fc5
Fix loss computation in TFBertForPreTraining ( #17898 )
2022-06-28 12:44:56 +01:00
Lysandre Debut
1dfa03f12b
Pin black to 22.3.0 to benefit from a stable --preview flag ( #17918 )
2022-06-28 04:32:18 -04:00
Suraj Patil
9eec4e937e
[M2M100] update conversion script ( #17916 )
2022-06-28 10:15:07 +02:00
Yih-Dar
db2644b9eb
Fix PyTorch/TF Auto tests ( #17895 )
...
* add loading_info
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-28 08:56:24 +02:00
Yih-Dar
f717d47fe0
Fix test_number_of_steps_in_training_with_ipex
( #17889 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-28 08:55:02 +02:00
Yih-Dar
0b0dd97737
Update expected values in constrained beam search tests ( #17887 )
...
* fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-28 08:53:53 +02:00
Andrej
e02037b352
Fix bug in gpt2's (from-scratch) special scaled weight initialization ( #17877 )
...
* only special scale init each gpt2 c_proj weight once, on exact match
* fix double quotes
Co-authored-by: leandro <leandro.vonwerra@spoud.io>
2022-06-27 15:01:49 -04:00
JiJi
6dd00f6bd4
Update README_zh-hans.md ( #17861 )
2022-06-27 13:09:20 -04:00
Stefan Schweter
71b2839fd3
bert: add conversion script for BERT Token Dropping TF2 checkpoints ( #17142 )
...
* bert: add conversion script for BERT Token Dropping TF2 checkpoints
* bert: rename conversion script for BERT Token Dropping checkpoints
* bert: fix flake errors in BERT Token Dropping conversion script
* bert: make doc-builder happy!!1!11
* bert: fix pytorch_dump_path of BERT Token Dropping conversion script
2022-06-27 13:08:32 -04:00
Sylvain Gugger
98742829d3
Fix add new model like frameworks ( #17869 )
...
* Add new model like adds only the selected frameworks object in init
* Small fix
2022-06-27 13:07:34 -04:00
Ian Castillo
afb71b6726
Add type annotations for RoFormer models ( #17878 )
2022-06-27 14:50:43 +01:00
Yih-Dar
9a3453846b
fix ( #17890 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-27 14:36:11 +02:00
Younes Belkada
3ec7d4cfe4
fix mask ( #17837 )
2022-06-27 14:08:18 +02:00
Matt
ee0d001de7
Add a TF in-graph tokenizer for BERT ( #17701 )
...
* Add a TF in-graph tokenizer for BERT
* Add from_pretrained
* Add proper truncation, option handling to match other tokenizers
* Add proper imports and guards
* Add test, fix all the bugs exposed by said test
* Fix truncation of paired texts in graph mode, more test updates
* Small fixes, add a (very careful) test for savedmodel
* Add tensorflow-text dependency, make fixup
* Update documentation
* Update documentation
* make fixup
* Slight changes to tests
* Add some docstring examples
* Update tests
* Update tests and add proper lowercasing/normalization
* make fixup
* Add docstring for padding!
* Mark slow tests
* make fixup
* Fall back to BertTokenizerFast if BertTokenizer is unavailable
* Fall back to BertTokenizerFast if BertTokenizer is unavailable
* make fixup
* Properly handle tensorflow-text dummies
2022-06-27 12:06:21 +01:00
Yih-Dar
401fcca6c5
Fix TF GPT2 test_onnx_runtime_optimize ( #17874 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-27 09:27:30 +02:00
Joao Gante
cc5c061e34
CLI: handle multimodal inputs ( #17839 )
2022-06-25 16:17:11 +01:00
Sylvain Gugger
e8eb699ee8
Properly get tests deps in test_fetcher ( #17870 )
...
* Properly get tests deps in test_fetcher
* Remove print
2022-06-24 16:56:46 -04:00
Yih-Dar
b03be78a4b
Fix test_inference_instance_segmentation_head
( #17872 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-24 19:36:45 +02:00
Yih-Dar
494aac65a7
Skip test_multi_gpu_data_parallel_forward
for MaskFormer
( #17864 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-24 19:35:00 +02:00
Yih-Dar
0e0f1f4692
Use higher value for hidden_size in Flax BigBird test ( #17822 )
...
* Use higher value for hidden_size in Flax BigBird test
* remove 5e-5
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-24 19:31:30 +02:00
kumapo
2ef94ee039
Fix: torch.utils.checkpoint import error. ( #17849 )
2022-06-24 13:23:29 -04:00
willtai
ef28a402a9
Add type hints for gptneox models ( #17858 )
...
* feat: Add type hints for GPTNeoxForCausalLM and GPTNeoXModel
* fix: removed imported Dict type
* fix: Removed unused List import
2022-06-24 17:12:36 +01:00
Suraj Patil
061a73d16f
[CodeGen] support device_map="auto" for sharded checkpoints ( #17871 )
2022-06-24 18:06:30 +02:00
rooa
d6b6fb9963
Add CodeGen model ( #17443 )
...
* Add CodeGen model
* Add missing key and switch order of super()
* Fix torch.ones init with uint8 instead of bool
* Address comments: copy statements and doc
* update tests
* remove old model parallel
* fix batch gen tests
* fix batch gen test
* update test_gpt2_sample_max_time
* fix codgen test and revert gpt2 test change
* Fix incorrect tie_word_embedding value, typo, URL
* Fix model order in README and styling
* Reorder model list alphabetically
* Set tie_word_embedding to False by default
* Apply suggestions from code review
* Better attn mask name & remove attn masked_bias
* add tokenizer for codegen
* quality
* doc tokenizer
* fix-copies
* add CodeGenTokenizer in converter
* make truncation optional
* add test for truncation
* add copyright
* fix-copies
* fix fast tokenizer decode
* Update src/transformers/models/codegen/tokenization_codegen.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* increase vocab_size in tests
Co-authored-by: patil-suraj <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-06-24 17:10:38 +02:00
Yih-Dar
447490015a
Fix Splinter test ( #17854 )
...
* fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-24 16:26:14 +02:00
Suraj Patil
73a0496c2f
[tests/VisionEncoderDecoder] import to_2tuple from test utils ( #17865 )
2022-06-24 15:23:30 +02:00
NaN
bc7a6fdc02
Fix Constrained beam search duplication and weird output issue ( #17814 )
...
* fix(ConstrainedBeamSearchScorer.step_sentence_constraint): avoid hypothesis duplication between topk and advance
* fix(GenerationMixin.constrained_beam_search): appropriately assign beam scores instead of token scores
2022-06-24 14:56:08 +02:00
Vishwas
c2c0d9db5f
Improve encoder decoder model docs ( #17815 )
...
* Copied all the changes from the last PR
* added in documentation_tests.txt
* Update docs/source/en/model_doc/encoder-decoder.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/en/model_doc/encoder-decoder.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/en/model_doc/encoder-decoder.mdx
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* Update docs/source/en/model_doc/encoder-decoder.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/en/model_doc/encoder-decoder.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/en/model_doc/encoder-decoder.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/en/model_doc/encoder-decoder.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: vishwaspai <vishwas.pai@emplay.net>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2022-06-24 14:48:19 +02:00
NielsRogge
0917870510
Improve vision models ( #17731 )
...
* Improve vision models
* Add a lot of improvements
* Remove to_2tuple from swin tests
* Fix TF Swin
* Fix more tests
* Fix copies
* Improve more models
* Fix ViTMAE test
* Add channel check for TF models
* Add proper channel check for TF models
* Apply suggestion from code review
* Apply suggestions from code review
* Add channel check for Flax models, apply suggestion
* Fix bug
* Add tests for greyscale images
* Add test for interpolation of pos encodigns
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-06-24 11:34:51 +02:00
Zachary Mueller
893ab12452
Auto-build Docker images before on-merge if setup.py was changed ( #17573 )
...
* Auto-build on setup modification
* Modify push-caller
* Make adjustments based on code review
2022-06-23 16:51:33 -04:00
Zachary Mueller
75259b44bf
Properly calculate the total train iterations and recalculate num epochs in no_trainer scripts ( #17856 )
2022-06-23 15:46:01 -04:00
Sylvain Gugger
7c1b91281f
Index RNG states by global rank in saves ( #17852 )
2022-06-23 12:53:50 -04:00
Sijun He
7cf52a49de
Nezha Pytorch implementation ( #17776 )
...
* wip
* rebase
* all tests pass
* rebase
* ready for PR
* address comments
* fix styles
* add require_torch to pipeline test
* remove remote image to improve CI consistency
* address comments; fix tf/flax tests
* address comments; fix tf/flax tests
* fix tests; add alias
* repo consistency tests
* Update src/transformers/pipelines/visual_question_answering.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* address comments
* Update src/transformers/pipelines/visual_question_answering.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* merge
* wip
* wip
* wip
* most basic tests passes
* all tests pass now
* relative embedding
* wip
* running make fixup
* remove bert changes
* fix doc
* fix doc
* fix issues
* fix doc
* address comments
* fix CI
* remove redundant copied from
* address comments
* fix broken test
Co-authored-by: Sijun He <sijunhe@Sijuns-MacBook-Pro.local>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2022-06-23 12:36:22 -04:00
Zachary Mueller
acb709d551
Change no trainer image_classification test ( #17635 )
...
* Adjust test arguments and use a new example test
2022-06-23 11:11:16 -04:00
Fx039482
e70abdad1b
Update modeling_cvt.py ( #17846 )
...
As shown in the colab notebook I added the missing type hints for " CvtForImageClassification
CvtModel
"
2022-06-23 16:08:36 +01:00
Matt
1a7ef3349f
Fix broken test for models with batchnorm ( #17841 )
...
* Fix tests that broke when models used batchnorm
* Initializing the model twice does not actually...
...give you the same weights each time.
I am good at machine learning.
* Fix speed regression
2022-06-23 15:59:53 +01:00