transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Yih-Dar	5cdfff5df3	Fix job links in Slack report (#17892 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-06-29 14:53:13 +02:00
Aritra Roy Gosthipaty	a7eba83161	TF implementation of RegNets (#17554 ) * chore: initial commit Copied the torch implementation of regnets and porting the code to tf step by step. Also introduced an output layer which was needed for regnets. * chore: porting the rest of the modules to tensorflow did not change the documentation yet, yet to try the playground on the model * Fix initilizations (#1) * fix: code structure in few cases. * fix: code structure to align tf models. * fix: layer naming, bn layer still remains. * chore: change default epsilon and momentum in bn. * chore: styling nits. * fix: cross-loading bn params. * fix: regnet tf model, integration passing. * add: tests for TF regnet. * fix: code quality related issues. * chore: added rest of the files. * minor additions.. * fix: repo consistency. * fix: regnet tf tests. * chore: reorganize dummy_tf_objects for regnet. * chore: remove checkpoint var. * chore: remov unnecessary files. * chore: run make style. * Update docs/source/en/model_doc/regnet.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * chore: PR feedback I. * fix: pt test. thanks to @ydshieh. * New adaptive pooler (#3) * feat: new adaptive pooler Co-authored-by: @Rocketknight1 * chore: remove image_size argument. Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: matt <rocketknight1@gmail.com> * Empty-Commit * chore: remove image_size comment. * chore: remove playground_tf.py * chore: minor changes related to spacing. * chore: make style. * Update src/transformers/models/regnet/modeling_tf_regnet.py Co-authored-by: amyeroberts <aeroberts4444@gmail.com> * Update src/transformers/models/regnet/modeling_tf_regnet.py Co-authored-by: amyeroberts <aeroberts4444@gmail.com> * chore: refactored __init__. * chore: copied from -> taken from./g * adaptive pool -> global avg pool, channel check. * chore: move channel check to stem. * pr comments - minor refactor and add regnets to doc tests. * Update src/transformers/models/regnet/modeling_tf_regnet.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * minor fix in the xlayer. * Empty-Commit * chore: removed from_pt=True. Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: amyeroberts <aeroberts4444@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-06-29 13:45:14 +01:00
Joao Gante	e6d27ca5c8	TF: XLA beam search + most generation-compatible models are now also XLA-generate-compatible (#17857 ) * working beam search 🎉 * XLA generation compatible with ALL classes * add xla generation slow test	2022-06-29 12:41:01 +01:00
Leon Derczynski	b8142753f9	Add missing comment quotes (#17379 )	2022-06-29 06:16:36 -04:00
NielsRogge	e113c5cb64	Remove render tags (#17897 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-06-29 06:06:42 -04:00
Santiago Castro	90415475bb	Fix the Conda package build (#16737 ) * Fix the Conda package build * Update build.sh * Update release-conda.yml	2022-06-29 06:03:16 -04:00
Michal Szutenberg	babd7b1a92	Remove DT_DOUBLE from the T5 graph (#17891 )	2022-06-29 10:23:49 +01:00
Yih-Dar	6aae59d0b5	Compute min_resolution in prepare_image_inputs (#17915 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-06-29 10:30:20 +02:00
Nicolas Patry	776855c752	Fixing a regression with `return_all_scores` introduced in #17606 (#17906 ) Fixing a regression with `return_all_scores` introduced in #17606 - The legacy test actually tested `return_all_scores=False` (the actual default) instead of `return_all_scores=True` (the actual weird case). This commit adds the correct legacy test and fixes it. Tmp legacy tests. Actually fix the regression (also contains lists) Less diffed code.	2022-06-28 17:24:45 -04:00
Sylvain Gugger	5f1e67a566	Pin PyTorch in requirements as well	2022-06-28 15:56:10 -04:00
Sylvain Gugger	5a3d0cbdda	Pin PyTorch while we fix compatibility with 1.12	2022-06-28 15:07:26 -04:00
Jerry Jiarui XU	6c8f4c9a93	Adding GroupViT Models (#17313 ) * add group vit and fixed test (except slow) * passing slow test * addressed some comments * fixed test * fixed style * fixed copy * fixed segmentation output * fixed test * fixed relative path * fixed copy * add ignore non auto configured * fixed docstring, add doc * fixed copies * Apply suggestions from code review merge suggestions Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * resolve comment, renaming model * delete unused attr * use fix copies * resolve comments * fixed attn * remove unused vars * refactor tests * resolve final comments * add demo notebook * fixed inconsitent default * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * rename stage->stages * Create single GroupViTEncoderLayer class * Update conversion script * Simplify conversion script * Remove cross-attention class in favor of GroupViTAttention * Convert other model as well, add processor to conversion script * addressing final comment * fixed args * Update src/transformers/models/groupvit/modeling_groupvit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-06-28 20:51:47 +02:00
mrbean	b424f0b4a3	Mrbean/codegen onnx (#17903 )	2022-06-28 14:57:53 +02:00
regisss	76d13de5ae	Add ONNX support for DETR (#17904 )	2022-06-28 14:48:43 +02:00
Bill Ray	bfcd5743ee	In `group_texts` function, drop last block if smaller than `block_size` (#17908 )	2022-06-28 08:34:55 -04:00
amyeroberts	f71895a633	Move logic into pixelshuffle layer (#17899 ) * Move all pixelshuffle logic into layer * Rename layer * Use correct input to function	2022-06-28 13:04:19 +01:00
Matt	0094565fc5	Fix loss computation in TFBertForPreTraining (#17898 )	2022-06-28 12:44:56 +01:00
Lysandre Debut	1dfa03f12b	Pin black to 22.3.0 to benefit from a stable --preview flag (#17918 )	2022-06-28 04:32:18 -04:00
Suraj Patil	9eec4e937e	[M2M100] update conversion script (#17916 )	2022-06-28 10:15:07 +02:00
Yih-Dar	db2644b9eb	Fix PyTorch/TF Auto tests (#17895 ) * add loading_info Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-06-28 08:56:24 +02:00
Yih-Dar	f717d47fe0	Fix `test_number_of_steps_in_training_with_ipex` (#17889 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-06-28 08:55:02 +02:00
Yih-Dar	0b0dd97737	Update expected values in constrained beam search tests (#17887 ) * fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-06-28 08:53:53 +02:00
Andrej	e02037b352	Fix bug in gpt2's (from-scratch) special scaled weight initialization (#17877 ) * only special scale init each gpt2 c_proj weight once, on exact match * fix double quotes Co-authored-by: leandro <leandro.vonwerra@spoud.io>	2022-06-27 15:01:49 -04:00
JiJi	6dd00f6bd4	Update README_zh-hans.md (#17861 )	2022-06-27 13:09:20 -04:00
Stefan Schweter	71b2839fd3	bert: add conversion script for BERT Token Dropping TF2 checkpoints (#17142 ) * bert: add conversion script for BERT Token Dropping TF2 checkpoints * bert: rename conversion script for BERT Token Dropping checkpoints * bert: fix flake errors in BERT Token Dropping conversion script * bert: make doc-builder happy!!1!11 * bert: fix pytorch_dump_path of BERT Token Dropping conversion script	2022-06-27 13:08:32 -04:00
Sylvain Gugger	98742829d3	Fix add new model like frameworks (#17869 ) * Add new model like adds only the selected frameworks object in init * Small fix	2022-06-27 13:07:34 -04:00
Ian Castillo	afb71b6726	Add type annotations for RoFormer models (#17878 )	2022-06-27 14:50:43 +01:00
Yih-Dar	9a3453846b	fix (#17890 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-06-27 14:36:11 +02:00
Younes Belkada	3ec7d4cfe4	fix mask (#17837 )	2022-06-27 14:08:18 +02:00
Matt	ee0d001de7	Add a TF in-graph tokenizer for BERT (#17701 ) * Add a TF in-graph tokenizer for BERT * Add from_pretrained * Add proper truncation, option handling to match other tokenizers * Add proper imports and guards * Add test, fix all the bugs exposed by said test * Fix truncation of paired texts in graph mode, more test updates * Small fixes, add a (very careful) test for savedmodel * Add tensorflow-text dependency, make fixup * Update documentation * Update documentation * make fixup * Slight changes to tests * Add some docstring examples * Update tests * Update tests and add proper lowercasing/normalization * make fixup * Add docstring for padding! * Mark slow tests * make fixup * Fall back to BertTokenizerFast if BertTokenizer is unavailable * Fall back to BertTokenizerFast if BertTokenizer is unavailable * make fixup * Properly handle tensorflow-text dummies	2022-06-27 12:06:21 +01:00
Yih-Dar	401fcca6c5	Fix TF GPT2 test_onnx_runtime_optimize (#17874 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-06-27 09:27:30 +02:00
Joao Gante	cc5c061e34	CLI: handle multimodal inputs (#17839 )	2022-06-25 16:17:11 +01:00
Sylvain Gugger	e8eb699ee8	Properly get tests deps in test_fetcher (#17870 ) * Properly get tests deps in test_fetcher * Remove print	2022-06-24 16:56:46 -04:00
Yih-Dar	b03be78a4b	Fix `test_inference_instance_segmentation_head` (#17872 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-06-24 19:36:45 +02:00
Yih-Dar	494aac65a7	Skip `test_multi_gpu_data_parallel_forward` for `MaskFormer` (#17864 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-06-24 19:35:00 +02:00
Yih-Dar	0e0f1f4692	Use higher value for hidden_size in Flax BigBird test (#17822 ) * Use higher value for hidden_size in Flax BigBird test * remove 5e-5 Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-06-24 19:31:30 +02:00
kumapo	2ef94ee039	Fix: torch.utils.checkpoint import error. (#17849 )	2022-06-24 13:23:29 -04:00
willtai	ef28a402a9	Add type hints for gptneox models (#17858 ) * feat: Add type hints for GPTNeoxForCausalLM and GPTNeoXModel * fix: removed imported Dict type * fix: Removed unused List import	2022-06-24 17:12:36 +01:00
Suraj Patil	061a73d16f	[CodeGen] support device_map="auto" for sharded checkpoints (#17871 )	2022-06-24 18:06:30 +02:00
rooa	d6b6fb9963	Add CodeGen model (#17443 ) * Add CodeGen model * Add missing key and switch order of super() * Fix torch.ones init with uint8 instead of bool * Address comments: copy statements and doc * update tests * remove old model parallel * fix batch gen tests * fix batch gen test * update test_gpt2_sample_max_time * fix codgen test and revert gpt2 test change * Fix incorrect tie_word_embedding value, typo, URL * Fix model order in README and styling * Reorder model list alphabetically * Set tie_word_embedding to False by default * Apply suggestions from code review * Better attn mask name & remove attn masked_bias * add tokenizer for codegen * quality * doc tokenizer * fix-copies * add CodeGenTokenizer in converter * make truncation optional * add test for truncation * add copyright * fix-copies * fix fast tokenizer decode * Update src/transformers/models/codegen/tokenization_codegen.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * increase vocab_size in tests Co-authored-by: patil-suraj <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-06-24 17:10:38 +02:00
Yih-Dar	447490015a	Fix Splinter test (#17854 ) * fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-06-24 16:26:14 +02:00
Suraj Patil	73a0496c2f	[tests/VisionEncoderDecoder] import to_2tuple from test utils (#17865 )	2022-06-24 15:23:30 +02:00
NaN	bc7a6fdc02	Fix Constrained beam search duplication and weird output issue (#17814 ) * fix(ConstrainedBeamSearchScorer.step_sentence_constraint): avoid hypothesis duplication between topk and advance * fix(GenerationMixin.constrained_beam_search): appropriately assign beam scores instead of token scores	2022-06-24 14:56:08 +02:00
Vishwas	c2c0d9db5f	Improve encoder decoder model docs (#17815 ) * Copied all the changes from the last PR * added in documentation_tests.txt * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: vishwaspai <vishwas.pai@emplay.net> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2022-06-24 14:48:19 +02:00
NielsRogge	0917870510	Improve vision models (#17731 ) * Improve vision models * Add a lot of improvements * Remove to_2tuple from swin tests * Fix TF Swin * Fix more tests * Fix copies * Improve more models * Fix ViTMAE test * Add channel check for TF models * Add proper channel check for TF models * Apply suggestion from code review * Apply suggestions from code review * Add channel check for Flax models, apply suggestion * Fix bug * Add tests for greyscale images * Add test for interpolation of pos encodigns Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-06-24 11:34:51 +02:00
Zachary Mueller	893ab12452	Auto-build Docker images before on-merge if setup.py was changed (#17573 ) * Auto-build on setup modification * Modify push-caller * Make adjustments based on code review	2022-06-23 16:51:33 -04:00
Zachary Mueller	75259b44bf	Properly calculate the total train iterations and recalculate num epochs in no_trainer scripts (#17856 )	2022-06-23 15:46:01 -04:00
Sylvain Gugger	7c1b91281f	Index RNG states by global rank in saves (#17852 )	2022-06-23 12:53:50 -04:00
Sijun He	7cf52a49de	Nezha Pytorch implementation (#17776 ) * wip * rebase * all tests pass * rebase * ready for PR * address comments * fix styles * add require_torch to pipeline test * remove remote image to improve CI consistency * address comments; fix tf/flax tests * address comments; fix tf/flax tests * fix tests; add alias * repo consistency tests * Update src/transformers/pipelines/visual_question_answering.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * address comments * Update src/transformers/pipelines/visual_question_answering.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * merge * wip * wip * wip * most basic tests passes * all tests pass now * relative embedding * wip * running make fixup * remove bert changes * fix doc * fix doc * fix issues * fix doc * address comments * fix CI * remove redundant copied from * address comments * fix broken test Co-authored-by: Sijun He <sijunhe@Sijuns-MacBook-Pro.local> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-06-23 12:36:22 -04:00
Zachary Mueller	acb709d551	Change no trainer image_classification test (#17635 ) * Adjust test arguments and use a new example test	2022-06-23 11:11:16 -04:00

1 2 3 4 5 ...

10120 Commits