transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 10:12:23 +06:00

Author	SHA1	Message	Date
Zachary Mueller	75259b44bf	Properly calculate the total train iterations and recalculate num epochs in no_trainer scripts (#17856 )	2022-06-23 15:46:01 -04:00
Sylvain Gugger	7c1b91281f	Index RNG states by global rank in saves (#17852 )	2022-06-23 12:53:50 -04:00
Sijun He	7cf52a49de	Nezha Pytorch implementation (#17776 ) * wip * rebase * all tests pass * rebase * ready for PR * address comments * fix styles * add require_torch to pipeline test * remove remote image to improve CI consistency * address comments; fix tf/flax tests * address comments; fix tf/flax tests * fix tests; add alias * repo consistency tests * Update src/transformers/pipelines/visual_question_answering.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * address comments * Update src/transformers/pipelines/visual_question_answering.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * merge * wip * wip * wip * most basic tests passes * all tests pass now * relative embedding * wip * running make fixup * remove bert changes * fix doc * fix doc * fix issues * fix doc * address comments * fix CI * remove redundant copied from * address comments * fix broken test Co-authored-by: Sijun He <sijunhe@Sijuns-MacBook-Pro.local> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-06-23 12:36:22 -04:00
Zachary Mueller	acb709d551	Change no trainer image_classification test (#17635 ) * Adjust test arguments and use a new example test	2022-06-23 11:11:16 -04:00
Fx039482	e70abdad1b	Update modeling_cvt.py (#17846 ) As shown in the colab notebook I added the missing type hints for " CvtForImageClassification CvtModel "	2022-06-23 16:08:36 +01:00
Matt	1a7ef3349f	Fix broken test for models with batchnorm (#17841 ) * Fix tests that broke when models used batchnorm * Initializing the model twice does not actually... ...give you the same weights each time. I am good at machine learning. * Fix speed regression	2022-06-23 15:59:53 +01:00
Younes Belkada	18c263c4b6	BLOOM minor changes on tokenizer (#17823 ) * few fixes: - hardcode tokenizer padding side - remove unused args * few fixes: - added new attribute on TokenizerTesterMixin - added new slow test - remove unused arg on tokenizer class * make style * Update src/transformers/models/bloom/tokenization_bloom_fast.py Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com> * make quality * apply changes - remove new attribute - redefine test on the class * add comments Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>	2022-06-23 15:57:12 +02:00
Leandro von Werra	6f29029b05	Improve performance docs (#17750 ) * add skeleton files * fix cpu inference link * add hint to make clear that single gpu section contains general info * add new files to ToC * update toctree to have subsection for performance * add "coming soon" to the still empty sections * fix missing title * fix typo * add reference to empty documents * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2022-06-23 14:51:54 +02:00
Yih-Dar	5bc779ae28	Fix an error message in BigBird (#17840 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-06-23 14:43:53 +02:00
Guillaume Klein	3eed5530ec	Fix properties of unset special tokens in non verbose mode (#17797 ) Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>	2022-06-23 14:40:13 +02:00
SaulLu	b2fdbaccdd	change message (#17836 )	2022-06-23 14:39:48 +02:00
willtai	d37a68e685	Add missing type hints for QDQBertModel (#17783 ) * Feat: add missing type hints for QDQBertModel * fix: ran black and isort * feat: Add missing output type for QDQBertModel * feat: Add type hints for QDQBertLMHeadModel and models starting with QDQBertFor * fix: add missing return type for QDQBertModel * fix: remove wrong return type for QDQBertEmbeddings * fix: readded config argument to load_tf_weights_in_qdqbert * fix: add BertConfig type to BertEmbeddings config due t checko error in ci * fix: removed config type hints to avoid copy checks	2022-06-23 12:58:43 +01:00
Fx039482	4297f44b63	Update type hints modeling_yoso.py (#17827 ) * Update modeling_yoso.py * make fixup * Update modeling_yoso.py That should be it copied from previous PR	2022-06-23 12:37:29 +01:00
Joao Gante	5cce3076c4	TF: generate without `tf.TensorArray` (#17801 )	2022-06-23 12:28:08 +01:00
Quentin	ab223fc148	add doctests for DETR (#17786 ) * add: check labels for detr object detection doctests * add: check shapes * add: add detr to documentation_tests.py * fix: make fixup output * fix: add a comment	2022-06-23 13:26:14 +02:00
Yih-Dar	8d634b70e0	Fix push CI artifact path (#17788 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-06-23 12:31:22 +02:00
Sylvain Gugger	df8e6804c0	Offload fixes (#17810 ) * Offload fixes * Add a test	2022-06-22 12:23:07 -04:00
Joao Gante	0d0c392c45	CLI: use hub's `create_commit` (#17755 ) * use create_commit * better commit message and description * touch setup.py to trigger cache update * add hub version gating	2022-06-22 16:50:21 +01:00
dependabot[bot]	c366ce1011	Bump numpy from 1.21.0 to 1.22.0 in /examples/research_projects/lxmert (#17817 ) Bumps [numpy](https://github.com/numpy/numpy) from 1.21.0 to 1.22.0. - [Release notes](https://github.com/numpy/numpy/releases) - [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst) - [Commits](https://github.com/numpy/numpy/compare/v1.21.0...v1.22.0) --- updated-dependencies: - dependency-name: numpy dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-22 09:29:40 -04:00
dependabot[bot]	af0d21e741	Bump numpy in /examples/research_projects/visual_bert (#17816 ) Bumps [numpy](https://github.com/numpy/numpy) from 1.21.0 to 1.22.0. - [Release notes](https://github.com/numpy/numpy/releases) - [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst) - [Commits](https://github.com/numpy/numpy/compare/v1.21.0...v1.22.0) --- updated-dependencies: - dependency-name: numpy dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-22 09:29:28 -04:00
Arthur	56b83cf049	initial commit (#17818 )	2022-06-22 14:26:03 +02:00
Eran Hirsch	1357038164	Add logits_processor parameter, used by `generate`, to `Seq2SeqTrainer` methods `evaluate` and `predict` (#17805 ) * Add logits_processor parameter, used by `generate`, to `Seq2SeqTrainer` methods `evaluate` and `predict` * Add all generate parameters to `Seq2SeqTrainer`, and also to `QuestionAnsweringSeq2SeqTrainer` which overrides it * Remove `self._num_beams` from trainer classes * - Run fixup - Fix "Constraint" not exposed - Fix synced_gpus to actually read from param * Use kwargs * Copy kwargs before making changes to it * Fix style issues unused imports	2022-06-22 08:11:39 -04:00
Arthur	16c6eb7ca1	Flax sharded (#17760 )	2022-06-22 07:04:35 +02:00
unifyh	3b00b623b7	Fix `top_k_top_p_filtering` having unexpected behavior (#17744 ) - Fix `top_k_top_p_filtering` not passing `filter_value` to `TopPLogitsWarper` causing any top-p filtered logits to be -inf instead of specified value - Add corresponding test	2022-06-21 21:35:55 +02:00
Kyungmin Lee	3ccff0d400	Remove duplicate code (#17708 )	2022-06-21 21:30:40 +02:00
Bram Vanroy	26a6a42608	Improve error message Union not allowed (#17769 ) * Improve error message Union not allowed * make style * Update src/transformers/hf_argparser.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-06-21 14:27:01 -04:00
Thomas Wang	abc400b06a	Add final_layer_norm to OPT model (#17785 ) * Add final_layer_norm to OPT model * Add JAX and TF version * Fix Keras name * Woops * Allow for non breaking change * Apply suggestions from code review * add tests Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-06-21 20:26:36 +02:00
Zachary Mueller	52404cbad4	Properly check for a TPU device (#17802 )	2022-06-21 13:39:55 -04:00
Sylvain Gugger	ef23fae596	Fix test for BF16 detection (#17803 )	2022-06-21 18:31:15 +02:00
Arthur	7cced021fa	TF Sharded (#17713 ) * initial commit * update modeeling tf utils * quality * clean and update args * update * remove potential bug * code quality * update * update max shard * update tests for sharding from pretrained * fix remaining test * make style * h5py if tf available * update and fix test * fix test * style * modified push to hub to support shard for TF * quick fix * update code * merge branch main and style * Apply suggestions from code review Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * update based on reviews * update doc * update and style * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update based on reviews * fix typo * style Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-06-21 18:01:08 +02:00
Yih-Dar	f47afefb21	Use 5e-5 For BigBird PT/Flax equivalence tests (#17780 ) * rename to check_pt_flax_outputs * update check_pt_flax_outputs * use 5e-5 for BigBird PT/Flax test Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-06-21 17:55:26 +02:00
Lysandre Debut	6a5272b205	Prepare transformers for v0.8.0 huggingface-hub release (#17716 ) * Prepare CI for v0.8.0 * pin hfh (revert before merge) * Revert "pin hfh (revert before merge)" This reverts commit `a0103140e1`. * Test rc3 * Test latest rc * Unpin to the RC Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>	2022-06-21 11:51:18 -04:00
Sylvain Gugger	7bc88c0511	Fix forward reference imports in DeBERTa configs (#17800 )	2022-06-21 11:21:06 -04:00
Anugunj Naman	27e907386a	Fix Automatic Download of Pretrained Weights in DETR (#17712 ) * added use_backbone_pretrained * style fixes * update * Update detr.mdx * Update detr.mdx * Update detr.mdx * update using doc py * Update detr.mdx * Update src/transformers/models/detr/configuration_detr.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-06-21 16:45:35 +02:00
NielsRogge	b681e12d59	[ViTMAE] Fix docstrings and variable names (#17710 ) * Fix docstrings and variable names * Rename x to something better * Improve messages * Fix docstrings and add test for greyscale images Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-06-21 15:56:00 +02:00
NielsRogge	3fab17fce8	Add link to notebook (#17791 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-06-21 14:53:08 +02:00
Jia LI	da2bd2ae96	[CodeParrot] Near-deduplication with jaccard similarity (#17054 ) * deduplication draft * update style * update style test * dummy test main * rename modules * rename functions * return extremes in deduplicate_clusters * update style * cast str for gzip * update doc string * time processing * use dataset map to compute minhash * fill value for short token * remove da map method * update style * use share object to multiprocess * update style * use f-string and minor fix Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by: Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com> * update style * use module parameters * change ds_dedup to ds_filter * save ds_dedup * mv test to script tests * make jaccard threshold a parameter of deduplicate_dataset * update style * add doc strings * update style * add doc string for DuplicationIndex * save files into data dir * update readme * Update examples/research_projects/codeparrot/README.md Co-authored-by: Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com> * make near deduplication optional * move near deduplication in README * Update examples/research_projects/codeparrot/README.md Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * use f string Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by: Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com>	2022-06-21 14:23:36 +02:00
mrbean	eb16be415a	add onnx support for deberta and debertav2 (#17617 ) * add onnx support for debertav2 * debertav2 -> deberta-v2 in onnx features file * remove causal lm * add deberta-v2-xlarge to onnx tests * use self.type().dtype() in xsoftmax Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com> * remove hack for deberta * remove unused imports * Update src/transformers/models/deberta_v2/configuration_deberta_v2.py Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com> * use generate dummy inputs * linter * add imports * add support for deberta v1 as well * deberta does not support multiple choice * Update src/transformers/models/deberta/configuration_deberta.py Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com> * Update src/transformers/models/deberta_v2/configuration_deberta_v2.py Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com> * one line ordered dict * fire build Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>	2022-06-21 11:04:15 +02:00
Patrick von Platen	8fcbe275c3	Add UL2 (just docs) (#17740 ) * Add UL2 Co-authored-by: Daniel Hesslow <Daniel.Hesslow@gmail.com> * Correct naming * sort better * up * apply sylvains suggestion	2022-06-21 10:24:50 +02:00
Brad Jascob	da27c4b398	Update modeling_longt5.py (#17777 ) On line 180, `torch.tensor(-1.0, xxx)` gives the error "TypeError: 'float' object cannot be interpreted as an integer" This is because the dtype here is `int64`. For `dtype=int64`, this needs to simply be `-1`. This impacts the long-t5-tglogbal-x model. It does not impact the long-t5-local-x version which does not appear to call this line.	2022-06-20 18:49:08 +02:00
Yih-Dar	d3cb28886a	Not use -1e4 as attn mask (#17306 ) * Use torch.finfo(self.dtype).min * for GPTNeoX * for Albert * For Splinter * Update src/transformers/models/data2vec/modeling_data2vec_audio.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * fix -inf used in Bart-like models * Fix a few remaining -inf * more fix * clean up * For CLIP * For FSMT * clean up * fix test * Add dtype argument and use it for LayoutLMv3 * update FlaxLongT5Attention Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-06-20 16:16:16 +02:00
Sylvain Gugger	fdb120805c	Fix cache for GPT-Neo-X (#17764 ) * Fix cache for GPT-Neo-X * Add more tests	2022-06-20 08:43:36 -04:00
Stas Bekman	a2d34b7c04	deprecate is_torch_bf16_available (#17738 ) * deprecate is_torch_bf16_available * address suggestions	2022-06-20 08:40:11 -04:00
Joao Gante	132402d752	TF: BART compatible with XLA generation (#17479 ) * Also propagate changes to blenderbot, blenderbot_small, marian, mbart, and pegasus	2022-06-20 11:07:46 +01:00
Yih-Dar	6589e510fa	Attempt to change Push CI to workflow_run (#17753 ) * Use workflow_run event for push CI * change to workflow_run * Add comments Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-06-18 08:35:03 +02:00
Rafael Zimmer	0d92798b45	Added translation of index.mdx to Portuguese Issue #16824 (#17565 ) * Added translation of installation.mdx to Portuguese, as well as default templates of _toctree.yml and _config.py * [ build_documentation.yml ] - Updated doc_builder to build documentation in Portuguese. [ pipeline_tutorial.mdx ] - Created translation for the pipeline_tutorial.mdx. * [ build_pr_documentation.yml ] - Added pt language to pr_documentation builder. [ pipeline_tutorial.mdx ] - Grammar changes. * [ accelerate.mdx ] - Translated to Portuguese the acceleration tutorial. * [ multilingual.mdx ] - Added portuguese translation for multilingual tutorial. [ training.mdx ] - Added portuguese translation for training tutorial. * [ preprocessing.mdx ] - WIP * Update _toctree.yml * Adding Pré-processamento to _toctree.yml * Update accelerate.mdx * Nits and eliminate preprocessing file while it is ready * [ index.mdx ] - Translated to Portuguese the index apresentation page. * [ docs/source/pt ] - Updated _toctree.yml to match newest translations. * Fix build_pr_documentation.yml * Fix index nits * nits in _toctree Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-06-17 20:06:05 -04:00
Swetha Mandava	522a9ece4b	Save huggingface checkpoint as artifact in mlflow callback (#17686 ) * Fix eval to compute rouge correctly for rouge_score * styling * moving sentence tokenization to utils from run_eval * saving ckpt in mlflow * use existing format of args * fix documentation Co-authored-by: Swetha Mandava <smandava@nvidia.com>	2022-06-17 14:14:03 -04:00
Sourab Mangrulkar	21a772426d	Migrate HFDeepSpeedConfig from trfrs to accelerate (#17623 ) * Migrate HFDeepSpeedConfig from trfrs to accelerate * add `accelerate` to testing dep * addressing comments * addressing comments Using `_shared_state` and avoiding object creation. This is necessary as `notebook_launcher` in `launcers.py` checks `len(AcceleratorState._shared_state)>0` to throw an error. * resolving comments 1. Use simple API from accelerate to manage the deepspeed config integration 2. Update the related documentation * reverting changes and addressing comments * docstring correction * addressing nits * addressing nits * addressing nits 3 * bumping up the accelerate version to 0.10.0 * resolving import * update setup.py to include deepspeed dependencies * Update dependency_versions_table.py * fixing imports * reverting changes to CI dependencies for "run_tests_pipelines_tf" tests These changes didn't help with resolving the failures and I believe this needs to be addressed in another PR. removing `accelerate` as hard dependency Resolves issues related to CI Tests * adding `accelerate` as dependency for building docs resolves failure in Build PR Documentation test * adding `accelerate` as dependency in "dev" to resolve doc build issue * resolving comments 1. adding `accelerate` to extras["all"] 2. Including check for accelerate too before import HFDeepSpeedConfig from there Co-Authored-By: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * resolving comments Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-06-17 23:29:35 +05:30
dependabot[bot]	e44a569fef	Bump notebook in /examples/research_projects/lxmert (#17743 ) Bumps [notebook](http://jupyter.org) from 6.4.10 to 6.4.12. --- updated-dependencies: - dependency-name: notebook dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-17 12:10:33 -04:00
dependabot[bot]	5089a2d412	Bump notebook in /examples/research_projects/visual_bert (#17742 ) Bumps [notebook](http://jupyter.org) from 6.4.10 to 6.4.12. --- updated-dependencies: - dependency-name: notebook dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-17 12:10:17 -04:00

1 2 3 4 5 ...

10074 Commits