transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Sylvain Gugger	3981ee8650	Sort the model doc Toc Alphabetically (#17723 )	2022-06-15 16:11:56 -04:00
Stas Bekman	66f893320c	normalize keys_to_ignore (#17722 )	2022-06-15 11:59:11 -07:00
Joao Gante	c3c62b5d2c	CLI: Add flag to push TF weights directly into main (#17720 ) * Add flag to push weights directly into main	2022-06-15 19:25:50 +01:00
Jeff Rasley	6ebeeeef81	Update requirements.txt (#17719 )	2022-06-15 13:51:41 -04:00
Yih-Dar	50415b84d6	Revert "Change push CI to run on workflow_run event (#17692 )" (#17717 ) This reverts commit `b76290f44c`.	2022-06-15 18:42:43 +02:00
Patrick von Platen	7f14839f55	[Wav2Vec2Conformer] Official release (#17709 ) * [Wav2Vec2Conformer] Official release * remove from not-in-readme	2022-06-15 18:34:15 +02:00
Stefan Schweter	242cc6e265	Documentation: RemBERT fixes (#17641 ) * rembert: fix python codeblock * rembert: use correct google/rembert checkpoint name in documentation * rembert: use correct google/rembert checkpoint name in TF documentation	2022-06-15 18:17:59 +02:00
Yih-Dar	b76290f44c	Change push CI to run on workflow_run event (#17692 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-06-15 17:43:31 +02:00
Younes Belkada	d453ea6120	fix tolerance for a bloom slow test (#17634 )	2022-06-14 18:14:12 +02:00
Suraj Patil	120649bf3a	[LongT5] disable model parallel test (#17702 )	2022-06-14 17:27:39 +02:00
Michael Benayoun	7ec9128e5a	FX function refactor (#17625 ) * Function refactor * Update src/transformers/utils/fx.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-06-14 17:22:21 +02:00
Hailey Schoelkopf	edb672ac5e	Add `BloomForSequenceClassification` and `BloomForTokenClassification` classes (#17639 ) * add new bloom classes * (feat) add bloom classification tests; make style * style: change import in test * add some typehints to bloom classes * merge main into branch * fix: input checking in bloom seq classification * fix tests * change model class tests * fix few tests - more tests should pass - one test left * make token classifier return hidden states * style: make BLOOM typehints consistent Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2022-06-14 17:10:12 +02:00
amyeroberts	bd43151af4	Swin main layer (#17693 ) * Swin models call TFSwinMainLayer * Tidy up	2022-06-14 14:28:12 +01:00
Sayak Paul	3960ce917f	Include a comment to reflect Amy's contributions (#17689 ) * Add note on amy's contribution. Co-authored-by: Amy Roberts <aeroberts4444@gmail.com> * remove non-tech comment. Co-authored by: Amy Roberts <aeroberts4444@gmail.com> Co-authored-by: Amy Roberts <aeroberts4444@gmail.com>	2022-06-14 09:15:39 -04:00
Shamane Siri	9068fa6c57	Rag end2end new (#17650 ) * check * update the RAG-end2end with new PL and RAY * removed unwanted comments	2022-06-14 14:56:32 +02:00
Patrick von Platen	53496ac510	[LongT5] Rename checkpoitns (#17700 )	2022-06-14 14:10:50 +02:00
jianan-gu	3b29c9fdb7	Extend Transformers Trainer Class to Enable PyTorch Torchscript for Inference (#17153 ) * add jit mode option and model wrap * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refine code * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add ut and refine code * code refine * refine code * add inference doc * Update src/transformers/trainer.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * add cpu inference performance doc * Update perf_infer_cpu.mdx * Update perf_infer_cpu.mdx * Update performance.mdx * Update _toctree.yml * refine jit func naming * Update _toctree.yml * Delete perf_infer_gpu_one.mdx * Update perf_infer_cpu.mdx * Update docs/source/en/perf_infer_cpu.mdx Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * add none check before jit * Update docs/source/en/perf_infer_cpu.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/perf_infer_cpu.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Stas Bekman <stas@stason.org> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2022-06-14 07:56:47 -04:00
Yih-Dar	df15703b42	Fix doc builder Dockerfile (#17435 ) * Fix doc builder Dockerfile Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-06-14 09:58:48 +02:00
Daniel Stancl	a72f1c9f5b	Add `LongT5` model (#16792 ) * Initial commit * Make some fixes * Make PT model full forward pass * Drop TF & Flax implementation, fix copies etc * Add Flax model and update some corresponding stuff * Drop some TF things * Update config and flax local attn * Add encoder_attention_type to config * . * Update docs * Do some cleansing * Fix some issues -> make style; add some docs * Fix position_bias + mask addition + Update tests * Fix repo consistency * Fix model consistency by removing flax operation over attn_mask * [WIP] Add PT TGlobal LongT5 * . * [WIP] Add flax tglobal model * [WIP] Update flax model to use the right attention type in the encoder * Fix flax tglobal model forward pass * Make the use of global_relative_attention_bias * Add test suites for TGlobal model * Fix minor bugs, clean code * Fix pt-flax equivalence though not convinced with correctness * Fix LocalAttn implementation to match the original impl. + update READMEs * Few updates * Update: [Flax] improve large model init and loading #16148 * Add ckpt conversion script accoring to #16853 + handle torch device placement * Minor updates to conversion script. * Typo: AutoModelForSeq2SeqLM -> FlaxAutoModelForSeq2SeqLM * gpu support + dtype fix * Apply some suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * * Remove (de)parallelize stuff * Edit shape comments * Update README.md * make fix-copies * Remove caching logic for local & tglobal attention * Apply another batch of suggestions from code review * Add missing checkpoints * Format converting scripts * Drop (de)parallelize links from longT5 mdx * Fix converting script + revert config file change * Revert "Remove caching logic for local & tglobal attention" This reverts commit 2a619828f6ddc3e65bd9bb1725a12b77fa883a46. * Stash caching logic in Flax model * Make side relative bias used always * Drop caching logic in PT model * Return side bias as it was * Drop all remaining model parallel logic * Remove clamp statements * Move test files to the proper place * Update docs with new version of hf-doc-builder * Fix test imports * Make some minor improvements * Add missing checkpoints to docs * Make TGlobal model compatible with torch.onnx.export * Replace some np.ndarray with jnp.ndarray * Fix TGlobal for ONNX conversion + update docs * fix _make_global_fixed_block_ids and masked neg value * update flax model * style and quality * fix imports * remove load_tf_weights_in_longt5 from init and fix copies * add slow test for TGlobal model * typo fix * Drop obsolete is_parallelizable and one warning * Update __init__ files to fix repo-consistency * fix pipeline test * Fix some device placements * [wip]: Update tests -- need to generate summaries to update expected_summary * Fix quality * Update LongT5 model card * Update (slow) summarization tests * make style * rename checkpoitns * finish * fix flax tests Co-authored-by: phungvanduy <pvduy23@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: patil-suraj <surajp815@gmail.com>	2022-06-13 22:36:58 +02:00
haohanchen-yagao	1690094bdb	Add FP16 Support for SageMaker Model Parallel (#17386 ) * Add FP16 supporot for sagemaker model parallel * minor fix * fix indentation * handle mix precision exception for smmp * minor fix * remove amp implementation on SMMP * remove redundant stuff * reformat trainer * restyling * reformat	2022-06-13 13:45:25 -04:00
Wang, Yi	4aabf9b52c	enable cpu distribution training using mpirun (#17570 ) * enable cpu distribution training using mpirun command like mpirun -n 2 python3 run_qa.py --no_cuda --xpu_backend ccl xxxx MASTER_ADDR and MASTER_PORT should be set as env export MASTER_ADDR=127.0.0.1 export MASTER_PORT=29500 Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> fix according to the review comment Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * use accelerate logic for cpu distribution training to set "RANK","LOCAL_RANK","WORLD_SIZE" environment Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>	2022-06-13 13:34:07 -04:00
Bram Vanroy	457d4a3245	Add Ray's scope to training arguments (#17629 ) * allow scope from trainer arg * add ray_scope to training args * escape double quotes * make style && quality * attempt to solve doc style issues * splitting up URLs for style * make fixup * Update src/transformers/training_args.py Co-authored-by: Antoni Baum <antoni.baum@protonmail.com> * make style Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>	2022-06-13 10:44:06 -04:00
Will Frey	5483388631	Update modeling_gpt_neox.py (#17575 ) I'm guessing that the intention was to have the `_no_split_modules` class attribute for `GPTNeoXPreTrainedModel` to be set to `["GPTNeoXLayer"]`, akin to how its set as `["GPTJBlock"]` for `GPTJPreTrainedModel`. If this is incorrect, please feel free to just close the PR. Thanks!	2022-06-13 09:59:27 -04:00
Sylvain Gugger	a1344dbfb9	Fix dtype getter (#17668 ) * Fix dtype getters * Proper fix for dtype getter * Style and commant * Always use last for consistency * Quality	2022-06-13 09:34:45 -04:00
Bram Vanroy	73083581a4	explicitly set utf8 for Windows (#17664 )	2022-06-13 08:05:45 -04:00
Saint	c1daf724ea	Fixed documentation typo, parameter name is evaluation_strategy, not eval_strategy (#17669 ) Co-authored-by: Saint <saint@st-mini.local>	2022-06-13 08:02:06 -04:00
Sijun He	66336dc183	Add Visual Question Answering (VQA) pipeline (#17286 ) * wip * rebase * all tests pass * rebase * ready for PR * address comments * fix styles * add require_torch to pipeline test * remove remote image to improve CI consistency * address comments; fix tf/flax tests * address comments; fix tf/flax tests * fix tests; add alias * repo consistency tests * Update src/transformers/pipelines/visual_question_answering.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * address comments * Update src/transformers/pipelines/visual_question_answering.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * merge * Update src/transformers/models/auto/modeling_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * merge Co-authored-by: Sijun He <sijunhe@Sijuns-MacBook-Pro.local> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-06-13 07:49:44 -04:00
Ayush Mangal	a5282ab4bc	Fix typo in adding_a_new_model README (#17679 )	2022-06-13 03:22:07 -04:00
Yih-Dar	224bde91ca	Avoid GPU OOM for a TF Rag test (#17638 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-06-10 18:50:29 +02:00
Domenic Rosati	39e146146b	fix typo from emtpy to empty (#17643 )	2022-06-10 18:50:11 +02:00
Patrick von Platen	13e875cc07	[Generation Test] Make fast test actually fast (#17661 )	2022-06-10 18:49:03 +02:00
Patrick von Platen	b4eef63a1d	[Data2Vec] Speed up test (#17660 )	2022-06-10 18:48:58 +02:00
Patrick von Platen	5e428b71b4	[BigBirdFlaxTests] Make tests slow (#17658 ) * [BigBirdFlaxTests] Make tests slow * up * correct black with new version	2022-06-10 16:54:14 +02:00
Loubna Ben Allal	3114df41f4	update README.md (#17657 ) - use CodeParrot scores of v1.1 - change evaluation command to use accelerate	2022-06-10 15:55:24 +02:00
Simon Brandeis	c99ddcc441	🐛 Properly raise `RepoNotFoundError` when not authenticated (#17651 ) * Raise RepoNotFoundError in case of 401 * Include changes from revert-17646-skip_repo_not_found * Add a comment * 💄 Code quality * 💚 Update `get_from_cache` test * 💚 Code quality & skip failing test	2022-06-10 15:41:53 +02:00
Balaji	35b16032cb	Fixes #17128 . (#17356 ) VisibleDeprecationWarning is addressed by specifying dtype=object when creating numpy array. Update code based on review feedback. Undo whitespace changes to tokenization_utils_base.py. Co-authored-by: I like data <ilikedata@nym.hush.com>	2022-06-10 09:36:48 -04:00
Sylvain Gugger	b88090914d	Fix dtype getters (#17656 )	2022-06-10 07:43:13 -04:00
amyeroberts	fd1e67033e	Add skip logic for attentions test - Levit (#17633 )	2022-06-10 12:46:30 +02:00
Lysandre	cdaed367b0	Fix style	2022-06-10 11:53:44 +02:00
Lysandre	2bc305107a	Fix style	2022-06-10 11:20:14 +02:00
dependabot[bot]	1d463303fe	Bump cookiecutter in /examples/research_projects/decision_transformer (#17645 ) Bumps [cookiecutter](https://github.com/cookiecutter/cookiecutter) from 1.7.2 to 2.1.1. - [Release notes](https://github.com/cookiecutter/cookiecutter/releases) - [Changelog](https://github.com/cookiecutter/cookiecutter/blob/master/HISTORY.md) - [Commits](https://github.com/cookiecutter/cookiecutter/compare/1.7.2...2.1.1) --- updated-dependencies: - dependency-name: cookiecutter dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-10 04:27:51 -04:00
Alara Dirik	49becbaa55	Enable crop_center method to handle (W, H, C) images (#17626 ) * enable crop_center method to handle (W, H, C) images * minor style and comment edits	2022-06-10 09:18:42 +03:00
Alara Dirik	6e93d94792	Move Clip image utils to image_utils.py (#17628 ) * move clip image utils to image_utils.py * dont default to square images * fix typo, revert change to test file * edit convert_rgb comments	2022-06-10 09:12:17 +03:00
Sylvain Gugger	af4a1ecad0	Skip tests until bug is fixed. (#17646 )	2022-06-09 21:32:19 -04:00
Martina Fumanelli	e0b58fb5ba	Translation/autoclass (#17615 ) * Add Italian translation for autoclass_tutorial.mdx * Fix synthesis Co-authored-by: martina.fumanelli <martina.fumanelli@MBP-di-martinafumanelli.local>	2022-06-09 20:56:44 -04:00
Stas Bekman	df1ec6b122	didn't exist in pt-1.9 (#17644 )	2022-06-09 16:01:01 -07:00
mrbean	fba0b6a820	convert assertion to raised exception in debertav2 (#17619 ) * convert assertion to raised exception in debertav2 * change assert to raise exception in deberta * fix messages	2022-06-09 18:18:29 -04:00
Yih-Dar	da0bed5f4a	Pre-build DeepSpeed (#17607 ) * pre-build deepspeed Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-06-09 23:02:33 +02:00
Stas Bekman	75343de938	[modeling_utils] torch_dtype/auto floating dtype fixes (#17614 ) * [modeling_utils] torch_dtype/auto fixes * add test * apply suggestions * add missing fallback * Renaming things * Use for else Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>	2022-06-09 10:18:26 -07:00
Nicolas Patry	c38f4e1f1c	Running a pipeline of `float16`. (#17637 ) When we're preparing the tensors for CPU for postprocessing, we need to upgrade the `float16` to `float32` since CPUs don't have instructions for `[b]float16`.	2022-06-09 19:04:42 +02:00

1 2 3 4 5 ...

10017 Commits