transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Leo Tronchon	a7ff77a573	raise value error if cross_attention_gate is None	2023-11-03 10:13:53 +01:00
Leo Tronchon	9a000da98f	update test based on comments	2023-10-30 18:43:53 +01:00
Leo Tronchon	f5b26b949a	Merge branch 'main' into fix-idefics-image-attention	2023-10-26 14:56:34 +02:00
Leo Tronchon	71d444daca	take off no_images logic	2023-10-26 14:35:31 +02:00
Leo Tronchon	6cc352cf6f	make tests for gate	2023-10-26 14:29:21 +02:00
Patrick von Platen	d7cb5e138e	[Llama FA2] Re-add _expand_attention_mask and clean a couple things (#27074 ) * clean * clean llama * fix more * make style * Apply suggestions from code review * Apply suggestions from code review * Update src/transformers/models/llama/modeling_llama.py * Update src/transformers/models/llama/modeling_llama.py * Apply suggestions from code review * finish * make style	2023-10-26 13:06:21 +02:00
Arthur	4864d08d3e	Add-support for commit description (#26704 ) * fix * update * revert * add dosctring * good to go * update * add a test	2023-10-26 12:37:09 +02:00
Arthur	15cd096288	Create SECURITY.md	2023-10-26 12:26:47 +02:00
Younes Belkada	fe2877ce21	Remove unneeded prints in modeling_gpt_neox.py (#27080 )	2023-10-26 11:55:31 +02:00
Younes Belkada	efba1a1744	Bump`flash_attn` version to `2.1` (#27079 ) * pin FA-2 to `2.1` * fix on modeling	2023-10-26 11:21:04 +02:00
Zach Mueller	90412401e6	Bring back `set_epoch` for Accelerate-based dataloaders (#26850 ) * Working tests! * Fix sampler * Fix * Update src/transformers/trainer.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fix check * Clean --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-10-26 11:20:11 +02:00
dependabot[bot]	3c2692407d	Bump urllib3 from 1.26.17 to 1.26.18 in /examples/research_projects/lxmert (#26888 ) Bump urllib3 in /examples/research_projects/lxmert Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.17 to 1.26.18. - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst) - [Commits](https://github.com/urllib3/urllib3/compare/1.26.17...1.26.18) --- updated-dependencies: - dependency-name: urllib3 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-10-26 09:10:29 +02:00
dependabot[bot]	9c5240af14	Bump werkzeug from 2.2.3 to 3.0.1 in /examples/research_projects/decision_transformer (#27072 ) Bump werkzeug in /examples/research_projects/decision_transformer Bumps [werkzeug](https://github.com/pallets/werkzeug) from 2.2.3 to 3.0.1. - [Release notes](https://github.com/pallets/werkzeug/releases) - [Changelog](https://github.com/pallets/werkzeug/blob/main/CHANGES.rst) - [Commits](https://github.com/pallets/werkzeug/compare/2.2.3...3.0.1) --- updated-dependencies: - dependency-name: werkzeug dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-10-26 08:56:28 +02:00
corey hu	df2eebf1e7	Handle unsharded Llama2 model types in conversion script (#27069 ) Handle all unshared models types	2023-10-26 08:41:07 +02:00
Aarya Balwadkar	a2f55a65cd	Hindi translation of pipeline_tutorial.md (#26837 ) * hindi translation of pipeline_tutorial.md * Update pipeline_tutorial.md * Update build_documentation.yml * Update build_pr_documentation.yml * Updated build_documentation.yml --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-10-25 11:21:49 -07:00
Yeyang	ba5144f7a9	🌐 [i18n-ZH] Translate custom_models.md into Chinese (#27065 ) * docs(zh): translate custom_models.md * minor fix in customer_models Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-10-25 11:20:32 -07:00
Younes Belkada	c34c50cdc0	[`docs`] Add `MaskGenerationPipeline` in docs (#27063 ) * add `MaskGenerationPipeline` in docs * Update __init__.py * fix repo consistency and clarify docstring * add on check docstirngs * actually we do have a tf sam * oops	2023-10-25 19:31:36 +02:00
Akash Kundu	ba073ea9e3	[DOCS] minor fixes in README.md (#27048 ) minor fixes	2023-10-25 10:21:13 -07:00
Leo Tronchon	ac49319372	fix no_images placement	2023-10-25 18:15:05 +02:00
Jing Hua	a64f8c1f87	[docstring] fix incorrect llama docstring: encoder -> decoder (#27071 ) fix incorrect docstring: encoder -> decoder	2023-10-25 18:09:04 +02:00
Leo Tronchon	7a45089b34	add information on gate shape	2023-10-25 18:07:53 +02:00
Leo Tronchon	699838cc61	pass cross_attention_gate similarly to no_images gate	2023-10-25 17:08:44 +02:00
Leo Tronchon	598deb124a	bring back no_images	2023-10-25 17:00:59 +02:00
Nick Hill	0baa9246cb	Fix TypicalLogitsWarper tensor OOB indexing edge case (#26579 ) * Fix TypicalLogitsWarper tensor OOB indexing edge case This can be triggerd fairly quickly with low precision e.g. bfloat16 and typical_p = 0.99. * Shift threshold index by one * Use explicit named arg for clamp min	2023-10-25 11:36:43 +01:00
Younes Belkada	06e782da4e	[`core`] Refactor of `gradient_checkpointing` (#27020 ) * v1 * fix * remove `create_custom_forward` * fixup * fixup * add test and fix all failing GC tests * remove all remaining `create_custom_forward` methods * fix idefics bug * fixup * replace with `__call__` * add comment * quality	2023-10-25 12:16:15 +02:00
Arthur	9286f0ac39	Skip-test (#27062 ) * skip plbart test * nits * update	2023-10-25 10:47:33 +02:00
Tom Aarsen	6cbc1369a3	Fix RoPE config validation for FalconConfig + various config typos (#26929 ) * Resolve incorrect ValueError in RoPE config for Falcon * Add broken codeblock tag in Falcon Config * Fix typo: an float -> a float * Implement copy functionality for Fuyu and Persimmon for RoPE scaling validation * Make style	2023-10-24 18:37:09 +01:00
JB (Don)	a0fd34483f	Add a default decoder_attention_mask for EncoderDecoderModel during training (#26752 ) * Add a default decoder_attention_mask for EncoderDecoderModel during training Since we are already creating the default decoder_input_ids from the labels, we should also create a default decoder_attention_mask to go with it. * Fix test constant that relied on manual_seed() The test was changed to use a decoder_attention_mask that ignores padding instead (which is the default one created by BERT when attention_mask is None). * Create the decoder_attention_mask using decoder_input_ids instead of labels * Fix formatting in test	2023-10-24 18:26:16 +01:00
Maria Khalusova	9333bf0769	[docs] Performance docs refactor p.2 (#26791 ) * initial edits * improvements for clarity and flow * improvements for clarity and flow, removed the repetead section * removed two docs that had no content * Revert "removed two docs that had no content" This reverts commit `e98fa2fa0d`. * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * feedback addressed * more feedback addressed * feedback addressed --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-10-24 13:10:06 -04:00
Patrick von Platen	13ef14e18e	Fix config silent copy in from_pretrained (#27043 ) * Fix config modeling utils * fix more * fix attn mask bug * Update src/transformers/modeling_utils.py	2023-10-24 19:05:37 +02:00
Alex McKinney	9da451713d	Device agnostic testing (#25870 ) * adds agnostic decorators and availability fns * renaming decorators and fixing imports * updating some representative example tests bloom, opt, and reformer for now * wip device agnostic functions * lru cache to device checking functions * adds `TRANSFORMERS_TEST_DEVICE_SPEC` if present, imports the target file and updates device to function mappings * comments `TRANSFORMERS_TEST_DEVICE_SPEC` code * extra checks on device name * `make style; make quality` * updates default functions for agnostic calls * applies suggestions from review * adds `is_torch_available` guard * Add spec file to docs, rename function dispatch names to backend_* * add backend import to docs example for spec file * change instances of to * Move register backend to before device check as per @statelesshz changes * make style * make opt test require fp16 to run --------- Co-authored-by: arsalanu <arsalanu@graphcore.ai> Co-authored-by: arsalanu <hzji210@gmail.com>	2023-10-24 16:49:26 +02:00
Marc Sun	41496b95da	Add fuyu device map (#26949 ) * add _no_split_modules * style * fix _no_split_modules * add doc	2023-10-24 09:10:23 -04:00
Leandro von Werra	b18e31407c	add info on TRL docs (#27024 ) * add info on TRL docs * add TRL link * tweak text * tweak text	2023-10-24 14:56:00 +02:00
amyeroberts	cb0c68069d	Safe import of rgb_to_id from FE modules (#27037 ) Safe import from FE modules	2023-10-24 13:40:16 +01:00
Arthur	7bde5d634f	[`TFxxxxForSequenceClassifciation`] Fix the eager mode after #25085 (#25751 ) * TODOS * Switch .shape -> shape_list --------- Co-authored-by: Matt <rocketknight1@gmail.com>	2023-10-24 13:33:05 +01:00
Michal Jamroz	e2d6d5ce57	Normalize only if needed (#26049 ) * Normalize only if needed * Update examples/pytorch/image-classification/run_image_classification.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * if else in one line * within block * one more place, sorry for mess * import order * Update examples/pytorch/image-classification/run_image_classification.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update examples/pytorch/image-classification/run_image_classification_no_trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-10-24 13:32:03 +01:00
JP	576e2823a3	Add descriptive docstring to WhisperTimeStampLogitsProcessor (#25642 ) * adding in logit examples for Whisper processor * adding in updated logits processor for Whisper * adding in cleaned version of logits processor for Whisper * adding docstrings for whisper processor * making sure the formatting is correct * adding logits after doc builder * Update src/transformers/generation/logits_process.py Adding in suggested fix to the LogitProcessor description. Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/logits_process.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/logits_process.py Removing tip per suggestion. Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/logits_process.py Removing redundant code per suggestion. Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * adding in revised version * adding in version with timestamp examples * Update src/transformers/generation/logits_process.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * enhanced paragraph on behavior of processor * fixing doc quality issue * removing the word poem from example * adding in updated docstring * adding in new version of file after doc-builder --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-10-24 12:02:06 +02:00
Yih-Dar	fc142bd775	Add `default_to_square_for_size` to `CLIPImageProcessor` (#26965 ) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-10-24 11:08:17 +02:00
Xuehai Pan	cc7803c0a6	Register ModelOutput as supported torch pytree nodes (#26618 ) * Register ModelOutput as supported torch pytree nodes * Test ModelOutput as supported torch pytree nodes * Update type hints for pytree unflatten functions	2023-10-24 11:02:40 +02:00
fxmarty	ede051f1b8	Fix key dtype in GPTJ and CodeGen (#26836 ) * fix key dtype in gptj and codegen * delay the key cast to a later point * fix	2023-10-24 16:55:14 +09:00
Yeyang	32f799db0d	🌐 [i18n-ZH] Translate create_a_model.md into Chinese (#27026 ) docs(zh): translate create_a_model.md	2023-10-23 15:44:42 -07:00
Mert Yanık	25c022d7c5	Fix little typo (#27028 )	2023-10-23 15:36:42 -07:00
Pedro Gabriel Gengo Lourenço	f370bebdc3	Bugfix device map detr model (#26849 ) * Fixed replace_batch_norm when on meta device * lint fix * Adding coauthor Co-authored-by: Pi Esposito <piero.skywalker@gmail.com> * Removed tests * Remove unused deps * Try to fix copy issue * try fix copy one more time * Reverted import changes --------- Co-authored-by: Pi Esposito <piero.skywalker@gmail.com>	2023-10-23 14:34:27 -04:00
jiaqiw09	b0d1d7f71a	translate `preprocessing.md` to Chinese (#26955 ) * translate preprocessing.md to Chinese * update files fixing problems mentioned in review * update files fixing problems mentioned in review --------- Co-authored-by: jiaqiw <wangjiaqi50@huawei.com>	2023-10-23 10:36:24 -07:00
Yeyang	19ae0505ae	🌐 [i18n-ZH] Translate multilingual into Chinese (#26935 ) translate multilingual into Chinese Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-10-23 10:35:17 -07:00
Patrick von Platen	33f98cfded	Remove ambiguous `padding_mask` and instead use a 2D->4D Attn Mask Mapper (#26792 ) * [Attn Mask Converter] refactor attn mask * up * Apply suggestions from code review Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com> * improve * rename * better cache * renaming * improve more * improve * fix bug * finalize * make style & make fix-copies * correct more * start moving attention_mask * fix llama * improve falcon * up * improve more * improve more * Update src/transformers/models/owlv2/modeling_owlv2.py * make style * make style * rename to converter * Apply suggestions from code review --------- Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>	2023-10-23 18:54:00 +02:00
jiaqiw09	f09a081d27	Translate `pipeline_tutorial.md` to chinese (#26954 ) * update translation of pipeline_tutorial and preprocessing(Version1.0) * update translation of pipeline_tutorial and preprocessing(Version2.0) * update translation docs * update to fix problems mentioned in review --------- Co-authored-by: jiaqiw <wangjiaqi50@huawei.com>	2023-10-23 08:58:00 -07:00
Matt	f7354a3bd6	Remove token_type_ids from default TF GPT-2 signature (#26962 ) Remove token_type_ids from default GPT-2 signature	2023-10-23 16:18:02 +01:00
Rafael Padilla	c0b5ad9473	small typos found (#26988 ) just very small typos found	2023-10-23 11:08:39 -03:00
Arthur	f9f27b0fc2	[`SeamlessM4T`] fix copies with NLLB MoE int8 (#27018 ) fix copies on newly merged model	2023-10-23 15:25:06 +02:00

1 2 3 4 5 ...

14322 Commits