transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Sylvain Gugger	11fd2c773b	Skip failing test	2023-04-04 21:26:17 -04:00
Matt	edb704b26e	Fix inverted conditional in TF common test! (#22540 ) * Fix inverted conditional in TF common test! * Make the same change in the PT tests file * Make sure hidden states for GPT2 have the same output shape in PT/TF * Minor fix to PT implementation of token classification loss * Skip loss equivalence test for TFHubert because it keeps overflowing to inf * Compute LM loss for TF the (weird) way it's computed in PT * Skip loss equivalence test for Wav2Vec2 for the same reason as Hubert * Fix - don't try to access the hidden states property when output is a tuple	2023-04-04 21:59:54 +01:00
Sourab Mangrulkar	48fbd8fa2e	fix `_no_split_modules` for Whisper model (#22486 )	2023-04-04 13:01:32 -04:00
Shubhamai	900677487d	Flax Regnet (#21867 ) * initial commit * review changes * post model PR merge * updating doc	2023-04-04 12:41:12 -04:00
Sun Haozhe	fc5b7419d4	corrected the code comment for the output of find_pruneable_heads_and_indices (#22557 ) * corrected/clarified the code comment of find_pruneable_heads_and_indices * have run make style	2023-04-04 11:29:42 -04:00
Matt	5f3ea66bc0	Add TF port of BLIP (#22090 ) * Initial commit * more stash commit * Yet another stash commit * yet more stash commit * Mostly working except for docs / repo consistency * Stop importing model list from torch file * Add TF BLIP models to docs * Add auto classes * Move get_text_features and get_image_features * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/blip/test_modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/blip/test_modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/models/blip/test_modeling_tf_blip_text.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Use channels_last convolutions in TF (better performance + compatibility) * Remove _shape function * Move multi-line statement to one line in PT + TF * Specify tf.keras.layers instead of importing from it * Remove test_gradient_checkpointing and empty test_training methods * move some multi-line statements to one line * Update docstring for generate * Remove pruned heads set * Remove self.seq_len_dim * Fixed issues with loss computation, should resolve some tests. Also ensured that the PT version follows the config for output_attentions and output_hidden_states * ensure original model follows config in more cases * Skip the same cross-attention tests in the PT tests - didn't realize we did it twice! * Add training args throughout the models and layers * make fixup * Fix docstring for inputs_embeds * Add docstring for is_decoder * Add docstrings to text models * Remove redundant computation * Add unpack_inputs / keras_serializable * Add modeling_tf_blip to doctests * Add config classes for keras serialization * Changes to allow model porting with pt-to-tf * Quick fix to decoder head and test tweaks * Revert an issue with masking the embeddings outputs * Allow missing keys in some equivalence tests (for unused layers) * Add tf-pt equivalence tests back in * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make fixup * Refactor invert_attention_mask out into tf_utils * Re-enable cross-tests on the PT side too --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-04-04 16:05:22 +01:00
Nicolas Patry	a515d0a77c	Soft error whisper. (#22475 ) * Soft error whisper. * Fix format. --------- Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-94.taildb5d.ts.net>	2023-04-04 16:21:57 +02:00
Maziyar Panahi	98268b2e76	Add id2label and label2id to model's config in run_xnil (#22558 ) Add id2label and label2id to config in run_xnil	2023-04-04 09:28:57 -04:00
Younes Belkada	fa2bdffc5d	[`bnb`] Fix typo (#22556 ) Update modeling_utils.py	2023-04-04 15:26:45 +02:00
Sylvain Gugger	28fcf00607	Remove hack for dynamic modules and use Python functions instead (#22537 )	2023-04-04 09:20:13 -04:00
Viktor Scherbakov	871598be55	Implemented safetensors checkpoints save/load for Trainer (#22498 ) * implemented safetensors save/load * remove duplicated file * added tests * more tests * style fix * fix tf tests * change to list comprehension Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * review fixes + safe load for sharded checkpoint * style fix * remove rogue import * remove partial to avoid undefined exception * use naming alias instead of safetensors.torch * fix safe sharding in tests * grammar Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * update docs Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * update docs Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * minor corrections * style --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-04-04 09:05:04 -04:00
Arthur	00b5887b94	🚨🚨🚨 `[NLLB Tokenizer]` Fix the prefix tokens 🚨🚨🚨 (#22313 ) * fix the prefix tokens * update fast and test values * add legacy behaviour Co-authored-by: sgugger <sylvain.gugger@gmail.com> * update disclaimer, linkissue PR and behaviral changes * Apply suggestions from code review Co-authored-by: Lysandre Debut <hi@lysand.re> * styling * make a quote * quote this time --------- Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Lysandre Debut <hi@lysand.re>	2023-04-04 14:53:06 +02:00
TheWall9	ad5e9b6c6a	[Roformer] Fixing a bug in RoFormerEncoder where it was ignoring the length of past_key_values when generating as a decoder (#22416 ) * fix RoFormerEncoder postion embedding when generate as decoder * make fixup * add test case for check generate with past key values * remove duplicating code	2023-04-04 12:50:33 +02:00
Joao Gante	1905384fd5	Generate: Add text streamer decoding options (#22544 )	2023-04-04 09:03:13 +01:00
Pavel T	41a2f3529c	Fix OPTForQuestionAnswering doc string (#22481 ) * Fix OPTForQuestionAnswering doc string for more adequate model answer decoding * black style fix * doc-builder style	2023-04-03 21:05:31 -04:00
Younes Belkada	159ff3342c	Update test_image_processing_pix2struct.py (#22543 )	2023-04-03 15:26:35 -04:00
Sylvain Gugger	c14d31294e	Skip failing test	2023-04-03 14:07:40 -04:00
Xuehai Pan	4169dc84bf	[setup] migrate setup script to `pyproject.toml` (#22539 ) * [setup] migrate setup script to `pyproject.toml` * [setup] cleanup configurations * remove unused imports	2023-04-03 14:03:41 -04:00
Vladimir Blagojevic	a17841ac49	Generate: Enable easier TextStreamer customization (#22516 )	2023-04-03 18:49:38 +01:00
Xuehai Pan	80d1319e1b	[setup] drop deprecated `distutils` usage (#22531 ) * [setup] drop deprecated `distutils` usage * drop deprecated `distutils.util.strtobool` usage * fix import order * reformat docstring by `doc-builder`	2023-04-03 12:04:24 -04:00
Ilya	4c33a0c4fc	Fix missing metrics with multiple eval datasets (#22536 )	2023-04-03 12:03:57 -04:00
Younes Belkada	d7a4f5becc	[`T5`] Enable naive Pipeline Parallelism training for T5 (#22535 ) * enable PP for T5 * make fixup * fix failing tests	2023-04-03 17:55:37 +02:00
Younes Belkada	cab048fb35	[`Trainer`] Force `is_model_parallel` when model is loaded in multiple GPUs using `accelerate` (#22532 ) * add `is_model_parallel` arg on Trainer * add warning * adapt from suggestions * revert t5 changes * remove commas * adapt from suggestions	2023-04-03 17:10:50 +02:00
zhbh01	aecbcb3680	[BLIP] fix cross attentions for BlipTextEncoder (#22515 )	2023-04-03 11:00:26 -04:00
Thibault Douzon	4e441e529c	fix LayoutLMv3TokenizerFast subword label after 'Ġ' token (#21695 ) LayoutLMv3TokenizerFast produces empty 'Ġ' token with `offset_mapping = (0, 0)`. Next token is wrongly assumed to also be beginning of word and isn't correctly assigned `pad_token_label`. Modify test with text that produce 'Ġ' token. Remove copy check from LayoutLMv2TokenizerFast for `_batch_encode_plus`. solves issue: #19978	2023-04-03 10:32:36 -04:00
Kirill	a60010566a	llama docs: fix conversion script url (#22514 )	2023-04-03 10:28:40 -04:00
larekrow	9419f144ad	Fix convert_opt_original_pytorch_checkpoint_to_pytorch.py typo (#22526 ) `load_checkpoint()` silently fails because `".qkj_proj." in key` is always `False`, but will eventually cause an error at `model.load_state_dict(state_dict)`.	2023-04-03 10:06:52 -04:00
Joao Gante	a55a822adf	Generate: `TextIteratorStreamer` (streamer for gradio) (#22501 ) * haha text go brrr (but in gradio)	2023-04-03 15:04:37 +01:00
Mohammed Jabir	7d25c9c81e	added biogpt token classifier (#22447 ) * added biogpt token classifier * fix reviews * Updated modeling_biogpt.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-04-03 09:20:02 -04:00
Jungnerd	1194c3e315	[WIP] docs: ko: sagemaker.mdx (#22509 ) docs: ko: sagemaker.mdx	2023-04-03 09:17:02 -04:00
Arthur	c0f99b4d2e	Fix llama tokenizer (#22402 ) * draft * update tokenization limma and conversion script * more udpates * initial commit * style * default pad to None * draft tokenization tests * update test * update tokenization tests * nits * update * versioning test * major fix * fix more testst * finish fixing special masks * last nit * more nits * add encode decode tests * add more * fix token type ids * style	2023-04-03 09:07:32 -04:00
Eli Simhayev	9eae4aa576	[Time-Series] fix past_observed_mask type (#22076 ) added > 0.5 to `past_observed_mask`	2023-04-03 09:07:21 -04:00
amyeroberts	559a45d1dc	Backbone add out indices (#22493 ) * Add out_indices to backbones, deprecate out_features * Update - can specify both out_features and out_indices but not both * Can specify both * Fix copies * Add out_indices to convnextv2 configuration	2023-04-03 11:06:25 +01:00
kevinpro	db803b6919	Update convert_llama_weights_to_hf.py (#22525 )	2023-04-03 10:41:39 +01:00
Sylvain Gugger	c612628045	Test fetch v2 (#22367 ) * Test fetcher v2 * Fix regexes * Remove sanity check * Fake modification to OPT * Fixes some .sep issues * Remove fake OPT change * Fake modif for BERT * Fake modif for init * Exclude SageMaker tests * Fix test and remove fake modif * Fake setup modif * Fake pipeline modif * Remove all fake modifs * Adds options to skip/force tests * [test-all-models] Fake modif for BERT * Try this way * Does the command actually work? * [test-all-models] Try again! * [skip circleci] Remove fake modif * Remove debug statements * Add the list of important models * Quality * Update utils/tests_fetcher.py Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> * Address review comments * Address review comments * Fix and add test * Apply suggestions from code review Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Address review comments --------- Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2023-03-31 16:18:43 -04:00
Sabine	3a9464bd30	Update Neptune callback docstring (#22497 ) * update NeptuneCallback docstring * formatting * apply make style --------- Co-authored-by: Aleksander Wojnarowicz <alwojnarowicz@gmail.com>	2023-03-31 15:38:34 -04:00
dependabot[bot]	6fc44656b4	Bump redis from 4.5.3 to 4.5.4 in /examples/research_projects/decision_transformer (#22494 ) Bump redis in /examples/research_projects/decision_transformer Bumps [redis](https://github.com/redis/redis-py) from 4.5.3 to 4.5.4. - [Release notes](https://github.com/redis/redis-py/releases) - [Changelog](https://github.com/redis/redis-py/blob/master/CHANGES) - [Commits](https://github.com/redis/redis-py/compare/v4.5.3...v4.5.4) --- updated-dependencies: - dependency-name: redis dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-03-31 10:50:33 -04:00
Nicolas Patry	d143087d18	Making sure we can use safetensors to serialize all the time. (#22437 ) * Making sure we can use safetensors to serialize all the time. * Expanding the tests for increased coverage. * Update the test. * Getting current state of affairs. * Tentative fix. * Fixing black version. * Fixing the worst offenders. * Try to modify less files. * Fixing blip_2 (Weird solution right now). * Fixing deta. * Fix blip ? * Missing extra newline. * No deta modification. * Adding some comments. * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Addressing comments. * Addressing comments. * creating warn_once. * Warning_once ! --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-31 16:07:35 +02:00
Yih-Dar	516077b3b0	Update `Wav2Vec2ProcessorWithLM` doc example (#22474 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-31 14:17:40 +02:00
lewtun	da68fd691c	Relax `eos_token_id < 0` checks in `generate()` from `ValueError` to warning (#22472 ) * Relax checks from to warning * Fix style * Replace warnings with logger * Use warning vs warn	2023-03-31 09:09:40 +02:00
Yih-Dar	0fe6c6bdca	(Re-)Enable Nightly + Past CI (#22393 ) * Enable Nightly + Past CI * put schedule --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-30 21:06:35 +02:00
Manuel de Prada	d5de578c22	Docs fix: Multinomial sampling decoding needs "num_beams=1", since by default it is usually not 1. (#22473 ) Fix: Multinomial sampling needs "num_beams=1", since by default is 5.	2023-03-30 11:04:12 -04:00
Joao Gante	165dd6dc91	Llama: support for `max_position_embeddings` (#22471 ) * Llama now supports max_position_embeddings * Save config; Cosmetic edits	2023-03-30 15:54:01 +01:00
Arthur	349e1242d9	[NLLB-MoE] `model_type` update for auto mapping (#22470 ) edit default model type and testing path set to hf-internal-testing	2023-03-30 15:36:07 +02:00
Roy Hvaara	11426641dc	Guard imports of PreTrainedTokenizerFast on is_tokenizers_available (#22285 ) Guard imports that use the tokenizers library	2023-03-30 09:16:03 -04:00
amyeroberts	4d7a5b5ba3	🚨🚨🚨 Fix ordering of height, width for BLIP image processor (#22466 ) Fix ordering of height,width for BLIP	2023-03-30 14:02:16 +01:00
Joao Gante	228792a9dc	Generate: basic token streaming (#22449 ) * haha tokens go brrrr	2023-03-30 12:00:12 +01:00
amyeroberts	f0aeb1be17	Skip flaky NLLB Moe test for now (#22463 ) Skip flaky test for now	2023-03-30 11:30:19 +01:00
amyeroberts	154c6bb7ac	Rescale image back if it was scaled during PIL conversion (#22458 ) * Rescale image back if it was scaled during PIL conversion * do_rescale is defined if PIL image passed in	2023-03-30 11:29:11 +01:00
amyeroberts	c15f937581	Move common properties to BackboneMixin (#21855 ) * Move common properties to BackboneMixin * Fix failing tests * Update ConvNextV2 backbone	2023-03-30 10:04:11 +01:00

1 2 3 4 5 ...

12510 Commits