transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-06 06:10:04 +06:00

Author	SHA1	Message	Date
Dani Martí	a84c413773	HfArgumentParser: allow for hyhenated field names in long-options (#33990 ) Allow for hyphenated field names in long-options argparse converts hyphens into underscores before assignment (e.g., an option passed as `--long-option` will be stored under `long_option`), So there is no need to pass options as literal attributes, as in `--long_option` (with an underscore instead of a hyphen). This commit ensures that this behavior is respected by `parse_args_into_dataclasses` as well. Issue: #33933 Co-authored-by: Daniel Marti <mrtidm@amazon.com>	2024-10-10 11:58:26 +02:00
Joao Gante	38f9f10dd9	Cache: revert DynamicCache init for BC (#33861 ) * tmp commit * tmp commit * make fixup * missing removal * fix condition * fix end-to-end compilation * if -> elif * BC * BC * use @deprecate_kwarg("num_hidden_layers", version="4.47.0") * wups the import * 🥴 --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2024-10-04 22:47:08 +02:00
Raushan Turganbay	061c2c4c38	Ignore keys on `validate_rope` (#33753 ) * ignore keys on check rope * add tests * fix tests, so maybe better leave at logger lvl	2024-10-04 12:39:37 +02:00
Joao Gante	b0c5660e88	Config: lower `save_pretrained` exception to warning (#33906 ) * lower to warning * msg * make fixup * rm extra comma	2024-10-03 16:45:14 +01:00
Guang Yang	808997a634	Fix passing str dtype to static cache (#33741 ) Co-authored-by: Guang Yang <guangyang@fb.com>	2024-10-01 09:50:17 +02:00
Joao Gante	3557f9a14a	Generate: `can_generate()` recursive check (#33718 ) * add recursive check and test warnings * missing space * models without can_generate	2024-09-26 18:11:14 +01:00
Arthur	19d58d31f1	Add MLLama (#33703 ) * current changes * nit * Add cross_attenttion_mask to processor * multi-image fixed * Add cross_attenttion_mask to processor * cross attn works in all cases * WIP refactoring function for image processor * WIP refactoring image processor functions * Refactor preprocess to use global loops instead of list nested list comps * Docstrings * Add channels unification * fix dtype issues * Update docsrings and format * Consistent max_image_tiles * current script * updates * Add convert to rgb * Add image processor tests * updates! * update * god damn it I am dumb sometimes * Precompute aspect ratios * now this works, full match * fix 😉 * nits * style * fix model and conversion * nit * nit * kinda works * hack for sdpa non-contiguous bias * nits here and there * latest c hanges * merge? * run forward * Add aspect_ratio_mask * vision attention mask * update script and config variable names * nit * nits * be able to load * style * nits * there * nits * make forward run * small update * enable generation multi-turn * nit * nit * Clean up a bit for errors and typos * A bit more constant fixes * 90B keys and shapes match * Fix for 11B model * Fixup, remove debug part * Docs * Make max_aspect_ratio_id to be minimal * Update image processing code to match new implementation * Adjust conversion for final checkpoint state * Change dim in repeat_interleave (accordig to meta code) * tmp fix for num_tiles * Fix for conversion (gate<->up, q/k_proj rope permute) * nits * codestyle * Vision encoder fixes * pass cross attn mask further * Refactor aspect ratio mask * Disable text-only generation * Fix cross attention layers order, remove q/k norm rotation for cross atention layers * Refactor gated position embeddings * fix bugs but needs test with new weights * rope scaling should be llama3 * Fix rope scaling name * Remove debug for linear layer * fix copies * Make mask prepare private func * Remove linear patch embed * Make precomputed embeddings as nn.Embedding module * MllamaPrecomputedAspectRatioEmbedding with config init * Remove unused self.output_dim * nit, intermediate layers * Rename ln and pos_embed * vision_chunk_size -> image_size * return_intermediate -> intermediate_layers_indices * vision_input_dim -> hidden_size * Fix copied from statements * fix most tests * Fix more copied from * layer_id->layer_idx * Comment * Fix tests for processor * Copied from for _prepare_4d_causal_attention_mask_with_cache_position * Style fix * Add MllamaForCausalLM * WIP fixing tests * Remove duplicated layers * Remove dummy file * Fix style * Fix consistency * Fix some TODOs * fix language_model instantiation, add docstring * Move docstring, remove todos for precomputed embeds (we cannot init them properly) * Add initial docstrings * Fix * fix some tests * lets skip these * nits, remove print, style * Add one more copied from * Improve test message * Make validate func private * Fix dummy objects * Refactor `data_format` a bit + add comment * typos/nits Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * fix dummy objects and imports * Add chat template config json * remove num_kv_heads from vision attention * fix * move some commits and add more tests * fix test * Remove `update_key_name` from modeling utils * remove num-kv-heads again * some prelimiary docs * Update chat template + tests * nit, conversion script max_num_tiles from params * Fix warning for text-only generation * Update conversion script for instruct models * Update chat template in converstion + test * add tests for CausalLM model * model_max_length, avoid null chat_template * Refactor conversion script * Fix forward * Fix integration tests * Refactor vision config + docs * Fix default * Refactor text config * Doc fixes * Remove unused args, fix docs example * Squashed commit of the following: commit b51ce5a2efffbecdefbf6fc92ee87372ec9d8830 Author: qubvel <qubvel@gmail.com> Date: Wed Sep 18 13:39:15 2024 +0000 Move model + add output hidden states and output attentions * Fix num_channels * Add mllama text and mllama vision models * Fixing repo consistency * Style fix * Fixing repo consistency * Fixing unused config params * Fix failed tests after refactoring * hidden_activation -> hidden_act for text mlp * Remove from_pretrained from sub-configs * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/mllama/convert_mllama_weights_to_hf.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Reuse lambda in conversion script * Remove run.py * Update docs/source/en/model_doc/mllama.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/mllama/processing_mllama.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Remove unused LlamaTokenizerFast * Fix logging * Refactor gating * Remove cycle for collecting intermediate states * Refactor text-only check, add integration test for text-only * Revert from pretrained to configs * Fix example * Add auto `bos_token` adding in processor * Fix tips * Update src/transformers/models/auto/tokenization_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Enable supports_gradient_checkpointing model flag * add eager/sdpa options * don't skip attn tests and bring back GC skips (did i really remove those?) * Fix signature, but get error with None gradient * Fix output attention tests * Disable GC back * Change no split modules * Fix dropout * Style * Add Mllama to sdpa list * Add post init for vision model * Refine config for MllamaForCausalLMModelTest and skipped tests for CausalLM model * if skipped, say it, don't pass * Clean vision tester config * Doc for args * Update tests/models/mllama/test_modeling_mllama.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add cross_attention_mask to test * typehint * Remove todo * Enable gradient checkpointing * Docstring * Style * Fixing and skipping some tests for new cache * Mark flaky test * Skip `test_sdpa_can_compile_dynamic` test * Fixing some offload tests * Add direct GenerationMixin inheritance * Remove unused code * Add initializer_range to vision config * update the test to make sure we show if split * fix gc? * Fix repo consistency * Undo modeling utils debug changes * Fix link * mllama -> Mllama * [mllama] -> [Mllama] * Enable compile test for CausalLM model (text-only) * Fix TextModel prefix * Update doc * Docs for forward, type hints, and vision model prefix * make sure to reset * fix init * small script refactor and styling * nit * updates! * some nits * Interpolate embeddings for 560 size and update integration tests * nit * does not suppor static cache! * update * fix * nit2 * this? * Fix conversion * Style * 4x memory improvement with image cache AFAIK * Token decorator for tests * Skip failing tests * update processor errors * fix split issues * style * weird * style * fix failing tests * update * nit fixing the whisper tests * fix path * update --------- Co-authored-by: raushan <raushan@huggingface.co> Co-authored-by: pavel <ubuntu@ip-10-90-0-11.ec2.internal> Co-authored-by: qubvel <qubvel@gmail.com> Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2024-09-25 19:56:25 +02:00
Joao Gante	e15687fffe	Generation: deprecate `PreTrainedModel` inheriting from `GenerationMixin` (#33203 )	2024-09-23 18:28:36 +01:00
Yih-Dar	077b552f07	Fix some missing tests in circleci (#33559 ) * fix * fix * fix * fix * skip * skip more --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-09-20 20:58:51 +02:00
Fanli Lin	b87755aa6d	[tests] skip tests for xpu (#33553 ) * enable * fix * add xpu skip * add marker * skip for xpu * add more * add one more	2024-09-19 19:28:04 +01:00
Joao Gante	7542fac2c7	Pipeline: no side-effects on `model.config` and `model.generation_config` 🔫 (#33480 )	2024-09-18 15:43:06 +01:00
Yoni Gozlan	d8500cd229	Uniformize kwargs for Pixtral processor (#33521 ) * add uniformized pixtral and kwargs * update doc * fix _validate_images_text_input_order * nit	2024-09-17 14:44:27 -04:00
Guang Yang	f38590dade	Make StaticCache configurable at model construct time (#32830 ) * Make StaticCache configurable at model construct time * integrations import structure * add new doc file to toc --------- Co-authored-by: Guang Yang <guangyang@fb.com> Co-authored-by: Joao Gante <joao@huggingface.co>	2024-09-10 16:35:57 +01:00
Lysandre Debut	f24f084329	Import structure & first three model refactors (#31329 ) * Import structure & first three model refactors * Register -> Export. Export all in __all__. Sensible defaults according to filename. * Apply most comments from Amy and some comments from Lucain Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Lucain Pouget <lucainp@gmail.com> * Style * Add comment * Clearer .py management * Raise if not in backend mapping * More specific type * More efficient listdir * Misc fixes --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Lucain Pouget <lucainp@gmail.com>	2024-09-10 11:10:53 +02:00
Ita Zaporozhets	363301f221	support loading model without config.json file (#32356 ) * support loading model without config.json file * fix condition * update tests * add test * ruff * ruff * ruff	2024-09-06 13:49:47 +02:00
Yoni Gozlan	9230d78e76	Add validate images and text inputs order util for processors and test_processing_utils (#33285 ) * Add validate images and test processing utils * Remove encoded text from possible inputs in tests * Removed encoded inputs as valid in processing_utils * change text input check to be recursive * change text check to all element of lists and not just the first one in recursive checks	2024-09-04 13:50:31 -04:00
Alex Sherstinsky	122ded0a11	Bugfix/alexsherstinsky/fix none check for attention factor in rope scaling 2024 08 28 0 (#33188 ) * Fixing a bug in the way "attention_factor" is validated in ROPE utilities. * Fixing a bug in the way "attention_factor" is validated in ROPE utilities. * Fixing a bug in the way "attention_factor" is validated in ROPE utilities.	2024-09-04 17:01:12 +02:00
Raushan Turganbay	ebbe8d8014	Cache docs: update (#32929 ) * some changes * more updates * fix cache copy * nits * nits * add tests	2024-09-04 15:05:31 +05:00
Gerben van V	5129671290	Add a static cache that offloads to the CPU or other device (#32161 ) * Add a static cache that offloads to the CPU or other device * Fix PR comments, add unit-tests	2024-08-29 11:51:09 +02:00
rasmi	f9ed05dd03	Fix import paths for test_module (#32888 ) * Fix import path for test_feature_extraction_utils.py See https://github.com/huggingface/transformers/pull/32601 * Fix import path for test_image_processing_utils.py	2024-08-28 12:08:29 +01:00
Joao Gante	ab0ac3b98f	CI: fix `efficientnet` pipeline timeout and prevent future similar issues due to large image size (#33123 ) * fix param not being passed in tested; add exceptions * better source of model name * Update utils/create_dummy_models.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-08-27 11:58:27 +01:00
Joao Gante	72d4a3f9c1	mps: add `isin_mps_friendly`, a wrapper function for `torch.isin` (#33099 )	2024-08-26 15:34:19 +01:00
Joao Gante	970a16ec7f	Forbid `PretrainedConfig` from saving `generate` parameters; Update deprecations in `generate`-related code 🧹 (#32659 ) Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-08-23 11:12:53 +01:00
Zach Mueller	8ec028aded	Reduce the error log when using core models that need their weights renamed, and provide a step forward (#32656 ) * Fin * Modify msg * Finish up nits	2024-08-16 13:05:57 -04:00
Joao Gante	cf32ee1753	Cache: use `batch_size` instead of `max_batch_size` (#32657 ) * more precise name * better docstrings * Update src/transformers/cache_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-08-16 11:48:45 +01:00
Sai-Suraj-27	df323476a3	fix: Fixed failing tests in `tests/utils/test_add_new_model_like.py` (#32678 ) * Fixed failing tests in tests/utils/test_add_new_model_like.py * Fixed formatting using ruff. * Small nit.	2024-08-14 12:06:17 +01:00
Sai-Suraj-27	c3cd9d807e	Fix: Fixed directory path for utils folder in `test_tokenization_utils.py` (#32601 ) * Removed un-necessary expressions. * Fixed directory path for utils folder in test_tokenization_utils.py	2024-08-13 16:48:15 +01:00
Pablo Montalvo	a5a8291ad1	Fix tests (#32649 ) * skip failing tests * [no-filter] * [no-filter] * fix wording catch in FA2 test * [no-filter] * trigger normal CI without filtering	2024-08-13 09:46:21 +01:00
Lysandre Debut	29c3a0fa01	Automatically add `transformers` tag to the modelcard (#32623 ) * Automatically add `transformers` tag to the modelcard * Specify library_name and test	2024-08-13 07:59:01 +02:00
Quentin Gallouédec	f1c8542ff7	"to be not" -> "not to be" (#32636 ) * "to be not" -> "not to be" * Update sam.md * Update trainer.py * Update modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py	2024-08-12 20:20:17 +01:00
Sai-Suraj-27	ce4b28830a	fix: Fixed failing `test_find_base_model_checkpoint` (#32638 ) Fixed failing test_find_base_model_checkpoint.	2024-08-12 19:51:30 +01:00
Guang Yang	0164560353	Fixed test `test_static_cache_exportability` with torch 2.4.0 (#32516 ) Workaround the export issue in torch 2.4 Co-authored-by: Guang Yang <guangyang@fb.com>	2024-08-08 18:13:40 +01:00
amyeroberts	7e5d46ded4	Respect the config's attn_implementation if set (#32383 ) * Respect the config's attn if set * Update test - can override in from_config * Fix	2024-08-05 16:33:19 +01:00
Joao Gante	083e13b7c4	RoPE: Add numerical tests ✨ (#32380 ) tests! :D	2024-08-02 09:39:45 +01:00
OsamaS99	51ab25e293	Fixed Hybrid Cache Shape Initialization. (#32163 ) * fixed hybrid cache init, added test * Fix Test Typo --------- Co-authored-by: Aaron Haag <aaron.haag@siemens.com>	2024-08-01 13:57:42 +01:00
Nikos Karampatziakis	ca59d6f77c	Offloaded KV Cache (#31325 ) * Initial implementation of OffloadedCache * enable usage via cache_implementation * Address feedback, add tests, remove legacy methods. * Remove flash-attn, discover synchronization bugs, fix bugs * Prevent usage in CPU only mode * Add a section about offloaded KV cache to the docs * Fix typos in docs * Clarifications and better explanation of streams	2024-08-01 14:42:07 +02:00
Guang Yang	811a9caa21	Make static cache compatible with torch.export (#32168 )	2024-07-29 18:19:15 +01:00
Joao Gante	7ffe25f2b9	Generate: end-to-end compilation (#30788 ) * mvp * added test (a few models need fixes) * fix a few test cases * test nits * harder test 😈 * revert changes in stablelm * test with improved condition * add todo * tmp commit * merged with main * nits * add todo * final corrections * add docs for generation compilation * docs nits * add tip * PR suggestions * add more details to the compilation docs * fix cache positions * cache is now init in generate; update docs * tag test as flaky * docs * post rebase make fixup and other nits * remove unintended changes * whisper (encoder-decoder) not supported * move token default updates to ; add tests for token defaults * push changes * manual rebase * chameleon doesn't support this * fix test_static_cache_mha_mqa_gqa (broken in another PR) * docs: dynamic is better with end-to-end compilation	2024-07-29 10:52:13 +01:00
Fanli Lin	27c7f971c0	[tests] fix `static` cache implementation is not compatible with `attn_implementation==flash_attention_2` (#32039 ) * add flash attention check * fix * fix	2024-07-26 11:41:27 +02:00
Yih-Dar	df6eee9201	Follow up for #31973 (#32025 ) * fix * [test_all] trigger full CI --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-07-25 16:12:23 +02:00
Joao Gante	2e113422b3	Llama: RoPE refactor (#32135 ) Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-07-23 10:42:55 +01:00
Sanchit Gandhi	f83c6f1d02	Remove `trust_remote_code` when loading Libri Dummy (#31748 ) * [whisper integration] use parquet dataset for testing * propagate to others * more propagation * last one	2024-07-23 14:54:38 +08:00
amyeroberts	817a676bd7	Don't default to other weights file when use_safetensors=True (#31874 ) * Don't default to other weights file when use_safetensors=True * Add tests * Update tests/utils/test_modeling_utils.py * Add clarifying comments to tests * Update tests/utils/test_modeling_utils.py * Update tests/utils/test_modeling_utils.py	2024-07-22 18:29:50 +01:00
Sai-Suraj-27	0fcfc5ccc9	fix: Replaced deprecated `mktemp()` function (#32123 ) Replaced deprecated mktemp function.	2024-07-22 14:13:39 +01:00
Zach Mueller	693cb828ff	Fix bad test about slower init (#32002 ) Bronked main	2024-07-16 10:33:05 -04:00
Zach Mueller	e0dfd7bcaf	Speedup model init on CPU (by 10x+ for llama-3-8B as one example) (#31771 ) * 1,100%! * Clean * Don't touch DS * Experiment with dtype allocation * skip test_load_save_without_tied_weights test * A little faster * Include proper upscaling? * Fixup tests * Potentially skip? * Let's see if this fixes git history * Maintain new dtype * Fin * Rm hook idea for now * New approach, see what breaks * stage * Clean * Stash * Should be fin now, just need to mark failing models * Clean up * Simplify * Deal with weird models * Enc/Dec * Skip w/ reason * Adjust test * Fix test * one more test * Keep experimenting * Fix ref * TO REMOVE: testing feedback CI * Right push * Update tests/utils/test_modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * disable * Add new func * Test nits from Amy * Update src/transformers/modeling_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Adjust comment * Adjust comment on skip * make private * Fin * Should be a not flag * Clarify and rename test --------- Co-authored-by: Marc Sun <marc@huggingface.co> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-07-16 09:32:01 -04:00
Yih-Dar	a1a34657d4	Avoid race condition (#31973 ) * [test_all] hub * remove delete * remove delete * remove delete * remove delete * remove delete * remove delete * [test_all] * [test_all] * [test_all] * [test_all] * [test_all] * [test_all] --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-07-15 17:56:24 +02:00
Joao Gante	739a63166d	Generate: remove deprecated code due to `Cache` and `cache_position` being default (#31898 ) * tmp commit * shorter * nit * explicit kwargs * propagate changes * mass propagation with a few manual touches (let's see how CI behaves) * fix cacheless case * Update src/transformers/generation/utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * make fixup --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-07-14 15:16:58 +01:00
Omar Salman	1499a55008	Add warning message for beta and gamma parameters (#31654 ) * Add warning message for and parameters * Fix when the warning is raised * Formatting changes * Improve testing and remove duplicated warning from _fix_key	2024-07-11 13:01:47 +01:00
Joao Gante	4c2538b863	Test loading generation config with safetensor weights (#31550 ) fix test	2024-07-09 16:22:43 +02:00
Marc Sun	8c5c180de0	Fix serialization for offloaded model (#31727 ) * Fix serialization * style * add test	2024-07-05 08:07:07 +02:00
Matt	cd0935dd55	Make tool JSON schemas consistent (#31756 ) Make the order of array items consistent using sorted()	2024-07-02 20:00:42 +01:00
Yih-Dar	93cd94b79d	Move some test files (`tets/test_xxx_utils.py`) to `tests/utils` (#31730 ) * move * move * move * move * Update tests/utils/test_image_processing_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-07-02 13:46:03 +02:00
amyeroberts	1de7dc7403	Skip tests properly (#31308 ) * Skip tests properly * [test_all] * Add 'reason' as kwarg for skipTest * [test_all] Fix up * [test_all]	2024-06-26 21:59:08 +01:00
Zhiyong Wang	dce253f645	Add implementation of `spectrogram_batch` (#27159 ) * Add initial implementation of `spectrogram_batch` * Format the initial implementation * Add test suite for the `spectrogram_batch` * Update `spectrogram_batch` to ensure compatibility with test suite * Update `spectrogram_batch` to include pre and post-processing * Add `amplitude_to_db_batch` function and associated tests * Add `power_to_db_batch` function and associated tests * Reimplement the test suite for `spectrogram_batch` * Fix errors in `spectrogram_batch` * Add the function annotation for `spectrogram_batch` * Address code quality * Re-add `test_chroma_equivalence` function * Update src/transformers/audio_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/audio_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-06-24 09:19:12 +02:00
Albert Villanova del Moral	a14b055b65	Pass datasets trust_remote_code (#31406 ) * Pass datasets trust_remote_code * Pass trust_remote_code in more tests * Add trust_remote_dataset_code arg to some tests * Revert "Temporarily pin datasets upper version to fix CI" This reverts commit `b7672826ca`. * Pass trust_remote_code in librispeech_asr_dummy docstrings * Revert "Pin datasets<2.20.0 for examples" This reverts commit `833fc17a3e`. * Pass trust_remote_code to all examples * Revert "Add trust_remote_dataset_code arg to some tests" to research_projects * Pass trust_remote_code to tests * Pass trust_remote_code to docstrings * Fix flax examples tests requirements * Pass trust_remote_dataset_code arg to tests * Replace trust_remote_dataset_code with trust_remote_code in one example * Fix duplicate trust_remote_code * Replace args.trust_remote_dataset_code with args.trust_remote_code * Replace trust_remote_dataset_code with trust_remote_code in parser * Replace trust_remote_dataset_code with trust_remote_code in dataclasses * Replace trust_remote_dataset_code with trust_remote_code arg	2024-06-17 17:29:13 +01:00
Matt	edc1dffd00	Chat Template support for function calling and RAG (#30621 ) * First draft, still missing automatic function conversion * First draft of the automatic schema generator * Lots of small fixes * the walrus has betrayed me * please stop committing your debug breakpoints * Lots of cleanup and edge cases, looking better now * Comments and bugfixes for the type hint parser * More cleanup * Add tests, update schema generator * Update tests, proper handling of return values * Small docstring change * More doc updates * More doc updates * Add json_schema decorator * Clean up the TODOs and finish the docs * self.maxDiff = None to see the whole diff for the nested list test * add import for add_json_schema * Quick test fix * Fix something that was bugging me in the chat template docstring * Less "anyOf" when unnecessary * Support return types for the templates that need them * Proper return type tests * Switch to Google format docstrings * Update chat templating docs to match new format * Stop putting the return type in with the other parameters * Add Tuple support * No more decorator - we just do it implicitly! * Add enum support to get_json_schema * Update docstring * Add copyright header * Update src/transformers/tokenization_utils_base.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/chat_templating.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/utils/chat_template_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/utils/chat_template_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add copyright header * make fixup * Fix indentation * Reformat chat_template_utils * Correct return value * Make regexes module-level * Support more complex, multi-line arg docstrings * Update error message for ... * Update ruff * Add document type validation * Refactor docs * Refactor docs * Refactor docs * Clean up Tuple error * Add an extra test for very complex defs and docstrings and clean everything up for it * Document enum block * Quick test fixes * Stop supporting type hints in docstring to fix bugs and simplify the regex * Update docs for the regex change * Clean up enum regex * Wrap functions in {"type": "function", "function": ...} * Update src/transformers/utils/chat_template_utils.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * Temporary tool calling commit * Add type hints to chat template utils, partially update docs (incomplete!) * Code cleanup based on @molbap's suggestion * Add comments to explain regexes * Fix up type parsing for unions and lists * Add custom exception types and adjust tests to look for them * Update docs with a demo! * Docs cleanup * Pass content as string * Update tool call formatting * Update docs with new function format * Update docs * Update docs with a second tool to show the model choosing correctly --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2024-06-11 15:46:38 +01:00
Pavel Iakubovskii	517df566f5	Decorators for deprecation and named arguments validation (#30799 ) * Fix do_reduce_labels for maskformer image processor * Deprecate reduce_labels in favor to do_reduce_labels * Deprecate reduce_labels in favor to do_reduce_labels (segformer) * Deprecate reduce_labels in favor to do_reduce_labels (oneformer) * Deprecate reduce_labels in favor to do_reduce_labels (maskformer) * Deprecate reduce_labels in favor to do_reduce_labels (mask2former) * Fix typo * Update mask2former test * fixup * Update segmentation examples * Update docs * Fixup * Imports fixup * Add deprecation decorator draft * Add deprecation decorator * Fixup * Add deprecate_kwarg decorator * Validate kwargs decorator * Kwargs validation (beit) * fixup * Kwargs validation (mask2former) * Kwargs validation (maskformer) * Kwargs validation (oneformer) * Kwargs validation (segformer) * Better message * Fix oneformer processor save-load test * Update src/transformers/utils/deprecation.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/utils/deprecation.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/utils/deprecation.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * Update src/transformers/utils/deprecation.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * Better handle classmethod warning * Fix typo, remove warn * Add header * Docs and `additional_message` * Move to filter decorator ot generic * Proper deprecation for semantic segm scripts * Add to __init__ and update import * Basic tests for filter decorator * Fix doc * Override `to_dict()` to pop depracated `_max_size` * Pop unused parameters * Fix trailing whitespace * Add test for deprecation * Add deprecation warning control parameter * Update generic test * Fixup deprecation tests * Introduce init service kwargs * Revert popping unused params * Revert oneformer test * Allow "metadata" to pass * Better docs * Fix test * Add notion in docstring * Fix notification for both names * Add func name to warning message * Fixup --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2024-06-10 12:35:10 +01:00
Lucain	c3044ec2f3	Use `HF_HUB_OFFLINE` + fix has_file in offline mode (#31016 ) * Fix has_file in offline mode * harmonize env variable for offline mode * Switch to HF_HUB_OFFLINE * fix test * revert test_offline to test TRANSFORMERS_OFFLINE * Add new offline test * merge conflicts * docs	2024-05-29 11:55:43 +01:00
amyeroberts	dff54ad2d9	🚨 out_indices always a list (#30941 ) * out_indices always a list * Update src/transformers/utils/backbone_utils.py * Update src/transformers/utils/backbone_utils.py * Move type casting * nit	2024-05-22 15:23:04 +01:00
amyeroberts	c793b26f2e	load_image - decode b64encode and encodebytes strings (#30192 ) * Decode b64encode and encodebytes strings * Remove conditional encode -- image is always a string	2024-04-26 18:21:47 +01:00
Albert Villanova del Moral	c86d020ead	Fix test transposing image with EXIF Orientation tag (#30319 ) * Fix test with exif_transpose image * Replace datasets with PIL to load image in tests	2024-04-18 17:41:20 +01:00
Zach Mueller	487505ff45	Allow for str versions of dicts based on typing (#30227 ) * Bookmark, initial impelemtation. Need to test * Clean * Working fully, woop woop * I think working version now, testing * Fin! * rm cast, could keep None * Fix typing issue * rm typehint * Add test * Add tests and make more rigid	2024-04-16 08:15:09 -04:00
Xu Song	8fd2de933c	Add test for parse_json_file and change typing to os.PathLike (#30183 ) * Add test for parse_json_file * Change Path to PathLike * Fix `Import block is un-sorted or un-formatted` * revert parse_json_file * Fix ruff format * Add parse_json_file test	2024-04-15 09:34:36 +01:00
Yih-Dar	b44df05bc0	Update `tests/utils/tiny_model_summary.json` (#29941 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-04-03 09:25:01 +02:00
Lysandre Debut	39114c0383	Remove static pretrained maps from the library's internals (#29112 ) * [test_all] Remove static pretrained maps from the library's internals * Deprecate archive maps instead of removing them * Revert init changes * [test_all] Deprecate instead of removing * [test_all] PVT v2 support * [test_all] Tests should all pass * [test_all] Style * Address review comments * Update src/transformers/models/deprecated/_archive_maps.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/deprecated/_archive_maps.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * [test_all] trigger tests * [test_all] LLAVA * [test_all] Bad rebase --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-03-25 10:33:38 +01:00
Yoach Lacombe	c43b380e70	Add MusicGen Melody (#28819 ) * first modeling code * make repository * still WIP * update model * add tests * add latest change * clean docstrings and copied from * update docstrings md and readme * correct chroma function * correct copied from and remove unreleated test * add doc to toctree * correct imports * add convert script to notdoctested * Add suggestion from Sanchit Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * correct get_uncoditional_inputs docstrings * modify README according to SANCHIT feedback * add chroma to audio utils * clean librosa and torchaudio hard dependencies * fix FE * refactor audio decoder -> audio encoder for consistency with previous musicgen * refactor conditional -> encoder * modify sampling rate logics * modify license at the beginning * refactor all_self_attns->all_attentions * remove ignore copy from causallm generate * add copied from for from_sub_models * fix make copies * add warning if audio is truncated * add copied from where relevant * remove artefact * fix convert script * fix torchaudio and FE * modify chroma method according to feedback-> better naming * refactor input_values->input_features * refactor input_values->input_features and fix import fe * add input_features to docstrigs * correct inputs_embeds logics * remove dtype conversion * refactor _prepare_conditional_hidden_states_kwargs_for_generation ->_prepare_encoder_hidden_states_kwargs_for_generation * change warning for chroma length * Update src/transformers/models/musicgen_melody/convert_musicgen_melody_transformers.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * change way to save wav, using soundfile * correct docs and change to soundfile * fix import * fix init proj layers * remove line breaks from md * fix issue with docstrings * add FE suggestions * improve is in logics and remove useless imports * remove custom from_pretrained * simplify docstring code * add suggestions for modeling tests * make style * update converting script with sanity check * remove encoder attention mask from conditional generation * replace musicgen melody checkpoints with official orga * rename ylacombe->facebook in checkpoints * fix copies * remove unecessary warning * add shape in code docstrings * add files to slow doc tests * fix md bug and add md to not_tested * make fix-copies * fix hidden states test and batching --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2024-03-18 13:06:12 +00:00
Lysandre Debut	f497f564bb	Update all references to canonical models (#29001 ) * Script & Manual edition * Update	2024-02-16 08:16:58 +01:00
amyeroberts	0199a484eb	Backbone kwargs in config (#28784 ) * Enable instantiating model with pretrained backbone weights * Clarify pretrained import * Use load_backbone instead * Add backbone_kwargs to config * Pass kwargs to constructors * Fix up * Input verification * Add tests * Tidy up * Update tests/utils/test_backbone_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-02-14 20:46:44 +00:00
Yih-Dar	a7cb92aa03	fix / skip (for now) some tests before switch to torch 2.2 (#28838 ) * fix / skip some tests before we can switch to torch 2.2 * style --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-02 14:11:50 +01:00
Matt	415e9a0980	Add tf_keras imports to prepare for Keras 3 (#28588 ) * Port core files + ESM (because ESM code is odd) * Search-replace in modelling code * Fix up transfo_xl as well * Fix other core files + tests (still need to add correct import to tests) * Fix cookiecutter * make fixup, fix imports in some more core files * Auto-add imports to tests * Cleanup, add imports to sagemaker tests * Use correct exception for importing tf_keras * Fixes in modeling_tf_utils * make fixup * Correct version parsing code * Ensure the pipeline tests correctly revert to float32 after each test * Ensure the pipeline tests correctly revert to float32 after each test * More tf.keras -> keras * Add dtype cast * Better imports of tf_keras * Add a cast for tf.assign, just in case * Fix callback imports	2024-01-30 17:26:36 +00:00
Angela Yi	243e186efb	Add serialization logic to pytree types (#27871 ) * Add serialized type name to pytrees * Modify context * add serde test	2024-01-29 10:41:20 +01:00
amyeroberts	27c79a0fb4	Enable instantiating model with pretrained backbone weights (#28214 ) * Enable instantiating model with pretrained backbone weights * Update tests so backbone checkpoint isn't passed in * Remove doc updates until changes made in modeling code * Clarify pretrained import * Update configs - docs and validation check * Update src/transformers/utils/backbone_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Clarify exception message * Update config init in tests * Add test for when use_timm_backbone=True * Small test updates --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-01-23 11:01:50 +00:00
Timothy Cronin	ff86bc364d	improve dev setup comments and hints (#28495 ) * improve dev setup comments and hints * fix tests for new dev setup hints	2024-01-15 18:36:40 +00:00
Yih-Dar	bb3bd44739	Fix the check of models supporting FA/SDPA not run (#28202 ) * add check_support_list.py * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-12-22 12:56:11 +01:00
amyeroberts	ee298a16a2	Align backbone stage selection with out_indices & out_features (#27606 ) * Iteratre over out_features instead of stage_names * Update for all backbones * Add tests * Fix * Align timm backbone behaviour with other backbones * Fix tests * Stricter checks on set out_features and out_indices * Revert back stage selection logic * Remove out-of-order logic * Document restriction in docstrings	2023-12-20 18:33:17 +00:00
amyeroberts	bd7a356135	Update split string in doctest to reflect #28087 (#28135 )	2023-12-19 13:55:09 +00:00
Matt	71d47f0ad4	More TF fixes (#28081 ) * More build_in_name_scope() * Make sure we set the save spec now we don't do it with dummies anymore * make fixup	2023-12-18 15:26:03 +00:00
Quentin Lhoest	26ea725bc0	Update fixtures-image-utils (#28080 ) * fix hf-internal-testing/fixtures_image_utils * fix test * comments	2023-12-15 16:58:36 +00:00
fxmarty	80377eb018	F.scaled_dot_product_attention support (#26572 ) * add sdpa * wip * cleaning * add ref * yet more cleaning * and more :) * wip llama * working llama * add output_attentions=True support * bigcode sdpa support * fixes * gpt-bigcode support, require torch>=2.1.1 * add falcon support * fix conflicts falcon * style * fix attention_mask definition * remove output_attentions from attnmaskconverter * support whisper without removing any Copied from statement * fix mbart default to eager renaming * fix typo in falcon * fix is_causal in SDPA * check is_flash_attn_2_available in the models init as well in case the model is not initialized through from_pretrained * add warnings when falling back on the manual implementation * precise doc * wip replace _flash_attn_enabled by config.attn_implementation * fix typo * add tests * style * add a copy.deepcopy on the config in from_pretrained, as we do not want to modify it inplace * obey to config.attn_implementation if a config is passed in from_pretrained * fix is_torch_sdpa_available when torch is not installed * remove dead code * Update src/transformers/modeling_attn_mask_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/modeling_attn_mask_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/modeling_attn_mask_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/modeling_attn_mask_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/modeling_attn_mask_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/bart/modeling_bart.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove duplicate pretraining_tp code * add dropout in llama * precise comment on attn_mask * add fmt: off for _unmask_unattended docstring * precise num_masks comment * nuke pretraining_tp in LlamaSDPAAttention following Arthur's suggestion * cleanup modeling_utils * backward compatibility * fix style as requested * style * improve documentation * test pass * style * add _unmask_unattended tests * skip meaningless tests for idefics * hard_check SDPA requirements when specifically requested * standardize the use if XXX_ATTENTION_CLASSES * fix SDPA bug with mem-efficient backend on CUDA when using fp32 * fix test * rely on SDPA is_causal parameter to handle the causal mask in some cases * fix FALCON_ATTENTION_CLASSES * remove _flash_attn_2_enabled occurences * fix test * add OPT to the list of supported flash models * improve test * properly test on different SDPA backends, on different dtypes & properly handle separately the pad tokens in the test * remove remaining _flash_attn_2_enabled occurence * Update src/transformers/modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/modeling_attn_mask_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/perf_infer_gpu_one.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove use_attn_implementation * fix docstring & slight bug * make attn_implementation internal (_attn_implementation) * typos * fix tests * deprecate use_flash_attention_2=True * fix test * add back llama that was removed by mistake * fix tests * remove _flash_attn_2_enabled occurences bis * add check & test that passed attn_implementation is valid * fix falcon torchscript export * fix device of mask in tests * add tip about torch.jit.trace and move bt doc below sdpa * fix parameterized.expand order * move tests from test_modeling_attn_mask_utils to test_modeling_utils as a relevant test class is already there * update sdpaattention class with the new cache * Update src/transformers/configuration_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/bark/modeling_bark.py * address review comments * WIP torch.jit.trace fix. left: test both eager & sdpa * add test for torch.jit.trace for both eager/sdpa * fix falcon with torch==2.0 that needs to use sdpa * fix doc * hopefully last fix * fix key_value_length that has no default now in mask converter * is it flacky? * fix speculative decoding bug * tests do pass * fix following #27907 --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-12-09 05:38:14 +09:00
Yih-Dar	b8db265bc6	Update tiny model summary file (#27388 ) * update * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-11-23 21:00:39 +01:00
Arthur	b97cab7e6d	Remove-auth-token (#27060 ) * don't use `use_auth_token`internally * let's use token everywhere * fixup	2023-11-13 14:20:54 +01:00
Lysandre Debut	113ebf80ac	Safetensors serialization by default (#27064 ) * Safetensors serialization by default * First pass on the tests * Second pass on the tests * Third pass on the tests * Fix TF weight loading from TF-format safetensors * Specific encoder-decoder fixes for weight crossloading * Add VisionEncoderDecoder fixes for TF too * Change filename test for pt-to-tf * One missing fix for TFVisionEncoderDecoder * Fix the other crossload test * Support for flax + updated tests * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Sanchit's comments * Sanchit's comments 2 * Nico's comments * Fix tests * cleanup * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: Matt <rocketknight1@gmail.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-10-31 19:16:49 +01:00
Xuehai Pan	cc7803c0a6	Register ModelOutput as supported torch pytree nodes (#26618 ) * Register ModelOutput as supported torch pytree nodes * Test ModelOutput as supported torch pytree nodes * Update type hints for pytree unflatten functions	2023-10-24 11:02:40 +02:00
Yih-Dar	e840aa67e8	Fix failing `MusicgenTest .test_pipeline_text_to_audio` (#26586 ) * fix * fix * Fix * Fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-10-06 15:53:59 +02:00
Yih-Dar	d9e4bc2895	Update tiny model information and pipeline tests (#26285 ) * Update tiny model summary file * add to pipeline tests * revert * fix import * fix import * fix * fix * update * update * update * fix * remove BarkModelTest * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-09-25 18:08:12 +02:00
Yoach Lacombe	9a30753485	Porting the torchaudio kaldi fbank implementation to audio_utils (#26182 ) * add kaldi fbank * make style * add herz_to_mel_kaldi tests * add mel to hertz kaldi test * integration tests * correct test and remove comment * make style * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * change parameter name * Apply suggestions from Arthur review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update remove_dc_offset description * fix bug + make style * fix error in using np.exp instead of np.power * make style --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-09-21 17:52:47 +02:00
Lucain	04191ea1e6	Fix gated repo tests (#26257 ) * Fix gated repo tests * Apply suggestions from code review	2023-09-19 13:25:12 +02:00
Craig Chan	d7bd325b5a	Add missing Maskformer dataclass decorator, add dataclass check in ModelOutput for subclasses (#25638 ) * Add @dataclass to MaskFormerPixelDecoderOutput * Add dataclass check if subclass of ModelOutout * Use unittest assertRaises rather than pytest per contribution doc * Update src/transformers/utils/generic.py per suggested change Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-09-14 10:30:49 +01:00
Yih-Dar	fbbe1b8a40	Fix `test_load_img_url_timeout` (#25976 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-09-05 11:34:28 +02:00
Haylee Schäfer	dbc16f4404	Support loading base64 images in pipelines (#25633 ) * support loading base64 images * add test * mention in docs * remove the logging * sort imports * update error message * Update tests/utils/test_image_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * restructure to catch base64 exception * doesn't like the newline * download files * format * optimize imports * guess it needs a space? * support loading base64 images * add test * remove the logging * sort imports * restructure to catch base64 exception * doesn't like the newline * download files * optimize imports * guess it needs a space? --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-08-29 19:24:24 +01:00
Sylvain Gugger	68fa9a5937	Skip broken tests	2023-08-24 01:48:53 -04:00
amyeroberts	5e5fa0d88c	Mark flaky tests (#25463 ) Make CI less brittle	2023-08-11 15:26:45 +01:00
amyeroberts	944ddce8bf	Enable passing number of channels when inferring data format (#25412 )	2023-08-09 17:41:21 +01:00
Matthew Hoffman	d4bd33cc9f	Register ModelOutput subclasses as supported torch.utils._pytree nodes (#25358 ) * Register ModelOutput subclasses as supported torch.utils._pytree nodes Fixes #25357 where DDP with static_graph=True does not sync gradients when calling backward() over tensors contained in ModelOutput subclasses * Add test for torch pytree ModelOutput serialization and deserialization	2023-08-08 08:12:11 +02:00
Roland Szabo	d114a6b71f	Add timeout parameter to load_image function (#25184 ) * Add timeout parameter to load_image function. * Remove line. * Reformat code Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add parameter to docs. --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-08-03 15:51:54 +01:00
Yih-Dar	1b4f6199c6	Update tiny model info. and pipeline testing (#25213 ) * update tiny_model_summary.json * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-31 19:35:33 +02:00
Lucain	c1dba1111b	Add test when downloading from gated repo (#25039 )	2023-07-28 08:14:27 -04:00
Yih-Dar	3441ad7d43	Make (TF) CI faster (test only a subset of model classes) (#24592 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-30 16:54:54 +02:00
Yih-Dar	e84bf1f734	⚠️ Time to say goodbye to py37 (#24091 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-28 07:22:39 +02:00
Sylvain Gugger	eb849f6604	Migrate doc files to Markdown. (#24376 ) * Rename index.mdx to index.md * With saved modifs * Address review comment * Treat all files * .mdx -> .md * Remove special char * Update utils/tests_fetcher.py Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> --------- Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2023-06-20 18:07:47 -04:00
Yih-Dar	c23d131eab	Update tiny models for pipeline testing. (#24364 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-20 14:43:10 +02:00
Matt	56efbf4301	TensorFlow CI fixes (#24360 ) * Fix saved_model_creation_extended * Skip the BLIP model creation test for now * Fix TF SAM test * Fix longformer tests * Fix Wav2Vec2 * Add a skip for XLNet * make fixup * make fix-copies * Add comments	2023-06-20 12:59:21 +01:00
Matt	9138995025	Add test for proper TF input signatures (#24320 ) * Add test for proper input signatures * No more signature pruning * Test the dummy inputs are valid too * fine-tine -> fine-tune * Fix indent in test_dataset_conversion	2023-06-16 17:03:13 +01:00
Matt	3bd1fe4315	Stop storing references to bound methods via tf.function (#24146 ) * Stop storing references to bound methods in tf.functions * Remove the gc.collect calls now that we resolved the underlying problem * Remove the default signature from model.serving entirely, big cleanup * Remove _prune_signature as self.input_signature can prune itself * Restore serving docstring * Update int support test to check the input signature * Make sure other tests also use model.input_signature and not serving.input_signature * Restore _prune_signature * Remove the doctest GC now it's no longer needed * Correct core tests to use the pruned sig * order lines correctly in core tests * Add eager_serving back with a deprecation warning	2023-06-13 19:04:22 +01:00
Radamés Ajna	a73883ae9e	add trust_remote_code option to CLI download cmd (#24097 ) * add trust_remote_code option * require_torch	2023-06-08 11:13:57 -04:00
Daniel King	89159651ba	Fix the regex in `get_imports` to support multiline try blocks and excepts with specific exception types (#23725 ) * fix and test get_imports for multiline try blocks, and excepts with specific errors * fixup * add some more tests * add license	2023-05-24 15:40:19 -04:00
Matt	f8b2574416	Better TF docstring types (#23477 ) * Rework TF type hints to use \| None instead of Optional[] for tf.Tensor * Rework TF type hints to use \| None instead of Optional[] for tf.Tensor * Don't forget the imports * Add the imports to tests too * make fixup * Refactor tests that depended on get_type_hints * Better test refactor * Fix an old hidden bug in the test_keras_fit input creation code * Fix for the Deit tests	2023-05-24 13:52:52 +01:00
Yih-Dar	ffad4f1373	Update tiny models and pipeline tests (#23446 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-05-18 17:29:04 +02:00
Matthijs Hollemans	7f91950901	audio_utils improvements (#21998 ) * silly change to allow making a PR * clean up doc comments * simplify hertz_to_mel and mel_to_hertz * fixup * clean up power_to_db * also add amplitude_to_db * move functions * clean up mel_filter_bank * fixup * credit librosa & torchaudio authors * add unit tests * tests for power_to_db and amplitude_to_db * add mel_filter_bank tests * rewrite STFT * add convenience spectrogram function * missing transpose * fewer transposes * add integration test to M-CTC-T * frame length can be either window or FFT length * rewrite stft API * add preemphasis coefficient * move argument * add log option to spectrogram * replace M-CTC-T feature extractor * fix api thing * replace whisper STFT * replace whisper mel filters * replace tvlt's stft * allow alternate window names * replace speecht5 stft * fixup * fix integration tests * fix doc comments * remove manual FFT length calculation * fix docs * go away, deprecation warnings * combine everything into spectrogram function * add deprecated functions back * fixup	2023-05-09 09:10:17 -04:00
amyeroberts	90e8263d91	Add methods to update and verify out_features out_indices (#23031 ) * Add methods to update and verify out_features out_indices * Safe update for config attributes * Fix function names * Save config correctly * PR comments - use property setters * PR comment - directly set attributes * Update test * Add updates to recently merged focalnet backbone	2023-05-04 10:15:06 +01:00
Xuehai Pan	ee4bc07474	Support union types `X \| Y` syntax for `HfArgumentParser` for Python 3.10+ (#23126 ) * Support union types `X \| Y` syntax for `HfArgumentParser` for Python 3.10+ * Add tests for PEP 604 for `HfArgumentParser` * Reorganize tests	2023-05-03 10:49:54 -04:00
Yih-Dar	a4908da04e	Fix the expected error in `test_offline_mode_pipeline_exception` (#23022 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-27 14:22:05 +02:00
Lingepumpe	5427250351	Avoid invalid escape sequences, use raw strings (#22936 ) * Avoid invalid escape sequences, use raw strings * Integrate PR feedback	2023-04-25 09:17:56 -04:00
Lucain	74c55ab9e5	Prepare tests for hfh 0.14 (#22958 ) * Test hf_hub 0.14.0rc1 * fix mocked tests * package version --------- Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: testbot <lucainp@hf.co>	2023-04-24 09:31:50 -04:00
Yih-Dar	975159bb61	Update tiny models and a few fixes (#22928 ) * run_check_tiny_models * update summary * update mixin * update pipeline_model_mapping * update pipeline_model_mapping * Update for gpt_bigcode --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-24 14:45:22 +02:00
Sylvain Gugger	3876fc6839	Make dynamic code work with offline mode (#22661 ) * Make dynamic code work with offline mode * Clean up * Quality	2023-04-10 08:49:42 -04:00
Yih-Dar	c7ec71baf5	Update tiny model summary file for recent models (#22637 ) * Update tiny model summary file for recent models --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-06 22:52:59 +02:00
Nicolas Patry	0aa1153ffb	Revert error back into warning for byte fallback conversion. (#22607 )	2023-04-06 14:00:29 +02:00
Nicolas Patry	1670be4bde	Adding Llama FastTokenizer support. (#22264 ) * Adding Llama FastTokenizer support. - Requires https://github.com/huggingface/tokenizers/pull/1183 version - Only support byte_fallback for llama, raise otherwise (safety net). - Lots of questions are special tokens How to test: ```python from transformers.convert_slow_tokenizer import convert_slow_tokenizer from transformers import AutoTokenizer from tokenizers import Tokenizer tokenizer = AutoTokenizer.from_pretrained("huggingface/llama-7b") if False: new_tokenizer = Tokenizer.from_file("tok.json") else: new_tokenizer = convert_slow_tokenizer(tokenizer) new_tokenizer.save("tok.json") strings = [ "This is a test", "生活的真谛是", "生活的真谛是[MASK]。", # XXX: This one is problematic because of special tokens # "<s> Something something", ] for string in strings: encoded = tokenizer(string)["input_ids"] encoded2 = new_tokenizer.encode(string).ids assert encoded == encoded2, f"{encoded} != {encoded2}" decoded = tokenizer.decode(encoded) decoded2 = new_tokenizer.decode(encoded2) assert decoded.strip() == decoded2, f"{repr(decoded)} != {repr(decoded2)}" ``` The converter + some test script. The test script. Tmp save. Adding Fast tokenizer + tests. Adding the tokenization tests. Correct combination. Small fix. Fixing tests. Fixing with latest update. Rebased. fix copies + normalized added tokens + copies. Adding doc. TMP. Doc + split files. Doc. Versions + try import. Fix Camembert + warnings -> Error. Fix by ArthurZucker. Not a decorator. * Fixing comments. * Adding more to docstring. * Doc rewriting.	2023-04-06 09:53:03 +02:00
Yih-Dar	8894b81742	Use real tokenizers if tiny version(s) creation has issue(s) (#22428 ) Fix some tiny model creation issues Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-29 16:16:23 +02:00
Yih-Dar	e8cc02555e	Automatically create/update tiny models (#22275 ) * Automatically create or update tiny models * Skip failed tests * update workflow file * use revision --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-23 19:14:17 +01:00
Yih-Dar	e6de918676	Add Blip and Blip2 for pipeline tests (#21904 ) * fix * add to tests * style and quality * add missing --------- Co-authored-by: NielsRogge <NielsRogge@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-02 18:20:34 +01:00
Sylvain Gugger	b29e2dcaff	Fix flaky test for log level (#21776 ) * Fix flaky test for log level * Fix other flaky test	2023-02-28 16:24:14 -05:00
Yih-Dar	871c31a6f1	🔥Rework pipeline testing by removing `PipelineTestCaseMeta` 🚀 (#21516 ) * Add PipelineTesterMixin * remove class PipelineTestCaseMeta * move validate_test_components * Add for ViT * Add to SPECIAL_MODULE_TO_TEST_MAP * style and quality * Add feature-extraction * update * raise instead of skip * add tiny_model_summary.json * more explicit * skip tasks not in mapping * add availability check * Add Copyright * A way to diable irrelevant tests * update with main * remove disable_irrelevant_tests * skip tests * better skip message * better skip message * Add all pipeline task tests * revert * Import PipelineTesterMixin * subclass test classes with PipelineTesterMixin * Add pipieline_model_mapping * Fix import after adding pipieline_model_mapping * Fix style and quality after adding pipieline_model_mapping * Fix one more import after adding pipieline_model_mapping * Fix style and quality after adding pipieline_model_mapping * Fix test issues * Fix import requirements * Fix mapping for MobileViTModelTest * Update * Better skip message * pipieline_model_mapping could not be None * Remove some PipelineTesterMixin * Fix typo * revert tests_fetcher.py * update * rename * revert * Remove PipelineTestCaseMeta from ZeroShotAudioClassificationPipelineTests * style and quality * test fetcher for all pipeline/model tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-02-28 19:40:57 +01:00
Joao Gante	92dfceb124	Inheritance-based framework detection (#21784 )	2023-02-27 15:31:55 +00:00
Aaron Gokaslan	5e8c8eb5ba	Apply ruff flake8-comprehensions (#21694 )	2023-02-22 09:14:54 +01:00
Sylvain Gugger	6f79d26442	Update quality tooling for formatting (#21480 ) * Result of black 23.1 * Update target to Python 3.7 * Switch flake8 to ruff * Configure isort * Configure isort * Apply isort with line limit * Put the right black version * adapt black in check copies * Fix copies	2023-02-06 18:10:56 -05:00
Joel Lamy-Poirier	e006ab51ac	Add the GeLU activation from pytorch with the tanh approximation (#21345 ) * gelu_python_tanh * rename * Version check, add test * Pr comment	2023-02-02 09:33:04 -05:00
amyeroberts	d18a1cba24	Accept batched tensor of images as input to image processor (#21144 ) * Accept a batched tensor of images as input * Add to all image processors * Update oneformer	2023-01-26 10:15:26 +00:00
Nicolas Patry	25ddd91b24	Fixing offline mode for pipeline (when inferring task). (#21113 ) * Fixing offline mode for pipeline (when inferring task). * Update src/transformers/pipelines/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Updating test to reflect change in exception. * Fixing offline mode. * Clean. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-01-17 15:24:40 +01:00
amyeroberts	0dde58978a	Rename test_feature_extraction files (#21140 ) * Rename files * Update file names in tests	2023-01-17 14:04:07 +00:00
Matt	afad0c18d9	Fix TF nightly tests (#20507 ) * Fixed test_saved_model_extended * Fix TFGPT2 tests * make fixup * Make sure keras-nlp utils are available for type hinting too * Update src/transformers/testing_utils.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * make fixup Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2022-11-30 14:47:54 +00:00
amyeroberts	321ef388fe	Include image processor in add-new-model-like (#20439 )	2022-11-28 16:46:02 +00:00
Matt	de4159a318	More TF int dtype fixes (#20384 ) * Add a test to ensure int dummy inputs are int64 * Move the test into the existing int64 test and update a lot of existing dummies * Fix remaining dummies * Fix remaining dummies * Test for int64 serving sigs as well * Update core tests to use tf.int64 * Add better messages to the assertions * Update all serving sigs to int64 * More sneaky hiding tf.int32s * Add an optional int32 signature in save_pretrained * make fixup * Add Amy's suggestions * Switch all serving sigs back to tf.int32 * Switch all dummies to tf.int32 * Adjust tests to check for tf.int32 instead of tf.int64 * Fix base dummy_inputs dtype * Start casting to tf.int32 in input_processing * Change dtype for unpack_inputs test * Add proper tf.int32 test * Make the alternate serving signature int64	2022-11-28 13:24:44 +00:00
Sylvain Gugger	afce73bd9d	Fix ModelOutput instantiation when there is only one tuple (#20416 )	2022-11-23 15:09:21 -05:00
Konstantin Dobler	1e3f17b5ab	Enhance HfArgumentParser functionality and ease of use (#20323 ) * Enhance HfArgumentParser * Fix type hints for older python versions * Fix and add tests (+formatting) * Add changes * doc-builder formatting * Remove unused import "Call"	2022-11-21 12:33:37 -05:00
amyeroberts	a6b7759880	Add Image Processors (#19796 ) * Add CLIP image processor * Crop size as dict too * Update warning * Actually use logger this time * Normalize doesn't change dtype of input * Add perceiver image processor * Tidy up * Add DPT image processor * Add Vilt image processor * Tidy up * Add poolformer image processor * Tidy up * Add LayoutLM v2 and v3 imsge processors * Tidy up * Add Flava image processor * Tidy up * Add deit image processor * Tidy up * Add ConvNext image processor * Tidy up * Add levit image processor * Add segformer image processor * Add in post processing * Fix up * Add ImageGPT image processor * Fixup * Add mobilevit image processor * Tidy up * Add postprocessing * Fixup * Add VideoMAE image processor * Tidy up * Add ImageGPT image processor * Fixup * Add ViT image processor * Tidy up * Add beit image processor * Add mobilevit image processor * Tidy up * Add postprocessing * Fixup * Fix up * Fix flava and remove tree module * Fix image classification pipeline failing tests * Update feature extractor in trainer scripts * Update pad_if_smaller to accept tuple and int size * Update for image segmentation pipeline * Update src/transformers/models/perceiver/image_processing_perceiver.py Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> * Update src/transformers/image_processing_utils.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/beit/image_processing_beit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * PR comments - docstrings; remove accidentally added resize; var names * Update docstrings * Add exception if size is not in the right format * Fix exception check * Fix up * Use shortest_edge in tuple in script Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-11-02 11:57:36 +00:00
Sylvain Gugger	d4eb52d13d	Refactor conversion function (#19799 ) * Refactor conversion function * Remove dupe line * Fixes * Fixes * Use the right variable... * Fix last test	2022-10-24 13:48:40 -04:00
Sylvain Gugger	fb0bd7b7a8	Fix activations being all the same module (#19728 )	2022-10-18 11:56:45 -04:00
amyeroberts	4181320b8c	Add normalize to image transforms module (#19544 ) * Adapt FE methods to transforms library * Mixin for saving the image processor * Base processor skeleton * BatchFeature for packaging image processor outputs * Initial image processor for GLPN * REmove accidental import * Fixup and docs * Mixin for saving the image processor * Fixup and docs * Import BatchFeature from feature_extraction_utils * Fixup and docs * Fixup and docs * Fixup and docs * Fixup and docs * BatchFeature for packaging image processor outputs * Import BatchFeature from feature_extraction_utils * Import BatchFeature from feature_extraction_utils * Fixup and docs * Fixup and docs * BatchFeature for packaging image processor outputs * Import BatchFeature from feature_extraction_utils * Fixup and docs * Mixin for saving the image processor * Fixup and docs * Add rescale back and remove ImageType * fix import mistake * Fix enum var reference * Can transform and specify image data format * Remove redundant function * Update reference * Data format flag for rescale * Fix typo * Fix dimension check * Fixes to make IP and FE outputs match * Add tests for transforms * Add test for utils * Update some docstrings * Make sure in channels last before converting to PIL * Remove default to numpy batching * Fix up * Add docstring and model_input_types * Use feature processor config from hub * Alias GLPN feature extractor to image processor * Alias feature extractor mixin * Add return_numpy=False flag for resize * Fix up * Fix up * Use different frameworks safely * Safely import PIL * Call function checking if PIL available * Only import if vision available * Address Sylvain PR comments Co-authored-by: Sylvain.gugger@gmail.com * Apply suggestions from code review Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/image_transforms.py Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> * Update src/transformers/models/glpn/feature_extraction_glpn.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add in docstrings * Fix TFSwinSelfAttention to have relative position index as non-trainable weight (#18226) Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> * Refactor `TFSwinLayer` to increase serving compatibility (#18352) * Refactor `TFSwinLayer` to increase serving compatibility Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> * Fix missed parameters while refactoring Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> * Fix window_reverse to calculate batch size Signed-off-by: Seunghwan Hong <harrydrippin@gmail.com> Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add TF prefix to TF-Res test class (#18481) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Remove py.typed (#18485) * Fix pipeline tests (#18487) * Fix pipeline tests * Make sure all pipelines tests run with init changes * Use new huggingface_hub tools for download models (#18438) * Draft new cached_file * Initial draft for config and model * Small fixes * Fix first batch of tests * Look in cache when internet is down * Fix last tests * Bad black, not fixing all quality errors * Make diff less * Implement change for TF and Flax models * Add tokenizer and feature extractor * For compatibility with main * Add utils to move the cache and auto-do it at first use. * Quality * Deal with empty commit shas * Deal with empty etag * Address review comments * Fix `test_dbmdz_english` by updating expected values (#18482) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Move cache folder to huggingface/hub for consistency with hf_hub (#18492) * Move cache folder to just huggingface * Thank you VsCode for this needless import * Move to hub * Forgot one * Update some expected values in `quicktour.mdx` for `resampy 0.3.0` (#18484) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Forgot one new_ for cache migration * disable Onnx test for google/long-t5-tglobal-base (#18454) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Typo reported by Joel Grus on TWTR (#18493) * Just re-reading the whole doc every couple of months 😬 (#18489) * Delete valohai.yaml * NLP => ML * typo * website supports https * datasets * 60k + modalities * unrelated link fixing for accelerate * Ok those links were actually broken * Fix link * Make `AutoTokenizer` auto-link * wording tweak * add at least one non-nlp task * `transformers-cli login` => `huggingface-cli login` (#18490) * zero chance anyone's using that constant no? * `transformers-cli login` => `huggingface-cli login` * `transformers-cli repo create` => `huggingface-cli repo create` * `make style` * Add seed setting to image classification example (#18519) * [DX fix] Fixing QA pipeline streaming a dataset. (#18516) * [DX fix] Fixing QA pipeline streaming a dataset. QuestionAnsweringArgumentHandler would iterate over the whole dataset effectively killing all properties of the pipeline. This restores nice properties when using `Dataset` or `Generator` since those are meant to be consumed lazily. * Handling TF better. * Clean up hub (#18497) * Clean up utils.hub * Remove imports * More fixes * Last fix * update fsdp docs (#18521) * updating fsdp documentation * typo fix * Fix compatibility with 1.12 (#17925) * Fix compatibility with 1.12 * Remove pin from examples requirements * Update torch scatter version * Fix compatibility with 1.12 * Remove pin from examples requirements * Update torch scatter version * fix torch.onnx.symbolic_opset12 import * Reject bad version Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Remove debug statement * Specify en in doc-builder README example (#18526) Co-authored-by: Ankur Goyal <ankur@impira.com> * New cache fixes: add safeguard before looking in folders (#18522) * unpin resampy (#18527) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * ✨ update to use interlibrary links instead of Markdown (#18500) * Add example of multimodal usage to pipeline tutorial (#18498) * 📝 add example of multimodal usage to pipeline tutorial * 🖍 apply feedbacks * 🖍 apply niels feedback * [VideoMAE] Add model to doc tests (#18523) * Add videomae to doc tests * Add pip install decord Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> * Update perf_train_gpu_one.mdx (#18532) * Update no_trainer.py scripts to include accelerate gradient accumulation wrapper (#18473) * Added accelerate gradient accumulation wrapper to run_image_classification_no_trainer.py example script * make fixup changes * PR comments * changed input to Acceletor based on PR comment, ran make fixup * Added comment explaining the sync_gradients statement * Fixed lr scheduler max steps * Changed run_clm_no_trainer.py script to use accelerate gradient accum wrapper * Fixed all scripts except wav2vec2 pretraining to use accelerate gradient accum wrapper * Added accelerate gradient accum wrapper for wav2vec2_pretraining_no_trainer.py script * make fixup and lr_scheduler step inserted back into run_qa_beam_search_no_trainer.py * removed changes to run_wav2vec2_pretraining_no_trainer.py script and fixed using wrong constant in qa_beam_search_no_trainer.py script * Add Spanish translation of converting_tensorflow_models.mdx (#18512) * Add file in spanish docs to be translated * Finish translation to Spanish * Improve Spanish wording * Add suggested changes from review * Spanish translation of summarization.mdx (#15947) (#18477) * Add Spanish translation of summarization.mdx * Apply suggestions from code review Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Let's not cast them all (#18471) * add correct dtypes when checking for params dtype * forward contrib credits * Update src/transformers/modeling_utils.py Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com> * more comments - added more comments on why we cast only floating point parameters * Update src/transformers/modeling_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: sgugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com> * fix: data2vec-vision Onnx ready-made configuration. (#18427) * feat: add the data2vec conf that are missing https://huggingface.co/docs/transformers/serialization * fix: wrong config * Add mt5 onnx config (#18394) * update features * MT5OnnxConfig added with updated with tests and docs * fix imports * fix onnc_config_cls for mt5 Co-authored-by: Thomas Chaigneau <thomas.deeptools.ai> * Minor update of `run_call_with_unpacked_inputs` (#18541) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * BART - Fix attention mask device issue on copied models (#18540) * attempt to fix attn mask device * fix bart `_prepare_decoder_attention_mask` - add correct device - run `make fix-copies` to propagate the fix * Adding a new `align_to_words` param to qa pipeline. (#18010) * Adding a new `align_to_words` param to qa pipeline. * Update src/transformers/pipelines/question_answering.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Import protection. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * 📝 update metric with evaluate (#18535) * Restore _init_weights value in no_init_weights (#18504) * Recover _init_weights value in no_init_weights For potential nested use. In addition, users might modify private no_init_weights as well. * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Remove private variable change check Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Clean up comment * 📝 update documentation build section (#18548) * `bitsandbytes` - `Linear8bitLt` integration into `transformers` models (#17901) * first commit * correct replace function * add final changes - works like charm! - cannot implement tests yet - tested * clean up a bit * add bitsandbytes dependencies * working version - added import function - added bitsandbytes utils file * small fix * small fix - fix import issue * fix import issues * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refactor a bit - move bitsandbytes utils to utils - change comments on functions * reformat docstring - reformat docstring on init_empty_weights_8bit * Update src/transformers/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * revert bad formatting * change to bitsandbytes * refactor a bit - remove init8bit since it is useless * more refactoring - fixed init empty weights issue - added threshold param * small hack to make it work * Update src/transformers/modeling_utils.py * Update src/transformers/modeling_utils.py * revmoe the small hack * modify utils file * make style + refactor a bit * create correctly device map * add correct dtype for device map creation * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply suggestions - remove with torch.grad - do not rely on Python bool magic! * add docstring - add docstring for new kwargs * add docstring - comment `replace_8bit_linear` function - fix weird formatting * - added more documentation - added new utility function for memory footprint tracking - colab demo to add * few modifs - typo doc - force cast into float16 when load_in_8bit is enabled * added colab link * add test architecture + docstring a bit * refactor a bit testing class * make style + refactor a bit * enhance checks - add more checks - start writing saving test * clean up a bit * male style * add more details on doc * add more tests - still needs to fix 2 tests * replace by "or" - could not fix it from GitHub GUI Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refactor a bit testing code + add readme * make style * fix import issue * Update src/transformers/modeling_utils.py Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * add few comments * add more doctring + make style * more docstring * raise error when loaded in 8bit * make style * add warning if loaded on CPU * add small sanity check * fix small comment * add bitsandbytes on dockerfile * Improve documentation - improve documentation from comments * add few comments * slow tests pass on the VM but not on the CI VM * Fix merge conflict * make style * another test should pass on a multi gpu setup * fix bad import in testing file * Fix slow tests - remove dummy batches - no more CUDA illegal memory errors * odify dockerfile * Update docs/source/en/main_classes/model.mdx * Update Dockerfile * Update model.mdx * Update Dockerfile * Apply suggestions from code review * few modifications - lm head can stay on disk/cpu - change model name so that test pass * change test value - change test value to the correct output - torch bmm changed to baddmm in bloom modeling when merging * modify installation guidelines * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * replace `n`by `name` * merge `load_in_8bit` and `low_cpu_mem_usage` * first try - keep the lm head in full precision * better check - check the attribute `base_model_prefix` instead of computing the number of parameters * added more tests * Update src/transformers/utils/bitsandbytes.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers into integration-8bit * improve documentation - fix typos for installation - change title in the documentation Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * TF: XLA-trainable DeBERTa v2 (#18546) * fix deberta issues * add different code paths for gpu and tpu * shorter gpu take along axis * Stable Dropout without tf cond * variable must be float * Preserve hub-related kwargs in AutoModel.from_pretrained (#18545) * Preserve hub-related kwargs in AutoModel.from_pretrained * Fix tests * Remove debug statement * TF Examples Rewrite (#18451) * Finished QA example * Dodge a merge conflict * Update text classification and LM examples * Update NER example * New Keras metrics WIP, fix NER example * Update NER example * Update MC, summarization and translation examples * Add XLA warnings when shapes are variable * Make sure batch_size is consistently scaled by num_replicas * Add PushToHubCallback to all models * Add docs links for KerasMetricCallback * Add docs links for prepare_tf_dataset and jit_compile * Correct inferred model names * Don't assume the dataset has 'lang' * Don't assume the dataset has 'lang' * Write metrics in text classification * Add 'framework' to TrainingArguments and TFTrainingArguments * Export metrics in all examples and add tests * Fix training args for Flax * Update command line args for translation test * make fixup * Fix accidentally running other tests in fp16 * Remove do_train/do_eval from run_clm.py * Remove do_train/do_eval from run_mlm.py * Add tensorflow tests to circleci * Fix circleci * Update examples/tensorflow/language-modeling/run_mlm.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/test_tensorflow_examples.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/translation/run_translation.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/token-classification/run_ner.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Fix save path for tests * Fix some model card kwargs * Explain the magical -1000 * Actually enable tests this time * Skip text classification PR until we fix shape inference * make fixup Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Use commit hash to look in cache instead of calling head (#18534) * Use commit hash to look in cache instead of calling head * Add tests * Add attr for local configs too * Stupid typos * Fix tests * Update src/transformers/utils/hub.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Address Julien's comments Co-authored-by: Julien Chaumond <julien@huggingface.co> * `pipeline` support for `device="mps"` (or any other string) (#18494) * `pipeline` support for `device="mps"` (or any other string) * Simplify `if` nesting * Update src/transformers/pipelines/base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix? @sgugger * passing `attr=None` is not the same as not passing `attr` 🤯 Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update philosophy to include other preprocessing classes (#18550) * 📝 update philosophy to include other preprocessing classes * 🖍 apply feedbacks * Properly move cache when it is not in default path (#18563) * Adds CLIP to models exportable with ONNX (#18515) * onnx config for clip * default opset as 14 * changes from the original repo * input values order fix * outputs fix * remove unused import * ran make fix-copies * black format * review comments: forward ref, import fix, model change revert, .to cleanup * make style * formatting fixes * revert groupvit * comment for cast to int32 * comment fix * make .T as .t() for onnx conversion * ran make fix-copies * remove unneeded comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix copies * remove comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * raise atol for MT5OnnxConfig (#18560) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * fix string (#18568) * Segformer TF: fix output size in documentation (#18572) * Segformer TF: fix output size in doc * Segformer pytorch: fix output size in doc Co-authored-by: Maxime Gardoni <maxime.gardoni@ecorobotix.com> * Fix resizing bug in OWL-ViT (#18573) * Fixes resizing bug in OWL-ViT * Defaults to square resize if size is set to an int * Sets do_center_crop default value to False * Fix LayoutLMv3 documentation (#17932) * fix typos * fix sequence_length docs of LayoutLMv3Model * delete trailing white spaces * fix layoutlmv3 docs more * apply make fixup & quality * change to two versions of input docstring * apply make fixup & quality * Skip broken tests * Change BartLearnedPositionalEmbedding's forward method signature to support Opacus training (#18486) * changing BartLearnedPositionalEmbedding forward signature and references to it * removing debugging dead code (thanks style checker) * blackened modeling_bart file * removing copy inconsistencies via make fix-copies * changing references to copied signatures in Bart variants * make fix-copies once more * using expand over repeat (thanks @michaelbenayoun) * expand instead of repeat for all model copies Co-authored-by: Daniel Jones <jonesdaniel@microsoft.com> * german docs translation (#18544) * Create _config.py * Create _toctree.yml * Create index.mdx not sure about "du / ihr" oder "sie" * Create quicktour.mdx * Update _toctree.yml * Update build_documentation.yml * Update build_pr_documentation.yml * fix build * Update index.mdx * Update quicktour.mdx * Create installation.mdx * Update _toctree.yml * Deberta V2: Fix critical trace warnings to allow ONNX export (#18272) * Fix critical trace warnings to allow ONNX export * Force input to `sqrt` to be float type * Cleanup code * Remove unused import statement * Update model sew * Small refactor Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * Use broadcasting instead of repeat * Implement suggestion Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * Match deberta v2 changes in sew_d * Improve code quality * Update code quality * Consistency of small refactor * Match changes in sew_d Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * [FX] _generate_dummy_input supports audio-classification models for labels (#18580) * Support audio classification architectures for labels generation, as well as provides a flag to print warnings or not * Use ENV_VARS_TRUE_VALUES * Fix docstrings with last version of hf-doc-builder styler (#18581) * Fix docstrings with last version of hf-doc-builder styler * Remove empty Parameter block * Bump nbconvert from 6.0.1 to 6.3.0 in /examples/research_projects/lxmert (#18565) Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0. - [Release notes](https://github.com/jupyter/nbconvert/releases) - [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0) --- updated-dependencies: - dependency-name: nbconvert dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump nbconvert in /examples/research_projects/visual_bert (#18566) Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0. - [Release notes](https://github.com/jupyter/nbconvert/releases) - [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0) --- updated-dependencies: - dependency-name: nbconvert dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix owlvit tests, update docstring examples (#18586) * Return the permuted hidden states if return_dict=True (#18578) * Load sharded pt to flax (#18419) * initial commit * add small test * add cross pt tf flag to test * fix quality * style * update test with new repo * fix failing test * update * fix wrong param ordering * style * update based on review * update related to recent new caching mechanism * quality * Update based on review Co-authored-by: sgugger <sylvain.gugger@gmail.com> * quality and style * Update src/transformers/modeling_flax_utils.py Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add type hints for ViLT models (#18577) * Add type hints for Vilt models * Add missing return type for TokenClassification class * update doc for perf_train_cpu_many, add intel mpi introduction (#18576) * update doc for perf_train_cpu_many, add mpi introduction Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * Update docs/source/en/perf_train_cpu_many.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/perf_train_cpu_many.mdx Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * typos (#18594) * FSDP bug fix for `load_state_dict` (#18596) * Add `TFAutoModelForSemanticSegmentation` to the main `__init__.py` (#18600) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Generate: validate `model_kwargs` (and catch typos in generate arguments) (#18261) * validate generate model_kwargs * generate tests -- not all models have an attn mask * Supporting seq2seq models for `bitsandbytes` integration (#18579) * Supporting seq2seq models for `bitsandbytes` integration - `bitsandbytes` integration supports now seq2seq models - check if a model has tied weights as an additional check * small modification - tie the weights before looking at tied weights! * Add Donut (#18488) * First draft * Improve script * Update script * Make conversion work * Add final_layer_norm attribute to Swin's config * Add DonutProcessor * Convert more models * Improve feature extractor and convert base models * Fix bug * Improve integration tests * Improve integration tests and add model to README * Add doc test * Add feature extractor to docs * Fix integration tests * Remove register_buffer * Fix toctree and add missing attribute * Add DonutSwin * Make conversion script work * Improve conversion script * Address comment * Fix bug * Fix another bug * Remove deprecated method from docs * Make Swin and Swinv2 untouched * Fix code examples * Fix processor * Update model_type to donut-swin * Add feature extractor tests, add token2json method, improve feature extractor * Fix failing tests, remove integration test * Add do_thumbnail for consistency * Improve code examples * Add code example for document parsing * Add DonutSwin to MODEL_NAMES_MAPPING * Add model to appropriate place in toctree * Update namespace to appropriate organization Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> * Fix URLs (#18604) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> * Update BLOOM parameter counts (#18531) * Update BLOOM parameter counts * Update BLOOM parameter counts * [doc] fix anchors (#18591) the manual anchors end up being duplicated with automatically added anchors and no longer work. * [fsmt] deal with -100 indices in decoder ids (#18592) * [fsmt] deal with -100 indices in decoder ids Fixes: https://github.com/huggingface/transformers/issues/17945 decoder ids get the default index -100, which breaks the model - like t5 and many other models add a fix to replace -100 with the correct pad index. For some reason this use case hasn't been used with this model until recently - so this issue was there since the beginning it seems. Any suggestions to how to add a simple test here? or perhaps we have something similar already? user's script is quite massive. * style * small change (#18584) * Flax Remat for LongT5 (#17994) * [Flax] Add remat (gradient checkpointing) * fix variable naming in test * flip: checkpoint using a method * fix naming * fix class naming * apply PVP's suggestions from code review * add gradient_checkpointing to examples * Add gradient_checkpointing to run_mlm_flax * Add remat to longt5 * Add gradient checkpointing test longt5 * Fix args errors * Fix remaining tests * Make fixup & quality fixes * replace kwargs * remove unecessary kwargs * Make fixup changes * revert long_t5_flax changes * Remove return_dict and copy to LongT5 * Remove test_gradient_checkpointing Co-authored-by: sanchit-gandhi <sanchit@huggingface.co> * mac m1 `mps` integration (#18598) * mac m1 `mps` integration * Update docs/source/en/main_classes/trainer.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * addressing comments * Apply suggestions from code review Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com> * resolve comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com> * Change scheduled CIs to use torch 1.12.1 (#18644) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Add checks for some workflow jobs (#18583) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * TF: Fix generation repetition penalty with XLA (#18648) * Update longt5.mdx (#18634) * Update run_translation_no_trainer.py (#18637) * Update run_translation_no_trainer.py found an error in selecting `no_decay` parameters and some small modifications when the user continues to train from a checkpoint * fixs `no_decay` and `resume_step` issue 1. change `no_decay` list 2. if use continue to train their model from provided checkpoint, the `resume_step` will not be initialized properly if `args.gradient_accumulation_steps != 1` * [bnb] Minor modifications (#18631) * bnb minor modifications - refactor documentation - add troubleshooting README - add PyPi library on DockerFile * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * put in one block - put bash instructions in one block * update readme - refactor a bit hardware requirements * change text a bit * Apply suggestions from code review Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * apply suggestions Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * add link to paper * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update tests/mixed_int8/README.md * Apply suggestions from code review * refactor a bit * add instructions Turing & Amperer Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * add A6000 * clarify a bit * remove small part * Update tests/mixed_int8/README.md Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Examples: add Bloom support for token classification (#18632) * examples: add Bloom support for token classification (FLAX, PyTorch and TensorFlow) * examples: remove support for Bloom in token classication (FLAX and TensorFlow currently have no support for it) * Fix Yolos ONNX export test (#18606) Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fixup * Fix up * Move PIL default arguments inside function for safe imports * Add image utils to toctree * Update `rescale` method to reflect changes in #18677 * Update docs/source/en/internal/image_processing_utils.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Address Niels PR comments * Add normalize method to transforms library * Apply suggestions from code review - remove defaults to None Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix docstrings and revert to PIL.Image.XXX resampling Use PIL.Image.XXX resampling values instead of PIL.Image.Resampling.XXX enum as it's only in the recent version >= 9.10 and version is not yet pinned and older version support deprecated * Some more docstrings and PIL.Image tidy up * Reorganise arguments so flags by modifiers * Few last docstring fixes * Add normalize to docs * Accept PIL.Image inputs with deprecation warning * Update src/transformers/image_transforms.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update warning to include version * Trigger CI - hash clash on doc build Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Amy Roberts <amyeroberts@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Seunghwan Hong <harrydrippin@gmail.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: regisss <15324346+regisss@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com> Co-authored-by: Ankur Goyal <ankrgyl@gmail.com> Co-authored-by: Ankur Goyal <ankur@impira.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Mishig Davaadorj <dmishig@gmail.com> Co-authored-by: Rasmus Arpe Fogh Jensen <Rasmus.arpe@gmail.com> Co-authored-by: Ian Castillo <7807897+donelianc@users.noreply.github.com> Co-authored-by: AguilaCudicio <aguila.cudicio@gmail.com> Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com> Co-authored-by: Niklas Hansson <niklas.sven.hansson@gmail.com> Co-authored-by: Thomas Chaigneau <t.chaigneau.tc@gmail.com> Co-authored-by: YouJiacheng <1503679330@qq.com> Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Dhruv Karan <k4r4n.dhruv@gmail.com> Co-authored-by: Michael Wyatt <mrwyattii@gmail.com> Co-authored-by: Maxime G <joihn@users.noreply.github.com> Co-authored-by: Maxime Gardoni <maxime.gardoni@ecorobotix.com> Co-authored-by: Wonseok Lee (Jack) <rollerkid02@snu.ac.kr> Co-authored-by: Dan Jones <dan.j.jones2@gmail.com> Co-authored-by: Daniel Jones <jonesdaniel@microsoft.com> Co-authored-by: flozi00 <flozi00.fz@gmail.com> Co-authored-by: iiLaurens <iiLaurens@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Wang, Yi <yi.a.wang@intel.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> Co-authored-by: Karim Foda <35491698+KMFODA@users.noreply.github.com> Co-authored-by: sanchit-gandhi <sanchit@huggingface.co> Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com> Co-authored-by: zhoutang776 <47708118+zhoutang776@users.noreply.github.com> Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2022-10-17 17:02:14 +01:00
amyeroberts	1973b7716b	Image transforms library (#18520 ) * Adapt FE methods to transforms library * Mixin for saving the image processor * Base processor skeleton * BatchFeature for packaging image processor outputs * Initial image processor for GLPN * REmove accidental import * Fixup and docs * Mixin for saving the image processor * Fixup and docs * Import BatchFeature from feature_extraction_utils * Fixup and docs * Fixup and docs * Fixup and docs * Fixup and docs * BatchFeature for packaging image processor outputs * Import BatchFeature from feature_extraction_utils * Import BatchFeature from feature_extraction_utils * Fixup and docs * Fixup and docs * BatchFeature for packaging image processor outputs * Import BatchFeature from feature_extraction_utils * Fixup and docs * Mixin for saving the image processor * Fixup and docs * Add rescale back and remove ImageType * fix import mistake * Fix enum var reference * Can transform and specify image data format * Remove redundant function * Update reference * Data format flag for rescale * Fix typo * Fix dimension check * Fixes to make IP and FE outputs match * Add tests for transforms * Add test for utils * Update some docstrings * Make sure in channels last before converting to PIL * Remove default to numpy batching * Fix up * Add docstring and model_input_types * Use feature processor config from hub * Alias GLPN feature extractor to image processor * Alias feature extractor mixin * Add return_numpy=False flag for resize * Fix up * Fix up * Use different frameworks safely * Safely import PIL * Call function checking if PIL available * Only import if vision available * Address Sylvain PR comments Co-authored-by: Sylvain.gugger@gmail.com * Apply suggestions from code review Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/image_transforms.py Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> * Update src/transformers/models/glpn/feature_extraction_glpn.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add in docstrings * Fix TFSwinSelfAttention to have relative position index as non-trainable weight (#18226) Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> * Refactor `TFSwinLayer` to increase serving compatibility (#18352) * Refactor `TFSwinLayer` to increase serving compatibility Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> * Fix missed parameters while refactoring Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> * Fix window_reverse to calculate batch size Signed-off-by: Seunghwan Hong <harrydrippin@gmail.com> Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add TF prefix to TF-Res test class (#18481) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Remove py.typed (#18485) * Fix pipeline tests (#18487) * Fix pipeline tests * Make sure all pipelines tests run with init changes * Use new huggingface_hub tools for download models (#18438) * Draft new cached_file * Initial draft for config and model * Small fixes * Fix first batch of tests * Look in cache when internet is down * Fix last tests * Bad black, not fixing all quality errors * Make diff less * Implement change for TF and Flax models * Add tokenizer and feature extractor * For compatibility with main * Add utils to move the cache and auto-do it at first use. * Quality * Deal with empty commit shas * Deal with empty etag * Address review comments * Fix `test_dbmdz_english` by updating expected values (#18482) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Move cache folder to huggingface/hub for consistency with hf_hub (#18492) * Move cache folder to just huggingface * Thank you VsCode for this needless import * Move to hub * Forgot one * Update some expected values in `quicktour.mdx` for `resampy 0.3.0` (#18484) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Forgot one new_ for cache migration * disable Onnx test for google/long-t5-tglobal-base (#18454) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Typo reported by Joel Grus on TWTR (#18493) * Just re-reading the whole doc every couple of months 😬 (#18489) * Delete valohai.yaml * NLP => ML * typo * website supports https * datasets * 60k + modalities * unrelated link fixing for accelerate * Ok those links were actually broken * Fix link * Make `AutoTokenizer` auto-link * wording tweak * add at least one non-nlp task * `transformers-cli login` => `huggingface-cli login` (#18490) * zero chance anyone's using that constant no? * `transformers-cli login` => `huggingface-cli login` * `transformers-cli repo create` => `huggingface-cli repo create` * `make style` * Add seed setting to image classification example (#18519) * [DX fix] Fixing QA pipeline streaming a dataset. (#18516) * [DX fix] Fixing QA pipeline streaming a dataset. QuestionAnsweringArgumentHandler would iterate over the whole dataset effectively killing all properties of the pipeline. This restores nice properties when using `Dataset` or `Generator` since those are meant to be consumed lazily. * Handling TF better. * Clean up hub (#18497) * Clean up utils.hub * Remove imports * More fixes * Last fix * update fsdp docs (#18521) * updating fsdp documentation * typo fix * Fix compatibility with 1.12 (#17925) * Fix compatibility with 1.12 * Remove pin from examples requirements * Update torch scatter version * Fix compatibility with 1.12 * Remove pin from examples requirements * Update torch scatter version * fix torch.onnx.symbolic_opset12 import * Reject bad version Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Remove debug statement * Specify en in doc-builder README example (#18526) Co-authored-by: Ankur Goyal <ankur@impira.com> * New cache fixes: add safeguard before looking in folders (#18522) * unpin resampy (#18527) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * ✨ update to use interlibrary links instead of Markdown (#18500) * Add example of multimodal usage to pipeline tutorial (#18498) * 📝 add example of multimodal usage to pipeline tutorial * 🖍 apply feedbacks * 🖍 apply niels feedback * [VideoMAE] Add model to doc tests (#18523) * Add videomae to doc tests * Add pip install decord Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> * Update perf_train_gpu_one.mdx (#18532) * Update no_trainer.py scripts to include accelerate gradient accumulation wrapper (#18473) * Added accelerate gradient accumulation wrapper to run_image_classification_no_trainer.py example script * make fixup changes * PR comments * changed input to Acceletor based on PR comment, ran make fixup * Added comment explaining the sync_gradients statement * Fixed lr scheduler max steps * Changed run_clm_no_trainer.py script to use accelerate gradient accum wrapper * Fixed all scripts except wav2vec2 pretraining to use accelerate gradient accum wrapper * Added accelerate gradient accum wrapper for wav2vec2_pretraining_no_trainer.py script * make fixup and lr_scheduler step inserted back into run_qa_beam_search_no_trainer.py * removed changes to run_wav2vec2_pretraining_no_trainer.py script and fixed using wrong constant in qa_beam_search_no_trainer.py script * Add Spanish translation of converting_tensorflow_models.mdx (#18512) * Add file in spanish docs to be translated * Finish translation to Spanish * Improve Spanish wording * Add suggested changes from review * Spanish translation of summarization.mdx (#15947) (#18477) * Add Spanish translation of summarization.mdx * Apply suggestions from code review Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Let's not cast them all (#18471) * add correct dtypes when checking for params dtype * forward contrib credits * Update src/transformers/modeling_utils.py Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com> * more comments - added more comments on why we cast only floating point parameters * Update src/transformers/modeling_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: sgugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com> * fix: data2vec-vision Onnx ready-made configuration. (#18427) * feat: add the data2vec conf that are missing https://huggingface.co/docs/transformers/serialization * fix: wrong config * Add mt5 onnx config (#18394) * update features * MT5OnnxConfig added with updated with tests and docs * fix imports * fix onnc_config_cls for mt5 Co-authored-by: Thomas Chaigneau <thomas.deeptools.ai> * Minor update of `run_call_with_unpacked_inputs` (#18541) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * BART - Fix attention mask device issue on copied models (#18540) * attempt to fix attn mask device * fix bart `_prepare_decoder_attention_mask` - add correct device - run `make fix-copies` to propagate the fix * Adding a new `align_to_words` param to qa pipeline. (#18010) * Adding a new `align_to_words` param to qa pipeline. * Update src/transformers/pipelines/question_answering.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Import protection. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * 📝 update metric with evaluate (#18535) * Restore _init_weights value in no_init_weights (#18504) * Recover _init_weights value in no_init_weights For potential nested use. In addition, users might modify private no_init_weights as well. * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Remove private variable change check Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Clean up comment * 📝 update documentation build section (#18548) * `bitsandbytes` - `Linear8bitLt` integration into `transformers` models (#17901) * first commit * correct replace function * add final changes - works like charm! - cannot implement tests yet - tested * clean up a bit * add bitsandbytes dependencies * working version - added import function - added bitsandbytes utils file * small fix * small fix - fix import issue * fix import issues * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refactor a bit - move bitsandbytes utils to utils - change comments on functions * reformat docstring - reformat docstring on init_empty_weights_8bit * Update src/transformers/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * revert bad formatting * change to bitsandbytes * refactor a bit - remove init8bit since it is useless * more refactoring - fixed init empty weights issue - added threshold param * small hack to make it work * Update src/transformers/modeling_utils.py * Update src/transformers/modeling_utils.py * revmoe the small hack * modify utils file * make style + refactor a bit * create correctly device map * add correct dtype for device map creation * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply suggestions - remove with torch.grad - do not rely on Python bool magic! * add docstring - add docstring for new kwargs * add docstring - comment `replace_8bit_linear` function - fix weird formatting * - added more documentation - added new utility function for memory footprint tracking - colab demo to add * few modifs - typo doc - force cast into float16 when load_in_8bit is enabled * added colab link * add test architecture + docstring a bit * refactor a bit testing class * make style + refactor a bit * enhance checks - add more checks - start writing saving test * clean up a bit * male style * add more details on doc * add more tests - still needs to fix 2 tests * replace by "or" - could not fix it from GitHub GUI Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refactor a bit testing code + add readme * make style * fix import issue * Update src/transformers/modeling_utils.py Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * add few comments * add more doctring + make style * more docstring * raise error when loaded in 8bit * make style * add warning if loaded on CPU * add small sanity check * fix small comment * add bitsandbytes on dockerfile * Improve documentation - improve documentation from comments * add few comments * slow tests pass on the VM but not on the CI VM * Fix merge conflict * make style * another test should pass on a multi gpu setup * fix bad import in testing file * Fix slow tests - remove dummy batches - no more CUDA illegal memory errors * odify dockerfile * Update docs/source/en/main_classes/model.mdx * Update Dockerfile * Update model.mdx * Update Dockerfile * Apply suggestions from code review * few modifications - lm head can stay on disk/cpu - change model name so that test pass * change test value - change test value to the correct output - torch bmm changed to baddmm in bloom modeling when merging * modify installation guidelines * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * replace `n`by `name` * merge `load_in_8bit` and `low_cpu_mem_usage` * first try - keep the lm head in full precision * better check - check the attribute `base_model_prefix` instead of computing the number of parameters * added more tests * Update src/transformers/utils/bitsandbytes.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers into integration-8bit * improve documentation - fix typos for installation - change title in the documentation Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * TF: XLA-trainable DeBERTa v2 (#18546) * fix deberta issues * add different code paths for gpu and tpu * shorter gpu take along axis * Stable Dropout without tf cond * variable must be float * Preserve hub-related kwargs in AutoModel.from_pretrained (#18545) * Preserve hub-related kwargs in AutoModel.from_pretrained * Fix tests * Remove debug statement * TF Examples Rewrite (#18451) * Finished QA example * Dodge a merge conflict * Update text classification and LM examples * Update NER example * New Keras metrics WIP, fix NER example * Update NER example * Update MC, summarization and translation examples * Add XLA warnings when shapes are variable * Make sure batch_size is consistently scaled by num_replicas * Add PushToHubCallback to all models * Add docs links for KerasMetricCallback * Add docs links for prepare_tf_dataset and jit_compile * Correct inferred model names * Don't assume the dataset has 'lang' * Don't assume the dataset has 'lang' * Write metrics in text classification * Add 'framework' to TrainingArguments and TFTrainingArguments * Export metrics in all examples and add tests * Fix training args for Flax * Update command line args for translation test * make fixup * Fix accidentally running other tests in fp16 * Remove do_train/do_eval from run_clm.py * Remove do_train/do_eval from run_mlm.py * Add tensorflow tests to circleci * Fix circleci * Update examples/tensorflow/language-modeling/run_mlm.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/test_tensorflow_examples.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/translation/run_translation.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/token-classification/run_ner.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Fix save path for tests * Fix some model card kwargs * Explain the magical -1000 * Actually enable tests this time * Skip text classification PR until we fix shape inference * make fixup Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Use commit hash to look in cache instead of calling head (#18534) * Use commit hash to look in cache instead of calling head * Add tests * Add attr for local configs too * Stupid typos * Fix tests * Update src/transformers/utils/hub.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Address Julien's comments Co-authored-by: Julien Chaumond <julien@huggingface.co> * `pipeline` support for `device="mps"` (or any other string) (#18494) * `pipeline` support for `device="mps"` (or any other string) * Simplify `if` nesting * Update src/transformers/pipelines/base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix? @sgugger * passing `attr=None` is not the same as not passing `attr` 🤯 Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update philosophy to include other preprocessing classes (#18550) * 📝 update philosophy to include other preprocessing classes * 🖍 apply feedbacks * Properly move cache when it is not in default path (#18563) * Adds CLIP to models exportable with ONNX (#18515) * onnx config for clip * default opset as 14 * changes from the original repo * input values order fix * outputs fix * remove unused import * ran make fix-copies * black format * review comments: forward ref, import fix, model change revert, .to cleanup * make style * formatting fixes * revert groupvit * comment for cast to int32 * comment fix * make .T as .t() for onnx conversion * ran make fix-copies * remove unneeded comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix copies * remove comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * raise atol for MT5OnnxConfig (#18560) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * fix string (#18568) * Segformer TF: fix output size in documentation (#18572) * Segformer TF: fix output size in doc * Segformer pytorch: fix output size in doc Co-authored-by: Maxime Gardoni <maxime.gardoni@ecorobotix.com> * Fix resizing bug in OWL-ViT (#18573) * Fixes resizing bug in OWL-ViT * Defaults to square resize if size is set to an int * Sets do_center_crop default value to False * Fix LayoutLMv3 documentation (#17932) * fix typos * fix sequence_length docs of LayoutLMv3Model * delete trailing white spaces * fix layoutlmv3 docs more * apply make fixup & quality * change to two versions of input docstring * apply make fixup & quality * Skip broken tests * Change BartLearnedPositionalEmbedding's forward method signature to support Opacus training (#18486) * changing BartLearnedPositionalEmbedding forward signature and references to it * removing debugging dead code (thanks style checker) * blackened modeling_bart file * removing copy inconsistencies via make fix-copies * changing references to copied signatures in Bart variants * make fix-copies once more * using expand over repeat (thanks @michaelbenayoun) * expand instead of repeat for all model copies Co-authored-by: Daniel Jones <jonesdaniel@microsoft.com> * german docs translation (#18544) * Create _config.py * Create _toctree.yml * Create index.mdx not sure about "du / ihr" oder "sie" * Create quicktour.mdx * Update _toctree.yml * Update build_documentation.yml * Update build_pr_documentation.yml * fix build * Update index.mdx * Update quicktour.mdx * Create installation.mdx * Update _toctree.yml * Deberta V2: Fix critical trace warnings to allow ONNX export (#18272) * Fix critical trace warnings to allow ONNX export * Force input to `sqrt` to be float type * Cleanup code * Remove unused import statement * Update model sew * Small refactor Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * Use broadcasting instead of repeat * Implement suggestion Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * Match deberta v2 changes in sew_d * Improve code quality * Update code quality * Consistency of small refactor * Match changes in sew_d Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * [FX] _generate_dummy_input supports audio-classification models for labels (#18580) * Support audio classification architectures for labels generation, as well as provides a flag to print warnings or not * Use ENV_VARS_TRUE_VALUES * Fix docstrings with last version of hf-doc-builder styler (#18581) * Fix docstrings with last version of hf-doc-builder styler * Remove empty Parameter block * Bump nbconvert from 6.0.1 to 6.3.0 in /examples/research_projects/lxmert (#18565) Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0. - [Release notes](https://github.com/jupyter/nbconvert/releases) - [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0) --- updated-dependencies: - dependency-name: nbconvert dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump nbconvert in /examples/research_projects/visual_bert (#18566) Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0. - [Release notes](https://github.com/jupyter/nbconvert/releases) - [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0) --- updated-dependencies: - dependency-name: nbconvert dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix owlvit tests, update docstring examples (#18586) * Return the permuted hidden states if return_dict=True (#18578) * Load sharded pt to flax (#18419) * initial commit * add small test * add cross pt tf flag to test * fix quality * style * update test with new repo * fix failing test * update * fix wrong param ordering * style * update based on review * update related to recent new caching mechanism * quality * Update based on review Co-authored-by: sgugger <sylvain.gugger@gmail.com> * quality and style * Update src/transformers/modeling_flax_utils.py Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add type hints for ViLT models (#18577) * Add type hints for Vilt models * Add missing return type for TokenClassification class * update doc for perf_train_cpu_many, add intel mpi introduction (#18576) * update doc for perf_train_cpu_many, add mpi introduction Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * Update docs/source/en/perf_train_cpu_many.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/perf_train_cpu_many.mdx Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * typos (#18594) * FSDP bug fix for `load_state_dict` (#18596) * Add `TFAutoModelForSemanticSegmentation` to the main `__init__.py` (#18600) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Generate: validate `model_kwargs` (and catch typos in generate arguments) (#18261) * validate generate model_kwargs * generate tests -- not all models have an attn mask * Supporting seq2seq models for `bitsandbytes` integration (#18579) * Supporting seq2seq models for `bitsandbytes` integration - `bitsandbytes` integration supports now seq2seq models - check if a model has tied weights as an additional check * small modification - tie the weights before looking at tied weights! * Add Donut (#18488) * First draft * Improve script * Update script * Make conversion work * Add final_layer_norm attribute to Swin's config * Add DonutProcessor * Convert more models * Improve feature extractor and convert base models * Fix bug * Improve integration tests * Improve integration tests and add model to README * Add doc test * Add feature extractor to docs * Fix integration tests * Remove register_buffer * Fix toctree and add missing attribute * Add DonutSwin * Make conversion script work * Improve conversion script * Address comment * Fix bug * Fix another bug * Remove deprecated method from docs * Make Swin and Swinv2 untouched * Fix code examples * Fix processor * Update model_type to donut-swin * Add feature extractor tests, add token2json method, improve feature extractor * Fix failing tests, remove integration test * Add do_thumbnail for consistency * Improve code examples * Add code example for document parsing * Add DonutSwin to MODEL_NAMES_MAPPING * Add model to appropriate place in toctree * Update namespace to appropriate organization Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> * Fix URLs (#18604) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> * Update BLOOM parameter counts (#18531) * Update BLOOM parameter counts * Update BLOOM parameter counts * [doc] fix anchors (#18591) the manual anchors end up being duplicated with automatically added anchors and no longer work. * [fsmt] deal with -100 indices in decoder ids (#18592) * [fsmt] deal with -100 indices in decoder ids Fixes: https://github.com/huggingface/transformers/issues/17945 decoder ids get the default index -100, which breaks the model - like t5 and many other models add a fix to replace -100 with the correct pad index. For some reason this use case hasn't been used with this model until recently - so this issue was there since the beginning it seems. Any suggestions to how to add a simple test here? or perhaps we have something similar already? user's script is quite massive. * style * small change (#18584) * Flax Remat for LongT5 (#17994) * [Flax] Add remat (gradient checkpointing) * fix variable naming in test * flip: checkpoint using a method * fix naming * fix class naming * apply PVP's suggestions from code review * add gradient_checkpointing to examples * Add gradient_checkpointing to run_mlm_flax * Add remat to longt5 * Add gradient checkpointing test longt5 * Fix args errors * Fix remaining tests * Make fixup & quality fixes * replace kwargs * remove unecessary kwargs * Make fixup changes * revert long_t5_flax changes * Remove return_dict and copy to LongT5 * Remove test_gradient_checkpointing Co-authored-by: sanchit-gandhi <sanchit@huggingface.co> * mac m1 `mps` integration (#18598) * mac m1 `mps` integration * Update docs/source/en/main_classes/trainer.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * addressing comments * Apply suggestions from code review Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com> * resolve comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com> * Change scheduled CIs to use torch 1.12.1 (#18644) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Add checks for some workflow jobs (#18583) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * TF: Fix generation repetition penalty with XLA (#18648) * Update longt5.mdx (#18634) * Update run_translation_no_trainer.py (#18637) * Update run_translation_no_trainer.py found an error in selecting `no_decay` parameters and some small modifications when the user continues to train from a checkpoint * fixs `no_decay` and `resume_step` issue 1. change `no_decay` list 2. if use continue to train their model from provided checkpoint, the `resume_step` will not be initialized properly if `args.gradient_accumulation_steps != 1` * [bnb] Minor modifications (#18631) * bnb minor modifications - refactor documentation - add troubleshooting README - add PyPi library on DockerFile * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * put in one block - put bash instructions in one block * update readme - refactor a bit hardware requirements * change text a bit * Apply suggestions from code review Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * apply suggestions Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * add link to paper * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update tests/mixed_int8/README.md * Apply suggestions from code review * refactor a bit * add instructions Turing & Amperer Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * add A6000 * clarify a bit * remove small part * Update tests/mixed_int8/README.md Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Examples: add Bloom support for token classification (#18632) * examples: add Bloom support for token classification (FLAX, PyTorch and TensorFlow) * examples: remove support for Bloom in token classication (FLAX and TensorFlow currently have no support for it) * Fix Yolos ONNX export test (#18606) Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fixup * Fix up * Move PIL default arguments inside function for safe imports * Add image utils to toctree * Update `rescale` method to reflect changes in #18677 * Update docs/source/en/internal/image_processing_utils.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Address Niels PR comments * Apply suggestions from code review - remove defaults to None Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix docstrings and revert to PIL.Image.XXX resampling Use PIL.Image.XXX resampling values instead of PIL.Image.Resampling.XXX enum as it's only in the recent version >= 9.10 and version is not yet pinned and older version support deprecated * Some more docstrings and PIL.Image tidy up * Reorganise arguments so flags by modifiers * Few last docstring fixes Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Amy Roberts <amyeroberts@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Seunghwan Hong <harrydrippin@gmail.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: regisss <15324346+regisss@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com> Co-authored-by: Ankur Goyal <ankrgyl@gmail.com> Co-authored-by: Ankur Goyal <ankur@impira.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Mishig Davaadorj <dmishig@gmail.com> Co-authored-by: Rasmus Arpe Fogh Jensen <Rasmus.arpe@gmail.com> Co-authored-by: Ian Castillo <7807897+donelianc@users.noreply.github.com> Co-authored-by: AguilaCudicio <aguila.cudicio@gmail.com> Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com> Co-authored-by: Niklas Hansson <niklas.sven.hansson@gmail.com> Co-authored-by: Thomas Chaigneau <t.chaigneau.tc@gmail.com> Co-authored-by: YouJiacheng <1503679330@qq.com> Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Dhruv Karan <k4r4n.dhruv@gmail.com> Co-authored-by: Michael Wyatt <mrwyattii@gmail.com> Co-authored-by: Maxime G <joihn@users.noreply.github.com> Co-authored-by: Maxime Gardoni <maxime.gardoni@ecorobotix.com> Co-authored-by: Wonseok Lee (Jack) <rollerkid02@snu.ac.kr> Co-authored-by: Dan Jones <dan.j.jones2@gmail.com> Co-authored-by: Daniel Jones <jonesdaniel@microsoft.com> Co-authored-by: flozi00 <flozi00.fz@gmail.com> Co-authored-by: iiLaurens <iiLaurens@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Wang, Yi <yi.a.wang@intel.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> Co-authored-by: Karim Foda <35491698+KMFODA@users.noreply.github.com> Co-authored-by: sanchit-gandhi <sanchit@huggingface.co> Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com> Co-authored-by: zhoutang776 <47708118+zhoutang776@users.noreply.github.com> Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2022-10-12 18:32:02 +01:00
IMvision12	2df602870b	Added tests for yaml and json parser (#19219 ) * Added tests for yaml and json * Added tests for yaml and json	2022-09-27 16:25:57 -04:00
Sylvain Gugger	34be08efcd	More tests for regression in cached non existence (#19216 ) * More tests for regression in cached non existence * Style	2022-09-27 09:36:34 -04:00
Sylvain Gugger	a32f97c37d	Fix cached_file in offline mode for cached non-existing files (#19206 ) * Fix cached_file in offline mode for cached non-existing files * Add tests * Test with offline mode	2022-09-26 18:01:00 -04:00
Sylvain Gugger	3c7b965bcd	Add some tests for check_dummies (#19146 )	2022-09-21 14:54:09 -04:00
Sylvain Gugger	38c3cd52fb	Clean up utils.hub using the latest from hf_hub (#18857 ) * Clean up utils.hub using the latest from hf_hub * Adapt test * Address review comment * Fix test	2022-09-02 10:30:06 -04:00
Felix Schneider	86387fe87f	Add an option to `HfArgumentParser.parse_{dict,json_file}` to raise an Exception when there extra keys (#18692 ) * Update parser to track unneeded keys, off by default * Fix formatting * Fix docstrings and defaults in HfArgparser * Fix formatting	2022-08-31 20:26:45 +02:00
amyeroberts	49e44b216b	Update feature extractor methods to enable type cast before normalize (#18499 ) * Update methods to optionally rescale This is necessary to allow for casting our images / videos to numpy arrays within the feature extractors' call. We want to do this to make sure the behaviour is as expected when flags like are False. If some transformations aren't applied, then the output type can't be unexpected e.g. a list of PIL images instead of numpy arrays. * Cast images to numpy arrays in call to enable consistent behaviour with different configs * Remove accidental clip changes * Update tests to reflect the scaling logic We write a generic function to handle rescaling of our arrays. In order for the API to be intuitive, we take some factor c and rescale the image values by that. This means, the rescaling done in normalize and to_numpy_array are now done with array * (1/255) instead of array / 255. This leads to small differences in the resulting image. When testing, this was in the order of 1e-8, and so deemed OK	2022-08-17 19:57:07 +01:00
Sylvain Gugger	377cdded7a	Clean up hub (#18497 ) * Clean up utils.hub * Remove imports * More fixes * Last fix	2022-08-08 08:48:10 -04:00

1 2 3 4 5 ...

271 Commits