transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Arthur	115ac94d06	[`Core generation`] Adds support for static KV cache (#27931 ) Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2024-02-08 11:50:34 +01:00
Javier	4b236aed76	Fix utf-8 yaml load for marian conversion to pytorch in Windows (#28618 ) Fix utf-8 yaml in marian conversion	2024-02-08 08:23:15 +01:00
Klaus Hipp	33df036917	[Docs] Revert translation of '@slow' decorator (#28912 )	2024-02-08 03:31:47 +01:00
Klaus Hipp	328ade855b	[Docs] Fix placement of tilde character (#28913 ) Fix placement of tilde character	2024-02-07 17:19:39 -08:00
Huazhong Ji	5f96855761	Add npu device for pipeline (#28885 ) add npu device for pipeline Co-authored-by: unit_test <test@unit.com>	2024-02-07 17:27:01 +00:00
Yih-Dar	308d2b9004	Update the cache number (#28905 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-07 16:37:09 +01:00
Daniel Korat	abf8f54a01	⚠️ Raise `Exception` when trying to generate 0 tokens ⚠️ (#28621 ) * change warning to exception * Update src/transformers/generation/utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * validate `max_new_tokens` > 0 in `GenerationConfig` * fix truncation test parameterization in `TextGenerationPipelineTests` --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2024-02-07 13:42:01 +01:00
Matt	349a6e8542	Fix Keras scheduler import so it works for older versions of Keras (#28895 ) Fix our schedule import so it works for older versions of Keras	2024-02-07 12:28:24 +00:00
Sourab Mangrulkar	d9deddb4c1	fix Starcoder FA2 implementation (#28891 )	2024-02-07 14:10:10 +05:30
Sai-Suraj-27	64d1518cbf	fix: Fixed the documentation for `logging_first_step` by removing "evaluate" (#28884 ) Fixed the documentation for logging_first_step by removing evaluate.	2024-02-07 08:46:36 +01:00
Klaus Hipp	1c31b7aa3b	[Docs] Add missing language options and fix broken links (#28852 ) * Add missing entries to the language selector * Add links to the Colab and AWS Studio notebooks for ONNX * Use anchor links in CONTRIBUTING.md * Fix broken hyperlinks due to spaces * Fix links to OpenAI research articles * Remove confusing footnote symbols from author names, as they are also considered invalid markup	2024-02-06 12:01:01 -08:00
Yih-Dar	40658be461	Hotfix - make `torchaudio` get the correct version in `torch_and_flax_job` (#28899 ) * check * check * check --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-06 21:00:42 +01:00
Klaus Hipp	4830f26965	[Docs] Fix backticks in inline code and documentation links (#28875 ) Fix backticks in code blocks and documentation links	2024-02-06 11:15:44 -08:00
Lucain	a1afec9e17	Explicit server error on gated model (#28894 )	2024-02-06 17:45:20 +00:00
Yih-Dar	89439fea64	unpin torch (#28892 ) * unpin torch * check * check * check --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-06 17:21:05 +01:00
Yih-Dar	76b4f666f5	Revert "[WIP] Hard error when ignoring tensors." (#28898 ) Revert "[WIP] Hard error when ignoring tensors. (#27484)" This reverts commit `2da28c4b41`.	2024-02-06 17:18:30 +01:00
Yih-Dar	6529a5b5c1	Fix `FastSpeech2ConformerModelTest` and skip it on CPU (#28888 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-06 11:05:23 +01:00
Sourab Mangrulkar	5346db1684	Raise error when using `save_only_model` with `load_best_model_at_end` for DeepSpeed/FSDP (#28866 ) * Raise error when using `save_only_model` with `load_best_model_at_end` for DeepSpeed/FSDP * Update trainer.py	2024-02-06 11:25:44 +05:30
Eran Hirsch	ee2a3400f2	Fix LongT5ForConditionalGeneration initialization of lm_head (#28873 )	2024-02-06 04:24:20 +01:00
Klaus Hipp	1ea0bbd73c	[Docs] Update project names and links in awesome-transformers (#28878 ) Update project names and repository links in awesome-transformers	2024-02-06 04:06:29 +01:00
dependabot[bot]	e83227d76e	Bump cryptography from 41.0.2 to 42.0.0 in /examples/research_projects/decision_transformer (#28879 ) Bump cryptography in /examples/research_projects/decision_transformer Bumps [cryptography](https://github.com/pyca/cryptography) from 41.0.2 to 42.0.0. - [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst) - [Commits](https://github.com/pyca/cryptography/compare/41.0.2...42.0.0) --- updated-dependencies: - dependency-name: cryptography dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-02-06 03:53:08 +01:00
nakranivaibhav	2e7c942c81	Adds LlamaForQuestionAnswering class in modeling_llama.py along with AutoModel Support (#28777 ) * This is a test commit * testing commit * final commit with some changes * Removed copy statement * Fixed formatting issues * Fixed error added past_key_values in the forward method * Fixed a trailing whitespace. Damn the formatting rules are strict * Added the copy statement	2024-02-06 03:41:42 +01:00
xkszltl	ac51e59e47	Do not use mtime for checkpoint rotation. (#28862 ) Resolve https://github.com/huggingface/transformers/issues/26961	2024-02-06 03:21:50 +01:00
eajechiloae	06901162b5	ClearMLCallback enhancements: support multiple runs and handle logging better (#28559 ) * add clearml tracker * support multiple train runs * remove bad code * add UI entries for config/hparams overrides * handle models in different tasks * run ruff format * tidy code based on code review --------- Co-authored-by: Eugen Ajechiloae <eugenajechiloae@gmail.com>	2024-02-05 20:04:17 +00:00
amyeroberts	ba3264b4e8	Image Feature Extraction pipeline (#28216 ) * Draft pipeline * Fixup * Fix docstrings * Update doctest * Update pipeline_model_mapping * Update docstring * Update tests * Update src/transformers/pipelines/image_feature_extraction.py Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Fix docstrings - review comments * Remove pipeline mapping for composite vision models * Add to pipeline tests * Remove for flava (multimodal) * safe pil import * Add requirements for pipeline run * Account for super slow efficientnet * Review comments * Fix tests * Swap order of kwargs * Use build_pipeline_init_args * Add back FE pipeline for Vilt * Include image_processor_kwargs in docstring * Mark test as flaky * Update TODO * Update tests/pipelines/test_pipelines_image_feature_extraction.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add license header --------- Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-02-05 14:50:07 +00:00
Yoach Lacombe	7addc9346c	Correct wav2vec2-bert inputs_to_logits_ratio (#28821 ) * Correct wav2vec2-bert inputs_to_logits_ratio * correct ratio * correct ratio, clean asr pipeline * refactor on one line	2024-02-05 13:14:47 +00:00
Arthur	3f9f749325	[`Doc`] update contribution guidelines (#28858 ) update guidelines	2024-02-05 21:19:21 +09:00
Nicolas Patry	2da28c4b41	[WIP] Hard error when ignoring tensors. (#27484 ) * [WIP] Hard error when ignoring tensors. * Better selection/error when saving a checkpoint. - Find all names we should normally drop (those are in the transformers config) - Find all disjoint tensors (for those we can safely trigger a copy to get rid of the sharing before saving) - Clone those disjoint tensors getting rid of the issue - Find all identical names (those should be declared in the config but we try to find them all anyway.) - For all identical names: - If they are in the config, just ignore them everything is fine - If they are not, warn about them. - For all remainder tensors which are shared yet neither identical NOR disjoint. raise a hard error. * Adding a failing test on `main` that passes here. * We don't need to keep the subfolder logic in this test. * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-02-05 09:17:24 +01:00
w4ffl35	0466fd5ca2	Ability to override clean_code_for_run (#28783 ) * Add clean_code_for_run function * Call clean_code_for_run from agent method	2024-02-05 03:48:41 +01:00
Zizhao Chen	c430d6eaee	[Docs] Fix bad doc: replace save with logging (#28855 ) Fix bad doc: replace save with logging	2024-02-05 03:38:08 +01:00
Ziyang	7b702836af	Support custom scheduler in deepspeed training (#26831 ) Reuse trainer.create_scheduler to create scheduler for deepspeed	2024-02-05 03:33:55 +01:00
dependabot[bot]	ca8944c4e3	Bump dash from 2.3.0 to 2.15.0 in /examples/research_projects/decision_transformer (#28845 ) Bump dash in /examples/research_projects/decision_transformer Bumps [dash](https://github.com/plotly/dash) from 2.3.0 to 2.15.0. - [Release notes](https://github.com/plotly/dash/releases) - [Changelog](https://github.com/plotly/dash/blob/dev/CHANGELOG.md) - [Commits](https://github.com/plotly/dash/compare/v2.3.0...v2.15.0) --- updated-dependencies: - dependency-name: dash dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-02-05 03:12:30 +01:00
amyeroberts	3d2900e829	Mark `test_encoder_decoder_model_generate` for `vision_encoder_deocder` as flaky (#28842 ) Mark test as flaky	2024-02-02 16:57:08 +00:00
Sourab Mangrulkar	80d50076c8	Reduce GPU memory usage when using FSDP+PEFT (#28830 ) support FSDP+PEFT	2024-02-02 21:18:01 +05:30
Yih-Dar	f497795948	Use `-v` for `pytest` on CircleCI (#28840 ) use -v in pytest Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-02 16:44:13 +01:00
Yih-Dar	a7cb92aa03	fix / skip (for now) some tests before switch to torch 2.2 (#28838 ) * fix / skip some tests before we can switch to torch 2.2 * style --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-02 14:11:50 +01:00
Yih-Dar	0e75aeefaf	Fix issues caused by natten (#28834 ) try Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-02 21:11:48 +09:00
Juri Ganitkevitch	ec29d25d9f	Add missing None check for hf_quantizer (#28804 ) * Add missing None check for hf_quantizer * Add test, fix logic. * make style * Switch test model to Mistral * Comment * Update tests/test_modeling_utils.py --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-02-02 09:34:12 +01:00
skumar951	1efb21c764	Explicitly check if token ID's are None in TFBertTokenizer constructor (#28824 ) Add an explicit none-check, since token ids can be 0	2024-02-02 09:13:36 +01:00
Klaus Hipp	721ee783ca	[Docs] Fix spelling and grammar mistakes (#28825 ) * Fix typos and grammar mistakes in docs and examples * Fix typos in docstrings and comments * Fix spelling of `tokenizer` in model tests * Remove erroneous spaces in decorators * Remove extra spaces in Markdown link texts	2024-02-02 08:45:00 +01:00
Steven Liu	2418c64a1c	[docs] HfQuantizer (#28820 ) * tidy * fix path	2024-02-02 08:22:18 +01:00
Steven Liu	abbffc4525	[docs] Backbone (#28739 ) * backbones * fix path * fix paths * fix code snippet * fix links	2024-02-01 09:16:16 -08:00
Rockerz	23ea6743f2	Add models from deit (#28302 ) * Add modelss * Add 2 more models * add models to tocrree * Add modles * Update docs/source/ja/model_doc/detr.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/model_doc/deit.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/model_doc/deplot.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix bugs --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-02-01 09:15:55 -08:00
zspo	d98591a12b	[docs] fix some bugs about parameter description (#28806 ) Co-authored-by: p_spozzhang <p_spozzhang@tencent.com>	2024-02-01 16:59:29 +00:00
Sangbum Daniel Choi	e19c12e094	enable graident checkpointing in DetaObjectDetection and add tests in Swin/Donut_Swin (#28615 ) * enable graident checkpointing in DetaObjectDetection * fix missing part in original DETA * make style * make fix-copies * Revert "make fix-copies" This reverts commit 4041c86c29248f1673e8173b677c20b5a4511358. * remove fix-copies of DetaDecoder * enable swin gradient checkpointing * fix gradient checkpointing in donut_swin * add tests for deta/swin/donut * Revert "fix gradient checkpointing in donut_swin" This reverts commit 1cf345e34d3cc0e09eb800d9895805b1dd9b474d. * change supports_gradient_checkpointing pipeline to PreTrainedModel * Revert "add tests for deta/swin/donut" This reverts commit 6056ffbb1eddc3cb3a99e4ebb231ae3edf295f5b. * Revert "Revert "fix gradient checkpointing in donut_swin"" This reverts commit 24e25d0a14891241de58a0d86f817d0b5d2a341f. * Simple revert * enable deformable detr gradient checkpointing * add gradient in encoder	2024-02-01 15:07:44 +00:00
Matt	7bc6d76396	Add tip on setting tokenizer attributes (#28764 ) * Add tip on setting tokenizer attributes * Grammar * Remove the bit that was causing doc builds to fail	2024-02-01 14:44:58 +00:00
fxmarty	709dc43239	Fix symbolic_trace with kv cache (#28724 ) * fix symbolic_trace with kv cache * comment & better test	2024-02-01 09:45:02 +01:00
Yih-Dar	eb8e7a005f	Make `is_torch_bf16_available_on_device` more strict (#28796 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-01 09:03:53 +01:00
JB (Don)	0d26abdd3a	Adding [T5/MT5/UMT5]ForTokenClassification (#28443 ) * Adding [T5/MT5/UMT5]ForTokenClassification * Add auto mappings for T5ForTokenClassification and variants * Adding ForTokenClassification to the list of models * Adding attention_mask param to the T5ForTokenClassification test * Remove outdated comment in test * Adding EncoderOnly and Token Classification tests for MT5 and UMT5 * Fix typo in umt5 string * Add tests for all the existing MT5 models * Fix wrong comment in dependency_versions_table * Reverting change to common test for _keys_to_ignore_on_load_missing The test is correctly picking up redundant keys in _keys_to_ignore_on_load_missing. * Removing _keys_to_ignore_on_missing from MT5 since the key is not used in the model * Add fix-copies to MT5ModelTest	2024-02-01 03:53:49 +01:00
Shichao Song	7b2bd1fbbd	[docs] Correct the statement in the docstirng of compute_transition_scores in generation/utils.py (#28786 )	2024-01-31 17:07:30 +00:00

1 2 3 4 5 ...

15079 Commits