transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Yih-Dar	db5e0c3292	Fix `MistralIntegrationTest` OOM (#26754 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-10-12 12:31:11 +02:00
Yih-Dar	72256bc72a	Fix `PersimmonIntegrationTest` OOM (#26750 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-10-12 11:24:18 +02:00
Lysandre Debut	ab0ddc99e8	Warnings controlled by logger level (#26527 ) * Logger level Co-authored-by: Sahil Bhosale <sahilbhosale63@live.com> Co-authored-by: Adithya4720 <hegdeadithyak@gmail.com> Co-authored-by: Sachin Singh <sachinishu02@gmail.com> Co-authored-by: Riya Dhanduke <113622644+riiyaa24@users.noreply.github.com> * More comprehensive documentation --------- Co-authored-by: Sahil Bhosale <sahilbhosale63@live.com> Co-authored-by: Adithya4720 <hegdeadithyak@gmail.com> Co-authored-by: Sachin Singh <sachinishu02@gmail.com> Co-authored-by: Riya Dhanduke <113622644+riiyaa24@users.noreply.github.com>	2023-10-12 10:48:38 +02:00
Tom Aarsen	40ea9ab2a1	Add many missing spaces in adjacent strings (#26751 ) Add missing spaces in adjacent strings	2023-10-12 10:28:40 +02:00
Yih-Dar	3bc65505fc	Fix doctest for `Blip2ForConditionalGeneration` (#26737 ) * fix * fix * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-10-12 10:01:07 +02:00
TERRY LEE	e1cec43415	Translated the accelerate.md file of the documentation to Chinese (#26161 ) * translate accelerate page * Update docs/source/zh/accelerate.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-10-11 10:54:22 -07:00
Rockerz	9b7668c03a	add japanese documentation (#26138 ) * udpaet * update * Update docs/source/ja/autoclass_tutorial.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * add codes workflows/build_pr_documentation.yml * Create preprocessing.md * added traning.md * Create Model_sharing.md * add quicktour.md * new * ll * Create benchmark.md * Create Tensorflow_model * add * add community.md * add create_a_model * create custom_model.md * create_custom_tools.md * create fast_tokenizers.md * create * add * Update docs/source/ja/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * md * add * commit * add * h * Update docs/source/ja/peft.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update docs/source/ja/_toctree.yml Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update docs/source/ja/_toctree.yml Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Suggested Update * add perf_train_gpu_one.md * added perf based MD files * Modify toctree.yml and Add transmartion to md codes * Add `serialization.md` and edit `_toctree.yml` * add task summary and tasks explained * Add and Modify files starting from T * Add testing.md * Create main_classes files * delete main_classes folder * Add toctree.yml * Update llm_tutorail.md * Update docs/source/ja/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update misspelled filenames * Update docs/source/ja/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/_toctree.yml * Update docs/source/ja/_toctree.yml * missplled file names inmrpovements * Update _toctree.yml * close tip block * close another tip block * Update docs/source/ja/quicktour.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/pipeline_tutorial.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/pipeline_tutorial.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/preprocessing.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/peft.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/add_new_model.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/testing.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/task_summary.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/tasks_explained.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update glossary.md * Update docs/source/ja/transformers_agents.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/llm_tutorial.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/create_a_model.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/torchscript.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/benchmarks.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/troubleshooting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/troubleshooting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/troubleshooting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/add_new_model.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update perf_torch_compile.md * Update Year to default in en documentation * Final Update --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-10-11 10:26:37 -07:00
Bojun-Feng	797a1babf2	[docstring] Fix docstring for `CodeLlamaTokenizer` (#26709 ) * update check_docstrings * update docstring	2023-10-11 18:01:22 +02:00
Minho Ryang	aaccf1844e	[docstring] Fix docstring for `LlamaTokenizer` and `LlamaTokenizerFast` (#26669 ) * [docstring] Fix docstring for `LlamaTokenizer` and `LlamaTokenizerFast` * [docstring] Fix docstring typo at `LlamaTokenizer` and `LlamaTokenizerFast`	2023-10-11 17:03:31 +02:00
Yih-Dar	e58cbed51d	Revert #20715 (#26734 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-10-11 16:46:41 +02:00
Yih-Dar	b219ae6bd4	Update docker files to use `torch==2.1.0` (#26735 ) Update docker files to use torch 2.1 Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-10-11 16:23:36 +02:00
Zach Mueller	1d6a84749b	Fix checkpoint path in `no_trainer` scripts (#26733 ) checkpoint path	2023-10-11 16:16:27 +02:00
Lysandre Debut	6ecb2ab679	Fix stale bot for locked issues (#26711 )	2023-10-11 16:08:55 +02:00
Sourab Mangrulkar	69873d529d	fix the model card issue as `use_cuda_amp` is no more available (#26731 )	2023-10-11 15:58:23 +02:00
Shivanand	cc44ca8017	[docstring] `SwinModel` docstring fix (#26679 ) * remove from utils * updated doc string * only in the model * Update src/transformers/models/swin/modeling_swin.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Update src/transformers/models/swin/modeling_swin.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2023-10-11 15:53:32 +02:00
Patrick von Platen	da69de17e8	[Assistant Generation] Improve Encoder Decoder (#26701 ) * [Assistant Generation] Improve enc dec * save more * Fix logit processor checks * Clean * make style * fix deprecation * fix generation test * Apply suggestions from code review * fix biogpt * make style	2023-10-11 15:52:20 +02:00
Yih-Dar	5334796d20	`Copied from` for test files (#26713 ) * copied statement for test files --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-10-11 14:12:09 +02:00
Ben Gubler	9f40639292	Update docs to explain disabling callbacks using report_to (#26155 ) * feat: update callback doc to explain disabling callbacks using report_to * docs: update report_to docstring	2023-10-11 07:50:23 -04:00
Billy Bradley	dcc49d8a7e	In assisted decoding, pass model_kwargs to model's forward call (fix prepare_input_for_generation in all models) (#25242 ) * In assisted decoding, pass model_kwargs to model's forward call Previously, assisted decoding would ignore any additional kwargs that it doesn't explicitly handle. This was inconsistent with other generation methods, which pass the model_kwargs through prepare_inputs_for_generation and forward the returned dict to the model's forward call. The prepare_inputs_for_generation method needs to be amended in all models, as previously it only kept the last input ID when a past_key_values was passed. * Improve variable names in _extend_attention_mask * Refactor extending token_type_ids into a function * Replace deepcopy with copy to optimize performance * Update new persimmon model with llama changes for assisted generation * Update new mistral model for assisted generation with prepare_inputs_for_generation * Update position_ids creation in falcon prepare_inputs_for_generation to support assisted generation	2023-10-11 13:18:42 +02:00
Thien Tran	1e3c9ddacc	Make Whisper Encoder's sinusoidal PE non-trainable by default (#26032 ) * set encoder's PE as non-trainable * freeze flax * init sinusoids * add test for non-trainable embed positions * simplify TF encoder embed_pos * revert tf * clean up * add sinusoidal init for jax * make consistent sinusoidal function * fix dtype * add default dtype * use numpy for sinusoids. fix jax * add sinusoid init for TF * fix * use custom embedding * use specialized init for each impl * fix sinusoids init. add test for pytorch * fix TF dtype * simplify sinusoid init for flax and tf * add tests for TF * change default dtype to float32 * add sinusoid test for flax * Update src/transformers/models/whisper/modeling_flax_whisper.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * move sinusoidal init to _init_weights --------- Co-authored-by: sanchit-gandhi <sanchit@huggingface.co> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2023-10-11 09:08:54 +01:00
Roy Hvaara	fc63914399	[JAX] Replace uses of `jnp.array` in types with `jnp.ndarray`. (#26703 ) `jnp.array` is a function, not a type: https://jax.readthedocs.io/en/latest/_autosummary/jax.numpy.array.html so it never makes sense to use `jnp.array` in a type annotation. Presumably the intent was to write `jnp.ndarray` aka `jax.Array`. Co-authored-by: Peter Hawkins <phawkins@google.com>	2023-10-10 21:35:16 +02:00
jheitmann	3eceaa3637	Fix source_prefix default value (#26654 )	2023-10-10 20:49:10 +02:00
théo gigant	975003eacb	fix a typo in flax T5 attention - attention_mask variable is misnamed (#26663 ) * fix a typo in flax t5 attention * fix the typo in flax longt5 attention	2023-10-10 20:36:32 +02:00
Pavarissy	e8fdd7875d	[docstring] Fix docstring for `LlamaConfig` (#26685 ) * Your commit message here * fix LlamaConfig docstring * run make fixup * fix formatting after review reformat of the file to prevent script issues * rerun make fixup after reformat	2023-10-10 17:05:48 +02:00
Tuowei Wang	a9862a0f49	Fix Typo: table in deepspeed.md (#26705 )	2023-10-10 11:50:10 +02:00
jiqing-feng	592f2eabd1	Control first downsample stride in ResNet (#26374 ) * control first downsample stride * reduce first only works for ResNetBottleNeckLayer * fix param name * fix style	2023-10-10 06:45:24 +02:00
Isaac Chung	a5e6df82c0	[docstring] Fix docstrings for `CLIP` (#26691 ) fix docstrings for vanilla clip	2023-10-09 17:39:05 +02:00
Lysandre Debut	87b4ade9e5	Fix stale bot (#26692 ) * Fix stale bot * Comments	2023-10-09 16:39:57 +02:00
Alex Bzdel	3257946fb7	[docstring] Fix docstring for DonutImageProcessor (#26641 ) * removed donutimageprocessor from objects_to_ignore * added docstring for donutimageprocessor * readding donut file * moved docstring to correct location	2023-10-09 16:32:13 +02:00
Isaac Chung	d2f06dfffc	[docstring] Fix docstring for `CLIPImageProcessor` (#26676 ) fix docstring for CLIPImageProcessor	2023-10-09 14:22:44 +02:00
Isaac Chung	3763101f85	[docstring] Fix docstring CLIP configs (#26677 ) * fix docstrings for CLIP configs * black formatted	2023-10-09 12:34:01 +02:00
tom white	c7f01beece	fix typos in idefics.md (#26648 ) * fix typos in idefics.md Two typos found in reviewing this documentation. 1) max_new_tokens=4, is not sufficient to generate "Vegetables" as indicated - you will get only "Veget". (incidentally - some mention of how to select this value might be useful as it seems to change in each example) 2) inputs = processor(prompts, return_tensors="pt").to(device) as inputs need to be on the same device (as they are in all other examples on the page) * Update idefics.md Change device to cuda explicitly to match other examples	2023-10-09 12:18:02 +02:00
Yih-Dar	740fc6a1da	Avoid CI OOM (#26639 ) fix avoid oom Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-10-09 11:42:08 +02:00
D. Carpintero	8835bff6a0	fix links in README.md for the GPT, GPT-2, and Llama2 Models (#26640 ) * fix OpenAI GPT, GPT-2 links * fix Llama2 link	2023-10-09 11:34:44 +02:00
Shreyas S	86a4e5a96b	Fixed malapropism error (#26660 ) Update test_integration.py Fixed malapropism clone>copy	2023-10-09 11:04:57 +02:00
NielsRogge	2629c8f36a	[DINOv2] Convert more checkpoints (#26177 ) * Convert checkpoints * Update doc test * Address comment	2023-10-09 09:58:04 +02:00
Jabasukuriputo Wang	897a826d83	docs(zh): review and punctuation & space fix (#26627 )	2023-10-06 09:24:28 -07:00
Yih-Dar	360ea8fc72	[docstring] Fix docstring for `AlbertConfig` (#26636 ) example fix docstring Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-10-06 17:36:22 +02:00
Arthur	9ad815e412	[`LlamaTokenizerFast`] Adds edge cases for the template processor (#26606 ) * make sure eos and bos are properly handled for fast tokenizer * fix code llama as well * nits * fix the conversion script as well * fix failing test	2023-10-06 16:40:54 +02:00
statelesshz	27597fea07	remove SharedDDP as it is deprecated (#25702 ) * remove SharedDDP as it was drepracated * apply review suggestion * make style * Oops,forgot to remove the compute_loss context manager in Seq2SeqTrainer. * remove the unnecessary conditional statement * keep the logic of IPEX * clean code * mix precision setup & make fixup --------- Co-authored-by: statelesshz <jihuazhong1@huawei.com>	2023-10-06 16:03:11 +02:00
Yih-Dar	e840aa67e8	Fix failing `MusicgenTest .test_pipeline_text_to_audio` (#26586 ) * fix * fix * Fix * Fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-10-06 15:53:59 +02:00
rui-ren	87499420bf	fix RoPE t range issue for fp16 (#26602 )	2023-10-06 12:04:54 +01:00
Matt	ea52ed9dc8	Update chat template docs with more tips on writing a template (#26625 )	2023-10-06 12:04:40 +01:00
fxmarty	64845307b3	Remove unnecessary unsqueeze - squeeze in rotary positional embedding (#26162 ) * remove unnecessary unsqueeze-squeeze in llama * correct other models * fix * revert gpt_neox_japanese * fix copie * fix test	2023-10-06 18:25:15 +09:00
Tianqi Liu	65aabafe2f	Update tokenization_code_llama_fast.py (#26576 ) * Update tokenization_code_llama_fast.py * Update test_tokenization_code_llama.py * Update test_tokenization_code_llama.py	2023-10-06 10:49:02 +02:00
Towdo	af38c837ee	Fixed inconsistency in several fast tokenizers (#26561 )	2023-10-06 10:40:47 +02:00
Ramiro Leal-Cavazos	8878eb1bd9	Remove unnecessary `view`s of `position_ids` (#26059 ) * Remove unnecessary `view` of `position_ids` in `modeling_llama` When `position_ids` is `None`, its value is generated using `torch.arange`, which creates a tensor of size `(seq_length + past_key_values_length) - past_key_values_length = seq_length`. The tensor is then unsqueezed, resulting in a tensor of shape `(1, seq_length)`. This means that the last `view` to a tensor of shape `(-1, seq_length)` is a no-op. This commit removes the unnecessary view. * Remove no-op `view` of `position_ids` in rest of transformer models	2023-10-06 10:28:00 +02:00
Yih-Dar	75a33d60f2	Don't install `pytorch-quantization` in Doc Builder docker file (#26622 ) Fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-10-05 16:57:50 +02:00
Maria Khalusova	18fbeec824	[docs] Update to scripts building index.md (#26546 ) * build the table in index.md with links to the model_doc * removed list generation on index.md * fixed missing models * make style	2023-10-05 10:20:41 -04:00
Yih-Dar	9d20601259	Fix `transformers-pytorch-gpu` docker build (#26615 ) Fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-10-05 15:33:35 +02:00

1 2 3 4 5 ...

14182 Commits