Yih-Dar
db5e0c3292
Fix MistralIntegrationTest
OOM ( #26754 )
...
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-12 12:31:11 +02:00
Yih-Dar
72256bc72a
Fix PersimmonIntegrationTest
OOM ( #26750 )
...
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-12 11:24:18 +02:00
Lysandre Debut
ab0ddc99e8
Warnings controlled by logger level ( #26527 )
...
* Logger level
Co-authored-by: Sahil Bhosale <sahilbhosale63@live.com>
Co-authored-by: Adithya4720 <hegdeadithyak@gmail.com>
Co-authored-by: Sachin Singh <sachinishu02@gmail.com>
Co-authored-by: Riya Dhanduke <113622644+riiyaa24@users.noreply.github.com>
* More comprehensive documentation
---------
Co-authored-by: Sahil Bhosale <sahilbhosale63@live.com>
Co-authored-by: Adithya4720 <hegdeadithyak@gmail.com>
Co-authored-by: Sachin Singh <sachinishu02@gmail.com>
Co-authored-by: Riya Dhanduke <113622644+riiyaa24@users.noreply.github.com>
2023-10-12 10:48:38 +02:00
Tom Aarsen
40ea9ab2a1
Add many missing spaces in adjacent strings ( #26751 )
...
Add missing spaces in adjacent strings
2023-10-12 10:28:40 +02:00
Yih-Dar
3bc65505fc
Fix doctest for Blip2ForConditionalGeneration
( #26737 )
...
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-12 10:01:07 +02:00
TERRY LEE
e1cec43415
Translated the accelerate.md file of the documentation to Chinese ( #26161 )
...
* translate accelerate page
* Update docs/source/zh/accelerate.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-10-11 10:54:22 -07:00
Rockerz
9b7668c03a
add japanese documentation ( #26138 )
...
* udpaet
* update
* Update docs/source/ja/autoclass_tutorial.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* add codes workflows/build_pr_documentation.yml
* Create preprocessing.md
* added traning.md
* Create Model_sharing.md
* add quicktour.md
* new
* ll
* Create benchmark.md
* Create Tensorflow_model
* add
* add community.md
* add create_a_model
* create custom_model.md
* create_custom_tools.md
* create fast_tokenizers.md
* create
* add
* Update docs/source/ja/_toctree.yml
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* md
* add
* commit
* add
* h
* Update docs/source/ja/peft.md
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update docs/source/ja/_toctree.yml
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update docs/source/ja/_toctree.yml
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Suggested Update
* add perf_train_gpu_one.md
* added perf based MD files
* Modify toctree.yml and Add transmartion to md codes
* Add `serialization.md` and edit `_toctree.yml`
* add task summary and tasks explained
* Add and Modify files starting from T
* Add testing.md
* Create main_classes files
* delete main_classes folder
* Add toctree.yml
* Update llm_tutorail.md
* Update docs/source/ja/_toctree.yml
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update misspelled filenames
* Update docs/source/ja/_toctree.yml
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/_toctree.yml
* Update docs/source/ja/_toctree.yml
* missplled file names inmrpovements
* Update _toctree.yml
* close tip block
* close another tip block
* Update docs/source/ja/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/pipeline_tutorial.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/pipeline_tutorial.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/preprocessing.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/peft.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/add_new_model.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/testing.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/task_summary.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/tasks_explained.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update glossary.md
* Update docs/source/ja/transformers_agents.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/llm_tutorial.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/create_a_model.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/torchscript.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/benchmarks.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/troubleshooting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/troubleshooting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/troubleshooting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/add_new_model.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update perf_torch_compile.md
* Update Year to default in en documentation
* Final Update
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-10-11 10:26:37 -07:00
Bojun-Feng
797a1babf2
[docstring] Fix docstring for CodeLlamaTokenizer
( #26709 )
...
* update check_docstrings
* update docstring
2023-10-11 18:01:22 +02:00
Minho Ryang
aaccf1844e
[docstring] Fix docstring for LlamaTokenizer
and LlamaTokenizerFast
( #26669 )
...
* [docstring] Fix docstring for `LlamaTokenizer` and `LlamaTokenizerFast`
* [docstring] Fix docstring typo at `LlamaTokenizer` and `LlamaTokenizerFast`
2023-10-11 17:03:31 +02:00
Yih-Dar
e58cbed51d
Revert #20715 ( #26734 )
...
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-11 16:46:41 +02:00
Yih-Dar
b219ae6bd4
Update docker files to use torch==2.1.0
( #26735 )
...
Update docker files to use torch 2.1
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-11 16:23:36 +02:00
Zach Mueller
1d6a84749b
Fix checkpoint path in no_trainer
scripts ( #26733 )
...
checkpoint path
2023-10-11 16:16:27 +02:00
Lysandre Debut
6ecb2ab679
Fix stale bot for locked issues ( #26711 )
2023-10-11 16:08:55 +02:00
Sourab Mangrulkar
69873d529d
fix the model card issue as use_cuda_amp
is no more available ( #26731 )
2023-10-11 15:58:23 +02:00
Shivanand
cc44ca8017
[docstring] SwinModel
docstring fix ( #26679 )
...
* remove from utils
* updated doc string
* only in the model
* Update src/transformers/models/swin/modeling_swin.py
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* Update src/transformers/models/swin/modeling_swin.py
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
---------
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2023-10-11 15:53:32 +02:00
Patrick von Platen
da69de17e8
[Assistant Generation] Improve Encoder Decoder ( #26701 )
...
* [Assistant Generation] Improve enc dec
* save more
* Fix logit processor checks
* Clean
* make style
* fix deprecation
* fix generation test
* Apply suggestions from code review
* fix biogpt
* make style
2023-10-11 15:52:20 +02:00
Yih-Dar
5334796d20
Copied from
for test files (#26713 )
...
* copied statement for test files
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-11 14:12:09 +02:00
Ben Gubler
9f40639292
Update docs to explain disabling callbacks using report_to ( #26155 )
...
* feat: update callback doc to explain disabling callbacks using report_to
* docs: update report_to docstring
2023-10-11 07:50:23 -04:00
Billy Bradley
dcc49d8a7e
In assisted decoding, pass model_kwargs to model's forward call (fix prepare_input_for_generation in all models) ( #25242 )
...
* In assisted decoding, pass model_kwargs to model's forward call
Previously, assisted decoding would ignore any additional kwargs
that it doesn't explicitly handle. This was inconsistent with other
generation methods, which pass the model_kwargs through
prepare_inputs_for_generation and forward the returned dict to the
model's forward call.
The prepare_inputs_for_generation method needs to be amended in all
models, as previously it only kept the last input ID when a past_key_values
was passed.
* Improve variable names in _extend_attention_mask
* Refactor extending token_type_ids into a function
* Replace deepcopy with copy to optimize performance
* Update new persimmon model with llama changes for assisted generation
* Update new mistral model for assisted generation with prepare_inputs_for_generation
* Update position_ids creation in falcon prepare_inputs_for_generation to support assisted generation
2023-10-11 13:18:42 +02:00
Thien Tran
1e3c9ddacc
Make Whisper Encoder's sinusoidal PE non-trainable by default ( #26032 )
...
* set encoder's PE as non-trainable
* freeze flax
* init sinusoids
* add test for non-trainable embed positions
* simplify TF encoder embed_pos
* revert tf
* clean up
* add sinusoidal init for jax
* make consistent sinusoidal function
* fix dtype
* add default dtype
* use numpy for sinusoids. fix jax
* add sinusoid init for TF
* fix
* use custom embedding
* use specialized init for each impl
* fix sinusoids init. add test for pytorch
* fix TF dtype
* simplify sinusoid init for flax and tf
* add tests for TF
* change default dtype to float32
* add sinusoid test for flax
* Update src/transformers/models/whisper/modeling_flax_whisper.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update src/transformers/models/whisper/modeling_tf_whisper.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* move sinusoidal init to _init_weights
---------
Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2023-10-11 09:08:54 +01:00
Roy Hvaara
fc63914399
[JAX] Replace uses of jnp.array
in types with jnp.ndarray
. ( #26703 )
...
`jnp.array` is a function, not a type:
https://jax.readthedocs.io/en/latest/_autosummary/jax.numpy.array.html
so it never makes sense to use `jnp.array` in a type annotation. Presumably the intent was to write `jnp.ndarray` aka `jax.Array`.
Co-authored-by: Peter Hawkins <phawkins@google.com>
2023-10-10 21:35:16 +02:00
jheitmann
3eceaa3637
Fix source_prefix default value ( #26654 )
2023-10-10 20:49:10 +02:00
théo gigant
975003eacb
fix a typo in flax T5 attention - attention_mask variable is misnamed ( #26663 )
...
* fix a typo in flax t5 attention
* fix the typo in flax longt5 attention
2023-10-10 20:36:32 +02:00
Pavarissy
e8fdd7875d
[docstring] Fix docstring for LlamaConfig
( #26685 )
...
* Your commit message here
* fix LlamaConfig docstring
* run make fixup
* fix formatting after review
reformat of the file to prevent script issues
* rerun make fixup after reformat
2023-10-10 17:05:48 +02:00
Tuowei Wang
a9862a0f49
Fix Typo: table in deepspeed.md ( #26705 )
2023-10-10 11:50:10 +02:00
jiqing-feng
592f2eabd1
Control first downsample stride in ResNet ( #26374 )
...
* control first downsample stride
* reduce first only works for ResNetBottleNeckLayer
* fix param name
* fix style
2023-10-10 06:45:24 +02:00
Isaac Chung
a5e6df82c0
[docstring] Fix docstrings for CLIP
( #26691 )
...
fix docstrings for vanilla clip
2023-10-09 17:39:05 +02:00
Lysandre Debut
87b4ade9e5
Fix stale bot ( #26692 )
...
* Fix stale bot
* Comments
2023-10-09 16:39:57 +02:00
Alex Bzdel
3257946fb7
[docstring] Fix docstring for DonutImageProcessor ( #26641 )
...
* removed donutimageprocessor from objects_to_ignore
* added docstring for donutimageprocessor
* readding donut file
* moved docstring to correct location
2023-10-09 16:32:13 +02:00
Isaac Chung
d2f06dfffc
[docstring] Fix docstring for CLIPImageProcessor
( #26676 )
...
fix docstring for CLIPImageProcessor
2023-10-09 14:22:44 +02:00
Isaac Chung
3763101f85
[docstring] Fix docstring CLIP configs ( #26677 )
...
* fix docstrings for CLIP configs
* black formatted
2023-10-09 12:34:01 +02:00
tom white
c7f01beece
fix typos in idefics.md ( #26648 )
...
* fix typos in idefics.md
Two typos found in reviewing this documentation.
1) max_new_tokens=4, is not sufficient to generate "Vegetables" as indicated - you will get only "Veget". (incidentally - some mention of how to select this value might be useful as it seems to change in each example)
2) inputs = processor(prompts, return_tensors="pt").to(device) as inputs need to be on the same device (as they are in all other examples on the page)
* Update idefics.md
Change device to cuda explicitly to match other examples
2023-10-09 12:18:02 +02:00
Yih-Dar
740fc6a1da
Avoid CI OOM ( #26639 )
...
fix avoid oom
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-09 11:42:08 +02:00
D. Carpintero
8835bff6a0
fix links in README.md for the GPT, GPT-2, and Llama2 Models ( #26640 )
...
* fix OpenAI GPT, GPT-2 links
* fix Llama2 link
2023-10-09 11:34:44 +02:00
Shreyas S
86a4e5a96b
Fixed malapropism error ( #26660 )
...
Update test_integration.py
Fixed malapropism clone>copy
2023-10-09 11:04:57 +02:00
NielsRogge
2629c8f36a
[DINOv2] Convert more checkpoints ( #26177 )
...
* Convert checkpoints
* Update doc test
* Address comment
2023-10-09 09:58:04 +02:00
Jabasukuriputo Wang
897a826d83
docs(zh): review and punctuation & space fix ( #26627 )
2023-10-06 09:24:28 -07:00
Yih-Dar
360ea8fc72
[docstring] Fix docstring for AlbertConfig
( #26636 )
...
example fix docstring
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-06 17:36:22 +02:00
Arthur
9ad815e412
[LlamaTokenizerFast
] Adds edge cases for the template processor ( #26606 )
...
* make sure eos and bos are properly handled for fast tokenizer
* fix code llama as well
* nits
* fix the conversion script as well
* fix failing test
2023-10-06 16:40:54 +02:00
statelesshz
27597fea07
remove SharedDDP as it is deprecated ( #25702 )
...
* remove SharedDDP as it was drepracated
* apply review suggestion
* make style
* Oops,forgot to remove the compute_loss context manager in Seq2SeqTrainer.
* remove the unnecessary conditional statement
* keep the logic of IPEX
* clean code
* mix precision setup & make fixup
---------
Co-authored-by: statelesshz <jihuazhong1@huawei.com>
2023-10-06 16:03:11 +02:00
Yih-Dar
e840aa67e8
Fix failing MusicgenTest .test_pipeline_text_to_audio
( #26586 )
...
* fix
* fix
* Fix
* Fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-06 15:53:59 +02:00
rui-ren
87499420bf
fix RoPE t range issue for fp16 ( #26602 )
2023-10-06 12:04:54 +01:00
Matt
ea52ed9dc8
Update chat template docs with more tips on writing a template ( #26625 )
2023-10-06 12:04:40 +01:00
fxmarty
64845307b3
Remove unnecessary unsqueeze - squeeze in rotary positional embedding ( #26162 )
...
* remove unnecessary unsqueeze-squeeze in llama
* correct other models
* fix
* revert gpt_neox_japanese
* fix copie
* fix test
2023-10-06 18:25:15 +09:00
Tianqi Liu
65aabafe2f
Update tokenization_code_llama_fast.py ( #26576 )
...
* Update tokenization_code_llama_fast.py
* Update test_tokenization_code_llama.py
* Update test_tokenization_code_llama.py
2023-10-06 10:49:02 +02:00
Towdo
af38c837ee
Fixed inconsistency in several fast tokenizers ( #26561 )
2023-10-06 10:40:47 +02:00
Ramiro Leal-Cavazos
8878eb1bd9
Remove unnecessary view
s of position_ids
( #26059 )
...
* Remove unnecessary `view` of `position_ids` in `modeling_llama`
When `position_ids` is `None`, its value is generated using
`torch.arange`, which creates a tensor of size `(seq_length +
past_key_values_length) - past_key_values_length = seq_length`. The
tensor is then unsqueezed, resulting in a tensor of shape `(1,
seq_length)`. This means that the last `view` to a tensor of shape
`(-1, seq_length)` is a no-op.
This commit removes the unnecessary view.
* Remove no-op `view` of `position_ids` in rest of transformer models
2023-10-06 10:28:00 +02:00
Yih-Dar
75a33d60f2
Don't install pytorch-quantization
in Doc Builder docker file ( #26622 )
...
Fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-05 16:57:50 +02:00
Maria Khalusova
18fbeec824
[docs] Update to scripts building index.md ( #26546 )
...
* build the table in index.md with links to the model_doc
* removed list generation on index.md
* fixed missing models
* make style
2023-10-05 10:20:41 -04:00
Yih-Dar
9d20601259
Fix transformers-pytorch-gpu
docker build ( #26615 )
...
Fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-05 15:33:35 +02:00