transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Joao Gante	12febc20db	Generate: Export TF generate with a TF tokenizer (#22310 ) * Export TF generate with a TF tokenizer * remove unused lines	2023-03-22 15:00:20 +00:00
Sylvain Gugger	5fd4e3c87c	Enforce `max_memory` for device_map strategies (#22311 ) Enforce for device_map strategies	2023-03-22 09:22:07 -04:00
silentghoul-spec	48bef3a734	Fixed bug to calculate correct xpath_sub_list in MarkupLMTokenizer (#22302 ) Fixed bug to calculate correct xpath_sub_list in MarkupLMTokenizer. Earlier xpath_sub_list was same as xpath_tags_list Co-authored-by: dusejat <dusejat@amazon.com>	2023-03-22 12:07:49 +00:00
Nick Hill	4e94c6c008	Fix position embeddings for GPT-J and CodeGen (#22069 ) * Revert "[GPT-J] add deprecation warning (#21869)" This reverts commit `fb76994c41`. * Fix position embeddings for GPT-J and CodeGen * Address review comments from @gante * Fix "Copied from" comment referencing wrong function * Fix copy/paste mistake * Fix training path * Hopefully make torch.fx happy * Move position_ids long cast * Revert "Hopefully make torch.fx happy" This reverts commit e41a6f4cad3ff441124c7457b19cfb630d4ca025. * Changes to help with torch.fx tracing * Linter fix * Correct position_ids tensor type hint * Work-around torch.fx tracing issue * Get the changes to work with torch.fx * Address review comment from @michaelbenayoun * Another small adjustment * Add explanatory comment; small code tidyup	2023-03-22 11:14:54 +00:00
Connor Henderson	8e6c34b390	fix: Allow only test_file in pytorch and flax summarization (#22293 ) allow only test_file in pytorch and flax summarization	2023-03-22 10:46:56 +00:00
Wang, Yi	4ccaf268fb	add low_cpu_mem_usage option in run_clm.py example which will benefit… (#22288 ) * add low_cpu_mem_usage option in run_clm.py example which will benefit LLM loading Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * update all the example and README under language-modeling Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>	2023-03-22 10:42:39 +00:00
jiqing-feng	8472a224fb	Enable traced model for text-generation task (#22265 )	2023-03-22 10:19:26 +00:00
Alara Dirik	0558914dff	Add MaskedImageModelingOutput (#22212 ) * Add MaskedImageModelingOutput	2023-03-22 07:35:47 +03:00
Yih-Dar	0dcb46e7a4	Final update of doctest (#22299 ) * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-22 01:00:33 +01:00
Stas Bekman	89a0a9eace	[deepspeed] offload + non-cpuadam optimizer exception doc (#22044 ) * [deepspeed] offload + non-cpuadam optimizer exception doc * deps	2023-03-21 17:00:05 -07:00
Ali Hassani	5990743fdd	Correct NATTEN function signatures and force new version (#22298 )	2023-03-21 17:21:34 -04:00
Yanming W	d35f729649	Restore fp16 support on xla gpu device (#22300 )	2023-03-21 16:32:43 -04:00
Yih-Dar	67c2dbdb54	Time to Say Goodbye, torch 1.7 and 1.8 (#22291 ) * time to say goodbye, torch 1.7 and 1.8 * clean up torch_int_div * clean up is_torch_less_than_1_8-9 * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-21 19:22:01 +01:00
Davide Gazzè	86c7931a70	Add translation perf_infer_gpu_one for it (#22296 ) Add translation	2023-03-21 13:07:30 -04:00
Yih-Dar	d0b942d1dc	fix more doctests (#22292 ) * fix more doctests * fix style --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-21 16:16:17 +01:00
Yih-Dar	48327c5718	More doctests (#22268 ) * all doctests * Skip failed tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-21 13:27:30 +01:00
Gerald Cuder	5a2b77a6c1	Fix error in mixed precision training of `TFCvtModel` (#22267 ) * Make sure CVT can be trained using mixed precision * Add test for keras-fit with mixed-precision * Update tests/models/cvt/test_modeling_tf_cvt.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> --------- Co-authored-by: gcuder <Gerald.Cuder@iacapps.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2023-03-21 12:12:57 +00:00
Andrei Panferov	330d8b991f	replace_8bit_linear modules_to_not_convert default value fix (#22238 ) * Fixed modules_to_not_convert default value * Fixed modules_to_not_convert docstring * Update src/transformers/utils/bitsandbytes.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/utils/bitsandbytes.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * ["lm_head"] if modules_to_not_convert is None --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-03-21 10:16:07 +00:00
amyeroberts	c07a02a4b7	Update vision docstring bool masked pos (#22237 ) * Add bool_masked_pos to forward docstrings * Add note about mask ratio - videomae * Fix up * Fix indenting	2023-03-20 20:06:16 +00:00
Maria Khalusova	7bd8650512	Example of pad_to_multiple_of for padding and truncation guide & docstring update (#22278 ) * added an example of pad_to_multiple_of * make style * addressed feedback	2023-03-20 14:18:55 -04:00
Antoni Viros	fb0a38b4f2	Move torch.compile() wrapping after DDP/FSDP wrapping to ensure correct graph breaks during training (#22279 )	2023-03-20 13:54:01 -04:00
amyeroberts	8ac29fe090	Fix doc links (#22274 )	2023-03-20 17:07:31 +00:00
Sylvain Gugger	da005253b8	Proper map location for optimizer load (#22273 ) * Proper map location for optimizer load * What happened to my code?	2023-03-20 11:30:46 -04:00
Sylvain Gugger	786092a35e	Rework a bit the LLaMA conversion script (#22236 ) * Update LLaMA conversion script * Doc * Fix the weight size for the 13B checkpoint * Update src/transformers/models/llama/convert_llama_weights_to_hf.py Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> --------- Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2023-03-20 11:30:36 -04:00
Sylvain Gugger	43efd7cb13	Fix balanced and auto device_map (#22271 )	2023-03-20 11:24:17 -04:00
yqy2001	89f0fda5d3	Fix the gradient checkpointing bug of the llama model (#22270 ) fix grad ckpt bug of llama	2023-03-20 10:26:50 -04:00
heya5	cf0af9a31b	[Trainer] Add optional communication backends for torch.distributed when using GPU (#22247 ) Update training_args.py	2023-03-20 09:17:34 -04:00
Nicola Procopio	c4bf6f38bd	Italian translation perf_infer_cpu (#22243 ) * added translated files added perf_train_cpu and perf_train_cpu_many * updated toctree * updated toctree * added file perf_infer_cpu.medx * italian translation perf_infer_cpu.mdx	2023-03-20 09:16:07 -04:00
yesinkim	466144d440	[Docs] fix typos in some tokenizer docs (#22256 ) [Docs] fix typos Co-authored-by: yesinkim <yesinkim@yesinkimui-MacBookAir.local>	2023-03-20 12:17:31 +00:00
Pasquale Minervini	a48310de47	Update training_args.py -- a nightly install is not required anymore for torch.compile (#22266 ) Update training_args.py A nightly install is not required anymore for `torch.compile`.	2023-03-20 12:00:05 +00:00
Stas Bekman	60d51ef512	[trainer] param count for deepspeed zero3 (#22193 ) [trainer] param count for zero3	2023-03-17 11:02:55 -07:00
Guangyuan Ma	cf601b902f	Fix Unnecessary move of tensors from CPU to GPU in LlamaRotaryEmbedding (#22234 ) push	2023-03-17 13:56:32 -04:00
Yih-Dar	bec075612a	Revert "Use `dash==2.8.1` for now for daily CI" (#22233 ) Revert "Use `dash==2.8.1` for now for daily CI (#22227)" This reverts commit `53218671d9`.	2023-03-17 16:54:27 +01:00
Ali Hassani	3028b20a71	Fix natten (#22229 ) * Add kernel size to NATTEN's QK arguments. The new NATTEN 0.14.5 supports PyTorch 2.0, but also adds an additional argument to the QK operation to allow optional RPBs. This ends up failing NATTEN tests. This commit adds NATTEN back to circleci and adds the arguments to get it working again. * Force NATTEN >= 0.14.5	2023-03-17 11:07:55 -04:00
Seb0	074490b2c2	fix(docs): fix task guide links in model docs (#22226 ) fix(docs): task guide links in model docs	2023-03-17 14:30:17 +00:00
Maria Khalusova	314cdf7c25	Removed .mdx extension in two links (#22230 ) removed .mdx extension	2023-03-17 10:27:12 -04:00
lewtun	f251441387	Add LlamaForSequenceClassification (#22209 ) * Add LlamaForSequenceClassification * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Add docstring * Add test * Add input embedding getter and setter * Remove dead code --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-03-17 14:39:26 +01:00
Wang, Yi	675d2a5a00	fix AutoTP in deepspeed could not work for bloom (#22196 ) * fix AutoTP in deepspeed could not work for bloom Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * add a method in BloomModel to build ailib Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>	2023-03-17 09:28:17 -04:00
Sylvain Gugger	00934026a4	LLaMA house-keeping (#22216 ) * LLaMA house-keeping * Doc links	2023-03-17 08:55:15 -04:00
Maria Khalusova	42f8f76402	Depth estimation task guide (#22205 ) * added doc to toc, auto tip with supported models, mention of task guide in model docs * make style * removed "see also" * minor fix	2023-03-17 08:36:23 -04:00
Yih-Dar	53218671d9	Use `dash==2.8.1` for now for daily CI (#22227 ) Use dash 2.8.1 for now Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-17 13:27:14 +01:00
wangpeng	af1c864cdc	fix code example in mgp-str doc (#22219 ) Co-authored-by: yue kun <yuekun.wp@alibaba-inc.com>	2023-03-17 09:40:06 +00:00
Kevin Turner	33d033d694	fix typos in llama.mdx (#22223 )	2023-03-17 08:43:18 +00:00
Yih-Dar	97a3d16a69	Hotfix for natten issue with torch 2.0.0 on CircleCI (#22218 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-16 23:57:26 +01:00
Yih-Dar	5110e5748e	🔥py38 + torch 2 🔥🔥🔥🚀 (#22204 ) * py38 + torch 2 * increment cache versions --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-16 22:59:23 +01:00
Susnato Dhar	fb366b9a2a	fixes a typo in WhisperFeatureExtractor docs. (#22208 ) * fixes a typo * .	2023-03-16 16:08:05 +00:00
Younes Belkada	da3ba3a167	[`XGLM`] Add `accelerate` support for XGLM (#22207 ) * add `accelerate` support for XGLM * fix order	2023-03-16 16:18:05 +01:00
SatyaJandhyalaAtMS	a88a4dae19	Temporarily fix ONNX model exporting error (#21830 ) * Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143 * Reduced column width * Fix formatting. * Revert "Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143" This reverts commit 6e95a108042118d204da447729f3834affa354fc. * Fix export error. * Revert "Fix formatting." This reverts commit 8310f60da10358edbdf77a2a2f3c83ee55066cb8. * Propagated changes made in SwinV2 to Swin2SR	2023-03-16 10:56:26 -04:00
Yih-Dar	4c5c0af7e5	Update tiny model creation script (#22202 ) * Update UNCONVERTIBLE_MODEL_ARCHITECTURES * Deal with 2 model tester classes in single test file * Deal with 2 model tester classes in single test file * Deal with 2 model tester classes in single test file * make style and quality --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-16 14:21:58 +01:00
Jason Phang	464d420775	LLaMA Implementation (#21955 ) * LLaMA * sharding and docs * tweak * black * inits * ruff * LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP * init * no checkpoint * docs * ruff * type_vocab_size * tokenizer fixes * tokenizer fixes * Update tokenization_llama.py * Update tokenization_llama.py * Update configuration_llama.py * Update modeling_llama.py * tokenizer add_bos by default * licenses * remove decoder * norms and mlp * rope overhaul * tweaks * black * mention OPT implementation * off-by-one naming * typo * fix * tokenization fix and slicing bug * padding config * cleanup * black * update tests * undo typo * fix vocab caching logic * ruff * docbuilder * attn fix from BlackSamorez * initial feedback * typo * docs * llama case * llama case * load checkpoint docs * comment about tokenizer * tokenizer defaults * clear past_key_values if use_cache=False * last tweaks * last tweaks * last tweaks * last tweaks --------- Co-authored-by: Stella Biderman <stellabiderman@gmail.com>	2023-03-16 09:01:15 -04:00

1 2 3 4 5 ...

12394 Commits