transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-19 04:28:26 +06:00

Author	SHA1	Message	Date
Younes Belkada	81c8191b46	FIX [`Generation`] Fix some issues when running the MaxLength criteria on CPU (#29317 ) fix the bitwise or issue	2024-03-05 02:29:19 +01:00
njackman-2344	e947683294	[Docs] Spanish Translation -Torchscript md & Trainer md (#29310 ) * torchscript and trainer md es translation * corrected md es files and even corrected spelling in en md * made es corrections to trainer.md * deleted entrenamiento... title on yml * placed entrenamiento in right place	2024-03-04 13:57:51 -08:00
NielsRogge	836921fdeb	Add UDOP (#22940 ) * First draft * More improvements * More improvements * More fixes * Fix copies * More improvements * More fixes * More improvements * Convert checkpoint * More improvements, set up tests * Fix more tests * Add UdopModel * More improvements * Fix equivalence test * More fixes * Redesign model * Extend conversion script * Use real inputs for conversion script * Add image processor * Improve conversion script * Add UdopTokenizer * Add fast tokenizer * Add converter * Update README's * Add processor * Add fully fledged tokenizer * Add fast tokenizer * Use processor in conversion script * Add tokenizer tests * Fix one more test * Fix more tests * Fix tokenizer tests * Enable fast tokenizer tests * Fix more tests * Fix additional_special_tokens of fast tokenizer * Fix tokenizer tests * Fix more tests * Fix equivalence test * Rename image to pixel_values * Rename seg_data to bbox * More renamings * Remove vis_special_token * More improvements * Add docs * Fix copied from * Update slow tokenizer * Update fast tokenizer design * Make text input optional * Add first draft of processor tests * Fix more processor tests * Fix decoder_start_token_id * Fix test_initialization * Add integration test * More improvements * Improve processor, add test * Add more copied from * Add more copied from * Add more copied from * Add more copied from * Remove print statement * Update README and auto mapping * Delete files * Delete another file * Remove code * Fix test * Fix docs * Remove asserts * Add doc tests * Include UDOP in exotic model tests * Add expected tesseract decodings * Add sentencepiece * Use same design as T5 * Add UdopEncoderModel * Add UdopEncoderModel to tests * More fixes * Fix fast tokenizer * Fix one more test * Remove parallelisable attribute * Fix copies * Remove legacy file * Copy from T5Tokenizer * Fix rebase * More fixes, copy from T5 * More fixes * Fix init * Use ArthurZ/udop for tests * Make all model tests pass * Remove UdopForConditionalGeneration from auto mapping * Fix more tests * fixups * more fixups * fix the tokenizers * remove un-necessary changes * nits * nits * replace truncate_sequences_boxes with truncate_sequences for fix-copies * nit current path * add a test for input ids * ids that we should get taken from `c9f7a32f57` * nits converting * nits * apply ruff * nits * nits * style * fix slow order of addition * fix udop fast range as well * fixup * nits * Add docstrings * Fix gradient checkpointing * Update code examples * Skip tests * Update integration test * Address comment * Make fixup * Remove extra ids from tokenizer * Skip test * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update year * Address comment * Address more comments * Address comments * Add copied from * Update CI * Rename script * Update model id * Add AddedToken, skip tests * Update CI * Fix doc tests * Do not use Tesseract for the doc tests * Remove kwargs * Add original inputs * Update casting * Fix doc test * Update question * Update question * Use LayoutLMv3ImageProcessor * Update organization * Improve docs * Update forward signature * Make images optional * Remove deprecated device argument * Add comment, add add_prefix_space * More improvements * Remove kwargs --------- Co-authored-by: ArthurZucker <arthur.zucker@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-03-04 18:49:02 +01:00
Donggeun Yu	ed74d97871	DeformableDETR support bfloat16 (#29232 ) * Update ms_deform_attn_cuda.cu * Update ms_deform_attn_cuda.cuh * Update modeling_deformable_detr.py * Update src/transformers/models/deformable_detr/modeling_deformable_detr.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update modeling_deformable_detr.py * python utils/check_copies.py --fix_and_overwrite * Fix dtype missmatch error * Update test_modeling_deformable_detr.py * Update test_modeling_deformable_detr.py * Update modeling_deformable_detr.py * Update modeling_deformable_detr.py * Support DeformableDETR with bfloat16 * Add test code * Use AT_DISPATCH_FLOATING_TYPES_AND2 Use AT_DISPATCH_FLOATING_TYPES_AND2 * Update tests/models/deformable_detr/test_modeling_deformable_detr.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/deformable_detr/test_modeling_deformable_detr.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix not found require_torch_bf16 function --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-03-04 14:18:09 +00:00
Yoach Lacombe	bcd23a54f1	Avoid edge case in audio utils (#28836 )	2024-03-04 13:24:40 +00:00
Sven Schultze	7941769e55	Fix grad_norm unserializable tensor log failure (#29212 ) * Fix grad_norm unserializable tensor log failure * Fix origin of grad_norm logs to be in deepspeed get_global_grad_norm()	2024-03-04 13:12:35 +00:00
Zach Mueller	1681a6d452	🚨 Fully revert atomic checkpointing 🚨 (#29370 ) Fully revert atomic checkpointing	2024-03-04 06:17:42 -05:00
Nick DeGroot	8ef9862864	Fix OneFormer `post_process_instance_segmentation` for panoptic tasks (#29304 ) * 🐛 Fix oneformer instance post processing when using panoptic task type * ✅ Add unit test for oneformer instance post processing panoptic bug --------- Co-authored-by: Nick DeGroot <1966472+nickthegroot@users.noreply.github.com>	2024-03-04 11:04:49 +00:00
Sean (Seok-Won) Yi	81220cba61	Fix: Fixed the previous tracking URI setting logic to prevent clashes with original MLflow code. (#29096 ) * Changed logic for setting the tracking URI. The previous code was calling the `mlflow.set_tracking_uri` function regardless of whether or not the environment variable `MLFLOW_TRACKING_URI` is even set. This led to clashes with the original MLflow implementation and therefore the logic was changed to only calling the function when the environment variable is explicitly set. * Check if tracking URI has already been set. The previous code did not consider the possibility that the tracking URI may already be set elsewhere and was therefore (erroneously) overriding previously set tracking URIs using the environment variable. * Removed redundant parentheses. Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix docstring to reflect library convention properly. Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix docstring to reflect library convention properly. "Unset by default" is the correct expression rather than "Default to `None`." Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-03-04 10:53:58 +00:00
NielsRogge	5e4b69dc12	Convert SlimSAM checkpoints (#28379 ) * First commit * Improve conversion script * Convert more checkpoints * Update src/transformers/models/sam/convert_sam_original_to_hf_format.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Rename file * More updates * Update docstring * Update script --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-03-04 11:51:16 +01:00
Traun Leyden	c38a12270a	Workaround for #27758 to avoid ZeroDivisionError (#28756 )	2024-03-04 10:23:40 +01:00
Y4hL	704b3f74f9	Add mlx support to BatchEncoding.convert_to_tensors (#29406 ) * Add mlx support * Fix import order and use def instead of lambda * Another fix for ruff format :) * Add detecting mlx from repr, add is_mlx_array	2024-03-04 10:19:13 +01:00
Siming Dai	39ef3fb248	[Mixtral] Fixes attention masking in the loss (#29363 ) Fix mixtral load balancing loss Co-authored-by: dingkunbo <dingkunbo@baidu.com>	2024-03-04 09:08:56 +01:00
Poedator	38953a75c1	update path to hub files in the error message (#29369 ) update path to hub files need to add `tree/` to path to files at HF hub. see example path: `https://huggingface.co/meta-llama/Llama-2-7b-hf/tree/main`	2024-03-04 08:26:01 +01:00
Fanli Lin	aade711d1e	[tests] enable automatic speech recognition pipeline tests on XPU (#29308 ) * use require_torch_gpu * enable on XPU	2024-03-04 08:24:38 +01:00
David Valente	831bc25d8f	Correct zero division error in inverse sqrt scheduler (#28982 ) * Correct zero division error in inverse sqrt scheduler * default timescale to 10_000	2024-03-01 17:04:40 +00:00
Zach Mueller	1a7c117df9	Fix deprecated arg issue (#29372 ) * Fix deprecated arg issue * Trainer check too * Check for dict or dataclass * Simplify, make config always AcceleratorConfig * Upstream to Trainer	2024-03-01 12:00:29 -05:00
Marc Sun	cec773345a	Fix llama + gemma accelete tests (#29380 )	2024-03-01 10:32:36 -05:00
Jingya HUANG	15f8296a9b	Support subfolder with `AutoProcessor` (#29169 ) enable subfolder	2024-03-01 10:29:21 +00:00
amyeroberts	f1b1379f37	[`YOLOS`] Fix - return padded annotations (#29300 ) * Fix yolos processing * Add back slow marker - protects for pycocotools in slow * Slow decorator goes above copied from header	2024-03-01 09:42:13 +00:00
Sanchit Gandhi	0a0a279e99	🚨🚨[Whisper Tok] Update integration test (#29368 ) * [Whisper Tok] Update integration test * make style	2024-03-01 09:22:31 +00:00
Arthur	e7b9837065	[`Llama + AWQ`] fix `prepare_inputs_for_generation` 🫠 (#29381 ) * use the generation config 🫠 * fixup	2024-03-01 08:59:26 +01:00
Younes Belkada	50db7ca4e8	FIX [`quantization` / `ESM`] Fix ESM 8bit / 4bit with bitsandbytes (#29329 ) * fix ESM 8bit * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-03-01 03:01:53 +01:00
Leon Engländer	2858d6c634	Fix Base Model Name of LlamaForQuestionAnswering (#29258 ) * LlamaForQuestionAnswering self.transformer->self.model * fix "Copied from" string * Llama QA model: set base_model_prefix = "transformer"	2024-03-01 02:58:19 +01:00
Song Fuchang	5ee0868a4b	Expose `offload_buffers` parameter of `accelerate` to `PreTrainedModel.from_pretrained` method (#28755 ) Expose offload_buffers parameter to from_pretrained method	2024-03-01 02:12:51 +01:00
Lucain	0ad770c373	Fix @require_read_token in tests (#29367 )	2024-02-29 11:25:16 +01:00
NielsRogge	bb4f816ad4	Patch YOLOS and others (#29353 ) Fix issue	2024-02-29 11:09:50 +01:00
Yih-Dar	44fe1a1cc4	Avoid using uncessary `get_values(MODEL_MAPPING)` (#29362 ) * more fixes * more fixes --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-29 17:19:17 +08:00
Younes Belkada	b647acdb53	FIX [`CI`] `require_read_token` in the llama FA2 test (#29361 ) Update test_modeling_llama.py	2024-02-29 04:49:01 +01:00
Younes Belkada	8d8ac9c2df	FIX [`CI`]: Fix failing tests for peft integration (#29330 ) fix failing tests for peft integration	2024-02-29 03:56:16 +01:00
Younes Belkada	1aee9afd1c	FIX [`CI` / `starcoder2`] Change starcoder2 path to correct one for slow tests (#29359 ) change starcoder2 path to correct one	2024-02-29 03:52:13 +01:00
Michael	2209b7afa0	[i18n-zh] Sync source/zh/index.md (#29331 ) * [i18n-zh] Sync source/zh/index.md * apply review comments	2024-02-28 09:41:18 -08:00
fxmarty	49204c1d37	Better SDPA unmasking implementation (#29318 ) * better unmask imple * comment * typo * bug report pytorch * cleanup * fix import * add back example * retrigger ci * come on	2024-02-28 16:36:47 +01:00
Marc Sun	f54d82cace	[CI] Quantization workflow (#29046 ) * [CI] Quantization workflow * build dockerfile * fix dockerfile * update self-cheduled.yml * test build dockerfile on push * fix torch install * udapte to python 3.10 * update aqlm version * uncomment build dockerfile * tests if the scheduler works * fix docker * do not trigger on psuh again * add additional runs * test again * all good * style * Update .github/workflows/self-scheduled.yml Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * test build dockerfile with torch 2.2.0 * fix extra * clean * revert changes * Revert "revert changes" This reverts commit `4cb52b8822`. * revert correct change --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-02-28 10:09:25 -05:00
jiqing-feng	554e7ada89	check if position_ids exists before using it (#29306 ) Co-authored-by: Joao Gante <joao@huggingface.co>	2024-02-28 14:56:25 +00:00
Daniel Han	d3a4b47544	RoPE loses precision for Llama / Gemma + Gemma logits.float() (#29285 ) * Update modeling_llama.py Llama - Force float32 since bfloat16 loses precision on long contexts * Update modeling_llama.py * Update modeling_gemma.py Fix RoPE and logits.float() * @torch.no_grad() * @torch.no_grad() * Cos, Sin to float32 * cos, sin to float32 * Update src/transformers/models/gemma/modeling_gemma.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Resolve PR conflicts * Fix RoPE for llama * Revert "Fix RoPE for llama" This reverts commit `b860a22dab`. * Fix RoPE for llama * RoPE device * Autocast device type * RoPE * RoPE isinstance --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-02-28 15:16:53 +01:00
Joao Gante	7628b3a0f4	Idefics: generate fix (#29320 )	2024-02-28 11:34:54 +00:00
Leonardo Emili	2ce56d35f6	Disable Mixtral `output_router_logits` during inference (#29249 ) * Set output_router_logits=False in prepare_inputs_for_generation for mixtral * Add output_router_logits=False to prepare_inputs_for_generation for mixtral * Fix style	2024-02-28 11:16:15 +01:00
Arthur	8a8a0a4ae0	[`Llama ROPE`] Fix torch export but also slow downs in forward (#29198 ) * remove control flow * update gptneox * update .... * nits * Actually let's just break. Otherwise we are silently failing which imo is not optimal * version BC * fix tests * fix eager causal * nit * add a test * style * nits * nits * more nits for the test * update and fix * make sure cuda graphs are not skipped * read token is needed for meta llama * update! * fiixup * compile test should be slow * fix thet fix copies * stle 🫠	2024-02-28 10:45:53 +01:00
Arthur	7c87f3577e	[`T5 and Llama Tokenizer`] remove warning (#29346 ) * remove warning * add co-author * update --------- Co-authored-by: hiaoxui <hiaoxui@users.noreply.github.com>	2024-02-28 10:41:58 +01:00
Arthur	a52888524d	[`require_read_token`] fix typo (#29345 ) fix wrapper	2024-02-28 10:13:57 +01:00
fxmarty	e715c78c66	Remove numpy usage from owlvit (#29326 ) * remove numpy usage from owlvit * fix init owlv2 * style	2024-02-28 09:38:44 +01:00
Younes Belkada	ad00c482c7	FIX [`Gemma` / `CI`] Make sure our runners have access to the model (#29242 ) * pu hf token in gemma tests * update suggestion * add to flax * revert * fix * fixup * forward contrib credits from discussion --------- Co-authored-by: ArthurZucker <ArthurZucker@users.noreply.github.com>	2024-02-28 06:25:23 +01:00
Jared Van Bortel	bd5b986306	simplify get_class_in_module and fix for paths containing a dot (#29262 )	2024-02-28 03:10:36 +01:00
RaymondLi0	63caa370e6	Starcoder2 model - bis (#29215 ) * Copy model * changes * misc * fixes * add embed and residual dropout (#30) * misc * remove rms norm and gated MLP * remove copied mentions where its not a copy anymore * remove unused _shape * copied from mistral instead * fix copies * fix copies * add not doctested * fix * fix copyright * Update docs/source/en/model_doc/starcoder2.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/starcoder2/configuration_starcoder2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/starcoder2/configuration_starcoder2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix doc * revert some changes * add fa2 tests * fix styling nit * fix * push dummy docs --------- Co-authored-by: Joel Lamy-Poirier <joel.lamy-poirier@servicenow.com> Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-02-28 01:24:34 +01:00
Michael	83ab0115d1	[i18n-zh] Translate fsdp.md into Chinese (#29305 ) * [i18n-zh] Translate fsdp.md into Chinese Signed-off-by: windsonsea <haifeng.yao@daocloud.io> * apply suggestions from Fan-Lin --------- Signed-off-by: windsonsea <haifeng.yao@daocloud.io>	2024-02-27 11:26:57 -08:00
Sadra Barikbin	227cd54aa5	Fix a few typos in `GenerationMixin`'s docstring (#29277 ) Co-authored-by: Joao Gante <joao@huggingface.co>	2024-02-27 18:15:43 +00:00
Raushan Turganbay	ddf7ac4237	Token level timestamps for long-form generation in Whisper (#29148 )	2024-02-27 18:15:26 +00:00
Marc Sun	8a1faf2803	Add compatibility with skip_memory_metrics for mps device (#29264 ) * Add compatibility with mps device * fix * typo and style	2024-02-27 09:58:43 -05:00
Yih-Dar	5c341d4555	Use torch 2.2 for deepspeed CI (#29246 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-27 17:51:37 +08:00

... 16 17 18 19 20 ...

16108 Commits