transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-05 22:00:09 +06:00

Author	SHA1	Message	Date
linlin	1c1aec2ef1	Update object_detection.md (#31488 ) Define MAX_SIZE before it is used.	2024-06-19 10:36:44 +01:00
Younes Belkada	7d683f7bae	Docs / AQLM: Clarify `torch.compile` support for AQLM (#31473 ) Update overview.md	2024-06-19 11:26:25 +02:00
Anton Vlasjuk	b275a41005	[`GPT2`] Add SDPA support (#31172 ) * `gpt2` sdpa support * fix (at least) one test, style, repo consistency * fix sdpa mask in forward --> fixes generation * test * test2 * test3 * test4 * simplify shapes for attn mask creation and small comments * hub fail test * benchmarks * flash attn 2 mask should not be inverted on enc-dec setup * fix comment * apply some suggestion from code review - only save _attn_implentation once - remove unnecessary comment * change elif logic * [run-slow] gpt2 * modify `test_gpt2_sample_max_time` to follow previous assertion patterns	2024-06-19 09:40:57 +02:00
Rémy Léone	22b41b3f8a	Update perf_train_gpu_many.md (#31451 ) * Update perf_train_gpu_many.md * Update docs/source/en/perf_train_gpu_many.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_train_gpu_many.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-06-18 11:00:26 -07:00
Matt	6e56b83453	Update chat template docs and bump Jinja version (#31455 ) * Update chat template docs * Minor bug in the version check * Update docs/source/en/chat_templating.md Co-authored-by: Joshua Lochner <admin@xenova.com> * Update docs/source/en/chat_templating.md Co-authored-by: Joshua Lochner <admin@xenova.com> * Update docs/source/en/chat_templating.md Co-authored-by: Joshua Lochner <admin@xenova.com> * Replace backticks with bolding because the doc builder was trying to parse them * Replace backticks with bolding because the doc builder was trying to parse them * Replace backticks with bolding because the doc builder was trying to parse them * More cleanups to avoid upsetting the doc builder * Add one more tip at the end --------- Co-authored-by: Joshua Lochner <admin@xenova.com>	2024-06-18 14:16:30 +01:00
Matt	dabf01973a	Make "tool_use" the default chat template key when tools are passed (#31429 ) * Make "tool_use" the default when tools are passed * Add some opinionated text to the docs * Add some opinionated text to the docs	2024-06-18 13:54:42 +01:00
Jade Choghari	67a4ef89d4	Add missing French translation of tutoriel_pipeline.md (#31396 ) * Update french translation of tutoriel_pipeline.md * Update docs/source/fr/tutoriel_pipeline.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update docs/source/fr/tutoriel_pipeline.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update docs/source/fr/tutoriel_pipeline.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update docs/source/fr/tutoriel_pipeline.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update docs/source/fr/tutoriel_pipeline.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update docs/source/fr/tutoriel_pipeline.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update docs/source/fr/tutoriel_pipeline.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update docs/source/fr/tutoriel_pipeline.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Jade Choghari <chogharijade@icloud.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-06-13 17:48:54 +02:00
谭九鼎	84351d57eb	docs: fix broken link (#31370 ) * docs: fix broken link * fix link	2024-06-12 11:33:00 +01:00
Jade Choghari	35a6d9d648	Add french translation of AutoBackbone (#31300 )	2024-06-11 18:28:52 +01:00
amyeroberts	f53fe35b29	Fast image processor (#28847 ) * Draft fast image processors * Draft working fast version * py3.8 compatible cache * Enable loading fast image processors through auto * Tidy up; rescale behaviour based on input type * Enable tests for fast image processors * Smarter rescaling * Don't default to Fast * Safer imports * Add necessary Pillow requirement * Woops * Add AutoImageProcessor test * Fix up * Fix test for imagegpt * Fix test * Review comments * Add warning for TF and JAX input types * Rearrange * Return transforms * NumpyToTensor transformation * Rebase - include changes from upstream in ImageProcessingMixin * Safe typing * Fix up * convert mean/std to tesnor to rescale * Don't store transforms in state * Fix up * Update src/transformers/image_processing_utils_fast.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/auto/image_processing_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/auto/image_processing_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/auto/image_processing_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Warn if fast image processor available * Update src/transformers/models/vit/image_processing_vit_fast.py * Transpose incoming numpy images to be in CHW format * Update mapping names based on packages, auto set fast to None * Fix up * Fix * Add AutoImageProcessor.from_pretrained(checkpoint, use_fast=True) test * Update src/transformers/models/vit/image_processing_vit_fast.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Add equivalence and speed tests * Fix up --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2024-06-11 15:47:38 +01:00
Matt	edc1dffd00	Chat Template support for function calling and RAG (#30621 ) * First draft, still missing automatic function conversion * First draft of the automatic schema generator * Lots of small fixes * the walrus has betrayed me * please stop committing your debug breakpoints * Lots of cleanup and edge cases, looking better now * Comments and bugfixes for the type hint parser * More cleanup * Add tests, update schema generator * Update tests, proper handling of return values * Small docstring change * More doc updates * More doc updates * Add json_schema decorator * Clean up the TODOs and finish the docs * self.maxDiff = None to see the whole diff for the nested list test * add import for add_json_schema * Quick test fix * Fix something that was bugging me in the chat template docstring * Less "anyOf" when unnecessary * Support return types for the templates that need them * Proper return type tests * Switch to Google format docstrings * Update chat templating docs to match new format * Stop putting the return type in with the other parameters * Add Tuple support * No more decorator - we just do it implicitly! * Add enum support to get_json_schema * Update docstring * Add copyright header * Update src/transformers/tokenization_utils_base.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/chat_templating.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/utils/chat_template_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/utils/chat_template_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add copyright header * make fixup * Fix indentation * Reformat chat_template_utils * Correct return value * Make regexes module-level * Support more complex, multi-line arg docstrings * Update error message for ... * Update ruff * Add document type validation * Refactor docs * Refactor docs * Refactor docs * Clean up Tuple error * Add an extra test for very complex defs and docstrings and clean everything up for it * Document enum block * Quick test fixes * Stop supporting type hints in docstring to fix bugs and simplify the regex * Update docs for the regex change * Clean up enum regex * Wrap functions in {"type": "function", "function": ...} * Update src/transformers/utils/chat_template_utils.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * Temporary tool calling commit * Add type hints to chat template utils, partially update docs (incomplete!) * Code cleanup based on @molbap's suggestion * Add comments to explain regexes * Fix up type parsing for unions and lists * Add custom exception types and adjust tests to look for them * Update docs with a demo! * Docs cleanup * Pass content as string * Update tool call formatting * Update docs with new function format * Update docs * Update docs with a second tool to show the model choosing correctly --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2024-06-11 15:46:38 +01:00
Pavel Iakubovskii	517df566f5	Decorators for deprecation and named arguments validation (#30799 ) * Fix do_reduce_labels for maskformer image processor * Deprecate reduce_labels in favor to do_reduce_labels * Deprecate reduce_labels in favor to do_reduce_labels (segformer) * Deprecate reduce_labels in favor to do_reduce_labels (oneformer) * Deprecate reduce_labels in favor to do_reduce_labels (maskformer) * Deprecate reduce_labels in favor to do_reduce_labels (mask2former) * Fix typo * Update mask2former test * fixup * Update segmentation examples * Update docs * Fixup * Imports fixup * Add deprecation decorator draft * Add deprecation decorator * Fixup * Add deprecate_kwarg decorator * Validate kwargs decorator * Kwargs validation (beit) * fixup * Kwargs validation (mask2former) * Kwargs validation (maskformer) * Kwargs validation (oneformer) * Kwargs validation (segformer) * Better message * Fix oneformer processor save-load test * Update src/transformers/utils/deprecation.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/utils/deprecation.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/utils/deprecation.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * Update src/transformers/utils/deprecation.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * Better handle classmethod warning * Fix typo, remove warn * Add header * Docs and `additional_message` * Move to filter decorator ot generic * Proper deprecation for semantic segm scripts * Add to __init__ and update import * Basic tests for filter decorator * Fix doc * Override `to_dict()` to pop depracated `_max_size` * Pop unused parameters * Fix trailing whitespace * Add test for deprecation * Add deprecation warning control parameter * Update generic test * Fixup deprecation tests * Introduce init service kwargs * Revert popping unused params * Revert oneformer test * Allow "metadata" to pass * Better docs * Fix test * Add notion in docstring * Fix notification for both names * Add func name to warning message * Fixup --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2024-06-10 12:35:10 +01:00
谭九鼎	4fa4dcb2be	docs/zh: fix style (#31334 )	2024-06-10 11:40:40 +01:00
谭九鼎	807483edba	docs: fix style (#31340 )	2024-06-10 09:53:25 +01:00
Matt	065729a692	Remove ConversationalPipeline and Conversation object (#31165 ) * Remove ConversationalPipeline and Conversation object, as they have been deprecated for some time and are due for removal * Update not-doctested.txt * Fix JA and ZH docs * Fix JA and ZH docs some more * Fix JA and ZH docs some more	2024-06-07 17:50:18 +01:00
amyeroberts	bdf36dcd48	Enable HF pretrained backbones (#31145 ) * Enable load HF or tim backbone checkpoints * Fix up * Fix test - pass in proper out_indices * Update docs * Fix tvp tests * Fix doc examples * Fix doc examples * Try to resolve DPT backbone param init * Don't conditionally set to None * Add condition based on whether backbone is defined * Address review comments	2024-06-06 22:02:38 +01:00
Jack Yang	a3d351c00f	Update text-to-speech.md (#31269 ) SpeechBrain usage has changed	2024-06-06 21:59:22 +01:00
Lucain	9ef93fccad	Switch from `cached_download` to `hf_hub_download` in remaining occurrences (#31284 ) Switch from hf_hub_url to hf_hub_download in remaining occurences	2024-06-06 12:05:59 +01:00
Vaibhav Srivastav	4a6024921f	doc: add info about wav2vec2 bert in older wav2vec2 models. (#31120 ) * doc: add info about wav2vec2 bert in older wav2vec2 models. * apply suggestions from review. * forward contrib credits from review --------- Co-authored-by: Sanchit Gandhi <sanchit-gandhi@users.noreply.github.com>	2024-06-05 11:56:11 +01:00
Younes Belkada	485d913dfb	Blip: Deprecate `BlipModel` (#31235 ) * deprecate blip * mention deprecation on docs	2024-06-04 18:29:45 +02:00
Aaron Jimenez	c73ee1333d	[docs] Spanish translation of tokenizer_summary.md (#31154 ) * add tokenizer_summary to es/_toctree.yml * add tokenizer_summary to es/ * fix link to Transformes XL in en/ * translate until Subword tokenization section * fix GPT link in en/ * fix other GPT link in en/ * fix typo in en/ * translate the doc * run make fixup * Remove .md in Transformer XL link * fix some link issues in es/ * fix typo	2024-06-03 16:52:23 -07:00
Jade Choghari	98dd842339	Wrong translation FR : Contents = Contenu (#31186 ) Update index.md - Contents = Contenu French typo - Contents = Contenu	2024-06-03 17:40:14 +02:00
Isotr0py	e4628434d8	Add Qwen2 GGUF loading support (#31175 ) * add qwen2 gguf support * Update docs * fix qwen2 tokenizer * add qwen2 gguf test * fix typo in qwen2 gguf test * format code * Remove mistral, clarify the error message * format code * add typing and update docstring	2024-06-03 14:55:10 +01:00
Pavel Iakubovskii	cdc813113a	Instance segmentation examples (#31084 ) * Initial setup * Metrics * Overfit on two batches * Train 40 epochs * Memory leak debugging * Trainer fine-tuning * Draft * Fixup * Trained end-to-end * Add requirements * Rewrite evaluator * nits * Add readme * Add instance-segmentation to the table * Support void masks * Remove sh * Update docs * Add pytorch test * Add accelerate test * Update examples/pytorch/instance-segmentation/README.md * Update examples/pytorch/instance-segmentation/run_instance_segmentation.py * Update examples/pytorch/instance-segmentation/run_instance_segmentation_no_trainer.py * Update examples/pytorch/instance-segmentation/run_instance_segmentation_no_trainer.py * Update examples/pytorch/instance-segmentation/run_instance_segmentation.py * Fix consistency oneformer * Fix imports * Fix imports sort * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update examples/pytorch/instance-segmentation/run_instance_segmentation.py Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> * Add resources to docs * Update examples/pytorch/instance-segmentation/README.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update examples/pytorch/instance-segmentation/README.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Remove explicit model_type argument * Fix tests * Update readme * Note about other models --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-05-31 16:56:17 +01:00
Aymeric Roucher	9837a25481	Add streaming, various fixes (#30838 ) * Implement streaming run in ReAct agents * Allow additional imports in code agents * Python interpreter: support classes and exceptions, fixes	2024-05-31 14:16:23 +02:00
Asif Ajrof	bd9d1ddf41	Update sam.md (#31130 ) `mask` variable is not defined. probably a writing mistake. it should be `segmentation_map`. `segmentation_map` should be a `1` channel image rather than `RGB`. [on a different note, the `mask_url` is the same as `raw_image`. could provide a better example.	2024-05-31 12:34:29 +02:00
Younes Belkada	f5590deaa8	Docs / Quantization: Replace all occurences of `load_in_8bit` with bnb config (#31136 ) Replace all occurences of `load_in_8bit` with bnb config	2024-05-30 16:47:35 +02:00
Younes Belkada	cb879c5801	FIX / Docs: Fix GPTQ expected number of bits (#31111 ) Update overview.md	2024-05-29 15:56:28 +02:00
Lucain	c3044ec2f3	Use `HF_HUB_OFFLINE` + fix has_file in offline mode (#31016 ) * Fix has_file in offline mode * harmonize env variable for offline mode * Switch to HF_HUB_OFFLINE * fix test * revert test_offline to test TRANSFORMERS_OFFLINE * Add new offline test * merge conflicts * docs	2024-05-29 11:55:43 +01:00
amyeroberts	a564d10afe	Deprecate low use models (#30781 ) * Deprecate models - graphormer - time_series_transformer - xlm_prophetnet - qdqbert - nat - ernie_m - tvlt - nezha - mega - jukebox - vit_hybrid - x_clip - deta - speech_to_text_2 - efficientformer - realm - gptsan_japanese * Fix up * Fix speech2text2 imports * Make sure message isn't indented * Fix docstrings * Correctly map for deprecated models from model_type * Uncomment out * Add back time series transformer and x-clip * Import fix and fix-up * Fix up with updated ruff	2024-05-28 18:07:07 +01:00
Younes Belkada	7f08817be4	Docs / Quantization: Redirect deleted page (#31063 ) Update _redirects.yml	2024-05-28 18:29:22 +02:00
Younes Belkada	4f98b14465	Docs / PEFT: Add PEFT API documentation (#31078 ) * add peft references * add peft references * Update docs/source/en/peft.md * Update docs/source/en/peft.md	2024-05-28 15:04:43 +02:00
NielsRogge	90da0b1c9f	[SuperPoint, PaliGemma] Update docs (#31025 ) * Update docs * Add PaliGemma resources * Address comment * Update docs	2024-05-28 13:22:06 +02:00
AP	dd4654eab7	Update quicktour.md to fix broken link to Glossary (#31072 ) Update quicktour.md to fix broken link Missing '/' in attention mask link in the transformers quicktour	2024-05-28 11:50:45 +02:00
Eitan Turok	0a064dc0fc	Follow up: Fix link in dbrx.md (#30514 ) * Fix link in dbrx.md * remove "though this may not be up to date" --------- Co-authored-by: Lysandre Debut <hi@lysand.re>	2024-05-27 14:57:43 +02:00
Aymeric Roucher	84c4b72ee9	Redirect transformers_agents doc to agents (#31054 )	2024-05-27 10:34:14 +02:00
Aritra Roy Gosthipaty	965e98dc54	[Port] TensorFlow implementation of Mistral (#29708 ) * chore: initial commit * chore: adding imports and inits * chore: adding the causal and classification code * chore: adding names to the layers * chore: using single self attn layer * chore: built the model and layers * chore: start with testing * chore: docstring change, transpose fix * fix: rotary embedding * chore: adding cache implementation * remove unused torch * chore: fixing the indexing issue * make fix-copies * Use modeling_tf_utils.keras * make fixup * chore: fixing tests * chore: adding past key value logic * chore: adding multi label classfication test * fix: switching on the built parameters in the layers * fixing repo consistency * ruff formats * style changes * fix: tf and pt equivalence * removing returns from docstrings * fix docstrings * fix docstrings * removing todos * fix copies * fix docstring * fix docstring * chore: using easier rotate_half * adding integration tests * chore: addressing review related to rotary embedding layer * review changes * [run-slow] mistral * skip: test save load after resize token embedding * style --------- Co-authored-by: Matt <rocketknight1@gmail.com>	2024-05-23 17:48:49 +01:00
Younes Belkada	5a74ae6dbe	FIX / Docs: Minor changes in quantization docs (#30985 ) * Change in quantization docs * Update overview.md * Update docs/source/en/quantization/overview.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-05-23 16:36:49 +02:00
Younes Belkada	87a351818e	Docs / Quantization: refactor quantization documentation (#30942 ) * refactor quant docs * delete file * rename to overview * fix * fix table * fix * add content * fix library versions * fix table * fix table * fix table * fix table * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * replace to quantization_config * fix aqlm snippet * add DLAI courses * fix * fix table * fix bulet points --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-05-23 14:31:52 +02:00
Raushan Turganbay	d583f1317b	Quantized KV Cache (#30483 ) * clean-up * Update src/transformers/cache_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/cache_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/cache_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/generation/configuration_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * more suggestions * mapping if torch available * run tests & add 'support_quantized' flag * fix jamba test * revert, will be fixed by another PR * codestyle * HQQ and versatile cache classes * final update * typo * make tests happy --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-05-23 17:25:20 +05:00
Pavel Iakubovskii	15585b81a5	Update object detection with latest resize and pad strategies (#30955 ) * Update with new resizing and pad strategy * Return pixel mask param * Update inference in guide * Fix empty compose * Update guide	2024-05-23 00:13:56 +01:00
Vaibhav Srivastav	24d2a5e1a3	[doc] Add references to the fine-tuning blog and distil-whisper to Whisper. (#30938 ) [doc] Add references to the fine-tuning blog and distil-whisper to Whisper doc.	2024-05-22 14:06:09 +01:00
Raushan Turganbay	934e1b84e9	Update video-llava docs (#30935 ) * update video-llava * Update docs/source/en/model_doc/video_llava.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-05-22 16:56:41 +05:00
NielsRogge	60bb571e99	🚨 [Idefics2] Update ignore index (#30898 ) * Update ignore index * Update docs * Update docs	2024-05-21 19:38:02 +02:00
Younes Belkada	8871b26150	FEAT / Trainer: LOMO optimizer support (#30178 ) * add V1 - adalomo not working yet * add todo docs + refactor from comments * adjust LR * add docs * add more elaborated test * Apply suggestions from code review Co-authored-by: Zach Mueller <muellerzr@gmail.com> * fix * push * add accelerate check * fix DDP case * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix * init kwargs * safely add attribute * revert to enum logic * Update src/transformers/trainer.py --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-05-21 10:16:37 +02:00
Aaron Jimenez	0df888ffb7	[docs] Spanish translation of model_memory_anatomy.md (#30885 ) * add model_memory_anatomy to es/_toctree.yml * copy model_memory_anatomy.md to es/ * translate first section * translate doc * chage forward activations * fix sentence and and link to Trainer * fix Trainer link	2024-05-20 16:48:52 -07:00
Longjie Zheng	616bb11d48	Add torch.compile for Mistral (#30642 ) * first version * fix sliding window * fix style * add sliding window cache * fix style * address comments * fix test * fix style * move sliding window check inside cache init * revert changes on irrelevant files & add comment on SlidingWindowCache * address comments & fix style fix style * update causal mask * [run-slow] mistral * [run-slow] mistral * [run-slow] mistral * [run-slow] mistral * [run-slow] mistral * [run-slow] llama * [run-slow] mistral * [run-slow] mistral * [run-slow] mistral * revert CI from a10 to t4 * wrap up	2024-05-20 16:27:24 +02:00
Raushan Turganbay	5d0bf59b4d	LLaVa-Next: Update docs with batched inference (#30857 ) * update docs with batch ex * Update docs/source/en/model_doc/llava_next.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * accept nested list of img --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2024-05-20 13:45:56 +05:00
Joseph Enguehard	07bf2dff78	Add TokenClassification for Mistral, Mixtral and Qwen2 (#29878 ) * Add MistralForTokenClassification * Add tests and docs * Add token classification for Mixtral and Qwen2 * Save llma for token classification draft * Add token classification support for Llama, Gemma, Persimmon, StableLm and StarCoder2 * Formatting * Add token classification support for Qwen2Moe model * Add dropout layer to each ForTokenClassification model * Add copied from in tests * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Propagate suggested changes * Style --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-05-20 10:06:57 +02:00
Jacky Lee	977ce58a78	Fix dependencies for image classification example (#30842 ) * fix: missing dependencies * fix: image classification dependencies	2024-05-17 13:57:47 +01:00

1 2 3 4 5 ...

2573 Commits