Commit Graph

16713 Commits

Author SHA1 Message Date
Matt
38d58a4427
Fix local repos with remote code not registering for pipelines (#33100)
* Extremely experimental fix!

* Try removing the clause entirely

* Add test

* make fixup

* stash commit

* Remove breakpoint

* Add anti-regression test

* make fixup

* Move repos to hf-internal-testing!
2024-08-30 16:56:22 +01:00
Matt
fbff27623a
Add warning for stop string edge case (#33169)
* Add warning for edge case

* make fixup
2024-08-30 16:26:26 +01:00
Julien Denize
e259d6d1e0
Add missing quotes in modeling_llava_next_video.py (#33214) 2024-08-30 15:39:23 +02:00
dependabot[bot]
9a6956baab
Bump torch from 1.13.1 to 2.2.0 in /examples/research_projects/decision_transformer (#33215)
Bump torch in /examples/research_projects/decision_transformer

Bumps [torch](https://github.com/pytorch/pytorch) from 1.13.1 to 2.2.0.
- [Release notes](https://github.com/pytorch/pytorch/releases)
- [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md)
- [Commits](https://github.com/pytorch/pytorch/compare/v1.13.1...v2.2.0)

---
updated-dependencies:
- dependency-name: torch
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-30 15:38:53 +02:00
dependabot[bot]
4987463de7
Bump torch from 1.13.1 to 2.2.0 in /examples/research_projects/codeparrot (#33173)
Bump torch in /examples/research_projects/codeparrot

Bumps [torch](https://github.com/pytorch/pytorch) from 1.13.1 to 2.2.0.
- [Release notes](https://github.com/pytorch/pytorch/releases)
- [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md)
- [Commits](https://github.com/pytorch/pytorch/compare/v1.13.1...v2.2.0)

---
updated-dependencies:
- dependency-name: torch
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-30 15:23:35 +02:00
Joao Gante
b127fb8fdc
Pipeline: fix bad generation kwargs docs (#33205)
fix link
2024-08-30 14:14:42 +02:00
Arthur
c409cd8177
use a single for loop (#33148)
* use a single for loop

* oups

* fixup

* fix typo
2024-08-29 15:55:02 +02:00
Gerben van V
5129671290
Add a static cache that offloads to the CPU or other device (#32161)
* Add a static cache that offloads to the CPU or other device

* Fix PR comments, add unit-tests
2024-08-29 11:51:09 +02:00
Anton Vlasjuk
92a75ff6b1
Mamba2 conversion script for original models (#32580)
* first attempt at allowing both conversions from codestral and from the original mamba ssm

* allow fp16, seems default for mamba2

* dtype fix

* simplify codestral check, dont overwrite pad/eos/bos when codestral

* change file -> directory

* use path join to be safe

* style

* apply code review
- add util mamba2 tokenizer (gptneox with left padding)
- add models dict

* fix copies

* add tokenizer to docs

* empty commit to check for weird err

* make conversion user dependent on model type, defaults for original paper models

* small comment nit

* remove norm_before_gate in conversion

* simplify model dict by using shared keys directly + remove unnecessary attributes

* fix tokenization: remove separate mamba2 tokenizer, add padding option as kwarg to gptneox one and reuse it for the conversion script

* simplify even further as we pass padding side via **kwargs already
2024-08-29 11:27:45 +02:00
Wing Lian
39bfb2f514
pass module to Params4bit.from_prequantized to ensure quant_state (#32524)
* pass module to Params4bit.from_prequantized to ensure quant_state

* make sure to check bnb version

* revert min bnb version and use inspect on method instead

* use version instead of inspect to prevent performance hit

* make the property name readable
2024-08-29 11:09:56 +02:00
Duygu Altinok
5c1027bf09
added quick clarification (#33166)
* added quick clarification

* cosmetics
2024-08-28 18:52:17 +02:00
Yih-Dar
3d79dcbda0
update push CI workflow files for security (#33142)
* update for security 1

* update for security 2

* update for security 3

* update for security 4

* update for security 5

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-08-28 18:15:58 +02:00
Nanami
74e19e81e2
Fix spell mistakes (#33149) 2024-08-28 15:27:16 +02:00
beep-bebop
5c84682f16
Customise the separator used for splicing in DataCollatorWithFlattening (#33114)
* Customising the separator used for splicing in DataCollatorWithFlattening

* update DataCollatorWithFlattening docs

---------

Co-authored-by: weifangyuan <i.weifangyuan@yuewen.com>
2024-08-28 15:22:07 +02:00
Pedro Cuenca
f4c86d0416
Zero-shot pipelines: minor doc changes (#33127)
Minor zero-shot doc changes for pipelines.
2024-08-28 13:59:16 +02:00
rasmi
f9ed05dd03
Fix import paths for test_module (#32888)
* Fix import path for test_feature_extraction_utils.py

See https://github.com/huggingface/transformers/pull/32601

* Fix import path for test_image_processing_utils.py
2024-08-28 12:08:29 +01:00
JB (Don)
f1a385b1de
[RoBERTa-based] Add support for sdpa (#30510)
* Adding SDPA support for RoBERTa-based models

* add not is_cross_attention

* fix copies

* fix test

* add minimal test for camembert and xlm_roberta as their test class does not inherit from ModelTesterMixin

* address some review comments

* use copied from

* style

* consistency

* fix lists

---------

Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-08-28 10:26:00 +02:00
benniekiss
e0b87b0f40
[whisper] pass attention_mask to generate_with_fallback() (#33145)
pass attention_mask to generate_with_fallback
2024-08-28 09:53:58 +02:00
Anton Vlasjuk
3bfd3e4803
Fix: Jamba batched generation (#32914)
* init fix

* fix mask during cached forward, move mask related stuff to own function

* adjust tests as left padding does not change logits as much anymore + batch gen (with todo on logits comp)

* revert overwriting new integration tests

* move some comments to docstring
2024-08-28 09:24:06 +02:00
Mayank Mishra
386931d950
fix model name and copyright (#33152) 2024-08-28 08:38:57 +02:00
Mayank Mishra
c35d2ccf5a
Granite language models (#31502)
* first commit

* drop tokenizer

* drop tokenizer

* drop tokenizer

* drop convert

* granite

* drop tokenization test

* mup

* fix

* reformat

* reformat

* reformat

* fix docs

* stop checking for checkpoint

* update support

* attention multiplier

* update model

* tiny drop

* saibo drop

* skip test

* fix test

* fix test

* drop

* drop useless imports

* update docs

* drop flash function

* copied from

* drop pretraining tp

* drop pretraining tp

* drop pretraining tp

* drop unused import

* drop code path

* change name

* softmax scale

* head dim

* drop legacy cache

* rename params

* cleanup

* fix copies

* comments

* add back legacy cache

* multipliers

* multipliers

* multipliers

* text fix

* fix copies

* merge

* multipliers

* attention multiplier

* drop unused imports

* fix

* fix

* fix

* move rope?

* Update src/transformers/models/granite/configuration_granite.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix

* Update src/transformers/models/granite/modeling_granite.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix

* fix

* fix

* fix

* fix-copies

* torch rmsnorm

* add authors

* change model path

* fix

* test

* drop static cache test

* uupdate readme

* drop non-causal

* readme

* drop useless imports

* Update docs/source/en/model_doc/granite.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/model_doc/granite.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/model_doc/granite.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-08-27 21:27:21 +02:00
Juan Pizarro
7591ca5bc5
🚨 Add Blip2ForImageTextRetrieval (#29261)
* add Blip2ForImageTextRetrieval

* use one line and remove unnecessary space in tests

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* use  value from the config, rather than hardcoded

* change order of params in Blip2QFormerModel.forward

* update docstring

* fix style

* update test_inference_opt

* move embeddings out of Blip2QFormerModel

* remove from_vision_qformer_configs

* remove autocast float16 in Blip2QFormerModel

* rename fiels into vision_projection,text_projection,use_image_text_matching_head

* use CLIPOutput for  Blip2ImageTextMatchingModelOutput

* remove past_key_values_length from Blip2TextEmbeddings

* fix small typo in the CLIPOutput docstring

* add Blip2ForImageTextRetrieval to Zero Shot Image Classification mapping

* update docstring and add require_torch_fp16

* rollback test_inference_opt

* use use_image_text_matching_head=True in convert

* skip test_model_get_set_embeddings

* fix create_rename_keys error on new itm fields

* revert to do  scale after dot product between "query" and "key"

* fix ValueError on convert script for blip2-opt-2.7b

* update org of paths to Salesforce

* add is_pipeline_test_to_skip for VisualQuestionAnsweringPipelineTests

* [run_slow] blip_2

* removed Blip2ForImageTextRetrieval from IGNORE_NON_AUTO_CONFIGURED

* fix docstring of Blip2ImageTextMatchingModelOutput

* [run_slow] blip_2

* fix multi-gpu tests

* [run_slow] blip_2

* [run_slow] blip_2

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-08-27 18:50:27 +01:00
Ali Salamatian
27903de7ec
Very small change to one of the function parameters (#32548)
Very small change to one of the parameters

np.random.randint second parameter is not included in the possible options. Therefore, we want the upper range to be 2, so that we have some 1 labels in our classification as well.
2024-08-27 09:29:05 -07:00
Sae_Chan_Oh
6101d934a1
🌐 [i18n-KO] Translated conversations.md to Korean (#32468)
* docs: ko: conversations.md

* feat: hand-crafted translate docs

* fix: modify typo after Grammar Check

* Update docs/source/ko/conversations.md

감사합니다

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* fix: accept suggestions about anchor and spacing

* Update docs/source/ko/conversations.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conversations.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conversations.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conversations.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conversations.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conversations.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conversations.md

Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>

* fix: anchor 'what happened inside piepeline?' be removed question mark

* fix: translate the comments in the code block

---------

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>
Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>
Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>
2024-08-27 09:25:41 -07:00
Marc Sun
7ee4363d19
update torch req for 4-bit optimizer (#33144)
update req
2024-08-27 17:07:10 +02:00
Emin Orhan
d47a9e8ce5
fix redundant checkpointing in example training scripts (#33131)
* fix redundant checkpointing in example scripts

* Update examples/pytorch/image-classification/run_image_classification_no_trainer.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update examples/pytorch/translation/run_translation_no_trainer.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update examples/pytorch/token-classification/run_ner_no_trainer.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update examples/pytorch/text-classification/run_glue_no_trainer.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update examples/pytorch/summarization/run_summarization_no_trainer.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update examples/pytorch/semantic-segmentation/run_semantic_segmentation_no_trainer.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update examples/pytorch/language-modeling/run_mlm_no_trainer.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update examples/pytorch/language-modeling/run_fim_no_trainer.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update examples/pytorch/language-modeling/run_clm_no_trainer.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update examples/pytorch/image-pretraining/run_mim_no_trainer.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update examples/pytorch/instance-segmentation/run_instance_segmentation_no_trainer.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update examples/pytorch/multiple-choice/run_swag_no_trainer.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update examples/pytorch/question-answering/run_qa_no_trainer.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update examples/pytorch/object-detection/run_object_detection_no_trainer.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update examples/pytorch/question-answering/run_qa_beam_search_no_trainer.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-08-27 15:50:00 +02:00
Joao Gante
c6b23fda65
Llama: make slow tests green 🟢 (#33138) 2024-08-27 14:44:42 +01:00
Matt
9956c2bc98
Add a fix for custom code tokenizers in pipelines (#32300)
* Add a fix for the case when tokenizers are passed as a string

* Support image processors and feature extractors as well

* Reverting load_feature_extractor and load_image_processor

* Add test

* Test is torch-only

* Add tests for preprocessors and feature extractors and move test

* Extremely experimental fix

* Revert that change, wrong branch!

* Typo!

* Split tests
2024-08-27 14:39:57 +01:00
Zizhao Chen
834ec7b1cc
fix Idefics2VisionConfig type annotation (#33103)
* fix Idefics2VisionConfig type annotation

* Update modeling_idefics2.py

* Update modeling_idefics2.py

add ignore copy

* Update modeling_idefics2.py

* Update modeling_idefics2.py
2024-08-27 14:43:28 +02:00
pedrobrs
d1f39c484d
Update stateful_callbacks state before saving checkpoint (#32115)
* update ExportableState callbacks state before saving trainer_state on save_checkpoint

* run make fixup and fix format

* manage multiple stateful callbacks of same class
2024-08-27 14:33:35 +02:00
Vaibhav Srivastav
6f0ecf1049
[docs] add quick usage snippet to Whisper. (#31289)
* [docs] add quick usage snippet to Whisper.

* Apply suggestions from review.

* 💉 Fix the device for pipeline.
2024-08-27 14:11:52 +02:00
Boris Feld
892d51caee
Log additional test metrics with the CometCallback (#33124)
* Log additional test metrics with the CometCallback.

Also follow the same metric naming convention as other callbacks

* Merge 2 subsequent if-statements

* Trigger Build

---------

Co-authored-by: Aliaksandr Kuzmik <alexander.kuzmik99@gmail.com>
2024-08-27 13:40:53 +02:00
dependabot[bot]
746e1148cf
Bump torch from 1.13.1 to 2.2.0 in /examples/research_projects/jax-projects/hybrid_clip (#33137)
Bump torch in /examples/research_projects/jax-projects/hybrid_clip

Bumps [torch](https://github.com/pytorch/pytorch) from 1.13.1 to 2.2.0.
- [Release notes](https://github.com/pytorch/pytorch/releases)
- [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md)
- [Commits](https://github.com/pytorch/pytorch/compare/v1.13.1...v2.2.0)

---
updated-dependencies:
- dependency-name: torch
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-27 13:33:37 +02:00
Joao Gante
ab0ac3b98f
CI: fix efficientnet pipeline timeout and prevent future similar issues due to large image size (#33123)
* fix param not being passed in tested; add exceptions

* better source of model name

* Update utils/create_dummy_models.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-08-27 11:58:27 +01:00
Yih-Dar
3806faa171
disable scheduled daily CI temporarily (#33136)
disable scheduled daily CI temporary

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-08-27 11:52:15 +02:00
Aya
7562366d4b
fix: multilingual midel convert to tflite get wrong token (#32079)
* fix: multilingual midel convert to tflite get wrong token

* fix: modify test_force_tokens_logits_processor the checking value as scores.dtype.min

---------

Co-authored-by: kent.sc.hung <kent.sc.hung@benq.com>
Co-authored-by: Aya <[kent831217@gmail.com]>
2024-08-27 11:44:09 +02:00
Sai-Suraj-27
3bf6dd8aa1
fix: Fixed CodeGenTokenizationTest::test_truncation failing test (#32850)
* Fixed failing CodeGenTokenizationTest::test_truncation.

* [run_slow] Codegen

* [run_slow] codegen
2024-08-27 09:20:59 +02:00
Zach Mueller
9578c2597e
Fixup py 38 type hints for mps friendly (#33128)
Fixup py 38
2024-08-26 12:27:39 -04:00
Pablo Montalvo
26f043bd4d
quickfix documentation (#32566)
* fix documentation

* update config
2024-08-26 17:49:44 +02:00
Sai-Suraj-27
3562772969
fix: Fixed pydantic required version in dockerfiles to make it compatible with DeepSpeed (#33105)
Fixed pydantic required version in dockerfiles.
2024-08-26 17:10:36 +02:00
Ritik Nandwal
a378a54a57
Add changes for uroman package to handle non-Roman characters (#32404)
* Add changes for uroman package to handle non-Roman characters

* Update docs for uroman changes

* Modifying error message to warning, for backward compatibility

* Update instruction for user to install uroman

* Update docs for uroman python version dependency and backward compatibility

* Update warning message for python version compatibility with uroman

* Refine docs
2024-08-26 17:07:01 +02:00
Joao Gante
72d4a3f9c1
mps: add isin_mps_friendly, a wrapper function for torch.isin (#33099) 2024-08-26 15:34:19 +01:00
Joao Gante
894d421ee5
Test: add higher atol in test_forward_with_num_logits_to_keep (#33093) 2024-08-26 15:23:30 +01:00
Joao Gante
93e0e1a852
CI: add torchvision to the consistency image (#32941) 2024-08-26 15:17:45 +01:00
Shijie
19e6e80e10
support qwen2-vl (#32318)
* support-qwen2-vl

* tidy

* tidy

* tidy

* tidy

* tidy

* tidy

* tidy

* hyphen->underscore

* make style

* add-flash2-tipd

* delete-tokenize=False

* remove-image_processor-in-init-file

* add-qwen2_vl-in-MODEL_FOR_VISION_2_SEQ_MAPPING_NAMES

* format-doct

* support-Qwen2VLVisionConfig

* remove-standardize_cache_format

* fix-letter-varaibles

* remove-torch-in-image-processor

* remove-useless-docstring

* fix-one-letter-varaible-name

* change-block-name

* default-quick-gelu-in-vision

* remove-useless-doc

* use-preimplemented-flash-forward

* fix-doc

* fix-image-processing-doc

* fix-apply-rotary-embed

* fix-flash-attn-sliding-window

* refactor

* remove-default_template

* remove-reorder_cache

* simple-get-rope_deltas

* update-prepare_inputs_for_generation

* update-attention-mask

* update-rotary_seq_len

* remove-state

* kv_seq_length

* remove-warning

* _supports_static_cache

* remove-legacy-cache

* refactor

* fix-replace

* mrope-section-doc

* code-quality

* code-quality

* polish-doc

* fix-image-processing-test

* update readme

* Update qwen2_vl.md

* fix-test

* Update qwen2_vl.md

* nit

* processor-kwargs

* hard-code-norm_layer

* code-quality

* discard-pixel-values-in-gen

* fix-inconsistent-error-msg

* unify-image-video

* hidden_act

* add-docstring

* vision-encode-as-PreTrainedModel

* pixel-to-target-dtype

* update doc and low memoryvit

* format

* format

* channel-foramt

* fix vit_flashatt

* format

* inherit-Qwen2VLPreTrainedModel

* simplify

* format-test

* remove-one-line-func-in-image-processing

* avoid-one-line-reshape

* simplify-rotary_seq_len

* avoid-single-letter-variable

* no-for-loop-sdpa

* avoid-single-letter-variable

* remove-one-line-reshape

* remove-one-line-reshape

* remove-no-rope-in-vit-logic

* default-mrope

* add-copied-from

* more-docs-for-mrope

* polish-doc

* comment-and-link

* polish-doc

* single-letter-variables

* simplify-image-processing

* video->images

* kv_seq_len-update

* vision-rope-on-the-fly

* vision-eager-attention

* change-processor-order

---------

Co-authored-by: baishuai <baishuai.bs@alibaba-inc.com>
Co-authored-by: ShuaiBai623 <43326198+ShuaiBai623@users.noreply.github.com>
2024-08-26 15:16:44 +02:00
S M Jishanul Islam
8defc95df3
Updated the custom_models.md changed cross_entropy code (#33118) 2024-08-26 13:15:43 +02:00
Matt
0a7af19f4d
Update Jinja docs with new functions and general cleanup (#33097) 2024-08-23 17:40:06 +01:00
Arun Prakash A
e3a5f35cd5
added doctring to SchedulerType class (#32898)
* added doctring to SchedulerType class

* Remove trailing whitespace  src/transformers/trainer_utils.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fixup

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-08-23 09:15:25 -07:00
Donggeun Yu
1dbd9d3693
DeviceGuard added to use Deformable Attention more safely on multi-GPU (#32910)
* Update modeling_deformable_detr.py

* Update src/transformers/models/deformable_detr/modeling_deformable_detr.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update ms_deform_attn_cuda.cu

* Update modeling_deformable_detr.py

* Update modeling_deformable_detr.py

* [empty] this is a empty commit

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-08-23 17:12:10 +01:00
Matt
371b9c1486
Enable some Jinja extensions and add datetime capabilities (#32684)
* Add new Jinja features:

- Do extension
- Break/continue in loops
- Call strftime to get current datetime in any format

* Add new Jinja features:

- Do extension
- Break/continue in loops
- Call strftime to get current datetime in any format

* Fix strftime template

* Add template strip() just to be safe

* Remove the do extension to make porting easier, and also because it's the least useful

* Rename test

* strftime -> strftime_now

* Split test

* Update test to use strftime_now

* Refactor everything out into chat_template_utils

* Refactor everything out into chat_template_utils

* Refactor everything out into chat_template_utils

* Refactor everything out into chat_template_utils

* Refactor everything out into chat_template_utils
2024-08-23 14:26:12 +01:00