transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Maria Khalusova	9c1d59882b	Removed BLIP mention from the troubleshooting guide (#21872 ) removed BLIP mention from the troubleshooting guide	2023-03-01 08:26:25 -05:00
Younes Belkada	72787c5b68	[`Blip`] Fix blip doctest (#21868 ) fix blip doctest	2023-03-01 14:05:53 +01:00
Lorenzo Balzani	619d831848	Italian translation of community.mdx (#21871 ) Italian translation of community.mdx gh-17459	2023-03-01 07:49:56 -05:00
raghavanone	ebd5258975	Change the way tensor is reshaped in BartAttention (from .view to .reshape) (#21860 ) * Change the .view call to .reshape * Change the .view call to .reshape to all the copies from bart attention * Fix copies and style * Fix copies and style * Fix copies and style * Fix copies and style * Fix copies and style * Revert unneccessary changes * Revert unneccessary changes * Revert unneccessary changes * Revert unneccessary changes	2023-03-01 07:47:17 -05:00
Eugene Zapolsky	f71873c5fc	[deepspeed] check whether model is NLP one instead of counting on input type (#21800 ) * trying to figure out whether model is NLP * drop my changes and apply easier fix * trying to handle all int input types * fix logic --------- Co-authored-by: Stas Bekman <stas@stason.org>	2023-03-01 07:41:35 -05:00
saswatmeher	72e9ca7519	Fix gradient checkpointing bug Bart (#21866 ) Co-authored-by: saswatmeher <saswatmeher@cse.iitb.ac.in>	2023-03-01 11:41:58 +00:00
Andy Ehrenberg	5e6cd51bec	Flax beam search fix (#21857 )	2023-03-01 10:25:33 +00:00
Arthur	b599b19289	[ConvBert] Fix #21523 (#21849 ) * fix reshaping Fixes #21523 * add test * styling * last fixes * Update src/transformers/models/convbert/modeling_convbert.py * code quallity	2023-03-01 11:11:04 +01:00
Arthur	44e3e3fb49	prepare for "__floordiv__ is deprecated and its behavior will change in a future version of pytorch" (#20211 ) * rounding_mode = "floor" instead of // to prevent behavioral change * add other TODO * use `torch_int_div` from pytrch_utils * same for tests * fix copies * style * use relative imports when needed * Co-authored-by: sgugger <sylvain.gugger@gmail.com>	2023-03-01 10:49:21 +01:00
Sylvain Gugger	b29e2dcaff	Fix flaky test for log level (#21776 ) * Fix flaky test for log level * Fix other flaky test	2023-02-28 16:24:14 -05:00
Matt	acfb714bdf	Improve TF weight loading, especially PT crossloading (#21792 ) * First commit for the improved PT-TF weight loading * Remove workarounds from TFEncoderDecoder tests * Allow a custom weight renaming function in from_pretrained and use that to clean up EncoderDecoder * make fixup * First attempt at visionencoderdecoder * Disable tensorfloat32 in tests to get consistent outputs * Quick fix to tf_vision_encoder_decoder tests * make fixup * Update Blenderbot tests * Remove unused arg in modeling_tf_opt * load_tf_sharded_weights had strict=True! This meant transfer learning was impossible, so I'm setting it to False. * Support prefixes when loading sharded TF checkpoints * make fixup * Add test to load sharded models with a weight prefix * Fix sharded weight loading test * Add a test for transfer from a sharded checkpoint * make fixup * Add test to check that crossloading from PT with a prefix works * Refactor from_pretrained in the encoderdecoder classes * Refactor from_pretrained in the encoderdecoder classes * missmatched -> mismatched * Explicitly check for None * No comments showing my very impressive and attractive knowledge of Py3.9+ * Disable TF32 across all TF tests	2023-02-28 18:41:34 +00:00
Yih-Dar	871c31a6f1	🔥Rework pipeline testing by removing `PipelineTestCaseMeta` 🚀 (#21516 ) * Add PipelineTesterMixin * remove class PipelineTestCaseMeta * move validate_test_components * Add for ViT * Add to SPECIAL_MODULE_TO_TEST_MAP * style and quality * Add feature-extraction * update * raise instead of skip * add tiny_model_summary.json * more explicit * skip tasks not in mapping * add availability check * Add Copyright * A way to diable irrelevant tests * update with main * remove disable_irrelevant_tests * skip tests * better skip message * better skip message * Add all pipeline task tests * revert * Import PipelineTesterMixin * subclass test classes with PipelineTesterMixin * Add pipieline_model_mapping * Fix import after adding pipieline_model_mapping * Fix style and quality after adding pipieline_model_mapping * Fix one more import after adding pipieline_model_mapping * Fix style and quality after adding pipieline_model_mapping * Fix test issues * Fix import requirements * Fix mapping for MobileViTModelTest * Update * Better skip message * pipieline_model_mapping could not be None * Remove some PipelineTesterMixin * Fix typo * revert tests_fetcher.py * update * rename * revert * Remove PipelineTestCaseMeta from ZeroShotAudioClassificationPipelineTests * style and quality * test fetcher for all pipeline/model tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-02-28 19:40:57 +01:00
Anahita Bhiwandiwalla	4cb5ffa93d	Add loss for BridgeTowerForMaskedLM and BridgeTowerForImageAndTextRetrieval (#21684 ) * Add loss for BridgeTowerForMaskedLM and BridgeTowerForImageAndTextRetrieval * minor fix return_dict * implement test for loss computation --------- Co-authored-by: Tiep Le <97980157+tileintel@users.noreply.github.com> Co-authored-by: Tiep Le <tiep.le@intel.com>	2023-02-28 12:21:48 -05:00
Younes Belkada	7f4f8b97d0	[`Blip2`] Fix Blip-2 multi gpu (#21707 ) * fix blip multi gpu * fix * final changes * adapt suggestions * fix failing slow test * forward contrib credits from testing and suggestions * reformat --------- Co-authored-by: akkikiki <akkikiki@users.noreply.github.com>	2023-02-28 17:28:58 +01:00
Yih-Dar	aab895c396	Make Slack CI reporting stronger (#21823 ) * Use token * Avoid failure * better error * Fix * fix style --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-02-28 17:12:44 +01:00
Maria Khalusova	6ca844582c	Add: task guide for zero shot object detection (#21829 ) * zero shot object detection part 1 * added batch prediction section * added image guided object detection section * make style * added the task guide to the TOC * minor polishing * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> * added embedded owlvit demo * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * minor fix * make style --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-28 10:23:08 -05:00
Herumb Shandilya	31fa2b6c68	[GPTJ] Fix gradient checkpointing bug (#21794 ) * If applied, this commit fixes generate bug in gptj * Remove extra same code block * formatting and test fix * Conflict fix and declaration error fix --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-28 10:12:42 -05:00
raghavanone	eec76042f4	Fix the issue of blip model returning loss even when the label is not provided. (#21811 ) * Fix the issue of blip model returning loss even when the label is not provoided * Fix ruff failure * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks	2023-02-28 09:54:08 -05:00
Younes Belkada	b8de7e448e	[`Blip2`] Add `Blip2Model` (#21817 ) * add v1 * add `Blip2Model` - add relevant functions - add tests - add on automapping * fix docs * fix doctest	2023-02-28 15:42:55 +01:00
Younes Belkada	ae9230af40	[`T5`] Fix torchquant issue (#21843 ) * fix torchquant issue * add tests	2023-02-28 15:09:44 +01:00
anruijian	2d506ea4c4	Fix tf random token masking probability in data collator (#21834 ) * fix tf random mask tokens probability * fix tf random mask tokens probability in collator for langauge modelling	2023-02-28 07:55:47 -05:00
Karim Foda	4fe744f528	Fix gradient checkpointing imagegpt (#21816 ) * Fix gradient checkpointing bug in gptneox * Fix gradient checkpointing bug in modeling_imagegpt.py * Revert gpt neox changes --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-28 07:47:04 -05:00
Karim Foda	e07a3d95f8	Fix gradient checkpointing bug in git (#21818 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-28 07:46:33 -05:00
Andy Ehrenberg	50db741417	check for None forced tokens (#21793 )	2023-02-28 13:24:43 +01:00
saswatmeher	50644cf624	Fix gradient checkpointing bug BioGpt (#21844 ) Co-authored-by: saswatmeher <saswatmeher@cse.iitb.ac.in>	2023-02-28 11:56:25 +00:00
Yih-Dar	a9dd124346	Rename `MobileViTModelTest` to `TFMobileViTModelTest` (#21825 ) Let's give TF a bit more love ❤️ 🙏 Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-02-28 08:10:29 +01:00
Stas Bekman	c7f3abc257	introduce `logger.warning_once` and use it for grad checkpointing code (#21804 ) * logger.warning_once * style	2023-02-27 13:25:06 -08:00
Yih-Dar	f95f60c829	Fix quality with `ruff==0.0.253` (#21828 ) fix quality with ruff 0.0.253 Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-02-27 19:38:44 +01:00
Joao Gante	92dfceb124	Inheritance-based framework detection (#21784 )	2023-02-27 15:31:55 +00:00
Karim Foda	7811bf7e73	Fix gradient checkpointing bug in gptneox (#21815 ) * Fix gradient checkpointing bug in gptneox * Remove use_cache block	2023-02-27 14:49:32 +00:00
fxmarty	0c7f93f5f1	Fix nn.init.trunc_normal_ call on torch.float16 data (#21789 ) fix nn.init.trunc_normal_ call on half data	2023-02-27 13:31:29 +01:00
fxmarty	ebf84f07ba	Fix PyTorch Perceiver `PerceiverFourierPositionEncoding` with fp16 (#21787 ) * fix perceiver fp16 * hopefully fix tests	2023-02-27 11:43:57 +00:00
Younes Belkada	831f3144a6	[`tests`] add `accelerate` marker (#21743 ) * add `accelerate` marker * add to docs * Update docs/source/en/testing.mdx	2023-02-27 12:33:34 +01:00
Arthur	c51dc4f927	[torch] remove deprecated uint8 in favor of bool (#21384 ) * uint8 -> bool * fix copies * style * update test modeling commen when checking attention buffers * style * use logical not on random mask instead of subtraction with 1 * remove torch uint8 * quality * remove modified modeling utils * Update based on review Co-authored-by: sgugger <sylvain.gugger@gmail.com> --------- Co-authored-by: sgugger <sylvain.gugger@gmail.com>	2023-02-27 11:46:02 +01:00
Arthur	cc44e72d14	[Pipeline] Add zero shot audio classificatoin pipeline (#21600 ) * add pipeline * update init * add zero shot to init * update inits and correct checkpoints * update base to support input features * add tests * Update src/transformers/pipelines/zero_shot_audio_classification.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/pipelines/zero_shot_audio_classification.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * update pieline code * use tiny checkpoint * nits and expected value with tiny model * style * last nit on tests values * fix styling * fix collate fn that was casting t float * update --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-02-27 11:43:44 +01:00
Tianqi Zhang (张天启)	2ea1ef9090	[FX tracer] Make `concrete_args` from outside available (#21775 ) make concrete_args from outside available	2023-02-27 08:57:57 +01:00
Thomas Paviot	ba2a5f13f7	Fix en documentation typos (#21799 ) * fix wrong url * typos in english documentation	2023-02-27 08:36:36 +01:00
Julian Weber	a36983653e	Fix type in gpt2 config docstring (#21782 ) Fix docstring gpt2 config	2023-02-27 08:19:19 +01:00
bofeng huang	3c0ce60855	[examples/summarization] deal with `max_length` and `num_beams` (#21740 ) * Override the decoding parameters of Seq2SeqTrainer * Fix quality * Fix max_length parameter * Fix quality * Remove redundant parameter max_length * Separate the preprocess of train and validation to use different max_target_length	2023-02-27 08:18:14 +01:00
Moshe Berchansky	9ddf4f4f03	Fix resume_from_checkpoint for deepspeed (#21735 ) * Fix resume_from_checkpoint for deepspeed Fix resume_from_checkpoint for deepspeed, by ensuring that the deepspeed engine is the one to load the checkpoint. * Empty commit to trigger CI * Removed deepspeed skipping Removed deepspeed skipping inside the _load_from_checkpoint function, as it is obsolete * another adjustment * Trigger CI * trigger circleci * style --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Stas Bekman <stas@stason.org>	2023-02-25 11:30:54 -08:00
Sanchit Gandhi	3dae0d7b4f	[SpeechT5] Fix HiFiGAN tests (#21788 )	2023-02-24 16:55:38 +01:00
Yi Heng Lim	59c1d5b96b	[GPT2, ProphetNet] Fix gradient checkpointing bug (#21772 ) * fix gradient checkpointing bug * fix gradient checkpointing bug * ran make fix-copies * fixed bug * fixed bug	2023-02-24 15:37:22 +00:00
Kashif Rasul	ba0e370dc1	[time series] updated expected values for integration test. (#21762 ) * updated expected * prediction_length fix * prediction_length default value * default prediction_length 24 * revert back prediction_length default * move prediction_length test	2023-02-24 12:36:54 +01:00
Joao Gante	440f39754b	Generate - update cookie cutters to not initialize cache with training and gradient checkpointing (#21759 )	2023-02-24 11:21:00 +00:00
Arthur	087436c98e	Fix-ci-whisper (#21767 ) * fix history * input_features instead of input ids for TFWhisport doctest * use translate intead of transcribe	2023-02-24 11:39:25 +01:00
bofeng huang	c8545d2a9c	[Whisper] Add SpecAugment (#21298 ) * Return and rescale attention_mask * Add SpecAugment to Whisper modeling * Fix test * Update docstring * Add SpecAug related parameters to model config * Add the _mask_input_features function to doc * Fix quality * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Remove dev comments * Add test * Resolve conflict * feat: mask {feature, time} prob fast tests * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: sanchit-gandhi <sanchit@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-24 11:07:52 +01:00
Sanchit Gandhi	75bd49ff88	[Flax] Fix erroneous kwargs being passed to generate config (#21765 )	2023-02-24 09:59:18 +01:00
Arthur	14f33205a7	Different behavior in DistilBERT when using "inputs_embeds" (#21752 ) * Different behavior in DistilBERT when using "inputs_embeds" Fixes #21089 * fix failing test	2023-02-24 09:48:07 +01:00
Sanchit Gandhi	13489248fa	[Examples] Generalise run audio classification for log-mel models (#21756 ) * [Examples] Generalise run audio classification for log-mel models * batch feature extractor * make style	2023-02-24 09:19:07 +01:00
Shubhamai	f7ca656f07	[Flax] adding support for batch norm layers (#21581 ) * [flax] adding support for batch norm layers * fixing bugs related to pt+flax integration * cleanup, batchnorm support in sharded pt to flax * support for batchnorm tests in pt+flax integration * simplifying checking batch norm layer	2023-02-24 08:47:33 +01:00

1 2 3 4 5 ...

12173 Commits