transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-30 01:32:23 +06:00

Author	SHA1	Message	Date
Lucain	923110b74f	Remove set_access_token usage + fail tests if FutureWarning (#22051 ) * Remove set_access_token usage + fail tests if FutureWarning * do not fail on FutureWarning in CI --------- Co-authored-by: testbot <lucainp@hf.co>	2023-03-09 09:23:48 -05:00
Yih-Dar	1cbac6867b	Mark all `BridgeTower` tests slow for now (#22039 ) * slow me --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-08 21:48:29 +01:00
Anahita Bhiwandiwalla	de81adf978	[WIP] Add BridgeTowerForContrastiveLearning (#21964 ) * Add BridgeTower for ITC * Fix review feedback * Rename BridgeTowerForITC, cleanup * Fix style and quality * implement tests --------- Co-authored-by: Tiep Le <97980157+tileintel@users.noreply.github.com> Co-authored-by: Tiep Le <tiep.le@intel.com>	2023-03-08 09:00:54 -05:00
Yih-Dar	b338414e61	Update tiny model creation script and some others files (#22006 ) * Update 1 * Update 2 * Update 3 * Update 4 * Update 5 * Update 6 * Update 7 * Update 8 * Update 9 * Update 10 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-07 22:31:14 +01:00
Eli Simhayev	8abe4930d3	[Time-Series] informer model (#21099 ) * added informer to gitignore * added informer to gitignore * WIP informer2020 * added checking that instantiate works * added config using gluonTS by kashif * WIP config * adding informeConfig. need to remove FeatureEmbedder * done InformerConfig, but need to change the names * Done informer model init. working on enc-dec * added things to address, after reading again enc-dec in the paper * done modeling - checking initialization work * added informer to gitignore * WIP informer2020 * added checking that instantiate works * added config using gluonTS by kashif * WIP config * adding informeConfig. need to remove FeatureEmbedder * done InformerConfig, but need to change the names * Done informer model init. working on enc-dec * added things to address, after reading again enc-dec in the paper * done modeling - checking initialization work * moved enc-dec init to InformerEncoder/Decoder init * added 'init_std' to config, now model init works! * WIP conversion script, and added code sources * WIP conversion script: loading original informer pth works * WIP conversion script: change defaults in the config * WIP conversion script: supporting Informer input embedding * WIP conversion script: added parameters for the informer embed * WIP conversion script: change dim_feedforward=2048 * WIP conversion script: remove unused args for loading checkpoint * just cleaning up * DataEmbedding removed, after thinking with Kashif * working on forward pass * WIP forward pass: trying to establish working batch for forward pass * cleaning and finalizing * adding HF names and docs * init after cleaning works * WIP in tests * added docs for the informer specific args * fix style * undo change * cleaning informer, now need to work only enc-dec * initial enc-dec classes * added encoder and decoder * added todo * add todos for conv_layers * added decoder docs from vanilla * added encoder docs from vanilla * remove encoder decoder from the original informer * removed AttentionLayer from the original paper * removed TriangularCausalMask, same as decoder_attention_mask * initial sparse attention * use conv_layers * fixed test_config test * fix parenthesis when itearting zip(layers, conv_layers) * error found in prob attention, added sizes as comments * fix sizes * added proposal for q_reduce indexing, and remove unused * WIP ProbMask, and changed factor=2 for testing * remove unused libs for this PR for creating the env * fix checking the attn_weights.size() after bmm * Q_reduce: changed from torch.gather to simple slicing * WIP calculate final attn_output * finish adding v_aggregated, attn_output ready * changed tgt_len to u in attention_mask, need to fix the size error * comment attention_mask for encoder, and fix if cond for v_agg * added ProbMask support (wip), removed old original code * finished ProbMask 😃 * Revert "remove unused libs for this PR for creating the env" This reverts commit `11a081e09e`. * fixes * make style * fix initial tests * fix more tests * dry * make style * remove unused files * style * added integration tests * fix num_static_real_features * fix header * remove unused function * fix example * fix docs * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/modeling_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixes for reviewer * use prediction_length from model * fix style * fixed informer.mdx * added to index * updated readme * undo * make fix-copies * typo * fix copy * added Informer to toctree * in order * fixed comments * remove unneeded new lines in docs * make static real and cat optional * fix use of distil conv layers * fixed integration test * added checkpoint for convlayer * make fix-copies * updated from time series model * make fix-copies * copy decoder * fix unit tests * updated scaling config * fix integration tests * IGNORE_NON_TESTED * IGNORE_NON_AUTO_CONFIGURED * IGNORE_NON_AUTO_CONFIGURED * updated check configs * fix formatting * undo change from time series * prediction_length should not be None * aliign with the blog: prettify ProbSparse and change attention_factor to sampling_factor * make style * make fix-copies * niels CR: update contributed by * niels CR: update configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * niels CR: update kashif -> huggingface Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * niels CR: `sampling_factor` only relevant when `attention_type`=prob * make style * fixed U_part: added multiplication by `L_Q` * fixed bug: remove `is not None` from `if config.distil` * fixed test: `decoder_seq_length` to `encoder_seq_length` in cross_attentions check * fix integration tests * updated model hub * do not shift as in training * undo * fix make-copies * make fix-copies * added `if prediction_length is None` * changed `ProbSparseAttention` to `InformerProbSparseAttention` * changed `V_sum` -> `v_mean_dim_time` * changed `ConvLayer` to `InformerConvLayer` and fixed `super()` * TimeSeriesTansformer->Informer in decoder's Copied from * more descriptive in ProbSparse * make style * fix coped from * Revert "added `if prediction_length is None`" This reverts commit `b4cbddfa05`. * fixed indent * use InformerSinusoidalPositionalEmbedding * make fix-style * fix from #21860 * fix name * make fix-copies * use time series utils * fix dec num_heads * docstring * added time series util doc * _import_structure * formatting * changes from review * make style * fix docs * fix doc * removed NegativeLogLikelihood --------- Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2023-03-07 21:36:38 +01:00
NielsRogge	dde718e7a6	[DETR and friends] Remove is_timm_available (#21814 ) * First draft * Fix to_dict * Improve conversion script * Update config * Remove timm dependency * Fix dummies * Fix typo, add integration test * Upload 101 model as well * Remove timm dummies * Fix style --------- Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2023-03-07 15:19:39 -05:00
Sanchit Gandhi	7c39318136	[Whisper] Add model for audio classification (#21754 ) * [Whisper] Add model for audio classification * make fix-copies * add to docs * add docstring * empty returns * add code example * switch to fleurs * stick everything on one line	2023-03-07 16:20:21 +01:00
Yih-Dar	9402788b34	Skip `test_multi_gpu_data_parallel_forward` for some model tests (#21991 ) skip test_multi_gpu_data_parallel_forward for some model tests Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-07 14:23:36 +01:00
NielsRogge	95408e9953	[DETR, YOLOS] Fix device bug (#21974 ) * Fix integration test * Add test * Add test	2023-03-07 07:34:04 -05:00
Yih-Dar	5b28b78332	Update `Jukebox` tests (#21984 ) * update expected values for jukebox * update expected values for jukebox * update expected values for jukebox * update expected values for jukebox * update expected values for jukebox --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-07 04:20:14 +01:00
Yih-Dar	f2a2616b74	Update expected values for `test_xglm_sample` (#21975 ) update expected values for xglm Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-06 18:07:31 +01:00
Yih-Dar	fcf813417a	Update expected values in `XLMProphetNetModelIntegrationTest` (#21957 ) update values Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-06 09:15:44 +01:00
Arthur	718e9d777f	[CLAP] Support batched inputs for CLAP. Fixes pipeline issues (#21931 ) * fix pipeline * fix feature_extraction clap * you can now batch the `is_longer` attribute * add tests * fixup * add expected scores * comment on is_longert	2023-03-03 18:42:18 +01:00
Yih-Dar	d4306daea1	Fix `AlignModelTest` tests (#21923 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-03 14:47:09 +01:00
Yih-Dar	fa9d2ad7ec	Update `model_split_percents` for `WhisperModelTest` (#21922 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-03 14:35:08 +01:00
Yih-Dar	9f5bfe1b99	Avoid modeling tests run in pipeline CI jobs (#21911 ) * rework is_pipeline_test * bring back 3 tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-02 21:23:06 +01:00
Kashif Rasul	db979f7588	[time series] Add Time series inputs tests (#21846 ) * intial test of inputs * added test for generation * remove asserts * fixed test * Update tests/models/time_series_transformer/test_modeling_time_series_transformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2023-03-02 20:43:35 +01:00
Yih-Dar	88e5c51a15	Temporarily skip 3 tests in `BridgeTowerModelTest` (#21908 ) skip for now Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-02 19:16:03 +01:00
Yih-Dar	e6de918676	Add Blip and Blip2 for pipeline tests (#21904 ) * fix * add to tests * style and quality * add missing --------- Co-authored-by: NielsRogge <NielsRogge@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-02 18:20:34 +01:00
Nicolas Patry	1325459105	Refactor whisper asr pipeline to include language too. (#21427 ) * [WIP] whisper refacto to support language output. * Handling merges. * A bit more cleanup and comments. * Many improvements. Lots of details everywhere. * Cleanup old code and tests. * Handle lone timestamp tokens (just recover when something bad happens). * Adding return_language example. * No ffmpeg. * Hmm. * Some corrections. * Both fast and slow. * New black. * Update src/transformers/models/whisper/tokenization_whisper.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/whisper/tokenization_whisper.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Remove print. * Undoing tests modifications. * Smaller test modifications. * Rename. * Remove maxDiff. --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-03-02 18:12:19 +01:00
Kian Sierra McGettigan	6bf885375a	Prophetnet batch dimension inversion fix (#21870 ) * decoder forward pass is working * no model has forward pass returning attentions * decoder ngram changed to not mix batch size * current basic forward pass returns identical result * passed test_model attentions * passed test_encoder_decoder_model_generate * passed test_headmasking * removed old block * removed comments bug/fixme * removed bug comments * applied styling * applied fix-copies * applied ngram forward comments * corrected dimension notation * applied styling and comment fixes * changed asserts for raise ValueError * changed question gen test * updated hidden_states integration test * applied styling	2023-03-02 12:07:45 -05:00
Arthur	c87654dca1	[Whisper] Add rescaling function with `do_normalize` (#21263 ) * add `zero_mean_unit_var_norm` function * normalize before MEL computation * fixup * add simple test * quality * Update tests/models/whisper/test_feature_extraction_whisper.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * fixup * use attention masks if padding was applied * Update based on review Co-authored-by: bofeng huang <bofenghuang7@gmail.com> --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: bofeng huang <bofenghuang7@gmail.com>	2023-03-02 14:17:21 +01:00
Yih-Dar	36ee128375	Fix `WhisperModelTest` (#21883 ) * force on the same device * fix tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-01 20:41:27 +01:00
Alara Dirik	269b054939	Add ALIGN to transformers (#21741 ) Adds the ALIGN model to transformers. ALIGN is introduced in "Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision" by Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, Yunhsuan Sung, Zhen Li, Tom Duerig.	2023-03-01 21:23:31 +03:00
Matt	f7c618e3b0	Add TFVisionTextDualEncoder (#21873 ) * Temporary commit to stash everything so far * Temporary commit to stash everything so far * stash commit * Refactor from_pretrained * Fix final test, make fixup * Update dummies * Add model to TEST_FILES_WITH_NO_COMMON_TESTS * Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Add TFVisionTextDualEncoder to utils/documentation_tests.txt * make fixup --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2023-03-01 18:00:48 +00:00
Arthur	b599b19289	[ConvBert] Fix #21523 (#21849 ) * fix reshaping Fixes #21523 * add test * styling * last fixes * Update src/transformers/models/convbert/modeling_convbert.py * code quallity	2023-03-01 11:11:04 +01:00
Arthur	44e3e3fb49	prepare for "__floordiv__ is deprecated and its behavior will change in a future version of pytorch" (#20211 ) * rounding_mode = "floor" instead of // to prevent behavioral change * add other TODO * use `torch_int_div` from pytrch_utils * same for tests * fix copies * style * use relative imports when needed * Co-authored-by: sgugger <sylvain.gugger@gmail.com>	2023-03-01 10:49:21 +01:00
Matt	acfb714bdf	Improve TF weight loading, especially PT crossloading (#21792 ) * First commit for the improved PT-TF weight loading * Remove workarounds from TFEncoderDecoder tests * Allow a custom weight renaming function in from_pretrained and use that to clean up EncoderDecoder * make fixup * First attempt at visionencoderdecoder * Disable tensorfloat32 in tests to get consistent outputs * Quick fix to tf_vision_encoder_decoder tests * make fixup * Update Blenderbot tests * Remove unused arg in modeling_tf_opt * load_tf_sharded_weights had strict=True! This meant transfer learning was impossible, so I'm setting it to False. * Support prefixes when loading sharded TF checkpoints * make fixup * Add test to load sharded models with a weight prefix * Fix sharded weight loading test * Add a test for transfer from a sharded checkpoint * make fixup * Add test to check that crossloading from PT with a prefix works * Refactor from_pretrained in the encoderdecoder classes * Refactor from_pretrained in the encoderdecoder classes * missmatched -> mismatched * Explicitly check for None * No comments showing my very impressive and attractive knowledge of Py3.9+ * Disable TF32 across all TF tests	2023-02-28 18:41:34 +00:00
Yih-Dar	871c31a6f1	🔥Rework pipeline testing by removing `PipelineTestCaseMeta` 🚀 (#21516 ) * Add PipelineTesterMixin * remove class PipelineTestCaseMeta * move validate_test_components * Add for ViT * Add to SPECIAL_MODULE_TO_TEST_MAP * style and quality * Add feature-extraction * update * raise instead of skip * add tiny_model_summary.json * more explicit * skip tasks not in mapping * add availability check * Add Copyright * A way to diable irrelevant tests * update with main * remove disable_irrelevant_tests * skip tests * better skip message * better skip message * Add all pipeline task tests * revert * Import PipelineTesterMixin * subclass test classes with PipelineTesterMixin * Add pipieline_model_mapping * Fix import after adding pipieline_model_mapping * Fix style and quality after adding pipieline_model_mapping * Fix one more import after adding pipieline_model_mapping * Fix style and quality after adding pipieline_model_mapping * Fix test issues * Fix import requirements * Fix mapping for MobileViTModelTest * Update * Better skip message * pipieline_model_mapping could not be None * Remove some PipelineTesterMixin * Fix typo * revert tests_fetcher.py * update * rename * revert * Remove PipelineTestCaseMeta from ZeroShotAudioClassificationPipelineTests * style and quality * test fetcher for all pipeline/model tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-02-28 19:40:57 +01:00
Anahita Bhiwandiwalla	4cb5ffa93d	Add loss for BridgeTowerForMaskedLM and BridgeTowerForImageAndTextRetrieval (#21684 ) * Add loss for BridgeTowerForMaskedLM and BridgeTowerForImageAndTextRetrieval * minor fix return_dict * implement test for loss computation --------- Co-authored-by: Tiep Le <97980157+tileintel@users.noreply.github.com> Co-authored-by: Tiep Le <tiep.le@intel.com>	2023-02-28 12:21:48 -05:00
Younes Belkada	7f4f8b97d0	[`Blip2`] Fix Blip-2 multi gpu (#21707 ) * fix blip multi gpu * fix * final changes * adapt suggestions * fix failing slow test * forward contrib credits from testing and suggestions * reformat --------- Co-authored-by: akkikiki <akkikiki@users.noreply.github.com>	2023-02-28 17:28:58 +01:00
raghavanone	eec76042f4	Fix the issue of blip model returning loss even when the label is not provided. (#21811 ) * Fix the issue of blip model returning loss even when the label is not provoided * Fix ruff failure * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks	2023-02-28 09:54:08 -05:00
Younes Belkada	b8de7e448e	[`Blip2`] Add `Blip2Model` (#21817 ) * add v1 * add `Blip2Model` - add relevant functions - add tests - add on automapping * fix docs * fix doctest	2023-02-28 15:42:55 +01:00
Younes Belkada	ae9230af40	[`T5`] Fix torchquant issue (#21843 ) * fix torchquant issue * add tests	2023-02-28 15:09:44 +01:00
Yih-Dar	a9dd124346	Rename `MobileViTModelTest` to `TFMobileViTModelTest` (#21825 ) Let's give TF a bit more love ❤️ 🙏 Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-02-28 08:10:29 +01:00
Sanchit Gandhi	3dae0d7b4f	[SpeechT5] Fix HiFiGAN tests (#21788 )	2023-02-24 16:55:38 +01:00
Kashif Rasul	ba0e370dc1	[time series] updated expected values for integration test. (#21762 ) * updated expected * prediction_length fix * prediction_length default value * default prediction_length 24 * revert back prediction_length default * move prediction_length test	2023-02-24 12:36:54 +01:00
Arthur	087436c98e	Fix-ci-whisper (#21767 ) * fix history * input_features instead of input ids for TFWhisport doctest * use translate intead of transcribe	2023-02-24 11:39:25 +01:00
bofeng huang	c8545d2a9c	[Whisper] Add SpecAugment (#21298 ) * Return and rescale attention_mask * Add SpecAugment to Whisper modeling * Fix test * Update docstring * Add SpecAug related parameters to model config * Add the _mask_input_features function to doc * Fix quality * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Remove dev comments * Add test * Resolve conflict * feat: mask {feature, time} prob fast tests * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: sanchit-gandhi <sanchit@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-24 11:07:52 +01:00
Joao Gante	1d4b797852	Generate: Fix GIT batched captioning (#21738 )	2023-02-23 09:50:37 +00:00
Sanchit Gandhi	82e61f3445	[SpeechT5HifiGan] Handle batched inputs (#21702 ) * [SpeechT5HifiGan] Handle batched inputs * fix docstring * rebase and new ruff style	2023-02-22 11:16:56 +01:00
Yih-Dar	09127c5713	Fix `GPTSanJapaneseModel` (#21731 ) * fix * skip test_model_parallelism * skip test_model_parallelism --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-02-22 11:09:04 +01:00
Aaron Gokaslan	5e8c8eb5ba	Apply ruff flake8-comprehensions (#21694 )	2023-02-22 09:14:54 +01:00
Kashif Rasul	df06fb1f0b	Time series transformer: input projection and Std scaler (#21020 ) * added loc and scale outputs from scalers * fix typo * fix tests * fixed formatting * initial StdScaler * move scaling to optional str * calculate std feature for scalers * undid change as it does not help * added StdScaler with weights * added input projection layer and d_model hyperparam * use linear proj * add back layernorm_embedding * add sin-cos pos embeddings * updated scalers * formatting * fix type * fixed test * fix repeated_past_values cal. * fix when keepdim=false * fix default_scale * backward compatibility of scaling config * update integration test expected output * fix style * fix docs * use the actual num_static_real_features in feature_dim cal * clarified docs * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * prediction_length is not optional * fix for reviewer * Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * get rid of un-needed new lines * fix doc * remove unneeded new lines * fix style * static_categorical_features and static_real_features are optional * fix integration test * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixing docs for multivariate setting * documentation for generate --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-22 07:50:13 +01:00
Yih-Dar	03aaac3502	Fix TVLT (torch device issue) (#21710 ) * fix tvlt ci * fix tvlt ci * fix tvlt ci --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-02-21 11:37:49 +01:00
Jonatan Kłosko	deafc24388	Add WhisperTokenizerFast (#21222 ) * Add WhisperTokenizerFast * Fixup * Up * Up * Improve tests * Update src/transformers/models/whisper/tokenization_whisper_fast.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Keep stride in whisper pipelien test * Remove unknown token special case * Reduce vocabulary size in tests * Fix vocab size assertion * Sync copied changes from WhisperTokenizer * Skip pipeline tests * Update assertion * Remove Whisper tokenizer dependency on sentencepiece * Format --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-02-21 06:58:54 +01:00
Alara Dirik	49ab16239c	Add EfficientNet (#21563 ) * Add EfficientNet to transformers	2023-02-20 16:37:11 +03:00
tanreinama	f56174ac5b	add GPTSAN model (reopen) (#21291 ) * add GPTSAN-Japanese * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN (update for review) * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * fix typo in comment text * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * fix document and comments * fix class name GPTSAN->GPTSan * fix import and test for tokenizer	2023-02-20 11:25:27 +01:00
Sylvain Gugger	c87bbe1ff0	Fix quality	2023-02-20 03:27:09 -05:00
Andy Ehrenberg	2840272c5f	add flax whisper implementation (#20479 ) * add flax whisper implementation * rever change to setup * remove unused imports * revert generation changes * flax whisper docs * docs * import order * import sorting * isort * add dummy objects * doc formatting * formatting * remove trailing whitespaces * fix flax whisper docs * add generation logic to unlock flax whisper * remove scans * give credits to Flax Bart implementation * remove unused imports * add license * remove assert * more credits to Bart * fix style * formatting * support left padding * add flax whisper generation test * remove copied from comments whenever not a full copy * fix docstrings for logits processors * revert change to FlaxForceTokensLogitsProcessor * revert doc changes * improve generation docs * reorganize * formatting * cleanup docs * add tests * handle empty list case * fix forced decoder ids in flax tests * add flax whisper to inits * upate dummy objects * docs for FlaxAutoModelForSpeechSeq2Seq * fix decoder_position_ids computation in pretrained model decode/__call__ fns * add Copied from statements as necessary * compute position_ids only in __call__ and decode methods of pretrained model subclasses * improve readabilityof compute positional embeddings * check dimensionality of input_features instead of hidden_states * copied from statement for init_cache * formatting * fix copies * fix copies * pass attention mask to encoder layers * fix decoder module outputs * set dtype Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * smaller flax model for whisper test * Update src/transformers/generation/flax_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/whisper/modeling_flax_whisper.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/models/whisper/test_modeling_flax_whisper.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * cleanup Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/whisper/modeling_flax_whisper.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * bias cleanup * doc fix * align style for force tokens processor * readability * fix input shape in tests * revert FlaxGenerationMixin docstring * formatting * fix tests * fix imports * consistent encoder hidden states * consistent hidden states * input shapes * typo * partial class trick * partial class for input shape * base_class with correct input shape * partial base classes * match by name * set main_input_name * compare on names * formatting * remove unused import * safer position ids computation * safer position id computation * Update src/transformers/models/whisper/modeling_flax_whisper.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update src/transformers/models/whisper/modeling_flax_whisper.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * remove identical inherited tests * fix prompt ids in tests * use generation config * use jnp array * better var names * more explicit bias use * import transformers * formatting * test formatting * remove unused imports * remove unused imports * formatting * isort * docs * fix ln orders for encoder hidden states * whisper unique generation stuff * flake * use finfo for attention bias * docs * Update src/transformers/generation/flax_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * docs * add timestamp flax test * jit for timestamps * formatting * clean up timestamps processor * formatting * remove if_true * cleanup --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-02-20 09:17:40 +01:00

1 2 3 4 5 ...

481 Commits