transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Joao Gante	a81fe4e1df	Generate: input expansion for any model input (#21624 )	2023-02-14 14:16:22 +00:00
Joao Gante	13e03e619d	Generate: filter encoder inputs when its signature does not accept wildcards (#21603 )	2023-02-14 10:46:46 +00:00
Joao Gante	56b03c96b8	Fix TF CTC tests (#21606 )	2023-02-13 21:23:00 +00:00
Yih-Dar	cbecf121cd	Fix env. variable type issue in testing (#21609 ) * fix env issue * fix env issue --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-02-13 20:53:26 +01:00
Joao Gante	fa4bdb0a40	Generate: correct default model input creation for decoder-only models (#21580 )	2023-02-13 17:04:49 +00:00
Yih-Dar	edc1e734bf	Fix Blip-2 CI (#21595 ) * use fp16 * use fp16 * use fp16 * use fp16 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-02-13 16:44:27 +01:00
Younes Belkada	1666c42f0b	[`bnb`] Let's make the daily CI green 🍏 (#21597 ) * fix bnb slow test * make fixup	2023-02-13 16:18:50 +01:00
Joao Gante	24273268b7	Generate: Fix flaky indexing error in `test_constrained_beam_search_generate_dict_output` (#21561 )	2023-02-13 15:12:07 +00:00
Joao Gante	4be75e9728	CI: skip failing TF hubert test (#21601 ) skip test	2023-02-13 09:34:23 -05:00
Joao Gante	eb6c59bc78	Generate: TF supports multiple eos tokens (#21571 )	2023-02-13 12:24:22 +00:00
amyeroberts	cb56590111	Replace input_values_processing with unpack_inputs (#21502 ) * Replace input_values_prrocessing with unpack_inputs * Skip test failing with OOM * Update tests	2023-02-10 18:19:39 +00:00
Stas Bekman	2f5507580b	[from_pretrained] extend `torch_dtype="auto"` to look up `config.torch_dtype` first, expand docs (#21524 ) * [from_pretrained] expand on torch_dtype entry * fold 4 into 1 * style * support torch_dtype='config' plus tests * style * oops * fold config into auto, fix bug * fix check * better log * better log * clean up	2023-02-10 09:09:21 -08:00
Shubhamai	9e40bba6ba	[Tests] Improve flax test_attention_outputs (#21486 ) improving flax tests	2023-02-10 11:31:49 -05:00
Patrick von Platen	b20147a3c8	[Variant] Make sure variant files are not incorrectly deleted (#21562 ) * [Variant] Make sure variant files are not incorrectly deleted * Apply suggestions from code review * fix	2023-02-10 15:44:51 +01:00
Jannis Vamvas	b0d539ccad	Add X-MOD (#20939 ) * Add X-MOD to Readme * Add documentation for X-MOD * Implement X-MOD * Fix formatting of X-MOD docs * Change signature of X-MOD forward methods to use lang_ids * Minor changes * Rebase with main and run make fix-copies * Make suggested changes to docstrings * Improve code readability Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Fix code style * Conversion script: Remove asserts and type annotations * Remove _TOKENIZER_FOR_DOC * XMOD -> Xmod * Update copyright note * Fix doctests * Fix docstring * Add integration test for FillMaskPipeline * Revert "Add integration test for FillMaskPipeline" This reverts commit 4381eb3b1d0f5d85785f89caba83928e6efa6d1f. * Add end-to-end integration test for mask fill * make style * Rebase with main and make fix-copies --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-02-10 15:32:06 +01:00
Quentin Meeus	5b72b3412b	Remove CLI spams with Whisper FeatureExtractor (#21267 ) * Remove CLI spams with Whisper FeatureExtractor Whisper feature extractor representation includes the MEL filters, a list of list that is represented as ~16,000 lines. This needlessly spams the command line. I added a `__repr__` method that replaces this list with a string "<array of shape (80, 201)>" * Remove mel_filters from to_dict output Credits to @ArthurZucker * remove unused import * update feature extraction tests for the changes in to_dict	2023-02-10 09:15:16 -05:00
Katie Le	21a2d900ec	Added with torch.no_grad() to Camembert integration test (#21544 ) add with torch.no_grad() to Camembert integration test Co-authored-by: Bibi <Bibi@katies-mac.local>	2023-02-10 10:58:29 +01:00
Younes Belkada	f83942684d	[`pipeline`] A simple fix for half-precision & 8bit models (#21479 ) * v1 fix * adapt from suggestions * make style * fix tests * add gpu tests * update docs * fix other tests * Apply suggestions from code review Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * better fix * make fixup * better example * revert changes * proposal * more elegant solution * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-10 10:26:17 +01:00
Sylvain Gugger	97d3390fc8	Skip failing test for now	2023-02-09 20:11:26 -05:00
Katie Le	23c146c38b	Added with torch.no_grad() to XLM-Roberta integration test (#21547 ) * added with torch.no_grad() to the integration tests and applied make style * added with torch.no_grad() to xlm roberta forward pass --------- Co-authored-by: Bibi <Bibi@katies-mac.local>	2023-02-09 21:49:54 +01:00
Sylvain Gugger	04b2f13c37	🚨🚨🚨 Enforce single model initialization (#21431 ) * Enforce single model initialization * Add OneFormer example for problem 3 * Do it the Stas way * Actually rename the uses... * Rewrite test * Try to change the test this way * Fix all init slow/fast tests * Break connection * Fix more tests * Fix test for initialization * Remove custom test * Quality * Fix last failing tests * The end?	2023-02-09 15:46:26 -05:00
Sylvain Gugger	2020ac4bd6	Fix from_pretrained API with config and state_dict (#21542 )	2023-02-09 15:44:02 -05:00
NielsRogge	d7f1e7c009	Add BLIP-2 (#21441 ) * First draft * More improvements * More improvements * Improve conversion script * Convert all weights * Make forward pass work * Make logits match * More improvements * More improvements * More improvements * Use get_input_embeddings * Improve some more * Improve model tests * Improve model tests * More improvements * Fix processor * Update files * Update prepare_inputs_for_generation * More improvements * Fix copies * More fixes * Make fixup * More improvements * Add support for seq2seq language model * More improvements * Fix test * More improvements * Improve conversion script * Remove some todo's * Fix README's * Improve conversion script * Fix generation * Fix style and remove Blip2Model * Fix model outputs * More improvements * Set eos_token_id in config * Fix quality * Small improvements * Add processor tests * More improvements * Apply suggestions * Apply suggestions * Add integration test * Update image URL * Add integration test * Fix model_type * Update style * Improve docs * Add doc tests * Fix copies * Remove tests which are passing * Improve some more * Add tests for seq2seq language models * Minor fix * Convert more checkpoints * finalize CI * Fix blip and blip2 processors * add `accelerate` support for `blip2` * clean up * make style * Update conversion script * Update conversion script some more * Update organization * revert toc file * add blip-2 to toc file * Some more improvements * Fix docstring * Improve docs --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: younesbelkada <younesbelkada@gmail.com>	2023-02-09 16:52:11 +01:00
Joao Gante	0d33381fad	Tag tests as slow ⌛ (#21537 ) begone slow tests	2023-02-09 14:46:15 +00:00
Joao Gante	2edf9a857b	Generate: TF `.generate()` can now be exported with dynamic length (#21474 )	2023-02-09 12:52:30 +00:00
Joao Gante	e69f9715eb	Generate: make TF `.generate()` signature == PT `.generate()` signature (#21525 )	2023-02-09 11:10:13 +00:00
Motoki Wu	9960506cbe	Fix multiple `eos_token_id`s in model.generate(...) (#21461 ) * add tests with multiple eos_token_ids * make math.prod instead of sum * make fixup * fix long and also use np.prod since math.prod does not exist <python 3.8 * make fixup * add prod util * use prod util instead of np.prod * make fixup * previous .long location * use tensor ops * remove prod * remove prod * update device * make fixup * fix none	2023-02-08 13:48:46 -05:00
Stas Bekman	8ea994d3c5	[tests] add missing `report_to none` (#21505 ) [tests] report_to none	2023-02-08 09:32:40 -08:00
Joao Gante	1d9c26a4b8	Generate: TF `compute_transition_scores` (#21341 )	2023-02-08 16:36:43 +00:00
Guillaume Klein	ca905ba28e	Exclude the madeup words from M2M100Tokenizer.vocab_size (#20976 )	2023-02-08 09:19:06 -05:00
Katie Le	cc1d0685b3	Wrap RemBert integration test forward passes with torch.no_grad() (#21503 ) added with torch.no_grad() to the integration tests and applied make style Co-authored-by: Bibi <Bibi@katies-mac.local>	2023-02-08 14:00:52 +01:00
Adrian Sager La Ganga	a3034c7004	Add inverse sqrt learning rate scheduler (#21495 ) * added inverse sqrt lr scheduler * Updated get_scheduler in src/transformers/optimization.py * Updated src/transformers/__init__.py * Added inverse sqrt lr scheduler test * Updated docs/source/en/main_classes/optimizer_schedules.mdx * Ran style and quality scripts * Fix get_inverse_sqrt_schedule docstring * Comment implementation URL	2023-02-07 15:00:50 -05:00
Stas Bekman	b9af152efb	[tokenizer] sanitize saved config (#21483 ) * [tokenizer] sanitize saved config * rm config["name_or_path"] test	2023-02-07 10:51:45 -08:00
Sylvain Gugger	67d074874d	Cleanup quality (#21493 ) * Remove mentions of flake8/isort * Clean up inits * Deall with all other inits * Last special rule for dummy files	2023-02-07 12:27:31 -05:00
Arthur	9e7f84a556	[OPT] Adds `GPT2TokenizerFast` to the list of tokenizer to use for OPT. (#20823 ) * Add ("opt", ("GPT2Tokenizer", "GPT2TokenizerFast" if is_tokenizers_available() else None)), * skip failing test * Add ("opt", ("GPT2Tokenizer", "GPT2TokenizerFast" if is_tokenizers_available() else None)), * skip failing test	2023-02-07 17:35:28 +01:00
Joao Gante	1e4cf8bb44	Generate: TF can now generate from embeddings in encoder-decoder models (#21475 )	2023-02-07 11:18:23 +00:00
Arthur	12eb528b5a	[CI ] Remove `past` in favor of `pat_key_values` (#21443 ) * fix past renamed to past_key_value * update more `past`that were ski^êd * fixup * remove changes made to rag * refactor `_reorder_cache` to use `past_key_values` * fix git `prepare_inputs_for_generation` to pass tests when false is needed in use_cache	2023-02-07 09:51:35 +01:00
Sylvain Gugger	cc8407522a	Fix epoch number when resuming training (#21478 )	2023-02-06 19:34:34 -05:00
Sylvain Gugger	6f79d26442	Update quality tooling for formatting (#21480 ) * Result of black 23.1 * Update target to Python 3.7 * Switch flake8 to ruff * Configure isort * Configure isort * Apply isort with line limit * Put the right black version * adapt black in check copies * Fix copies	2023-02-06 18:10:56 -05:00
Joao Gante	4943331015	Generate: TF can now accept custom logits processors (#21454 )	2023-02-06 15:44:47 +00:00
Yih-Dar	0db5d911fc	Fix `SpeechT5ForSpeechToSpeechIntegrationTests` device issue (#21460 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-02-06 10:43:07 +01:00
Yih-Dar	59d5edef34	Avoid flaky generation sampling tests (#21445 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-02-03 22:01:25 +01:00
Matthijs Hollemans	e4bacf6614	[WIP] add SpeechT5 model (#18922 ) * make SpeechT5 model by copying Wav2Vec2 * add paper to docs * whoops added docs in wrong file * remove SpeechT5Tokenizer + put CTC back in the name * remove deprecated class * remove unused docstring * delete SpeechT5FeatureExtractor, use Wav2Vec2FeatureExtractor instead * remove classes we don't need right now * initial stab at speech encoder prenet * add more speech encoder prenet stuff * improve SpeechEncoderPrenet * add encoder (not finished yet) * add relative position bias to self-attention * add encoder CTC layers * fix formatting * add decoder from BART, doesn't work yet * make it work with generate loop * wrap the encoder into a speech encoder class * wrap the decoder in a text decoder class * changed my mind * changed my mind again ;-) * load decoder weights, make it work * add weights for text decoder postnet * add SpeechT5ForCTC model that uses only the encoder * clean up EncoderLayer and DecoderLayer * implement _init_weights in SpeechT5PreTrainedModel * cleanup config + Encoder and Decoder * add head + cross attention masks * improve doc comments * fixup * more cleanup * more fixup * TextDecoderPrenet works now, thanks Kendall * add CTC loss * add placeholders for other pre/postnets * add type annotation * fix freeze_feature_encoder * set padding tokens to 0 in decoder attention mask * encoder attention mask downsampling * remove features_pen calculation * disable the padding tokens thing again * fixup * more fixup * code review fixes * rename encoder/decoder wrapper classes * allow checkpoints to be loaded into SpeechT5Model * put encoder into wrapper for CTC model * clean up conversion script * add encoder for TTS model * add speech decoder prenet * add speech decoder post-net * attempt to reconstruct the generation loop * add speech generation loop * clean up generate_speech * small tweaks * fix forward pass * enable always dropout on speech decoder prenet * sort declaration * rename models * fixup * fix copies * more fixup * make consistency checker happy * add Seq2SeqSpectrogramOutput class * doc comments * quick note about loss and labels * add HiFi-GAN implementation (from Speech2Speech PR) * rename file * add vocoder to TTS model * improve vocoder * working on tokenizer * more better tokenizer * add CTC tokenizer * fix decode and batch_code in CTC tokenizer * fix processor * two processors and feature extractors * use SpeechT5WaveformFeatureExtractor instead of Wav2Vec2 * cleanup * more cleanup * even more fixup * notebooks * fix log-mel spectrograms * support reduction factor * fixup * shift spectrograms to right to create decoder inputs * return correct labels * add labels for stop token prediction * fix doc comments * fixup * remove SpeechT5ForPreTraining * more fixup * update copyright headers * add usage examples * add SpeechT5ProcessorForCTC * fixup * push unofficial checkpoints to hub * initial version of tokenizer unit tests * add slow test * fix failing tests * tests for CTC tokenizer * finish CTC tokenizer tests * processor tests * initial test for feature extractors * tests for spectrogram feature extractor * fixup * more fixup * add decorators * require speech for tests * modeling tests * more tests for ASR model * fix imports * add fake tests for the other models * fixup * remove jupyter notebooks * add missing SpeechT5Model tests * add missing tests for SpeechT5ForCTC * add missing tests for SpeechT5ForTextToSpeech * sort tests by name * fix Hi-Fi GAN tests * fixup * add speech-to-speech model * refactor duplicate speech generation code * add processor for SpeechToSpeech model * add usage example * add tests for speech-to-speech model * fixup * enable gradient checkpointing for SpeechT5FeatureEncoder * code review * push_to_hub now takes repo_id * improve doc comments for HiFi-GAN config * add missing test * add integration tests * make number of layers in speech decoder prenet configurable * rename variable * rename variables * add auto classes for TTS and S2S * REMOVE CTC!!! * S2S processor does not support save/load_pretrained * fixup * these models are now in an auto mapping * fix doc links * rename HiFiGAN to HifiGan, remove separate config file * REMOVE auto classes * there can be only one * fixup * replace assert * reformat * feature extractor can process input and target at same time * update checkpoint names * fix commit hash	2023-02-03 12:43:46 -05:00
Yih-Dar	197e7ce911	Fix device issue in a `ConvBertModelTest` test (#21438 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-02-03 15:12:28 +01:00
Joao Gante	f21af26279	🚨🚨 Generate: standardize beam search behavior across frameworks (#21368 )	2023-02-03 10:24:02 +00:00
Yih-Dar	a6d8a149a8	Fix some pipeline tests (#21401 ) * fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-02-02 19:03:31 +01:00
Younes Belkada	8298e4ec02	[`bnb`] Fine-tuning HF 8-bit models (#21290 ) * force `memory_efficient_backward=True` * enhancements - trainer support - add new flag * some changes - internal changes in `Trainer` - small refactor * make quality * Fixes - add new testing util - add new test - change test in Trainer * fix CI test * educate users on how to ft 8bit models * more checks * fix `logger` error * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * adapt from review * fix * add comment * use return instead --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-02 16:39:23 +01:00
Clémentine Fourrier	67a3920d85	Fix Graphormer test suite (#21419 ) * [FIX] path for Graphormer checkpoint * [FIX] Test suite for graphormer * [FIX] Update graphormer default num_classes	2023-02-02 16:29:13 +01:00
Joel Lamy-Poirier	e006ab51ac	Add the GeLU activation from pytorch with the tanh approximation (#21345 ) * gelu_python_tanh * rename * Version check, add test * Pr comment	2023-02-02 09:33:04 -05:00
Joao Gante	92ce53aab8	Generate: decoder-only models can generate with `inputs_embeds` (#21405 )	2023-02-01 21:50:38 +00:00

1 2 3 4 5 ...

2488 Commits