transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-16 02:58:23 +06:00

Author	SHA1	Message	Date
Younes Belkada	4b79697865	🚨🚨🚨 [`Refactor`] Move third-party related utility files into `integrations/` folder 🚨🚨🚨 (#25599 ) * move deepspeed to `lib_integrations.deepspeed` * more refactor * oops * fix slow tests * Fix docs * fix docs * addess feedback * address feedback * final modifs for PEFT * fixup * ok now * trigger CI * trigger CI again * Update docs/source/en/main_classes/deepspeed.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * import from `integrations` * address feedback * revert removal of `deepspeed` module * revert removal of `deepspeed` module * fix conflicts * ooops * oops * add deprecation warning * place it on the top * put `FutureWarning` * fix conflicts with not_doctested.txt * add back `bitsandbytes` module with a depr warning * fix * fix * fixup * oops * fix doctests --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-08-25 17:13:34 +02:00
Pedro Cuenca	cb8e3ee25f	Add FlaxCLIPTextModelWithProjection (#25254 ) * Add FlaxClipTextModelWithProjection This is necessary to support the Flax port of Stable Diffusion XL: `fb6d705fb5/text_encoder_2/config.json (L3)` Co-authored-by: Martin Müller <martin.muller.me@gmail.com> Co-authored-by: Juan Acevedo <juancevedo@gmail.com> * Use FlaxCLIPTextModelOutput * make fix-copies again * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Use `return_dict` for consistency with other uses. Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Fix docstring example. * Add new model to FlaxCLIPTextModelTest * Add to IGNORE_NON_AUTO_CONFIGURED list * Fix naming convention. --------- Co-authored-by: Martin Müller <martin.muller.me@gmail.com> Co-authored-by: Juan Acevedo <juancevedo@gmail.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2023-08-25 10:58:14 +02:00
Younes Belkada	ae320fa53f	[`PEFT`] Fix PeftConfig save pretrained when calling `add_adapter` (#25738 ) fix save_pretrained issue + add test	2023-08-25 08:19:11 +02:00
Sanchit Gandhi	0218876822	[ASR Pipe Test] Fix CTC timestamps error message (#25727 )	2023-08-24 17:58:37 +01:00
Stas Bekman	7a6efe1e9f	[idefics] idefics-9b test use 4bit quant (#25734 )	2023-08-24 08:33:14 -07:00
Younes Belkada	584eeb5387	[`AutoGPTQ`] Add correct installation of GPTQ library + fix slow tests (#25713 ) * add correct installation of GPTQ library * update tests values	2023-08-24 14:57:16 +02:00
Yih-Dar	8fff61b9db	Fix failing `test_batch_generation` for bloom (#25718 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-08-24 11:15:29 +02:00
Sylvain Gugger	68fa9a5937	Skip broken tests	2023-08-24 01:48:53 -04:00
Joao Gante	3c2383b1c6	Generate: general test for decoder-only generation from `inputs_embeds` (#25687 ) Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-08-23 19:17:01 +01:00
Arthur	51794bf21e	[`SPM`] Patch `spm` Llama and T5 (#25656 ) * hot fix * only encode with string prefix if starts with prefix * styling * add a new test * fixup	2023-08-23 07:16:43 +02:00
Arthur	e20fab0bbe	Fix bloom add prefix space (#25652 ) * properly support Sequence of pretokenizers * actual fix * make sure the fix works. Tests are not working for sure! * hacky way * add TODO * update * add a todo * nits * rename test * nits * rename test	2023-08-22 14:50:12 +02:00
Tanay Mehta	182b83749a	Add Number Normalisation for SpeechT5 (#25447 ) * add: NumberNormalizer works for integers, floats, common currencies, negative numbers and percentages * fix: renamed number normalizer class and added normalization to SpeechT5Processor * fix: restyled with black and ruff, should pass code quality tests * fix: moved normalization to tokenizer and other small changes to normalizer * add: test for normalization and changed the existing full tokenizer test * fix: tokenization tests now pass, made changes to existing tokenization where normalization is covered; added normalize arg to func signature * fix: changed default normalize setting to False, modified the tests a bit * fix: added support for comma separated numbers, tokenization on the fly with kwargs and normalizer getter setter funcs	2023-08-22 08:12:57 +02:00
Susnato Dhar	450a181d8b	Add Pop2Piano (#21785 ) * init commit * config updated also some modeling * Processor and Model config combined * extraction pipeline(upto before spectogram & mel_conditioner) added but not properly tested * model loading successful! * feature extractor done! * FE can now be called from HF * postprocessing added in fe file * same as prev commit * Pop2PianoConfig doc done * cfg docs slightly changed * fe docs done * batched * batched working! * temp * v1 * checking * trying to go with generate * with generate and model tests passed * before rebasing * . * tests done docs done remaining others & nits * nits * LogMelSpectogram shifted to FeatureExtractor * is_tf rmeoved from pop2piano/init * import solved * tokenization tests added * minor fixed regarding modeling_pop2piano * tokenizer changed to only return midi_object and other changes * Updated paper abstract(Camera-ready version) (#2) * more comments and nits * ruff changes * code quality fix * sg comments * t5 change added and rebased * comments except batching * batching done * comments * small doc fix * example removed from modeling * ckpt * forward it compatible with fe and generation done * comments * comments * code-quality fix(maybe) * ckpts changed * doc file changed from mdx to md * test fixes * tokenizer test fix * changes * nits done main changes remaining * code modified * Pop2PianoProcessor added with tests * other comments * added Pop2PianoProcessor to dummy_objects * added require_onnx to modeling file * changes * update .md file * remove extra line in index.md * back to the main index * added pop2piano to index * Added tokenizer.__call__ with valid args and batch_decode and aligned the processor part too * changes * added return types to 2 tokenizer methods * the PR build test might work now * added backends * PR build fix * vocab added * comments * refactored vocab into 1 file * added conversion script * comments * essentia version changed in .md * comments * more tokenizer tests added * minor fix * tests extended for outputs acc check * small fix --------- Co-authored-by: Jongho Choi <sweetcocoa@snu.ac.kr>	2023-08-21 16:35:00 +01:00
Francisco Kurucz	2f8acfea1c	Fix test_modeling_mpt typo in model id (#25606 ) Fix model id in get_large_model_config on file test_modeling_mpt	2023-08-21 11:11:21 +02:00
ydshieh	1982dd3b15	Hotfix	2023-08-19 11:15:38 +02:00
Stas Bekman	6c811a322f	new model: IDEFICS via HuggingFaceM4 (#24796 ) * rename * restore * mappings * unedited tests+docs * docs * fixes * fix auto-sync breakage * cleanup * wip * wip * add fetch_images * remove einops dependency * update * fix * fix * fix * fix * fix * re-add * add batching * rework * fix * improve * add Leo as I am extending his work * cleanup * fix * cleanup * slow-test * fix * fix * fixes * deal with warning * rename modified llama classes * rework fetch_images * alternative implementation * cleanup * strict version * cleanup * [`IDEFICS`] Fix idefics ci (#25056) * Fix IDEFICS CI * fix test file * fixup * some changes to make tests pass * fix * fixup * Update src/transformers/models/idefics/configuration_idefics.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> --------- Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * remove compat checks * style * explain that Idefics is not for training from scratch * require pt>=2.0 * fix idefics vision config (#25092) * fix idefics vision config * fixup * clean * Update src/transformers/models/idefics/configuration_idefics.py --------- Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * cleanup * style * cleanup * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * upcase * sequence of images * handle the case with no images * Update src/transformers/image_processing_utils.py Co-authored-by: Victor SANH <victorsanh@gmail.com> * support pure lm take 2 * support tokenizer options * parameterize num_channels * fix upcase * s\|IdeficsForCausalLM\|IdeficsForVisionText2Text\|g * manual to one line * addressing review * unbreak * remove clip dependency * fix test * consistency * PIL import * Idefics prefix * Idefics prefix * hack to make tests work * style * fix * fix * revert * try/finally * cleanup * clean up * move * [`IDEFICS`] Fix idefics config refactor (#25149) * refactor config * nuke init weights * more refactor * oops * remove visual question answering pipeline support * Update src/transformers/models/idefics/clip.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/models/idefics/modeling_idefics.py * cleanup * mv clip.py vision.py * tidyup --------- Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas@stason.org> * fix * license * condition on pt * fix * style * fix * rm torchvision dependency, allow custom transforms * address review * rework device arg * add_eos_token * s/transforms/transform/ * fix top level imports * fix return value * cleanup * cleanup * fix * style * license * license * Update src/transformers/models/idefics/image_processing_idefics.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add a wrapper to freeze vision layears * tidyup * use the correct std/mean settings * parameterize values from config * add tests/models/idefics/test_image_processing_idefics.py * add test_processor_idefics.py * cleanup * cleanups * fix * fix * move to the right group * style * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add perceiver config * reset * missing arg docs * Apply suggestions from code review Co-authored-by: Leo Tronchon <leo.tronchon@gmail.com> * address review comments * inject automatic end of utterance tokens (#25218) * inject automatic end of utterance tokens * fix * fix * fix * rework to not use the config * not end_of_utterance_token at the end * Update src/transformers/models/idefics/processing_idefics.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address review * Apply suggestions from code review Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/image_processing_utils.py Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * [`Idefics`] add image_embeddings option in generate-related methods (#25442) * add image_embeddings option in generate-related methods * style * rename image_embeddings and allow perceiver embeddings precomputation * compute embeddings within generate * make is_encoder_decoder= True the default in config * nested if else fix * better triple check * switch if elif order for pixel values / img embeds * update model_kwargs perceiver only at the end * use _prepare_model_inputs instead of encoder_decoder logic * fix comment typo * fix config default for is_encoder_decoder * style * add typehints * precompute in forward * doc builder * style * pop instead of get image hidden states * Trigger CI * Update src/transformers/models/idefics/modeling_idefics.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/idefics/modeling_idefics.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * + indentation + style * simplify a bit the use_resampler logic using comments * update diocstrings * Trigger CI --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix rebase changes * unbreak #25237 - to be fixed in follow up PRs * is_composition = False * no longer needed --------- Co-authored-by: leot13 <leo.tronchon@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Victor SANH <victorsanh@gmail.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-08-18 14:12:28 -07:00
Younes Belkada	faed2ca46f	[`PEFT`] Peft integration alternative design (#25077 ) * a draft version * v2 integration * fix * make it more generic and works for IA3 * add set adapter and multiple adapters support * fixup * adapt a bit * oops * oops * oops * adapt more * fix * add more refactor * now works with model class * change it to instance method as it causes issues with `jit`. * add CR * change method name * add `add_adapter` method * clean up * Update src/transformers/adapters/peft_mixin.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * add moe utils * fixup * Update src/transformers/adapters/peft_mixin.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * adapt * oops * fixup * add is_peft_available * remove `requires_backend` * trainer compatibility * fixup + docstring * more details * trigger CI * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_utils.py * fixup + is_main_process * added `save_peft_format` in save_pretrained * up * fix nits here and there * nits here and there. * docs * revert `encoding="utf-8"` * comment * added slow tests before the PEFT release. * fixup and nits * let's be on the safe zone * added more comments * v1 docs * add remaining docs * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * move to `lib_integrations` * fixup * this time fixup * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address final comments * refactor to use `token` * add PEFT to DockerFile for slow tests. * added pipeline support. --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-08-18 19:08:03 +02:00
Arthur	bc3e20dcf0	[`Llama`] remove prompt and fix prefix finetuning (#25565 ) * nit * update * make sure use_default_system_prompt is saved * update checkpointing * consistency * use_default_system_prompt for test	2023-08-18 13:39:23 +02:00
Arthur	30b3c46ff5	[`split_special_tokens`] Add support for `split_special_tokens` argument to encode (#25081 ) * draft changes * update and add tests * styling for no * move test * path to usable model * update test * small update * update bertbased tokenizers * don'tuse kwargs for _tokenize * don'tuse kwargs for _tokenize * fix copies * update * update test for special tokenizers * fixup * skip two tests * remove pdb breakpiont() * wowo * rewrite custom tests * nits * revert chang in target keys * fix markup lm * update documentation of the argument	2023-08-18 13:26:27 +02:00
Alex McKinney	9d7afd2536	Replaces calls to `.cuda` with `.to(torch_device)` in tests (#25571 ) * Replaces calls to `.cuda` with `.to(torch_device)` in tests `torch.Tensor.cuda()` is a pre-0.4 solution to changing a tensor's device. It is recommended to prefer `.to(...)` for greater flexibility and error handling. Furthermore, this makes it more consistent with other tests (that tend to use `.to(torch_device)`) and ensures the correct device backend is used (if `torch_device` is neither `cpu` or `cuda`). * addressing review comments * more formatting changes in Bloom test * `make style` * Update tests/models/bloom/test_modeling_bloom.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixes style failures --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-08-18 12:40:40 +02:00
Yih-Dar	427adc898a	Skip `test_contrastive_generate` for `TFXLNet` (#25574 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-08-17 18:56:34 +02:00
Yoach Lacombe	b8f69d0d10	Add Text-To-Speech pipeline (#24952 ) * add AutoModelForTextToSpeech class * add TTS pipeline and tessting * add docstrings to text_to_speech pipeline * fix torch dependency * corrector 'processor is None' case in Pipeline * correct repo id * modify text-to-speech -> text-to-audio * remove processor * rename text_to_speech pipelines files to text_audio * add textToWaveform and textToSpectrogram instead of textToAudio classes * update TTS pipeline to the bare minimum * update tests TTS pipeline * make style and erase useless import torch in TTS pipeline tests * modify how to check if generate or forward in TTS pipeline * remove unnecessary extra new lines * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * refactor input_texts -> text_inputs * correct docstrings of TTS.__call__ * correct the shape of generated waveform * take care of Bark tokenizer special case * correct run_pipeline_test TTS * make style * update TTS docstrings * address Sylvain nit refactors * make style * refactor into one liners * correct squeeze * correct way to test if forward or generate * Update output audio waveform shape * make style * correct import * modify how the TTS pipeline test if a model can generate * align shape output of TTS pipeline with consistent shape --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2023-08-17 17:34:47 +01:00
Arthur	181d778f83	[`NllbMoe`] Update code to properly support loss computation (#25429 ) * update nllb_moe * fix * doc nits * nits * add a small test * ficup * remove adapted from	2023-08-17 17:21:56 +02:00
Arthur	b4d5548800	🚨🚨🚨 [`SPM`] Finish fix spm models 🚨🚨🚨 (#25224 ) * fix EVERYTHING * more fixes * ⚗️⚗️ Tokenizer magic ⚗️⚗️ * wrong value but test passes for the TODO * update * updat * safe protobuf import? * style * non gated repo * update * fixup * Update src/transformers/models/llama/tokenization_llama.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/llama/tokenization_llama.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/t5/test_tokenization_t5.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * nits * fix t5 too * use assert equal * fix llama decoding * nits on t5 * fixup * only remove the prefix space, not other spaces * more deconding tests and more todos * fix CI as well * fixup * skip failing test on CI (its tf its ok) * skip test_subword_regularization_tokenizer that is also crashing on the CI for TF * update llama * revert good fixes * fixup * empty * explain why we need to encode with an additional token * better warning? * nits --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-08-17 17:08:05 +02:00
Arthur	d6bf08f7f6	[`resize_embedding`] Introduce `pad_to_multiple_of` and guidance (#25088 ) * fix * revert cahnges and update resizing of embedding layer * use wraning * fixup * more styling nits * fix all tests that overload the embedding tests * 👀👀 remove breakpoint * remove useless overload + overload correctly where needed * resize lm head with new vocab size * reverse not necessary changes * style * fix CIs! * fix last CI tests, adapt bark and Marian * fixup	2023-08-17 17:00:32 +02:00
Yih-Dar	d2871b2975	Skip `test_beam_search_xla_generate_simple` for `T5` (#25566 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-08-17 15:30:46 +02:00
Younes Belkada	e7e9261a20	[`Docs`] Fix un-rendered images (#25561 ) fix un-rendered images	2023-08-17 12:08:11 +02:00
Yih-Dar	8992589dd6	Skip `test_onnx_runtime_optimize` for now (#25560 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-08-17 11:23:16 +02:00
Yih-Dar	ec25306b39	Fix MPT CI (#25548 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-08-17 09:06:26 +02:00
Sanchit Gandhi	36f183ebab	[ASR Pipeline] Fix init with timestamps (#25438 ) * [ASR Pipeline] Fix init * refactor test * change default kwarg setting * only perform checks if we have to * override init * move pre/forward/post checks to sanitize	2023-08-16 18:04:19 +01:00
amyeroberts	6bca43bb90	Input data format (#25464 ) * Add copied from statements for image processors * Move out rescale and normalize to base image processor * Remove rescale and normalize from vit (post rebase) * Update docstrings and tidy up * PR comments * Add input_data_format as preprocess argument * Resolve tests and tidy up * Remove num_channels argument * Update doc strings -> default ints not in code formatting	2023-08-16 17:45:02 +01:00
Yih-Dar	f61f072b61	Fix `MaskFormerModelIntegrationTest` OOM (#25544 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-08-16 18:11:24 +02:00
Marc Sun	0ed23e4db2	fix vit hybrid test (#25543 ) fix test	2023-08-16 17:02:57 +02:00
Joao Gante	3f9cb33504	Generate: fix default max length warning (#25539 )	2023-08-16 15:30:54 +01:00
Joao Gante	0b568291d7	Marian: post-hack-fix correction (#25459 )	2023-08-16 11:49:29 +01:00
Zach Mueller	ca51499248	Make training args fully immutable (#25435 ) * Make training args fully immutable * Working tests, PyTorch * In test_trainer * during testing * Use proper dataclass way * Fix test * Another one * Fix tf * Lingering slow * Exception * Clean	2023-08-15 11:47:47 -04:00
amyeroberts	c41291965f	🚨🚨🚨 Remove softmax for EfficientNetForImageClassification 🚨🚨🚨 (#25501 ) * Remove softmax for EfficientNet * Update integration test values * Fix up	2023-08-14 17:08:47 +01:00
amyeroberts	5e5fa0d88c	Mark flaky tests (#25463 ) Make CI less brittle	2023-08-11 15:26:45 +01:00
amyeroberts	11757e2bbd	Add input_data_format argument, image transforms (#25462 ) * Enable specifying input data format - overriding inferring * Add tests	2023-08-11 15:09:31 +01:00
Joao Gante	4692d26194	Switch Transformers: remove overwritten beam sample test (#25458 )	2023-08-11 13:16:01 +01:00
amyeroberts	41d56ea6dd	Refactor image processor testers (#25450 ) * Refactor image processor test mixin - Move test_call_numpy, test_call_pytorch, test_call_pil to mixin - Rename mixin to reflect handling of logic more than saving - Add prepare_image_inputs, expected_image_outputs for tests * Fix for oneformer	2023-08-11 11:30:18 +01:00
Marc Sun	55db70c63d	GPTQ integration (#25062 ) * GTPQ integration * Add tests for gptq * support for more quantization model * fix style * typo * fix method * Update src/transformers/modeling_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add dataclass and fix quantization_method * fix doc * Update tests/quantization/gptq/test_gptq.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * modify dataclass * add gtpqconfig import * fix typo * fix tests * remove dataset as req arg * remove tokenizer import * add offload cpu quantization test * fix check dataset * modify dockerfile * protect trainer * style * test for config * add more log * overwrite torch_dtype * draft doc * modify quantization_config docstring * fix class name in docstring * Apply suggestions from code review Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * more warning * fix 8bit kwargs tests * peft compatibility * remove var * fix is_gptq_quantized * remove is_gptq_quantized * fix wrap * Update src/transformers/modeling_utils.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add exllama * skip test * overwrite float16 * style * fix skip test * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix docsting formatting * add doc * better test --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-08-10 16:06:29 -04:00
Joao Gante	3e41cf13fc	Generate: Load generation config when `device_map` is passed (#25413 )	2023-08-10 10:54:26 +01:00
Joao Gante	123ad5363f	Generation: strict generation config validation at save time (#25411 ) * strict gen config save; Add tests * add note that the warning will be an exception in v4.34	2023-08-10 10:42:34 +01:00
amyeroberts	944ddce8bf	Enable passing number of channels when inferring data format (#25412 )	2023-08-09 17:41:21 +01:00
hukuda222	cb3c821cb7	aligned sample_beam output selection with beam_search (#25375 ) * aligned sample_beam specs with beam_search * pull origin main * Revert "pull origin main" This reverts commit `06d356f113`. * update test_utils.py * fix format * remove comment --------- Co-authored-by: Shogo Fujita <shogo.fujita@legalontech.jp>	2023-08-09 18:28:57 +02:00
Yoach Lacombe	704bf595eb	Update Bark generation configs and tests (#25409 ) * update bark generation configs for more coherent parameter * make style * update bark hub repo	2023-08-09 18:28:02 +02:00
Yih-Dar	5b517e1764	Use small config for `OneFormerModelTest.test_model_with_labels` (#25383 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-08-08 17:15:34 +02:00
Sanchit Gandhi	dedd11160d	[ASR Pipeline] Clarify return timestamps (#25344 ) * [ASR Pipeline] Clarify return timestamps * fix indentation * fix ctc check * fix ctc error message! * fix test * fix other test * add new tests * final comment	2023-08-08 10:16:00 +01:00
Yih-Dar	6ea3ee3cd2	Fix `test_model_parallelism` (#25359 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-08-08 10:48:45 +02:00
Matthew Hoffman	d4bd33cc9f	Register ModelOutput subclasses as supported torch.utils._pytree nodes (#25358 ) * Register ModelOutput subclasses as supported torch.utils._pytree nodes Fixes #25357 where DDP with static_graph=True does not sync gradients when calling backward() over tensors contained in ModelOutput subclasses * Add test for torch pytree ModelOutput serialization and deserialization	2023-08-08 08:12:11 +02:00
Pedro Lira	080a97119c	Add mask2former fp16 support (#25093 ) * Add mask2former fp16 support * Clear consistency/quality issues * Fix consistency/quality (2) * Add integration test for mask2former (fp16 case) * Fix code quality * Add integration test for maskformer (fp16 case) * Add integration test for oneformer (fp16 case) * Remove slow decorator from fp16 tests * Fix lint * Remove usage of full inference and value checks for fp16 * Temporarily comment slow for {mask, mask2, one}former * Add fp16 support to oneformer * Revert "Temporarily comment slow for {mask, mask2, one}former" This reverts commit `e5371edabd`. * Remove dtype conversion noop	2023-08-07 20:07:29 +01:00
Sylvain Gugger	baf1daa58e	Migrate Trainer from `Repository` to `upload_folder` (#25095 ) * First draft * Deal with progress bars * Update src/transformers/utils/hub.py Co-authored-by: Lucain <lucainp@gmail.com> * Address review comments * Forgot one * Pin hf_hub * Add argument for push all and fix tests * Fix tests * Address review comments --------- Co-authored-by: Lucain <lucainp@gmail.com>	2023-08-07 17:47:22 +02:00
Yih-Dar	c177606fb4	Fix more offload edge cases (#25342 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-08-07 17:45:41 +02:00
Guillaume "Vermeille" Sanchez	d533465150	add CFG for .generate() (#24654 )	2023-08-06 20:15:24 +01:00
Yih-Dar	ce6d153a53	Make `bark` could have tiny model (#25290 ) * temp * update * update * update * small dim * small dim * small dim * fix * update * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-08-04 15:13:14 +02:00
Sylvain Gugger	f0fd73a2de	Document check copies (#25291 ) * Document check copies better and add tests * Include header in check for copies * Manual fixes * Try autofix * Fixes * Clean tests * Finalize doc * Remove debug print * More fixes	2023-08-04 14:56:29 +02:00
Sylvain Gugger	29f04002e6	Deal with nested configs better in base class (#25237 ) * Deal better with nested configs * Fixes * More fixes * Fix last test * Clean up existing configs * Remove hack in MPT Config * Update src/transformers/configuration_utils.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Fix setting a nested config via dict in the kwargs * Adapt common test * Add test for nested config load with dict --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-08-04 14:56:09 +02:00
Sylvain Gugger	fab1a0aa82	Give more memory in test_disk_offload (#25315 )	2023-08-04 14:10:31 +02:00
Roland Szabo	d114a6b71f	Add timeout parameter to load_image function (#25184 ) * Add timeout parameter to load_image function. * Remove line. * Reformat code Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add parameter to docs. --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-08-03 15:51:54 +01:00
Yoach Lacombe	6d3f9c1e2e	add generate method to SpeechT5ForTextToSpeech (#25233 ) * add generate method to SpeechT5ForTextToSpeech * update speecht5forTTS docstrings * Remove defaults to None in generate docstrings Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-08-03 14:12:07 +01:00
amyeroberts	30409af6e1	Update InstructBLIP & Align values after rescale update (#25209 ) * Update InstructBLIP values Note: the tests are not independent. Running the test independentely produces different logits compared to running all the integration tests * Update test values after rescale update * Remove left over commented out code * Revert to previous rescaling logic * Update rescale tests	2023-08-03 11:01:10 +01:00
Yih-Dar	bd90cda9a6	CI with `num_hidden_layers=2` 🚀🚀🚀 (#25266 ) * CI with layers=2 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-08-02 20:22:36 +02:00
Patrick von Platen	b28ebb2655	[MMS] Fix mms (#25267 ) * [MMS] Fix mms * [MMS] Fix mms * fix mms loading * Apply suggestions from code review * make style * Update tests/models/wav2vec2/test_modeling_wav2vec2.py	2023-08-02 18:11:15 +02:00
Yupeng Jia	8021c684ec	Fix some bugs for two stage training of deformable detr (#25045 ) * Update modeling_deformable_detr.py Fix bugs for two stage training * Update modeling_deformable_detr.py * Add test_two_stage_training to DeformableDetrModelTest --------- Co-authored-by: yupeng.jia <yupeng.jia@momenta.ai>	2023-08-02 11:30:36 +01:00
amyeroberts	1b35409768	Update rescale tests - cast to float after rescaling to reflect #25229 (#25259 ) Rescale tests - cast to float after rescaling to reflect #25229	2023-08-02 11:29:55 +01:00
YQ	2230d149f0	fix get_keys_to_not_convert() to return correct modules for full precision inference (#25105 ) * add test for `get_keys_to_not_convert` * add minimum patch to keep mpt lm_head from 8bit quantization * add reivsion to	2023-08-02 04:21:52 -04:00
Younes Belkada	05ebb0264e	[`MPT`] Add `require_bitsandbytes` on MPT integration tests (#25201 ) * add `require_bitsandbytes` on MPT integration tests * add it on mpt as well	2023-08-01 12:20:34 +02:00
Yih-Dar	1b4f6199c6	Update tiny model info. and pipeline testing (#25213 ) * update tiny_model_summary.json * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-31 19:35:33 +02:00
Yih-Dar	9ca3aa0156	Fix `all_model_classes` in `FlaxBloomGenerationTest` (#25211 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-31 17:32:05 +02:00
amyeroberts	05cda5df34	🚨🚨🚨 Fix rescale ViVit Efficientnet (#25174 ) * Fix rescaling bug * Add tests * Update integration tests * Fix up * Update src/transformers/image_transforms.py * Update test - new possible order in list	2023-07-28 19:52:51 +01:00
Sanchit Gandhi	03f98f9683	[MusicGen] Fix integration tests (#25169 ) * move to device * update with cuda values * fix fp16 * more rigorous	2023-07-28 18:50:15 +01:00
Younes Belkada	dd9d45b6ec	[`InstructBlip`] Fix instructblip slow test (#25171 ) * fix instruct blip slow test * Update tests/models/instructblip/test_modeling_instructblip.py	2023-07-28 17:00:10 +02:00
Younes Belkada	add0895dd9	[`Mpt`] Fix mpt slow test (#25170 ) fix mpt slow test	2023-07-28 16:45:09 +02:00
Lucain	c1dba1111b	Add test when downloading from gated repo (#25039 )	2023-07-28 08:14:27 -04:00
Sanchit Gandhi	e93103632b	Add bloom flax (#25094 ) * First commit * step 1 working * add alibi * placeholder for `scan` * add matrix mult alibi * beta scaling factor for bmm * working v1 - simple forward pass * move layer_number from attribute to arg in call * partial functioning scan * hacky working scan * add more modifs * add test * update scan for new kwarg order * fix position_ids problem * fix bug in attention layer * small fix - do the alibi broadcasting only once * prelim refactor * finish refactor * alibi shifting * incorporate dropout_add to attention module * make style * make padding work again * update * remove bogus file * up * get generation to work * clean code a bit * added small tests * adding albii test * make CI tests pass: - change init weight - add correct tuple for output attention - add scan test - make CI tests work * fix few nits * fix nit onnx * fix onnx nit * add missing dtype args to nn.Modules * remove debugging statements * fix scan generate * Update modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * fix small test issue + make style * clean up * Update tests/models/bloom/test_modeling_flax_bloom.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * fix function name * small fix test * forward contrib credits from PR17761 * Fix failing test * fix small typo documentation * fix non passing test - remove device from build alibi * refactor call - refactor `FlaxBloomBlockCollection` module * make style * upcast to fp32 * cleaner way to upcast * remove unused args * remove layer number * fix scan test * make style * fix i4 casting * fix slow test * Update src/transformers/models/bloom/modeling_flax_bloom.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * remove `layer_past` * refactor a bit * fix `scan` slow test * remove useless import * major changes - remove unused code - refactor a bit - revert import `torch` * major refactoring - change build alibi * remove scan * fix tests * make style * clean-up alibi * add integration tests * up * fix batch norm conversion * style * style * update pt-fx cross tests * update copyright * Update src/transformers/modeling_flax_pytorch_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * per-weight check * style * line formats --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: haileyschoelkopf <haileyschoelkopf@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-07-27 18:24:56 +01:00
Yoach Lacombe	0b92ae3489	Add offload support to Bark (#25037 ) * initial Bark offload proposal * use hooks instead of manually offloading * add test of bark offload to cpu feature * Apply nit suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docstrings of offload Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * remove unecessary set_seed in Bark tests --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2023-07-27 15:35:17 +01:00
Arthur	9cea3e7b80	[`MptConfig`] support from pretrained args (#25116 ) * support from pretrained args * draft addition of tests * update test * use parrent assert true * Update src/transformers/models/mpt/configuration_mpt.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-07-27 16:24:52 +02:00
amyeroberts	659829b6ae	MaskFormer - enable return_dict in order to compile (#25052 ) * Enable return_dict in order to compile * Update tests	2023-07-26 16:23:30 +01:00
Yih-Dar	224da5df69	update `use_auth_token` -> `token` (#25083 ) * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-26 15:09:59 +02:00
Yih-Dar	31acba5697	Fix `PvtModelIntegrationTest::test_inference_fp16` (#25106 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-26 14:57:44 +02:00
Sebastian Husch Lee	8f36ab3e22	[`T5`, `MT5`, `UMT5`] Add [T5, MT5, UMT5]ForSequenceClassification (#24726 ) * Initial addition of t5forsequenceclassification * Adding imports and adding tests * Formatting * Running make fix-copies * Adding mt5forseq * Formatting * run make fix-copies * Adding to docs * Add model_parallel * Fix bug * Fix * Remove TODO * Fixing tests for T5ForSequenceClassification * Undo changes to dependency_versions_table.py * Change classification head to work with T5Config directly * Change seq length to let tests pass * PR comments for formatting * Formatting * Initial addition of UMT5ForSequenceClassification * Adding to inits and formatting * run make fix-copies * Add doc for UMT5ForSeqClass * Update UMT5 config * Fix docs * Skip torch fx test for SequenceClassification * Formatting * Add skip to UMT5 tests as well * Fix umt5 tests * Running make fix-copies * PR comments * Fix for change to sentence_representation * Rename seq_len to hidden_size since that's what it is * Use base_model to follow format of the rest of the library * Update docs * Extract the decoder_input_ids changes and make one liner * Make one-liner	2023-07-25 21:02:49 +02:00
Arthur	f9cc333805	[ `PreTrainedTokenizerFast`] Keep properties from fast tokenizer (#25053 ) * draft solution * use `setdefault` * nits * add tests and fix truncation issue * fix test * test passes locally * quality * updates * update tsets	2023-07-25 18:45:01 +02:00
Connor Henderson	0779fc8eb8	Edit err message and comment in `test_model_is_small` (#25087 ) * Edit err message and comment in * put back 80M comment	2023-07-25 12:24:36 -04:00
Arthur	dcb183f4bd	[`MPT`] Add MosaicML's `MPT` model to transformers (#24629 ) * draft add new model like * some cleaning of the config * nits * add nested configs * nits * update * update * added layer norms + triton kernels * consider only LPLayerNorm for now. * update * all keys match. * Update * fixing nits here and there * working forward pass. * removed einops dependency * nits * format * add alibi * byebye head mask * refactor attention * nits. * format * fix nits. * nuke ande updates * nuke tokenizer test * don't reshape query with kv heads * added a bit of documentation. * remove unneeded things * nuke more stuff * nit * logits match - same generations * rm unneeded methods * 1 remaining failing CI test * nit * fix nits * fix docs * fix docs * rm tokenizer * fixup * fixup * fixup and fix tests * fixed configuration object. * use correct activation * few minor fixes * clarify docs a bit * logits match à 1e-12 * skip and unskip a test * added some slow tests. * fix readme * add more details * Update docs/source/en/model_doc/mpt.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix configuration issues * more fixes in config * added more models * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove unneeded position ids * fix some comments * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * revert suggestion * mpt alibi + added batched generation * Update src/transformers/models/mpt/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove init config * Update src/transformers/models/mpt/configuration_mpt.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix nit * add another slow test * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fits in one line * some refactor because make fixup doesn't pass * add ft notebook * update md * correct doc path --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-07-25 14:32:40 +02:00
Xuehai Pan	6bc61aa7af	Set `TF32` flag for PyTorch cuDNN backend (#25075 )	2023-07-25 08:04:48 -04:00
Sylvain Gugger	f295fc8a16	Fix last models for common tests that are too big. (#25058 ) * Fix last models for common tests that are too big. * Remove print statement	2023-07-25 07:56:04 -04:00
Rinat	a03d13c83d	Pvt model (#24720 ) * pull and push updates * add docs * fix modeling * Add and run test * make copies * add task * fix tests and fix small issues * Checks on a Pull Request * fix docs * add desc pvt.md	2023-07-24 15:34:19 +01:00
Sylvain Gugger	afe8bfc075	Comment again print statement	2023-07-24 10:12:20 -04:00
Sylvain Gugger	42571f6eb8	Make more test models smaller (#25005 ) * Make more test models tiny * Make more test models tiny * More models * More models	2023-07-24 10:08:47 -04:00
Zach Mueller	3b734f5042	Add dispatch_batches to training arguments (#25038 ) * Dispatch batches * Copy items	2023-07-24 09:27:19 -04:00
Arthur	0511369a8b	[`LlamaConfig`] Nit: pad token should be None by default (#24958 ) * pad token should be None by default * fix tests * nits	2023-07-21 14:32:34 +02:00
Benjamin Badger	caf5e369fc	Contrastive Search peak memory reduction (#24120 ) Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2023-07-20 18:46:53 +01:00
Joao Gante	89136ff7f8	Generate: sequence bias can handle same terminations (#24822 )	2023-07-20 12:23:17 +01:00
Tom Aarsen	79444f370f	Deprecate unused OpenLlama architecture (#24922 ) * Resolve typo in check_repo.py * Specify encoding when opening modeling files * Deprecate the OpenLlama architecture * Add disclaimer pointing to Llama I'm open to different wordings here * Match the capitalisation of LLaMA	2023-07-20 07:03:24 -04:00
Arthur	07360b6c9c	[`Llama2`] Add support for Llama 2 (#24891 ) * add llama * add other readmes * update padding id in readme * add link to paper * fix paths and tokenizer * more nits * styling * fit operation in 2 lines when possible * nits * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add form * update reademe * update readme, we don't have a default pad token * update test and tokenization * LLaMA instead of Llama * nits * add expected text * add greeedy output * styling * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * sequential device map * skip relevant changes --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-07-18 15:18:31 -04:00
NielsRogge	3ec10e6c76	Add DINOv2 (#24016 ) * First draft * More improvements * Convert patch embedding layer * Convert all weights * Make conversion work * Improve conversion script * Fix style * Make all tests pass * Add image processor to auto mapping * Add swiglu ffn * Add image processor to conversion script * Fix conversion of giant model * Fix documentation * Fix style * Fix tests * Address comments * Address more comments * Remove unused arguments * Remove more arguments * Rename parameters * Include mask token * Address comments * Add docstring * Transfer checkpoints * Empty commit	2023-07-18 15:34:06 +01:00
Yih-Dar	57da42ad05	Enable `ZeroShotAudioClassificationPipelineTests::test_small_model_pt` (#24882 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-18 15:08:53 +02:00
statelesshz	9c875839c0	add ascend npu accelerator support (#24879 ) * Add Ascend NPU accelerator support * fix style warining	2023-07-18 08:20:32 -04:00
Yih-Dar	2ab75add4b	Remove `tests/onnx` (#24868 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-17 22:37:28 +02:00
Yih-Dar	870dfc15b2	Skip failing `ZeroShotAudioClassificationPipelineTests::test_small_model_pt` for now (#24867 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-17 15:51:50 -04:00
Yoach Lacombe	f42a35e611	Add bark (#24086 ) * first raw version of the bark integration * working code on small models with single run * add converting script from suno weights 2 hf * many changes * correct past_kv output * working implementation for inference * update the converting script according to the architecture changes * add a working end-to-end inference code * remove some comments and make small changes * remove unecessary comment * add docstrings and ensure no unecessary intermediary output during audio generation * remove done TODOs * make style + add config docstrings * modification for batch inference support on the whole model * add details to .generation_audio method * add copyright * convert EncodecModel from original library to transformers implementation * add two class in order to facilitate model and sub-models loading from the hub * add support of loading the whole model * add BarkProcessor * correct modeling according to processor output * Add proper __init__ and auto support * Add up-to-date copyright/license message * add relative import instead of absolute * cleaner head_dim computation * small comment removal or changes * more verbose LayerNorm init method * specify eps for clearer comprehension * more verbose variable naming in the MLP module * remove unecessary BarkBlock parameter * clearer code in the forward pass of the BarkBlock * remove _initialize_modules method for cleaner code * Remove unnecessary methods from sub-models * move code to remove unnecessary function * rename a variable for clarity and change an assert * move code and change variable name for clarity * remove unnecessary asserts * correct small bug * correct a comment * change variable names for clarity * remove asserts * change import from absolute to relative * correct small error due to comma missing + correct import * Add attribute Bark config * add first version of tests * update attention_map * add tie_weights and resize_token_embeddings for fineModel * correct getting attention_mask in generate_text_semantic * remove Bark inference trick * leave more choices in barkProcessor * remove _no_split_modules * fixe error in forward of block and introduce clearer notations * correct converting script with last changes * make style + add draft bark.mdx * correct BarkModelTest::test_generate_text_semantic * add Bark in main README * add dummy_pt_objects for Bark * add missing models in the main init * correct test_decoder_model_past_with_large_inputs * disable torchscript test * change docstring of BarkProcessor * Add test_processor_bark * make style * correct copyrights * add bark.mdx + make style, quality and consistency * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Remove unnecessary test method * simply logic of a test * Only check first ids for slow audio generation * split full end-to-end generation tests * remove unneccessary comment * change submodel names for clearer naming * remove ModuleDict from modeling_bark * combine two if statements * ensure that an edge misued won't happen * modify variable name * move code snippet to the right place (coarse instead of semantic) * change BarkSemanticModule -> BarkSemanticModel * align BarkProcessor with transformers paradigm * correct BarkProcessor tests with last commit changes * change _validate_voice_preset to an instance method instead of a class method * tie_weights already called with post_init * add codec_model config to configuration * update bark modeling tests with recent BarkProcessor changes * remove SubModelPretrainedModel + change speakers embeddings prompt type in BarkModel * change absolute imports to relative * remove TODO * change docstrings * add examples to docs and docstrings * make style * uses BatchFeature in BarkProcessor insteads of dict * continue improving docstrings and docs + make style * correct docstrings examples * more comprehensible speaker_embeddings load/Save * rename speaker_embeddings_dict -> speaker_embeddings * correct bark.mdx + add bark to documentation_tests * correct docstrings configuration_bark * integrate last nit suggestions * integrate BarkGeneration configs * make style * remove bark tests from documentation_tests.txt because timeout - tested manually * add proper generation config initialization * small bark.mdx documentation changes * rename bark.mdx -> bark.md * add torch.no_grad behind BarkModel.generate_audio() * replace assert by ValueError in convert_suno_to_hf.py * integrate a series of short comments from reviewer * move SemanticLogitsProcessors and remove .detach() from Bark docs and docstrings * actually remove SemanticLogitsProcessor from modeling_bark.oy * BarkProcessor returns a single output instead of tuple + correct docstrings * make style + correct bug * add initializer_range to BarkConfig + correct slow modeling tests * add .clone() to history_prompt.coarse_prompt to avoid modifying input array * Making sure no extra "`" are present * remove extra characters in modeling_bark.py * Correct output if history_prompt is None * remove TODOs * remove ravel comment * completing generation_configuration_bark.py docstrings * change docstrings - number of audio codebooks instead of Encodec codebooks * change 'bias' docstrings in configuration_bark.py * format code * rename BarkModel.generate_audio -> BarkModel.generate_speech * modify AutoConfig instead of EncodecConfig in BarkConfig * correct AutoConfig wrong init * refactor BarkModel and sub-models generate_coarse, generate_fine, generate_text_semantic * remove SemanticLogitsProcessor and replace it with SuppressTokensLogitsProcessor * move nb_codebook related config arguments to BarkFineConfig * rename bark.mdx -> bark.md * correcting BarkModelConfig from_pretrained + remove keys_to_ignore * correct bark.md with correct hub path * correct code bug in bark.md * correct list tokens_to_suppress * modify Processor to load nested speaker embeddings in a safer way * correct batch sampling in BarkFineModel.generate_fine * Apply suggestions from code review Small docstrings correction and code improvements Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * give more details about num_layers in docstrings * correct indentation mistake * correct submodelconfig order of docstring variables * put audio models in alphabetical order in utils/check_repo.my * remove useless line from test_modeling_bark.py * makes BarkCoarseModelTest inherits from (ModelTesterMixin, GenerationTesterMixin, unittest.TestCase) instead of BarkSemanticModelTest * make a Tester class for each sub-model instead of inheriting * add test_resize_embeddings=True for Bark sub-models * add Copied from transformers.models.gpt_neo.modeling_gpt_neo.GPTNeoSelfAttention._split_heads * remove 'Copied fom Bark' comment * remove unneccessary comment * change np.min -> min in modeling_bark.py * refactored all custom layers to have Bark prefix * add attention_mask as an argument of generate_text_semantic * refactor sub-models start docstrings to have more precise config class definition * move _tied_weights_keys overriding * add docstrings to generate_xxx in modeling_bark.py * add loading whole BarkModel to convert_suno_to_hf * refactor attribute and variable names * make style convert_suno * update bark checkpoints * remove never entered if statement * move bark_modeling docstrings after BarkPretrainedModel class definition * refactor modeling_bark.py: kv -> key_values * small nits - code refactoring and removing unecessary lines from _init_weights * nits - replace inplace method by variable assigning * remove optional when necessary * remove some lines in generate_speech * add default value for optional parameter * Refactor preprocess_histories_before_coarse -> preprocess_histories Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * correct usage after refactoring * refactor Bark's generate_xxx -> generate and modify docstrings and tests accordingly * update docstrings python in configuration_bark.py * add bark files in utils/documentation_test.txt * correct docstrings python snippet * add the ability to use parameters in the form of e.g coarse_temperature * add semantic_max_new_tokens in python snippet in docstrings for quicker generation * Reformate sub-models kwargs in BakModel.generate Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * correct kwargs in BarkModel.generate * correct attention_mask kwarg in BarkModel.generate * add tests for sub-models args in BarkModel.generate and correct BarkFineModel.test_generate_fp16 * enrich BarkModel.generate docstrings with a description of how to use the kwargs --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-07-17 17:53:24 +01:00
Sylvain Gugger	1023705440	Check models used for common tests are small (#24824 ) * First models * Conditional DETR * Treat DETR models, skip others * Skip LayoutLMv2 as well * Fix last tests	2023-07-14 14:43:19 -04:00
Sylvain Gugger	f32303d519	Run hub tests (#24807 ) * Run hub tests * [all-test] Run tests please! * [all-test] Add vision dep for hub tests * Fix tests	2023-07-13 15:25:45 -04:00
Joao Gante	34d9409427	Llama/GPTNeoX: add RoPE scaling (#24653 ) * add rope_scaling * tmp commit * add gptneox * add tests * GPTNeoX can now handle long inputs, so the pipeline test was wrong * Update src/transformers/models/open_llama/configuration_open_llama.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * remove ntk * remove redundant validation --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-07-13 16:47:30 +01:00
Sylvain Gugger	9342c8fb82	Deprecate models (#24787 ) * Deprecate some models * Fix imports * Fix inits too * Remove tests * Add deprecated banner to documentation * Remove from init * Fix auto classes * Style * Remote upgrade strategy 1 * Remove site package cache * Revert this part * Fix typo... * Update utils * Update docs/source/en/model_doc/bort.md Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> * Address review comments * With all files saved --------- Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2023-07-13 11:46:54 -04:00
Yih-Dar	717dadc6f3	Skip torchscript tests for `MusicgenForConditionalGeneration` (#24782 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-13 15:54:18 +02:00
Zach Mueller	0284285501	Fix pad across processes dim in trainer and not being able to set the timeout (#24775 ) * dim, and rm copy * Don't rm copy for now * Oops * pad index * Should be a working test * Tickle down ddp timeout * Put fix back in now that testing locally is done * Better comment specifying timeout Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-07-12 10:01:51 -04:00
NielsRogge	bb13a92859	[InstructBLIP] Fix bos token of LLaMa checkpoints (#24492 ) * Add fix * Fix doctest	2023-07-11 20:43:01 +01:00
Connor Henderson	5739726fcc	fix: Text splitting in the BasicTokenizer (#22280 ) * fix: Apostraphe splitting in the BasicTokenizer for CLIPTokenizer * account for apostrophe at start of new word * remove _run_split_on_punc, use re.findall instead * remove debugging, make style and quality * use pattern and punc splitting, repo-consistency will fail * remove commented out debugging * adds bool args to BasicTokenizer, remove pattern * do_split_on_punc default True * clean stray comments and line breaks * rebase, repo-consistency * update to just do punctuation split * add unicode normalizing back * remove redundant line	2023-07-11 11:07:58 -04:00
Jegor Kitškerkin	8a5e8a9c2a	Add ViViT (#22518 ) * Add model * Add ability to get classification head weights * Add docs * Add imports to __init__.py * Run style * Fix imports and add mdx doc * Run style * Fix copyright * Fix config docstring * Remove imports of ViViTLayer and load_tf_weights_in_vivit * Remove FeatureExtractor and replace with ImageProcessor everywhere * Remove ViViTForPreTraining from vivit.mdx * Change ViViT -> Vivit everywhere * Add model_doc to _toctree.yml * Replace tuples with lists in arguments of VivitConfig * Rename patch_size to tubelet_size in TubeletEmbeddings * Fix checkpoint names * Add tests * Remove unused num_frames * Fix imports for VivitImageProcessor * Minor fixes * Decrease number of frames in VivitModelTester from 32 to 16 * Decrease number of frames in VivitModelTester from 16 to 8 * Add initialization for pos embeddings * Rename Vivit -> ViViT in some places * Fix docstring and formatting * Rename TubeletEmbeddings -> VivitTubeletEmbeddings * Remove load_tf_weights_in_vivit * Change checkpoint name * Remove Vivit _TOKENIZER_FOR_DOC * Fix * Fix VivitTubeletEmbeddings and pass config object as parameter * Use image_size and num_frames instead of video_size * Change conversion script and fix differences with the orig implementation * Fix docstrings * Add attention head pruning * Run style and fixup * Fix tests * Add ViViT to video_classification.mdx * Save processor in conversion script * Fix * Add image processor test * Run fixup and style * Run fix-copies * Update tests/models/vivit/test_modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/vivit/test_modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Use PyAV instead of decord * Add unittest.skip * Run style * Remove unneeded test * Update docs/source/en/model_doc/vivit.mdx Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/configuration_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add model * Add docs * Run style * Fix imports and add mdx doc * Remove FeatureExtractor and replace with ImageProcessor everywhere * Change ViViT -> Vivit everywhere * Rename Vivit -> ViViT in some places * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Run make style * Remove inputs save * Fix image processor * Fix * Run `make style` * Decrease parameters of VivitModelTester * Decrease tubelet size * Rename vivit.mdx * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix default values in image_processing_vivit.py --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-07-11 14:04:04 +01:00
Arthur	b15343de6f	[Patch-t5-tokenizer] Patches the changes on T5 to make sure previous behaviour is still valide for beginning of words (#24622 ) * patch `_tokenize` function * more tests * properly fix * fixup * Update src/transformers/models/t5/tokenization_t5.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix without ifs * update * protect import * add python processing * is first needed * add doc and update with lefacy * updaate * fix T5 SPM converter * styling * fix T5 warning * add is_seqio_available * remove is_first * revert some changes * more tests and update * update llama test batterie * fixup * refactor T5 spm common tests * draft the llama tests * update * uopdate test * nits * refine * name nit * fix t5 tests * fix T5 * update * revert convert slow to fast changes that fail lots of tests * legacy support * fixup * nits is first not defined * don't use legacy behaviour for switch transformers * style * My attempt to check. * nits * fixes * update * fixup * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * updates * fixup * add legacy warning * fixup * warning_once nit * update t5 documentation test * update llama tok documentation * add space to warning * nits * nit * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * last nits --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2023-07-11 15:02:18 +02:00
Matt	b3ab3fac1d	Falcon port (#24523 ) * Initial commit * Update src/transformers/models/falcon/configuration_falcon.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/falcon/configuration_falcon.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Cleanup config docstring * Update src/transformers/models/falcon/configuration_falcon.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Convert to relative imports * Remove torch < 1.8 warning * Restructure cos_sin header * qkv -> query, key, value * Refactor attention calculation * Add a couple of config variables to account for the different checkpoints * Successful merging of the code paths! * Fix misplaced line in the non-parallel attention path * Update config and tests * Add a pad_token_id when testing * Support output_attentions when alibi is None * make fixup * Skip KV cache shape test * No more _keys_to_ignore_on_load_missing * Simplify self attention a bit * Simplify self attention a bit * make fixup * stash commit * Some more attention mask updates * Should pass all tests except assisted generation! * Add big model generation test * make fixup * Add temporary workaround for test * Test overrides for assisted generation * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/models/falcon/test_modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Test overrides for assisted generation * Add generation demo * Update copyright * Make the docstring model actually small * Add module-level docstring * Remove all assertions * Add copied from bloom * Reformat the QKV layer * Add copied from bloom * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Remove unused line and reformat * No single letter variables * Cleanup return names * Add copied from line * Remove the deprecated arguments blocks * Change the embeddings test to an alibi on/off test * Remove position_ids from FalconForQA * Remove old check for token type IDs * Fix the alibi path when multi_query is False * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/falcon/test_modeling_falcon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update config naming * Fix typo for new_decoder_architecture * Add some comments * Fix docstring * Fix docstring * Create range in the right dtype from the start * Review comment cleanup * n_head_kv -> num_kv_heads * self.alibi -> self.use_alibi * self.num_kv -> self.num_kv_heads * Reorder config args * Made alibi arguments Optional * Add all model docstrings * Add extra checkpoints * Add author info for Falcon * Stop removing token_type_ids because our checkpoints shouldn't return it anymore * Add one hopeful comment for the future * Fix typo * Update tests, fix cache issue for generation * Use -1e9 instead of -inf to avoid float overflow * Recompute the rotary embeddings much less often * Re-enable disabled tests * One final fix to attention mask calculation, and update tests * Cleanup targeting falcon-40b equivalency * Post-rebase docs update * Update docstrings, especially in the config * More descriptive variable names, and comments where we can't rename them --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-07-11 13:36:31 +01:00
novice	30ed3adf47	Add Multi Resolution Analysis (MRA) (New PR) (#24513 ) * Add all files * Update masked_language_modeling.md * fix mlm models * fix conflicts * fix conflicts * fix copies * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Reduce seq_len and hidden_size in ModelTester * remove output_attentions * fix conflicts * remove copied from statements * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-07-10 10:50:43 +01:00
Yih-Dar	4957294270	Fix flaky `test_for_warning_if_padding_and_no_attention_mask` (#24706 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-07 11:55:21 +02:00
Arthur	fb78769b9c	[`MT5`] Fix CONFIG_MAPPING issue leading it to load umt5 class (#24678 ) * update * add umt5 to auto tokenizer mapping * nits * fixup * fix failing torch test	2023-07-07 11:33:54 +09:00
Yuchao Dai	fb3b22c3b9	LlamaTokenizer should be picklable (#24681 ) * LlamaTokenizer should be picklable * make fixup	2023-07-06 10:21:27 +01:00
Nripesh Niketan	bd9dfc23b9	Add `is_torch_mps_available` function to utils (#24660 ) * Add mps function utils * black formating * format fix * Added MPS functionality to transformers * format fix	2023-07-05 16:02:20 +02:00
Yih-Dar	ee339bad01	Fix `VisionTextDualEncoderIntegrationTest` (#24661 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-05 13:44:30 +02:00
Yih-Dar	d211a84aca	Fix `EncodecModelTest::test_multi_gpu_data_parallel_forward` (#24663 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-05 11:37:46 +02:00
Sanchit Gandhi	4e94566018	Fix audio feature extractor deps (#24636 ) * Fix audio feature extractor deps * use audio utils window over torch window	2023-07-04 16:03:27 +01:00
Arthur	799df10aef	[`Umt5`] Add google's umt5 to `transformers` (#24477 ) * add tokenization template * update conversion script * update modeling code * update * update convert checkpoint * update modeling * revert changes on convert script * new conversion script for new format * correct position bias * cleaning a bit * Credit co authors Co-authored-by: agemagician <ahmed.elnaggar@tum.de> Co-authored-by: stefan-it <> * styling * Add docq * fix copies * add co author * Other Author * Merge branch 'main' of https://github.com/huggingface/transformers into add-umt5 * add testing * nit * Update docs/source/en/model_doc/umt5.mdx Co-authored-by: Stefan Schweter <stefan@schweter.it> * fix t5 * actual fix? * revert wrong changes * remove * update test * more fixes * revert some changes * add SPIECE_UNDERLINE * add a commone xample * upfate * fix copies * revert changes on t5 conversion script * revert bytefallback changes since there was no addition yet * fixup * fixup * ingore umt5 cutom testing folder * fix readmes * revertT5 changes * same outputs * fixup * update example * Apply suggestions from code review * style * draft addition of all new files * current update * fix attention and stuff * finish refactoring * auto config * fixup * more nits * add umt5 to init * use md format * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * revert changes on mt5 * revert mt4 changes * update test * more fixes * add to mapping * fix-copies * fix copies * foix retain grad * fix some tests * nits * done * Update src/transformers/models/umt5/modeling_umt5.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/model_doc/umt5.md * Update src/transformers/models/umt5/__init__.py * Update docs/source/en/model_doc/umt5.md Co-authored-by: Stefan Schweter <stefan@schweter.it> * Update src/transformers/models/umt5/modeling_umt5.py * update conversion script + use google checkpoints * nits * update test and modelling * stash slow convert * update fixupd * don't change slow --------- Co-authored-by: stefan-it <> Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-07-03 07:38:21 +02:00
Matt	134caef31a	Speed up TF tests by reducing hidden layer counts (#24595 ) * hidden layers, huh, what are they good for (absolutely nothing) * Some tests break with 1 hidden layer, use 2 * Use 1 hidden layer in a few slow models * Use num_hidden_layers=2 everywhere * Slightly higher tol for groupvit * Slightly higher tol for groupvit	2023-06-30 16:30:33 +01:00
Yih-Dar	3441ad7d43	Make (TF) CI faster (test only a subset of model classes) (#24592 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-30 16:54:54 +02:00
JB (Don)	78a2b19fc8	Show a warning for missing attention masks when pad_token_id is not None (#24510 ) * Adding warning messages to BERT for missing attention masks These warning messages when there are pad tokens within the input ids and no attention masks are given. The warning message should only show up once. * Adding warning messages to BERT for missing attention masks These warning messages are shown when the pad_token_id is not None and no attention masks are given. The warning message should only show up once. * Ran fix copies to copy over the changes to some of the other models * Add logger.warning_once.cache_clear() to the test * Shows warning when there are no attention masks and input_ids start/end with pad tokens * Using warning_once() instead and fix indexing in input_ids check --------- Co-authored-by: JB Lau <hckyn@voyager2.local>	2023-06-30 08:19:39 -04:00
Arthur	b52a03cd3b	⚠️⚠️[`T5Tokenize`] Fix T5 family tokenizers⚠️⚠️ (#24565 ) * don't add space before single letter chars that don't have a merge * fix the fix * fixup * add a test * more testing * fixup * hack to make sure fast is also fixed * update switch transformers test * revert convert slow * Update src/transformers/models/t5/tokenization_t5.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add typechecking * quality --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-06-30 07:00:43 +02:00
amyeroberts	b324557aac	Removal of deprecated vision methods and specify deprecation versions (#24570 ) * Removal of deprecated methods and specify versions * Fix tests	2023-06-29 15:09:51 +01:00
Yih-Dar	77db28dc52	Update some torchscript tests after #24505 (#24566 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-29 16:05:24 +02:00
Sanchit Gandhi	1c1c90756d	Add Musicgen (#24109 ) * Add Audiocraft * add cross attention * style * add for lm * convert and verify * introduce t5 * split configs * load t5 + lm * clean conversion * copy from t5 * style * start pattern provider * make generation work * style * fix pos embs * propagate shape changes * propagate shape changes * style * delay pattern: pad tokens at end * audiocraft -> musicgen * fix inits * add mdx * style * fix pad token in processor * override generate and add todos * add init to test * undo pattern delay mask after gen * remove cfg logits processor * remove cfg logits processor * remove logits processor in favour of mask * clean pos embs * make fix copies * update readmes * clean pos emb * refactor encoder/decoder * make fix copies * update conversion * fix config imports * update config docs * make style * send pattern mask to device * pattern mask with delay * recover prompted audio tokens * fix docstrings * laydown test file * pattern edge case * remove t5 ref * add processing class * config refactor * better pattern comment * check if mask is not present * check if mask is not present * refactor to auto class * remove encoder configs * fix processor * processor import * start updating conversion * start updating tests * make style * convert t5, encodec, lm * convert as composite * also convert processor * run generate * classifier free gen * comments and clean up * make style * docs for logit proc * docstring for uncond gen * start lm tests * work tests * let the lm generate * refactor: reshape inside forward * undo greedy loop changes * from_enc_dec -> from_sub_model * fix input id shapes in docstrings * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * undo generate changes * from sub model config * Update src/transformers/models/musicgen/modeling_musicgen.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make generate work again * generate uncond -> get uncond inputs * remove prefix allowed tokens fn * better error message * logit proc checks * Apply suggestions from code review Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * make decoder only tests work * composite fast tests * make style * uncond generation * feat extr padding * make audio prompt work * fix inputs docstrings * unconditional inputs: dict -> model output * clean up tests * more clean up tests * make style * t5 encoder -> auto text encoder * remove comments * deal with frames * fix auto text * slow tests * nice mdx * remove can generate * todo - hub id * convert m/l * make fix copies * only import generation with torch * ignore decoder from tests * don't wrap uncond inputs * make style * cleaner uncond inputs * add example to musicgen forward * fix docs * ignore MusicGen Model/ForConditionalGeneration in auto mapping * add doc section to toctree * add to doc tests * add processor tests * fix push to hub in conversion * tips for decoder only loading * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix conversion for s / m / l checkpoints * import stopping criteria from module * remove from pipeline tests * fix uncond docstring * decode audio method * fix docs * org: sanchit-gandhi -> facebook * fix max pos embeddings * remove auto doc (not compatible with shapes) * bump max pos emb * make style * fix doc * fix config doc * fix config doc * ignore musicgen config from docstring * make style * fix config * fix config for doctest * consistent from_sub_models * don't automap decoder * fix mdx save audio file * fix mdx save audio file * processor batch decode for audio * remove keys to ignore * update doc md * update generation config * allow changes for default generation config * update tests * make style * fix docstring for uncond * fix processor test * fix processor test --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-06-29 14:48:59 +01:00
amyeroberts	ae454f41d4	Update old existing feature extractor references (#24552 ) * Update old existing feature extractor references * Typo * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * Address comments from review - update 'feature extractor' Co-authored by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2023-06-29 10:17:36 +01:00
Yih-Dar	fd6735102a	Make PT/Flax tests could be run on GPU (#24557 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-28 20:11:01 +02:00
Younes Belkada	33b5ef5cdf	[`InstructBlip`] Add instruct blip int8 test (#24555 ) * add 8bit instructblip test * update tests	2023-06-28 19:06:30 +02:00
Younes Belkada	903b97d8df	[`gpt2-int8`] Add gpt2-xl int8 test (#24543 ) add gpt2-xl test	2023-06-28 18:02:13 +02:00
Yih-Dar	b0651655be	Update `EncodecIntegrationTest` (#24553 ) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-28 18:01:41 +02:00
Yih-Dar	e84bf1f734	⚠️ Time to say goodbye to py37 (#24091 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-28 07:22:39 +02:00
Dario Sučić	12240925cf	Add bitsandbytes support for gpt2 models (#24504 ) * Add bitsandbytes support for gpt2 models * Guard Conv1D import to pass tensorflow test * Appease ruff linter * Fix 4bit test and remove int8 test boilerplate * Update tests/bnb/test_mixed_int8.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-06-28 05:55:32 +02:00
Sylvain Gugger	89b6ee49fd	Finishing tidying keys to ignore on load (#24535 )	2023-06-27 21:35:15 -04:00
Sylvain Gugger	8e5d1619b3	Clean load keys (#24505 ) * Preliminary work on some models * Fix test load missing and make sure nonpersistent buffers are tested * Always ignore nonpersistent buffers if in state_dict * Treat models * More models * Treat remaining models * Fix quality * Fix tests * Remove draft * This test is not needed anymore * Fix copies * Fix last test * Newly added models * Fix last tests * Address review comments	2023-06-27 14:45:40 -04:00
Sebastian	06910f5a76	[`T5`] Add T5ForQuestionAnswering and MT5ForQuestionAnswering (#24481 ) * Adding T5ForQuestionAnswering * Changed weight initialization that results in better initial loss when fine-tuning * Update to class variables * Running make fixup * Running make fix-copies * Remove model_parallel * Adding MT5ForQuestionAnswering * Adding docs * Fix wrong doc * Update src/transformers/models/mt5/modeling_mt5.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/t5/modeling_t5.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * File formatting * Undoing change --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-06-27 10:07:06 -04:00
Xiaoli Wang	239ace152b	Fix TypeError: Object of type int64 is not JSON serializable (#24340 ) * Fix TypeError: Object of type int64 is not JSON serializable * Convert numpy.float64 and numpy.int64 to float and int for json serialization * Black reformatted examples/pytorch/token-classification/run_ner_no_trainer.py * * make style	2023-06-27 12:15:49 +01:00
Joao Gante	5f3efdf762	Generate: `group_beam_search` requires `diversity_penalty>0.0` (#24456 ) * add exception * update docs	2023-06-27 10:46:39 +01:00
Yih-Dar	850cf4af0c	Compute `dropout_probability` only in training mode (#24486 ) * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-26 18:36:47 +02:00
Sylvain Gugger	5757923888	Add support for for loops in python interpreter (#24429 ) Add support for for loops	2023-06-26 09:58:14 -04:00
Yih-Dar	3ca022238b	Update `InstructBlipModelIntegrationTest` (#24490 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-26 14:37:12 +02:00
Younes Belkada	914289ac4b	[`pipeline`] Fix str device issue (#24396 ) * fix str device issue * fixup * adapt from suggestions * forward contrib credits from suggestions * better fix * added backward compatibility for older PT versions * final fixes * oops * Attempting something with less branching. --------- Co-authored-by: amyeroberts <amyeroberts@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2023-06-26 13:58:36 +02:00
Matthijs Hollemans	3b84d86b57	add missing alignment_heads to Whisper integration test (#24487 ) add missing alignment heads	2023-06-26 11:50:10 +02:00
NielsRogge	868363abb9	Add InstructBLIP (#23460 ) * Squash 88 commits * Use markdown * Remove mdx files due to bad rebase * Fix modeling files due to bad rebase * Fix style * Update comment * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-26 11:23:57 +02:00
Sanchit Gandhi	8767958fc1	Allow dict input for audio classification pipeline (#23445 ) * Allow dict input for audio classification pipeline * make style * Empty commit to trigger CI * Empty commit to trigger CI * check for torchaudio * add pip instructions Co-authored-by: Sylvain <sylvain.gugger@gmail.com> * Update src/transformers/pipelines/audio_classification.py Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * asr -> audio class * asr -> audio class --------- Co-authored-by: Sylvain <sylvain.gugger@gmail.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2023-06-23 13:50:37 +01:00
Yih-Dar	2898fd3968	Fix some `TFWhisperModelIntegrationTests` (#24428 ) * fix * fix * fix * fix * fix * fix * fix * fix * fix * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-06-23 14:27:49 +02:00
Bowen Bao	a28325e25e	Replace python random with torch.rand to enable dynamo.export (#24434 ) * Replace python random with torch.rand to enable dynamo.export * revert changes to flax model code * Remove unused random import * Fix torch template * Move torch.manual_seed(0) to right location	2023-06-23 08:17:21 -04:00
Alex Hall	b6295b26c5	Refactor hyperparameter search backends (#24384 ) * Refactor hyperparameter search backends * Simpler refactoring without abstract base class * black * review comments: specify name in class use methods instead of callable class attributes name constant better * review comments: safer bool checking, log multiple available backends * test ALL_HYPERPARAMETER_SEARCH_BACKENDS vs HPSearchBackend in unit test, not module. format with black. * copyright	2023-06-22 14:28:25 -04:00
Younes Belkada	3ce3385c47	Revert "Fix gradient checkpointing + fp16 autocast for most models" (#24420 ) Revert "Fix gradient checkpointing + fp16 autocast for most models (#24247)" This reverts commit `285a48011d`.	2023-06-22 16:11:27 +02:00
Yih-Dar	652ece0710	Skip `test_conditional_generation_pt_pix2struct` in Past CI (torch < 1.11) (#24417 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-22 15:34:13 +02:00
Matthijs Hollemans	cd927a4736	add word-level timestamps to Whisper (#23205 ) * let's go! * initial implementation of token-level timestamps * only return a single timestamp per token * remove token probabilities * fix return type * fix doc comment * strip special tokens * rename * revert to not stripping special tokens * only support models that have alignment_heads * add integration test * consistently name it token-level timestamps * small DTW tweak * initial support for ASR pipeline * fix pipeline doc comments * resolve token timestamps in pipeline with chunking * change warning when no final timestamp is found * return word-level timestamps * fixup * fix bug that skipped final word in each chunk * fix failing unit tests * merge punctuations into the words * also return word tokens * also return token indices * add (failing) unit test for combine_tokens_into_words * make combine_tokens_into_words private * restore OpenAI's punctuation rules * add pipeline tests * make requested changes * PR review changes * fix failing pipeline test * small stuff from PR * only return words and their timestamps, not segments * move alignment_heads into generation config * forgot to set alignment_heads in pipeline tests * tiny comment fix * grr	2023-06-21 17:48:21 +02:00
Younes Belkada	285a48011d	Fix gradient checkpointing + fp16 autocast for most models (#24247 ) * fix gc bug * continue PoC on OPT * fixes * 🤯 * fix tests * remove pytest.mark * fixup * forward contrib credits from discussions * forward contrib credits from discussions * reverting changes on untouched files. --------- Co-authored-by: zhaoqf123 <zhaoqf123@users.noreply.github.com> Co-authored-by: 7eu7d7 <7eu7d7@users.noreply.github.com>	2023-06-21 17:04:59 +02:00
Joao Gante	5f0801d174	Generate: add SequenceBiasLogitsProcessor (#24334 )	2023-06-21 11:14:41 +01:00
Sylvain Gugger	eb849f6604	Migrate doc files to Markdown. (#24376 ) * Rename index.mdx to index.md * With saved modifs * Address review comment * Treat all files * .mdx -> .md * Remove special char * Update utils/tests_fetcher.py Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> --------- Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2023-06-20 18:07:47 -04:00
Patrick von Platen	b0513b013b	[Wav2Vec2 - MMS] Correct directly loading adapters weights (#24335 ) * Correct direct lang loading * correct more * revert black * Use tie weights instead= * add tests * add tests * make style	2023-06-20 19:39:52 +02:00
Arthur	e5c760d636	[GPTNeoX] Nit in config (#24349 ) * add raise value error for attention size * nits to fix test_config * style	2023-06-20 19:19:19 +02:00
Yih-Dar	83dc5762e7	Skip a tapas (tokenization) test in past CI (#24378 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-20 18:35:45 +02:00
Yih-Dar	297d769d0e	Better test name and enable pipeline test for `pix2struct` (#24377 ) * best test name forever * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-20 18:29:30 +02:00
Yih-Dar	0527c1c0ea	Add a check in `ImageToTextPipeline._forward` (#24373 ) * fix * fix * fix * Update src/transformers/pipelines/image_to_text.py Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2023-06-20 18:07:34 +02:00
Sylvain Gugger	dc4449918d	Rename test to be more accurate (#24374 )	2023-06-20 11:54:55 -04:00
Sanchit Gandhi	6c1344444a	[Whisper] Make tests faster (#24105 )	2023-06-20 16:01:56 +01:00
Yih-Dar	c23d131eab	Update tiny models for pipeline testing. (#24364 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-20 14:43:10 +02:00
Matt	56efbf4301	TensorFlow CI fixes (#24360 ) * Fix saved_model_creation_extended * Skip the BLIP model creation test for now * Fix TF SAM test * Fix longformer tests * Fix Wav2Vec2 * Add a skip for XLNet * make fixup * make fix-copies * Add comments	2023-06-20 12:59:21 +01:00
Matt	9138995025	Add test for proper TF input signatures (#24320 ) * Add test for proper input signatures * No more signature pruning * Test the dummy inputs are valid too * fine-tine -> fine-tune * Fix indent in test_dataset_conversion	2023-06-16 17:03:13 +01:00
Sylvain Gugger	096f2cf126	Tied weights load (#24310 ) * Use tied weight keys * More * Fix tied weight missing warning * Only give info on unexpected keys with different classes * Deal with empty archs * Fix tests * Refine test	2023-06-16 10:55:42 -04:00
Matt	3403712958	Big TF test cleanup (#24282 ) * Fix one BLIP arg not being optional, remove misspelled arg * Remove the lxmert test overrides and just use the base test_saved_model_creation * saved_model_creation fixes and re-enabling tests across the board * Remove unnecessary skip * Stop caching sinusoidal embeddings in speech_to_text * Fix transfo_xl compilation * Fix transfo_xl compilation * Fix the conditionals in xglm * Set the save spec only when building * Clarify comment * Move comment correctly * Correct embeddings generation for speech2text * Mark RAG generation tests as @slow * Remove redundant else: * Add comment to clarify the save_spec line in build() * Fix size tests for XGLM at last! * make fixup * Remove one band_part operation * Mark test_keras_fit as @slow	2023-06-16 15:40:49 +01:00
Yih-Dar	896a58de15	Byebye pytorch 1.9 (#24080 ) byebye --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-16 16:38:23 +02:00
Matt	62d71f4083	Fix functional TF Whisper and modernize tests (#24301 ) * Revert whisper change and modify the test_compile_tf_model test * make fixup * Tweak test slightly * Add functional model saving to test * Ensure TF can infer shapes for data2vec * Add override for efficientformer * Mark test as slow	2023-06-16 14:43:43 +01:00
Sanchit Gandhi	4124a09f8b	[EnCodec] Changes for 32kHz ckpt (#24296 ) * [EnCodec] Changes for 32kHz ckpt * Update src/transformers/models/encodec/convert_encodec_checkpoint_to_pytorch.py * Update src/transformers/models/encodec/convert_encodec_checkpoint_to_pytorch.py	2023-06-15 14:36:19 +01:00
amyeroberts	e6122c3f40	Fix image segmentation tool bug (#23897 ) * Image segmentation tool bug * Remove resizing in the tests	2023-06-15 08:09:31 -04:00
Sylvain Gugger	372f50030b	Split common test from core tests (#24284 )	2023-06-15 07:30:24 -04:00
Matthijs Hollemans	0c3fdccf2f	[WIP] add EnCodec model (#23655 ) * boilerplate stuff * messing around with the feature extractor * fix feature extractor * unit tests for feature extractor * rename speech to audio * quick-and-dirty import of Meta's code * import weights (sort of) * cleaning up * more cleaning up * move encoder/decoder args into config * cleanup model * rename EnCodec -> Encodec * RVQ parameters in config * add slow test * add lstm init and test_init * Add save & load * finish EncodecModel * remove decoder_input_values as they are ont used anywhere (not removed from doc yet) * fix test feature extraction model name * Add better slow test * Fix tests * some fixup and cleaning * Improve further * cleaning up quantizer * fix up conversion script * test don't pass, _encode_fram does not work * update tests with output per encode and decode * more cleanup * rename _codebook * remove old config cruft * ratios & hop_length * use ModuleList instead of Sequential * clean up resnet block * update types * update tests * fixup * quick cleanup * fix padding * more styl,ing * add patrick feedback * fix copies * fixup * fix lstm * fix shape issues * fixup * rename conv layers * fixup * fix decoding * small conv refactoring * remove norm_params * simplify conv layers * rename conv layers * stuff * Clean up * Add padding logic use padding mask small conv refactoring remove norm_params simplify conv layers rename conv layers stuff add batched test update Clean up merge and update for padding fix padding fixup * clean up more * clean up more * More clean ups * cleanup convolutions * typo * fix typos * fixup * build PR doc? * start refactoring docstring * fix don't pad when no strid and chunk * update docstring * update docstring * nits * update going to lunch * update config and model * fix broken testse (becaue of the config changes) * fix scale computation * fixu[ * only return dict if speciefied or if config returns it * remove todos * update defaults in config * update conversion script * fix doctest * more docstring + fixup * nits on batched_tests * more nits * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * update basxed on review * fix update * updaet tests * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fixup * add overlap and chunl_length_s * cleanup feature extraction * teste edge cases truncation and padding * correct processor values * update config encodec, nits * fix tests * fixup * fix 24Hz test * elle tests are green * fix fixup * Apply suggestions from code review * revert readme changes * fixup * add example * use facebook checkpoints * fix typo * no pipeline tests * use slef.pad everywhere we can * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * update based on review * update * update mdx * fix bug and tests * fixup * fix doctest * remove comment * more nits * add more coverage for `test_truncation_and_padding` * fixup * add last test * fix text * nits * Update tests/models/encodec/test_modeling_encodec.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * take care of the last comments * typo * fix test * nits * fixup * Update src/transformers/models/encodec/feature_extraction_encodec.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: arthur.zucker@gmail.com <arthur.zucker@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-06-14 18:57:23 +02:00
Yih-Dar	a04ebc8b33	`Pix2StructImageProcessor` requires `torch>=1.11.0` (#24270 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-14 17:05:40 +02:00
Joao Gante	4626df5077	TF: CTRL with native embedding layers (#23456 )	2023-06-14 14:39:02 +01:00
Yih-Dar	eac8dede83	Skip some `TQAPipelineTests` tests in past CI (#24267 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-14 14:25:24 +02:00
Yih-Dar	233113149b	Skip `GPT-J` fx tests for torch < 1.12 (#24256 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-13 20:33:26 +02:00
Matt	3bd1fe4315	Stop storing references to bound methods via tf.function (#24146 ) * Stop storing references to bound methods in tf.functions * Remove the gc.collect calls now that we resolved the underlying problem * Remove the default signature from model.serving entirely, big cleanup * Remove _prune_signature as self.input_signature can prune itself * Restore serving docstring * Update int support test to check the input signature * Make sure other tests also use model.input_signature and not serving.input_signature * Restore _prune_signature * Remove the doctest GC now it's no longer needed * Correct core tests to use the pruned sig * order lines correctly in core tests * Add eager_serving back with a deprecation warning	2023-06-13 19:04:22 +01:00
Yih-Dar	cf561d7cf1	Add `torch >=1.12` requirement for `Tapas` (#24251 ) * fix * fix * fix * Update src/transformers/models/tapas/modeling_tapas.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-06-13 19:19:40 +02:00
Joao Gante	b1ea6b4bf5	Generate: GenerationConfig can overwrite attributes at from_pretrained time (#24238 ) Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-06-13 17:59:21 +01:00
Joao Gante	7bb6933b9d	TF: standardize `test_model_common_attributes` for language models (#23457 )	2023-06-13 17:51:37 +01:00
Sylvain Gugger	695928e1e5	Tied params cleanup (#24211 ) * First test * Add info for all models * style * Repo consistency * Fix last model and cleanup prints * Repo consistency * Use consistent function for detecting tied weights	2023-06-13 11:38:39 -04:00
Yih-Dar	74b846cacf	Update `(TF)SamModelIntegrationTest` (#24199 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-13 14:28:14 +02:00
Yih-Dar	4fe9716a79	Skip RWKV test in past CI (#24204 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-12 18:14:15 +02:00
Zach Mueller	ebd94b0f6f	🚨🚨🚨 Replace DataLoader logic for Accelerate in Trainer, remove unneeded tests 🚨🚨🚨 (#24028 ) * Working integration * Fix failing test * Revert label host logic * Bring it back!	2023-06-12 11:23:37 -04:00
Yih-Dar	dadc9fb427	Update `GPTNeoXLanguageGenerationTest` (#24193 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-12 15:37:12 +02:00
Yih-Dar	e26c6f03be	Fix `Wav2Vec2` CI OOM (#24190 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-12 11:39:04 +02:00
Stas Bekman	0d217f428f	[tests] fix bitsandbytes import issue (#24151 ) fix bitsandbytes import issue	2023-06-09 21:53:11 -07:00
Lysandre Debut	deff5979fe	Tool types (#24032 ) * Tool types * Tests + fixes * Isolate types * Oops * Review comments + docs * Tests + docs * soundfile -> vision	2023-06-09 13:34:07 -04:00
Yih-Dar	d0d1632958	Fix Pipeline CI OOM issue (#24124 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-09 16:49:02 +02:00
Younes Belkada	62fe753325	[`SAM`] Fix sam slow test (#24140 ) * fix sam test * update pipeline typehint	2023-06-09 16:22:09 +02:00
Yih-Dar	847b47c0ee	Fix XGLM OOM on CI (#24123 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-09 15:20:59 +02:00
Yih-Dar	b8fe259f16	Fix SAM OOM issue on CI (#24125 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-09 15:07:08 +02:00
Yih-Dar	707023d155	Fix TF Rag OOM issue (#24122 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-09 15:03:11 +02:00
Younes Belkada	a6d05d55f6	[`bnb`] Fix bnb config json serialization (#24137 ) * fix bnb config json serialization * forward contrib credits from discussions --------- Co-authored-by: Andrechang <Andrechang@users.noreply.github.com>	2023-06-09 13:41:14 +02:00
Yih-Dar	2e2088f24b	Avoid `GPT-2` daily CI job OOM (in TF tests) (#24106 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-08 18:21:09 +02:00
Radamés Ajna	a73883ae9e	add trust_remote_code option to CLI download cmd (#24097 ) * add trust_remote_code option * require_torch	2023-06-08 11:13:57 -04:00
Sylvain Gugger	89b00eef94	Fix expected value in tests of the test fetcher (#24077 ) * Fix expected value in tests of the test fetcher * Fix trigger for repo util tests	2023-06-07 11:38:56 -04:00
Younes Belkada	4795219228	[`bnb`] Fix bnb skip modules (#24043 ) * fix skip modules test * oops * address comments	2023-06-07 15:27:46 +02:00
Patrick von Platen	52972e70c7	[Wav2Vec2] Fix torch srcipt (#24062 ) * [Wav2Vec2] Fix torch srcipt * fix more	2023-06-07 07:27:07 -04:00
Joao Gante	612b2a1a6d	Generate: increase left-padding test atol (#23448 ) increase atol	2023-06-07 11:56:57 +01:00
Sylvain Gugger	f1660d7e23	Remote code improvements (#23959 ) * Fix model load when it has both code on the Hub and locally * Add input check with timeout * Add tests * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> * Some non-saved stuff * Add feature extractors * Add image processor * Add model * Add processor and tokenizer * Reduce timeout --------- Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2023-06-06 14:31:14 -04:00
Matt	4a55e47877	Move TF building to an actual build() method (#23760 ) * A fun new PR where I break the entire codebase again * A fun new PR where I break the entire codebase again * Handle cross-attention * Move calls to model(model.dummy_inputs) to the new build() method * Seeing what fails with the build context thing * make fix-copies * Let's see what fails with new build methods * Fix the pytorch crossload build calls * Fix the overridden build methods in vision_text_dual_encoder * Make sure all our build methods set self.built or call super().build(), which also sets it * make fix-copies * Remove finished TODO * Tentatively remove unneeded (?) line * Transpose b in deberta correctly and remove unused threading local * Get rid of build_with_dummies and all it stands for * Rollback some changes to TF-PT crossloading * Correctly call super().build()	2023-06-06 18:30:51 +01:00
amyeroberts	a717e0318c	Add TimmBackbone model (#22619 ) * Add test_backbone for convnext * Add TimmBackbone model * Add check for backbone type * Tidying up - config checks * Update convnextv2 * Tidy up * Fix indices & clearer comment * Exceptions for config checks * Correclty update config for tests * Safer imports * Safer safer imports * Fix where decorators go * Update import logic and backbone tests * More import fixes * Fixup * Only import all_models if torch available * Fix kwarg updates in from_pretrained & main rebase * Tidy up * Add tests for AutoBackbone * Tidy up * Fix import error * Fix up * Install nattan in doc_test_job * Revert back to setting self._out_xxx directly * Bug fix - out_indices mapping from out_features * Fix tests * Dont accept output_loading_info for Timm models * Set out_xxx and don't remap * Use smaller checkpoint for test * Don't remap timm indices - check out_indices based on stage names * Skip test as it's n/a * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Cleaner imports / spelling is hard --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-06-06 17:11:30 +01:00
Sylvain Gugger	b8935980a2	Modification of one text example file should trigger said test (#24051 )	2023-06-06 12:02:56 -04:00
Yih-Dar	17846646f2	Fix `MobileViTV2` checkpoint name (#24018 ) * fix * fix * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-06-05 18:12:45 +02:00
Jungwoo Park	44bd590a29	Pix2Struct: fix wrong broadcast axis of attention mask in visual encoder (#23976 ) * fix wrong broadcast axis of attention mask in visual encoder * fix slow tests --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com>	2023-06-05 11:47:29 -04:00
Yih-Dar	5176dc2310	Skip `test_multi_gpu_data_parallel_forward` for `MobileViTV2ModelTest` (#24017 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-05 16:29:32 +02:00
Sanchit Gandhi	c9cf337772	[Whisper Tokenizer] Skip special tokens when decoding with timestamps (#23945 )	2023-06-02 16:26:59 +02:00
Shehan Munasinghe	07c54413ac	Add MobileViTv2 (#22820 ) * generated code from add-new-model-like * Add code for modeling, config, and weight conversion * add tests for image-classification, update modeling and config * add code, tests for semantic-segmentation * make style, make quality, make fix-copies * make fix-copies * Update modeling_mobilevitv2.py fix bugs * Update _toctree.yml * update modeling, config fix bugs * Edit docs - fix bug MobileViTv2v2 -> MobileViTv2 * Update mobilevitv2.mdx * update docstrings * Update configuration_mobilevitv2.py make style * Update convert_mlcvnets_to_pytorch.py remove unused options * Update convert_mlcvnets_to_pytorch.py make style * Add suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make style, make quality * Add suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add suggestions from code review Remove MobileViTv2ImageProcessor Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make style * Add suggestions from code review Rename MobileViTv2 -> MobileViTV2 Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update modeling_mobilevitv2.py make style * Update serialization.mdx * Update modeling_mobilevitv2.py --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-06-02 10:37:02 +01:00
Patrick von Platen	5dfd407b37	[MMS] Scaling Speech Technology to 1,000+ Languages \| Add attention adapter to Wav2Vec2 (#23813 ) * add fine-tuned with adapter layer * Add set_target_lang to tokenizer * Implement load adapter * add tests * make style * Apply suggestions from code review * Update src/transformers/models/wav2vec2/tokenization_wav2vec2.py * make fix-copies * Apply suggestions from code review * make fix-copies * make style again * mkae style again * fix doc string * Update tests/models/wav2vec2/test_tokenization_wav2vec2.py * Apply suggestions from code review * fix * Correct wav2vec2 adapter * mkae style * Update src/transformers/models/wav2vec2/modeling_wav2vec2.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * add more nice docs * finish * finish * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review * all finish --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-06-02 10:30:24 +01:00
Marc Sun	e03a9cc0cd	Modify device_map behavior when loading a model using from_pretrained (#23922 ) * Modify device map behavior for 4/8 bits model * Remove device_map arg for training 4/8 bit model * Remove index Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add Exceptions * Modify comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix formatting * Get current device with accelerate * Revert "Get current device with accelerate" This reverts commit `46f0079910`. * Fix Exception * Modify quantization doc * Fix error Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-06-01 13:21:22 -04:00
amyeroberts	c608b8fc93	Bug fix - flip_channel_order for channels first images (#23701 ) Bug fix - flip_channel_order for channels_first	2023-05-31 17:12:27 +01:00
Connor Henderson	7adce8b532	fix: Replace `add_prefix_space` in `get_prompt_ids` with manual space for FastTokenizer compatibility (#23796 ) * add ' ' replacement for add_prefix_space * add fast tokenizer test	2023-05-31 10:52:35 -04:00
Sanchit Gandhi	8f915c450d	Unpin numba (#23162 ) * fix for ragged list * unpin numba * make style * np.object -> object * propagate changes to tokenizer as well * np.long -> "long" * revert tokenization changes * check with tokenization changes * list/tuple logic * catch numpy * catch else case * clean up * up * better check * trigger ci * Empty commit to trigger CI	2023-05-31 14:59:30 +01:00
Sourab Mangrulkar	a73b1d59a3	accelerate deepspeed and gradient accumulation integrate (#23236 ) * mixed precision support via accelerate * fix issues * fix for the sharded ddp case * fix flax and tf failing tests * `refactor the place to create `Accelerator` object * move ddp prep to accelerate * fix 😅 * resolving comments * move fsdp handling to accelerate * fixex * fix saving * shift torch dynamo handling to accelerate * shift deepspeed integration and save & load utils to accelerate * fix accelerate launcher support * oops * fix 🐛 * save ckpt fix * Trigger CI * nasty 🐛 😅 * as deepspeed needs grad_acc fixes, transfer grad_acc to accelerate * make tests happy * quality ✨ * loss tracked needs to account for grad_acc * fixing the deepspeed tests * quality ✨ * 😅😅😅 * tests 😡 * quality ✨ * Trigger CI * resolve comments and fix the issue with the previous merge from branch * Trigger CI * accelerate took over deepspeed integration --------- Co-authored-by: Stas Bekman <stas@stason.org>	2023-05-31 15:16:22 +05:30
Denisa Roberts	88f50a1e89	Add TensorFlow implementation of EfficientFormer (#22620 ) * Add tf code for efficientformer * Fix return dict bug - return last hidden state after last stage * Fix corresponding return dict bug * Override test tol * Change default values of training to False * Set training to default False X3 * Rm axis from ln * Set init in dense projection * Rm debug stuff * Make style; all tests pass. * Modify year to 2023 * Fix attention biases codes * Update the shape list logic * Add a batch norm eps config * Remove extract comments in test files * Add conditional attn and hidden states return for serving output * Change channel dim checking logic * Add exception for withteacher model in training mode * Revert layer count for now * Add layer count for conditional layer naming * Transpose for conv happens only in main layer * Make tests smaller * Make style * Update doc * Rm from_pt * Change to actual expect image class label * Remove stray print in tests * Update image processor test * Remove the old serving output logic * Make style * Make style * Complete test	2023-05-31 10:43:12 +01:00
Arthur	6fc0454b2f	[LlamaTokenizerFast] nit update `post_processor` on the fly (#23855 ) * Update the processor when changing add_eos and add_bos * fixup * update * add a test * fix failing tests * fixup	2023-05-30 16:50:41 +02:00
Matthijs Hollemans	2faa09530b	fix Whisper tests on GPU (#23753 ) * move input features to GPU * skip these tests because undefined behavior * unskip tests	2023-05-30 09:06:58 -04:00
Eli Simhayev	4b6a5a7caa	[Time-Series] Autoformer model (#21891 ) * ran `transformers-cli add-new-model-like` * added `AutoformerLayernorm` and `AutoformerSeriesDecomposition` * added `decomposition_layer` in `init` and `moving_avg` to config * added `AutoformerAutoCorrelation` to encoder & decoder * removed caninical self attention `AutoformerAttention` * added arguments in config and model tester. Init works! 😁 * WIP autoformer attention with autocorrlation * fixed `attn_weights` size * wip time_delay_agg_training * fixing sizes and debug time_delay_agg_training * aggregation in training works! 😁 * `top_k_delays` -> `top_k_delays_index` and added `contiguous()` * wip time_delay_agg_inference * finish time_delay_agg_inference 😎 * added resize to autocorrelation * bug fix: added the length of the output signal to `irfft` * `attention_mask = None` in the decoder * fixed test: changed attention expected size, `test_attention_outputs` works! * removed unnecessary code * apply AutoformerLayernorm in final norm in enc & dec * added series decomposition to the encoder * added series decomp to decoder, with inputs * added trend todos * added autoformer to README * added to index * added autoformer.mdx * remove scaling and init attention_mask in the decoder * make style * fix copies * make fix-copies * inital fix-copies * fix from https://github.com/huggingface/transformers/pull/22076 * make style * fix class names * added trend * added d_model and projection layers * added `trend_projection` source, and decomp layer init * added trend & seasonal init for decoder input * AutoformerModel cannot be copied as it has the decomp layer too * encoder can be copied from time series transformer * fixed generation and made distrb. out more robust * use context window to calculate decomposition * use the context_window for decomposition * use output_params helper * clean up AutoformerAttention * subsequences_length off by 1 * make fix copies * fix test * added init for nn.Conv1d * fix IGNORE_NON_TESTED * added model_doc * fix ruff * ignore tests * remove dup * fix SPECIAL_CASES_TO_ALLOW * do not copy due to conv1d weight init * remove unused imports * added short summary * added label_length and made the model non-autoregressive * added params docs * better doc for `factor` * fix tests * renamed `moving_avg` to `moving_average` * renamed `factor` to `autocorrelation_factor` * make style * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix configurations * fix integration tests * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fixing `lags_sequence` doc * Revert "fixing `lags_sequence` doc" This reverts commit `21e34911e3`. * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * model layers now take the config * added `layer_norm_eps` to the config * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * added `config.layer_norm_eps` to AutoformerLayernorm * added `config.layer_norm_eps` to all layernorm layers * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix variable names * added inital pretrained model * added use_cache docstring * doc strings for trend and use_cache * fix order of args * imports on one line * fixed get_lagged_subsequences docs * add docstring for create_network_inputs * get rid of layer_norm_eps config * add back layernorm * update fixture location * fix signature * use AutoformerModelOutput dataclass * fix pretrain config * no need as default exists * subclass ModelOutput * remove layer_norm_eps config * fix test_model_outputs_equivalence test * test hidden_states_output * make fix-copies * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * removed unused attr * Update tests/models/autoformer/test_modeling_autoformer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * use AutoFormerDecoderOutput * fix formatting * fix formatting --------- Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-05-30 10:23:32 +02:00
Sylvain Gugger	6e4bc67099	Revamp test selection for the example tests (#23737 ) * Revamp test selection for the example tests * Rename old XLA test and fake modif in run_glue * Fixes * Fake Trainer modif * Remove fake modifs	2023-05-25 09:38:21 -04:00
Daniel King	89159651ba	Fix the regex in `get_imports` to support multiline try blocks and excepts with specific exception types (#23725 ) * fix and test get_imports for multiline try blocks, and excepts with specific errors * fixup * add some more tests * add license	2023-05-24 15:40:19 -04:00
Sanchit Gandhi	d8222be57e	[Whisper] Reduce batch size in tests (#23736 )	2023-05-24 17:31:25 +01:00
Matt	814de8fac7	Overhaul TF serving signatures + dummy inputs (#23234 ) * Let's try autodetecting serving sigs * Don't clobber existing sigs * Change shapes for multiplechoice models * Make default dummy inputs smarter too * Fix missing f-string * Let's YOLO a serving output too * Read __class__.__name__ properly * Don't just pass naked lists in there and expect it to be okay * Code cleanup * Update default serving sig * Clearer error messages * Further updates to the default serving output * make fixup * Update the serving output a bit more * Cleanups and renames, raise errors appropriately when we can't infer inputs * More renames * we're building in a functional context again, yolo * import DUMMY_INPUTS from the right place * import DUMMY_INPUTS from the right place * Support cross-attention in the dummies * Support cross-attention in the dummies * Complete removal of dummy/serving overrides in BERT * Complete removal of dummy/serving overrides in RoBERTa * Obliterate lots and lots of serving sig and dummy overrides * merge type hint changes * Fix for token_type_ids with vocab_size 1 * Add missing property decorator * Fix T5 and hopefully some models that take conv inputs * More signature pruning * Fix T5's signature * Fix Wav2Vec2 signature * Fix LongformerForMultipleChoice input signature * Fix BLIP and LED * Better default serving output error handling * Fix BART dummies * Fix dummies for cross-attention, esp encoder-decoder models * Fix visionencoderdecoder signature * Fix BLIP serving output * Small tweak to BART dummies * Cleanup the ugly parameter inspection line that I used in a few places * committed a breakpoint again * Move the text_dims check * Remove blip_text serving_output * Add decoder_input_ids to the default input sig * Remove all the manual overrides for encoder-decoder model signatures * Tweak longformer/led input sigs * Tweak default serving output * output.keys() -> output * make fixup	2023-05-24 17:03:24 +01:00
Matt	f8b2574416	Better TF docstring types (#23477 ) * Rework TF type hints to use \| None instead of Optional[] for tf.Tensor * Rework TF type hints to use \| None instead of Optional[] for tf.Tensor * Don't forget the imports * Add the imports to tests too * make fixup * Refactor tests that depended on get_type_hints * Better test refactor * Fix an old hidden bug in the test_keras_fit input creation code * Fix for the Deit tests	2023-05-24 13:52:52 +01:00
Tim Dettmers	796162c512	Paged Optimizer + Lion Optimizer for Trainer (#23217 ) * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com>	2023-05-24 12:53:28 +02:00
Tim Dettmers	9d73b92269	4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) (#23479 ) * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. * Initial draft. Some tests fail. * Fixed dtype bug. * Fixed bug caused by torch_dtype='auto'. * All test green for 8-bit and 4-bit layers. * Added fix for fp32 layer norms and bf16 compute in LLaMA. * Initial draft. Some tests fail. * Fixed dtype bug. * Fixed bug caused by torch_dtype='auto'. * All test green for 8-bit and 4-bit layers. * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. * Fixing issues for PR #23479. * Added fix for fp32 layer norms and bf16 compute in LLaMA. * Reverted variable name change. * Initial draft. Some tests fail. * Fixed dtype bug. * Fixed bug caused by torch_dtype='auto'. * All test green for 8-bit and 4-bit layers. * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. * Added missing tests. * Fixup changes. * Added fixup changes. * Missed some variables to rename. * revert trainer tests * revert test trainer * another revert * fix tests and safety checkers * protect import * simplify a bit * Update src/transformers/trainer.py * few fixes * add warning * replace with `load_in_kbit = load_in_4bit or load_in_8bit` * fix test * fix tests * this time fix tests * safety checker * add docs * revert torch_dtype * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * multiple fixes * update docs * version checks and multiple fixes * replace `is_loaded_in_kbit` * replace `load_in_kbit` * change methods names * better checks * oops * oops * address final comments --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-05-24 12:52:45 +02:00
Yih-Dar	de5f86e59d	Skip `TFCvtModelTest::test_keras_fit_mixed_precision` for now (#23699 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-05-23 20:47:47 +02:00
LWprogramming	3d57404464	is_batched fix for remaining 2-D numpy arrays (#23309 ) * Fix is_batched code to allow 2-D numpy arrays for audio * Tests * Fix typo * Incorporate comments from PR #23223	2023-05-23 14:37:35 -04:00
Younes Belkada	42baa58f90	[`SAM`] Fixes pipeline and adds a dummy pipeline test (#23684 ) * add a dummy pipeline test * change test name	2023-05-23 17:36:49 +02:00
Yih-Dar	71a5ed3433	Fix a `BridgeTower` test (#23694 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-05-23 17:32:57 +02:00
Yih-Dar	abf691aac0	Fix PyTorch SAM tests (#23682 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-05-23 14:48:38 +02:00
NielsRogge	2f424d7979	[image-to-text pipeline] Add conditional text support + GIT (#23362 ) * First draft * Remove print statements * Add conditional generation * Add more tests * Remove scripts * Remove BLIP specific linkes * Add support for pix2struct * Add fast test * Address comment * Fix style	2023-05-22 21:45:50 +02:00
Matt	26a06814a1	Fix SAM tests and use smaller checkpoints (#23656 ) * Fix SAM tests and use smaller checkpoints * Override test_model_from_pretrained to use sam-vit-base as well * make fixup	2023-05-22 19:42:35 +02:00
LWprogramming	5de2a6d5e5	Fix wav2vec2 is_batched check to include 2-D numpy arrays (#23223 ) * Fix wav2vec2 is_batched check to include 2-D numpy arrays * address comment * Add tests * oops * oops * Switch to np array Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Switch to np array * condition merge * Specify mono channel only in comment * oops, add other comment too * make style * Switch list check from falsiness to empty --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2023-05-22 12:57:45 -04:00
Younes Belkada	7bbdfd7b24	Fix accelerate logger bug (#23650 ) * fix logger bug * Update tests/mixed_int8/test_mixed_int8.py Co-authored-by: Zachary Mueller <muellerzr@gmail.com> * import `PartialState` --------- Co-authored-by: Zachary Mueller <muellerzr@gmail.com>	2023-05-22 15:39:47 +02:00
Yih-Dar	3658488ff7	Fix `tests/repo_utils/test_get_test_info.py` (#23485 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-05-20 06:53:10 +02:00
Younes Belkada	3cb9309024	[`Blip`] Remove redundant shift right (#23153 ) * remove redundant shit right * fix failing tests * this time fix tests	2023-05-19 19:14:16 +02:00
Matt	1c460a5273	TF port of the Segment Anything Model (SAM) (#22970 ) * First commit * Add auto-translation with GPT-4 * make fixup * Add a functional layernorm for TF * Add all the auxiliary imports etc. * Add the extra processor and tests * rebase to main * Add all the needed fixes to the GPT code * make fixup * Make convolutions channels-last so they run on CPU * make fixup * Fix final issues * Fix other models affected by test change * Clarify comment on the sparse_prompt_embeddings check * Refactor functional_layernorm, use shape_list in place of .shape in some places * Remove deprecated torch-alike code * Update tests/models/sam/test_modeling_tf_sam.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/sam/test_modeling_tf_sam.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Refactor processor with common methods and separated private methods * make fixup * Quietly delete the file that didn't do anything (sorry Sylvain) * Refactor the processor tests into one file * make fixup * Clean up some unnecessary indirection * Fix TF mask postprocessing * Add more processor equivalence tests * Refactor generate_crop_boxes to use framework-neutral np code * Make the serving output correctly conditional * Fix error message line length * Use dict keys rather than indices internally in both TF and PT SAM call/forward * Return dicts internally in the call/forward methods * Revert changes to common tests and just override check_pt_tf_outputs * Revert changes to other model tests * Clarify comments for functional layernorm * Add missing transpose from PT code * Removed unused copied from in PT code * Remove overrides for tests that don't exist in TF * Fix transpose and update tests for PT and TF to check pred_masks * Add training flag * Update tests to use TF checkpoints * Update index.mdx * Add missing cross-test decorator * Remove optional extra asterisks * Revert return_dict changes in PT code * Update src/transformers/models/sam/modeling_tf_sam.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Remove None return annotations on init methods * Update tests/models/sam/test_processor_sam.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix input_boxes shapes * make fixup --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-05-19 14:14:13 +01:00
Connor Henderson	2acedf4721	feat: Whisper prompting (#22496 ) * initial working additions * clean and rename, add cond stripping initial prompt to decode * cleanup, edit create_initial_prompt_ids, add tests * repo consistency, flip order of conditional * fix error, move the processor fn to the tokenizer * repo consistency, update test ids to corresponding tokenizer * use convert_tokens_to_ids not get_vocab... * use actual conditional in generate * make sytle * initial address comments * initial working add new params to pipeline * first draft of sequential generation for condition_on_previous_text * add/update tests, make compatible with timestamps * make compatible with diff. input kwargs and max length * add None check * add temperature check * flip temp check operand * refocusing to prev pr scope * remove the params too * make style * edits, move max length incorporating prompt to whisper * address comments * remove asr pipeline prompt decoding, fix indexing * address comments (more tests, validate prompt) * un-comment out tests (from debug) * remove old comment * address comments * fix typo * remove timestamp token from test * make style * cleanup * copy method to fast tokenizer, set max_new_tokens for test * prompt_ids type just pt * address Amy's comments * make style	2023-05-19 09:33:11 +01:00
Yih-Dar	ffad4f1373	Update tiny models and pipeline tests (#23446 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-05-18 17:29:04 +02:00
Yih-Dar	2406dbdcfa	Less flaky `test_assisted_decoding_matches_greedy_search` (#23451 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-05-18 17:28:22 +02:00
Yih-Dar	5777c3cb3f	Fix (skip) a pipeline test for `RwkvModel` (#23444 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-05-18 14:54:23 +02:00
Joao Gante	aea7b23b57	Generate: skip left-padding tests on old models (#23437 )	2023-05-18 11:04:51 +01:00
Yih-Dar	a8732e09bb	Fix device issue in `SwiftFormerModelIntegrationTest::test_inference_image_classification_head` (#23435 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-05-17 19:48:18 +02:00
Yih-Dar	939a65aba7	Update Bigbird Pegasus tests (#23431 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-05-17 18:14:29 +02:00
IMvision12	ebb649a4e3	Add Missing tokenization test [electra] (#22997 ) * Create test_tokenization_electra.py * Update tests/models/electra/test_tokenization_electra.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-05-17 10:45:15 -04:00
Younes Belkada	3d3c7d4213	[`SAM`] fix sam slow test (#23376 ) * fix sam slow test * oops * fix error message	2023-05-17 14:27:43 +02:00
Yih-Dar	46d2468695	Update `ConvNextV2ModelIntegrationTest::test_inference_image_classification_head` (#23402 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-05-16 23:35:11 +02:00
Joao Gante	918a06e25d	Generate: add test to check KV format (#23403 ) Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-05-16 19:28:19 +01:00
Stas Bekman	bbbc5c15d4	[AutoModel] fix `torch_dtype=auto` in `from_pretrained` (#23379 ) * [automodel] fix torch_dtype=auto in from_pretrained * add test * fix logic * Update src/transformers/models/auto/auto_factory.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-05-16 10:21:42 -07:00
Yih-Dar	21741e8c7e	Update `test_batched_inference_image_captioning_conditioned` (#23391 ) * fix * fix * fix test + add more docs --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: younesbelkada <younesbelkada@gmail.com>	2023-05-16 14:49:24 +02:00
LWprogramming	ee3be05310	Fix test typos - audio feature extractors (#23310 )	2023-05-15 17:22:10 +01:00
Yih-Dar	8f76dc8e5a	Skip failing `AlignModelTest::test_multi_gpu_data_parallel_forward` (#23374 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-05-15 16:46:58 +02:00
Yih-Dar	81a73fa638	Fix issue introduced in PR #23163 (#23363 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-05-15 11:38:44 +02:00
Shehan Munasinghe	c045249049	Add swiftformer (#22686 ) * Commit the automatically generated code using add-new-model-like * Update description at swiftformer.mdx file * remove autogenerated code for MaskedImageModeling * update weight conversion scripts * Update modeling_swiftformer.py * update configuration_swiftformer.py * Update test_modeling_swiftformer.py * update modeling code - remove einops dependency * Update _toctree.yml * update modeling code - remove copied from comments * update docs * Revert "update docs" This reverts commit `c2e05e2998`. * update docs * remove unused reference SwiftFormerImageProcessor * update dependency_versions_table.py * update swiftformer.mdx * update swiftformer.mdx * change model output type - no attentions * update model org name * Fix typo * fix copies * Update tests/models/swiftformer/test_modeling_swiftformer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/auto/image_processing_auto.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/auto/feature_extraction_auto.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/swiftformer.mdx Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/swiftformer/configuration_swiftformer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update modeling_swiftformer.py fix-copies * make style, make quality, fix-copies * Apply suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make style Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fix-copies * Update modeling_swiftformer.py * Update modeling_swiftformer.py * Add suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-05-12 11:52:31 +01:00
Sylvain Gugger	4eea25b445	Fix image segmentation tool test (#23306 )	2023-05-11 14:38:11 -04:00
Alessandro Pietro Bardelli	83eda6435e	Better check for packages availability (#23163 ) * Better check for packages availability * amend _optimumneuron_available * amend torch_version * amend PIL detection and lint * lint * amend _faiss_available * remove overloaded signatures of _is_package_available * fix sklearn and decord detection * remove unused checks * revert	2023-05-11 13:52:22 -04:00
amyeroberts	e1eb3efd02	Temporarily increase tol for PT-FLAX whisper tests (#23288 )	2023-05-11 11:43:18 +01:00
amyeroberts	f82ee109e6	Temporary tolerance fix for flaky whipser PT-TF equiv. test (#23257 ) * Temp tol fix for flaky whipser test * Add equivalent update to TF tests	2023-05-11 10:04:07 +01:00
José Ángel Rey Liñares	0c65fb7cfa	chore: allow protobuf 3.20.3 requirement (#22759 ) * chore: allow protobuf 3.20.3 Allow latest bugfix release for protobuf (3.20.3) * chore: update auto-generated dependency table update auto-generated dependency table * run in subprocess * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-05-10 20:22:56 +02:00
Sylvain Gugger	3335724376	Test composition (#23214 ) * Remove nestedness in tool config * Really do it * Use remote tools descriptions * Work * Clean up eval * Changes * Tools * Tools * tool * Fix everything * Use last result/assign for evaluation * Prompt * Remove hardcoded selection * Evaluation for chat agents * correct some spelling * Small fixes * Change summarization model (#23172) * Fix link displayed * Update description of the tool * Fixes in chat prompt * Custom tools, custom prompt * Tool clean up * save_pretrained and push_to_hub for tool * Fix init * Tests * Fix tests * Tool save/from_hub/push_to_hub and tool->load_tool * Clean push_to_hub and add app file * Custom inference API for endpoints too * Clean up * old remote tool and new remote tool * Make a requirements * return_code adds tool creation * Avoid redundancy between global variables * Remote tools can be loaded * Tests * Text summarization tests * Quality * Properly mark tests * Test the python interpreter * And the CI shall be green. * fix loading of additional tools * Work on RemoteTool and fix tests * General clean up * Guard imports * Fix tools * docs: Fix broken link in 'How to add a model...' (#23216) fix link * Get default endpoint from the Hub * Add guide * Simplify tool config * Docs * Some fixes * Docs * Docs * Docs * Fix code returned by agent * Try this * Match args with signature in remote tool * Should fix python interpreter for Python 3.8 * Fix push_to_hub for tools * Other fixes to push_to_hub * Add API doc page * Docs * Docs * Custom tools * Pin tensorflow-probability (#23220) * Pin tensorflow-probability * [all-test] * [all-test] Fix syntax for bash * PoC for some chaining API * Text to speech * J'ai pris des libertés * Rename * Basic python interpreter * Add agents * Quality * Add translation tool * temp * GenQA + LID + S2T * Quality + word missing in translation * Add open assistance, support f-strings in evaluate * captioning + s2t fixes * Style * Refactor descriptions and remove chain * Support errors and rename OpenAssistantAgent * Add setup * Deal with typos + example of inference API * Some rename + README * Fixes * Update prompt * Unwanted change * Make sure everyone has a default * One prompt to rule them all. * SD * Description * Clean up remote tools * More remote tools * Add option to return code and update doc * Image segmentation * ControlNet * Gradio demo * Diffusers protection * Lib protection * ControlNet description * Cleanup * Style * Remove accelerate and try to be reproducible * No randomness * Male Basic optional in token * Clean description * Better prompts * Fix args eval in interpreter * Add tool wrapper * Tool on the Hub * Style post-rebase * Big refactor of descriptions, batch generation and evaluation for agents * Make problems easier - interface to debug * More problems, add python primitives * Back to one prompt * Remove dict for translation * Be consistent * Add prompts * New version of the agent * Evaluate new agents * New endpoints agents * Make all tools a dict variable * Typo * Add problems * Add to big prompt * Harmonize * Add tools * New evaluation * Add more tools * Build prompt with tools descriptions * Tools on the Hub * Let's chat! * Cleanup * Temporary bs4 safeguard * Cache agents and clean up * Blank init * Fix evaluation for agents * New format for tools on the Hub * Add method to reset state * Remove nestedness in tool config * Really do it * Use remote tools descriptions * Work * Clean up eval * Changes * Tools * Tools * tool * Fix everything * Use last result/assign for evaluation * Prompt * Remove hardcoded selection * Evaluation for chat agents * correct some spelling * Small fixes * Change summarization model (#23172) * Fix link displayed * Update description of the tool * Fixes in chat prompt * Custom tools, custom prompt * Tool clean up * save_pretrained and push_to_hub for tool * Fix init * Tests * Fix tests * Tool save/from_hub/push_to_hub and tool->load_tool * Clean push_to_hub and add app file * Custom inference API for endpoints too * Clean up * old remote tool and new remote tool * Make a requirements * return_code adds tool creation * Avoid redundancy between global variables * Remote tools can be loaded * Tests * Text summarization tests * Quality * Properly mark tests * Test the python interpreter * And the CI shall be green. * Work on RemoteTool and fix tests * fix loading of additional tools * General clean up * Guard imports * Fix tools * Get default endpoint from the Hub * Simplify tool config * Add guide * Docs * Some fixes * Docs * Docs * Fix code returned by agent * Try this * Docs * Match args with signature in remote tool * Should fix python interpreter for Python 3.8 * Fix push_to_hub for tools * Other fixes to push_to_hub * Add API doc page * Fixes * Doc fixes * Docs * Fix audio * Custom tools * Audio fix * Improve custom tools docstring * Docstrings * Trigger CI * Mode docstrings * More docstrings * Improve custom tools * Fix for remote tools * Style * Fix repo consistency * Quality * Tip * Cleanup on doc * Cleanup toc * Add disclaimer for starcoder vs openai * Remove disclaimer * Small fixed in the prompts * 4.29 * Update src/transformers/tools/agents.py Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> * Complete documentation * Small fixes * Agent evaluation * Note about gradio-tools & LC * Clean up agents and prompt * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Note about gradio-tools & LC * Add copyrights and address review comments * Quality * Add all language codes * Add remote tool tests * Move custom prompts to other docs * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * TTS tests * Quality --------- Co-authored-by: Lysandre <hi@lyand.re> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com> Co-authored-by: Connor Henderson <connor.henderson@talkiatry.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre <lysandre@huggingface.co> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-05-09 20:37:57 -04:00
Sylvain Gugger	b4d4d6fe87	Add RWKV-4 (#22797 ) * First draft of RWKV-4 * Add support for generate * Style post-rebase * Properly use state * Write doc * Fix doc * More math * Add model to README, dummies and clean config * Fix init * multiple fixes: - fix common tests - fix configuraion default values - add CI test for checking state computation - fix some CI tests * correct tokenizer * some tweaks - fix config docstring - fix failing tests * fix CI tests - add output_attention / output_hidden_states - override test_initialization - fix failing CIs * fix conversion script - fix sharded case - add new arguments * add slow tests + more fixes on conversion script * add another test * final fixes * change single name variable * add mock attention mask for pipeline to work * correct eos token id * fix nits * add checkpoints * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add `tie_word_embeddings` in docstring * change tensor name * fix final nits * Trigger CI --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-05-09 13:04:10 -04:00
Matthijs Hollemans	7f91950901	audio_utils improvements (#21998 ) * silly change to allow making a PR * clean up doc comments * simplify hertz_to_mel and mel_to_hertz * fixup * clean up power_to_db * also add amplitude_to_db * move functions * clean up mel_filter_bank * fixup * credit librosa & torchaudio authors * add unit tests * tests for power_to_db and amplitude_to_db * add mel_filter_bank tests * rewrite STFT * add convenience spectrogram function * missing transpose * fewer transposes * add integration test to M-CTC-T * frame length can be either window or FFT length * rewrite stft API * add preemphasis coefficient * move argument * add log option to spectrogram * replace M-CTC-T feature extractor * fix api thing * replace whisper STFT * replace whisper mel filters * replace tvlt's stft * allow alternate window names * replace speecht5 stft * fixup * fix integration tests * fix doc comments * remove manual FFT length calculation * fix docs * go away, deprecation warnings * combine everything into spectrogram function * add deprecated functions back * fixup	2023-05-09 09:10:17 -04:00
Joao Gante	bbfb9fc22b	Generate: starcoder 🤜 🤛 assisted generation (#23182 ) * starcoder has joined the chat * indexing that works for all	2023-05-08 10:45:40 +01:00
Bartosz Szmelczynski	6f8a02844a	fix random attention for pytorch's bigbird/pegasus_bigbird (#23056 ) * fix random attention usage for bigbird and pegasus_bigbird * remove staticmethod, update tests target valus * revert style changes	2023-05-07 18:55:04 -04:00
raghavanone	312b104ff6	Add FlaxWhisperForAudioClassification model (#23173 ) * Add FlaxWhisperForAudioClassification model * Add models to init * Add models to init * Fix copies * Fix automapping * Fix failing test	2023-05-05 13:23:46 -04:00
Connor Henderson	17083b9b84	fix: Passing language as acronym to Whisper generate (#23141 ) * add fix * address comments * remove error formatting	2023-05-05 11:52:19 -04:00
Sylvain Gugger	01734dba84	Revert "Add FlaxWhisperForAudioClassification model" (#23154 ) Revert "Add FlaxWhisperForAudioClassification model (#22883)" This reverts commit `c8f2c5c56e`.	2023-05-04 13:47:07 -04:00
Joao Gante	b369e507aa	Generate: text generation pipeline no longer emits `max_length` warning when it is not set (#23139 )	2023-05-04 18:36:23 +01:00
raghavanone	c8f2c5c56e	Add FlaxWhisperForAudioClassification model (#22883 ) * Add FlaxWhisperForAudioClassification model * Add models to init * Add models to init * Fix copies * Fix automapping	2023-05-04 13:00:16 -04:00
peter-sk	83b38fbea8	GPTNeoXForQuestionAnswering (#23059 ) * first draft - gives index error in question_answering.py * maturing * no labels * pipeline should know about QA * fixing checks * formatting * fixed docstring * initial commit * formatting * adding the class to many places * towards less unhappy checks * nearly there * and gpt neox for qa * use right model * forgot this one * base_model_prefix is "gpt_neox" for GPTNeoX* models * unnecessary stuff * Update src/transformers/models/gpt_neox/modeling_gpt_neox.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * format * Update src/transformers/models/gpt_neox/modeling_gpt_neox.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * removed gpt2 stuff --------- Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-05-04 10:15:15 -04:00
amyeroberts	90e8263d91	Add methods to update and verify out_features out_indices (#23031 ) * Add methods to update and verify out_features out_indices * Safe update for config attributes * Fix function names * Save config correctly * PR comments - use property setters * PR comment - directly set attributes * Update test * Add updates to recently merged focalnet backbone	2023-05-04 10:15:06 +01:00
peter-sk	78b7debf56	GPTNeoForQuestionAnswering (#23057 ) * first draft - gives index error in question_answering.py * maturing * no labels * pipeline should know about QA * fixing checks * formatting * fixed docstring * initial commit * formatting * adding the class to many places * towards less unhappy checks * nearly there * Update src/transformers/models/gpt_neo/modeling_gpt_neo.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * avoid error * moving to device of star/end_logits --------- Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-05-03 15:59:19 -04:00
Alara Dirik	b0a78091a5	Remove redundant print statements (#23133 ) remove redundant print statements	2023-05-03 18:04:48 +01:00
Alara Dirik	441658dd6c	Add focalnet backbone (#23104 ) Adds FocalNet backbone to return features from all stages	2023-05-03 19:32:42 +03:00
Mayank Agarwal	c4e32e206f	Add support for beam search's num_return_sequencs flag in flax (#23082 ) * add code for numReturnSeq * add flax support for num return sequences * Make Fix up for changes * add test for num return sequences * lint	2023-05-03 10:50:34 -04:00
Xuehai Pan	ee4bc07474	Support union types `X \| Y` syntax for `HfArgumentParser` for Python 3.10+ (#23126 ) * Support union types `X \| Y` syntax for `HfArgumentParser` for Python 3.10+ * Add tests for PEP 604 for `HfArgumentParser` * Reorganize tests	2023-05-03 10:49:54 -04:00
Joao Gante	ce31e3c8bf	Generate: slow assisted generation test (#23125 )	2023-05-03 14:24:50 +01:00
peter-sk	2b0c924568	GPT2ForQuestionAnswering (#23030 ) * first draft - gives index error in question_answering.py * maturing * no labels * pipeline should know about QA * fixing checks * formatting * fixed docstring * make sure legacy code executes * comment * like this --------- Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com>	2023-05-02 09:25:46 -04:00
Ashwin Mathur	487f132a6f	Add `BioGPTForSequenceClassification` (#22253 ) * added BioGptForSequenceClassification * added source of copied code * typo * Format code with black * Update comments for copied code * Remove code copy comment * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fix failing tests * Update code copied from comments * Fix code quality * Update src/transformers/models/biogpt/modeling_biogpt.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fix lint error * Update src/transformers/models/biogpt/modeling_biogpt.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Rename model to biogpt for consistency * Add PipelineTesterMixin to test_modeling_biogpt.py * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Resolve merge confict --------- Co-authored-by: Guillem García Subies <37592763+GuillemGSubies@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-05-01 09:17:27 -04:00
Joao Gante	849367ccf7	Generate: prepare assisted generation for release (#23052 )	2023-04-29 10:53:30 +01:00
s-JoL	c2c99dc7ef	add open-llama model with ckpt (#22795 ) * update Open-Llama model * update * update format * update doc * update * update stable embedding test * update test case * update format * update readme * fix typo * update name * remove tokenizer and update format * remove convert_open_llama_weights_to_hf * update warning and doc_string --------- Co-authored-by: songliang.bayesian <songliang.bayesian@bytedance.com>	2023-04-28 11:01:32 -04:00
Yih-Dar	0bf34b1c9f	Skip pt/flax equivalence tests in pytorch `bigbird` test file (#23040 ) skip Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-28 17:00:13 +02:00
Maxime Méloux	9b435204b1	Add Trainer support for ReduceLROnPlateau (#23010 ) * Add Trainer support for ReduceLROnPlateau Fixes #16503 * Remove training argument and add default instance --------- Co-authored-by: mmeloux <maxime.meloux@loria.fr>	2023-04-28 09:17:30 -04:00
Yih-Dar	cf7baf4060	Make `_test_xla_generate` less flaky (#22996 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-28 13:27:28 +02:00
Bartosz Szmelczynski	88399476c3	Fix bigbird random attention (#21023 ) * switch np.random.permutation to jax.random.permuation * remove comments * remove leftover comment * skip similarity tests * modify indices_prng_key usage, add deterministic behaviour * update style * remove unused import * remove copy statement since classes are not identical * remove numpy import * revert removing copied from statements * make style from copied * remove copied from statement * update copied from statement to include only np.ndarry * add deterministic args, unittestskip equivalence tests	2023-04-27 13:52:28 -04:00
Yih-Dar	27b66bea01	Update `BridgeTowerModelTester` (#23029 ) * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-27 18:26:17 +02:00
peter-sk	d65b14ed67	added GPTNeoForTokenClassification (#22908 ) * added GPTNeoForTokenClassification * add to top-level init * fixup * test * more fixup * add to gpt_neo.mdx * repo consistency * dummy copy * fix copies * optax >= 0.1.5 assumes jax.Array exists - which it doesn't for jax <= 0.3.6 * merge with main made this superfluous * added classifier_dropout * remove legacy code * removed fmt:on/off removed expected_outputs * doc style fix * classifier_dropout is always in config --------- Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com>	2023-04-27 12:10:03 -04:00
peter-sk	614e191c4d	added GPTNeoXForTokenClassification (#23002 ) * initial commit * added GPTNeoXForTokenClassification * typo * doc fixed extra comma that turned into a tuple * unifying variable names fixing forward call * classifier_dropout is in config Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-04-27 11:08:26 -04:00
Yih-Dar	a4908da04e	Fix the expected error in `test_offline_mode_pipeline_exception` (#23022 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-27 14:22:05 +02:00
fxmarty	3042c63a95	Add methods to PreTrainedModel to use PyTorch's BetterTransformer (#21259 ) * fix mess * better documentation * typo * fix doc * update * add test * fix test * more tests * Update src/transformers/modeling_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * move to utils * Apply suggestions from code review Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * nit --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>	2023-04-27 11:03:42 +02:00
Younes Belkada	304aacac90	🚨🚨🚨 [`Pix2Struct`] Attempts to fix training issues 🚨🚨🚨 (#23004 ) * multiple fixes - add `add_special_tokens` to `True` by default - remove label smoothing and labels masking * fix test	2023-04-26 18:29:25 +02:00
Ritik Nandwal	20ac86c6f1	Add TensorFlow Wav2Vec2 for sequence classification (#22073 ) * Add initial changes for TF wav2vec2 for sequence classification * Add suggested changes * Add serving and serving output methods * Add serving_output implementation and fix layer_weights * Add fixes * Fixed test cases * Fixing test and adding suggested changes	2023-04-26 13:35:30 +01:00
Lingepumpe	5427250351	Avoid invalid escape sequences, use raw strings (#22936 ) * Avoid invalid escape sequences, use raw strings * Integrate PR feedback	2023-04-25 09:17:56 -04:00
Joao Gante	e4a97f82bf	Generate: assisted generation with sample (take 2) (#22949 ) * temperature controls speed	2023-04-24 19:54:55 +01:00
Lucain	74c55ab9e5	Prepare tests for hfh 0.14 (#22958 ) * Test hf_hub 0.14.0rc1 * fix mocked tests * package version --------- Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: testbot <lucainp@hf.co>	2023-04-24 09:31:50 -04:00
Yih-Dar	3f6a4b5bd7	Decorate `test_codegen_sample_max_time` as flaky (#22953 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-24 15:27:31 +02:00
Yih-Dar	975159bb61	Update tiny models and a few fixes (#22928 ) * run_check_tiny_models * update summary * update mixin * update pipeline_model_mapping * update pipeline_model_mapping * Update for gpt_bigcode --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-24 14:45:22 +02:00
NielsRogge	3d3204c025	Add FocalNet (#21532 ) Adds FocalNet by Microsoft to transformers --------- Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: alaradirik <alaradirik@gmail.com>	2023-04-23 20:03:05 +03:00
Arthur	7579a52b55	Small sam patch (#22920 ) * patch * add test * move tests * cover more cases (will fail nw update the code) * style * fix * Update src/transformers/models/sam/image_processing_sam.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/sam/image_processing_sam.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add better check --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: younesbelkada <younesbelkada@gmail.com>	2023-04-21 21:41:18 +02:00
Connor Henderson	b950c38565	tests: Fix flaky test for NLLB-MoE (#22880 ) * add test update and docs edits * docs edit suggestion	2023-04-21 17:09:40 +01:00
Arthur	eddf9eeca0	[CI] clap patch fusion test values (#22922 ) * patch test with values * lower tol	2023-04-21 11:22:07 -04:00
Yih-Dar	1e1cb6f8e5	Fix `FillMaskPipelineTests` (#22894 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-21 15:16:45 +02:00
Matthijs Hollemans	ec93b895c1	fix CLAP integration tests (#22834 ) * integration tests were not being run * add tests for short input waveform * rewrite test for long input * even more betterer * my bad * oh boy	2023-04-21 11:04:15 +01:00
Yih-Dar	397720fb14	Skip a failing test on main for now (#22911 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-21 10:22:54 +02:00
Arthur	f143037789	Add `automatic-mask-generation` pipeline for Segment Anything Model (SAM) (#22840 ) * cleanup * updates * more refactoring * make style * update inits * support other inputs in base * update based on review Co-authored-by: Nicolas Patry <patry.nicolas@gmail.com> * Update tests/pipelines/test_pipelines_automatic_mask_generation.py Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * update * fixup * TODO x and y to refactor, _h _w refactored here * update docstring * more nits * style on these * more doc fix * rename variables * update * updates * style * update * fix `_mask_to_rle_pytorch` * styling * fix ask to rle, wrong outputs * add device arg * update * more updates, fix tets * udpate * update docstrings * styling * fixup * add notebook on the docs * update orginal sizes * fix docstring * updat condition on point_per-batch * updates tests * fix CI test * extend is required, append does not work! * fixup * fix CI tests * whit pixels left * address doc comments * fix doc * slow pipeline tests * update auto init * add revision * make fixup * update p!ipoeline tag when calling tests * alphabeitcal order in inits * fix copies * last style nits * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * reformat docstring * more reformat * address most of the comments * Update src/transformers/pipelines/mask_generation.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * final refactor * Update src/transformers/models/sam/image_processing_sam.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fixup and fix slow tests * revert --------- Co-authored-by: Nicolas Patry <patry.nicolas@gmail.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-04-20 19:27:24 +02:00
Matt	6dc0a849b7	Fix weight tying in TF-ESM (#22839 ) Fix weight tying in ESM	2023-04-20 15:50:31 +01:00
Joao Gante	4060d6857e	XGLM: Fix left-padding (PT and TF) (#22828 )	2023-04-20 10:01:56 +01:00
Arthur	474bf508df	Add Segment Anything Model (SAM) (#22654 ) * initial commit * keys match * update, fix conversion * fixes, inference working * fix * more fixes * more fixes * clean up * more clean up * fix copies and add convext copied layer norm * stash * pretty big upfate * cleaning * more cleaning * fixup stuffs * fix copies * fix iinit * update test removing tokenizer * nits * add pretrained * more nits * remove tracking of pipeline * few fixes * update san and conversion script * fix mask decoder and prompt encoder conversion * fixes * small update * fix order * fix * fix image embeddings * nites * few fixes * fix logits * clean up * fixes boxes inference * v1 AMG * clean up * some clean up * multi points support * amg working * fixup * clean up * readme * update toctree * fix type hint * multiple fixes * fixup * fixes * updates * updates * more tests * few fixes * change to `SamForMaskGeneration` * doc * fixup * fix more tests * multiple fixes * fix CI tests * refactor processor * renamings * draft the pipeline * refactor * fix tests * fix test * few cleanings * fix test * edit pipelien support chunking * udate * add slow tests * fix nit * fixup * fix nit * current chunk pipleine * cast boxes in fp32 * nit * current updates * piepleine works * fixup * clean up config * fix slow tests * fix slow tests * clean up * update doc and pipeline * adds more slow tests * fix slow tests * cleaning * tests pass * add docstring * fix copies * clean up * support batch of images * style * dummy is needed, add tests * fix slow tests * fix CI * update * adds more tests * fixes * fixes * fixup * fixes * few fixes * filter * few fixes * some refactor * touches finales * fix * style * remove pipeline files * fixes nits * revert pipeline changes * fix test * fixup * remove automodel for automatic mask generation * fix failing torch tests * update mdx * revert removal of `MODEL_FOR_AUTOMATIC_MASK_GENERATION_MAPPING` * update sam config based on review Co-authored-by: amyeroberts <aeroberts4444@gmail.com> Co-authored-by: sgugger <sylvain.gugger@gmail.com> * update low_resolution_masks -> pred_masks inti ln with layer_norm_eps add_decomposed_rel_pos doc forward doc of SamForMaskGeneration * update processor docstring * remove image processor import empty * update for testing * output vision hidden states + clean recomm also test all iou values * fixup * fixup * remove unused * Update src/transformers/models/sam/modeling_sam.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/sam/image_processing_sam.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * nits * fix * fix CI tests and slow tests * replace with Amy's processor * clearer docstring * add `SamVisionNeck` * refactor - all CI tests should pass * fix broken import on Gcolab * few fixes here and there * fix another bug * fix more bugs * update and merge * correct ckpt * address comments * add tips * revert * fix docstring * replace with `SamModel` * make fixup * add support for bathed images and batch ed points * make fixup this time, really * make fixup again and again * few fixes here and there, this should be the touche finale * Update docs/source/en/model_doc/sam.mdx * fixup * correct checkpoints * correct name * rm unneeded file * add notebook --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: amyeroberts <aeroberts4444@gmail.com> Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-04-19 21:01:49 +02:00
Yih-Dar	06bab00338	Remove some pipeline skip cases (#22865 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-19 20:27:19 +02:00
Sylvain Gugger	5f9b825c89	Use code on the Hub from another repo (#22814 ) * initial work * Add other classes * Refactor code * Move warning and fix dynamic pipeline * Issue warning when necessary * Add test * Do not skip auto tests * Fix failing tests * Refactor and address review comments * Address review comments	2023-04-18 13:46:11 -04:00
Joao Gante	78cda46f17	Generate: Add assisted generation (#22211 ) * working mvp * remove breakpoint * fix commit * standardize outputs * tmp commit * tests almost ready * tmp commit * skip a few models * Add streaming; Docs and examples * document limitations * PR commits * Amy PR comments	2023-04-18 17:36:56 +01:00
Yih-Dar	90247d3e01	Fix `test_eos_token_id_int_and_list_top_k_top_sampling` (#22826 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-18 16:04:51 +02:00
Matthijs Hollemans	ac2bc50a10	TTS fine-tuning for SpeechT5 (#21824 ) * wrong argument name * append eos_token_id * all tokenizers need mask and ctc_blank tokens * remove reduction factor from feature extractor * add proper TTS loss * did shifting the wrong way around * mask out padded portions * remove logits again (don't really need it) * fix unit tests * fixup * pad also returns the decoder attention mask, since that's useful to have * clean up feature extractor logic * pad can handle TTS task too * remove stop_labels from loss calculation * simplify logic * fixup * do -100 masking properly * small STFT optimization (calculate mel filterbanks only once) * replace torchaudio fbanks with audio_utils * remove torchaudio dependency * simplify & speed up the STFT * don't serialize window and mel filters * output cross attentions when generating speech * add guided attention loss * fix failing test * Update src/transformers/models/speecht5/feature_extraction_speecht5.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update src/transformers/models/speecht5/modeling_speecht5.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * change type annotation of attention_mask to LongTensor * extract loss into class * remove unused frame_signal_scale argument * use config object in loss class * fix type annotations in doc comments * change optional to just bool * implement missing tokenizer method * add deprecation warning * Update src/transformers/models/speecht5/feature_extraction_speecht5.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/speecht5/feature_extraction_speecht5.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add deprecation warning for stop_labels --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-04-18 10:12:30 +01:00
Zachary Mueller	03462875cc	Introduce `PartialState` as the device handler in the `Trainer` (#22752 ) * Use accelerate for device management * Add accelerate to setup Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-04-17 15:09:45 -04:00
Sylvain Gugger	50caa20628	Revert "Use code on the Hub from another repo" (#22813 ) Revert "Use code on the Hub from another repo (#22698)" This reverts commit `ea7b0a539a`.	2023-04-17 14:22:13 -04:00
Yih-Dar	5269718cb7	Don't use `LayoutLMv2` and `LayoutLMv3` in some pipeline tests (#22774 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-17 17:45:20 +02:00
Sylvain Gugger	ea7b0a539a	Use code on the Hub from another repo (#22698 ) * initial work * Add other classes * Refactor code * Move warning and fix dynamic pipeline * Issue warning when necessary * Add test	2023-04-17 11:36:29 -04:00
Yih-Dar	76d24f1a83	Fix `test_word_time_stamp_integration` for `Wav2Vec2ProcessorWithLMTest` (#22800 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-17 12:41:55 +02:00
Joao Gante	9af845afc2	Generate: pin number of beams in BART test (#22763 )	2023-04-14 09:57:25 +01:00
Yih-Dar	410b61ad7e	Revert (for now) the change on `Deta` in #22437 (#22750 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-13 21:32:29 +02:00
Joao Gante	9dfd6a4baa	Generate: handle text conditioning with multimodal encoder-decoder models (#22748 )	2023-04-13 19:51:13 +01:00
Yih-Dar	32b08742a5	`DocumentQuestionAnsweringPipeline` only for fast ⚡ tokenizers (#22745 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-13 17:22:59 +02:00
NielsRogge	8eb38f638d	[Pix2struct] Simplify generation (#22527 ) * Add model to doc tests * Remove generate and replace by prepare_inputs_for_generation * More fixes * Remove print statements * Update integration tests * Fix generate * Remove model from auto mapping * Use auto processor * Fix integration tests * Fix test * Add inference code snippet * Remove is_encoder_decoder * Update docs * Remove notebook link	2023-04-13 09:01:14 -04:00
Matt	50f82e1282	Fix docstrings for TF BLIP (#22618 ) * Fix docstrings for TFBLIP * Fix missing line in TF port! * Use values from torch tests now other bugs fixed * Use values from torch tests now other bugs fixed * Fix doctest string	2023-04-12 17:46:41 +01:00
Stas Bekman	1306b7d3ae	[tests] switch to torchrun (#22712 )	2023-04-12 08:25:45 -07:00
Younes Belkada	370f0ca18c	[`bnb`] Let's make serialization of int8 models possible (#22177 ) * make serialization of int8 models possible * make fixup * add docs * add ability to push to hub and save pretrained * fixes * more addition * more tests * fix issues * change variable * clearer message * adapt from suggestions * few fixes * remove unused function * Update src/transformers/utils/quantization_config.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address last comments * last warning * clarify doc * protect import * Update src/transformers/modeling_utils.py * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-04-12 08:01:18 -04:00
pioliverse	523ca4e016	add model resources for CPMAnt (new) (#20906 ) * resolve conflicts * rebase and make style * test * test * test * rebase and make style * rebase and make style * tests * tests * rewrite some functions * rebase and make style * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * fix some bugs & docstring * add models and tests * solve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * tests * resolve conflicts * resolve conflicts * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * fix some bugs & docstring * save resolution * make style * delete redefinition code * reformat function * reformat * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * tests * resolve conflicts * resolve conflicts * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * resolve conflicts * make style * fix bugs and refactor * modify docstrings and make style * unify import format in __init__.py * fix import-altclp bug * fix copies to update index.md * fix unused config parameters * fix unused config parameters * fix unused config parameters * update README_ja.md * dummy commit for unit test * fix attention mask * add CPMAntTokenizer&-Fast to auto-mapping * drop redundant changes in README_ko * fix defaults in docstring * fix use_cache and some docstring * add missing args in tokenizer * modify tester inheritance * add is_jieba_available * fix some bugs * make style and fix-copies * add doctests * skip integration tests * add is_jieba_available * fix bugs in common tests * adjust docstrings and make style * add argument docstring * adjust code to some specifications * make style and fix-copies * add fast tokenization test * dummy commit for unit test * dummy commit for unit test * dummy commit for unit test * normalize some comments and names * Bert->CPMAnt * camel names and drop redundant codes * make style and fix-coies * add CpmTokenizerFast _import_structure * drop cpmanttokenizerfast in model_doc * fix some problems * fix CPMAnt tokenization for common test * make style and fixup * fix copies and fixup * fix bugs in tokenization test * dummy commit for connection failure in unittest * fix copies * drop trailing comma * fix decorator in tests * dummy commit for connection failure in unittest --------- Co-authored-by: Gong Baitao <gongbaitao11@gmail.com>	2023-04-12 07:33:20 -04:00
Yih-Dar	fe1f5a639d	Fix decorator order (#22708 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-11 17:59:15 +02:00
Yih-Dar	4c01231e67	Update some `MarkupLM` tests' expected values (#22667 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-11 10:00:34 +02:00
Sugawara	6daa9cb515	add GPTNeoXForSequenceClassification (#22671 ) * add GPTNeoXForSequenceClassification * move the labels to logits.device (ref: #22561) * fix	2023-04-10 11:52:23 -04:00
Sylvain Gugger	3876fc6839	Make dynamic code work with offline mode (#22661 ) * Make dynamic code work with offline mode * Clean up * Quality	2023-04-10 08:49:42 -04:00
Joel Lamy-Poirier	e0921c6b53	Add GPTBigCode model (Optimized GPT2 with MQA from Santacoder & BigCode) (#22575 ) * Add model with cli tool * Remove unwanted stuff * Add new code * Remove inference runner * Style * Fix checks * Test updates * make fixup * fix docs * fix doc * fix test * hopefully fix pipeline tests * refactor * fix CIs * add comment * rename to `GPTBigCodeForCausalLM` * correct readme * make fixup + docs * make fixup * fixes * fixes * Remove pruning * Remove import * Doc updates * More pruning removal * Combine copies * Single MQA implementation, remove kv cache pre-allocation and padding * Update doc * Revert refactor to match gpt2 style * Merge back key and value caches, fix some type hints * Update doc * Fix position ids pith padding (PR 21080) * Add conversion script temporarily * Update conversion script * Remove checkpoint conversion * New model * Fix MQA test * Fix copies * try fix tests * FIX TEST!! * remove `DoubleHeadsModel` * add MQA tests * add slow tests * clean up * add CPU checker * final fixes * fixes - fix GPU issue - fixed slow tests - skip disk offload * fix final issue * Simplify and comment baddbmm fix * Remove unnecessary code * Transpose tweaks * Use beta=1 on cpu, improve tests --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com>	2023-04-10 10:57:21 +02:00
Arthur	f33419261a	[OPT] Fix default attention mask size (#22649 ) * Fix default attention mask size * fixup * add a test to make sure that even if attention mask are not provided, works * style	2023-04-07 20:12:57 +02:00
Yih-Dar	14d5b2b645	Fix `MegaModel` CI (#22652 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-07 17:13:04 +02:00
Yih-Dar	c7ec71baf5	Update tiny model summary file for recent models (#22637 ) * Update tiny model summary file for recent models --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-06 22:52:59 +02:00
Younes Belkada	ed67286465	[`Blip`] Fix slow tests and doctests with correct values (#22632 ) fix slow tests and doctests	2023-04-06 19:12:51 +02:00
Yih-Dar	fa01127a67	update_pip_test_mapping (#22606 ) * Add TFBlipForConditionalGeneration * update pipeline_model_mapping * Add import * Revert changes in GPTSanJapaneseTest --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-06 17:56:06 +02:00
Yih-Dar	2c22bc79c2	Make tiny model creation + pipeline testing more robust (#22500 ) * Final Tiny things --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-06 17:45:55 +02:00
amyeroberts	12d51db243	Backbone add mixin tests (#22542 ) * Add out_indices to backbones, deprecate out_features * Update - can specify both out_features and out_indices but not both * Add backbone mixin tests * Test tidy up * Add test_backbone for convnext * Remove redefinition of method * Update for Dinat and Nat backbones * Update tests * Smarter indexing * Add checks on config creation for backbone * PR comments	2023-04-06 13:50:15 +01:00
Nicolas Patry	0aa1153ffb	Revert error back into warning for byte fallback conversion. (#22607 )	2023-04-06 14:00:29 +02:00
Nicolas Patry	1670be4bde	Adding Llama FastTokenizer support. (#22264 ) * Adding Llama FastTokenizer support. - Requires https://github.com/huggingface/tokenizers/pull/1183 version - Only support byte_fallback for llama, raise otherwise (safety net). - Lots of questions are special tokens How to test: ```python from transformers.convert_slow_tokenizer import convert_slow_tokenizer from transformers import AutoTokenizer from tokenizers import Tokenizer tokenizer = AutoTokenizer.from_pretrained("huggingface/llama-7b") if False: new_tokenizer = Tokenizer.from_file("tok.json") else: new_tokenizer = convert_slow_tokenizer(tokenizer) new_tokenizer.save("tok.json") strings = [ "This is a test", "生活的真谛是", "生活的真谛是[MASK]。", # XXX: This one is problematic because of special tokens # "<s> Something something", ] for string in strings: encoded = tokenizer(string)["input_ids"] encoded2 = new_tokenizer.encode(string).ids assert encoded == encoded2, f"{encoded} != {encoded2}" decoded = tokenizer.decode(encoded) decoded2 = new_tokenizer.decode(encoded2) assert decoded.strip() == decoded2, f"{repr(decoded)} != {repr(decoded2)}" ``` The converter + some test script. The test script. Tmp save. Adding Fast tokenizer + tests. Adding the tokenization tests. Correct combination. Small fix. Fixing tests. Fixing with latest update. Rebased. fix copies + normalized added tokens + copies. Adding doc. TMP. Doc + split files. Doc. Versions + try import. Fix Camembert + warnings -> Error. Fix by ArthurZucker. Not a decorator. * Fixing comments. * Adding more to docstring. * Doc rewriting.	2023-04-06 09:53:03 +02:00
Matt	e577bd0f13	Use native TF checkpoints for the BLIP TF tests (#22593 ) * Use native TF checkpoints for the TF tests * Remove unneeded exceptions	2023-04-05 18:43:14 +01:00
Matt	2a91a9ef66	Fix PT-TF equivalence test for GPT1 (#22586 ) * Re-enable skipped test and fix the hidden state shape issue * Actually fix the bug instead of just doing something wrong	2023-04-05 13:16:00 +01:00
Joao Gante	861ff890d6	Generate: `TextIteratorStreamer` timeout (#22576 )	2023-04-05 09:57:46 +01:00
Sylvain Gugger	11fd2c773b	Skip failing test	2023-04-04 21:26:17 -04:00
Matt	edb704b26e	Fix inverted conditional in TF common test! (#22540 ) * Fix inverted conditional in TF common test! * Make the same change in the PT tests file * Make sure hidden states for GPT2 have the same output shape in PT/TF * Minor fix to PT implementation of token classification loss * Skip loss equivalence test for TFHubert because it keeps overflowing to inf * Compute LM loss for TF the (weird) way it's computed in PT * Skip loss equivalence test for Wav2Vec2 for the same reason as Hubert * Fix - don't try to access the hidden states property when output is a tuple	2023-04-04 21:59:54 +01:00
Shubhamai	900677487d	Flax Regnet (#21867 ) * initial commit * review changes * post model PR merge * updating doc	2023-04-04 12:41:12 -04:00
Matt	5f3ea66bc0	Add TF port of BLIP (#22090 ) * Initial commit * more stash commit * Yet another stash commit * yet more stash commit * Mostly working except for docs / repo consistency * Stop importing model list from torch file * Add TF BLIP models to docs * Add auto classes * Move get_text_features and get_image_features * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/blip/test_modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/blip/test_modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/models/blip/test_modeling_tf_blip_text.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Use channels_last convolutions in TF (better performance + compatibility) * Remove _shape function * Move multi-line statement to one line in PT + TF * Specify tf.keras.layers instead of importing from it * Remove test_gradient_checkpointing and empty test_training methods * move some multi-line statements to one line * Update docstring for generate * Remove pruned heads set * Remove self.seq_len_dim * Fixed issues with loss computation, should resolve some tests. Also ensured that the PT version follows the config for output_attentions and output_hidden_states * ensure original model follows config in more cases * Skip the same cross-attention tests in the PT tests - didn't realize we did it twice! * Add training args throughout the models and layers * make fixup * Fix docstring for inputs_embeds * Add docstring for is_decoder * Add docstrings to text models * Remove redundant computation * Add unpack_inputs / keras_serializable * Add modeling_tf_blip to doctests * Add config classes for keras serialization * Changes to allow model porting with pt-to-tf * Quick fix to decoder head and test tweaks * Revert an issue with masking the embeddings outputs * Allow missing keys in some equivalence tests (for unused layers) * Add tf-pt equivalence tests back in * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make fixup * Refactor invert_attention_mask out into tf_utils * Re-enable cross-tests on the PT side too --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-04-04 16:05:22 +01:00
Nicolas Patry	a515d0a77c	Soft error whisper. (#22475 ) * Soft error whisper. * Fix format. --------- Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-94.taildb5d.ts.net>	2023-04-04 16:21:57 +02:00
Viktor Scherbakov	871598be55	Implemented safetensors checkpoints save/load for Trainer (#22498 ) * implemented safetensors save/load * remove duplicated file * added tests * more tests * style fix * fix tf tests * change to list comprehension Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * review fixes + safe load for sharded checkpoint * style fix * remove rogue import * remove partial to avoid undefined exception * use naming alias instead of safetensors.torch * fix safe sharding in tests * grammar Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * update docs Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * update docs Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * minor corrections * style --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-04-04 09:05:04 -04:00
Arthur	00b5887b94	🚨🚨🚨 `[NLLB Tokenizer]` Fix the prefix tokens 🚨🚨🚨 (#22313 ) * fix the prefix tokens * update fast and test values * add legacy behaviour Co-authored-by: sgugger <sylvain.gugger@gmail.com> * update disclaimer, linkissue PR and behaviral changes * Apply suggestions from code review Co-authored-by: Lysandre Debut <hi@lysand.re> * styling * make a quote * quote this time --------- Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Lysandre Debut <hi@lysand.re>	2023-04-04 14:53:06 +02:00
TheWall9	ad5e9b6c6a	[Roformer] Fixing a bug in RoFormerEncoder where it was ignoring the length of past_key_values when generating as a decoder (#22416 ) * fix RoFormerEncoder postion embedding when generate as decoder * make fixup * add test case for check generate with past key values * remove duplicating code	2023-04-04 12:50:33 +02:00
Joao Gante	1905384fd5	Generate: Add text streamer decoding options (#22544 )	2023-04-04 09:03:13 +01:00
Younes Belkada	159ff3342c	Update test_image_processing_pix2struct.py (#22543 )	2023-04-03 15:26:35 -04:00
Sylvain Gugger	c14d31294e	Skip failing test	2023-04-03 14:07:40 -04:00
Thibault Douzon	4e441e529c	fix LayoutLMv3TokenizerFast subword label after 'Ġ' token (#21695 ) LayoutLMv3TokenizerFast produces empty 'Ġ' token with `offset_mapping = (0, 0)`. Next token is wrongly assumed to also be beginning of word and isn't correctly assigned `pad_token_label`. Modify test with text that produce 'Ġ' token. Remove copy check from LayoutLMv2TokenizerFast for `_batch_encode_plus`. solves issue: #19978	2023-04-03 10:32:36 -04:00
Joao Gante	a55a822adf	Generate: `TextIteratorStreamer` (streamer for gradio) (#22501 ) * haha text go brrr (but in gradio)	2023-04-03 15:04:37 +01:00
Mohammed Jabir	7d25c9c81e	added biogpt token classifier (#22447 ) * added biogpt token classifier * fix reviews * Updated modeling_biogpt.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-04-03 09:20:02 -04:00
Arthur	c0f99b4d2e	Fix llama tokenizer (#22402 ) * draft * update tokenization limma and conversion script * more udpates * initial commit * style * default pad to None * draft tokenization tests * update test * update tokenization tests * nits * update * versioning test * major fix * fix more testst * finish fixing special masks * last nit * more nits * add encode decode tests * add more * fix token type ids * style	2023-04-03 09:07:32 -04:00
Eli Simhayev	9eae4aa576	[Time-Series] fix past_observed_mask type (#22076 ) added > 0.5 to `past_observed_mask`	2023-04-03 09:07:21 -04:00
Sylvain Gugger	c612628045	Test fetch v2 (#22367 ) * Test fetcher v2 * Fix regexes * Remove sanity check * Fake modification to OPT * Fixes some .sep issues * Remove fake OPT change * Fake modif for BERT * Fake modif for init * Exclude SageMaker tests * Fix test and remove fake modif * Fake setup modif * Fake pipeline modif * Remove all fake modifs * Adds options to skip/force tests * [test-all-models] Fake modif for BERT * Try this way * Does the command actually work? * [test-all-models] Try again! * [skip circleci] Remove fake modif * Remove debug statements * Add the list of important models * Quality * Update utils/tests_fetcher.py Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> * Address review comments * Address review comments * Fix and add test * Apply suggestions from code review Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Address review comments --------- Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2023-03-31 16:18:43 -04:00
Nicolas Patry	d143087d18	Making sure we can use safetensors to serialize all the time. (#22437 ) * Making sure we can use safetensors to serialize all the time. * Expanding the tests for increased coverage. * Update the test. * Getting current state of affairs. * Tentative fix. * Fixing black version. * Fixing the worst offenders. * Try to modify less files. * Fixing blip_2 (Weird solution right now). * Fixing deta. * Fix blip ? * Missing extra newline. * No deta modification. * Adding some comments. * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Addressing comments. * Addressing comments. * creating warn_once. * Warning_once ! --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-31 16:07:35 +02:00
Arthur	349e1242d9	[NLLB-MoE] `model_type` update for auto mapping (#22470 ) edit default model type and testing path set to hf-internal-testing	2023-03-30 15:36:07 +02:00
Joao Gante	228792a9dc	Generate: basic token streaming (#22449 ) * haha tokens go brrrr	2023-03-30 12:00:12 +01:00
amyeroberts	f0aeb1be17	Skip flaky NLLB Moe test for now (#22463 ) Skip flaky test for now	2023-03-30 11:30:19 +01:00
amyeroberts	154c6bb7ac	Rescale image back if it was scaled during PIL conversion (#22458 ) * Rescale image back if it was scaled during PIL conversion * do_rescale is defined if PIL image passed in	2023-03-30 11:29:11 +01:00
Younes Belkada	b844f8a9ab	[`Pix2Struct`] Fix slow test (#22448 ) fix slow test	2023-03-29 17:40:45 +02:00
Yih-Dar	8894b81742	Use real tokenizers if tiny version(s) creation has issue(s) (#22428 ) Fix some tiny model creation issues Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-29 16:16:23 +02:00
Younes Belkada	33f4cb1093	[`bnb`] fix bnb failing test (#22439 ) * fix bnb failing test * fix * fix * fixup	2023-03-29 15:13:00 +02:00
Arthur	8d9c3836be	Add clean_up_tokenization_spaces to config (#22341 ) * add draft changes * fix failing wav2vec * style * make sure that the argument is saved + add tests * style * fixup * update test * default clean_up_tokenization_spaces to False for Bloom and Llama * Update code based on review Co-authored-by: Nicolas Patry <patry.nicolas@gmail.com> * style * quality --------- Co-authored-by: Nicolas Patry <patry.nicolas@gmail.com>	2023-03-29 13:21:07 +02:00
Arthur	19ade2426a	[WIP]`NLLB-MoE` Adds the moe model (#22024 ) * Initial commit * update modeling code * update doc * add functions necessary * fix impotrs * revert changes * fixup * more styling to get going * remove standalone encoder * update code * styling * fix config and model * update code and some refactoring * make more tests pass * Adding NLLB-200 - MoE - 54.5B for no language left behind Fixes #21300 * fix mor common tests * styke * update testing file * update * update * Router2 doc * update check config with sparse layer * add dummy router * update current conversion script * create on the fly conversion script * Fixup * style * style 2 * fix empty return * fix return * Update default config sparse layers * easier to create sparse layers * update * update conversion script * update modeling * add to toctree * styling * make ruff happy * update docstring * update conversion script * update, will break tests but impelemting top2 * update * ❗local groups are supported here * ⚠️ Support for local groups is now removed ⚠️ This is because it has to work with model parallelism that we do not support * finish simplificaiton * Fix forward * style * fixup * Update modelling and test, refactoring * update tests * remove final layer)norm as it is done in the FF * routing works! Logits test added * nit in test * remove top1router * style * make sure sparse are tested. Had to change route_tokens a liottle bit * add support for unslip models when converting * fixup * style * update test s * update test * REFACTOR * encoder outputs match! * style * update testing * 🎉encoder and decoder logits match 🎉 * styleing * update tests * cleanup tests * fix router test and CIs * cleanup * cleanup test styling * fix tests * Finally the generation tests match! * cleanup * update test * style testing file * remove script * cleanup * more cleanup * nits * update * NLLB tokenizer is wrong and will be fixed soon * use LongTensors * update tests * revert some small changes * fix second expert sampling and batch prioritized routing * update tests * finish last tests * make ruff happy * update * ruff again * style * Update docs/source/en/model_doc/nllb-moe.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Updates based on review * style and fix import issue * nit * more nits * cleanup * styling * update test_seconde_expert_policy * fix name * last nit on the markdown examples --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-27 19:42:00 +02:00
NielsRogge	0e708178ed	[Pix2Struct] Add support to resize embeddings (#22394 ) * First draft * Fix integration test * Remove script * Fix test and typos * Fix one more test * Skip tied embeddings test * Remove line * Address comments	2023-03-27 11:38:07 -04:00
Joao Gante	7dcd8703ef	Generate: support for left-padding on GPTNeoX and Llama (#22382 )	2023-03-27 15:48:23 +01:00
Shubhamai	a0cbbba31f	Resnet flax (#21472 ) * [WIP] flax resnet * added pretrained flax models, results reproducible * Added pretrained flax models, results reproducible * working on tests * no real code change, just some comments * [flax] adding support for batch norm layers * fixing bugs related to pt+flax integration * removing loss from modeling flax output class * fixing classifier tests * fixing comments, model output * cleaning comments * review changes * review changes * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * renaming Flax to PyTorch --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-03-24 19:45:57 +00:00
Mitch Naylor	57f25f4b7f	Add Mega: Moving Average Equipped Gated Attention (#21766 ) * add mega file structure and plain pytorch version of mega source code * added config class with old naming conventions * filled in mega documentation * added config class and embeddings with optional token types * updated notes * starting the conversion process, deleted intermediate and added use_cache back to config * renamed config attributes in modeling_mega.py * checkpointing before refactoring incremental decoding functions * removed stateful incremental key/values for EMA and self-attention * refactored MovingAverageGatedAttention to remove stateful k/v history and use unified attention mask * MovingAverageGatedAttention works with incremental decoding + past values, added sequence length enforcement * more comments in MovingAverageGatedAttention + checkpointing before GatedCrossAttention * bug fix in attention mask handling in MovingAverageGatedAttention * removed incremental state from GatedCrossAttention and removed IncrementalState class * finished gated cross attention and got MegaLayer working * fixed causal masking in mega decoder * fixed how padding and causal masks are passed through MegaLayer with and without k/v caching * finished MegaModel; tested with encoder, decoder-only, and cross-attention type inputs; started work on downstream classes; removed mentions of position_ids * added optional dense hidden layer for masked and causal LM classes * docstring updates in MultiHeadEMA and GatedCrossAttention, removed unnecessary inputs in cross-attention * removed before_attn_fn in Mega class and updated docstrings and comments up to there * bug fix in MovingAverageGatedAttention masking * working conversion of MLM checkpoint in scratchpad script -- perfect matches * moved arg for hidden dense layer in LM head to config; discovered issue where from_pretrained is renaming gamma and beta parameters * renamed gamma and beta parameters to avoid HF renaming when loading from checkpoint * finished checkpoint conversion script * cleanup old class in mega config script * removed 'copied from' statements and passing integration tests * added num_attention_heads=1 to config for integration compatibility, decoder tests working, generation tests failing * fixed tuple output of megamodel * all common tests passing after fixing issues in decoder, gradient retention, and initialization * added mega-specific tests, ready for more documentation and style checks * updated docstrings; checkpoint before style fixes * style and quality checks, fixed initialization problem in float_tensor, ready for PR * added mega to toctree * removed unnecessary arg in megaconfig * removed unused arg and fixed code samples with leftover roberta models * Apply suggestions from code review Applied all suggestions except the one renaming a class, as I'll need to update that througout Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixed issue where .view breaks batch dimension, conversion script fixed with absolute imports, updated readme with Mega->MEGA * removed asserts in Mega code, renamed sequencenorm, gatedcrossattention, and NFFN, replaced get_activation_fn with ACTFN, and added sequencenorm to layer norms * reformatted .forward() docstrings to match style and removed unused mask input in cross-attention * removed all reset_parameters() methods and rolled into MegaPreTrainedModel._init_weights() * renamed all single-letter variables and improved readability in tensor size comments, Mega->MEGA in 2 documentation files * variable names in NFFN * manual Mega->MEGA changes in docs * Mega->MEGA in config auto * style and quality fixes * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * renamed parameters and variables with confusing names, added copied from statements, moved fft conv to its own method, other cleanup from PR comments * commit before dealing with merge conflicts * made new attention activation functions available in ACT2FN and added generation test from OPT * style and quality in activations and tests * documentation fixes, renaming variables in dropout and rotary positions, used built-in causal masking, encoders->layers in MegaModel, moved comments into docstrings * style and quality fixes after latest updates, before rotary position ids * causal mask in MegaBlock docstring + added missing device passing * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * added Mega prefixes where missing, reverted MegaSequenceNorm to if-else, other module renaming requested in PR * style and quality fixes + readme updates pointing to main --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-24 08:17:27 -04:00
Joao Gante	0fa46524ac	Generate: Add GPTNeoX integration test (#22346 )	2023-03-24 11:33:16 +00:00
Yih-Dar	e8cc02555e	Automatically create/update tiny models (#22275 ) * Automatically create or update tiny models * Skip failed tests * update workflow file * use revision --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-23 19:14:17 +01:00
Joao Gante	502fec779b	Generate: add test for left-padding support (#22322 )	2023-03-23 17:00:22 +00:00
Sylvain Gugger	80e3b36361	Really fix quality due to ruff release	2023-03-22 20:56:22 -04:00
Sylvain	ef28df0572	Fix quality due to ruff release	2023-03-22 20:45:08 -04:00
Yih-Dar	8b05ace014	Fix PipelineTests skip conditions (#22320 ) * check what tests fail * Skip failing tests * Skip failing tests * Skip failing tests * Skip failing tests * clean up * clean up --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-22 20:02:24 +01:00
Luc CAILLIAU	d62e7d8842	Chunkable token classification pipeline (#21771 ) * Chunkable classification pipeline The TokenClassificationPipeline is now able to process sequences longer than 512. No matter the framework, the model, the tokenizer. We just have to pass process_all=True and a stride number (optional). The behavior remains the same if you don't pass these optional parameters. For overlapping parts when using stride above 0, we consider only the max scores for each overlapped token in all chunks where the token is. * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * update with latest black format * update black format * Update token_classification.py * Update token_classification.py * format correction * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update comments * Update src/transformers/pipelines/token_classification.py Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * Update token_classification.py Correct spaces, remove process_all and keep only stride. If stride is provided, the pipeline is applied to the whole text. * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update chunk aggregation Update the chunk aggregation strategy based on entities aggregation. * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py Remove unnecessary pop from outputs dict * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update src/transformers/pipelines/token_classification.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add chunking tests * correct formating * correct formatting * correct model id for test chunking * update scores with nested simplify * Update test_pipelines_token_classification.py * Update test_pipelines_token_classification.py * update model to a tiny one * Update test_pipelines_token_classification.py * Adding smaller test for chunking. * Fixup * Update token_classification.py * Update src/transformers/pipelines/token_classification.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines/token_classification.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-22 14:13:20 -04:00
Younes Belkada	0f68a7f408	Add Pix2Struct (#21400 ) * v1 all keys match * clean up * forward pass ok * add correct image transform * generate works, logits matching * clean up * more refactor * revert * revert * clean up * clean ups * clean up * refactor * refactor * fix doc * fix tokenizer test * fix toctree * revert toctree * oops * few fixes * replace to `pixel_embeds` * make fixup * test processing & feat extractor * fix some tests * more fixes * make fixup * clean up * more clean up * add a single slow test * fix test * make fixup * fix * fix authors * fix toctree * update docs * add docstring * revert change * Update src/transformers/models/pix2struct/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix tokenizer * fix processor test * fix test * make fixup * refactor * fix config * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * format * fix * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * make fixup * add docstring * fix issues * fix * fix * fix * add slow test * fix * fix * fix batched issue * fix training issues * fix ci test * fix slow test * fix conversion script * remove unneeded classes * fix slow test * fix require backends * fix masked fill * revert * fix softmax * add large models support * fix conditional generation * few fixes * add instructions * rm unneeded file * Update src/transformers/models/pix2struct/convert_pix2struct_original_pytorch_to_hf.py * fix ci test * fix ci test really * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix nit * fix nits * fix image processors nits * docstring * clean up * fix nit * fix tests * docstring nit * fix reshape * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix nit * fix repetition * refactor processor * make patch size consistent * refactor forward * fix docstring * fix max_patches issue * update docstirng * update docstring * fix coped from * add skip reasons * few fixes * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * format * fix doctests * refactor and fix * fix doc build issue * fix processor test * small fix conversion script * replace correct weights * make fixup * fix some issues * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * revert config and fixes * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * more details * fixes * fix processor * fix processor test * fix * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * fix processor * Update src/transformers/models/pix2struct/modeling_pix2struct.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add copied * make fixup * fix copies * update docstring * refactor * fix docstring * fix conversion script * fix vqa issue * replace to `flattened_patches` * nit * fix numpy issue * fix image processors * add batched vqa support * fix vqa conversion * make fixup * fix conversion script * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * add correct docstring * update docstring * fix module level + channel dim * use `make_list_of_images` * refactor * correct docstring * fix authors * remove `data_format` * add header text test * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * add checkpoints --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2023-03-22 16:53:52 +01:00
Joao Gante	fd3eb3e3cd	Beef up Llama tests (#22314 ) * tmp commit * beef up llama tests	2023-03-22 15:20:48 +00:00
Joao Gante	12febc20db	Generate: Export TF generate with a TF tokenizer (#22310 ) * Export TF generate with a TF tokenizer * remove unused lines	2023-03-22 15:00:20 +00:00
silentghoul-spec	48bef3a734	Fixed bug to calculate correct xpath_sub_list in MarkupLMTokenizer (#22302 ) Fixed bug to calculate correct xpath_sub_list in MarkupLMTokenizer. Earlier xpath_sub_list was same as xpath_tags_list Co-authored-by: dusejat <dusejat@amazon.com>	2023-03-22 12:07:49 +00:00
Alara Dirik	0558914dff	Add MaskedImageModelingOutput (#22212 ) * Add MaskedImageModelingOutput	2023-03-22 07:35:47 +03:00
Yih-Dar	67c2dbdb54	Time to Say Goodbye, torch 1.7 and 1.8 (#22291 ) * time to say goodbye, torch 1.7 and 1.8 * clean up torch_int_div * clean up is_torch_less_than_1_8-9 * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-21 19:22:01 +01:00
Gerald Cuder	5a2b77a6c1	Fix error in mixed precision training of `TFCvtModel` (#22267 ) * Make sure CVT can be trained using mixed precision * Add test for keras-fit with mixed-precision * Update tests/models/cvt/test_modeling_tf_cvt.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> --------- Co-authored-by: gcuder <Gerald.Cuder@iacapps.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2023-03-21 12:12:57 +00:00
lewtun	f251441387	Add LlamaForSequenceClassification (#22209 ) * Add LlamaForSequenceClassification * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Add docstring * Add test * Add input embedding getter and setter * Remove dead code --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-03-17 14:39:26 +01:00
Yih-Dar	5110e5748e	🔥py38 + torch 2 🔥🔥🔥🚀 (#22204 ) * py38 + torch 2 * increment cache versions --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-16 22:59:23 +01:00
Jason Phang	0041be5b3d	LLaMA Implementation (#21955 ) * LLaMA * sharding and docs * tweak * black * inits * ruff * LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP * init * no checkpoint * docs * ruff * type_vocab_size * tokenizer fixes * tokenizer fixes * Update tokenization_llama.py * Update tokenization_llama.py * Update configuration_llama.py * Update modeling_llama.py * tokenizer add_bos by default * licenses * remove decoder * norms and mlp * rope overhaul * tweaks * black * mention OPT implementation * off-by-one naming * typo * fix * tokenization fix and slicing bug * padding config * cleanup * black * update tests * undo typo * fix vocab caching logic * ruff * docbuilder * attn fix from BlackSamorez * initial feedback * typo * docs * llama case * llama case * load checkpoint docs * comment about tokenizer * tokenizer defaults * clear past_key_values if use_cache=False * last tweaks * last tweaks * last tweaks * last tweaks --------- Co-authored-by: Stella Biderman <stellabiderman@gmail.com>	2023-03-16 09:00:53 -04:00
Yih-Dar	52a57f7c7c	Update expected values in `MgpstrModelIntegrationTest` (#22195 ) Update values Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-16 11:48:52 +00:00
Anahita Bhiwandiwalla	16121bae5c	Update BridgeTowerForContrastiveLearning (#22145 ) * Use return_loss for BridgeTowerForContrastiveLearning, add example * fix tests * Update example in BridgeTowerForContrastiveLearning * Update test_modeling_bridgetower.py * update model output format * minor update * Update src/transformers/models/bridgetower/modeling_bridgetower.py * make style --------- Co-authored-by: Tiep Le <97980157+tileintel@users.noreply.github.com> Co-authored-by: Tiep Le <tiep.le@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-15 20:54:38 +01:00
Sylvain Gugger	42ad693b7b	Regression pipeline device (#22190 ) * Fix regression in pipeline when device=-1 is passed * Add regression test	2023-03-15 14:13:38 -04:00
amyeroberts	737681477c	Revert 22152 MaskedImageCompletionOutput changes (#22187 ) Revert changes	2023-03-15 18:37:23 +01:00
amyeroberts	c6318c3788	to_pil - don't rescale if int and in range 0-255 (#22158 ) * Don't rescale if in and in range 0-255 * Raise value error if int values too large * Update tests/test_image_transforms.py * Update tests/test_image_transforms.py	2023-03-14 15:43:44 +00:00
Alara Dirik	3b22bfbc6a	Create MaskedImageCompletionOutput and fix ViT docs (#22152 ) * create MaskedImageCompletionOutput * fix bugs * fix bugs	2023-03-14 13:55:18 +00:00
Alara Dirik	cdddfbffa1	Add ConvNeXT V2 (#21679 ) * Add ConvNeXt V2 to transformers * TF model is separated from the PR to fix issues	2023-03-14 12:08:14 +03:00
Yih-Dar	6c2ad00c46	Move `is_pipeline_test_to_skip` to specific model test classes (#21999 ) * Move `is_pipeline_test_to_skip` to specific model test classes --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-14 10:03:02 +01:00
Patrick von Platen	f780557a34	[Safetensors] Add explicit flag to from pretrained (#22083 ) * [Safetensors] Add explicit flag to from pretrained * add test * remove @ * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-13 21:39:06 +01:00
Younes Belkada	d979cf6efd	[`Whiper`] add `get_input_embeddings` to `WhisperForAudioClassification` (#22133 ) * add `get_input_embeddings` to `WhisperForAudioClassification` * add common tests * fix another common test * Update tests/models/whisper/test_modeling_whisper.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix style --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-03-13 19:46:01 +01:00
Younes Belkada	6652e7da0d	[`Blip2`] skip accelerate test (#22124 ) skip accelerate test	2023-03-13 15:03:21 +01:00
wangpeng	102b5ff4a8	add new model of MGP-STR (#21418 ) * add new model of MGP-STR * fix the check failings * remove torch and numpy from mgp_tokenization * remove unused import from modeling_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str.py * add test_processing_mgp_str * add test_processing_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str and add softmax outs to model * rm test_processing_mgp_str and add softmax outs to model * rewrite the code of mgp-str according to PR suggestions * rewrite the code of mgp-str according to PR suggestions * add new model of MGP-STR * fix the check failings * remove torch and numpy from mgp_tokenization * remove unused import from modeling_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str.py * add test_processing_mgp_str * add test_processing_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str and add softmax outs to model * rewrite the code of mgp-str according to PR suggestions * rewrite the code of mgp-str according to PR suggestions * remove representation_size from MGPSTRConfig * reformat configuration_mgp_str.py * format test_processor_mgp_str.py * add test for tokenizer and complete model/processer test and model file * rm Unnecessary tupple in modeling_mgp_str * reduce hidden_size/layers/label_size in test_model * add integration tests and change MGPSTR to Mgpstr * add test for logit values * reformat test model file --------- Co-authored-by: yue kun <yuekun.wp@alibaba-inc.com>	2023-03-13 10:11:31 +00:00
Yih-Dar	2f320661f3	Revert "[GPT2] Propose fix for #21080 " (#22093 ) Revert "[GPT2] Propose fix for #21080 (#21853)" to avoid CI failure This reverts commit `a3fef89b26`.	2023-03-10 22:08:21 +01:00
Dean Wyatte	2f4cdd97f5	handle numpy inputs in whole word mask data collator (#22032 )	2023-03-10 10:50:29 -05:00
Arthur	a3fef89b26	[GPT2] Propose fix for #21080 (#21853 ) * Make sure position ids are masked * test that padded input produce the same results * fix failing tests * fixup * fix batch test	2023-03-10 07:15:25 -05:00
Yih-Dar	ab81d31d20	Skip 3 tests for `WhisperEncoderModelTest` (#22060 ) * skip 3 tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-09 19:09:23 +01:00
Stas Bekman	ec24132b6c	[deepspeed] offload + non-cpuadam optimizer exception (#22043 ) * [deepspeed] offload + non-cpuadam optimizer exception * flip * revert min version	2023-03-09 08:12:57 -08:00
Lucain	923110b74f	Remove set_access_token usage + fail tests if FutureWarning (#22051 ) * Remove set_access_token usage + fail tests if FutureWarning * do not fail on FutureWarning in CI --------- Co-authored-by: testbot <lucainp@hf.co>	2023-03-09 09:23:48 -05:00
Yih-Dar	1cbac6867b	Mark all `BridgeTower` tests slow for now (#22039 ) * slow me --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-08 21:48:29 +01:00
Anahita Bhiwandiwalla	de81adf978	[WIP] Add BridgeTowerForContrastiveLearning (#21964 ) * Add BridgeTower for ITC * Fix review feedback * Rename BridgeTowerForITC, cleanup * Fix style and quality * implement tests --------- Co-authored-by: Tiep Le <97980157+tileintel@users.noreply.github.com> Co-authored-by: Tiep Le <tiep.le@intel.com>	2023-03-08 09:00:54 -05:00
Yih-Dar	dfe9a31973	Update `AudioClassificationPipelineTests::test_small_model_pt` for PT 2.0.0 (#22023 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-08 13:56:47 +01:00
Yih-Dar	b338414e61	Update tiny model creation script and some others files (#22006 ) * Update 1 * Update 2 * Update 3 * Update 4 * Update 5 * Update 6 * Update 7 * Update 8 * Update 9 * Update 10 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-07 22:31:14 +01:00
Eli Simhayev	8abe4930d3	[Time-Series] informer model (#21099 ) * added informer to gitignore * added informer to gitignore * WIP informer2020 * added checking that instantiate works * added config using gluonTS by kashif * WIP config * adding informeConfig. need to remove FeatureEmbedder * done InformerConfig, but need to change the names * Done informer model init. working on enc-dec * added things to address, after reading again enc-dec in the paper * done modeling - checking initialization work * added informer to gitignore * WIP informer2020 * added checking that instantiate works * added config using gluonTS by kashif * WIP config * adding informeConfig. need to remove FeatureEmbedder * done InformerConfig, but need to change the names * Done informer model init. working on enc-dec * added things to address, after reading again enc-dec in the paper * done modeling - checking initialization work * moved enc-dec init to InformerEncoder/Decoder init * added 'init_std' to config, now model init works! * WIP conversion script, and added code sources * WIP conversion script: loading original informer pth works * WIP conversion script: change defaults in the config * WIP conversion script: supporting Informer input embedding * WIP conversion script: added parameters for the informer embed * WIP conversion script: change dim_feedforward=2048 * WIP conversion script: remove unused args for loading checkpoint * just cleaning up * DataEmbedding removed, after thinking with Kashif * working on forward pass * WIP forward pass: trying to establish working batch for forward pass * cleaning and finalizing * adding HF names and docs * init after cleaning works * WIP in tests * added docs for the informer specific args * fix style * undo change * cleaning informer, now need to work only enc-dec * initial enc-dec classes * added encoder and decoder * added todo * add todos for conv_layers * added decoder docs from vanilla * added encoder docs from vanilla * remove encoder decoder from the original informer * removed AttentionLayer from the original paper * removed TriangularCausalMask, same as decoder_attention_mask * initial sparse attention * use conv_layers * fixed test_config test * fix parenthesis when itearting zip(layers, conv_layers) * error found in prob attention, added sizes as comments * fix sizes * added proposal for q_reduce indexing, and remove unused * WIP ProbMask, and changed factor=2 for testing * remove unused libs for this PR for creating the env * fix checking the attn_weights.size() after bmm * Q_reduce: changed from torch.gather to simple slicing * WIP calculate final attn_output * finish adding v_aggregated, attn_output ready * changed tgt_len to u in attention_mask, need to fix the size error * comment attention_mask for encoder, and fix if cond for v_agg * added ProbMask support (wip), removed old original code * finished ProbMask 😃 * Revert "remove unused libs for this PR for creating the env" This reverts commit `11a081e09e`. * fixes * make style * fix initial tests * fix more tests * dry * make style * remove unused files * style * added integration tests * fix num_static_real_features * fix header * remove unused function * fix example * fix docs * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/modeling_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixes for reviewer * use prediction_length from model * fix style * fixed informer.mdx * added to index * updated readme * undo * make fix-copies * typo * fix copy * added Informer to toctree * in order * fixed comments * remove unneeded new lines in docs * make static real and cat optional * fix use of distil conv layers * fixed integration test * added checkpoint for convlayer * make fix-copies * updated from time series model * make fix-copies * copy decoder * fix unit tests * updated scaling config * fix integration tests * IGNORE_NON_TESTED * IGNORE_NON_AUTO_CONFIGURED * IGNORE_NON_AUTO_CONFIGURED * updated check configs * fix formatting * undo change from time series * prediction_length should not be None * aliign with the blog: prettify ProbSparse and change attention_factor to sampling_factor * make style * make fix-copies * niels CR: update contributed by * niels CR: update configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * niels CR: update kashif -> huggingface Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * niels CR: `sampling_factor` only relevant when `attention_type`=prob * make style * fixed U_part: added multiplication by `L_Q` * fixed bug: remove `is not None` from `if config.distil` * fixed test: `decoder_seq_length` to `encoder_seq_length` in cross_attentions check * fix integration tests * updated model hub * do not shift as in training * undo * fix make-copies * make fix-copies * added `if prediction_length is None` * changed `ProbSparseAttention` to `InformerProbSparseAttention` * changed `V_sum` -> `v_mean_dim_time` * changed `ConvLayer` to `InformerConvLayer` and fixed `super()` * TimeSeriesTansformer->Informer in decoder's Copied from * more descriptive in ProbSparse * make style * fix coped from * Revert "added `if prediction_length is None`" This reverts commit `b4cbddfa05`. * fixed indent * use InformerSinusoidalPositionalEmbedding * make fix-style * fix from #21860 * fix name * make fix-copies * use time series utils * fix dec num_heads * docstring * added time series util doc * _import_structure * formatting * changes from review * make style * fix docs * fix doc * removed NegativeLogLikelihood --------- Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2023-03-07 21:36:38 +01:00
NielsRogge	dde718e7a6	[DETR and friends] Remove is_timm_available (#21814 ) * First draft * Fix to_dict * Improve conversion script * Update config * Remove timm dependency * Fix dummies * Fix typo, add integration test * Upload 101 model as well * Remove timm dummies * Fix style --------- Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2023-03-07 15:19:39 -05:00
Arthur	2156662dea	[TF] Fix creating a PR while pushing in TF framework (#21968 ) * add create pr arg * style * add test * ficup * update test * last nit fix typo * add `is_pt_tf_cross_test` marker for the tsts	2023-03-07 17:32:08 +01:00
Sanchit Gandhi	7c39318136	[Whisper] Add model for audio classification (#21754 ) * [Whisper] Add model for audio classification * make fix-copies * add to docs * add docstring * empty returns * add code example * switch to fleurs * stick everything on one line	2023-03-07 16:20:21 +01:00
Yih-Dar	9402788b34	Skip `test_multi_gpu_data_parallel_forward` for some model tests (#21991 ) skip test_multi_gpu_data_parallel_forward for some model tests Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-07 14:23:36 +01:00
NielsRogge	95408e9953	[DETR, YOLOS] Fix device bug (#21974 ) * Fix integration test * Add test * Add test	2023-03-07 07:34:04 -05:00
Elad Segal	eec46b4f75	Fix MinNewTokensLengthLogitsProcessor when used with a list of eos tokens (#21959 ) * Fix MinNewTokensLengthLogitsProcessor when used with a list of eos tokens * fix docs * Empty commit * formatting	2023-03-07 11:59:22 +00:00
amyeroberts	4063fd9cba	Add check before int casting for PIL conversion (#21969 ) * Add check before int casting for PIL conversion * Line length * Tidier logic	2023-03-07 11:14:09 +00:00
Yih-Dar	5b28b78332	Update `Jukebox` tests (#21984 ) * update expected values for jukebox * update expected values for jukebox * update expected values for jukebox * update expected values for jukebox * update expected values for jukebox --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-07 04:20:14 +01:00
Yih-Dar	f2a2616b74	Update expected values for `test_xglm_sample` (#21975 ) update expected values for xglm Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-06 18:07:31 +01:00
Yih-Dar	9474abdf47	Use larger atol in `torch.allclose` for some tests (#21966 ) Use larger atol Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-06 17:41:00 +01:00
Yih-Dar	fcf813417a	Update expected values in `XLMProphetNetModelIntegrationTest` (#21957 ) update values Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-06 09:15:44 +01:00
Arthur	718e9d777f	[CLAP] Support batched inputs for CLAP. Fixes pipeline issues (#21931 ) * fix pipeline * fix feature_extraction clap * you can now batch the `is_longer` attribute * add tests * fixup * add expected scores * comment on is_longert	2023-03-03 18:42:18 +01:00
Yih-Dar	d4306daea1	Fix `AlignModelTest` tests (#21923 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-03 14:47:09 +01:00
Yih-Dar	fa9d2ad7ec	Update `model_split_percents` for `WhisperModelTest` (#21922 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-03 14:35:08 +01:00
Yih-Dar	9f5bfe1b99	Avoid modeling tests run in pipeline CI jobs (#21911 ) * rework is_pipeline_test * bring back 3 tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-02 21:23:06 +01:00
Kashif Rasul	db979f7588	[time series] Add Time series inputs tests (#21846 ) * intial test of inputs * added test for generation * remove asserts * fixed test * Update tests/models/time_series_transformer/test_modeling_time_series_transformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2023-03-02 20:43:35 +01:00
Yih-Dar	88e5c51a15	Temporarily skip 3 tests in `BridgeTowerModelTest` (#21908 ) skip for now Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-02 19:16:03 +01:00
Yih-Dar	e6de918676	Add Blip and Blip2 for pipeline tests (#21904 ) * fix * add to tests * style and quality * add missing --------- Co-authored-by: NielsRogge <NielsRogge@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-02 18:20:34 +01:00
Nicolas Patry	1325459105	Refactor whisper asr pipeline to include language too. (#21427 ) * [WIP] whisper refacto to support language output. * Handling merges. * A bit more cleanup and comments. * Many improvements. Lots of details everywhere. * Cleanup old code and tests. * Handle lone timestamp tokens (just recover when something bad happens). * Adding return_language example. * No ffmpeg. * Hmm. * Some corrections. * Both fast and slow. * New black. * Update src/transformers/models/whisper/tokenization_whisper.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/whisper/tokenization_whisper.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Remove print. * Undoing tests modifications. * Smaller test modifications. * Rename. * Remove maxDiff. --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-03-02 18:12:19 +01:00
Connor Henderson	8e5a1b2abb	Make schedulers picklable by making lr_lambda fns global (#21768 ) * Make schedulers picklable by making lr_lambda fns global * add unused _get_constant_schedule_lr_lambda arg * remove unneeded _get_constant_schedule_lr_lamda * add test * make style * rebase, remove torch dep, put lambda back * repo-consistency and style	2023-03-02 12:08:43 -05:00
Kian Sierra McGettigan	6bf885375a	Prophetnet batch dimension inversion fix (#21870 ) * decoder forward pass is working * no model has forward pass returning attentions * decoder ngram changed to not mix batch size * current basic forward pass returns identical result * passed test_model attentions * passed test_encoder_decoder_model_generate * passed test_headmasking * removed old block * removed comments bug/fixme * removed bug comments * applied styling * applied fix-copies * applied ngram forward comments * corrected dimension notation * applied styling and comment fixes * changed asserts for raise ValueError * changed question gen test * updated hidden_states integration test * applied styling	2023-03-02 12:07:45 -05:00
Sylvain Gugger	50a8ed3ee0	Mark pipeline tests to skip them easily (#21887 ) * Mark pipeline tests to skip them easily * Mark the mixin as pipeline test * Update src/transformers/testing_utils.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2023-03-02 10:55:36 -05:00
Arthur	c87654dca1	[Whisper] Add rescaling function with `do_normalize` (#21263 ) * add `zero_mean_unit_var_norm` function * normalize before MEL computation * fixup * add simple test * quality * Update tests/models/whisper/test_feature_extraction_whisper.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * fixup * use attention masks if padding was applied * Update based on review Co-authored-by: bofeng huang <bofenghuang7@gmail.com> --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: bofeng huang <bofenghuang7@gmail.com>	2023-03-02 14:17:21 +01:00
Yih-Dar	36ee128375	Fix `WhisperModelTest` (#21883 ) * force on the same device * fix tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-01 20:41:27 +01:00
Alara Dirik	269b054939	Add ALIGN to transformers (#21741 ) Adds the ALIGN model to transformers. ALIGN is introduced in "Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision" by Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, Yunhsuan Sung, Zhen Li, Tom Duerig.	2023-03-01 21:23:31 +03:00
Matt	f7c618e3b0	Add TFVisionTextDualEncoder (#21873 ) * Temporary commit to stash everything so far * Temporary commit to stash everything so far * stash commit * Refactor from_pretrained * Fix final test, make fixup * Update dummies * Add model to TEST_FILES_WITH_NO_COMMON_TESTS * Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Add TFVisionTextDualEncoder to utils/documentation_tests.txt * make fixup --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2023-03-01 18:00:48 +00:00
Yih-Dar	53735d7c3b	Add an utility file to get information from test files (#21856 ) * Add an utility file to get information from test files --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-01 17:53:29 +01:00
Arthur	b599b19289	[ConvBert] Fix #21523 (#21849 ) * fix reshaping Fixes #21523 * add test * styling * last fixes * Update src/transformers/models/convbert/modeling_convbert.py * code quallity	2023-03-01 11:11:04 +01:00
Arthur	44e3e3fb49	prepare for "__floordiv__ is deprecated and its behavior will change in a future version of pytorch" (#20211 ) * rounding_mode = "floor" instead of // to prevent behavioral change * add other TODO * use `torch_int_div` from pytrch_utils * same for tests * fix copies * style * use relative imports when needed * Co-authored-by: sgugger <sylvain.gugger@gmail.com>	2023-03-01 10:49:21 +01:00
Sylvain Gugger	b29e2dcaff	Fix flaky test for log level (#21776 ) * Fix flaky test for log level * Fix other flaky test	2023-02-28 16:24:14 -05:00
Matt	acfb714bdf	Improve TF weight loading, especially PT crossloading (#21792 ) * First commit for the improved PT-TF weight loading * Remove workarounds from TFEncoderDecoder tests * Allow a custom weight renaming function in from_pretrained and use that to clean up EncoderDecoder * make fixup * First attempt at visionencoderdecoder * Disable tensorfloat32 in tests to get consistent outputs * Quick fix to tf_vision_encoder_decoder tests * make fixup * Update Blenderbot tests * Remove unused arg in modeling_tf_opt * load_tf_sharded_weights had strict=True! This meant transfer learning was impossible, so I'm setting it to False. * Support prefixes when loading sharded TF checkpoints * make fixup * Add test to load sharded models with a weight prefix * Fix sharded weight loading test * Add a test for transfer from a sharded checkpoint * make fixup * Add test to check that crossloading from PT with a prefix works * Refactor from_pretrained in the encoderdecoder classes * Refactor from_pretrained in the encoderdecoder classes * missmatched -> mismatched * Explicitly check for None * No comments showing my very impressive and attractive knowledge of Py3.9+ * Disable TF32 across all TF tests	2023-02-28 18:41:34 +00:00

... 7 8 9 10 11 ...

3386 Commits