mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-05 05:40:05 +06:00
remove-script-datasets-in-tests-test-datasets-main
712 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
![]() |
5c67682b16
|
v4.33.0.dev0 | ||
![]() |
6c811a322f
|
new model: IDEFICS via HuggingFaceM4 (#24796)
* rename * restore * mappings * unedited tests+docs * docs * fixes * fix auto-sync breakage * cleanup * wip * wip * add fetch_images * remove einops dependency * update * fix * fix * fix * fix * fix * re-add * add batching * rework * fix * improve * add Leo as I am extending his work * cleanup * fix * cleanup * slow-test * fix * fix * fixes * deal with warning * rename modified llama classes * rework fetch_images * alternative implementation * cleanup * strict version * cleanup * [`IDEFICS`] Fix idefics ci (#25056) * Fix IDEFICS CI * fix test file * fixup * some changes to make tests pass * fix * fixup * Update src/transformers/models/idefics/configuration_idefics.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> --------- Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * remove compat checks * style * explain that Idefics is not for training from scratch * require pt>=2.0 * fix idefics vision config (#25092) * fix idefics vision config * fixup * clean * Update src/transformers/models/idefics/configuration_idefics.py --------- Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * cleanup * style * cleanup * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * upcase * sequence of images * handle the case with no images * Update src/transformers/image_processing_utils.py Co-authored-by: Victor SANH <victorsanh@gmail.com> * support pure lm take 2 * support tokenizer options * parameterize num_channels * fix upcase * s|IdeficsForCausalLM|IdeficsForVisionText2Text|g * manual to one line * addressing review * unbreak * remove clip dependency * fix test * consistency * PIL import * Idefics prefix * Idefics prefix * hack to make tests work * style * fix * fix * revert * try/finally * cleanup * clean up * move * [`IDEFICS`] Fix idefics config refactor (#25149) * refactor config * nuke init weights * more refactor * oops * remove visual question answering pipeline support * Update src/transformers/models/idefics/clip.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/models/idefics/modeling_idefics.py * cleanup * mv clip.py vision.py * tidyup --------- Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas@stason.org> * fix * license * condition on pt * fix * style * fix * rm torchvision dependency, allow custom transforms * address review * rework device arg * add_eos_token * s/transforms/transform/ * fix top level imports * fix return value * cleanup * cleanup * fix * style * license * license * Update src/transformers/models/idefics/image_processing_idefics.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add a wrapper to freeze vision layears * tidyup * use the correct std/mean settings * parameterize values from config * add tests/models/idefics/test_image_processing_idefics.py * add test_processor_idefics.py * cleanup * cleanups * fix * fix * move to the right group * style * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add perceiver config * reset * missing arg docs * Apply suggestions from code review Co-authored-by: Leo Tronchon <leo.tronchon@gmail.com> * address review comments * inject automatic end of utterance tokens (#25218) * inject automatic end of utterance tokens * fix * fix * fix * rework to not use the config * not end_of_utterance_token at the end * Update src/transformers/models/idefics/processing_idefics.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address review * Apply suggestions from code review Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/image_processing_utils.py Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * [`Idefics`] add image_embeddings option in generate-related methods (#25442) * add image_embeddings option in generate-related methods * style * rename image_embeddings and allow perceiver embeddings precomputation * compute embeddings within generate * make is_encoder_decoder= True the default in config * nested if else fix * better triple check * switch if elif order for pixel values / img embeds * update model_kwargs perceiver only at the end * use _prepare_model_inputs instead of encoder_decoder logic * fix comment typo * fix config default for is_encoder_decoder * style * add typehints * precompute in forward * doc builder * style * pop instead of get image hidden states * Trigger CI * Update src/transformers/models/idefics/modeling_idefics.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/idefics/modeling_idefics.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * + indentation + style * simplify a bit the use_resampler logic using comments * update diocstrings * Trigger CI --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix rebase changes * unbreak #25237 - to be fixed in follow up PRs * is_composition = False * no longer needed --------- Co-authored-by: leot13 <leo.tronchon@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Victor SANH <victorsanh@gmail.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> |
||
![]() |
dcb183f4bd
|
[MPT ] Add MosaicML's MPT model to transformers (#24629)
* draft add new model like * some cleaning of the config * nits * add nested configs * nits * update * update * added layer norms + triton kernels * consider only LPLayerNorm for now. * update * all keys match. * Update * fixing nits here and there * working forward pass. * removed einops dependency * nits * format * add alibi * byebye head mask * refactor attention * nits. * format * fix nits. * nuke ande updates * nuke tokenizer test * don't reshape query with kv heads * added a bit of documentation. * remove unneeded things * nuke more stuff * nit * logits match - same generations * rm unneeded methods * 1 remaining failing CI test * nit * fix nits * fix docs * fix docs * rm tokenizer * fixup * fixup * fixup and fix tests * fixed configuration object. * use correct activation * few minor fixes * clarify docs a bit * logits match à 1e-12 * skip and unskip a test * added some slow tests. * fix readme * add more details * Update docs/source/en/model_doc/mpt.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix configuration issues * more fixes in config * added more models * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove unneeded position ids * fix some comments * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * revert suggestion * mpt alibi + added batched generation * Update src/transformers/models/mpt/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove init config * Update src/transformers/models/mpt/configuration_mpt.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix nit * add another slow test * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fits in one line * some refactor because make fixup doesn't pass * add ft notebook * update md * correct doc path --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
a03d13c83d
|
Pvt model (#24720)
* pull and push updates * add docs * fix modeling * Add and run test * make copies * add task * fix tests and fix small issues * Checks on a Pull Request * fix docs * add desc pvt.md |
||
![]() |
c035970212
|
Update tested versions in READMEs (#24895)
* Update supported Python and PyTorch versions in readme * Update Python, etc. versions in non-English readmes These were more out of date than in the English readme. This updates all the versions the readmes claim the repository is tested with to the same versions stated in the English readme. Those versions are current at least in the case of the Python and PyTorch versions (and less out of date for the others). * Propagate trailing whitespace fix to model list This runs "make fix-copies". The only change is the removal of whitespace. No actual information or wording is changed. * Update tested TensorFlow to 2.6 in all readmes Per pinning in setup.py Unlike Python and PyTorch, the minimum supported TensorFlow version has not very recently changed, but old versions were listed in all READMEs. |
||
![]() |
07360b6c9c
|
[Llama2 ] Add support for Llama 2 (#24891)
* add llama * add other readmes * update padding id in readme * add link to paper * fix paths and tokenizer * more nits * styling * fit operation in 2 lines when possible * nits * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add form * update reademe * update readme, we don't have a default pad token * update test and tokenization * LLaMA instead of Llama * nits * add expected text * add greeedy output * styling * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * sequential device map * skip relevant changes --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
3ec10e6c76
|
Add DINOv2 (#24016)
* First draft * More improvements * Convert patch embedding layer * Convert all weights * Make conversion work * Improve conversion script * Fix style * Make all tests pass * Add image processor to auto mapping * Add swiglu ffn * Add image processor to conversion script * Fix conversion of giant model * Fix documentation * Fix style * Fix tests * Address comments * Address more comments * Remove unused arguments * Remove more arguments * Rename parameters * Include mask token * Address comments * Add docstring * Transfer checkpoints * Empty commit |
||
![]() |
e9ad51306f
|
4.32.0.dev0 | ||
![]() |
f42a35e611
|
Add bark (#24086)
* first raw version of the bark integration * working code on small models with single run * add converting script from suno weights 2 hf * many changes * correct past_kv output * working implementation for inference * update the converting script according to the architecture changes * add a working end-to-end inference code * remove some comments and make small changes * remove unecessary comment * add docstrings and ensure no unecessary intermediary output during audio generation * remove done TODOs * make style + add config docstrings * modification for batch inference support on the whole model * add details to .generation_audio method * add copyright * convert EncodecModel from original library to transformers implementation * add two class in order to facilitate model and sub-models loading from the hub * add support of loading the whole model * add BarkProcessor * correct modeling according to processor output * Add proper __init__ and auto support * Add up-to-date copyright/license message * add relative import instead of absolute * cleaner head_dim computation * small comment removal or changes * more verbose LayerNorm init method * specify eps for clearer comprehension * more verbose variable naming in the MLP module * remove unecessary BarkBlock parameter * clearer code in the forward pass of the BarkBlock * remove _initialize_modules method for cleaner code * Remove unnecessary methods from sub-models * move code to remove unnecessary function * rename a variable for clarity and change an assert * move code and change variable name for clarity * remove unnecessary asserts * correct small bug * correct a comment * change variable names for clarity * remove asserts * change import from absolute to relative * correct small error due to comma missing + correct import * Add attribute Bark config * add first version of tests * update attention_map * add tie_weights and resize_token_embeddings for fineModel * correct getting attention_mask in generate_text_semantic * remove Bark inference trick * leave more choices in barkProcessor * remove _no_split_modules * fixe error in forward of block and introduce clearer notations * correct converting script with last changes * make style + add draft bark.mdx * correct BarkModelTest::test_generate_text_semantic * add Bark in main README * add dummy_pt_objects for Bark * add missing models in the main init * correct test_decoder_model_past_with_large_inputs * disable torchscript test * change docstring of BarkProcessor * Add test_processor_bark * make style * correct copyrights * add bark.mdx + make style, quality and consistency * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Remove unnecessary test method * simply logic of a test * Only check first ids for slow audio generation * split full end-to-end generation tests * remove unneccessary comment * change submodel names for clearer naming * remove ModuleDict from modeling_bark * combine two if statements * ensure that an edge misued won't happen * modify variable name * move code snippet to the right place (coarse instead of semantic) * change BarkSemanticModule -> BarkSemanticModel * align BarkProcessor with transformers paradigm * correct BarkProcessor tests with last commit changes * change _validate_voice_preset to an instance method instead of a class method * tie_weights already called with post_init * add codec_model config to configuration * update bark modeling tests with recent BarkProcessor changes * remove SubModelPretrainedModel + change speakers embeddings prompt type in BarkModel * change absolute imports to relative * remove TODO * change docstrings * add examples to docs and docstrings * make style * uses BatchFeature in BarkProcessor insteads of dict * continue improving docstrings and docs + make style * correct docstrings examples * more comprehensible speaker_embeddings load/Save * rename speaker_embeddings_dict -> speaker_embeddings * correct bark.mdx + add bark to documentation_tests * correct docstrings configuration_bark * integrate last nit suggestions * integrate BarkGeneration configs * make style * remove bark tests from documentation_tests.txt because timeout - tested manually * add proper generation config initialization * small bark.mdx documentation changes * rename bark.mdx -> bark.md * add torch.no_grad behind BarkModel.generate_audio() * replace assert by ValueError in convert_suno_to_hf.py * integrate a series of short comments from reviewer * move SemanticLogitsProcessors and remove .detach() from Bark docs and docstrings * actually remove SemanticLogitsProcessor from modeling_bark.oy * BarkProcessor returns a single output instead of tuple + correct docstrings * make style + correct bug * add initializer_range to BarkConfig + correct slow modeling tests * add .clone() to history_prompt.coarse_prompt to avoid modifying input array * Making sure no extra "`" are present * remove extra characters in modeling_bark.py * Correct output if history_prompt is None * remove TODOs * remove ravel comment * completing generation_configuration_bark.py docstrings * change docstrings - number of audio codebooks instead of Encodec codebooks * change 'bias' docstrings in configuration_bark.py * format code * rename BarkModel.generate_audio -> BarkModel.generate_speech * modify AutoConfig instead of EncodecConfig in BarkConfig * correct AutoConfig wrong init * refactor BarkModel and sub-models generate_coarse, generate_fine, generate_text_semantic * remove SemanticLogitsProcessor and replace it with SuppressTokensLogitsProcessor * move nb_codebook related config arguments to BarkFineConfig * rename bark.mdx -> bark.md * correcting BarkModelConfig from_pretrained + remove keys_to_ignore * correct bark.md with correct hub path * correct code bug in bark.md * correct list tokens_to_suppress * modify Processor to load nested speaker embeddings in a safer way * correct batch sampling in BarkFineModel.generate_fine * Apply suggestions from code review Small docstrings correction and code improvements Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * give more details about num_layers in docstrings * correct indentation mistake * correct submodelconfig order of docstring variables * put audio models in alphabetical order in utils/check_repo.my * remove useless line from test_modeling_bark.py * makes BarkCoarseModelTest inherits from (ModelTesterMixin, GenerationTesterMixin, unittest.TestCase) instead of BarkSemanticModelTest * make a Tester class for each sub-model instead of inheriting * add test_resize_embeddings=True for Bark sub-models * add Copied from transformers.models.gpt_neo.modeling_gpt_neo.GPTNeoSelfAttention._split_heads * remove 'Copied fom Bark' comment * remove unneccessary comment * change np.min -> min in modeling_bark.py * refactored all custom layers to have Bark prefix * add attention_mask as an argument of generate_text_semantic * refactor sub-models start docstrings to have more precise config class definition * move _tied_weights_keys overriding * add docstrings to generate_xxx in modeling_bark.py * add loading whole BarkModel to convert_suno_to_hf * refactor attribute and variable names * make style convert_suno * update bark checkpoints * remove never entered if statement * move bark_modeling docstrings after BarkPretrainedModel class definition * refactor modeling_bark.py: kv -> key_values * small nits - code refactoring and removing unecessary lines from _init_weights * nits - replace inplace method by variable assigning * remove *optional* when necessary * remove some lines in generate_speech * add default value for optional parameter * Refactor preprocess_histories_before_coarse -> preprocess_histories Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * correct usage after refactoring * refactor Bark's generate_xxx -> generate and modify docstrings and tests accordingly * update docstrings python in configuration_bark.py * add bark files in utils/documentation_test.txt * correct docstrings python snippet * add the ability to use parameters in the form of e.g coarse_temperature * add semantic_max_new_tokens in python snippet in docstrings for quicker generation * Reformate sub-models kwargs in BakModel.generate Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * correct kwargs in BarkModel.generate * correct attention_mask kwarg in BarkModel.generate * add tests for sub-models args in BarkModel.generate and correct BarkFineModel.test_generate_fp16 * enrich BarkModel.generate docstrings with a description of how to use the kwargs --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
5bb4430edc
|
[🔗 Docs] Fixed Incorrect Migration Link (#24793)
* [🔗 Docs] Fixed Incorrect Migration Link
* Update README.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
|
||
![]() |
cfc8a05305
|
Remove WWT from README (#24672) | ||
![]() |
8a5e8a9c2a
|
Add ViViT (#22518)
* Add model * Add ability to get classification head weights * Add docs * Add imports to __init__.py * Run style * Fix imports and add mdx doc * Run style * Fix copyright * Fix config docstring * Remove imports of ViViTLayer and load_tf_weights_in_vivit * Remove FeatureExtractor and replace with ImageProcessor everywhere * Remove ViViTForPreTraining from vivit.mdx * Change ViViT -> Vivit everywhere * Add model_doc to _toctree.yml * Replace tuples with lists in arguments of VivitConfig * Rename patch_size to tubelet_size in TubeletEmbeddings * Fix checkpoint names * Add tests * Remove unused num_frames * Fix imports for VivitImageProcessor * Minor fixes * Decrease number of frames in VivitModelTester from 32 to 16 * Decrease number of frames in VivitModelTester from 16 to 8 * Add initialization for pos embeddings * Rename Vivit -> ViViT in some places * Fix docstring and formatting * Rename TubeletEmbeddings -> VivitTubeletEmbeddings * Remove load_tf_weights_in_vivit * Change checkpoint name * Remove Vivit _TOKENIZER_FOR_DOC * Fix * Fix VivitTubeletEmbeddings and pass config object as parameter * Use image_size and num_frames instead of video_size * Change conversion script and fix differences with the orig implementation * Fix docstrings * Add attention head pruning * Run style and fixup * Fix tests * Add ViViT to video_classification.mdx * Save processor in conversion script * Fix * Add image processor test * Run fixup and style * Run fix-copies * Update tests/models/vivit/test_modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/vivit/test_modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Use PyAV instead of decord * Add unittest.skip * Run style * Remove unneeded test * Update docs/source/en/model_doc/vivit.mdx Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/configuration_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add model * Add docs * Run style * Fix imports and add mdx doc * Remove FeatureExtractor and replace with ImageProcessor everywhere * Change ViViT -> Vivit everywhere * Rename Vivit -> ViViT in some places * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Run make style * Remove inputs save * Fix image processor * Fix * Run `make style` * Decrease parameters of VivitModelTester * Decrease tubelet size * Rename vivit.mdx * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix default values in image_processing_vivit.py --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> |
||
![]() |
b3ab3fac1d
|
Falcon port (#24523)
* Initial commit * Update src/transformers/models/falcon/configuration_falcon.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/falcon/configuration_falcon.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Cleanup config docstring * Update src/transformers/models/falcon/configuration_falcon.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Convert to relative imports * Remove torch < 1.8 warning * Restructure cos_sin header * qkv -> query, key, value * Refactor attention calculation * Add a couple of config variables to account for the different checkpoints * Successful merging of the code paths! * Fix misplaced line in the non-parallel attention path * Update config and tests * Add a pad_token_id when testing * Support output_attentions when alibi is None * make fixup * Skip KV cache shape test * No more _keys_to_ignore_on_load_missing * Simplify self attention a bit * Simplify self attention a bit * make fixup * stash commit * Some more attention mask updates * Should pass all tests except assisted generation! * Add big model generation test * make fixup * Add temporary workaround for test * Test overrides for assisted generation * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/models/falcon/test_modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Test overrides for assisted generation * Add generation demo * Update copyright * Make the docstring model actually small * Add module-level docstring * Remove all assertions * Add copied from bloom * Reformat the QKV layer * Add copied from bloom * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Remove unused line and reformat * No single letter variables * Cleanup return names * Add copied from line * Remove the deprecated arguments blocks * Change the embeddings test to an alibi on/off test * Remove position_ids from FalconForQA * Remove old check for token type IDs * Fix the alibi path when multi_query is False * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/falcon/test_modeling_falcon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update config naming * Fix typo for new_decoder_architecture * Add some comments * Fix docstring * Fix docstring * Create range in the right dtype from the start * Review comment cleanup * n_head_kv -> num_kv_heads * self.alibi -> self.use_alibi * self.num_kv -> self.num_kv_heads * Reorder config args * Made alibi arguments Optional * Add all model docstrings * Add extra checkpoints * Add author info for Falcon * Stop removing token_type_ids because our checkpoints shouldn't return it anymore * Add one hopeful comment for the future * Fix typo * Update tests, fix cache issue for generation * Use -1e9 instead of -inf to avoid float overflow * Recompute the rotary embeddings much less often * Re-enable disabled tests * One final fix to attention mask calculation, and update tests * Cleanup targeting falcon-40b equivalency * Post-rebase docs update * Update docstrings, especially in the config * More descriptive variable names, and comments where we can't rename them --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> |
||
![]() |
30ed3adf47
|
Add Multi Resolution Analysis (MRA) (New PR) (#24513)
* Add all files * Update masked_language_modeling.md * fix mlm models * fix conflicts * fix conflicts * fix copies * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Reduce seq_len and hidden_size in ModelTester * remove output_attentions * fix conflicts * remove copied from statements * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> |
||
![]() |
799df10aef
|
[Umt5 ] Add google's umt5 to transformers (#24477)
* add tokenization template * update conversion script * update modeling code * update * update convert checkpoint * update modeling * revert changes on convert script * new conversion script for new format * correct position bias * cleaning a bit * Credit co authors Co-authored-by: agemagician <ahmed.elnaggar@tum.de> Co-authored-by: stefan-it <> * styling * Add docq * fix copies * add co author * Other Author * Merge branch 'main' of https://github.com/huggingface/transformers into add-umt5 * add testing * nit * Update docs/source/en/model_doc/umt5.mdx Co-authored-by: Stefan Schweter <stefan@schweter.it> * fix t5 * actual fix? * revert wrong changes * remove * update test * more fixes * revert some changes * add SPIECE_UNDERLINE * add a commone xample * upfate * fix copies * revert changes on t5 conversion script * revert bytefallback changes since there was no addition yet * fixup * fixup * ingore umt5 cutom testing folder * fix readmes * revertT5 changes * same outputs * fixup * update example * Apply suggestions from code review * style * draft addition of all new files * current update * fix attention and stuff * finish refactoring * auto config * fixup * more nits * add umt5 to init * use md format * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * revert changes on mt5 * revert mt4 changes * update test * more fixes * add to mapping * fix-copies * fix copies * foix retain grad * fix some tests * nits * done * Update src/transformers/models/umt5/modeling_umt5.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/model_doc/umt5.md * Update src/transformers/models/umt5/__init__.py * Update docs/source/en/model_doc/umt5.md Co-authored-by: Stefan Schweter <stefan@schweter.it> * Update src/transformers/models/umt5/modeling_umt5.py * update conversion script + use google checkpoints * nits * update test and modelling * stash slow convert * update fixupd * don't change slow --------- Co-authored-by: stefan-it <> Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
1c1c90756d
|
Add Musicgen (#24109)
* Add Audiocraft * add cross attention * style * add for lm * convert and verify * introduce t5 * split configs * load t5 + lm * clean conversion * copy from t5 * style * start pattern provider * make generation work * style * fix pos embs * propagate shape changes * propagate shape changes * style * delay pattern: pad tokens at end * audiocraft -> musicgen * fix inits * add mdx * style * fix pad token in processor * override generate and add todos * add init to test * undo pattern delay mask after gen * remove cfg logits processor * remove cfg logits processor * remove logits processor in favour of mask * clean pos embs * make fix copies * update readmes * clean pos emb * refactor encoder/decoder * make fix copies * update conversion * fix config imports * update config docs * make style * send pattern mask to device * pattern mask with delay * recover prompted audio tokens * fix docstrings * laydown test file * pattern edge case * remove t5 ref * add processing class * config refactor * better pattern comment * check if mask is not present * check if mask is not present * refactor to auto class * remove encoder configs * fix processor * processor import * start updating conversion * start updating tests * make style * convert t5, encodec, lm * convert as composite * also convert processor * run generate * classifier free gen * comments and clean up * make style * docs for logit proc * docstring for uncond gen * start lm tests * work tests * let the lm generate * refactor: reshape inside forward * undo greedy loop changes * from_enc_dec -> from_sub_model * fix input id shapes in docstrings * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * undo generate changes * from sub model config * Update src/transformers/models/musicgen/modeling_musicgen.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make generate work again * generate uncond -> get uncond inputs * remove prefix allowed tokens fn * better error message * logit proc checks * Apply suggestions from code review Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * make decoder only tests work * composite fast tests * make style * uncond generation * feat extr padding * make audio prompt work * fix inputs docstrings * unconditional inputs: dict -> model output * clean up tests * more clean up tests * make style * t5 encoder -> auto text encoder * remove comments * deal with frames * fix auto text * slow tests * nice mdx * remove can generate * todo - hub id * convert m/l * make fix copies * only import generation with torch * ignore decoder from tests * don't wrap uncond inputs * make style * cleaner uncond inputs * add example to musicgen forward * fix docs * ignore MusicGen Model/ForConditionalGeneration in auto mapping * add doc section to toctree * add to doc tests * add processor tests * fix push to hub in conversion * tips for decoder only loading * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix conversion for s / m / l checkpoints * import stopping criteria from module * remove from pipeline tests * fix uncond docstring * decode audio method * fix docs * org: sanchit-gandhi -> facebook * fix max pos embeddings * remove auto doc (not compatible with shapes) * bump max pos emb * make style * fix doc * fix config doc * fix config doc * ignore musicgen config from docstring * make style * fix config * fix config for doctest * consistent from_sub_models * don't automap decoder * fix mdx save audio file * fix mdx save audio file * processor batch decode for audio * remove keys to ignore * update doc md * update generation config * allow changes for default generation config * update tests * make style * fix docstring for uncond * fix processor test * fix processor test --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
868363abb9
|
Add InstructBLIP (#23460)
* Squash 88 commits * Use markdown * Remove mdx files due to bad rebase * Fix modeling files due to bad rebase * Fix style * Update comment * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> |
||
![]() |
0b7b4429c7
|
Update test versions on README.md (#24307)
Update README.md Updated the tested versions |
||
![]() |
0c3fdccf2f
|
[WIP] add EnCodec model (#23655)
* boilerplate stuff * messing around with the feature extractor * fix feature extractor * unit tests for feature extractor * rename speech to audio * quick-and-dirty import of Meta's code * import weights (sort of) * cleaning up * more cleaning up * move encoder/decoder args into config * cleanup model * rename EnCodec -> Encodec * RVQ parameters in config * add slow test * add lstm init and test_init * Add save & load * finish EncodecModel * remove decoder_input_values as they are ont used anywhere (not removed from doc yet) * fix test feature extraction model name * Add better slow test * Fix tests * some fixup and cleaning * Improve further * cleaning up quantizer * fix up conversion script * test don't pass, _encode_fram does not work * update tests with output per encode and decode * more cleanup * rename _codebook * remove old config cruft * ratios & hop_length * use ModuleList instead of Sequential * clean up resnet block * update types * update tests * fixup * quick cleanup * fix padding * more styl,ing * add patrick feedback * fix copies * fixup * fix lstm * fix shape issues * fixup * rename conv layers * fixup * fix decoding * small conv refactoring * remove norm_params * simplify conv layers * rename conv layers * stuff * Clean up * Add padding logic use padding mask small conv refactoring remove norm_params simplify conv layers rename conv layers stuff add batched test update Clean up merge and update for padding fix padding fixup * clean up more * clean up more * More clean ups * cleanup convolutions * typo * fix typos * fixup * build PR doc? * start refactoring docstring * fix don't pad when no strid and chunk * update docstring * update docstring * nits * update going to lunch * update config and model * fix broken testse (becaue of the config changes) * fix scale computation * fixu[ * only return dict if speciefied or if config returns it * remove todos * update defaults in config * update conversion script * fix doctest * more docstring + fixup * nits on batched_tests * more nits * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * update basxed on review * fix update * updaet tests * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fixup * add overlap and chunl_length_s * cleanup feature extraction * teste edge cases truncation and padding * correct processor values * update config encodec, nits * fix tests * fixup * fix 24Hz test * elle tests are green * fix fixup * Apply suggestions from code review * revert readme changes * fixup * add example * use facebook checkpoints * fix typo * no pipeline tests * use slef.pad everywhere we can * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * update based on review * update * update mdx * fix bug and tests * fixup * fix doctest * remove comment * more nits * add more coverage for `test_truncation_and_padding` * fixup * add last test * fix text * nits * Update tests/models/encodec/test_modeling_encodec.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * take care of the last comments * typo * fix test * nits * fixup * Update src/transformers/models/encodec/feature_extraction_encodec.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: arthur.zucker@gmail.com <arthur.zucker@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> |
||
![]() |
ba695c1efd
|
v4.31.0.dev0 | ||
![]() |
07c54413ac
|
Add MobileViTv2 (#22820)
* generated code from add-new-model-like * Add code for modeling, config, and weight conversion * add tests for image-classification, update modeling and config * add code, tests for semantic-segmentation * make style, make quality, make fix-copies * make fix-copies * Update modeling_mobilevitv2.py fix bugs * Update _toctree.yml * update modeling, config fix bugs * Edit docs - fix bug MobileViTv2v2 -> MobileViTv2 * Update mobilevitv2.mdx * update docstrings * Update configuration_mobilevitv2.py make style * Update convert_mlcvnets_to_pytorch.py remove unused options * Update convert_mlcvnets_to_pytorch.py make style * Add suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make style, make quality * Add suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add suggestions from code review Remove MobileViTv2ImageProcessor Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make style * Add suggestions from code review Rename MobileViTv2 -> MobileViTV2 Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update modeling_mobilevitv2.py make style * Update serialization.mdx * Update modeling_mobilevitv2.py --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> |
||
![]() |
5dfd407b37
|
[MMS] Scaling Speech Technology to 1,000+ Languages | Add attention adapter to Wav2Vec2 (#23813)
* add fine-tuned with adapter layer * Add set_target_lang to tokenizer * Implement load adapter * add tests * make style * Apply suggestions from code review * Update src/transformers/models/wav2vec2/tokenization_wav2vec2.py * make fix-copies * Apply suggestions from code review * make fix-copies * make style again * mkae style again * fix doc string * Update tests/models/wav2vec2/test_tokenization_wav2vec2.py * Apply suggestions from code review * fix * Correct wav2vec2 adapter * mkae style * Update src/transformers/models/wav2vec2/modeling_wav2vec2.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * add more nice docs * finish * finish * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review * all finish --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> |
||
![]() |
4b6a5a7caa
|
[Time-Series] Autoformer model (#21891)
* ran `transformers-cli add-new-model-like`
* added `AutoformerLayernorm` and `AutoformerSeriesDecomposition`
* added `decomposition_layer` in `init` and `moving_avg` to config
* added `AutoformerAutoCorrelation` to encoder & decoder
* removed caninical self attention `AutoformerAttention`
* added arguments in config and model tester. Init works! 😁
* WIP autoformer attention with autocorrlation
* fixed `attn_weights` size
* wip time_delay_agg_training
* fixing sizes and debug time_delay_agg_training
* aggregation in training works! 😁
* `top_k_delays` -> `top_k_delays_index` and added `contiguous()`
* wip time_delay_agg_inference
* finish time_delay_agg_inference 😎
* added resize to autocorrelation
* bug fix: added the length of the output signal to `irfft`
* `attention_mask = None` in the decoder
* fixed test: changed attention expected size, `test_attention_outputs` works!
* removed unnecessary code
* apply AutoformerLayernorm in final norm in enc & dec
* added series decomposition to the encoder
* added series decomp to decoder, with inputs
* added trend todos
* added autoformer to README
* added to index
* added autoformer.mdx
* remove scaling and init attention_mask in the decoder
* make style
* fix copies
* make fix-copies
* inital fix-copies
* fix from https://github.com/huggingface/transformers/pull/22076
* make style
* fix class names
* added trend
* added d_model and projection layers
* added `trend_projection` source, and decomp layer init
* added trend & seasonal init for decoder input
* AutoformerModel cannot be copied as it has the decomp layer too
* encoder can be copied from time series transformer
* fixed generation and made distrb. out more robust
* use context window to calculate decomposition
* use the context_window for decomposition
* use output_params helper
* clean up AutoformerAttention
* subsequences_length off by 1
* make fix copies
* fix test
* added init for nn.Conv1d
* fix IGNORE_NON_TESTED
* added model_doc
* fix ruff
* ignore tests
* remove dup
* fix SPECIAL_CASES_TO_ALLOW
* do not copy due to conv1d weight init
* remove unused imports
* added short summary
* added label_length and made the model non-autoregressive
* added params docs
* better doc for `factor`
* fix tests
* renamed `moving_avg` to `moving_average`
* renamed `factor` to `autocorrelation_factor`
* make style
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* fix configurations
* fix integration tests
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fixing `lags_sequence` doc
* Revert "fixing `lags_sequence` doc"
This reverts commit
|
||
![]() |
3cf01b2060
|
README: Fix affiliation for MEGA (#23394)
* README: Fix affiliation for MEGA * Fix quality --------- Co-authored-by: Lysandre <lysandre@huggingface.co> |
||
![]() |
ea0eb15649
|
Small fixes and link in the README (#23428)
Fix + link |
||
![]() |
c045249049
|
Add swiftformer (#22686)
* Commit the automatically generated code
using add-new-model-like
* Update description at swiftformer.mdx file
* remove autogenerated code for MaskedImageModeling
* update weight conversion scripts
* Update modeling_swiftformer.py
* update configuration_swiftformer.py
* Update test_modeling_swiftformer.py
* update modeling code - remove einops dependency
* Update _toctree.yml
* update modeling code - remove copied from comments
* update docs
* Revert "update docs"
This reverts commit
|
||
![]() |
a0c0a78233
|
v4.30.0.dev0 | ||
![]() |
b4d4d6fe87
|
Add RWKV-4 (#22797)
* First draft of RWKV-4 * Add support for generate * Style post-rebase * Properly use state * Write doc * Fix doc * More math * Add model to README, dummies and clean config * Fix init * multiple fixes: - fix common tests - fix configuraion default values - add CI test for checking state computation - fix some CI tests * correct tokenizer * some tweaks - fix config docstring - fix failing tests * fix CI tests - add output_attention / output_hidden_states - override test_initialization - fix failing CIs * fix conversion script - fix sharded case - add new arguments * add slow tests + more fixes on conversion script * add another test * final fixes * change single name variable * add mock attention mask for pipeline to work * correct eos token id * fix nits * add checkpoints * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add `tie_word_embeddings` in docstring * change tensor name * fix final nits * Trigger CI --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> |
||
![]() |
c2c99dc7ef
|
add open-llama model with ckpt (#22795)
* update Open-Llama model * update * update format * update doc * update * update stable embedding test * update test case * update format * update readme * fix typo * update name * remove tokenizer and update format * remove convert_open_llama_weights_to_hf * update warning and doc_string --------- Co-authored-by: songliang.bayesian <songliang.bayesian@bytedance.com> |
||
![]() |
a0e7332839
|
Fix CLAP link across all READMEs (#23032)
* Fix CLAP link across all READMEs * Fix copy only for en |
||
![]() |
3d3204c025
|
Add FocalNet (#21532)
Adds FocalNet by Microsoft to transformers --------- Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: alaradirik <alaradirik@gmail.com> |
||
![]() |
2da73f6302
|
[SAM ] Correct arxiv link (#22886)
put correct link |
||
![]() |
474bf508df
|
Add Segment Anything Model (SAM) (#22654)
* initial commit * keys match * update, fix conversion * fixes, inference working * fix * more fixes * more fixes * clean up * more clean up * fix copies and add convext copied layer norm * stash * pretty big upfate * cleaning * more cleaning * fixup stuffs * fix copies * fix iinit * update test removing tokenizer * nits * add pretrained * more nits * remove tracking of pipeline * few fixes * update san and conversion script * fix mask decoder and prompt encoder conversion * fixes * small update * fix order * fix * fix image embeddings * nites * few fixes * fix logits * clean up * fixes boxes inference * v1 AMG * clean up * some clean up * multi points support * amg working * fixup * clean up * readme * update toctree * fix type hint * multiple fixes * fixup * fixes * updates * updates * more tests * few fixes * change to `SamForMaskGeneration` * doc * fixup * fix more tests * multiple fixes * fix CI tests * refactor processor * renamings * draft the pipeline * refactor * fix tests * fix test * few cleanings * fix test * edit pipelien support chunking * udate * add slow tests * fix nit * fixup * fix nit * current chunk pipleine * cast boxes in fp32 * nit * current updates * piepleine works * fixup * clean up config * fix slow tests * fix slow tests * clean up * update doc and pipeline * adds more slow tests * fix slow tests * cleaning * tests pass * add docstring * fix copies * clean up * support batch of images * style * dummy is needed, add tests * fix slow tests * fix CI * update * adds more tests * fixes * fixes * fixup * fixes * few fixes * filter * few fixes * some refactor * touches finales * fix * style * remove pipeline files * fixes nits * revert pipeline changes * fix test * fixup * remove automodel for automatic mask generation * fix failing torch tests * update mdx * revert removal of `MODEL_FOR_AUTOMATIC_MASK_GENERATION_MAPPING` * update sam config based on review Co-authored-by: amyeroberts <aeroberts4444@gmail.com> Co-authored-by: sgugger <sylvain.gugger@gmail.com> * update low_resolution_masks -> pred_masks inti ln with layer_norm_eps add_decomposed_rel_pos doc forward doc of SamForMaskGeneration * update processor docstring * remove image processor import empty * update for testing * output vision hidden states + clean recomm also test all iou values * fixup * fixup * remove unused * Update src/transformers/models/sam/modeling_sam.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/sam/image_processing_sam.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * nits * fix * fix CI tests and slow tests * replace with Amy's processor * clearer docstring * add `SamVisionNeck` * refactor - all CI tests should pass * fix broken import on Gcolab * few fixes here and there * fix another bug * fix more bugs * update and merge * correct ckpt * address comments * add tips * revert * fix docstring * replace with `SamModel` * make fixup * add support for bathed images and batch ed points * make fixup this time, really * make fixup again and again * few fixes here and there, this should be the touche finale * Update docs/source/en/model_doc/sam.mdx * fixup * correct checkpoints * correct name * rm unneeded file * add notebook --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: amyeroberts <aeroberts4444@gmail.com> Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> |
||
![]() |
5f97bbc124
|
Remove 'main' from doc links (#22860) | ||
![]() |
888c4a2ae0
|
v4.29.0.dev0 | ||
![]() |
523ca4e016
|
add model resources for CPMAnt (new) (#20906)
* resolve conflicts * rebase and make style * test * test * test * rebase and make style * rebase and make style * tests * tests * rewrite some functions * rebase and make style * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * fix some bugs & docstring * add models and tests * solve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * tests * resolve conflicts * resolve conflicts * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * fix some bugs & docstring * save resolution * make style * delete redefinition code * reformat function * reformat * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * tests * resolve conflicts * resolve conflicts * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * resolve conflicts * make style * fix bugs and refactor * modify docstrings and make style * unify import format in __init__.py * fix import-altclp bug * fix copies to update index.md * fix unused config parameters * fix unused config parameters * fix unused config parameters * update README_ja.md * dummy commit for unit test * fix attention mask * add CPMAntTokenizer&-Fast to auto-mapping * drop redundant changes in README_ko * fix defaults in docstring * fix use_cache and some docstring * add missing args in tokenizer * modify tester inheritance * add is_jieba_available * fix some bugs * make style and fix-copies * add doctests * skip integration tests * add is_jieba_available * fix bugs in common tests * adjust docstrings and make style * add argument docstring * adjust code to some specifications * make style and fix-copies * add fast tokenization test * dummy commit for unit test * dummy commit for unit test * dummy commit for unit test * normalize some comments and names * Bert->CPMAnt * camel names and drop redundant codes * make style and fix-coies * add CpmTokenizerFast _import_structure * drop cpmanttokenizerfast in model_doc * fix some problems * fix CPMAnt tokenization for common test * make style and fixup * fix copies and fixup * fix bugs in tokenization test * dummy commit for connection failure in unittest * fix copies * drop trailing comma * fix decorator in tests * dummy commit for connection failure in unittest --------- Co-authored-by: Gong Baitao <gongbaitao11@gmail.com> |
||
![]() |
e0921c6b53
|
Add GPTBigCode model (Optimized GPT2 with MQA from Santacoder & BigCode) (#22575)
* Add model with cli tool * Remove unwanted stuff * Add new code * Remove inference runner * Style * Fix checks * Test updates * make fixup * fix docs * fix doc * fix test * hopefully fix pipeline tests * refactor * fix CIs * add comment * rename to `GPTBigCodeForCausalLM` * correct readme * make fixup + docs * make fixup * fixes * fixes * Remove pruning * Remove import * Doc updates * More pruning removal * Combine copies * Single MQA implementation, remove kv cache pre-allocation and padding * Update doc * Revert refactor to match gpt2 style * Merge back key and value caches, fix some type hints * Update doc * Fix position ids pith padding (PR 21080) * Add conversion script temporarily * Update conversion script * Remove checkpoint conversion * New model * Fix MQA test * Fix copies * try fix tests * FIX TEST!! * remove `DoubleHeadsModel` * add MQA tests * add slow tests * clean up * add CPU checker * final fixes * fixes - fix GPU issue - fixed slow tests - skip disk offload * fix final issue * Simplify and comment baddbmm fix * Remove unnecessary code * Transpose tweaks * Use beta=1 on cpu, improve tests --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> |
||
![]() |
176ceff91f
|
Add DePlot + MatCha on transformers (#22528)
* add deplot + matcha on `transformers` * more docs * correct path * Update docs/source/en/model_doc/deplot.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix * use auto processor * Update docs/source/en/model_doc/matcha.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make fixup * Update docs/source/en/model_doc/deplot.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * add correct names --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> |
||
![]() |
19ade2426a
|
[WIP]NLLB-MoE Adds the moe model (#22024)
* Initial commit * update modeling code * update doc * add functions necessary * fix impotrs * revert changes * fixup * more styling to get going * remove standalone encoder * update code * styling * fix config and model * update code and some refactoring * make more tests pass * Adding NLLB-200 - MoE - 54.5B for no language left behind Fixes #21300 * fix mor common tests * styke * update testing file * update * update * Router2 doc * update check config with sparse layer * add dummy router * update current conversion script * create on the fly conversion script * Fixup * style * style 2 * fix empty return * fix return * Update default config sparse layers * easier to create sparse layers * update * update conversion script * update modeling * add to toctree * styling * make ruff happy * update docstring * update conversion script * update, will break tests but impelemting top2 * update * ❗local groups are supported here * ⚠️ Support for local groups is now removed ⚠️ This is because it has to work with model parallelism that we do not support * finish simplificaiton * Fix forward * style * fixup * Update modelling and test, refactoring * update tests * remove final layer)norm as it is done in the FF * routing works! Logits test added * nit in test * remove top1router * style * make sure sparse are tested. Had to change route_tokens a liottle bit * add support for unslip models when converting * fixup * style * update test s * update test * REFACTOR * encoder outputs match! * style * update testing * 🎉encoder and decoder logits match 🎉 * styleing * update tests * cleanup tests * fix router test and CIs * cleanup * cleanup test styling * fix tests * Finally the generation tests match! * cleanup * update test * style testing file * remove script * cleanup * more cleanup * nits * update * NLLB tokenizer is wrong and will be fixed soon * use LongTensors * update tests * revert some small changes * fix second expert sampling and batch prioritized routing * update tests * finish last tests * make ruff happy * update * ruff again * style * Update docs/source/en/model_doc/nllb-moe.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Updates based on review * style and fix import issue * nit * more nits * cleanup * styling * update test_seconde_expert_policy * fix name * last nit on the markdown examples --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
57f25f4b7f
|
Add Mega: Moving Average Equipped Gated Attention (#21766)
* add mega file structure and plain pytorch version of mega source code * added config class with old naming conventions * filled in mega documentation * added config class and embeddings with optional token types * updated notes * starting the conversion process, deleted intermediate and added use_cache back to config * renamed config attributes in modeling_mega.py * checkpointing before refactoring incremental decoding functions * removed stateful incremental key/values for EMA and self-attention * refactored MovingAverageGatedAttention to remove stateful k/v history and use unified attention mask * MovingAverageGatedAttention works with incremental decoding + past values, added sequence length enforcement * more comments in MovingAverageGatedAttention + checkpointing before GatedCrossAttention * bug fix in attention mask handling in MovingAverageGatedAttention * removed incremental state from GatedCrossAttention and removed IncrementalState class * finished gated cross attention and got MegaLayer working * fixed causal masking in mega decoder * fixed how padding and causal masks are passed through MegaLayer with and without k/v caching * finished MegaModel; tested with encoder, decoder-only, and cross-attention type inputs; started work on downstream classes; removed mentions of position_ids * added optional dense hidden layer for masked and causal LM classes * docstring updates in MultiHeadEMA and GatedCrossAttention, removed unnecessary inputs in cross-attention * removed before_attn_fn in Mega class and updated docstrings and comments up to there * bug fix in MovingAverageGatedAttention masking * working conversion of MLM checkpoint in scratchpad script -- perfect matches * moved arg for hidden dense layer in LM head to config; discovered issue where from_pretrained is renaming gamma and beta parameters * renamed gamma and beta parameters to avoid HF renaming when loading from checkpoint * finished checkpoint conversion script * cleanup old class in mega config script * removed 'copied from' statements and passing integration tests * added num_attention_heads=1 to config for integration compatibility, decoder tests working, generation tests failing * fixed tuple output of megamodel * all common tests passing after fixing issues in decoder, gradient retention, and initialization * added mega-specific tests, ready for more documentation and style checks * updated docstrings; checkpoint before style fixes * style and quality checks, fixed initialization problem in float_tensor, ready for PR * added mega to toctree * removed unnecessary arg in megaconfig * removed unused arg and fixed code samples with leftover roberta models * Apply suggestions from code review Applied all suggestions except the one renaming a class, as I'll need to update that througout Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixed issue where .view breaks batch dimension, conversion script fixed with absolute imports, updated readme with Mega->MEGA * removed asserts in Mega code, renamed sequencenorm, gatedcrossattention, and NFFN, replaced get_activation_fn with ACTFN, and added sequencenorm to layer norms * reformatted .forward() docstrings to match style and removed unused mask input in cross-attention * removed all reset_parameters() methods and rolled into MegaPreTrainedModel._init_weights() * renamed all single-letter variables and improved readability in tensor size comments, Mega->MEGA in 2 documentation files * variable names in NFFN * manual Mega->MEGA changes in docs * Mega->MEGA in config auto * style and quality fixes * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * renamed parameters and variables with confusing names, added copied from statements, moved fft conv to its own method, other cleanup from PR comments * commit before dealing with merge conflicts * made new attention activation functions available in ACT2FN and added generation test from OPT * style and quality in activations and tests * documentation fixes, renaming variables in dropout and rotary positions, used built-in causal masking, encoders->layers in MegaModel, moved comments into docstrings * style and quality fixes after latest updates, before rotary position ids * causal mask in MegaBlock docstring + added missing device passing * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * added Mega prefixes where missing, reverted MegaSequenceNorm to if-else, other module renaming requested in PR * style and quality fixes + readme updates pointing to main --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
0f68a7f408
|
Add Pix2Struct (#21400)
* v1 all keys match * clean up * forward pass ok * add correct image transform * generate works, logits matching * clean up * more refactor * revert * revert * clean up * clean ups * clean up * refactor * refactor * fix doc * fix tokenizer test * fix toctree * revert toctree * oops * few fixes * replace to `pixel_embeds` * make fixup * test processing & feat extractor * fix some tests * more fixes * make fixup * clean up * more clean up * add a single slow test * fix test * make fixup * fix * fix authors * fix toctree * update docs * add docstring * revert change * Update src/transformers/models/pix2struct/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix tokenizer * fix processor test * fix test * make fixup * refactor * fix config * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * format * fix * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * make fixup * add docstring * fix issues * fix * fix * fix * add slow test * fix * fix * fix batched issue * fix training issues * fix ci test * fix slow test * fix conversion script * remove unneeded classes * fix slow test * fix require backends * fix masked fill * revert * fix softmax * add large models support * fix conditional generation * few fixes * add instructions * rm unneeded file * Update src/transformers/models/pix2struct/convert_pix2struct_original_pytorch_to_hf.py * fix ci test * fix ci test really * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix nit * fix nits * fix image processors nits * docstring * clean up * fix nit * fix tests * docstring nit * fix reshape * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix nit * fix repetition * refactor processor * make patch size consistent * refactor forward * fix docstring * fix max_patches issue * update docstirng * update docstring * fix coped from * add skip reasons * few fixes * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * format * fix doctests * refactor and fix * fix doc build issue * fix processor test * small fix conversion script * replace correct weights * make fixup * fix some issues * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * revert config and fixes * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * more details * fixes * fix processor * fix processor test * fix * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * fix processor * Update src/transformers/models/pix2struct/modeling_pix2struct.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add copied * make fixup * fix copies * update docstring * refactor * fix docstring * fix conversion script * fix vqa issue * replace to `flattened_patches` * nit * fix numpy issue * fix image processors * add batched vqa support * fix vqa conversion * make fixup * fix conversion script * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * add correct docstring * update docstring * fix module level + channel dim * use `make_list_of_images` * refactor * correct docstring * fix authors * remove `data_format` * add header text test * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * add checkpoints --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> |
||
![]() |
0041be5b3d
|
LLaMA Implementation (#21955)
* LLaMA * sharding and docs * tweak * black * inits * ruff * LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP * init * no checkpoint * docs * ruff * type_vocab_size * tokenizer fixes * tokenizer fixes * Update tokenization_llama.py * Update tokenization_llama.py * Update configuration_llama.py * Update modeling_llama.py * tokenizer add_bos by default * licenses * remove decoder * norms and mlp * rope overhaul * tweaks * black * mention OPT implementation * off-by-one naming * typo * fix * tokenization fix and slicing bug * padding config * cleanup * black * update tests * undo typo * fix vocab caching logic * ruff * docbuilder * attn fix from BlackSamorez * initial feedback * typo * docs * llama case * llama case * load checkpoint docs * comment about tokenizer * tokenizer defaults * clear past_key_values if use_cache=False * last tweaks * last tweaks * last tweaks * last tweaks --------- Co-authored-by: Stella Biderman <stellabiderman@gmail.com> |
||
![]() |
ebdb185bef
|
v4.28.0.dev0 | ||
![]() |
cdddfbffa1
|
Add ConvNeXT V2 (#21679)
* Add ConvNeXt V2 to transformers * TF model is separated from the PR to fix issues |
||
![]() |
6cb5132a7f
|
Fix doc link for MGP-STR (#22138) | ||
![]() |
102b5ff4a8
|
add new model of MGP-STR (#21418)
* add new model of MGP-STR * fix the check failings * remove torch and numpy from mgp_tokenization * remove unused import from modeling_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str.py * add test_processing_mgp_str * add test_processing_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str and add softmax outs to model * rm test_processing_mgp_str and add softmax outs to model * rewrite the code of mgp-str according to PR suggestions * rewrite the code of mgp-str according to PR suggestions * add new model of MGP-STR * fix the check failings * remove torch and numpy from mgp_tokenization * remove unused import from modeling_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str.py * add test_processing_mgp_str * add test_processing_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str and add softmax outs to model * rewrite the code of mgp-str according to PR suggestions * rewrite the code of mgp-str according to PR suggestions * remove representation_size from MGPSTRConfig * reformat configuration_mgp_str.py * format test_processor_mgp_str.py * add test for tokenizer and complete model/processer test and model file * rm Unnecessary tupple in modeling_mgp_str * reduce hidden_size/layers/label_size in test_model * add integration tests and change MGPSTR to Mgpstr * add test for logit values * reformat test model file --------- Co-authored-by: yue kun <yuekun.wp@alibaba-inc.com> |
||
![]() |
8abe4930d3
|
[Time-Series] informer model (#21099)
* added informer to gitignore * added informer to gitignore * WIP informer2020 * added checking that instantiate works * added config using gluonTS by kashif * WIP config * adding informeConfig. need to remove FeatureEmbedder * done InformerConfig, but need to change the names * Done informer model init. working on enc-dec * added things to address, after reading again enc-dec in the paper * done modeling - checking initialization work * added informer to gitignore * WIP informer2020 * added checking that instantiate works * added config using gluonTS by kashif * WIP config * adding informeConfig. need to remove FeatureEmbedder * done InformerConfig, but need to change the names * Done informer model init. working on enc-dec * added things to address, after reading again enc-dec in the paper * done modeling - checking initialization work * moved enc-dec init to InformerEncoder/Decoder init * added 'init_std' to config, now model init works! * WIP conversion script, and added code sources * WIP conversion script: loading original informer pth works * WIP conversion script: change defaults in the config * WIP conversion script: supporting Informer input embedding * WIP conversion script: added parameters for the informer embed * WIP conversion script: change dim_feedforward=2048 * WIP conversion script: remove unused args for loading checkpoint * just cleaning up * DataEmbedding removed, after thinking with Kashif * working on forward pass * WIP forward pass: trying to establish working batch for forward pass * cleaning and finalizing * adding HF names and docs * init after cleaning works * WIP in tests * added docs for the informer specific args * fix style * undo change * cleaning informer, now need to work only enc-dec * initial enc-dec classes * added encoder and decoder * added todo * add todos for conv_layers * added decoder docs from vanilla * added encoder docs from vanilla * remove encoder decoder from the original informer * removed AttentionLayer from the original paper * removed TriangularCausalMask, same as decoder_attention_mask * initial sparse attention * use conv_layers * fixed test_config test * fix parenthesis when itearting zip(layers, conv_layers) * error found in prob attention, added sizes as comments * fix sizes * added proposal for q_reduce indexing, and remove unused * WIP ProbMask, and changed factor=2 for testing * remove unused libs for this PR for creating the env * fix checking the attn_weights.size() after bmm * Q_reduce: changed from torch.gather to simple slicing * WIP calculate final attn_output * finish adding v_aggregated, attn_output ready * changed tgt_len to u in attention_mask, need to fix the size error * comment attention_mask for encoder, and fix if cond for v_agg * added ProbMask support (wip), removed old original code * finished ProbMask 😃 * Revert "remove unused libs for this PR for creating the env" This reverts commit |
||
![]() |
c5fe06c59d
|
Update README logo (#21933) | ||
![]() |
82aac00e0f
|
[Flan-UL2] Add-flan-ul2 (#21929)
* add doc and readme * add model docs * update toctree and fix copies * update * update doc file * fix * add FLAN-UL2 to configuration mapping * fixup * Apply suggestions from code review * more clarification --------- Co-authored-by: younesbelakda <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> |
||
![]() |
269b054939
|
Add ALIGN to transformers (#21741)
Adds the ALIGN model to transformers. ALIGN is introduced in "Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision" by Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, Yunhsuan Sung, Zhen Li, Tom Duerig. |
||
![]() |
49ab16239c
|
Add EfficientNet (#21563)
* Add EfficientNet to transformers |
||
![]() |
f56174ac5b
|
add GPTSAN model (reopen) (#21291)
* add GPTSAN-Japanese * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN (update for review) * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * fix typo in comment text * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * fix document and comments * fix class name GPTSAN->GPTSan * fix import and test for tokenizer |
||
![]() |
c236a62172
|
[CLAP] Add CLAP to the library (#21370)
* add model like clip * update * text model ok * clap text works * some refactor - `CLAPVision` to `CLAPAudio` - refactor kwargs of audio modules * more refactor * more refactor * more refactor * correct fusion * more refactor * new modules * add basic processor * fixup * remove whisper copioed from * audio logits match * add doc * correct filters mel and add maxlength * style * few fixes * forward passes * fixup * fixup * some clean up * remove mels form the dictionnary * pad after the repeat * update padding when dsmaller * fix padding * style * use swin patch merging * use copied from swin * processor with any tokenizer * more copied from * some clean up * more refactor * fix mel when rand_trunc * style * remove unused imports * update processing * remove image processing tests * add testing fiel * fixmodeling issues * replace with `is_longer` * clap in serialization * more refactor * `make fixup` * make fixup * fix feature extractor * update test feature extractor * `make fixup` * clean up config * more clean up * more cleanup * update tests * refactor tests and inits * removeCLAP vision config * remove CLAP from image procssing auto and dummy vision objects * update inits * style * re order classes in modeling clap * Use roberta tokenizer as the other weights are not open sourced * small cleaup * remove tokenization CLAP * processor tokenizr is roberta * update feature extraction doc * remove vclap from model zero shot * update f_min and f_max to frequency_xx * some changes - fix modeling keys - add `is_longer` in the forward pass - make fixup * make fixup * consistent behavior ebtween rand_crop and fusion * add numpy resize and bilinear and documentation * move resizing to image utils * clean feature extraction * import resize from correct file * resize in image transforms * update * style * style * nit * remove unused arguments form the feature extractor * style * few fixes + make fixup * oops * fix more tests * add zero shot audio classification pipeline * update zeroshot classification pipeline * fixup * fix copies * all CI tests pass * make fixup + fix docs * fix docs * fix docs * update tests pip;eline * update zero shot pipeline * update feature extraction clap * update tokenization auto * use nested simplify * update pipeline tests * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * split in two lines * fixes * refactor * clean up * add integration tests * update config docstring * style * update processor * fix processor test * fix feat extractor tests * update docs * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix readmes * fix tips * Update src/transformers/models/auto/configuration_auto.py * update doc and remove todo -> properly explained * fix idx and typo * typoe * cleanup config * cleanup tests, styles and doc * ignore docstyle on image transform * add conversion script * remove the `clap` indx in favor of `CLAP` * update __init * nits * Update src/transformers/pipelines/__init__.py * fix bug * clarifiy config * fix copy * fix init * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix model output * fix comment * make fixup * make fixup * rename to `Clap` * replace to `Clap` * replace to `Clap` * repo consistency * again repo-consistency * make fixup * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * add config * changes * update conversion * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * remove unused function * update based on code reviews * style * more comments * cleanup * clean up * style * apply suggestions * Empty commit * pipeline will be added in a different PR * update calls to audio utils functions * update pipeline init * style * style * styling again * use pad * fix repo-consistency * update utils and add doc for audio utils * clean up resize by using torch. update inits accordingly * style * CLap's tokenizer is RobertA * add audio utils to internal toctreee * update totctree * style * update documentation and normalize naming accross audio utils and feature extraction clap * style * clean up * update doc and typos * fix doctest * update modelin code, got rid of a lot of reshaping * style on added doc audio utils * update modeling clap * style * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * docstringvariables with CLAP * rename key * update modeling CLAP * update audio utils docstring * update processing clap * fix readmes * fix toctree * udpate configuration clap * fix init * make fixup * fix * fix * update naming * update * update checkpoint path * Apply suggestions from code review * Major refactoring * Update src/transformers/models/clap/configuration_clap.py * merge --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> |
||
![]() |
a0e69a9375
|
Add TVLT (#20725)
* Update image_processing_tvlt.py * Update modeling_tvlt.py * Update * Update modeling_tvlt.py * Create tvlt.mdx * Update configuration_tvlt.py * Update modeling_tvlt.py * Update test_modeling_tvlt.py * Update modeling_tvlt.py * Update modeling_tvlt.py * Update image_processing_tvlt.py * Update feature_extraction_tvlt.py * Update tvlt models * Update tests * Update * Update * Update tests * Update README_ko.md * Update README_ja.md * Update README_ko.md * Update README_zh-hans.md * Update docs/source/en/model_doc/tvlt.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/tvlt.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update tvlt.mdx * Update modeling_tvlt.py * Update configuration_tvlt.py * Update modeling_tvlt.py * Update modeling_tvlt.py * Update modeling_tvlt.py * Update modeling_tvlt.py * Add files via upload * Update model * Update modeling_tvlt.py * Update tvlt models * Update src/transformers/models/tvlt/__init__.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/__init__.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add files via upload * Add files via upload * Delete modeling_tvlt.py * Delete feature_extraction_tvlt.py * Delete configuration_tvlt.py * Delete image_processing_tvlt.py * Delete processing_tvlt.py * Update tvlt * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update README_es.md * Update README_hd.md * Update README_ja.md * Update README_ko.md * Update README_zh-hans.md * Update README_zh-hant.md * Update index.mdx * Update tvlt.mdx * Update tvlt.mdx * Update configuration_tvlt.py * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update modeling_tvlt.py * Add files via upload * Update tvlt.mdx * Update modeling_auto.py * Add files via upload * Add files via upload * Update dummy_pt_objects.py * Update __init__.py * Update feature_extraction_tvlt.py * Update feature_extraction_tvlt.py * Update image_processing_tvlt.py * Update modeling_auto.py * Update test_feature_extraction_tvlt.py * Update test_processor_tvlt.py * Update test_feature_extraction_tvlt.py * Add files via upload * Update test_image_processor_tvlt.py * Update tests/models/tvlt/test_processor_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/tvlt/processing_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_image_processor_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update tests/models/tvlt/test_image_processor_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_image_processor_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_image_processor_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_feature_extraction_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/processing_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/tvlt.mdx Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/feature_extraction_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/feature_extraction_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/feature_extraction_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/feature_extraction_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update feature_extraction_tvlt.py * Update feature_extraction_tvlt.py * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update image_processing_tvlt.py * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update modeling_tvlt.py * Update modeling_tvlt.py * Update modeling_tvlt.py * Update test_image_processor_tvlt.py * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add files via upload * Add files via upload * Update modeling_tvlt.py * Update modeling_tvlt.py * Update modeling_tvlt.py * Add files via upload * Update docs/source/en/model_doc/tvlt.mdx Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update image_processing_tvlt.py * Add files via upload * Add files via upload * Update tvlt.mdx * Update docs/source/en/model_doc/tvlt.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/tvlt.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/tvlt.mdx Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update docs/source/en/model_doc/tvlt.mdx Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Add files via upload * Add files via upload * Add files via upload * Add files via upload * Update modeling_auto.py * Update tvlt.mdx * Update dummy_pt_objects.py * Update feature_extraction_tvlt.py * Update modeling_tvlt.py * Update test_feature_extraction_tvlt.py * Update test_image_processor_tvlt.py * Update test_feature_extraction_tvlt.py * Update modeling_tvlt.py * Update dummy_pt_objects.py * Update dummy_speech_objects.py * Add files via upload * Update README_hd.md * Update modeling_tvlt.py * Update modeling_tvlt.py * Update modeling_tvlt.py * Update modeling_tvlt.py * Update modeling_tvlt.py * Update modeling_tvlt.py * Update test_modeling_tvlt.py * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/feature_extraction_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update MAE processing * Update modeling_tvlt.py * Update modeling_tvlt.py * Update modeling * Update style * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update check_repo.py * Update tvlt.mdx * Update __init__.py * Update tests * Update tvlt models * Update configuration_tvlt.py * Update configuration_tvlt.py * Update image_processing_tvlt.py * Update dummy_pt_objects.py * Add files via upload * Update test_modeling_tvlt.py * Update test_feature_extraction_tvlt.py * Update test_feature_extraction_tvlt.py * Update test_feature_extraction_tvlt.py * Update test_feature_extraction_tvlt.py * Update test_feature_extraction_tvlt.py * Update test_feature_extraction_tvlt.py --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> |
||
![]() |
0c9c8472e6
|
Add Ernie-M Model to huggingface (#21349)
* config and tokenization(fast too) changed and ErnieEncoder added * Slow Tokenization Added * Tokenizer(slow) is now working and Fast Tokenizer removed * Added Config code * Added Base Model and utils * ErnieMModel is now working * All added except tests * All tests passed except ErnieUIEM * All tests passed * all fixes done * all fixes done * fixed MAP * fixed check_code_quality * fixed Build PR Documentation issue * Added changes(comments) and also updated to the latest upstream/main * Added fixup * Added # Copied comments * Added fixup * Added more comments and some nits * Added fixup * Fixed README_hd.md * Added more fixes * ErnieMTokenizer (being sentencepiece) protected and other docs edited * Added code_quality fix * Fixed for * Added more fix * modified AZ * ernie-m tokenization test added! * attention mask part fixed(with 0->self.config.pad_token_id) * applied make fixup |
||
![]() |
b0d539ccad
|
Add X-MOD (#20939)
* Add X-MOD to Readme * Add documentation for X-MOD * Implement X-MOD * Fix formatting of X-MOD docs * Change signature of X-MOD forward methods to use lang_ids * Minor changes * Rebase with main and run make fix-copies * Make suggested changes to docstrings * Improve code readability Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Fix code style * Conversion script: Remove asserts and type annotations * Remove _TOKENIZER_FOR_DOC * XMOD -> Xmod * Update copyright note * Fix doctests * Fix docstring * Add integration test for FillMaskPipeline * Revert "Add integration test for FillMaskPipeline" This reverts commit 4381eb3b1d0f5d85785f89caba83928e6efa6d1f. * Add end-to-end integration test for mask fill * make style * Rebase with main and make fix-copies --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> |
||
![]() |
d7f1e7c009
|
Add BLIP-2 (#21441)
* First draft * More improvements * More improvements * Improve conversion script * Convert all weights * Make forward pass work * Make logits match * More improvements * More improvements * More improvements * Use get_input_embeddings * Improve some more * Improve model tests * Improve model tests * More improvements * Fix processor * Update files * Update prepare_inputs_for_generation * More improvements * Fix copies * More fixes * Make fixup * More improvements * Add support for seq2seq language model * More improvements * Fix test * More improvements * Improve conversion script * Remove some todo's * Fix README's * Improve conversion script * Fix generation * Fix style and remove Blip2Model * Fix model outputs * More improvements * Set eos_token_id in config * Fix quality * Small improvements * Add processor tests * More improvements * Apply suggestions * Apply suggestions * Add integration test * Update image URL * Add integration test * Fix model_type * Update style * Improve docs * Add doc tests * Fix copies * Remove tests which are passing * Improve some more * Add tests for seq2seq language models * Minor fix * Convert more checkpoints * finalize CI * Fix blip and blip2 processors * add `accelerate` support for `blip2` * clean up * make style * Update conversion script * Update conversion script some more * Update organization * revert toc file * add blip-2 to toc file * Some more improvements * Fix docstring * Improve docs --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: younesbelkada <younesbelkada@gmail.com> |
||
![]() |
7e51a441e4
|
Add XLM-V to Model Doc (#21498)
* doc: introduce new section for XLM-V model * doc: mention more details for XLM-V integration * docs: paper abstract in italics, model identifier for base model added * doc: mention new XLM-V support * auto: add XLM-V mapping * doc: run make fix-copies ;) |
||
![]() |
e4bacf6614
|
[WIP] add SpeechT5 model (#18922)
* make SpeechT5 model by copying Wav2Vec2 * add paper to docs * whoops added docs in wrong file * remove SpeechT5Tokenizer + put CTC back in the name * remove deprecated class * remove unused docstring * delete SpeechT5FeatureExtractor, use Wav2Vec2FeatureExtractor instead * remove classes we don't need right now * initial stab at speech encoder prenet * add more speech encoder prenet stuff * improve SpeechEncoderPrenet * add encoder (not finished yet) * add relative position bias to self-attention * add encoder CTC layers * fix formatting * add decoder from BART, doesn't work yet * make it work with generate loop * wrap the encoder into a speech encoder class * wrap the decoder in a text decoder class * changed my mind * changed my mind again ;-) * load decoder weights, make it work * add weights for text decoder postnet * add SpeechT5ForCTC model that uses only the encoder * clean up EncoderLayer and DecoderLayer * implement _init_weights in SpeechT5PreTrainedModel * cleanup config + Encoder and Decoder * add head + cross attention masks * improve doc comments * fixup * more cleanup * more fixup * TextDecoderPrenet works now, thanks Kendall * add CTC loss * add placeholders for other pre/postnets * add type annotation * fix freeze_feature_encoder * set padding tokens to 0 in decoder attention mask * encoder attention mask downsampling * remove features_pen calculation * disable the padding tokens thing again * fixup * more fixup * code review fixes * rename encoder/decoder wrapper classes * allow checkpoints to be loaded into SpeechT5Model * put encoder into wrapper for CTC model * clean up conversion script * add encoder for TTS model * add speech decoder prenet * add speech decoder post-net * attempt to reconstruct the generation loop * add speech generation loop * clean up generate_speech * small tweaks * fix forward pass * enable always dropout on speech decoder prenet * sort declaration * rename models * fixup * fix copies * more fixup * make consistency checker happy * add Seq2SeqSpectrogramOutput class * doc comments * quick note about loss and labels * add HiFi-GAN implementation (from Speech2Speech PR) * rename file * add vocoder to TTS model * improve vocoder * working on tokenizer * more better tokenizer * add CTC tokenizer * fix decode and batch_code in CTC tokenizer * fix processor * two processors and feature extractors * use SpeechT5WaveformFeatureExtractor instead of Wav2Vec2 * cleanup * more cleanup * even more fixup * notebooks * fix log-mel spectrograms * support reduction factor * fixup * shift spectrograms to right to create decoder inputs * return correct labels * add labels for stop token prediction * fix doc comments * fixup * remove SpeechT5ForPreTraining * more fixup * update copyright headers * add usage examples * add SpeechT5ProcessorForCTC * fixup * push unofficial checkpoints to hub * initial version of tokenizer unit tests * add slow test * fix failing tests * tests for CTC tokenizer * finish CTC tokenizer tests * processor tests * initial test for feature extractors * tests for spectrogram feature extractor * fixup * more fixup * add decorators * require speech for tests * modeling tests * more tests for ASR model * fix imports * add fake tests for the other models * fixup * remove jupyter notebooks * add missing SpeechT5Model tests * add missing tests for SpeechT5ForCTC * add missing tests for SpeechT5ForTextToSpeech * sort tests by name * fix Hi-Fi GAN tests * fixup * add speech-to-speech model * refactor duplicate speech generation code * add processor for SpeechToSpeech model * add usage example * add tests for speech-to-speech model * fixup * enable gradient checkpointing for SpeechT5FeatureEncoder * code review * push_to_hub now takes repo_id * improve doc comments for HiFi-GAN config * add missing test * add integration tests * make number of layers in speech decoder prenet configurable * rename variable * rename variables * add auto classes for TTS and S2S * REMOVE CTC!!! * S2S processor does not support save/load_pretrained * fixup * these models are now in an auto mapping * fix doc links * rename HiFiGAN to HifiGan, remove separate config file * REMOVE auto classes * there can be only one * fixup * replace assert * reformat * feature extractor can process input and target at same time * update checkpoint names * fix commit hash |
||
![]() |
5451f8896c
|
Add DETA (#20983)
* First draft * Add initial draft of conversion script * Convert all weights * Fix config * Add image processor * Fix DetaImageProcessor * Run make fix copies * Remove timm dependency * Fix dummy objects * Improve loss function * Remove conv_encoder attribute * Update conversion scripts * Improve postprocessing + docs * Fix copied from statements * Add tests * Improve postprocessing * Improve postprocessing * Update READMEs * More improvements * Fix rebase * Add is_torchvision_available * Add torchvision dependency * Fix typo and README * Fix bug * Add copied from * Fix style * Apply suggestions * Fix thanks to @ydshieh * Fix another dependency check * Simplify image processor * Add scipy * Improve code * Add threshold argument * Fix bug * Set default threshold * Improve integration test * Add another integration test * Update setup.py * Address review * Improve deformable attention function * Improve copied from * Use relative imports * Address review * Replace assertions * Address review * Update dummies * Remove dummies * Address comments, update READMEs * Remove custom kernel code * Add image processor tests * Add requires_backends * Add minor comment * Update scripts * Update organization name * Fix defaults, add doc tests * Add id2label for object 365 * Fix tests * Update task guide |
||
![]() |
3a6e4a221c
|
Add BridgeTower model (#20775)
* Commit with BTModel and latest HF code * Placeholder classes for BTForMLM and BTForITR * Importing Bert classes from transformers * Removed objectives.py and dist_utils.py * Removed swin_transformer.py * Add image normalization, BridgeTowerForImageAndTextRetrieval * Add center_crop * Removing bert tokenizer and LCI references * Tested config loading from HF transformers hub * Removed state_dict updates and added path to hub * Enable center crop * Getting image_size from config, renaming num_heads and num_layers * Handling max_length in BridgeTowerProcessor * Add BridgeTowerForMaskedLM * Add doc string for BridgeTowerConfig * Add doc strings for BT config, processor, image processor * Adding docs, removed swin * Removed convert_bridgetower_original_to_pytorch.py * Added doc files for bridgetower, removed is_vision * Add support attention_mask=None and BridgeTowerModelOutput * Fix formatting * Fixes with 'make style', 'make quality', 'make fixup' * Remove downstream tasks from BridgeTowerModel * Formatting fixes, add return_dict to BT models * Clean up after doc_test * Update BTModelOutput return type, fix todo in doc * Remove loss_names from init * implement tests and update tuples returned by models * Add image reference to bridgetower.mdx * after make fix-copies, make fixup, make style, make quality, make repo-consistency * Rename class names with BridgeTower prefix * Fix for image_size in BTImageProcessor * implement feature extraction bridgetower tests * Update image_mean and image_std to be list * remove unused import * Removed old comments * Rework CLIP * update config in tests followed config update * Formatting fixes * Add copied from for BridgeTowerPredictionHeadTransform * Update bridgetower.mdx * Update test_feature_extraction_bridgetower.py * Update bridgetower.mdx * BridgeTowerForMaskedLM is conditioned on image too * Add BridgeTowerForMaskedLM * Fixes * Call post_init to init weights * Move freeze layers into method * Remove BTFeatureExtractor, add BT under multimodal models * Remove BTFeatureExtractor, add BT under multimodal models * Code review feedback - cleanup * Rename variables * Formatting and style to PR review feedback * Move center crop after resize * Use named parameters * Style fix for modeling_bridgetower.py * Update docs/source/en/model_doc/bridgetower.mdx Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/bridgetower.mdx Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/bridgetower.mdx Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/bridgetower/modeling_bridgetower.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/bridgetower/modeling_bridgetower.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/bridgetower.mdx Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/bridgetower/modeling_bridgetower.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Rename config params, copy BERT classes, clean comments * Cleanup irtr * Replace Roberta imports, add BTTextConfig and Model * Update docs, add visionconfig, consistent arg names * make fixup * Comments for forward in BTModel and make fixup * correct tests * Remove inconsistent roberta copied from * Add BridgeTowerTextModel to dummy_pt_objects.py * Add BridgeTowerTextModel to IGNORE_NON_TESTED * Update docs for BT Text and Vision Configs * Treat BridgeTowerTextModel as a private model * BridgeTowerTextModel as private * Run make fix-copies * Adding BTTextModel to PRIVATE_MODELS * Fix for issue with BT Text and Image configs * make style changes * Update README_ja.md Add から to BridgeTower's description * Clean up config, .mdx and arg names * Fix init_weights. Remove nn.Sequential * Formatting and style fixes * Re-add tie_word_embeddings in config * update test implementation * update style * remove commented out * fix style * Update README with abs for BridgeTower * fix style * fix mdx file * Update bridgetower.mdx * Update img src in bridgetower.mdx * Update README.md * Update README.md * resolve style failed * Update _toctree.yml * Update README_ja.md * Removed mlp_ratio, rename feats, rename BTCLIPModel * Replace BTCLIP with BTVisionModel,pass in vision_config to BTVisionModel * Add test_initialization support * Add support for output_hidden_states * Update support for output_hidden_states * Add support for output_attentions * Add docstring for output_hidden_states * update tests * add bridgetowervisionmodel as private model * rerun the PR test * Remove model_type, pass configs to classes, renames * Change self.device to use weight device * Remove image_size * Style check fixes * Add hidden_size and num_hidden_layers to BridgeTowerTransformer * Update device setting * cosmetic update * trigger test again * trigger tests again * Update test_modeling_bridgetower.py trigger tests again * Update test_modeling_bridgetower.py * minor update * re-trigger tests * Update docs/source/en/model_doc/bridgetower.mdx Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Remove pad, update max_text_len, doc cleanup, pass eps to LayerNorm * Added copied to, some more review feedback * make fixup * Use BridgeTowerVisionEmbeddings * Code cleanup * Fixes for BridgeTowerVisionEmbeddings * style checks * re-tests * fix embedding * address comment on init file * retrigger tests * update import prepare_image_inputs * update test_image_processing_bridgetower.py to reflect test_image_processing_common.py * retrigger tests Co-authored-by: Shaoyen Tseng <shao-yen.tseng@intel.com> Co-authored-by: Tiep Le <tiep.le@intel.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Tiep Le <97980157+tileintel@users.noreply.github.com> |
||
![]() |
7119bb052a
|
v4.27.0.dev0 | ||
![]() |
1b37fb5e17
|
Efficientformer (#20459)
- Adds EfficientFormer V1 to transformers - PR co-authored by @novice03 and @Bearnardd Co-authored-by: novice <pranavpulijala@gmail.com> Co-authored-by: novice <44259234+novice03@users.noreply.github.com> |
||
![]() |
87208a05af
|
Graphormer model for Graph Classification (#20968)
* [FT] First commit for graphormer architecture. The model has no tokenizer, as it uses a collator and preprocessing function for its input management. Architecture to be tested against original one. The arch might need to be changed to fit the checkpoint, but a revert to the original arch will make the code less nice to read. TODO: doc * [FIX] removed test model * [FIX] import error * [FIX] black and flake * [DOC] added paper refs * [FIX] [DOC] * [FIX] black * [DOC] Updated READMEs * [FIX] Order of imports + rm Tokenizer calls * [FIX] Moved assert in class to prevent doc build failure * [FIX] make fix-copies * [Doc] update from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * [FIX] Removed Graphormer from Sequence classification model list * [DOC] Added HF copyright to Cython file * [DOC] Fixed comments * [FIX] typos in class doc + removed config classes. Todo: update doc from paper definitions * [FIX] Removed dependency to fairseq, and replaced all asserts with Exception management * [FIX] Homogeneized initialization of weights to pretrained constructor * [FIX] [CP] Updated multi_hop parameter to get same results as in original implementation * [DOC] Relevant parameter description in the configuration file * [DOC] Updated doc and comments in main graphormer file * [FIX] make style and quality checks * [DOC] Fix doc format * [FIX] [WIP] Updated part of the tests, though still a wip * [FIX] [WIP] * [FIX] repo consistency * [FIX] Changed input names for more understandability * [FIX] [BUG] updated num_classes params for propagation in the model * simplified collator * [FIX] Updated tests to follow new naming pattern * [TESTS] Updated test suite along with model * |FIX] rm tokenizer import * [DOC] add link to graphormerdoc * Changed section in doc from text model to graph model * Apply suggestions from code review Spacing, inits Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * [DOC] Explain algos_graphormer functions * Cython soft import protection * Rm call to Callable in configuration graphormer * [FIX] replaced asserts with Exceptions * Add org to graphormer checkpoints * Prefixed classes with Graphormer * Management of init functions * format * fixes * fix length file * update indent * relaunching ci * Errors for missing cython imports * fix style * fix style doc Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
5b949623c7
|
Add OneFormer Model (#20577)
* Add Oneformer Model * Add OneFormer Tests * Add UNIVERSAL_SEGMENTATION_MAPPING * Fix config * 🐛 Fix error encountered while writing tests * 🔨 Fix instance segmentation post processing * Format Files and Add Documentation * Add Documentation mdx file * Run make fixup * Run make fix-copies * Remove unnecessary code * Format modeling_oneformer.py * Add OneFormer to ImageSegmentationPipeline * Format files * Add Demo link to Readme * Fix fomatting errors * Fix test failures * Update Table in index.mdx * Fix version * Fix style * Remove OneFormer from TF * Fix Imports * Fix dummy objects * Fix tests * Add newline * Remove OneFormerFeatureExtractor * Remove CUDA Kernels * Use AutoBackbone for Swin * Fix description * Use Image Processor * Fix copies * Fix formatting * Fix import order * Fix flake8 errors * Fix doc errors * Add Hindi Readme entry * Update supported backbones * Update supported backbones * Undo Changes * Fix type of config * Fix isort * Fix auto.mdx * Fix swin config * Replace DinatBackbone with AutoBackbone * Use SwinBackbone * Use SwinBackbone * Fix conversion script * Fix arguments * Add argument description * Fix style * Add OneFormerProcessor * Fix OneFormerProcessor Tests * Fix mapping * Fix imports * Fix inits * Fix style * Fix comment * Fix docstring * Move OneFormer to MultiModal * Fix Copies * Remove size divisor * Fix check_repo.py * Fix copies * Add Processor for Testing Pipeline * Fix padding for tokens * Fix variables * Fix formatting with correct black version * Add Image Processor Test * Apply suggestions * Revert common modeling * Add check for task * Fix conversion script * Fix initialization order * Fix tests * Undo Pipeline Changes * Fix layers in MLP * Fix copies * Update image paths * Fix copies * Apply suggestions |
||
![]() |
cf028d0c3d
|
Add batch of resources (#20647)
* Add resources * Add more resources * Add more resources * Add TAPAS * Fix pipeline tag * Fix pipeline tags * Remove pipeline tag * Remove depth-estimation tag * Update docs/source/en/model_doc/segformer.mdx Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Apply suggestion * Fix segformer Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Maria Khalusova <kafooster@gmail.com> |
||
![]() |
2411f0e465
|
Add Mask2Former (#20792)
* Adds Mask2Former to transformers Co-authored-by: Shivalika Singh <shivalikasingh95@gmail.com> Co-authored-by: Shivalika Singh <73357305+shivalikasingh95@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
4ed89d48ab
|
Add UperNet (#20648)
* First draft * More improvements * Add convnext backbone * Add conversion script * Add more improvements * Comment out to_dict * Add to_dict method * Add default config * Fix config * Fix backbone * Fix backbone some more * Add docs, auto mapping, tests * Fix some tests * Fix more tests * Fix more tests * Add conversion script * Improve conversion script * Add support for getting reshaped undownsampled hidden states * Fix forward pass * Add print statements * Comment out set_shift_and_window_size * More improvements * Correct downsampling layers conversion * Fix style * First draft * Fix conversion script * Remove config attribute * Fix more tests * Update READMEs * Update ConvNextBackbone * Fix ConvNext tests * Align ConvNext with Swin * Remove files * Fix index * Improve docs * Add output_attentions to model forward * Add backbone mixin, improve tests * More improvements * Update init_weights * Fix interpolation of logits * Add UperNetImageProcessor * Improve image processor * Fix image processor * Remove print statements * Remove script * Update import * Add image processor tests * Remove print statements * Fix test * Add integration test * Add convnext integration test * Update docstring * Fix README * Simplify config * Apply suggestions * Improve docs * Rename class * Fix test_initialization * Fix import * Address review * Fix confg * Convert all checkpoints * Fix default backbone * Usage same processor as segformer * Apply suggestions * Fix init_weights, update conversion scripts * Improve config * Use Auto API instead of creating a new image processor * Fix docs * Add doctests * Remove ResNetConfig dependency * Add always_partition argument * Fix rebaseé * Improve docs * Convert checkpoints Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain> |
||
![]() |
926452298d
|
Fix model hub link (#20998) | ||
![]() |
ce85686a1f
|
Add AltCLIP (#20446)
* add altclip * update * fix wrong title * fix the copyright in readme * add altclip model * add altclip * fix test_gradient_checkpointing_enable_disable * code * add return class * add projection_state * "fix pretrained model bug" * delete print and fix 2 test instances. * delete token * rm xlmr * one model one file. * empty commit to trigger CI * Fix modeling_outputs.py * Fix __init__ * Fix quality * Fix modeling file docstring * Fix README.md * Fix test file * add vision model * empty commit to trigger CI * fix * fix * fix * fix * fix * fix * fix * fix * fix * del token in mdx file * fix * fix * fix * remove altrob from test list * add vision test * fix fx * fix * fix * fix * trigger CI * fix copies * fix tests * fix style * fix quality * update * recover import * recover * add , * recover * fix copies * trigger CI * fix * some of review * update * remove import * last 2 * fix * fix style * fix style * fix bug * fix uncomment * fix * update * fix * second review * empty commit to trigger CI * empty commit to trigger CI * fix position * fix * empty commit to trigger CI * empty commit to trigger CI * third comment * Update docs/source/en/model_doc/altclip.mdx Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Update docs/source/en/model_doc/altclip.mdx Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Update src/transformers/models/altclip/configuration_altclip.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Update src/transformers/models/altclip/modeling_altclip.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Update src/transformers/models/altclip/processing_altclip.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Update src/transformers/models/altclip/modeling_altclip.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * fix merge * fix copies * update * update * empty commit to trigger CI * fix code example * empty commit to trigger CI * fix * empty commit to trigger CI * empty commit to trigger CI Co-authored-by: shunxing1234 <xw747777271@gmail.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: shunxing1234 <33774367+shunxing1234@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> |
||
![]() |
9c6f7485a6
|
Add GIT (GenerativeImage2Text) (#20295)
* First draft * Make model instantiation work * Fix copied from statement * More fixes * Add correct output head * Improve configuration * Add conversion script * Improve conversion script * Remove token_type_ids * Fix conversion of projection layers * Convert all weights * Use cats image * Make logits match * Generate caption on cats image * Add GITProcessor * Update conversion script * Add support for more checkpoints * Fix conversion script * Add initial tests * Remove cross-attention * More improvements * Remove is_decoder * Improve model tests * Improve tests * Improve model outputs * Fix model outputs equivalence * Fix more tests * Remove unused code * Use generate to generate text, no use of cache for now * Use generate more appropriately * Fix config tests * Fix style * Add support for use_cache Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Fix style * Fix GIT vision encoder * Update README * Fix integration test * Set bos and eos token ids * Improve docs * Improve code * Add support for provided attention_mask * Add copied from statement * Fix gradient checkpointing test * Set model_input_names * Investigate model_input_names * Remove script * Fix model inputs * Fix docstring * Rename GIT to Git * Support more models * Add support for textvqa model * Add video support * Extend conversion script for video * Add support for large variant * Add support for more models * Fix config archive map * Update integration test * Fix README * Fix CLIP mean and std * Update processor * Fix use_cache for video, thanks @gante * Remove print statements * Remove assertion * Add processor tests * Fix model_input_names * Use Auto API for processor * Fix processor tests * Fix integration test * Fix pipeline test * Make tests faster * Update conversion script * Update conversion script * Convert more checkpoints * Update conversion script * Fix typo * Update docstrings * Improve code snippets * Fix doc tests * Add more code examplesé * Fix doc tests * Add integration tests * Fix unused variable * revert * Add GIT to Japanese README Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> |
||
![]() |
0d284bd574
|
Add BLIP (#20716)
* add new model like * add v1 * v1 * v1 * vision encoder logits match * v2 * fix * add docstring * CI tests pass * fix tests * make fixup * add to `toctree` * fix processors * fix processors * fix doc * fill title * add content doc * remove from tokenization auto * fix config * change order * add `# Copied from` * few fixes - add correct license on modeling text - remove dummy argument * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * replace name * refactor a bit * more refactor * remove unused arg * make fixup + remove some `# Adapted from ...` * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * more `# Copied from` * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * now `generate` supports no prefix * remove `FeatureExtractor` * fix path * correct dependency * fix tests * few fixes * add integration tests * add correct conversion script * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add `blip` to tokenization auto * fix docstrings * fix test + add image * remove processor from uncorrect place * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * clean up a bit * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * clean pixel mask * clean pixel mask * fix `F` * Update src/transformers/models/blip/modeling_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix output * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix pad token id * remove `token_type_ids` * make fixup * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * make fixup * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * add comments * Update src/transformers/models/blip/modeling_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * remove `token_type_ids` * make fixup * better name * replace with `image_attention_mask` * refactor * make fixup * better docstring * replace `answer_xx` * remove ununsed args * add `labels` * add `labels` * fix processing tests * make fixup * make fixup * put correct repo * remove `pad` * remove `crop` and `center_crop` * Update src/transformers/models/blip/image_processing_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix * remove `size_divisor` * fix weights `init` * remove unneeded functions * add suggestions * minor changes - change slow test output for PT 1.13 - docstring order * replace `feature_extractor` by `image_processor` * fix doctests * fix weight init order + add fp16 slow test * add `blip` to doctest * add correct repo name and fix test * Update src/transformers/models/blip/processing_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix tests * use `convert_to_rgb` from `image_transforms` * make fixup * fix large loading issue Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
b4b613b102
|
Implement Roberta PreLayerNorm (#20305)
* Copy RoBERTa * formatting * implement RoBERTa with prelayer normalization * update test expectations * add documentation * add convertion script for DinkyTrain weights * update checkpoint repo Unfortunately the original checkpoints assumes a hacked roberta model * add to RoBERTa-PreLayerNorm docs to toc * run utils/check_copies.py * lint files * remove unused import * fix check_repo reporting wrongly a test is missing * fix import error, caused by rebase * run make fix-copies * add RobertaPreLayerNormConfig to ROBERTA_EMBEDDING_ADJUSMENT_CONFIGS * Fix documentation <Facebook> -> Facebook Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup: Fix documentation <Facebook> -> Facebook Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add missing Flax header Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * expected_slice -> EXPECTED_SLICE Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update copies after rebase * add missing copied from statements * make fix-copies * make prelayernorm explicit in code * fix checkpoint path for the original implementation * add flax integration tests * improve docs * update utils/documentation_tests.txt * lint files * Remove Copyright notice Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make fix-copies * Remove EXPECTED_SLICE calculation comments Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
26dd041c6e
|
Add Swin2SR (#19784)
* First draft * Add more improvements * Improve forward pass * Fix layernorm * Add upscaler * More improvements * More improvements * More improvements * Improve conversion script * Add preprocessing * Make output match original implementation * Add additional attributes * Add support for more models * Support more models * Add support for real world sr * Add initial Swin2SRFeatureExtractor * Add ImageSuperResolutionOutput * Make more tests pass * Use BaseModelOutput * Fix one more test * Fix more tests * Fix another test * Fix all tests * Rename to Swin2SRImageProcessor * Fix toctree * Fix toctree * Fix rebase * Improve Swin2SRImageProcessor * Remove feature extractor file * Improve model * Improve conversion script * Fix integration test * Fix init * Fix conversion script * Address comments * Improve upsampler * Add NearestConvUpsampler * Improve pixel shuffle upsampler * Improve auxiliary upsampler * Improve conversion script * Rename conv_last to final_convolution * Fix rebase * Improve upsample module * Add padding to image processor * Fix bug * Update padding * Remove print statement and fix integration test * Improve docs * Add image processor tests * Convert all checkpoints, fix testsé * Remove print statements * Fix import Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> |
||
![]() |
5f94855dc3
|
Add gpt-sw3 model to transformers (#20209)
* Add templates for gpt-sw3 * Add templates for gpt-sw3 * Added sentencepiece tokenizer * intermediate commit with many changes * fixed conflicts * Init commit for tokenization port * Tokenization progress * Remove fast tokenizer * Clean up and rename spm.model -> spiece.model * Remove TF -> PT conversion script template, Clean up Megatron -> PT script * Optimize encode & decode performance * added new attention * added new attention * attention for gpt-sw3 working * attention good * Cache is now working * fixed attention mask so that it works with causal attention * fixed badbmm bug for cpu and caching * updated config with correct parameters * Refactor and leave optimizations as separate functions to avoid breaking expected functionality * Fix special tokens mapping for both tokenizers * cleaning up of code and comments * HF compatible attention outputs * Tokenizer now passing tests, add documentation * Update documentation * reverted back to base implementation after checking that it is identical to pretrained model * updated gpt-sw3 config * updated conversion script * aligned parameters with gpt-sw3 config * changed default scale_attn_by_inverse_layer_idx to true * removed flag from conversion script * added temporary model path * reverted back to functioning convert script * small changes to default config * updated tests for gpt-sw3 * make style, make quality, minor cleanup * Change local paths to testing online repository * Change name: GptSw3 -> GPTSw3 * Remove GPTSw3TokenizerFast references * Use official model repository and add more model sizes * Added reference to 6.7b model * Add GPTSw3DoubleHeadsModel to IGNORE_NON_AUTO_CONFIGURED, like GPT2DoubleHeadsModel * Remove pointers to non-existing TFGPTSw3 * Add GPTSw3 to docs/_toctree.yml * Remove TF artifacts from GPTSw3 in __init__ files * Update README:s with 'make fix-copies' * Add 20b model to archive list * Add documentation for GPT-Sw3 * Fix typo in documentation for GPT-Sw3 * Do 'make fix-copies' again after having updated docs * Fix some typos in docs * Update src/transformers/models/gpt_sw3/configuration_gpt_sw3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/configuration_gpt_sw3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/convert_megatron_to_pytorch.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/models/gpt_sw3/test_tokenization_gpt_sw3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Resolve comments from PR feedback * Resolve more comments from PR feedback, also set use_cache=True in convert script * Add '# Copied from' comments for GPTSw3 modeling * Set 'is_parallelizable = False' * Remove '# Copied from' where code was modified and add 'with x->y' when appropriate * Remove parallelize in mdx * make style, make quality * Update GPTSw3Config default values and corresponding documentation * Update src/transformers/models/gpt_sw3/tokenization_gpt_sw3.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Clean up and protect GPTSw3Tokenizer imports with is_sentencepiece_available * Make style, make quality * Add dummy object for GPTSw3Tokenizer via 'make fix-copies' * make fix-copies * Remove GPTSw3 modeling classes * make style, make quality * Add GPTSw3 auto-mappings for other GPT2 heads * Update docs/source/en/model_doc/gpt-sw3.mdx Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/convert_megatron_to_pytorch.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/tokenization_gpt_sw3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Remove old TODO-comment * Add example usage to GPTSw3Tokenizer docstring * make style, make quality * Add implementation details and example usage to gpt-sw3.mdx Co-authored-by: JoeyOhman <joeyoh@kth.se> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
d151a8c550
|
Add BiT + ViT hybrid (#20550)
* First draft * More improvements * Add backbone, first draft of ViT hybrid * Add AutoBackbone * More improvements * Fix bug * More improvements * More improvements * Convert ViT-hybrid * More improvements * add patch bit * Fix style * Improve code * cleaned v1 * more cleaning * more refactoring * Improve models, add tests * Add docs and tests * Make more tests pass * Improve default backbone config * Update model_type * Fix more tests * Add more copied from statements * More improvements * Add push to hub to conversion scripts * clean * more cleanup * clean * replace to * fix * Update src/transformers/models/bit/configuration_bit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix base model prefix * more cleaning * get rid of stem * clean * replace flag * Update src/transformers/models/bit/configuration_bit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/bit/configuration_bit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * add check * another check * fix for hybrid vit * final fix * update config * fix class name * fix `make fix-copies` * remove `use_activation` * Update src/transformers/models/bit/configuration_bit.py * rm unneeded file * Add BiT image processor * rm unneeded file * add doc * Add image processor to conversion script * Add ViTHybrid image processor * Add resources * Move bit to correct position * Fix auto mapping * Rename hybrid to Hybrid * Fix name in toctree * Fix READMEs' * Improve config * Simplify GroupNormActivation layer * fix test + make style * Improve config * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * remove comment * remove comment * replace * replace * remove all conv_layer * refactor norm_layer * revert x * add copied from * last changes + integration tests * make fixup * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix name * fix message * remove assert and refactor * refactor + make fixup * refactor - add + sfety checker * fix docstring + checkpoint names * fix merge issues * fix function name * fix copies * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix model checkpoint * fix doctest output * vit name on doc * fix name on doc * fix small nits * fixed integration tests * final changes - slow tests pass Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
25e10da427
|
Adding anchor links to Hindi README (#20606) | ||
![]() |
13e736685a
|
Add BioGPT (#20420)
* biogpt initial commit * updated init * fix faster decoding with use_cache * 1. fix input_ids and input_embeds with correct device 2. added _keys_to_ignore_on_load_missing 3. updated prepare_inputs_for_generation * add activation_dropout and scale_embedding * replace fsmt attention with bart attention * added test * run make fix-copies * doc init and fix build * updated README with proper information * 1. added tips to docs 2. updated BioGptTokenizer func * 1. added tokenizer test 2. refactor tokenizer * make fixup * add biogpt fairseq to hf converter * updated layer names more similar to original checkpoints * config update doc string and set defaults * added "#copied" from bart model and updated doc strings * enable model_input_names in tokenizer * 1. positionalembedding depending on attention_mask 2. added attention mask to prepare for generation * added test to verify past and generation * BioGptLMHeadModel -> BioGptForCausalLM * fix typo * tokenization and test Copyright and updated assertion * updated Copyright and one func at time in line * Copyright updates and minor doc fix * replace assertion with ValueError * rm extra space * added code syntax * revert cmnt position change * add tokenizer to auto * updated doc string * tokenizer doc string update * biogpt hub model update to microsoft/biogpt * make fixup * rm cmnt to fix flake8 5.0.4 vs 6 error |
||
![]() |
cc3d0e1b01
|
[New Model] Add TimeSformer model (#18908)
* init timesformer * apply fix-copies * reformat style * revert back some incoorect style updates * init timesformer * apply fix-copies * reformat style * revert back some incoorect style updates * update timseformer doc * add some functions and classes * add new config params * implement multiple classes * update TimeSformerLayer * update TimeSformerModel, TimeSformerPreTrainedModel, TimeSformerEncoder * several fixes * reformat * temporary update * fix some typos * fix weight converter * more fixes * fix a typo * fix typo * remove redundant params * fix for latest hf-hub * merge fix * fix some checks * video classification works with einops * add paper info to docs * merge fix * remove redundant line * remove redundant docstring * update config * fix some typos * fix converter * update some test constants * refactor einops functions * reformat * fix a comment * remove redundat imports * reformat * fix a typo * remove comment * remove unused imports * remove redundant doc line * reformat * add missing line * fix docs * fix timesformer auto feat ext * add unittests * reformat * fix docs * some fixes and updates * fix readme * fix modeling * fix readme * update index * revert _toctree.yml changes * update timseformer.mdx * update drop_path_prob to drop_path_rate * add dosctring for drop_path_rate * update TimeSformerPatchEmbed naming * remove to_2tuple * explicit use of nn.functional * reformat * many updates from review comments * fix a typo * reformat * remove assert, better variable name * make variable names more explicit * add some adapted from * more explicit variable names * remove redundant docstring * fix initilaization * move permute inside embedding * update class names * remove unused imports * add test for video classification * update PretrainedModel with PreTrainedModel * remove double permute * update based on sylvain's review * aply auto fix * update image_processing_auto for timesformer * update hub urls * reformat * remove duplicate import * update doc link |
||
![]() |
60d1f31bb0
|
v4.26.0.dev0 | ||
![]() |
721764028e
|
Add Chinese-CLIP implementation (#20368)
* init chinese-clip model from clip * init model tests and docs * implement chinese-clip into hf * implement chinese-clip into hf * implement chinese-clip into hf * implement chinese-clip into hf * implement chinese-clip into hf * update usecase example in model implementation * fix codestyle * fix model_type typo in readme * add placeholder in doc * add placeholder in doc * update the init script * update usecase * fix codestyle * update testcase * update testcase * update testcase * update testcase * update testcase * update testcase * update testcase * update testcase * update testcase * update testcase * update testcase * update testcase * forward the convert_rgb * update testcase * update testcase * update testcase * merge the recent update from clip about model_input_name property * update the doc * update the doc * update the doc * update the doc * remove unused imports * reformat code style * update the doc * fix isort style * bypass a weird failed unit test which is unrelated with my PR * update the doc * implement independent vision config class * implement independent vision model class * fix refactor bug * fix refactor bug * fix refactor bug * make style * fix refactor bug * make style * fix refactor bug * fix refactor bug * make style * fix refactor bug * fix refactor bug * doc-build restyle * implement independent text config class * implement independent text model class * implement independent text model class * make style * make fix-copies * fix refactor bug * fix refactor bug * fix refactor bug * fix refactor bug * fix refactor bug * fix refactor bug * fix refactor bug * fix refactor bug * fix refactor bug * fix refactor bug * make style * update doc * black and isort * update doc * Update src/transformers/models/chinese_clip/configuration_chinese_clip.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/auto/tokenization_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * modify the model type from chinese-clip to chinese_clip * format the example comment of ChineseCLIPVisionConfig * correct the copyright comment * fix the tokenizer specification * add copied from for loss function * remove unused class * update CHINESE_CLIP_TEXT_INPUTS_DOCSTRING * update CHINESE_CLIP_INPUTS_DOCSTRING * update doc * update doc * update code comment in config * update copied from statement * make style * rename the doc file * add copied statement * remove unused attention_mask, causal_attention_mask in ChineseCLIPVisionEncoder * remove ChineseCLIPTextPreTrainedModel * fix bug * fix bug * fix bug * update doc * make style * Update src/transformers/models/chinese_clip/configuration_chinese_clip.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/chinese_clip/configuration_chinese_clip.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * update ChineseCLIPImageProcessor in image_processing_auto * fix config_class of chinesecliptextmodel * fix the test case * update the docs * remove the copied from comment for ChineseCLIPTextModel, since it has diverged from BertModel with customed config_class * update the testcase * final fix Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> |
||
![]() |
4973d2a04c
|
Add Audio Spectogram Transformer (#19981)
* First draft * Make conversion script work * Add id2label mapping, run code quality * Fix copies * Add first draft of feature extractor * Update conversion script to use feature extractor * Make more tests pass * Add docs * update input_features to input_values + pad by default to max length * Fix doc tests * Add feature extractor tests * Add proper padding/truncation to feature extractor * Add support for conversion of all audioset checkpoints * Improve docs and extend conversion script * Fix README * Rename spectogram to spectrogram * Fix copies * Add integration test * Remove dummy conv * Update to ast * Update organization * Fix init * Rename model to AST * Add require_torchaudio annotator * Move import of ASTFeatureExtractor under a is_speech_available * Fix rebase * Add pipeline config * Update name of classifier head * Rename time_dimension and frequency_dimension for clarity * Remove print statement * Fix pipeline test * Fix pipeline test * Fix index table * Fix init * Fix conversion script * Rename to ForAudioClassification * Fix index table Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> |
||
![]() |
d21c97cc0f
|
add MobileNetV1 model (#17799)
* add model files etc for MobileNetV2 rename files for MobileNetV1 initial implementation of MobileNetV1 fix conversion script cleanup write docs tweaks fix conversion script extract hidden states fix test cases make fixup fixup it all remove main from doc link fixes fix tests fix up use google org fix weird assert * fixup * use google organization for checkpoints |
||
![]() |
22d7161a52
|
fix: "BigSicence" typo in docs (#20331) | ||
![]() |
fc4a993e1b
|
Add Neighborhood Attention Transformer (NAT) and Dilated NAT (DiNAT) models (#20219)
* Add DiNAT * Adds DiNAT + tests * Minor fixes * Added HF model * Add natten to dependencies. * Cleanup * Minor fixup * Reformat * Optional NATTEN import. * Reformat & add doc to _toctree * Reformat (finally) * Dummy objects for DiNAT * Add NAT + minor changes Adds NAT as its own independent model + docs, tests Adds NATTEN to ext deps to ensure ci picks it up. * Remove natten from `all` and `dev-torch` deps, add manual pip install to ci tests * Minor fixes. * Fix READMEs. * Requested changes to docs + minor fixes. * Requested changes. * Add NAT/DiNAT tests to layoutlm_job * Correction to Dinat doc. * Requested changes. |
||
![]() |
163ac3d3ee
|
Add Switch transformers (#19323)
* first commit * add more comments * add router v1 * clean up - remove `tf` modeling files * clean up - remove `tf` modeling files * clean up * v0 routers * added more router - Implemented `ExpertsChooseMaskedRouter` - added tests - 2 more routers to implement * last router * improved docstring - completed the docstring in `router.py` - added more args in the config * v0 sparse mlp * replace wrong naming * forward pass run * update MOE layer * small router update * fixup * consistency * remove scatter router * remove abstract layer * update test and model for integration testing * v1 conversion * update * hardcode hack * all keys match * add gin conversion, without additional libraries * update conversion sctipy * delete router file * update tests wrt router deletion * fix router issues * update expert code * update, logits match, code needsREFACTORING * Refactor code Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com> * add generate tests Co-authored-by: younesbelkada <younesbelkada@gmail.com> * add support for router loss Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com> * fix forward error * refactor a bit * remove `FlaxSwitchTransformers` modules * more tests pass * Update code Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com> * fixup * fix tests * fix doc * fix doc + tokenization * fix tokenizer test * fix test * fix loss output * update code for backward pass * add loss support * update documentation * fix documentation, clean tokenizer * more doc fix, cleanup example_switch * fix failing test * fix test * fix test * fix loss issue * move layer * update doc and fix router capacity usage * fixup * add sparse mlp index for documentation on hub * fixup * test sparse mix architecture * Apply suggestions from code review * Update docs/source/en/model_doc/switch_transformers.mdx * fixup on update * fix tests * fix another test * attempt fix * Update src/transformers/models/switch_transformers/configuration_switch_transformers.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/switch_transformers/convert_switch_transformers_original_flax_checkpoint_to_pytorch.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * try * all tests pass * fix jitter noise * Apply suggestions from code review * doc tests pass * Update src/transformers/models/switch_transformers/modeling_switch_transformers.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/switch_transformers/modeling_switch_transformers.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove assert * change config order * fix readme japanese * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove parallelizable tests + add one liners * remove ONNX config * fix nits - add `T5Tokenizer` in auto mapping - remove `Switch Transformers` from ONNX supported models * remove `_get_router` * remove asserts * add check in test for `router_dtype` * add `SwitchTransformersConfig` in `run_pipeline_test` * Update tests/pipelines/test_pipelines_summarization.py * add huge model conversion script * fix slow tests - add better casting for `Linear8bitLt` - remove `torchscript` tests * add make dir * style on new script * fix nits - doctest - remove `_keys_to_ignore_on_load_unexpected` * Update src/transformers/models/switch_transformers/configuration_switch_transformers.py * add google as authors * fix year * remove last `assert` statements * standardize vertical spaces * fix failing import * fix another failing test * Remove strange àuthorized_keys` * removing todo and padding that is never used Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: ybelkada <younes@huggingface.co> Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Arthur Zucker <arthur@huggingface.co> |
||
![]() |
f711d683b5
|
add MobileNetV2 model (#17845)
* add model files etc for MobileNetV2 * rename files for MobileNetV1 * initial implementation of MobileNetV1 * fix conversion script * cleanup * write docs * tweaks * fix conversion script * extract hidden states * fix test cases * make fixup * fixup it all * rename V1 to V2 * fix checkpoints * fixup * implement first block + weight conversion * add remaining layers * add output stride and dilation * fixup * add tests * add deeplabv3+ head * a bit of fixup * finish deeplab conversion * add link to doc * fix issue with JIT trace in_height and in_width would be Tensor objects during JIT trace, which caused Core ML conversion to fail on the remainder op. By making them ints, the result of the padding calculation becomes a constant value. * cleanup * fix order of models * fix rebase error * remove main from doc link * add image processor * remove old feature extractor * fix converter + other issues * fixup * fix unit test * add to onnx tests (but these appear broken now) * add post_process_semantic_segmentation * use google org * remove unused imports * move args * replace weird assert |
||
![]() |
61a51f5f23
|
Add Jukebox model (replaces #16875) (#17826) | ||
![]() |
efa889d2e4
|
Add RocBert (#20013)
* add roc_bert * update roc_bert readme * code style * change name and delete unuse file * udpate model file * delete unuse log file * delete tokenizer fast * reformat code and change model file path * add RocBertForPreTraining * update docs * delete wrong notes * fix copies * fix make repo-consistency error * fix files are not present in the table of contents error * change RocBert -> RoCBert * add doc, add detail test Co-authored-by: weiweishi <weiweishi@tencent.com> |
||
![]() |
258963062b
|
Add CLIPSeg (#20066)
* Add first draft * Update conversion script * Improve conversion script * Improve conversion script some more * Add conditional embeddings * Add initial decoder * Fix activation function of decoder * Make decoder outputs match original implementation * Make decoder outputs match original implementation * Add more copied from statements * Improve model outputs * Fix auto tokenizer file * Fix more tests * Add test * Improve README and docs, improve conditional embeddings * Fix more tests * Remove print statements * Remove initial embeddings * Improve conversion script * Add interpolation of position embeddings * Finish addition of interpolation of position embeddings * Add support for refined checkpoint * Fix refined checkpoint * Remove unused parameter * Improve conversion script * Add support for training * Fix conversion script * Add CLIPSegFeatureExtractor * Fix processor * Fix CLIPSegProcessor * Fix conversion script * Fix most tests * Fix equivalence test * Fix README * Add model to doc tests * Use better variable name * Convert other checkpoint as well * Update config, add link to paper * Add docs * Update organization * Replace base_model_prefix with clip * Fix base_model_prefix * Fix checkpoint of config * Fix config checkpoint * Remove file * Use logits for output * Fix tests Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> |
||
![]() |
6e1c5786dc
|
Update READMEs for ESMFold and add notebooks (#20067)
* Update READMEs for ESMFold and add notebooks * Fix PyCharm formatting * make fix-copies |
||
![]() |
38e5b71abb
|
Add Japanese translated README (#19945)
* Add japanese translated README.md * Add README_ja.md link * Add japanese transkate to check_copies.py * Add guide to Japanese README.md * Update README_ja.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update utils/check_copies.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
c3a93d8d82
|
v4.25.0.dev0 | ||
![]() |
7a1c68a845
|
Add flan-t5 documentation page (#19892)
* add `flan-t5` documentation page
* Update README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add more content
* revert `_toctree` modif
* revert `toctree` modif - 2
* Update README.md
* Revert "Update README.md"
This reverts commit
|
||
![]() |
802b98c72b
|
Correct README image text (#19883)
swap "right" and "left" so description is correct. |
||
![]() |
dd523da577
|
Add table transformer [v2] (#19614)
* First draft * Add conversion script * Make conversion work * Upload checkpoints * Add final fixes * Revert changes of conditional and deformable detr * Fix toctree, add and remove copied from * Use model type * Improve docs * Improve code example * Update copies * Add copied formt * Don't update conditional detr * Don't update deformable detr |
||
![]() |
59e29be363
|
object-detection instead of object_detection (#19677) | ||
![]() |
f4e31a9aa1
|
word replacement line #231 (#19662)
install->installation |
||
![]() |
4d367a3c81
|
Add LiLT (#19450)
* First draft * Fix more things * Improve more things * Remove some head models * Fix more things * Add missing layers * Remove tokenizer * Fix more things * Fix copied from statements * Make all tests pass * Remove print statements * Remove files * Fix README and docs * Add integration test and fix organization * Add tips * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Make tests faster, improve docs * Fix doc tests * Add model to toctree * Add docs * Add note about creating new checkpoint * Remove is_decoder * Make tests smaller, add docs Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
0b7b4c60c6
|
Adding the README_es.md and reference to it in the others files readme (#19427)
* Adding the README_es.md and reference to it in the others files readme * Updating the check_copies.py * Updating README_es.md * Updating chec_copies |
||
![]() |
10100979ed | Dev version | ||
![]() |
e3f028f3af
|
Add TF whisper (#19378)
* simplify loop * add featur extractor * add model * start conversion * add dropout * initial commit of test files * copnversion for all models * update processor for correct padding * update feature extraction * update integration test logits match * fmnt: off for the logits * on the fly mel bank * small nit * update test * update tokenizer * nit feature extraction * update * update tokenizer test * adds logit processor and update tokenizer to get supress tokens * style * clean convert * revert to original modeling tf utils * Update * update * nit * clean convert file * update tests and nits * quality * slow generation test * ffn_dim to allow customization * update readme * add to toctreee * start fixing integration tests * update tests and code * fix feature extractor * fix config tests common * update code to fix tests * fix feature exctractor * nit feature extraction * update test for new feature extractor * style * add absrtact * large logits wioth custom decoder input ids * wraap around is otrch available * fix feature extractor * correct logits for whisper small.en * nit * fix encoder_attentino_mask * some fixes * remove unnecessary inputs * nits * add normalizer file * update etst tokenization * fix attention mask not defined * fix generate * remove uncoder attention mask useless * update test modeling whisper * update condfig to add second non supress tokens * nits on feature exrtactor * nit for test tokenizers * update etsts * update tests * update tokenization test * fixup * invalidated hf token. Clean convert openai to whisper * fix logit tests * fixup * Add model to README * Fix doc tests * clean merge * revert toc_tree changes * remove useless LogitProcessor * Update whisper .mdx * update config file doc * update configuration docstring * update test tokenization * update test tokenization * update tokenization whisper Added copied from where needed * update feature extraction * nit test name * style * quality * remove get suppress tokens and update non_speech tokens global variables * Update src/transformers/models/whisper/feature_extraction_whisper.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * clean modeling whisper and test Removed the attention mask arguments that are deprecated * fix large test * Add multilingual audio test, and translate test * style * fix larg multilingual test * nits * add copied from for attention layer * remove attention masks in doc * add english normalizer * Update docs/source/en/model_doc/whisper.mdx Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * update tokenization test * remove copied from in whisper attention : no bias in k_proj only * wrap around dependencies in english normalizer * style * correct import generation logits * for now, wrap feature extractor with torch * remove torch depencies for feature extraction and style * Update src/transformers/models/whisper/convert_openai_whisper_to_tfms.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/whisper/configuration_whisper.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/whisper.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixup * nit * update logitds * style * nit * nits and fix final tests * add `is_more_itertools_available` to utils * quality * add begin supress tokens, supress tokens to generate args and config * clean supressTokensLogitProcessor in generation logits * Nit naming * add supressTokensAtBegin * udpate tests, supress tokens to None or correct values * nit and style * update RAG to fit test and generate_logit * add copy pasted statment on english normalizer * add arguments to config_common_kwargs * Update src/transformers/generation_utils.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/generation_logits_process.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * revert changes based on reviews * update doc and nits * Update src/transformers/models/whisper/configuration_whisper.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * more nits * last nits * update test configuration common * add BART name in decoder attention mask documentation * Update src/transformers/models/whisper/modeling_whisper.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * style * nit * nit * add english.json file to git * nits on documentation * nit * nits * last styling * add main toctree file * remove sentence piece dependency * clean init file * fix tokenizer that has no dependencies on sentencepiece * update whisper init file, nit * remove english.json file * add get decoder prompt id * All weights loading * Remove hanging pdb * Fixup and tidy up * Use same copied from as PT model * Remove whitespace changes * Remove torch references * Tie embeddings * Remove logits processor input to generate * Update logit values * revert changes and add forced logit processor * nit * clean normalizer * remove protected * Add logit processors and update generation code & tests * Some tidy up * Update docstring * update * update based on review * Update src/transformers/models/whisper/configuration_whisper.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/whisper/configuration_whisper.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update to reflect changes on the PT model branch * Tidy up * Remove extra whitespace * Fix test - make input ids small enough we can append * Include upstream changes on main * PR comments - add batch tests, remove comments & defaults * Fix model output imports * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation_tf_logits_process.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/models/whisper/test_modeling_tf_whisper.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update docstring example * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Remove changes to adjust_logits_during_generation function * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Tidy up imports that don't require TF * Update tests - skip and no more skip * Update tests/generation/test_generation_tf_logits_process.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/whisper/modeling_tf_whisper.py * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Add training flags * Add (skipped) XLA generation tests * Add embedding correctness test * Add constant ids for generation tests * Make logits finding a bit tidier * Remove unused args * xla generation enabled * Don't skip XLA tests anymore * Fix tests - add position ids to expected signature and update rag generation * Undo method reorder * Remove added whitespace * Remove copy-paste gradient checkopint ref * Remove * Trigger CI - (issue with refs when pulling) Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: NielsRogge <niels.rogge1@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Joao Gante <joao@huggingface.co> |
||
![]() |
45e14038f2
|
Add WhisperModel to transformers (#19166)
* simplify loop * add featur extractor * add model * start conversion * add dropout * initial commit of test files * copnversion for all models * update processor for correct padding * update feature extraction * update integration test logits match * fmnt: off for the logits * on the fly mel bank * small nit * update test * update tokenizer * nit feature extraction * update * update tokenizer test * adds logit processor and update tokenizer to get supress tokens * style * clean convert * revert to original modeling tf utils * Update * update * nit * clean convert file * update tests and nits * quality * slow generation test * ffn_dim to allow customization * update readme * add to toctreee * start fixing integration tests * update tests and code * fix feature extractor * fix config tests common * update code to fix tests * fix feature exctractor * nit feature extraction * update test for new feature extractor * style * add absrtact * large logits wioth custom decoder input ids * wraap around is otrch available * fix feature extractor * correct logits for whisper small.en * nit * fix encoder_attentino_mask * some fixes * remove unnecessary inputs * nits * add normalizer file * update etst tokenization * fix attention mask not defined * Add model to README * Fix doc tests * fix generate * remove uncoder attention mask useless * update test modeling whisper * update condfig to add second non supress tokens * nits on feature exrtactor * nit for test tokenizers * update etsts * update tests * update tokenization test * fixup * invalidated hf token. Clean convert openai to whisper * fix logit tests * fixup * clean merge * revert toc_tree changes * remove useless LogitProcessor * Update whisper .mdx * update config file doc * update configuration docstring * update test tokenization * update test tokenization * update tokenization whisper Added copied from where needed * update feature extraction * nit test name * style * quality * remove get suppress tokens and update non_speech tokens global variables * Update src/transformers/models/whisper/feature_extraction_whisper.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * clean modeling whisper and test Removed the attention mask arguments that are deprecated * fix large test * Add multilingual audio test, and translate test * style * fix larg multilingual test * nits * Update docs/source/en/model_doc/whisper.mdx Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * add copied from for attention layer * remove attention masks in doc * add english normalizer * update tokenization test * remove copied from in whisper attention : no bias in k_proj only * wrap around dependencies in english normalizer * style * correct import generation logits * for now, wrap feature extractor with torch * Update src/transformers/models/whisper/convert_openai_whisper_to_tfms.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/whisper/configuration_whisper.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/whisper.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * remove torch depencies for feature extraction and style * fixup * nit * update logitds * style * nit * nits and fix final tests * add `is_more_itertools_available` to utils * quality * add begin supress tokens, supress tokens to generate args and config * clean supressTokensLogitProcessor in generation logits * Nit naming * add supressTokensAtBegin * udpate tests, supress tokens to None or correct values * nit and style * update RAG to fit test and generate_logit * add copy pasted statment on english normalizer * add arguments to config_common_kwargs * Update src/transformers/generation_utils.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/generation_logits_process.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/whisper/configuration_whisper.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * revert changes based on reviews * update doc and nits * more nits * last nits * update test configuration common * add BART name in decoder attention mask documentation * Update src/transformers/models/whisper/modeling_whisper.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * style * nit * nit * add english.json file to git * nits on documentation * nit * nits * last styling * add main toctree file * remove sentence piece dependency * clean init file * fix tokenizer that has no dependencies on sentencepiece * update whisper init file, nit * remove english.json file * add get decoder prompt id * revert changes and add forced logit processor * nit * clean normalizer * remove protected * update * Update src/transformers/models/whisper/configuration_whisper.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * update based on review * Update src/transformers/models/whisper/configuration_whisper.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add batched tests Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: NielsRogge <niels.rogge1@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
534cd8ff94
|
Update README.md (#19309) | ||
![]() |
5cd16f01db
|
time series forecasting model (#17965)
* initial files * initial model via cli * typos * make a start on the model config * ready with configuation * remove tokenizer ref. * init the transformer * added initial model forward to return dec_output * require gluonts * update dep. ver table and add as extra * fixed typo * add type for prediction_length * use num_time_features * use config * more config * typos * opps another typo * freq can be none * default via transformation is 1 * initial transformations * fix imports * added transform_start_field * add helper to create pytorch dataloader * added inital val and test data loader * added initial distr head and loss * training working * remove TimeSeriesTransformerTokenizer Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixed copyright * removed docs * remove time series tokenizer * fixed docs * fix text * fix second * fix default * fix order * use config directly * undo change * fix comment * fix year * fix import * add additional arguments for training vs. test * initial greedy inference loop * fix inference * comment out token inputs to enc dec * Use HF encoder/decoder * fix inference * Use Seq2SeqTSModelOutput output * return Seq2SeqTSPredictionOutput * added default arguments * fix return_dict true * scale is a tensor * output static_features for inference * clean up some unused bits * fixed typo * set return_dict if none * call model once for both train/predict * use cache if future_target is none * initial generate func * generate arguments * future_time_feat is required * return SampleTSPredictionOutput * removed unneeded classes * fix when params is none * fix return dict * fix num_attention_heads * fix arguments * remove unused shift_tokens_right * add different dropout configs * implement FeatureEmbedder, Scaler and weighted_average * remove gluonts dependency * fix class names * avoid _variable names * remove gluonts dependency * fix imports * remove gluonts from configuration * fix docs * fixed typo * move utils to examples * add example requirements * config has no freq * initial run_ts_no_trainer * remove from ignore * fix output_attentions and removed unsued getters/setters * removed unsed tests * add dec seq len * add test_attention_outputs * set has_text_modality=False * add config attribute_map * make style * make fix-copies * add encoder_outputs to TimeSeriesTransformerForPrediction forward * Improve docs, add model to README * added test_forward_signature * More improvements * Add more copied from * Fix README * Fix remaining quality issues * updated encoder and decoder * fix generate * output_hidden_states and use_cache are optional * past key_values returned too * initialize weights of distribution_output module * fixed more tests * update test_forward_signature * fix return_dict outputs * Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * removed commented out tests * added neg. bin and normal output * Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * move to one line * Add docstrings * Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * add try except for assert and raise * try and raise exception * fix the documentation formatting * fix assert call * fix docstring formatting * removed input_ids from DOCSTRING * Update input docstring * Improve variable names * Update order of inputs * Improve configuration * Improve variable names * Improve docs * Remove key_length from tests * Add extra docs * initial unittests * added test_inference_no_head test * added test_inference_head * add test_seq_to_seq_generation * make style * one line * assert mean prediction * removed comments * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix order of args * make past_observed_mask optional as well * added Amazon license header * updated utils with new fieldnames * make style * cleanup * undo position of past_observed_mask * fix import * typo * more typo * rename example files * remove example for now * Update docs/source/en/_toctree.yml Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update modeling_time_series_transformer.py fix style * fixed typo * fix typo and grammer * fix style Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: NielsRogge <niels.rogge1@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
368b649af6
|
Rebase ESM PR and update all file formats (#19055)
* Rebase ESM PR and update all file formats * Fix test relative imports * Add __init__.py to the test dir * Disable gradient checkpointing * Remove references to TFESM... FOR NOW >:| * Remove completed TODOs from tests * Convert docstrings to mdx, fix-copies from BERT * fix-copies for the README and index * Update ESM's __init__.py to the modern format * Add to _toctree.yml * Ensure we correctly copy the pad_token_id from the original ESM model * Ensure we correctly copy the pad_token_id from the original ESM model * Tiny grammar nitpicks * Make the layer norm after embeddings an optional flag * Make the layer norm after embeddings an optional flag * Update the conversion script to handle other model classes * Remove token_type_ids entirely, fix attention_masking and add checks to convert_esm.py * Break the copied from link from BertModel.forward to remove token_type_ids * Remove debug array saves * Begin ESM-2 porting * Add a hacky workaround for the precision issue in original repo * Code cleanup * Remove unused checkpoint conversion code * Remove unused checkpoint conversion code * Fix copyright notices * Get rid of all references to the TF weights conversion * Remove token_type_ids from the tests * Fix test code * Update src/transformers/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add credit * Remove _ args and __ kwargs in rotary embedding * Assertively remove asserts * Replace einsum with torch.outer() * Fix docstring formatting * Remove assertions in tokenization * Add paper citation to ESMModel docstring * Move vocab list to single line * Remove ESMLayer from init * Add Facebook copyrights * Clean up RotaryEmbedding docstring * Fix docstring formatting * Fix docstring for config object * Add explanation for new config methods * make fix-copies * Rename all the ESM- classes to Esm- * Update conversion script to allow pushing to hub * Update tests to point at my repo for now * Set config properly for tests * Remove the gross hack that forced loss of precision in inv_freq and instead copy the data from the model being converted * make fixup * Update expected values for slow tests * make fixup * Remove EsmForCausalLM for now * Remove EsmForCausalLM for now * Fix padding idx test * Updated README and docs with ESM-1b and ESM-2 separately (#19221) * Updated README and docs with ESM-1b and ESM-2 separately * Update READMEs, longer entry with 3 citations * make fix-copies Co-authored-by: Your Name <you@example.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Tom Sercu <tsercu@fb.com> Co-authored-by: Your Name <you@example.com> |
||
![]() |
f3d2f7a6e0
|
Add MarkupLM (#19198)
* First draft * Make basic test work * Fix most tokenizer tests * More improvements * Make more tests pass * Fix more tests * Fix some code quality * Improve truncation * Implement feature extractor * Improve feature extractor and add tests * Improve feature extractor tests * Fix pair_input test partly * Add fast tokenizer * Improve implementation * Fix rebase * Fix rebase * Fix most of the tokenizer tests. * propose solution for fast * add: integration test for fasttokenizer, warning for decode, fix template in slow tokenizer * add: modify markuplmconverter * add: some modify on converter and tokenizerfast * Fix style, copies * Make fixup * Update tokenization_markuplm.py * Update test_tokenization_markuplm.py * Update markuplm related * Improve processor, add integration test * Add processor test file * Improve processor * Improve processor tests * Fix more processor tests * Fix processor tests * Update docstrings * Add Copied from statements * Add more Copied from statements * Add code examples * Improve code examples * Add model to doc tests * Adding dependency check * Add dummy file * Add requires_backends * Add model to toctree * Fix more things, disable dependency check for now * Apply more suggestions * Add soft dependency * Add annotators to tests * Fix style * Remove from_slow=True * Remove print statements * Add sanity check * Fix processor test * Fix processor tests, add more docs * Add doc tests for mdx file * Add more tips * Apply suggestions Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: lockon-n <45759388+lockon-n@users.noreply.github.com> Co-authored-by: SaulLu <lucilesaul.com@gmail.com> Co-authored-by: lockon-n <dd098309@126.com> |
||
![]() |
2d9853b226
|
MSN (Masked Siamese Networks) for ViT (#18815)
* feat: modeling and conversion scripts for msn. * chore: change license year. * chore: remove unneeded modules. * feat: direct loading of state_dict from remote url. * fix: import paths. * add: rest of the files. * add and fix rest of the files. Co-authored-by: Niels <niels.rogge1@gmail.com> * chore: formatting. * code quality fix. * chore: remove pooler. * feat: add classification top. * fix: configuration object. * add: initial test cases (one failing). * fix: basemodeloutput. * add: caution on using the classification head. * add: rest of the model related files. * add: vit msn readme. * fix: copied from statement. * fix: dummy objects. * add: ViTMSNPreTrainedModel to inits. * fix: repo consistency. * minor change in the model doc. * fix: tests. * Empty-Commit * Update src/transformers/models/vit_msn/configuration_vit_msn.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address PR comments. * Update src/transformers/models/vit_msn/modeling_vit_msn.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * chore: put model in no_grad() and formatting. Co-authored-by: Niels <niels.rogge1@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> |
||
![]() |
126a739058
|
Add support for conditional detr (#18948)
* added conditional_detr files * checked copies * checked copies * fixed style and copies * fixed style and copies * fixed hub * fixed style * Update README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/_toctree.yml Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/index.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/convert_conditional_detr_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/conditional_detr.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixed some doc issue * changed prefix to ConditionalDetr * fixed docs * Update README_ko.md * added spatial_model_name * fixed fix-copies * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added some copied from * added some copied from * added some copied from * added some copied from * fixed use_pretrained issue * changed post-process * added conditional_detr files * checked copies * checked copies * fixed style and copies * fixed style and copies * fixed hub * fixed style * Update README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/_toctree.yml Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/index.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/convert_conditional_detr_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixed some doc issue * Update docs/source/en/model_doc/conditional_detr.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * changed prefix to ConditionalDetr * fixed docs * Update README_ko.md * added spatial_model_name * fixed fix-copies * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added some copied from * added some copied from * added some copied from * added some copied from * fixed use_pretrained issue * changed post-process * fix style quality and copies * fix style quality and copies * fix style quality and copies * fix style quality and copies * add more fix-copies * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixed some variable names & added more fix-copies * fixed some variable names & added more fix-copies * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added more copied from * fixed quality * changed pretrained config * added more copied-from and fixed the issue in feature_extraction_auto * added conditional_detr files * checked copies * checked copies * fixed style and copies * fixed style and copies * fixed hub * fixed style * Update README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/_toctree.yml Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/index.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/convert_conditional_detr_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixed some doc issue * Update docs/source/en/model_doc/conditional_detr.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * changed prefix to ConditionalDetr * fixed docs * Update README_ko.md * added spatial_model_name * fixed fix-copies * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added some copied from * added some copied from * added some copied from * added some copied from * fixed use_pretrained issue * changed post-process * added conditional_detr files * checked copies * fixed style and copies * fixed some doc issue * changed prefix to ConditionalDetr * fixed docs * added spatial_model_name * fixed fix-copies * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added some copied from * added some copied from * added some copied from * added some copied from * fix style quality and copies * fix style quality and copies * fix style quality and copies * add more fix-copies * fixed some variable names & added more fix-copies * fixed some variable names & added more fix-copies * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added more copied from * fixed quality * changed pretrained config * added more copied-from and fixed the issue in feature_extraction_auto * fixed style * added conditional_detr files * checked copies * checked copies * fixed style and copies * fixed style and copies * fixed hub * fixed style * Update README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/_toctree.yml Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/index.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/convert_conditional_detr_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixed some doc issue * Update docs/source/en/model_doc/conditional_detr.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * changed prefix to ConditionalDetr * fixed docs * Update README_ko.md * added spatial_model_name * fixed fix-copies * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added some copied from * added some copied from * added some copied from * added some copied from * fixed use_pretrained issue * changed post-process * added conditional_detr files * checked copies * fixed style and copies * fixed some doc issue * changed prefix to ConditionalDetr * fixed docs * added spatial_model_name * fixed fix-copies * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added some copied from * added some copied from * added some copied from * added some copied from * fix style quality and copies * fix style quality and copies * fix style quality and copies * add more fix-copies * fixed some variable names & added more fix-copies * fixed some variable names & added more fix-copies * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added more copied from * fixed quality * changed pretrained config * added more copied-from and fixed the issue in feature_extraction_auto * rebased Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Depu Meng <depumeng@Depus-MacBook-Pro.local> |
||
![]() |
56c548f17c
|
Note about developer mode (#19075) | ||
![]() |
16913b3c92 | Dev version | ||
![]() |
f5f430e5c8
|
Add support for Japanese GPT-NeoX-based model by ABEJA, Inc. (#18814)
* add gpt-neox-japanese model and tokenizer as new model * Correction to PR's comment for GPT NeoX Japanese - Fix to be able to use gpu - Add comment # Copied... at the top of RotaryEmbedding - Implement nn.Linear instead of original linear class - Add generation test under @slow * fix bias treatment for gpt-neox-japanese * Modidy gpt-neox-japanese following PR - add doc for bias_dropout_add - style change following a PR comment * add document for gpt-neox-japanese * remove unused import from gpt-neox-japanese * fix README for gpt-neox-japanese |
||
![]() |
59407bbeb3
|
Add Deformable DETR (#17281)
* First draft * More improvements * Improve model, add custom CUDA code * Import torch before * Add script that imports custom layer * Add everything in new ops directory * Import custom layer in modeling file * Fix ARCHIVE_MAP typo * Creating the custom kernel on the fly. * Import custom layer in modeling file * More improvements * Fix CUDA loading * More improvements * Improve conversion script * Improve conversion script * Make it work until encoder_outputs * Make forward pass work * More improvements * Make logits match original implementation * Make implementation also support single_scale model * Add support for single_scale and dilation checkpoint * Add support for with_box_refine model * Support also two stage model * Improve tests * Fix more tests * Make more tests pass * Upload all models to the hub * Clean up some code * Improve decoder outputs * Rename intermediate hidden states and reference points * Improve model outputs * Move tests to dedicated folder * Improve model outputs * Fix retain_grad test * Improve docs * Clean up and make test_initialization pass * Improve variable names * Add copied from statements * Improve docs * Fix style * Improve docs * Improve docs, move tests to model folder * Fix rebase * Remove DetrForSegmentation from auto mapping * Apply suggestions from code review * Improve variable names and docstrings * Apply some more suggestions from code review * Apply suggestion from code review * better docs and variables names * hint to num_queries and two_stage confusion * remove asserts and code refactor * add exception if two_stage is True and with_box_refine is False * use f-strings * Improve docs and variable names * Fix code quality * Fix rebase * Add require_torch_gpu decorator * Add pip install ninja to CI jobs * Apply suggestion of @sgugger * Remove DeformableDetrForObjectDetection from auto mapping * Remove DeformableDetrModel from auto mapping * Add model to toctree * Add model back to mappings, skip model in pipeline tests * Apply @sgugger's suggestion * Fix imports in the init * Fix copies * Add CPU implementation * Comment out GPU function * Undo previous change * Apply more suggestions * Remove require_torch_gpu annotator * Fix quality * Add logger.info * Fix logger * Fix variable names * Fix initializaztion * Add missing initialization * Update checkpoint name * Add model to doc tests * Add CPU/GPU equivalence test * Add Deformable DETR to pipeline tests * Skip model for object detection pipeline Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Nouamane Tazi <nouamane98@gmail.com> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> |
||
![]() |
22f7218560
|
add task_type_id to BERT to support ERNIE-2.0 and ERNIE-3.0 models (#18686)
* add_ernie * remove Tokenizer in ernie * polish code * format code style * polish code * fix style * update doc * make fix-copies * change model name * change model name * fix dependency * add more copied from * rename ErnieLMHeadModel to ErnieForCausalLM do not expose ErnieLayer update doc * fix * make style * polish code * polish code * fix * fix * fix * fix * fix * final fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> |
||
![]() |
bb6f6d5338
|
Add X-CLIP (#18852)
* First draft * Improve conversion script * Make vision encoder work * More improvements * Improve conversion script * Fix quality * Add MultiframeIntegrationTransformer * More improvements * Make MiT output work * Fix quality * Add prompts generator * Add tests * Fix some tests * Fix some more tests * Fix more tests * Improve conversion script * Fix model outputs * Fix more tests * Add XClipProcessor * Use processor in conversion script * Fix integration test * Update README, fix docs * Fix all tests * Add MIT output to XClipOutput * Create better variable names * Rename XClip to XCLIP * Extend conversion script * Add support for large models * Add support for 16 frame models * Add another model' * Fix module issue * Apply suggestions from code review * Add figure to docs * Fix CLIPProcessor issue * Apply suggestions from code review * Delete file * Convert more checkpoints * Convert last checkpoint * Update nielsr to microsoft |
||
![]() |
9832ac7c73
|
Fix LayoutXLM wrong link in README (#18932)
* fix LayoutXLM wrong link in README * fix LayoutXLM worng link in index.mdx |
||
![]() |
53e33e6f1b
|
PEGASUS-X (#18551)
* PegasusX Initial commit * rename * pegasus X implementation * pegx update * pegx fix * pegasus-x fixes * pegx updates * cleanup * cleanup * cleanup * tests * stylefixes * Documentation update * Model hub fix * cleanup * update * update * testfix * Check fix * tweaks for merging * style * style * updates for pr * style * change pegasus-x repo |
||
![]() |
f1fd460694
|
Add SegFormer and ViLT links (#18808)
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> |
||
![]() |
2ab790e82d
|
Add Donut (#18488)
* First draft * Improve script * Update script * Make conversion work * Add final_layer_norm attribute to Swin's config * Add DonutProcessor * Convert more models * Improve feature extractor and convert base models * Fix bug * Improve integration tests * Improve integration tests and add model to README * Add doc test * Add feature extractor to docs * Fix integration tests * Remove register_buffer * Fix toctree and add missing attribute * Add DonutSwin * Make conversion script work * Improve conversion script * Address comment * Fix bug * Fix another bug * Remove deprecated method from docs * Make Swin and Swinv2 untouched * Fix code examples * Fix processor * Update model_type to donut-swin * Add feature extractor tests, add token2json method, improve feature extractor * Fix failing tests, remove integration test * Add do_thumbnail for consistency * Improve code examples * Add code example for document parsing * Add DonutSwin to MODEL_NAMES_MAPPING * Add model to appropriate place in toctree * Update namespace to appropriate organization Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> |
||
![]() |
8d1f9039d0
|
Just re-reading the whole doc every couple of months 😬 (#18489)
* Delete valohai.yaml * NLP => ML * typo * website supports https * datasets * 60k + modalities * unrelated link fixing for accelerate * Ok those links were actually broken * Fix link * Make `AutoTokenizer` auto-link * wording tweak * add at least one non-nlp task |
||
![]() |
f9a0008d2d
|
Add VideoMAE (#17821)
* First draft * Add VideoMAEForVideoClassification * Improve conversion script * Add VideoMAEForPreTraining * Add VideoMAEFeatureExtractor * Improve VideoMAEFeatureExtractor * Improve docs * Add first draft of model tests * Improve VideoMAEForPreTraining * Fix base_model_prefix * Make model take pixel_values of shape (B, T, C, H, W) * Add loss computation of VideoMAEForPreTraining * Improve tests * Improve model testsé * Make all tests pass * Add VideoMAE to main README * Add tests for VideoMAEFeatureExtractor * Add integration test * Improve conversion script * Rename patch embedding class * Remove VideoMAELayer from init * Update design of patch embeddings * Improve comments * Improve conversion script * Improve conversion script * Add conversion of pretrained model * Add loss verification of pretrained model * Add loss verification of unnormalized targets * Add integration test for pretraining model * Apply suggestions from code review * Fix bug to make feature extractor resize only shorter edge * Address more comments * Improve normalization of videos * Add doc examples * Move constants to dedicated script * Remove scripts * Transfer checkpoints, fix docs * Update script * Update image mean and std * Fix doc tests * Set return_tensors to NumPy by default * Revert the previous change Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> |
||
![]() |
e87ac9d18b
|
Add swin transformer v2 (#17469)
* Add files generated using transformer-cli add-new-model-like command * Add changes for swinv2 attention and forward method * Add fixes * Add modifications for weight conversion and remaining args in swin model * Add changes for patchmerging * Add changes for SwinV2selfattention * Update conversion script * Add final fixes for the swin_v2 model * Add changes for conversion script for pretrained window size case * Add pretrained window size value from config in SwinV2Encoder class * Make fixup * Add swinv2 to models_not_in_readme to utils/check_copies.py * Modify Swinv2v2 to Swin Transformer V2 * Remove copied from, to run make fixup command * Add updates to swinv2tf from main branch * Add pretrained_window_size to config, to make tests pass * Add modified weights from nandwalritik profile for swinv2 * Update model weights from swinv2 from nandwalritik profile * Add fix for build_pr_documentation CI fix * Add fixes for weight conversion * Add change to make input with padding work * Add fixes for test cases * Add few changes from swin to swinv2 to pass test cases * Remove tests for tensorflow as swinv2 for TF is not added yet * Overide test_pt_tf_model_equivalence function as TF implementation for swinv2 is not added yet * Add modeling_tf_swinv2 to _ignore_modules as test file is removed for this one right now. * Update docs url for swinv2 in README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Undo changes for check_repo * Update url in readme.md * Remove overrided function to test pt_tf_model_equivalence * Remove TF model imports for Swinv2 as its not implemented in this PR * Add changes for index.mdx * Add swinv2 papers link,abstract and contributors details * Rename cpb_mlp to continous_position_bias_mlp * Add tips for swinv2 model * Update src/transformers/models/swinv2/configuration_swinv2.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/swinv2/configuration_swinv2.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Fix indentation for docstring example in src/transformers/models/swinv2/configuration_swinv2.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update import order in src/transformers/models/swinv2/configuration_swinv2.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add copyright statements in weights conversion script. Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Remove Swinv2 from models_not_in_readme * Reformat code * Remove TF implementation file for swinv2 * Update start docstring. Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add changes for docstring * Update orgname for weights to microsoft * Remove to_2tuple function * Add copied from statements wherever applicable * Add copied from to Swinv2ForMaskedImageModelling class * Reformat code. Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add unittest.skip(with reason.) for test_inputs_embeds test case. Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add updates for test_modeling_swinv2.py * Add @unittest.skip() annotation for clarity to create_and_test_config_common_properties function * Add continuous_position_bias_mlp parameter to conversion script * Add test for testing masked_image_modelling for swinv2 * Update Swinv2 to Swin Transformer v2 in docs/source/en/model_doc/swinv2.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update Swinv2 to Swin Transformer v2 in docs/source/en/model_doc/swinv2.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/swinv2.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/swinv2.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add suggested changes * Add copied from to forward methods of Swinv2Stage and Swinv2Encoder * Add push_to_hub flag to weight conversion script * Change order or Swinv2DropPath class * Add id2label mapping for imagenet 21k * Add updated url for SwinV2 functions and classes used in implementation * Update input_feature dimensions format, mentioned in comments. Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> * Add suggested changes for modeling_swin2.py * Update docs * Remove create_and_test_config_common_properties function, as test_model_common_attributes is sufficient. * Fix indentation. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add changes for making Nit objects in code style * Add suggested changes * Add suggested changes for test_modelling_swinv2 * make fix-copies * Update docs/source/en/model_doc/swinv2.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
c89a592e87 | Dev version | ||
![]() |
12d66b4701
|
Add OWL-ViT model for zero-shot object detection (#17938)
* add owlvit model skeleton * add class and box predictor heads * convert modified flax clip to pytorch * fix box and class predictors * add OwlViTImageTextEmbedder * convert class and box head checkpoints * convert image text embedder checkpoints * add object detection head * fix bugs * update conversion script * update conversion script * fix q,v,k,out weight conversion conversion * add owlvit object detection output * fix bug in image embedder * fix bugs in text embedder * fix positional embeddings * fix bug in inference mode vision pooling * update docs, init tokenizer and processor files * support batch processing * add OwlViTProcessor * remove merge conflicts * readd owlvit imports * fix bug in OwlViTProcessor imports * fix bugs in processor * update docs * fix bugs in processor * update owlvit docs * add OwlViTFeatureExtractor * style changes, add postprocess method to feature extractor * add feature extractor and processor tests * add object detection tests * update conversion script * update config paths * update config paths * fix configuration paths and bugs * fix bugs in OwlViT tests * add import checks to processor * fix docs and minor issues * fix docs and minor issues * fix bugs and issues * fix bugs and issues * fix bugs and issues * fix bugs and issues * update docs and examples * fix bugs and issues * update conversion script, fix positional embeddings * process 2D input ids, update tests * fix style and quality issues * update docs * update docs and imports * update OWL-ViT index.md * fix bug in OwlViT feature ext tests * fix code examples, return_dict by default * return_dict by default * minor fixes, add tests to processor * small fixes * add output_attentions arg to main model * fix bugs * remove output_hidden_states arg from main model * update self.config variables * add option to return last_hidden_states * fix bug in config variables * fix copied from statements * fix small issues and bugs * fix bugs * fix bugs, support greyscale images * run fixup * update repo name * merge OwlViTImageTextEmbedder with obj detection head * fix merge conflict * fix merge conflict * make fixup * fix bugs * fix bugs * add additional processor test |
||
![]() |
9f12ec7d87
|
Typo in readme (#18195) | ||
![]() |
e630dad555
|
Add vision example to README (#18194) | ||
![]() |
c1c79b0655
|
NLLB tokenizer (#18126)
* NLLB tokenizer * Apply suggestions from code review - Thanks Stefan! Co-authored-by: Stefan Schweter <stefan@schweter.it> * Final touches * Style :) * Update docs/source/en/model_doc/nllb.mdx Co-authored-by: Stefan Schweter <stefan@schweter.it> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * PR reviews * Auto models Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
fbc7598bab
|
add MobileViT model (#17354)
* add MobileViT * fixup * Update README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * remove empty line Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * use clearer variable names * rename to MobileViTTransformerLayer * no longer inherit from nn.Sequential * fixup * fixup * not sure why this got added twice * rename organization for checkpoints * fix it up * Update src/transformers/models/mobilevit/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/configuration_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/configuration_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/configuration_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/models/mobilevit/test_modeling_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/modeling_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/modeling_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/modeling_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/modeling_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * code style improvements * fixup * Update docs/source/en/model_doc/mobilevit.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/mobilevit.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/mobilevit/configuration_mobilevit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/mobilevit/configuration_mobilevit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * download labels from hub * rename layers * rename more layers * don't compute loss in separate function * remove some nn.Sequential * replace nn.Sequential with new MobileViTTransformer class * replace nn.Sequential with MobileViTMobileNetLayer * fix pruning since model structure changed * fixup * fix doc comment * remove custom resize from feature extractor * fix ONNX import * add to doc tests * use center_crop from image_utils * move RGB->BGR flipping into image_utils * fix broken tests * wrong type hint * small tweaks Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
3cff4cc587
|
Add MVP model (#17787)
* Add MVP model * Update README * Remove useless module * Update docs * Fix bugs in tokenizer * Remove useless test * Remove useless module * Update vocab * Remove specifying * Remove specifying * Add #Copied ... statement * Update paper link * Remove useless TFMvp * Add #Copied ... statement * Fix style in test mvp model * Fix some typos * Fix properties of unset special tokens in non verbose mode * Update paper link * Update MVP doc * Update MVP doc * Fix README * Fix typos in docs * Update docs |
||
![]() |
6c8f4c9a93
|
Adding GroupViT Models (#17313)
* add group vit and fixed test (except slow) * passing slow test * addressed some comments * fixed test * fixed style * fixed copy * fixed segmentation output * fixed test * fixed relative path * fixed copy * add ignore non auto configured * fixed docstring, add doc * fixed copies * Apply suggestions from code review merge suggestions Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * resolve comment, renaming model * delete unused attr * use fix copies * resolve comments * fixed attn * remove unused vars * refactor tests * resolve final comments * add demo notebook * fixed inconsitent default * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * rename stage->stages * Create single GroupViTEncoderLayer class * Update conversion script * Simplify conversion script * Remove cross-attention class in favor of GroupViTAttention * Convert other model as well, add processor to conversion script * addressing final comment * fixed args * Update src/transformers/models/groupvit/modeling_groupvit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> |
||
![]() |
d6b6fb9963
|
Add CodeGen model (#17443)
* Add CodeGen model * Add missing key and switch order of super() * Fix torch.ones init with uint8 instead of bool * Address comments: copy statements and doc * update tests * remove old model parallel * fix batch gen tests * fix batch gen test * update test_gpt2_sample_max_time * fix codgen test and revert gpt2 test change * Fix incorrect tie_word_embedding value, typo, URL * Fix model order in README and styling * Reorder model list alphabetically * Set tie_word_embedding to False by default * Apply suggestions from code review * Better attn mask name & remove attn masked_bias * add tokenizer for codegen * quality * doc tokenizer * fix-copies * add CodeGenTokenizer in converter * make truncation optional * add test for truncation * add copyright * fix-copies * fix fast tokenizer decode * Update src/transformers/models/codegen/tokenization_codegen.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * increase vocab_size in tests Co-authored-by: patil-suraj <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> |
||
![]() |
7cf52a49de
|
Nezha Pytorch implementation (#17776)
* wip * rebase * all tests pass * rebase * ready for PR * address comments * fix styles * add require_torch to pipeline test * remove remote image to improve CI consistency * address comments; fix tf/flax tests * address comments; fix tf/flax tests * fix tests; add alias * repo consistency tests * Update src/transformers/pipelines/visual_question_answering.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * address comments * Update src/transformers/pipelines/visual_question_answering.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * merge * wip * wip * wip * most basic tests passes * all tests pass now * relative embedding * wip * running make fixup * remove bert changes * fix doc * fix doc * fix issues * fix doc * address comments * fix CI * remove redundant copied from * address comments * fix broken test Co-authored-by: Sijun He <sijunhe@Sijuns-MacBook-Pro.local> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> |
||
![]() |
8fcbe275c3
|
Add UL2 (just docs) (#17740)
* Add UL2 Co-authored-by: Daniel Hesslow <Daniel.Hesslow@gmail.com> * Correct naming * sort better * up * apply sylvains suggestion |
||
![]() |
7c6ec195ad | v4.21.0.dev0 | ||
![]() |
7f14839f55
|
[Wav2Vec2Conformer] Official release (#17709)
* [Wav2Vec2Conformer] Official release * remove from not-in-readme |
||
![]() |
a72f1c9f5b
|
Add LongT5 model (#16792)
* Initial commit * Make some fixes * Make PT model full forward pass * Drop TF & Flax implementation, fix copies etc * Add Flax model and update some corresponding stuff * Drop some TF things * Update config and flax local attn * Add encoder_attention_type to config * . * Update docs * Do some cleansing * Fix some issues -> make style; add some docs * Fix position_bias + mask addition + Update tests * Fix repo consistency * Fix model consistency by removing flax operation over attn_mask * [WIP] Add PT TGlobal LongT5 * . * [WIP] Add flax tglobal model * [WIP] Update flax model to use the right attention type in the encoder * Fix flax tglobal model forward pass * Make the use of global_relative_attention_bias * Add test suites for TGlobal model * Fix minor bugs, clean code * Fix pt-flax equivalence though not convinced with correctness * Fix LocalAttn implementation to match the original impl. + update READMEs * Few updates * Update: [Flax] improve large model init and loading #16148 * Add ckpt conversion script accoring to #16853 + handle torch device placement * Minor updates to conversion script. * Typo: AutoModelForSeq2SeqLM -> FlaxAutoModelForSeq2SeqLM * gpu support + dtype fix * Apply some suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * * Remove (de)parallelize stuff * Edit shape comments * Update README.md * make fix-copies * Remove caching logic for local & tglobal attention * Apply another batch of suggestions from code review * Add missing checkpoints * Format converting scripts * Drop (de)parallelize links from longT5 mdx * Fix converting script + revert config file change * Revert "Remove caching logic for local & tglobal attention" This reverts commit 2a619828f6ddc3e65bd9bb1725a12b77fa883a46. * Stash caching logic in Flax model * Make side relative bias used always * Drop caching logic in PT model * Return side bias as it was * Drop all remaining model parallel logic * Remove clamp statements * Move test files to the proper place * Update docs with new version of hf-doc-builder * Fix test imports * Make some minor improvements * Add missing checkpoints to docs * Make TGlobal model compatible with torch.onnx.export * Replace some np.ndarray with jnp.ndarray * Fix TGlobal for ONNX conversion + update docs * fix _make_global_fixed_block_ids and masked neg value * update flax model * style and quality * fix imports * remove load_tf_weights_in_longt5 from init and fix copies * add slow test for TGlobal model * typo fix * Drop obsolete is_parallelizable and one warning * Update __init__ files to fix repo-consistency * fix pipeline test * Fix some device placements * [wip]: Update tests -- need to generate summaries to update expected_summary * Fix quality * Update LongT5 model card * Update (slow) summarization tests * make style * rename checkpoitns * finish * fix flax tests Co-authored-by: phungvanduy <pvduy23@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: patil-suraj <surajp815@gmail.com> |
||
![]() |
ca2a55e9df
|
BLOOM (#17474)
* adding template * update model * model update * update conf for debug model * update conversion * update conversion script * update conversion script * fix missing keys check * add tests to test the tokenizer in the local machine * Change variable name * add tests on xnli dataset * add more description * add descriptions + clearer code * clearer code * adding new tests + skipping few tests because of env problems * change comment * add dtype on the configuration * add test embeddings * add hardcoded test * fix dtype issue * adding torch.float16 to config * adding more metrics (min, max, mean) * add sum * now the test passes with almost equal * add files for conversion - test passes on cpu gpu * add final changes * cleaning code * add new args in the docstring * fix one liner function * remove macros * remove forward attention * clean up init funtion * add comments on the issue * rm scale mask softmax * do make style * fix dtype in init * fixing for loop on att probs * fix style with black * fix style + doc error * fix and debug CI errors (docs + style) * some updates - change new operations - finally add scaled softmax - added new args in the config * make use cache working * add changes - save sharded models - final changes on the modeling script * add changes - comment on alibi - add TODO on seq length * test commit - added a text to test the commit Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com> * final changes - attention mask change - generation works on BS176b Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com> * changes - model + conversion * move to correct dir * put , * fex fixes * fix tokenizer autodoc * fix minor CI issues * fix minor CI issues * fix minor CI issues * fix style issue * fix minor import issues * fix few issues * remove def main on the test * add require torch * replace decorator with 'with' * fix style * change to bloom * add quick fix tokenizer * fix tokenizer file * fix tokenizer - merge tests - small fixes * fix import issue * add bloom to readme * fix consistency * Update docs/source/en/model_doc/bloom.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review fix comment issues on file headers Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix doc issue * small fix - modeling test * some changes - refactor some code - taking into account reviews - more tests should pass - removed pruning tests * remove useless division * more tests should pass * more tests should pass * more tests should pass * let's try this one -add alibi offset - remove all permutes to make the grad operations work - finger crossed * refactor - refactor code - style changes - add new threshold for test * major changes - change BLOOM to Bloom - add quick doc on bloom.mdx - move embeddings test on modeling test * modify readme * small fixes * small fix - better threshold for a test * remove old test file from fetcher * fix small typo * major change - change BloomLMHead to BloomForCausalLM * remove onnx config * major changes - refactor the code - remove asserts - change tol for test * make style * small change * adding a slow test + commenting old ones for now * make style * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make style * fix duplicates * cleaning comments on config * clean a bit conversion file * refacor a bit modeling file * refactor tokenizer file * fix tokenization test issue * fix tokenization issue #2 * fix tokenization issue second try * fix test issue * make style + add suggestions * change test fetcher * try this one - slow tests should pass - finger crossed * possible final changes * make style * try fix padding side issue * fix side * fix padding issue * fix ko-readme * fix config auto * cleaning modeling file * keep bloom in caps in ko * update config docs * remove pretraining_pp * remove model parallel * update config - add correct config files * fix duplicates * fix fetcher * fix refactor issue - remove divide function * try to remove alibi * small fixes - fix alibi - remove seq length - refactor a bit the code * put correct values - fix bos and eos token ids * fix attention mask loop Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com> * small fixes: - remove skip bias add * small fixes - fix typo in readme - fix typos in config * small changes - remove a test - add reconstruction test - change config * small changes - change Scaled Softmax to BloomScaledSoftmax * small fixes - fix alibi dtype * major changes - removing explicit dtype when loading modules - fixing test args (torch_dtype=auto) - add dosctring * fix readmes * major changes - now bloom supports alibi shifting - refactor a bit the code - better test tolerance now * refactor a bit * refactor a bit * put correct name on test * change docstring * small changes - fix docstring modeling - fix test tolerance * fix small nit - take dtype from tensors in the conversion script * minor fix - fix mdx issue * minor fix - change config docstring * forward contrib credits from PR14084 * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * apply modifications Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * resolve softmax upcast * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * final changes modeling Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Merge commit 'd156898f3b9b2c990e5963f5030a7143d57921a2' * merge commit * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * apply suggestions Apply suggestions from Stas comments Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Fix gradient checkpointing Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * add slow but exact * add accelerate compatibility Co-authored-by: Nicolas Patry <Narsil@users.noreply.github.com> * forward contrib credits Co-authored-by: thomasw21 <thomasw21@users.noreply.github.com> Co-authored-by: sgugger <sgugger@users.noreply.github.com> Co-authored-by: patrickvonplaten <patrickvonplaten@users.noreply.github.com> Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> Co-authored-by: LysandreJik <LysandreJik@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * fix torch device on tests * make style * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * fix nits Co-authored-by: patrickvonplaten<patrickvonplaten@users.noreply.github.com> * remove final nits * fix doc - add more details on the doc - add links to checkpoints * Update src/transformers/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply suggestions Co-authored-by: sgugger <sgugger@users.noreply.github.com> * put test torchscript to false * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by: justheuristic <justheuristic@gmail.com> * fix alibi - create alibi only once * add small doc * make quality * replace torch.nn * remove token type emb * fix fused op + output bias * add fused op - now can control fused operation from config * remove fused op * make quality * small changes - remove unsed args on config - removed bias gelu file - make the model torchscriptable - add torchscript slow tests * Update src/transformers/models/bloom/modeling_bloom.py * fix slow * make style * add accelerate support * add bloom to deepspeed tests * minor changes * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * minor change * slow tests pass * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/model_doc/bloom.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * minor changes: - change docstring - add link to paper Co-authored-by: Thomwolf <thomwolf@gmail.com> Co-authored-by: Thomas Wolf <thomas@huggingface.co> Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: sIncerass <sheng.s@berkeley.edu> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> Co-authored-by: Nicolas Patry <Narsil@users.noreply.github.com> Co-authored-by: thomasw21 <thomasw21@users.noreply.github.com> Co-authored-by: sgugger <sgugger@users.noreply.github.com> Co-authored-by: patrickvonplaten <patrickvonplaten@users.noreply.github.com> Co-authored-by: LysandreJik <LysandreJik@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: justheuristic <justheuristic@gmail.com> Co-authored-by: Stas Bekman <stas@stason.org> |
||
![]() |
119e3c0fc8
|
M-CTC-T Model (#16402)
* added cbs to notebooks, made copy-paste error fix in generation_utils * initial push for mctc model * mctc feature extractor done * added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly. * added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly. * passing attention, now struggling to figure out how attention masks make sense here * works when excluding attention masks. ask later how one would integrate attention maskshere * bizarre configuration error (model prefix comes first in config dict json and messes up the order) * all passing but bizzarre config dict ordering issue when to_dict * passing all major tests * feature extraction, processor, tokenizer added & tests passing * style & consistency & other logistical fixes * copy paste fix * model after feature extraction working * commiting final feature extraction results; need to fix normalization * feature extraction passing tests; probably should add tests on the specific flashlight-copied functions? * delete print ; format code a bit * fixing tests * passing major tests * fixing styles * completed tokenization test with real example; not sure if these values are entirely correct. * last test fixes from local * reverting accidentally included custom setup configs * remove load tf weights; fix config error * testing couldnt import featureextractor * fix docs * fix docs * resolving comments * style fixes * style fixes * Update to MCTCConv1dSubSampler Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * relposemb fixes * conv1d name issue; expecting config fail with paraentheses * fix config issue * fix config issue * fix config issue * change everything to MCTCT * fixing naming change errors * archive list * copyrights and docs * copyrights and docs * copyrights and docs * merge resolution * move tests, fix to changed optionaldependency structure * test directories changed * fixing tests * how to avoid tf tests? * how to avoid tf tests? * tests passing locally * allow mctctprocessor imported any env * allow mctctprocessor imported any env * fixed second round of feedback, need to fix docs * doc changes not being applied * all fixed * style fix * feedback fixes * fix copies and feature extraction style fix * Update tests/models/visual_bert/test_modeling_visual_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * copy paste huggingface:main visual bert * added eof newline to visual bert; all tests are passing otherwise * fix slow tests by adding attention mask * change model id to speechbrain * make fix-copies * fix readme unwanted deletes * fixing readmes, make fix-copies * consistent M-CTC-T naming * Update src/transformers/models/mctct/__init__.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * all fixed but variable naming * adjust double quotes * fixed variable names * copyright and mr quilter * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * correct slow tests * make fix-copies * Update src/transformers/models/mctct/configuration_mctct.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mctct/configuration_mctct.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * m-ctc-t not mctct Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
048dd73bba
|
Check list of models in the main README and sort it (#17517)
* Script for README * Fix copies * Complete error message |
||
![]() |
84aaadd8c5
|
Adding LeViT Model by Facebook (#17466)
* levit files * levit tests * weights script * weights script * update * style fixes * few minor corrections * Added teacher model * edit docs * fix-copies * style fixes * pr error resolved * Update README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/index.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/levit.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/levit.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/levit.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/levit.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/configuration_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/configuration_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * suggested pr changes * style fixes * minor bug * update * minor doc edit * style * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/models/levit/test_modeling_levit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/levit/modeling_levit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * residual layer readable * style * Update docs/source/en/model_doc/levit.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/modeling_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/modeling_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/modeling_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update tests/models/levit/test_feature_extraction_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * change checkpoints and style * update * minor changes * Update src/transformers/models/levit/modeling_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/modeling_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
71e602725b
|
[WIP] Adding GPT-NeoX-20B (#16659)
* initial * first try * working 20B * 20B tokenizers * Docs * Import fixes for missing classes * Update docs, fixup * black formatting * isort * flake * dummy objects * documentation * Documentation yml * more docs * tweaks for tests * tokenization auto * fix neox tests * test * test * einsum * address PR feedback * Documentation * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gpt_neox/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gpt_neox/configuration_gpt_neox.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Remove undefined LaTeX syntax * Update to full url to avoid confusion about if that's supposed to refer to the Hub * fix auto * move tests * documentation fix * more doc fixes * test refactor * fix import * fix import * fix import * fix import * fix import * style fixes * More modeling fixes Co-authored-by: Jason Phang <zp489@gr057.hpc.nyu.edu> Co-authored-by: Stella Biderman <stellabiderman@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
31ee80d556
|
Add LayoutLMv3 (#17060)
* Make forward pass work * More improvements * Remove unused imports * Remove timm dependency * Improve loss calculation of token classifier * Fix most tests * Add docs * Add model integration test * Make all tests pass * Add LayoutLMv3FeatureExtractor * Improve integration test + make fixup * Add example script * Fix style * Add LayoutLMv3Processor * Fix style * Add option to add visual labels * Make more tokenizer tests pass * Fix more tests * Make more tests pass * Fix bug and improve docs * Fix import of processors * Improve docstrings * Fix toctree and improve docs * Fix auto tokenizer * Move tests to model folder * Move tests to model folder * change default behavior add_prefix_space * add prefix space for fast * add_prefix_spcae set to True for Fast * no space before `unique_no_split` token * add test to hightligh special treatment of added tokens * fix `test_batch_encode_dynamic_overflowing` by building a long enough example * fix `test_full_tokenizer` with add_prefix_token * Fix tokenizer integration test * Make the code more readable * Add tests for LayoutLMv3Processor * Fix style * Add model to README and update init * Apply suggestions from code review * Replace asserts by value errors * Add suggestion by @ducviet00 * Add model to doc tests * Simplify script * Improve README * a step ahead to fix * Update pair_input_test * Make all tokenizer tests pass - phew * Make style * Add LayoutLMv3 to CI job * Fix auto mapping * Fix CI job name * Make all processor tests pass * Make tests of LayoutLMv2 and LayoutXLM consistent * Add copied from statements to fast tokenizer * Add copied from statements to slow tokenizer * Remove add_visual_labels attribute * Fix tests * Add link to notebooks * Improve docs of LayoutLMv3Processor * Fix reference to section Co-authored-by: SaulLu <lucilesaul.com@gmail.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> |
||
![]() |
c86aad6110
|
Fix cvt docstrings (#17367) | ||
![]() |
adc0ff2502
|
Add CvT (#17299)
* Adding cvt files * Adding cvt files * changes in init file * Adding cvt files * changes in init file * Style fixes * Address comments from code review * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Format lists in docstring * Fix copies * Apply suggestion from code review Co-authored-by: AnugunjNaman <anugunjjha@gmail.com> Co-authored-by: Ayushman Singh <singhayushman13@protonmail.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
![]() |
d6b8e9cec7
|
Add trajectory transformer (#17141)
* Add trajectory transformer Fix model init Fix end of lines for .mdx files Add trajectory transformer model to toctree Add forward input docs Fix docs, remove prints, simplify prediction test Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Update docs, more descriptive comments Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Update readme Small comment update and add conversion script Rebase and reformat Fix copies Fix rebase, remove duplicates Fix rebase, remove duplicates * Remove tapex * Remove tapex * Remove tapex |
||
![]() |
b971c769e8
|
Add OPT (#17088)
* First version - OPT model * Final changes - putting use cache to False * few changes - remove commented block * few changes - remove unecessary files * fix style issues * few changes - remove a test file - added the logits test * Update src/transformers/models/auto/tokenization_auto.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * add gen tests * few changes - rm mask filling example on docstring * few changes - remove useless args * some changes - more tests should pass now - needs to clean more - documentation still needs to be done * fix code quality * major changes - change attention architecture to BART-like - modify some tests - style fix * rm useless classes - remove opt for: - QA - cond generation - seq classif * Removed autodoc calls to non-existant classes TOkenizers are not implemented * Update src/transformers/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/auto/modeling_tf_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Replaced OPTTokeniser with GPT2 tokenizer * added GPT2Tokenizer.from_pretrained("patrickvonplaten/opt_gpt2_tokenizer") * Removed OPTTokenizer * make style * Make style replaces ``` ...).unsqueeze(``` by ``` >>>).unsqueeze(``` * make repo consistency * Removed PretrainedOPTModel * fix opt.mdx removed other heads * fix init, removed 3 heads * removed heads * finished cleaning head * removed seauence classif and question answering * removed unused imports * removed useless dummy object for QA, SC and CG * removed tests for removed useless dummy object for QA, SC and CG * Removed head_mask using encoder layers which don't exist * fixed test * fix line * added OPT to toctree * Updated model path with pushed weigths * fix model path * fixed code quality * fixed embeddings and generation tests * update paths * clean comments * removed OPTClassificationHead for sentence classification * renamed hidden layer * renamed num layers to standard num_hidden_layers * num_attention_heads fix * changes for 125m * add first version for 125m * add first version - flax * add new version * causal LM output * replace output type with BaseModelOutputWithPastAndCrossAttentions * revert working config from 150m to 350m * clean * removed decoder input ids * fixed embed dim * more embed_dim issues * make style + removed enc_dec test * update falx model * removed troublesome copy * added is_encoder_decoder=False to config * added set_input emb fuinction to model class * requires torch on embed test * use head mask instead of decoder head mask input param solves a test * 8 test remaining, update * Updated create_and_check_decoder_model_past_large_inputs * Make style * update op tokenizer with condition * make style * See if I can push * some clean up * remove linear head hack * save intermediate * save correct attention * add copied from from bart * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * fix part of the reviewss Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * same changes in naming / conversion * correct mask * more fixes * delete FlaxOPT and TfOPT * clean traces of Flax and Tf * fix mask * fixed positionnal embedding length when past key value is provoded * get 125m, 6.7b to work * Added do_layer_norm * solved mismatch in load dictionnary * clean up preapre opt input dict * fixed past key value as bool * fix previus * fixed return dict False tuple issue * All tests are passing * Make style * Ignore OPTDecoder non tested * make fix-copies * make repo consistency * small fix * removed uselss @torch.no_grad decorator * make styl;e * fix previous opt test * style * make style * added opt documentation * update OPT_PRETRAINED_MODEL_ARCHIVE_LIST * up * more fixes * model & config work * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * added comment on padding hack (+2) * cleaup * review update * docstring for missing arg * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/opt/__init__.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * update pretrained map * update path and tests * make style * styling * make consistency * add gpt2 tok new * more tok fixes * Update src/transformers/models/auto/tokenization_auto.py * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/models/opt/test_modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update based on reviews * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * make style * make tokenizer auto tests pass * apply Lysandre suggestion * finish tests * add some good tokenizer tests * improve docs slighly Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: ArthurZucker <arthur.zucker@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> |
||
![]() |
a10f61834d
|
[feat] Add FLAVA model (#16654)
* [WIP] Add FLAVA model This PR aims to add [FLAVA](ihttps://arxiv.org/abs/2112.04482) model to the transformers repo. Following checklist delineates the list of things to be done for this PR to be complete: [x] Flava init [x] Flava base models [x] Flava layers [x] Flava Configs [x] Flava encoders [x] Flava pretraining models [ ] Flava classification/retrieval models (To be added in a separate PR) [x] Documentation updates [x] Imports updates [x] Argstring updates [x] Flava pretrained checkpoints [x] Flava tests [x] Flava processors [x] Sanity check [x] Lint |
||
![]() |
1ac698744c
|
Add YOLOS (#16848)
* First draft * Add YolosForObjectDetection * Make forward pass work * Add mid position embeddings * Add interpolation of position encodings * Add expected values * Add YOLOS to tests * Add integration test * Support tiny model as well * Support all models in conversion script * Remove mid_pe_size attribute * Make more tests pass * Add model to README and fix config * Add copied from statements * Rename base_model_prefix to vit * Add missing YOLOS_PRETRAINED_CONFIG_ARCHIVE_MAP * Apply suggestions from code review * Apply more suggestions from code review * Convert remaining checkpoints * Improve docstrings * Add YolosFeatureExtractor * Add feature extractor to docs * Add corresponding tests * Fix style * Fix docs * Apply suggestion from code review * Fix bad rebase * Fix some more bad rebase * Fix missing character * Improve docs and variable names Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> |
||
![]() |
e6f00a11d7
|
Update README to latest release (#16997) | ||
![]() |
4ef0abb738
|
Add TAPEX (#16473)
* Add TapexTokenizer * Improve docstrings and provide option to provide answer * Remove option for pretokenized inputs * Add TAPEX to README * Fix copies * Remove option for pretokenized inputs * Initial commit: add tapex fine-tuning examples on both table-based question answering and table-based fact verification. * - Draft a README file for running the script and introducing some background. - Remove unused code lines in tabfact script. - Disable the deafult `pad_to_max_length` option which is memory-consuming. * * Support `as_target_tokenizer` function for TapexTokenizer. * Fix the do_lower_case behaviour of TapexTokenizer. * Add unit tests for target scenarios and cased/uncased scenarios for both source and target. * * Replace the label BartTokenizer with TapexTokenizer's as_target_tokenizer function. * Fix typos in tapex example README. * * fix the evaluation script - remove the property `task_name` * * Make the label space more clear for tabfact tasks * * Using a new fine-tuning script for tapex-base on tabfact. * * Remove the lowercase code outside the tokenizer - we use the tokenizer to control whether do_lower_case * Guarantee the hyper-parameter can be run without out-of-memory on 16GB card and report the new reproduced number on wikisql * * Remove the default tokenizer_name option. * Provide evaluation command. * * Support for WikiTableQuestion dataset. * Fix a typo in README. * * Fix the datasets's key name in WikiTableQuestions * Run make fixup and move test to folder * Fix quality * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply some more suggestions from code review * Improve docstrings * Overwrite failing test * Improve comment in example scripts * Fix rebase * Add TAPEX to Auto mapping * Add TAPEX to auto config mappings * Put TAPEX higher than BART in auto mapping * Add TAPEX to doc tests Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain> Co-authored-by: SivilTaram <qianlxc@outlook.com> Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> |