transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-08 23:30:08 +06:00

Author	SHA1	Message	Date
Sayak Paul	84eaa6acf5	Add TFConvNextModel (#15750 ) * feat: initial implementation of convnext in tensorflow. * fix: sample code for the classification model. * chore: added checked for from the classification model. * chore: set bias initializer in the classification head. * chore: updated license terms. * chore: removed ununsed imports * feat: enabled argument during using drop_path. * chore: replaced tf.identity with layers.Activation(linear). * chore: edited default checkpoint. * fix: minor bugs in the initializations. * partial-fix: tf model errors for loading pretrained pt weights. * partial-fix: call method updated * partial-fix: cross loading of weights (4x3 variables to be matched) * chore: removed unneeded comment. * removed playground.py * rebasing * rebasing and removing playground.py. * fix: renaming TFConvNextStage conv and layer norm layers * chore: added initializers and other minor additions. * chore: added initializers and other minor additions. * add: tests for convnext. * fix: integration tester class. * fix: issues mentioned in pr feedback (round 1). * fix: how output_hidden_states arg is propoagated inside the network. * feat: handling of arg for pure cnn models. * chore: added a note on equal contribution in model docs. * rebasing * rebasing and removing playground.py. * feat: encapsulation for the convnext trunk. * Fix variable naming; Test-related corrections; Run make fixup * chore: added Joao as a contributor to convnext. * rebasing * rebasing and removing playground.py. * rebasing * rebasing and removing playground.py. * chore: corrected copyright year and added comment on NHWC. * chore: fixed the black version and ran formatting. * chore: ran make style. * chore: removed from_pt argument from test, ran make style. * rebasing * rebasing and removing playground.py. * rebasing * rebasing and removing playground.py. * fix: tests in the convnext subclass, ran make style. * rebasing * rebasing and removing playground.py. * rebasing * rebasing and removing playground.py. * chore: moved convnext test to the correct location * fix: locations for the test file of convnext. * fix: convnext tests. * chore: applied sgugger's suggestion for dealing w/ output_attentions. * chore: added comments. * chore: applied updated quality enviornment style. * chore: applied formatting with quality enviornment. * chore: revert to the previous tests/test_modeling_common.py. * chore: revert to the original test_modeling_common.py * chore: revert to previous states for test_modeling_tf_common.py and modeling_tf_utils.py * fix: tests for convnext. * chore: removed output_attentions argument from convnext config. * chore: revert to the earlier tf utils. * fix: output shapes of the hidden states * chore: removed unnecessary comment * chore: reverting to the right test_modeling_tf_common.py. * Styling nits Co-authored-by: ariG23498 <aritra.born2fly@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>	2022-02-25 18:19:16 +01:00
Sylvain Gugger	0118c4f6a8	Re-enable doctests for the quicktour (#15828 ) * Re-enable doctests for the quicktour * Re-enable doctests for task_summary (#15830) * Remove &	2022-02-25 17:46:38 +01:00
Tanay Mehta	7566734d6f	Add model specific output classes to PoolFormer model docs (#15746 ) * Added model specific output classes to poolformer docs * Fixed Segformer typo in Poolformer docs	2022-02-25 13:43:56 +01:00
Steven Liu	fecb08c2b8	🧼 NLP task guides (#15731 ) * clean commit of changes to NLP tasks * 🖍 apply feedback * 📝 move tf data collator in multiple choice Co-authored-by: Steven <stevhliu@gmail.com>	2022-02-23 13:58:33 -06:00
Julien Chaumond	32f5de10a0	[doc] custom_models: mention security features of the Hub (#15768 ) * custom_models: tiny doc addition * mention security feature earlier in the section Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2022-02-23 11:40:06 -05:00
Nicolas Patry	f9582c205a	Adding ZeroShotImageClassificationPipeline (#12119 ) * [Proposal] Adding ZeroShotImageClassificationPipeline - Based on CLIP * WIP, Resurection in progress. * Resurrection... achieved. * Reword handling different `padding_value` for `feature_extractor` and `tokenizer`. * Thanks doc-builder ! * Adding docs + global namespace `ZeroShotImageClassificationPipeline`. * Fixing templates. * Make the test pass and be robust to floating error. * Adressing suraj's comments on docs mostly. * Tf support start. * TF support. * Update src/transformers/pipelines/zero_shot_image_classification.py Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2022-02-23 09:41:42 +01:00
Patrick von Platen	c44d3675c2	Time stamps for CTC models (#15687 ) * [Wav2Vec2 Time Stamps] * Add first version * add word time stamps * Fix * save intermediate space * improve * [Finish CTC Tokenizer] * remove @ * remove @ * push * continue with phonemes * up * finish PR * up * add example * rename * finish * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * correct split * finalize Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-22 19:26:44 +01:00
Francesco Saverio Zuppichini	38bed912e3	added link to our writing-doc document (#15756 )	2022-02-22 09:57:28 +01:00
Joao Gante	3956b133b6	TF text classification examples (#15704 ) * Working example with to_tf_dataset * updated text_classification * more comments	2022-02-21 17:17:59 +00:00
Gunjan Chhablani	2c2a31ffbc	Add missing PLBart entry in README (#15721 ) * Add missing PLBart entry in index * Fix README * Fix README * Fix style * Change to master model doc	2022-02-18 21:11:42 +01:00
Gunjan Chhablani	ae1f835028	Add PLBart (#13269 ) * Init PLBART * Add missing configuration file * Add conversion script and configurationf ile * Fix style * Update modeling and conversion scripts * Fix scale embedding in config * Add comment * Fix conversion script * Add classification option to conversion script * Fix vocab size in config doc * Add tokenizer files from MBart50 * Allow no lang code in regular tokenizer * Add PLBart Tokenizer Converters * Remove mask from multi tokenizer * Remove mask from multi tokenizer * Change from MBart-50 to MBart tokenizer * Fix names and modify src/tgt behavior * Fix imports for tokenizer * Remove <mask> from multi tokenizer * Fix style * Change tokenizer_class to processor_class * Add attribute map to config class * Update modeling file to modified MBart code * Update configuration file to MBart style configuration * Fix tokenizer * Separate tokenizers * Fix error in tokenization auto * Copy MBart tests * Replace with MBart tokenization tests * Fix style * Fix language code in multi tokenizer * Fix configuration docs * Add entry for plbart_multi in transformers init * Add dummy objects and fix imports * Fix modeling tests * Add TODO in config * Fix copyright year * Fix modeling docs and test * Fix some tokenization tests and style * Add changes from review * Fix copies * Fix docs * Fix docs * Fix style * Fix year * Add changes from review * Remove extra changes * Fix base tokenizer and doc * Fix style * Fix modeling and slow tokenizer tests * Remove Multi-tokenizer Converter and Tests * Delete QA model and Multi Tokenizer dummy objects * Fix repo consistency and code quality issues * Fix example documentation * Fix style * Remove PLBartTokenizer from type checking in init * Fix consistency issue * Add changes from review * Fix style * Remove PLBartTokenizerFast * Remove FastTokenizer converter * Fix AutoTokenzier mapping * Add plbart to toctree and fix consistency issues * Add language codes tokenizer test * Fix styling and doc issues * Add fixes for failing tests * Fix copies * Fix failing modeling test * Change assert to assertTrue in modeling tests	2022-02-18 14:17:09 +01:00
Francesco Saverio Zuppichini	240cc6cbdc	Adding a model, more doc for pushing to the hub (#15690 ) * doc for adding a model to the hub * run make style * resolved conversation * removed a line * removed ) * Update docs/source/add_new_model.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/add_new_model.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make style Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-18 09:11:18 +01:00
NielsRogge	57882177be	Add SimMIM (#15586 ) * Add first draft * Make model importable * Make SwinForMaskedImageModeling importable * Fix imports * Add missing inits * Add support for Swin * Fix bug * Fix bug * Fix another bug * Fix Swin MIM implementation * Fix default encoder stride * Fix Swin * Add print statements for debugging * Add image_size data argument * Fix Swin * Fix image_size * Add print statements for debugging * Fix print statement * Remove print statements * Improve reshaping of bool_masked_pos * Add support for DeiT, fix tests * Improve docstrings * Apply new black version * Improve script * Fix bug * Improve README * Apply suggestions from code review * Remove DS_Store and add to gitignore * Apply suggestions from code review + fix BEiT Flax * Revert BEiT changes * Improve README * Fix code quality * Improve README Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-02-17 19:44:55 +01:00
Yih-Dar	92a537d938	Minor fix on README.md (#15688 ) * fix README * fix more arxiv links * make fix-copies Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-02-17 08:38:32 -05:00
Tanay Mehta	f84e0dbd2a	Add PoolFormer (#15531 ) * Added all files, PoolFormerFeatureExtractor still failing tests * Fixed PoolFormerFeatureExtractor not being able to import * Completed Poolformer doc * Applied Suggested fixes * Fixed errors in modeling_auto.py * Fix feature extractor, convert docs to Markdown, styling of code * Remove PoolFormer from check_repo and fix integration test * Remove Poolformer from check_repo * Fixed configuration_poolformer.py docs and removed inference.py from poolformer * Ran with black v22 * Added PoolFormer to _toctree.yml * Updated poolformer doc * Applied suggested fixes and added on README.md * Did make fixup and make fix-copies, tests should pass now * Changed PoolFormer weights conversion script name and fixed README * Applied fixes in test_modeling_poolformer.py and modeling_poolformer.py * Added PoolFormerFeatureExtractor to AutoFeatureExtractor API Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>	2022-02-17 13:16:37 +01:00
Francesco Saverio Zuppichini	b87c044c79	Usage examples for logger (#15657 ) * logger * Update docs/source/main_classes/logging.mdx Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update docs/source/main_classes/logging.mdx Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2022-02-16 10:15:13 +01:00
Stas Bekman	bee361c6f1	[t5/t0/mt5 models] faster/leaner custom layer norm (#14656 ) * [t5] faster/leaner custom layer norm * wip * apex.normalization.FusedRMSNorm * cleanup * cleanup * add doc * add catch all * Trigger CI * expand	2022-02-15 16:49:57 -08:00
Patrick von Platen	2e12b907ae	TF generate refactor - Greedy Search (#15562 ) * TF generate start refactor * Add tf tests for sample generate * re-organize * boom boom * Apply suggestions from code review * re-add * add all code * make random greedy pass * make encoder-decoder random work * further improvements * delete bogus file * make gpt2 and t5 tests work * finish logits tests * correct logits processors * correct past / encoder_outputs drama * refactor some methods * another fix * refactor shape_list * fix more shape list * import shape _list * finish docs * fix imports * make style * correct tf utils * Fix TFRag as well * Apply Lysandre's and Sylvais suggestions * Update tests/test_generation_tf_logits_process.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Update src/transformers/tf_utils.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * remove cpu according to gante * correct logit processor Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2022-02-15 17:54:43 +01:00
Nicolas Patry	cdf19c501d	Re-export `KeyDataset`. (#15645 ) * Re-export `KeyDataset`. * Update the docs locations.	2022-02-15 17:49:38 +01:00
Stas Bekman	28e6155d8a	add a network debug script and document it (#15652 ) * add a network debug script and document it * doc	2022-02-15 08:48:00 -08:00
jonrbates	86a7845c0c	Fix typo in speech2text2 doc (#15617 ) Forward looks for inputs, not input_ids	2022-02-15 13:54:34 +01:00
fra	05a8580964	Revert "logger doc" This reverts commit `41168a49ce`.	2022-02-15 10:46:45 +01:00
fra	41168a49ce	logger doc	2022-02-15 10:03:28 +01:00
NielsRogge	b090b79022	Make Swin work with VisionEncoderDecoderModel (#15527 ) * Add attribute_map * Add mention in docs * Set hidden_size attribute correctly * Add note about Transformer-based models only Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>	2022-02-14 17:33:35 +01:00
Daniel Erenrich	4f403ea899	Fix grammar in tokenizer_summary (#15614 ) "to make ensure" is redundant.	2022-02-11 16:51:30 -05:00
Stas Bekman	f15c99fabf	[deepspeed docs] misc additions (#15585 ) * [deepspeed docs] round_robin_gradients * training and/or eval/predict loss is * Update docs/source/main_classes/deepspeed.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-11 10:54:04 -08:00
Steven Liu	85aee09e9a	🖍 remove broken link (#15615 )	2022-02-11 12:33:55 -06:00
Sylvain Gugger	6cf06d198c	Mark "code in the Hub" API as experimental (#15624 )	2022-02-11 09:55:31 -05:00
Ngo Quang Huy	c0864d98ba	Correct JSON format (#15600 )	2022-02-10 09:02:03 -08:00
lewtun	2e8b85f72e	Add local and TensorFlow ONNX export examples to docs (#15604 ) * Add local and TensorFlow ONNX export examples to docs * Use PyTorch - TensorFlow split	2022-02-10 16:31:00 +01:00
Alberto Bégué	cb7ed6e083	Add Tensorflow handling of ONNX conversion (#13831 ) * Add TensorFlow support for ONNX export * Change documentation to mention conversion with Tensorflow * Refactor export into export_pytorch and export_tensorflow * Check model's type instead of framework installation to choose between TF and Pytorch Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Alberto Bégué <alberto.begue@della.ai> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2022-02-10 11:18:41 +01:00
Sylvain Gugger	c722753afd	Expand tutorial for custom models (#15587 ) * Expand tutorial for custom models * Style * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2022-02-09 17:44:28 -05:00
NielsRogge	a86ee2261e	Add link (#15588 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>	2022-02-09 23:33:39 +01:00
Stas Bekman	dee17d5676	[trainer docs] document how to select specific gpus (#15551 ) * [trainer docs] document how to select specific gpus * expand * add urls * add accelerate launcher	2022-02-09 10:12:29 -08:00
Chan Woo Kim	2b5603f6ac	Constrained Beam Search [without disjunctive decoding] (#15416 ) * added classes to get started with constrained beam search * in progress, think i can directly force tokens now but not yet with the round robin * think now i have total control, now need to code the bank selection * technically works as desired, need to optimize and fix design choices leading to undersirable outputs * complete PR #1 without disjunctive decoding * removed incorrect tests * Delete k.txt * Delete test.py * Delete test.sh * revert changes to test scripts * genutils * full implementation with testing, no disjunctive yet * shifted docs * passing all tests realistically ran locally * removing accidentally included print statements * fixed source of error in initial PR test * fixing the get_device() vs device trap * fixed documentation docstrings about constrained_beam_search * fixed tests having failing for Speech2TextModel's floating point inputs * fix cuda long tensor * added examples and testing for them and founx & fixed a bug in beam_search and constrained_beam_search * deleted accidentally added test halting code with assert False * code reformat * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py * fixing based on comments on PR * took out the testing code that should but work fails without the beam search moditification ; style changes * fixing comments issues * docstrings for ConstraintListState * typo in PhrsalConstraint docstring * docstrings improvements Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-02-09 16:59:26 +01:00
Leandro von Werra	d923f76203	add model scaling section (#15119 ) * add model scaling section * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * integrate reviewer feedback * initialize GPU properly * add note about BnB optimizer * move doc from `scaling.mdx` to `performance.mdx` * integrate reviewer feedback * revert section levels Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-09 15:27:30 +01:00
Sylvain Gugger	b5c6fdecf0	PoC for a ProcessorMixin class (#15549 ) * PoC for a ProcessorMixin class * Documentation * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Roll out to other processors * Add base feature extractor class in init * Use args and kwargs Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-02-09 09:24:49 -05:00
Nathan Raw	fcb4f11c92	📝 Add codecarbon callback to docs (#15563 )	2022-02-08 14:10:53 -05:00
Joao Gante	8406fa6dd5	Add TFSpeech2Text (#15113 ) * Add wrapper classes * convert inner layers to tf * Add TF Encoder and Decoder layers * TFSpeech2Text models * Loadable model * TF model with same outputs as PT model * test skeleton * correct tests and run the fixup * correct attention expansion * TFSpeech2Text pask_key_values with TF format	2022-02-08 16:27:23 +00:00
aaron	87d08afb16	electra is added to onnx supported model (#15084 ) * electra is added to onnx supported model * add google/electra-base-generator for test onnx module Co-authored-by: Lewis Tunstall <lewis.c.tunstall@gmail.com>	2022-02-08 15:47:49 +01:00
Steven Liu	552f8d3091	Create a custom model guide (#15489 ) * 📝 add config section * 📝 finish first draft * 📝 add feature extractor and processor * 🖍 apply feedback from review * 📝 minor edits * last review	2022-02-07 12:34:56 -06:00
lewtun	6775b211b6	Remove Longformers from ONNX-supported models (#15273 )	2022-02-07 17:32:13 +01:00
NielsRogge	84eec9e6ba	Add ConvNeXT (#15277 ) * First draft * Add conversion script * Improve conversion script * Improve docs and implement tests * Define model output class * Fix tests * Fix more tests * Add model to README * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply more suggestions from code review * Apply suggestions from code review * Rename dims to hidden_sizes * Fix equivalence test * Rename gamma to gamma_parameter * Clean up conversion script * Add ConvNextFeatureExtractor * Add corresponding tests * Implement feature extractor correctly * Make implementation cleaner * Add ConvNextStem class * Improve design * Update design to also include encoder * Fix gamma parameter * Use sample docstrings * Finish conversion, add center cropping * Replace nielsr by facebook, make feature extractor tests smaller * Fix integration test Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-07 16:11:37 +01:00
Stas Bekman	8ce1330631	[deepspeed docs] DeepSpeed ZeRO Inference (#15486 ) * [deepspeed docs] DeepSpeed ZeRO Inference * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * tweak * deal with black * extra cleanup, better comments Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-04 13:51:02 -08:00
Sylvain Gugger	ac6aa10f23	Standardize semantic segmentation models outputs (#15469 ) * Standardize instance segmentation models outputs * Rename output * Update src/transformers/modeling_outputs.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add legacy argument to the config and model forward * Update src/transformers/models/beit/modeling_beit.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Copy fix in Segformer Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2022-02-04 14:52:07 -05:00
Stas Bekman	31be2f45a9	[deepspeed docs] Megatron-Deepspeed info (#15488 )	2022-02-04 11:15:13 -08:00
Stas Bekman	21dcaec5d5	[deepspeed docs] memory requirements (#15506 )	2022-02-03 10:55:14 -08:00
Sylvain Gugger	44b21f117b	Save code of registered custom models (#15379 ) * Allow dynamic modules to use relative imports * Work for configs * Fix last merge conflict * Save code of registered custom objects * Map strings to strings * Fix test * Add tokenizer * Rework tests * Tests * Ignore fixtures py files for tests * Tokenizer test + fix collection * With full path * Rework integration * Fix typo * Remove changes in conftest * Test for tokenizers * Add documentation * Update docs/source/custom_models.mdx Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Add file structure and file content * Add more doc * Style * Update docs/source/custom_models.mdx Co-authored-by: Suraj Patil <surajp815@gmail.com> * Address review comments Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2022-02-02 10:44:37 -05:00
Steven Liu	b9418a1d97	Update tutorial docs (#15165 ) * first draft of pipeline, autoclass, preprocess tutorials * apply review feedback * 🖍 apply feedback from patrick/niels * 📝add output image to preprocessed image * 🖍 apply feedback from patrick	2022-02-01 18:31:35 -06:00
Steven Liu	c157c7e3fd	Update fine-tune docs (#15259 ) * add fine-tune tutorial * make edits, fix style * 📝 make edits * 🖍 fix code format links to external libraries * 🔄revert code formatting * 🖍 use DefaultDataCollator instead of DataCollatorWithPadding	2022-02-01 18:28:12 -06:00

1 2 3 4 5 ...

1127 Commits