transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Dhruv Nair	fe06f8dcac	Improvements to Comet Integration (#14680 ) * change args to address overwriting issue * remove project name from args * remove passing args as kwargs to experiment object * remove passing args as kwargs to offline experiment * fix offline directory assignment in experiment kwargs * log checkpoint folder on training end * log entire output_dir as asset folder * log asset folder recursively * end experiment at the end of training * clean up * clean up * Default to always log training assets to Comet when using CometCallback * change logging training assets to be true when running callback setup * fix so that experiment always ends when training ends * styling and quality fixes * update docstring for COMET_LOG_ASSETS environment variable * run styling and quality checks * clean up to docstring * remove merge markers * change asset logging to false to avoid hitting max assets per experiment limit * update training asset description * fix styling	2021-12-08 13:39:10 -05:00
Gaurang Tandon	4ea19de80c	fix: verify jsonlines file in run_translation (#14660 ) (#14661 ) * fix: verify jsonl in run_translation (#14660) * fix(run_translation.py): json/jsonl validation Both json and jsonl are to be accepted as valid jsonlines file extension * fix(run_translation.py): make black happy * Ran make style	2021-12-08 13:25:30 -05:00
Sylvain Gugger	cf36f4d7a8	Convert tutorials (#14665 ) * Convert a few docs * And another * Last tutorials * New syntax for colab links * Convert a few docs * And another * Last tutorials * New syntax for colab links	2021-12-08 13:19:46 -05:00
lewtun	0f4e39c559	Revert "Added support for other features for already supported models (#14358 )" (#14679 ) This reverts commit `0c70f145d1`.	2021-12-08 13:04:40 -05:00
Michael Benayoun	0c70f145d1	Added support for other features for already supported models (#14358 ) * Added support for other features for already supported models * Partial support for causal and seq2seq models * Partial support for causal and seq2seq models * OnnxSeq2SeqConfigWithPast to support seq2seq models * Parameterized the onnx tests * Restored run_mlm.py * Restored run_mlm.py * [WIP] BART update * BART and MBART * Added comments * Another sequence length of the past_key_values	2021-12-08 18:39:56 +01:00
Patrick von Platen	ee4fa2e465	[AutoProcessor] Add Wav2Vec2WithLM & small fix (#14675 ) * [AutoProcessor] Add Wav2Vec2WithLM & small fix * revert line removal * Update src/transformers/__init__.py * add test * up * up * small fix	2021-12-08 15:51:28 +01:00
Lysandre Debut	2294071a0c	Fix doc builder (#14676 )	2021-12-08 09:14:36 -05:00
ZOHETH	fab3b518ef	fix deprecated tf method (#14671 ) tf.matrix_band_part -> tf.linalg.band_part	2021-12-08 13:43:21 +00:00
NielsRogge	65b20b739b	Add Perceiver IO (#14487 ) * First draft * Style and remove mlm * Make forward pass work * More improvements * More improvements * Fix bug * More improvements * More improvements * Add PerceiverTokenizer first draft * Improve conversion script * More improvements * Make conversion script work for the encoder * Make conversion script work with local pickle files * Style & quality, fix-copies * Add dummy input to conversion script * Add absolute position embeddings to TextPreProcessor * Make forward pass of encoder work * More improvements * Move text preprocessor to separate script * More improvements * More improvements * Add post processor * Make MLM model work * Style * Add PerceiverForMaskedLM * Add PerceiverImagePreprocessor * Make style * Make PerceiverForImageClassification work * More improvements * More improvements * Use tokenizer in conversion script * Use PerceiverForMaskedLM in conversion script * Define custom PerceiverModelOutput * Improve PerceiverAttention to make it work for both MLM and image classification * More improvements * More improvements * More improvements to the conversion script * Make conversion script work for both MLM and image classification * Add PerceiverFeatureExtractor * More improvements * Style and quality * Add center cropping * Fix bug * Small fix * Add print statement * Fix bug in image preprocessor * Fix bug with conversion script * Make output position embeddings an nn.Parameter layer instead of nn.Embedding * Comment out print statements * Add position encoding classes * More improvements * Use position_encoding_kwargs * Add PerceiverForImageClassificationFourier * Make style & quality * Add PerceiverForImageClassificationConvProcessing * Style & quality * Add flow model * Move processors to modeling file * Make position encodings modular * Make basic decoder use modular position encodings * Add PerceiverForOpticalFlow to conversion script * Add AudioPreprocessor * Make it possible for the basic decoder to use Fourier position embeddings * Add PerceiverForMultimodalAutoencoding * Improve model for optical flow * Improve _build_network_inputs method * Add print statement * Fix device issue * Fix device of Fourier embeddings * Add print statements for debugging * Add another print statement * Add another print statement * Add another print statement * Add another print statement * Improve PerceiverAudioPreprocessor * Improve conversion script for multimodal modal * More improvements * More improvements * Improve multimodal model * Make forward pass multimodal model work * More improvements * Improve tests * Fix some more tests * Add output dataclasses * Make more tests pass * Add print statements for debuggin * Add tests for image classification * Add PerceiverClassifierOutput * More improvements * Make more tests pass for the optical flow model * Make style & quality * Small improvements * Don't support training for optical flow model for now * Fix _prepare_for_class for tests * Make more tests pass, add some docs * Add multimodal model to tests * Minor fixes * Fix tests * Improve conversion script * Make fixup * Remove pos_dim argument * Fix device issue * Potential fix for OOM * Revert previous commit * Fix test_initialization * Add print statements for debugging * Fix print statement * Add print statement * Add print statement * Add print statement * Add print statement * Add print statement * Add print statement * Remove need for output_shape * Comment out output_shape * Remove unnecessary code * Improve docs * Fix make fixup * Remove PerceiverTextProcessor from init * Improve docs * Small improvement * Apply first batch of suggestions from code review * Apply more suggestions from code review * Update docstrings * Define dicts beforehand for readability * Rename task to architecture in conversion script, include PerceiverModel in tests * Add print statements for debugging * Fix tests on GPU * Remove preprocessors, postprocessors and decoders from main init * Add integration test * Fix docs * Replace einops by torch * Update for new docs frontend * Rename PerceiverForImageClassification * Improve docs * Improve docs * Improve docs of PerceiverModel * Fix some more tests * Improve center_crop * Add PerceiverForSequenceClassification * Small improvements * Fix tests * Add integration test for optical flow model * Clean up * Add tests for tokenizer * Fix tokenizer by adding special tokens properly * Fix CI	2021-12-08 14:20:34 +01:00
Patrick von Platen	961732c276	[Wav2Vec2] PyCTCDecode Integration to support language model boosted decoding (#14339 ) * up * up * up * make it cleaner * correct * make styhahalal * add more tests * finish * small fix * make style * up * tryout to solve cicrle ci * up * fix more tests * fix more tests * apply sylvains suggestions * fix import * correct docs * add pyctcdecode only to speech tests * fix more tests * add tf, flax and pt tests * add pt * fix last tests * fix more tests * Apply suggestions from code review * change lines * Apply suggestions from code review Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * correct tests * correct tests * add doc string Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>	2021-12-08 12:07:54 +01:00
Nicolas Patry	2e12d90b9e	Fixing Dataset for TQA + token-classification. (#14658 ) * Fixing Dataset for TQA + token-classification. * Fixing the tests. * Making sure `offset_mappings` is a valid argument.	2021-12-08 09:54:24 +01:00
Stas Bekman	fae0b9faef	[trainer] conditional ctx managers into one wrapper (#14663 ) * [trainer] conditional ctx managers into one wrapper * workaround for contextlib.nullcontext for py<3.7 * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * one more autocast * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-07 13:04:18 -08:00
TranSirius	39f1dff5a0	Fix a Bug, trainer_seq2seq.py, in the else branch at Line 172, generation_inputs should be a dict (#14546 ) * fix bug, trainer_seq2seq.py, Line 172, generation_inputs must be a dict before feeding into self.model.generation() * fix bug, trainer_seq2seq.py, Line 172, generation_inputs must be a dict before feeding into self.model.generation()	2021-12-07 12:09:18 -05:00
Nouamane Tazi	2171695cc2	quick fix SummarizationPipeline error messages (#14618 ) * quick fix SummarizationPipeline error messages Fix error messages to avoid spam errors, and errors of type: `Your max_length is set to 50, but you input_length is only 46. You might consider decreasing max_length manually, e.g. summarizer('...', max_length=50)` * correcto SummarizationPipeline error messages fixes	2021-12-07 16:44:28 +01:00
Stas Bekman	b66c5ab20c	[deepspeed] fix --load_best_model_at_end (#14652 ) * [deepspeed] fix load_best_model_at_end * try with pull_request_target * revert: try with pull_request_target * style * add test * cleanup	2021-12-06 21:57:47 -08:00
Ryokan RI	30646a0a3c	Add mLUKE (#14640 ) * implement MLukeTokenizer and LukeForMaskedLM * update tests * update docs * add LukeForMaskedLM to check_repo.py * update README * fix test and specify the entity pad id in tokenization_(m)luke * fix EntityPredictionHeadTransform	2021-12-07 00:25:28 -05:00
Yih-Dar	4cdb67caba	Use cross_attention_hidden_size in Encoder-Decoder models (#14378 ) * add cross_attention_hidden_size to text-2-text encoder-decoder models (PT/Flax) * for TFEncoderDecoderModel * add equivalence test for TFEncoderDecoderModel * fix * fix failed equivalence tests * remove unused import * add detailed comment * Fix check_equivalence_tf_to_pt by using encoder/decoder * cleaning * Use cross_attention_hidden_size in speech-to-text * clean fast init logging msg in encoder decoder models * increase tol from 1e-5 to 1e-3 for tf test * style * style * make sure projection layer can run * remove type conversion + add check * fix conflict (config.output_hidden_size) * Remove TF -> PT in check_pt_tf_equivalence for TFEncoderDecoderModel Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2021-12-07 00:27:32 +01:00
Sylvain Gugger	381b05a3f5	Remove nonworking workflow for now	2021-12-06 17:25:28 -05:00
Suraj Patil	75ae287aec	fix flax examples tests (#14646 ) * make tensorboard optional * update test_fetcher for flax examples * make the tests slow	2021-12-07 00:34:27 +05:30
Sylvain Gugger	03fda7b743	Add a job to test the documentation build (#14645 ) * Add a job to the documentation build * Add caching * Test cache	2021-12-06 13:55:59 -05:00
Sylvain Gugger	e513c16e82	Fix syntax for class references (#14644 )	2021-12-06 13:31:27 -05:00
Lysandre Debut	e9688875bf	Auto processor fix (#14623 ) * Add AutoProcessor class Init and tests Add doc Fix init Update src/transformers/models/auto/processing_auto.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Reverts to tokenizer or feature extractor when available Adapt test * Revert "Adapt test" This reverts commit `bbdde5fab0`. * Revert "Reverts to tokenizer or feature extractor when available" This reverts commit `77659ff5d2`. * Don't revert everything Lysandre! Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>	2021-12-06 12:49:50 -05:00
Suraj Patil	cbe6026536	fix flax example tests (#14643 )	2021-12-06 23:14:37 +05:30
guhur	df085d8ea8	doc: mismatch between pooler/d_output (#14641 ) The model outputs a pooler_output whereas the doctype examples were using a pooled_output.	2021-12-06 11:51:53 -05:00
tucan9389	0f3f045ebd	Add GPTJForQuestionAnswering (#14503 ) * Add GPTJForQuestionAnswering * Reformat for GPTJForQuestionAnswering * Fix isort error * make style for GPTJForQA * Add _keys_to_ignore_on_load_missing * Change the sequence of qa and classification Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-12-06 11:44:10 -05:00
Jay Zhang	1ccc033c56	Update the example of exporting Bart + BeamSearch to ONNX module to resolve comments. (#14310 ) * Update code to resolve comments left in previous PR. * Add README.md file for this example. * Update examples/onnx/pytorch/translation/README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update examples/onnx/pytorch/translation/README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update examples/onnx/pytorch/translation/README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update README.md file to resolve comments. * Add a section name. * Update examples/onnx/pytorch/translation/README.md Co-authored-by: Gary Miguel <garymm@garymm.org> * Add more comments for _convert_past_list_to_tuple(). * Change the default file name to a consistent one. * Fix a format issue. * Update examples/onnx/pytorch/translation/README.md Co-authored-by: Gary Miguel <garymm@garymm.org> * Update examples/onnx/pytorch/translation/run_onnx_exporter.py Co-authored-by: Gary Miguel <garymm@garymm.org> * Update examples/onnx/pytorch/translation/README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Change the folder to summarization and address some other coments. * Update the torch version. Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Gary Miguel <garymm@garymm.org> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2021-12-06 14:01:51 +01:00
Julien Chaumond	6cdc3a7844	[urls to hub] Replace outdated model tags with their now-canonical pipeline types (#14617 ) * Replace outdated model tags with their now-canonical pipeline types * spam the CI till it's green	2021-12-06 04:35:01 -05:00
Suraj Patil	c824d7ed48	add flax example tests in CI workflow (#14637 )	2021-12-06 14:50:43 +05:30
Suraj Patil	bc8a9f415b	fix typo (#14635 )	2021-12-06 10:52:43 +05:30
Suraj Patil	c5bd732ac6	Add Flax example tests (#14599 ) * add test for glue * add tests for clm * fix clm test * add summrization tests * more tests * fix few tests * add test for t5 mlm * fix t5 mlm test * fix tests for multi device * cleanup * ci job * fix metric file name * make t5 more robust	2021-12-06 10:48:58 +05:30
Kamal Raj	803a8cd18f	updated readme with proper arguments (#14624 )	2021-12-05 22:12:51 -05:00
(Bill) Yuchen Lin	3977b58437	fix a typo (#14626 )	2021-12-05 11:31:23 +05:30
Matt	73ec4340ec	Make DefaultDataCollator importable from root (#14588 ) * Make DefaultDataCollator importable from root * Add documentation for DefaultDataCollator and add return_tensors argument to all class docstrings * make style * Add DefaultDataCollator to data_collator.rst * Add DefaultDataCollator to data_collator.rst	2021-12-03 15:15:09 -05:00
Stas Bekman	71b1bf7ea8	[trainer] add tf32-mode control (#14606 ) * [trainer] add --tf32 support * it's pt>=.17 * it's pt>=.17 * flip the default to True * add experimental note * simplify logic * style * switch to 3-state logic * doc * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * re-style code Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-03 10:08:58 -08:00
Lysandre Debut	aada989ad5	Fix doc builder (#14616 ) * Fix doc builder * Fix doc builder * Fix doc builder	2021-12-03 12:09:25 -05:00
Lysandre Debut	ec47baeba2	2022 is the year of multi-modality (#14610 ) * 2022 is the year of multi-modality * Small fix * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * Apply suggestions from code review * Apply to documentation index * Apply suggestions from code review Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2021-12-03 11:35:44 -05:00
Stas Bekman	e62091d5a7	[CI] move env print to util, add pt, nccl versions (#14607 ) * move env print to util, add pt, nccl versions * style * version * align	2021-12-03 08:18:36 -05:00
Li-Huai (Allan) Lin	66ea739168	Improve tokenizer tests (#13594 ) * Use new method to acquire tokenizers * Resolve TODOs. * Style * Fix * Enable do_lower_case in test_tokenize_special_tokens * Apply suggestion from code review * Fix mask token handling * Revert "Fix mask token handling" This reverts commit `daaa3f5291`. * Fix FNet mask token tokenization * Complete everything * Apply suggestions from code review	2021-12-03 08:39:10 +01:00
Nik	6645eb61fa	fix #14524 (IndexError when mask prob is too low) (#14525 ) * fix #14524 (IndexError when mask prob is too low) * fix formatting * correct documentation, add option for setting min_num_masks * change the semantic meaning of `mask_prob` in _compute_mask_indices With this commit the meaing of `mask_prob` actually adhered to the probability for each vector to be the start of a masked span of length. * fix check_copies test * fix documentation to semantic meaning of `upper bound of overall masking percentage`, revert changes to _compute_mask_indices * fix typo	2021-12-02 17:05:31 +03:00
yis11178	96cc02b51b	change tf.math.divide with int(/) to remove dim_per_head from the TF graph (#14600 ) Co-authored-by: yis <yis@graphcore.ai>	2021-12-02 13:13:42 +00:00
Leandro von Werra	43f953cc2e	Add CodeParrot 🦜 codebase (#14536 ) * add readme skeleton * update readme * add initialization script * add deduplication script * add codeparrot training script * add code generation evaluation * add validation loss script * add requirements * update readme * tweak readme * make style * add highlights to readme * add CLIs to scripts * add tokenizer training script * add docstring to constant length dataset * fix defaults in arguments * update readme with cli * move image to hub * tweaks of readme * fix cli commands * add author * explain env variables * fix formatting * Update examples/research_projects/codeparrot/README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Apply suggestions from code review Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * replace generic with gpt2 tokenizer Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2021-12-02 10:41:35 +01:00
Lysandre Debut	e4c67d60ec	Python 3.6 -> Python 3.7 for TF runs (#14598 )	2021-12-02 04:09:17 -05:00
Daniel Stancl	50d909be28	[Flax] Add FlaxBlenderbotSmall (#14576 ) * [WIP] Add FlaxBlenderbotSmall * Revert some unintentionally changed files Revert some unintentionally files changed by improperly filled cookiecutter instructions. * Fix repo consistency * Fix Flax-PT equivalence * Apply suggestions from code review * Update index.mdx * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-12-02 14:21:48 +05:30
Lysandre Debut	77d87e732e	Adds a git pull instruction to the documentation builder (#14597 ) * Adds a git pull instruction * master -> main	2021-12-02 03:32:38 -05:00
Mishig Davaadorj	275402bf2b	Update doc img links (#14593 ) * Update doc img links * Rename toctree.yml -> _toctree.yml (#14594) * Update doc img links * Update performance.md img link	2021-12-02 09:01:35 +01:00
Mishig Davaadorj	4f68de625c	Rename toctree.yml -> _toctree.yml (#14594 )	2021-12-02 08:58:39 +01:00
Stas Bekman	fbe278c76c	[doc] bf16/tf32 guide (#14579 ) * [doc] bf16/tf32 guide * expand * expand * Update docs/source/performance.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-01 14:18:58 -08:00
Li-Huai (Allan) Lin	934e2799da	Fix mask token handling (#14364 ) * Fix mask token handling * Revert "Fix mask token handling" This reverts commit `daaa3f5291`. * Fix FNet mask token tokenization	2021-12-01 20:16:52 +01:00
Sylvain Gugger	4df7d05a87	Doc new front (#14590 ) * Convert PretrainedConfig doc to Markdown * Use syntax * Add necessary doc files (#14496) * Doc fixes (#14499) * Fixes for the new front * Convert DETR file for table * Title is needed * Simplify a bit * Even simpler * Remove imports * Fix typo in toctree (#14516) * Fix checkpoints badge * Update versions.yml format (#14517) * Doc new front github actions (#14512) * Doc new front github actions * Fix docstring * Fix feature extraction utils import (#14515) * Address Julien's comments * Push to doc-builder * Ready for merge * Remove old build and deploy * Doc misc fixes (#14583) * Rm versions.yml from doc * Fix converting.rst * Rm pretrained_models from toctree * Fix index links (#14567) * Fix links in README * Localized READMEs * Fix copy script * Fix find doc script * Update README_ko.md Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: Julien Chaumond <julien@huggingface.co> * Adapt build command to new CLI tools (#14578) * Fix typo * Fix doc interlinks (#14589) * Convert PretrainedConfig doc to Markdown * Use syntax * Rm pattern <[a-z]+(.html).> Rm huggingface.co/transformers/master * Rm .html * Rm .html from index.mdx * Rm .html from model_summary.rst * Update index.mdx rm html * Update remove .html * Fix inner doc links * Fix interlink in preprocssing.rst * Update pr_checks Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Convert PretrainedConfig doc to Markdown * Use syntax * Add necessary doc files (#14496) * Doc fixes (#14499) * Fixes for the new front * Convert DETR file for table * Title is needed * Simplify a bit * Even simpler * Remove imports * Fix checkpoints badge * Fix typo in toctree (#14516) * Update versions.yml format (#14517) * Doc new front github actions (#14512) * Doc new front github actions * Fix docstring * Fix feature extraction utils import (#14515) * Address Julien's comments * Push to doc-builder * Ready for merge * Remove old build and deploy * Doc misc fixes (#14583) * Rm versions.yml from doc * Fix converting.rst * Rm pretrained_models from toctree * Fix index links (#14567) * Fix links in README * Localized READMEs * Fix copy script * Fix find doc script * Update README_ko.md Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: Julien Chaumond <julien@huggingface.co> * Adapt build command to new CLI tools (#14578) * Fix typo * Fix doc interlinks (#14589) * Convert PretrainedConfig doc to Markdown * Use syntax * Rm pattern <[a-z]+(.html).> Rm huggingface.co/transformers/master * Rm .html * Rm .html from index.mdx * Rm .html from model_summary.rst * Update index.mdx rm html * Update remove .html * Fix inner doc links * Fix interlink in preprocssing.rst * Update pr_checks Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Styling Co-authored-by: Mishig Davaadorj <mishig.davaadorj@coloradocollege.edu> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Julien Chaumond <julien@huggingface.co>	2021-12-01 14:13:02 -05:00
Stas Bekman	14cc50d081	fix autocast for older pytorch	2021-12-01 09:32:52 -08:00

1 2 3 4 5 ...

8431 Commits