transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-14 18:18:24 +06:00

Author	SHA1	Message	Date
arfy slowy	01977466f4	fix: typo spelling grammar (#13212 ) * fix: typo spelling grammar * fix: make fixup	2021-08-30 08:09:14 -04:00
Patrick von Platen	fbf468b057	[Flax] Correct flax docs (#12782 ) * fix_torch_device_generate_test * remove @ * fix flax docs * correct more docs in flax * another correction * fix flax docs * Apply suggestions from code review	2021-08-04 16:31:23 +02:00
Stas Bekman	807b6bd160	[Deepspeed] warmup_ratio docs (#12830 ) * [Deepspeed] warmup_ratio docs * Update docs/source/main_classes/deepspeed.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style * Update docs/source/main_classes/deepspeed.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-07-21 10:49:29 -07:00
Sylvain Gugger	da72ac6e26	Fix push_to_hub docstring and make it appear in doc (#12770 )	2021-07-17 15:52:33 +02:00
Stas Bekman	5dd0c956a8	non-native optimizers are mostly ok with zero-offload (#12690 )	2021-07-13 20:18:51 -07:00
Stas Bekman	78f5fe1416	[Deepspeed] adapt multiple models, add zero_to_fp32 tests (#12477 ) * zero_to_fp32 tests * args change * remove unnecessary work * use transformers.trainer_utils.get_last_checkpoint * document the new features * cleanup * wip * fix fsmt * add bert * cleanup * add xlm-roberta * electra works * cleanup * sync * split off the model zoo tests * cleanup * cleanup * cleanup * cleanup * reformat * cleanup * casing * deepspeed>=0.4.3 * adjust distilbert * Update docs/source/main_classes/deepspeed.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-07-13 12:07:32 -07:00
Stas Bekman	7682e97702	[models] respect dtype of the model when instantiating it (#12316 ) * [models] respect dtype of the model when instantiating it * cleanup * cleanup * rework to handle non-float dtype * fix * switch to fp32 tiny model * improve * use dtype.is_floating_point * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix the doc * recode to use explicit torch_dtype_auto_detect, torch_dtype args * docs and tweaks * docs and tweaks * docs and tweaks * merge 2 args, add docs * fix * fix * better doc * better doc Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-28 20:11:21 -07:00
Stas Bekman	4a872caef4	remove extra white space from log format (#12360 )	2021-06-25 13:20:14 -07:00
Stas Bekman	07ae6103c3	[Deepspeed] new docs (#12077 ) * document sub_group_size * style * install + issues reporting * style * style * Update docs/source/main_classes/deepspeed.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * indent 4 * restore * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-23 11:07:37 -07:00
Stas Bekman	ebe5413589	[trainer] 2 bug fixes and a rename (#12309 ) * bug fixes and a rename * add extended DDP test	2021-06-22 11:13:23 -07:00
Stas Bekman	dad414d5f9	[trainer + examples] set log level from CLI (#12276 ) * set log level from CLI * add log_level_replica + test + extended docs * cleanup * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * rename datasets objects to allow datasets module * improve the doc * style * doc improve Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-21 19:30:50 -07:00
Stas Bekman	040283170c	consistent nn. and nn.functional: part 5 docs (#12161 )	2021-06-14 13:34:32 -07:00
Stas Bekman	0e82f0cbc2	typo	2021-06-08 12:55:17 -07:00
Stas Bekman	32290d87f6	[Deepspeed] various fixes (#12058 ) * replace deprecated config * sub_group_size was too big * complete deprecation removal	2021-06-08 08:36:15 -07:00
Stas Bekman	2c73b93099	[Deepspeed] Assert on mismatches between ds and hf args (#12021 ) * wip * add mismatch validation + test * renames * Update docs/source/main_classes/deepspeed.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * renames Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-04 08:58:23 -07:00
Stas Bekman	640318befa	[deepspeed] Move code and doc into standalone files (#11984 ) * move code and docs * style * moved * restore	2021-06-02 09:56:00 -07:00
Stas Bekman	7ec596ecda	[DeepSpeed] decouple `DeepSpeedConfigHF` from `Trainer` (#11966 ) * decouple DeepSpeedConfigHF from Trainer * add LoggingLevel ctx manager; add new test * cleanup * add docs * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * implemented suggested renames * formatter workaround Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-01 13:24:52 -07:00
Stas Bekman	79712e7e7a	[deepspeed] docs (#11940 ) * deepspeed docs * cleanup * cleanup	2021-06-01 09:21:21 -07:00
Patrick von Platen	996a315e76	Flax Generate (#11777 ) * fix_torch_device_generate_test * remove @ * add * indexing * correct a couple of tests * fix tests * add logits processor * finish top_k, top_p, temp * add docs * correct flax prng key default * improve generate * add generation docs * add docs * make style * revert model outputs change * make style * correct typo * fix tests * fix slow test * add raise * finish generation Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-05-27 00:18:17 +01:00
Sylvain Gugger	cbbf49f644	Fix doc deployment	2021-05-13 10:34:14 -04:00
Lysandre Debut	39084ca663	Add the ImageClassificationPipeline (#11598 ) * Add the ImageClassificationPipeline * Code review Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com> * Have `load_image` at the module level Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>	2021-05-07 08:08:40 -04:00
Stas Bekman	c065025c47	[trainer] document resume randomness (#11588 ) * document resume randomness * fix link * reword * fix * reword * style	2021-05-04 14:17:11 -07:00
Stas Bekman	4e7bf94e72	[DeepSpeed] fp32 support (#11499 ) * prep for deepspeed==0.3.16 * new version * too soon * support and test fp32 mode * troubleshooting doc start * workaround no longer needed * add fp32 doc * style * cleanup, add tf32 note * clarify * release was made	2021-04-30 12:51:48 -07:00
Nicolas Patry	db9dd09cf9	Adding `AutomaticSpeechRecognitionPipeline`. (#11337 ) * Adding `AutomaticSpeechRecognitionPipeline`. - Because we added everything to enable this pipeline, we probably should add it to `transformers`. - This PR tries to limit the scope and focuses only on the pipeline part (what should go in, and out). - The tests are very specific for S2T and Wav2vec2 to make sure both architectures are supported by the pipeline. We don't use the mixin for tests right now, because that requires more work in the `pipeline` function (will be done in a follow up PR). - Unsure about the "helper" function `ffmpeg_read`. It makes a lot of sense from a user perspective, it does not add any additional dependencies (as in hard dependency, because users can always use their own load mechanism). Meanwhile, it feels slightly clunky to have so much optional preprocessing. - The pipeline is not done to support streaming audio right now. Future work: - Add `automatic-speech-recognition` as a `task`. And add the FeatureExtractor.from_pretrained within `pipeline` function. - Add small models within tests - Add the Mixin to tests. - Make the logic between ForCTC vs ForConditionalGeneration better. * Update tests/test_pipelines_automatic_speech_recognition.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Adding docs + main import + type checking + LICENSE. * Doc style !. * Fixing TYPE_HINT. * Specifying waveform shape in the docs. * Adding asserts + specify in the documentation the shape of the input np.ndarray. * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Adding require to tests + move the `feature_extractor` doc. Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-04-30 11:54:08 +02:00
Hamel Husain	88ac60f7b5	update QuickTour docs to reflect model output object (#11462 ) * update docs to reflect model output object * run make style`	2021-04-26 22:18:37 -04:00
Stas Bekman	bc2571e61c	[Deepspeed] ZeRO-Infinity integration plus config revamp (#11418 ) * adding Z-inf * revamp config process * up version requirement * wip * massive rewrite * cleanup * cleanup * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * consistent json commas * act on suggestions * leave this feature for 0.3.16 * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-26 10:40:32 -07:00
Sylvain Gugger	bf2e0cf70b	Trainer push to hub (#11328 ) * Initial support for upload to hub * push -> upload * Fixes + examples * Fix torchhub test * Torchhub test I hate you * push_model_to_hub -> push_to_hub * Apply mixin to other pretrained models * Remove ABC inheritance * Add tests * Typo * Run tests * Install git-lfs * Change approach * Add push_to_hub to all * Staging test suite * Typo * Maybe like this? * More deps * Cache * Adapt name * Quality * MOAR tests * Put it in testing_utils * Docs + torchhub last hope * Styling * Wrong method * Typos * Update src/transformers/file_utils.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Address review comments * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-04-23 09:17:37 -04:00
Sylvain Gugger	dabeb15292	Examples reorg (#11350 ) * Base move * Examples reorganization * Update references * Put back test data * Move conftest * More fixes * Move test data to test fixtures * Update path * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address review comments and clean Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-04-21 11:11:20 -04:00
Sylvain Gugger	f38cd4373f	Indent code block in the documentation (#11233 ) * Indent code block * Indent code blocks version 2 * Quality	2021-04-13 15:36:36 -04:00
Sylvain Gugger	3312e96bfb	Doc check: a bit of clean up (#11224 )	2021-04-13 12:14:25 -04:00
fghuman	0c6fcd3034	Added documentation for data collator. (#10941 ) * Added documentation for data collator. * Update docs/source/data_collator.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Added documentation for data collator. * Added documentation for the data collator. * Merge branch 'doc_DataCollator' of C:\Users\mahii\PycharmProjects\transformers with conflicts. * Update documentation for the data collator. * Update documentation for the data collator. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Amna <A.A.Ahmad@student.tudelft.nl>	2021-04-12 11:59:46 -04:00
Stas Bekman	0311ba2153	typo (#11152 ) * typo * style	2021-04-08 19:47:31 -07:00
Stas Bekman	c2e0fd5283	[setup] make fairscale and deepspeed setup extras (#11151 ) * make fairscale and deepspeed setup extras * fix default * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * no reason not to ask for the good version * update the CIs Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-08 15:46:54 -07:00
Stas Bekman	66446909b2	[tests] relocate core integration tests (#11146 ) * relocate core integration tests * add sys.path context manager * cleanup * try * try2 * fix path * doc * style * add dep * add 2 more deps	2021-04-08 13:13:17 -07:00
Stas Bekman	c6d664849b	[DeepSpeed] ZeRO Stage 3 (#10753 ) * synced gpus * fix * fix * need to use t5-small for quality tests * notes * complete merge * fix a disappearing std stream problem * start zero3 tests * wip * tune params * sorting out the pre-trained model loading * reworking generate loop wip * wip * style * fix tests * split the tests * refactor tests * wip * parameterized * fix * workout the resume from non-ds checkpoint pass + test * cleanup * remove no longer needed code * split getter/setter functions * complete the docs * suggestions * gpus and their compute capabilities link * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * style * remove invalid paramgd * automatically configure zero3 params that rely on hidden size * make _get_resized_embeddings zero3-aware * add test exercising resize_token_embeddings() * add docstring Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-04-08 09:53:01 -07:00
Amala Deshmukh	e1c02e018c	Add example for registering callbacks with trainers (#10928 ) * Add example for callback registry Resolves: #9036 * Update callback registry documentation * Added comments for other ways to register callback	2021-04-05 12:27:23 -04:00
Lysandre Debut	9f4e0c23d6	Documentation about loading a fast tokenizer within Transformers (#11029 ) * Documentation about loading a fast tokenizer within Transformers * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-05 10:51:16 -04:00
Sylvain Gugger	b0595d33c1	Add ImageFeatureExtractionMixin (#10905 ) * Add ImageFeatureExtractionMixin * Add dummy vision objects * Add require_vision * Add tests * Fix test	2021-03-26 11:23:56 -04:00
Cheng Li	c83fbc5f2d	[Deepspeed] Allow HF optimizer and scheduler to be passed to deepspeed (#10464 ) * pass hf optimizer and scheduler to deepspeed if not specified in ds config * pass hf optimizer and scheduler to deepspeed if not specified in ds config * update * make init_deepspeed support config dict * fix docstring formatting * clean up trainer's comments * add new tests * fix type * composit argparse doesn't work * style * add a new test, rename others * document new functionality * complete tests, add docs * style * correct level * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add new methods to the doc * must tell DS we are using a non-native optimizer * add protection against cpu_offload + HF optimizer combo * fix the cli overrides * sync docs + tests * restore AdamW * better docs * need new version * no longer needed * remove outdate information * refactor duplicated code Co-authored-by: Stas Bekman <stas@stason.org> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-03-16 15:51:09 -07:00
Théo Matussière	6f840990a7	split seq2seq script into summarization & translation (#10611 ) * split seq2seq script, update docs * needless diff * fix readme * remove test diff * s/summarization/translation Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * cr * fix arguments & better mbart/t5 refs * copyright Co-authored-by: Suraj Patil <surajp815@gmail.com> * reword readme Co-authored-by: Suraj Patil <surajp815@gmail.com> * s/summarization/translation * short script names * fix tests * fix isort, include mbart doc * delete old script, update tests * automate source prefix * automate source prefix for translation * s/translation/trans Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * fix script name (short version) * typos Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * exact parameter Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * remove superfluous source_prefix calls in docs * rename scripts & warn for source prefix * black * flake8 Co-authored-by: theo <theo@matussie.re> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2021-03-15 09:11:42 -04:00
Stas Bekman	4c32f9f26e	AdamW is now supported by default (#9624 )	2021-03-12 13:40:07 -08:00
Sylvain Gugger	e8246f78f9	Add auto_wrap option in fairscale integration (#10673 ) * Add auto_wrap option in fairscale integration * Style	2021-03-12 07:50:20 -05:00
Sylvain Gugger	26a33cfd8c	Document Trainer limitation on custom models (#10635 )	2021-03-10 14:58:22 -05:00
Patrick von Platen	9a06b6b11b	[FeatureExtractorSavingUtils] Refactor PretrainedFeatureExtractor (#10594 ) * save first version * finish refactor * finish refactor * correct naming * correct naming * shorter names * Update src/transformers/feature_extraction_common_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * change name * finish Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-03-09 12:16:59 +03:00
lewtun	12b66215cf	Fix example of custom Trainer to reflect signature of compute_loss (#10537 )	2021-03-05 07:44:53 -05:00
Sylvain Gugger	948b730f97	Remove unsupported methods from ModelOutput doc (#10505 )	2021-03-03 14:55:18 -05:00
Sylvain Gugger	9d14be5c20	Add support for ZeRO-2/3 and ZeRO-offload in fairscale (#10354 ) * Ass support for ZeRO-2/3 and ZeRO-offload in fairscale * Quality * Rework from review comments * Add doc * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Address review comments Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2021-02-25 11:07:53 -05:00
Patrick von Platen	cb38ffcc5e	[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer (#10324 ) * push to show * small improvement * small improvement * Update src/transformers/feature_extraction_utils.py * Update src/transformers/feature_extraction_utils.py * implement base * add common tests * make all tests pass for wav2vec2 * make padding work & add more tests * finalize feature extractor utils * add call method to feature extraction * finalize feature processor * finish tokenizer * finish general processor design * finish tests * typo * remove bogus file * finish docstring * add docs * finish docs * small fix * correct docs * save intermediate * load changes * apply changes * apply changes to doc * change tests * apply surajs recommend * final changes * Apply suggestions from code review * fix typo * fix import * correct docstring	2021-02-25 17:42:46 +03:00
Stas Bekman	eab0afc19c	[Trainer] implement gradient_accumulation_steps support in DeepSpeed integration (#10310 ) * implement gradient_accumulation_steps support in DeepSpeed integration * typo * cleanup * cleanup	2021-02-22 11:15:59 -08:00
Stas Bekman	5da7c78ed8	update to new script; notebook notes (#10241 )	2021-02-17 15:58:08 -08:00
Stas Bekman	b54cb0bd82	[DeepSpeed in notebooks] Jupyter + Colab (#10130 ) * init devices/setup explicitly * docs + test * simplify * cleanup * cleanup * cleanup * correct the required dist setup * derive local_rank from env LOCAL_RANK	2021-02-11 14:02:05 -08:00
Stas Bekman	7c07a47dfb	[DeepSpeed docs] new information (#9610 ) * how to specify a specific gpu * new paper * expand on buffer sizes * style * where to find config examples * specific example * small updates	2021-02-09 22:16:20 -08:00
Lysandre Debut	78f4a0e7e5	Logging propagation (#10092 ) * Enable propagation by default * Document enable/disable default handler	2021-02-09 10:27:49 -05:00
Sylvain Gugger	45aaf5f7ab	A few fixes in the documentation (#10033 )	2021-02-08 05:02:01 -05:00
Sylvain Gugger	de38a6e4d2	Fix 9918 (#9932 ) * Initial work * Fix doc styler and other models	2021-02-02 05:22:20 -05:00
Stas Bekman	82498cbc37	[deepspeed doc] install issues + 1-gpu deployment (#9582 ) * [doc] install + 1-gpu deployment * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * improvements Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-14 11:05:04 -08:00
Stas Bekman	2df34f4aba	[trainer] deepspeed integration (#9211 ) * deepspeed integration * style * add test * ds wants to do its own backward * fp16 assert * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style * for clarity extract what args are being passed to deepspeed * introduce the concept of self.wrapped_model * s/self.wrapped_model/self.model_wrapped/ * complete transition to self.wrapped_model / self.model * fix * doc * give ds its own init * add custom overrides, handle bs correctly * fix test * clean up model_init logic, fix small bug * complete fix * collapse --deepspeed_config into --deepspeed * style * start adding doc notes * style * implement hf2ds optimizer and scheduler configuration remapping * oops * call get_num_training_steps absolutely when needed * workaround broken auto-formatter * deepspeed_config arg is no longer needed - fixed in deepspeed master * use hf's fp16 args in config * clean * start on the docs * rebase cleanup * finish up --fp16 * clarify the supported stages * big refactor thanks to discovering deepspeed.init_distributed * cleanup * revert fp16 part * add checkpoint-support * more init ds into integrations * extend docs * cleanup * unfix docs * clean up old code * imports * move docs * fix logic * make it clear which file it's referring to * document nodes/gpus * style * wrong format * style * deepspeed handles gradient clipping * easier to read * major doc rewrite * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * docs * switch to AdamW optimizer * style * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * clarify doc Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-12 19:05:18 -08:00
Sugeeth	314cca2842	Fix documentation links always pointing to master. (#9217 ) * Use extlinks to point hyperlink with the version of code * Point to version on release and master until then * Apply style * Correct links * Add missing backtick * Simple missing backtick after all. Co-authored-by: Raghavendra Sugeeth P S <raghav-5305@raghav-5305.csez.zohocorpin.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2021-01-05 06:18:48 -05:00
Suraj Patil	88ef8893cd	Add caching mechanism to BERT, RoBERTa (#9183 ) * add past_key_values * add use_cache option * make mask before cutting ids * adjust position_ids according to past_key_values * flatten past_key_values * fix positional embeds * fix _reorder_cache * set use_cache to false when not decoder, fix attention mask init * add test for caching * add past_key_values for Roberta * fix position embeds * add caching test for roberta * add doc * make style * doc, fix attention mask, test * small fixes * adress patrick's comments * input_ids shouldn't start with pad token * use_cache only when decoder * make consistent with bert * make copies consistent * add use_cache to encoder * add past_key_values to tapas attention * apply suggestions from code review * make coppies consistent * add attn mask in tests * remove copied from longformer * apply suggestions from code review * fix bart test * nit * simplify model outputs * fix doc * fix output ordering	2020-12-23 23:01:32 +05:30
Sylvain Gugger	490b39e614	Seq2seq trainer (#9241 ) * Add label smoothing in Trainer * Add options for scheduler and Adafactor in Trainer * Put Seq2SeqTrainer in the main lib * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Address review comments and adapt scripts * Documentation * Move test not using script to tests folder Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-12-22 11:33:44 -05:00
Lysandre Debut	1c1a2ffbff	TableQuestionAnsweringPipeline (#9145 ) * AutoModelForTableQuestionAnswering * TableQuestionAnsweringPipeline * Apply suggestions from Patrick's code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Sylvain and Patrick comments * Better PyTorch/TF error message * Add integration tests * Argument Handler naming Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com> * Fix docs to appease the documentation gods Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-12-16 12:31:50 -05:00
Patrick von Platen	640e6fe190	[Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054 ) * save intermediate * save intermediate * save intermediate * correct flax bert model file * new module / model naming * make style * almost finish BERT * finish roberta * make fix-copies * delete keys file * last refactor * fixes in run_mlm_flax.py * remove pooled from run_mlm_flax.py` * fix gelu \| gelu_new * remove Module from inits * splits * dirty print * preventing warmup_steps == 0 * smaller splits * make fix-copies * dirty print * dirty print * initial_evaluation argument * declaration order fix * proper model initialization/loading * proper initialization * run_mlm_flax improvements: improper model inputs bugfix + automatic dataset splitting + tokenizers parallelism warning + avoiding warmup_steps=0 bug * removed tokenizers warning hack, fixed model re-initialization * reverted training_args.py changes * fix flax from pretrained * improve test in flax * apply sylvains tips * update init * make 0.3.0 compatible * revert tevens changes * revert tevens changes 2 * finalize revert * fix bug * add docs * add pretrained to init * Update src/transformers/modeling_flax_utils.py * fix copies * final improvements Co-authored-by: TevenLeScao <teven.lescao@gmail.com>	2020-12-16 13:03:32 +01:00
Sylvain Gugger	1310e1a758	Enforce all objects in the main init are documented (#9014 )	2020-12-10 11:57:12 -05:00
Sylvain Gugger	00aa9dbca2	Copyright (#8970 ) * Add copyright everywhere missing * Style	2020-12-07 18:36:34 -05:00
Colin Brochtrup	8ffc01a76a	Add early stopping callback to pytorch trainer (#8581 ) * Add early stopping patience and minimum threshold metric must improve to prevent early stopping to pytorch trainer * Add early stopping test * Set patience counter to 0 if best metric not defined yet * Make early stopping a callback. Add callback event for updating the best metric for early stopping callback to trigger on. * Run make style * make funciton name sensible * Improve new argument docstring wording and hope that flakey CI test passes. * Use on_evaluation callback instead of custom. Remove some debug printing * Move early stopping arguments and state into early stopping callback * Run make style * Remove old code * Fix docs formatting. make style went rogue on me. * Remove copied attributes and fix variable * Add assertions on training arguments instead of mutating them. Move comment out of public docs. * Make separate test for early stopping callback. Add test of invalid arguments. * Run make style... I remembered before CI this time! * appease flake8 * Add EarlyStoppingCallback to callback docs * Make docstring EarlyStoppingCallabck match other callbacks. * Fix typo in docs	2020-11-23 17:25:35 -05:00
Chengxi Guo	d65e0bfea3	Fix doc bug (#8500 ) * fix doc bug Signed-off-by: mymusise <mymusise1@gmail.com> * fix example bug Signed-off-by: mymusise <mymusise1@gmail.com>	2020-11-12 11:47:23 -05:00
Yossi Synett	bc0d26d1de	[All Seq2Seq model + CLM models that can be used with EncoderDecoder] Add cross-attention weights to outputs (#8071 ) * Output cross-attention with decoder attention output * Update src/transformers/modeling_bert.py * add cross-attention for t5 and bart as well * fix tests * correct typo in docs * add sylvains and sams comments * correct typo Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-11-06 19:34:48 +01:00
Patrick von Platen	a1bbcf3f6c	Refactoring the generate() function (#6949 ) * first draft * show design proposition for new generate method * up * make better readable * make first version * gpt2 tests pass * make beam search for gpt2 work * add first encoder-decoder code * delete typo * make t5 work * save indermediate * make bart work with beam search * finish beam search bart / t5 * add default kwargs * make more tests pass * fix no bad words sampler * some fixes and tests for all distribution processors * fix test * fix rag slow tests * merge to master * add nograd to generate * make all slow tests pass * speed up generate * fix edge case bug * small fix * correct typo * add type hints and docstrings * fix typos in tests * add beam search tests * add tests for beam scorer * fix test rag * finish beam search tests * move generation tests in seperate file * fix generation tests * more tests * add aggressive generation tests * fix tests * add gpt2 sample test * add more docstring * add more docs * finish doc strings * apply some more of sylvains and sams comments * fix some typos * make fix copies * apply lysandres and sylvains comments * final corrections on examples * small fix for reformer	2020-11-03 16:04:22 +01:00
Davide Fiocco	995006eabb	Add AzureML in integrations via dedicated callback (#8062 ) * first attempt to add AzureML callbacks * func arg fix * var name fix, but still won't fix error... * fixing as in https://discuss.huggingface.co/t/how-to-integrate-an-azuremlcallback-for-logging-in-azure/1713/2 * Avoid lint check of azureml import * black compliance * Make isort happy * Fix point typo in docs * Add AzureML to Callbacks docs * Attempt to make sphinx happy * Format callback docs * Make documentation style happy * Make docs compliant to style Co-authored-by: Davide Fiocco <davide.fiocco@frontiersin.net>	2020-10-27 14:21:54 -04:00
Sylvain Gugger	08f534d2da	Doc styling (#8067 ) * Important files * Styling them all * Revert "Styling them all" This reverts commit `7d029395fd`. * Syling them for realsies * Fix syntax error * Fix benchmark_utils * More fixes * Fix modeling auto and script * Remove new line * Fixes * More fixes * Fix more files * Style * Add FSMT * More fixes * More fixes * More fixes * More fixes * Fixes * More fixes * More fixes * Last fixes * Make sphinx happy	2020-10-26 18:26:02 -04:00
Sylvain Gugger	04a17f8550	Doc fixes in preparation for the docstyle PR (#8061 ) * Fixes in preparation for doc styling * More fixes * Better syntax * Fixes * Style * More fixes * More fixes	2020-10-26 15:01:09 -04:00
noise-field	c48b16b8da	Mlflow integration callback (#8016 ) * Add MLflow integration class Add integration code for MLflow in integrations.py along with the code that checks that MLflow is installed. * Add MLflowCallback import Add import of MLflowCallback in trainer.py * Handle model argument Allow the callback to handle model argument and store model config items as hyperparameters. * Log parameters to MLflow in batches MLflow cannot log more than a hundred parameters at once. Code added to split the parameters into batches of 100 items and log the batches one by one. * Fix style * Add docs on MLflow callback * Fix issue with unfinished runs The "fluent" api used in MLflow integration allows only one run to be active at any given moment. If the Trainer is disposed off and a new one is created, but the training is not finished, it will refuse to log the results when the next trainer is created. * Add MLflow integration class Add integration code for MLflow in integrations.py along with the code that checks that MLflow is installed. * Add MLflowCallback import Add import of MLflowCallback in trainer.py * Handle model argument Allow the callback to handle model argument and store model config items as hyperparameters. * Log parameters to MLflow in batches MLflow cannot log more than a hundred parameters at once. Code added to split the parameters into batches of 100 items and log the batches one by one. * Fix style * Add docs on MLflow callback * Fix issue with unfinished runs The "fluent" api used in MLflow integration allows only one run to be active at any given moment. If the Trainer is disposed off and a new one is created, but the training is not finished, it will refuse to log the results when the next trainer is created.	2020-10-26 09:41:58 -04:00
Tiger	7e73c12805	fixed lots of typos. (#7758 )	2020-10-13 10:00:20 -04:00
Sylvain Gugger	08ba4b4902	Trainer callbacks (#7596 ) * Initial callback proposal * Finish various callbacks * Post-rebase conflicts * Fix tests * Don't use something that's not set * Documentation * Remove unwanted print. * Document all models can work * Add tests + small fixes * Update docs/source/internal/trainer_utils.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address review comments * Fix TF tests * Real fix this time * This one should work * Fix typo * Really fix typo Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-10-07 10:50:21 -04:00
Lysandre Debut	8d3bb781ee	Formatter (#7368 ) * Formatter * Docs	2020-09-24 10:59:21 -04:00
Sylvain Gugger	3323146e90	Models doc (#7345 ) * Clean up model documentation * Formatting * Preparation work * Long lines * Main work on rst files * Cleanup all config files * Syntax fix * Clean all tokenizers * Work on first models * Models beginning * FaluBERT * All PyTorch models * All models * Long lines again * Fixes * More fixes * Update docs/source/model_doc/bert.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update docs/source/model_doc/electra.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Last fixes Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-09-23 13:20:45 -04:00
Sylvain Gugger	4cbd50e611	Compute loss method (#7074 )	2020-09-11 12:06:31 -04:00
Sylvain Gugger	e841b75dec	Automate the lists in auto-xxx docs (#7061 ) * More readable dict * More nlp -> datasets * Revert "More nlp -> datasets" This reverts commit `3cd1883d22`. * Automate the lists in auto-xxx docs * More readable dict * Revert "More nlp -> datasets" This reverts commit `3cd1883d22`. * Automate the lists in auto-xxx docs * nlp -> datasets * Fix new key	2020-09-11 10:42:09 -04:00
Stas Bekman	d0963486c1	adding TRANSFORMERS_VERBOSITY env var (#6961 ) * introduce TRANSFORMERS_VERBOSITY env var + test + test helpers * cleanup * remove helper function	2020-09-09 04:08:01 -04:00
Suraj Patil	4230d30f77	[pipelines] Text2TextGenerationPipeline (#6744 ) * add Text2TextGenerationPipeline * remove max length warning * remove comments * remove input_length * fix typo * add tests * use TFAutoModelForSeq2SeqLM * doc * typo * add the doc below TextGenerationPipeline * doc nit * style * delete comment	2020-09-02 07:34:35 -04:00
Sylvain Gugger	d5f1ffa0d8	Logging doc (#6852 ) * Add logging doc * Foamtting * Update docs/source/main_classes/logging.rst * Update src/transformers/utils/logging.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-09-01 03:16:34 -04:00
Lysandre Debut	41aa2b4ef1	Adafactor docs (#6765 )	2020-08-27 05:16:50 -04:00
Sylvain Gugger	895ed8f451	Generation doc (#6470 ) * Generation doc * MBartForConditionalGeneration (#6441) * add MBartForConditionalGeneration * style * rebase and fixes * add mbart test in TEST_FILES_WITH_NO_COMMON_TESTS * fix docs * don't ignore mbart * doc * fix mbart fairseq link * put mbart before bart * apply doc suggestions * Use hash to clean the test dirs (#6475) * Use hash to clean the test dirs * Use hash to clean the test dirs * Use hash to clean the test dirs * fix * [EncoderDecoder] Add Cross Attention for GPT2 (#6415) * add cross attention layers for gpt2 * make gpt2 cross attention work * finish bert2gpt2 * add explicit comments * remove attention mask since not yet supported * revert attn mask in pipeline * Update src/transformers/modeling_gpt2.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_encoder_decoder.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Sort unique_no_split_tokens to make it deterministic (#6461) * change unique_no_split_tokens's type to set * use sorted list instead of set * style * Import accuracy_score (#6480) * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address comments * Styling * Generation doc * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address comments * Styling Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Kevin Canwen Xu <canwenxu@126.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com> Co-authored-by: gijswijnholds <gijswijnholds@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-08-14 09:46:39 -04:00
Joe Davison	972535ea74	fix zero shot pipeline docs (#6245 )	2020-08-04 16:37:49 -04:00
Sylvain Gugger	e4920c92d6	Doc pipelines (#6175 ) * Init work on pipelines doc * Work in progress * Work in progress * Doc pipelines * Rm unwanted default * Apply suggestions from code review Lysandre comments Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-08-03 11:44:46 -04:00
Sylvain Gugger	86caab1e0b	Harmonize both Trainers API (#6157 ) * Harmonize both Trainers API * Fix test * main_prcess -> process_zero	2020-07-31 09:43:23 -04:00
Sylvain Gugger	f3065abdb8	Doc tokenizer (#6110 ) * Start doc tokenizers * Tokenizer documentation * Start doc tokenizers * Tokenizer documentation * Formatting after rebase * Formatting after merge * Update docs/source/main_classes/tokenizer.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address comment * Update src/transformers/tokenization_utils_base.py Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> * Address Thom's comments Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-07-30 14:51:19 -04:00
guillaume-be	e642c78908	Addition of a DialoguePipeline (#5516 ) * initial commit for pipeline implementation Addition of input processing and history concatenation * Conversation pipeline tested and working for single & multiple conversation inputs * Added docstrings for dialogue pipeline * Addition of dialogue pipeline integration tests * Delete test_t5.py * Fixed max code length * Updated styling * Fixed test broken by formatting tools * Removed unused import * Added unit test for DialoguePipeline * Fixed Tensorflow compatibility * Fixed multi-framework support using framework flag * - Fixed docstring - Added `min_length_for_response` as an initialization parameter - Renamed `args` to `conversations`, `conversations` being a `Conversation` or a `List[Conversation]` - Updated truncation to truncate entire segments of conversations, instead of cutting in the middle of a user/bot input - renamed pipeline name from dialogue to conversational - removed hardcoded default value of 1000 and use config.max_length instead - added `append_response` and `set_history` method to the Conversation class to avoid direct fields mutation - fixed bug in history truncation method * - Updated ConversationalPipeline to accept only active conversations (otherwise a ValueError is raised) * - Simplified input tensor conversion * - Updated attention_mask value for Tensorflow compatibility * - Updated last dialogue reference to conversational & fixed integration tests * Fixed conflict with master * Updates following review comments * Updated formatting * Added Conversation and ConversationalPipeline to the library __init__, addition of docstrings for Conversation, added both to the docs * Update src/transformers/pipelines.py Updated docsting following review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-07-30 14:11:39 -04:00
Sylvain Gugger	3b44aa935a	Model utils doc (#6005 ) * Document TF modeling utils * Document all model utils	2020-07-24 09:16:28 -04:00
Sylvain Gugger	33d7506ea1	Update doc of the model page (#5985 )	2020-07-22 18:14:57 -04:00
Sylvain Gugger	7fad617dc1	Document model outputs (#5673 ) * Document model outputs * Update docs/source/main_classes/output.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-07-10 17:31:02 -04:00
Sylvain Gugger	b2747af543	Improvements to PretrainedConfig documentation (#5642 ) * Update PretrainedConfig doc * Formatting * Small fixes * Forgotten args and more cleanup	2020-07-10 10:31:47 -04:00
Sylvain Gugger	4ade7491f4	Fix examples titles and optimization doc page (#5408 )	2020-07-01 08:11:25 -04:00
Sylvain Gugger	87716a6d07	Documentation for the Trainer API (#5383 ) * Documentation for the Trainer API * Address review comments * Address comments	2020-06-30 11:43:43 -04:00
Thomas Wolf	601d4d699c	[tokenizers] Updates data processors, docstring, examples and model cards to the new API (#5308 ) * remove references to old API in docstring - update data processors * style * fix tests - better type checking error messages * better type checking * include awesome fix by @LysandreJik for #5310 * updated doc and examples	2020-06-26 19:48:14 +02:00
Sylvain Gugger	417e492f1e	Quick tour (#5145 ) * Quicktour part 1 * Update * All done * Typos Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> * Address comments in quick tour * Update docs/source/quicktour.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update from feedback Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-06-22 16:08:09 -04:00
Sylvain Gugger	011cc0be51	Fix all sphynx warnings (#5068 )	2020-06-16 16:50:02 -04:00
Anthony MOI	36434220fc	[HUGE] Refactoring tokenizers backend - padding - truncation - pre-tokenized pipeline - fast tokenizers - tests (#4510 ) * Use tokenizers pre-tokenized pipeline * failing pretrokenized test * Fix is_pretokenized in python * add pretokenized tests * style and quality * better tests for batched pretokenized inputs * tokenizers clean up - new padding_strategy - split the files * [HUGE] refactoring tokenizers - padding - truncation - tests * style and quality * bump up requied tokenizers version to 0.8.0-rc1 * switched padding/truncation API - simpler better backward compat * updating tests for custom tokenizers * style and quality - tests on pad * fix QA pipeline * fix backward compatibility for max_length only * style and quality * Various cleans up - add verbose * fix tests * update docstrings * Fix tests * Docs reformatted * __call__ method documented Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-06-15 17:12:51 -04:00
Julien Chaumond	99207bd112	Pipelines: miscellanea of QoL improvements and small features... (#4632 ) * [hf_api] Attach all unknown attributes for future-proof compatibility * [Pipeline] NerPipeline is really a TokenClassificationPipeline * modelcard.py: I don't think we need to force the download * Remove config, tokenizer from SUPPORTED_TASKS as we're moving to one model = one weight + one tokenizer * FillMaskPipeline: also output token in string form * TextClassificationPipeline: option to return all scores, not just the argmax * Update docs/source/main_classes/pipelines.rst	2020-06-03 03:51:31 -04:00
Julien Chaumond	c99fe0386b	[doc] Fix broken links + remove crazy big notebook	2020-05-07 18:44:18 -04:00

1 2 3 4

176 Commits