transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-29 17:22:25 +06:00

Author	SHA1	Message	Date
Lysandre Debut	245cdb469d	Fix barthez tokenizer (#9562 )	2021-01-13 06:24:10 -05:00
Julien Chaumond	247a7b2029	Doc: Update pretrained_models wording (#9545 ) * Update pretrained_models.rst To clarify things cf. this tweet for instance https://twitter.com/RTomMcCoy/status/1349094111505211395 * format	2021-01-13 05:58:05 -05:00
Suraj Patil	69ed36063a	fix BlenderbotSmallTokenizer (#9538 ) * add model_input_names * fix test	2021-01-13 10:53:43 +05:30
Stas Bekman	2df34f4aba	[trainer] deepspeed integration (#9211 ) * deepspeed integration * style * add test * ds wants to do its own backward * fp16 assert * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style * for clarity extract what args are being passed to deepspeed * introduce the concept of self.wrapped_model * s/self.wrapped_model/self.model_wrapped/ * complete transition to self.wrapped_model / self.model * fix * doc * give ds its own init * add custom overrides, handle bs correctly * fix test * clean up model_init logic, fix small bug * complete fix * collapse --deepspeed_config into --deepspeed * style * start adding doc notes * style * implement hf2ds optimizer and scheduler configuration remapping * oops * call get_num_training_steps absolutely when needed * workaround broken auto-formatter * deepspeed_config arg is no longer needed - fixed in deepspeed master * use hf's fp16 args in config * clean * start on the docs * rebase cleanup * finish up --fp16 * clarify the supported stages * big refactor thanks to discovering deepspeed.init_distributed * cleanup * revert fp16 part * add checkpoint-support * more init ds into integrations * extend docs * cleanup * unfix docs * clean up old code * imports * move docs * fix logic * make it clear which file it's referring to * document nodes/gpus * style * wrong format * style * deepspeed handles gradient clipping * easier to read * major doc rewrite * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * docs * switch to AdamW optimizer * style * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * clarify doc Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-12 19:05:18 -08:00
Sylvain Gugger	5f6721032a	Use the right version of tokenizers (#9550 ) * Use the right version of tokenizers * Try another way * Try another way * Deps are installed from there... * Deps are installed from there... * Revert last * remove needless comment	2021-01-12 18:55:45 -05:00
Sylvain Gugger	063d8d27f4	Refactor `prepare_seq2seq_batch` (#9524 ) * Add target contextmanager and rework prepare_seq2seq_batch * Fix tests, treat BART and Barthez * Add last tokenizers * Fix test * Set src token before calling the superclass * Remove special behavior for T5 * Remove needless imports * Remove needless asserts	2021-01-12 18:19:38 -05:00
Sylvain Gugger	e6ecef711e	Revert, it was not the issue.	2021-01-12 18:00:22 -05:00
Sylvain Gugger	250f27f207	Fix tokenizers install for now	2021-01-12 17:50:27 -05:00
Lysandre Debut	dfbf0f5598	topk -> top_k (#9541 )	2021-01-12 16:21:29 -05:00
Lysandre Debut	a1100fac67	LayoutLM Config (#9539 )	2021-01-12 10:03:50 -05:00
NielsRogge	e45eba3b1c	Improve LayoutLM (#9476 ) * Add LayoutLMForSequenceClassification and integration tests Improve docs Add LayoutLM notebook to list of community notebooks * Make style & quality * Address comments by @sgugger, @patrickvonplaten and @LysandreJik * Fix rebase with master * Reformat in one line * Improve code examples as requested by @patrickvonplaten Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-12 09:26:32 -05:00
Suraj Patil	ccd1923f46	[T5] enable T5 fp16 (#9487 ) * fix t5 fp16	2021-01-12 17:12:33 +05:30
Patrick von Platen	2aa9c2f204	fix blenderbot tok (#9532 )	2021-01-12 05:53:32 -05:00
Lysandre Debut	406cbf58b2	Shouldn't stale issues/PRs with feature request label (#9511 )	2021-01-12 04:49:15 -05:00
Simon Brandeis	3b67c5abb0	Update 'Develop on Windows' guidelines (#9519 )	2021-01-12 04:15:16 -05:00
Patrick von Platen	a051d8928a	[ProphetNet] Fix naming and wrong config (#9514 ) * fix naming issues * better names	2021-01-12 04:10:05 -05:00
Patrick von Platen	7f28613213	[TFBart] Split TF-Bart (#9497 ) * make templates ready * make add_new_model_command_ready * finish tf bart * prepare tf mbart * finish tf bart * add tf mbart * add marian * prep pegasus * add tf pegasus * push blenderbot tf * add blenderbot * add blenderbot small * clean-up * make fix copy * define blend bot tok * fix * up * make style * add to docs * add copy statements * overwrite changes * improve * fix docs * finish * fix last slow test * fix missing git conflict line * fix blenderbot * up * fix blenderbot small * load changes * finish copied from * upload fix	2021-01-12 02:06:32 +01:00
Stas Bekman	0ecbb69806	[make docs] parallel build (#9522 ) After experimenting with different number of workers https://github.com/huggingface/transformers/issues/9496#issuecomment-758145868 4-5 workers seems to be the most optimal - let's go with 4 as surely we wouldn't find a cpu with less cores these days. Fixes part of https://github.com/huggingface/transformers/issues/9496 @sgugger	2021-01-11 13:00:08 -08:00
Stas Bekman	e6f211cade	[trainer] round numbers in trainer state (#9491 ) * round numbers * style * round only on logging	2021-01-11 10:17:49 -08:00
Sylvain Gugger	01a1684078	Make doc styler behave properly on Windows (#9516 )	2021-01-11 10:25:24 -05:00
Sylvain Gugger	6009668c63	Add link to forums thread	2021-01-11 10:00:59 -05:00
Julien Plu	ba702966ba	Fix cardinality (#9505 )	2021-01-11 09:42:19 -05:00
Stas Bekman	33b7422839	[trainer] remove `--model_parallel` (#9451 ) * fix bad merge - dropped code * remove --model_parallel * Deal with TrainingArguments * Use a private attr and fix batch sizes * fix _n_gpu * add is_parallel helper wrapper * fix attribute * introduce a new attribute is_model_parallel * docs * docs * Put back init False and rearrange doc * Ignore non-init args in HFArgumentParser Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>	2021-01-11 09:39:28 -05:00
Stas Bekman	6f63501383	[doc] How To Request Support document stab (#9288 ) * How To Request Support document stab * integrate suggestions * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * small corrections * expand on how to search for issues with examples * address issues * Update ISSUES.md Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * patrick's suggestion * patrick's suggestion * small fix Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-11 09:23:51 -05:00
Nicolas Patry	d20e9c7299	Enable TruncationStrategy override for pipelines (#9432 ) * Enable TruncationStrategy override for pipelines * Update isort. * Fixing test * Fixing text_generation pipeline. * Using same DummyTok as other PR for easier merge later. * Some more import guards. * Remove bogus file. * Do not pass `generate_kwargs` to `_parse_and_tokenize`. @patrickvonplaten * Removed DummyTok. * Doc quality.	2021-01-11 09:23:28 -05:00
Sylvain Gugger	8d25df2c7a	Make doc styler detect lists on rst (#9488 )	2021-01-11 08:53:41 -05:00
Aakash Tripathi	5a442a8db1	New Updated DistilGPT-2 Finetuning and Generation (#9494 ) https://github.com/huggingface/transformers/pull/3177	2021-01-11 14:34:39 +01:00
Patrick von Platen	6c8ec2a931	fix tf led pt test (#9513 )	2021-01-11 14:14:48 +01:00
Julien Plu	1e3c362235	Fix template (#9512 )	2021-01-11 08:03:28 -05:00
Lysandre Debut	d415882b41	Remove tolerance + drop_rows_to_fit by default (#9507 ) * Remove tolerance + drop_rows_to_fit by default * remove drop_rows_to_fit	2021-01-11 08:02:41 -05:00
Julien Plu	1243ee7d0c	Full rework of the TF input/output embeddings and bias resizing (#9193 ) * Start rework resizing * Rework bias/decoder resizing * Full resizing rework * Full resizing rework * Start to update the models with the new approach * Finish to update the models * Update all the tests * Update the template * Fix tests * Fix tests * Test a new approach * Refactoring * Refactoring * Refactoring * New rework * Rework BART * Rework bert+blenderbot * Rework CTRL * Rework Distilbert * Rework DPR * Rework Electra * Rework Flaubert * Rework Funnel * Rework GPT2 * Rework Longformer * Rework Lxmert * Rework marian+mbart * Rework mobilebert * Rework mpnet * Rework openai * Rework pegasus * Rework Roberta * Rework T5 * Rework xlm+xlnet * Rework template * Fix TFT5EncoderOnly + DPRs * Restore previous methods * Fix Funnel * Fix CTRL and TransforXL * Apply style * Apply Sylvain's comments * Restore a test in DPR * Address the comments * Fix bug * Apply style * remove unused import * Fix test * Forgot a method * missing test * Trigger CI * naming update * Rebase * Trigger CI	2021-01-11 06:27:28 -05:00
Julien Plu	cf416764f4	Fix template (#9504 )	2021-01-11 05:21:25 -05:00
Richard Liaw	09926c8e86	fix-template (#9499 ) Signed-off-by: Richard Liaw <rliaw@berkeley.edu>	2021-01-10 20:34:17 -05:00
Julien Plu	4f7022d68d	Reformat (#9482 )	2021-01-10 15:10:15 +01:00
Nicolas Patry	96f1f74aaf	Fixing tests. It seems master changed something in the warnings. (#9483 ) Trying to keep warning tests for now. Should be discarded if it becomes too hard to maintain.	2021-01-10 15:08:20 +01:00
Boris Dayma	1c19b423bf	fix(wandb): fix config (#9489 )	2021-01-08 14:32:02 -05:00
Nicolas Patry	02e05fb0a5	Making Conversation possible to create directly a full conversation (#9434 ) * Cleaning up conversation tests. * Adding tests that don't require downloading models + conversation can be fully created from static state. * Making tests non flaky (by fixing generation length) * Bumping isort version. * Doc cleanup. * Remove unused test in this PR. * Torch import guard for TF. * Missing torch guard. * Small mistake in doc. * Actual uses `_history` and `_index` cache. + remove dead enumerate + improve warning message. * Update src/transformers/pipelines/conversational.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines/conversational.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines/conversational.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Adding comments and cleaner code to address history copy. * Improving pipeline name in tests. * Change tokenizer to a real one (still created at runtime with no external dependency) * Simplify DummyTok, reverse changes on tokenization. * Removing DummyTok. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-08 14:33:25 +01:00
Julien Plu	4fbcf8ea49	Fix TF input for np.ndarray (#9294 ) * Fix input for np.ndarray" * add a test * add a test * Add a test * Apply style * Fix test	2021-01-08 08:23:29 -05:00
Thomas Tanon	e34e45536f	Makes HfArgumentParser compatible with Python 3.9 (#9479 ) Python 3.9 changed the format of the string serialization of `typing.Optional`. For example, `str(typing.Optional[str])` is `typing.Union[str, NoneType]` in python 3.8 and `typing.Optional[str]` in Python 3.9.	2021-01-08 08:10:44 -05:00
Sylvain Gugger	1bdf42409c	Fast imports part 3 (#9474 ) * New intermediate inits * Update template * Avoid importing torch/tf/flax in tokenization unless necessary * Styling * Shutup flake8 * Better python version check	2021-01-08 07:40:59 -05:00
Patrick von Platen	79bbcc5260	[Generation] Fix bug for manual decoder_input_ids + warning message (#9472 ) * up * improve style	2021-01-08 05:50:39 -05:00
Patrick von Platen	9e1ea846bc	[README] Add new models (#9465 ) * add new models * make fix-copies	2021-01-08 05:49:43 -05:00
Nicolas Patry	bf9056442a	Removing duplicated code for Translation,Summarization and Text2TextGeneration pipelines (#9433 ) * Merging all duplicated codes for Text2TextPipeline while preserving backward compat. * Fixing TranslationPipeline Hierarchy + return_name * torch import guard. * Update isort version. * Remove code from other PR disentanglement. * Removed named example to something more agnostic.	2021-01-07 23:10:16 +01:00
Patrick von Platen	f33a6f3446	[TFGPT2] - Fix flaky past_key_values test (#9460 ) * fix tf flakey * remove test files	2021-01-07 16:12:08 +01:00
Sylvain Gugger	758ed3332b	Transformers fast import part 2 (#9446 ) * Main init work * Add version * Change from absolute to relative imports * Fix imports * One more typo * More typos * Styling * Make quality script pass * Add necessary replace in template * Fix typos * Spaces are ignored in replace for some reason * Forgot one models. * Fixes for import Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr> * Add documentation * Styling Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>	2021-01-07 09:36:14 -05:00
Patrick von Platen	a400fe8931	[LED Test] fix common inputs pt for flaky pt-tf led test (#9459 ) * fix common inputs pt flakey led * fix other tests correspondingly	2021-01-07 12:29:03 +01:00
Patrick von Platen	ae5a32bb0d	up (#9454 )	2021-01-07 11:51:02 +01:00
Julien Plu	812045adcc	New serving (#9419 ) * Add a serving method * Add albert * Add serving for BERT and BART * Add more models * Finish the serving addition * Temp fix * Restore DPR * Fix funnel attribute * Fix attributes GPT2 * Fix OpenAIGPT attribute * Fix T5 attributes * Fix Bart attributes * Fix TransfoXL attributes * Add versioning * better test * Update template * Fix Flaubert * Fix T5 * Apply style * Remove unused imports * Deactivate extra parameters * Remove too long test + saved_model default to False * Ignore the saved model test for some models * Fix some inputs * Fix mpnet serving * Trigger CI * Address all comments	2021-01-07 11:48:49 +01:00
guillaume-be	390cf16bc8	Prophetnet optimization (#9453 ) * Vectorized `ngram_attention_bias` calculation * updated formatting with black * Further optimization * one (last) optimization	2021-01-07 11:41:58 +01:00
Stas Bekman	28d74872cc	a more reliable version of branching point discovery (#9449 )	2021-01-07 04:47:50 -05:00

... 5 6 7 8 9 ...

6600 Commits