transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Daniel Stancl	357fb1c5d8	Add head_mask/decoder_head_mask for BART (#9569 ) * Add head_mask/decoder_head_mask for BART This branch implement head_mask and decoder_head_mask for BART-based models. Full list below: - BART - MBart - Blenderbot - BlenderbotSmall - Marian - Pegasus Everything is accompanied with updated testing. * Fix test_headmasking for BART models * Fix text_headmasking for BART-like models which has only 2 layers in each modules. The condition ``` self.assertNotEqual(attentions[1][..., 0, :, :].flatten().sum().item(), 0.0) ``` is, therefore, invalid for encoder-decoder models considering the `head_mask` ``` head_mask = torch.ones( self.model_tester.num_hidden_layers, self.model_tester.num_attention_heads, device=torch_device, ) head_mask[0, 0] = 0 head_mask[-1, :-1] = 0 ``` specified in the `test_headmasking` test/function. * Adjust test_modeling_common.py to reflect T5 input args * Update tests/test_modeling_common.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make style * make fix-copies Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-18 13:35:22 +01:00
Devrim	65eb5d9ac5	Fix: torch.utils.checkpoint import error. (#9626 )	2021-01-18 04:33:39 -05:00
Anthony MOI	72fc9abf17	Remove duplicated extra["retrieval"] (#9621 )	2021-01-18 04:24:21 -05:00
Stas Bekman	c60e0e1ee4	deepspeed + grad acumm (#9622 )	2021-01-15 10:12:26 -08:00
Lysandre Debut	6d3b688b04	Ignore lm_head decoder bias warning (#9615 ) * Ignore lm_head decoder bias warning * Revert "Ignore lm_head decoder bias warning" This reverts commit `f25177a9da`. * predictions -> lm_head	2021-01-15 09:40:21 -05:00
Julien Plu	8eba1f8ca8	Remove unused token_type_ids in MPNet (#9564 ) * Add warning * Remove unused import * Fix missing call * Fix missing call * Completely remove token_type_ids * Apply style * Remove unused import * Update src/transformers/models/mpnet/modeling_tf_mpnet.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-15 08:06:29 -05:00
Patrick von Platen	90ca8d36e9	[TF Led] Fix wrong decoder attention mask behavior (#9601 ) * fix tf led * remove loop file	2021-01-15 06:40:27 -05:00
Kiyoung Kim	85788bae5c	Revert "Gradient accumulation for TFTrainer (#9585 )" This reverts commit `3f40070c88`.	2021-01-15 10:47:01 +01:00
Stas Bekman	82498cbc37	[deepspeed doc] install issues + 1-gpu deployment (#9582 ) * [doc] install + 1-gpu deployment * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * improvements Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-14 11:05:04 -08:00
Sylvain Gugger	329fe2746a	Upstream (and rename) sortish sampler (#9574 ) * Upstream (and rename) sortish sampler * Use proper sampler * Update src/transformers/trainer_pt_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-14 10:38:14 -05:00
Kiyoung Kim	3f40070c88	Gradient accumulation for TFTrainer (#9585 ) * gradient accumulation for tftrainer * label naming Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * label naming Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-14 10:16:39 -05:00
Lysandre	e43f3b6190	v4.2.1 in docs	2021-01-14 14:25:30 +01:00
Lysandre Debut	280db79ac1	BatchEncoding.to with device with tests (#9584 )	2021-01-14 07:57:58 -05:00
Lysandre Debut	8bf27075a2	Fix conda build (#9589 ) * conda build -> conda-build * Syntax error * conda build -> conda-build + 4.2.0 * Prepare to merge in `master`	2021-01-14 05:51:52 -05:00
Stas Bekman	c99751dd9d	[setup.py] note on how to get to transformers exact dependencies from shell (#9553 ) * note on how to get to deps from shell * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix text Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-14 05:04:08 -05:00
Julien Plu	a26536f0c8	Make logs tf compliant (#9565 )	2021-01-14 04:56:53 -05:00
Julien Plu	14d677ca4a	Compliancy with tf-nightly (#9570 ) * Compliancy with tf-nightly * Add more version + restore min version check	2021-01-14 04:35:35 -05:00
Sylvain Gugger	46ed56cfd1	Switch metrics in run_ner to datasets (#9567 ) * Switch metrics in run_ner to datasets * Add flag to return all metrics * Upstream (and rename) sortish_sampler * Revert "Upstream (and rename) sortish_sampler" This reverts commit `e07d0dcf65`.	2021-01-14 03:37:07 -05:00
Sylvain Gugger	5e1bea4f16	Fix Trainer with a parallel model (#9578 ) * Fix Trainer with a parallel model * More clean up	2021-01-14 03:23:41 -05:00
Patrick von Platen	126fd281bc	Update README.md	2021-01-13 16:55:59 +01:00
Lysandre	e63cad7936	v4.3.0.dev0	2021-01-13 16:16:54 +01:00
Lysandre	33a8497db8	v4.2.0 documentation	2021-01-13 16:15:40 +01:00
Lysandre	7d9a9d0c72	Release: v4.2.0	2021-01-13 16:01:51 +01:00
Lysandre Debut	c949516695	Fix slow tests v4.2.0 (#9561 ) * Fix conversational pipeline test * LayoutLM * ProphetNet * BART * Blenderbot & small * Marian * mBART * Pegasus * Tapas tokenizer * BERT2BERT test * Style * Example requirements * TF BERT2BERT test	2021-01-13 09:55:48 -05:00
Sylvain Gugger	04dc65e5c6	Fix data parallelism in Trainer (#9566 ) * Fix data parallelism in Trainer * Update src/transformers/training_args.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-13 09:54:41 -05:00
Stas Bekman	b2dfcc567b	use correct deps for torchhub (#9552 )	2021-01-13 08:02:53 -05:00
Yusuke Mori	eabad8fd9c	Update run_glue for do_predict with local test data (#9442 ) (#9486 ) * Update run_glue for do_predict with local test data (#9442) * Update run_glue (#9442): fix comments ('files' to 'a file') * Update run_glue (#9442): reflect the code review * Update run_glue (#9442): auto format * Update run_glue (#9442): reflect the code review	2021-01-13 07:48:35 -05:00
LSinev	0c9f01a8e5	Speed up TopKLogitsWarper and TopPLogitsWarper (pytorch) (#9557 ) * make TopKLogitsWarper faster * make TopPLogitsWarper faster	2021-01-13 07:47:47 -05:00
Pavel Tarashkevich	27d0e01d75	Fix classification script: enable dynamic padding with truncation (#9554 ) Co-authored-by: Pavel Tarashkevich <Pavel.Tarashkievich@orange.com>	2021-01-13 07:46:48 -05:00
Lysandre Debut	245cdb469d	Fix barthez tokenizer (#9562 )	2021-01-13 06:24:10 -05:00
Julien Chaumond	247a7b2029	Doc: Update pretrained_models wording (#9545 ) * Update pretrained_models.rst To clarify things cf. this tweet for instance https://twitter.com/RTomMcCoy/status/1349094111505211395 * format	2021-01-13 05:58:05 -05:00
Suraj Patil	69ed36063a	fix BlenderbotSmallTokenizer (#9538 ) * add model_input_names * fix test	2021-01-13 10:53:43 +05:30
Stas Bekman	2df34f4aba	[trainer] deepspeed integration (#9211 ) * deepspeed integration * style * add test * ds wants to do its own backward * fp16 assert * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style * for clarity extract what args are being passed to deepspeed * introduce the concept of self.wrapped_model * s/self.wrapped_model/self.model_wrapped/ * complete transition to self.wrapped_model / self.model * fix * doc * give ds its own init * add custom overrides, handle bs correctly * fix test * clean up model_init logic, fix small bug * complete fix * collapse --deepspeed_config into --deepspeed * style * start adding doc notes * style * implement hf2ds optimizer and scheduler configuration remapping * oops * call get_num_training_steps absolutely when needed * workaround broken auto-formatter * deepspeed_config arg is no longer needed - fixed in deepspeed master * use hf's fp16 args in config * clean * start on the docs * rebase cleanup * finish up --fp16 * clarify the supported stages * big refactor thanks to discovering deepspeed.init_distributed * cleanup * revert fp16 part * add checkpoint-support * more init ds into integrations * extend docs * cleanup * unfix docs * clean up old code * imports * move docs * fix logic * make it clear which file it's referring to * document nodes/gpus * style * wrong format * style * deepspeed handles gradient clipping * easier to read * major doc rewrite * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * docs * switch to AdamW optimizer * style * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * clarify doc Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-12 19:05:18 -08:00
Sylvain Gugger	5f6721032a	Use the right version of tokenizers (#9550 ) * Use the right version of tokenizers * Try another way * Try another way * Deps are installed from there... * Deps are installed from there... * Revert last * remove needless comment	2021-01-12 18:55:45 -05:00
Sylvain Gugger	063d8d27f4	Refactor `prepare_seq2seq_batch` (#9524 ) * Add target contextmanager and rework prepare_seq2seq_batch * Fix tests, treat BART and Barthez * Add last tokenizers * Fix test * Set src token before calling the superclass * Remove special behavior for T5 * Remove needless imports * Remove needless asserts	2021-01-12 18:19:38 -05:00
Sylvain Gugger	e6ecef711e	Revert, it was not the issue.	2021-01-12 18:00:22 -05:00
Sylvain Gugger	250f27f207	Fix tokenizers install for now	2021-01-12 17:50:27 -05:00
Lysandre Debut	dfbf0f5598	topk -> top_k (#9541 )	2021-01-12 16:21:29 -05:00
Lysandre Debut	a1100fac67	LayoutLM Config (#9539 )	2021-01-12 10:03:50 -05:00
NielsRogge	e45eba3b1c	Improve LayoutLM (#9476 ) * Add LayoutLMForSequenceClassification and integration tests Improve docs Add LayoutLM notebook to list of community notebooks * Make style & quality * Address comments by @sgugger, @patrickvonplaten and @LysandreJik * Fix rebase with master * Reformat in one line * Improve code examples as requested by @patrickvonplaten Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-12 09:26:32 -05:00
Suraj Patil	ccd1923f46	[T5] enable T5 fp16 (#9487 ) * fix t5 fp16	2021-01-12 17:12:33 +05:30
Patrick von Platen	2aa9c2f204	fix blenderbot tok (#9532 )	2021-01-12 05:53:32 -05:00
Lysandre Debut	406cbf58b2	Shouldn't stale issues/PRs with feature request label (#9511 )	2021-01-12 04:49:15 -05:00
Simon Brandeis	3b67c5abb0	Update 'Develop on Windows' guidelines (#9519 )	2021-01-12 04:15:16 -05:00
Patrick von Platen	a051d8928a	[ProphetNet] Fix naming and wrong config (#9514 ) * fix naming issues * better names	2021-01-12 04:10:05 -05:00
Patrick von Platen	7f28613213	[TFBart] Split TF-Bart (#9497 ) * make templates ready * make add_new_model_command_ready * finish tf bart * prepare tf mbart * finish tf bart * add tf mbart * add marian * prep pegasus * add tf pegasus * push blenderbot tf * add blenderbot * add blenderbot small * clean-up * make fix copy * define blend bot tok * fix * up * make style * add to docs * add copy statements * overwrite changes * improve * fix docs * finish * fix last slow test * fix missing git conflict line * fix blenderbot * up * fix blenderbot small * load changes * finish copied from * upload fix	2021-01-12 02:06:32 +01:00
Stas Bekman	0ecbb69806	[make docs] parallel build (#9522 ) After experimenting with different number of workers https://github.com/huggingface/transformers/issues/9496#issuecomment-758145868 4-5 workers seems to be the most optimal - let's go with 4 as surely we wouldn't find a cpu with less cores these days. Fixes part of https://github.com/huggingface/transformers/issues/9496 @sgugger	2021-01-11 13:00:08 -08:00
Stas Bekman	e6f211cade	[trainer] round numbers in trainer state (#9491 ) * round numbers * style * round only on logging	2021-01-11 10:17:49 -08:00
Sylvain Gugger	01a1684078	Make doc styler behave properly on Windows (#9516 )	2021-01-11 10:25:24 -05:00
Sylvain Gugger	6009668c63	Add link to forums thread	2021-01-11 10:00:59 -05:00

1 2 3 4 5 ...

6329 Commits