transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-28 16:52:24 +06:00

Author	SHA1	Message	Date
Suraj Patil	47a9768334	[FlaxBart] few small fixes (#12247 ) * boom boom * remove flax clip example * few small fixes	2021-06-18 10:29:42 +01:00
Suraj Patil	f74655cd9b	[Flax] FlaxAutoModelForSeq2SeqLM (#12228 ) * add FlaxAutoModelForSeq2SeqLM	2021-06-18 13:20:09 +05:30
Bhavitvya Malik	e43e11260f	update desc for map in all examples (#12226 ) * update desc for map in all examples * added plm * suggestions	2021-06-17 15:37:31 -04:00
Sylvain Gugger	adb70eda4d	AutoTokenizer: infer the class from the tokenizer config if possible (#12208 ) * AutoTokenizer: infer the class from the tokenizer config if possible * Add tests * Update src/transformers/models/auto/tokenization_auto.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-06-17 12:39:22 -04:00
Lysandre	0daadc1919	Docs for v4.8.0	2021-06-17 18:17:42 +02:00
Lysandre	7a6c9fab8e	Release: v4.7.0	2021-06-17 17:57:42 +02:00
Stas Bekman	d6ea91c96a	fix pt-1.9.0 `add_` deprecation (#12217 ) * fix pt-1.9.0 add_ deprecation * add () for clarity * Trigger CI * require_version(torch	2021-06-17 08:53:59 -07:00
Lysandre Debut	3a960c4857	Support for torch 1.9.0 (#12224 ) * Support for torch 1.9.0 * Torch scatter for 1.9.0 * Github Actions run on 1.9.0	2021-06-17 11:29:01 -04:00
Sylvain Gugger	afdd9e3663	Add link to the course (#12229 )	2021-06-17 11:14:53 -04:00
NielsRogge	29b0aef871	Improve detr (#12147 ) * Remove unused variables * Improve docs * Fix docs of segmentation masks Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-06-17 10:37:54 -04:00
Lysandre Debut	b56848c8c8	Pipeline update & tests (#12207 )	2021-06-17 09:41:16 +02:00
Bhadresh Savani	700cee3446	[Docs] fixed broken link (#12205 ) * fixed broken link * Update docs/source/benchmarks.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/benchmarks.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-16 15:14:53 -04:00
Sylvain Gugger	255a17a089	Use yaml to create metadata (#12185 ) * Use yaml to create metadata * Fix typo * Remove pin	2021-06-16 13:17:45 -04:00
Nicolas Patry	15ef0dc5c6	Enabling AutoTokenizer for HubertConfig. (#12198 )	2021-06-16 15:28:46 +01:00
Philipp Schmid	afa414d060	updated DLC images and sample notebooks (#12191 )	2021-06-16 07:24:00 -04:00
Patrick von Platen	ccca510276	Hubert (#11889 ) * fix_torch_device_generate_test * remove @ * add hubert * add first test file * more docs * fix bugs * fix bug * finish * finish * finish docstring * fix * fix * finalize * add to ignored * finish * Apply suggestions from code review * correct naming * finish * fix auto config * finish * correct convert script * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Suraj Patil <surajp815@gmail.com> * apply suggestions lysandre & suraj Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-06-16 12:14:12 +01:00
Patrick von Platen	c3c39f7e84	[Flax] Add Beam Search (#12131 ) * fix_torch_device_generate_test * remove @ * push new logit processors * add processors * save first working version * save intermediate * finish * make style * make fix-copies * finish * Update tests/test_modeling_flax_bart.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-06-16 09:43:54 +01:00
Sylvain Gugger	802ffaff0d	Temporarily deactivate torchhub test (#12184 )	2021-06-15 16:16:51 -04:00
Lysandre Debut	52c7ca0488	Temporarily deactivate torch-scatter while we wait for new release (#12181 ) * Temporarily deactivate torch-scatter while we wait for new release * torch-1.8.1 binary for scatter * Revert to 1.8.0 * Pin torch dependency * torchaudio and torchvision	2021-06-15 16:03:58 -04:00
Sylvain Gugger	7d7ceca396	Model card defaults (#12122 ) * [WIP] Model card defaults * finetuned_from default value * Add all mappings to the mapping file * Be more defensive on finetuned_from arg * Add default task tag * Separate tags from tasks * Edge case for dataset * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-06-15 16:01:37 -04:00
Stas Bekman	6e7cc5cc51	[testing] ensure concurrent pytest workers use a unique port for torch.dist (#12166 ) * ensure concurrent pytest workers use a unique port for torch.distributed.launch * reword	2021-06-15 11:12:59 -07:00
Amog Kamsetty	b9d66f4c4b	Ray Tune Integration Updates (#12134 ) * fix * fixes * add back to scheduled tests * formatting * Update integrations.py	2021-06-15 14:11:29 -04:00
Kilian Kluge	a79585bbf9	Update AutoModel classes in summarization example (#12178 ) - Convert use of deprecated AutoModelWithLMHead to AutoModelForSeq2SeqLM - Add newly required `truncation=True` to `tokenizer.encode` with `max_length` This silences all warnings.	2021-06-15 10:36:10 -04:00
Sylvain Gugger	d6c929e200	Merge remote-tracking branch 'origin/master'	2021-06-15 09:37:46 -04:00
Sylvain Gugger	a8694b8850	Adjust banner width	2021-06-15 09:37:15 -04:00
kumapo	955b2b97a6	Enable add_prefix_space if model_type is roberta or gpt2 (#12116 )	2021-06-15 09:33:21 -04:00
Sylvain Gugger	60b1d6b45b	Add course banner (#12157 ) * Add course banner * Update course banner	2021-06-15 09:25:49 -04:00
Lysandre Debut	d07b540a37	Have dummy processors have a `from_pretrained` method (#12145 )	2021-06-15 08:39:05 -04:00
Avital Oliver	9b393240a2	Use a released version of optax rather than installing from Git. (#12173 ) Use a released version of optax rather than installing from Git	2021-06-15 16:42:51 +05:30
Patrick von Platen	9bc9e59869	[Flax generate] Add params to generate (#12171 ) * fix_torch_device_generate_test * remove @ * add params as input * finish	2021-06-15 11:50:12 +01:00
Sylvain Gugger	a55dc157e3	Add video links to the documentation (#12162 )	2021-06-15 06:37:37 -04:00
Stas Bekman	040283170c	consistent nn. and nn.functional: part 5 docs (#12161 )	2021-06-14 13:34:32 -07:00
Stas Bekman	88e84186e5	[style] consistent nn. and nn.functional: part 4 `examples` (#12156 ) * consistent nn. and nn.functional: p4 examples * restore	2021-06-14 12:28:24 -07:00
Stas Bekman	372ab9cd6d	[style] consistent nn. and nn.functional: part 3 `tests` (#12155 ) * consistent nn. and nn.functional: p3 templates * restore	2021-06-14 12:18:22 -07:00
Vasudev Gupta	d9c0d08f9a	Flax Big Bird (#11967 ) * add flax bert * bert -> bigbird * original_full ported * add debugger * init block sparse * fix copies ; gelu_fast -> gelu_new * block sparse port * fix block sparse * block sparse working * all ckpts working * fix-copies * make quality * init tests * temporary fix for FlaxBigBirdForMultipleChoice * skip test_attention_outputs * fix * gelu_fast -> gelu_new ; fix multiple choice model * remove nsp * fix sequence classifier * fix * make quality * make fix-copies * finish * Delete debugger.ipynb * Update src/transformers/models/big_bird/modeling_flax_big_bird.py * make style * finish * bye bye jit flax tests Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-06-14 20:01:03 +01:00
Stas Bekman	a156da9a23	consistent nn. and nn.functional: p2 templates (#12153 )	2021-06-14 11:41:24 -07:00
Patrick von Platen	007be9e402	[Flax] Fix flax pt equivalence tests (#12154 ) * fix_torch_device_generate_test * remove @ * upload	2021-06-14 19:19:10 +01:00
Will Rice	d438eee030	Adding TFWav2Vec2Model (#11617 ) * [WIP] Add TFWav2Vec2Model Work in progress for adding a tensorflow version of Wav2Vec2 * feedback changes * small fix * Test Feedback Round 1 * Add SpecAugment and CTC Loss * correct spec augment mask creation * docstring and correct copyright * correct bugs * remove bogus file * finish tests correction * del unnecessary layers * Update src/transformers/models/wav2vec2/modeling_tf_wav2vec2.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style * correct final bug * Feedback Changes Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-06-14 18:58:54 +01:00
Stas Bekman	1ed2ebf60d	[style] consistent nn. and nn.functional (#12124 ) * consistent nn. and nn.functional * fix glitch * fix glitch #2	2021-06-14 09:44:28 -07:00
Stas Bekman	ff7c81687a	[optim] implement AdafactorSchedule (#12123 ) * implement AdafactorSchedule * typo * fix * Update src/transformers/optimization.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-14 09:43:48 -07:00
Suraj Patil	fe3576488a	fix error message (#12148 )	2021-06-14 14:12:18 +01:00
Kumar Abhishek	9de62cfbce	[lm examples] Replicate --config_overrides addition to other LM examples (#12135 ) * [lm examples] Replicate --config_overrides addition to other LM examples * Removing no trainer files changes * Update README Co-authored-by: Kumar Abhishek <kabhishek@expedia.com>	2021-06-14 08:12:22 -04:00
Nicholas Broad	cd7961b632	Use text_column_name variable instead of "text" (#12132 ) * Use text_column_name variable instead of "text" `text_column_name` was already defined above where I made the changes and it was also used below where I made changes. This is a very minor change. If a dataset does not use "text" as the column name, then the `tokenize_function` will now use whatever column is assigned to `text_column_name`. `text_column_name` is just the first column name if "text" is not a column name. It makes the function a little more robust, though I would assume that 90% + of datasets use "text" anyway. * black formatting * make style Co-authored-by: Nicholas Broad <nicholas@nmbroad.com>	2021-06-14 08:11:13 -04:00
Sylvain Gugger	b8ab541340	Don't log anything before logging is setup in examples (#12121 ) * Don't log anything before logging is setup in examples * Last example	2021-06-14 08:03:33 -04:00
Patrick von Platen	7566fefa69	[Flax] Add links to google colabs (#12146 ) * fix_torch_device_generate_test * remove @ * add colab links	2021-06-14 11:00:29 +01:00
SaulLu	476ba679dd	Feature to use the PreTrainedTokenizerFast class as a stand-alone tokenizer (#11810 ) * feature for tokenizer without slow/legacy version * format * modify common test * add tests * add PreTrainedTokenizerFast to AutoTokenizer * format * change tokenizer common test in order to be able to run test without a slow version * update tokenizer fast test in order to use `rust_tokenizer_class` attribute instead of `tokenizer_class` * add autokenizer test * replace `if self.tokenizer_class is not None` with ` if self.tokenizer_class is None` * remove obsolete change in comment * Update src/transformers/tokenization_utils_base.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/tokenization_utils_fast.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * change `get_main_tokenizer` into `get_tokenizers` * clarify `get_tokenizers` method * homogenize with `test_slow_tokenizer` and `test_rust_tokenizer` * add `test_rust_tokenizer = False` to tokenizer which don't define a fast version * `test_rust_tokenizer = False` for BertJapaneseTokenizer * `test_rust_tokenizer = False` for BertJapaneseCharacterTokenizationTest Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-14 11:58:44 +02:00
Daniel Stancl	4a51b1dd9b	FlaxBart (#11537 ) * Start working on FlaxBart * Create modeling_flax_bart.py * Write FlaxBartAttention * Add FlaxBartEncoderLayer * Add FlaxBartDecoderLayer and some typing * Add helepr function for FlaxBart * shift_tokens_right * _make_causal_mask * _expand_mask * Add PositionalEmbedding and fix init_std naming * Add FlaxBartPretrainedModel * Add FlaxBartEncoder * Add FlaxBartEncoder * Add FlaxBartEncoder among modules to be imported * YET WE CANNOT INITIALIZE THAT!! :( * Make BartEncoder working Change BartEncoder to instance of nn.Module so far * Add FlaxBartDecoder * Add FlaxBartModel * TODO to make model run -> Prepapre model inputs * Resolve padding * Add FlaxBartModel * Add FlaxBartModel into importable modules * Remove FlaxBartEncoder and FlaxBartDecoder from importable modules * make style; not properly working * make style; make quality not pass due to some import I left * Remove TODO for padding_idx in nn.Embed so far * Add FlaxBartForConditionalGeneration * Incorporate Flax model output classes, i.e. return_dict * Add another models and incorporate use_cache arg * Add FlaxBartForSequenceClassification and FlaxBartForQuestionAnswering * Incorporate use_cache arg from PyTorch implementation * Add all necessary Flax output utils * Add FlaxBartForCausalLM; not working yet' * Add minor improvements; still lacks some functionality * Update docs, src and tests * Add support of FlaxBart to docs/source * Fix some bugs in FlaxBart souce code * Add some neccessary tests for FlaxBart models - jit_compilation not passing * Fix tests and add test_head_masking * Fix tests for @jax.jit computation * Add test_head_masking * Migrate FlaxBart tests from jax.numpy to numpy * Remove FlaxBartForCausalLM * Clean repo * fix bart model weight structure * Fix FlaxBartForSequenceClassification Slicing is not possible to use below jit, therefore, selecting sentence representation from hidden_states must be changed. * Allow FlaxBartForSequenceClassification for testing pt_flax equivalence * Allow testing for FlaxBartForQA for pt_flax equivalence * Add a comment to FlaxBartForSequenceClassification + change noise from 1e-3 to 1e-6 * remove past_key_values * remove inputs_mebeds and make input_ids required * add position ids * re-write attention layer * fix dataclass * fix pos embeds and attention output * fix pos embeds * expose encode method * expose decode method * move docstring to top * add cache for causal attn layer * remove head masking for now * s2s greedy search first pass * boom boom * fix typos * fix greedy generate for bart * use encoder, decoder layers instead of num_hidden_layers * handle encoder_outputs * cleanup * simplify decoding * more clean-up * typos * Change header + add {decoder_,}position_ids into 2 models * add BartConfig * fix existing tests * add encode, decode methods * Fix shift_tokens_right for JIT compilation + clarify one condition * fix decode * encoder => encode * simplify generate * add tests for encode and decode * style * add tests for cache * fix equivalence tests * sample generate now works with seq2seq * generation tests * initialize dense layers * docstring and cleanup * quality * remove get/set input_embeddings * address Patricks suggestions * decode for every model, remove encoder_outputs from call * update tests accordingly * decode returns only decoder outputs and logits * fix arguments * doc encode, decode methods * correct base_model_prefix * fix test for seq classif model * fix docs Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-06-14 15:16:08 +05:30
Suraj Patil	d36fce8237	add readme for flax clm (#12111 ) * add readme for flax clm * use section link for tokenizer * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * update metrics Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-06-14 15:03:55 +05:30
Patrick von Platen	16c0efca2c	Add mlm pretraining xla torch readme (#12011 ) * fix_torch_device_generate_test * remove @ * upload * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * Update examples/flax/language-modeling/README.md * add more info * finish * fix Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-06-14 10:31:21 +01:00
Guido Novati	ecd6efe7cb	Fix megatron_gpt2 attention block's causal mask (#12007 ) * Fix megatron_gpt2 attention block's causal mask. * compatibility with checkpoints created with recent versions of Megatron-LM * added integration test for the released Megatron-GPT2 model * code style changes * added option to megatron conversion script to read from config file Co-authored-by: Guido Novati <gnovati@nvidia.com>	2021-06-14 04:57:55 -04:00

... 28 29 30 31 32 ...

8821 Commits