transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-24 23:08:57 +06:00

History

Daniel Stancl 4a51b1dd9b FlaxBart (#11537 ) * Start working on FlaxBart * Create modeling_flax_bart.py * Write FlaxBartAttention * Add FlaxBartEncoderLayer * Add FlaxBartDecoderLayer and some typing * Add helepr function for FlaxBart * shift_tokens_right * _make_causal_mask * _expand_mask * Add PositionalEmbedding and fix init_std naming * Add FlaxBartPretrainedModel * Add FlaxBartEncoder * Add FlaxBartEncoder * Add FlaxBartEncoder among modules to be imported * YET WE CANNOT INITIALIZE THAT!! :( * Make BartEncoder working Change BartEncoder to instance of nn.Module so far * Add FlaxBartDecoder * Add FlaxBartModel * TODO to make model run -> Prepapre model inputs * Resolve padding * Add FlaxBartModel * Add FlaxBartModel into importable modules * Remove FlaxBartEncoder and FlaxBartDecoder from importable modules * make style; not properly working * make style; make quality not pass due to some import I left * Remove TODO for padding_idx in nn.Embed so far * Add FlaxBartForConditionalGeneration * Incorporate Flax model output classes, i.e. return_dict * Add another models and incorporate use_cache arg * Add FlaxBartForSequenceClassification and FlaxBartForQuestionAnswering * Incorporate use_cache arg from PyTorch implementation * Add all necessary Flax output utils * Add FlaxBartForCausalLM; not working yet' * Add minor improvements; still lacks some functionality * Update docs, src and tests * Add support of FlaxBart to docs/source * Fix some bugs in FlaxBart souce code * Add some neccessary tests for FlaxBart models - jit_compilation not passing * Fix tests and add test_head_masking * Fix tests for @jax.jit computation * Add test_head_masking * Migrate FlaxBart tests from jax.numpy to numpy * Remove FlaxBartForCausalLM * Clean repo * fix bart model weight structure * Fix FlaxBartForSequenceClassification Slicing is not possible to use below jit, therefore, selecting sentence representation from hidden_states must be changed. * Allow FlaxBartForSequenceClassification for testing pt_flax equivalence * Allow testing for FlaxBartForQA for pt_flax equivalence * Add a comment to FlaxBartForSequenceClassification + change noise from 1e-3 to 1e-6 * remove past_key_values * remove inputs_mebeds and make input_ids required * add position ids * re-write attention layer * fix dataclass * fix pos embeds and attention output * fix pos embeds * expose encode method * expose decode method * move docstring to top * add cache for causal attn layer * remove head masking for now * s2s greedy search first pass * boom boom * fix typos * fix greedy generate for bart * use encoder, decoder layers instead of num_hidden_layers * handle encoder_outputs * cleanup * simplify decoding * more clean-up * typos * Change header + add {decoder_,}position_ids into 2 models * add BartConfig * fix existing tests * add encode, decode methods * Fix shift_tokens_right for JIT compilation + clarify one condition * fix decode * encoder => encode * simplify generate * add tests for encode and decode * style * add tests for cache * fix equivalence tests * sample generate now works with seq2seq * generation tests * initialize dense layers * docstring and cleanup * quality * remove get/set input_embeddings * address Patricks suggestions * decode for every model, remove encoder_outputs from call * update tests accordingly * decode returns only decoder outputs and logits * fix arguments * doc encode, decode methods * correct base_model_prefix * fix test for seq classif model * fix docs Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com>		2021-06-14 15:16:08 +05:30
..
_static	Docs for v4.7.0.dev0	2021-05-12 17:08:35 +02:00
imgs	[Templates] Add template "call-for-model" markdown and "call-for-big-bird" markdown (#9921 )	2021-02-05 15:47:54 +03:00
internal	[docs] fix xref to `PreTrainedModel.generate` (#11049 )	2021-06-02 09:21:05 -07:00
main_classes	typo	2021-06-08 12:55:17 -07:00
model_doc	FlaxBart (#11537 )	2021-06-14 15:16:08 +05:30
add_new_model.rst	Indent code block in the documentation (#11233 )	2021-04-13 15:36:36 -04:00
benchmarks.rst	Examples reorg (#11350 )	2021-04-21 11:11:20 -04:00
bertology.rst	Fix documentation links always pointing to master. (#9217 )	2021-01-05 06:18:48 -05:00
community.md	Add DETR (#11653 )	2021-06-09 11:51:13 -04:00
conf.py	added social thumbnail for docs (#11083 )	2021-04-06 14:56:18 +02:00
contributing.md	Update installation page and add contributing to the doc (#5084 )	2020-06-17 14:01:10 -04:00
converting_tensorflow_models.rst	Examples reorg (#11350 )	2021-04-21 11:11:20 -04:00
custom_datasets.rst	Rename NLP library to Datasets library (#10920 )	2021-03-26 08:07:59 -04:00
debugging.rst	[debug utils] activation/weights underflow/overflow detector (#11274 )	2021-04-30 11:15:46 -07:00
examples.md	per_device instead of per_gpu/error thrown when argument unknown (#4618 )	2020-05-27 11:36:55 -04:00
fast_tokenizers.rst	Documentation about loading a fast tokenizer within Transformers (#11029 )	2021-04-05 10:51:16 -04:00
favicon.ico	Adding usage examples for common tasks (#2850 )	2020-02-25 13:48:24 -05:00
glossary.rst	Indent code block in the documentation (#11233 )	2021-04-13 15:36:36 -04:00
index.rst	FlaxBart (#11537 )	2021-06-14 15:16:08 +05:30
installation.md	Fix two typos in docs (#11852 )	2021-05-24 14:26:02 -04:00
migration.md	[docs] fix invalid class name (#11438 )	2021-04-26 08:37:32 -07:00
model_sharing.rst	Fix two typos in docs (#11852 )	2021-05-24 14:26:02 -04:00
model_summary.rst	Examples reorg (#11350 )	2021-04-21 11:11:20 -04:00
multilingual.rst	Examples reorg (#11350 )	2021-04-21 11:11:20 -04:00
notebooks.md	Update notebooks (#3620 )	2020-04-06 14:32:39 -04:00
perplexity.rst	minor typo fix	2021-04-01 11:58:37 -06:00
philosophy.rst	Minor documentation revisions from copyediting (#9266 )	2020-12-23 10:15:49 -05:00
preprocessing.rst	Minor documentation revisions from copyediting (#9266 )	2020-12-23 10:15:49 -05:00
pretrained_models.rst	GPT Neo few fixes (#10968 )	2021-03-30 11:15:55 -04:00
quicktour.rst	Finish Making Quick Tour respect the model object (#11467 )	2021-04-27 10:04:12 -04:00
sagemaker.md	Examples reorg (#11350 )	2021-04-21 11:11:20 -04:00
serialization.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
task_summary.rst	[docs] fix xref to `PreTrainedModel.generate` (#11049 )	2021-06-02 09:21:05 -07:00
testing.rst	[testing doc] bring doc up to date (#11359 )	2021-04-21 08:51:00 -07:00
tokenizer_summary.rst	Minor documentation revisions from copyediting (#9266 )	2020-12-23 10:15:49 -05:00
training.rst	Update training tutorial (#11533 )	2021-05-03 13:18:46 -04:00
troubleshooting.md	[troubleshooting] add 2 points of reference to the offline mode (#11236 )	2021-04-14 08:39:23 -07:00