transformers/docs/source
Daniel Stancl 4a51b1dd9b
FlaxBart (#11537)
* Start working on FlaxBart

* Create modeling_flax_bart.py

* Write FlaxBartAttention

* Add FlaxBartEncoderLayer

* Add FlaxBartDecoderLayer and some typing

* Add helepr function for FlaxBart

* shift_tokens_right

* _make_causal_mask

* _expand_mask

* Add PositionalEmbedding and fix init_std naming

* Add FlaxBartPretrainedModel

* Add FlaxBartEncoder

* Add FlaxBartEncoder

* Add FlaxBartEncoder among modules to be imported

* YET WE CANNOT INITIALIZE THAT!! :(

* Make BartEncoder working

Change BartEncoder to instance of nn.Module so far

* Add FlaxBartDecoder

* Add FlaxBartModel

* TODO to make model run -> Prepapre model inputs

* Resolve padding

* Add FlaxBartModel

* Add FlaxBartModel into importable modules

* Remove FlaxBartEncoder and FlaxBartDecoder from importable modules

* make style; not properly working

* make style; make quality not pass due to some import I left

* Remove TODO for padding_idx in nn.Embed so far

* Add FlaxBartForConditionalGeneration

* Incorporate Flax model output classes, i.e. return_dict

* Add another models and incorporate use_cache arg

* Add FlaxBartForSequenceClassification and FlaxBartForQuestionAnswering

* Incorporate use_cache arg from PyTorch implementation

* Add all necessary Flax output utils

* Add FlaxBartForCausalLM; not working yet'

* Add minor improvements; still lacks some functionality

* Update docs, src and tests

* Add support of FlaxBart to docs/source

* Fix some bugs in FlaxBart souce code

* Add some neccessary tests for FlaxBart models - jit_compilation not passing

* Fix tests and add test_head_masking

* Fix tests for @jax.jit computation

* Add test_head_masking

* Migrate FlaxBart tests from jax.numpy to numpy

* Remove FlaxBartForCausalLM

* Clean repo

* fix bart model weight structure

* Fix FlaxBartForSequenceClassification

Slicing is not possible to use below jit, therefore, selecting sentence
representation from hidden_states must be changed.

* Allow FlaxBartForSequenceClassification for testing pt_flax equivalence

* Allow testing for FlaxBartForQA for pt_flax equivalence

* Add a comment to FlaxBartForSequenceClassification + change noise from 1e-3 to 1e-6

* remove past_key_values

* remove inputs_mebeds and make input_ids required

* add position ids

* re-write attention layer

* fix dataclass

* fix pos embeds and attention output

* fix pos embeds

* expose encode method

* expose decode method

* move docstring to top

* add cache for causal attn layer

* remove head masking for now

* s2s greedy search first pass

* boom boom

* fix typos

* fix greedy generate for bart

* use encoder, decoder layers instead of num_hidden_layers

* handle encoder_outputs

* cleanup

* simplify decoding

* more clean-up

* typos

* Change header + add {decoder_,}position_ids into 2 models

* add BartConfig

* fix existing tests

* add encode, decode methods

* Fix shift_tokens_right for JIT compilation + clarify one condition

* fix decode

* encoder => encode

* simplify generate

* add tests for encode and decode

* style

* add tests for cache

* fix equivalence tests

* sample generate now works with seq2seq

* generation tests

* initialize dense layers

* docstring and cleanup

* quality

* remove get/set input_embeddings

* address Patricks suggestions

* decode for every model, remove encoder_outputs from call

* update tests accordingly

* decode returns only decoder outputs and logits

* fix arguments

* doc encode, decode methods

* correct base_model_prefix

* fix test for seq classif model

* fix docs

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
2021-06-14 15:16:08 +05:30
..
_static Docs for v4.7.0.dev0 2021-05-12 17:08:35 +02:00
imgs [Templates] Add template "call-for-model" markdown and "call-for-big-bird" markdown (#9921) 2021-02-05 15:47:54 +03:00
internal [docs] fix xref to PreTrainedModel.generate (#11049) 2021-06-02 09:21:05 -07:00
main_classes typo 2021-06-08 12:55:17 -07:00
model_doc FlaxBart (#11537) 2021-06-14 15:16:08 +05:30
add_new_model.rst Indent code block in the documentation (#11233) 2021-04-13 15:36:36 -04:00
benchmarks.rst Examples reorg (#11350) 2021-04-21 11:11:20 -04:00
bertology.rst Fix documentation links always pointing to master. (#9217) 2021-01-05 06:18:48 -05:00
community.md Add DETR (#11653) 2021-06-09 11:51:13 -04:00
conf.py added social thumbnail for docs (#11083) 2021-04-06 14:56:18 +02:00
contributing.md Update installation page and add contributing to the doc (#5084) 2020-06-17 14:01:10 -04:00
converting_tensorflow_models.rst Examples reorg (#11350) 2021-04-21 11:11:20 -04:00
custom_datasets.rst Rename NLP library to Datasets library (#10920) 2021-03-26 08:07:59 -04:00
debugging.rst [debug utils] activation/weights underflow/overflow detector (#11274) 2021-04-30 11:15:46 -07:00
examples.md per_device instead of per_gpu/error thrown when argument unknown (#4618) 2020-05-27 11:36:55 -04:00
fast_tokenizers.rst Documentation about loading a fast tokenizer within Transformers (#11029) 2021-04-05 10:51:16 -04:00
favicon.ico Adding usage examples for common tasks (#2850) 2020-02-25 13:48:24 -05:00
glossary.rst Indent code block in the documentation (#11233) 2021-04-13 15:36:36 -04:00
index.rst FlaxBart (#11537) 2021-06-14 15:16:08 +05:30
installation.md Fix two typos in docs (#11852) 2021-05-24 14:26:02 -04:00
migration.md [docs] fix invalid class name (#11438) 2021-04-26 08:37:32 -07:00
model_sharing.rst Fix two typos in docs (#11852) 2021-05-24 14:26:02 -04:00
model_summary.rst Examples reorg (#11350) 2021-04-21 11:11:20 -04:00
multilingual.rst Examples reorg (#11350) 2021-04-21 11:11:20 -04:00
notebooks.md Update notebooks (#3620) 2020-04-06 14:32:39 -04:00
perplexity.rst minor typo fix 2021-04-01 11:58:37 -06:00
philosophy.rst Minor documentation revisions from copyediting (#9266) 2020-12-23 10:15:49 -05:00
preprocessing.rst Minor documentation revisions from copyediting (#9266) 2020-12-23 10:15:49 -05:00
pretrained_models.rst GPT Neo few fixes (#10968) 2021-03-30 11:15:55 -04:00
quicktour.rst Finish Making Quick Tour respect the model object (#11467) 2021-04-27 10:04:12 -04:00
sagemaker.md Examples reorg (#11350) 2021-04-21 11:11:20 -04:00
serialization.rst Copyright (#8970) 2020-12-07 18:36:34 -05:00
task_summary.rst [docs] fix xref to PreTrainedModel.generate (#11049) 2021-06-02 09:21:05 -07:00
testing.rst [testing doc] bring doc up to date (#11359) 2021-04-21 08:51:00 -07:00
tokenizer_summary.rst Minor documentation revisions from copyediting (#9266) 2020-12-23 10:15:49 -05:00
training.rst Update training tutorial (#11533) 2021-05-03 13:18:46 -04:00
troubleshooting.md [troubleshooting] add 2 points of reference to the offline mode (#11236) 2021-04-14 08:39:23 -07:00