* [LED] fixed global_attention_mask not passed for generation + docs clarification for gradient checkpointing
* LED docs clarification
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* [LED] gradient_checkpointing=True should be passed to TrainingArguments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* [LED] docs: remove wrong word
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* [LED] docs fix typo
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>