* Add model card for singbert.
Adding a model card for singbert- bert for singlish and manglish.
* Update README.md
Add additional tags and model name.
* Update README.md
Fix tag for malay.
* Update model_cards/zanelim/singbert/README.md
Fix language
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
* Add examples and custom widget input.
Add examples and custom widget input.
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
* Create PULL_REQUEST_TEMPLATE.md
Proposing to copy this neat feature from pytorch. This is a small template that let's a PR submitter tell which issue that PR closes.
* Update .github/PULL_REQUEST_TEMPLATE.md
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
* Add optuna hyperparameter search to Trainer
* @julien-c suggestions
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* Make compute_objective an arg function
* Formatting
* Rework to make it easier to add ray
* Formatting
* Initial support for Ray
* Formatting
* Polish and finalize
* Add trial id to checkpoint with Ray
* Smaller default
* Use GPU in ray if available
* Formatting
* Fix test
* Update install instruction
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Address review comments
* Formatting post-merge
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Tested in a local build of the docs.
e.g. Just above https://huggingface.co/transformers/task_summary.html#causal-language-modeling
Copy will copy the full code, e.g.
for token in top_5_tokens:
print(sequence.replace(tokenizer.mask_token, tokenizer.decode([token])))
Instead of currently only:
for token in top_5_tokens:
>>> for token in top_5_tokens:
... print(sequence.replace(tokenizer.mask_token, tokenizer.decode([token])))
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help reduce our carbon footprint.
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help increase our carbon footprint.
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help decrease our carbon footprint.
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help offset our carbon footprint.
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help improve our carbon footprint.
Docs for the option fix:
https://sphinx-copybutton.readthedocs.io/en/latest/
* Feed forward chunking for Distilbert & Albert
* Added ff chunking for many other models
* Change model signature
* Added chunking for XLM
* Cleaned up by removing some variables.
* remove test_chunking flag
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>