token_type_id is converted into the segment embedding. For question answering,
this needs to highlight whether a token belongs to sequence 0 or 1.
encode_plus takes care of correctly setting this parameter automatically.
* Refactored use of newstest2013 to newstest2014. Fixed bug where argparse consumed first command line argument as model_size argument rather than using default model_size by forcing explicit --model_size flag inclusion
* More pythonic file handling through 'with' context
* COSMETIC - ran Black and isort
* Fixed reference to number of lines in newstest2014
* Fixed failing test. More pythonic file handling
* finish PR from tholiao
* remove outcommented lines
* make style
* make isort happy
Co-authored-by: Thomas Liao <tholiao@gmail.com>
* remove output_past from pt
* make style
* add optional input length for gpt2
* add use cache to prepare input
* save memory in gpt2
* correct gpt2 test inputs
* make past input optional for gpt2
* finish use_cache for all models
* make style
* delete modeling_gpt2 change in test file
* correct docstring
* correct is true statements for gpt2
* added model_cards for polish squad models
* corrected mistake in polish design cards
* updated model_cards for squad2_dutch model
* added links to benchmark models
Co-authored-by: Henryk Borzymowski <henryk.borzymowski@pwc.com>
* Initial commit to get BERT + run_glue.py on TPU
* Add README section for TPU and address comments.
* Cleanup TPU bits from run_glue.py (#3)
TPU runner is currently implemented in:
https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py.
We plan to upstream this directly into `huggingface/transformers`
(either `master` or `tpu`) branch once it's been more thoroughly tested.
* Cleanup TPU bits from run_glue.py
TPU runner is currently implemented in:
https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py.
We plan to upstream this directly into `huggingface/transformers`
(either `master` or `tpu`) branch once it's been more thoroughly tested.
* No need to call `xm.mark_step()` explicitly (#4)
Since for gradient accumulation we're accumulating on batches from
`ParallelLoader` instance which on next() marks the step itself.
* Resolve R/W conflicts from multiprocessing (#5)
* Add XLNet in list of models for `run_glue_tpu.py` (#6)
* Add RoBERTa to list of models in TPU GLUE (#7)
* Add RoBERTa and DistilBert to list of models in TPU GLUE (#8)
* Use barriers to reduce duplicate work/resources (#9)
* Shard eval dataset and aggregate eval metrics (#10)
* Shard eval dataset and aggregate eval metrics
Also, instead of calling `eval_loss.item()` every time do summation with
tensors on device.
* Change defaultdict to float
* Reduce the pred, label tensors instead of metrics
As brought up during review some metrics like f1 cannot be aggregated
via averaging. GLUE task metrics depends largely on the dataset, so
instead we sync the prediction and label tensors so that the metrics can
be computed accurately on those instead.
* Only use tb_writer from master (#11)
* Apply huggingface black code formatting
* Style
* Remove `--do_lower_case` as example uses cased
* Add option to specify tensorboard logdir
This is needed for our testing framework which checks regressions
against key metrics writtern by the summary writer.
* Using configuration for `xla_device`
* Prefix TPU specific comments.
* num_cores clarification and namespace eval metrics
* Cache features file under `args.cache_dir`
Instead of under `args.data_dir`. This is needed as our test infra uses
data_dir with a read-only filesystem.
* Rename `run_glue_tpu` to `run_tpu_glue`
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
* [examples] Generate argparsers from type hints on dataclasses
* [HfArgumentParser] way simpler API
* Restore run_language_modeling.py for easier diff
* [HfArgumentParser] final tweaks from code review
* Big cleanup of `glue_convert_examples_to_features`
* Use batch_encode_plus
* Cleaner wrapping of glue_convert_examples_to_features for TF
@lysandrejik
* Cleanup syntax, thanks to @mfuntowicz
* Raise explicit error in case of user error