transformers/tests
Nicolas Patry b1aa4982cd
Cleaning up ConversationalPipeline to support more than DialoGPT. (#10002)
* Cleaning up `ConversationalPipeline` to support more than DialoGPT.

Currently ConversationalPipeline was heavily biased towards DialoGPT
,which is the default model for this pipeline.

This PR proposes changes to put back the modifications specific to
DialoGPT into tokenizer-specific behavior wherever possible, by
creating `_build_conversation_input_ids` function that takes
conversation as input, and returns a list of ints corresponding
to the tokens. It feels natural to put here because all models
have probably different strategies to build input_ids from the
full conversation and it's the tokenizer's job to transform strings
into tokens (and vice-versa)

If `_build_conversation_input_ids` is missing, previous behavior is
used so we don't break anything so far (except for blenderbot where it's a fix).

This PR also contains a fix for too long inputs. There used
to be dead code for trying to limit the size of incoming input.
The introduced fixed is that we limit
within `_build_conversation_input_ids` to `tokenizer.model_max_length`.
It corresponds to the intent of the removed dead code and is actually
better because it corresponds to `model_max_length` which is different
from `max_length` (which is a default parameter for `generate`).

- Removed `history` logic from the Conversation as it's not relevant
anymore because tokenization logic has been moved to tokenizer.
And tokenizer cannot save any cache, and conversation cannot know
what is relevant or not.
Also it's not usable from `blenderbot` because the input_ids are
not append only (EOS tokens is always at the end).

- Added `iter_texts` method on `Conversation` because all
the code was literred with some form of this iteration of
past/generated_responses.

* Removing torch mention in types.

* Adding type checking to `_build_conversation_input_ids`.

* Fixing import in strings.
2021-02-08 14:29:07 +03:00
..
fixtures New run_seq2seq script (#9605) 2021-01-19 15:22:17 -05:00
__init__.py GPU text generation: mMoved the encoded_prompt to correct device 2020-01-06 15:11:12 +01:00
conftest.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_activations_tf.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_activations.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_benchmark_tf.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_benchmark.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_cli.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_configuration_auto.py Reorganize repo (#8580) 2020-11-16 21:43:42 -05:00
test_configuration_common.py [PretrainedConfig] Fix save pretrained config for edge case (#7943) 2020-10-22 15:39:01 +02:00
test_data_collator.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_doc_samples.py Fix ignore list behavior in doctests (#8213) 2020-11-02 08:47:37 -05:00
test_file_utils.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_flax_auto.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_generation_beam_search.py Add flags to return scores, hidden states and / or attention weights in GenerationMixin (#9150) 2021-01-06 17:11:42 +01:00
test_generation_logits_process.py Adding new encoder_no_repeat_ngram_size to generate. (#9984) 2021-02-04 15:00:18 +01:00
test_generation_utils.py [Flaky Generation Tests] Make sure that no early stopping is happening for beam search (#9794) 2021-01-26 03:21:44 -05:00
test_hf_api.py transformers-cli: LFS multipart uploads (> 5GB) (#8663) 2020-12-07 16:38:39 -05:00
test_hf_argparser.py [traner] fix --lr_scheduler_type choices (#9800) 2021-01-27 10:12:15 -05:00
test_logging.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_model_card.py GPU text generation: mMoved the encoded_prompt to correct device 2020-01-06 15:11:12 +01:00
test_model_output.py Add tests and fix various bugs in ModelOutput (#7073) 2020-09-11 12:01:33 -04:00
test_modeling_albert.py Alber model integration testing added (#9980) 2021-02-03 11:41:10 -05:00
test_modeling_auto.py LayoutLM Config (#9539) 2021-01-12 10:03:50 -05:00
test_modeling_bart.py fix bart tests (#10060) 2021-02-08 13:25:09 +03:00
test_modeling_bert_generation.py Add caching mechanism to BERT, RoBERTa (#9183) 2020-12-23 23:01:32 +05:30
test_modeling_bert.py Add caching mechanism to BERT, RoBERTa (#9183) 2020-12-23 23:01:32 +05:30
test_modeling_blenderbot_small.py BartForCausalLM analogs to ProphetNetForCausalLM (#9128) 2021-02-04 11:56:12 +03:00
test_modeling_blenderbot.py Hotfixing tests (blenderbot decoderonly tests, also need to remove (#10003) 2021-02-04 11:41:34 -05:00
test_modeling_bort.py ADD BORT (#9813) 2021-01-27 21:25:11 +03:00
test_modeling_camembert.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_modeling_common.py Add head_mask and decoder_head_mask to PyTorch LED (#9856) 2021-02-02 11:06:52 -08:00
test_modeling_convbert.py ConvBERT Model (#9717) 2021-01-27 03:20:09 -05:00
test_modeling_ctrl.py Ctrl for sequence classification (#8812) 2020-12-01 09:49:27 +01:00
test_modeling_deberta.py Add DeBERTa head models (#9691) 2021-01-20 10:18:50 -05:00
test_modeling_distilbert.py Added Integration testing for DistilBert model from issue #9948' (#9995) 2021-02-04 04:24:59 -05:00
test_modeling_dpr.py Fix slow dpr test (#10059) 2021-02-08 04:43:25 -05:00
test_modeling_electra.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_modeling_encoder_decoder.py BartForCausalLM analogs to ProphetNetForCausalLM (#9128) 2021-02-04 11:56:12 +03:00
test_modeling_flaubert.py Integration test for FlauBert (#10022) 2021-02-08 04:36:50 -05:00
test_modeling_flax_bert.py [Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054) 2020-12-16 13:03:32 +01:00
test_modeling_flax_common.py [Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054) 2020-12-16 13:03:32 +01:00
test_modeling_flax_roberta.py [Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054) 2020-12-16 13:03:32 +01:00
test_modeling_fsmt.py Add head_mask and decoder_head_mask to FSMT (#9819) 2021-02-01 09:30:21 +03:00
test_modeling_funnel.py Switch return_dict to True by default. (#8530) 2020-11-16 11:43:00 -05:00
test_modeling_gpt2.py Update past_key_values in GPT-2 (#9596) 2021-01-19 16:00:15 +01:00
test_modeling_layoutlm.py Fix slow tests v4.2.0 (#9561) 2021-01-13 09:55:48 -05:00
test_modeling_led.py Add head_mask and decoder_head_mask to PyTorch LED (#9856) 2021-02-02 11:06:52 -08:00
test_modeling_longformer.py Add head_mask and decoder_head_mask to PyTorch LED (#9856) 2021-02-02 11:06:52 -08:00
test_modeling_lxmert.py Remove redundant test_head_masking = True flags in test files (#9858) 2021-01-28 10:09:13 -05:00
test_modeling_marian.py BartForCausalLM analogs to ProphetNetForCausalLM (#9128) 2021-02-04 11:56:12 +03:00
test_modeling_mbart.py BartForCausalLM analogs to ProphetNetForCausalLM (#9128) 2021-02-04 11:56:12 +03:00
test_modeling_mobilebert.py Removed unused encoder_hidden_states and encoder_attention_mask (#8972) 2020-12-08 12:04:34 -05:00
test_modeling_mpnet.py Add MP Net 2 (#9004) 2020-12-09 10:32:43 -05:00
test_modeling_mt5.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_modeling_openai.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_modeling_pegasus.py BartForCausalLM analogs to ProphetNetForCausalLM (#9128) 2021-02-04 11:56:12 +03:00
test_modeling_prophetnet.py Allow text generation for ProphetNetForCausalLM (#9707) 2021-01-21 11:13:38 +01:00
test_modeling_rag.py Proposed Fix : [RagSequenceForGeneration] generate "without" input_ids (#9220) 2020-12-24 13:38:00 +01:00
test_modeling_reformer.py Add flags to return scores, hidden states and / or attention weights in GenerationMixin (#9150) 2021-01-06 17:11:42 +01:00
test_modeling_roberta.py Add caching mechanism to BERT, RoBERTa (#9183) 2020-12-23 23:01:32 +05:30
test_modeling_squeezebert.py Switch return_dict to True by default. (#8530) 2020-11-16 11:43:00 -05:00
test_modeling_t5.py [EncoderDecoder] Make tests more aggressive (#9256) 2020-12-22 17:00:04 +01:00
test_modeling_tapas.py Remove tolerance + drop_rows_to_fit by default (#9507) 2021-01-11 08:02:41 -05:00
test_modeling_tf_albert.py Added integration tests for TensorFlow implementation of the ALBERT model (#9976) 2021-02-03 09:49:18 -05:00
test_modeling_tf_auto.py Optional layers (#8961) 2020-12-08 09:14:09 -05:00
test_modeling_tf_bart.py Add XLA test (#9848) 2021-01-29 11:25:03 +01:00
test_modeling_tf_bert.py Add head_mask/decoder_head_mask for TF BART models (#9639) 2021-01-26 03:50:00 -05:00
test_modeling_tf_blenderbot_small.py Add XLA test (#9848) 2021-01-29 11:25:03 +01:00
test_modeling_tf_blenderbot.py Add XLA test (#9848) 2021-01-29 11:25:03 +01:00
test_modeling_tf_bort.py ADD BORT (#9813) 2021-01-27 21:25:11 +03:00
test_modeling_tf_camembert.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_modeling_tf_common.py Fix Longformer and LED (#9942) 2021-02-03 12:26:32 +01:00
test_modeling_tf_convbert.py Add XLA test (#9848) 2021-01-29 11:25:03 +01:00
test_modeling_tf_ctrl.py Add XLA test (#9848) 2021-01-29 11:25:03 +01:00
test_modeling_tf_distilbert.py TF DistilBERT integration tests (#9975) 2021-02-03 09:51:00 -05:00
test_modeling_tf_dpr.py New serving (#9419) 2021-01-07 11:48:49 +01:00
test_modeling_tf_electra.py Add head_mask/decoder_head_mask for TF BART models (#9639) 2021-01-26 03:50:00 -05:00
test_modeling_tf_flaubert.py Add XLA test (#9848) 2021-01-29 11:25:03 +01:00
test_modeling_tf_funnel.py Add a test for mixed precision (#9806) 2021-01-27 03:36:49 -05:00
test_modeling_tf_gpt2.py Add XLA test (#9848) 2021-01-29 11:25:03 +01:00
test_modeling_tf_led.py Add XLA test (#9848) 2021-01-29 11:25:03 +01:00
test_modeling_tf_longformer.py Add XLA test (#9848) 2021-01-29 11:25:03 +01:00
test_modeling_tf_lxmert.py Add a test for mixed precision (#9806) 2021-01-27 03:36:49 -05:00
test_modeling_tf_marian.py Add XLA test (#9848) 2021-01-29 11:25:03 +01:00
test_modeling_tf_mbart.py Add XLA test (#9848) 2021-01-29 11:25:03 +01:00
test_modeling_tf_mobilebert.py Integration test for mobilebert (#9978) 2021-02-03 11:36:45 -05:00
test_modeling_tf_mpnet.py Integration test added for TF MPnet (#9979) 2021-02-03 11:39:40 -05:00
test_modeling_tf_mt5.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_modeling_tf_openai.py Add XLA test (#9848) 2021-01-29 11:25:03 +01:00
test_modeling_tf_pegasus.py Add XLA test (#9848) 2021-01-29 11:25:03 +01:00
test_modeling_tf_pytorch.py Optional layers (#8961) 2020-12-08 09:14:09 -05:00
test_modeling_tf_roberta.py Add head_mask/decoder_head_mask for TF BART models (#9639) 2021-01-26 03:50:00 -05:00
test_modeling_tf_t5.py Add XLA test (#9848) 2021-01-29 11:25:03 +01:00
test_modeling_tf_transfo_xl.py Add XLA test (#9848) 2021-01-29 11:25:03 +01:00
test_modeling_tf_xlm_roberta.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_modeling_tf_xlm.py Add XLA test (#9848) 2021-01-29 11:25:03 +01:00
test_modeling_tf_xlnet.py Add head_mask/decoder_head_mask for TF BART models (#9639) 2021-01-26 03:50:00 -05:00
test_modeling_transfo_xl.py Add flags to return scores, hidden states and / or attention weights in GenerationMixin (#9150) 2021-01-06 17:11:42 +01:00
test_modeling_wav2vec2.py Wav2Vec2 (#9659) 2021-02-02 15:52:10 +03:00
test_modeling_xlm_prophetnet.py Ci test tf super slow (#8007) 2020-10-30 10:25:48 -04:00
test_modeling_xlm_roberta.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_modeling_xlm.py Add flags to return scores, hidden states and / or attention weights in GenerationMixin (#9150) 2021-01-06 17:11:42 +01:00
test_modeling_xlnet.py Add flags to return scores, hidden states and / or attention weights in GenerationMixin (#9150) 2021-01-06 17:11:42 +01:00
test_onnx.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_optimization_tf.py Use stable functions (#9369) 2021-01-05 03:58:26 -05:00
test_optimization.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_pipelines_common.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_pipelines_conversational.py Cleaning up ConversationalPipeline to support more than DialoGPT. (#10002) 2021-02-08 14:29:07 +03:00
test_pipelines_feature_extraction.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_pipelines_fill_mask.py Adding skip_special_tokens=True to FillMaskPipeline (#9783) 2021-01-26 10:06:28 +01:00
test_pipelines_ner.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_pipelines_question_answering.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_pipelines_sentiment_analysis.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_pipelines_summarization.py Enable TruncationStrategy override for pipelines (#9432) 2021-01-11 09:23:28 -05:00
test_pipelines_table_question_answering.py Adding a test to prevent late failure in the Table question answering (#9808) 2021-01-27 04:10:53 -05:00
test_pipelines_text_generation.py Adding a new return_full_text parameter to TextGenerationPipeline. (#9852) 2021-01-29 10:27:32 +01:00
test_pipelines_text2text_generation.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_pipelines_translation.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_pipelines_zero_shot.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_retrieval_rag.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_skip_decorators.py [testing] rename skip targets + docs (#7863) 2020-10-20 04:39:13 -04:00
test_tokenization_albert.py ALBERT Tokenizer integration test (#9943) 2021-02-02 04:39:33 -05:00
test_tokenization_auto.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_tokenization_bart.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_tokenization_barthez.py Add barthez model (#8393) 2020-11-27 12:31:42 -05:00
test_tokenization_bert_generation.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_tokenization_bert_japanese.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_tokenization_bert.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_tokenization_bertweet.py Reorganize repo (#8580) 2020-11-16 21:43:42 -05:00
test_tokenization_blenderbot.py [PyTorch Bart] Split Bart into different models (#9343) 2021-01-05 22:00:05 +01:00
test_tokenization_camembert.py Fast transformers import part 1 (#9441) 2021-01-06 12:17:24 -05:00
test_tokenization_common.py [Tokenizer Utils Base] Make pad function more flexible (#9928) 2021-02-02 10:35:27 +03:00
test_tokenization_ctrl.py Refactor prepare_seq2seq_batch (#9524) 2021-01-12 18:19:38 -05:00
test_tokenization_deberta.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_tokenization_distilbert.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_tokenization_dpr.py [Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659) 2020-10-18 20:51:24 +02:00
test_tokenization_fsmt.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_tokenization_funnel.py Reorganize repo (#8580) 2020-11-16 21:43:42 -05:00
test_tokenization_gpt2.py [Tokenizer Utils Base] Make pad function more flexible (#9928) 2021-02-02 10:35:27 +03:00
test_tokenization_herbert.py Reorganize repo (#8580) 2020-11-16 21:43:42 -05:00
test_tokenization_layoutlm.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_tokenization_lxmert.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_tokenization_marian.py Fast transformers import part 1 (#9441) 2021-01-06 12:17:24 -05:00
test_tokenization_mbart.py Fast transformers import part 1 (#9441) 2021-01-06 12:17:24 -05:00
test_tokenization_mpnet.py [MPNet] Add slow to fast tokenizer converter (#9233) 2020-12-21 15:41:34 +01:00
test_tokenization_openai.py [Tokenizer Utils Base] Make pad function more flexible (#9928) 2021-02-02 10:35:27 +03:00
test_tokenization_pegasus.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_tokenization_phobert.py Reorganize repo (#8580) 2020-11-16 21:43:42 -05:00
test_tokenization_prophetnet.py Reorganize repo (#8580) 2020-11-16 21:43:42 -05:00
test_tokenization_rag.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_tokenization_reformer.py [Tokenizer Utils Base] Make pad function more flexible (#9928) 2021-02-02 10:35:27 +03:00
test_tokenization_roberta.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_tokenization_small_blenderbot.py [PyTorch Bart] Split Bart into different models (#9343) 2021-01-05 22:00:05 +01:00
test_tokenization_squeezebert.py [Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659) 2020-10-18 20:51:24 +02:00
test_tokenization_t5.py Fast transformers import part 1 (#9441) 2021-01-06 12:17:24 -05:00
test_tokenization_tapas.py Fix slow tests v4.2.0 (#9561) 2021-01-13 09:55:48 -05:00
test_tokenization_transfo_xl.py Refactor prepare_seq2seq_batch (#9524) 2021-01-12 18:19:38 -05:00
test_tokenization_utils.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_tokenization_wav2vec2.py Wav2Vec2 (#9659) 2021-02-02 15:52:10 +03:00
test_tokenization_xlm_prophetnet.py Reorganize repo (#8580) 2020-11-16 21:43:42 -05:00
test_tokenization_xlm_roberta.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_tokenization_xlm.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_tokenization_xlnet.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_trainer_callback.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_trainer_distributed.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_trainer_seq2seq.py Fix slow tests v4.2.0 (#9561) 2021-01-13 09:55:48 -05:00
test_trainer_tpu.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_trainer_utils.py Fix memory regression in Seq2Seq example (#9713) 2021-01-21 12:05:46 -05:00
test_trainer.py Deprecate model_path in Trainer.train (#9854) 2021-01-28 08:32:46 -05:00
test_utils_check_copies.py Copyright (#8970) 2020-12-07 18:36:34 -05:00
test_versions_utils.py Copyright (#8970) 2020-12-07 18:36:34 -05:00