Arthur
44e3e3fb49
prepare for "__floordiv__ is deprecated and its behavior will change in a future version of pytorch" ( #20211 )
...
* rounding_mode = "floor" instead of // to prevent behavioral change
* add other TODO
* use `torch_int_div` from pytrch_utils
* same for tests
* fix copies
* style
* use relative imports when needed
* Co-authored-by: sgugger <sylvain.gugger@gmail.com>
2023-03-01 10:49:21 +01:00
Aaron Gokaslan
5e8c8eb5ba
Apply ruff flake8-comprehensions ( #21694 )
2023-02-22 09:14:54 +01:00
Joao Gante
13e03e619d
Generate: filter encoder inputs when its signature does not accept wildcards ( #21603 )
2023-02-14 10:46:46 +00:00
Joao Gante
fa4bdb0a40
Generate: correct default model input creation for decoder-only models ( #21580 )
2023-02-13 17:04:49 +00:00
Joao Gante
24273268b7
Generate: Fix flaky indexing error in test_constrained_beam_search_generate_dict_output
( #21561 )
2023-02-13 15:12:07 +00:00
Joao Gante
eb6c59bc78
Generate: TF supports multiple eos tokens ( #21571 )
2023-02-13 12:24:22 +00:00
Joao Gante
2edf9a857b
Generate: TF .generate()
can now be exported with dynamic length ( #21474 )
2023-02-09 12:52:30 +00:00
Joao Gante
e69f9715eb
Generate: make TF .generate()
signature == PT .generate()
signature ( #21525 )
2023-02-09 11:10:13 +00:00
Motoki Wu
9960506cbe
Fix multiple eos_token_id
s in model.generate(...) ( #21461 )
...
* add tests with multiple eos_token_ids
* make math.prod instead of sum
* make fixup
* fix long and also use np.prod since math.prod does not exist <python 3.8
* make fixup
* add prod util
* use prod util instead of np.prod
* make fixup
* previous .long location
* use tensor ops
* remove prod
* remove prod
* update device
* make fixup
* fix none
2023-02-08 13:48:46 -05:00
Joao Gante
1d9c26a4b8
Generate: TF compute_transition_scores
( #21341 )
2023-02-08 16:36:43 +00:00
Joao Gante
1e4cf8bb44
Generate: TF can now generate from embeddings in encoder-decoder models ( #21475 )
2023-02-07 11:18:23 +00:00
Sylvain Gugger
6f79d26442
Update quality tooling for formatting ( #21480 )
...
* Result of black 23.1
* Update target to Python 3.7
* Switch flake8 to ruff
* Configure isort
* Configure isort
* Apply isort with line limit
* Put the right black version
* adapt black in check copies
* Fix copies
2023-02-06 18:10:56 -05:00
Joao Gante
4943331015
Generate: TF can now accept custom logits processors ( #21454 )
2023-02-06 15:44:47 +00:00
Yih-Dar
59d5edef34
Avoid flaky generation sampling tests ( #21445 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-03 22:01:25 +01:00
Joao Gante
f21af26279
🚨 🚨 Generate: standardize beam search behavior across frameworks ( #21368 )
2023-02-03 10:24:02 +00:00
Joao Gante
92ce53aab8
Generate: decoder-only models can generate with inputs_embeds
( #21405 )
2023-02-01 21:50:38 +00:00
Joao Gante
623346ab18
Template for framework-agnostic tests ( #21348 )
2023-01-31 11:33:18 +00:00
Joao Gante
42b60f8b02
Generate: Relaxed max_length
and max_new_tokens
coexistence ( #21347 )
...
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-01-30 17:53:54 +00:00
Arthur
94a7edd938
[GenerationConfig] add additional kwargs handling ( #21269 )
...
* add additional kwargs handling
* fix issue when serializing
* correct order of kwargs removal for serialization in from dict
* add `dict_torch_dtype_to_str` in case a dtype is needed for generation
* add condition when adding the kwargs : not from config
* Add comment based on review
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* add test function
* default None when poping arg
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2023-01-24 19:04:42 +01:00
Joao Gante
1eda4a4102
Generate: save generation config with the models' .save_pretrained()
( #21264 )
2023-01-23 16:21:44 +00:00
Joao Gante
af37d183b3
Generate: documented function to compute the transition scores ( #21191 )
...
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-01-20 12:50:01 +00:00
Karim Foda
b9403e9516
Add hallucination filter ( #18675 )
...
* Add hallucination penalty
* Make quality changes
* Inverse penalty
* Fix imports & quality
* Fix name spelling issue
* set encoder_repetition_penalty and fix quality
* Fix failing test
* Add to config_common_kwargs
* Fix modelling_rag error
* Update src/transformers/generation_logits_process.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Remove breakpoint
* Make style fixes
* Update encoder_repetition_penalty default value
* Merge latest main changes
* Make fixup changes
* Add EncoderRepetitionPenaltyLogitsProcessor to generation/__init__.py
* Fix repo-inconsistency
* Remove venv
* Remove tensorflow-macos & add tests
* Add documentation
* Fix quality issues
* move encoder_repetition_penalty to config
* Update src/transformers/configuration_utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/generation/configuration_utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Remove encoder_repetition_penalty from tests
* Fix type error
* Fix format error
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2023-01-19 11:20:25 -05:00
Sherman Siu
865da84abb
Add Epsilon- and Eta-Sampling ( #21121 )
...
* Add epsilon- and eta-sampling.
Add epsilon- and eta-sampling, following the official code from https://github.com/john-hewitt/truncation-sampling and adapting to be more configurable, as required by Huggingface transformers.
* Add unit tests for epsilon- and eta-sampling.
* Black: fix code formatting.
* Fix docstring spacing.
* Clean up newlines.
* Fix implementation bugs and their associated tests.
* Remove epsilon- and eta-sampling parameters from PretrainedConfig.
* Clarify and clean up the documentation.
* Remove parameters for PretrainedConfig test.
2023-01-17 13:04:32 -05:00
Joao Gante
b91048968b
Generate: Fix CI related to #20727 ( #21003 )
2023-01-04 20:26:56 +00:00
Motoki Wu
45da7cec5a
Add custom stop token ids for generation ( #20727 )
...
* Add StopIdStoppingCriteria
* add a working test for stop id criteria
* add to global scope
* add stop_ids to generate
* add pipeline test
* use tokenizer encode in test
* add test to generation utils
* reformat
* fixup
* make-fix-copies
* rename to stop_token_id
* use stop_tokens instead
* add to text to text generation
* make fixup
* make repo-consistency
* Add support for list of ints for eos_token_id inside generation/utils.py
* Instead of having if elses, cast the eos_token_id into a List[int]
* Add List[int] support for logits_process.py
* add List[int] for beam_search.py
* add List[int] for forced_eos_token_id
* revert stop token id stopping criteria changes
* make fixup
* fix tests
* add eos_token_id to generation/utils.py and added tests test_utils.py
* add eos_token_id type hints and fix for pad tokens
* add comments
* remove some prints and remove forced false test
* fix
* put back test_stop_sequence_stopping_criteria
* remove unused import and make fixup
* add a none check
* update docstring
* add more docstring for list ints
* make fixup
2023-01-03 15:18:24 -05:00
Konstantin Kotik
367fdf3330
MinNewTokensLengthLogitsProcessor
for .generate
method #20814 ( #20892 )
...
* feat: add min new length logit processor
* test: add min new length logit processor
* docs: add MinNewTokensLengthLogitsProcessor
* feat: import MinNewTokensLengthLogitsProcessor
* fix: update pytorch dummy objects
* refactor & fix: rename attributes and var and get rid of dynamic attribute
* tests: align test with new interface
* docs: fix typo
* docs: minor clarification
* Empty-Commit
* empty commit
* run automated quality edits
Co-authored-by: Joao Gante <joao@huggingface.co>
2023-01-03 06:29:02 -05:00
fzyzcjy
ae3cbbcaf6
Fix tiny typo ( #20841 )
...
* Fix typo
* Update README.md
* Update run_mlm_flax_stream.py
* Update README.md
2022-12-20 03:17:59 -05:00
Joao Gante
4bc723f87d
Generate: use GenerationConfig
as the basis for .generate()
parametrization ( #20388 )
...
* generate from config mvp
* fix failing tests
* max_time test
* Load default gen config at model load time; Update docs
* further documentation; add tests
* adapt rag to the new structure
* handle models not instantiated with from_pretained (like in tests)
* better default generation config
* add can_generate fn
* handle legacy use case of ad hoc model config changes
* initialize gen config from config in individual methods, if gen config is none
* fix _get_decoder_start_token_id when called outside GenerationMixin
* correct model config load order (set attr > model config > decoder config)
* update rag to match latest changes
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* load gen config from model config in model.from_pretrained
* fix can_generate fn
* handle generate calls without a previous from_pretrained (e.g. tests)
* add legacy behavior (and a warning)
* lower logger severity
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-12-15 18:27:20 +00:00
Joao Gante
4cf38148dc
Generate: model_kwargs
can also be an input to prepare_inputs_for_generation
( #20353 )
2022-11-21 16:20:27 +00:00
Joao Gante
3de07473da
Generate: add generation config class ( #20218 )
...
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-11-21 13:30:15 +00:00
Joao Gante
938cb04789
Generate: add Bloom fixes for contrastive search ( #20213 )
2022-11-14 18:34:11 +00:00
Joao Gante
f270b960d6
Generate: move generation_*.py src files into generation/*.py ( #20096 )
...
* move generation_*.py src files into generation/*.py
* populate generation.__init__ with lazy loading
* move imports and references from generation.xxx.object to generation.object
2022-11-09 15:34:08 +00:00
Joao Gante
831590f6a9
Generate: contrastive search with full optional outputs ( #19963 )
...
* Use beam search functionality; Add extra outputs and test
* Add full tests for contrastive search
* Add error message on unconventional cache format
2022-11-01 18:15:36 +00:00
Joao Gante
e0b825a8d0
Generate: contrastive search test updates ( #19787 )
...
* contrastive search test updates
* make fixup
2022-10-21 19:10:08 +01:00
GMFTBY
71786b10c5
Adding the state-of-the-art contrastive search decoding methods for the codebase of generation_utils.py ( #19477 )
...
* add: the contrastive search for generaton_utils
* add: testing scripts for contrastive search under examples/text-generation
* update the quality of codes
* revise the docstring; make the generation_contrastive_search.py scripts;
* revise the examples/pytorch/text-generation/run_generation_contrastive_search.py to the auto-APIs format
* revise the necessary documents
* fix: revise the docstring of generation_contrastive_search.py
* Fix the code indentation
* fix: revise the nits and examples in contrastive_search docstring.
* fix the copyright
* delete generation_contrastive_search.py
* revise the logic in contrastive_search
* update the intergration test and the docstring
* run the tests over
* add the slow decorate to the contrastive_search intergrate test
* add more test
* do the style, quality, consistency checks
2022-10-19 10:17:46 +01:00
amyeroberts
e3f028f3af
Add TF whisper ( #19378 )
...
* simplify loop
* add featur extractor
* add model
* start conversion
* add dropout
* initial commit of test files
* copnversion for all models
* update processor for correct padding
* update feature extraction
* update integration test logits match
* fmnt: off for the logits
* on the fly mel bank
* small nit
* update test
* update tokenizer
* nit feature extraction
* update
* update tokenizer test
* adds logit processor and update tokenizer to get supress tokens
* style
* clean convert
* revert to original modeling tf utils
* Update
* update
* nit
* clean convert file
* update tests and nits
* quality
* slow generation test
* ffn_dim to allow customization
* update readme
* add to toctreee
* start fixing integration tests
* update tests and code
* fix feature extractor
* fix config tests common
* update code to fix tests
* fix feature exctractor
* nit feature extraction
* update test for new feature extractor
* style
* add absrtact
* large logits wioth custom decoder input ids
* wraap around is otrch available
* fix feature extractor
* correct logits for whisper small.en
* nit
* fix encoder_attentino_mask
* some fixes
* remove unnecessary inputs
* nits
* add normalizer file
* update etst tokenization
* fix attention mask not defined
* fix generate
* remove uncoder attention mask useless
* update test modeling whisper
* update condfig to add second non supress tokens
* nits on feature exrtactor
* nit for test tokenizers
* update etsts
* update tests
* update tokenization test
* fixup
* invalidated hf token. Clean convert openai to whisper
* fix logit tests
* fixup
* Add model to README
* Fix doc tests
* clean merge
* revert toc_tree changes
* remove useless LogitProcessor
* Update whisper .mdx
* update config file doc
* update configuration docstring
* update test tokenization
* update test tokenization
* update tokenization whisper
Added copied from where needed
* update feature extraction
* nit test name
* style
* quality
* remove get suppress tokens and update non_speech tokens global variables
* Update src/transformers/models/whisper/feature_extraction_whisper.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* clean modeling whisper and test
Removed the attention mask arguments that are deprecated
* fix large test
* Add multilingual audio test, and translate test
* style
* fix larg multilingual test
* nits
* add copied from for attention layer
* remove attention masks in doc
* add english normalizer
* Update docs/source/en/model_doc/whisper.mdx
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* update tokenization test
* remove copied from in whisper attention : no bias in k_proj only
* wrap around dependencies in english normalizer
* style
* correct import generation logits
* for now, wrap feature extractor with torch
* remove torch depencies for feature extraction and style
* Update src/transformers/models/whisper/convert_openai_whisper_to_tfms.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/whisper/configuration_whisper.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/en/model_doc/whisper.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* fixup
* nit
* update logitds
* style
* nit
* nits and fix final tests
* add `is_more_itertools_available` to utils
* quality
* add begin supress tokens, supress tokens to generate args and config
* clean supressTokensLogitProcessor in generation logits
* Nit naming
* add supressTokensAtBegin
* udpate tests, supress tokens to None or correct values
* nit and style
* update RAG to fit test and generate_logit
* add copy pasted statment on english normalizer
* add arguments to config_common_kwargs
* Update src/transformers/generation_utils.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/generation_logits_process.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* revert changes based on reviews
* update doc and nits
* Update src/transformers/models/whisper/configuration_whisper.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* more nits
* last nits
* update test configuration common
* add BART name in decoder attention mask documentation
* Update src/transformers/models/whisper/modeling_whisper.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* style
* nit
* nit
* add english.json file to git
* nits on documentation
* nit
* nits
* last styling
* add main toctree file
* remove sentence piece dependency
* clean init file
* fix tokenizer that has no dependencies on sentencepiece
* update whisper init file, nit
* remove english.json file
* add get decoder prompt id
* All weights loading
* Remove hanging pdb
* Fixup and tidy up
* Use same copied from as PT model
* Remove whitespace changes
* Remove torch references
* Tie embeddings
* Remove logits processor input to generate
* Update logit values
* revert changes and add forced logit processor
* nit
* clean normalizer
* remove protected
* Add logit processors and update generation code & tests
* Some tidy up
* Update docstring
* update
* update based on review
* Update src/transformers/models/whisper/configuration_whisper.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/whisper/configuration_whisper.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update to reflect changes on the PT model branch
* Tidy up
* Remove extra whitespace
* Fix test - make input ids small enough we can append
* Include upstream changes on main
* PR comments - add batch tests, remove comments & defaults
* Fix model output imports
* Update src/transformers/models/whisper/modeling_tf_whisper.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/generation_tf_logits_process.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/models/whisper/modeling_tf_whisper.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/models/whisper/modeling_tf_whisper.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update tests/models/whisper/test_modeling_tf_whisper.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/models/whisper/modeling_tf_whisper.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/models/whisper/modeling_tf_whisper.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update docstring example
* Update src/transformers/models/whisper/modeling_tf_whisper.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* Remove changes to adjust_logits_during_generation function
* Update src/transformers/models/whisper/modeling_tf_whisper.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Tidy up imports that don't require TF
* Update tests - skip and no more skip
* Update tests/generation/test_generation_tf_logits_process.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/models/whisper/modeling_tf_whisper.py
* Update src/transformers/models/whisper/modeling_tf_whisper.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* Add training flags
* Add (skipped) XLA generation tests
* Add embedding correctness test
* Add constant ids for generation tests
* Make logits finding a bit tidier
* Remove unused args
* xla generation enabled
* Don't skip XLA tests anymore
* Fix tests - add position ids to expected signature and update rag generation
* Undo method reorder
* Remove added whitespace
* Remove copy-paste gradient checkopint ref
* Remove
* Trigger CI - (issue with refs when pulling)
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: NielsRogge <niels.rogge1@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
2022-10-10 14:48:17 +01:00
Karim Foda
e396358104
Add stop sequence to text generation pipeline ( #18444 )
2022-09-30 14:26:51 +01:00
Ekagra Ranjan
578e18e002
🚨 🚨 🚨 Optimize Top P Sampler and fix edge case ( #18984 )
...
* init PR
* optimize top p and add edge case
* styling
* style
* revert tf and flax test
* add edge case test for FLAX and TF
* update doc with smallest set sampling for top p
* make style
2022-09-15 15:50:11 +02:00
Joao Gante
d4dbd7ca59
Generate: get the correct beam index on eos token ( #18851 )
2022-09-05 19:35:47 +01:00
Joao Gante
9196f48b95
Generate: validate model_kwargs
on TF (and catch typos in generate arguments) ( #18651 )
2022-09-02 16:25:26 +01:00
Joao Gante
e95d433d77
Generate: add missing **model_kwargs
in sample tests ( #18696 )
2022-08-19 16:14:27 +01:00
Joao Gante
a541d97477
Generate: validate model_kwargs on FLAX (and catch typos in generate arguments) ( #18653 )
2022-08-18 10:56:21 +01:00
Joao Gante
ed1924e801
Generate: validate model_kwargs
(and catch typos in generate arguments) ( #18261 )
...
* validate generate model_kwargs
* generate tests -- not all models have an attn mask
2022-08-12 14:53:51 +01:00
Joao Gante
7e44226fc7
Generate: deprecate default max_length
( #18018 )
2022-07-23 18:02:03 +01:00
Yih-Dar
0b0dd97737
Update expected values in constrained beam search tests ( #17887 )
...
* fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-28 08:53:53 +02:00
unifyh
3b00b623b7
Fix top_k_top_p_filtering
having unexpected behavior ( #17744 )
...
- Fix `top_k_top_p_filtering` not passing `filter_value` to
`TopPLogitsWarper` causing any top-p filtered logits to be -inf
instead of specified value
- Add corresponding test
2022-06-21 21:35:55 +02:00
Patrick von Platen
13e875cc07
[Generation Test] Make fast test actually fast ( #17661 )
2022-06-10 18:49:03 +02:00
Patrick von Platen
518bd02c9b
[Generation] Fix Transition probs ( #17311 )
...
* [Draft] fix transition probs
* up
* up
* up
* make it work
* fix
* finish
* update
2022-05-19 22:17:02 +02:00
Sylvain Gugger
afe5d42d8d
Black preview ( #17217 )
...
* Black preview
* Fixup too!
* Fix check copies
* Use the same version as the CI
* Bump black
2022-05-12 16:25:55 -04:00
Joao Gante
fb0ae12947
TF: XLA bad words logits processor and list of processors ( #16974 )
2022-04-29 15:54:58 +01:00