* Create CodeAgent and ReactAgent
* Fix formatting errors
* Update documentation for agents
* Add custom errors, improve logging
* Support variable usage in ReactAgent
* add messages
* Add message passing format
* Create React Code Agent
* Update
* Refactoring
* Fix errors
* Improve python interpreter
* Only non-tensor inputs should be sent to device
* Calculator tool slight refactor
* Improve docstrings
* Refactor
* Fix tests
* Fix more tests
* Fix even more tests
* Fix tests by replacing output and input types
* Fix operand type issue
* two small fixes
* EM TTS
* Fix agent running type errors
* Change text to speech tests to allow changed outputs
* Update doc with new agent types
* Improve code interpreter
* If max iterations reached, provide a real answer instead of an error
* Add edge case in interpreter
* Add safe imports to the interpreter
* Interpreter tweaks: tuples and listcomp
* Make style
* Make quality
* Add dictcomp to interpreter
* Rename ReactJSONAgent to ReactJsonAgent
* Misc changes
* ToolCollection
* Rename agent's logger to self.logger
* Add while loops to interpreter
* Update doc with new tools. still need to mention collections
* Add collections to the doc
* Small fixes on logs and interpretor
* Fix toolbox return type
* Docs + fixup
* Skip doctests
* Correct prompts with improved examples and formatting
* Update prompt
* Remove outdated docs
* Change agent to accept Toolbox object for tools
* Remove calculator tool
* Propagate removal of calculator in doc
* Fix 2 failing workflows
* Simplify additional argument passing
* AgentType audio
* Minor changes: function name, types
* Remove calculator tests
* Fix test
* Fix torch requirement
* Fix final answer tests
* Style fixes
* Fix tests
* Update docstrings with calculator removal
* Small type hint fixes
* Update tests/agents/test_translation.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update tests/agents/test_python_interpreter.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/agents/default_tools.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/agents/tools.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update tests/agents/test_agents.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/bert/configuration_bert.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/agents/tools.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/agents/speech_to_text.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update tests/agents/test_speech_to_text.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update tests/agents/test_tools_common.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* pygments
* Answer comments
* Cleaning up
* Simplifying init for all agents
* Improving prompts and making code nicer
* Style fixes
* Add multiple comparator test in interpreter
* Style fixes
* Improve BERT example in documentation
* Add examples to doc
* Fix python interpreter quality
* Logging improvements
* Change test flag to agents
* Quality fix
* Add example for HfEngine
* Improve conversation example for HfEngine
* typo fix
* Verify doc
* Update docs/source/en/agents.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/agents/agents.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/agents/prompts.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/agents/python_interpreter.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/en/agents.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Fix style issues
* local s2t tool
---------
Co-authored-by: Cyril Kondratenko <kkn1993@gmail.com>
Co-authored-by: Lysandre <lysandre@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Adding SDPA support for BERT
* Using the proper input name for testing model input in inference()
* Adding documentation for SDPA in BERT model page
* Use the stable link for the documentation
* Adding a gate to only call .contiguous() for torch < 2.2.0
* Additions and fixes to the documentation
* Minor updates to documentation
* Adding extra requirements needed for the contiguous() bug
* Adding "Adapted from" in plcae of the "Copied from"
* Add benchmark speedup tables to the documentation
* Minor fixes to the documentation
* Use ClapText as a replacemenet for Bert in the Copied-From
* Some more fixes for the fix-copies references
* Overriding the test_eager_matches_sdpa_generate in bert tests to not load with low_cpu_mem_usage
[test all]
* Undo changes to separate test
* Refactored SDPA self attention code for KV projections
* Change use_sdpa to attn_implementation
* Fix test_sdpa_can_dispatch_on_flash by preparing input (required for MultipleChoice models)
* Draft tutorial for talking to chat models
* Reformat lists and text snippets
* Cleanups and clarifications
* Finish up remaining TODOs
* Correct section link
* Small fix
* Add proper quantization examples
* Add proper quantization examples
* Add proper quantization examples
* Update docs/source/en/conversations.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/conversations.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/conversations.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/conversations.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/conversations.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/conversations.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/conversations.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/conversations.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/conversations.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/conversations.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/_toctree.yml
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/conversations.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Fix Text Generation Pipeline link and add a ref to the LLM inference guide
* intelligent -> capable
* Small intro cleanup
* Small text cleanup
* Small text cleanup
* Clarification about system message
* Clarification about system message
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* chore(root): Initial commit of Phi-3 files.
* fix(root): Fixes Phi-3 missing on readme.
* fix(root): Ensures files are consistent.
* fix(phi3): Fixes unit tests.
* fix(tests): Fixes style of phi-3 test file.
* chore(tests): Adds integration tests for Phi-3.
* fix(phi3): Removes additional flash-attention usage, .e.g, swiglu and rmsnorm.
* fix(phi3): Fixes incorrect docstrings.
* fix(phi3): Fixes docstring typos.
* fix(phi3): Adds support for Su and Yarn embeddings.
* fix(phi3): Improves according first batch of reviews.
* fix(phi3): Uses up_states instead of y in Phi3MLP.
* fix(phi3): Uses gemma rotary embedding to support torch.compile.
* fix(phi3): Improves how rotary embedding classes are defined.
* fix(phi3): Fixes inv_freq not being re-computed for extended RoPE.
* fix(phi3): Adds last suggestions to modeling file.
* fix(phi3): Splits inv_freq calculation in two lines.
* [FEAT]: EETQ quantizer support
* Update quantization.md
* Update docs/source/en/main_classes/quantization.md
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update docs/source/en/quantization.md
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update docs/source/en/quantization.md
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/integrations/__init__.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/integrations/__init__.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/integrations/eetq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/integrations/eetq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/integrations/eetq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update tests/quantization/eetq_integration/test_eetq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/quantizers/auto.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/quantizers/auto.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/quantizers/auto.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/quantizers/quantizer_eetq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update tests/quantization/eetq_integration/test_eetq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/quantizers/quantizer_eetq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update tests/quantization/eetq_integration/test_eetq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update tests/quantization/eetq_integration/test_eetq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* [FEAT]: EETQ quantizer support
* [FEAT]: EETQ quantizer support
* remove whitespaces
* update quantization.md
* style
* Update docs/source/en/quantization.md
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* add copyright
* Update quantization.md
* Update docs/source/en/quantization.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/quantization.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Address the comments by amyeroberts
* style
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Marc Sun <marc@huggingface.co>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Duplicate swiftformer
* Convert SwiftFormerPatchEmbedding
* Convert SwiftFormerEmbeddings
* Convert TFSwiftFormerMlp
* Convert TFSwiftFormerConvEncoder
* Convert TFSwiftFormerLocalRepresentation
* convert TFSwiftFormerEncoderBlock
* Convert SwiftFormerStage
* Convert SwiftFormerEncoder
* Add TFSWiftFormerPreTrainedModel
* Convert SwiftFormerForImageClassification
* Add kwargs and start drop path
* Fix syntax
* Change Model class name
* Add TFSwiftFormer to __init__
* Duplicate test_modeling_swiftformer
* First test conversions
* Change require_torch to require_tf
* Add exports to swiftformer __init__
* Add TFSwiftFormerModel wrapper
* Fix __init__ and run black
* Remove docstring from MainLayer, fix padding
* Use keras.layers.Activation on keras.Sequential
* Fix swiftformer exports
* Fix activation layer from config
* Remove post_inits
* Use tf.keras.layers.ZeroPadding2D
* Convert torch normalize
* Change tf test input shape
* Fix softmax and reduce_sum
* Convert expand_dims and repeat
* Add missing reshape and tranpose
* Simplify TFSwiftFormerEncoderBlock.call
* Fix mismatch in patch embeddings
* Fix expected output shape to match channels last
* Fix swiftformer typo
* Disable test_onnx
* Fix TFSwiftFormerForImageClassification call
* Add unpack inputs
* Convert flatten(2).mean(-1)
* Change vision dummy inputs (to be reviewed)
* Change test_forward_signature to use .call
* Fix @unpack_inputs
* Set return_tensors="tf" and rename class
* Rename wrongly named patch_embeddings layer
* Add serving_output and change dummy_input shape
* Make dimensions BCHW and transpose inside embedding layer
* Change SwiftFormerEncoderBlock
* Fix ruff problems
* Add image size to swiftformer config
* Change tranpose to MainLayer and use -1 for reshape
* Remove serving_outputs and dummy_inputs
* Remove test_initialization test from tf model
* Make Sequential component a separate layer
* Fix layers' names
* Tranpose encoder outputs
* Fix tests and check if hidden states is not None
* Fix TFSwiftFormerForImageClassification
* Run make fixup
* Run make fix-copies
* Update modeling_tf_auto
* Update docs
* Fix modeling auto mapping
* Update modelint_tf_swiftformer docs
* Fill image_size doc and type
* Add reduction=None to loss computation
* Update docs
* make style
* Debug: Delete the tip to see if that changes anything
* Re-add tip
* Remove add_code_sample_docstrings
* Remove unused import
* Get the debug to actually tell us the problem it has with the docs
* Try a substitution to match the PyTorch file?
* Add swiftformer to ignore list
* Add build() methods
* Update copyright year
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Remove FIXME comment
* Remove from_pt
* Update copyright year
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Rename one-letter variables
* Remove FIXMEs related to momentum
* Remove old TODO comment
* Remove outstanding FIXME comments
* Get dropout rate from config
* Add specific dropout config for MLP
* Add convencoder dropout to config
* Pass config to SwiftFormerDropPath layer
* Fix drop_path variable name and add Adapted from comment
* Run ruff
* Removed copied from comment
* Run fix copies
* Change drop_path to identity to match pt
* Cleanup build() methods and move to new keras imports
* Update docs/source/en/model_doc/swiftformer.md
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* Raise error if drop_path_rate > 0.0
* Apply suggestions from code review
Replace (self.dim), with self.dim,
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* Remove drop_path function
* Add training to TFSwiftFormerEncoder
* Set self.built = True last
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Should have been added to previous commit
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Change default_feature_extractor to default_image_processor
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Import Keras from modeling_tf_utils
* Remove relative import
* Run ruff --fix
* Move import keras to tf_available
* Add copied from comment to test_forward_signature
* Reduce batch size and num_labels
* Extract loss logic to hf_compute_loss
* Run ruff format
---------
Co-authored-by: Matt <rocketknight1@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* initial commit, remove warnings on default chat templates
* stash commit
* Raise a much sterner warning for default chat templates, and prepare for depreciation
* Update the docs
* wip
* fix __init__.py
* add docs
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* address comments 1
* work on make fixup
* pass configs down
* add sdpa attention
* remove DbrxBlock
* add to configuration_auto
* docstring now passes formatting test
* fix style
* update READMEs
* add dbrx to modeling_auto
* make fix-copies generated this
* add DBRX_PRETRAINED_CONFIG_ARCHIVE_MAP
* config docstring passes formatting test
* rename moe_loss_weight to router_aux_loss_coef
* add to flash-attn documentation
* fix model-path in tests
* Explicitly make `"suli"` the default `ffn_act_fn`
Co-authored-by: Wing Lian <wing.lian@gmail.com>
* default to using router_aux_loss_coef over ffn_config[moe_loss_weight]
* fix _flash_attn_uses_top_left_mask and is_causal
* fix tests path
* don't use token type IDs
* follow Llama and remove token_type_ids from test
* init ConfigTester differently so tests pass
* remove multiple choice test
* remove question + answer test
* remove sequence classification test
* remove token classification test
* copy Llama tests and remove token_type_ids from test inputs
* do not test pruning or headmasking; style code
* add _tied_weights_keys parameter to pass test
* add type hints
* fix type check
* update config tester
* remove masked_lm test
* remove encoder tests
* initialize DbrxModelTester with correct params
* style
* torch_dtype does not rely on torch
* run make fixup, fix-copies
* use https://huggingface.co/v2ray/dbrx-base-fixed/blob/main/modeling_dbrx.py
* add copyright info
* fix imports and DbrxRotaryEmbedding
* update DbrxModel docstring
* use copies
* change model path in docstring
* use config in DbrxFFN
* fix flashattention2, sdpaattention
* input config to DbrXAttention, DbrxNormAttentionNorm
* more fixes
* fix
* fix again!
* add informative comment
* fix ruff?
* remove print statement + style
* change doc-test
* fix doc-test
* fix docstring
* delete commented out text
* make defaults match dbrx-instruct
* replace `router_aux_loss_coef` with `moe_loss_weight`
* is_decoder=True
* remove is_decoder from configtester
* implement sdpa properly
* make is_decoder pass tests
* start on the GenerationTesterMixin tests
* add dbrx to sdpa documentation
* skip weight typing test
* style
* initialize smaller model
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* Add DBRX to toctree
* skip test_new_cache_format
* make config defaults smaller again
* add pad_token_id
* remove pad_token_id from config
* Remove all references to DBRX_PRETRAINED_CONFIG_ARCHIVE_MAP
* Update src/transformers/models/dbrx/__init__.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/dbrx/modeling_dbrx.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/en/model_doc/dbrx.md
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* Update src/transformers/models/dbrx/configuration_dbrx.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/en/model_doc/dbrx.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix typo
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* update docs, fix configuration_auto.py
* address pr comments
* remove is_decoder flag
* slice
* fix requires grad
* remove grad
* disconnect differently
* remove grad
* enable grads
* patch
* detach expert
* nissan al ghaib
* Update modeling_dbrx.py
* Update src/transformers/models/dbrx/modeling_dbrx.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* replace "Gemma" with "Dbrx"
* remove # type: ignore
* don't hardcode vocab_size
* remove ToDo
* Re-add removed idefics2 line
* Update test to use tiny-random!
* Remove TODO
* Remove one more case of loading the entire dbrx-instruct in the tests
* Update src/transformers/models/dbrx/modeling_dbrx.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* address some comments
* small model
* add dbrx to tokenization_auto
* More docstrings with add_start_docstrings
* Dbrx for now
* add PipelineTesterMixin
* Update src/transformers/models/dbrx/configuration_dbrx.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* remove flash-attn2 import error
* fix docstring
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* add useage example
* put on one line
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fix ffn_act_fn
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* change "dbrx" to "DBRX" for display purposes.
* fix __init__.py?
* fix __init__.py
* fix README
* return the aux_loss
* remove extra spaces
* fix configuration_auto.py
* fix format in tokenization_auto
* remove new line
* add more useage examples
---------
Co-authored-by: Abhi Venigalla <abhi.venigalla@databricks.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Eitan Turok <eitan.turok@databricks.com>
Co-authored-by: Eitan Turok <150733043+eitanturok@users.noreply.github.com>
Co-authored-by: Wing Lian <wing.lian@gmail.com>
Co-authored-by: Eitan Turok <eitanturok@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Matt <rocketknight1@gmail.com>
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Mihir Patel <mihir.v.patel7@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Add jamba arch
* apply "make fix-copies" changes
* fix link to model in JambaConfig docstring
* Add n_ctx in modeling file because repo-consistency wants that
* Add jamba to flash attention and sdpa documentation
* mamba dt_proj quant fix now works for LoRA as well
* override test_left_padding_compatibility and use a more permissive tolerance. left padding numerical difference are accentuated by mamba layers
* add jamba to tokenization auto
* fix comments of shape (PR #24 in the model page: https://huggingface.co/ai21labs/Jamba-v0.1/discussions/24)
* simple PR fixes
* remove unnecessary kwargs from JambaAttentionDecoderLayer and JambaMambaDecoderLayer
* remove the LoRA hack for the mamba dt_proj bias. It was solved in huggingface/peft#1530 (https://github.com/huggingface/peft/pull/1530)
* Add copied comment on JambaMLP (it's the same as MixtralMLP)
* remove padding_mask warnings. It's not supported anymore
* fix docstring. Float instead of int
* A few more minor PR fixes
* (1) lowercase names for mamba layernorms (2) remove _apply_inner_layernorms and do it directly in the forward pass
* Return None attention weights from mamba layers. Append to all attentions only if not None.
* remove some leftover jamba archive lists
* Better separation between expert vs non-expert layers. non-expert layers return None as router_logits, and it is not concatenated to all_router_logits returned from JambaModel
* no need to take router_logits at config.expert_layer_offset anymore. result.router_logits now holds results only for expert layers
* Add Jamba paper on READMEs
* (1) rename n_ctx -> max_position_embeddings (2) don't use it in the modeling file since it's not needed (set it as an exception to check_config_attributes)
* Add copied from comment
* remove the code path for apply_inner_layernorms=False. Jamba always has the inner mamba layernorms
* clearer docstring for _convert_to_standard_cache
* style fixes
* Change calc_logits_for_entire_prompt (bool) to num_logits_to_keep (int). Adapt assisted decoding code tp use it. Also small change in low memory beam search decoding path to support this new int value in model_inputs
* rename test so it still overrides what its meant to override
* draft
* oups
* nit
* remove more complexe logic
* fix names used in config
* fix fix fix
* style
* fix some more failing tests
* generate did not init the cache 🙃
* more small nits
* typo
* config.mamba_expand * config.hidden_size for the intermediate size of the mamba shapes
* fix init of pkv with torch.tensor()
* empty tensor
* fix some init issues
* stupid changes required by generate because it does not even support it's own DynamicCache class
* more fixes
* fix general assisted gen cache_position bug
* tests passing
* Add offsets and periods as SPECIAL_CASES_TO_ALLOW in check_config_attributes.py
* fix reorder_cache to reorder mamba states and override some more functions in HybridMambaAttentionDynamicCache
* no need to override test_past_key_values_format() and _check_past_key_values_for_generate() in tests anymore
* fix docstrings and typehints for past_key_values
* style fixes
* fix docs
* change typehint due to copy from Mixtral
* forgot import
* import order
* Add configuration_jamba and modeling_jamba to not_doctested because the model is too big to download (in docstring of JambaForCausalLM.forward)
* Add integration test with tiny tandom Jamba model on hub
* fix flash attention cache shapes
* bring back forgotten hidden states
* rename HybridMambaAttentionDynamicCache.seqlen_offset to has_previous_state (and make bool) and bugfix - it should be set to True after a finished forward pass of the entire model
* align integration test after modeling fixes
* bugfix - mamba can use precomputed states only of forward pass is on a single token
* bugfix - mamba can use precomputed states only if they match the batch size
* typo
* remove making _prepare_4d_causal_attention_mask a leaf function
* stop using past_seq_len.get_seq_length(). Use cache positions instead. Adjust test (test_decoder_model_past_with_large_inputs) accordingly
---------
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
* Add OLMo using add-new-model-like with Llama
* Fix incorrect tokenizer for OLMo
* Copy-paste relevant OLMo methods and their imports
* Add OLMo config
* Modify OLMo config to follow HF conventions
* Remove unneeded Llama code from OLMo model
* Add ability for OLMo model to output attentions
* Add OLMoPreTrainedModel and OLMoModel
* Add OLMoForCausalLM
* Minor fixes to OLMo model for style and missing functions
* Implement OLMo tokenizer
* Implement OLMo to HF conversion script
* Add tests for OLMo model
* Add tests for OLMo fast tokenizer
* Add auto-generated dummy objects
* Remove unimplemented OLMo classes from auto and init classes and re-format
* Add README and associated auto-generated files
* Use OLMo names for common properties
* Run make fixup
* Remove `|` from OLMo typing
* Remove unneeded tokenization_olmo.py
* Revert model, config and converter to add-new-model-like Llama
* Move logic for adding bos/eos token into GPTNeoxTokenizerFast
* Change OLMoConfig defaults to match OLMo-7B
* Use GPTNeoXToknizerFast in OLMo tokenizer tests
* Modify auto-generated OLMoModelTests to work for OLMo
* Add non-parametric layer norm OLMoLayerNorm
* Update weight conversion script for OLMo
* Fix __init__ and auto structure for OLMo
* Fix errors from make fixup
* Remove OLMoTokenizerFast from documentation
* Add missing 'Copied from' for OLMoModel._update_causal_mask
* Run make fix-copies
* Rearrange string replacements in OLMoForCausalLM Copied from
* Move OLMo and Llama CausalLM.forward example into global constants
* Fix OLMO_GENERATION_EXAMPLE doc string typo
* Add option for qkv clipping to OLMo
* Rearrange OLMoConfig kwargs in convert_olmo_weights_to_hf
* Add clip_qkv to OLMoConfig in convert_olmo_weights_to_hf
* Fix OLMo tokenization bug using conversion script
* Keep model in full precision after conversion
* Do not add eos token automatically
* Update references to OLMo model in HF Hub
* Do not add eos token during encoding by default
* Fix Llama generation example
* Run make fixup
* OLMo 7B integration test fix
* Remove unneeded special case for OLMoConfig
* OLMo 7B Twin 2T integration test fix
* Fix test_model_7b_greedy_generation
* Remove test_compile_static_cache
* Fix OLMo and Llama generation example
* Run make fixup
* Revert "OLMo 7B integration test fix"
This reverts commit 4df56a4b15.
* Revert "OLMo 7B Twin 2T integration test fix"
This reverts commit 9ff65a4a29.
* Ungate 7B integration tests and fix greedy generation test
* Add retries for flaky test_eager_matches_sdpa_generate
* Fix output of doc example for OLMoForCausalLM.forward
* Downsize OLMo doc test for OLMoForCausalLM.forward to 1B model
* Try fix incorrect characters in OLMoForCausalLM.forward doct test
* Try fix incorrect characters in OLMoForCausalLM.forward doc test using end quotes
* Remove pretraining_tp from OLMo config and model
* Add missing 'Copied from' instances
* Remove unneeded causal_mask from OLMoModel
* Revert Llama changes
* Ignore copy for OLMoForCausalLM.forward
* Change 'OLMo' to 'Olmo' in classes
* Move minimal OLMo tokenization tests to model tests
* Add missed 'Copied from' for repeat_kv
* Add create token type ids to CodeGenTokenizer
* Fix inconsistent length of token type ids
* Format source codes
* Fix inconsistent order of methods
* Update docstring
* add test_tokenizer_integration test
* Format source codes
* Add `copied from` comment to CodeGenTokenizerFast
* Add doc of create_token_type_ids_from_sequences
* Make return_token_type_ids False by default
* Make test_tokenizer_integration as slow test
* Add return_token_type_ids to tokenizer init arg
* Add test for tokenizer's init return_token_type_ids
* Format source codes
* Configuring Translation Pipelines documents update #27753
Configuring Translation Pipelines documents update
* Language Format Addition
* adding supported list of languages list