* Put tokenizer methods in the right alphabetical order in the docs
* Quick tweak to ConversationalPipeline
* Typo fixes in the developer doc
* make fixup
* add Bros boilerplate
* copy and pasted modeling_bros.py from official Bros repo
* update copyright of bros files
* copy tokenization_bros.py from official repo and update import path
* copy tokenization_bros_fast.py from official repo and update import path
* copy configuration_bros.py from official repo and update import path
* remove trailing period in copyright line
* copy and paste bros/__init__.py from official repo
* save formatting
* remove unused unnecessary pe_type argument - using only crel type
* resolve import issue
* remove unused model classes
* remove unnecessary tests
* remove unused classes
* fix original code's bug - layer_module's argument order
* clean up modeling auto
* add bbox to prepare_config_and_inputs
* set temporary value to hidden_size (32 is too low because of the of the
Bros' positional embedding)
* remove decoder test, update create_and_check* input arguemnts
* add missing variable to model tests
* do make fixup
* update bros.mdx
* add boilerate plate for no_head inference test
* update BROS_PRETRAINED_MODEL_ARCHIVE_LIST (add naver-clova-ocr prefix)
* add prepare_bros_batch_inputs function
* update modeling_common to add bbox inputs in Bros Model Test
* remove unnecessary model inference
* add test case
* add model_doc
* add test case for token_classification
* apply fixup
* update modeling code
* update BrosForTokenClassification loss calculation logic
* revert logits preprocessing logic to make sure logits have original shape
* - update class name
* - add BrosSpadeOutput
- update BrosConfig arguments
* add boilerate plate for no_head inference test
* add prepare_bros_batch_inputs function
* add test case
* add test case for token_classification
* update modeling code
* update BrosForTokenClassification loss calculation logic
* revert logits preprocessing logic to make sure logits have original shape
* apply masking on the fly
* add BrosSpadeForTokenLinking
* update class name
put docstring to the beginning of the file
* separate the logits calculation logic and loss calculation logic
* update logic for loss calculation so that logits shape doesn't change
when return
* update typo
* update prepare_config_and_inputs
* update dummy node initialization
* update last_hidden_states getting logic to consider when return_dict is False
* update box first token mask param
* bugfix: remove random attention mask generation
* update keys to ignore on load missing
* run make style and quality
* apply make style and quality of other codes
* update box_first_token_mask to bool type
* update index.md
* apply make style and quality
* apply make fix-copies
* pass check_repo
* update bros model doc
* docstring bugfix fix
* add checkpoint for doc, tokenizer for doc
* Update README.md
* Update docs/source/en/model_doc/bros.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update bros.md
* Update src/transformers/__init__.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/bros.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* apply suggestions from code review
* apply suggestions from code review
* revert test_processor_markuplm.py
* Update test_processor_markuplm.py
* apply suggestions from code review
* apply suggestions from code review
* apply suggestions from code review
* update BrosSpadeELForTokenClassification head name to entity linker
* add doc string for config params
* update class, var names to more explicit and apply suggestions from code review
* remove unnecessary keys to ignore
* update relation extractor to be initialized with config
* add bros processor
* apply make style and quality
* update bros.md
* remove bros tokenizer, add bros processor that wraps bert tokenizer
* revert change
* apply make fix-copies
* update processor code, update itc -> initial token, stc -> subsequent token
* add type hint
* remove unnecessary condition branches in embedding forward
* fix auto tokenizer fail
* update docstring for each classes
* update bbox input dimension as standard 2 points and convert them to 4
points in forward pass
* update bros docs
* apply suggestions from code review : update Bros -> BROS in bros.md
* 1. box prefix var -> bbox
2. update variable names to be more explicit
* replace einsum with torch matmul
* apply style and quality
* remove unused argument
* remove unused arguments
* update docstrings
* apply suggestions from code review: add BrosBboxEmbeddings, replace
einsum with classical matrix operations
* revert einsum update
* update bros processor
* apply suggestions from code review
* add conversion script for bros
* Apply suggestions from code review
* fix readme
* apply fix-copies
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* First commit while I figure this out
* make fixup
* Remove unused method
* Store prompt attrib
* Fix prompt argument for tests
* Make same changes in fast tokenizer
* Remove global prompts from fast tokenizer too
* stash commit
* stash commit
* Migrate PromptConfig to its True Final Location
* Replace Conversation entirely with the new class
* Import/dependency fixes
* Import/dependency fixes
* Change format for lots of default prompts
* More default prompt fixups
* Revert llama old methods so we can compare
* Fix some default configs
* Fix some default configs
* Fix misspelled kwarg
* Fixes for Blenderbot
* make fixup
* little rebase cleanup
* Add basic documentation
* Quick doc fix
* Truncate docstring for now
* Add handling for the case when messages is a single string
* Quick llama merges
* Update conversational pipeline and tests
* Add a couple of legacy properties for backward compatibility
* More legacy handling
* Add docstring for build_conversation_input_ids
* Restructure PromptConfig
* Let's start T E M P L A T I N G
* Refactor all default configs to use templates instead
* Revert changes to the special token properties since we don't need them anymore
* More class templates
* Make the sandbox even sandier
* Everything replaced with pure templating
* Remove docs for PromptConfig
* Add testing and optional requirement boilerplate
* Fix imports and make fixup
* Fix LLaMA tests and add Conversation docstring
* Finally get LLaMA working with the template system
* Finally get LLaMA working with the template system
* make fixup
* make fixup
* fmt-off for the long lists of test tokens
* Rename method to apply_chat_template for now
* Start on documentation
* Make chat_template a property that reads through to the default if it's not set
* Expand docs
* Expand chat templating doc some more
* trim/lstrip blocks by default and update doc
* Few doc tweaks
* rebase cleanup
* Clarify docstring
* rebase cleanup
* rebase cleanup
* make fixup
* Quick doc edit
* Reformat the standard template to match ChatML
* Re-add PEFT check
* Update docs/source/en/chat_templating.md
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Add apply_chat_template to the tokenizer doc
* make fixup
* Add doc links
* Fix chat links
* Fix chat links
* Explain system messages in the doc
* Add chat template test
* Proper save-loading for chat template attribute
* Add test skips for layout models
* Remove _build_conversation_input_ids, add default_chat_template to code_llama
* Make sure all LLaMA models are using the latest template
* Remove default_system_prompt block in code_llama because it has no default prompt
* Update ConversationPipeline preprocess
* Add correct #Copied from links to the default_chat_templates
* Remove unneeded type checking line
* Add a dummy mark_processsed method
* Reorganize Conversation to have **deprecated_kwargs
* Update chat_templating.md
* Quick fix to LLAMA tests
* Small doc tweaks
* Add proper docstrings and "copied from" statements to all default chat templates
* Merge use_default_system_prompt support for code_llama too
* Improve clarity around self.chat_template
* Docstring fix
* Fix blenderbot default template
* More doctest fix
* Break out some tokenizer kwargs
* Update doc to explain default templates
* Quick tweaks to tokenizer args
* Cleanups for tokenizer args
* Add note about cacheing
* Quick tweak to the chat-templating doc
* Update the LLaMA template with error checking and correct system message embedding
* make fixup
* make fixup
* add requires_jinja
* Cleanup to expected output formatting
* Add cacheing
* Fix typo in llama default template
* Update LLaMA tests
* Update documentation
* Improved legacy handling in the Conversation class
* Update Jinja template with proper error handling
* Quick bugfix
* Proper exception raising
* Change cacheing behaviour so it doesn't try to pickle an entire Jinja env
* make fixup
* rebase cleanup
---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* enable optuna multi-objectives feature
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* update hpo doc
* update docstring
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* extend direction to List[str] type
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* Update src/transformers/integrations/integration_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* intiial commit
* updates
* nits
* update conversion script
* update conversion script
* use path to load
* add tips etc
* some modeling logic
* modeling update
* more nits
* nits
* normal layer norm
* update config and doc
* nits
* update doc remove unused
* update
* fix inits and stuff
* fixup
* revert wrong changes
* updates
* more nits
* add default config values to the configuration file
* fixup happy
* update
* 2 tests left
* update readmes
* more nits
* slow test and more documentation
* update readme
* fix licences
* styling
* use fast if possible when saving tokenizer
* remove todo
* remove tokenization tests
* small last nits
* Apply suggestions from code review
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* nits to skip the timout doctest
* fix integration test
* fix test
* update eos token
* update to allow fast tokenization
* styling
* fix codeLlama as well for the update post processor
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* add more copied from statements
* update
* doc passes doctest
* remove `# final layer norm?`
* change docstring prompot
* update
* Update README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* don't doctest the conversion script as it requires more packages
* don't init a model in the config
* oups
* fix doctest
---------
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Added HerBERT to README.md
* Update README.md to contain HerBERT (#26016)
* Resolved#26016: Updated READMEs and index.md to contain Herbert
Updated READMEs and ran make fix-copies
* docs: feat: model resources for llama
* fix: resolve suggestion
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
* Add proper Falcon docs and conversion script
* Autodetect the decoder architecture instead of using an arg
* Update docs now that we can autodetect
* Fix doc error
* Add doc to toctree
* Quick doc update
* Modify single-GPU efficient training doc with now-available adamw_bnb_8bit optimizer
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* add a warning=True tip to the Llama2 doc
* code llama needs a tip too
* doc nit
* build PR doc
* doc nits
Co-authored-by: Lysandre <lysandre@huggingface.co>
---------
Co-authored-by: Lysandre <lysandre@huggingface.co>
* add all
* Revert "Delete .github directory"
This reverts commit 9b0ff7b052e2b20b629a26fb13606b78a42944d1.
* make conversion script backward compatible
* fixup
* more styling
* copy to llama changes
* fix repo consistency
* nits
* document correct classes
* updates
* more fixes
* nits
* update auto mappings
* add readmes
* smallupdates
* llama-code replace with llama_code
* make fixup
* updates to the testsing suite
* fix fast nits
* more small fixes
* fix decode
* fix template processing
* properly reset the normalizer
* nits processor
* tokenization tests pass
* styling
* last tests
* additional nits
* one test is left
* nits
Co-authored-by faabian <faabian@users.noreply.github.com>
* update failing test
* fixup
* remove decode infilling users should handle it on their onw after generation, padding can be a problem
* update
* make test slow and more meaningfull
* fixup
* doc update
* fixup
* Apply suggestions from code review
* add kwargs doc
* tokenizer requires `requires_backend`
* type requires_backends
* CodeLlama instead of LlamaCode
* more name cahnges
* nits
* make doctests happy
* small pipeline nits
* last nit
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* update
* add codellama to toctree
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Add FlaxClipTextModelWithProjection
This is necessary to support the Flax port of Stable Diffusion XL: fb6d705fb5/text_encoder_2/config.json (L3)
Co-authored-by: Martin Müller <martin.muller.me@gmail.com>
Co-authored-by: Juan Acevedo <juancevedo@gmail.com>
* Use FlaxCLIPTextModelOutput
* make fix-copies again
* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Use `return_dict` for consistency with other uses.
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Fix docstring example.
* Add new model to FlaxCLIPTextModelTest
* Add to IGNORE_NON_AUTO_CONFIGURED list
* Fix naming convention.
---------
Co-authored-by: Martin Müller <martin.muller.me@gmail.com>
Co-authored-by: Juan Acevedo <juancevedo@gmail.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* docs: feat: model resources for llama2
Co-authored-by: Woojun Jung <hello_984@naver.com>
* fix: add description for dpo and rearrange posts
* docs: feat: add llama2 notebook resources
* style: one liners for each resource
Co-Authored-By: Woojun Jung <46880056+jungnerd@users.noreply.github.com>
Co-Authored-By: Kihoon Son <75935546+kihoon71@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Fix typo
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Woojun Jung <hello_984@naver.com>
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Adds `TRANSFORMERS_TEST_BACKEND`
Allows specifying arbitrary additional import following first `import torch`.
This is useful for some custom backends, that will require additional imports to trigger backend registration with upstream torch.
See https://github.com/pytorch/benchmark/pull/1805 for a similar change in `torchbench`.
* Update src/transformers/testing_utils.py
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* Adds real backend example to documentation
---------
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* init commit
* config updated also some modeling
* Processor and Model config combined
* extraction pipeline(upto before spectogram & mel_conditioner) added but not properly tested
* model loading successful!
* feature extractor done!
* FE can now be called from HF
* postprocessing added in fe file
* same as prev commit
* Pop2PianoConfig doc done
* cfg docs slightly changed
* fe docs done
* batched
* batched working!
* temp
* v1
* checking
* trying to go with generate
* with generate and model tests passed
* before rebasing
* .
* tests done docs done remaining others & nits
* nits
* LogMelSpectogram shifted to FeatureExtractor
* is_tf rmeoved from pop2piano/init
* import solved
* tokenization tests added
* minor fixed regarding modeling_pop2piano
* tokenizer changed to only return midi_object and other changes
* Updated paper abstract(Camera-ready version) (#2)
* more comments and nits
* ruff changes
* code quality fix
* sg comments
* t5 change added and rebased
* comments except batching
* batching done
* comments
* small doc fix
* example removed from modeling
* ckpt
* forward it compatible with fe and generation done
* comments
* comments
* code-quality fix(maybe)
* ckpts changed
* doc file changed from mdx to md
* test fixes
* tokenizer test fix
* changes
* nits done main changes remaining
* code modified
* Pop2PianoProcessor added with tests
* other comments
* added Pop2PianoProcessor to dummy_objects
* added require_onnx to modeling file
* changes
* update .md file
* remove extra line in index.md
* back to the main index
* added pop2piano to index
* Added tokenizer.__call__ with valid args and batch_decode and aligned the processor part too
* changes
* added return types to 2 tokenizer methods
* the PR build test might work now
* added backends
* PR build fix
* vocab added
* comments
* refactored vocab into 1 file
* added conversion script
* comments
* essentia version changed in .md
* comments
* more tokenizer tests added
* minor fix
* tests extended for outputs acc check
* small fix
---------
Co-authored-by: Jongho Choi <sweetcocoa@snu.ac.kr>
* a draft version
* v2 integration
* fix
* make it more generic and works for IA3
* add set adapter and multiple adapters support
* fixup
* adapt a bit
* oops
* oops
* oops
* adapt more
* fix
* add more refactor
* now works with model class
* change it to instance method as it causes issues with `jit`.
* add CR
* change method name
* add `add_adapter` method
* clean up
* Update src/transformers/adapters/peft_mixin.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* add moe utils
* fixup
* Update src/transformers/adapters/peft_mixin.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* adapt
* oops
* fixup
* add is_peft_available
* remove `requires_backend`
* trainer compatibility
* fixup + docstring
* more details
* trigger CI
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/modeling_utils.py
* fixup + is_main_process
* added `save_peft_format` in save_pretrained
* up
* fix nits here and there
* nits here and there.
* docs
* revert `encoding="utf-8"`
* comment
* added slow tests before the PEFT release.
* fixup and nits
* let's be on the safe zone
* added more comments
* v1 docs
* add remaining docs
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* move to `lib_integrations`
* fixup
* this time fixup
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* address final comments
* refactor to use `token`
* add PEFT to DockerFile for slow tests.
* added pipeline support.
---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* added more details about flash attention
* correct and add more details
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* few modifs
* more details
* up
* Apply suggestions from code review
Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>
* adapt from suggestion
* Apply suggestions from code review
Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>
* trigger CI
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* fix nits and copies
* add new section
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>
* Suggestions on Pipeline_webserver
docs: reorder the warning tip for pseudo-code
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ko/pipeline_webserver.md
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
---------
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* add AutoModelForTextToSpeech class
* add TTS pipeline and tessting
* add docstrings to text_to_speech pipeline
* fix torch dependency
* corrector 'processor is None' case in Pipeline
* correct repo id
* modify text-to-speech -> text-to-audio
* remove processor
* rename text_to_speech pipelines files to text_audio
* add textToWaveform and textToSpectrogram instead of textToAudio classes
* update TTS pipeline to the bare minimum
* update tests TTS pipeline
* make style and erase useless import torch in TTS pipeline tests
* modify how to check if generate or forward in TTS pipeline
* remove unnecessary extra new lines
* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* refactor input_texts -> text_inputs
* correct docstrings of TTS.__call__
* correct the shape of generated waveform
* take care of Bark tokenizer special case
* correct run_pipeline_test TTS
* make style
* update TTS docstrings
* address Sylvain nit refactors
* make style
* refactor into one liners
* correct squeeze
* correct way to test if forward or generate
* Update output audio waveform shape
* make style
* correct import
* modify how the TTS pipeline test if a model can generate
* align shape output of TTS pipeline with consistent shape
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Adds `TRANSFORMERS_TEST_DEVICE`
Mirrors the same API in the diffusers library. Useful in transformers
too.
* replace backend checking with trying `torch.device`
* Adds better error message for unknown test devices
* `make style`
* adds documentation showing `TRANSFORMERS_TEST_DEVICE` usage.
* added benchmarks for compile
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* added more models
* added more models fr
* added visualizations
* minor fix
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Added links to models and put charts side by side
* Added batch comparisons
* Added more comparisons
* Fix table
* Added link to wheel
* Update perf_torch_compile.md
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* clearer explanation on how things works under the hood.
* Update docs/source/en/main_classes/quantization.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/main_classes/quantization.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* add `load_in_4bit` in `from_pretrained`
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Initial addition of t5forsequenceclassification
* Adding imports and adding tests
* Formatting
* Running make fix-copies
* Adding mt5forseq
* Formatting
* run make fix-copies
* Adding to docs
* Add model_parallel
* Fix bug
* Fix
* Remove TODO
* Fixing tests for T5ForSequenceClassification
* Undo changes to dependency_versions_table.py
* Change classification head to work with T5Config directly
* Change seq length to let tests pass
* PR comments for formatting
* Formatting
* Initial addition of UMT5ForSequenceClassification
* Adding to inits and formatting
* run make fix-copies
* Add doc for UMT5ForSeqClass
* Update UMT5 config
* Fix docs
* Skip torch fx test for SequenceClassification
* Formatting
* Add skip to UMT5 tests as well
* Fix umt5 tests
* Running make fix-copies
* PR comments
* Fix for change to sentence_representation
* Rename seq_len to hidden_size since that's what it is
* Use base_model to follow format of the rest of the library
* Update docs
* Extract the decoder_input_ids changes and make one liner
* Make one-liner