* llm prompting guide
* updated code examples
* an attempt to fix the code example tests
* set seed in examples
* added a doctest comment
* added einops to the doc_test_job
* string formatting
* string formatting, again
* added the toc to slow_documentation_tests.txt
* minor list fix
* string formatting + pipe renamed
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* replaced max_length with max_new_tokens and updated the outputs to match
* minor formatting fix
* removed einops from circleci config
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <hi@lysand.re>
* removed einops and trust_remote_code parameter
---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>
* translate accelerate page
* Update docs/source/zh/accelerate.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix typos in idefics.md
Two typos found in reviewing this documentation.
1) max_new_tokens=4, is not sufficient to generate "Vegetables" as indicated - you will get only "Veget". (incidentally - some mention of how to select this value might be useful as it seems to change in each example)
2) inputs = processor(prompts, return_tensors="pt").to(device) as inputs need to be on the same device (as they are in all other examples on the page)
* Update idefics.md
Change device to cuda explicitly to match other examples
* add FA-2 support for mistral
* fixup
* add sliding windows
* fixing few nits
* v1 slicing cache - logits do not match
* add comment
* fix bugs
* more mem efficient
* add warning once
* add warning once
* oops
* fixup
* more comments
* copy
* add safety checker
* fixup
* Update src/transformers/models/mistral/modeling_mistral.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* copied from
* up
* raise when padding side is right
* fixup
* add doc + few minor changes
* fixup
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* start working on next chapter
* finish testing
* Update docs/source/de/testing.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/de/testing.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/de/testing.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* initial
* toctree
* add tf model
* run scripts
* peft
* llm and agents
* Update docs/source/de/peft.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/de/peft.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/de/peft.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/de/run_scripts.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/de/run_scripts.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/de/transformers_agents.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/de/transformers_agents.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Put tokenizer methods in the right alphabetical order in the docs
* Quick tweak to ConversationalPipeline
* Typo fixes in the developer doc
* make fixup
* add Bros boilerplate
* copy and pasted modeling_bros.py from official Bros repo
* update copyright of bros files
* copy tokenization_bros.py from official repo and update import path
* copy tokenization_bros_fast.py from official repo and update import path
* copy configuration_bros.py from official repo and update import path
* remove trailing period in copyright line
* copy and paste bros/__init__.py from official repo
* save formatting
* remove unused unnecessary pe_type argument - using only crel type
* resolve import issue
* remove unused model classes
* remove unnecessary tests
* remove unused classes
* fix original code's bug - layer_module's argument order
* clean up modeling auto
* add bbox to prepare_config_and_inputs
* set temporary value to hidden_size (32 is too low because of the of the
Bros' positional embedding)
* remove decoder test, update create_and_check* input arguemnts
* add missing variable to model tests
* do make fixup
* update bros.mdx
* add boilerate plate for no_head inference test
* update BROS_PRETRAINED_MODEL_ARCHIVE_LIST (add naver-clova-ocr prefix)
* add prepare_bros_batch_inputs function
* update modeling_common to add bbox inputs in Bros Model Test
* remove unnecessary model inference
* add test case
* add model_doc
* add test case for token_classification
* apply fixup
* update modeling code
* update BrosForTokenClassification loss calculation logic
* revert logits preprocessing logic to make sure logits have original shape
* - update class name
* - add BrosSpadeOutput
- update BrosConfig arguments
* add boilerate plate for no_head inference test
* add prepare_bros_batch_inputs function
* add test case
* add test case for token_classification
* update modeling code
* update BrosForTokenClassification loss calculation logic
* revert logits preprocessing logic to make sure logits have original shape
* apply masking on the fly
* add BrosSpadeForTokenLinking
* update class name
put docstring to the beginning of the file
* separate the logits calculation logic and loss calculation logic
* update logic for loss calculation so that logits shape doesn't change
when return
* update typo
* update prepare_config_and_inputs
* update dummy node initialization
* update last_hidden_states getting logic to consider when return_dict is False
* update box first token mask param
* bugfix: remove random attention mask generation
* update keys to ignore on load missing
* run make style and quality
* apply make style and quality of other codes
* update box_first_token_mask to bool type
* update index.md
* apply make style and quality
* apply make fix-copies
* pass check_repo
* update bros model doc
* docstring bugfix fix
* add checkpoint for doc, tokenizer for doc
* Update README.md
* Update docs/source/en/model_doc/bros.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update bros.md
* Update src/transformers/__init__.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/bros.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* apply suggestions from code review
* apply suggestions from code review
* revert test_processor_markuplm.py
* Update test_processor_markuplm.py
* apply suggestions from code review
* apply suggestions from code review
* apply suggestions from code review
* update BrosSpadeELForTokenClassification head name to entity linker
* add doc string for config params
* update class, var names to more explicit and apply suggestions from code review
* remove unnecessary keys to ignore
* update relation extractor to be initialized with config
* add bros processor
* apply make style and quality
* update bros.md
* remove bros tokenizer, add bros processor that wraps bert tokenizer
* revert change
* apply make fix-copies
* update processor code, update itc -> initial token, stc -> subsequent token
* add type hint
* remove unnecessary condition branches in embedding forward
* fix auto tokenizer fail
* update docstring for each classes
* update bbox input dimension as standard 2 points and convert them to 4
points in forward pass
* update bros docs
* apply suggestions from code review : update Bros -> BROS in bros.md
* 1. box prefix var -> bbox
2. update variable names to be more explicit
* replace einsum with torch matmul
* apply style and quality
* remove unused argument
* remove unused arguments
* update docstrings
* apply suggestions from code review: add BrosBboxEmbeddings, replace
einsum with classical matrix operations
* revert einsum update
* update bros processor
* apply suggestions from code review
* add conversion script for bros
* Apply suggestions from code review
* fix readme
* apply fix-copies
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>