* Proposed fix for TF example now running on safetensors.
* Adding more warnings and returning keys.
* Trigger CI
* Trigger CI
---------
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
* First draft of RWKV-4
* Add support for generate
* Style post-rebase
* Properly use state
* Write doc
* Fix doc
* More math
* Add model to README, dummies and clean config
* Fix init
* multiple fixes:
- fix common tests
- fix configuraion default values
- add CI test for checking state computation
- fix some CI tests
* correct tokenizer
* some tweaks
- fix config docstring
- fix failing tests
* fix CI tests
- add output_attention / output_hidden_states
- override test_initialization
- fix failing CIs
* fix conversion script
- fix sharded case
- add new arguments
* add slow tests + more fixes on conversion script
* add another test
* final fixes
* change single name variable
* add mock attention mask for pipeline to work
* correct eos token id
* fix nits
* add checkpoints
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* add `tie_word_embeddings` in docstring
* change tensor name
* fix final nits
* Trigger CI
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Add run_mim_no_trainer.py draft from #20412
Add parse_args method and copy over other dependencies
Add Method call for sending telemetry
Initialize Accelerator
Make one log on every process
Set seed and Handle repository creation
Initialize dataset and Set validation split
Create Config
Adapt Config
Update Config
Create Feature Extractor
Create model
Set column names
Create transforms
Create mask generator
Create method to preprocess images
Shuffle datasets if needed and set transforms
Create Dataloaders
Add optimizer
Add learning rate scheduler
Prepare everything with our accelerator
Tie weights for TPU training
Recalculate training steps and training epochs
Set accelerator checkpointing steps
Initialize trackers and store configuration
Set total batch size
Fix typo: mlm -> mim
Log info at the start of training
Load in the weights and states from previous save
update the progress_bar if load from checkpoint
Define train loop
Add evaluation loop to training
Add to parse_args method
Push repo to hub
Save accelerator state
End training and save model and feature extractor
Remove unused imports
Fix trailing whitespace
* Update code based on comments, Rename feature_extractor to image_processor
* Fix linting
* Add argument for learning rate
* Add argument for setting number of training epochs
* Remove incorrect logger argument
* Convert max_train_steps to int for tqdm
---------
Co-authored-by: Saad Mahmud <shuvro.mahmud79@gmail.com>
* first draft - gives index error in question_answering.py
* maturing
* no labels
* pipeline should know about QA
* fixing checks
* formatting
* fixed docstring
* initial commit
* formatting
* adding the class to many places
* towards less unhappy checks
* nearly there
* and gpt neox for qa
* use right model
* forgot this one
* base_model_prefix is "gpt_neox" for GPTNeoX* models
* unnecessary stuff
* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* format
* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* removed gpt2 stuff
---------
Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* first draft - gives index error in question_answering.py
* maturing
* no labels
* pipeline should know about QA
* fixing checks
* formatting
* fixed docstring
* initial commit
* formatting
* adding the class to many places
* towards less unhappy checks
* nearly there
* Update src/transformers/models/gpt_neo/modeling_gpt_neo.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* avoid error
* moving to device of star/end_logits
---------
Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Enable to use custom tracer in FX `symbolic_trace`
* Integrate feedback from review
* Formatting
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* [doc] Try a few ≠ ways of linking to Papers, users, and org profiles
* Empty commit
* Empty commit now that the backend is fixed
---------
Co-authored-by: Lysandre <lysandre@huggingface.co>