Commit Graph

4 Commits

Author SHA1 Message Date
Joao Gante
34d9409427
Llama/GPTNeoX: add RoPE scaling (#24653)
* add rope_scaling

* tmp commit

* add gptneox

* add tests

* GPTNeoX can now handle long inputs, so the pipeline test was wrong

* Update src/transformers/models/open_llama/configuration_open_llama.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* remove ntk

* remove redundant validation

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-07-13 16:47:30 +01:00
Joao Gante
fd3eb3e3cd
Beef up Llama tests (#22314)
* tmp commit

* beef up llama tests
2023-03-22 15:20:48 +00:00
lewtun
f251441387
Add LlamaForSequenceClassification (#22209)
* Add LlamaForSequenceClassification

* Update src/transformers/models/llama/modeling_llama.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/models/llama/modeling_llama.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Add docstring

* Add test

* Add input embedding getter and setter

* Remove dead code

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-03-17 14:39:26 +01:00
Jason Phang
0041be5b3d
LLaMA Implementation (#21955)
* LLaMA

* sharding and docs

* tweak

* black

* inits

* ruff

* LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP

* init

* no checkpoint

* docs

* ruff

* type_vocab_size

* tokenizer fixes

* tokenizer fixes

* Update tokenization_llama.py

* Update tokenization_llama.py

* Update configuration_llama.py

* Update modeling_llama.py

* tokenizer add_bos by default

* licenses

* remove decoder

* norms and mlp

* rope overhaul

* tweaks

* black

* mention OPT implementation

* off-by-one naming

* typo

* fix

* tokenization fix and slicing bug

* padding config

* cleanup

* black

* update tests

* undo typo

* fix vocab caching logic

* ruff

* docbuilder

* attn fix from BlackSamorez

* initial feedback

* typo

* docs

* llama case

* llama case

* load checkpoint docs

* comment about tokenizer

* tokenizer defaults

* clear past_key_values if use_cache=False

* last tweaks

* last tweaks

* last tweaks

* last tweaks

---------

Co-authored-by: Stella Biderman <stellabiderman@gmail.com>
2023-03-16 09:00:53 -04:00