transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-06 22:30:09 +06:00

Author	SHA1	Message	Date
Anthony	19224c3642	fix: "check out" as verb (#38678 ) "check out" as verb	2025-06-09 14:07:31 +00:00
Lysandre Debut	d4e92f1a21	Remove add-new-model in favor of add-new-model-like (#30424 ) * Remove add-new-model in favor of add-new-model-like * nits	2024-04-24 09:38:18 +02:00
Abhi Venigalla	005b957fb8	Add DBRX Model (#29921 ) * wip * fix __init__.py * add docs * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * address comments 1 * work on make fixup * pass configs down * add sdpa attention * remove DbrxBlock * add to configuration_auto * docstring now passes formatting test * fix style * update READMEs * add dbrx to modeling_auto * make fix-copies generated this * add DBRX_PRETRAINED_CONFIG_ARCHIVE_MAP * config docstring passes formatting test * rename moe_loss_weight to router_aux_loss_coef * add to flash-attn documentation * fix model-path in tests * Explicitly make `"suli"` the default `ffn_act_fn` Co-authored-by: Wing Lian <wing.lian@gmail.com> * default to using router_aux_loss_coef over ffn_config[moe_loss_weight] * fix _flash_attn_uses_top_left_mask and is_causal * fix tests path * don't use token type IDs * follow Llama and remove token_type_ids from test * init ConfigTester differently so tests pass * remove multiple choice test * remove question + answer test * remove sequence classification test * remove token classification test * copy Llama tests and remove token_type_ids from test inputs * do not test pruning or headmasking; style code * add _tied_weights_keys parameter to pass test * add type hints * fix type check * update config tester * remove masked_lm test * remove encoder tests * initialize DbrxModelTester with correct params * style * torch_dtype does not rely on torch * run make fixup, fix-copies * use https://huggingface.co/v2ray/dbrx-base-fixed/blob/main/modeling_dbrx.py * add copyright info * fix imports and DbrxRotaryEmbedding * update DbrxModel docstring * use copies * change model path in docstring * use config in DbrxFFN * fix flashattention2, sdpaattention * input config to DbrXAttention, DbrxNormAttentionNorm * more fixes * fix * fix again! * add informative comment * fix ruff? * remove print statement + style * change doc-test * fix doc-test * fix docstring * delete commented out text * make defaults match dbrx-instruct * replace `router_aux_loss_coef` with `moe_loss_weight` * is_decoder=True * remove is_decoder from configtester * implement sdpa properly * make is_decoder pass tests * start on the GenerationTesterMixin tests * add dbrx to sdpa documentation * skip weight typing test * style * initialize smaller model Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Add DBRX to toctree * skip test_new_cache_format * make config defaults smaller again * add pad_token_id * remove pad_token_id from config * Remove all references to DBRX_PRETRAINED_CONFIG_ARCHIVE_MAP * Update src/transformers/models/dbrx/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/dbrx/modeling_dbrx.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/model_doc/dbrx.md Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Update src/transformers/models/dbrx/configuration_dbrx.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/model_doc/dbrx.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix typo * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update docs, fix configuration_auto.py * address pr comments * remove is_decoder flag * slice * fix requires grad * remove grad * disconnect differently * remove grad * enable grads * patch * detach expert * nissan al ghaib * Update modeling_dbrx.py * Update src/transformers/models/dbrx/modeling_dbrx.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * replace "Gemma" with "Dbrx" * remove # type: ignore * don't hardcode vocab_size * remove ToDo * Re-add removed idefics2 line * Update test to use tiny-random! * Remove TODO * Remove one more case of loading the entire dbrx-instruct in the tests * Update src/transformers/models/dbrx/modeling_dbrx.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * address some comments * small model * add dbrx to tokenization_auto * More docstrings with add_start_docstrings * Dbrx for now * add PipelineTesterMixin * Update src/transformers/models/dbrx/configuration_dbrx.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * remove flash-attn2 import error * fix docstring Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add useage example * put on one line Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix ffn_act_fn Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * change "dbrx" to "DBRX" for display purposes. * fix __init__.py? * fix __init__.py * fix README * return the aux_loss * remove extra spaces * fix configuration_auto.py * fix format in tokenization_auto * remove new line * add more useage examples --------- Co-authored-by: Abhi Venigalla <abhi.venigalla@databricks.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Eitan Turok <eitan.turok@databricks.com> Co-authored-by: Eitan Turok <150733043+eitanturok@users.noreply.github.com> Co-authored-by: Wing Lian <wing.lian@gmail.com> Co-authored-by: Eitan Turok <eitanturok@gmail.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Matt <rocketknight1@gmail.com> Co-authored-by: Your Name <you@example.com> Co-authored-by: Mihir Patel <mihir.v.patel7@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-04-18 15:18:52 +02:00
Klaus Hipp	fe3df9d5b3	[Docs] Add language identifiers to fenced code blocks (#28955 ) Add language identifiers to code blocks	2024-02-12 10:48:31 -08:00
Klaus Hipp	721ee783ca	[Docs] Fix spelling and grammar mistakes (#28825 ) * Fix typos and grammar mistakes in docs and examples * Fix typos in docstrings and comments * Fix spelling of `tokenizer` in model tests * Remove erroneous spaces in decorators * Remove extra spaces in Markdown link texts	2024-02-02 08:45:00 +01:00
Sylvain Gugger	eb849f6604	Migrate doc files to Markdown. (#24376 ) * Rename index.mdx to index.md * With saved modifs * Address review comment * Treat all files * .mdx -> .md * Remove special char * Update utils/tests_fetcher.py Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> --------- Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2023-06-20 18:07:47 -04:00
Sylvain Gugger	28c19ab58d	Make it easier to develop without a dev install (#22697 ) * Make it easier to develop without a dev install * Remove ugly hack that doesn't work anyway	2023-04-11 08:41:53 -04:00
amyeroberts	17a7b49bda	Update doc examples feature extractor -> image processor (#20501 ) * Update doc example feature extractor -> image processor * Apply suggestions from code review	2022-11-30 14:50:55 +00:00
Ayush Mangal	a5282ab4bc	Fix typo in adding_a_new_model README (#17679 )	2022-06-13 03:22:07 -04:00
Sylvain Gugger	7fc6f41d91	Add doc for add-new-model-like command (#15433 )	2022-01-31 11:10:45 -05:00
Patrick von Platen	02039352b2	Update README.md	2021-09-01 09:50:21 +02:00
Sylvain Gugger	00aa9dbca2	Copyright (#8970 ) * Add copyright everywhere missing * Style	2020-12-07 18:36:34 -05:00
Sylvain Gugger	c89bdfbe72	Reorganize repo (#8580 ) * Put models in subfolders * Styling * Fix imports in tests * More fixes in test imports * Sneaky hidden imports * Fix imports in doc files * More sneaky imports * Finish fixing tests * Fix examples * Fix path for copies * More fixes for examples * Fix dummy files * More fixes for example * More model import fixes * Is this why you're unhappy GitHub? * Fix imports in conver command	2020-11-16 21:43:42 -05:00
Lysandre Debut	826f04576f	Model templates encoder only (#8509 ) * Model templates * TensorFlow * Remove pooler * CI * Tokenizer + Refactoring * Encoder-Decoder * Let's go testing * Encoder-Decoder in TF * Let's go testing in TF * Documentation * README * Fixes * Better names * Style * Update docs * Choose to skip either TF or PT * Code quality fixes * Add to testing suite * Update file path * Cookiecutter path * Update `transformers` path * Handle rebasing * Remove seq2seq from model templates * Remove s2s config * Apply Sylvain and Patrick comments * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Last fixes from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-11-13 11:59:30 -05:00
Sylvain Gugger	b2b7fc7814	Check and update model list in index.rst automatically (#7527 ) * Check and update model list in index.rst automatically * Check and update model list in index.rst automatically * Adapt template	2020-10-05 09:40:45 -04:00
Forrest Iandola	02ef825be2	SqueezeBERT architecture (#7083 ) * configuration_squeezebert.py thin wrapper around bert tokenizer fix typos wip sb model code wip modeling_squeezebert.py. Next step is to get the multi-layer-output interface working set up squeezebert to use BertModelOutput when returning results. squeezebert documentation formatting allow head mask that is an array of [None, ..., None] docs docs cont'd path to vocab docs and pointers to cloud files (WIP) line length and indentation squeezebert model cards formatting of model cards untrack modeling_squeezebert_scratchpad.py update aws paths to vocab and config files get rid of stub of NSP code, and advise users to pretrain with mlm only fix rebase issues redo rebase of modeling_auto.py fix issues with code formatting more code format auto-fixes move squeezebert before bert in tokenization_auto.py and modeling_auto.py because squeezebert inherits from bert tests for squeezebert modeling and tokenization fix typo move squeezebert before bert in modeling_auto.py to fix inheritance problem disable test_head_masking, since squeezebert doesn't yet implement head masking fix issues exposed by the test_modeling_squeezebert.py fix an issue exposed by test_tokenization_squeezebert.py fix issue exposed by test_modeling_squeezebert.py auto generated code style improvement issue that we inherited from modeling_xxx.py: SqueezeBertForMaskedLM.forward() calls self.cls(), but there is no self.cls, and I think the goal was actually to call self.lm_head() update copyright resolve failing 'test_hidden_states_output' and remove unused encoder_hidden_states and encoder_attention_mask docs add integration test. rename squeezebert-mnli --> squeezebert/squeezebert-mnli autogenerated formatting tweaks integrate feedback from patrickvonplaten and sgugger to programming style and documentation strings * tiny change to order of imports	2020-10-05 04:25:43 -04:00
Sylvain Gugger	722b5807d8	Template updates (#6914 )	2020-09-03 04:14:58 -04:00
Stas Bekman	7351ef83c1	[doc] typos (#6867 ) * [doc] typos fixed typos * Update README.md	2020-09-02 06:51:51 -04:00
Sylvain Gugger	a884b7fa38	Update the new model template (#6019 )	2020-07-24 14:15:37 -04:00
Sam Shleifer	feeb956a19	[docs] Add integration test example to copy pasta template (#5961 ) Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-07-22 12:48:38 -04:00
Julien Chaumond	a594ee9c84	More doc for model cards (#3698 ) see https://github.com/huggingface/transformers/pull/3679#pullrequestreview-389368270	2020-04-08 12:12:52 -04:00
Julien Chaumond	94eb68d742	weigths*weights	2020-04-04 15:03:26 -04:00
alberduris	81d6841b4b	GPU text generation: mMoved the encoded_prompt to correct device	2020-01-06 15:11:12 +01:00
alberduris	dd4df80f0b	Moved the encoded_prompts to correct device	2020-01-06 15:11:12 +01:00
Julien Chaumond	4d6c93e923	Kill __main__	2019-12-27 22:55:22 -05:00
Julien Chaumond	30968d70af	misc doc	2019-11-05 19:06:12 -05:00
thomwolf	7f4226f9e6	adding templates	2019-10-30 11:31:56 +01:00

27 Commits