transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-08-03 03:31:05 +06:00

Author	SHA1	Message	Date
Sylvain Gugger	0c96262f7d	Fast transformers import part 1 (#9441 ) * Don't import libs to check they are available * Don't import integrations at init * Add importlib_metdata to deps * Remove old vars references * Avoid syntax error * Adapt testing utils * Try to appease torchhub * Add dependency * Remove more private variables * Fix typo * Another typo * Refine the tf availability test	2021-01-06 12:17:24 -05:00
Patrick von Platen	cbe63949d7	Model Templates for Seq2Seq (#9251 ) * adapt cookie cutter * fix copy past statement * delete copy statements for now * remove unused import from template * make doc rst * correct config docstring * correct training * correct inputs processing tf enc dec * make style * adapt templates * clean tabs * correct tensor -> Tensor naming * correct indent * correct templates * fix the test * break lines to avoid > 119 * Apply suggestions from code review	2020-12-22 23:41:20 +01:00
Lysandre	dc9f245442	Torch scatter with torch 1.7.0	2020-12-16 13:48:57 -05:00
Lysandre	2f918defa8	hotfix torch scatter version	2020-12-16 10:26:13 -05:00
NielsRogge	1551e2dc6d	[WIP] Tapas v4 (tres) (#9117 ) * First commit: adding all files from tapas_v3 * Fix multiple bugs including soft dependency and new structure of the library * Improve testing by adding torch_device to inputs and adding dependency on scatter * Use Python 3 inheritance rather than Python 2 * First draft model cards of base sized models * Remove model cards as they are already on the hub * Fix multiple bugs with integration tests * All model integration tests pass * Remove print statement * Add test for convert_logits_to_predictions method of TapasTokenizer * Incorporate suggestions by Google authors * Fix remaining tests * Change position embeddings sizes to 512 instead of 1024 * Comment out positional embedding sizes * Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES * Added more model names * Fix truncation when no max length is specified * Disable torchscript test * Make style & make quality * Quality * Address CI needs * Test the Masked LM model * Fix the masked LM model * Truncate when overflowing * More much needed docs improvements * Fix some URLs * Some more docs improvements * Test PyTorch scatter * Set to slow + minify * Calm flake8 down * First commit: adding all files from tapas_v3 * Fix multiple bugs including soft dependency and new structure of the library * Improve testing by adding torch_device to inputs and adding dependency on scatter * Use Python 3 inheritance rather than Python 2 * First draft model cards of base sized models * Remove model cards as they are already on the hub * Fix multiple bugs with integration tests * All model integration tests pass * Remove print statement * Add test for convert_logits_to_predictions method of TapasTokenizer * Incorporate suggestions by Google authors * Fix remaining tests * Change position embeddings sizes to 512 instead of 1024 * Comment out positional embedding sizes * Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES * Added more model names * Fix truncation when no max length is specified * Disable torchscript test * Make style & make quality * Quality * Address CI needs * Test the Masked LM model * Fix the masked LM model * Truncate when overflowing * More much needed docs improvements * Fix some URLs * Some more docs improvements * Add add_pooling_layer argument to TapasModel Fix comments by @sgugger and @patrickvonplaten * Fix issue in docs + fix style and quality * Clean up conversion script and add task parameter to TapasConfig * Revert the task parameter of TapasConfig Some minor fixes * Improve conversion script and add test for absolute position embeddings * Improve conversion script and add test for absolute position embeddings * Fix bug with reset_position_index_per_cell arg of the conversion cli * Add notebooks to the examples directory and fix style and quality * Apply suggestions from code review * Move from `nielsr/` to `google/` namespace * Apply Sylvain's comments Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Rogge Niels <niels.rogge@howest.be> Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: sgugger <sylvain.gugger@gmail.com>	2020-12-15 17:08:49 -05:00
Lysandre Debut	67ff1c314a	Templates overhaul 1 (#8993 )	2020-12-08 18:00:07 -05:00
Lysandre Debut	aa60b230ec	Patch model parallel test (#8920 ) * Patch model parallel test * Remove line * Remove `ci_*` from scheduled branches	2020-12-03 17:15:47 -05:00
Lysandre Debut	0c5615af66	Put Transformers on Conda (#8918 ) * conda * Guide * correct tag * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/installation.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Sylvain's comments Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-12-03 14:28:49 -05:00
LysandreJik	edbff1fd00	Temporarily deactivate model generation	2020-11-27 16:15:00 -05:00
Stas Bekman	cb7602b38d	typo (#8810 )	2020-11-26 14:47:36 -08:00
Lysandre Debut	6fdd0bb231	Fix slow tests v2 (#8746 ) * Fix BART test * Fix MBART tests * Remove erroneous line from yaml * Update tests/test_modeling_bart.py * Quality	2020-11-24 09:35:12 -05:00
Lysandre Debut	f2e07e7272	Fix a bunch of slow tests (#8634 ) * CI should install `sentencepiece` * Requiring TF * Fixing some TFDPR bugs * remove return_dict=False/True hack Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>	2020-11-19 10:41:41 -05:00
cronoik	b290195ac7	grammar (#8639 )	2020-11-18 18:04:25 -05:00
Lysandre Debut	3095ee9dab	Tokenizers should be framework agnostic (#8599 ) * Tokenizers should be framework agnostic * Run the slow tests * Not testing * Fix documentation * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-11-17 14:03:03 -05:00
Sylvain Gugger	c89bdfbe72	Reorganize repo (#8580 ) * Put models in subfolders * Styling * Fix imports in tests * More fixes in test imports * Sneaky hidden imports * Fix imports in doc files * More sneaky imports * Finish fixing tests * Fix examples * Fix path for copies * More fixes for examples * Fix dummy files * More fixes for example * More model import fixes * Is this why you're unhappy GitHub? * Fix imports in conver command	2020-11-16 21:43:42 -05:00
LysandreJik	9d519dabb7	Fix paths in github YAML	2020-11-13 12:04:17 -05:00
Lysandre Debut	826f04576f	Model templates encoder only (#8509 ) * Model templates * TensorFlow * Remove pooler * CI * Tokenizer + Refactoring * Encoder-Decoder * Let's go testing * Encoder-Decoder in TF * Let's go testing in TF * Documentation * README * Fixes * Better names * Style * Update docs * Choose to skip either TF or PT * Code quality fixes * Add to testing suite * Update file path * Cookiecutter path * Update `transformers` path * Handle rebasing * Remove seq2seq from model templates * Remove s2s config * Apply Sylvain and Patrick comments * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Last fixes from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-11-13 11:59:30 -05:00
Stas Bekman	02bdfc0251	using multi_gpu consistently (#8446 ) * s\|multiple_gpu\|multi_gpu\|g; s\|multigpu\|multi_gpu\|g' * doc	2020-11-10 13:23:58 -05:00
Sylvain Gugger	3213d3bfae	Question template (#8440 ) * Remove SO from question template * Styling	2020-11-10 10:07:56 -05:00
Stas Bekman	190df58560	[github CI] add a multi-gpu job for all example tests (#8341 ) * add a multi-gpu job for all example tests * run only ported tests * rename * explain why env is re-activated on each step * mark all unported/checked tests with @require_torch_non_multigpu_but_fix_me * style * Apply suggestions from code review Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-11-09 15:47:38 -05:00
Sam Shleifer	46509d1c19	[docs] remove sshleifer from issue-template :( (#8418 )	2020-11-09 12:51:38 -05:00
Patrick von Platen	226b9debb7	Update PULL_REQUEST_TEMPLATE.md	2020-11-05 09:40:15 +01:00
Patrick von Platen	6f35c61f93	Update bug-report.md	2020-11-05 09:39:05 +01:00
Stas Bekman	1bb4bba53c	[CIs] Better reports everywhere (#8275 ) * make it possible to invoke testconf.py in both test suites without crashing on having the same option added * perl -pi -e 's\|--make_reports\|--make-reports\|' to be consistent with other opts * add `pytest --make-reports` to all CIs (and artifacts) * fix	2020-11-03 16:57:12 -05:00
Lysandre Debut	3c8d401cf6	Patch reports (#8238 )	2020-11-02 10:26:25 -05:00
Lysandre Debut	10f8c63620	Ci test tf super slow (#8007 ) * Test TF GPU CI * Change cache * Fix missing torch requirement * Fix some model tests Style * LXMERT * MobileBERT * Longformer skip test * XLNet * The rest of the tests * RAG goes OOM in multi gpu setup * YAML test files * Last fixes * Skip doctests * Fill mask tests * Yaml files * Last test fix * Style * Update cache * Change ONNX tests to slow + use tiny model	2020-10-30 10:25:48 -04:00
Stas Bekman	0538820737	[CI] Better reports #2 (#8163 )	2020-10-29 19:30:05 -04:00
Lysandre Debut	1b6c8d4811	Update CI cache (#8126 )	2020-10-28 13:59:43 -04:00
Stas Bekman	8065fea870	[gh actions] run artifacts job always (#8110 )	2020-10-28 01:45:19 -04:00
Stas Bekman	bfd5e370a7	[CI] generate separate report files as artifacts (#7995 ) * better reports * a whole bunch of reports in their own files * clean up * improvements * github artifacts experiment * style * complete the report generator with multiple improvements/fixes * fix * save all reports under one dir to easy upload * can remove temp failing tests * doc fix * some cleanup	2020-10-27 09:25:07 -04:00
Thomas Wolf	3a40cdf58d	[tests\|tokenizers] Refactoring pipelines test backbone - Small tokenizers improvements - General tests speedups (#7970 ) * WIP refactoring pipeline tests - switching to fast tokenizers * fix dialog pipeline and fill-mask * refactoring pipeline tests backbone * make large tests slow * fix tests (tf Bart inactive for now) * fix doc... * clean up for merge * fixing tests - remove bart from summarization until there is TF * fix quality and RAG * Add new translation pipeline tests - fix JAX tests * only slow for dialog * Fixing the missing TF-BART imports in modeling_tf_auto * spin out pipeline tests in separate CI job * adding pipeline test to CI YAML * add slow pipeline tests * speed up tf and pt join test to avoid redoing all the standalone pt and tf tests * Update src/transformers/tokenization_utils_base.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * Update src/transformers/pipelines.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/testing_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add require_torch and require_tf in is_pt_tf_cross_test Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-10-23 15:58:19 +02:00
Sam Shleifer	5ac07513e0	[gh ci] less output ( --durations=50) (#7989 )	2020-10-22 16:10:15 -04:00
Stas Bekman	805a202e1a	[CIs] report slow tests add --durations=0 to some pytest jobs (#7884 ) * add --durations=50 to some pytest runs * report all tests	2020-10-19 08:23:14 -04:00
Stas Bekman	4eb61f8e88	remove USE_CUDA (#7861 )	2020-10-19 07:08:34 -04:00
Terencio Agozzino	7c44c864a5	style: fix typo (#7883 )	2020-10-19 06:14:53 -04:00
Thomas Wolf	ba8c4d0ac0	[Dependencies\|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659 ) * splitting fast and slow tokenizers [WIP] * [WIP] splitting sentencepiece and tokenizers dependencies * update dummy objects * add name_or_path to models and tokenizers * prefix added to file names * prefix * styling + quality * spliting all the tokenizer files - sorting sentencepiece based ones * update tokenizer version up to 0.9.0 * remove hard dependency on sentencepiece 🎉 * and removed hard dependency on tokenizers 🎉 * update conversion script * update missing models * fixing tests * move test_tokenization_fast to main tokenization tests - fix bugs * bump up tokenizers * fix bert_generation * update ad fix several tokenizers * keep sentencepiece in deps for now * fix funnel and deberta tests * fix fsmt * fix marian tests * fix layoutlm * fix squeezebert and gpt2 * fix T5 tokenization * fix xlnet tests * style * fix mbart * bump up tokenizers to 0.9.2 * fix model tests * fix tf models * fix seq2seq examples * fix tests without sentencepiece * fix slow => fast conversion without sentencepiece * update auto and bert generation tests * fix mbart tests * fix auto and common test without tokenizers * fix tests without tokenizers * clean up tests lighten up when tokenizers + sentencepiece are both off * style quality and tests fixing * add sentencepiece to doc/examples reqs * leave sentencepiece on for now * style quality split hebert and fix pegasus * WIP Herbert fast * add sample_text_no_unicode and fix hebert tokenization * skip FSMT example test for now * fix style * fix fsmt in example tests * update following Lysandre and Sylvain's comments * Update src/transformers/testing_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/testing_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-10-18 20:51:24 +02:00
Patrick von Platen	2d4e928d97	Update PULL_REQUEST_TEMPLATE.md Putting my name on a couple more issues to directly redirect them to me	2020-10-13 12:18:31 +02:00
Thomas Wolf	55cb2ee62e	Green tests: update torch-hub test dependencies (add protobuf and pin tokenizer 0.9.0-RC2) (#7658 ) * pin torch-hub test * add protobuf dep	2020-10-08 13:21:15 +02:00
Lysandre Debut	44a93c981f	Number of GPUs for multi-gpu (#7472 )	2020-09-30 06:53:20 -04:00
Lysandre	35e94c68df	Number of GPUs	2020-09-30 12:29:26 +02:00
Lysandre Debut	056723ad1d	Multi-GPU setup (#7453 )	2020-09-30 05:53:34 -04:00
Lysandre Debut	7f4115c099	Pull request template (#7392 ) co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: sgugger <sylvain.gugger@gmail.com>	2020-09-28 09:51:49 -04:00
Sylvain Gugger	514486739c	Fix CI with change of name of nlp (#7054 ) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last	2020-09-10 14:51:08 -04:00
Sam Shleifer	9d1b4db2aa	add nlp install (#6767 )	2020-08-27 11:08:14 -04:00
Sylvain Gugger	64c7c2bc15	Install nlp for github actions test (#6728 )	2020-08-25 14:58:38 -04:00
Funtowicz Morgan	ac9702c284	Fix ONNX test_quantize unittest (#6716 )	2020-08-25 13:24:40 -04:00
Sam Shleifer	a99d09c6f9	add new line to make examples run (#6706 )	2020-08-25 06:26:29 -04:00
Stas Bekman	a8d6716ecb	Create PULL_REQUEST_TEMPLATE.md (#6660 ) * Create PULL_REQUEST_TEMPLATE.md Proposing to copy this neat feature from pytorch. This is a small template that let's a PR submitter tell which issue that PR closes. * Update .github/PULL_REQUEST_TEMPLATE.md Co-authored-by: Kevin Canwen Xu <canwenxu@126.com> Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>	2020-08-25 00:30:38 +08:00
Stefan Schweter	cfa26d2b41	github: add @stefan-it to bug-report template for all token-classification related bugs (#6489 )	2020-08-18 08:38:54 -04:00
Kevin Canwen Xu	fe61c05b85	Add examples/bert-loses-patience who can help (#6499 )	2020-08-16 16:30:16 +08:00

1 2

100 Commits