transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-28 16:52:24 +06:00

Author	SHA1	Message	Date
Tim Suchanek	68e19f1c22	Fix typo in root README (#5073 )	2020-06-20 23:00:04 +08:00
Sylvain Gugger	e4aaa45805	Update pipeline examples to doctest syntax (#5030 )	2020-06-16 18:14:58 -04:00
Lysandre Debut	88762a2f8c	Specify PyTorch versions for examples (#4710 )	2020-06-02 04:29:28 -04:00
Lysandre Debut	6a17688021	per_device instead of per_gpu/error thrown when argument unknown (#4618 ) * per_device instead of per_gpu/error thrown when argument unknown * [docs] Restore examples.md symlink * Correct absolute links so that symlink to the doc works correctly * Update src/transformers/hf_argparser.py Co-authored-by: Julien Chaumond <chaumond@gmail.com> * Warning + reorder * Docs * Style * not for squad Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-05-27 11:36:55 -04:00
Iz Beltagy	8f1d047148	Longformer (#4352 ) * first commit * bug fixes * better examples * undo padding * remove wrong VOCAB_FILES_NAMES * License * make style * make isort happy * unit tests * integration test * make `black` happy by undoing `isort` changes!! * lint * no need for the padding value * batch_size not bsz * remove unused type casting * seqlen not seq_len * staticmethod * `bert` selfattention instead of `n2` * uint8 instead of bool + lints * pad inputs_embeds using embeddings not a constant * black * unit test with padding * fix unit tests * remove redundant unit test * upload model weights * resolve todo * simpler _mask_invalid_locations without lru_cache + backward compatible masked_fill_ * increase unittest coverage	2020-05-19 16:04:43 +02:00
Sam Shleifer	3487be75ef	[Marian] documentation and AutoModel support (#4152 ) - MarianSentencepieceTokenizer - > MarianTokenizer - Start using unk token. - add docs page - add better generation params to MarianConfig - more conversion utilities	2020-05-10 13:54:57 -04:00
Julien Chaumond	c99fe0386b	[doc] Fix broken links + remove crazy big notebook	2020-05-07 18:44:18 -04:00
Julien Chaumond	0ae96ff8a7	BIG Reorganize examples (#4213 ) * Created using Colaboratory * [examples] reorganize files * remove run_tpu_glue.py as superseded by TPU support in Trainer * Bugfix: int, not tuple * move files around	2020-05-07 13:48:44 -04:00
Patrick von Platen	dca34695d0	Reformer (#3351 ) * first copy & past commit from Bert and morgans LSH code * add easy way to compare to trax original code * translate most of function * make trax lsh self attention deterministic with numpy seed + copy paste code * add same config * add same config * make layer init work * implemented hash_vectors function for lsh attention * continue reformer translation * hf LSHSelfAttentionLayer gives same output as trax layer * refactor code * refactor code * refactor code * refactor * refactor + add reformer config * delete bogus file * split reformer attention layer into two layers * save intermediate step * save intermediate step * make test work * add complete reformer block layer * finish reformer layer * implement causal and self mask * clean reformer test and refactor code * fix merge conflicts * fix merge conflicts * update init * fix device for GPU * fix chunk length init for tests * include morgans optimization * improve memory a bit * improve comment * factorize num_buckets * better testing parameters * make whole model work * make lm model work * add t5 copy paste tokenizer * add chunking feed forward * clean config * add improved assert statements * make tokenizer work * improve test * correct typo * extend config * add complexer test * add new axial position embeddings * add local block attention layer * clean tests * refactor * better testing * save intermediate progress * clean test file * make shorter input length work for model * allow variable input length * refactor * make forward pass for pretrained model work * add generation possibility * finish dropout and init * make style * refactor * add first version of RevNet Layers * make forward pass work and add convert file * make uploaded model forward pass work * make uploaded model forward pass work * refactor code * add namedtuples and cache buckets * correct head masks * refactor * made reformer more flexible * make style * remove set max length * add attention masks * fix up tests * fix lsh attention mask * make random seed optional for the moment * improve memory in reformer * add tests * make style * make sure masks work correctly * detach gradients * save intermediate * correct backprob through gather * make style * change back num hashes * rename to labels * fix rotation shape * fix detach * update * fix trainer * fix backward dropout * make reformer more flexible * fix conflict * fix * fix * add tests for fixed seed in reformer layer * fix trainer typo * fix typo in activations * add fp16 tests * add fp16 training * support fp16 * correct gradient bug in reformer * add fast gelu * re-add dropout for embedding dropout * better naming * better naming * renaming * finalize test branch * finalize tests * add more tests * finish tests * fix * fix type trainer * fix fp16 tests * fix tests * fix tests * fix tests * fix issue with dropout * fix dropout seeds * correct random seed on gpu * finalize random seed for dropout * finalize random seed for dropout * remove duplicate line * correct half precision bug * make style * refactor * refactor * docstring * remove sinusoidal position encodings for reformer * move chunking to modeling_utils * make style * clean config * make style * fix tests * fix auto tests * pretrained models * fix docstring * update conversion file * Update pretrained_models.rst * fix rst * fix rst * update copyright * fix test path * fix test path * fix small issue in test * include reformer in generation tests * add docs for axial position encoding * finish docs * Update convert_reformer_trax_checkpoint_to_pytorch.py * remove isort * include sams comments * remove wrong comment in utils * correct typos * fix typo * Update reformer.rst * applied morgans optimization * make style * make gpu compatible * remove bogus file * big test refactor * add example for chunking * fix typo * add to README	2020-05-07 10:17:01 +02:00
Clement	877fc56410	change order pytorch/tf in readme (#4167 )	2020-05-06 16:31:07 -04:00
Jared T Nielsen	64070cbb88	Fix TF input docstrings to refer to tf.Tensor rather than torch.FloatTensor. (#4051 )	2020-04-30 14:28:56 +02:00
Clement	6ba254ee54	quick fix wording readme for community models (#3900 )	2020-04-23 14:19:45 -04:00
Julien Chaumond	dd9d483d03	Trainer (#3800 ) * doc * [tests] Add sample files for a regression task * [HUGE] Trainer * Feedback from @sshleifer * Feedback from @thomwolf + logging tweak * [file_utils] when downloading concurrently, get_from_cache will use the cached file for subsequent processes * [glue] Use default max_seq_length of 128 like before * [glue] move DataTrainingArguments around * [ner] Change interface of InputExample, and align run_{tf,pl} * Re-align the pl scripts a little bit * ner * [ner] Add integration test * Fix language_modeling with API tweak * [ci] Tweak loss target * Don't break console output * amp.initialize: model must be on right device before * [multiple-choice] update for Trainer * Re-align to `827d6d6ef0`	2020-04-21 20:11:56 -04:00
Patrick von Platen	a21d4fa410	add "by" to ReadMe	2020-04-18 18:07:17 +02:00
Patrick von Platen	d22894dfd4	[Docs] Add DialoGPT (#3755 ) * add dialoGPT * update README.md * fix conflict * update readme * add code links to docs * Update README.md * Update dialo_gpt2.rst * Update pretrained_models.rst * Update docs/source/model_doc/dialo_gpt2.rst Co-Authored-By: Julien Chaumond <chaumond@gmail.com> * change filename of dialogpt Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-04-16 09:04:32 +02:00
Julien Chaumond	cbad305ce6	[docs] The use of `do_lower_case` in scripts is on its way to deprecation (#3738 )	2020-04-10 12:34:04 -04:00
Julien Chaumond	83703cd077	Update doc for {Summarization,Translation}Pipeline and other tweaks	2020-04-08 09:45:00 -04:00
Lysandre Debut	d5d7d88612	ELECTRA (#3257 ) * Electra wip * helpers * Electra wip * Electra v1 * ELECTRA may be saved/loaded * Generator & Discriminator * Embedding size instead of halving the hidden size * ELECTRA Tokenizer * Revert BERT helpers * ELECTRA Conversion script * Archive maps * PyTorch tests * Start fixing tests * Tests pass * Same configuration for both models * Compatible with base + large * Simplification + weight tying * Archives * Auto + Renaming to standard names * ELECTRA is uncased * Tests * Slight API changes * Update tests * wip * ElectraForTokenClassification * temp * Simpler arch + tests Removed ElectraForPreTraining which will be in a script * Conversion script * Auto model * Update links to S3 * Split ElectraForPreTraining and ElectraForTokenClassification * Actually test PreTraining model * Remove num_labels from configuration * wip * wip * From discriminator and generator to electra * Slight API changes * Better naming * TensorFlow ELECTRA tests * Accurate conversion script * Added to conversion script * Fast ELECTRA tokenizer * Style * Add ELECTRA to README * Modeling Pytorch Doc + Real style * TF Docs * Docs * Correct links * Correct model intialized * random fixes * style * Addressing Patrick's and Sam's comments * Correct links in docs	2020-04-03 14:10:54 -04:00
Thomas Wolf	2187c49f5c	CPU/GPU memory benchmarking utilities - Remove support for python 3.5 (now only 3.6+) (#3186 ) * memory benchmark rss * have both forward pass and line-by-line mem tracing * cleaned up tracing * refactored and cleaning up API * no f-strings yet... * add GPU mem logging * fix GPU memory monitoring * style and quality * clean up and doc * update with comments * Switching to python 3.6+ * fix quality	2020-03-17 10:17:11 -04:00
Sam Shleifer	087465b943	add BART to README (#3255 )	2020-03-12 19:38:05 -04:00
Julien Chaumond	d6de6423ba	[doc] --organization tweak Co-Authored-By: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-03-10 16:52:44 -04:00
Julien Chaumond	0e56dc3078	[doc] Document the new --organization flag of CLI	2020-03-10 16:42:01 -04:00
Santiago Castro	976e9afece	Add syntax highlighting to the BibTeX in README	2020-02-20 10:06:15 -05:00
Lysandre	59c23ad9c9	README link + better instructions for release	2020-02-19 11:57:17 -05:00
VictorSanh	ee5a6856ca	distilbert-base-cased weights + Readmes + omissions	2020-02-07 15:28:13 -05:00
Clement	c069932f5d	Add contributors snapshot powered by https://github.com/sourcerer-io/hall-of-fame	2020-02-06 15:25:47 -05:00
Julien Chaumond	eae8ee0389	[doc] model sharing: mention README.md + tweaks cc @lysandrejik @thomwolf	2020-02-05 14:20:03 -05:00
Arnaud	3a21d6da6b	Typo on markdown link in README.md	2020-01-31 10:58:49 -05:00
Lysandre	0aa40e9569	v2.4.0 documentation	2020-01-31 09:55:34 -05:00
Julien Chaumond	9fa836a73f	fill_mask helper (#2576 ) * fill_mask helper * [poc] FillMaskPipeline * Revert "[poc] FillMaskPipeline" This reverts commit `67eeea55b0`. * Revert "fill_mask helper" This reverts commit `cacc17b884`. * README: clarify that Pipelines can also do text-classification cf. question at the AI&ML meetup last week, @mfuntowicz * Fix test: test feature-extraction pipeline * Test tweaks * Slight refactor of existing pipeline (in preparation of new FillMaskPipeline) * Extraneous doc * More robust way of doing this @mfuntowicz as we don't rely on the model name anymore (see AutoConfig) * Also add RobertaConfig as a quickfix for wrong token_type_ids * cs * [BIG] FillMaskPipeline	2020-01-30 18:15:42 -05:00
Hang Le	f0a4fc6cd6	Add Flaubert	2020-01-30 10:04:18 -05:00
Julien Chaumond	119dc50e2a	Doc tweak on model sharing	2020-01-22 22:40:38 -05:00
alberduris	81d6841b4b	GPU text generation: mMoved the encoded_prompt to correct device	2020-01-06 15:11:12 +01:00
alberduris	dd4df80f0b	Moved the encoded_prompts to correct device	2020-01-06 15:11:12 +01:00
Julien Chaumond	78528742f1	Fix syntax + link to community page	2020-01-05 12:43:39 -05:00
Clement	12e0aa4368	Proposition to include community models in readme	2020-01-05 12:37:11 -05:00
Julien Chaumond	9b2badf3c9	[cli] Update doc	2019-12-27 22:54:29 -05:00
Aymeric Augustin	3233b58ad4	Quote square brackets in shell commands. This ensures compatibility with zsh. Fix #2316.	2019-12-27 08:50:25 +01:00
Aymeric Augustin	a8d34e534e	Remove [--editable] in install instructions. Use -e only in docs targeted at contributors. If a user copy-pastes command line with [--editable], they will hit an error. If they don't know the --editable option, we're giving them a choice to make before they can move forwards, but this isn't a choice they need to make right now.	2019-12-24 08:46:08 +01:00
Aymeric Augustin	70373a5f7c	Update contribution instructions. Also provide shortcuts in a Makefile.	2019-12-23 21:05:30 +01:00
Aymeric Augustin	45841eaf7b	Remove references to Python 2 in documentation.	2019-12-22 18:38:56 +01:00
Aymeric Augustin	b6ea0f43ae	Remove duplicate -v flag.	2019-12-22 17:47:27 +01:00
Aymeric Augustin	ced0a94204	Switch test files to the standard test_*.py scheme.	2019-12-22 14:15:13 +01:00
Aymeric Augustin	067395d5c5	Move tests outside of library.	2019-12-22 13:47:17 +01:00
Aymeric Augustin	698f9e3d7a	Remove trailing whitespace in README.	2019-12-22 13:29:58 +01:00
thomwolf	1ab25c49d3	Merge branch 'master' into pr/2115	2019-12-21 14:54:30 +01:00
Thomas Wolf	6e7102cfb3	Merge pull request #2203 from gthb/patch-1 fix: wrong architecture count in README	2019-12-21 14:31:44 +01:00
Lysandre	a436574bfd	Release: v2.3.0	2019-12-20 16:22:20 -05:00
thomwolf	71883b6ddc	update link in readme	2019-12-20 19:40:23 +01:00
Morgan Funtowicz	b98ff88544	Added pipelines quick tour in README	2019-12-20 15:52:50 +01:00
Stefan Schweter	3e89fca543	readme: add XLM-RoBERTa to model architecture list	2019-12-18 19:44:23 +01:00
Gunnlaugur Thor Briem	d303f84e7b	fix: wrong architecture count in README Just say “the following” so that this intro doesn't so easily fall out of date :) )	2019-12-17 16:18:00 +00:00
Julien Chaumond	3f5ccb183e	[doc] Clarify uploads cf `855ff0e91d (commitcomment-36452545)`	2019-12-16 18:20:29 -05:00
Julien Chaumond	855ff0e91d	[doc] Model upload and sharing ping @lysandrejik @thomwolf Is this clear enough? Anything we should add?	2019-12-16 12:42:22 -05:00
Thomas Wolf	e92bcb7eb6	Merge pull request #1739 from huggingface/t5 [WIP] Adding Google T5 model	2019-12-14 09:40:43 +01:00
Lysandre	7bd11dda6f	Release: v2.2.2	2019-12-13 16:45:30 -05:00
thomwolf	0558c9cb9b	Merge branch 'master' into t5	2019-12-10 12:58:48 +01:00
Suvrat Bhooshan	df3961121f	Add MMBT Model to Transformers Repo	2019-12-09 18:36:48 -08:00
Pierric Cistac	5c877fe94a	fix albert links	2019-12-09 18:53:00 -05:00
Aymeric Augustin	35401fe50f	Remove dependency on pytest for running tests (#2055 ) * Switch to plain unittest for skipping slow tests. Add a RUN_SLOW environment variable for running them. * Switch to plain unittest for PyTorch dependency. * Switch to plain unittest for TensorFlow dependency. * Avoid leaking open files in the test suite. This prevents spurious warnings when running tests. * Fix unicode warning on Python 2 when running tests. The warning was: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal * Support running PyTorch tests on a GPU. Reverts `27e015bd`. * Tests no longer require pytest. * Make tests pass on cuda	2019-12-06 13:57:38 -05:00
VictorSanh	552c44a9b1	release distilm-bert	2019-12-05 10:14:58 -05:00
LysandreJik	8101924a68	Patch: v2.2.1	2019-12-03 11:20:26 -05:00
Julien Chaumond	b5d884d25c	Uniformize #1952	2019-11-27 11:05:55 -05:00
Lysandre	cf26a0c85e	Fix pretrained models table	2019-11-26 15:40:03 -05:00
Lysandre Debut	b632145273	Update master documentation link in README	2019-11-26 14:27:15 -05:00
Lysandre	ae98d45991	Release: v2.2.0	2019-11-26 14:12:44 -05:00
Julien Chaumond	176cd1ce1b	[doc] homogenize instructions slightly	2019-11-23 11:18:54 -05:00
Rémi Louf	6f70bb8c69	add instructions to run the examples	2019-11-21 14:41:19 -05:00
Julien Chaumond	3916b334a8	[camembert] Acknowledge the full author list	2019-11-18 09:29:11 -05:00
Sebastian Stabinger	44455eb5b6	Adds CamemBERT to Model architectures list	2019-11-18 09:23:14 -05:00
Thomas Wolf	df99f8c5a1	Merge pull request #1832 from huggingface/memory-leak-schedulers replace LambdaLR scheduler wrappers by function	2019-11-14 22:10:31 +01:00
Rémi Louf	2276bf69b7	update the examples, docs and template	2019-11-14 20:38:02 +01:00
thomwolf	8aba81a0b6	fix #1789	2019-11-12 08:52:43 +01:00
thomwolf	f03c0c1423	adding models in readme and auto classes	2019-11-08 11:49:46 +01:00
Lysandre	68f7064a3e	Add `model.train()` line to ReadMe training example Co-Authored-By: Santosh-Gupta <San.Gupta.ML@gmail.com>	2019-11-04 11:52:35 -05:00
Thomas Wolf	7f84fc571a	Merge pull request #1670 from huggingface/templates Templates and explanation for adding a new model and example script	2019-10-30 17:05:58 +01:00
Thomas Wolf	5c6a19a94a	Merge pull request #1604 from huggingface/deploy_doc Versioning in documentation	2019-10-30 17:03:14 +01:00
thomwolf	328a86d2af	adding links to the templates in readme and contributing	2019-10-30 11:37:55 +01:00
Lysandre	b82bfbd0c3	Updated README to show all available documentation	2019-10-24 15:55:31 +00:00
Julien Chaumond	ef1b8b2ae5	[CTRL] warn if generation prompt does not start with a control code see also https://github.com/salesforce/ctrl/pull/50	2019-10-22 21:30:32 +00:00
Julián Peller (dataista)	e16d46843a	Fix architectures count	2019-10-22 15:13:47 -04:00
thomwolf	4d456542e9	Fix citation	2019-10-21 16:34:14 +02:00
Lysandre Debut	c544194611	Remove `special_tokens_mask` from inputs in README Co-authored-by: Thomas Wolf @thomwolf	2019-10-16 11:05:13 -04:00
Emrah Budur	5a8c6e771a	Fixed the sample code in the title 'Quick tour'.	2019-10-12 14:17:17 +03:00
thomwolf	4b8f3e8f32	adding citation	2019-10-11 16:18:16 +02:00
thomwolf	d9e60f4f0d	Merge branch 'master' into pr/1383	2019-10-09 17:25:08 +02:00
Julien Chaumond	d688af19e5	Update link to swift-coreml-transformers cc @lysandrejik	2019-10-08 16:37:52 -04:00
seanBE	6dc6c716c5	fix pytorch-transformers migration description in README	2019-10-07 09:59:54 +01:00
Christopher Goh	904158ac4d	Rephrase forward method to reduce ambiguity	2019-10-06 23:40:52 -04:00
Christopher Goh	0f65d8cbbe	Fix some typos in README	2019-10-06 23:40:52 -04:00
keskarnitish	dbed1c5d94	Adding CTRL (squashed commit) adding conversion script adding first draft of modeling & tokenization adding placeholder for test files bunch of changes registering the tokenizer/model/etc tests change link; something is very VERY wrong here weird end-of-word thingy going on i think the tokenization works now ; wrote the unit tests overall structure works;load w next the monster is alive! works after some cleanup as well adding emacs autosave to gitignore currently only supporting the 48 layer one; seems to infer fine on my macbook cleanup fixing some documentation fixing some documentation tests passing? now works on CUDA also adding greedy? adding greedy sampling works well	2019-10-03 22:29:03 -07:00
VictorSanh	35071007cb	incoming release 🔥 update links to arxiv preprint	2019-10-03 10:27:11 -04:00
DenysNahurnyi	6971556ab8	Fix syntax typo in README.md	2019-10-01 14:59:31 -04:00
Santosh Gupta	5c3b32d44d	Update README.md Lines 183 - 200, fixed indentation. Line 198, replaced `tokenizer_class` with `BertTokenizer`, since `tokenizer_class` is not defined in the loop it belongs to.	2019-09-30 18:48:01 +00:00
wangfei	60f791631b	Fix link in readme	2019-09-28 16:20:17 +08:00
BramVanroy	15749bfc10	Add small note about the output of hidden states	2019-09-27 10:01:36 +02:00
thomwolf	6c3b131516	typo in readme/doc	2019-09-26 16:23:28 +02:00
thomwolf	4e63c90720	update installation instructions in readme	2019-09-26 16:14:21 +02:00
Lysandre Debut	0f92f76ca3	CircleCI reference in README	2019-09-26 08:59:52 -04:00
thomwolf	9676d1a2a8	update readme and setup.py	2019-09-26 13:47:58 +02:00

1 2 3 4 5 ...

372 Commits