mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-12 09:10:05 +06:00

* Add SynthIDTextWatermarkLogitsProcessor * esolving comments. * Resolving comments. * esolving commits, * Improving SynthIDWatermark tests. * switch to PT version * detector as pretrained model + style * update training + style * rebase * Update logits_process.py * Improving SynthIDWatermark tests. * Shift detector training to wikitext negatives and stabilize with lower learning rate. * Clean up. * in for 7B * cleanup * upport python 3.8. * README and final cleanup. * HF Hub upload and initiaze. * Update requirements for synthid_text. * Adding SynthIDTextWatermarkDetector. * Detector testing. * Documentation changes. * Copyrights fix. * Fix detector api. * ironing out errors * ironing out errors * training checks * make fixup and make fix-copies * docstrings and add to docs * copyright * BC * test docstrings * move import * protect type hints * top level imports * watermarking example * direct imports * tpr fpr meaning * process_kwargs * SynthIDTextWatermarkingConfig docstring * assert -> exception * example updates * no immutable dict (cant be serialized) * pack fn * einsum equivalent * import order * fix test on gpu * add detector example --------- Co-authored-by: Sumedh Ghaisas <sumedhg@google.com> Co-authored-by: Marc Sun <marc@huggingface.co> Co-authored-by: sumedhghaisas2 <138781311+sumedhghaisas2@users.noreply.github.com> Co-authored-by: raushan <raushan@huggingface.co>
35 lines
1.2 KiB
Markdown
35 lines
1.2 KiB
Markdown
# SynthID Text
|
|
|
|
This project showcases the use of SynthIDText for watermarking LLMs. The code shown in this repo also
|
|
demostrates the training of the detector for detecting such watermarked text. This detector can be uploaded onto
|
|
a private HF hub repo (private for security reasons) and can be initialized again through pretrained model loading also shown in this script.
|
|
|
|
See our blog post: https://huggingface.co/blog/synthid-text
|
|
|
|
|
|
## Python version
|
|
|
|
User would need python 3.9 to run this example.
|
|
|
|
## Installation and running
|
|
|
|
Once you install transformers you would need to install requirements for this project through requirements.txt provided in this folder.
|
|
|
|
```
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
## To run the detector training
|
|
|
|
```
|
|
python detector_training.py --model_name=google/gemma-7b-it
|
|
```
|
|
|
|
Check the script for more parameters are are tunable and check out paper at link
|
|
https://www.nature.com/articles/s41586-024-08025-4 for more information on these parameters.
|
|
|
|
## Caveat
|
|
|
|
Make sure to run the training of the detector and the detection on the same hardware
|
|
CPU, GPU or TPU to get consistent results (we use detecterministic randomness which is hardware dependent).
|