Rename torch.run to torchrun (#30405)

torch.run does not exist anywhere as far as I can tell.
This commit is contained in:
Steven Basart 2024-04-23 12:04:17 -04:00 committed by GitHub
parent 696ededd2b
commit b8b1e442e3
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -659,7 +659,7 @@ You could also use the [`Trainer`]'s `--save_on_each_node` argument to automatic
For [torchrun](https://pytorch.org/docs/stable/elastic/run.html), you have to ssh to each node and run the following command on both of them. The launcher waits until both nodes are synchronized before launching the training. For [torchrun](https://pytorch.org/docs/stable/elastic/run.html), you have to ssh to each node and run the following command on both of them. The launcher waits until both nodes are synchronized before launching the training.
```bash ```bash
python -m torch.run --nproc_per_node=8 --nnode=2 --node_rank=0 --master_addr=hostname1 \ torchrun --nproc_per_node=8 --nnode=2 --node_rank=0 --master_addr=hostname1 \
--master_port=9901 your_program.py <normal cl args> --deepspeed ds_config.json --master_port=9901 your_program.py <normal cl args> --deepspeed ds_config.json
``` ```