-
Notifications
You must be signed in to change notification settings - Fork 24
Open
Labels
Description
Clear and concise description of the problem
It would be good to provide an option to select accelerator as TPU instead of GPU
We can also auto-select TPU accelerator if open with Colab + add torch_xla installation steps.
What to do:
0) Try a template with TPUs. Choose distributed training option with 8 processes and spawning option. "Open in colab" one template, for example, vision classification template, install manually torch_xla (see https://colab.research.google.com/drive/1E9zJrptnLJ_PKhmaP5Vhb6DTVRvyrKHx) and run the code with backend xla-tpu
: python main.py --nproc_per_node 8 --backend nccl
. If everything is correctly done, training should probably run
- Update UI
- Add a drop-out menu for backend selection: "nccl" and "xla-tpu" in "Training Options"
- when user selects "xla-tpu", training should be only distributed with 8 processes and "Run the training with torch.multiprocessing.spawn".
- Update content: README.md and other impacted files
- if exported to Colab, we need to make sure that accelerator is "TPU"
Suggested solution
Alternative
Additional context
sdesrozis and trsvchn