Skip to content

Commit b1a5b23

Browse files
Merge branch 'site' into Scaling-PyTorch-FSDP-blog-post
2 parents 5ab587c + 162ceb1 commit b1a5b23

File tree

2,062 files changed

+4198
-3739
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

2,062 files changed

+4198
-3739
lines changed

_get_started/pytorch.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -395,7 +395,7 @@ It’s rare to get both performance and convenience, but this is why the core te
395395
For GPU (newer generation GPUs will see drastically better performance)
396396

397397
```
398-
pip3 install numpy --pre torch[dynamo] --force-reinstall --extra-index-url https://download.pytorch.org/whl/nightly/cu117
398+
pip3 install numpy --pre torch --force-reinstall --extra-index-url https://download.pytorch.org/whl/nightly/cu117
399399
```
400400

401401
For CPU
@@ -452,11 +452,11 @@ PyTorch 2.0 is what 1.14 would have been. We were releasing substantial new feat
452452

453453
CUDA 11.7<br>
454454
```
455-
pip3 install numpy --pre torch[dynamo] torchvision torchaudio --force-reinstall --extra-index-url https://download.pytorch.org/whl/nightly/cu117
455+
pip3 install numpy --pre torch torchvision torchaudio --force-reinstall --extra-index-url https://download.pytorch.org/whl/nightly/cu117
456456
```
457457
CUDA 11.6
458458
```
459-
pip3 install numpy --pre torch[dynamo] torchvision torchaudio --force-reinstall --extra-index-url https://download.pytorch.org/whl/nightly/cu116
459+
pip3 install numpy --pre torch torchvision torchaudio --force-reinstall --extra-index-url https://download.pytorch.org/whl/nightly/cu116
460460
```
461461
CPU
462462
```

_posts/2022-12-02-Accelerating-Hugging-Face-and-TIMM-models.md

Lines changed: 35 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ This tutorial will show you exactly how to replicate those speedups so you can b
3232
For GPU (newer generation GPUs will see drastically better performance)
3333

3434
```
35-
pip3 install numpy --pre torch[dynamo] --force-reinstall --extra-index-url https://download.pytorch.org/whl/nightly/cu117
35+
pip3 install numpy --pre torch --force-reinstall --extra-index-url https://download.pytorch.org/whl/nightly/cu117
3636
3737
```
3838

@@ -78,16 +78,16 @@ by step. Please note that you’re likely to see more significant speedups the n
7878

7979
```python
8080
import torch
81-
def fn(x, y):
82-
a = torch.sin(x).cuda()
83-
b = torch.sin(y).cuda()
84-
return a + b
85-
new_fn = torch.compile(fn, backend="inductor")
86-
input_tensor = torch.randn(10000).to(device="cuda:0")
87-
a = new_fn()
81+
def fn(x, y):
82+
a = torch.sin(x).cuda()
83+
b = torch.sin(y).cuda()
84+
return a + b
85+
new_fn = torch.compile(fn, backend="inductor")
86+
input_tensor = torch.randn(10000).to(device="cuda:0")
87+
a = new_fn()
8888
```
8989

90-
This example won’t actually run faster but it’s a good educational.
90+
This example won’t actually run faster but it’s educational.
9191

9292
example that features `torch.cos()` and `torch.sin()` which are examples of pointwise ops as in they operate element by element on a vector. A more famous pointwise op you might actually want to use would be something like `torch.relu()`.
9393

@@ -110,17 +110,17 @@ TORCHINDUCTOR_TRACE=1 python trig.py
110110
```python
111111

112112
@pointwise(size_hints=[16384], filename=__file__, meta={'signature': {0: '*fp32', 1: '*fp32', 2: 'i32'}, 'device': 0, 'constants': {}, 'configs': [instance_descriptor(divisible_by_16=(0, 1, 2), equal_to_1=())]})
113-
@triton.jit
114-
def kernel(in_ptr0, out_ptr0, xnumel, XBLOCK : tl.constexpr):
115-
xnumel = 10000
116-
xoffset = tl.program_id(0) * XBLOCK
117-
xindex = xoffset + tl.reshape(tl.arange(0, XBLOCK), [XBLOCK])
118-
xmask = xindex < xnumel
119-
x0 = xindex
120-
tmp0 = tl.load(in_ptr0 + (x0), xmask)
121-
tmp1 = tl.sin(tmp0)
122-
tmp2 = tl.sin(tmp1)
123-
tl.store(out_ptr0 + (x0 + tl.zeros([XBLOCK], tl.int32)), tmp2, xmask)
113+
@triton.jit
114+
def kernel(in_ptr0, out_ptr0, xnumel, XBLOCK : tl.constexpr):
115+
xnumel = 10000
116+
xoffset = tl.program_id(0) * XBLOCK
117+
xindex = xoffset + tl.reshape(tl.arange(0, XBLOCK), [XBLOCK])
118+
xmask = xindex < xnumel
119+
x0 = xindex
120+
tmp0 = tl.load(in_ptr0 + (x0), xmask)
121+
tmp1 = tl.sin(tmp0)
122+
tmp2 = tl.sin(tmp1)
123+
tl.store(out_ptr0 + (x0 + tl.zeros([XBLOCK], tl.int32)), tmp2, xmask)
124124

125125
```
126126

@@ -132,9 +132,9 @@ As a next step let’s try a real model like resnet50 from the PyTorch hub.
132132

133133
```python
134134
import torch
135-
model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet18', pretrained=True)
136-
opt_model = torch.compile(model, backend="inductor")
137-
model(torch.randn(1,3,64,64))
135+
model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet18', pretrained=True)
136+
opt_model = torch.compile(model, backend="inductor")
137+
model(torch.randn(1,3,64,64))
138138

139139
```
140140

@@ -152,14 +152,14 @@ So we’re going to directly download a pretrained model from the Hugging Face h
152152
```python
153153

154154
import torch
155-
from transformers import BertTokenizer, BertModel
156-
# Copy pasted from here https://huggingface.co/bert-base-uncased
157-
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
158-
model = BertModel.from_pretrained("bert-base-uncased").to(device="cuda:0")
159-
model = torch.compile(model) # This is the only line of code that we changed
160-
text = "Replace me by any text you'd like."
161-
encoded_input = tokenizer(text, return_tensors='pt').to(device="cuda:0")
162-
output = model(**encoded_input)
155+
from transformers import BertTokenizer, BertModel
156+
# Copy pasted from here https://huggingface.co/bert-base-uncased
157+
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
158+
model = BertModel.from_pretrained("bert-base-uncased").to(device="cuda:0")
159+
model = torch.compile(model) # This is the only line of code that we changed
160+
text = "Replace me by any text you'd like."
161+
encoded_input = tokenizer(text, return_tensors='pt').to(device="cuda:0")
162+
output = model(**encoded_input)
163163

164164
```
165165

@@ -171,10 +171,10 @@ Similarly let’s try out a TIMM example
171171

172172
```python
173173
import timm
174-
import torch
175-
model = timm.create_model('resnext101_32x8d', pretrained=True, num_classes=2)
176-
opt_model = torch.compile(model, backend="inductor")
177-
opt_model(torch.randn(64,3,7,7))
174+
import torch
175+
model = timm.create_model('resnext101_32x8d', pretrained=True, num_classes=2)
176+
opt_model = torch.compile(model, backend="inductor")
177+
opt_model(torch.randn(64,3,7,7))
178178
```
179179

180180
Our goal with PyTorch was to build a breadth-first compiler that would speed up the vast majority of actual models people run in open source. The Hugging Face Hub ended up being an extremely valuable benchmarking tool for us, ensuring that any optimization we work on actually helps accelerate models people want to run.

docs/master/_dynamo.html

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -237,7 +237,7 @@
237237
<div class="pytorch-left-menu-search">
238238

239239
<div class="version">
240-
<a href='https://pytorch.org/docs/versions.html'>master (1.14.0a0+git876b702 ) &#x25BC</a>
240+
<a href='https://pytorch.org/docs/versions.html'>master (2.0.0a0+git9eccfed ) &#x25BC</a>
241241
</div>
242242

243243

@@ -517,7 +517,7 @@
517517

518518
<dl class="py function">
519519
<dt class="sig sig-object py" id="torch._dynamo.optimize">
520-
<span class="sig-prename descclassname"><span class="pre">torch._dynamo.</span></span><span class="sig-name descname"><span class="pre">optimize</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">backend</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">'inductor'</span></span></em>, <em class="sig-param"><span class="o"><span class="pre">*</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">nopython</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">guard_export_fn</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">disable</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">dynamic</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/torch/_dynamo/eval_frame.html#optimize"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#torch._dynamo.optimize" title="Permalink to this definition"></a></dt>
520+
<span class="sig-prename descclassname"><span class="pre">torch._dynamo.</span></span><span class="sig-name descname"><span class="pre">optimize</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">backend</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">'inductor'</span></span></em>, <em class="sig-param"><span class="o"><span class="pre">*</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">nopython</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">guard_export_fn</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">guard_fail_fn</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">disable</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">dynamic</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/torch/_dynamo/eval_frame.html#optimize"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#torch._dynamo.optimize" title="Permalink to this definition"></a></dt>
521521
<dd><p>The main entrypoint of TorchDynamo. Do graph capture and call
522522
backend() to optimize extracted graphs.</p>
523523
<dl class="field-list simple">
@@ -548,7 +548,7 @@
548548

549549
<dl class="py function">
550550
<dt class="sig sig-object py" id="torch._dynamo.optimize_assert">
551-
<span class="sig-prename descclassname"><span class="pre">torch._dynamo.</span></span><span class="sig-name descname"><span class="pre">optimize_assert</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">backend</span></span></em>, <em class="sig-param"><span class="o"><span class="pre">*</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">guard_export_fn</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">export</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">dynamic</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/torch/_dynamo/eval_frame.html#optimize_assert"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#torch._dynamo.optimize_assert" title="Permalink to this definition"></a></dt>
551+
<span class="sig-prename descclassname"><span class="pre">torch._dynamo.</span></span><span class="sig-name descname"><span class="pre">optimize_assert</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">backend</span></span></em>, <em class="sig-param"><span class="o"><span class="pre">*</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">hooks</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">Hooks(guard_export_fn=None,</span> <span class="pre">guard_fail_fn=None)</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">export</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">dynamic</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/torch/_dynamo/eval_frame.html#optimize_assert"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#torch._dynamo.optimize_assert" title="Permalink to this definition"></a></dt>
552552
<dd><p>The same as <cite>torch._dynamo.optimize(backend, nopython=True)</cite></p>
553553
</dd></dl>
554554

docs/master/_images/RReLU.png

124 Bytes
Loading

docs/master/_modules/index.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -235,7 +235,7 @@
235235
<div class="pytorch-left-menu-search">
236236

237237
<div class="version">
238-
<a href='https://pytorch.org/docs/versions.html'>master (1.14.0a0+git876b702 ) &#x25BC</a>
238+
<a href='https://pytorch.org/docs/versions.html'>master (2.0.0a0+git9eccfed ) &#x25BC</a>
239239
</div>
240240

241241

0 commit comments

Comments
 (0)