PyTorch3 Flashcards

(61 cards)

1
Q

Create a tensor from Python data

A
 import torch
 x = torch.tensor([1, 2, 3])
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Create common tensors (zeros/ones/rand)

A
 x0 = torch.zeros(3, 4)
 x1 = torch.ones(3, 4)
 xr = torch.rand(3, 4)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Create tensor like another tensor (shape/dtype/device)

A
 y = torch.zeros_like(x)
 z = torch.randn_like(x.float())
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Set dtype explicitly

A
 x = torch.tensor([1,2,3], dtype=torch.float32)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Move tensor to device (CPU/GPU)

A
 device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
 x = x.to(device)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Change dtype (cast)

A
 x = x.to(torch.float16)
 # or
 x = x.float()
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Tensor shape/size

A
 x.shape
 x.size()
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Reshape (view/reshape)

A
 y = x.reshape(2, -1)
 # view requires contiguous sometimes
 y = x.view(2, -1)

(-1) infers a dimension
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Add/remove dimensions

A
 y = x.unsqueeze(0)
 z = y.squeeze(0)

- squeeze only removes a dimension of size 1
- to remove all dimension of size one
 z = y.squeeze()
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Flatten tensor

A
 y = torch.flatten(x, start_dim=1)

- flattens to a 1D tensor
- in this case dimension 0 is skipped because start_dim=1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Permute dimensions (reorder axes)

A
 y = x.permute(0, 2, 1)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Transpose last two dims

A
 y = x.transpose(-1, -2)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Concatenate tensors

A
 y = torch.cat([a, b], dim=0)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Stack tensors (new dimension)

A
 y = torch.stack([a, b], dim=0)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Indexing & slicing

A
 y = x[0, :3]
 z = x[:, -1]

last index is not included
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Boolean masking

A
 mask = x > 0
 y = x[mask]
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Cap values to a range

A
 y = torch.clamp(x, min=0.0, max=1.0)

higher values are replaced by max
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Where (conditional select)

A
 y = torch.where(x > 0, x, torch.zeros_like(x))
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Argmax / Top-k

A
 pred = logits.argmax(dim=1)
 vals, idx = logits.topk(k=5, dim=1)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Softmax / LogSoftmax

A
 p = torch.softmax(logits, dim=1)
 logp = torch.log_softmax(logits, dim=1)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Basic math operations

A
 y = a + b
 y = a * b
 y = torch.matmul(a, b)

a*b is an element-wise multiplication
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Broadcasting (concept)

A
 # shapes like (B,1,D) + (B,T,D) broadcast on dim=1
 y = a + b
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Random seed for reproducibility

A
 import torch
 torch.manual_seed(0)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Disable gradient tracking (inference)

A
 with torch.no_grad():
     y = model(x)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Set train vs eval mode
``` model.train() # ... training model.eval() # ... evaluation ```
26
Autograd: enable gradient on a tensor
``` x = torch.tensor([1.0, 2.0], requires_grad=True) ```
27
Backprop through a scalar loss
``` loss.backward() ```
28
Gradient accumulation reminder (concept)
``` optimizer.zero_grad() loss.backward() optimizer.step() # call zero_grad each step unless intentionally accumulating ``` - backward() computes and stores grads in param.grad - step() reads param.grad and updates param (plus its optimizer state)
29
Read gradients
``` for p in model.parameters(): if p.grad is not None: print(p.grad.norm()) ```
30
Stop gradients (detach)
``` y = x.detach() # or for numpy arr = x.detach().cpu().numpy() ```
31
Get Python scalar from 1-element tensor
``` val = loss.item() ```
32
Define a model with nn.Module
``` import torch.nn as nn class M(nn.Module): def __init__(self): super().__init__() self.fc = nn.Linear(10, 3) def forward(self, x): return self.fc(x) model = M() ```
33
Common layers: Linear / Conv2d / Embedding
``` import torch.nn as nn fc = nn.Linear(128, 10) conv = nn.Conv2d(3, 16, 3, padding=1) emb = nn.Embedding(5000, 128) ```
34
Activations (ReLU, GELU, etc.)
``` import torch.nn.functional as F y = F.relu(x) y = F.gelu(x) ```
35
Dropout (regularization)
``` drop = nn.Dropout(p=0.5) y = drop(x) # active only in model.train() ```
36
BatchNorm (concept + usage)
``` bn = nn.BatchNorm1d(num_features=128) y = bn(x) ``` - it would be 2D or 3D for image data
37
Loss: CrossEntropy (classification)
``` import torch.nn.functional as F loss = F.cross_entropy(logits, targets) # targets: int64 class indices ```
38
Loss: MSE (regression)
``` import torch.nn.functional as F loss = F.mse_loss(pred, y) ```
39
Loss: BCEWithLogits (binary / multilabel)
``` import torch.nn.functional as F loss = F.binary_cross_entropy_with_logits(logits, targets.float()) ```
40
Create an optimizer (Adam/SGD)
``` import torch.optim as optim opt = optim.Adam(model.parameters(), lr=1e-3) # opt = optim.SGD(model.parameters(), lr=1e-2, momentum=0.9) ```
41
Optimizer step cycle (the 3 lines)
``` opt.zero_grad() loss.backward() opt.step() ```
42
Learning-rate scheduler
``` from torch.optim.lr_scheduler import StepLR sched = StepLR(opt, step_size=10, gamma=0.1) # after each epoch: sched.step() ``` - step_size is how often LR is updated - gamma is by how much it is decreased
43
Gradient clipping (stability)
``` import torch.nn.utils as U U.clip_grad_norm_(model.parameters(), max_norm=1.0) ```
44
Save model weights (state_dict)
``` import torch torch.save(model.state_dict(), 'model.pt') ```
45
Load model weights (state_dict)
``` import torch model.load_state_dict(torch.load('model.pt', map_location='cpu')) model.eval() ```
46
Save checkpoint (model+optimizer+epoch)
``` torch.save({'epoch': epoch, 'model': model.state_dict(), 'opt': opt.state_dict()}, 'ckpt.pt') ```
47
Load checkpoint (model+optimizer+epoch)
``` ckpt = torch.load('ckpt.pt', map_location='cpu') model.load_state_dict(ckpt['model']) opt.load_state_dict(ckpt['opt']) start_epoch = ckpt['epoch'] + 1 ```
48
Custom Dataset skeleton
``` from torch.utils.data import Dataset class MyDS(Dataset): def __init__(self, xs, ys): self.xs, self.ys = xs, ys def __len__(self): return len(self.xs) def __getitem__(self, i): return self.xs[i], self.ys[i] ```
49
DataLoader basics (batching/shuffling)
``` from torch.utils.data import DataLoader loader = DataLoader(ds, batch_size=32, shuffle=True, num_workers=2) ```
50
Move a whole batch to device
``` xb, yb = xb.to(device), yb.to(device) ```
51
Typical training loop (minimal)
``` model.train() for xb, yb in loader: xb, yb = xb.to(device), yb.to(device) opt.zero_grad() logits = model(xb) loss = F.cross_entropy(logits, yb) loss.backward() opt.step() ```
52
Evaluation loop (no_grad + eval)
``` model.eval() correct = 0 with torch.no_grad(): for xb, yb in loader: logits = model(xb.to(device)) pred = logits.argmax(1).cpu() correct += (pred == yb).sum().item() ```
53
Check CUDA availability & GPU name
``` import torch print(torch.cuda.is_available()) if torch.cuda.is_available(): print(torch.cuda.get_device_name(0)) ```
54
Clear CUDA cache (debug/memory)
``` import torch torch.cuda.empty_cache() ```
55
Freeze parameters (no training)
``` for p in model.parameters(): p.requires_grad_(False) ```
56
Unfreeze parameters
``` for p in model.parameters(): p.requires_grad_(True) ```
57
Get parameter list / count parameters
``` params = list(model.parameters()) num = sum(p.numel() for p in model.parameters()) ```
58
Initialize weights (example)
``` import torch.nn as nn for m in model.modules(): if isinstance(m, nn.Linear): nn.init.xavier_uniform_(m.weight) nn.init.zeros_(m.bias) ```
59
Tensor to NumPy safely
``` arr = x.detach().cpu().numpy() ```
60
Create one-hot encoding
``` y = torch.nn.functional.one_hot(labels, num_classes=C).float() ```
61
Compute accuracy quickly
``` pred = logits.argmax(dim=1) acc = (pred == y).float().mean().item() ```