📚 Tom's Notes

Search

❯

Deep learning advanced dev

Deep learning advanced dev

Jul 18, 20251 min read

ai
ml
dev

Training

How to tune / optimize training

https://github.com/google-research/tuning_playbook

Experiments in accelerated training

https://github.com/tysam-code/hlb-CIFAR10

PyTorch

Training of large models on multiple GPUs

https://lilianweng.github.io/posts/2021-09-25-train-large/

Synthetic augmentation of training sets

Images - Albumentations

https://github.com/albumentations-team/albumentations

Optimizing Object Detection written in PyTorch

https://paulbridger.com/posts/video_analytics_pipeline_tuning/

PyTorch Lightning

https://github.com/PyTorchLightning/pytorch-lightning

organizing pytorch code for scalable development

Traps to be aware of

https://tanelp.github.io/posts/a-bug-that-plagues-thousands-of-open-source-ml-projects/

random number generator + seed + multiple workers

Deep learning on consumer GPUs

https://medium.com/@dun.chwong/the-simple-guide-deep-learning-with-rtx-3090-cuda-cudnn-tensorflow-keras-pytorch-e88a2a8249bc

ML compilers

https://huyenchip.com/2021/09/07/a-friendly-introduction-to-machine-learning-compilers-and-optimizers.html

cuDNN
XLA
Pytorch Glow
TVM
(MLIR)

Low level arch and optimizations

https://hazyresearch.stanford.edu/blog/2024-05-12-tk

H-100 GPU specs
warp group matrix multiply accumulate
shared memory
address generation
occupancy
ThunderKittens low level library to write compute kernels

AI-generated kernels

https://crfm.stanford.edu/2025/05/28/fast-kernels.html

Graph View

Training
How to tune / optimize training
Experiments in accelerated training
Training of large models on multiple GPUs
Synthetic augmentation of training sets
Images - Albumentations
Optimizing Object Detection written in PyTorch
PyTorch Lightning
Traps to be aware of
Deep learning on consumer GPUs
ML compilers
Low level arch and optimizations
AI-generated kernels

Backlinks

No backlinks found

Created with Quartz v4.2.3 © 2025

GitHub
Discord Community