Skip to content

pramodith/kernel-engineering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

kernel-engineering

A repo for learning kernel-engineering/gpu-programming

Setup

make setup

Notebooks

Notebook Description
Control Divergence Explores warp divergence in GPU kernels — what happens when threads within a warp take different branches, how it serializes execution, and benchmarks the performance cost.
TF32 Precision & Performance Demonstrates TensorFloat-32 (TF32) on Ampere+ GPUs — compares matmul precision (TF32 vs FP32 vs FP16 vs FP64), shows TF32 has FP16's precision but FP32's range, and benchmarks the throughput speedup.

About

A repo for learning kernel-engineering/gpu-programming

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors