gpu-experimentations

gpu/cuda/nvidia experimentations for cloud of GPU

Experiments:

🧊 Terraform: raw1: Simple Terraform exprimentation. Later devloped into [https://github.com/sohale/gpu-experimentations/tree/main/provisioning_scripts/terraform]
🧊 LLVM: raw-llvm : Coding LLVM hard-coded hands-on home-made LLVM code
🧊 TVM: 3tvm : TVM (Framework/DSL for Neural Networks Inference). (For TVM open-source tickets)
🧊 Lean4: 4leannet1 (moved to 5) Early Lean4 experiments
🧊 Triton: 4triton (For OpenAI Triton open-source tickets)
🧊 Lean4: leannn5 Lean4 experiments
🧊 CUDA: 6_cuda_rggbuff Simple CUDA code for RGBA buffer
🧊 MLIR: 7_mlir MLIR experiment 1: Full MLIR build, build scripts, My own Docker build (Dockerfile) for containerised MLIR development (For MLIR open-source tickets)
🧊 MLIR:8_mlir_nn: MLIR experiment 2: Neural network (cancelled)
🧊 MLIR:9_mlir_neo_refactor: MLIR experiment 3 Neural network (with better build and container), as support for a compiler project. See [https://github.com/sohale/gpu-experimentations/tree/main/provisioning_scripts/mlir_env]. Also LLVM debugging using lldb (Clang toolchain).
🧊 PTX: 10_mcmc_ptx: MCMC using PTX (direct hard-coded NVidia's assembly language, on top of).
- Low-level “Parallel-Thread Execution ISA Version 8.3” (almost architecture-independent, using "as-if virtual machine")
- ( see PTX (pdf) and PTX (html)
- PTX is itself on top of SASS (propriatory): SASS' .yacc and SASS' .lex
- ptxas, ``
- Also see cuda_api.h and cuda_runtime_api.cc , .lex file ptx.l on gpgpusim
- PTX Op Codes: opcodes.def
- Cool from GPGPUSIM: gpgpu_context.h. They even have OpenCL runtime API: opencl_runtime_api.cc
- CUDA-level: cuda_runtime_api.cc for CUDA-level and instructions.cc
  - CUDA Memory model:
    - memory.h
    - dram.h
    - stack.h
  - CUDA devide runtime: cuda_device_runtime.cc
- power_stat.h, taking into account POD, DRAM, interconnect.
  - traffic_breakdown.h
🧊 CUDA: 11_matrix_cuda: Advanced CUDA optimisation techniques + profiling: for Matrix Multiplicaiton
🧊 FPGA: 12_fpga_aws: FPGA on cloud using AWS's F2, utilixiing Xilinx hardware and AmaranthDHL (open-source hardware HDL) (as part of heterogeneous computing)
🧊 CUDA: 13_cuda_sharedmem: Advanced CUDA+PTX optimisation techniques + profiling ( for experimentation with CUDA / CC Architectures )

Name		Name	Last commit message	Last commit date
Latest commit History 232 Commits
experiments		experiments
provisioning_scripts		provisioning_scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gpu-experimentations

About

Releases

Packages

Languages

License

sohale/gpu-experimentations

Folders and files

Latest commit

History

Repository files navigation

gpu-experimentations

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages