User Tools

Site Tools


gpu

Resources

Tutorials

Higher-Level wrapping of CUDA/OpenCL

  • Thrust: a high-level C++ interface to CUDA (released by nVidia), Github Link
  • ArrayFire: a commercial library for C/C++/Fortran; support both CUDA and OpenCL
  • ViennaCL: a C++ interface supporting CUDA/OpenCL/OpenMP
  • cudapp: something similar to Thrust. Seems not in active development
  • PyCUDA: a python wrapper for CUDA

MD packages supporting GPU

Purchase nVidia GPUs

FAQs

Texture memory

It is a common misconception, but there is no such thing as “texture memory” in CUDA GPUs. There are only textures, which are global memory allocations accessed through dedicated hardware which has inbuilt cache, filtering and addressing limitations which lead to the size limits you see reported in the documentation and device query. So the limit is either roughly the free amount of global memory (allowing for padding and alignment in CUDA arrays) or the dimensional limits you already quoted.

Local memory

  • See this Ref
  • Not really a “memory” – bytes are stored in global memory
  • Differences from global memory:
    • Addressing is resolved by the compiler
    • Stores are cached in L1

How to choose block size and grid size

Streaming multiprocessors, Blocks and Threads

Applications

Nonlinear fitting

gpu.txt · Last modified: 2022/04/09 00:49 by 127.0.0.1