- C/C++ compiler
- Visual Profiler
- GPU-accelerated BLAS library
- GPU-accelerated FFT library
- GPU-accelerated Sparse Matrix library
- GPU-accelerated RNG library
- Additional tools and documentation
- Easier Application Porting
- Share GPUs across multiple threads
- Use all GPUs in the system concurrently from a single host thread
- No-copy pinning of system memory, a faster alternative to cudaMallocHost()
- C++ new/delete and support for virtual functions
- Support for inline PTX assembly
- Thrust library of templated performance primitives such as sort, reduce, etc.
- Nvidia Performance Primitives (NPP) library for image/video processing
- Layered Textures for working with same size/format textures at larger sizes and higher performance
- Faster Multi-GPU Programming
- Unified Virtual Addressing
- GPUDirect v2.0 support for Peer-to-Peer Communication
- New & Improved Developer Tools
- Automated Performance Analysis in Visual Profiler
- C++ debugging in CUDA-GDB for Linux and MacOS
- GPU binary disassembler for Fermi architecture (cuobjdump)
- Parallel Nsight 2.0 now available for Windows developers with new debugging and profiling features.
- Nsight Eclipse Edition for Linux and Mac OS is an all-in-one development environment that allows developing, debugging, and optimizing CUDA code in an integrated UI environment.
- A new command-line profiler, nvprof, provides summary information about where applications spend the most time, so that optimization efforts can be properly focused.
Software that can be used as an alternative to Nvidia CUDA Toolkit: