Heterogeneous Parallel Programming
- 收录时间:2018-03-23 13:32:57
- 文件大小:910MB
- 下载次数:114
- 最近下载:2021-01-05 21:55:41
- 磁力链接:
-
文件列表
- 4 - 7 - 4.7- Parallel Computation Patterns - More on Parallel Scan.mp4 40MB
- 4 - 5 - 4.5- Parallel Computation Patterns - A Work-Inefficient Scan Kernel.mp4 38MB
- 4 - 6 - 4.6- Parallel Computation Patterns - A Work-Efficient Parallel Scan Kernel.mp4 38MB
- 2 - 6 - 2.6- Tiled Matrix Multiplication Kernel.mp4 36MB
- 3 - 6 - 3.6- Parallel Computation Patterns - Data Reuse in Tiled Convolution.mp4 32MB
- 4 - 1 - 4.1- Parallel Computation Patterns - Reduction.mp4 31MB
- 5 - 3 - 5.3- Parallel Computation Patterns - Atomic Operations in CUDA.mp4 31MB
- 3 - 1 - 3.1- Performance Considerations - DRAM Bandwidth.mp4 30MB
- 2 - 3 - 2.3- Memory Model and Locality -- CUDA Memories.mp4 29MB
- 5 - 4 - 5.4- Parallel Computation Patters - Atomic Operations Performance.mp4 28MB
- 1 - 4 - 1.4- Introduction to CUDA, Data Parallelism and Threads.mp4 27MB
- 4 - 4 - 4.4- Parallel Computation Patterns - Scan (Prefix Sum).mp4 27MB
- 2 - 5 - 2.5- Tiled Matrix Multiplication.mp4 26MB
- 1 - 1 - 1.1- Course Overview.mp4 26MB
- 2 - 1 - 2.1- Kernel-based Parallel Programming - Thread Scheduling.mp4 26MB
- 3 - 5 - 3.5- Parallel Computation Patterns - 2D Tiled Convolution Kernel.mp4 25MB
- 4 - 2 - 4.2- Parallel Computation Patterns - A Basic Reduction Kernel.mp4 25MB
- 2 - 4 - 2.4- Tiled Parallel Algorithms.mp4 25MB
- 3 - 4 - 3.4- Parallel Computation Patterns - Tiled Convolution.mp4 25MB
- 1 - 5 - 1.5- Introduction to CUDA, Memory Allocation and Data Movement API.mp4 24MB
- 1 - 6 - 1.6- Introduction to CUDA, Kernel-Based SPMD Parallel Programming.mp4 24MB
- 5 - 5 - 5.5- Parallel Computation Patterns - A Privatized Histogram Kernel.mp4 23MB
- 3 - 2 - 3.2- Performance Considerations - Memory Coalescing in CUDA.mp4 23MB
- 5 - 1 - 5.1- Parallel Computation Patterns - Histogramming.mp4 22MB
- 5 - 2 - 5.2- Parallel Computation Patterns - Atomic Operations.mp4 22MB
- 1 - 8 - 1.8- Kernel-based Parallel Programming, Basic Matrix-Matrix Multiplication.mp4 22MB
- 1 - 7 - 1.7- Kernel-based Parallel Programming, Multidimensional Kernel Configuration.mp4 22MB
- 2 - 8 - 2.8- A Tiled Kernel for Arbitrary Matrix Dimensions.mp4 21MB
- Рекомендуемая литература David B. Kirk, Wen-mei W. Hwu Programming Massively Parallel Processors, Second Edition.pdf 21MB
- 3 - 3 - 3.3- Parallel Computation Patterns - Convolution.mp4 21MB
- 4 - 3 - 4.3- Parallel Computation Patterns - A Better Reduction Kernel.mp4 20MB
- 2 - 2 - 2.2- Control Divergence.mp4 20MB
- 2 - 7 - 2.7- Handling Boundary Conditions in Tiling.mp4 18MB
- 1 - 2 - 1.2- Introduction to Heterogeneous Parallel Computing.mp4 18MB
- 1 - 3 - 1.3- Portability and Scalability in Heterogeneous Parallel Computing.mp4 10MB
- hetero-lecture_slides_002-Lecture 1-Lecture-1-5-cuda-API.pdf 893KB
- Lecture-5-3-CUDA-atomic.pdf 770KB
- hetero-lecture_slides_002-Lecture 1-Lecture-1-4-cuda-intro.pdf 593KB
- Lecture-4-7-more-on-scan.pdf 580KB
- Lecture-5-5-privatized-histogram.pdf 541KB
- Lecture-4-4-scan.pdf 526KB
- Lecture-3-2-memory-coalescing.pdf 513KB
- Lecture-3-6-convolution-reuse.pdf 506KB
- Lecture-5-1-histogram.pdf 502KB
- Lecture-3-3-convolution.pdf 500KB
- Lecture-3-1-dram-bandwidth.pdf 498KB
- Lecture-4-6-work-efficient-scan-kernel.pdf 492KB
- hetero-lecture_slides_002-Lecture 1-Lecture-1-6-cuda-kernel.pdf 491KB
- Lecture-3-5-2D-convolution-kernel.pdf 477KB
- Lecture-3-4-tiled-convolution.pdf 452KB
- Lecture-5-4-atomic-performance.pdf 444KB
- hetero-lecture_slides_002-Lecture 2-Lecture-2-2-control-divergence.pdf 442KB
- Lecture-5-2-atomic-operations.pdf 437KB
- hetero-lecture_slides_002-Lecture 2-Lecture-2-1-transparent-scaling.pdf 430KB
- Lecture-4-3-better-reduction-kernel.pdf 414KB
- Lecture-4-2-reduction-kernel.pdf 363KB
- hetero-lecture_slides_002-Lecture 1-Lecture-1-7-kernel-multidimension.pdf 343KB
- Lecture-4-5-naive-scan-kernel.pdf 339KB
- Lecture-4-1-reduction.pdf 294KB
- hetero-lecture_slides_002-Lecture 2-Lecture-2-3-cuda-memories.pdf 294KB
- hetero-lecture_slides_002-Lecture 1-Lecture-1-3-software-cost.pdf 280KB
- hetero-lecture_slides_002-Lecture 1-Lecture-1-2-heterogeneous.pdf 272KB
- hetero-lecture_slides_002-Lecture 1-Lecture-1-8-kernel-matrix-multiplication.pdf 270KB
- hetero-lecture_slides_002-Lecture 1-Lecture-1-1-Overview.pdf 243KB
- hetero-lecture_slides_002-Lecture 2-Lecture-2-8-boundary-condition-kernel.pdf 233KB
- hetero-lecture_slides_002-Lecture 2-Lecture-2-6-tiled-kernel.pdf 231KB
- hetero-lecture_slides_002-Lecture 2-Lecture-2-4-tiled-algorithms.pdf 222KB
- hetero-lecture_slides_002-Lecture 2-Lecture-2-7-boundary-condition.pdf 176KB
- hetero-lecture_slides_002-Lecture 2-Lecture-2-5-tiled-matrix-multiplication.pdf 162KB
- 2 - 6 - 2.6- Tiled Matrix Multiplication Kernel.srt 33KB
- 4 - 1 - 4.1- Parallel Computation Patterns - Reduction.srt 28KB
- 3 - 1 - 3.1- Performance Considerations - DRAM Bandwidth.srt 28KB
- Гетерогенное параллельное программирование.docx 28KB
- 3 - 6 - 3.6- Parallel Computation Patterns - Data Reuse in Tiled Convolution.srt 27KB
- 1 - 4 - 1.4- Introduction to CUDA, Data Parallelism and Threads.srt 26KB
- 2 - 3 - 2.3- Memory Model and Locality -- CUDA Memories.srt 26KB
- 4 - 5 - 4.5- Parallel Computation Patterns - A Work-Inefficient Scan Kernel.srt 26KB
- 1 - 1 - 1.1- Course Overview.srt 26KB
- 4 - 7 - 4.7- Parallel Computation Patterns - More on Parallel Scan.srt 25KB
- 2 - 5 - 2.5- Tiled Matrix Multiplication.srt 25KB
- 4 - 6 - 4.6- Parallel Computation Patterns - A Work-Efficient Parallel Scan Kernel.srt 24KB
- 4 - 4 - 4.4- Parallel Computation Patterns - Scan (Prefix Sum).srt 24KB
- 1 - 6 - 1.6- Introduction to CUDA, Kernel-Based SPMD Parallel Programming.srt 24KB
- 2 - 4 - 2.4- Tiled Parallel Algorithms.srt 23KB
- 1 - 5 - 1.5- Introduction to CUDA, Memory Allocation and Data Movement API.srt 23KB
- 3 - 5 - 3.5- Parallel Computation Patterns - 2D Tiled Convolution Kernel.srt 23KB
- 2 - 1 - 2.1- Kernel-based Parallel Programming - Thread Scheduling.srt 23KB
- 4 - 2 - 4.2- Parallel Computation Patterns - A Basic Reduction Kernel.srt 22KB
- 5 - 3 - 5.3- Parallel Computation Patterns - Atomic Operations in CUDA.srt 22KB
- 3 - 4 - 3.4- Parallel Computation Patterns - Tiled Convolution.srt 21KB
- 2 - 8 - 2.8- A Tiled Kernel for Arbitrary Matrix Dimensions.srt 20KB
- 1 - 8 - 1.8- Kernel-based Parallel Programming, Basic Matrix-Matrix Multiplication.srt 20KB
- 2 - 6 - 2.6- Tiled Matrix Multiplication Kernel.txt 20KB
- 5 - 4 - 5.4- Parallel Computation Patters - Atomic Operations Performance.srt 20KB
- 1 - 2 - 1.2- Introduction to Heterogeneous Parallel Computing.srt 19KB
- 1 - 7 - 1.7- Kernel-based Parallel Programming, Multidimensional Kernel Configuration.srt 19KB
- 4 - 3 - 4.3- Parallel Computation Patterns - A Better Reduction Kernel.srt 19KB
- 3 - 3 - 3.3- Parallel Computation Patterns - Convolution.srt 19KB
- 3 - 2 - 3.2- Performance Considerations - Memory Coalescing in CUDA.srt 18KB
- 2 - 2 - 2.2- Control Divergence.srt 18KB
- 4 - 1 - 4.1- Parallel Computation Patterns - Reduction.txt 18KB
- 3 - 1 - 3.1- Performance Considerations - DRAM Bandwidth.txt 17KB
- 5 - 5 - 5.5- Parallel Computation Patterns - A Privatized Histogram Kernel.srt 17KB
- 3 - 6 - 3.6- Parallel Computation Patterns - Data Reuse in Tiled Convolution.txt 17KB
- 2 - 7 - 2.7- Handling Boundary Conditions in Tiling.srt 17KB
- 2 - 3 - 2.3- Memory Model and Locality -- CUDA Memories.txt 16KB
- 4 - 5 - 4.5- Parallel Computation Patterns - A Work-Inefficient Scan Kernel.txt 16KB
- 1 - 4 - 1.4- Introduction to CUDA, Data Parallelism and Threads.txt 16KB
- 1 - 1 - 1.1- Course Overview.txt 16KB
- 2 - 5 - 2.5- Tiled Matrix Multiplication.txt 15KB
- 5 - 1 - 5.1- Parallel Computation Patterns - Histogramming.srt 15KB
- 4 - 7 - 4.7- Parallel Computation Patterns - More on Parallel Scan.txt 15KB
- 5 - 2 - 5.2- Parallel Computation Patterns - Atomic Operations.srt 15KB
- 4 - 6 - 4.6- Parallel Computation Patterns - A Work-Efficient Parallel Scan Kernel.txt 15KB
- 4 - 4 - 4.4- Parallel Computation Patterns - Scan (Prefix Sum).txt 15KB
- 1 - 6 - 1.6- Introduction to CUDA, Kernel-Based SPMD Parallel Programming.txt 14KB
- 2 - 4 - 2.4- Tiled Parallel Algorithms.txt 14KB
- 1 - 5 - 1.5- Introduction to CUDA, Memory Allocation and Data Movement API.txt 14KB
- 3 - 5 - 3.5- Parallel Computation Patterns - 2D Tiled Convolution Kernel.txt 14KB
- 2 - 1 - 2.1- Kernel-based Parallel Programming - Thread Scheduling.txt 14KB
- 4 - 2 - 4.2- Parallel Computation Patterns - A Basic Reduction Kernel.txt 13KB
- 3 - 4 - 3.4- Parallel Computation Patterns - Tiled Convolution.txt 13KB
- 5 - 3 - 5.3- Parallel Computation Patterns - Atomic Operations in CUDA.txt 13KB
- 1 - 8 - 1.8- Kernel-based Parallel Programming, Basic Matrix-Matrix Multiplication.txt 13KB
- 2 - 8 - 2.8- A Tiled Kernel for Arbitrary Matrix Dimensions.txt 13KB
- 1 - 2 - 1.2- Introduction to Heterogeneous Parallel Computing.txt 12KB
- 5 - 4 - 5.4- Parallel Computation Patters - Atomic Operations Performance.txt 12KB
- 1 - 7 - 1.7- Kernel-based Parallel Programming, Multidimensional Kernel Configuration.txt 12KB
- 4 - 3 - 4.3- Parallel Computation Patterns - A Better Reduction Kernel.txt 12KB
- 3 - 3 - 3.3- Parallel Computation Patterns - Convolution.txt 11KB
- 3 - 2 - 3.2- Performance Considerations - Memory Coalescing in CUDA.txt 11KB
- 2 - 2 - 2.2- Control Divergence.txt 11KB
- 1 - 3 - 1.3- Portability and Scalability in Heterogeneous Parallel Computing.srt 11KB
- 5 - 5 - 5.5- Parallel Computation Patterns - A Privatized Histogram Kernel.txt 10KB
- 2 - 7 - 2.7- Handling Boundary Conditions in Tiling.txt 10KB
- 5 - 1 - 5.1- Parallel Computation Patterns - Histogramming.txt 9KB
- 5 - 2 - 5.2- Parallel Computation Patterns - Atomic Operations.txt 9KB
- 1 - 3 - 1.3- Portability and Scalability in Heterogeneous Parallel Computing.txt 7KB