All Projects → nickspell → udacity-IntroToParallelProgramming

nickspell / udacity-IntroToParallelProgramming

Licence: other
CS344 - Introduction To Parallel Programming course (Udacity) proposed solutions

Programming Languages

Cuda
1817 projects
C++
36643 projects - #6 most used programming language
Makefile
30231 projects
CMake
9771 projects
c
50402 projects - #5 most used programming language

Projects that are alternatives of or similar to udacity-IntroToParallelProgramming

ud859
Udacity course ud859 with Go
Stars: ✭ 18 (-50%)
Mutual labels:  udacity
FlickOff
A lite movie guide app, with MVVM architecture, that lets you discover movies from TMDb.
Stars: ✭ 31 (-13.89%)
Mutual labels:  udacity
Behavior-Cloning
end to end learning for self-driving
Stars: ✭ 25 (-30.56%)
Mutual labels:  udacity
highway-path-planning
My path-planning pipeline to navigate a car safely around a virtual highway with other traffic.
Stars: ✭ 39 (+8.33%)
Mutual labels:  udacity
SilverScreener
A feature-rich movie guide app, that lets you discover movies from TMDb.
Stars: ✭ 24 (-33.33%)
Mutual labels:  udacity
Facial-Keypoint-Detection
Computer vision: Detect facial keypoints using PyTorch and OpenCV
Stars: ✭ 25 (-30.56%)
Mutual labels:  udacity
Udacity-programming-for-Data-Science-With-Python-Nanodegree
This reprositry contain all the codes of Udacity programming for data science course
Stars: ✭ 22 (-38.89%)
Mutual labels:  udacity
Stock-Hawk
An Android app for monitoring stocks. This will replace Project 3 in the Android Developer Nanodegree.
Stars: ✭ 19 (-47.22%)
Mutual labels:  udacity
java-multithread
Códigos feitos para o curso de Multithreading com Java, no canal RinaldoDev do YouTube.
Stars: ✭ 24 (-33.33%)
Mutual labels:  parallel-programming
fusion-ekf
An extended Kalman Filter implementation in C++ for fusing lidar and radar sensor measurements.
Stars: ✭ 113 (+213.89%)
Mutual labels:  udacity
FlyingCarUdacity
🛩️⚙️ 3D Planning, PID Control, Extended Kalman Filter for the Udacity Flying Car Nanodegree // FCND-Term1
Stars: ✭ 16 (-55.56%)
Mutual labels:  udacity
Udacity-Data-Analyst-Nanodegree
Repository for the projects needed to complete the Data Analyst Nanodegree.
Stars: ✭ 31 (-13.89%)
Mutual labels:  udacity
udacity-baking-recipes
Udacity - Baking Android App
Stars: ✭ 14 (-61.11%)
Mutual labels:  udacity
BakingApp
Udacity Android Developer Nanodegree, project 2.
Stars: ✭ 54 (+50%)
Mutual labels:  udacity
UDACITY-Deep-Learning-Nanodegree-PROJECTS
These are the projects I did on my Udacity Deep Learning Nanodegree 🌟 💻 💻. 💥 🌈
Stars: ✭ 18 (-50%)
Mutual labels:  udacity
Scientific-Programming-in-Julia
Repository for B0M36SPJ
Stars: ✭ 32 (-11.11%)
Mutual labels:  parallel-programming
udacity-ml-nanodegree
Udacity ML Nanodegree Projects
Stars: ✭ 22 (-38.89%)
Mutual labels:  udacity
Parallel-Computing-Javascript
Undergraduate Research about Parallel Computing in JavaScript
Stars: ✭ 23 (-36.11%)
Mutual labels:  parallel-programming
Self-Driving-Car-Steering-Simulator
The aim of this project is to allow a self driving car to steer autonomously in a virtual environment.
Stars: ✭ 15 (-58.33%)
Mutual labels:  udacity
point-cloud-clusters
A catkin workspace in ROS which uses DBSCAN to identify which points in a point cloud belong to the same object.
Stars: ✭ 43 (+19.44%)
Mutual labels:  udacity

udacity-IntroToParallelProgramming

CS344 - Introduction To Parallel Programming course (Udacity) proposed solutions

Testing Environment: Visual Studio 2015 x64 + nVidia CUDA 8.0 + OpenCV 3.2.0

For each problem set, the core of the algorithm to be implemented is located in the students_func.cu file.

Problem Set 1 - RGB2Gray:

Objective

Convert an input RGBA image into grayscale version (ignoring the A channel).

Topics

Example of a map primitive operation on a data structure.

Problem Set 2 - Blur

Objective

Apply a Gaussian blur convolution filter to an input RGBA image (blur each channel independently, ignoring the A channel).

Topics

Example of a stencil primitive operation on a 2D array. Use of the shared memory in order to speed-up the algorithm. Both global memory and shared memory based kernels are provided, the latter providing approx. 1.6 speedup over the first.

Problem Set 3 -Tone Mapping

Objective

Map a High Dynamic Range image into an image for a device supporting a smaller range of intensity values.

Topics

  • Compute range of intensity values of the input image: min and max reduce implemented.
  • Compute histogram of intensity values (1024-values array)
  • Compute the cumulative ditribution function of the histogram: Hillis & Steele scan algorithm (step-efficient, well suited for small arrays like the histogram one).

Problem Set 4 - Red eyes removal

Objective

Remove red eys effect from an inout RGBA image (it uses Normalized Cross Correlation against a training template).

Topics

Sorting algorithms with GPU: given an input array of NCC scores, sort it in ascending order: radix sort. For each bit:

  • Compute a predicate vector (0:false, 1:true)
  • Performs Bielloch Scan on the predicate vector (for both false and positive cases)
  • From Bielloch Scan extracts: an histogram of predicate values [0 numberOfFalses], an offset vector (the actual result of scan)
  • A move kernel computes the new index of each element (using the two structures above), and moves it.

Problem Set 5 - Optimized histogram computation

Objective

Improve the histogram computation performance on GPU over the simple global atomic solution.

Topics

Per-block histogram computation. Each block computes his own histogram in shared memory, and histograms are combined at the end in global memory (more than 7x speedup over global atomic implementation, while being relatively simple).

Problem Set 6 - Seamless Image Cloning

Objective

Given a target image (e.g. a swimming pool), do a seamless attachment of a source image mask (e.g. an hyppo).

Topics

The algorithm consists into performing Jacobi iterations on the source and target image to blend one with the other.

  • Given the mask, detect the interior points and the boundary points
  • Since the algorithm has to be performed only on the interior points, compute the bounding box of the mask region to restrict the Jacobi iterations on a subimage.
  • Split the images in the R,G and B channels.
  • Run 800 Jacobi iterations on each channel. The code makes use of CUDA Streams to run concurrently the same kernel on the 3 different channels (speedup of 3x on my machine, of 1.5x on the Udacity machine). The Jacobi kernel makes extensive use of shared memory, so the number of threads per block has been reduced to maximize SM's occupancy.
  • Recombine the 3 channels to form the output image.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].