All Projects → SCRN-VRC → YOLOv4-Tiny-in-UnityCG-HLSL

SCRN-VRC / YOLOv4-Tiny-in-UnityCG-HLSL

Licence: MIT License
A modern object detector inside fragment shaders

Programming Languages

C++
36643 projects - #6 most used programming language
GLSL
2045 projects
HLSL
714 projects
ShaderLab
938 projects
C#
18002 projects

Projects that are alternatives of or similar to YOLOv4-Tiny-in-UnityCG-HLSL

Unity Shaders
✨ Shader demo - More than 300 examples
Stars: ✭ 198 (+421.05%)
Mutual labels:  shaders, hlsl
Urp Lwrp Shaders
A Collection of Shader For URP(LWRP) Render Pipeline
Stars: ✭ 252 (+563.16%)
Mutual labels:  shaders, hlsl
Universalshaderexamples
Sand box project containing example shaders and assets compatible with Unity Universal Render Pipeline.
Stars: ✭ 207 (+444.74%)
Mutual labels:  shaders, hlsl
Reshade
A generic post-processing injector for games and video software.
Stars: ✭ 2,285 (+5913.16%)
Mutual labels:  shaders, hlsl
unity-raymarcher
Real-time ray marching shaders in Unity
Stars: ✭ 28 (-26.32%)
Mutual labels:  shaders, hlsl
Hlslpp
Math library using hlsl syntax with SSE/NEON support
Stars: ✭ 153 (+302.63%)
Mutual labels:  shaders, hlsl
Alloy
Alloy physical shader framework for Unity.
Stars: ✭ 244 (+542.11%)
Mutual labels:  shaders, hlsl
Lwks Fx Bundle
Synced user effects pack
Stars: ✭ 21 (-44.74%)
Mutual labels:  shaders, hlsl
VRC-Cancerspace
Cancerous screenspace shader for VRChat. Please use responsibly. :^)
Stars: ✭ 55 (+44.74%)
Mutual labels:  hlsl, vrchat
DrawSpace
Space-game oriented rendering engine
Stars: ✭ 20 (-47.37%)
Mutual labels:  shaders, hlsl
Hlslexplorer
See how hardware understands your HLSL
Stars: ✭ 91 (+139.47%)
Mutual labels:  shaders, hlsl
DLAA
(DLAA) Directionally Localized antiAliasing
Stars: ✭ 18 (-52.63%)
Mutual labels:  shaders, hlsl
3d Game Shaders For Beginners
🎮 A step-by-step guide to implementing SSAO, depth of field, lighting, normal mapping, and more for your 3D game.
Stars: ✭ 11,698 (+30684.21%)
Mutual labels:  shaders, hlsl
Spirv Vm
Virtual machine for executing SPIR-V
Stars: ✭ 173 (+355.26%)
Mutual labels:  shaders, hlsl
Hlsl To Ispc
HLSL-to-ISPC Utility Library
Stars: ✭ 37 (-2.63%)
Mutual labels:  shaders, hlsl
Shadered
Lightweight, cross-platform & full-featured shader IDE
Stars: ✭ 3,247 (+8444.74%)
Mutual labels:  shaders, hlsl
Hlsl2glslfork
HLSL to GLSL language translator based on ATI's HLSL2GLSL. Used in Unity.
Stars: ✭ 488 (+1184.21%)
Mutual labels:  shaders, hlsl
Slang
Making it easier to work with shaders
Stars: ✭ 627 (+1550%)
Mutual labels:  shaders, hlsl
Dxbc2Dxil
DEPRECATED. DXBC to DXIL (HLSL Bytecode to LLVM IR) using internal APIs.
Stars: ✭ 21 (-44.74%)
Mutual labels:  shaders, hlsl
bShaders
Video playback Effects/Filters (DirectX .hlsl pixel shaders, mpv .hook)
Stars: ✭ 29 (-23.68%)
Mutual labels:  shaders, hlsl

YOLOv4 Tiny in UnityCG/HLSL

Video Demo: https://twitter.com/SCRNinVR/status/1380238589238206465?s=20

Overview

YOLOv4 Tiny is one of the fastest object detectors that exists currently. The goal of this project is to completely recreate it without any existing ML libraries such as Darknet, PyTorch, or TensorFlow in order to port it into a VR game called VRChat.

My naive implementation only runs around 30 FPS so it doesn't hog resources for VR. It is nowhere near as performant as the original.

This implementation is based on the TensorFlow version from https://github.com/hunglc007/tensorflow-yolov4-tflite

NOTE: This was built and tested with Unity 2018.4.20f1, there may be shader compatibility issues with other versions.

Setup

  1. Download the package from Release
  2. Import
  3. Open scene in the Scenes folder
  4. Done, no dependencies
  5. Enter Play Mode to run the network

Avatars

  1. Look in the Prefabs folder
  2. Drop the prefab onto your avatar

Code

Important Shader Properties

yolov4tiny.shader

  1. Frame Delay - How much time each layer waits to update. Default value is 3, the lower it is the more GPU intensive it is.

nms.shader

  1. Confidence Threshold - The cut off point for which the bounding boxes will be culled. Default value is 0.5. Lowering this value will increase the boxes but also increase the error rate.

Reading the Output

The basic setup is: yolov4tiny.shader -> nms.shader -> output.shader. To read the bounding boxes information, we loop through the output of nms.shader

This is a basic setup of how output.shader works, refer to the file for more information.

  1. Setup the input, and feed in nms_buffer.renderTexture
Properties
{
    _NMSout ("NMS Output", 2D) = "black" {}
}
  1. Import the functions
#include "nms_include.cginc"
  1. Loop through the texture
const float2 scale = 0.5.xx;

uint i;
uint j;

// Loop through the 26x26 grid output
for (i = 0; i < 26; i++) {
    for (j = 0; j < 26; j++) {
        // Only draw a box if the confidence is over 50%
        uint4 buff = asuint(_NMSout[txL20nms.xy + uint2(i, j)]);
        float conf = f16tof32(buff.a);
        [branch]
        if (conf > 0.5) {
            // Class, 0 to 79
            float c = f16tof32(buff.b >> 16);
            // x, y is the center position of the bbox relative to 416, the initial image input size that goes into the network
            float x = f16tof32(buff.r >> 16);
            float y = f16tof32(buff.r);
            // w, h are the width and height of the bbox relative to 416, the initial image input size that goes into the network
            float w = f16tof32(buff.g >> 16);
            float h = f16tof32(buff.g);
            // Scale to camera resolution using UVs
            float2 center = float2(x, y) / 416.0;
            center.y = 1.0 - center.y;
            float2 size = float2(w, h) / 416.0 * scale;
        }
    }
}

// YOLOv4 tiny has two outputs, remember to go through the second one too
// Loop through the 13x13 grid output
for (i = 0; i < 13; i++) {
    for (j = 0; j < 13; j++) {
        // Only draw a box if the confidence is over 50%
        uint4 buff = asuint(_NMSout[txL17nms.xy + uint2(i, j)]);
        float conf = f16tof32(buff.a);
        [branch]
        if (conf > 0.5) {
            // Class, 0 to 79
            float c = f16tof32(buff.b >> 16);
            // x, y is the center position of the bbox relative to 416, the initial image input size that goes into the network
            float x = f16tof32(buff.r >> 16);
            float y = f16tof32(buff.r);
            // w, h are the width and height of the bbox relative to 416, the initial image input size that goes into the network
            float w = f16tof32(buff.g >> 16);
            float h = f16tof32(buff.g);
            // Scale to camera resolution using UVs
            float2 center = float2(x, y) / 416.0;
            center.y = 1.0 - center.y;
            float2 size = float2(w, h) / 416.0 * scale;
        }
    }
}

The data is packed into two 16 bits per channel by the nms.shader and the layout is as follows:

R =      X       |    Y
G =      W       |    H
B =  Best class  |    Best class probability
A =              |    Bounding box confidence

All the classes the network can detect, based on the COCO Dataset classes:

Index 0 starts at the top left, i.e. 0 = person, 1 = bicycle and so on.

How It Works

Since this is a direct implementation of a known architecture, you can refer to their original papers.

YOLOv4's paper is basically an ablation study on the different parameters and tuning to maximize speed and accuracy. I suggest reading the previous versions to have a better understanding of the actual architecture.

Other Resources

If you have questions or comments, you can reach me on Discord: SCRN#8008 or Twitter: https://twitter.com/SCRNinVR

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].