All Projects → VidSitu → Similar Projects or Alternatives

164 Open source projects that are alternatives of or similar to VidSitu

CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks

Stars: ✭ 105 (+156.1%)

Mutual labels: vision, vision-and-language, grounding

Stars: ✭ 52 (+26.83%)

Mutual labels: captioning, captioning-videos

autonomous-delivery-robot

Repository for Autonomous Delivery Robot project of IvLabs, VNIT

Stars: ✭ 65 (+58.54%)

Mutual labels: vision

pytorch violet

A PyTorch implementation of VIOLET

Stars: ✭ 119 (+190.24%)

Mutual labels: vision-and-language

Arc Robot Vision

MIT-Princeton Vision Toolbox for Robotic Pick-and-Place at the Amazon Robotics Challenge 2017 - Robotic Grasping and One-shot Recognition of Novel Objects with Deep Learning.

Stars: ✭ 224 (+446.34%)

Mutual labels: vision

stereo.vision

planar fitting computation using stereo vision techniques

Stars: ✭ 19 (-53.66%)

Mutual labels: vision

pybv

A lightweight I/O utility for the BrainVision data format, written in Python.

Stars: ✭ 18 (-56.1%)

Mutual labels: vision

Learnable-Image-Resizing

TF 2 implementation Learning to Resize Images for Computer Vision Tasks (https://arxiv.org/abs/2103.09950v1).

Stars: ✭ 48 (+17.07%)

Mutual labels: vision

mediapipe plus

The purpose of this project is to apply mediapipe to more AI chips.

Stars: ✭ 38 (-7.32%)

Mutual labels: vision

Donkeycar

Open source hardware and software platform to build a small scale self driving car.

Stars: ✭ 2,192 (+5246.34%)

Mutual labels: vision

res-mlp-pytorch

Implementation of ResMLP, an all MLP solution to image classification, in Pytorch

Stars: ✭ 178 (+334.15%)

Mutual labels: vision

Openkai

OpenKAI: A modern framework for unmanned vehicle and robot control

Stars: ✭ 150 (+265.85%)

Mutual labels: vision

TokenLabeling

Pytorch implementation of "All Tokens Matter: Token Labeling for Training Better Vision Transformers"

Stars: ✭ 385 (+839.02%)

Mutual labels: vision

SAPC-APCA

APCA (Accessible Perceptual Contrast Algorithm) is a new method for predicting contrast for use in emerging web standards (WCAG 3) for determining readability contrast. APCA is derived form the SAPC (S-LUV Advanced Predictive Color) which is an accessibility-oriented color appearance model designed for self-illuminated displays.

Stars: ✭ 266 (+548.78%)

Mutual labels: vision

lang2seg

Referring Expression Object Segmentation with Caption-Aware Consistency, BMVC 2019

Stars: ✭ 30 (-26.83%)

Mutual labels: vision-and-language

TRAR-VQA

[ICCV 2021] TRAR: Routing the Attention Spans in Transformers for Visual Question Answering -- Official Implementation

Stars: ✭ 49 (+19.51%)

Mutual labels: vision-and-language

Grocery-Product-Detection

This repository builds a product detection model to recognize products from grocery shelf images.

Stars: ✭ 73 (+78.05%)

Mutual labels: vision

dd-ml-segmentation-benchmark

DroneDeploy Machine Learning Segmentation Benchmark

Stars: ✭ 179 (+336.59%)

Mutual labels: vision

Amazing Arkit

ARKit相关资源汇总群：326705018

Stars: ✭ 239 (+482.93%)

Mutual labels: vision

wikiHow paper list

A paper list of research conducted based on wikiHow

Stars: ✭ 25 (-39.02%)

Mutual labels: vision-and-language

React Native Text Detector

Text Detector from image for react native using firebase MLKit on android and Tesseract on iOS

Stars: ✭ 194 (+373.17%)

Mutual labels: vision

face age gender

Can we predict the age and gender of someone given a picture of their face ?

Stars: ✭ 40 (-2.44%)

Mutual labels: vision

Apriltag ros

A ROS wrapper of the AprilTag 3 visual fiducial detector

Stars: ✭ 160 (+290.24%)

Mutual labels: vision

ebu-tt-live-toolkit

Toolkit for supporting the EBU-TT Live specification

Stars: ✭ 23 (-43.9%)

Mutual labels: captioning

Robotcar Dataset Sdk

Software Development Kit for the Oxford Robotcar Dataset

Stars: ✭ 151 (+268.29%)

Mutual labels: vision

CNN-GoogLeNet

👁 Vision : Model 4: GoogLeNet : Image Classification

Stars: ✭ 17 (-58.54%)

Mutual labels: vision

Ravens

Train robotic agents to learn pick and place with deep learning for vision-based manipulation in PyBullet. Transporter Nets, CoRL 2020.

Stars: ✭ 133 (+224.39%)

Mutual labels: vision

monodepth

Python ROS depth estimation from RGB image based on code from the paper "High Quality Monocular Depth Estimation via Transfer Learning"

Stars: ✭ 41 (+0%)

Mutual labels: vision

S2VT-seq2seq-video-captioning-attention

S2VT (seq2seq) video captioning with bahdanau & luong attention implementation in Tensorflow

Stars: ✭ 18 (-56.1%)

Mutual labels: captioning

EfficientMORL

EfficientMORL (ICML'21)

Stars: ✭ 22 (-46.34%)

Mutual labels: vision

DonkeyDrift

Open-source self-driving car based on DonkeyCar and programmable chassis

Stars: ✭ 15 (-63.41%)

Mutual labels: vision

sam-textvqa

Official code for paper "Spatially Aware Multimodal Transformers for TextVQA" published at ECCV, 2020.

Stars: ✭ 51 (+24.39%)

Mutual labels: vision

CarLens-iOS

CarLens - Recognize and Collect Cars

Stars: ✭ 124 (+202.44%)

Mutual labels: vision

CBP

Official Tensorflow Implementation of the AAAI-2020 paper "Temporally Grounding Language Queries in Videos by Contextual Boundary-aware Prediction"

Stars: ✭ 52 (+26.83%)

Mutual labels: vision-and-language

frc-score-detection

A program to detect FRC match scores from their livestream.

Stars: ✭ 15 (-63.41%)

Mutual labels: vision

Final-year-project-deep-learning-models

Deep learning for freehand sketch object recognition

Stars: ✭ 22 (-46.34%)

Mutual labels: vision

nested-transformer

Nested Hierarchical Transformer https://arxiv.org/pdf/2105.12723.pdf

Stars: ✭ 174 (+324.39%)

Mutual labels: vision

FaceData

A macOS app to parse face landmarks from a video for GANs training

Stars: ✭ 71 (+73.17%)

Mutual labels: vision

Opencv

📷 Computer-Vision Demos

Stars: ✭ 244 (+495.12%)

Mutual labels: vision

X-VLM

X-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)

Stars: ✭ 283 (+590.24%)

Mutual labels: vision-and-language

Cs231a Notes

The course notes for Stanford's CS231A course on computer vision

Stars: ✭ 230 (+460.98%)

Mutual labels: vision

flutter-vision

iOS and Android app built with Flutter and Firebase. Includes Firebase ML Vision, Firestore, and Storage

Stars: ✭ 45 (+9.76%)

Mutual labels: vision

Simplecv

Stars: ✭ 2,522 (+6051.22%)

Mutual labels: vision

ReferFormer

[CVPR2022] Official Implementation of ReferFormer

Stars: ✭ 230 (+460.98%)

Mutual labels: video-language

Opticalflow visualization

Python optical flow visualization following Baker et al. (ICCV 2007) as used by the MPI-Sintel challenge

Stars: ✭ 183 (+346.34%)

Mutual labels: vision

mlp-mixer-pytorch

An All-MLP solution for Vision, from Google AI

Stars: ✭ 771 (+1780.49%)

Mutual labels: vision

Attendance Using Face

Face-recognition using Siamese network

Stars: ✭ 174 (+324.39%)

Mutual labels: vision

photonvision

PhotonVision is the free, fast, and easy-to-use computer vision solution for the FIRST Robotics Competition.

Stars: ✭ 115 (+180.49%)

Mutual labels: vision

Arucogen

Online ArUco markers generator

Stars: ✭ 155 (+278.05%)

Mutual labels: vision

non-contact-sleep-apnea-detection

Gihan Jayatilaka, Harshana Weligampola, Suren Sritharan, Pankayaraj Pathmanathan, Roshan Ragel and Isuru Nawinne, "Non-contact Infant Sleep Apnea Detection," 2019 14th Conference on Industrial and Information Systems (ICIIS), Kandy, Sri Lanka, 2019, pp. 260-265, doi: 10.1109/ICIIS47346.2019.9063269.

Stars: ✭ 15 (-63.41%)

Mutual labels: vision

Nextlevel

NextLevel was initally a weekend project that has now grown into a open community of camera platform enthusists. The software provides foundational components for managing media recording, camera interface customization, gestural interaction customization, and image streaming on iOS. The same capabilities can also be found in apps such as Snapchat, Instagram, and Vine.

Stars: ✭ 1,940 (+4631.71%)

Mutual labels: vision

CustomVisionMicrosoftToCoreMLDemoApp

This app recognises 3 hand signs - fist, high five and victory hand [ rock, paper, scissors basically :) ] with live feed camera. It uses a HandSigns.mlmodel which has been trained using Custom Vision from Microsoft.

Stars: ✭ 25 (-39.02%)

Mutual labels: vision

Flowiz

Converts Optical Flow files to images and optionally compiles them to a video. Flow viewer GUI is also available. Check out mockup right from Github Pages:

Stars: ✭ 144 (+251.22%)

Mutual labels: vision

edge-computer-vision

Edge Computer Vision Course

Stars: ✭ 41 (+0%)

Mutual labels: vision

Cocoaai

🤖 The Cocoa Artificial Intelligence Lab

Stars: ✭ 134 (+226.83%)

Mutual labels: vision

handbook

We're a small high-trust livelihood pod doing tech consulting within Enspiral.