All Projects → VidSitu → Similar Projects or Alternatives

164 Open source projects that are alternatives of or similar to VidSitu

calvin
CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks
Stars: ✭ 105 (+156.1%)
Mutual labels:  vision, vision-and-language, grounding
iPerceive
Applying Common-Sense Reasoning to Multi-Modal Dense Video Captioning and Video Question Answering | Python3 | PyTorch | CNNs | Causality | Reasoning | LSTMs | Transformers | Multi-Head Self Attention | Published in IEEE Winter Conference on Applications of Computer Vision (WACV) 2021
Stars: ✭ 52 (+26.83%)
Mutual labels:  captioning, captioning-videos
autonomous-delivery-robot
Repository for Autonomous Delivery Robot project of IvLabs, VNIT
Stars: ✭ 65 (+58.54%)
Mutual labels:  vision
pytorch violet
A PyTorch implementation of VIOLET
Stars: ✭ 119 (+190.24%)
Mutual labels:  vision-and-language
Arc Robot Vision
MIT-Princeton Vision Toolbox for Robotic Pick-and-Place at the Amazon Robotics Challenge 2017 - Robotic Grasping and One-shot Recognition of Novel Objects with Deep Learning.
Stars: ✭ 224 (+446.34%)
Mutual labels:  vision
stereo.vision
planar fitting computation using stereo vision techniques
Stars: ✭ 19 (-53.66%)
Mutual labels:  vision
pybv
A lightweight I/O utility for the BrainVision data format, written in Python.
Stars: ✭ 18 (-56.1%)
Mutual labels:  vision
Learnable-Image-Resizing
TF 2 implementation Learning to Resize Images for Computer Vision Tasks (https://arxiv.org/abs/2103.09950v1).
Stars: ✭ 48 (+17.07%)
Mutual labels:  vision
mediapipe plus
The purpose of this project is to apply mediapipe to more AI chips.
Stars: ✭ 38 (-7.32%)
Mutual labels:  vision
Donkeycar
Open source hardware and software platform to build a small scale self driving car.
Stars: ✭ 2,192 (+5246.34%)
Mutual labels:  vision
res-mlp-pytorch
Implementation of ResMLP, an all MLP solution to image classification, in Pytorch
Stars: ✭ 178 (+334.15%)
Mutual labels:  vision
Openkai
OpenKAI: A modern framework for unmanned vehicle and robot control
Stars: ✭ 150 (+265.85%)
Mutual labels:  vision
TokenLabeling
Pytorch implementation of "All Tokens Matter: Token Labeling for Training Better Vision Transformers"
Stars: ✭ 385 (+839.02%)
Mutual labels:  vision
SAPC-APCA
APCA (Accessible Perceptual Contrast Algorithm) is a new method for predicting contrast for use in emerging web standards (WCAG 3) for determining readability contrast. APCA is derived form the SAPC (S-LUV Advanced Predictive Color) which is an accessibility-oriented color appearance model designed for self-illuminated displays.
Stars: ✭ 266 (+548.78%)
Mutual labels:  vision
lang2seg
Referring Expression Object Segmentation with Caption-Aware Consistency, BMVC 2019
Stars: ✭ 30 (-26.83%)
Mutual labels:  vision-and-language
TRAR-VQA
[ICCV 2021] TRAR: Routing the Attention Spans in Transformers for Visual Question Answering -- Official Implementation
Stars: ✭ 49 (+19.51%)
Mutual labels:  vision-and-language
Grocery-Product-Detection
This repository builds a product detection model to recognize products from grocery shelf images.
Stars: ✭ 73 (+78.05%)
Mutual labels:  vision
dd-ml-segmentation-benchmark
DroneDeploy Machine Learning Segmentation Benchmark
Stars: ✭ 179 (+336.59%)
Mutual labels:  vision
Amazing Arkit
ARKit相关资源汇总 群:326705018
Stars: ✭ 239 (+482.93%)
Mutual labels:  vision
wikiHow paper list
A paper list of research conducted based on wikiHow
Stars: ✭ 25 (-39.02%)
Mutual labels:  vision-and-language
React Native Text Detector
Text Detector from image for react native using firebase MLKit on android and Tesseract on iOS
Stars: ✭ 194 (+373.17%)
Mutual labels:  vision
face age gender
Can we predict the age and gender of someone given a picture of their face ?
Stars: ✭ 40 (-2.44%)
Mutual labels:  vision
Apriltag ros
A ROS wrapper of the AprilTag 3 visual fiducial detector
Stars: ✭ 160 (+290.24%)
Mutual labels:  vision
ebu-tt-live-toolkit
Toolkit for supporting the EBU-TT Live specification
Stars: ✭ 23 (-43.9%)
Mutual labels:  captioning
Robotcar Dataset Sdk
Software Development Kit for the Oxford Robotcar Dataset
Stars: ✭ 151 (+268.29%)
Mutual labels:  vision
CNN-GoogLeNet
👁 Vision : Model 4: GoogLeNet : Image Classification
Stars: ✭ 17 (-58.54%)
Mutual labels:  vision
Ravens
Train robotic agents to learn pick and place with deep learning for vision-based manipulation in PyBullet. Transporter Nets, CoRL 2020.
Stars: ✭ 133 (+224.39%)
Mutual labels:  vision
monodepth
Python ROS depth estimation from RGB image based on code from the paper "High Quality Monocular Depth Estimation via Transfer Learning"
Stars: ✭ 41 (+0%)
Mutual labels:  vision
S2VT-seq2seq-video-captioning-attention
S2VT (seq2seq) video captioning with bahdanau & luong attention implementation in Tensorflow
Stars: ✭ 18 (-56.1%)
Mutual labels:  captioning
EfficientMORL
EfficientMORL (ICML'21)
Stars: ✭ 22 (-46.34%)
Mutual labels:  vision
DonkeyDrift
Open-source self-driving car based on DonkeyCar and programmable chassis
Stars: ✭ 15 (-63.41%)
Mutual labels:  vision
sam-textvqa
Official code for paper "Spatially Aware Multimodal Transformers for TextVQA" published at ECCV, 2020.
Stars: ✭ 51 (+24.39%)
Mutual labels:  vision
CarLens-iOS
CarLens - Recognize and Collect Cars
Stars: ✭ 124 (+202.44%)
Mutual labels:  vision
CBP
Official Tensorflow Implementation of the AAAI-2020 paper "Temporally Grounding Language Queries in Videos by Contextual Boundary-aware Prediction"
Stars: ✭ 52 (+26.83%)
Mutual labels:  vision-and-language
frc-score-detection
A program to detect FRC match scores from their livestream.
Stars: ✭ 15 (-63.41%)
Mutual labels:  vision
Final-year-project-deep-learning-models
Deep learning for freehand sketch object recognition
Stars: ✭ 22 (-46.34%)
Mutual labels:  vision
nested-transformer
Nested Hierarchical Transformer https://arxiv.org/pdf/2105.12723.pdf
Stars: ✭ 174 (+324.39%)
Mutual labels:  vision
FaceData
A macOS app to parse face landmarks from a video for GANs training
Stars: ✭ 71 (+73.17%)
Mutual labels:  vision
Opencv
📷 Computer-Vision Demos
Stars: ✭ 244 (+495.12%)
Mutual labels:  vision
X-VLM
X-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)
Stars: ✭ 283 (+590.24%)
Mutual labels:  vision-and-language
Cs231a Notes
The course notes for Stanford's CS231A course on computer vision
Stars: ✭ 230 (+460.98%)
Mutual labels:  vision
flutter-vision
iOS and Android app built with Flutter and Firebase. Includes Firebase ML Vision, Firestore, and Storage
Stars: ✭ 45 (+9.76%)
Mutual labels:  vision
Simplecv
Stars: ✭ 2,522 (+6051.22%)
Mutual labels:  vision
ReferFormer
[CVPR2022] Official Implementation of ReferFormer
Stars: ✭ 230 (+460.98%)
Mutual labels:  video-language
Opticalflow visualization
Python optical flow visualization following Baker et al. (ICCV 2007) as used by the MPI-Sintel challenge
Stars: ✭ 183 (+346.34%)
Mutual labels:  vision
mlp-mixer-pytorch
An All-MLP solution for Vision, from Google AI
Stars: ✭ 771 (+1780.49%)
Mutual labels:  vision
Attendance Using Face
Face-recognition using Siamese network
Stars: ✭ 174 (+324.39%)
Mutual labels:  vision
photonvision
PhotonVision is the free, fast, and easy-to-use computer vision solution for the FIRST Robotics Competition.
Stars: ✭ 115 (+180.49%)
Mutual labels:  vision
Arucogen
Online ArUco markers generator
Stars: ✭ 155 (+278.05%)
Mutual labels:  vision
non-contact-sleep-apnea-detection
Gihan Jayatilaka, Harshana Weligampola, Suren Sritharan, Pankayaraj Pathmanathan, Roshan Ragel and Isuru Nawinne, "Non-contact Infant Sleep Apnea Detection," 2019 14th Conference on Industrial and Information Systems (ICIIS), Kandy, Sri Lanka, 2019, pp. 260-265, doi: 10.1109/ICIIS47346.2019.9063269.
Stars: ✭ 15 (-63.41%)
Mutual labels:  vision
Nextlevel
NextLevel was initally a weekend project that has now grown into a open community of camera platform enthusists. The software provides foundational components for managing media recording, camera interface customization, gestural interaction customization, and image streaming on iOS. The same capabilities can also be found in apps such as Snapchat, Instagram, and Vine.
Stars: ✭ 1,940 (+4631.71%)
Mutual labels:  vision
CustomVisionMicrosoftToCoreMLDemoApp
This app recognises 3 hand signs - fist, high five and victory hand [ rock, paper, scissors basically :) ] with live feed camera. It uses a HandSigns.mlmodel which has been trained using Custom Vision from Microsoft.
Stars: ✭ 25 (-39.02%)
Mutual labels:  vision
Flowiz
Converts Optical Flow files to images and optionally compiles them to a video. Flow viewer GUI is also available. Check out mockup right from Github Pages:
Stars: ✭ 144 (+251.22%)
Mutual labels:  vision
edge-computer-vision
Edge Computer Vision Course
Stars: ✭ 41 (+0%)
Mutual labels:  vision
Cocoaai
🤖 The Cocoa Artificial Intelligence Lab
Stars: ✭ 134 (+226.83%)
Mutual labels:  vision
handbook
We're a small high-trust livelihood pod doing tech consulting within Enspiral.
Stars: ✭ 35 (-14.63%)
Mutual labels:  vision
SemanticSegmentation-Libtorch
Libtorch Examples
Stars: ✭ 38 (-7.32%)
Mutual labels:  vision
Vision
Computer Vision And Neural Network with Xamarin
Stars: ✭ 54 (+31.71%)
Mutual labels:  vision
MTL-AQA
What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment [CVPR 2019]
Stars: ✭ 38 (-7.32%)
Mutual labels:  captioning
SentimentVisionDemo
🌅 iOS11 demo application for visual sentiment prediction.
Stars: ✭ 34 (-17.07%)
Mutual labels:  vision
1-60 of 164 similar projects