All Projects → Uehwan → 3-D-Scene-Graph

Uehwan / 3-D-Scene-Graph

Licence: other
3D scene graph generator implemented in Pytorch.

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to 3-D-Scene-Graph

Openbot
OpenBot leverages smartphones as brains for low-cost robots. We have designed a small electric vehicle that costs about $50 and serves as a robot body. Our software stack for Android smartphones supports advanced robotics workloads such as person following and real-time autonomous navigation.
Stars: ✭ 2,025 (+3794.23%)
Mutual labels:  robot, deeplearning
Plen 3dmodel fusion360
PLEN2's 3D model data implemented by Autodesk Fusion 360.
Stars: ✭ 24 (-53.85%)
Mutual labels:  robot, 3d-models
zhamao-framework
协程、高性能、灵活的聊天机器人 & Web 开发框架(炸毛框架)
Stars: ✭ 99 (+90.38%)
Mutual labels:  robot
AI-for-Security-Testing
My AI security testing projects
Stars: ✭ 34 (-34.62%)
Mutual labels:  deeplearning
AutoCar
Repo for DJI RoboMaster2018, Which is detect the armor of robomaster robot.
Stars: ✭ 24 (-53.85%)
Mutual labels:  robot
slack-robot
Simple robot for your slack integration
Stars: ✭ 29 (-44.23%)
Mutual labels:  robot
Line-us-Programming
Some very simple examples to get you started with the Line-us API
Stars: ✭ 98 (+88.46%)
Mutual labels:  robot
urdf-rs
URDF parser using serde-xml-rs for rust
Stars: ✭ 21 (-59.62%)
Mutual labels:  robot
go-aida
[DEPRECATED] wechat robot based on wechat-go(wechat web api)
Stars: ✭ 71 (+36.54%)
Mutual labels:  robot
TD3-BipedalWalkerHardcore-v2
Solve BipedalWalkerHardcore-v2 with TD3
Stars: ✭ 41 (-21.15%)
Mutual labels:  robot
knime-deeplearning
KNIME Deep Learning Integration
Stars: ✭ 19 (-63.46%)
Mutual labels:  deeplearning
air writing
Online Hand Writing Recognition using BLSTM
Stars: ✭ 26 (-50%)
Mutual labels:  deeplearning
DeepPixel
An open-source Python package for making computer vision and image processing simpler
Stars: ✭ 21 (-59.62%)
Mutual labels:  deeplearning
SLAM Qt
My small SLAM simulator to study "SLAM for dummies"
Stars: ✭ 47 (-9.62%)
Mutual labels:  robot
Deep-Reinforcement-Learning-for-Boardgames
Master Thesis project that provides a training framework for two player games. TicTacToe and Othello have already been implemented.
Stars: ✭ 17 (-67.31%)
Mutual labels:  deeplearning
Paddle-SEQ
低代码序列数据处理框架,最短两行即可完成训练任务!
Stars: ✭ 13 (-75%)
Mutual labels:  deeplearning
Groundbreaking-Papers
ML Research paper summaries, annotated papers and implementation walkthroughs
Stars: ✭ 90 (+73.08%)
Mutual labels:  deeplearning
conde simulator
Autonomous Driving Simulator for the Portuguese Robotics Open
Stars: ✭ 31 (-40.38%)
Mutual labels:  robot
machine-learning-notebook-series
Jupyter notebook series for machine learning and deep learning.
Stars: ✭ 14 (-73.08%)
Mutual labels:  deeplearning
dst
yet another custom data science template via cookiecutter
Stars: ✭ 59 (+13.46%)
Mutual labels:  deeplearning

3D-Scene-Graph: A Sparse and Semantic Representation of Physical Environments for Intelligent Agents

This work is based on our paper (IEEE Transactions on Cybernetics 2019, accepted). We proposed a new concept called 3D scene graph and its construction framework. Our work is based on FactorizableNet, implemented in Pytorch.

3D Scene Graph Construction Framework

The proposed 3D scene graph construction framework extracts relevant semantics within environments such as object categories and relations between objects as well as physical attributes such as 3D positions and major colors in the process of generating 3D scene graphs for the given environments. The framework receives a sequence of observations regarding the environments in the form of RGB-D image frames. For robust performance, the framework filters out unstable observations(i.e., blurry images) using the proposed adaptive blurry image detection algorithm. Then, the framework factors out keyframe groups to avoid redundant processing of the same information. Keyframe groups contain reasonably-overlapping frames. Next, the framework extracts semantics and physical attributes within the environments through recognition modules. During the recognition processes, spurious detections get rejected and missing entities are supplemented.
Finally, the gathered information gets fused into 3D scene graph and the graph gets updated upon new observations.

Requirements

Installation

Install Pytorch 0.3.1. The code has been tested only with Python 2.7, CUDA 9.0 on Ubuntu 16.04. You will need to modify a significant amount of code if you want to run in a different environment (Python 3+ or Pytorch 0.4+).

  1. Download 3D-Scene-Graph repository
git clone --recurse-submodules https://github.com/Uehwan/3D-Scene-Graph.git
  1. Install FactorizableNet
cd 3D-Scene-Graph/FactorizableNet

Please follow the installation instructions in FactorizableNet repository. Follow steps 1 through 6. You can skip step 7. Download VG-DR-Net in step 8. You do not need to download other models.

  1. Install 3D-Scene-Graph
cd 3D-Scene-Graph
touch FactorizableNet/__init__.py
ln -s ./FactorizableNet/options/ options
mkdir data
ln -s ./FactorizableNet/data/svg data/svg
ln -s ./FactorizableNet/data/visual_genome data/visual_genome
   
pip install torchtext==0.2.3
pip install setuptools pyyaml graphviz webcolors pandas matplotlib 
pip install git+https://github.com/chickenbestlover/ColorHistogram.git

An Alternative: use installation script

   ./build.sh
  1. Download ScanNet dataset

In order to use ScanNet dataset, you need to fill out an agreement to toe ScanNet Terms of Use and send it to the ScanNet team at [email protected]. If the process was successful, they will send you a script downloading ScanNet dataset.

To download a specific scan (e.g. scene0000_00) using the script (the script only runs on Python 2.7):

download-scannet.py -o [directory in which to download] --id scene0000_00
(then press Enter twice)

After the download is finished, the scan is located in a new folder scene0000_00. In the folder, *.sens file contains the RGBD Video with camera pose. To extract them, we use SensReader, an extraction tool provided by ScanNet git repo.

git clone https://github.com/ScanNet/ScanNet.git
cd ScanNet/SensReader/python/
python reader.py \
   --filename [your .sens filepath]  \
   --output_path [ROOT of 3D-Scene-Graph]/data/scene0000_00/ \
   --export_depth_images \
   --export_color_images \
   --export_poses \
   --export_intrinsics
    

Example of usage

python scene_graph_tuning.py \
  --scannet_path data/scene0000_00/\
  --obj_thres 0.23\
  --thres_key 0.2\
  --thres_anchor 0.68 \
  --visualize \
  --frame_start 800 \
  --plot_graph \
  --disable_spurious \
  --gain 10 \
  --detect_cnt_thres 2 \
  --triplet_thres 0.065

Core hyper-parameters

Data settings:

  • --dataset : choose dataset, default='scannet'.
  • --scannet_path : scannet scan filepath , default='./data/scene0507/'.
  • --frame_start : idx of frame to start , default=0.
  • --frame_end : idx of frame to end , default=5000.

FactorizableNet Output Filtering Settings:

  • --obj_thres : object recognition threshold score , default=0.25.
  • --triplet_thres : triplet recognition threshold score , default=0.08.
  • --nms : NMS threshold for post object NMS (negative means not NMS) , default=0.2.
  • --triplet_nms : Triplet NMS threshold for post object NMS (negative means not NMS) , default=0.4.

Key-frame Extraction Settings:

  • --thres_key : keyframe threshold score , default=0.1.
  • --thres_anchor : achorframe threshold score , default=0.65.
  • --alpha : weight for Exponentially Weighted Summation , default=0.4.
  • --gain : gain for adaptive thresholding in blurry image detection , default=25.
  • --offset : offset for adaptive thresholding in blurry image detection , default=1.

Visualization Settings:

  • --pause_time : a pause interval (sec) for every detection , default=1.
  • --plot_graph : plot 3D Scene Graph if true.
  • --visualize : enable visualization if ture.
  • --format : resulting image format, pdf or png, default='png'.
  • --draw_color : draw color node in 3D scene graph if true.
  • --save_image : save detection result image if true.

Result

scores1

Demo Video

Video Label

Citations

Please consider citing this project in your publications if it helps your research. The following is a BibTeX reference.

@article{kim2019graph3d,
  title={3D-Scene-Graph: A Sparse and Semantic Representation of Physical Environments for Intelligent Agents},
  author={Kim, Ue-Hwan and Park, Jin-Man and Song, Taek-Jin and Kim, Jong-Hwan},
  journal={IEEE Cybernetics},
  year={2019}
}

Acknowledgement

This work was supported by the ICT R&D program of MSIP/IITP. [2016-0-00563, Research on Adaptive Machine Learning Technology Development for Intelligent Autonomous Digital Companion]

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].