Alternatives and detailed information of 2019-feature-selection

pat-s / 2019-feature-selection

Licence: Unknown, MIT licenses found Licenses found Unknown LICENSE MIT LICENSE.md

Research project

Programming Languages

7636 projects

TeX

3793 projects

Projects that are alternatives of or similar to 2019-feature-selection

GPS

code for "A global pathway selection algorithm for the reduction of detailed chemical kinetic mechanisms" (Gao et al., CNF'16)

Stars: ✭ 18 (-30.77%)

Mutual labels: feature-selection

bess

Best Subset Selection algorithm for Regression, Classification, Count, Survival analysis

Stars: ✭ 14 (-46.15%)

Mutual labels: feature-selection

L0Learn

Efficient Algorithms for L0 Regularized Learning

Stars: ✭ 74 (+184.62%)

Mutual labels: feature-selection

feature engine

Feature engineering package with sklearn like functionality

Stars: ✭ 758 (+2815.38%)

Mutual labels: feature-selection

FIFA-2019-Analysis

This is a project based on the FIFA World Cup 2019 and Analyzes the Performance and Efficiency of Teams, Players, Countries and other related things using Data Analysis and Data Visualizations

Stars: ✭ 28 (+7.69%)

Mutual labels: feature-selection

skrobot

skrobot is a Python module for designing, running and tracking Machine Learning experiments / tasks. It is built on top of scikit-learn framework.

Stars: ✭ 22 (-15.38%)

Mutual labels: feature-selection

GeneticAlgorithmForFeatureSelection

Search the best feature subset for you classification mode

Stars: ✭ 82 (+215.38%)

Mutual labels: feature-selection

exemplary-ml-pipeline

Exemplary, annotated machine learning pipeline for any tabular data problem.

Stars: ✭ 23 (-11.54%)

Mutual labels: feature-selection

Reinforcement-Learning-Feature-Selection

Feature selection for maximizing expected cumulative reward

Stars: ✭ 27 (+3.85%)

Mutual labels: feature-selection

Ball

Statistical Inference and Sure Independence Screening via Ball Statistics

Stars: ✭ 22 (-15.38%)

Mutual labels: feature-selection

Mlr

Machine Learning in R

Stars: ✭ 1,542 (+5830.77%)

Mutual labels: feature-selection

rcompendium

📦 Create a package or compendium structure

Stars: ✭ 26 (+0%)

Mutual labels: research-compendium

GraphOfDocs

GraphOfDocs: Representing multiple documents as a single graph

Stars: ✭ 13 (-50%)

Mutual labels: feature-selection

adapt

Awesome Domain Adaptation Python Toolbox

Stars: ✭ 46 (+76.92%)

Mutual labels: feature-selection

msda

Library for multi-dimensional, multi-sensor, uni/multivariate time series data analysis, unsupervised feature selection, unsupervised deep anomaly detection, and prototype of explainable AI for anomaly detector

Stars: ✭ 80 (+207.69%)

Mutual labels: feature-selection

qbso-fs

Python implementation of QBSO-FS : a Reinforcement Learning based Bee Swarm Optimization metaheuristic for Feature Selection problem.

Stars: ✭ 47 (+80.77%)

Mutual labels: feature-selection

thesis

PhD thesis

Stars: ✭ 25 (-3.85%)

Mutual labels: research-compendium

NVTabular

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

Stars: ✭ 797 (+2965.38%)

Mutual labels: feature-selection

mrmr

mRMR (minimum-Redundancy-Maximum-Relevance) for automatic feature selection at scale.

Stars: ✭ 170 (+553.85%)

Mutual labels: feature-selection

pomdp-intro

📓 A separate repo for first pomdp paper (AmNat)

Stars: ✭ 17 (-34.62%)

Mutual labels: research-compendium

View All Similar Projects ➔

Monitoring forest health using hyperspectral imagery: Does feature selection improve the performance of machine-learning techniques?

Research study

See https://pat-s.github.io/2019-feature-selection/ for a detailed description including HTML result documents.

Project structure

📔 code/: R scripts

📔 docs/00-manuscripts/ieee: LaTeX manuscripts

📔 R/: R functions

📔 _drake.R: {drake} config file. Specifies execution order of all steps to reproduce this study.

📔 analysis/: Reporting documents (R Markdown)

📔 docs/: HTML docs created via {workflowr} using the .Rmd sources from the analysis/ directory.

The data is hosted at Zenodo and automatically downloaded and processed when invoking the workflow via drake::r_make().

Reproducibility

This study makes heavy use of the R packages {drake}, {renv} and {workflowr} to streamline workflow execution, manage R package versions and the creation of a research website to complete the study.

By calling drake::r_make() from the repository root, the creation of R objects used in this study is initiated (including data download from Zenodo). Intermediate/single objects can be computed by specifying their names explicitly in drake_config(targets = <target name>) in _drake.R.

While most targets are cheap to compute, the modeling part is pretty expensive. These were run on a High-Performance-Computing (HPC) system and attempting to create those on a local desktop machine is not recommended.

Known Issues

Parts of this work (and some targets) depend on the download of Sentinel2 images. For this task the R package {getSpatialData} was used. After a required refactoring to the latest version of the package in November 2020 (due to outdated/non-working functionality with the initial package version of {getSpatialData} from 2019), the Sentinel2 download is currently broken.

This issue does not affect the recreation of the targets used for the scientific manuscript.

Creating targets with {drake}

(Before creating any target/object, make sure to call renv::restore() to install all required packages.)

Calling r_make() will create targets specified in drake_config(targets = <target>) in _drake.R with the additional drake settings specified.

Important: If you do have access to a Slurm cluster, set options(clustermq.scheduler = "slurm") in _drake.R (around l.73).

Required disk space

The data/ folder will contain data about 5.5GB in size.

Important intermediate targets

Out of the 400+ intermediate targets/objects in this project, the following targets are considered important, i.e. they might want to be created/inspected in more detail.

task_reduced_cor: List of all mlr tasks used for benchmarking.
bm_aggregated: Aggregated benchmark results of all models using a 1 meter buffer for hyperspectral data extraction.
eda_wfr: Creates the report which shows Exploratory Data Analysis (EDA) plots and tables.
eval_performance_wfr: Creates the report which evaluates the model performances.
spectral_signatures_wfr: Creates the report which inspects the spectral signatures of the hyperspectral data.
feature_importance_wfr: Creates the report which inspects the feature importance of variables.
filter_correlations_wfr: Creates the report which inspects correlations among filter methods.

Note that most reports require some/all fitted models. Creating these (e.g. target benchmark_no_models) is a costly process and takes several days on a HPC and way longer on a single machine.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

pat-s / 2019-feature-selection

Programming Languages

Labels

Projects that are alternatives of or similar to 2019-feature-selection

Project structure

Reproducibility

Known Issues

Creating targets with {drake}

Required disk space

Important intermediate targets