NlpythonThis repository contains the code related to Natural Language Processing using python scripting language. All the codes are related to my book entitled "Python Natural Language Processing"
Stars: ✭ 265 (+783.33%)
gan tensorflowAutomatic feature engineering using Generative Adversarial Networks using TensorFlow.
Stars: ✭ 48 (+60%)
Amazing Feature EngineeringFeature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.
Stars: ✭ 218 (+626.67%)
feature engineFeature engineering package with sklearn like functionality
Stars: ✭ 758 (+2426.67%)
autoencoders tensorflowAutomatic feature engineering using deep learning and Bayesian inference using TensorFlow.
Stars: ✭ 66 (+120%)
Kaggle CompetitionsThere are plenty of courses and tutorials that can help you learn machine learning from scratch but here in GitHub, I want to solve some Kaggle competitions as a comprehensive workflow with python packages. After reading, you can use this workflow to solve other real problems and use it as a template.
Stars: ✭ 86 (+186.67%)
Machine Learning Workflow With PythonThis is a comprehensive ML techniques with python: Define the Problem- Specify Inputs & Outputs- Data Collection- Exploratory data analysis -Data Preprocessing- Model Design- Training- Evaluation
Stars: ✭ 157 (+423.33%)
50-days-of-Statistics-for-Data-ScienceThis repository consist of a 50-day program. All the statistics required for the complete understanding of data science will be uploaded in this repository.
Stars: ✭ 19 (-36.67%)
mistqlA miniature lisp-like language for querying JSON-like structures. Tuned for clientside ML feature extraction.
Stars: ✭ 260 (+766.67%)
DeltapyDeltaPy - Tabular Data Augmentation (by @firmai)
Stars: ✭ 344 (+1046.67%)
TsfelAn intuitive library to extract features from time series
Stars: ✭ 202 (+573.33%)
RcpiMolecular informatics toolkit with a comprehensive integration of bioinformatics and cheminformatics tools for drug discovery.
Stars: ✭ 22 (-26.67%)
tsflexFlexible time series feature extraction & processing
Stars: ✭ 252 (+740%)
featurewizUse advanced feature engineering strategies and select best features from your data set with a single line of code.
Stars: ✭ 229 (+663.33%)
NniAn open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Stars: ✭ 10,698 (+35560%)
BlurrData transformations for the ML era
Stars: ✭ 96 (+220%)
fastknnFast k-Nearest Neighbors Classifier for Large Datasets
Stars: ✭ 64 (+113.33%)
Awesome Feature EngineeringA curated list of resources dedicated to Feature Engineering Techniques for Machine Learning
Stars: ✭ 433 (+1343.33%)
Feature SelectionFeatures selector based on the self selected-algorithm, loss function and validation method
Stars: ✭ 534 (+1680%)
ManormA robust model for quantitative comparison of ChIP-Seq data sets.
Stars: ✭ 16 (-46.67%)
PretzelJavascript full-stack framework for Big Data visualisation and analysis
Stars: ✭ 26 (-13.33%)
TfidfSimple TF IDF Library
Stars: ✭ 6 (-80%)
PybedgraphA Python package for fast operations on 1-dimensional genomic signal tracks
Stars: ✭ 17 (-43.33%)
Taxadb🐣 locally query the ncbi taxonomy
Stars: ✭ 26 (-13.33%)
Speechpy💬 SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/
Stars: ✭ 833 (+2676.67%)
16gtSimultaneous detection of SNPs and Indels using a 16-genotype probabilistic model
Stars: ✭ 26 (-13.33%)
ScipipeRobust, flexible and resource-efficient pipelines using Go and the commandline
Stars: ✭ 826 (+2653.33%)
GalaxyData intensive science for everyone.
Stars: ✭ 812 (+2606.67%)
SeqtkToolkit for processing sequences in FASTA/Q formats
Stars: ✭ 799 (+2563.33%)
Uncurl pythonUNCURL is a tool for single cell RNA-seq data analysis.
Stars: ✭ 13 (-56.67%)
Metacachememory efficient, fast & precise taxnomomic classification system for metagenomic read mapping
Stars: ✭ 26 (-13.33%)
MeydaAudio feature extraction for JavaScript.
Stars: ✭ 792 (+2540%)
PykaldiA Python wrapper for Kaldi
Stars: ✭ 756 (+2420%)
Tiledb VcfEfficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (-13.33%)
MultiqcAggregate results from bioinformatics analyses across many samples into a single report.
Stars: ✭ 708 (+2260%)
Tuna🐟 A streaming ETL for fish
Stars: ✭ 11 (-63.33%)
PyensemblrestA wrapper for the EnsEMBL REST API
Stars: ✭ 25 (-16.67%)
HailScalable genomic data analysis.
Stars: ✭ 706 (+2253.33%)
FxtA large scale feature extraction tool for text-based machine learning
Stars: ✭ 25 (-16.67%)
React Plotly.jsA plotly.js React component from Plotly 📈
Stars: ✭ 701 (+2236.67%)
FeatexpFeature exploration for supervised learning
Stars: ✭ 688 (+2193.33%)
Sv CallersSnakemake-based workflow for detecting structural variants in WGS data
Stars: ✭ 28 (-6.67%)
Sevenbridges RSeven Bridges API Client, CWL Schema, Meta Schema, and SDK Helper in R
Stars: ✭ 27 (-10%)
ScanpySingle-Cell Analysis in Python. Scales to >1M cells.
Stars: ✭ 858 (+2760%)
EmaFast & accurate alignment of barcoded short-reads
Stars: ✭ 24 (-20%)
Efficientnet PytorchA PyTorch implementation of EfficientNet and EfficientNetV2 (coming soon!)
Stars: ✭ 6,685 (+22183.33%)
CromwellScientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
Stars: ✭ 655 (+2083.33%)
NucleusPython and C++ code for reading and writing genomics data.
Stars: ✭ 657 (+2090%)
Hyperparameter hunterEasy hyperparameter optimization and automatic result saving across machine learning algorithms and libraries
Stars: ✭ 648 (+2060%)
ScispacyA full spaCy pipeline and models for scientific/biomedical documents.
Stars: ✭ 855 (+2750%)
Fusiondirect.jl(No maintenance) Detect gene fusion directly from raw fastq files
Stars: ✭ 23 (-23.33%)