All Projects → taller_SparkR → Similar Projects or Alternatives

1249 Open source projects that are alternatives of or similar to taller_SparkR

Spring2017 proffosterprovost
Introduction to Data Science
Stars: ✭ 18 (+50%)
genie
Genie: A Fast and Robust Hierarchical Clustering Algorithm (this R package has now been superseded by genieclust)
Stars: ✭ 21 (+75%)
Model Describer
model-describer : Making machine learning interpretable to humans
Stars: ✭ 22 (+83.33%)
leaflet heatmap
简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (+8.33%)
Mutual labels:  bigdata, hdfs, data-analysis
genieclust
Genie++ Fast and Robust Hierarchical Clustering with Noise Point Detection - for Python and R
Stars: ✭ 34 (+183.33%)
Tipdm
TipDM建模平台,开源的数据挖掘工具。
Stars: ✭ 130 (+983.33%)
Mutual labels:  data-mining, bigdata, data-analysis
Nmflibrary
MATLAB library for non-negative matrix factorization (NMF): Version 1.8.1
Stars: ✭ 153 (+1175%)
Urs
Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.
Stars: ✭ 275 (+2191.67%)
Mutual labels:  data-mining, data-analysis
Pyod
A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)
Stars: ✭ 5,083 (+42258.33%)
Mutual labels:  data-mining, data-analysis
Data Science With Ruby
Practical Data Science with Ruby based tools.
Stars: ✭ 549 (+4475%)
Mutual labels:  data-mining, data-analysis
Vectorbt
Ultimate Python library for time series analysis and backtesting at scale
Stars: ✭ 855 (+7025%)
Mutual labels:  data-mining, data-analysis
Spotify-Song-Recommendation-ML
UC Berkeley team's submission for RecSys Challenge 2018
Stars: ✭ 70 (+483.33%)
Mutual labels:  data-mining, data-analysis
Lagoujob
Job data mining repo for lagou.com
Stars: ✭ 256 (+2033.33%)
Mutual labels:  data-mining, data-analysis
Ai Learn
人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Python,数学,机器学习,数据分析,深度学习,计算机视觉,自然语言处理,PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域
Stars: ✭ 4,387 (+36458.33%)
Mutual labels:  data-mining, data-analysis
Knowage Server
Knowage is the professional open source suite for modern business analytics over traditional sources and big data systems.
Stars: ✭ 276 (+2200%)
Mutual labels:  data-mining, data-analysis
Cookbook 2nd
IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018
Stars: ✭ 704 (+5766.67%)
Mutual labels:  data-mining, data-analysis
Dataflowjavasdk
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Stars: ✭ 854 (+7016.67%)
Mutual labels:  data-mining, data-analysis
Elki
ELKI Data Mining Toolkit
Stars: ✭ 613 (+5008.33%)
Mutual labels:  data-mining, data-analysis
Php Ml
PHP-ML - Machine Learning library for PHP
Stars: ✭ 7,900 (+65733.33%)
Data mining
The Ruby DataMining Gem, is a little collection of several Data-Mining-Algorithms
Stars: ✭ 10 (-16.67%)
Pycm
Multi-class confusion matrix library in Python
Stars: ✭ 1,076 (+8866.67%)
Mutual labels:  data-mining, data-analysis
Dex
Dex : The Data Explorer -- A data visualization tool written in Java/Groovy/JavaFX capable of powerful ETL and publishing web visualizations.
Stars: ✭ 1,238 (+10216.67%)
Mutual labels:  data-mining, data-analysis
Python Machine Learning Book
The "Python Machine Learning (1st edition)" book code repository and info resource
Stars: ✭ 11,428 (+95133.33%)
Sourced Ce
source{d} Community Edition (CE)
Stars: ✭ 153 (+1175%)
Mutual labels:  data-mining, data-analysis
Pipeline
the `pipeline` shell command
Stars: ✭ 168 (+1300%)
Mutual labels:  data-mining, data-analysis
Amazing Feature Engineering
Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.
Stars: ✭ 218 (+1716.67%)
Mutual labels:  data-mining, data-analysis
twitter-analytics-wrapper
A simple Python wrapper to download tweets data from the Twitter Analytics platform. Particularly interesting for the impressions metrics that are unavailable on current Twitter API. Also works for the videos data.
Stars: ✭ 44 (+266.67%)
Mutual labels:  data-mining, data-analysis
Data Science Resources
👨🏽‍🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋
Stars: ✭ 171 (+1325%)
Mutual labels:  data-mining, data-analysis
Deepgraph
Analyze Data with Pandas-based Networks. Documentation:
Stars: ✭ 232 (+1833.33%)
Mutual labels:  data-mining, data-analysis
Suod
(MLSys' 21) An Acceleration System for Large-scare Unsupervised Heterogeneous Outlier Detection (Anomaly Detection)
Stars: ✭ 245 (+1941.67%)
Lihang algorithms
用python和sklearn两种方法实现李航《统计学习方法》中的算法
Stars: ✭ 263 (+2091.67%)
popular restaurants from officials
서울시 공무원의 업무추진비를 분석하여 진짜 맛집 찾기 프로젝트
Stars: ✭ 22 (+83.33%)
Mutual labels:  data-mining, data-analysis
Pydataroad
open source for wechat-official-account (ID: PyDataLab)
Stars: ✭ 302 (+2416.67%)
Mutual labels:  data-mining, data-analysis
Machine Learning Books
book
Stars: ✭ 290 (+2316.67%)
Cookbook 2nd Code
Code of the IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018 [read-only repository]
Stars: ✭ 541 (+4408.33%)
Mutual labels:  data-mining, data-analysis
Dataproofer
A proofreader for your data
Stars: ✭ 628 (+5133.33%)
Mutual labels:  data-mining, data-analysis
Nfstream
NFStream: a Flexible Network Data Analysis Framework.
Stars: ✭ 622 (+5083.33%)
Mutual labels:  data-mining, data-analysis
xgboost-smote-detect-fraud
Can we predict accurately on the skewed data? What are the sampling techniques that can be used. Which models/techniques can be used in this scenario? Find the answers in this code pattern!
Stars: ✭ 59 (+391.67%)
Ai For Security Learning
安全场景、基于AI的安全算法和安全数据分析学习资料整理
Stars: ✭ 986 (+8116.67%)
Mutual labels:  data-mining, data-analysis
Drugs Recommendation Using Reviews
Analyzing the Drugs Descriptions, conditions, reviews and then recommending it using Deep Learning Models, for each Health Condition of a Patient.
Stars: ✭ 35 (+191.67%)
Mutual labels:  data-mining, data-analysis
Tsrepr
TSrepr: R package for time series representations
Stars: ✭ 75 (+525%)
Mutual labels:  data-mining, data-analysis
Nanny
A tidyverse suite for (pre-) machine-learning: cluster, PCA, permute, impute, rotate, redundancy, triangular, smart-subset, abundant and variable features.
Stars: ✭ 17 (+41.67%)
Mutual labels:  rstudio, data-analysis
Machine learning for good
Machine learning fundamentals lesson in interactive notebooks
Stars: ✭ 142 (+1083.33%)
Mutual labels:  data-mining, data-analysis
Awesome Ts Anomaly Detection
List of tools & datasets for anomaly detection on time-series data.
Stars: ✭ 2,027 (+16791.67%)
Mutual labels:  data-mining, data-analysis
Etl unicorn
数据可视化, 数据挖掘, 数据处理 ETL
Stars: ✭ 156 (+1200%)
Mutual labels:  data-mining, data-analysis
PracticalMachineLearning
A collection of ML related stuff including notebooks, codes and a curated list of various useful resources such as books and softwares. Almost everything mentioned here is free (as speech not free food) or open-source.
Stars: ✭ 60 (+400%)
Mutual labels:  data-mining, data-analysis
python-notebooks
A collection of Jupyter Notebooks used in conferences or just to have some snippets.
Stars: ✭ 14 (+16.67%)
Mutual labels:  data-mining, data-analysis
Pyss3
A Python package implementing a new machine learning model for text classification with visualization tools for Explainable AI
Stars: ✭ 191 (+1491.67%)
Python practice of data analysis and mining
《Python数据分析与挖掘实战》随书源码与数据
Stars: ✭ 172 (+1333.33%)
Mutual labels:  data-mining, data-analysis
Datascience
Curated list of Python resources for data science.
Stars: ✭ 3,051 (+25325%)
Mutual labels:  data-mining, data-analysis
Rightmove webscraper.py
Python class to scrape data from rightmove.co.uk and return listings in a pandas DataFrame object
Stars: ✭ 125 (+941.67%)
Mutual labels:  data-mining, data-analysis
greycat
GreyCat - Data Analytics, Temporal data, What-if, Live machine learning
Stars: ✭ 104 (+766.67%)
bigdata-doc
大数据学习笔记,学习路线,技术案例整理。
Stars: ✭ 37 (+208.33%)
Mutual labels:  bigdata, hdfs
heidi
heidi : tidy data in Haskell
Stars: ✭ 24 (+100%)
Mutual labels:  data-mining, data-analysis
Heart disease prediction
Heart Disease prediction using 5 algorithms
Stars: ✭ 43 (+258.33%)
online-course-recommendation-system
Built on data from Pluralsight's course API fetched results. Works with model trained with K-means unsupervised clustering algorithm.
Stars: ✭ 31 (+158.33%)
Igel
a delightful machine learning tool that allows you to train, test, and use models without writing code
Stars: ✭ 2,956 (+24533.33%)
rworkshops
Materials for R Workshops
Stars: ✭ 43 (+258.33%)
Mutual labels:  rstudio, data-analysis
Papers Literature Ml Dl Rl Ai
Highly cited and useful papers related to machine learning, deep learning, AI, game theory, reinforcement learning
Stars: ✭ 1,341 (+11075%)
PaperWeeklyAI
📚「@MaiweiAI」Studying papers in the fields of computer vision, NLP, and machine learning algorithms every week.
Stars: ✭ 50 (+316.67%)
1-60 of 1249 similar projects