All Projects → GZTipDM → Tipdm

GZTipDM / Tipdm

Licence: apache-2.0
TipDM建模平台,开源的数据挖掘工具。

Programming Languages

javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to Tipdm

taller SparkR
Taller SparkR para las Jornadas de Usuarios de R
Stars: ✭ 12 (-90.77%)
Mutual labels:  data-mining, bigdata, data-analysis
Etl unicorn
数据可视化, 数据挖掘, 数据处理 ETL
Stars: ✭ 156 (+20%)
Mutual labels:  data-analysis, data-mining, workflow
Ai For Security Learning
安全场景、基于AI的安全算法和安全数据分析学习资料整理
Stars: ✭ 986 (+658.46%)
Mutual labels:  data-analysis, data-mining
Optimus
🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+658.46%)
Mutual labels:  data-analysis, bigdata
Tsrepr
TSrepr: R package for time series representations
Stars: ✭ 75 (-42.31%)
Mutual labels:  data-analysis, data-mining
Dataflowjavasdk
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Stars: ✭ 854 (+556.92%)
Mutual labels:  data-analysis, data-mining
Vectorbt
Ultimate Python library for time series analysis and backtesting at scale
Stars: ✭ 855 (+557.69%)
Mutual labels:  data-analysis, data-mining
Countly Sdk Cordova
Countly Product Analytics SDK for Cordova, Icenium and Phonegap
Stars: ✭ 69 (-46.92%)
Mutual labels:  data-analysis, bigdata
Dataproofer
A proofreader for your data
Stars: ✭ 628 (+383.08%)
Mutual labels:  data-analysis, data-mining
Spark Py Notebooks
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+929.23%)
Mutual labels:  data-analysis, bigdata
Flyte
Accelerate your ML and Data workflows to production. Flyte is a production grade orchestration system for your Data and ML workloads. It has been battle tested at Lyft, Spotify, freenome and others and truly open-source.
Stars: ✭ 1,242 (+855.38%)
Mutual labels:  data-analysis, workflow
Rightmove webscraper.py
Python class to scrape data from rightmove.co.uk and return listings in a pandas DataFrame object
Stars: ✭ 125 (-3.85%)
Mutual labels:  data-analysis, data-mining
Model Describer
model-describer : Making machine learning interpretable to humans
Stars: ✭ 22 (-83.08%)
Mutual labels:  data-analysis, data-mining
Spring2017 proffosterprovost
Introduction to Data Science
Stars: ✭ 18 (-86.15%)
Mutual labels:  data-analysis, data-mining
Drugs Recommendation Using Reviews
Analyzing the Drugs Descriptions, conditions, reviews and then recommending it using Deep Learning Models, for each Health Condition of a Patient.
Stars: ✭ 35 (-73.08%)
Mutual labels:  data-analysis, data-mining
Cookbook 2nd
IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018
Stars: ✭ 704 (+441.54%)
Mutual labels:  data-analysis, data-mining
Pycm
Multi-class confusion matrix library in Python
Stars: ✭ 1,076 (+727.69%)
Mutual labels:  data-analysis, data-mining
Spark R Notebooks
R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 109 (-16.15%)
Mutual labels:  data-analysis, bigdata
Elki
ELKI Data Mining Toolkit
Stars: ✭ 613 (+371.54%)
Mutual labels:  data-analysis, data-mining
Nfstream
NFStream: a Flexible Network Data Analysis Framework.
Stars: ✭ 622 (+378.46%)
Mutual labels:  data-analysis, data-mining

Introduction

TipDM建模平台,是由广东泰迪智能科技股份有限公司研发并开源的数据挖掘工具,TipDM建模平台提供数据丰富的数据预处理、 数据分析与数据挖掘组件,帮助广大中小企业快速建立数据挖掘工程,提升数据处理的效能。同时,我们也在积极 推动大数据挖掘社区建设,构建校企对接桥梁,为企业精准推送优质大数据挖掘人才;在产业需求的基础上推动高 校的人才培养工作。

Documentation

使用文档

Communication

社区交流

Features

  1. 基于Python,用于数据挖掘建模。
  2. 使用直观的拖放式图形界面构建数据挖掘工作流程,无需编程。
  3. 支持多种数据源,包括CSV文件和关系型数据库。
  4. 支持挖掘流程每个节点的结果在线预览。
  5. 提供5大类共40种算法组件,包括数据预处理、分类、聚类等数据挖掘算法。
  6. 支持新增/编辑算法组件,自定义程度高。
  7. 提供众多公开可用的数据挖掘示例工程,一键创建,快速运行。
  8. 提供完善的交流社区,提供数据挖掘相关的学习资源(数据、代码和模型等)。

Screenshot

输入图片说明 输入图片说明 输入图片说明 输入图片说明

Development

环境依赖

安装Java开发环境

下载JDK 1.8.x和Apache-Maven并安装,设置JAVA_HOME和PATH环境变量,如添加以下到~/.bashrc中(不同的操作系统,环境变量的设置方式有所不同,请根据自己的情况设置环境变量): echo 'export JAVA_HOME=~/jdk_1.8.0_171' >> /.bashrc
echo 'export PATH=$JAVA_HOME/bin:
/apache-maven-3.3.9/bin:$PATH' >> ~/.bashrc . ~/.bashrc

检查java与maven环境及版本是否正确安装,运行如下命令检查:

tipdm: ~ devp$ javac -version
javac 1.8.0_171
tipdm: ~ devp$ mvn -version
Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-11T00:41:47+08:00)

如果返回"-bash: xxx: command not found",或者版本号低于TipDM要求,请确认依赖软件是否安装正确,相应的环境变量是否设置生效。

安装Python

下载Python 3.6.x,并完成环境变量的配置。

required library:

arch==4.3.1
docx==0.2.4
gensim==3.6.0
graphviz==0.10.1
jieba==0.38
jieba-fast==0.53
Keras==2.2.4
matplotlib==2.2.0
numpy==1.14.2
pandas==0.23.4
pdfminer3k==1.3.1
pyclust==0.2.0
pydot==1.2.4
python-docx==0.8.10
scikit-learn==0.19.1
scipy==0.19.1
SQLAlchemy==1.2.0
statsmodels==0.9.0
tensorflow==1.12.0
thulac==0.2.0
wordcloud==1.5.0

批量安装依赖库

将上面的内容粘贴至requirements.txt(可随意命名),进入命令行,CD到requirements.txt所在的目录下,执行命令:

pip install -r requirements.txt

安装PostgreSQL

下载9.4x并完成安装。PostgreSQL中文社区

快速入门

构建项目

backend

下载源代码至本地,按照maven格式将源码导入IDE(Eclipse或IDEA)

数据初始化

首先在本机上要有PostgreSQL服务,使它监听127.0.0.1的5432端口(默认安装和初始化的PostgreSQL即监听127.0.0.1的5432端口),然后使用PG的管理员身份(一般是初始化PG数据库的linux账号,这里是postgres账户),运行$TipDM_HOME/WEB-INF/classes/sql/目录下的.sql脚本,初始化元数据:

psql -h 127.0.0.1 -p 5432 -U postgres -d tipdm_DB -f initData.sql
psql -h 127.0.0.1 -p 5432 -U postgres -d tipdm_DB -f quartz_postgres.sql
系统配置

配置文件说明:

sysconfig/database.properties			数据库配置文件
sysconfig/dbSupport.config			在此配置系统可支持的数据库类型
sysconfig/system.properties			系统的相关配置
sysconfig/redis.properties			Redis
PyConnection.xml				Python服务(该文件在sysconfig目录的上层)
编译
cd进入到源码根目录,使用maven进行编译即可,源码结构如下:
framework-common		公共模块
framework-model  		数据模型
framework-persist 		数据持久化
framework-service 		service
tipdm-server  			后台服务

看到

BUILD SUCCESS
Total time: ...

表示编译成功,生成的二进制包在$HOME/target/目录中。

部署

部署详情参考IntelliJ IDEA – Run / debug web application on Tomcat

FAQ

http://python.tipdm.org/bzzx/index.jhtml?n=%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].