The shortest yet efficient Python implementation of the sequential pattern mining algorithm PrefixSpan, closed sequential pattern mining algorithm BIDE, and generator sequential pattern mining algorithm FEAT.

Stars: ✭ 214 (+568.75%)

Mutual labels: data-mining

scikit-hubness

A Python package for hubness analysis and high-dimensional data mining

Stars: ✭ 41 (+28.13%)

Mutual labels: data-mining

Reaper

Social media scraping / data collection tool for the Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs

Stars: ✭ 240 (+650%)

Mutual labels: data-mining

Awesome Datascience

📝 An awesome Data Science repository to learn and apply for real world problems.

Stars: ✭ 17,520 (+54650%)

Mutual labels: data-mining

Deepgraph

Analyze Data with Pandas-based Networks. Documentation:

Stars: ✭ 232 (+625%)

Mutual labels: data-mining

Datascience

Curated list of Python resources for data science.

Stars: ✭ 3,051 (+9434.38%)

Mutual labels: data-mining

Orange3

🍊 📊 💡 Orange: Interactive data analysis

Stars: ✭ 3,152 (+9750%)

Mutual labels: data-mining

Automlpipeline.jl

A package that makes it trivial to create and evaluate machine learning pipeline architectures.

Stars: ✭ 223 (+596.88%)

Mutual labels: data-mining

Rule Extraction from Trees

A toolkit for extracting comprehensible rules from tree-based algorithms

Stars: ✭ 34 (+6.25%)

Mutual labels: data-mining

Amazing Feature Engineering

Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.

Stars: ✭ 218 (+581.25%)

Mutual labels: data-mining

Python Projects

some python projects

Stars: ✭ 247 (+671.88%)

Mutual labels: data-mining

Semantic-Bus

object flow treatment, data transformation

Stars: ✭ 49 (+53.13%)

Mutual labels: data-mining

software-analytics

A repository with my data analysis results of software artifacts

Stars: ✭ 37 (+15.63%)

Mutual labels: data-mining

Matminer

Data mining for materials science

Stars: ✭ 251 (+684.38%)

Mutual labels: data-mining

View All Similar Projects ➔

Sugarcube

Synopsis

Sugarcube is a framework to fetch, transform and export data. Data processes are described using plugins, which are chained in sequence to model complex data processes.

It is a tool designed to support journalists, non-profits, academic researchers, human rights organisations and others with investigations using online, publicly-available sources (e.g.tweets, videos, public databases, websites, online databases).

Learn how to use Sugarcube on your own project.

This code is licensed under the GPL 3.

Documentation

All documentation can be found on the website.

Examples

There are more examples and explanations on the website. Here is one to get you started.

sugarcube -p http_import,media_warc,media_screenshot,elastic_export \
          -c config.json \
          -Q http_url:'https://mwatana.org/en/airstrike-on-detention-center/'

This example will fetch and extract the contents and meta data of an online article, archive the website as a Web ARChive, take a screenshot of the website and store the data in an Elasticsearch database.

Data processes, like from the example above, can be codified in order to repeat them. Once a data process has been defined, Sugarcube allows to scale and automate it's operation.

Testimony

Syrian Archive uses Sugarcube to archive video evidence of human rights violations in Syria. Further, Sugarcube is used to monitor human rights documentation that is taken down by social media companies. The systems and workflows developed with Syrian Archive are now being expanded to do similar work in Yemen, Sudan and other areas.
Built using Sugarcube, the Data Scores investigation tool provided evidence and insights for research into how data analytics and data-driven "scoring" were being used in the public sector of the UK to make decisions. This research was conducted by the Data Justice Lab.

License

Sugarcube is licensed under the GPL 3.0.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

critocrito / sugarcube

Programming Languages

Labels

Projects that are alternatives of or similar to sugarcube

Sugarcube

Synopsis

Documentation

Examples

Testimony

License