knime-rKNIME Interactive R Statistics Integration
Stars: ✭ 18 (-83.78%)
oci-clouderaTerraform module to deploy Cloudera on Oracle Cloud Infrastructure (OCI)
Stars: ✭ 20 (-81.98%)
PHATPathogen-Host Analysis Tool - A modern Next-Generation Sequencing (NGS) analysis platform
Stars: ✭ 17 (-84.68%)
beam-siteApache Beam Site
Stars: ✭ 28 (-74.77%)
splinkImplementation of Fellegi-Sunter's canonical model of record linkage in Apache Spark, including EM algorithm to estimate parameters
Stars: ✭ 181 (+63.06%)
volkscvA Python toolbox for computer vision research and project
Stars: ✭ 58 (-47.75%)
go-mndMagic number detector for Go.
Stars: ✭ 153 (+37.84%)
hadoopofficeHadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
Stars: ✭ 56 (-49.55%)
ggshakeRAn analysis and visualization R package that works with publicly available soccer data
Stars: ✭ 69 (-37.84%)
shamashAutoscaling for Google Cloud Dataproc
Stars: ✭ 31 (-72.07%)
skeinA tool and library for easily deploying applications on Apache YARN
Stars: ✭ 128 (+15.32%)
Bitcoin Analysis-Python Bitcoin is widely used cryptocurrency for digital market. It is decentralised that means it is not own by government or any other company.Transactions are simple and easy as it doesn’t belong to any country.Records data are stored in Blockchain.Bitcoin price is variable and it is widely used so it is important to predict the price of it f…
Stars: ✭ 42 (-62.16%)
cummings.eeA collection of the work of Edward Estlin Cummings, as it enters the public domain.
Stars: ✭ 32 (-71.17%)
character-extractionExtracts character names from a text file and performs analysis of text sentences containing the names.
Stars: ✭ 40 (-63.96%)
Spark-ArResources for Spark AR
Stars: ✭ 43 (-61.26%)
pandapower guiA Graphical User Interface for the open source pandapower load flow program. [ I was so inexperienced when I started this, but maybe we can try again]
Stars: ✭ 28 (-74.77%)
platys-modern-data-platformSupport for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....
Stars: ✭ 35 (-68.47%)
GroupDocs.Classification-for-.NETGroupDocs.Classification-for-.NET samples and showcase (text and documents classification and sentiment analysis)
Stars: ✭ 38 (-65.77%)
ohloh-uiWeb Application for the Ohloh Stack.
Stars: ✭ 72 (-35.14%)
wiki从diy行为艺术到diy苏格拉底式对话,从diy一个仪式到diy一次旷课,各种活动指南的百科。diy💔是706孵化的一个非代码开源项目。
Stars: ✭ 49 (-55.86%)
decaylanguagePackage to parse decay files, describe and convert particle decays between digital representations.
Stars: ✭ 34 (-69.37%)
hypotheticalHypothesis and statistical testing in Python
Stars: ✭ 49 (-55.86%)
xxhadoopData Analysis Using Hadoop/Spark/Storm/ElasticSearch/MachineLearning etc. This is My Daily Notes/Code/Demo. Don't fork, Just star !
Stars: ✭ 37 (-66.67%)
census📜Automated review of open source software projects
Stars: ✭ 111 (+0%)
TwitterSearch2GephiThis windows CLI app lets you collect data from twitter via REST API and convert it into a CSV data set that can be used with Gephi. Other social networks (Reddit, Youtube, WWW) are also supported.
Stars: ✭ 21 (-81.08%)
dmarc-viewerDjango based web-app to visually analyze DMARC aggregate reports
Stars: ✭ 51 (-54.05%)
spark-stringmetricSpark functions to run popular phonetic and string matching algorithms
Stars: ✭ 51 (-54.05%)
jobAnalytics and searchJobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.
Stars: ✭ 25 (-77.48%)
booknlpBookNLP, a natural language processing pipeline for books
Stars: ✭ 636 (+472.97%)
cejaPySpark phonetic and string matching algorithms
Stars: ✭ 24 (-78.38%)
analysis-netStatic analysis framework for .NET programs.
Stars: ✭ 19 (-82.88%)
BigCLAM-ApacheSparkOverlapping community detection in Large-Scale Networks using BigCLAM model build on Apache Spark
Stars: ✭ 40 (-63.96%)
visualize-data-with-pythonA Jupyter notebook using some standard techniques for data science and data engineering to analyze data for the 2017 flooding in Houston, TX.
Stars: ✭ 60 (-45.95%)
story-generatorBudget Visualization Tool to explore and analyse major fiscal indicators across various states in India
Stars: ✭ 17 (-84.68%)
scarfToolkit for highly memory efficient analysis of single-cell RNA-Seq, scATAC-Seq and CITE-Seq data. Analyze atlas scale datasets with millions of cells on laptop.
Stars: ✭ 54 (-51.35%)
disqA library for manipulating bioinformatics sequencing formats in Apache Spark
Stars: ✭ 29 (-73.87%)
ros hadoopHadoop splittable InputFormat for ROS. Process rosbag with Hadoop Spark and other HDFS compatible systems.
Stars: ✭ 92 (-17.12%)
appdata-environment-desktopA selection of script and the manual for Privacy International's data interception environment
Stars: ✭ 70 (-36.94%)
corcAn ORC File Scheme for the Cascading data processing platform.
Stars: ✭ 14 (-87.39%)
hotmapWebGL Heatmap Viewer for Big Data and Bioinformatics
Stars: ✭ 13 (-88.29%)
polarsFast multi-threaded DataFrame library in Rust | Python | Node.js
Stars: ✭ 6,368 (+5636.94%)
UnitorTool for analysing and disassembling any unity game. Supports both mono and il2cpp.
Stars: ✭ 31 (-72.07%)
TraduXioA participative platform for cultural texts translators
Stars: ✭ 19 (-82.88%)
pyparEfficient and scalable parallelism using the message passing interface (MPI) to handle big data and highly computational problems.
Stars: ✭ 66 (-40.54%)
pathpypathpy is an OpenSource python package for the modeling and analysis of pathways and temporal networks using higher-order and multi-order graphical models
Stars: ✭ 124 (+11.71%)
RemoteShuffleServiceCeleborn provides an elastic and high-performance service for shuffle and spilled data.
Stars: ✭ 262 (+136.04%)
couchdb-mangoMirror of Apache CouchDB Mango
Stars: ✭ 34 (-69.37%)