pulserlApache Pulsar client library for Erlang/Elixir
Stars: ✭ 15 (-54.55%)
Awesome Web ScrapingList of libraries, tools and APIs for web scraping and data processing.
Stars: ✭ 4,510 (+13566.67%)
prostoProsto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
Stars: ✭ 54 (+63.64%)
Data Science On GcpSource code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Stars: ✭ 864 (+2518.18%)
TexarToolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/
Stars: ✭ 2,236 (+6675.76%)
DaliA GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
Stars: ✭ 3,624 (+10881.82%)
etl[READ-ONLY] PHP - ETL (Extract Transform Load) data processing library
Stars: ✭ 279 (+745.45%)
BroadwayConcurrent and multi-stage data ingestion and data processing with Elixir
Stars: ✭ 1,310 (+3869.7%)
alfa♿ Suite of open and standards-based tools for performing reliable accessibility conformance testing at scale
Stars: ✭ 75 (+127.27%)
bonobo-sqlalchemyPREVIEW - SQL databases in Bonobo, using sqlalchemy
Stars: ✭ 23 (-30.3%)
MdsplusThe MDSplus data management system
Stars: ✭ 47 (+42.42%)
VaspyManipulating VASP files with Python.
Stars: ✭ 185 (+460.61%)
Texar PytorchIntegrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project: http://casl-project.ai/
Stars: ✭ 636 (+1827.27%)
processorA simple and lightweight JavaScript data processing tool. Live demo:
Stars: ✭ 27 (-18.18%)
XidelCommand line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.
Stars: ✭ 335 (+915.15%)
Pulsar FlinkElastic data processing with Apache Pulsar and Apache Flink
Stars: ✭ 126 (+281.82%)
RapidtablesSuper fast list of dicts to pre-formatted tables conversion library for Python 2/3
Stars: ✭ 292 (+784.85%)
ProcessorOntology-driven Linked Data processor and server for SPARQL backends. Apache License.
Stars: ✭ 54 (+63.64%)
Bash OnelinerA collection of handy Bash One-Liners and terminal tricks for data processing and Linux system maintenance.
Stars: ✭ 1,359 (+4018.18%)
pyGAPSA framework for processing adsorption data and isotherm fitting
Stars: ✭ 36 (+9.09%)
Pxi🧚 pxi (pixie) is a small, fast, and magical command-line data processor similar to jq, mlr, and awk.
Stars: ✭ 248 (+651.52%)
DialogptLarge-scale pretraining for dialogue
Stars: ✭ 1,177 (+3466.67%)
Speech-RecognitionEnd-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
Stars: ✭ 21 (-36.36%)
tracemlEngine for ML/Data tracking, visualization, dashboards, and model UI for Polyaxon.
Stars: ✭ 445 (+1248.48%)
CbrainCBRAIN is a flexible Ruby on Rails framework for accessing and processing of large data on high-performance computing infrastructures.
Stars: ✭ 51 (+54.55%)
TdmR package for normalizing RNA-seq data to make them comparable to microarray data.
Stars: ✭ 33 (+0%)
rsgislibRemote Sensing and GIS Software Library; python module tools for processing spatial data.
Stars: ✭ 103 (+212.12%)
DataflowjavasdkGoogle Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Stars: ✭ 854 (+2487.88%)
CollapseAdvanced and Fast Data Transformation in R
Stars: ✭ 184 (+457.58%)
PanderaA light-weight, flexible, and expressive pandas data validation library
Stars: ✭ 506 (+1433.33%)
rec-coreData pipelining service
Stars: ✭ 19 (-42.42%)
PadasipPython Adaptive Signal Processing
Stars: ✭ 138 (+318.18%)
Eternal👾~ music, eternal ~ 👾
Stars: ✭ 323 (+878.79%)
perkeA keyphrase extractor for Persian
Stars: ✭ 60 (+81.82%)
NonechucksDeal with bad samples in your dataset dynamically, use Transforms as Filters, and more!
Stars: ✭ 304 (+821.21%)
HubDataset format for AI. Build, manage, & visualize datasets for deep learning. Stream data real-time to PyTorch/TensorFlow & version-control it. https://activeloop.ai
Stars: ✭ 4,003 (+12030.3%)
ECG analysisNo description or website provided.
Stars: ✭ 32 (-3.03%)
baleen3Baleen 3 is a data processing tool based on the Annot8 framework
Stars: ✭ 15 (-54.55%)
BonoboExtract Transform Load for Python 3.5+
Stars: ✭ 1,475 (+4369.7%)
SharpPulsarOne million topics client for Apache Pulsar - that is the goal!
Stars: ✭ 23 (-30.3%)
MillerMiller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
Stars: ✭ 4,633 (+13939.39%)
meta-schemaLittle DSL to make data processing sane with clojure.spec and spec-tools
Stars: ✭ 25 (-24.24%)
sparklanesA lightweight data processing framework for Apache Spark
Stars: ✭ 17 (-48.48%)
cqClojure Command-line Data Processor for JSON, YAML, EDN, XML and more
Stars: ✭ 111 (+236.36%)
ForteForte is a flexible and powerful NLP builder FOR TExt. This is part of the CASL project: http://casl-project.ai/
Stars: ✭ 89 (+169.7%)
mech🦾 Main repository for the Mech programming language. Start here!
Stars: ✭ 135 (+309.09%)
AmadeusHarmonious distributed data analysis in Rust.
Stars: ✭ 240 (+627.27%)
Pulsar SparkWhen Apache Pulsar meets Apache Spark
Stars: ✭ 55 (+66.67%)
parallel-corpora-toolsTools for filtering and cleaning parallel and monolingual corpora for machine translation and other natural language processing tasks.
Stars: ✭ 35 (+6.06%)
machine-learning-data-pipelinePipeline module for parallel real-time data processing for machine learning models development and production purposes.
Stars: ✭ 22 (-33.33%)
PysparklingA pure Python implementation of Apache Spark's RDD and DStream interfaces.
Stars: ✭ 231 (+600%)