datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (-29.09%)
graspEssential NLP & ML, short & fast pure Python code
Stars: ✭ 58 (+5.45%)
isarn-sketches-sparkRoutines and data structures for using isarn-sketches idiomatically in Apache Spark
Stars: ✭ 28 (-49.09%)
Awesome SparkA curated list of awesome Apache Spark packages and resources.
Stars: ✭ 1,061 (+1829.09%)
SentimentAFINN-based sentiment analysis for Node.js.
Stars: ✭ 2,469 (+4389.09%)
Spark NlpState of the Art Natural Language Processing
Stars: ✭ 2,518 (+4478.18%)
jupyterlab-sparkmonitorJupyterLab extension that enables monitoring launched Apache Spark jobs from within a notebook
Stars: ✭ 78 (+41.82%)
SparkTwitterAnalysisAn Apache Spark standalone application using the Spark API in Scala. The application uses Simple Build Tool(SBT) for building the project.
Stars: ✭ 29 (-47.27%)
mmtf-workshop-2018Structural Bioinformatics Training Workshop & Hackathon 2018
Stars: ✭ 50 (-9.09%)
fink-brokerAstronomy Broker based on Apache Spark
Stars: ✭ 18 (-67.27%)
MmlsparkSimple and Distributed Machine Learning
Stars: ✭ 2,899 (+5170.91%)
spark3DSpark extension for processing large-scale 3D data sets: Astrophysics, High Energy Physics, Meteorology, …
Stars: ✭ 23 (-58.18%)
big dataA collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (-38.18%)
hashformersHashformers is a framework for hashtag segmentation with transformers.
Stars: ✭ 18 (-67.27%)
SparkoraPowerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟
Stars: ✭ 51 (-7.27%)
OryxOryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning
Stars: ✭ 1,785 (+3145.45%)
pyspark-cheatsheetPySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster
Stars: ✭ 115 (+109.09%)
autThe Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (+101.82%)
WirbelsturmWirbelsturm is a Vagrant and Puppet based tool to perform 1-click local and remote deployments, with a focus on big data tech like Kafka.
Stars: ✭ 332 (+503.64%)
Agile data code 2Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+650.91%)
Pyspark StubsApache (Py)Spark type annotations (stub files).
Stars: ✭ 98 (+78.18%)
Spark GotchasSpark Gotchas. A subjective compilation of the Apache Spark tips and tricks
Stars: ✭ 308 (+460%)
SA-DLSentiment Analysis with Deep Learning models. Implemented with Tensorflow and Keras.
Stars: ✭ 35 (-36.36%)
Awesome PulsarA curated list of Pulsar tools, integrations and resources.
Stars: ✭ 57 (+3.64%)
Spark.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Stars: ✭ 1,721 (+3029.09%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+172.73%)
TiaYour Advanced Twitter stalking tool
Stars: ✭ 98 (+78.18%)
Live log analyzer sparkSpark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.
Stars: ✭ 14 (-74.55%)
Bigdata PlaygroundA complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Stars: ✭ 177 (+221.82%)
geosparkbring sf to spark in production
Stars: ✭ 53 (-3.64%)
hyperdriveExtensible streaming ingestion pipeline on top of Apache Spark
Stars: ✭ 31 (-43.64%)
SynapseMLSimple and Distributed Machine Learning
Stars: ✭ 3,355 (+6000%)
Kafka Storm StarterCode examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.
Stars: ✭ 728 (+1223.64%)
Quinnpyspark methods to enhance developer productivity 📣 👯 🎉
Stars: ✭ 217 (+294.55%)
TextMoodA Xamarin + IoT + Azure sample that detects the sentiment of incoming text messages, performs sentiment analysis on the text, and changes the color of a Philips Hue lightbulb
Stars: ✭ 52 (-5.45%)
PythonPython
Stars: ✭ 22 (-60%)
Whatsapp-analyticsperforming sentiment analysis on the whatsapp chats.
Stars: ✭ 20 (-63.64%)
levheimcubeNo description or website provided.
Stars: ✭ 11 (-80%)
pyspark-cassandrapyspark-cassandra is a Python port of the awesome @datastax Spark Cassandra connector. Compatible w/ Spark 2.0, 2.1, 2.2, 2.3 and 2.4
Stars: ✭ 70 (+27.27%)
kafkaSaurApache Kafka client for Deno
Stars: ✭ 42 (-23.64%)
twitter sentiment challengeThe code uses the tweepy library to access the Twitter API and the TextBlob library to perform Sentiment Analysis on each Tweet.
Stars: ✭ 14 (-74.55%)
text2emojiPredict an emoji that is associated with a text
Stars: ✭ 30 (-45.45%)
HackerNewsA .NET MAUI app for displaying the top posts on Hacker News that demonstrates text sentiment analysis gathered using artificial intelligence
Stars: ✭ 184 (+234.55%)
optimus🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Stars: ✭ 1,351 (+2356.36%)
text-analysisWeaving analytical stories from text data
Stars: ✭ 12 (-78.18%)
soda-sparkSoda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
Stars: ✭ 58 (+5.45%)