Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Scalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.

Stars: ✭ 97 (-3.96%)

Mutual labels: spark

Spark States

Custom state store providers for Apache Spark

Stars: ✭ 83 (-17.82%)

Mutual labels: spark

Big Data

🔧 Use dplyr to analyze Big Data 🐘

Stars: ✭ 93 (-7.92%)

Mutual labels: spark

Bigdata Notebook

Stars: ✭ 100 (-0.99%)

Mutual labels: spark

Almond

A Scala kernel for Jupyter

Stars: ✭ 1,354 (+1240.59%)

Mutual labels: spark

Spark Py Notebooks

Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks

Stars: ✭ 1,338 (+1224.75%)

Mutual labels: spark

View All Similar Projects ➔

Spark-FFM

A Spark-based implementation of Field-Awared Factorization Machine. See http://www.csie.ntu.edu.tw/~cjlin/papers/ffm.pdf

The data should be formatted as

label field1:feat1:val1 field2:feat2:val2

to fit FFM, that is to extends LIBSVM data format by adding field information to each feature.

Currently, we support paralleledSGD and paralledAdagrad optimization methods, as they are more efficient in dealing with large dataset.

Besides, user can also choose to have FFMModel with/without global bias and one-way interactions.

Contact & Feedback

If you encounter bugs, feel free to submit an issue or pull request.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 101

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (7) 🔗