All Projects → aksnzhy → Xlearn

aksnzhy / Xlearn

Licence: apache-2.0
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

Programming Languages

C++
36643 projects - #6 most used programming language
shell
77523 projects
python
139335 projects - #7 most used programming language
Makefile
30231 projects
CMake
9771 projects
c
50402 projects - #5 most used programming language

Projects that are alternatives of or similar to Xlearn

Pycm
Multi-class confusion matrix library in Python
Stars: ✭ 1,076 (-63.75%)
Mutual labels:  data-science, statistics, data-analysis
Datascience
Curated list of Python resources for data science.
Stars: ✭ 3,051 (+2.8%)
Mutual labels:  data-science, statistics, data-analysis
Datacamp
🍧 A repository that contains courses I have taken on DataCamp
Stars: ✭ 69 (-97.68%)
Mutual labels:  data-science, statistics, data-analysis
Awesome Python Data Science
Probably the best curated list of data science software in Python.
Stars: ✭ 812 (-72.64%)
Mutual labels:  data-science, statistics, data-analysis
Deeplearning Notes
Notes for Deep Learning Specialization Courses led by Andrew Ng.
Stars: ✭ 126 (-95.75%)
Mutual labels:  data-science, statistics, data-analysis
Socrat
A Dynamic Web Toolbox for Interactive Data Processing, Analysis, and Visualization
Stars: ✭ 26 (-99.12%)
Mutual labels:  data-science, statistics, data-analysis
Bayesian Cognitive Modeling In Pymc3
PyMC3 codes of Lee and Wagenmakers' Bayesian Cognitive Modeling - A Pratical Course
Stars: ✭ 93 (-96.87%)
Mutual labels:  data-science, statistics, data-analysis
Hyperlearn
50% faster, 50% less RAM Machine Learning. Numba rewritten Sklearn. SVD, NNMF, PCA, LinearReg, RidgeReg, Randomized, Truncated SVD/PCA, CSR Matrices all 50+% faster
Stars: ✭ 1,204 (-59.43%)
Mutual labels:  data-science, statistics, data-analysis
Sweetviz
Visualize and compare datasets, target values and associations, with one line of code.
Stars: ✭ 1,851 (-37.63%)
Mutual labels:  data-science, statistics, data-analysis
Scikit Learn
scikit-learn: machine learning in Python
Stars: ✭ 48,322 (+1528.1%)
Mutual labels:  data-science, statistics, data-analysis
Tablesaw
Java dataframe and visualization library
Stars: ✭ 2,785 (-6.17%)
Mutual labels:  data-science, statistics, data-analysis
Collapse
Advanced and Fast Data Transformation in R
Stars: ✭ 184 (-93.8%)
Mutual labels:  data-science, statistics, data-analysis
Imbalanced Learn
A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
Stars: ✭ 5,617 (+89.25%)
Mutual labels:  data-science, statistics, data-analysis
Pandas Profiling
Create HTML profiling reports from pandas DataFrame objects
Stars: ✭ 8,329 (+180.63%)
Mutual labels:  data-science, statistics, data-analysis
Scikit Mobility
scikit-mobility: mobility analysis in Python
Stars: ✭ 339 (-88.58%)
Mutual labels:  data-science, statistics, data-analysis
Tennis Crystal Ball
Ultimate Tennis Statistics and Tennis Crystal Ball - Tennis Big Data Analysis and Prediction
Stars: ✭ 107 (-96.39%)
Mutual labels:  data-science, statistics, data-analysis
Covid19 Severity Prediction
Extensive and accessible COVID-19 data + forecasting for counties and hospitals. 📈
Stars: ✭ 170 (-94.27%)
Mutual labels:  data-science, statistics, data-analysis
Data Science Live Book
An open source book to learn data science, data analysis and machine learning, suitable for all ages!
Stars: ✭ 193 (-93.5%)
Mutual labels:  data-science, statistics, data-analysis
Streamlit
Streamlit — The fastest way to build data apps in Python
Stars: ✭ 16,906 (+469.61%)
Mutual labels:  data-science, data-analysis
Facet
Human-explainable AI.
Stars: ✭ 269 (-90.94%)
Mutual labels:  data-science, statistics

Hex.pm Project Status

What is xLearn?

xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM), all of which can be used to solve large-scale machine learning problems. xLearn is especially useful for solving machine learning problems on large-scale sparse data. Many real world datasets deal with high dimensional sparse feature vectors like a recommendation system where the number of categories and users is on the order of millions. In that case, if you are the user of liblinear, libfm, and libffm, now xLearn is your another better choice.

Get Started! (English)

Get Started! (中文)

Performance

xLearn is developed by high-performance C++ code with careful design and optimizations. Our system is designed to maximize CPU and memory utilization, provide cache-aware computation, and support lock-free learning. By combining these insights, xLearn is 5x-13x faster compared to similar systems.

Ease-of-use

xLearn does not rely on any third-party library and users can just clone the code and compile it by using cmake. Also, xLearn supports very simple Python and CLI interface for data scientists, and it also offers many useful features that have been widely used in machine learning and data mining competitions, such as cross-validation, early-stop, etc.

Scalability

xLearn can be used for solving large-scale machine learning problems. First, xLearn supports out-of-core training, which can handle very large data (TB) by just leveraging the disk of a PC. In addition, xLearn supports distributed training, which scales beyond billions of example across many machines by using the Parameter Server framework.

How to Contribute

xLearn has been developed and used by many active community members. Your help is very valuable to make it better for everyone.

  • Please contribute if you find any bug in xLearn.
  • Contribute new features you want to see in xLearn.
  • Contribute to the tests to make it more reliable.
  • Contribute to the documents to make it clearer for everyone.
  • Contribute to the examples to share your experience with other users.
  • Open issue if you met problems during development.

Note that, please post iusse and contribution in English so that everyone can get help from them.

Contributors (rank randomly)

For Enterprise Users and Call for Sponsors

If you are enterprise users and find xLearn is useful in your work, please let us know, and we are glad to add your company logo here. We also welcome you become a sponsor to make this project better.

What's New

  • 2019-10-13 Andrew Kane add Ruby bindings for xLearn!

  • 2019-4-25 xLearn 0.4.4 version release. Main update:

    • Support Python DMatrix
    • Better Windows support
    • Fix bugs in previous version
  • 2019-3-25 xLearn 0.4.3 version release. Main update:

    • Fix bugs in previous version
  • 2019-3-12 xLearn 0.4.2 version release. Main update:

    • Release Windows version of xLearn
  • 2019-1-30 xLearn 0.4.1 version release. Main update:

    • More flexible data reader
  • 2018-11-22 xLearn 0.4.0 version release. Main update:

    • Fix bugs in previous version
    • Add online learning for xLearn
  • 2018-11-10 xLearn 0.3.8 version release. Main update:

    • Fix bugs in previous version.
    • Update early-stop mechanism.
  • 2018-11-08. xLearn gets 2000 star! Congs!

  • 2018-10-29 xLearn 0.3.7 version release. Main update:

    • Add incremental Reader, which can save 50% memory cost.
  • 2018-10-22 xLearn 0.3.5 version release. Main update:

    • Fix bugs in 0.3.4.
  • 2018-10-21 xLearn 0.3.4 version release. Main update:

    • Fix bugs in on-disk training.
    • Support new file format.
  • 2018-10-14 xLearn 0.3.3 version release. Main update:

    • Fix segmentation fault in prediction task.
    • Update early-stop meachnism.
  • 2018-09-21 xLearn 0.3.2 version release. Main update:

    • Fix bugs in previous version
    • New TXT format for model output
  • 2018-09-08 xLearn uses the new logo:

  • 2018-09-07 The Chinese document is available now!

  • 2018-03-08 xLearn 0.3.0 version release. Main update:

    • Fix bugs in previous version
    • Solved the memory leak problem for on-disk learning
    • Support TXT model checkpoint
    • Support Scikit-Learn API
  • 2017-12-18 xLearn 0.2.0 version release. Main update:

    • Fix bugs in previous version
    • Support pip installation
    • New Documents
    • Faster FTRL algorithm
  • 2017-11-24 The first version (0.1.0) of xLearn release !

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].