All Projects → SAG-KeLP → kelp-core

SAG-KeLP / kelp-core

Licence: Apache-2.0 license
www.kelp-ml.org

Programming Languages

java
68154 projects - #9 most used programming language

Projects that are alternatives of or similar to kelp-core

adenine
ADENINE: A Data ExploratioN PipelINE
Stars: ✭ 15 (-21.05%)
Mutual labels:  clustering-algorithm
Clustering-in-Python
Clustering methods in Machine Learning includes both theory and python code of each algorithm. Algorithms include K Mean, K Mode, Hierarchical, DB Scan and Gaussian Mixture Model GMM. Interview questions on clustering are also added in the end.
Stars: ✭ 27 (+42.11%)
Mutual labels:  clustering-algorithm
clueminer
interactive clustering platform
Stars: ✭ 13 (-31.58%)
Mutual labels:  clustering-algorithm
spectral-clustering.matlab
An intuitive implementation of spectral clustering on matlab
Stars: ✭ 12 (-36.84%)
Mutual labels:  clustering-algorithm
Clustering
Implements "Clustering a Million Faces by Identity"
Stars: ✭ 128 (+573.68%)
Mutual labels:  clustering-algorithm
Hdbscan
A high performance implementation of HDBSCAN clustering.
Stars: ✭ 2,032 (+10594.74%)
Mutual labels:  clustering-algorithm
Clustering-Python
Python Clustering Algorithms
Stars: ✭ 23 (+21.05%)
Mutual labels:  clustering-algorithm
Statistical-Learning-using-R
This is a Statistical Learning application which will consist of various Machine Learning algorithms and their implementation in R done by me and their in depth interpretation.Documents and reports related to the below mentioned techniques can be found on my Rpubs profile.
Stars: ✭ 27 (+42.11%)
Mutual labels:  clustering-algorithm
cyoptics-clustering
Fast OPTICS clustering in Cython + gradient cluster extraction
Stars: ✭ 23 (+21.05%)
Mutual labels:  clustering-algorithm
keras lstm chinese document classification
使用 Keras 进行中文文本分类
Stars: ✭ 50 (+163.16%)
Mutual labels:  classification-algorithm
clope
Elixir implementation of CLOPE: A Fast and Effective Clustering Algorithm for Transactional Data
Stars: ✭ 18 (-5.26%)
Mutual labels:  clustering-algorithm
genieclust
Genie++ Fast and Robust Hierarchical Clustering with Noise Point Detection - for Python and R
Stars: ✭ 34 (+78.95%)
Mutual labels:  clustering-algorithm
Opencog
A framework for integrated Artificial Intelligence & Artificial General Intelligence (AGI)
Stars: ✭ 2,132 (+11121.05%)
Mutual labels:  learning-algorithm
clusters
Cluster analysis library for Golang
Stars: ✭ 68 (+257.89%)
Mutual labels:  clustering-algorithm
Clustering4Ever
C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.
Stars: ✭ 126 (+563.16%)
Mutual labels:  clustering-algorithm
online-course-recommendation-system
Built on data from Pluralsight's course API fetched results. Works with model trained with K-means unsupervised clustering algorithm.
Stars: ✭ 31 (+63.16%)
Mutual labels:  clustering-algorithm
spatialcluster
spatially-constrained clustering in R
Stars: ✭ 25 (+31.58%)
Mutual labels:  clustering-algorithm
Project17-B-Map
Map SDK를 활용한 POI Clustering Interaction Dev
Stars: ✭ 78 (+310.53%)
Mutual labels:  clustering-algorithm
clustering-python
Different clustering approaches applied on different problemsets
Stars: ✭ 36 (+89.47%)
Mutual labels:  clustering-algorithm
npo classifier
Automated coding using machine-learning and remapping the U.S. nonprofit sector: A guide and benchmark
Stars: ✭ 18 (-5.26%)
Mutual labels:  classification-algorithm

kelp-core

KeLP is the Kernel-based Learning Platform (Filice '15) developed in the Semantic Analytics Group of the University of Roma Tor Vergata.

This is the KeLP core module and it contains the infrastructure of abstract classes and interfaces to work with KeLP. Furthermore, some implementations of algorithms, kernels and representations are included, to provide a base operative environment. More sophisticated components can be found in various extending modules, such as kelp-additional-algorithms and kelp-additional-kernels.

KELP is released as open source software under the Apache 2.0 license and the source code is available on Github.

###Core Structures

Core functionalities of KeLP comprise the interfaces and abstract classes needed to build and extend the library. The main interfaces and abstract classes are:

  • Dataset: it models the notion of a dataset as a collection of examples
  • Example: it models a single example as a collection of representations
  • Representation: it is the base type for a generic representation
  • Label: it models the label
  • Kernel: it models the notion of kernel
  • LearningAlgorithm: it is the base type for a learning algorithm
  • PredictionFunction: it is the base type for a function that computes a prediction
  • Manipulator: it is a class providing some methods to modify data and perform simple pre-processing steps

###Representations kelp-core include two vectorial representation that can be exploited in both linear and kernel-based learning models.

  • DenseVector: it is a vectorial representation that should be adopted in modeling dense feature vectors in a small feature space, like an embedding. It relies on EJML for an efficient implementation.

  • SparseVector: it represents the best option for modeling sparse feature vector from high dimensional feature spaces, like a Bag-of-Words feature space. It relies on a hashmap implementation based on TROVE, in order to guarantee and efficient solution both from memory usage and computational perspectives.

###Learning Algorithms

In this package different subclasses of the LearningAlgorithm interface can be found. The majority of the classes here is not an actual implementation, but they are used to build the hierarchy needed to instantiate the different kind of learning algorithms. For example, BinaryLearningAlgorithm is responsible to model the notion of a learning algorithm that operates with two classes. KernelMethod instead is used to model the notion of learning algorithm based on Kernel functions (e.g., Support Vector Machines).

The following actual implementations of Learning Algorithms are included:

CLASSIFICATION ALGORITHMS:

  • BinaryCSvmClassification: it is the KeLP implementation of C-Support Vector Machine learning algorithm. It is a learning algorithm for binary classification and it relies on kernel functions. It is a porting of the LibSVM implementation (Chang '11)
  • BinaryNuSvmClassification: it is the KeLP implementation of ν-Support Vector Machine learning algorithm. It is a learning algorithm for binary classification and it relies on kernel functions. It is a porting of the LibSVM implementation (Chang '11)
  • OneClassSvmClassification: the KeLP implementation of One-Class Support Vector Machine learning algorithm. It is a learning algorithm for estimating the Support of a High-Dimensional Distribution and it relies on kernel functions. The model is acquired only by considering positive examples. It is useful in anomaly detection (a.k.a. novelty detection). It is a porting of the LibSVM implementation (Chang '11)

REGRESSION ALGORITHMS:

  • EpsilonSvmRegression: It implements the ε-SVR learning algorithm discussed in (Chang '11)

CLUSTERING ALGORITHMS:

  • KernelBasedKMeansEngine: it is the implementation of the clustering algorithm described in (Kulis '09). It is basically a kernel-based extention of the standard k-mean clustering algorithm.

META ALGORITHMS:

  • OneVsAllLearningAlgorithm: implementation of the One-Vs-All schema for extending binary classification algorithms to multi-class classification problems.
  • OneVsOneLearningAlgorithm: implementation of the One-Vs-One schema for extending binary classification algorithms to multi-class classification problems.
  • MultiLabelClassificationLearning: implementation of a multilabel learning strategy for extending binary classification algorithms to multi-label classification tasks.

###Prediction Functions

The PredictionFunction interface model the notion of function used to make a prediction. Different classes are subtype of PredictionFunction depending on the role they have in classification or regression schemas. For example, BinaryClassifier extends a Classifier that is a prediction function used to derive discrete classifications.

###Kernel functions

Kernel is the base type for modeling a kernel function. Subclasses of kernel model different type of kernel functions available.

DirectKernel

It models a kernel that operates directly on a specific representation (e.g., a linear kernel or a tree kernel extends this class)

  • LinearKernel: it performs a dot product between explicit feature vectors, like DenseVector or SparseVector.

KernelComposition

it models a kernel function that operates on the result produced by another kernel function.

  • PolynomialKernel: it applies the polynomial operation over the result of another kernel
  • RbfKernel: it is the implementations of the Radial Basis Funtion Kernel (a.k.a. Gaussian Kernel)
  • NormalizationKernel: it normalizes the result of another kernel making it ranging in [-1;1]

KernelCombination:

it models a kernel function that combines other kernel functions.

  • LinearKernelCombination: it applies a weighted linear combination of kernels. The sum of two kernels corresponds to the concatenation of their respective feature spaces.
  • KernelMultiplication: it multiplies the results of different kernels. The product of two kernels corresponds to the Cartesian products of their feature spaces.

KernelOnPairs:

It is a kernel operating on instances of ExamplePair, i.e., examples naturally modeled as pairs, such as question and answer in Q/A, or text and hyphothesis in textual entailment.

  • PreferenceKernel: it is the implementation of the Preference Kernel proposed in (Shen '03) and largely used in lerning to rank tasks

  • PairwiseSumKernel: it implements the following formula: K(<x1,x2>, <y1,y2>) = BK(x1, y1) + BK(x2, y2) + BK(x1, y2) + BK(x2, y1). Where BK is a base kernel. (See (Filice '15b))

  • PairwiseProductKernel: it implements the following formula: K(<x1,x2>, <y1,y2>) = BK(x1, y1) * BK(x2, y2) + BK(x1, y2) * BK(x2, y1). Where BK is a base kernel. (See (Filice '15b))

  • UncrossedPairwiseSumKernel: it implements the following formula: K(<x1,x2>, <y1,y2>) = BK(x1, y1) + BK(x2, y2). Where BK is a base kernel. (See (Filice '15b))

  • UncrossedPairwiseProductKernel: it implements the following formula: K(<x1,x2>, <y1,y2>) = BK(x1, y1) * BK(x2, y2). Where BK is a base kernel. (See (Filice '15b))

  • BestPairwiseAlignmentKernel: it implements the following formula: K(<x1,x2>, <y1,y2>) = softmax(BK(x1, y1) * BK(x2, y2), BK(x1, y2) * BK(x2, y1)). Where BK is a base kernel. (See (Filice '15b))

=============

##Including KeLP in your project

If you want to include the core functionalities of KeLP you can easily include it in your Maven project adding the following repositories to your pom file:

<repositories>
	<repository>
			<id>kelp_repo_snap</id>
			<name>KeLP Snapshots repository</name>
			<releases>
				<enabled>false</enabled>
				<updatePolicy>always</updatePolicy>
				<checksumPolicy>warn</checksumPolicy>
			</releases>
			<snapshots>
				<enabled>true</enabled>
				<updatePolicy>always</updatePolicy>
				<checksumPolicy>fail</checksumPolicy>
			</snapshots>
			<url>http://sag.art.uniroma2.it:8081/artifactory/kelp-snapshot/</url>
		</repository>
		<repository>
			<id>kelp_repo_release</id>
			<name>KeLP Stable repository</name>
			<releases>
				<enabled>true</enabled>
				<updatePolicy>always</updatePolicy>
				<checksumPolicy>warn</checksumPolicy>
			</releases>
			<snapshots>
				<enabled>false</enabled>
				<updatePolicy>always</updatePolicy>
				<checksumPolicy>fail</checksumPolicy>
			</snapshots>
			<url>http://sag.art.uniroma2.it:8081/artifactory/kelp-release/</url>
		</repository>
	</repositories>

Then, the Maven dependency for the kelp-core project is:

<dependency>
    <groupId>it.uniroma2.sag.kelp</groupId>
    <artifactId>kelp-core</artifactId>
    <version>2.1.0</version>
</dependency>

Alternatively, thanks to the modularity of KeLP, you can include one of the following modules that already contains the dependency to kelp-core:

  • kelp-additional-kernels: it contains additional kernel functions, such as the Tree Kernels or the Graph Kernels;

  • kelp-additional-algorithms: it contains additional learning algorithms, such as the KeLP Java implementation of Liblinear or Online Learning algorithms, such as the Passive Aggressive;

  • kelp-full: it is a complete package of KeLP that contains the entire set of existing modules, i.e. additional kernel functions and algorithms.

============= How to cite KeLP

If you find KeLP useful in your researches, please cite the following paper:

@InProceedings{filice-EtAl:2015:ACL-IJCNLP-2015-System-Demonstrations,
	author = {Filice, Simone and Castellucci, Giuseppe and Croce, Danilo and Basili, Roberto},
	title = {KeLP: a Kernel-based Learning Platform for Natural Language Processing},
	booktitle = {Proceedings of ACL-IJCNLP 2015 System Demonstrations},
	month = {July},
	year = {2015},
	address = {Beijing, China},
	publisher = {Association for Computational Linguistics and The Asian Federation of Natural Language Processing},
	pages = {19--24},
	url = {http://www.aclweb.org/anthology/P15-4004}
}

============= REFERENCES

(Chang '11) Chih-Chung Chang and Chih-Jen Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1-27:27, 2011. Original code available at LibSVM

(Filice '15) Simone Filice, Giuseppe Castellucci, Danilo Croce, Roberto Basili. KeLP: a Kernel-based Learning Platform for Natural Language Processing. In: Proceedings of ACL: System Demonstrations. Beijing, China (July 2015)

(Filice '15b) Simone Filice, Giovanni Da San Martino and Alessandro Moschitti. Structural Representations for Learning Relations between Pairs of Texts. In Proc. of ACL 2015.

(Kulis '09) Brian Kulis, Sugato Basu, Inderjit Dhillon, and Raymond Mooney. Semi-supervised graph clustering: a kernel approach. Machine Learning, 74(1):1-22, January 2009.

(Shen '03) L. Shen and A. K. Joshi. An SVM based voting algorithm with application to parse reranking. In Proc. of CoNLL. 2003

Useful Links

KeLP site: http://www.kelp-ml.org

SAG site: http://sag.art.uniroma2.it

Source code hosted at GitHub: https://github.com/SAG-KeLP

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].