All Projects → JUrban → deepmath

JUrban / deepmath

Licence: Unknown and 2 other licenses found Licenses found Unknown COPYING Unknown COPYING.CC-BY-SA GPL-3.0 COPYING.GPL
deep learning for math

Programming Languages

perl
6916 projects
prolog
421 projects

This is a dataset of 32524 examples for premise selection based on the Mizar40 experiments [1]. The proof data (without the negative part added here) served also for the DeepMath experiments described in [2].

For each conjecture the dataset contains a file with a roughly equal number of unnecessary facts and necessary facts. This is done to simplify experiments with learning methods such as neural networks.

In nnhpdata/* the examples are in this format:

C prefixformula
+ prefixformula
...
+ prefixformula
- prefixformula
...
- prefixformula

In nndata/* the examples are in this format:

C tptpformula
+ tptpformula
...
+ tptpformula
- tptpformula
...
- tptpformula

Here tptpformula is the standard FOL TPTP representation, and prefixformula is a prefix curried format using de Bruijn indeces.

C is the conjecture, + are the formulas needed for an ATP proof of C, and - are unnecessary formulas that are reasonably highly ranked by a k-nearest neighbor algorithm trained on the proofs .

To divide into training and testing, just randomly select a part (e.g. 90%) of the files to train on and test on the remaining files.

The accuracy of a simple count-based predictor evaluated incrementally on the whole dataset is 71.75%.

The mizar40 and code directories contain the data (proofs, statemnts, etc.) and the code used to generate the nndata and nnhpdata examples.

[1] http://dx.doi.org/10.1007/s10817-015-9330-8

[2] https://arxiv.org/abs/1606.04442

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].