Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → wadhwasahil → Relation_extraction

wadhwasahil / Relation_extraction

Relation Extraction using Deep learning(CNN)

Programming Languages

python

139335 projects - #7 most used programming language

Labels

tensorflow nlp spark relation-extraction pyspark

Projects that are alternatives of or similar to Relation extraction

Devops Python Tools

80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.

Stars: ✭ 406 (+322.92%)

Mutual labels: spark, pyspark

Sparkling Titanic

Training models with Apache Spark, PySpark for Titanic Kaggle competition

Stars: ✭ 12 (-87.5%)

Mutual labels: spark, pyspark

Pyspark Example Project

Example project implementing best practices for PySpark ETL jobs and applications.

Stars: ✭ 633 (+559.38%)

Mutual labels: spark, pyspark

spark-extension

A library that provides useful extensions to Apache Spark and PySpark.

Stars: ✭ 25 (-73.96%)

Mutual labels: spark, pyspark

Optimus

🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark

Stars: ✭ 986 (+927.08%)

Mutual labels: spark, pyspark

data-algorithms-with-spark

O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian

Stars: ✭ 34 (-64.58%)

Mutual labels: spark, pyspark

Spark Tdd Example

A simple Spark TDD example

Stars: ✭ 23 (-76.04%)

Mutual labels: spark, pyspark

data processing course

Some class materials for a data processing course using PySpark

Stars: ✭ 50 (-47.92%)

Mutual labels: spark, pyspark

Spark python ml examples

Spark 2.0 Python Machine Learning examples

Stars: ✭ 87 (-9.37%)

Mutual labels: spark, pyspark

Sparkmagic

Jupyter magics and kernels for working with remote Spark clusters

Stars: ✭ 954 (+893.75%)

Mutual labels: spark, pyspark

aut

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.

Stars: ✭ 111 (+15.63%)

Mutual labels: spark, pyspark

W2v

Word2Vec models with Twitter data using Spark. Blog:

Stars: ✭ 64 (-33.33%)

Mutual labels: spark, pyspark

incubator-linkis

Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.

Stars: ✭ 2,459 (+2461.46%)

Mutual labels: spark, pyspark

basin

Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser

Stars: ✭ 25 (-73.96%)

Mutual labels: spark, pyspark

kafka-compose

🎼 Docker compose files for various kafka stacks

Stars: ✭ 32 (-66.67%)

Mutual labels: spark, pyspark

Scriptis

Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.

Stars: ✭ 696 (+625%)

Mutual labels: spark, pyspark

Gimel

Big Data Processing Framework - Unified Data API or SQL on Any Storage

Stars: ✭ 216 (+125%)

Mutual labels: spark, pyspark

ODSC India 2018

My presentation at ODSC India 2018 about Deep Learning with Apache Spark

Stars: ✭ 26 (-72.92%)

Mutual labels: spark, pyspark

Live log analyzer spark

Spark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.

Stars: ✭ 14 (-85.42%)

Mutual labels: spark, pyspark

Pysparkgeoanalysis

🌐 Interactive Workshop on GeoAnalysis using PySpark

Stars: ✭ 63 (-34.37%)

Mutual labels: spark, pyspark

View All Similar Projects ➔

Relation_Extraction

Relation Classification via Convolutional Deep Neural Network

The code is an implementation of the paper http://www.aclweb.org/anthology/C14-1220 using tensorflow.

##Algorithm

I almost followed the technique used in the paper mentioned above, only tweaking with some parameters such as dimensions of word vector, position vectors, optimization function and so on.
Basic architecture is a convolution layer, max pool and final softamx layer. We can always add/delete the number of conv and max-pool layers b/w the input layer and the final softmax layer. I used only 1 conv and 1 max pool.

##Files

text_cnn.py - It is a class which implements the architecture of the model. So it accepts the input, contains all the layers such as conv2d(convolution layer), max_pool etc. which process the input vector and finally gives the output in terms of predictions for each class.
data_helpers.py - It is a generic script which contains helpers such as generating batches, loading the training data etc.
train.py - This module creates the input vector from the training data, and finally trains the model on the data and saves it on the disk.
temp.py - This is a pyspark code used to fetch data from the HBase table and predict the class of each row using the trained model.

##Challenges

My training data is around 7K rows. Due to this, the accuracy is around 70.34% on the test set. So as the training set grows, I'm sure the model will perform much better.
My data set consists of inter sentencial entities with entities linked with a cause-effect relationship. However, this model can extended to a n class problem.

##TODO

Use RNNs maybe LSTMS for the training.
Fine tuning the model.

###PYSPARK can now be used with TensorFlow for online training and testing. In my case I am using pysark for online testing.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 96

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗