Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → surajr → Url Classification

surajr / Url Classification

Machine learning to classify Malicious (Spam)/Benign URL's

Labels

jupyter-notebook machine-learning security phishing internet classifier

Projects that are alternatives of or similar to Url Classification

Keras transfer cifar10

Object classification with CIFAR-10 using transfer learning

Stars: ✭ 120 (+26.32%)

Mutual labels: jupyter-notebook, classifier

Machine Learning

Machine learning for Project Cognoma

Stars: ✭ 30 (-68.42%)

Mutual labels: jupyter-notebook, classifier

Python-based utility that uses supervised machine learning to detect phishing domains from the Certificate Transparency log network.

Stars: ✭ 271 (+185.26%)

Mutual labels: jupyter-notebook, phishing

Vehicle Detection And Tracking

Udacity Self-Driving Car Engineer Nanodegree. Project: Vehicle Detection and Tracking

Stars: ✭ 60 (-36.84%)

Mutual labels: jupyter-notebook, classifier

Hate Speech And Offensive Language

Repository for the paper "Automated Hate Speech Detection and the Problem of Offensive Language", ICWSM 2017

Stars: ✭ 543 (+471.58%)

Mutual labels: jupyter-notebook, classifier

Building classifiers using cancer transcriptomes across 33 different cancer-types

Stars: ✭ 84 (-11.58%)

Mutual labels: jupyter-notebook, classifier

Kaggle Competitions

All Kaggle competitions

Stars: ✭ 94 (-1.05%)

Mutual labels: jupyter-notebook

Py Thin Plate Spline

Code for computing interpolating / approximating thin plate splines.

Stars: ✭ 95 (+0%)

Mutual labels: jupyter-notebook

Anything can happen in the next half hour (including spectral timing made easy)!

Stars: ✭ 94 (-1.05%)

Mutual labels: jupyter-notebook

An automated phishing tool with 30+ templates.

Stars: ✭ 1,321 (+1290.53%)

Mutual labels: phishing

Transferlearningtutorial

Applying transfer learning to a custom dataset by retraining Inception's final layer

Stars: ✭ 95 (+0%)

Mutual labels: jupyter-notebook

Deep Learning Coursera

Deep Learning Specialization by Andrew Ng on Coursera.

Stars: ✭ 95 (+0%)

Mutual labels: jupyter-notebook

A repository for sharing knowledge on Panel by HoloViz in order to build awesome analytics apps in Python

Stars: ✭ 95 (+0%)

Mutual labels: jupyter-notebook

Python 3

Stars: ✭ 94 (-1.05%)

Mutual labels: jupyter-notebook

Ismir2018 tutorial

Stars: ✭ 95 (+0%)

Mutual labels: jupyter-notebook

Examples and IPython Notebooks about NetworkX

Stars: ✭ 93 (-2.11%)

Mutual labels: jupyter-notebook

Python Thenotheoryguide

Jupyter NoteBooks to get you boosted with the basics of python with hands-on-practice.

Stars: ✭ 95 (+0%)

Mutual labels: jupyter-notebook

Implementations of "LSTM: A Search Space Odyssey" variants and their training results on the PTB dataset.

Stars: ✭ 94 (-1.05%)

Mutual labels: jupyter-notebook

Deepspeechdistances

Authors' implementation of DeepSpeech Distances.

Stars: ✭ 95 (+0%)

Mutual labels: jupyter-notebook

Deeplearningbookcode Volume2

Python/Jupyter notebooks for Volume 2 of "Deep Learning - From Basics to Practice" by Andrew Glassner

Stars: ✭ 95 (+0%)

Mutual labels: jupyter-notebook

View All Similar Projects ➔

Phishing URL Classification

Malicious Web sites are a cornerstone of Internet criminal activities. These Web sites contain various unwanted content such as spam-advertised products, phishing sites, dangerous "drive-by" harness that infect a visitor's system with malware. The most influential approaches to the malicious URL problem are manually constructed lists in which all malicious web page`s URLs are listed, as well as users systems that analyze the content or behavior of a Web site as it is visited.

The disadvantage of Blacklisting approach is that we have to do the tedious task of searching the list for presence of the entry. And the list can be very large considering the amount of web sites on the Internet. Also the list cannot be kept upto date because of the evergrowing growth of web link each and every hour.

In the given System we are using Machine-Learning techniques to classify a URL as either Safe or Unsafe in Real Time without even the need to download the webpage.

Algorithms we are using in this system are :

[Random Forest] (https://en.wikipedia.org/wiki/Random_forest)
[Logistic Regression] (https://en.wikipedia.org/wiki/Logistic_regression)
[Decision Trees] (https://en.wikipedia.org/wiki/Decision+Trees)
[Gradiant Boosting]

The system is presently working only on Lexical features(Simple text features of a URL) which includes:

Length of URL
Domain Length
Presence of Ip Address in Host Name
Presence of Security Sensitive Words in URL

and many more(around 22 total). The Host Based Features like country code in which site is hosted, creation date, updation date etc. are still yet to be added to the system and increase accuracy of the classifier but increase the Latency time in classifying the URL as we have to query WHOIS servers in order to come up with the Host Based Features. For this query purpose the PyWhois module has been used.

About Dataset

For this given system we are using two sources to collect our data,namely:

Phishtank.com

For the phishing/malicious URLs we are collecting data from [Phishtank] (https://www.phishtank.com/).

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 95

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (1) 🔗