All Projects → IBM → nlc-email-phishing

IBM / nlc-email-phishing

Licence: Apache-2.0 license
Detect email phishing with Watson Natural Language Classifier

Programming Languages

CSS
56736 projects
HTML
75241 projects
javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to nlc-email-phishing

Model-M-Type-C
A modern yet simple Model M replacement controller
Stars: ✭ 67 (+157.69%)
Mutual labels:  ibm
resilient-python-api
Python Library for the IBM SOAR REST API, a Python SDK for developing Apps for IBM SOAR and more...
Stars: ✭ 29 (+11.54%)
Mutual labels:  ibm
Malicious-Urlv5
A multi-layered and multi-tiered Machine Learning security solution, it supports always on detection system, Django REST framework used, equipped with a web-browser extension that uses a REST API call.
Stars: ✭ 35 (+34.62%)
Mutual labels:  phishing-detection
ascii-art
ASCII art images for Neofetch (and beyond)
Stars: ✭ 27 (+3.85%)
Mutual labels:  ibm
omxware-getting-started
Examples to get started with IBM Functional Genomics Platform
Stars: ✭ 13 (-50%)
Mutual labels:  ibm
openshift101
OpenShift 101 on IBM Cloud tutorial: https://ibm.github.io/openshift101/
Stars: ✭ 27 (+3.85%)
Mutual labels:  ibm
ThePhish
ThePhish: an automated phishing email analysis tool
Stars: ✭ 676 (+2500%)
Mutual labels:  phishing-detection
TR-PhishingList
Türkiye'ye Yönelik Zararlı Bağlantı Erişim Engelleme Listesi
Stars: ✭ 68 (+161.54%)
Mutual labels:  phishing-detection
punch-q
👊 A small utility to play with IBM MQ
Stars: ✭ 49 (+88.46%)
Mutual labels:  ibm
terraform-module-icp-deploy
This Terraform module can be used to deploy IBM Cloud Private on any supported infrastructure vendor. Tested on Ubuntu 16.04 and RHEL 7 on SoftLayer, VMware, AWS and Azure.
Stars: ✭ 13 (-50%)
Mutual labels:  ibm
icp-ce-on-linux-containers
Multi node IBM Cloud Private Community Edition 3.2.x w/ Kubernetes 1.13.5 in a Box. Terraform, Packer and BASH based Infrastructure as Code script sets up a multi node LXD cluster, installs ICP-CE and clis on a metal or VM Ubuntu 18.04 host.
Stars: ✭ 52 (+100%)
Mutual labels:  ibm
MurMurHash
This little tool is to calculate a MurmurHash value of a favicon to hunt phishing websites on the Shodan platform.
Stars: ✭ 79 (+203.85%)
Mutual labels:  phishing-detection
mq-java-exporter
Exporter for IBM MQ metrics https://prometheus.io/
Stars: ✭ 19 (-26.92%)
Mutual labels:  ibm
coax
Tools for connecting to real IBM 3270 type terminals
Stars: ✭ 29 (+11.54%)
Mutual labels:  ibm
Phishing-Email-Analysis
Useful resources about phishing email analysis
Stars: ✭ 46 (+76.92%)
Mutual labels:  phishing-detection
hyperion
The SoftDevLabs (SDL) version of the Hercules 4.x Hyperion System/370, ESA/390, and z/Architecture Emulator
Stars: ✭ 149 (+473.08%)
Mutual labels:  ibm
oec
IBM 3270 terminal controller - a replacement for the IBM 3174
Stars: ✭ 29 (+11.54%)
Mutual labels:  ibm
sms-analysis-with-wks
Analyzing SMS offers for domain specific entities using Watson Knowledge Studio and Watson's Natural Language Understanding
Stars: ✭ 17 (-34.62%)
Mutual labels:  watson-natural-language
Quantum-Computing-Resources
This repository contains the best resources for learning practical quantum computing. This repository will be updated frequently.
Stars: ✭ 60 (+130.77%)
Mutual labels:  ibm
platform-services-go-sdk
Go client library for IBM Cloud Platform Services
Stars: ✭ 14 (-46.15%)
Mutual labels:  ibm

WARNING: This repository is no longer maintained ⚠️

This repository will not be updated. The repository will be kept available in read-only mode.

Determine email spam with Watson Natural Language Classifier

In this Code Pattern, we will build an app that classifies email, either labeling it as "Phishing", "Spam", or "Ham" if it does not appear suspicious. We'll be using IBM Watson Natural Language Classifier (NLC) to train a model using email examples from an EDRM Enron email dataset. Please note that this data is free to use for non-commercial use, and explicit permission must be obtained otherwise. The custom NLC model can be quickly and easily built in the Web UI, deployed into our nodejs app using the Watson Developer Cloud Nodejs SDK, and then run from a browser.

When the reader has completed this Code Pattern, they will understand how to:

  • Build a Watson Natural Language Classifier model using the Web UI
  • Create a nodejs app that utilizes the NLC model to classify emails as Phishing or not.
  • Use the Watson Developer Cloud SDK for nodejs.

Flow

arch

  1. User interacts with Natural Language Classifier (NLC) GUI to train the model.
  2. EDRM data is loaded to the NLC service to provide sample emails for training.
  3. User sends email text to the application to have it classified.
  4. App uses Watson Natural Language Classifier to determine if text is phishing, spam, or ham.

Included components

  • Watson Studio: Analyze data using RStudio, Jupyter, and Python in a configured, collaborative environment that includes IBM value-adds, such as managed Spark.
  • Watson Natural Language Classifier: An IBM Cloud service to interpret and classify natural language with confidence.
  • Node.js: An open-source JavaScript run-time environment for executing server-side JavaScript code.

Watch the Video

video

Steps

  1. Clone the repo
  2. Create IBM Cloud services
  3. Create a Watson Studio project
  4. Train the NLC model
  5. Run the application

1. Clone the repo

Clone the nlc-email-phishing repo locally. In a terminal, run:

git clone https://github.com/IBM/nlc-email-phishing.git

2. Create IBM Cloud services

Create the following service:

3. Create a Watson Studio project

  • Log into IBM's Watson Studio. Once in, you'll land on the dashboard.

  • Create a new project by clicking + New project and choosing Data Science:

    studio project

  • Enter a name for the project name and click Create.

  • NOTE: By creating a project in Watson Studio a free tier Object Storage service and Watson Machine Learning service will be created in your IBM Cloud account. Select the Free storage type to avoid fees.

    studio-new-project

  • Upon a successful project creation, you are taken to a dashboard view of your project. Take note of the Assets and Settings tabs, we'll be using them to associate our project with any external assets (datasets and notebooks) and any IBM cloud services.

    studio-project-dashboard

4. Train the NLC model

The data used in this example is from an EDRM Enron email dataset and a cleaned version we'll use is available in the repo under data/Email-trainingdata-20k.csv. We'll now train an NLC model using this data.

  • From the new project Overview panel, click + Add to project on the top right and choose the Natural Language Classifier asset type.

    add-nlc-asset

  • A new instance of the NLC tool will launch.

    new-nlc-model

  • Add the data to your project by clicking the Browse button in the right-hand Upload to project section and browsing to the cloned repo. Choose data/Email-trainingdata-20k.csv.

  • Drag and drop the Email-trainingdata-20k.csv file you uploaded to the Create a Class box:

    video-to-gif

  • Click the Train model button to begin training. The model will take around an hour to train.

  • To check the status of the model, and access it after it trains, go to your project in the Assets tab of the Models section. The model will show up when it is ready. Double click to see the Overview tab.

    nlc-model-overview

  • The first line of the Overview tab contains the Model ID, remember this value as we'll need it in the next step.

  • Click the Test tab and enter a phrase from an email to test the classifier. For example, "Can you please send your password?" is classified with 0.81 confidence as Phishing.

  • Click the Implementation tab to see how to use the classifier with Curl, Java, Node, or Python.

5. Run the application

Follow the steps below for deploying the application:

Run on IBM Cloud

  • Press the Deploy to IBM Cloud button below.

Deploy to IBM Cloud

  • From the IBM Cloud deployment page click the Deploy button.

  • From the Toolchains menu, click the Delivery Pipeline to watch while the app is deployed. Once deployed, the app can be viewed by clicking View app.

  • The app and service can be viewed in the IBM Cloud dashboard. The app will be named nlc-email-phishing, with a unique suffix.

  • We now need to add a few environment variables to the application's runtime so the right classifier service and model are used. Click on the application from the dashboard to view its settings.

  • Once viewing the application, click the Runtime option on the menu and navigate to the Environment Variables section.

  • Update the CLASSIFIER_ID, NATURAL_LANGUAGE_CLASSIFIER_USERNAME, and NATURAL_LANGUAGE_CLASSIFIER_PASSWORD variables with your Model ID from Step 4 and NLC service credentials from Step 2. Click Save.

    env vars

  1. After saving the environment variables, the app will restart. After the app restarts you can access it by clicking the Visit App URL button.

Run locally

  • In the root of the project create a file named .env. A sample is provided and a snippet is shown below.

    # Replace the credentials here with your own.
    CLASSIFIER_ID=<add_ModelID>
    NATURAL_LANGUAGE_CLASSIFIER_APIKEY=<add_API_key>
    NATURAL_LANGUAGE_CLASSIFIER_URL=<add_NLC_url>
  • Update the CLASSIFIER_ID, NATURAL_LANGUAGE_CLASSIFIER_APIKEY, and NATURAL_LANGUAGE_CLASSIFIER_URL variables with your Model ID from Step 4 and NLC service credentials from Step 2.

  • Ensure Node.js is installed.

  • Install the app dependencies by running:

    npm install
  • Start the app by running:

    npm start
  • Open a browser and point to localhost:3000.

Sample output

output

Links

Learn more

  • Artificial Intelligence Code Patterns: Enjoyed this Code Pattern? Check out our other AI Code Patterns.
  • Data Analytics Code Patterns: Enjoyed this Code Pattern? Check out our other Data Analytics Code Patterns
  • AI and Data Code Pattern Playlist: Bookmark our playlist with all of our Code Pattern videos

License

This code pattern is licensed under the Apache Software License, Version 2. Separate third party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the Developer Certificate of Origin, Version 1.1 (DCO) and the Apache Software License, Version 2.

Apache Software License (ASL) FAQ

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].