J A W

Website | Docs | Setup | Crawler | Quick Start (Example)

JAW

An open-source, prototype implementation of property graphs for JavaScript based on the esprima parser, and the EsTree SpiderMonkey Spec. JAW can be used for analyzing the client-side of web applications and JavaScript-based programs.

This project is licensed under GNU AFFERO GENERAL PUBLIC LICENSE V3.0. See here for more information.

JAW has a Github pages website available at https://soheilkhodayari.github.io/JAW/.

Content

Overview of JAW

The architecture of the JAW is shown below.

Inputs and Data Collection

JAW can be used in two distinct ways:

Option 1. Modeling and analyzing an arbitrary JavaScript program using JAW.

In this case, the only input to the tool is the path of the target JavaScript program in the file system.

Option 2. Analyzing a web application given a single seed URL.

JAW has a stand-alone, JavaScript-enabled web crawler (based on chromium) that can collect the application web resources, and use them for analysis.

Test Case Scripts. Optionally, a so-called test case script or a state script can be provided together with the seed URL of the application in order for the crawler to reach a certain, pre-defined state (e.g., logged in state) before the crawling session starts. For more information about how to create such state script, see here.

Crawler Output. The crawler outputs the JavaScript code as well as the State Values for each web page found.

JavaScript Code: for each web page found, JAW creates a single JavaScript file preseving the execution order of the program.
State Values: state values are a collection of concrete values observed during the execution of a web page. These include the snapshot of the initial and rendered HTML page, fired events, HTTP requests and responses, and cookies. Optionally, the crawler can collect any JavaScript property accessible within the web page.

HPG Construction

JavaScript code and state values collected are next used to build a HPG. The built graph is imported into a Neo4j database.

Alternatively, the graph is built for the (arbitrary) JavaScript program given as the input by the tester

Optional Input: the HPG construction module can optionally be provided with a mapping of semantic types to arbitrary, tester-defined JavaScript language tokens. For example, a semantic type REQ can be assigned to all low-level, JavaScript functions that send an HTTP request (e.g., the Fetch API, or XMLHttpRequest).

Analysis and Outputs

Finally, the constructed Neo4j graph database can be queried for analysis. JAW provides a series of utility traversals for data flow analysis, control flow and reachability analysis, or pattern matching, which can be leveraged for writing custom traversals for security analyses. Also, JAW includes traversals to detect client-side CSRF vulnerabilities.

Setup

Prerequisites

Please install the following dependencies before proceeding to the installation step:

latest version of npm package manager (node js)
any stable version of python 3.x
python pip package manager

Installation

Below is the quick installation guide. Please see here for detailed installation instructions.

Step 1: Installing Python/NodeJS Dependencies

In the project root directory, run:

$ cd installation
$ ./install_dependencies.sh

Step 2: Setup Neo4j

This prototype has been tested with Neo4j 3.5.9, and Neo4j 4.2.3 community edition.

Option 1: Using JAW with Neo4j in Docker Please see here for information on how to use JAW with neo4j running inside docker.

Option 2: Installing in Host Machine.

You can download Neo4j 3.5.9, or Neo4j 4.2.3 from the neo4j download center. You can also install it, among others, via apt-get or homebrew. For example:

$ cd installation
$ # for linux, neo4j 3.5.x
$ ./linux_neo4j_installation.sh
$ # for macos, neo4j 3.5.x
$ ./macos_neo4j_installation.sh

Then, copy the example.env and rename it to .env.

Set your operating system:
- PLATFORM=linux
- PLATFORM=macos

Note By default, the neo4j password should be set as root for the user neo4j. If you set any other password, you also need to change it in constants.py.

$ neo4j-admin set-initial-password root

Quick Start

Data Collection

This module collects the data (i.e., JavaScript code and state values of web pages) needed for testing.

If you want to test a specific JavaScipt file that you already have on your file system, you can skip this step.

Inputs for Web Crawling

JAW can crawl and collect the web resources for a given set of web applications. The inputs to the tool are thus a list of sites' seed URLs, and an optional test case or state script for each site under test.

Action 1: Specify the set of sites for testing by adding new entries in the SITES_MAP dictionary in hpg_crawler/sites/sitemap.py. As shown below, you should assign each site an integer id, a name, and a seed URL.

SITES_MAP = {
	'1': ('example-name-1', 'https://example1.com', ),
	'2': ('example-name-2', 'https://example2.com', ),
}

Action 2 (Optional): For each site specified, you can input a custom test case or state script that gives instructions to the crawler to reach a certain state of the application (e.g., logged in) before the crawling session starts.

To create a new state script for a site, copy the sites/template folder and rename it to sites/<site-id> where <site-id> is the id you set in sitemap.py for that site, e.g., sites/1.
Add your selenium-based state functions in sites/<SITE_ID>/scripts/Auth.py for the target site. See the example Auth.py file for more information.

How to Start the Web Crawler?

In order to start the crawler for a given <site-id>, run:

$ cd hpg_crawler
$ python3 driver.py <site-id>

Alternatively, you can run the crawler for a list of sites, i.e., from an integer <start-site-id> to the integer <end-site-id> (inclusive), according to the integer id values specified in hpg_crawler/sites/sitemap.py. Thus, you can run:

$ cd hpg_crawler
$ python3 driver.py <start-site-id> <end-site-id>

For example, python3 driver.py 1 5 crawls websites with id one to five from hpg_crawler/sites/sitemap.py.

Running With Docker

Specify which website you want to crawl in docker-compose.yml under the command field
Spawn an instance of the crawler by ./run.docker.sh.

For more information about the web crawler of JAW, see here.

Graph Construction

In order to create a hybrid property graph for the output of the crawler or a given JavaScript program, you can use the following general command:

$ python3 -m hpg_construction.api <path> --js=<program.js> --import=<bool> --hybrid=<bool> --reqs=<requests.out> --evts=<events.out> --cookies=<cookies.pkl> --html=<html_snapshot.html>

Specification of Parameters:

<path>: absolute path to the folder containing the program files for analysis (must be under the hpg_construction/outputs folder).
--js=<program.js>: name of the JavaScript program for analysis (default: js_program.js).
--import=<bool>: whether the constructed property graph should be imported to an active neo4j database (default: true).
--hybrid=bool: whether the hybrid mode is enabled (default: false). This implies that the tester wants to enrich the property graph by inputing files for any of the HTML snapshot, fired events, HTTP requests and cookies, as collected by the JAW crawler.
--reqs=<requests.out>: for hybrid mode only, name of the file containing the sequence of obsevered network requests, pass the string false to exclude (default: request_logs_short.out).
--evts=<events.out>: for hybrid mode only, name of the file containing the sequence of fired events, pass the string false to exclude (default: events.out).
--cookies=<cookies.pkl>: for hybrid mode only, name of the file containing the cookies, pass the string false to exclude (default: cookies.pkl).
--html=<html_snapshot.html>: for hybrid mode only, name of the file containing the DOM tree snapshot, pass the string false to exclude (default: html_rendered.html).

For more information, you can use the help CLI provided with the graph construction API:

$ python3 -m hpg_construction.api -h

Note: To add and assign custom semantic types to JavaScript language tokens (suitable for the specific type of analysis you are doing), you can input them in hpg_construction/lib/jaw/semantictypes.js.

Note: Alternatively, one can use the following two steps for graph construction, which is more suited for debugging purposes.

Building the Graph CSV Files

The javascript analyzer modules creates the nodes.csv and rels.csv files for analysis (the property graph).

In the project root directory, try:

$ node hpg_construction/main.js -js <RELATIVE_PATH_TO_TEST_FILE> -o <OUTPUT_FOLDER_NAME>

For example:

$ node hpg_construction/main.js -js hpg_construction/test-inputs/test.js -o myfolder

Importing into Neo4j

In the project root directory, run:

$ python3 -m hpg_neo4j.hpg_import <path-to-the-folder-of-the-csv-files> --nodes=nodes.csv --edges=rels.csv

Run the help CLI command for more information:

$ python3 -m hpg_neo4j.hpg_import -h

HPG construction for a JavaScript Program

To construct a HPG for an arbitrary JavaScript file, simply run:

$ python3 -m hpg_construction.api <path> --js=<program.js>

In the above command, the parameter <path> is the base path to the folder containing the program files for analysis (must be under the hpg_construction/outputs folder), and the parameter --js=<program.js> specifies the name of the JavaScript program for analysis.

HPG construction for the Crawler Output

To construct a HPG for a given web page output by the crawler, simply run:

$ python3 -m hpg_construction.api <path>

In the above command, the <path> parameter is the location of the folder containing the target web page for analysis (output by the crawler), e.g., <BASE_PATH>/hpg_construction/outputs/example-app/example-page.

Analysis

The constructed HPG can then be queried using Cypher or the NeoModel ORM.

Running Custom Graph traversals

You should place and run your queries in hpg_analysis/<ANALYSIS_NAME>.

Option 1: Using the NeoModel ORM

You can use the NeoModel ORM to query the HPG. To write a query:

(1) Check out the HPG data model and syntax tree.
(2) Check out the ORM model for HPGs
(3) See the example query file provided; example_query_orm.py in the hpg_analysis/example folder.

$ python3 -m hpg_analysis.example.example_query_orm

For more information, please see here.

Option 2: Using Cypher Queries

You can use Cypher to write custom queries. For this:

(1) Check out the HPG data model and syntax tree.
(2) See the example query file provided; example_query_cypher.py in the hpg_analysis/example folder.

$ python3 -m hpg_analysis.example.example_query_cypher

For more information, please see here.

Detecting Client-side CSRF

This section describes how to configure and run the client-side CSRF analyzer, and how to interpret its output.

How to Run the Analysis Script?

In the root directory, run

$ python3 -m hpg_analysis.cs_csrf.main

This will build the property graph, creates a neo4j database and queries the database for client-side CSRF vulnerabilities.

Note: By default, the testing process is done for the unit tests specified under hpg_construction/unit_tests/cs_csrf. However, the analysis script has three different run modes:

(i) Detecting vulnerabilities in a specific web page
(i) Detecting vulnerabilities in a specific web site, i.e., all web pages of that site
(iii) Detecting vulnerabilities in unit test files

The active run mode can be adjusted by changing the hardcoded value of the ACTIVE_MODE parameter at the top of the script.

ENUM_TEST_WEB_PAGE = 0 	# case (i)
ENUM_TEST_WEB_SITE = 1	# case (ii)
ENUM_UNIT_TEST = 2	# case (iii)
ACTIVE_MODE = ENUM_UNIT_TEST # change this value to change the active run mode of the script

For case (i) and (ii), to specify the exact site or URL to test, modify the hardcoded site_identifer and url variables at the end of the file.

How to Interpret the Output of the Analysis?

The outputs will be stored in a file called template.out in the same folder as that of the input. For each HTTP request detected, JAW outputs an entry marking the set of semantic types (a.k.a, semantic tags or labels) associated with the elements constructing the request (i.e., the program slices). For example, an HTTP request marked with the semantic type ['WIN.LOC'] is forgeable through the window.location injection point. However, a request marked with ['NON-REACH'] is not forgeable.

An example output entry is shown below:

[*] Tags: ['WIN.LOC']
[*] NodeId: {'TopExpression': '86', 'CallExpression': '87', 'Argument': '94'}
[*] Location: 29
[*] Function: ajax
[*] Template: ajaxloc + "/bearer1234/"
[*] Top Expression: $.ajax({ xhrFields: { withCredentials: "true" }, url: ajaxloc + "/bearer1234/" })

1:['WIN.LOC'] variable=ajaxloc
	0 (loc:6)- var ajaxloc = window.location.href

This entry shows that on line 29, there is a $.ajax call expression, and this call expression triggers an ajax request with the url template value of ajaxloc + "/bearer1234/, where the parameter ajaxloc is a program slice reading its value at line 6 from window.location.href, thus forgeable through ['WIN.LOC'].

Detailed Documentation.

For more information, visit our wiki page here. Below is a table of contents for quick access.

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

SoheilKhodayari / JAW

Programming Languages

Labels

Projects that are alternatives of or similar to JAW

JAW

Content

Overview of JAW

Setup

Quick Start

Further Information

Overview of JAW

Inputs and Data Collection

HPG Construction

Analysis and Outputs

Setup

Prerequisites

Installation

Step 1: Installing Python/NodeJS Dependencies

Step 2: Setup Neo4j

Quick Start

Data Collection

Inputs for Web Crawling

How to Start the Web Crawler?

Running With Docker

Graph Construction

Building the Graph CSV Files

Importing into Neo4j

HPG construction for a JavaScript Program

HPG construction for the Crawler Output

Analysis

Running Custom Graph traversals

Option 1: Using the NeoModel ORM

Option 2: Using Cypher Queries

Detecting Client-side CSRF

How to Run the Analysis Script?

How to Interpret the Output of the Analysis?

Detailed Documentation.

The Web Crawler of JAW

Data Model of Hybrid Property Graphs (HPGs)

Graph Construction

Graph Traversals

Contribution and Code Of Conduct