All Projects → twang2218 → pmap

twang2218 / pmap

Licence: other
Process Map Visualization of event analysis in R

Programming Languages

r
7636 projects
Makefile
30231 projects
Dockerfile
14818 projects

Projects that are alternatives of or similar to pmap

Unity Plane Mesh Splitter
Unity Plane Mesh Splitter
Stars: ✭ 71 (+273.68%)
Mutual labels:  map, optimization
angular-mapboxgl-directive
AngularJS directive for Mapbox GL
Stars: ✭ 43 (+126.32%)
Mutual labels:  map
least-squares-cpp
A single header-only C++ library for least squares fitting.
Stars: ✭ 46 (+142.11%)
Mutual labels:  optimization
PEPit
PEPit is a package enabling computer-assisted worst-case analyses of first-order optimization methods.
Stars: ✭ 41 (+115.79%)
Mutual labels:  optimization
core d.js
Offload your heavy lifting to a daemon. Extracted from eslint_d.
Stars: ✭ 21 (+10.53%)
Mutual labels:  process
elm-prepack-experiments
Experimenting with running built Elm programs through Prepack
Stars: ✭ 18 (-5.26%)
Mutual labels:  optimization
insights-js
The javascript client for Getinsights
Stars: ✭ 18 (-5.26%)
Mutual labels:  analytics
OptimalTransport.jl
Optimal transport algorithms for Julia
Stars: ✭ 64 (+236.84%)
Mutual labels:  optimization
RBP
Recurrent Back Propagation, Back Propagation Through Optimization, ICML 2018
Stars: ✭ 35 (+84.21%)
Mutual labels:  optimization
map-machine
Python renderer for OpenStreetMap with custom icons intended to display as many map features as possible
Stars: ✭ 82 (+331.58%)
Mutual labels:  map
reactjs-coronavirus-maps
A map with cases of coronavirus...
Stars: ✭ 18 (-5.26%)
Mutual labels:  map
camunda-bpm-data
Beautiful process data handling for Camunda BPM.
Stars: ✭ 24 (+26.32%)
Mutual labels:  process
bitmovin-go
Golang-Client which enables you to seamlessly integrate the new Bitmovin API into your existing projects
Stars: ✭ 49 (+157.89%)
Mutual labels:  analytics
PhotosApp
React Native Photos App: AWS Amplify, AWS S3, Mobile Analytics with Pinpoint
Stars: ✭ 21 (+10.53%)
Mutual labels:  analytics
js-collections-map-set
Repository to have example code to demonstrate JavaScript Map and Set data structures.
Stars: ✭ 21 (+10.53%)
Mutual labels:  map
django-leaflet-admin-list
The Django Leaflet Admin List package provides an admin list view featured by the map and bounding box filter for the geo-based data of the GeoDjango.
Stars: ✭ 28 (+47.37%)
Mutual labels:  map
Leaflet-active-area
A Leaflet plugin to center the map not in the center of the map but inside a DIV. Useful for responsive design.
Stars: ✭ 99 (+421.05%)
Mutual labels:  map
Visitor-Parser-JS
Visitor Parser JS
Stars: ✭ 20 (+5.26%)
Mutual labels:  analytics
clevertap-react-native
CleverTap React Native SDK
Stars: ✭ 40 (+110.53%)
Mutual labels:  analytics
scipydirect
Python wrapper to the DIRECT global optimization algorithm with scipy.optimize compatible call syntax
Stars: ✭ 26 (+36.84%)
Mutual labels:  optimization

Process Map

Build Status Coverage - Codecov status Coverage - Coveralls status Github Version CRAN Version Download Stats License

The goal of pmap is to provide the functionality of generating a process map from an event log with the user's preference.

Installation

An older version of pmap is available on CRAN, if you prefer to install this version, you can install it by:

install.packages("pmap")

However, based on the CRAN policy, a developer shouldn't submit a package to CRAN more than once within a month, therefore the GitHub repo will be the primary release channel, and the package will be submitted to CRAN only when it is possible. That is, the package version in CRAN can be a bit outdated.

To install the latest version, you can install pmap from GitHub directly:

devtools::install_github("twang2218/pmap")

And, the users have the options to choose the installed version by specifying the version number in the command, as I git tagged each release:

devtools::install_github("twang2218/pmap", ref = "v0.6.0")

Usage

This is a demonstration of how to use pmap to create a process map from an event log. sepsis dataset in the eventdataR package will be used in the demonstration.

Data preparation

Like any data analysis task, the first but the most important thing is to prepare our data.

Before the actual preparation steps, we should have a common ground on the terminology to be used later. There are mainly four terms, Case, Activity, Category and Event. The relation between the terms can be described as the following graph.

process map without prune

And eventlog is a collection of Event. So, each row in the eventlog represents an Event object, and each Event contains several attributes, including:

  • when - timestamp;
  • who - case_id;
  • what - activity and category;

Therefore pmap requires three mandatory fields and one optional field in the given eventlog data frame:

  • timestamp: Represent the timestamps of the events when they occurred. The data type should be POSIXct. For the case of data type of timestamp is character, the package will attempt to convert the column to POSIXct, but it's just handy in some cases, it's better to make sure the timestamp column is in correct data type.
  • case_id: Represent Case ID in the process paths. It is used to calculate the activity frequency or process performance.
  • activity: Activity name.
  • category(optional since v0.4.0): It is used to differentiate the grouped activities by different colors for a better visualization purpose. For example, the marketing activities with different purposes can be visualized by different colors, with one purpose each. If category is missing, the activity name will be used as category for coloring by default.

category was previously called event_type, and required before v0.3.2. It is no longer necessary after v0.4.0.

Now, let's do the data preparation.

library(eventdataR)
library(dplyr)
library(pmap)

# Prepare the event log data frame
eventlog <- eventdataR::sepsis %>%
    select(timestamp, case_id, activity) %>%
    na.omit()

Check eventlog data frame structure.

> head(eventlog)
# A tibble: 6 x 3
  timestamp           case_id activity
  <dttm>              <chr>   <chr>
1 2014-10-22 11:15:41 A       ER Registration
2 2014-10-22 11:27:00 A       Leucocytes
3 2014-10-22 11:27:00 A       CRP
4 2014-10-22 11:27:00 A       LacticAcid
5 2014-10-22 11:33:37 A       ER Triage
6 2014-10-22 11:34:00 A       ER Sepsis Triage
> str(eventlog)
eventlog [15,190 × 3] (S3: eventlog/tbl_df/tbl/data.frame)
 $ timestamp: POSIXct[1:15190], format: "2014-10-22 11:15:41" ...
 $ case_id  : chr [1:15190] "A" "A" "A" "A" ...
 $ activity : Factor w/ 16 levels "Admission IC",..: 4 10 3 9 6 5 8 7 2 3 ...
 - attr(*, "case_id")= chr "case_id"
 - attr(*, "activity_id")= chr "activity"
 - attr(*, "activity_instance_id")= chr "activity_instance_id"
 - attr(*, "lifecycle_id")= chr "lifecycle"
 - attr(*, "resource_id")= chr "resource"
 - attr(*, "timestamp")= chr "timestamp"
 - attr(*, "na.action")= 'omit' Named int [1:24] 442 443 444 445 446 447 448 449 450 451 ...
  ..- attr(*, "names")= chr [1:24] "442" "443" "444" "445" ...

Create a process map

You can create a process map from the eventlog directly by running only one command:

# Create process map
p <- create_pmap(eventlog)
# Render the process map
render_pmap(p)

The result will be shown in Viewer window if you're using R Studio, or in a new browser window if you're running the code from a Terminal.

process map without prune

Prune the process map

As you can see, the above result is a bit messy, however, we can prune some edges with smaller volume to simplify the process map. It is a better way to find the common paths in the process.

p %>% prune_edges(0.5) %>% render_pmap()

process map without prune

It's better, but we can improve it even better by pruning some not very important nodes as well.

p %>% prune_nodes(0.5) %>% prune_edges(0.5) %>% render_pmap()

Or, if you want a more interactive approach, you can start a Shiny server app with a slide bar for pruning the nodes and/or edges by a certain percentage. Just be careful, the more the edges and nodes, the slower the process will be. Let's keep 50% nodes and 50% edges in our example:

render_pmap_shiny(p, nodes_prune_percentage = 0.5, edges_prune_percentage = 0.5)

cleaner process map

Expand the loop

The above process map is great to find the valuable insights as we can immediately observe something very interesting insight from the map: the loop between CRP and Leucocyte. Is this because of a small group of cases repeatedly went through these two steps many many times? or is this because most cases went through the loop just a few times? To answer the question, we can expand the loop by distinct the repeated activity.

p <- create_pmap(eventlog, distinct_repeated_activities = TRUE)

By this way, each new activity name will be attached with the occurrence sequence number of the activity in the path, so the same activity occurs multiple times in the path will have a different name, which means different nodes in the final map. The newly generated the process map will be much more complex than before, so we need prune it further.

p %>% prune_nodes(0.5) %>% prune_edges(0.8) %>% render_pmap()

process map with distinct repeated activities

It's interesting to see that there isn't much connection between first time CRP (1) and Leucocyte (1), however, the back and forth happened after Admission NC.

Time is the key

Are this back and forth activity loop because of some kind of regular check after Admission NC? We are not sure from previous process map, because we don't know how long between each activity occurred. By default, the edge label will be the number of cases went between the two connected activities. We can change it for the duration of those connected activities to understand more about the process in a timely manner.

As there are multiple cases went through the path, we need to decide how to summarize the duration, such as:

  • the maximum duration
  • the minimum duration
  • the mean duration
  • the median duration

We can specify the kind of duration by given an edge_label argument to the create_pmap().

p <- create_pmap(eventlog, edge_label = "mean_duration")
p %>% prune_nodes(0.5) %>% prune_edges(0.8) %>% render_pmap()

or, it can be changed after the process map created by using adjust_edge_label() function.

p <- adjust_edge_label(p, label = "mean_duration")
render_pmap(p)

process map with distinct repeated activities with mean duration

By adding the duration between each path into the process map, we can eliminate the problem immediately, as it's not a loop. Leucocyte almost always occurred immediately after CRP occurred, but not the otherwise. It means CRP and Leucocyte occurred together in the same sequence, might belong to a blood test pack, and the patients will be tested regularly after Admission NC.

And we can also discover the patients would normally be released 2 days after CRP and Leucocyte test. It might because the test results came back ok, the patients will be released after 2 days observation without any further issue.

We can get that information from the process map clearly.

Persistent the result

If you're happy with the result, you can save the process map to a PDF or other file format by replace render_pmap() with render_pmap_file().

p <- create_pmap(eventlog, edge_label = "mean_duration")
p %>% prune_nodes(0.5) %>% prune_edges(0.8) %>% render_pmap_file("sepsis_process_map.pdf")
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].