footer: © NodeProgram.com, Node.University and Azat Mardan 2018 slidenumbers: true theme: Simple, 1 build-lists: true autoscale:true

[.slidenumbers: false] [.hide-footer]

Node Advanced

Overview

Azat Mardan @azat_co

![left](images/azat node interacitev no pipe.jpeg)

Node Advanced

Videos: http://node.university/p/node-advanced
Slides: in *.md in https://github.com/azat-co/node-advanced
Code: in code in https://github.com/azat-co/node-advanced

Course Overview

Table of Contents
What to expect
What you need

Curriculum

Node Modules
Node Event Loop and Async Programming
Streaming
Networking
Debugging
Scaling

What to Expect

Focus on:

Pure Node
Core Node modules
ES6-8

What not to Expect

Do not expect:

Not much JavaScript fundamentals and no old ES5
Not much Linux, Unix, Windows or computer fundamentals
Not many fancy npm modules or frameworks

Prerequisites

Node Foundation: https://node.university/p/node-npm-and-mongodb-foundation
You Don't Know Node: https://node.university/p/you-dont-know-node
Node Patterns: https://node.university/p/node-patterns

What You Need

Node version 8+: node -v
npm version 5+: npm -v
Google Chrome
Slides&code: https://github.com/azat-co/node-advanced

Mindset

Embrace errors
Increase curiosity
Experiment by iteration
Get comfortable reading source code of Node.js, npm, and npm modules
Enjoy the process

Reading Source Code

You learn how to use a module and how to be a better developer

Tips for Deeper (Advanced) Understanding

Learn to think like V8 (a JS+Node engine): When in doubt, use console.log or debugger to walk through execution
Read call stack error message carefully. Learn and know common errors (address in use, cannot find module, undefined, etc.)
Upgrade your tools (No Notepad ++, seriously)

Tips for Deeper (Advanced) Understanding (Cont)

Memorize all the array, string and Node core methods - saves tons of time and keeps focus (can work offline too)
Read good books, take in-person classes from good instructors and watch good video courses
Build side-projects
Subscribe to Node Weekly to stay up-to-date
Teach

Module 1: Modules

Importing Modules with `require()`

Resolving
Loading
Wrapping
Evaluating
Caching

Modules Can Have Code

code/modules/module-1.js:

console.log(module) // console.log(global.module)

Module {
  id: '.',
  exports: {},
  parent: null,
  filename: '/Users/azat/Documents/Code/node-advanced/code/module-1.js',
  loaded: false,
  children: [],
  paths:
   [ '/Users/azat/Documents/Code/node-advanced/code/node_modules',
     '/Users/azat/Documents/Code/node-advanced/node_modules',
     '/Users/azat/Documents/Code/node_modules',
     '/Users/azat/Documents/node_modules',
     '/Users/azat/node_modules',
     '/Users/node_modules',
     '/node_modules' ] }

`require()`

local paths takes precedence (0 to N)
module can be a file or a folder with index.js (or any file specified in package.json main in that nested folder)
loaded is true when this file is imported/required by another
id is the path when this file is required by another
parent and children will be populated accordingly

`require.resolve()`

Check if the package exists/installed or not but does not execute

How `require()` Checks Files

Try name.js
Try name.json
Try name.node (compiled addon example)
Try name folder, i.e., name/index.js

`require.extensions`

{ '.js': [Function], '.json': [Function], '.node': [Function] }

function (module, filename) { // require.extensions['.js'].toString()
  var content = fs.readFileSync(filename, 'utf8');
  module._compile(internalModule.stripBOM(content), filename);
  }

function (module, filename) { // require.extensions['.json'].toString()
    var content = fs.readFileSync(filename, 'utf8');
    try {
          module.exports = JSON.parse(internalModule.stripBOM(content));
      } catch (err) {
          err.message = filename + ': ' + err.message;
        throw err;
      }
    }

function (module, filename) { // > require.extensions['.node'].toString()
    return process.dlopen(module, path._makeLong(filename));
}

Caching

Running require() twice will not print twice but just once:

cd code/modules && node
> require('./module-1.js')
...
> require('./module-1.js')
{}

(Or run modules/main.js)

A better way to execute code multiple times is to export it and then invoke

Exporting Module

Exporting Code

module.exports = () => {

}

CSV to Node Object Converter Module

code/modules/module-2.js

module.exports.parse = (csvString = '') => {
  const lines = csvString.split('\n')
  let result = []
  ...
  return result
}

CSV to Node Object Converter Main Program

code/modules/main-2.js

const csvConverter = require('./module-2.js').parse

const csvString = `id,first_name,last_name,email,gender,ip_address
...
10,Allin,Bernadot,[email protected],Male,15.162.216.199`

console.log(csvConverter(csvString))

Module Patterns

Export Function
Export Class
Export Function Factory
Export Object
Export Object with Methods

Exporting Tricks and Gotchas

module.exports.parse = () => {} // ok
exports.parse = () => {} // ok
global.module.exports.parse = () => {}  // not ok, use local module

Exporting Tricks and Gotchas (Cont)

exports.parse = ()=>{} // ok
module.exports = {parse: ()=>{} } // ok again 
exports = {parse: ()=>{} } // not ok, creates a new variable

Module Wrapper Function

Keeps local vars local

require('module').wrapper

node
> require('module').wrapper
[ '(function (exports, require, module, __filename, __dirname) { ',
  '\n});' ]

Tricky Local Globals

exports and require are specific to each module, not true global global, same with __filename and __dirname

console.log(global.module === module) // false
console.log(arguments)

What You Export === What You Use

module.exports = { 
  parse: (csv) => {
    //...
  }
}

Importing object, so use:

const parse = require('./name.js').parse
const {parse} = require('./name.js') // or
parse(csv)

What You Export === What You Use (Cont)

const Parser = { 
  parse(csv) {
    // ...
  }
}
module.exports = Parser

Again importing object, so use:

const parse = require('./name.js').parse
const {parse} = require('./name.js') // or
parse(csv)

What You Export === What You Use (Cont)

module.exports = () => { 
  return {
    parse: (csv) => {}
  }
}

Importing function, not object, so use:

const {parse} = require('./name.js')()
const parse = require('./name.js')().parse

(modules/main-3.js and modules/module-3.js)

What You Export === What You Use (Cont)

class Parser extends BaseClass {
  parse(csv) {
    // ...
  }
}
module.exports = Parser

const Parser = require('./name.js')
const parser = new Parser()
const parse = parser.parse // or const {parse} = parser

`import` vs `import()` vs `require()`

import is static and require is dynamic
*.mjs experimental https://nodejs.org/api/esm.html
import() method (stage 3)
No require.extensions or require.cache in import

Node experimental ESM support

import fs from 'fs'
import('./button.js')

For now, it's better to use Babel or just stick with require

Caching

require.cache has the cache

Clear Cache

main-4.js prints twice (unlike main-1.js):

require('./module-4.js')
delete require.cache[require.resolve('./module-4.js')]
require('./module-4.js')

Global

var limit = 1000 // local, not available outside
const height = 50 // local
let i = 10 // local
console = () => {} // global, overwrites console outside
global.Parser = {} // global, available in other files
max = 999 // global too

npm

registry
cli: folders, git, private registries (self hosted npm, Nexus, Artifactory)
yarn
pnpm

npm Git

npm i expressjs/express -E

npm i expressjs/express#4.14.0 -E
npm install https://github.com/indexzero/forever/tarball/v0.5.6
npm install git+ssh://[email protected]:npm/npm#semver:^5.0
npm install git+https://[email protected]/npm/npm.git

When in doubt: npm i --dry-run express

npm ls

npm ls express
npm ls -g --depth=0
npm ll -g --depth=0
npm ls -g --depth=0 --json

npm installs in ~/node_modules (if no local)

Creating package.json For Lazy Programmers

npm init -y

Setting Init Configs

List:

npm config ls

My npm Configs: cli, user, global

; cli configs
scope = ""
user-agent = "npm/4.2.0 node/v7.10.1 darwin x64"

; userconfig /Users/azat/.npmrc
init-author-name = "Azat Mardan"
init-author-url = "http://azat.co/"
init-license = "MIT"
init-version = "1.0.1"
python = "/usr/bin/python"

; node bin location = /Users/azat/.nvm/versions/node/v7.10.1/bin/node
; cwd = /Users/azat/Documents/Code/node-advanced
; HOME = /Users/azat
; "npm config ls -l" to show all defaults.

Configs for npm init

init-author-name = "Azat Mardan"
init-author-url = "http://azat.co/"
init-license = "MIT"
init-version = "1.0.1"

Setting up npm registry Config

npm config set registry "http://registry.npmjs.org/"

edit ~/.npmrc, e.g., /Users/azat/.npmrc

Setting up npm proxy

npm config set https-proxy http://proxy.company.com:8080
npm config set proxy http://proxy_host:port

Note: The https-proxy doesn't have https as the protocol, but http.

Dependency Options

npm i express -S (default in npm v5)
npm i express -D
npm i express -O
npm i express -E

`npm update` and `npm outdated`

< and <=
=
.x
~
^
> and >=

npm Tricks

npm home express
npm repo express
npm docs express

npm Linking for Developing CLI Tools

npm link 
npm unlink

Module 2: Node Event Loop and Async Programming

Event loop

Two Categories of Tasks

CPU-bound
I/O-bound

CPU Bound Tasks

CPU-bound tasks examples:

Encryption
Password
Encoding
Compression
Calculations

Input and Output Bound Tasks

Input/Output examples:

Disk: write, read
Networking: request, response
Database: write, read

CPU-bound tasks are not the bottleneck in networking apps. The I/O tasks are the bottleneck because they take up more time typically.

Dealing with Slow I/O

Synchronous
Forking (later module)
Threading (more servers, computers, VMs, containers)
Event loop (this module)

Call Stack

Uses push, pop functions on the FILO/LIFO/LCFS basis, i.e., functions removed from top (opposite of queue).

^https://techterms.com/definition/filo

Call Stack Illustration

const f3 = () => {
  console.log('executing f3')
  undefinedVariableError //  ERROR!
}
const f2 = () => {
  console.log('executing f2')
  f3()
}
const f1 = () => {
  console.log('executing f1')
  f2()
}

f1()

Call Stack as a Bucket

Starts with Anonymous, then f1, f2, etc.

f3() // last in the bucket but first to go
f2()
f1()
anonymous() // first in the bucket but last to go

Call Stack Error

> f1()
executing f1
executing f2
executing f3
ReferenceError: undefinedVariableError is not defined
    at f3 (repl:3:1)
    at f2 (repl:3:1)
    at f1 (repl:3:1)
    at repl:1:1
    at ContextifyScript.Script.runInThisContext (vm.js:23:33)
    at REPLServer.defaultEval (repl.js:339:29)
    at bound (domain.js:280:14)
    at REPLServer.runBound [as eval] (domain.js:293:12)
    at REPLServer.onLine (repl.js:536:10)
    at emitOne (events.js:101:20)

Event Queue

FIFO to push to call stack

Async Callback Messes Call Stack

const f3 = () => {
  console.log('executing f3')
  setTimeout(()=>{
    undefinedVariableError // STILL an ERROR but async in this case
  }, 100)
}
const f2 = () => {
  console.log('executing f2')
  f3()
}
const f1 = () => {
  console.log('executing f1')
  f2()
}

f1()

Different Call Stack!

No f1, f2, f3 for the setTimeout callback call stack because event loop moved one, i.e., error comes from a different event queue:

> f1()
executing f1
executing f2
executing f3
undefined
> ReferenceError: undefinedVariableError is not defined
    at Timeout.setTimeout [as _onTimeout] (repl:4:1)
    at ontimeout (timers.js:386:14)
    at tryOnTimeout (timers.js:250:5)
    at Timer.listOnTimeout (timers.js:214:5)
>

Event Loop Order of Operation:

Timers
I/O callbacks
Idle, prepare
Poll (incoming connections, data)
Check
Close callbacks

^https://nodejs.org/en/docs/guides/event-loop-timers-and-nexttick/

[.autoscale: true]

Phases Overview

Timers: this phase executes callbacks scheduled by setTimeout() and setInterval().
I/O callbacks: executes almost all callbacks with the exception of close callbacks, the ones scheduled by timers, and setImmediate().
Idle, prepare: only used internally.
Poll: retrieve new I/O events; node will block here when appropriate.
Check: setImmediate() callbacks are invoked here.
Close callbacks: e.g. socket.on('close', ...).

[.autoscale: true]

https://youtu.be/PNa9OMajw9w?t=5m48s

setTimeout vs. setImmediate vs. process.nextTick

setTimeout(fn, 0) - pushes to the next event loop cycle
setImmediate() similar to setTimeout() with 0 but timing is different sometimes, it is recommended when you need to execute on the next cycle
process.nextTick - not the next cycle (same cycle!), used to make functions fully async or to postpone code for events

nextTick Usage

All callbacks passed to process.nextTick() will be resolved before the event loop continues

To emit event after .on()
To make some sync code async

Event Emit nextTick Example in http

In http, to make sure event listeners are attached before emitting error (or anything else) ^source:

    if (err) {
      process.nextTick(() => this.emit('error', err));
      return;
    }

Async or Sync Error Handling in fs

To postpone callback if it's set (async) or throw error right away (sync) ^source:

function handleError(val, callback) {
  if (val instanceof Error) {
    if (typeof callback === 'function') {
      process.nextTick(callback, val);
      return true;
    } else throw val;
  }
  return false;
}

Async Code Syntax

Just Callbacks: code and data are arguments
Promises: code is separate from data
Generators and Async/await: look like sync but actually async

Error-First Callback

Define your async function:

const myFn = (cb) => {
  // Define error and data
  // Do something...
  cb(error, data)
}

Error-First Callback Usage

Use your function:

myFn((error, data)=>{
  
})

Arguments Naming

Argument names don't matter but the order does matter, put errors first and callbacks last:

myFn((err, result)=>{
  
})

Error-First

Errors first but the callback last

Popular convention but not enforced by Node)

Arguments Order and Callback-First?

Some functions don't follow error-first and use callback first, e.g., setTimeout(fn, time).

Callback-First

With the ES6 rest operator, it might make sense to start using callback-first style more because rest can only be the last parameter, e.g.

const myFn = (cb, ...options) => {

}

How to Know What is The Function Signature

You created it so you should know
Someone else created it, thus, always know others modules by reading source code, checking documentation, testing and reading examples, tests, tutorials.

Callbacks not Always Async

Sync code which has a function as an argument. :

const arr = [1, 2, 3]
arr.map((item, index, list) => {
  return item*index // called arr.length times
})

Promises

Externalize the callback code and separate it from the data arguments

Promises for Developers

Consume a ready promise from a library/module (axios, koa, etc.) - most likely
Create your own using ES6 Promise or a library (bluebird or q) - less likely

Usage and Consumption of Ready Promises

Callbacks Syntax

Where to put the callback and does the error argument go first?

asyncFn1((error1, data1) => {
  asyncFn2(data1, (error2, data2) => {
    asyncFn3(data2, (error3, data3) => {
      asyncFn4(data3, (error4, data4) => {
        // Do something with data4
      })
    })
  })
})

Promise Syntax

Clear separation of data and control flow arguments:

promise1(data1)
  .then(promise2)
  .then(promise3)
  .then(promise4)
  .then(data4=>{
    // Do something with data4
  })
  .catch(error=>{
    // handle error1, 2, 3 and 4
  })

Axios GET Example

const axios = require('axios')
axios.get('http://azat.co')
  .then((response) => response.data)
  .then(html => console.log(html))

Axios GET Error Example

const axios = require('axios')
axios.get('https://azat.co') // https will cause an error!
  .then((response)=>response.data)
  .then(html => console.log(html))
  .catch(e=>console.error(e))

Error: Hostname/IP doesn't match certificate's altnames: "Host: azat.co. is not in the cert's altnames: DNS:.github.com, DNS:github.com, DNS:.github.io, DNS:github.io"

Let's implement our own naive promise.

We can learn how easy promises are, and this is advanced course after all so why not?

Naive Promise: Callback Async Function

function myAsyncTimeoutFn(data, callback) {
  setTimeout(() => {
    callback()
  }, 1000)
}

myAsyncTimeoutFn('just a silly string argument', () => {
  console.log('Final callback is here')
})

Naive Promise: Implementation

function myAsyncTimeoutFn(data) {
  let _callback = null
  setTimeout(() => {
    if (_callback) _callback()
  }, 1000)
  return {
    then(cb){
      _callback = cb
    }
  }
}

myAsyncTimeoutFn('just a silly string argument').then(() => {
  console.log('Final callback is here')
})

Naive Promise: Implementation with Errors

const fs = require('fs')
function readFilePromise( filename ) {
  let _callback = () => {}
  let _errorCallback = () => {}

  fs.readFile(filename, (error, buffer) => {
    if (error) _errorCallback(error)
    else _callback(buffer)
  })

  return {
    then( cb, errCb ){
      _callback = cb
      _errorCallback = errCb
    }
  }

}

Naive Promise: Reading File

readFilePromise('package.json').then( buffer => {
  console.log( buffer.toString() )
  process.exit(0)
}, err => {
  console.error( err )
  process.exit(1)
})

Naive Promise: Triggering Error

readFilePromise('package.jsan').then( buffer => {
  console.log( buffer.toString() )
  process.exit(0)
}, err => {
  console.error( err )
  process.exit(1)
})

{ Error: ENOENT: no such file or directory, open 'package.jsan'
  errno: -2,
  code: 'ENOENT',
  syscall: 'open',
  path: 'package.jsan' }

Creating Promises Using The Standard ES6/ES2015 Promise

ES6/ES2015 Promise in Node

Node version 8+ (v8 not V8):

Promise === global.Promise

ES6 Promise takes callback with resolve and reject

Simple Promise Implementation with ES6/ES2015

const fs = require('fs')
function readJSON(filename, enc='utf8'){
  return new Promise(function (resolve, reject){
    fs.readFile(filename, enc, function (err, res){
      if (err) reject(err)
      else {
        try {
          resolve(JSON.parse(res))
        } catch (ex) {
          reject(ex)
        }
      }
    })
  })
}

readJSON('./package.json').then(console.log)

Advanced Promise Implementation with ES6/ES2015 for Both Promises and Callbacks

const fs = require('fs')

const readFileIntoArray = function(file, cb = null) {
  return new Promise((resolve, reject) => {
    fs.readFile(file, (error, data) => {
      if (error) {
        if (cb) return cb(error) 
        return reject(error)
      }

      const lines = data.toString().trim().split('\n')
      if (cb) return cb(null, lines)
      else return resolve(lines)
    })
  })
}

Example Calls with `then` and a Callback

const printLines = (lines) => {
  console.log(`There are ${lines.length} line(s)`)
  console.log(lines)
}
const FILE_NAME = __filename

readFileIntoArray(FILE_NAME)
  .then(printLines)
  .catch(console.error)

readFileIntoArray(FILE_NAME, (error, lines) => {
  if (error) return console.error(error)
  printLines(lines)
})

Event Emitters

Import require('events')
Extend class Name extends ...
Instantiate new Name()
Add listeners .on()
Emit .emit()

Emitting Outside Event Emitter Class

const events = require('events')
class Encrypt extends events {
  constructor(ops) {
    super(ops)
    this.on('start', () => {
      console.log('beginning A')
    })    
    this.on('start', () => {
      console.log('beginning B')
    })
  }
}

const encrypt = new Encrypt()
encrypt.emit('start')

Emitting Outside and Inside

const events = require('events')
class Encrypt extends events {
  constructor(ops) {
    super(ops)
    this.on('start', () => {
      console.log('beginning A')
    })    
    this.on('start', () => {
      console.log('beginning B')
      setTimeout(()=>{
        this.emit('finish', {msg: 'ok'})
      }, 0)
    })
  }
}

const encrypt = new Encrypt()
encrypt.on('finish', (data) => {
  console.log(`Finshed with message: ${data.msg}`)
})
encrypt.emit('start')

Working with Events

Events are about building extensible functionality and making modular code flexible

.emit() can be in the module and .on() in the main program which consumes the module
.on() can be in the module and .emit() in the main program, and in constructor or in instance
pass data with emit()
error is a special event (if listen to it then no crashes)
on() execution happen in the order in which they are defined (prependListener or removeListener)

[.autoscale: true]

Default Max Event Listeners

Default maximum listeners is 10 (to find memory leaks), use setMaxListeners^source

var defaultMaxListeners = 10;
...
EventEmitter.prototype.setMaxListeners = function setMaxListeners(n) {
  if (typeof n !== 'number' || n < 0 || isNaN(n)) {
    const errors = lazyErrors();
    throw new errors.RangeError('ERR_OUT_OF_RANGE', 'n',
                                'a non-negative number', n);
  }
  this._maxListeners = n;
  return this;
};

Promises vs Events

Events are synchronous while Promises are typically asynchronous
Events react to same event from multiple places, Promise just from one call
Events react to same event multiple times, then just once

nextTick in class

Again, nextTick helps to emit events later such as in a class constructor

class Encrypt extends events {
  constructor() {
    process.nextTick(()=>{  // otherwise, emit will happen before .on('ready')
      this.emit('ready', {})
    })
  }
}
const encrypt = new Encrypt()
encrypt.on('ready', (data) => {})

Async/await

How Developers Use Async/await

Consume ready async/await functions from libraries which support it - often
Create your own from callback or promises - not often (Node's util.promisify)

You need Node v8+ for both

Consuming Async Fn from axios

const axios = require('axios')
const getAzatsWebsite = async () => {
  const response = await axios.get('http://azat.co')
  return response.data
}
getAzatsWebsite().then(console.log)

`util.promisify`

const fs = require('fs')
const util = require('util')
const f = async function () {
  try {
    const data = await util.promisify(fs.readFile)('os.js', 'utf8') // <- try changing to non existent file to trigger an error
    console.log(data)
  } catch (e) {
    console.log('ooops')
    console.error(e)
    process.exit(1)
  }
}

f()
console.log('could be doing something else')

(Can be use just for Promises as well)

Consuming Async Fn from mocha and axios

const axios = require('axios')
const {expect} = require('chai')
const app = require('../server.js')
const port = 3004

before(async function() {
  await app.listen(port, () => {
    console.log('server is running')
  })
  console.log('code after the server is running')
})

Consuming Async Fn from mocha and axios (Cont)

describe('express rest api server', async () => {
  let id

  it('posts an object', async () => {
    const {data: body} = await axios
      .post(`http://localhost:${port}/collections/test`, 
      { name: 'John', email: '[email protected]'})
    expect(body.length).to.eql(1)
    expect(body[0]._id.length).to.eql(24)
    id = body[0]._id
  })

  it('retrieves an object', async () => {
    const {data: body} = await axios
      .get(`http://localhost:${port}/collections/test/${id}`)
    expect(typeof body).to.eql('object')
    expect(body._id.length).to.eql(24)
    expect(body._id).to.eql(id)
    expect(body.name).to.eql('John')
  })
  // ...
})

Project: Avatar Service

Koa Server with Mocha and Async/await Fn and Promise.all

Terminal:

cd code
cd koa-rest
npm i
npm start

Open in a Browser: http://localhost:3000/?email=YOURMAIL, e.g., http://localhost:3000/[email protected] to see your avatar (powered by Gravatar)

Module 3: Streaming

Abstractions for continuous chunking of data or simply data which is not available all at once and which does NOT require too much memory.

No need to wait for the entire resource to load

Types of Streams

Readable, e.g., fs.createReadStream
Writable, e.g., fs.createWriteStream
Duplex, e.g., net.Socket
Transform, e.g., zlib.createGzip

Streams Inherit from Event Emitter

Streams are Everywhere!

HTTP requests and responses
Standard input/output (stdin&stdout)
File reads and writes

Readable Stream Example

process.stdin

Standard input streams contain data going into applications.

Event data: on('data')
read() method

Input typically comes from the keyboard used to start the process.

To listen in on data from stdin, use the data and end events:

// stdin.js
process.stdin.resume()
process.stdin.setEncoding('utf8')

process.stdin.on('data', function (chunk) {
  console.log('chunk: ', chunk)
})

process.stdin.on('end', function () {
  console.log('--- END ---')
})

Readable stdin Stream Demo

$ node stdin.js

Interface `read()`

var readable = getReadableStreamSomehow()
readable.on('readable', () => {
  var chunk
  while (null !== (chunk = readable.read())) { // SYNC!
    console.log('got %d bytes of data', chunk.length)
  }
})

^readable.read is sync but the chunks are small

Writable Stream Example

process.stdout: Standard output streams contain data going out of the applications.
response (server request handler response)
request (client request)

Write to Writable Stream

Use write() method

process.stdout.write('A simple message\n')

Data written to standard output is visible on the command line.

Writable stdout Stream Demo

node stdout.js

Pipe

source.pipe(destination)

source - readable or duplex destination - writable, or transform or duplex

Linux vs Node Piping

Linux shell:

operationA | operationB | operationC | operationD

Node :

streamA.pipe(streamB).pipe(streamC).pipe(streamD)

streamA.pipe(streamB)
streamB.pipe(streamC)
streamC.pipe(streamD)

How pipe really works: readable source will be paused if the queue for the writable/transform/duplex destination stream is full. Otherwise, the readable will be resumed and read. ^source

[.footer:hide]

                                                   +===================+
                         x-->  Piping functions   +-->   src.pipe(dest)  |
                         x     are set up during     |===================|
                         x     the .pipe method.     |  Event callbacks  |
  +===============+      x                           |-------------------|
  |   Your Data   |      x     They exist outside    | .on('close', cb)  |
  +=======+=======+      x     the data flow, but    | .on('data', cb)   |
          |              x     importantly attach    | .on('drain', cb)  |
          |              x     events, and their     | .on('unpipe', cb) |
+---------v---------+    x     respective callbacks. | .on('error', cb)  |
|  Readable Stream  +----+                           | .on('finish', cb) |
+-^-------^-------^-+    |                           | .on('end', cb)    |
  ^       |       ^      |                           +-------------------+
  |       |       |      |
  |       ^       |      |
  ^       ^       ^      |    +-------------------+         +=================+
  ^       |       ^      +---->  Writable Stream  +--------->  .write(chunk)  |
  |       |       |           +-------------------+         +=======+=========+
  |       |       |                                                 |
  |       ^       |                              +------------------v---------+
  ^       |       +-> if (!chunk)                |    Is this chunk too big?  |
  ^       |       |     emit .end();             |    Is the queue busy?      |
  |       |       +-> else                       +-------+----------------+---+
  |       ^       |     emit .write();                   |                |
  |       ^       ^                                   +--v---+        +---v---+
  |       |       ^-----------------------------------<  No  |        |  Yes  |
  ^       |                                           +------+        +---v---+
  ^       |                                                               |
  |       ^               emit .pause();          +=================+     |
  |       ^---------------^-----------------------+  return false;  <-----+---+
  |                                               +=================+         |
  |                                                                           |
  ^            when queue is empty     +============+                         |
  ^------------^-----------------------<  Buffering |                         |
               |                       |============|                         |
               +> emit .drain();       |  ^Buffer^  |                         |
               +> emit .resume();      +------------+                         |
                                       |  ^Buffer^  |                         |
                                       +------------+   add chunk to queue    |
                                       |            <---^---------------------<
                                       +============+

Pipe and Transform

Encrypts and Zips:

const r = fs.createReadStream('file.txt')
const e = crypto.createCipher('aes256', SECRET) 
const z = zlib.createGzip()
const w = fs.createWriteStream('file.txt.gz')
r.pipe(e).pipe(z).pipe(w)

^Readable.pipe takes writable and returns destination

Readable Streams Events

data
end
error
close
readable

Readable Streams Methods

pipe()
unpipe()
read()
unshift()
resume()
pause()
isPaused()
setEncoding()

Writable Streams Events

drain
finish
error
close
pipe
unpipe

Writable Streams Methods

write()
end()
cork()
uncork()
setDefaultEncoding()

With pipe, we can listen to events too!

const r = fs.createReadStream('file.txt')
const e = crypto.createCipher('aes256', SECRET) 
const z = zlib.createGzip()
const w = fs.createWriteStream('file.txt.gz')
r.pipe(e)
  .pipe(z).on('data', () => process.stdout.write('.') // progress dot "."
  .pipe(w).on('finish', () => console.log('all is done!')) // when all is done

Readable Stream

paused: stream.read() - safe stream.resume()

flowing: EventEmitter - data can be lost if no listeners or they are not ready stream.pause()

What about HTTP?

Core http uses Streams!

const http = require('http')
var server = http.createServer( (req, res) => {
  req.setEncoding('utf8')
  req.on('data', (chunk) => { // readable
    processDataChunk(chunk) // This functions is defined somewhere else
  })
  req.on('end', () => {  
    res.write('ok') // writable
    res.end()
  })
})

server.listen(3000)

Streaming for Servers

streams/large-file-server.js:

const path = require('path')
const fileName = path.join(
  __dirname, process.argv[2] || 'webapp.log') // 67Mb
const fs = require('fs')
const server = require('http').createServer()

server.on('request', (req, res) => {
  if (req.url === '/callback') {
    fs.readFile(fileName, (err, data) => {
      if (err) return console.error(err)
      res.end(data)
    })
  } else if (req.url === '/stream') {
    const src = fs.createReadStream(fileName)
    src.pipe(res)
  }
})

server.listen(3000)

Before we were consuming streams, not let's create our own stream. This is Sparta advanced course after all!

Create a Stream

const stream = require('stream')
const writable = new stream.Writable({...})
const readable = new stream.Readable({...})
const transform = new stream.Transform({...})
const duplex = new stream.Duplex({...})

const {Writable} = require('stream')
const writable = new Writable({...})

Create a Writable Stream

const translateWritableStream = new Writable({
  write(chunk, encoding, callback) {
    translate(chunk.toString(), {to: 'en'}).then(res => {
        console.log(res.text)
        //=> I speak English
        console.log(res.from.language.iso)
        //=> nl
        callback()
    }).catch(err => {
        console.error(err)
        callback()
    })
  }
})

streams/writable-translate.js

Creating Readable

const {Readable} = require('stream')
const Web3 = require('web3')
const web3 = new Web3(new Web3.providers.HttpProvider("https://mainnet.infura.io/jrrVdXuXrVpvzsYUkCYq"))

const latestBlock = new Readable({
  read(size) {
    web3.eth.getBlock('latest')
      .then((x) => {
        // console.log(x.timestamp)
        this.push(`${x.hash}\n`)
        // this.push(null)
      })
  }
})

latestBlock.pipe(process.stdout)

Creating Duplex

const {Duplex} = require('stream')

const MyDuplex = new Duplex ({
  write(chunk, encoding, callback) {
    callback()
  }
  read(size) {
    this.push(data) // data defined
    this.push(null)
  }
})

Creating Transform

const {Transform} = require('stream')

const MyTransform = new Transform({
  transform(chunk, encoding, callback) {
    this.push(data)
    callback()
  }
})

Transform Real Life Example: Zlib from Node Core ^source

Zlib.prototype._transform = function _transform(chunk, encoding, cb) {
  // If it's the last chunk, or a final flush, we use the Z_FINISH flush flag
  // (or whatever flag was provided using opts.finishFlush).
  // If it's explicitly flushing at some other time, then we use
  // Z_FULL_FLUSH. Otherwise, use the original opts.flush flag.
  var flushFlag;
  var ws = this._writableState;
  if ((ws.ending || ws.ended) && ws.length === chunk.byteLength) {
    flushFlag = this._finishFlushFlag;
  } else {
    flushFlag = this._flushFlag;
    // once we've flushed the last of the queue, stop flushing and
    // go back to the normal behavior.
    if (chunk.byteLength >= ws.length)
      this._flushFlag = this._origFlushFlag;
  }
  processChunk(this, chunk, flushFlag, cb);
};

Backpressure

Data clogs
Reading typically is faster than writing
Backpressure is bad for memory exhaustion and GC (triggering GC too often is expensive)
Stream and Node solves the back pressure automatically by pausing source (read) stream if needed
highWaterMark option, defaults to 16kb

Overwrite Streams

Since Node.js v0.10, the Stream class has offered the ability to modify the behavior of the .read() or .write() by using the underscore version of these respective functions (._read() and ._write()).

Guide

Module 4: Networking

net

Any server, not just http or https!

const server = require('net').createServer()
server.on('connection', socket => {
  socket.write('Enter your command: ') // Sent to client
  socket.on('data', data => {
    // Incoming data from a client
  })

  socket.on('end', () => {
    console.log('Client disconnected')
  })
})

server.listen(3000, () => console.log('Server bound'))

Chat

chat.js:

if (!sockets[socket.id]) {
  socket.name = data.toString().trim()
  socket.write(`Welcome ${socket.name}!\n`)
  sockets[socket.id] = socket
  return
}
Object.entries(sockets).forEach(([key, cs]) => {
  if (socket.id === key) return
  cs.write(`${socket.name} ${timestamp()}: `)
  cs.write(data)
})

Client?

telnet localhost 3000

nc localhost 3000

or write your own TCP/IP client using Node, C++, Python, etc.

Bitcoin Price Ticker

node code/bitcoin-price-ticker.js

Ticker Server

const https = require('https')

const server = require('net').createServer()
let counter = 0
let sockets = {}
server.on('connection', socket => {
  socket.id = counter++

  console.log('Welcome to Bitcoin Price Ticker (Data by Coindesk)')
  console.log(`There are ${counter} clients connected`)
  socket.write('Enter currency code (e.g., USD or CNY): ')

  socket.on('data', data => {
    // process data from the client
  })

  socket.on('end', () => {
    delete sockets[socket.id]
    console.log('Client disconnected')
  })
})

server.listen(3000, () => console.log('Server bound'))

Processing Data from the Client

    let currency = data.toString().trim()
    if (!sockets[socket.id]) {
      sockets[socket.id] = {
        currency: currency
      }
      console.log(currency)
    }
    fetchBTCPrice(currency, socket)
    clearInterval(sockets[socket.id].interval)
    sockets[socket.id].interval = setInterval(()=>{
      fetchBTCPrice(currency, socket)
    }, 5000)

Making request to Coindesk API (HTTPS!)

API: https://api.coindesk.com/v1/bpi/currentprice/.json


https://api.coindesk.com/v1/bpi/currentprice/USD.json
https://api.coindesk.com/v1/bpi/currentprice/JPY.json
https://api.coindesk.com/v1/bpi/currentprice/RUB.json
https://api.coindesk.com/v1/bpi/currentprice/NYC.json

Response
{
  "time": {
    "updated": "Jan 9, 2018 19:52:00 UTC",
    "updatedISO": "2018-01-09T19:52:00+00:00",
    "updateduk": "Jan 9, 2018 at 19:52 GMT"
  },
  "disclaimer": "This data was produced from the CoinDesk 
  Bitcoin Price Index (USD). Non-USD currency data 
  converted using hourly conversion rate from openexchangerates.org",
  "bpi": {
    "USD": {
      "code": "USD",
      "rate": "14,753.6850",
      "description": "United States Dollar",
      "rate_float": 14753.685
    }
  }
}

HTTPS GET
const fetchBTCPrice = (currency, socket) => {
  const req = https.request({
    port: 443,
    hostname: 'api.coindesk.com',
    method: 'GET',
    path: `/v1/bpi/currentprice/${currency}.json`
  }, (res) => {
    let data = ''
    res.on('data', (chunk) => {
      data +=chunk
    })
    res.on('end', () => {
      socket.write(`1 BTC is ${JSON.parse(data).bpi[currency].rate} ${currency}\n`)
    })
  })
  req.end()
}

Client
telnet localhost 3000
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Enter currency code (e.g., USD or CNY): USD
1 BTC is 14,707.9438 USD
1 BTC is 14,694.5113 USD
1 BTC is 14,694.5113 USD
CNY
1 BTC is 40,202.5000 CNY
RUB
1 BTC is 837,400.5342 RUB
1 BTC is 837,400.5342 RUB
1 BTC is 837,400.5342 RUB


http
Protected SQL archive (file-server/file-server.js):
const url = require('url')
const fs = require('fs')
const SECRET = process.env.SECRET
const server = require('http').createServer((req, res) => {
  console.log(`URL is ${req.url} and the method is ${req.method}`)
  const course = req.url.match(/courses\/([0-9]*)/) // works for /courses/123 to get 123
  const query = url.parse(req.url, true).query // works for /?key=value&key2=value2 
  if (course && course[1] && query.API_KEY === SECRET) {
    fs.readFile('./clients_credit_card_archive.sql', (error, data)=>{
      if (error) {
        res.writeHead(500)
        res.end('Server error')
      } else {
        res.writeHead(200, {'Content-Type': 'text/plain' })
        res.end(data)
      }
    })
  } else {
    res.writeHead(404)
    res.end('Not found')
  }
}).listen(3000, () => {
  console.log('server is listening on 3000')
})

HTTP File Server
Command to run the server:
SECRET=NNN nodemon file-server.js

Browser request: http://localhost:3000/courses/123?API_KEY=NNN

HTTP Routing
You can use switch...
const server = require('http').createServer((req, res) => {
  switch (req.url) {
    case '/api':
      res.writeHead(200, { 'Content-Type': 'application/json' })
      // fetch data from a database
      res.end(JSON.stringify(data))
      break
    case '/home':
      res.writeHead(200, { 'Content-Type': 'text/html' })
      // send html from a file
      res.end(html)
      break
    default:
      res.writeHead(404)
      res.end()
  }
}).listen(3000, () => {
  console.log('server is listening on 3000')
})

HTTP Routing Puzzle
Find a problem with this server (from Advanced Node by Samer Buna):
const fs = require('fs')
const server = require('http').createServer()
const data = {}

server.on('request', (req, res) => {
  switch (req.url) {
  case '/api':
    res.writeHead(200, { 'Content-Type': 'application/json' })
    res.end(JSON.stringify(data))
    break
  case '/home':
  case '/about':
    res.writeHead(200, { 'Content-Type': 'text/html' })
    res.end(fs.readFileSync(`.${req.url}.html`))
    break
  case '/':
    res.writeHead(301, { 'Location': '/home' })
    res.end()
    break
  default:
    res.writeHead(404)
    res.end()
  }
})

server.listen(3000)

Puzzle Answer
Always reading (no caching) and blocking!
  case '/about':
    res.writeHead(200, { 'Content-Type': 'text/html' })
    res.end(fs.readFileSync(`.${req.url}.html`))
    break

Use HTTP Status Codes
http.STATUS_CODES


Core https Module
Server needs the key and certificate files:
openssl req -x509 -newkey rsa:2048 -nodes -sha256 -subj '/C=US/ST=CA/L=SF/O=NO\x08A/OU=NA' \
  -keyout server.key -out server.crt


HTTPS Server with Core https Module
const https = require('https')
const fs = require('fs')

const server = https.createServer({
  key: fs.readFileSync('server.key'),
  cert: fs.readFileSync('server.crt')
}, (req, res) => {
  res.writeHead(200)
  res.end('hello')
}).listen(443)

https Request with Streaming
const https = require('https') 

const req = https.request({
    hostname: 'webapplog.com',
    port: 443, 
    path: '/',
    method: 'GET'
  }, (res) => {
  console.log('statusCode:', res.statusCode)
  console.log('headers:', res.headers)

  res.on('data', (chunk) => {
    process.stdout.write(chunk)
  })
})

req.on('error', (error) => {
  console.error(error)
})
req.end()

HTTP/2 with http2

Generating Self-Signed SSL
openssl req -x509 -newkey rsa:2048 -nodes -sha256 -subj '/C=US/ST=CA/L=SF/O=NO\x08A/OU=NA' \
  -keyout server.key -out server.crt


Using Core http2 Module
const http2 = require('http2')
const fs = require('fs')

const server = http2.createSecureServer({
  key: fs.readFileSync('server.key'),
  cert: fs.readFileSync('server.crt')
}, (req, res) => {
  res.writeHead(200, {'Content-Type': 'text/plain' })
  res.end('<h1>Hello World</h1>') // JUST LIKE HTTP!
})
server.on('error', (err) => console.error(err))
server.listen(3000)

Running H2 Hello Server
cd code
cd http2
node h2-hello.js

Browser: https://localhost:3000
Terminal:
curl https://localhost:3000/ -vik










curl https://localhost:3000/ -vik

 Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 3000 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection:
...
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: C=US; ST=CA; L=SF; O=NOx08A; OU=NA
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)


Using Core http2 Module with Stream
const http2 = require('http2')
const fs = require('fs')

const server = http2.createSecureServer({
  key: fs.readFileSync('server.key'),
  cert: fs.readFileSync('server.crt')
})

server.on('error', (err) => console.error(err))
server.on('socketError', (err) => console.error(err))

server.on('stream', (stream, headers) => {
  // stream is a Duplex
  stream.respond({
    'content-type': 'text/html',
    ':status': 200
  })
  stream.end('<h1>Hello World</h1>')
})

server.listen(3000)

WTF is http2 Server Push?

Example: index.html refers to four static assets
HTTP/1: server requires five requests from a client:

index.html
style.css
bundle.js
favicon.ico
logo.png


Example: index.html refers to four static assets (Cont)
HTTP/2: server with server push requires just one request from a client:

index.html


style.css
bundle.js
favicon.ico
logo.png



HTML and assets are pushed by the server but assets are not used unless referred to by HTML.


Let's implement some server push!

Start with a Normal H2 Server
const http2 = require('http2')
const fs = require('fs')

const server = http2.createSecureServer({
  key: fs.readFileSync('server.key'),
  cert: fs.readFileSync('server.crt')
})

server.on('error', (err) => console.error(err))
server.on('socketError', (err) => console.error(err))

Use Stream and pushStream
server.on('stream', (stream, headers) => {
  stream.respond({
    'content-type': 'text/html',
    ':status': 200
  })
  stream.pushStream({ ':path': '/myfakefile.js' }, (pushStream) => {
    pushStream.respond({ 
      'content-type': 'text/javascript',
      ':status': 200 
    })
    pushStream.end(`alert('you win')`)
  })
  stream.end('<script src="https://github.com/myfakefile.js"></script><h1>Hello World</h1>')
})

server.listen(3000)



Additional server push articles

What’s the benefit of Server Push?
Announcing Support for HTTP/2 Server Push
Innovating with HTTP 2.0 Server Push


Advanced Express REST API Routing in HackHall Demo

Conclusion
Just don't use core http directly. Use Express, Hapi or Koa.

Module 5: Debugging

Debugging Strategies

Don't guess and don't think too much
Isolate (use binary search)
Watch/check values
Trial and error
Full Stack overflow development (skip question, read answers)
Read source code, docs can be outdated or subpar


console.log is one of the best debuggers

Not breaking the execution flow
Nothing extra needed (unlike Node Inspector/DevTools or VS Code)
Robust: clearly shows if a line is executed
Clearly shows data


Console Tricks

Streaming Logs to Files
const fs = require('fs')

const out = fs.createWriteStream('./out.log')
const err = fs.createWriteStream('./err.log')

const console2 = new console.Console(out, err)

setInterval(() => {
  console2.log(new Date())
  console2.error(new Error('Whoops'))
}, 500)

Console Parameters
console.log('Step', 2) // Step2
const name = 'Azat'
const city = 'San Francisco'
console.log('Hello %s from %s', name, city)

util.format and util.inspect
const util = require('util')
console.log(util.format('Hello %s from %s', name, city)) 
// Hello Azat from San Francisco
console.log('Hello %s from %s', 'Azat', {city: 'San Francisco'}) 
// Hello Azat from [object Object]
console.log({city: 'San Francisco'}) 
// { city: 'San Francisco' }
console.log(util.inspect({city: 'San Francisco'})) 
// { city: 'San Francisco' }

console.dir()
const str = util.inspect(global, {depth: 0})
console.dir(global, {depth: 0})
info = log
warn = error
trace // prints call stack
assert // require('assert')


Console Timers
console.log('Ethereum transaction started')
console.time('Ethereum transaction')
web3.send(txHash, (error, results)=>{
  console.timeEnd('Ethereum transaction') 
  // Ethereum transaction: 4545.921ms
})

REPL Tricks (which can be used for quick testing and debugging)

Core modules are there already
You can load any module with require() (must be installed with proper path)
You can see all your sessions' histories in ~/.node_repl_history, i.e., cat ~/.node_repl_history or tail ~/.node_repl_history


REPL Commands

.break: When in the process of inputting a multi-line expression, entering the .break command (or pressing the -C key combination) will abort further input or processing of that expression.
.clear: Resets the REPL context to an empty object and clears any multi-line expression currently being input.
.exit: Close the I/O stream, causing the REPL to exit.
.help: Show this list of special commands.
.save: Save the current REPL session to a file: > .save ./file/to/save.js
.load: Load a file into the current REPL session. > .load ./file/to/load.js


Editing in REPL
.editor - Enter editor mode (-D to finish, -C to cancel)
> .editor
// Entering editor mode (^D to finish, ^C to cancel)
function welcome(name) {
  return `Hello ${name}!`;
}

welcome('Node.js User');

// ^D
'Hello Node.js User!'
>


Real Debuggers

CLI
DevTools
VS Code


Node CLI Debugger
$ node inspect debug-me.js
< Debugger listening on ws://127.0.0.1:9229/80e7a814-7cd3-49fb-921a-2e02228cd5ba
< For help see https://nodejs.org/en/docs/inspector
< Debugger attached.
Break on start in myscript.js:1
> 1 (function (exports, require, module, __filename, __dirname) { global.x = 5;
  2 setTimeout(() => {
  3   console.log('world');
debug>


Node CLI Debugger (Cont)
Stepping#
cont, c - Continue execution
next, n - Step next
step, s - Step in
out, o - Step out
pause - Pause running code (like pause button in Developer Tools)


Node V8 Inspector
$ node --inspect index.js
Debugger listening on 127.0.0.1:9229.
To start debugging, open the following URL in Chrome:
    chrome-devtools://devtools/bundled/inspector.html?experiments=true&v8only=true&ws=127.0.0.1:9229/dc9010dd-f8b8-4ac5-a510-c1a114ec7d29

Better to break right away:
node --inspect-brk debug-me.js

Old (deprecated):
node --inspect --debug-brk index.js


Node V8 Inspector Demo

VS Code Demo

CPU profiling

Networking Debugging with DevTools

V8 Memory Scheme
Resident Set:

Code: Node/JS code
Stack: Primitives, local variables, pointers to objects in the heap and control flow
Heap: Referenced types such as Objects, strings, closures


process.memoryUsage()

{ rss: 12476416,
  heapTotal: 7708672,
  heapUsed: 5327904,
  external: 8639 }


Heap

New Space&Young Generation: New allocations, size 1-8Mb, fast collection (Scavenge), ~20% goes into Old Space
Old Space&Old Generation: Allocation is fast but collection is expensive (Mark-Sweep)


Garbage Collection
The mechanism that allocates and frees heap memory is called garbage collection.

Garbage Collection (Cont)

Automatic in Node, thanks to V8
Stops the world - expensive
Objects with refs are not collected (memory leaks)


Memory Leak



Leaky Server
const express = require('express')

const app = express()

let cryptoWallet = {}
const generateAddress = () => {
  const initialCryptoWallet = cryptoWallet
  const tempCryptoWallet = () => {
    if (initialCryptoWallet) console.log('We received initial cryptoWallet')
  }
  cryptoWallet = {
    key: new Array(1e7).join('.'),
    address: () => {
      // ref to tempCryptoWallet ???
      console.log('address returned')
    }
  }
}

app.get('*', (req, res) => {
  generateAddress()
  console.log( process.memoryUsage())
  return res.json({msg: 'ok'})
})
app.listen(3000)

Starting the LEAK
loadtest -c 100 --rps 100 http://localhost:3000
node leaky-server/server.js

{ rss: 1395490816,
  heapTotal: 1469087744,
  heapUsed: 1448368200,
  external: 16416 }
{ rss: 1405501440,
  heapTotal: 1479098368,
  heapUsed: 1458377224,
  external: 16416 }
{ rss: 1335377920,
  heapTotal: 1409097728,
  heapUsed: 1388386720,
  external: 16416 }


GCs
<--- Last few GCs --->

[35417:0x103000c00]    36302 ms: Mark-sweep 1324.1 (1345.3) -> 1324.1 (1345.3) MB, 22.8 / 0.0 ms  allocation failure GC in old space requested
[35417:0x103000c00]    36328 ms: Mark-sweep 1324.1 (1345.3) -> 1324.1 (1330.3) MB, 26.4 / 0.0 ms  last resort GC in old space requested
[35417:0x103000c00]    36349 ms: Mark-sweep 1324.1 (1330.3) -> 1324.1 (1330.3) MB, 20.9 / 0.0 ms  last resort GC in old space requested


Line 12
==== JS stack trace =========================================

Security context: 0x3c69fae25ee1 <JSObject>
    2: generateAddress [/Users/azat/Documents/Code/node-advanced/code/leaky-server/server.js:12] [bytecode=0x3c69df959db9 offset=42](this=0x3c69a7f0c0b9 <JSGlobal Object>)
    4: /* anonymous */ [/Users/azat/Documents/Code/node-advanced/code/leaky-server/server.js:20] [bytecode=0x3c69df959991 offset=7](this=0x3c69a7f0c0b9 <JSGlobal Object>,req=0x3c69389c07c1 <IncomingMessage map = 0x3c693e7300f1...


    FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory


Memory Leak Mitigation

Reproduce the error/leak
Check for variables and fn arguments, pure fns are better
Take heap dumps and compare (with debug and DevTools or heapdump modules)
Update Node
Get rid of extra npm modules
Trial and error: remove code you think is leaky
Modularize&refactor


Useful Libraries

https://www.npmjs.com/package/memwatch-next
https://www.npmjs.com/package/systeminformation
https://github.com/bnoordhuis/node-heapdump


Heap Dumping
code/leaky-server/server-heapdump.js:
// ...
const heapdump = require('heapdump')
setInterval(function () {
  heapdump.writeSnapshot()
}, 2 * 1000)
// ...
Creates files in the current folder:
heapdump-205347232.998971.heapsnapshot
heapdump-205508465.289834.heapsnapshot
heapdump-205513413.472744.heapsnapshot








Module 6: Scaling

Why You Need to Scale

Performance (e.g., under 100ms response time)
Availability (e.g., 99.999%)
Fault tolerance (e.g., zero downtime)

^Zero downtime
^ Offload the workload: when Node server is a single process, it can be easily blocked
^https://blog.interfaceware.com/disaster-recovery-vs-high-availability-vs-fault-tolerance-what-are-the-differences/

Scaling Strategies

Forking (just buy more EC2s) - what we will do
Decomposing (e.g., microservices just for bottlenecks) - in another course
Sharding (e.g., eu.docusign.com and na2.docusign.net) - not recommended


Offload the Workload

spawn() - events, stream, messages, no size limit, no shell
fork() - Node processes, exchange messages
exec() - callback, buffer, 1Gb size limit, creates shell
execFile() - exec file, no shell


Sync Processes (Dumb)

spawnSync()
execFileSync()
execSync()
forkSync()


Executing bash and Spawn params
const {spawn} = require('child_process')
spawn('cd $HOME/Downloads && find . -type f | wc -l', 
  {stdio: 'inherit', 
  shell: true, 
  cwd: '/', 
  env: {PASSWORD: 'dolphins'}
})

Good Examples of Offloading the Workload

Hashing
Encryption
Requests
Encoding
Archiving/Compression
Computation


Let's use Node to launch Python script to securely (512) hash a long string and get results back into Node.

Executing Python with exec()
code/exec-hash.js:
const {exec} = require('child_process')
console.time('hashing')
const str = 'React Quickly: Painless web apps with React, JSX, Redux, and GraphQL'.repeat(100)
exec(`STR="${str}" python ${__dirname}/py-hash.py`, (error, stdout, stderr) => {
  if (error) return console.error(error)
  console.timeEnd('hashing')
  console.log(stdout)
})

console.log('could be doing something else')

Python SHA512 Hashing
code/py-hash.py:
import os
str = os.environ['STR'] 
import hashlib
hash_object = hashlib.sha512(str.encode())
hex_dig = hash_object.hexdigest()
print(hex_dig)

Let's launch Ruby script to encrypt a string from Node with AES into a file and not wait for it.

Node Sends a Long String for Encryption to Ruby
const {spawn} = require('child_process')
const str = 'React Quickly: Painless web apps with React, JSX, Redux, and GraphQL'.repeat(100)
console.time('launch encryption')

const rubyEncrypt = spawn('ruby', ['encrypt.rb'], {
  env: {STR: str},
  detached: true,
  stdio: 'ignore'
})
rubyEncrypt.unref() // Do not wait cause the results will be in the file.

console.timeEnd('launch encryption')

Ruby Script is Encrypting with AES 256
require 'openssl'
cipher = OpenSSL::Cipher.new('aes-256-cbc')
cipher.encrypt # We are encrypting
key = cipher.random_key
iv = cipher.random_iv

encrypted_string = cipher.update ENV["STR"]
encrypted_string << cipher.final
File.write('ruby-encrypted.txt', encrypted_string)

Quick Summary About Spawn

Use params to pass data around
Offload work to other processes even when they are in other languages
Compare timing



Scaling by forking will require the core os module.


os Module

Things You Can Do with os
const os = require('os')
console.log(os.freemem())
console.log(os.type())
console.log(os.release())
console.log(os.cpus())
console.log(os.uptime())
console.log(os.networkInterface())

Network Interface Results
{ lo0:
   [ { address: '127.0.0.1',
       netmask: '255.0.0.0',
       family: 'IPv4',
       mac: '00:00:00:00:00:00',
       internal: true },
  ...
  en0:
   [ { address: '10.0.1.4',
       netmask: '255.255.255.0',
       family: 'IPv4',
       mac: '78:4f:43:96:c6:f1',
       internal: false } ],
  ...  

macOS terminal command to get the same IP:
ifconfig | grep "inet " | grep -v 127.0.0.1


CPU Usage in %
code/os-cpu.js:
const os = require('os')
let cpus = os.cpus()

cpus.forEach((cpu, i) => {
  console.log('CPU %s:', i)
  let total = 0
  for (let type in cpu.times) {
    total += cpu.times[type]
  }
  for (let type in cpu.times) {
    console.log(`\t ${type} ${Math.round(100 * cpu.times[type] / total)}%`)
  }
})

The Core cluster Module

Master process
Worker processes: it's own PID, event loop and memory space
Load testing - round robin or the second approach is where the master process creates the listen socket and sends it to interested workers. The workers then accept incoming connections directly.
Use the child_process.fork() method and messaging


Load Testing
cluster uses round Robin uses shift and push ^source
RoundRobinHandle.prototype.distribute = function(err, handle) {
  this.handles.push(handle);
  const worker = this.free.shift();

  if (worker)
    this.handoff(worker);
};

Load/Stress Testing Tools
Node loadtest:
npm i loadtest -g
loadtest -c 10 --rps 100 10.0.1.4:3000

or Apache ab
ab -c 200 -t 10 http://localhost:3000


With Clusters
Avoid In-memory caching (each cluster has its own memory) or sticky sessions. Use external state store.

Cluster Messaging
Master:
cluster.workers
worker.send(data)
Worker:
process.on('message', data=>{})

Optimizing a Slow Password Salting+Hashing Server

A sync function which is a very CPU-Intensive task
offload/server-v1.js:
// ...
const bcrypt = require('bcrypt')

const hashPassword = (password, cb) => {
  const hash = bcrypt.hashSync(password, 16) // bcrypt has async but we are using sync here for the example
  cb(hash)
}
// ...
app.post('/signup', (req, res) => {
  hashPassword(req.body.password.toString(), (hash) => { // callback but sync
    // Store hash in your password DB.
    res.send('your credentials are stored securely')
  })
})
app.listen(3000)

Benchmarking The Password Salting+Hashing Server
Terminal:
node server-v1.js

Another terminal (not the first terminal):
curl localhost:3000/signup -d '{"password":123}' -H "Content-Type: application/json" -X POST

Third terminal window/tab:
curl localhost:3000

Result: 2nd request (3rd terminal) will wait for the 1st request (2nd terminal)

Optimizing The Password Salting+Hashing Server
Server with forked hashing code/offload/worker-v2.js:
const bcrypt = require('bcrypt')

process.on('message', (password) => {
  const hash = bcrypt.hashSync(password, 16)
  process.send(hash)
})

Optimizing Server (Cont)
Optimized server code/offload/server-v2.js:
const hashPassword = (password, cb) => {
  const hashWorker = fork('worker-v2.js')
  hashWorker.send(password)
  hashWorker.on('message', hash => {
    cb(hash)
  })
}
app.use(bodyParser.json())
app.get('/', (req, res) => {
  res.send('welcome to strong password site')
})

app.post('/signup', (req, res) => {
  hashPassword(req.body.password.toString(), (hash) => { // callback but sync
    // Store hash in your password DB.
    res.send('your credentials are stored securely')
  })
})

Testing Server v2 (Forked Process)
Terminal:
node server-v2.js

Another terminal (not the first terminal):
curl localhost:3000/signup -d '{"password":123}' -H "Content-Type: application/json" -X POST

Third terminal window/tab:
curl localhost:3000

Result: 2nd request (3rd terminal) will NOT wait for the 1st request (2nd terminal)

We can fork the v1 server without splitting the hashing+salting function into a worker

Server With a Forked Cluster
code/offload/server-v3.j:
const express = require('express')
const app = express()
const path = require('path')
const bodyParser = require('body-parser')
const bcrypt = require('bcrypt')

const cluster = require('cluster')

if (cluster.isMaster) {
  const os = require('os')
  os.cpus().forEach(() => {
    const worker = cluster.fork()
    console.log(`Started worker ${worker.process.pid}`)
  })
  return true
} 

Server With a Forked Cluster (Cont)
code/offload/server-v3.j:
// cluster.isWorker === true
const hashPassword = (password, cb) => {  
  const hash = bcrypt.hashSync(password, 16) // bcrypt has async but we are using sync here for the example
  cb(hash)
}

app.use(bodyParser.json())
app.get('/', (req, res) => {
  res.send('welcome to strong password site')
})

app.post('/signup', (req, res) => {
  hashPassword(req.body.password.toString(), (hash) => { // callback but sync
    // Store hash in your password DB.
    res.send('your credentials are stored securely')
  })
})

app.listen(3000)

Testing Server v3 (Forked Server)
Terminal:
node server-v3.js

Another terminal (not the first terminal):
curl localhost:3000/signup -d '{"password":123}' -H "Content-Type: application/json" -X POST

Third terminal window/tab:
curl localhost:3000

Result: 2nd request (3rd terminal) will NOT wait for the 1st request (2nd terminal)


Node.js does not automatically manage the number of workers, however. It is the application's responsibility to manage the worker pool based on its own needs.


No Fault Tolerance in Server v3
node server-v3.js

ps aux | grep 'node'
kill 12668


Implementing Fault Tolerance in Server v4
in isMaster in code/offload/server-v4.js:
  cluster.on('exit', (worker, code, signal) => {
    if (signal) {
      console.log(`worker was killed by signal: ${signal}`);
    } else if (code !== 0) { // &&!worker.exitedAfterDisconnect
      console.log(`worker exited with error code: ${code}`);
    } else {
      console.log('worker success!');
    }
    const newWorker = cluster.fork()
    console.log(`Worker ${worker.process.pid} exited. Thus, starting a new worker ${newWorker.process.pid}`)
  })

Fault Tolerance in Server v4
node server-v4.js

ps aux | grep 'node'
kill 12668



cluster is good but pm2 is better


pm2 Basics
npm i -g pm2
pm2 start app.js Start, Daemonize and auto-restart application (Node)
pm2 start app.js --watch
pm2 start app.js --name="bitcoin-exchange-api"
pm2 reset bitcoin-exchange-api
pm2 stop all
pm2 stop bitcoin-exchange-api


pm2 Advanced
pm2 startup
pm2 save
pm2 unstartup 
pm2 start app.js -i 4         # Start 4 instances of application in cluster mode 
                              # it will load balance network queries to each app
pm2 start app.js -i 4         # Start auto-detect instances of application in cluster mode  
pm2 reload all                # Zero Second Downtime Reload
pm2 scale [app-name] 10       # Scale Cluster app to 10 process


pm2 More
pm2-dev
pm2-docker


Outro

Summary

Debugging
Console, Node REPL and npm tricks
Forking and spawning
Creating streams, async/await and naive promises
How really globals, modules and require() work


The End!

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

azat-co / node-advanced

Programming Languages

Labels

Projects that are alternatives of or similar to node-advanced

Node Advanced

Overview

Azat Mardan @azat_co

Node Advanced

Course Overview

Course Overview

Curriculum

Curriculum

What to Expect

What not to Expect

Prerequisites

What You Need

Mindset

Reading Source Code

Tips for Deeper (Advanced) Understanding

Tips for Deeper (Advanced) Understanding (Cont)

Module 1: Modules

Importing Modules with require()

Modules Can Have Code

require()

require.resolve()

How require() Checks Files

require.extensions

Caching

Exporting Module

Exporting Code

CSV to Node Object Converter Module

CSV to Node Object Converter Main Program

Module Patterns

Exporting Tricks and Gotchas

Exporting Tricks and Gotchas (Cont)

Module Wrapper Function

Tricky Local Globals

What You Export === What You Use

What You Export === What You Use (Cont)

What You Export === What You Use (Cont)

What You Export === What You Use (Cont)

import vs import() vs require()

Node experimental ESM support

Caching

Clear Cache

Global

npm

npm Git

npm ls

Creating package.json For Lazy Programmers

Setting Init Configs

My npm Configs: cli, user, global

Configs for npm init

Setting up npm registry Config

Setting up npm proxy

Dependency Options

npm update and npm outdated

npm Tricks

npm Linking for Developing CLI Tools

Module 2: Node Event Loop and Async Programming

Event loop

Two Categories of Tasks

CPU Bound Tasks

Input and Output Bound Tasks

Dealing with Slow I/O

Call Stack

Call Stack Illustration

Call Stack as a Bucket

Call Stack Error

Event Queue

Async Callback Messes Call Stack

Different Call Stack!

Event Loop Order of Operation:

Phases Overview

setTimeout vs. setImmediate vs. process.nextTick

nextTick Usage

Event Emit nextTick Example in http

Async or Sync Error Handling in fs

Async Code Syntax

Importing Modules with `require()`

`require()`

`require.resolve()`

How `require()` Checks Files

`require.extensions`

`import` vs `import()` vs `require()`

`npm update` and `npm outdated`

Example Calls with `then` and a Callback

`util.promisify`

Interface `read()`