All Projects → danbev → Learning Nodejs

danbev / Learning Nodejs

Project for learning Node.js internals

Labels

Projects that are alternatives of or similar to Learning Nodejs

Paint Timing
Paint Timing
Stars: ✭ 226 (-7.38%)
Mutual labels:  makefile
Rhplaceholder
Show pleasant loading view for your users 😍
Stars: ✭ 238 (-2.46%)
Mutual labels:  makefile
Personal Server
Personal server configuration with k3s
Stars: ✭ 2,784 (+1040.98%)
Mutual labels:  makefile
Smallest Secured Golang Docker Image
Create the smallest and secured golang docker image based on scratch
Stars: ✭ 229 (-6.15%)
Mutual labels:  makefile
Build Harness
🤖Collection of Makefiles to facilitate building Golang projects, Dockerfiles, Helm charts, and more
Stars: ✭ 236 (-3.28%)
Mutual labels:  makefile
Tesla Menu
The Nintendo Switch overlay menu
Stars: ✭ 236 (-3.28%)
Mutual labels:  makefile
Mkdkr
Make + Docker + Shell = CI Pipeline
Stars: ✭ 225 (-7.79%)
Mutual labels:  makefile
Data Making Guidelines
📘 Making Data, the DataMade Way
Stars: ✭ 248 (+1.64%)
Mutual labels:  makefile
Memory Hack
打造超人大脑
Stars: ✭ 237 (-2.87%)
Mutual labels:  makefile
Crazeeriderbbc
Crazee Rider - BBC Micro
Stars: ✭ 243 (-0.41%)
Mutual labels:  makefile
Openwrt Trojan
trojan and its dependencies for OpenWrt
Stars: ✭ 236 (-3.28%)
Mutual labels:  makefile
Mav voxblox planning
MAV planning tools using voxblox as the map representation.
Stars: ✭ 234 (-4.1%)
Mutual labels:  makefile
Dircolors Solarized
This is a repository of themes for GNU ls (configured via GNU dircolors) that support Ethan Schoonover’s Solarized color scheme.
Stars: ✭ 2,671 (+994.67%)
Mutual labels:  makefile
Esp Idf Template
Template application for https://github.com/espressif/esp-idf
Stars: ✭ 227 (-6.97%)
Mutual labels:  makefile
Verified Smart Contracts
Smart contracts which are formally verified
Stars: ✭ 243 (-0.41%)
Mutual labels:  makefile
Rancid Git
DEPRECATED -- Strongly consider using the upstream, the version here is very out of date and a poor place to start from!
Stars: ✭ 225 (-7.79%)
Mutual labels:  makefile
Source Code Examples
Examples of code for the ESP8266
Stars: ✭ 237 (-2.87%)
Mutual labels:  makefile
Aosp build
AOSP Build system compatible version of Open GApps
Stars: ✭ 250 (+2.46%)
Mutual labels:  makefile
Python Ios Support
A meta-package for building a version of Python that can be embedded into an iOS project.
Stars: ✭ 246 (+0.82%)
Mutual labels:  makefile
Mach
A remake of make (in ClojureScript)
Stars: ✭ 240 (-1.64%)
Mutual labels:  makefile

Learning Node.js

The sole purpose of this project is to aid in learning Node.js internals. Please note that viewing this README.md as the main page of the project might truncate it as it has become quite large. By opening the file (clicking on it) you can view the whole content.

Prerequisites

You'll need to have checked out the node.js source.

Compiling Node.js with debug enbled:

$ ./configure --debug
$ make -C out BUILDTYPE=Debug

After compiling (with debugging enabled) start node using lldb:

$ cd node/out/Debug
$ lldb ./node

Node uses Generate Your Projects (gyp) for which I was not familiar with so there is a example project in gyp to look into it.

Running the Node.js tests

$ make -j8 test

Notes

The rest of this page contains notes gathred while setting through the code base: (These are more sections than listed here but they might be hard to follow)

  1. Background
  2. Start up
  3. Loading of builtins
  4. Environment
  5. TCPWrap
  6. Running a script
  7. Event loop
  8. setTimeout
  9. setImmediate
  10. nextTick
  11. AsyncWrap
  12. lldb
  13. Promises
  14. Libuv Thread pool
  15. bootstrap_node.js compilation and execution walkthrough

Background

Node.js is roughly Google V8, libuv and Node.js core which glues everything together.

V8 bascially consists of the memory management of the heap and the execution stack (very simplified but helps make my point). If you are used to web client side development you'll know about the WebAPIs that are also available like DOM, AJAX, setTimeout etc. This functionality is not provided by V8 but in instead by chrome. There is also nothing about a event loop in V8, this is also something that is provided by chrome.

+------------------------------------------------------------------------------------------+
| Google Chrome                                                                            |
|                                                                                          |
| +----------------------------------------+          +------------------------------+     |
| | Google V8                              |          |            WebAPIs           |     |
| | +-------------+ +---------------+      |          |                              |     |
| | |    Heap     | |     Stack     |      |          |                              |     |
| | |             | |               |      |          |                              |     |
| | |             | |               |      |          |                              |     |
| | |             | |               |      |          |                              |     |
| | |             | |               |      |          |                              |     |
| | |             | |               |      |          |                              |     |
| | +-------------+ +---------------+      |          |                              |     |
| |                                 |      |          |                              |     |
| +----------------------------------------+          +------------------------------+     |
|                                                                                          |
|                                                                                          |
| +---------------------+     +---------------------------------------+                    |
| |     Event loop      |     |          Render task queue            |                    |
| |                     |     |                                       |                    |
| +---------------------+     +---------------------------------------+                    |
|                             +---------------------------------------+                    |
|                             |          Callback/task queue          |                    |
|                             |                                       |                    |
|                             +---------------------------------------+                    |
|                                                                                          |
|                                                                                          |
+------------------------------------------------------------------------------------------+

The execution stack is a stack of frame pointers. For each function called, that function will be pushed onto the stack. When a function returns it will be removed. If that function calls other functions they will be pushed onto the stack. When all functions have returned execution can proceed from the returned to point. If one of the functions performs an operation that takes time, progress will not be made until it completes as the only way to complete is that the function returns and is popped off the stack. This is what happens when you have a single threaded programming language.

Aychnronous work can be done by calling into the WebAPIs, for example calling setTimeout which will call out to the WebAPI and then return. The functionality for setTimeout is provided by the WebAPI and when the timer is due the WebAPI will push the callback onto the callback queue. Items from the callback queue will be picked up by the event loop and pushed onto the stack for execution.

TODO: Is the microtask queue considered part of V8 or part of chrome?

Now lets compare this with Node.js:

+------------------------------------------------------------------------------------------+
| Node.js                                                                                  |
|                                                                                          |
| +----------------------------------------+          +------------------------------+     |
| | Google V8                              |          |        Node Core APIs        |     |
| | +-------------+ +---------------+      |          |                              |     |
| | |    Heap     | |     Stack     |      |          |                              |     |
| | |             | |               |      |          |                              |     |
| | |             | |               |      |          |                              |     |
| | |             | |               |      |          |                              |     |
| | |             | |               |      |          |                              |     |
| | |             | |               |      |          |                              |     |
| | +-------------+ +---------------+      |          |                              |     |
| |                                        |          |                              |     |
| +----------------------------------------+          +------------------------------+     |
|                                                                                          |
|                                                                                          |
| +---------------------+     +---------------------------------------+                    |
| | libuv               |     |          Microtask queue              |                    |
| |     Event Loop      |     |                                       |                    |
| +---------------------+     +---------------------------------------+                    |
|                             +---------------------------------------+                    |
|                             |          Callback queue               |                    |
|                             |                                       |                    |
|                             +---------------------------------------+                    |
|                                                                                          |
|                                                                                          |
+------------------------------------------------------------------------------------------+

Taking the same example from above, setTimeout, this would be a call to Node Core API and then the function will return. When the timer expires Node Core API will push the callback onto the callback queue. The event loop in Node is provided by libuv, whereas in chrome this is provided by the browser (chromium I believe) TODO: Is the microtask queue considered part of V8 or part of node?

Starting Node

To start and stop at first line in a js program use:

$ lldb -- node --debug-brk test.js

Set a break point in node_main.cc:

(lldb) breakpoint set --file node_main.cc --line 52
(lldb) run

Walkthrough

node_main.cc will bascially call node::Start which we can find in src/node.cc.

Start(int argc, char** argv)

Starts by calling PlatformInit

default_platform = v8::platform::CreateDefaultPlatform(v8_thread_pool_size);
V8::InitializePlatform(default_platform);
V8::Initialize();
...
Init(&argc, const_cast<const char**>(argv), &exec_argc, &exec_argv);

PlatformInit()

From what I understand this mainly sets up things like signals and file descriptor limits.

Init

Init has some libuv code that looks familiar to what I played around with in learning-libuv.

uv_async_init(uv_default_loop(),
             &dispatch_debug_messages_async,
             DispatchDebugMessagesAsyncCallback);

Now I've not used uv_async_init but looking at the docs this is done to allow a different thread to wake up the event loop and have the callback invoked. uv_async_init looks like this:

int uv_async_init(uv_loop_t* loop, uv_async_t* async, uv_async_cb async_cb)

To understand this better this standalone example helped my clarify things a bit.

uv_unref(reinterpret_cast<uv_handle_t*>(&dispatch_debug_messages_async));

I believe this is done so that the ref count of the dispatch_debug_message_async handle is decremented. If this handle is the only thing referened that would cause the event loop to be considered alive and it will continue to iterate.

So a different thread can use uv_async_sent(&dispatch_debug_messages_async) to to wake up the eventloop and have the DispatchDebugMessagesAsyncCallback function called.

EnableDebugSignalHandler

This is where we signal the semaphore which will increment the counter, and any threads in the wait queue will now run. So our thread that is blocked waiting for this debug_semaphore will be able to proceed and TryStartDebugger will be called.

uv_sem_post(&debug_semaphore);

But what will actually send the signal for all this to happen? I think this is done DebugProcess(const FunctionCallbackInfo& args). Setting a break point confirmed this and the back trace:

(lldb) bt
* thread #1: tid = 0x11f57b1, 0x0000000100cafddc node`node::DebugProcess(args=0x00007fff5fbf4700) + 12 at node.cc:3754, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x0000000100cafddc node`node::DebugProcess(args=0x00007fff5fbf4700) + 12 at node.cc:3754
    frame #1: 0x000000010028618b node`v8::internal::FunctionCallbackArguments::Call(this=0x00007fff5fbf4878, f=(node`node::DebugProcess(v8::FunctionCallbackInfo<v8::Value> const&) at node.cc:3753))(v8::FunctionCallbackInfo<v8::Value> const&)) + 139 at arguments.cc:33
    frame #2: 0x00000001002f58f3 node`v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(isolate=0x0000000104004000, args=BuiltinArguments<v8::internal::BuiltinExtraArguments::kTarget> @ 0x00007fff5fbf49d0)::BuiltinArguments<(v8::internal::BuiltinExtraArguments)1>) + 1619 at builtins.cc:3915
    frame #3: 0x000000010031ce36 node`v8::internal::Builtin_Impl_HandleApiCall(args=v8::internal::(anonymous namespace)::HandleApiCallArgumentsType @ 0x00007fff5fbf4a38, isolate=0x0000000104004000)::BuiltinArguments<(v8::internal::BuiltinExtraArguments)1>, v8::internal::Isolate*) + 86 at builtins.cc:3939
    frame #4: 0x00000001002f9c8f node`v8::internal::Builtin_HandleApiCall(args_length=3, args_object=0x00007fff5fbf4b28, isolate=0x0000000104004000) + 143 at builtins.cc:3936
    frame #5: 0x00003ca66bf0961b
    frame #6: 0x00003ca66c081e0d
    frame #7: 0x00003ca66bf0d17a
    frame #8: 0x00003ca66c01edb3
    frame #9: 0x00003ca66c01e802
    frame #10: 0x00003ca66bf0d17a
    frame #11: 0x00003ca66c081b25
    frame #12: 0x00003ca66c074fd2
    frame #13: 0x00003ca66c074c6b
    frame #14: 0x00003ca66bf38024
    frame #15: 0x00003ca66bf22962
    frame #16: 0x00000001006f11df node`v8::internal::(anonymous namespace)::Invoke(isolate=0x0000000104004000, is_construct=false, target=Handle<v8::internal::Object> @ 0x00007fff5fbf4f50, receiver=Handle<v8::internal::Object> @ 0x00007fff5fbf4f48, argc=0, args=0x0000000000000000, new_target=Handle<v8::internal::Object> @ 0x00007fff5fbf4f40) + 607 at execution.cc:97
    frame #17: 0x00000001006f0f61 node`v8::internal::Execution::Call(isolate=0x0000000104004000, callable=Handle<v8::internal::Object> @ 0x00007fff5fbf50a8, receiver=Handle<v8::internal::Object> @ 0x00007fff5fbf50a0, argc=0, argv=0x0000000000000000) + 1313 at execution.cc:163
    frame #18: 0x000000010023f4af node`v8::Function::Call(this=0x0000000104062c20, context=(val_ = 0x00000001040404c8), recv=(val_ = 0x00000001040628a0), argc=0, argv=0x0000000000000000) + 671 at api.cc:4404
    frame #19: 0x000000010023f611 node`v8::Function::Call(this=0x0000000104062c20, recv=(val_ = 0x00000001040628a0), argc=0, argv=0x0000000000000000) + 113 at api.cc:4413
    frame #20: 0x0000000100c8f3b8 node`node::AsyncWrap::MakeCallback(this=0x0000000104800d50, cb=(val_ = 0x00000001040404a0), argc=3, argv=0x00007fff5fbf5690) + 2600 at async-wrap.cc:284
    frame #21: 0x0000000100c937e6 node`node::AsyncWrap::MakeCallback(this=0x0000000104800d50, symbol=(val_ = 0x000000010403e5b0), argc=3, argv=0x00007fff5fbf5690) + 198 at async-wrap-inl.h:110
    frame #22: 0x0000000100d06c67 node`node::StreamBase::EmitData(this=0x0000000104800d50, nread=43, buf=(val_ = 0x0000000104040488), handle=(val_ = 0x0000000000000000)) + 551 at stream_base.cc:427
    frame #23: 0x0000000100d0adc3 node`node::StreamWrap::OnReadImpl(nread=43, buf=0x00007fff5fbf58f8, pending=UV_UNKNOWN_HANDLE, ctx=0x0000000104800d50) + 675 at stream_wrap.cc:222
    frame #24: 0x0000000100ca25a7 node`node::StreamResource::OnRead(this=0x0000000104800d50, nread=43, buf=0x00007fff5fbf58f8, pending=UV_UNKNOWN_HANDLE) + 119 at stream_base.h:171
    frame #25: 0x0000000100d0b93f node`node::StreamWrap::OnReadCommon(handle=0x0000000104800df0, nread=43, buf=0x00007fff5fbf58f8, pending=UV_UNKNOWN_HANDLE) + 351 at stream_wrap.cc:246
    frame #26: 0x0000000100d0b3d4 node`node::StreamWrap::OnRead(handle=0x0000000104800df0, nread=43, buf=0x00007fff5fbf58f8) + 116 at stream_wrap.cc:261
    frame #27: 0x0000000100f70e93 node`uv__read(stream=0x0000000104800df0) + 1555 at stream.c:1192
    frame #28: 0x0000000100f6cb8c node`uv__stream_io(loop=0x00000001019ee200, w=0x0000000104800e78, events=1) + 348 at stream.c:1259
    frame #29: 0x0000000100f7b784 node`uv__io_poll(loop=0x00000001019ee200, timeout=7073) + 3492 at kqueue.c:276
    frame #30: 0x0000000100f5e62f node`uv_run(loop=0x00000001019ee200, mode=UV_RUN_ONCE) + 207 at core.c:354
    frame #31: 0x0000000100cb33a0 node`node::StartNodeInstance(arg=0x00007fff5fbfea60) + 912 at node.cc:4303
    frame #32: 0x0000000100cb2f8d node`node::Start(argc=2, argv=0x0000000103404a60) + 253 at node.cc:4380
    frame #33: 0x0000000100cede9b node`main(argc=2, argv=0x00007fff5fbfeb18) + 75 at node_main.cc:54
    frame #34: 0x0000000100001634 node`start + 52

So to recap, SetupProcessObject sets up the process object for node and one of the methods it sets is '_debugProcess':

 env->SetMethod(process, "_debugProcess", DebugProcess);

SetupProcessObject is called from Environment::Start (src/env.cc):

* thread #1: tid = 0x1207377, 0x0000000100cad2fc node`node::SetupProcessObject(env=0x00007fff5fbfe108, argc=2, argv=0x0000000103604a20, exec_argc=0, exec_argv=0x0000000103604410) + 11020 at node.cc:3205, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
 * frame #0: 0x0000000100cad2fc node`node::SetupProcessObject(env=0x00007fff5fbfe108, argc=2, argv=0x0000000103604a20, exec_argc=0, exec_argv=0x0000000103604410) + 11020 at node.cc:3205
   frame #1: 0x0000000100c91bc7 node`node::Environment::Start(this=0x00007fff5fbfe108, argc=2, argv=0x0000000103604a20, exec_argc=0, exec_argv=0x0000000103604410, start_profiler_idle_notifier=false) + 919 at env.cc:91
   frame #2: 0x0000000100cb32b1 node`node::StartNodeInstance(arg=0x00007fff5fbfeaa0) + 673 at node.cc:4274
   frame #3: 0x0000000100cb2f8d node`node::Start(argc=2, argv=0x0000000103604a20) + 253 at node.cc:4380
   frame #4: 0x0000000100cede9b node`main(argc=2, argv=0x00007fff5fbfeb58) + 75 at node_main.cc:54
   frame #5: 0x0000000100001634 node`start + 52

After that detour we are back in the Start method, and the next line is:

default_platform = v8::platform::CreateDefaultPlatform(v8_thread_pool_size);
(lldb) p v8_thread_pool_size
(int) $30 = 4

We can find the implementation of this in deps/v8/src/libplatform/default-platform.cc. The call is the same as was used in the hello_world example except here the size of the thread pool is being passed in and in the hello_world the no arguments method was called. I only skimmed this part when I was working through that example so it might be good to figure out what is going on here.

An instance of DefaultPlatform is created and then its SetThreadPoolSize method is called with v8_thread_pool_size. When the size is not given it will default to p SysInfo::NumberOfProcessors(). Next, EnsureInitialized is called which does a check to see if the instance has already been initilized and if not:

 for (int i = 0; i < thread_pool_size_; ++i)
   thread_pool_.push_back(new WorkerThread(&queue_));

This will create new workers and threads for them. This call finds its way down into deps/v8/src/base/platform/platform-posix.c and its Thread::Start method:

LockGuard<Mutex> lock_guard(&data_->thread_creation_mutex_);
result = pthread_create(&data_->thread_, &attr, ThreadEntry, this);    

We can see this is where the creation and starting of a new thread is done. The first argument is the pthread_t to associate with the function ThreadEntry which is the top level entry point for the new thread. The second argument are additional attributes. The third argument is the function as already mentioned and the fourth parameter is the argument to the function. So we can see that ThreadEntry takes the current instance as an argument (well it takes a void pointer):

static void* ThreadEntry(void* arg) {
 Thread* thread = reinterpret_cast<Thread*>(arg);
 // We take the lock here to make sure that pthread_create finished first since
 // we don't know which thread will run first (the original thread or the new
 // one).
 { LockGuard<Mutex> lock_guard(&thread->data()->thread_creation_mutex_); }
 SetThreadName(thread->name());
 DCHECK(thread->data()->thread_ != kNoThread);
 thread->NotifyStartedAndRun();
 return NULL;

}

ThreadEntry is using a LockGuard and creates a scope to use the Resource Acquisition Is Initialization (RAII) idiom for a mutex. The scope is very limited but like the comment says is really just trying to aquire the lock, which was the same that was used when creating the thread above. So, lets take a look at thread->NotifyStartedAndRun()

NotifyStartedAndRun

void NotifyStartedAndRun() {
  if (start_semaphore_) start_semaphore_->Signal();
  Run();

}

LockGuard

So a lock guard is an implementation of Resource Acquisition Is Initialization (RAII) and takes a mutex in its constructor which it then calls lock on. When this instance goes out of scope its descructor will be called and it will call unlock guarenteeing that the mutex will be unlocked even if an exception is thrown. The Mutex class can be found in deps/v8/src/base/platform/mutex.h. On a Unix system the mutex will be of type pthread_mutex_t

We can verify this by inspecting the threads before and after the calls. Before:

(lldb) thread list
Process 4614 stopped
* thread #1: tid = 0xe0d19a, 0x0000000100f80cc1 node`v8::base::Thread::Start(this=0x0000000103206110) + 321 at platform-posix.cc:618, queue = 'com.apple.main-thread', stop reason = step over
  thread #2: tid = 0xe0d2f2, 0x00007fff858affae libsystem_kernel.dylib`semaphore_wait_trap + 10

After:

(lldb) thread list
Process 4669 stopped
* thread #1: tid = 0xe0e3a7, 0x0000000100f80cdb node`v8::base::Thread::Start(this=0x0000000103206530) + 347 at platform-posix.cc:619, queue = 'com.apple.main-thread', stop reason = step over
  thread #2: tid = 0xe0e46d, 0x00007fff858affae libsystem_kernel.dylib`semaphore_wait_trap + 10
  thread #3: tid = 0xe0f230, 0x0000000100f81570 node`v8::base::Thread::data(this=0x0000000103206530) at platform.h:463

What does one of these thread do? Lets set a breakpoint in the ThreadEntry method:

(lldb) breakpoint set --file platform-posix.cc --line 582

NodeInstanceData

In the Start method we can see a block with the creation of a new NodeInstanceData instance:

int exit_code = 1;
{
  NodeInstanceData instance_data(NodeInstanceType::MAIN,
                                 uv_default_loop(),
                                 argc,
                                 const_cast<const char**>(argv),
                                 exec_argc,
                                 exec_argv,
                                 use_debug_agent);
  StartNodeInstance(&instance_data);
  exit_code = instance_data.exit_code();
}	

There are two NodeInstanceTypes, MAIN and WORKER. The second argument is the libuv event loop to be used.

StartNodeInstance

We are passing the NodeInstanceData instance we created above. The code in this method is very similar to the code that we used in the hello-world.cc. A new Isolate is created. Remember that an Isolate is an independant copy of the V8 runtime, with its own heap.

Environment* env = CreateEnvironment(isolate, context, instance_data);

CreateEnvironment(isolate, context, instance_data)

Local<FunctionTemplate> process_template = FunctionTemplate::New(isolate);
process_template->SetClassName(FIXED_ONE_BYTE_STRING(isolate, "process"));
...
SetupProcessObject(env, argc, argv, exec_argc, exec_argv);

This looks like the node process object is being created here. All the JavaScript built-in objects are provided by the V8 runtime but the process object is not one of them. So here we are doing the same as in the hello-world example above but naming the object 'process'

auto maybe = process->SetAccessor(env->context(),
                             env->title_string(),
                             ProcessTitleGetter,
                             ProcessTitleSetter,
                             env->as_external());
CHECK(maybe.FromJust());

Notice that SetAccessor returns an "optional" MayBe type.

READONLY_PROPERTY(process,
   "version",
   FIXED_ONE_BYTE_STRING(env->isolate(), NODE_VERSION));

The above is adding properties to the 'process' object. The first being version and then:

process.moduleLoadList
process.versions[
  http_parser,
  node,
  v8,
  vu,
  zlib,
  ares,
  icu,
  modules
]
process.icu_data_dir
process.arch
process.platform
process.release
process.release.name
process.release.lts
process.release.sourceUrl
process.release.headersUrl

process.env
process.pid
process.features



READONLY_PROPERTY(process,
                 "moduleLoadList",
                 env->module_load_list_array());

I was not aware of this one but process.moduleLoadList will return an array of modules loaded.

 READONLY_PROPERTY(process, "versions", versions);

Next up is process.versions which on my local machine returns:

> process.versions
{ http_parser: '2.5.2',
  node: '4.4.3',
  v8: '4.5.103.35',
  uv: '1.8.0',
  zlib: '1.2.8',
  ares: '1.10.1-DEV',
  icu: '56.1',
  modules: '46',
  openssl: '1.0.2g' }

After setting up all the object (SetupProcessObject) this methods returns. There is still no sign of the loading of the 'bootstrap_node.js' script. This is done in LoadEnvironment.

LoadEnvironment

  Local<String> loaders_name = FIXED_ONE_BYTE_STRING(env->isolate(), "internal/bootstrap/loaders.js");
  Local<Function> loaders_bootstrapper = GetBootstrapper(env, LoadersBootstrapperSource(env), loaders_name);
  Local<String> node_name = FIXED_ONE_BYTE_STRING(env->isolate(), "internal/bootstrap/node.js");
  Local<Function> node_bootstrapper = GetBootstrapper(env, NodeBootstrapperSource(env), node_name);

  ...
  // Create binding loaders
  v8::Local<v8::Function> get_binding_fn = env->NewFunctionTemplate(GetBinding)->GetFunction(env->context()).ToLocalChecked();
  v8::Local<v8::Function> get_linked_binding_fn = env->NewFunctionTemplate(GetLinkedBinding)->GetFunction(env->context()).ToLocalChecked();
  v8::Local<v8::Function> get_internal_binding_fn = env->NewFunctionTemplate(GetInternalBinding)->GetFunction(env->context()).ToLocalChecked();

  Local<Value> loaders_bootstrapper_args[] = {
    env->process_object(),
    get_binding_fn,
    get_linked_binding_fn,
    get_internal_binding_fn
  }; 

  Local<Value> bootstrapped_loaders;
  if (!ExecuteBootstrapper(env, loaders_bootstrapper,
                           arraysize(loaders_bootstrapper_args),
                           loaders_bootstrapper_args,
                           &bootstrapped_loaders)) {
    return;
  } 

So ExecuteBootstrapper will call the function in internal/bootstrap/loaders.js:

(lldb) jlh bootstrapper

I'm not showing the output as it is a little long but you can see the contents of loaders.js. We can see the arguments that the function takes:

- source code: (process, getBinding, getLinkedBinding, getInternalBinding) {

These match the arguments from loaders_bootstrapper_args above. Next we have:

  // Bootstrap Node.js
  Local<Value> bootstrapped_node;
  Local<Value> node_bootstrapper_args[] = {
    env->process_object(),
    bootstrapped_loaders
  };
  if (!ExecuteBootstrapper(env, node_bootstrapper,
                           arraysize(node_bootstrapper_args),
                           node_bootstrapper_args,
                           &bootstrapped_node)) {
    return;

Notice that bootstrapped_loaders is passed as an argument:

(lldb) jlh bootstrapped_loaders
0x9b2977bc179: [JS_OBJECT_TYPE]
 - map: 0x9b269e9e361 <Map(HOLEY_ELEMENTS)> [FastProperties]
 - prototype: 0x9b27a204479 <Object map = 0x9b269e822a1>
 - elements: 0x9b2e0782251 <FixedArray[0]> [HOLEY_ELEMENTS]
 - properties: 0x9b2e0782251 <FixedArray[0]> {
    #internalBinding: 0x9b2977b3e31 <JSFunction internalBinding (sfi = 0x9b27a2794d9)> (data field 0)
    #NativeModule: 0x9b2977b3369 <JSFunction NativeModule (sfi = 0x9b27a2792f9)> (data field 1)
 }

internalBinding and NativeModule properties are destructed in the function:

- source code: (process, { internalBinding, NativeModule }) {

So we have GetBinding, GetLinkedBinding, and GetInternalBinding which are all passed to the bootstrap function. A native module can be created using using one of the following types ('src/node_internals.h'):

enum {
  NM_F_BUILTIN  = 1 << 0,
  NM_F_LINKED   = 1 << 1,
  NM_F_INTERNAL = 1 << 2,
};

For example, the crypto module (src/node_crypto.cc) is registered using the macro:

NODE_BUILTIN_MODULE_CONTEXT_AWARE(crypto, node::crypto::Initialize)

#define NODE_BUILTIN_MODULE_CONTEXT_AWARE(modname, regfunc)                   \
  NODE_MODULE_CONTEXT_AWARE_CPP(modname, regfunc, nullptr, NM_F_BUILTIN)

Modules that use NF_F_BUILTIN:

NODE_BUILTIN_MODULE_CONTEXT_AWARE(inspector, node::inspector::Initialize);
NODE_BUILTIN_MODULE_CONTEXT_AWARE(util, node::util::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(tcp_wrap, node::TCPWrap::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(url, node::url::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(udp_wrap, node::UDPWrap::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(inspector, Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(process_wrap, node::ProcessWrap::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(buffer, node::Buffer::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(contextify, node::contextify::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(os, node::os::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(async_wrap, node::AsyncWrap::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(fs_event_wrap, node::FSEventWrap::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(spawn_sync, node::SyncProcessRunner::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(js_stream, node::JSStream::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(pipe_wrap, node::PipeWrap::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(tty_wrap, node::TTYWrap::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(crypto, node::crypto::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(tls_wrap, node::TLSWrap::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(config, node::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(zlib, node::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(fs, node::fs::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(icu, node::i18n::Initialize)
NODE_BUILTIN_MODULE_CONTEXT_AWARE(cares_wrap, node::cares_wrap::Initialize)

There is work in progress to make the above internal.

If a module uses NODE_MODULE_CONTEXT_AWARE_INTERNAL it will use `NM_F_INTERNAL which is used by:

NODE_MODULE_CONTEXT_AWARE_INTERNAL(heap_utils, node::heap::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(types, node::InitializeTypes)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(timers, node::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(http2, node::http2::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(string_decoder, node::InitializeStringDecoder)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(http_parser, node::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(performance, node::performance::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(uv, node::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(messaging, node::worker::InitMessaging)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(trace_events, node::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(serdes, node::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(v8, node::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(stream_pipe, node::InitializeStreamPipe)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(domain, node::domain::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(module_wrap, node::loader::ModuleWrap::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(worker, node::worker::InitWorker)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(symbols, node::symbols::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(signal_wrap, node::SignalWrap::Initialize)
NODE_MODULE_CONTEXT_AWARE_INTERNAL(stream_wrap, node::LibuvStreamWrap::Initialize)

So how about NM_F_LINKED?

// - process._linkedBinding(): intended to be used by embedders to add
//   additional C++ bindings in their applications. These C++ bindings
//   can be created using NODE_MODULE_CONTEXT_AWARE_CPP() with the flag

GetBinding

Will be bound to process.binding. This function will try to find a builtin module with the passed in module name.

GetLinkedBinding

Will be bound to process._linkedBinding.

GetInternalBinding

Will be bound to process.internalBinding.

lib/internal/bootstrap_node.js

This is the file that is loaded by LoadEnvironment as "bootstrap_node.js". I read that this file is actually precompiled, where/how?

This file is referenced in node.gyp and is used with the target node_js2c. This target calls tools/js2c.py which is a tool for converting JavaScript source code into C-Style char arrays. This target will process all the library_files specified in the variables section which lib/internal/bootstrap_node.js is one of. The output of this out/Debug/obj/gen/node_javascript.h, depending on the type of build being performed. So lib/internal/bootstrap_node.js will become internal_bootstrap_node_value in node_javascript.h. This is then later included in src/node_javascript.cc.

We can see the contents of this in lldb using:

(lldb) p internal_bootstrap_node_value

Loading of Node Native JavaScript

The JavaScript source files that are located in the lib directory are not loaded in the normal way the JavaScript sources you provide yourself. Instead these are converted first into c arrays for faster execution. This is done by a build and more specifically a Python tool called js2c.py (JavaScript to C). If we take a look at node_javascript.h we find that it declares two functions:

void DefineJavaScript(Environment* env, v8::Local<v8::Object> target);
v8::Local<v8::String> MainSource(Environment* env);

Main source is what is used to load bootstrap_node.js and DefineJavaScript is used in the Binding function in src/node.js for loading the natives modules using the binding call. For example, in lib/internal/bootstrap_node.js:

NativeModule._source = process.binding('natives');

Lets set a breakpoint in DefineJavaScript:

(lldb) br s -n node::DefineJavaScript
CHECK(target->Set(env->context(),
                  internal_bootstrap_node_key.ToStringChecked(env->isolate()),
                  internal_bootstrap_node_value.ToStringChecked(env->isolate())).FromJust());
(lldb) jlh internal_bootstrap_node_key.ToStringChecked(env->isolate())
"internal/bootstrap_node"

And the value of this will the contents of boostrap_node.js. So this will be set as the property on the export object (the object named target above).

Loading of builtins

I wanted to know how builtins, like tcp_wrap and others are loaded. The loading is done explicitely upon initialization by this call in node.cc:

void Init(int* argc,
          const char** argv,
          int* exec_argc,
          const char*** exec_argv) {
  // Initialize prog_start_time to get relative uptime.
  prog_start_time = static_cast<double>(uv_now(uv_default_loop()));

  // Register built-in modules
  RegisterBuiltinModules();

RegisterBuiltinModules

void RegisterBuiltinModules() {
#define V(modname) _register_##modname();
  NODE_BUILTIN_MODULES(V)
#undef V
}

The macro NODE_BUILTIN_MODULES can be found in node_internals.h and list all the modules:

#define NODE_BUILTIN_OPENSSL_MODULES(V) V(crypto) V(tls_wrap)

#define NODE_BUILTIN_ICU_MODULES(V) V(icu)

#define NODE_BUILTIN_STANDARD_MODULES(V)                                      \
    V(async_wrap)                                                             \
    V(buffer)                                                                 \
    V(cares_wrap)                                                             \
    V(config)                                                                 \
    V(contextify)                                                             \
    V(domain)                                                                 \
    V(fs)                                                                     \
    V(fs_event_wrap)                                                          \
    V(http2)                                                                  \
    V(http_parser)                                                            \
    V(inspector)                                                              \
    V(js_stream)                                                              \
    V(module_wrap)                                                            \
    V(os)                                                                     \
    V(performance)                                                            \
    V(pipe_wrap)                                                              \
    V(process_wrap)                                                           \
    V(serdes)                                                                 \
    V(signal_wrap)                                                            \
    V(spawn_sync)                                                             \
    V(stream_wrap)                                                            \
    V(string_decoder)                                                         \
    V(tcp_wrap)                                                               \
    V(timer_wrap)                                                             \
    V(trace_events)                                                           \
    V(tty_wrap)                                                               \
    V(udp_wrap)                                                               \
    V(url)                                                                    \
    V(util)                                                                   \
    V(uv)                                                                     \
    V(v8)                                                                     \
    V(zlib)

#define NODE_BUILTIN_MODULES(V)                                               \
  NODE_BUILTIN_STANDARD_MODULES(V)                                            \
  NODE_BUILTIN_OPENSSL_MODULES(V)                                             \
  NODE_BUILTIN_ICU_MODULES(V)

So for example, tcp_wrap would have the following function call generated by the preprocesor:

void RegisterBuiltinModules() {
  _register_tcp_wrap();

This will call the _register_tcp_wrap() function that is generated by the NODE_BUILTIN_MODULE_CONTEXT_AWARE in tcp_wrap.cc. Lets take a look at the following line from src/tcp_wrap.cc:

NODE_BUILTIN_MODULE_CONTEXT_AWARE(tcp_wrap, node::TCPWrap::Initialize)

Now, setting a breakpoint on this and printing the thread backtrace gives:

-> 436 	NODE_BUILTIN_MODULE_CONTEXT_AWARE(tcp_wrap, node::TCPWrap::Initialize)
(lldb) bt
* thread #1: tid = 0x18d8053, 0x0000000100d1056b node`_register_tcp_wrap() + 11 at tcp_wrap.cc:436, queue = 'com.apple.main-thread', stop reason = breakpoint 5.1
  * frame #0: 0x0000000100d1056b node`_register_tcp_wrap() + 11 at tcp_wrap.cc:436
    frame #1: 0x00007fff5fc1310b dyld`ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 265
    frame #2: 0x00007fff5fc13284 dyld`ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 40
    frame #3: 0x00007fff5fc0f8bd dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 305
    frame #4: 0x00007fff5fc0f743 dyld`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 127
    frame #5: 0x00007fff5fc0f9b3 dyld`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 75
    frame #6: 0x00007fff5fc020f1 dyld`dyld::initializeMainExecutable() + 208
    frame #7: 0x00007fff5fc05d98 dyld`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 3596
    frame #8: 0x00007fff5fc01276 dyld`dyldbootstrap::start(macho_header const*, int, char const**, long, macho_header const*, unsigned long*) + 512
    frame #9: 0x00007fff5fc01036 dyld`_dyld_start + 54

First things to note is that NODE_BUILTIN_MODULE_CONTEXT_AWARE is a macro defined in node.h which takes a modname and regfunc argument. This in turn calls another macro function named NODE_MODULE_CONTEXT_AWARE_X. This macro is invoked with the following arguments:

NODE_MODULE_CONTEXT_AWARE_X(modname, regfunc, NULL, NM_F_BUILTIN)

We already know that in our case modname is tcp_wrap and that regfunc is node::TCPWrap::Initialize.

NODE_MODULE_CONTEXT_AWARE_X

#define NODE_MODULE_CONTEXT_AWARE_CPP(modname, regfunc, priv, flags)  \
  extern "C" {                                                        \
    static node::node_module _module =                                \
    {                                                                 \
      NODE_MODULE_VERSION,                                            \
      flags,                                                          \
      nullptr,                                                        \
      __FILE__,                                                       \
      nullptr,                                                        \
      (node::addon_context_register_func) (regfunc),                  \
      NODE_STRINGIFY(modname),                                        \
      priv,                                                           \
      nullptr                                                         \
    };                                                                \
    void _register_ ## modname() {                                    \
      node_module_register(&_module)                                  \
    }
}

First, extern "C" means that C linkage should be used and no C++ name mangling should occur. This is saying that everything in the block should have this kind of linkage. With that out of the way we can focus on the contents of the block.

    static node::node_module _module =                                \
    {                                                                 \
      NODE_MODULE_VERSION,                                            \
      flags,                                                          \
      NULL,                                                           \
      __FILE__,                                                       \
      NULL,                                                           \
      (node::addon_context_register_func) (regfunc),                  \
      NODE_STRINGIFY(modname),                                        \
      priv,                                                           \
      NULL                                                            \
    };                                                                \

We are creating a static variable (it exists for the lifetime of the program, but the name is not visible outside of the block. Remember that we are in tcp_wrap.cc in this walk through so the preprocessor will add a definition of the _module to that tcp_wrap. node_module is a struct in node.h and looks like this:

struct node_module {
  int nm_version;
  unsigned int nm_flags;
  void* nm_dso_handle;
  const char* nm_filename;
  node::addon_register_func nm_register_func;
  node::addon_context_register_func nm_context_register_func;
  const char* nm_modname;
  void* nm_priv;
  struct node_module* nm_link;
};

Environment

To create an Environment we need to have an v8::Isolate instance and also an IsolateData instance:

 inline Environment(IsolateData* isolate_data, v8::Local<v8::Context> context);

Such a call can be found during startup:

(lldb) bt
  * thread #1: tid = 0x946796, 0x00000001008fa39f node`node::Start(int, char**) + 307 at node.cc:4390, queue = 'com.apple.main-thread', stop reason = step over
    * frame #0: 0x00000001008fa39f node`node::Start(int, char**) + 307 at node.cc:4390
      frame #1: 0x00000001008fa26c node`node::Start(argc=<unavailable>, argv=0x0000000102a00000) + 205 at node.cc:4503
      frame #2: 0x0000000100000b34 node`start + 52

See IsolateData for details about that class and the members that are proxied through via an Environment instance.

An Environment has a number of nested classes:

AsyncHooks
AsyncHooksCallbackScope
DomainFlag
TickInfo

The above nested classes call the DISALLOW_COPY_AND_ASSIGN macro, for example:

DISALLOW_COPY_AND_ASSIGN(TickInfo);

This macro uses = delete for the copy and assignment operator functions:

#define DISALLOW_COPY_AND_ASSIGN(TypeName) \
TypeName(const TypeName&) = delete;      \
void operator=(const TypeName&) = delete

The last nested class is:

HandleCleanup

Environment also has a number of static methods:

static inline Environment* GetCurrent(v8::Isolate* isolate);

This got me wondering, how can we get an Environment from an Isolate, an Isolate is a V8 thing and an Environment a Node thing?

inline Environment* Environment::GetCurrent(v8::Isolate* isolate) {
  return GetCurrent(isolate->GetCurrentContext());
}

So we are going to use the current context to get the Environment pointer, but the context is also a V8 concept, not a node.js concept.

inline Environment* Environment::GetCurrent(v8::Local<v8::Context> context) {
 return static_cast<Environment*>(context->GetAlignedPointerFromEmbedderData(kContextEmbedderDataIndex));
}

Alright, now we are getting somewhere. Lets take a closer look at context->GetAlignedPointerFromEmbedderData(kContextEmbedderDataIndex). We have to look at the Environment constructor to see where this is set (env-inl.h):

inline Environment::Environment(IsolateData* isolate_data, v8::Local<v8::Context> context) 
...
AssignToContext(context);

So, we can see that AssignToContext is setting the environment on the passed-in context:

static const int kContextEmbedderDataIndex = 5;

inline void Environment::AssignToContext(v8::Local<v8::Context> context) {
  context->SetAlignedPointerInEmbedderData(kContextEmbedderDataIndex, this);
}

So this how the Environment is associated with the context, and this enables us to get the environment for a context above. The argument to SetAlignedPointerInEmbedderData is a void pointer so it can be anything you want. The data is stored in a V8 FixedArray, the kContextEmbedderDataIndex is the index into this array (I think, still learning here). TODO: read up on how this FixedArray and alignment works.

There are also static methods to get the Environment using a context.

So an Isolate is like single instance of V8 runtime. A Context is a separate execution context that does not know about other context.

An Environment is a Node.js concept and multiple environments can exist within a single isolate. What I'm trying to figure out is how a AtExit callback can be registered with an environment, and also how to force that callback to be called when that particular environment is about to exit. Currently, this is done with a thread-local, but if there are multiple environments per thread these will overwrite each other.

ENVIRONMENT_STRONG_PERSISTENT_PROPERTIES

These are declared in env.h:

#define ENVIRONMENT_STRONG_PERSISTENT_PROPERTIES(V)                           \
  V(as_external, v8::External)                                                \
  V(async_hooks_destroy_function, v8::Function)                               \
  V(async_hooks_init_function, v8::Function)                                  \
  V(async_hooks_post_function, v8::Function)                                  \
  V(async_hooks_pre_function, v8::Function)                                   \
  V(binding_cache_object, v8::Object)                                         \
  V(buffer_constructor_function, v8::Function)                                \
  V(buffer_prototype_object, v8::Object)                                      \
  V(context, v8::Context)                                                     \
  V(domain_array, v8::Array)                                                  \
  V(domains_stack_array, v8::Array)                                           \
  V(fs_stats_constructor_function, v8::Function)                              \
  V(generic_internal_field_template, v8::ObjectTemplate)                      \
  V(jsstream_constructor_template, v8::FunctionTemplate)                      \
  V(module_load_list_array, v8::Array)                                        \
  V(pipe_constructor_template, v8::FunctionTemplate)                          \
  V(process_object, v8::Object)                                               \
  V(promise_reject_function, v8::Function)                                    \
  V(push_values_to_array_function, v8::Function)                              \
  V(script_context_constructor_template, v8::FunctionTemplate)                \
  V(script_data_constructor_function, v8::Function)                           \
  V(secure_context_constructor_template, v8::FunctionTemplate)                \
  V(tcp_constructor_template, v8::FunctionTemplate)                           \
  V(tick_callback_function, v8::Function)                                     \
  V(tls_wrap_constructor_function, v8::Function)                              \
  V(tls_wrap_constructor_template, v8::FunctionTemplate)                      \
  V(tty_constructor_template, v8::FunctionTemplate)                           \
  V(udp_constructor_function, v8::Function)                                   \
  V(write_wrap_constructor_function, v8::Function)                            \

Notice that V is passed in enabling different macros to be passed in. This is used to create setters/getters like this:

#define V(PropertyName, TypeName)                                             \
  inline v8::Local<TypeName> PropertyName() const;                            \
  inline void set_ ## PropertyName(v8::Local<TypeName> value);
  ENVIRONMENT_STRONG_PERSISTENT_PROPERTIES(V)
#undef V

The field itself is private and defined in env.h:

#define V(PropertyName, TypeName)                                             \
  v8::Persistent<TypeName> PropertyName ## _;
  ENVIRONMENT_STRONG_PERSISTENT_PROPERTIES(V)
#undef V

The above is defining getters and setter for all the properties in ENVIRONMENT_STRONG_PERSISTENT_PROPERTIES. Notice the usage of V and that is is passed into the macro. Lets take a look at one:

V(tcp_constructor_template, v8::FunctionTemplate)

Like before these are only the declarations, the definitions can be found in src/env-inl.h:

#define V(PropertyName, TypeName)                                             \
  inline v8::Local<TypeName> Environment::PropertyName() const {              \
    return StrongPersistentToLocal(PropertyName ## _);                        \
  }                                                                           \
  inline void Environment::set_ ## PropertyName(v8::Local<TypeName> value) {  \
    PropertyName ## _.Reset(isolate(), value);                                \
  }
  ENVIRONMENT_STRONG_PERSISTENT_PROPERTIES(V)
#undef V 

So, in the case of tcp_constructor_template this would expand to:

inline v8::Local<v8::FunctionTemplate> Environment::tcp_constructor_template() const {              
  return StrongPersistentToLocal(tcp_constructor_template_);                        
}                                                                           
inline void Environment::set_tcp_constructor_template(v8::Local<v8::FunctionTempalate> value) {  
  tcp_constructor_template_.Reset(isolate(), value);                                
}

So where is this setter called?
It is called from TCPWrap::Initialize:

env->set_tcp_constructor_template(t); 

And when is TCPWrap::Initialize called?
From Binding in node.cc:

...
mod->nm_context_register_func(exports, unused, env->context(), mod->nm_priv);

Recall (from Loading of builtins) how a module is registred:

NODE_MODULE_CONTEXT_AWARE_BUILTIN(tcp_wrap, node::TCPWrap::Initialize)

The nm_context_register_func is node::TCPWrap::Initialize, which is a static method declared in src/tcp_wrap.h:

static void Initialize(v8::Local<v8::Object> target,
                       v8::Local<v8::Value> unused,
                       v8::Local<v8::Context> context);


wrap_data->MakeCallback(env->onconnection_string(), arraysize(argv), argv);

env->onconnection_string() is a simple getter generated by the preprocessor by a macro in env-inl.h.

TCPWrap::Initialize

First thing that happens is that the Environment is retreived using the current context.

Next, a function template is created:

Local<FunctionTemplate> t = env->NewFunctionTemplate(New);

Just to be clear New is the address of the function and we are just passing that to the NewFunctionTemplate method. It will use that address when creating a NewFunctionTemplate.

TcpWrap::New

This class is called TcpWrap because is wraps a libuv uv_tcp_t handle.

static void SetNoDelay(const v8::FunctionCallbackInfo<v8::Value>& args);
static void SetKeepAlive(const v8::FunctionCallbackInfo<v8::Value>& args);
static void Bind(const v8::FunctionCallbackInfo<v8::Value>& args);
static void Bind6(const v8::FunctionCallbackInfo<v8::Value>& args);
static void Listen(const v8::FunctionCallbackInfo<v8::Value>& args);
static void Connect(const v8::FunctionCallbackInfo<v8::Value>& args);
static void Connect6(const v8::FunctionCallbackInfo<v8::Value>& args);
static void Open(const v8::FunctionCallbackInfo<v8::Value>& args);

Each of the functions above, for example SetNoDelay will all wrap a function call in libuv:

void TCPWrap::SetNoDelay(const FunctionCallbackInfo<Value>& args) {
  TCPWrap* wrap;
  ASSIGN_OR_RETURN_UNWRAP(&wrap,
                          args.Holder(),
                          args.GetReturnValue().Set(UV_EBADF));
  int enable = static_cast<int>(args[0]->BooleanValue());
  int err = uv_tcp_nodelay(&wrap->handle_, enable);
  args.GetReturnValue().Set(err);
}

When a new instance of this class is created it will initialize the handle which must be of type uv_tcp_t:

int r = uv_tcp_init(env->event_loop(), &handle_);

Now, a uv_tcp_t could be used for accepting connection but also for connecting to sockets.

When this is used in JavaScript is would look like this:

var TCP = process.binding('tcp_wrap').TCP;
var handle = new TCP();

When the second line is executed the callback New will be invoked. This is set up by this line later in TCPWrap::Initialize:

target->Set(FIXED_ONE_BYTE_STRING(env->isolate(), "TCP"), t->GetFunction());

New takes a single argument of type v8::FunctionCallbackInfo which holds information about the function call make. These are things like the number of arguments used, the arguments can be retreived using with the operator[]. New looks like this:

void TCPWrap::New(const FunctionCallbackInfo<Value>& a) {
  CHECK(a.IsConstructCall());
  Environment* env = Environment::GetCurrent(a);
  TCPWrap* wrap;
  if (a.Length() == 0) {
    wrap = new TCPWrap(env, a.This(), nullptr);
  } else if (a[0]->IsExternal()) {
    void* ptr = a[0].As<External>()->Value();
    wrap = new TCPWrap(env, a.This(), static_cast<AsyncWrap*>(ptr));
  } else {
    UNREACHABLE();
  }
  CHECK(wrap);
}

Like mentioned above when the constructor of TCPWrap is called it will initialize the uv_tcp_t handle. Using the example above we can see that Length should be 0 as we did not pass any arguments to the TCP function. Just wondering, what could be passed as a parameter?
What ever it might look like it should be a pointer to an AsyncWrap.

So this is where the instance of TCPWrap is created. Notice a.This() which is passed all the way up to BaseObject's constructor and made into a persistent handle.

const req = new TCPConnectWrap();
const err = client.connect(req, '127.0.0.1', this.address().port);

Now, new TcpConnectWrap() is setup in TCPWrap::Initalize and the only thing that happens here is that it configured with a constructor that checks that this function is called with the new keyword. So there is really nothing else happening at this stage. But, when we call client.connect something interesting does happen: TCPWrap::Connect

if (err == 0) {
ConnectWrap* req_wrap = new ConnectWrap(env, req_wrap_obj, AsyncWrap::PROVIDER_TCPCONNECTWRAP);
err = uv_tcp_connect(req_wrap->req(),
                     &wrap->handle_,
                     reinterpret_cast<const sockaddr*>(&addr),
                     AfterConnect);
req_wrap->Dispatched();
if (err)
  delete req_wrap;

}

So we can see that we are creating a new ConnectWrap instance which extends AsyncWrap and also ReqWrap. Thinking about this makes sense I think. If we recall that the classes with Wrap in them wrap libuv concepts, and in this case we are going to make a tcp connection. If we look at our client example we can see that we are using uv_connect_t make the connection (named connection_req):

r = uv_tcp_connect(&connect_req,
                   &tcp_client,
                   (const struct sockaddr*) &addr,
                   connect_cb);

tcp_client in the above example is of type uv_tcp_t. But ConnectWrap also extend AsyncWrap. See the AsyncWrap section for details. What might be of interest and something to look into a little deeper is that ReqWrap will add the request wrap (wrapping a uv_req_t remember) to the current env req_wrap_queue. Keep in mind that a reqest is shortlived. The last thing that the ConnectWrap constructor does is call Wrap:

Wrap(req_wrap_obj, this);

Now, you might not remember what this req_wrap_obj is, but it was the first argument to client.connect and was the new TCPConnectWrap instance. But this was nothing more than a constructor and nothing else:

(lldb) p req_wrap_obj
(v8::Local<v8::Object>) $34 = (val_ = 0x00007fff5fbfd018)
(lldb) p *(*req_wrap_obj)
(v8::Object) $35 = {}

We can see that this is a v8::Localv8::Object and we are going to store the ConnectWrap instance in this object:

req_wrap_obj->SetAlignedPointerInInternalField(0, this);

So why is this being done?
Well if you take a look in AfterConnect you can see that this will be accessed and passed as a parameter to the oncomplete function:

ConnectWrap* req_wrap = static_cast<ConnectWrap*>(req->data);
...
Local<Value> argv[5] = {
  Integer::New(env->isolate(), status),
  wrap->object(),
  req_wrap->object(),
  Boolean::New(env->isolate(), readable),
  Boolean::New(env->isolate(), writable)
};
req_wrap->MakeCallback(env->oncomplete_string(), arraysize(argv), argv);

This will then invoke the oncomplete callback set up on the req object:

  req.oncomplete = function(status, client_, req_, readable, writable) {
  }

NewFunctionTemplate

NewFunctionTemplate in env.h specifies a default value for the second parameter `v8::Localv8::Signature() so it does not have to be specified.

 v8::Local<v8::External> external = as_external();
 return v8::FunctionTemplate::New(isolate(), callback, external, signature);

(lldb) p callback
(v8::FunctionCallback) $0 = 0x0000000100db8540 (node`node::TCPWrap::New(v8::FunctionCallbackInfo<v8::Value> const&) at tcp_wrap.cc:107)

So t is a function template, a blueprint for a single function. You create an instance of the template by calling GetFunction. Recall that in JavaScript to create a new type of object you use a function. When this function is used as a constructor, using new, the returned object will be an instance of the InstanceTemplate (ObjectTemplate) that will be discussed shortly.

t->SetClassName(FIXED_ONE_BYTE_STRING(env->isolate(), "TCP"));

The class name is is used for printing objects created with the function created from the FunctionTemplate as its constructor.

 t->InstanceTemplate()->SetInternalFieldCount(1);

InstanceTemplate returns the ObjectTemplate associated with the FunctionTemplate. Every FunctionTemplate has one. Like mentioned before this is the object that is returned after having used the FunctionTemplate as a constructor. SetInternalFieldCount(1) instructs V8 to allocate internal storage for every instance created using this template. Anything can be stored that space allocated, and for node this is often done using SetAlignedPointerInInternalField. This could then be retrieved using GetAlignedPointerFromInternalField.

Next, the ObjectTemplate is set up. First a number of properties are configured:

t->InstanceTemplate()->Set(String::NewFromUtf8(env->isolate(), "reading"),
                           Boolean::New(env->isolate(), false));

Then, a number of prototype methods are set:

env->SetProtoMethod(t, "close", HandleWrap::Close);

Alright, lets take a look at this SetProtoMethod method in Environment:

inline void Environment::SetProtoMethod(v8::Local<v8::FunctionTemplate> that,
                                     const char* name,
                                     v8::FunctionCallback callback) {
v8::Local<v8::Signature> signature = v8::Signature::New(isolate(), that);
v8::Local<v8::FunctionTemplate> t = NewFunctionTemplate(callback, signature);
// kInternalized strings are created in the old space.
const v8::NewStringType type = v8::NewStringType::kInternalized;
v8::Local<v8::String> name_string =
   v8::String::NewFromUtf8(isolate(), name, type).ToLocalChecked();
that->PrototypeTemplate()->Set(name_string, t);
t->SetClassName(name_string);  // NODE_SET_PROTOTYPE_METHOD() compatibility.

}

A Signature has the following class documentation: "A Signature specifies which receiver is valid for a function.". So the receiver is set to be that which is t, our newly created FunctionTemplate.

Next, we are creating a FunctionTemplate for the call back HandleWrap::Close with the signature just created. Then, we will set the function template as a PrototypeTemplate. Again we see t->SetClassName which I believe is for when this is printed. There are few more prototype methods that use HandleWrap callbacks:

env->SetProtoMethod(t, "ref", HandleWrap::Ref);
env->SetProtoMethod(t, "unref", HandleWrap::Unref);
env->SetProtoMethod(t, "hasRef", HandleWrap::HasRef);

So have have a class called HandleWrap, which I think requires a section of its own.

After this we find the following line:

StreamWrap::AddMethods(env, t, StreamBase::kFlagHasWritev);

This method is defined in stream_wrap.cc:

env->SetProtoMethod(target, "setBlocking", SetBlocking);
StreamBase::AddMethods<StreamWrap>(env, target, flags);

I've been wondering about the class names that end with Wrap and what they are wrapping. My thinking now is that they are wrapping libuv things. For instance, take StreamWrap, in libuv src/unix/stream.c which is what SetBlocking calls:

 void StreamWrap::SetBlocking(const FunctionCallbackInfo<Value>& args) {
   StreamWrap* wrap;
   ASSIGN_OR_RETURN_UNWRAP(&wrap, args.Holder());

   CHECK_GT(args.Length(), 0);
   if (!wrap->IsAlive())
     return args.GetReturnValue().Set(UV_EINVAL);

   bool enable = args[0]->IsTrue();
   args.GetReturnValue().Set(uv_stream_set_blocking(wrap->stream(), enable));
}

Lets take a look at ASSIGN_OR_RETURN_UNWRAP:

#define ASSIGN_OR_RETURN_UNWRAP(ptr, obj, ...)                                \
  do {                                                                        \
    *ptr =                                                                    \
        Unwrap<typename node::remove_reference<decltype(**ptr)>::type>(obj);  \
    if (*ptr == nullptr)                                                      \
      return __VA_ARGS__;                                                     \
} while (0)

So what would this look like after the preprocessor has processed it (need to double check this):

do {
  *wrap = Unwrap<uv_stream_t>(obj);
  if (*wrap == nullptr)
     return;
} while (0);

What does __VA_ARGS__ do?
I've seen this before with variadic methods in c, but not sure what it means to return it. Turns out that if you don't pass anything apart from the required arguments then the return __VA_ARGS_ statement will just bereturn;`. There are other places when the usage of this macro does pass additional arguments, for example:

ASSIGN_OR_RETURN_UNWRAP(&wrap,
                       args.Holder(),
                       args.GetReturnValue().Set(UV_EBADF));

do {
  *wrap = Unwrap<uv_stream_t>(obj);
  if (*wrap == nullptr)
     return args.GetReturnValue.Set(UV_EBADF);
} while (0);

So we will be returning early with a BADF (bad file descriptor) error.

BaseObject

inline BaseObject(Environment* env, v8::Local<v8::Object> handle);

I'm thinking that the handle is the Node representation of a libuv handle.

inline v8::Persistent<v8::Object>& persistent();

A persistent handle lives on the heap just like a local handle but it does not correspond to C++ scopes. You have to explicitly call Persistent::Reset.

ReqWrap

class ReqWrap : public AsyncWrap

cares_wrap.cc has a subclass named GetAddrInfoReqWrap node_file.cc has a subclass named FSReqWrap stream_base.cc has a subclass name ShutDownWrap stream_base.cc has a subclass name WriteWrap udp_wrap.cc has a subclass named SendWrap connect_wrap.cc has a subclass named 'ConnectWrap` which is subclassed by PipeWrap and TCPWrap.

AsyncWrap

Some background about AsyncWrap can be found here So using AsyncWrap we can have callbacks invoked during the life of handle objects. A handle object would for example be a TCPWrap which extends ConnectionWrap -> StreamWrap -> HandleWrap.

Being a builtin module it follows the same initialization as others. So lets take a look at the initialization function and see what kind of functions are made available from JavaScript:

env->SetMethod(target, "setupHooks", SetupHooks); env->SetMethod(target, "pushAsyncIds", PushAsyncIds); env->SetMethod(target, "popAsyncIds", PopAsyncIds); env->SetMethod(target, "queueDestroyAsyncId", QueueDestroyAsyncId); env->SetMethod(target, "enablePromiseHook", EnablePromiseHook); env->SetMethod(target, "disablePromiseHook", DisablePromiseHook); env->SetMethod(target, "registerDestroyHook", RegisterDestroyHook);

You can confirm this by using:

$ ./node --expose-internals  -p "require('internal/test/binding').internalBinding('async_wrap')"


env->set_async_hooks_init_function(init_v.As<Function>());

So, if you are like me you might have gone searching for this set_async_hooks_init_function and not finding it. You might recall this coming up before. So every environment will have such setters and getters for

 V(async_hooks_destroy_function, v8::Function)                               
 V(async_hooks_init_function, v8::Function)                                  
 V(async_hooks_post_function, v8::Function)                                  
 V(async_hooks_pre_function, v8::Function)

So, we are setting a field named async_hooks_init_function_ in the current env. An example of this usage might be:

const asyncWrap = process.binding('async_wrap');
let asyncObject = {
  init: function(uid, provider, parentUid, parentHandle) {
    process._rawDebug('init uid:', uid, ', provider:', provider);
  },
  pre: function(uid) {
    process._rawDebug('pre uid:', uid);
  },
  post: function(uid, didThrow) {
    process._rawDebug('post. uid:', uid, 'didThrow:', didThrow);
  },
  destroy: function(uid) {
    process._rawDebug('destroy: uid:', uid);
  }
};

There is also a module named async_hoooks in lib/async_hook.js that can be used:

var ah = require('async_hooks');
let asyncObject = {
  init: function(uid, provider, parentUid, parentHandle) {
    process._rawDebug('init uid:', uid, ', provider:', provider);
  },
  before: function(uid) {
    process._rawDebug('pre uid:', uid);
  },
  after: function(uid, didThrow) {
    process._rawDebug('post. uid:', uid, 'didThrow:', didThrow);
  },
  destroy: function(uid) {
    process._rawDebug('destroy: uid:', uid);
  }
};
let asynchook = ah.createHook(asyncObject);

Now, we can create a break point and see when SetupHooks is called.

(lldb) br s -n node::SetupHooks

When the startup function in bootstrap_node.js is executed it will call the function setupProcessFatal.

function setupProcessFatal() {
  const {
    executionAsyncId,
    clearDefaultTriggerAsyncId,
    clearAsyncIdStack,
    hasAsyncIdStack,
    afterHooksExist,
    emitAfter
  } = NativeModule.require('internal/async_hooks');
  ...

This will load internal/async_hooks module which will call:

const async_wrap = process.binding('async_wrap');
...
async_wrap.setupHooks({ init: emitInitNative,
                        before: emitBeforeNative,
                        after: emitAfterNative,
                        destroy: emitDestroyNative,
                        promise_resolve: emitPromiseResolveNative });

So this is the first time that setupHooks is called. In SetupHooks we have the following macro:

#define SET_HOOK_FN(name)                                                     \
  Local<Value> name##_v = fn_obj->Get(                                        \
      env->context(),                                                         \
      FIXED_ONE_BYTE_STRING(env->isolate(), #name)).ToLocalChecked();         \
  CHECK(name##_v->IsFunction());                                              \
  env->set_async_hooks_##name##_function(name##_v.As<Function>());

  SET_HOOK_FN(init);
  SET_HOOK_FN(before);
  SET_HOOK_FN(after);
  SET_HOOK_FN(destroy);
  SET_HOOK_FN(promise_resolve);
#undef SET_HOOK_FN

Lets expand it for the init function:

  Local<Value> init_v = fn_obj->Get(env->context(), FIXED_ONE_BYTE_STRING(env->isolate(), "init")).ToLocalChecked();
  CHECK(init_v->IsFunction());
  env->set_async_hooks_init_function(init_v.As<Function>());

After these function have been set we also have the following code in SetupHooks:

  Local<FunctionTemplate> ctor = FunctionTemplate::New(env->isolate()); 
  ctor->SetClassName(FIXED_ONE_BYTE_STRING(env->isolate(), "PromiseWrap"));
  Local<ObjectTemplate> promise_wrap_template = ctor->InstanceTemplate();
  promise_wrap_template->SetInternalFieldCount(PromiseWrap::kInternalFieldCount);  // kInternalFieldCount = 3
  promise_wrap_template->SetAccessor(FIXED_ONE_BYTE_STRING(env->isolate(), "promise"), PromiseWrap::GetPromise);
  promise_wrap_template->SetAccessor(FIXED_ONE_BYTE_STRING(env->isolate(), "isChainedPromise"), PromiseWrap::getIsChainedPromise);
  env->set_promise_wrap_template(promise_wrap_template);

PromiseWrap is a class defined in async_wrap.cc. So we are setting up a constructor template for a PromiseWrap on the environment.

(lldb) expr env->async_hooks_init_function()
(v8::Local<v8::Function>) $66 = (val_ = 0x00000001060013c0)
(lldb) jlh env->async_hooks_init_function()
0x329732782401: [Function]
 - map = 0x329757882521 [FastProperties]
 - prototype = 0x3297917043d1
 - elements = 0x3297cd382251 <FixedArray[0]> [HOLEY_ELEMENTS]
 - function prototype =
 - initial_map =
 - shared_info = 0x32979b184ce1 <SharedFunctionInfo emitInitNative>
 - name = 0x32979b1840f9 <String[14]: emitInitNative>
 - formal_parameter_count = 4
 - kind = [ NormalFunction ]
 - context = 0x329732782241 <FixedArray[38]>
 - code = 0x19146b71b241 <Code BUILTIN>
 - source code = (asyncId, type, triggerAsyncId, resource) {
  active_hooks.call_depth += 1;
  // Use a single try/catch for all hook to avoid setting up one per iteration.
  try {
    for (var i = 0; i < active_hooks.array.length; i++) {
      if (typeof active_hooks.array[i][init_symbol] === 'function') {
        active_hooks.array[i][init_symbol](
          asyncId, type, triggerAsyncId,
          resource
        );
      }
    }
  } catch (e) {
    fatalError(e);
  } finally {
    active_hooks.call_depth -= 1;
  }

  // Hooks can only be restored if there have been no recursive hook calls.
  // Also the active hooks do not need to be restored if enable()/disable()
  // weren't called during hook execution, in which case active_hooks.tmp_array
  // will be null.
  if (active_hooks.call_depth === 0 && active_hooks.tmp_array !== null) {
    restoreActiveHooks();
  }
}
 - properties = 0x3297cd382251 <FixedArray[0]> {
    #length: 0x3297cd3b8c81 <AccessorInfo> (const accessor descriptor)
    #name: 0x3297cd3b8c11 <AccessorInfo> (const accessor descriptor)
    #prototype: 0x3297cd3b8cf1 <AccessorInfo> (const accessor descriptor)
 }

 - feedback vector: not available

So lets take a closer look at emitInitNative in 'lib/internal/async_hooks.js':

active_hooks.call_depth += 1;
try {
  for (var i = 0; i < active_hooks.array.length; i++) {
    if (typeof active_hooks.array[i][init_symbol] === 'function') {
      active_hooks.array[i][init_symbol]( asyncId, type, triggerAsyncId, resource);
    }
  }
} catch (e) {
  fatalError(e);
} finally {
  active_hooks.call_depth -= 1;
}

When will env->async_hooks_init_function() be called?

Each Environment has an AsyncHook as a member (src/env.h):

AsyncHooks async_hooks_;

AsyncHook is a nested class of Environment.

Lets take a look at tcp_wrap.h. TCPWrap extends ConnectionWrap, which extends LibuvStreamWrap, which extends HandleWrap, which extends AsyncWrap.

listen will eventually call:

handle = new TCP(TCPConstants.SERVER);

This will call void TCPWrap::New:

  ...
  ProviderType provider;
  switch (type) {
    case SOCKET:
      provider = PROVIDER_TCPWRAP;
      break;
    case SERVER:
      provider = PROVIDER_TCPSERVERWRAP;
      break;
    default:
      UNREACHABLE();
  }

  new TCPWrap(env, args.This(), provider);

This constructor call will delegate up to the BaseObject class's constructor and then continue with AsyncWrap's constructor:

#define NODE_ASYNC_ID_OFFSET 0xA1C
  ...
  // Shift provider value over to prevent id collision.
  persistent().SetWrapperClassId(NODE_ASYNC_ID_OFFSET + provider);

Lets take a look SetWrapperClassId more closely. e following will use NODE_ASYNC_ID_OFFSET is 2588 in decimal.

(lldb) expr provider_type_
(const node::AsyncWrap::ProviderType) $3 = PROVIDER_PROMISE
(lldb) expr 2588 + provider_type_
(int) $4 = 2607

persistent() is a member function of BaseObject which returns the handle for this AsyncWrap instance. SetWrapperClassId is a member function in PersistentBase<T>:

internal::Object** obj = reinterpret_cast<internal::Object**>(this->val_);
uint8_t* addr = reinterpret_cast<uint8_t*>(obj) + I::kNodeClassIdOffset

obj is a pointer to a pointer which I find hard to visualize but here is an attempt:

(lldb) expr this->val_
(v8::Object *) $76 = 0x000000010604b680
(lldb) expr obj
(v8::internal::Object **) $75 = 0x000000010604b680

lldb) expr &this->val_
(v8::Object **) $84 = 0x0000000104507a28

(lldb) expr &obj
(v8::internal::Object ***) $85 = 0x00007fff5fbfc898

So this->val_ is a pointer to v8::Object and we are using reinterpret_cast to instruct the compiler to treat it like it was of type v8::internal::Object**.

&this->val
0x0000000104507a28 -----------------------> 0x000000010604b680 -------------> v8::Object
                                                  | 
                                                  | 
&obj                                              |
0x00007fff5fbfc898 -------------------------------+

Next, we are going to interpret the obj as an uint8_t* type so that we can perform the addition:

uint8_t* addr = reinterpret_cast<uint8_t*>(obj) + I::kNodeClassIdOffset
*reinterpret_cast<uint16_t*>(addr) = class_id;

And then we set the value that addr is pointing to, to the passed in class it. But that does not really explain why this is being done. Lets stick a break point in the PromiseWrap constructor:

(lldb) br s -f async_wrap.cc -l 611

Now, lets try casting obj to

(lldb) expr *reinterpret_cast<v8::internal::GlobalHandles::Node*>(obj)
(v8::internal::GlobalHandles::Node) $96 = {
  object_ = 0x000033d8d5325b49
  class_id_ = 0
  index_ = 'D'
  flags_ = 'a'
  weak_callback_ = 0x0000000000000000
  parameter_or_next_free_ = {
    parameter = 0x0000000000000000
    next_free = 0x0000000000000000
  }
}

v8::internal::GlobalHandles::Node can be found in deps/v8/src/global-handles.cc and we can see that it has the following private members:

  Object* object_;
  // Wrapper class ID.
  uint16_t class_id_;
  // Index in the containing handle block.
  uint8_t index_;

Notice the order. This is what addr is pointing to in :

uint8_t* addr = reinterpret_cast<uint8_t*>(obj) + I::kNodeClassIdOffset
*reinterpret_cast<uint16_t*>(addr) = class_id;

After having set the value we can again inspect the value:

(lldb) expr *reinterpret_cast<v8::internal::GlobalHandles::Node*>(obj)
(v8::internal::GlobalHandles::Node) $107 = {
  object_ = 0x000033d8d5325b49
  class_id_ = 2607
  index_ = 'D'
  flags_ = 'a'
  weak_callback_ = 0x0000000000000000
  parameter_or_next_free_ = {
    parameter = 0x0000000000000000
    next_free = 0x0000000000000000
  }
}

TODO: Figure out how the connection between the Node and the stored object actually works.

So lets take a step back and see how things all fits together.

(lldb) br s -f async_wrap.cc -l 264
(lldb) br s -f v8.h -l 9082

When we create a new PromiseWrap it will delegate calling the above constructors, the last on is the BaseObject constructor it will invoke .

template <class T>
T* PersistentBase<T>::New(Isolate* isolate, T* that) {
  if (that == NULL) return NULL;
  internal::Object** p = reinterpret_cast<internal::Object**>(that);
  return reinterpret_cast<T*>(V8::GlobalizeReference(reinterpret_cast<internal::Isolate*>(isolate), p));
}

GlobalizeReference will then do the following:

i::Object** V8::GlobalizeReference(i::Isolate* isolate, i::Object** obj) {
  LOG_API(isolate, Persistent, New);
  i::Handle<i::Object> result = isolate->global_handles()->Create(*obj);
  return result.location();
}

GlobalHandles can be found in deps/v8/src/global-handles.h. A global handle is alive until it's Destroy function is called (so it is not cleared when it does out of scope like a local handle with a HandleScope).

(lldb) expr *this
(v8::internal::GlobalHandles) $126 = {
  isolate_ = 0x0000000105807800
  number_of_global_handles_ = 1
  first_block_ = 0x0000000104801800
  first_used_block_ = 0x0000000104801800
  first_free_ = 0x0000000104801820
  new_space_nodes_ = size=1 {
    [0] = 0x0000000104801800
  }
  post_gc_processing_count_ = 0
  number_of_phantom_handle_resets_ = 0
  pending_phantom_callbacks_ = size=0 {}
}

GlobalHandles class has a number of private members as we can see above:

Isolate* isolate_;
int number_of_global_handles_;
NodeBlock* first_block_;
NodeBlock* first_used_block_;
Node* first_free_;
std::vector<Node*> new_space_nodes_;
int post_gc_processing_count_;
size_t number_of_phantom_handle_resets_;
std::vector<PendingPhantomCallback> pending_phantom_callbacks_;

So lets take a closer look at Create and see what it does.

Node* result = first_free_;
result->Acquire(value);

Acquire performs the following on the passed on Object:

  DCHECK(state() == FREE);
  object_ = object;
  class_id_ = v8::HeapProfiler::kPersistentHandleNoClassId;
  set_active(false);
  set_state(NORMAL);
  parameter_or_next_free_.parameter = nullptr;
  weak_callback_ = nullptr;
  IncreaseBlockUses();

Now, Aquire is a member function on GlobalHandles::Node and we can see that the object pointer is set, and the class_id_ (and others but I'm focusing on these two).

Create then returns:

return result->handle();

Which does:

Handle<Object> handle() { return Handle<Object>(location()); }

v8::Object does not have any members and an Object can be either a Smi or a HeapObject.

AsyncReset() is called from AsyncWrap's constructor:

  // Use AsyncReset() call to execute the init() callbacks.
  AsyncReset(execution_async_id);

Note that AsyncWrap has a default value for execution_async_id in async_wrap.h:

double execution_async_id = -1

Which is the value of execution_async_id in this call:

(lldb) expr execution_async_id
(double) $25 = -1

Also not the AsyncReset has default values defined for its parameters:

void AsyncReset(double execution_async_id = -1, bool silent = false);

So lets take a look at AsyncReset:

async_id_ = execution_async_id == -1 ? env()->new_async_id() : execution_async_id;

In our case we will get a new async_id (double) from the environment instance:

inline double Environment::new_async_id() {
  async_hooks()->async_id_fields()[AsyncHooks::kAsyncIdCounter] += 1;
  return async_hooks()->async_id_fields()[AsyncHooks::kAsyncIdCounter];
}
lldb) expr *async_hooks()
(node::Environment::AsyncHooks) $28 = {
...
async_id_fields_ = {
  isolate_ = 0x0000000105007400
  count_ = 4
  byte_offset_ = 0
  buffer_ = 0x0000000104a1a2e0
  js_array_ = {
    v8::PersistentBase<v8::Float64Array> = (val_ = 0x000000010505c840)
  }
  free_buffer_ = true
}

async_id_fields_ is defined in src/env.h:

AliasedBuffer<double, v8::Float64Array> async_id_fields_;

The types of fields are:

enum UidFields {
  kExecutionAsyncId,
  kTriggerAsyncId,
  kAsyncIdCounter,
  kDefaultTriggerAsyncId,
  kUidFieldsCount,
};

So if we look at the above call again:

  async_hooks()->async_id_fields()[AsyncHooks::kAsyncIdCounter] += 1;

So we obtaining an id for this AsyncWrap resource which remember is of type TCPWrap. Next we have:

trigger_async_id_ = env()->get_default_trigger_async_id();

The trigger id is an id for what triggered. Which we can find in src/env-inl.h:

inline double Environment::get_default_trigger_async_id() {
  double default_trigger_async_id = async_hooks()->async_id_fields()[AsyncHooks::kDefaultTriggerAsyncId];
  // If defaultTriggerAsyncId isn't set, use the executionAsyncId
  if (default_trigger_async_id < 0)
    default_trigger_async_id = execution_async_id();
  return default_trigger_async_id;
}

This time we are using kDefaultTriggerAsyncId as the index. In our case:

(lldb) expr default_trigger_async_id
(double) $36 = -1

So execution_async_id() will be called which does:

return async_hooks()->async_id_fields()[AsyncHooks::kExecutionAsyncId];

Next in AsyncReset we have:

 switch (provider_type()) {
#define V(PROVIDER)                                                           \
    case PROVIDER_ ## PROVIDER:                                               \
      TRACE_EVENT_NESTABLE_ASYNC_BEGIN2("node.async_hooks",                   \
        #PROVIDER, static_cast<int64_t>(get_async_id()),                      \
        "executionAsyncId",                                                   \
        static_cast<int64_t>(env()->execution_async_id()),                    \
        "triggerAsyncId",                                                     \
        static_cast<int64_t>(get_trigger_async_id()));                        \
      break;
    NODE_ASYNC_PROVIDER_TYPES(V)
#undef V
    default:
      UNREACHABLE();
  }
(lldb) expr provider_type()
(node::AsyncWrap::ProviderType) $39 = PROVIDER_TCPSERVERWRAP

Lets expand the macro for the provider:

  case PROVIDER_TCPSERVERWRAP:
    TRACE_EVENT_NESTABLE_ASYNC_BEGIN2("node.async_hooks",                   
      TCPSERVERWRAP, static_cast<int64_t>(get_async_id()),                 
      "executionAsyncId",                                                   
      static_cast<int64_t>(env()->execution_async_id()),                    
      "triggerAsyncId",                                                     
      static_cast<int64_t>(get_trigger_async_id()));                        
    break;

This add a v8 tracing event. TODO: take a closer look at this.

Next, we have the following:

  if (silent) return;

  EmitAsyncInit(env(), object(),
                env()->async_hooks()->provider_string(provider_type()),
                async_id_, trigger_async_id_);

EmitAsyncInit has the following code:

Local<Function> init_fn = env->async_hooks_init_function();

This is the function that we looked at earlier. We can see the arguments being created:

Local<Value> argv[] = {
    Number::New(env->isolate(), async_id),
    type,
    Number::New(env->isolate(), trigger_async_id),
    object,
  };

And these match the parameters that init takes:

init uid: 6 , provider: TCPSERVERWRAP , parentUid: 1 , parentHandle: TCP { reading: false, owner: null, onread: null, onconnection: null }

And the function will execute the code generated from emitInitNative which will call all the init functions that have been registered. So where/when are the init functions added to the active_hooks.array ? Well, if we take a look at lib/internal/async_hooks.js we find:

const active_hooks = {
  // Array of all AsyncHooks that will be iterated whenever an async event fires.
  array: [],
  call_depth: 0,
  tmp_array: null,
  tmp_fields: null
};

active_hooks is accessed by calling setHookArrays() in lib/async_hooks.js, called in enable and disable. Let's take a look at enable first:

  const [hooks_array, hook_fields] = getHookArrays();

  // Each hook is only allowed to be added once.
  if (hooks_array.includes(this))
    return this;

  const prev_kTotals = hook_fields[kTotals];
  hook_fields[kTotals] = 0;

  // createHook() has already enforced that the callbacks are all functions,
  // so here simply increment the count of whether each callbacks exists or
  // not.
  hook_fields[kTotals] += hook_fields[kInit] += +!!this[init_symbol];
  hook_fields[kTotals] += hook_fields[kBefore] += +!!this[before_symbol];
  hook_fields[kTotals] += hook_fields[kAfter] += +!!this[after_symbol];
  hook_fields[kTotals] += hook_fields[kDestroy] += +!!this[destroy_symbol];
  hook_fields[kTotals] += hook_fields[kPromiseResolve] += +!!this[promise_resolve_symbol];
  hooks_array.push(this);
 
  if (prev_kTotals === 0 && hook_fields[kTotals] > 0) {
      enableHooks();
  }

  return this;

hook_fields[kTotals] is first set to 0. Not sure why the first line is doing += as it would only be adding zero. Could that not just be:

  hook_fields[kTotals] = hook_fields[kInit] += +!!this[init_symbol];

But lets take a look of the rest of this expression.

  hook_fields[kInit] += +!!this[init_symbol];

hook_fields[kInit] is a counter of all the init hooks. This is going to add either 0 or 1 depending on if there is a function for the init_symbol. Notice the + unary operator sign before double ! operators. It will try to convert that boolean to a number. So if an init function was defined this will add 1 to hooks_fields[kInit] otherwise 0. Next, we see that this AsyncHook instance is added to the hooks_array. This answers the above question regarding here the init fuctions are added to the hooks array.

Following this, lets take a look at enableHooks() which can be found in lib/internal/async_wrap.js:

function enableHooks() {
  enablePromiseHook();
  async_hook_fields[kCheck] += 1;
}

enablePromiseHook can be found in src/async_wrap.cc:

static void EnablePromiseHook(const FunctionCallbackInfo<Value>& args) {
  Environment* env = Environment::GetCurrent(args);
  env->AddPromiseHook(PromiseHook, static_cast<void*>(env));
}

This will land us in env.cc AddPromiseHook:

void Environment::AddPromiseHook(promise_hook_func fn, void* arg) {
  auto it = std::find_if(
      promise_hooks_.begin(), promise_hooks_.end(),
      [&](const PromiseHookCallback& hook) {
        return hook.cb_ == fn && hook.arg_ == arg;
      });
  if (it != promise_hooks_.end()) {
    it->enable_count_++;
    return;
  }
  promise_hooks_.push_back(PromiseHookCallback{fn, arg, 1});

  if (promise_hooks_.size() == 1) {
    isolate_->SetPromiseHook(EnvPromiseHook);
  }
}

Every environment has a vector of PromiseHookCallbacks (env.h):

struct PromiseHookCallback {
    promise_hook_func cb_;
    void* arg_;
    size_t enable_count_;
  };
std::vector<PromiseHookCallback> promise_hooks_;

Next, remember that std::find_if will return an iterator to the first element for which the predicate returns true. If nothing is found the function returns last/end. So if the passed in promise_hook_func which is a typedef for a function pointer:

typedef void (*promise_hook_func) (v8::PromiseHookType type,
                                   v8::Local<v8::Promise> promise,
                                   v8::Local<v8::Value> parent,
                                   void* arg);

if the fn and the arg already match an existing PromiseHookCallback, the iterator will not be equal to end() and in that case that PromiseHookCallback will have it's enable_count incremented and return. If the passed in promise_hook_func and args have not been added then a new PromiseHookCallback will be created and added to promise_hooks_.

Last thing that happens in AddPromiseHook is SetPromiseHook is called on the isolate. The function EnvPromiseHook is a promise hook that will run all the promise_hook_'s added.

So when will EnvPromiseHook be called? Lets take a closer look at the V8 side of this. If we look in deps/v8/include/v8.h we can find:

enum class PromiseHookType { kInit, kResolve, kBefore, kAfter };

typedef void (*PromiseHook)(PromiseHookType type, Local<Promise> promise,
                            Local<Value> parent);

So we can see that PromiseHook is a function pointer to a function that takes a type, promise, and a parent value. This callback will be invoked by Isolate::RunPromiseHook:

void Isolate::RunPromiseHook(PromiseHookType type, Handle<JSPromise> promise,
                             Handle<Object> parent) {
  if (debug()->is_active()) debug()->RunPromiseHook(type, promise, parent);
  if (promise_hook_ == nullptr) return;
  promise_hook_(type, v8::Utils::PromiseToLocal(promise), v8::Utils::ToLocal(parent));
}

The first time this is called is from deps/v8/src/runtime/runtime-promise.cc and:

RUNTIME_FUNCTION(Runtime_PromiseHookInit) {
  HandleScope scope(isolate);
  DCHECK_EQ(2, args.length());
  CONVERT_ARG_HANDLE_CHECKED(JSPromise, promise, 0);
  CONVERT_ARG_HANDLE_CHECKED(Object, parent, 1);
  isolate->RunPromiseHook(PromiseHookType::kInit, promise, parent);
  return isolate->heap()->undefined_value();
}

V8 provides four hooks, init, resolve, before, and after. Init is run when a new Promise is created in V8. resolve when a promise is resolved. before is run before a PromiseReactionJob, and after after that job.

Lets say we create the following JavaScript:

const p = new Promise((resolve, reject) => {
  resolve('ok');
});

If I'm reading this correctly, this would invoke code generated by v8/src/builtins/builtins-promise-gen.cc:

TF_BUILTIN(PromiseConstructor, PromiseBuiltinsAssembler) {
  ...
  GotoIfNot(IsPromiseHookEnabledOrDebugIsActive(), &debug_push);
  CallRuntime(Runtime::kPromiseHookInit, context, instance, UndefinedConstant());
}

So, Runtime_PromiseHookInit would only be called if a promise hook was enabled. Lets see if we can force this:

(lldb) br s -n ExecuteScript
(lldb) r
(lldb) expr ((v8::internal::Isolate*)env->isolate())->promise_hook_or_debug_is_active_
(bool) $8 = false
(lldb) expr ((v8::internal::Isolate*)env->isolate())->promise_hook_or_debug_is_active_ = true
(bool) $9 = true
(lldb) br s -f runtime-promise.cc -l 113
(lldb) continue

Indeed, we can now see that Runtime_PromiseHookInit is getting called and it will in turn call Isolate::RunPromiseHook, but in this case promise_hook_ will be a nullptr as it was not set it will simply return.

When a PromiseHook as been set on the isolate it will call EnvPromiseHook EnvPromiseHook will iterate of all the added promise_hook_s in the environment and call there:

Environment* env = Environment::GetCurrent(promise->CreationContext());
for (const PromiseHookCallback& hook : env->promise_hooks_) {
  hook.cb_(type, promise, parent, hook.arg_);
}

Recall, that the cb_ in this case it the callback added from async_wrap.cc by EnablePromiseHook which was called from enableHooks() in lib/internal/async_wrap.js:

env->AddPromiseHook(PromiseHook, static_cast<void*>(env));

So, lets take a look a PromiseHook. For this we need to update our example to use both async_hooks and a promise:

const http = require('http')
const ah = require('async_hooks');
let asyncObject = {
  init: function(uid, provider, parentUid, parentHandle) {
    process._rawDebug('init uid:', uid, ', provider:', provider, ', parentUid:', parentUid);
  },
  before: function(uid) {
    process._rawDebug('before uid:', uid);
  },
  after: function(uid, didThrow) {
    process._rawDebug('after. uid:', uid, 'didThrow:', didThrow);
  },
  destroy: function(uid) {
    process._rawDebug('destroy: uid:', uid);
  },
  promiseResolve: function(uid) {
    process._rawDebug('promiseResolve: uid:', uid);
  }
};
let asynchook = ah.createHook(asyncObject);
asynchook.enable();

const p = new Promise((resolve, reject) => {
  resolve('ok');
});

p.then(msg => {
  console.log(msg);
});

We should now be able to set a breakpoint in PromiseHook:

(lldb) br s -f async_wrap.cc -l 288
(lldb) r

Just to recap, Runtime_PromiseHookInit in deps/v8/src/runtime/runtime-promise.cc will call our PromiseHook:

isolate->RunPromiseHook(PromiseHookType::kInit, promise, parent);

And Isolate::RunPromiseHook will call the registered hook, which is EnvPromiseHook, which iterates over the registered PromiseHookCallback (which is a struct containing the callback, the arg and a counter) calling the struct's callback (which is PromiseHook):

static void PromiseHook(PromiseHookType type, Local<Promise> promise,
                        Local<Value> parent, void* arg) {
  Environment* env = static_cast<Environment*>(arg);
  Local<Value> resource_object_value = promise->GetInternalField(0);
(lldb) expr promise
(v8::Local<v8::Promise>) $67 = (val_ = 0x00007fff5fbfce10)
(lldb) expr promise->State()
(v8::Promise::PromiseState) $12 = kPending
(lldb) expr promise->HasHandler()
(bool) $11 = false

v8::Promise can be found in deps/v8/include/v8.h. A v8::Promise also have a Result() function which can be called if the state is not pending. And a Catch and Then function to register function handlers. So this is the promise that was passed up from V8. GetInternalField is a function of v8::Object. For kInit there will not be any thing in the internal field (at least not during this debugging session) but later when PromiseWrap::New is called:

wrap = PromiseWrap::New(env, promise, nullptr, silent);

So, lets take a closer look at PromiseWrap::New:

  Local<Object> object = env->promise_wrap_template()->NewInstance(env->context()).ToLocalChecked();
  object->SetInternalField(PromiseWrap::kPromiseField, promise);
  object->SetInternalField(PromiseWrap::kIsChainedPromiseField,
                           parent_wrap != nullptr ?
                              v8::True(env->isolate()) :
                              v8::False(env->isolate()));
  CHECK_EQ(promise->GetAlignedPointerFromInternalField(0), nullptr);
  promise->SetInternalField(0, object);
  return new PromiseWrap(env, object, silent);

As the name of this class indicates this will wrap a v8::Promise. So we first create a new instance of the template that was created previously in SetupHooks. The wrapping is done by setting the promise on this object as an internal field (kPromiseField). Also note that the object is set as an internal field on the promise, as index 0. This is later accessed in PromiseHook

Local<Value> resource_object_value = promise->GetInternalField(0);
PromiseWrap* wrap = nullptr;
  if (resource_object_value->IsObject()) {
    Local<Object> resource_object = resource_object_value.As<Object>();
    wrap = Unwrap<PromiseWrap>(resource_object);
  }

So resource_object_value can contain PromiseWrap and if so it is unwrapped.

When a parent promise exists a DefaultTriggerAsyncIdScope is used before calling PromiseWrap::New:

  AsyncHooks::DefaultTriggerAsyncIdScope trigger_scope(parent_wrap);

The constructor that takes a AsyncWrap can be found in src/async_wrap-inl.h:

inline Environment::AsyncHooks::DefaultTriggerAsyncIdScope ::DefaultTriggerAsyncIdScope(AsyncWrap* async_wrap)
    : DefaultTriggerAsyncIdScope(async_wrap->env(),
                                 async_wrap->get_async_id()) {}

So this constructor just delegates to the one that takes an Environment pointer and a double. This constructor can be found in src/env-inl.h.

Recall that this following SetAccessor was set on the template:

  promise_wrap_template->SetAccessor(FIXED_ONE_BYTE_STRING(env->isolate(), "promise"), PromiseWrap::GetPromise);

So accessing promise on object above will invoke PromiseWrap::GetPromise:

  info.GetReturnValue().Set(info.Holder()->GetInternalField(kPromiseField));

Next, a new PromiseWrap is created using this object. And since PromiseWrap extends AsyncWrap it's constructor will call AsyncReset:

  AsyncReset(-1, silent);

which will call AsyncWrap::EmitAsyncInit:

  Local<Value> argv[] = {
    Number::New(env->isolate(), async_id),
    type,
    Number::New(env->isolate(), trigger_async_id),
    object,
  };
USE(init_fn->Call(env->context(), object, arraysize(argv), argv));

These are the arguments that will be passed to init:

init uid: 6 , provider: PROMISE , parentUid: 1 , parentHandle: PromiseWrap { isChainedPromise: false, promise: Promise { <pending> } }

In lib/internal/async_hooks.js we have these two fields:

const async_wrap = process.binding('async_wrap');
const { async_hook_fields, async_id_fields } = async_wrap;

When process.binding('async_wrap') is called this will invoke AsyncWrap::Initialize:

#define FORCE_SET_TARGET_FIELD(obj, str, field)                               \
  (obj)->DefineOwnProperty(context,                                           \
                           FIXED_ONE_BYTE_STRING(isolate, str),               \
                           field,                                             \
                           ReadOnlyDontDelete).FromJust()

  // Attach the uint32_t[] where each slot contains the count of the number of
  // callbacks waiting to be called on a particular event. It can then be
  // incremented/decremented from JS quickly to communicate to C++ if there are
  // any callbacks waiting to be called.
  FORCE_SET_TARGET_FIELD(target, "async_hook_fields", env->async_hooks()->fields().GetJSArray());

This will expand to:

  target->DefineOwnProperty(context,
                           FIXED_ONE_BYTE_STRING(isolate, "async_hook_fields"),
                           env->async_hooks()->fields().GetJSArray(),
                           ReadOnlyDontDelete).FromJust()

Notice that the field here is env->async_hooks()->fields().GetJSArray(). If we look at the fields function of AsyncHooks we see:

inline AliasedBuffer<uint32_t, v8::Uint32Array>& fields();

So the native type And inspecting it we can see:

(lldb) expr *this
(node::AliasedBuffer<unsigned int, v8::Uint32Array>) $6 = {
  isolate_ = 0x0000000105007400
  count_ = 8
  byte_offset_ = 0
  buffer_ = 0x0000000104a1c590
  js_array_ = {
    v8::PersistentBase<v8::Uint32Array> = (val_ = 0x000000010505ba20)
  }
  free_buffer_ = true
}

When an Environment is created the AsyncHooks constructor will be called as it is a member of Environment (src/env.h):

AsyncHooks async_hooks_;

AsyncHooks constructor (src/env-inl.h) will set construct a AliasedBuffer for the the async_id_fields_ member:

fields_(env()->isolate(), kFieldsCount),

In the constructor for AliasedBuffer we can see that a buffer is created:

buffer_ = Calloc<NativeT>(count)

So we are going to allocate zeroed out memory for an unsigned int (buffer_ pointing to it)

(lldb) expr count
(size_t) $11 = 8
(lldb) expr buffer_
(unsigned int *) $13 = 0x0000000106000000

Next, we are going to create a V8 ArrayBuffer using the newly allocated block of memory:

  v8::Local<v8::ArrayBuffer> ab = v8::ArrayBuffer::New(isolate_, buffer_, sizeInBytes);
(lldb) jlh ab
0x24f52a506c11: [JSArrayBuffer]
 - map = 0x24f5fbb02c01 [FastProperties]
 - prototype = 0x24f522e89819
 - elements = 0x24f5eb102251 <FixedArray[0]> [HOLEY_ELEMENTS]
 - embedder fields: 2
 - backing_store = 0x106000000
 - byte_length = 32
 - external
 - neuterable
 - properties = 0x24f5eb102251 <FixedArray[0]> {}
 - embedder fields = {
    0x0
    0x0
 }

We can see that the backing_store is pointing to the memory that was allocated. An ArrayBuffer is an object that represents a block of data. A view is used to access the data, which are called TypedArray views. This is what we create next:

v8::Local<V8T> js_array = V8T::New(ab, byte_offset_, count);
(lldb) expr byte_offset_
(size_t) $31 = 0
(lldb) expr count
(size_t) $32 = 8
(lldb) jlh js_array
0x24f52a506c61: [JSTypedArray]
 - map = 0x24f5fbb03011 [FastProperties]
 - prototype = 0x24f522e8ac51
 - elements = 0x24f52a506ca9 <FixedUint32Array[8]> [UINT32_ELEMENTS]
 - embedder fields: 2
 - buffer = 0x24f52a506c11 <ArrayBuffer map = 0x24f5fbb02c01>
 - byte_offset = 0
 - byte_length = 32
 - length = 8
 - properties = 0x24f5eb102251 <FixedArray[0]> {}
 - elements = 0x24f52a506ca9 <FixedUint32Array[8]> {
         0-7: 0
 }
 - embedder fields = {
    0x0
    0x0
 }

Next in AsyncHooks contructor we have the following line:

fields_[kCheck] = 1;

AliasedBuffer has overloaded the [] operator so this call will invoke AliasedBuffer::operator[]:

Reference operator[](size_t index) {
  return Reference(this, index);
}

And Reference in turn overloads the = operator so it will be invoked:

template <typename T>
inline Reference& operator=(const T& val) {
  aliased_buffer_->SetValue(index_, val);
  return *this;
}

Reference is used to store the index for the element being used (plus the AliasedBuffer pointer).

So we can see that we are setting index_ which is kCheck, to value which is 1:

(lldb) expr index_
(size_t) $57 = 6
(lldb) expr ::Fields::kCheck
(int) $60 = 6
(lldb) expr val
(const int) $61 = 1

So we can get and set values using:

(lldb) expr fields_.SetValue(::Fields::kCheck, 2)
(lldb) expr fields_.GetValue(::Fields::kCheck)
(unsigned int) $69 = 2
(lldb) expr fields_
(node::AliasedBuffer<unsigned int, v8::Uint32Array>) $72 = {
  isolate_ = 0x0000000105007400
  count_ = 8
  byte_offset_ = 0
  buffer_ = 0x0000000106000000
  js_array_ = {
    v8::PersistentBase<v8::Uint32Array> = (val_ = 0x0000000105057a20)
  }
  free_buffer_ = true
}
(lldb) memory read -f x -s 4 -c 7 0x0000000106000000
0x106000000: 0x00000000 0x00000000 0x00000000 0x00000000
0x106000010: 0x00000000 0x00000000 0x00000002
(lldb) expr fields_.SetValue(::Fields::kCheck, 1)
(lldb) memory read -f x -s 4 -c 7 0x0000000106000000
0x106000000: 0x00000000 0x00000000 0x00000000 0x00000000
0x106000010: 0x00000000 0x00000000 0x00000001

So we bacially have an 32 bit array which have 8 values in indexed using AsyncHooks::Fields enum. These are basically counters as far as I can tell. The entries are used to keep track of the number of init functions that should be called. Looking at the code we can see that

Getting back on track we were in AsyncWrap::Initialize :

  target->DefineOwnProperty(context,
                           FIXED_ONE_BYTE_STRING(isolate, "async_hook_fields"),
                           env->async_hooks()->fields().GetJSArray(),
                           ReadOnlyDontDelete).FromJust()

So looking again at this like we can see that we are retreiving the JSTypedArray and setting that on the target as async_hook_fields

(lldb) expr env->async_hooks()->fields().GetJSArray()->Buffer()->GetContents()
(v8::ArrayBuffer::Contents) $302 = {
  data_ = 0x0000000106000000
  byte_length_ = 32
  allocation_base_ = 0x0000000106000000
  allocation_length_ = 32
  allocation_mode_ = kNormal
}

Notice that the data_ field points to the same memory location as buffer_ above. Next constants property is populated.

AliasedBuffer

AliasedBuffer has overloaded the [] operator so this call will invoke AliasedBuffer::operator[]:

Reference operator[](size_t index) {
  return Reference(this, index);
}

And Reference in turn overloads the = operator so it will be invoked:

template <typename T>
inline Reference& operator=(const T& val) {
  aliased_buffer_->SetValue(index_, val);
  return *this;
}

Reference is used to store the index for the element being used (plus the AliasedBuffer pointer).

v8::Isolate* isolate_;
  size_t count_;
  size_t byte_offset_;
  NativeT* buffer_;
  v8::Global<V8T> js_array_;
  bool free_buffer_;

HandleWrap

HandleWrap represents a libuv handle which represents . Take the following functions:

static void Close(const v8::FunctionCallbackInfo<v8::Value>& args);
static void Ref(const v8::FunctionCallbackInfo<v8::Value>& args);
static void Unref(const v8::FunctionCallbackInfo<v8::Value>& args);
static void HasRef(const v8::FunctionCallbackInfo<v8::Value>& args);

There are libuv counter parts for these in uv_handle_t:

void uv_close(uv_handle_t* handle, uv_close_cb close_cb)
void uv_ref(uv_handle_t* handle)
void uv_unref(uv_handle_t* handle)
int uv_has_ref(const uv_handle_t* handle)

Just like in libuv where uv_handle_t is a base type for all libuv handles, HandleWrap is a base class for all Node.js Wrap classes.

Every uv_handle_t can have a data member, and this is being set in the constructor to this instance of HandleWrap.

handle__->data = this;
HandleScope scope(env->isolate());
Wrap(object, this);

In HandleWrap's constructor the HandleWrap is added to the queue of HandleWraps in the Environment:

env->handle_wrap_queue()->PushBack(this);

libuv has the following types of handle types:

#define UV_HANDLE_TYPE_MAP(XX)                                               \
 XX(ASYNC, async)                                                            \
 XX(CHECK, check)                                                            \
 XX(FS_EVENT, fs_event)                                                      \
 XX(FS_POLL, fs_poll)                                                        \
 XX(HANDLE, handle)                                                          \
 XX(IDLE, idle)                                                              \
 XX(NAMED_PIPE, pipe)                                                        \
 XX(POLL, poll)                                                              \
 XX(PREPARE, prepare)                                                        \
 XX(PROCESS, process)                                                        \
 XX(STREAM, stream)                                                          \
 XX(TCP, tcp)                                                                \
 XX(TIMER, timer)                                                            \
 XX(TTY, tty)                                                                \
 XX(UDP, udp)                                                                \
 XX(SIGNAL, signal)                                                          \ 


struct uv_tcp_s {
  UV_HANDLE_FIELDS
  UV_STREAM_FIELDS
  UV_TCP_PRIVATE_FIELDS
};

We know that TCPWrap is a built-in module and that it's Initialize method is called, which sets up all the prototype functions available, among them listen:

env->SetProtoMethod(t, "listen", Listen);

And in Listen we find:

int backlog = args[0]->Int32Value();
int err = uv_listen(reinterpret_cast<uv_stream_t*>(&wrap->handle_),
                    backlog,
                    OnConnection);
args.GetReturnValue().Set(err);

We can find a similarity in Node where TCPWrap indirectly also extends StreamWrap (which extends HandleWrap).

Wrap

template <typename TypeName>
void Wrap(v8::Local<v8::Object> object, TypeName* pointer) {
 CHECK_EQ(false, object.IsEmpty());
 CHECK_GT(object->InternalFieldCount(), 0);
 object->SetAlignedPointerInInternalField(0, pointer);
}

Here we can see that we are setting a pointer in field 0. The object in question, and pointer the pointer to this HandleWrap.

persistent().Reset will destroy the underlying storage cell if it is non-empty, and create a new one the handle.

MakeWeak:

 inline void MakeWeak(void) {
   persistent().SetWeak(this, WeakCallback, v8::WeakCallbackType::kParameter);
   persistent().MarkIndependent();
}

The above is installing a finalization callback on the persistent object. Marking the persistent object as independant means that the GC is free to ignore object groups containing this persistent object. Why is this done? I don't know enough about the V8 GC yet to answer this.

The callback may be called (best effort) and it looks like this:

static void WeakCallback(const v8::WeakCallbackInfo<ObjectWrap>& data) {
  ObjectWrap* wrap = data.GetParameter();
  assert(wrap->refs_ == 0);
  wrap->handle_.Reset();
  delete wrap;
}

ContextifyScript will call MakeWeak in it's constructor:

ContextifyScript(Environment* env, Local<Object> object) : BaseObject(env, object) {
  MakeWeak();
}

So we are calling MakeWeak so that a callback (a finalizer) will be called when the GC has determined that there are not more refs to the object.

If we take a look at MakeWeak:

void BaseObject::MakeWeak() {
  persistent_handle_.SetWeak(
      this,
      [](const v8::WeakCallbackInfo<BaseObject>& data) {
        std::unique_ptr<BaseObject> obj(data.GetParameter());
        // Clear the persistent handle so that ~BaseObject() doesn't attempt
        // to mess with internal fields, since the JS object may have
        // transitioned into an invalid state.
        // Refs: https://github.com/nodejs/node/issues/18897
        obj->persistent_handle_.Reset();
      }, v8::WeakCallbackType::kParameter);
}

From looking at the code the callback is called after the

And the destructor looks like this now:

BaseObject::~BaseObject() {
  env_->RemoveCleanupHook(DeleteMe, static_cast<void*>(this));

  if (persistent_handle_.IsEmpty()) {
    // This most likely happened because the weak callback below cleared it.
    return;
  }

  {
    v8::HandleScope handle_scope(env_->isolate());
    object()->SetAlignedPointerInInternalField(0, nullptr);
  }
}

Notice the call to persitent_handle_.IsEmpty(), so if it is emtpy we will not do anything. So could this callback could be avoided completely? Making something a weak pointer will allow it to be GC'd but you might not require any callback to be invoked. In that case you can just call:

  persistent_handle_.SetWeak();

Remember, obj is of type BaseObject and is not managed by V8, but the persistent_handle_ is managed by V8's GC. When we get the callback that the persistent_handle_ is about to be freed. If we don't call Reset, then ~BaseObject will try to set the internal field on the now freed persistent_handle_. object() calls and SetAlignedPointerInInternalField will segfault as the handle has already been freed.

v8::Local<v8::Object> BaseObject::object() const {
  return PersistentToLocal(env_->isolate(), persistent_handle_);
}

My understanding of this is that when the callback/finalizer lambda is called the underlying Persistent object will have been freed, but the BaseObject instance still has a reference to it. It uses this reference in ~ObjectBaset() create a new Local, and then calls SetAlignedPointerInInternalField which will segfault when trying to OpenHandle (which has been feed).

So, what does Reset do?

V8::DisposeGlobal(reinterpret_castinternal::Object**(this->val_)); i::GlobalHandles::Destroy(location); global-handles.cc:94

(lldb) expr persistent_handle_ (node::Persistentv8::Object) $2 = { v8::PersistentBasev8::Object = (val_ = 0x0000000108005fa0) }

TcpWrap

TcpWrap extends ConnectionWrap Lets take a look at the creation of a TcpWrap:

wrap = new TCPWrap(env, args.This(), nullptr);

What is args.This(). That will be the (v8::Localv8::Object) object that will be wrapped.

This be passed to ConnectionWrap's constructor, which in turn will pass it to StreamWrap's constructor, which will pass it to HandleWrap's constructor, which will pass it to AsyncWrap's constructor, which will pass it to BaseObject's constructor which will set this/create a persistent object to store the handle:

: persistent_handle_(env->isolate(), handle)

I've not seen this before, initializing a member with two parameters, and I cannot find a function that matches this signature. What is going on there?
Well, the type of persistent_handle_ is :

v8::Persistent<v8::Object> persistent_handle_;

And the constructor for Persistent looks like this:

 template <class S>
 V8_INLINE Persistent(Isolate* isolate, Local<S> that)
    : PersistentBase<T>(PersistentBase<T>::New(isolate, *that)) {
  TYPE_CHECK(T, S);
}

IsolateData

Has a public constructor that takes a pointer to Isolate, a pointer to uv_loop_t, and a pointer to uint32 zero_fill_field. An IsolateData instance also has a number of public methods:

#define VP(PropertyName, StringValue) V(v8::Private, PropertyName, StringValue)
#define VS(PropertyName, StringValue) V(v8::String, PropertyName, StringValue)
#define V(TypeName, PropertyName, StringValue)                                \
  inline v8::Local<TypeName> PropertyName(v8::Isolate* isolate) const;
  PER_ISOLATE_PRIVATE_SYMBOL_PROPERTIES(VP)
  PER_ISOLATE_STRING_PROPERTIES(VS)
#undef V
#undef VS
#undef VP

What is happening here is that we are declaring methods for each for the PER_ISOLATE_PRIVATE_SYMBOL_PROPERTIES. Since VP is being passed and the type for those methods is v8::Private there will be the following methods:

v8::Local<Private> alpn_buffer_private_symbol(v8::Isolate* isolate) const;
v8::Local<Private> arrow_message_private_symbol(v8::Isolate* isolate) const;
...

But what is the StringValue used for?
The StringValue is actually not used here, see #7905 for details.

The StringValue is used in the definition though which can be found in src/env-inl.h:

inline IsolateData::IsolateData(v8::Isolate* isolate, uv_loop_t* event_loop,
                              uint32_t* zero_fill_field)
  :
#define V(PropertyName, StringValue)                                          \
  PropertyName ## _(                                                        \
      isolate,                                                              \
      v8::Private::New(                                                     \
          isolate,                                                          \
          v8::String::NewFromOneByte(                                       \
              isolate,                                                      \
              reinterpret_cast<const uint8_t*>(StringValue),                \
              v8::NewStringType::kInternalized,                             \
              sizeof(StringValue) - 1).ToLocalChecked())),
PER_ISOLATE_PRIVATE_SYMBOL_PROPERTIES(V)
#undef V
#define V(PropertyName, StringValue)                                          \
  PropertyName ## _(                                                        \
      isolate,                                                              \
      v8::String::NewFromOneByte(                                           \
          isolate,                                                          \
          reinterpret_cast<const uint8_t*>(StringValue),                    \
          v8::NewStringType::kInternalized,                                 \
          sizeof(StringValue) - 1).ToLocalChecked()),
  PER_ISOLATE_STRING_PROPERTIES(V)
#undef V

This is the definition of the IsolateData constructor, and it is setting each of the private member fields to the StringValue. I created an example to try this out. While it might not be easy on the eyes this does have a major advantage of not having to maintain all of these accessor methods. Adding a new one is simply a matter of adding an entry to the macro.

So now that we understand the macro, lets take a look at the actual information that this class stores/provides.
All the property accessors defined above are available using he IsolateData instance but also they can be called using an Environment instance which just passes the calls through to the IsolateData instance. The per isolate private members are the following:

V(alpn_buffer_private_symbol, "node:alpnBuffer")
V(npn_buffer_private_symbol, "node:npnBuffer")
V(selected_npn_buffer_private_symbol, "node:selectedNpnBuffer")

The above are used by node_crypto.cc which makes sense as Application Level Protocol Negotiation (ALPN) is an TLS protocol, as it Next Prototol Negotiation (NPN).

V(arrow_message_private_symbol, "node:arrowMessage")

Not sure exactly what this does but from a quick search it looks like it has to do with exception handling and printing of error messages. TODO: revisit this later.

An IsolateData (and also an Environement as it proxies these members) actually has a lot of members, too many to list here it is easy to do a search for them.

Running lint

$ make lint
$ make jslint

Run lint os one file:

$ ./tools/node_modules/eslint/bin/eslint.js --rulesdir=tools/eslint-rules --ext=.js,.mjs,.md test/sequential/test-benchmark-tls.js

Running tests

To run the test use the following command:

$ make -j4 test

The -j is the number of processes to use.

Mac firewall exceptions

On mac you might find it popping up dialogs about the firwall blocking access to the node and cctest applications when running the tests. There is a script in node/tools that can run to add rules to the firewall:

$ sudo tools/macosx-firewall.sh

Running a script

This section attempts to explain the process of running a javascript file. We will create a break point in the javascript source and see how it is executed.

$ ./node -inspect-brk

Next, start lldb and

$ lldb -- out/Debug/node --inspect-brk test/parallel/test-tcp-wrap-connect.js

Now, when a script is executed it will be read and loaded. Where is this done? To recap the loading is done by LoadEnvironment which loads and executes lib/internal/bootstrap/node.js. This is a function which is then executed:

    Local<Value> arg = env->process_object();
    f->Call(Null(env->isolate()), 1, &arg);

As we can see the process_object which was configured earlier is passed into the function:

    (function(process) {
      function startup() {
        ...
      }
      //other functions
      
      startup();
    });

We can see that the startup function will be called when the the f is called. Since we are specifying a script to run we will be looking at setting up the various object in the environment, mosty using the passed in process object (TODO: need to write out the details for this later) and eventually running:

     preloadModules();
     run(Module.runMain);

Module.runMain is a function in lib/module.js:

    // bootstrap main module.
    Module.runMain = function() {
      // Load the main module--the command line argument.
      Module._load(process.argv[1], null, true);
      // Handle any nextTicks added in the first tick of the program
      process._tickCallback();
    };

_load

Will check the module cache for the filename and if it already exists just returns the exports object for this module. But otherwise the filename will be loaded using the file extension. Possible extensions are .js, .json, and .node (defaulting to .js if no extension is given).

    Module._extensions[extension](this, filename);

We know our extension is .js so lets look closer at it:

     // Native extension for .js
     Module._extensions['.js'] = function(module, filename) {
       var content = fs.readFileSync(filename, 'utf8');
       module._compile(internalModule.stripBOM(content), filename);
     };

So lets take a look at _compile_

module._compile

After removing the shebang from the content which is passed in as the first parameter the content is wrapped:

    var wrapper = Module.wrap(content);

    var compiledWrapper = vm.runInThisContext(wrapper, {
      filename: filename,
      lineOffset: 0,
      displayErrors: true
    });

vm.runInThisContext :

    var dirname = path.dirname(filename);
    var require = internalModule.makeRequireFunction.call(this);
    var args = [this.exports, require, this, filename, dirname];
    var depth = internalModule.requireDepth;
    if (depth === 0) stat.cache = new Map();
    var result = compiledWrapper.apply(this.exports, args);

Module.wrap

This is declared as:

    const NativeModule = require('native_module');
    ....
    Module.wrap = NativeModule.wrap;

NativeModule can be found lib/internal/bootstrap/node.js:

     NativeModule.wrap = function(script) {
       return NativeModule.wrapper[0] + script + NativeModule.wrapper[1];
     };

     NativeModule.wrapper = [
       '(function (exports, require, module, __filename, __dirname) { ',
       '\n});'
     ];

We can see here that the content of our JavaScript file will be included/wrapped in

    (function (exports, require, module, __filename, __dirname) { 
	// script content
    });'

So this is also how exports, require, module, __filename, and __dirname are made available to all scripts.

So, to recap we have a wrapper instance that is a function. The next thing that happens in lib/modules.js is:

    var compiledWrapper = vm.runInThisContext(wrapper, {
      filename: filename,
      lineOffset: 0,
      displayErrors: true
    });

So what does vm.runInThisContext do?
This is defined in lib/vm.js:

    exports.runInThisContext = function(code, options) {
      var script = new Script(code, options);
      return script.runInThisContext(options);
    };

As described in the vm the vm module provides APIs for compiling and running code within V8 Virtual Machine contexts. Creating a new Script will compile but not run the code. It can later be run multiple times.

So what is a Script? It is declared as:

    const binding = process.binding('contextify');
    const Script = binding.ContextifyScript;

What is Contextify about?
This is related to V8 contexts and all JavaScript code is run in a context.

src/node_contextify.cc is a builtin module and contains an Init function that does the following (among other things):

    env->SetProtoMethod(script_tmpl, "runInContext", RunInContext);
    env->SetProtoMethod(script_tmpl, "runInThisContext", RunInThisContext);

script.runInThisContext in vm.js overrides runInThisContext and then delegates to src/node_contextify.cc RunInThisContext.

     // Do the eval within this context
     Environment* env = Environment::GetCurrent(args);
     EvalMachine(env, timeout, display_errors, break_on_sigint, args, &try_catch);

After all this processing is done we will be back in node.cc and continue processing there. As everything is event driven the event loop start running and trigger callbacks for anything that has been set up by the script. Just think about a V8 example you create yourself, you set up the c++ code that is to be called from JavaScript and then V8 takes care of the rest. In node, the script is first wrapped in node specific JavaScript and then executed. Node code uses libuv there are callbacks setup that are called by libuv and more actions taken, like invoking a JavaScript callback function.

Remove need to specify a no-operation immediate_idle_handle

When calling setImmediate, this will schedule the callback passed in to be scheduled for execution after I/O events:

setImmediate(function() {
console.log("In immediate...");
});

Currently this is done by using a libuv uv_check_handle. Since checks are performed after polling for I/O, if there are no idle handle or prepare handle (need to check this) then the I/O polling would block as there would be nothing for the event loop to process until there is an I/O event. But if we have an idle handler there is something for the event loop to do which will cause the poll timeout to be zero and the event loop will not block on I/O.

In src/node.cc there is currently an empty uv_idle_handle callback (IdleImmediateDummy) for this which could be removed if it was possible to pass in a NULL callback. Currently there is a check in libuv checcing if the callback is null and this might not be able to change.

My first idea was to overload the function but C does not suppport overloading, so perhaps having a new function named something like:

 uv_idle_start_nop(&handle)

Another option might be to make uv_idle_start an varargs function and if the only one argument is passed (not null but actually missing) then assume that a nop-callback. But looking into a variadic function there is no way to know when there if an argument was provided or not (of the optional arguments that is). Currently I'm only adding a function for uv_idle_start_nop to uv-common.c to try this out and see if I can get some feedback on a better place for this.

This task did not come to anything yet. Perhaps with libuv 2.0 libuv might accepts a null callback.

tcp_wrap and pipe_wrap

Lets take a look at the following statement:

    var TCPConnectWrap = process.binding('tcp_wrap').TCPConnectWrap;
    var req = new TCPConnectWrap();

We know from before that binding is set as function on the process object. This was done in SetupProcessObject in node.cc:

    env->SetMethod(process, "binding", Binding);

So we are invoking the Binding function in node.cc with the argument 'tcp_wrap':

    static void Binding(const FunctionCallbackInfo<Value>& args) {

Binding will extract the first (and only) argument which is the name of the module. Every environment seems to have a cache, and if the module is in this cache it is returned:

    Local<Object> cache = env->binding_cache_object();

It will also create a instance of Local exports which is the object that will be returned.

    Local<Object> exports;

So, when the tcp_wrap.cc was Initialized (see section about Builtins):

    // Create FunctionTemplate for TCPConnectWrap.
    auto constructor = [](const FunctionCallbackInfo<Value>& args) {
      CHECK(args.IsConstructCall());
    };
    auto cwt = FunctionTemplate::New(env->isolate(), constructor);
    cwt->InstanceTemplate()->SetInternalFieldCount(1);
    cwt->SetClassName(FIXED_ONE_BYTE_STRING(env->isolate(), "TCPConnectWrap"));
    target->Set(FIXED_ONE_BYTE_STRING(env->isolate(), "TCPConnectWrap"), cwt->GetFunction());

What is going on here. We create a new FunctionTemplate using the constructor lamba, this is then added to the target (the object that we are initializing). The constructor is only checking that the passed in args can be used as a constructor (using new in JavaScript)
The object returned from the constructor call does not have any methods as far as I can tell. The constructor would late be used like this:

    var client = new TCP();
    var req = new TCPConnectWrap();
    var err = client.connect(req, '127.0.0.1', this.address().port);

Now, we saw that TCP has a bunch of methods set up in Initialize, one of the being connect:

    void TCPWrap::Connect(const FunctionCallbackInfo<Value>& args) {
      ...
      Local<Object> req_wrap_obj = args[0].As<Object>();

This is the instance of TCPConnectWrap req created above and we can see that it is of type v8::Local<v8::Local>.

    ConnectWrap* req_wrap = new ConnectWrap(env, req_wrap_obj, AsyncWrap::PROVIDER_TCPCONNECTWRAP);

Remember that ConnectWrap extends ReqWrap which extends AsyncWrap.

We know that ConnectWrap takes Local<Object> as the req_wrap_obj

    err = uv_tcp_connect(req_wrap->req(), &wrap->handle_, reinterpret_cast<const sockaddr*>(&addr), AfterConnect);

uv_tcp_connect takes a pointer uv_connect_t and a pointer to uv_tcp_t handle. This will connect to the specified sockaddr_in and the callback will be called when the connection has been established or if an error occurs. So it makes sense that ConnectWrap extends ReqWrap as uv_connect_t is a request type in libuv:

    /* Request types. */
    typedef struct uv_req_s uv_req_t;
    typedef struct uv_getaddrinfo_s uv_getaddrinfo_t;
    typedef struct uv_getnameinfo_s uv_getnameinfo_t;
    typedef struct uv_shutdown_s uv_shutdown_t;
    typedef struct uv_write_s uv_write_t;
    typedef struct uv_connect_s uv_connect_t;         <--------------------
    typedef struct uv_udp_send_s uv_udp_send_t;
    typedef struct uv_fs_s uv_fs_t;
    typedef struct uv_work_s uv_work_t;

AsyncWrap extends BaseObject which

    (lldb) p *this
    (node::BaseObject) $28 = {
      persistent_handle_ = {
        v8::PersistentBase<v8::Object> = (val_ = 0x0000000105010c60)
      }
      env_ = 0x00007fff5fbfe108
    }

So each BaseObject instance has a v8::Persistentv8:Object. This is a persistent object as it needs to be preserved accross C++ function boundries. Also, we can see that each BaseObject instance also has a node::Environment associated with it. The only thing that BaseObject's constructor does (baseobject-inl-h) is :

    // The zero field holds a pointer to the handle. Immediately set it to
    // nullptr in case it's accessed by the user before construction is complete.
    if (handle->InternalFieldCount() > 0)
      handle->SetAlignedPointerInInternalField(0, nullptr);

So after we have returned to AsyncWraps constructor, and then ReqWrap's we are back in ConnectWrap's constructor:

    Wrap(req_wrap_obj, this);

Wrap in util-inl.h:

    template <typename TypeName>
    void Wrap(v8::Local<v8::Object> object, TypeName* pointer) {
      CHECK_EQ(false, object.IsEmpty());
      CHECK_GT(object->InternalFieldCount(), 0);
      object->SetAlignedPointerInInternalField(0, pointer);
    }

We are now setting index 0 to the pointer which is the current Object is the v8::Local<v8::Object>, the one we created in our JavaScript file and passed to the connect method named req:

    var req = new TCPConnectWrap();
    var err = client.connect(req, '127.0.0.1', this.address().port);

So we are setting/storing a pointer to the ConnectWrap instance at index 0 of the req_wrap_obj.

After all that we are ready to make the uv_tcp_connect call:

    err = uv_tcp_connect(req_wrap->req(), &wrap->handle_, reinterpret_cast<const sockaddr*>(&addr), AfterConnect);

We can see the callback is node::ConnectionWrap<node::TCPWrap, uv_tcp_s>::AfterConnect(uv_connect_s*, int)

Notice that target is of type Local.

    Local<Object> exports;
    ....
    exports = Object::New(env->isolate());
    ...
    mod->nm_context_register_func(exports, unused, env->context(), mod->nm_priv);

exports is what is returned to the caller.

    args.GetReturnValue().Set(exports);

And we access the TCPConnectWrap member, which is a function which can be used as a constructor by using new. Lets start with where is ConnectWrap called? It is called from tcp_wrap.cc and its Connect method. ConnectWrap extends ReqWrap which extends AsyncWrap which extens BaseObject

    req.oncomplete = function(status, client_, req_) {

So, we know from earlier that our req object is basically empty. Here we are setting a property name oncomplete to be a function. This will be called in connection_wrap.cc 111:

    req_wrap->MakeCallback(env->oncomplete_string(), arraysize(argv), argv);

oncomplete_string() is a generated method from a macro in env.h

    v8::Local<v8::Value> cb_v = object()->Get(symbol);
    CHECK(cb_v->IsFunction());
    return MakeCallback(cb_v.As<v8::Function>(), argc, argv);

object() will return the persistent object to out handle (from base-object-inl.h) :

    return PersistentToLocal(env_->isolate(), persistent_handle_);

We can see that the persistent_handle_ is the handle that was created using which makes sense as this is the object that oncomplete was created for:

    var req = new TCPConnectWrap();

We are then calling Get(symbol) which will be a Symbol representing 'oncomplete'. And the calling it with number of arguments, and the arguments themselves.

tcp_wrap.cc

In OnConnect I found the following:

    TCPWrap* tcp_wrap = static_cast<TCPWrap*>(handle->data);
    ....
    Local<Object> client_obj = Instantiate(env, static_cast<AsyncWrap*>(tcp_wrap));
class TCPWrap : public StreamWrap
class StreamWrap : public HandleWrap, public StreamBase
class HandleWrap : public AsyncWrap {

As far as I can tell TCPWrap is of type AsyncWrap. Looking at src/pipe_wrap.cc which has a very similar OnConnect method (which I'm going to take a stab at refactoring) but does not have this cast.

Refactoring tcpwrap and pipewrap

This comment exist on pipewrap OnConnect:

// TODO(bnoordhuis) maybe share with TCPWrap?
    void PipeWrap::OnConnection(uv_stream_t* handle, int status) {
    PipeWrap* pipe_wrap = static_cast<PipeWrap*>(handle->data);
    CHECK_EQ(&pipe_wrap->handle_, reinterpret_cast<uv_pipe_t*>(handle));

The reinterpret_cast operator changes one data type into another. Recall how the types of libuv have a type of c inheritance allowing casting.

    /*
     * uv_pipe_t is a subclass of uv_stream_t.
     *
     * Representing a pipe stream or pipe server. On Windows this is a Named
     * Pipe. On Unix this is a Unix domain socket.
     */
    struct uv_pipe_s {
      UV_HANDLE_FIELDS
      UV_STREAM_FIELDS
      int ipc; /* non-zero if this pipe is used for passing handles */
      UV_PIPE_PRIVATE_FIELDS
   };

The main difference that I've been able to find is in pipewrap status is checked:

    if (status != 0) {
      pipe_wrap->MakeCallback(env->onconnection_string(), arraysize(argv), argv);
      return;
   } 

src/stream_wrap.cc

Looking into a task where the public member field req_ in src/req_wrap.cc is to be made private, I came accross the following method:

    286 void StreamWrap::AfterShutdown(uv_shutdown_t* req, int status) {
    287   ShutdownWrap* req_wrap = ContainerOf(&ShutdownWrap::req_, req);
    288   HandleScope scope(req_wrap->env()->isolate());
    289   Context::Scope context_scope(req_wrap->env()->context());
    290   req_wrap->Done(status);
    291 }

What I did for the public req_ member is made it private and then added a public accessor method for it. This was easy to update in most places but in src/stream_wrap.cc we have the following line:

   ShutdownWrap* req_wrap = ContainerOf(&ShutdownWrap::req_, req);

Extracting AfterConnect into connection_wrap.cc

Just like OnConnect was extracted into connection_wrap and shared by both tcp_wrap and pipe_wrap the same should be done for AfterConnect.

The main difference I found was in PipeWrap::AfterConnect:

    bool readable, writable;

    if (status) {
      readable = writable = 0;
    } else {
      readable = uv_is_readable(req->handle) != 0;
      writable = uv_is_writable(req->handle) != 0;
    } 
    Local<Object> req_wrap_obj = req_wrap->object();
    Local<Value> argv[5] = {
      Integer::New(env->isolate(), status),
      wrap->object(),
      req_wrap_obj,
      Boolean::New(env->isolate(), readable),
      Boolean::New(env->isolate(), writable)
    };

AfterConnect is a callback that is passed to uv_pipe_connect. The status will be 0 if uv_connect() was successful and < 0 otherwise.

The thing to notice is the difference compared to tcp_wrap:

    Local<Object> req_wrap_obj = req_wrap->object();
    Local<Value> argv[5] = {
      Integer::New(env->isolate(), status),
      wrap->object(),
      req_wrap_obj,
      v8::True(env->isolate()),
      v8::True(env->isolate())
    };

TCPWrap always sets the readable and writable values to true where as PipeWrap checks if the handle is readble/writeble. Seems like a the TCPWrap will always be both readable and writable.

Making ReqWrap req_ member private

Currently the member req_ is public in src/req

One issue when doing this was that after renaming req_ to req() I had to rename a macro in src/node_file.cc to avoid a collision with the macro parameter with the same name.

The second issue I ran into was with src/stream_wrap.cc:

    void StreamWrap::AfterShutdown(uv_shutdown_t* req, int status) {
      ShutdownWrap* req_wrap = ContainerOf(&ShutdownWrap::req_, req);

We can find ContainerOf in src/util-inl.h :

    template <typename Inner, typename Outer>
    inline ContainerOfHelper<Inner, Outer> ContainerOf(Inner Outer::*field, Inner* pointer) {
      return ContainerOfHelper<Inner, Outer>(field, pointer);
    }

The call in question is auto-deducing the paremeter types from the arguments, it could also have been explicit:

    ShutdownWrap* req_wrap = ContainerOf<uv_shutdown_t*, ShutdownWrap>(&ShutdownWrap::req_, req);

ContainerOfHelper

src/util.h declares a class named ContainerOfHelper:

    // The helper is for doing safe downcasts from base types to derived types.
    template <typename Inner, typename Outer>
    class ContainerOfHelper {
     public:
       inline ContainerOfHelper(Inner Outer::*field, Inner* pointer);
       template <typename TypeName>
       inline operator TypeName*() const;
     private:
       Outer* const pointer_;
 };

So back to our call using ContainerOf which will invoke:

    template <typename Inner, typename Outer>
    ContainerOfHelper<Inner, Outer>::ContainerOfHelper(Inner Outer::*field, Inner* pointer)
        : pointer_(reinterpret_cast<Outer*>(reinterpret_cast<uintptr_t>(pointer) - reinterpret_cast<uintptr_t>(&(static_cast<Outer*>(0)->*field)))) {
    }

First, note that the parameter field is a pointer-to-member, which gives the offset of the member within the class object as opposed to using the address-of operator on a data member bound to an actual class object which yields the member's actual address in memory. uintptr_t is an unsigned int that is capable of storing a pointer. Such a type can be used when you need to perform integer operations on a pointer. reinterpret_cast is a compiler directive which instructs the compiler to treat the sequence of bits as if it had the new type:

    reinterpret_cast<uintptr_t>(pointer) 

reinterpret_cast is used to convert any pointer type to any other pointer type and the result is a binary copy of the value.

        reinterpret_cast<uintptr_t>(&(static_cast<ShutdownWrap*>(0)->*field))

I've not seen this usage before using 0 as the argument to static_cast:

    static_cast<ShutdownWrap*>(0)->*field)

The static_cast part of this expression will give a nullptr, but we are not accessing a member, but a pointer-to-member which remember is the offset. A pointer is only a memory address but the type of the object determines how a pointer can be used, like using a member it needs to know the offsets of those members. So we creating a pointer to Outer which by using the offset of the field and substracting that from pointer. So when using a pointer and dereferencing field this will point to same value of pointer

Why does the protected field req_ have to be last:

    Command: out/Release/node /Users/danielbevenius/work/nodejs/node/test/parallel/test-child-process-stdio-big-write-end.js
    --- CRASHED (Signal: 10) ---
    === release test-cluster-disconnect ===
    Path: parallel/test-cluster-disconnect
    /Users/danielbevenius/work/nodejs/node/out/Release/node[84341]: ../src/connection_wrap.cc:83:static void node::ConnectionWrap<node::TCPWrap, uv_tcp_s>::AfterConnect(uv_connect_t *, int) [WrapType = node::TCPWrap, UVType = uv_tcp_s]: Assertion `(req_wrap->env()) == (wrap->env())' failed.
     1: node::Abort() [/Users/danielbevenius/work/nodejs/node/out/Release/node]
     2: node::RunMicrotasks(v8::FunctionCallbackInfo<v8::Value> const&) [/Users/danielbevenius/work/nodejs/node/out/Release/node]
     3: node::ConnectionWrap<node::TCPWrap, uv_tcp_s>::AfterConnect(uv_connect_s*, int) [/Users/danielbevenius/work/nodejs/node/out/Release/node]
     4: uv__stream_io [/Users/danielbevenius/work/nodejs/node/out/Release/node]
     5: uv__io_poll [/Users/danielbevenius/work/nodejs/node/out/Release/node]
     6: uv_run [/Users/danielbevenius/work/nodejs/node/out/Release/node]
     7: node::Start(int, char**) [/Users/danielbevenius/work/nodejs/node/out/Release/node]
     8: start [/Users/danielbevenius/work/nodejs/node/out/Release/node]
     9: 0x2

"req_wrap_queue_ needs to be at a fixed offset from the start of the struct because it is used by ContainerOf to calculate the address of the embedding ReqWrap. ContainerOf compiles down to simple, fixed pointer arithmetic. sizeof(req_) depends on the type of T, so req_wrap_queue_ would no longer be at a fixed offset if it came after req_."

This is what ReqWrap currently looks like:

     private:
      friend class Environment;
      ListNode<ReqWrap> req_wrap_queue_;

Notice that this is not a pointer and when a ReqWrap instance is created the ListNode::ListNode() constructor will be called:

    template <typename T>
    ListNode<T>::ListNode() : prev_(this), next_(this) {}

So every instance will have it's own doubly link linked list and each entry contains a ReqWrap instance which has a type T member. Depending on the type of T the size of the ReqWrap object in memory will be different. So it would not be possible to have req_wrap_queue after req_, or req_ before req_wrap_queue as this would make the offset different during runtime (compile time would still work fine).

Every Environment instance has the following queues:

    HandleWrapQueue handle_wrap_queue_;
    ReqWrapQueue req_wrap_queue_; 

And a typedef for this is created using a pointer-to-member:

    typedef ListHead<ReqWrap<uv_req_t>, &ReqWrap<uv_req_t>::req_wrap_queue_> ReqWrapQueue;

Each time a instance of ReqWrap is created that instance will be added to the queue:

    env->req_wrap_queue()->PushBack(reinterpret_cast<ReqWrap<uv_req_t>*>(this));

Share AfterWrite with with udp_wrap and stream_wrap

So, the task is basically to follow this comment in udb_wrap.cc:

    // TODO(bnoordhuis) share with StreamWrap::AfterWrite() in stream_wrap.cc
    void UDPWrap::OnSend(uv_udp_send_t* req, int status) {

At first glance this don't look that similar that they could be shared:

     void UDPWrap::OnSend(uv_udp_send_t* req, int status) {
       SendWrap* req_wrap = static_cast<SendWrap*>(req->data);
       if (req_wrap->have_callback()) {
         Environment* env = req_wrap->env();
         HandleScope handle_scope(env->isolate());
         Context::Scope context_scope(env->context());
         Local<Value> arg[] = {
           Integer::New(env->isolate(), status),
           Integer::New(env->isolate(), req_wrap->msg_size),
         };
         req_wrap->MakeCallback(env->oncomplete_string(), 2, arg);
      }
      delete req_wrap;
    }

have_callback() is a method on the SendWrap class and does not exist for WriteWrap.

   void StreamWrap::AfterWrite(uv_write_t* req, int status) {
    WriteWrap* req_wrap = WriteWrap::from_req(req);
    CHECK_NE(req_wrap, nullptr);
    HandleScope scope(req_wrap->env()->isolate());
    Context::Scope context_scope(req_wrap->env()->context());
    req_wrap->Done(status);
  }

First thing to notice is the checking for a callback, StreamWrap::AfterWrite seems to assume that there will always be a callback by looking at req_wrap->Done:

      inline void Done(int status, const char* error_str = nullptr) {
         Req* req = static_cast<Req*>(this);
         Environment* env = req->env();
         if (error_str != nullptr) {
           req->object()->Set(env->error_string(), OneByteString(env->isolate(), error_str));
         }
        cb_(req, status);
      }

When DoShutdown is called the last thing that is done is:

    req_wrap->Dispatched();

which will set req_.data = this; this being the Shutdown wrap instance. Later when the AfterShutdown method is called that instance will be available by using the req->data.

Stream class hierarchy

    class TTYWrap : public StreamWrap

    class PipeWrap : public ConnectionWrap<PipeWrap, uv_pipe_t>
    class TCPWrap : public ConnectionWrap<TCPWrap, uv_tcp_t>
    
    class ConnectionWrap : public StreamWrap
    class StreamWrap : public HandleWrap, public StreamBase
    class HandleWrap : public AsyncWrap
    class AsyncWrap : public BaseObject
    class BaseObject

    class StreamBase : public StreamResource
    class StreamResource

Wrapped

    var TCP = process.binding('tcp_wrap').TCP;
    var TCPConnectWrap = process.binding('tcp_wrap').TCPConnectWrap;
    var ShutdownWrap = process.binding('stream_wrap').ShutdownWrap;

    var client = new TCP();
    var shutdownReq = new ShutdownWrap();

This above will invoke the constructor set up by TCPWrap::Initialize:

    auto constructor = [](const FunctionCallbackInfo<Value>& args) {
      CHECK(args.IsConstructCall());
    };
    auto cwt = FunctionTemplate::New(env->isolate(), constructor);
    cwt->InstanceTemplate()->SetInternalFieldCount(1);
    SetClassName(FIXED_ONE_BYTE_STRING(env->isolate(), "TCPConnectWrap"));
    Set(FIXED_ONE_BYTE_STRING(env->isolate(), "TCPConnectWrap"), GetFunction());

The only thing the constructor does is check that new is used with the function (as in new ShutdownWrap).

    var err = client.shutdown(shutdownReq);

The methods available to a TCP instance are also configured in TCPWrap::Initialize. The shutdown method is set up using the following call:

    StreamWrap::AddMethods(env, t, StreamBase::kFlagHasWritev);

src/stream_base-inl.h contains the shutdown method:

    env->SetProtoMethod(t, "shutdown", JSMethod<Base, &StreamBase::Shutdown>); 

So we are using a referece to StreamBase::Shutdown which can be found in src/stream_base.cc:

   int StreamBase::Shutdown(const FunctionCallbackInfo<Value>& args) {
     Environment* env = Environment::GetCurrent(args);
 
     CHECK(args[0]->IsObject());
     Local<Object> req_wrap_obj = args[0].As<Object>();

     ShutdownWrap* req_wrap = new ShutdownWrap(env,
                                               req_wrap_obj,
                                               this,
                                               AfterShutdown);

The Shutdown constructor delegates to ReqWrap:

    ReqWrap(env, req_wrap_obj, AsyncWrap::PROVIDER_SHUTDOWNWRAP),

Which delegates to AsyncWrap:

    AsyncWrap(env, req_wrap_obj, AsyncWrap::PROVIDER_SHUTDOWNWRAP),

Which delegates to BaseObject:

    BaseObject(env, req_wrap_obj)

req_wrap_obj is refered to handle in BaseObject and is made into a persistent V8 handle

AfterShutdown is of type typedef void (DoneCb)(Req req, int status). This callback is passed to the constructor of StreamReq:

    StreamReq<ShutdownWrap>(cb)

This will simply store the callback in a private field.

The StreamBase instance (this in the call above) will be set as a private member of Shutdown wrap. There is a single function call in the constructor which is:

Wrap(req_wrap_obj, this);

void Wrap(v8::Local<v8::Object> object, TypeName* pointer) {
  CHECK_EQ(false, object.IsEmpty());
  CHECK_GT(object->InternalFieldCount(), 0);
  object->SetAlignedPointerInInternalField(0, pointer);
}

So we are setting the ShutdownWrap instance pointer on the V8 local object. So wrap means that we are wrapping the ShutdownWrap instance in the req_warp_obj.

Compiling the test in this project

First step is that Google Test needs to be added. Follow the steps in "Adding Google test to the project" before proceeding.

Building and running the tests

make check 

Clean

make clean

Adding Google test to the project

Build the gtest lib:

$ mkdir lib
$ mkdir deps ; cd deps
$ git clone [email protected]:google/googletest.git
$ cd googletest/googletest
$ mkdir build ; cd build
$ c++ -std=gnu++0x -stdlib=libstdc++ -I`pwd`/../include -I`pwd`/../ -pthread -c `pwd`/../src/gtest-all.cc
$ ar -rv libgtest.a gtest-all.o
$ cp libgtest.a ../../../../lib

We will be linking against Node.js which is build (on mac) using c++ and using the GNU Standard library. Before OS X 10.9.x the default was libstdc++, but after OS X 10.9.x the default is libc++. I'm ususing 10.11.5 so the default would be libc++ in my case. I ran into an issue when compiling and not explicitely specifying -stdlib=libstdc++ as this would mix two different standard library implementations. Instead of our program crashing at runtime we get a link time error. libc++ uses a C++11 language feature called inline namespace to change the ABI of std::string without impacting the API of std::string. That is, to you std::string looks the same. But to the linker, std::string is being mangled as if it is in namespace std::__1. Thus the linker knows that std::basic_string and std::__1::basic_string are two different data structures (the former coming from gcc's libstdc++ and the latter coming from libc++).

Writing a test file

$ mkdir test
$ vi main.cc
#include "gtest/gtest.h"
#include "base-object_test.cc"

int main(int argc, char* argv[]) {
  ::testing::InitGoogleTest(&argc, argv);
  return RUN_ALL_TESTS();
}

$ vi base-object_test.cc
#include "gtest/gtest.h"

TEST(BaseObject, base) {
}

then compile using:

$ clang++ -I`pwd`/../deps/googletest/googletest/include -pthread main.cc ../lib/libgtest.a -o base-object_test

Run the test:

./base-object_test

use of undeclared identifier 'node'

After making sure that I can include the 'base-object.h' header I get the following error when compiling:

In file included from test/main.cc:2:
test/base-object_test.cc:9:3: error: use of undeclared identifier 'node'
node::BaseObject bo;
 ^
1 error generated.
make: *** [test/base-object_test] Error 1

After taking a closer look at src/base-object.j I noticed this line:

#if defined(NODE_WANT_INTERNALS) && NODE_WANT_INTERNALS

I've not set this in the test, so there is not much being included by the preprocessor. Adding #define NODE_WANT_INTERNALS 1 should fix this.

Default implicit destructor

When working on a task involving extracting commmon code to a superclass I caused an issue with the CI builds.

What I had originally done was added an empty destructor:

~ConnectionWrap() {
}

I later changed this to be an explicitly defaulted destructor generated by the compiler

~ConnectionWrap() = default;

While I did not see any failures on my local machine during development the CI server did. I currently don't have more information than this but will try to gather some.

My understanding/assumption was that these two would be equivalent. So what is doing on? Let's start by taking a look a the inheritance tree and the various destructors:

class ConnectionWrap : public StreamWrap
  protected:
    ~ConnectionWrap() {}

class StreamWrap : public HandleWrap, public StreamBase
  protected:
    ~StreamWrap() { }

class HandleWrap : public AsyncWrap
  protected:
    ~HandleWrap() override;

class AsyncWrap : public BaseObject
  public:
    inline virtual ~AsyncWrap();

class BaseObject
  public:
    inline virtual ~BaseObject();

class StreamBase : public StreamResource
  public:
    virtual ~StreamBase() = default;

class StreamResource
  public:
    virtual ~StreamResource() = default;

From looking at the error:

In file included from ../src/pipe_wrap.h:7:0,
             from ../src/pipe_wrap.cc:1:
../src/connection_wrap.h:26:3: internal compiler error: in use_thunk, at cp/method.c:338
~ConnectionWrap() = default;
^
Please submit a full bug report,

with preprocessed source if appropriate. See file:///usr/share/doc/gcc-4.8/README.Bugs for instructions. Preprocessed source stored into /tmp/ccvbqQz3.out file, please attach this to your bugreport. ERROR: Cannot create report: [Errno 17] File exists: '/var/crash/_usr_lib_gcc_x86_64-linux-gnu_4.8_cc1plus.1000.crash' make[2]: *** [/home/iojs/build/workspace/node-test-commit-linux/nodes/ubuntu1204-64/out/Release/obj.target/node/src/pipe_wrap.o] Error 1

it looks like GCC (G++) 4.8 is being used. The reason for asking is I did a search and found a few indications that this might be a bug in the compiler. This is reported as sovled in 4.8.3 which is also why I'm curious about the compiler version. The centos machines use devtoolset-2, which comes with g++ 4.8.2.

If a class has no user-declared destructor, one is declared implicitly by the compiler and is called an implicitly-declared destructor. An implicitly-declared destructor is inline. Another aspect about destructors that is important to understand is that even if the body of a destructor is empty, it doesn’t mean that this destructor won’t execute any code. The C++ compiler augments the destructor with calls to destructors for bases and non-static data members

Chrome debugger

Open developer tools from Chrome CMD+OPT+I

Debugging

CMD+; step into
CMD+' step over
CMD+SHIFT+; step out
CMD+\ continue
CTRL+. next call frame
CTRL+, previous call frame
CMD+B toggle breakpoint
CTRL+SHIFT+E run highlighted snipped and show output in console.

Searching

CMD+F search current file
CMD+ALT+F search all sources
CMD+P go to source file. Opens a dialog where you can type in a file name
CTRL+G go to line

Editor

SHIFT+CMD+P go to member

ESC toggle drawer
CTRL+~ jump to console
CMD+[ next panel
CMD+] previous panel
CMD+ALT+[ next panel in history
CMD+ALT+] previous panel history
CMD+SHIFT+D toggle location of panels (separate screen/docked)
? show settings dialog
You can see all the shortcuts from here
ESC close settings/dialog

Node Package Manager (NPM)

I was curious about what type of program it is. Looking at the shell script on my machine it is a simple wrapper that calls node with the javascript file being the shell script itself. Kinda like doing:

#!/bin/sh
// 2>/dev/null; exec "`dirname "~/.nvm/versions/node/v4.4.3/bin/npm"`/node" "$0" "[email protected]"

console.log("bajja");

Make

GNU make has two phases. During the first phase it reads all the makefiles, and internalizes all variables. Make will expand any variables or functions in that section as the makefile is parsed. This is called immediate expansion since this happens during the first phase. The expansion is called deferred if it is not performed immediately.

Take a look at this rule:

config.gypi: configure
    if [ -f [email protected] ]; then
            $(error Stale [email protected], please re-run ./configure)
    else
            $(error No [email protected], please run ./configure first)
    fi

The recipe in this case is a shell if statement, which is a deferred construct. But the control function $(error) is an immediate construct which will cause the makefile processing to stop processing. If I understand this correctly the only possible outcome of this rule is the Stale config.gypi message which will be done in the first phase and then exit. The shell condition will not be considered.

For example, if we delete config.gypi we would expect the result to be an error saying that No config.gypi, please run ./configure first. But the result is:

Makefile:81: *** Stale config.gypi, please re-run ./configure.  Stop.

Keep in mind that config.gypi is not a .PHONY target, so it is a file on the file system and if it is missing the recipe will be run. So we could use a simple echo statement and and exit to work around this:

config.gypi: configure
    @if [ -f [email protected] ]; then \
      echo Stale [email protected], please re-run ./$<; \
    else \
      echo No [email protected], please run ./$< first; \
    fi
    @exit 1;

But that will produce the kind of ugly result:

$ make config.gypi
Stale config.gypi, please re-run ./configure
make: *** [config.gypi] Error 1

AtExit

// TODO(bnoordhuis) Turn into per-context event.
4278 void RunAtExit(Environment* env) {

What exactly is a AtExit function. An "AtExit" hook is a function that is invoked after the Node.js event loop has ended but before the JavaScript VM is terminated and Node.js shuts down.

So in node.cc you can find:

void AtExit(void (*cb)(void* arg), void* arg) {

This would be called like this:

static void callback(void* arg) {
}

AtExit(callback);

static AtExitCallback* at_exit_functions_;

I notices that AtExist is declared in node.h:

NODE_EXTERN void RunAtExit(Environment* env);

NODE_EXTERN is declared as:

So the idea is that at_exit_functions_ should be a per-environment property rather than a global. Like bnoordhuis pointed out, AtExit does not take a pointer to an Environment but we have to add the callbacks to the Environment associated with the addon. Is the environemnt available when the addons init function is called?

To answer that question, what is the type contained in the init function of an addon?

void init(Local<Object> target) {
  AtExit(at_exit_cb1, target->CreationContext()->GetIsolate());
}

NODE_MODULE(binding, init);

So a user will still have to call AtExit but instead of node.cc holding a static linked list of callbacks to call these should be added to the current environment.

void AtExit(void (*cb)(void* arg), void* arg) {

So AtExit takes a function pointer as its first argument, and a void pointer as its second. The function pointer is to a function that returns void and takes a void pointer as an argument.

mp->nm_register_func(exports, module, mp->nm_priv);

The above call can be found in DLOpen in src/node.cc`. The first thing that happens in DLOpen is:

    Environment* env = Environment::GetCurrent(args);

I've covered the setting of the Environment in AssignToContext previously. This is done by the Environment contructor and by node_contextify.cc.

The only Start function exposed in node.h is the one that takes argc and argv. Calling node::Start multiple times does not work and result in the following error:

# Fatal error in ../deps/v8/src/isolate.cc, line 2021
# Check failed: thread_data_table_.
#
==== C stack trace ===============================

    0   cctest                              0x0000000100324fce v8::base::debug::StackTrace::StackTrace() + 30
    1   cctest                              0x0000000100325005 v8::base::debug::StackTrace::StackTrace() + 21
    2   cctest                              0x000000010031dd94 V8_Fatal + 452
    3   cctest                              0x0000000100cc053c v8::internal::Isolate::Isolate(bool) + 2092
    4   cctest                              0x0000000100cc0ad5 v8::internal::Isolate::Isolate(bool) + 37
    5   cctest                              0x0000000100370a59 v8::Isolate::New(v8::Isolate::CreateParams const&) + 41
    6   cctest                              0x000000010003323f node::Start(uv_loop_s*, int, char const* const*, int, char const* const*) + 79
    7   cctest                              0x0000000100032e38 node::Start(int, char**) + 200
    8   cctest                              0x00000001000cdb33 EnvironmentTest_StartMultipleTimes_Test::TestBody() + 51
    9   cctest                              0x000000010014089a void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) + 122
    10  cctest                              0x00000001001190be void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) + 110
    11  cctest                              0x0000000100118fa5 testing::Test::Run() + 197
    12  cctest                              0x0000000100119f98 testing::TestInfo::Run() + 216
    13  cctest                              0x000000010011b227 testing::TestCase::Run() + 231
    14  cctest                              0x0000000100129ccc testing::internal::UnitTestImpl::RunAllTests() + 908
    15  cctest                              0x00000001001444aa bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) + 122
    16  cctest                              0x00000001001298be bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) + 110
    17  cctest                              0x00000001001297b5 testing::UnitTest::Run() + 373
    18  cctest                              0x0000000100147a81 RUN_ALL_TESTS() + 17
    19  cctest                              0x0000000100147a5b main + 43
    20  cctest                              0x00000001000010f4 start + 52
make: *** [cctest] Illegal instruction: 4

The Environment created when using the above start function is done in

    inline int Start(Isolate* isolate, IsolateData* isolate_data,
                     int argc, const char* const* argv,
                     int exec_argc, const char* const* exec_argv) {

Would it be safe to use GetCurrent using the isolate in

Thread-local

    static thread_local Environment* thread_local_env;

The object is allocated when the thread begins and deallocated when the thread ends. Each thread has its own instance of the object. Only objects declared thread_local have this storage duration. thread_local can appear together with static or extern to adjust linkage.

So we are specifying static only to specify that it should only have internal linkage, meaning that it can be referred to from all scopes in the current translation unit. It does not mean that it is static as in "static storage" meaning that it would be allocated when the program begins and deallocated when the program ends. But without the static linkage it would be external by default which is not what we want.

When used in a declaration of an object, it specifies static storage duration (except if accompanied by thread_local). When used in a declaration at namespace scope, it specifies internal linkage.

Using a `while(more == true)' :
    0x1011d81a9 <+1177>: jmp    0x1011d81ae               ; <+1182> at node.cc:4453
    0x1011d81ae <+1182>: movb   -0xd31(%rbp), %al         ; move byte value of -0xd31(%rpb) (move variable) into al register
    0x1011d81b4 <+1188>: andb   $0x1, %al                 ; AND 1 and the content of move variable
    0x1011d81b6 <+1190>: movzbl %al, %ecx                 ; conditional move into eax if zero
    0x1011d81b9 <+1193>: cmpl   $0x1, %ecx                ; compare 1 and the contents of eax
    0x1011d81bc <+1196>: je     0x1011d80e9               ; <+985> at node.cc:4437

    0x1011d81c2 <+1202>: leaq   -0xd30(%rbp), %rdi
    0x1011d81c9 <+1209>: callq  0x1002214e0               ; v8::SealHandleScope::~SealHandleScope at api.cc:926


Compared to using `while(more)`:

    0x1011d81a9 <+1177>: jmp    0x1011d81ae               ; <+1182> at node.cc:4453
    0x1011d81ae <+1182>: testb  $0x1, -0xd31(%rbp)        ; AND 1 and more
    0x1011d81b5 <+1189>: jne    0x1011d80e9               ; <+985> at node.cc:4437

    0x1011d81bb <+1195>: leaq   -0xd30(%rbp), %rdi
    0x1011d81c2 <+1202>: callq  0x1002214e0               ; v8::SealHandleScope::~SealHandleScope at api.cc:926

Calling conventions

Are the rules when making functions calls regarding how parameters are passed, who is responsible for cleaning up the stack, how the return value is to be retrieved, and also how the function calls are decorated.

cdecl

A calling convention that is used for standard C where the the stack must be cleaned up by the callee as there is support for varargs and there is now way for the called function to know the actual number of values pushed onto the stack before the function was called. Function name is decorated by prefixing it with an underscore character '_' .

stdcall

Here arguments are fixed and the called function can to the stack clean up. The advantage here is that the stack clean up code is only done once in one place. Function name is decorated by prepending an underscore character and appending a '@' character and the number of bytes of stack space required.

Issue

When running the Node.js build on windows (trying to get cctest to work for a test I added), I got the following link error:

    env.obj : error LNK2001: unresolved external symbol 
    "public: __cdecl node::Utf8Value::Utf8Value(class v8::Isolate *,class v8::Local<class v8::Value>)" ([email protected]@@[email protected]@[email protected]@V?$Local@[email protected]@@@[email protected]@Z) [c:\workspace\node-compile-windows\label\win-vs2015\cctest.vcxproj]

Now, we can see that the calling convention used is __cdecl but the name mangling does not look correct as it is using @@

    process_title.len = argv[argc - 1] + strlen(argv[argc - 1]) - argv[0];

This would be the same as :

    (lldb) p (size_t) argv[argc-1] + (size_t) strlen(argv[argc-1]) - (size_t)argv[0]
    (unsigned long) $9 = 56

When in my unit test the same gives me:

    (lldb) p (size_t) argv[argc-1] + (size_t) strlen(argv[argc-1]) - (size_t)argv[0]
    (unsigned long) $10 = 34693

What is happening is that we are taking the memory address of argv[argc-1] +

Debugging a Node addon

The task a hand was to debug Realm's addon to see why test were just hanging even though I made sure to call Tape test's end function. So, realm is a normal dependency and exist in node_modules.

Setup:

    $ npm install --save realm
    $ cd node_modules/realm
    $ env REALMJS_USE_DEBUG_CORE=true node-pre-gyp install --build-from-source --debug
    $ lldb -- node test/datastores/realm-store-test.js
    (lldb) breakpoint set --file node_init.cpp --line 26

It turns out that when breaking in the debugger (CTRL+C) and then stepping through it was in a kevent and this migth be some kind of listerner for events, and there is a realm.removeAllListeners() that can be called and this solved my issue.

Generate Your Project

For node the various targets in node.gyp will generate make files in the out directory. For example the target named cctest will generate out/cctest.target.mk file.

Profiling

You can use Google V8's built in profiler using the --prof command line option:

    $ out/Debug/node --prof test.js

This will generate a file in the current directory named something like isolate-0x104005e00-v8.log. Now we can process this file:

    $ export D8_PATH=~/work/google/javascript/v8/out/x64.debug
    $ deps/v8/tools/mac-tick-processor isolate-0x104005e00-v8.log

or you can use node's ---prof-process option:

    $ ./out/Debug/node --prof-process isolate-0x104005e00-v8.log
Statistical profiling result from isolate-0x104005e00-v8.log, (332 ticks, 86 unaccounted, 0 excluded).

The profiler is sample based so with wakes up and takes a sample. The intervals that is wakes up is called a tick. It will look at where the instruction pointer is RIP and reports the function if that function can be resolved. If cannot resolve the function this will be reported as an unaccounted tick.


    [Summary]:
     ticks  total  nonlib   name
        0    0.0%    0.0%  JavaScript
      236   71.1%   73.3%  C++
        4    1.2%    1.2%  GC
       10    3.0%          Shared libraries
       86   25.9%          Unaccounted

We can see that 71.1% of the time was spent in C++ code. Inspecting the C++ section you should be able to see were the most time is being spent and the sources.

    [C++]:
     ticks  total  nonlib   name
       66   19.9%   20.5%  node::ContextifyScript::New(v8::FunctionCallbackInfo<v8::Value> const&)
       20    6.0%    6.2%  node::Binding(v8::FunctionCallbackInfo<v8::Value> const&)
        5    1.5%    1.6%  v8::internal::HandleScope::ZapRange(v8::internal::Object**, v8::internal::Object**)

The [Bottom up] section shows us which the primary callers of the above are:


    [Bottom up (heavy) profile]:
    Note: percentage shows a share of a particular caller in the total amount of its parent calls.
    Callers occupying less than 2.0% are not shown.

     ticks parent  name
       86   25.9%  UNKNOWN

       66   19.9%  node::ContextifyScript::New(v8::FunctionCallbackInfo<v8::Value> const&)
       66  100.0%    v8::internal::Builtin_HandleApiCall(int, v8::internal::Object**, v8::internal::Isolate*)
       66  100.0%      LazyCompile: ~runInThisContext bootstrap_node.js:427:28
       66  100.0%        LazyCompile: ~NativeModule.compile bootstrap_node.js:509:44
       66  100.0%          LazyCompile: ~NativeModule.require bootstrap_node.js:443:34
       15   22.7%            LazyCompile: ~startup bootstrap_node.js:12:19
       11   16.7%            Function: ~<anonymous> module.js:1:11
        8   12.1%            Function: ~<anonymous> stream.js:1:11
        7   10.6%            LazyCompile: ~setupGlobalVariables bootstrap_node.js:192:32
        6    9.1%            Function: ~<anonymous> util.js:1:11
        6    9.1%            Function: ~<anonymous> tty.js:1:11
        3    4.5%            LazyCompile: ~setupGlobalTimeouts bootstrap_node.js:226:31
        2    3.0%            LazyCompile: ~createWritableStdioStream internal/process/stdio.js:134:35
        2    3.0%            Function: ~<anonymous> fs.js:1:11
        2    3.0%            Function: ~<anonymous> buffer.js:1:11

LazyCompile: Simply means that the function was complied lazily and not that this was the time spent compiling.

  • before function name means that time is being spent in optimized function. ~ before a function means that is was not optimized.

The % in the parent column shows the percentage of samples for which the function in the row above was called by the function in the current row. So,


       66   19.9%  node::ContextifyScript::New(v8::FunctionCallbackInfo<v8::Value> const&)
       66  100.0%    v8::internal::Builtin_HandleApiCall(int, v8::internal::Object**, v8::internal::Isolate*)

would be read as when v8::internal::Builting_HandleApiCall was sampled it called node:ContextifyScript every time. And


       66  100.0%          LazyCompile: ~NativeModule.require bootstrap_node.js:443:34
       15   22.7%            LazyCompile: ~startup bootstrap_node.js:12:19

that when startup in bootstrap_node.js was called, in 22% of the samples it called NativeModule.require.


    [Shared libraries]:
     ticks  total  nonlib   name
        6    1.8%          /usr/lib/system/libsystem_kernel.dylib
        2    0.6%          /usr/lib/system/libsystem_platform.dylib
        1    0.3%          /usr/lib/system/libsystem_malloc.dylib
        1    0.3%          /usr/lib/system/libsystem_c.dylib

setTimeout

Let's take the following example:

    setTimeout(function () {
      console.log('bajja');
    }, 5000);
$ ./out/Debug/node --inspect --inspect-brk settimeout.js

In Node you can call setTimeout with out having a require. This is done by lib/boostrap/node.js:

    function setupGlobalTimeouts() {
      const timers = NativeModule.require('timers');
      global.clearImmediate = timers.clearImmediate;
      global.clearInterval = timers.clearInterval;
      global.clearTimeout = timers.clearTimeout;
      global.setImmediate = timers.setImmediate;
      global.setInterval = timers.setInterval;
      global.setTimeout = timers.setTimeout;
   }

So we can see that we are able to call setTimout without having to require any module and that it is part of a native modules named timers. This is located in lib/timers.js.

The first thing that will happen is a new Timeout will be created in createSingleTimeout. A timeout looks like:

    function Timeout(after, callback, args) {
      this._called = false;
      this._idleTimeout = after;  // this will be 5000 in our use-case
      this._idlePrev = this;
      this._idleNext = this;
      this._idleStart = null;
      this._onTimeout = callback; // this is our callback that just logs to the console
      this._timerArgs = args;
      this._repeat = null;
    }

This timer instance is then passed to active(timer) which will insert the timer by calling insert:

     insert(item, false);

item is the timer, and false is the value of the unrefed argument)

    item._idleStart = TimerWrap.now();

So we can see that we are using timer_wrap which is located in src/timer_wrap.cc and the now function which is 
initialized to:
```c++
    env->SetTemplateMethod(constructor, "now", Now);

Back in the insert function we then have the following:

    const lists = unrefed === true ? unrefedLists : refedLists;

We know that unrefed is false so lists will be the refedLists which is an object keyed with the millisecond that a timeout is due to expire. The value of each key is a linkedlist of timers that expire at the same time.

    var list = lists[msecs];

If there are other timers that also expire after 5000ms then there might already be a list for them. But in this case there is not and a new list will be created:

    lists[msecs] = list = createTimersList(msecs, unrefed);

    const list = new TimersList(msecs, unrefed); // 5000 and false

    function TimersList(msecs, unrefed) {
      this._idleNext = null; // Create the list with the linkedlist properties to
      this._idlePrev = null; // prevent any unnecessary hidden class changes.
      this._timer = new TimerWrap();
      this._unrefed = unrefed; // will be false in our case
      this.msecs = msecs; // will be 5000 in our case
   }

The new TimerWrap call will invoke New in timer_wrap.cc as setup in the initialize function:

    Local<FunctionTemplate> constructor = env->NewFunctionTemplate(New);

New will invoke TimerWrap's constructor which does:

    int r = uv_timer_init(env->event_loop(), &handle_);

So we can see that it is setting up a libuv timer. Shortly after we have the following code (back in JavaScript land and lib/timers.js):

Next the list (TimerList) is initialized setting _idleNext and _idlePrev to list. After this we are adding a field to the list:

    list._timer._list = list;

    list._timer.start(msecs);

Start is initialized using :

    env->SetProtoMethod(constructor, "start", Start);

    static void Start(const FunctionCallbackInfo<Value>& args) {
      TimerWrap* wrap = Unwrap<TimerWrap>(args.Holder());

      CHECK(HandleWrap::IsAlive(wrap));

      int64_t timeout = args[0]->IntegerValue();
      int err = uv_timer_start(&wrap->handle_, OnTimeout, timeout, 0);
      args.GetReturnValue().Set(err);
   }

Compare this with timer.c. and you can see that these is not that much of a difference. Let's look at the callback OnTimeout:

    static void OnTimeout(uv_timer_t* handle) {
      TimerWrap* wrap = static_cast<TimerWrap*>(handle->data);
      Environment* env = wrap->env();
      HandleScope handle_scope(env->isolate());
      Context::Scope context_scope(env->context());
      wrap->MakeCallback(kOnTimeout, 0, nullptr);
    }

The callback in question looks like:

    0xa90abea6961: [Function]
     - map = 0x2fecee786da1 [FastProperties]
     - prototype = 0x1b3829484539
     - elements = 0x2203fd302241 <FixedArray[0]> [FAST_HOLEY_ELEMENTS]
     - initial_map =
     - shared_info = 0x28b2bad27aa1 <SharedFunctionInfo listOnTimeout>
     - name = 0x28b2bad26b31 <String[13]: listOnTimeout>
     - formal_parameter_count = 0
     - context = 0x19bfe9663951 <FixedArray[48]>
     - feedback vector cell = 0x28b2bad2a549 <Cell value= 0x2203fd302311 <undefined>>
     - code = 0x26d0e2004941 <Code BUILTIN>
     - properties = 0x2203fd302241 <FixedArray[0]> {
        #length: 0x35bd747eed51 <AccessorInfo> (const accessor descriptor)
        #name: 0x35bd747eedc1 <AccessorInfo> (const accessor descriptor)
        #prototype: 0x35bd747eee31 <AccessorInfo> (const accessor descriptor)
     }

Notice that the callback is listOnTimeout and this can be found in lib/timer.js.

setImmediate

The very simple JavaScript looks like this:

    setImmediate(function () {
      console.log('bajja');
    });

Like setTimeout the implementation is found in lib/timers.js. A new Immediate will be created in createImmediate which looks like this:

    function Immediate() {
      // assigning the callback here can cause optimize/deoptimize thrashing
      // so have caller annotate the object (node v6.0.0, v8 5.0.71.35)
      this._idleNext = null;
      this._idlePrev = null;
      this._callback = null;
      this._argv = null;
      this._onImmediate = null;
      this.domain = process.domain;
    }

The following check will then be done:

    if (!process._needImmediateCallback) {
      process._needImmediateCallback = true;
      process._immediateCallback = processImmediate;
    }

In this case process._needImmediateCallback is false so we'll enter the above block and set process._needImmediateCallback to true.

Also, notice that we are setting the processImmediate instance as a member of the process object. processImmediate is a function defined in timer.js. There is a V8 accessor for the field _immediateCallback on the process object which is set up in node.cc (SetupProcessObject function):

    auto need_immediate_callback_string =
        FIXED_ONE_BYTE_STRING(env->isolate(), "_needImmediateCallback");
    CHECK(process->SetAccessor(env->context(), need_immediate_callback_string,
                               NeedImmediateCallbackGetter,
                               NeedImmediateCallbackSetter,
                               env->as_external()).FromJust());

So when we do process_.immediateCallback NeedImmediateCallbackSetter will be invoked. Looking closer at this function and comparing it with a libuv check example we should see some similarties.

    uv_check_t* immediate_check_handle = env->immediate_check_handle();

    uv_idle_t* immediate_idle_handle = env->immediate_idle_handle();

    uv_check_start(immediate_check_handle, CheckImmediate);
    // Idle handle is needed only to stop the event loop from blocking in poll.
    uv_idle_start(immediate_idle_handle, IdleImmediateDummy);

So we can see that when this setter is called it will set up check handle (if the value was true as in process._needImmediateCallback = true). When the check phase is reached the CheckImmediate callback will be invoked. Lets set a breakpoint in that function and verify this:

    (lldb) breakpoint set --file node.cc --line 286 

    static void CheckImmediate(uv_check_t* handle) {
      Environment* env = Environment::from_immediate_check_handle(handle);
      HandleScope scope(env->isolate());
      Context::Scope context_scope(env->context());
      MakeCallback(env, env->process_object(), env->immediate_callback_string());
    }

Following MakeCallback will will find ourselves in timers.js and its processImmediate function which you might recall that we set:

     process._immediateCallback = processImmediate;

     immediate._callback = immediate._onImmediate;

immediate._onImmediate will be our callback function (anonymous in setimmediate.js)

    tryOnImmediate(immediate, tail);

will call:

    runCallback(immediate);

will call:

    return timer._callback();

And the callback is:

    function () {
       console.log('bajja');
    }

And there we have how setImmediate works in Node.js.

process._nextTick

The very simple JavaScript looks like this:

    process.nextTick(function () {
      console.log('bajja');
    });

nextTick is defined in lib/internal/process/next_tick.js. After a few checks what happens is that the callback is added to the nextTickQueue:

    nextTickQueue.push({
      callback,
      domain: process.domain || null,
      args
    });

nextTickQueue is an array:

    var nextTickQueue = [];

And we are pushing an object with the callback as a function named callback, domain and args. So for every nextTick called an entry will be added to the queue.

    tickInfo[kLength]++;

Recall that TickInfo is an inner class of Environment. Lets back up a little. bootstrap/node.js will call next_tick's setup() function from its start function:

    NativeModule.require('internal/process/next_tick').setup();

    exports.setup = setupNextTick;

    var microtasksScheduled = false;

    // Used to run V8's micro task queue.
    var _runMicrotasks = {};

    // *Must* match Environment::TickInfo::Fields in src/env.h.
    var kIndex = 0;
    var kLength = 1;

    process.nextTick = nextTick;
    // Needs to be accessible from beyond this scope.
    process._tickCallback = _tickCallback;
    process._tickDomainCallback = _tickDomainCallback;

    // This tickInfo thing is used so that the C++ code in src/node.cc
    // can have easy access to our nextTick state, and avoid unnecessary
    // calls into JS land.
    const tickInfo = process._setupNextTick(_tickCallback, _runMicrotasks);

process._setupNextTick is initialized in SetupProcessObject in src/node.cc:

    env->SetMethod(process, "_setupNextTick", SetupNextTick);

Lets take a look at what SetupNextTick does...

    env->set_tick_callback_function(args[0].As<Function>());

    env->SetMethod(args[1].As<Object>(), "runMicrotasks", RunMicrotasks);

So, here we are setting a method named runMicrotasks on the _runMicrotasks object passed to _setupNextTick.

    // Do a little housekeeping.
    env->process_object()->Delete(
        env->context(),
        FIXED_ONE_BYTE_STRING(args.GetIsolate(), "_setupNextTick")).FromJust();

Looks like this removes the _setupNextTick function from the process object afterwards.

    uint32_t* const fields = env->tick_info()->fields();
    uint32_t const fields_count = env->tick_info()->fields_count();

What are 'fields'? What are 'fields_count'?

    (lldb) p fields_count
    (uint32_t) $23 = 2

    Local<ArrayBuffer> array_buffer =
        ArrayBuffer::New(env->isolate(), fields, sizeof(*fields) * fields_count);

    args.GetReturnValue().Set(Uint32Array::New(array_buffer, 0, fields_count));

So tickInfo returned will be an ArrayBuffer:

    const tickInfo = process._setupNextTick(_tickCallback, _runMicrotasks);

Next we assign the RunMicroTasks callback to the _runMicrotasks variable:

    _runMicrotasks = _runMicrotasks.runMicrotasks;

After this we are done in bootstrap/node.js and the setup of next_tick. So, lets continue and break in our script and follow process.setNextTick.

    nextTickQueue.push({
      callback,
      domain: process.domain || null,
      args
    });

So we are again showing that we add callback info to the nextTickQueue (after a few checks) Then we do the following:

    tickInfo[kLength]++;

For each object added to the nextTickQueue we will increment the second element of the tickInfo array.

And that is it, the stack frames will start returning and be poped off the call stack. What we are interested in is in module.js and Module.runMain:

    process._tickCallback();


    do {
      while (tickInfo[kIndex] < tickInfo[kLength]) {
        tock = nextTickQueue[tickInfo[kIndex]++];
        ...
        _combinedTickCallback(args, callback);
        if (kMaxCallbacksUntilQueueIsShortened < tickInfo[kIndex])
           tickDone();
      }
    } while (tickInfo[kLength] !== 0);

The check is to see if tickInfo[kIndex] (is this the index of being processed?) is less than the number of tick callbacks in the nextTickQueue. Next tickInfo[kIndex] is retrieved from the nextTickQueue and then tickInfo[kIndex] is incremented.

tickDone():

    function tickDone() {
      if (tickInfo[kLength] !== 0) {
        if (tickInfo[kLength] <= tickInfo[kIndex]) {
          nextTickQueue = [];
          tickInfo[kLength] = 0;
        } else {
          nextTickQueue.splice(0, tickInfo[kIndex]);
          tickInfo[kLength] = nextTickQueue.length;
        }
      }
      tickInfo[kIndex] = 0;
     }

Lets take a look at:

    if (tickInfo[kLength] <= tickInfo[kIndex]) {

If the number of callbacks added is less than or equal to the just processed callbacks index this would mean that all of the callbacks in the queue have been processed and the following clause will make nextTickQueue point to an empty array and reset tickInfo[kLength] to zero. But if there are more callback in the queue than the just processed callbacks index the else clause will be taken:

    nextTickQueue.splice(0, tickInfo[kIndex]);
    tickInfo[kLength] = nextTickQueue.length;

splice will remove all elements from 0 to tickInfo[kIndex], which is removing all the processed callbacks. The new length is set as tickInfo[kLength]. This is done so that the nextTickQueue array does not become too large and run the process out of memory. By shortning the array this reduces the likelyhood of this happening.

Compiling with a different version of libuv

What I'd like to do is use my local fork of libuv instead of the one in the deps directory. I think the way to do this is to make install and then run configure with the following options:

    $ ./configure --debug --shared-libuv --shared-libuv-includes=/usr/local/include

The location of the library is /usr/local/lib, and /usr/local/include for the headers on my machine.

Updating addons test

Some of the addons tests are not version controlled but instead generate using:

   $ ./node tools/doc/addon-verify.js doc/api/addons.md

The source for these tests can be found in doc/api/addons.md and these might need to be updated if a change to all tests is required, for a concrete example we wanted to update the build/Release/addon directory to be different depending on the build type (Debug/Release) and I forgot to update these tests.

Using nvm with Node.js source

Install to the nvm versions directory:

    $ make install DESTDIR=~/.nvm/versions/node/ PREFIX=v8.0.0

You can then use nvm to list that version and versions:

   $ nvm ls 
         v6.5.0
         v7.0.0
         v7.4.0
         v8.0.0

   $ nvm use 8

lldb

There is a .lldbinit which contains a number of useful alias to print out various V8 objects. This are most of the aliases defined in gdbinit.

For example, you can print a v8::Localv8::Function using the builtin print command:


    (lldb) p init_fn
    (v8::Local<v8::Function>) $3 = (val_ = 0x000000010484f900)

This does not give much, but if we instead use jlh:

    (lldb) jlh init_fn
    0x19417e265ba9: [Function]
     - map = 0x382d7ba86ea9 [FastProperties]
     - prototype = 0xd21f3203f39
     - elements = 0x18ede4802241 <FixedArray[0]> [FAST_HOLEY_ELEMENTS]
     - initial_map =
     - shared_info = 0x23c6dbac1ce1 <SharedFunctionInfo init>
     - name = 0x21a813bbd419 <String[4]: init>
     - formal_parameter_count = 4
     - context = 0x19417e203b41 <FixedArray[8]>
     - literals = 0x18ede4804a49 <FixedArray[1]>
     - code = 0x1aa594184481 <Code: BUILTIN>
     - properties = {
       #length: 0x18ede4850bd9 <AccessorInfo> (accessor constant)
       #name: 0x18ede4850c49 <AccessorInfo> (accessor constant)
       #prototype: 0x18ede4850cb9 <AccessorInfo> (accessor constant)
     }

So that gives us more information, but lets say you'd like to see the name of the function:

    (lldb) jlh init_fn->GetName()
    #init

Promise builtin

For debugging the builtin promise we are going to disable V8 snapshots:

$ ./configure --without-snapshot --debug
$ lldb -- out/Debug/node ../scripts/promise.js
(lldb) br s -f bootstrapper.cc -l 2321

It takes a while before the breakpoint is hit but it will be. And we are going to look at the following js:

const p = new Promise((resolve, reject) => {
  resolve('ok');
});
Handle<JSFunction> promise_fun = InstallFunction(global,
    "Promise", JS_PROMISE_TYPE, JSPromise::kSizeWithEmbedderFields,
    0, factory->the_hole_value(), Builtins::kPromiseConstructor);
(lldb) expr isolate()->builtins()->builtin_handle(Builtins::Name::kPromiseConstructor)->Print()
0x3c5be1744d61: [Code]
 - map: 0x3657d7704051 <Map(HOLEY_ELEMENTS)>
kind = BUILTIN
name = PromiseConstructor
compiler = turbofan
address = 0x3c5be1744d61
Body (size = 3644)
Instructions (size = 3320)
0x3c5be1744dc0     0  55             push rbp
0x3c5be1744dc1     1  4889e5         REX.W movq rbp,rsp
0x3c5be1744dc4     4  56             push rsi
0x3c5be1744dc5     5  57             push rdi
0x3c5be1744dc6     6  50             push rax
0x3c5be1744dc7     7  4883ec40       REX.W subq rsp,0x40
0x3c5be1744dcb     b  4989e2         REX.W movq r10,rsp
0x3c5be1744dce     e  4883ec08       REX.W subq rsp,0x8
0x3c5be1744dd2    12  4883e4f0       REX.W andq rsp,0xf0
0x3c5be1744dd6    16  4c891424       REX.W movq [rsp],r10
0x3c5be1744dda    1a  488bc2         REX.W movq rax,rdx
0x3c5be1744ddd    1d  488955e0       REX.W movq [rbp-0x20],rdx
0x3c5be1744de1    21  488bde         REX.W movq rbx,rsi
0x3c5be1744de4    24  488bfe         REX.W movq rdi,rsi
0x3c5be1744de7    27  48be000000001c000000 REX.W movq rsi,0x1c00000000
0x3c5be1744df1    31  48ba69bce15e57360000 REX.W movq rdx,0x36575ee1bc69    ;; object: 0x36575ee1bc69 <String[157]: CAST(Parameter(Linkage::GetJSCallContextParamIndex( static_cast<int>(call_descriptor->JSParameterCount())))) at ../deps/v8/src/compiler/code-assembler.cc:385>
0x3c5be1744dfb    3b  498d85185620fa REX.W leaq rax,[r13-0x5dfa9e8]
....
  Handle<SharedFunctionInfo> shared(promise_fun->shared(), isolate);
  shared->SetConstructStub(*BUILTIN_CODE(isolate, JSBuiltinsConstructStub));
  shared->set_internal_formal_parameter_count(1);
  shared->set_length(1);

  InstallSpeciesGetter(promise_fun);
  SimpleInstallFunction(promise_fun, "all", Builtins::kPromiseAll, 1, true);
  SimpleInstallFunction(promise_fun, "race", Builtins::kPromiseRace, 1, true);
  SimpleInstallFunction(promise_fun, "resolve", Builtins::kPromiseResolveTrampoline, 1, true);
  SimpleInstallFunction(promise_fun, "reject", Builtins::kPromiseReject, 1, true);

So we can see that the Promise function is set up as a global. The then function is later setup using:

  Handle<JSFunction> promise_then = SimpleInstallFunction(prototype, isolate->factory()->then_string(), Builtins::kPromisePrototypeThen, 2, true);
  native_context()->set_promise_then(*promise_then);

deps/v8/src/builtins/builtins-definitions.h:

TFJ(PromiseConstructor, 1, kExecutor)

TFJ means TurboFan JavaScript linkage and means it is callable as a JavaScript function.

Debug JavaScript tests

Just example commands that I use in different projects to run the debugger with different test suites.

Mocha

    $ mocha --inspect --debug-brk  -u exports --recursive -t 10000 ./test/setup.js  test/sync/test_index.js

crypto

The current version of openssl is 1.0.2k (run process.versions.openssl). So this is major version 1, minor 0 and patch 2k I guess. To investigate lets take a look what happens when one requires crypto:

    const crypto = require('crypto');
    $ lldb -- ./out/Debug/node crypto.js

src/node_crypto.cc is a builtin:

    NODE_MODULE_CONTEXT_AWARE_BUILTIN(crypto, node::crypto::InitCrypto)

So, lets set a breakpoint in InitCrypto:

    (lldb) breakpoint set -f node_crypto.cc -l 6007
    (lldb) breakpoint set -f node_crypto.cc -l 5880

First things that happens is that libcrypto must be initializes. My understanding is that OpenSSL has two libraries which are libssl which used libcrypto. Node is using libcrypto in this case.

From InitCryptOnce:

    SSL_load_error_strings();
    OPENSSL_no_config();

OPENSSL_no_config() marks OpenSSL as configured. It seems that if this is not done to avoid some of OpenSSLs standard init functions that automatically call the configuration to just return hence do nothing.

   if (!openssl_config.empty())

This path would be taken if a openssl config had been passed using --openssl-config.

Next, we have:

    SSL_library_init();

This function can be found in deps/openssl/openssl/ssl/ssl_algs.c

    EVP_add_cipher(EVP_des_cbc());

EVP I think stands for envelope and has a number of high level cryptographic functions.

CNNIC

China Internet Network Information Center (CNNIC) is referenced in some code. It is a Certificate Authority (may be other things as well).

Signed Public Key and Challenge (SPKAC)

Also known as Netscape SPKI (spooky). There was originally an element named keygen in the html5 spec which was later removed. The intention was to create client side certificates through a web service for protocols like WebID.

Building with shared openssl

Building an locally built version of OpenSSL.

$ ./configure --debug --shared-openssl --shared-openssl-libpath=/Users/danielbevenius/work/security/build_1_1_0g/lib --shared-openssl-includes=/Users/danielbevenius/work/security/build_1_1_0g/include
$ make -j8

Building OpenSSL without elliptic curve support

$ ./Configure no-ec --debug --prefix=/Users/danielbevenius/work/security/openssl/build  --libdir="openssl" darwin64-x86_64-cc

Then building Node against that version:

$ ./configure --shared-openssl --shared-openssl-libpath=/Users/danielbevenius/work/security/openssl/build/openssl --shared-openssl-includes=/Users/danielbevenius/work/security/openssl/build/include

This will not compile are the headers for ec will not exist in the --shared-openssl-includes directory. You'll have to use the source include directory instead so that all the headers can be found.

    $ ./configure --debug --shared-openssl --shared-openssl-libpath=/Users/danielbevenius/work/security/openssl/build/openssl --prefix=/Users/danielbevenius/work/nodejs/build --shared-openssl-includes=/Users/danielbevenius/work/security/openssl/include

There will instead be a runtime error if you try to call functions that require EC but you'll be able to build.

Notice here that we have configured OpenSSL without Elliptic curve support

I was wondering what the values will if you just specify --shared-openssl which I've seen. In this case pkg-config will be called to retrieve information about install libraries in the system. For my system this will be:

'libraries': ['-lcrypto', '-lssl']},

RHEL8

OpenSSL on RHEL8 will be OpenSSL 1.1.1 and TLS 1.3. For node this will have implications if node is built and dynamically linking OpenSSL.

Is the OpenSSL version that RHEL ships the same as the upstream one?
If there is no difference than perhaps using different certificates or something, I think we should be using the statically linked version of OpenSSL. I know that previously dynamically linking was argued as better as that would mean that only the OpenSSL dynamic library would have to be updated in the case of a CVE. But in node's case it is very tightly coupled with OpenSSL and changes are that there are code changes in node core required. So just updating the dynamic OpenSSL library might break a node application. So in reality if a CVE coming out for OpenSSL the node package/executable will also have to be updated. Also, at least for us with in rhoar our target is a container we would have to update the base image in addition to an update to node..

Building on Solaris

I used VirtualBox to build and run the test suite on Solaris

    $ uname -a
    SunOS solaris 5.11 11.3 i86pc i386 i86pc

Setup

    $ sudo pkgadd -d http://get.opencsw.org/now

Install git:

    $ sudo /opt/csw/bin/pkgutil -y -i git
    $ export PATH=/opt/csw/bin:$PATH      // added to ~/.bashrc

Install binutils:

    $ sudo pkgutil -y -i binutils
    $ export PATH=/opt/csw/bin:/opt/csw/gnu:$PATH

Set GNU Make as the default:

    $ sudo ln -s /usr/bin/gmake /usr/bin/make

Clone node:

    $ git clone https://github.com/nodejs/node.git

Install gcc 49:

    $ sudo pkg install --accept --license gcc-49
    $ gmake CXXFLAGS+="--function-sections -fdata-sections"

Install stdc++6:

    $ sudo pkgchk -L CSWlibstdc++6
    $ export LD_LIBRARY_PATH=/opt/csw/lib/:$LD_LIBRARY_PATH

Patches:

[email protected]:~/work/node$ git show 8ff7afd
commit 8ff7afd2aba1cc13348e5d639f292b2fbb3b86d0
Author: Daniel Bevenius <[email protected]>
Date:   Thu Mar 16 06:59:05 2017 +0100

    add include ldflag for solaris
    
    It looks like when cctest is to be compiled the includes are
    missing. Trying to add the SHARED_INTERMEDIATE_DIR and see if
    that fixes one of the includes. If that works I'll add the rest
    if required.

diff --git a/node.gyp b/node.gyp
index 2407844..4bd3769 100644
--- a/node.gyp
+++ b/node.gyp
@@ -667,6 +667,9 @@
             ]},
           ],
         }],
+        ['OS=="solaris"', {
+          'ldflags': [ '-I<(SHARED_INTERMEDIATE_DIR)' ]
+        }],
       ]
     }
   ], # end targets
[email protected]:~/work/node$ git show 68c396b
commit 68c396be00896801bb92b5705da240d9aef30890
Author: Daniel Bevenius <[email protected]>
Date:   Thu Mar 16 12:21:24 2017 +0100

    fix for building on solaris

diff --git a/common.gypi b/common.gypi
index 3aad8e7..e001a1d 100644
--- a/common.gypi
+++ b/common.gypi
@@ -282,7 +282,8 @@
       }],
       [ 'OS in "linux freebsd openbsd solaris android aix"', {
         'cflags': [ '-Wall', '-Wextra', '-Wno-unused-parameter', ],
-        'cflags_cc': [ '-fno-rtti', '-fno-exceptions', '-std=gnu++0x' ],
+        #'cflags_cc': [ '-fno-rtti', '-fno-exceptions', '-std=gnu++0x' ],
+        'cflags_cc': [ '-fno-rtti', '-fno-exceptions' ],
         'ldflags': [ '-rdynamic' ],
         'target_conditions': [
           # The 1990s toolchain on SmartOS can't handle thin archives.
@@ -323,6 +324,8 @@
             'ldflags': [ '-m64', '-march=z196' ],
           }],
           [ 'OS=="solaris"', {
+            'defines': ['_GLIBCXX_USE_C99_MATH'],
+            'cflags_cc': [ '-std=c++11' ],
             'cflags': [ '-pthreads' ],
             'ldflags': [ '-pthreads' ],
             'cflags!': [ '-pthread' ],

Build and run node tests:

    $ gmake cctest

Building on Windows

I used VirtualBox to build and run the test suite on Windows

    $ .\vcbuild test

Debugging

    $ lldb -- out/Debug/cctest
    (lldb) r
    [ RUN      ] EnvironmentTest.MultipleEnvironmentsPerIsolate
    cctest(86419,0x7fff799ca000) malloc: *** error for object 0x104401058: incorrect checksum for freed object - object was probably modified after being freed.
*** set a breakpoint in malloc_error_break to debug
Process 86419 stopped
* thread #1: tid = 0x2bd84d0, 0x00007fff97229f06 libsystem_kernel.dylib`__pthread_kill + 10, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
    frame #0: 0x00007fff97229f06 libsystem_kernel.dylib`__pthread_kill + 10
libsystem_kernel.dylib`__pthread_kill:
->  0x7fff97229f06 <+10>: jae    0x7fff97229f10            ; <+20>
    0x7fff97229f08 <+12>: movq   %rax, %rdi
    0x7fff97229f0b <+15>: jmp    0x7fff972247cd            ; cerror_nocancel
    0x7fff97229f10 <+20>: retq

    (lldb) memory read 0x104401058


    (lldb) jlh handle
    0x1872340551b1: [Symbol]
     - hash: 60730982
     - name: 0x187234028391 <String[14]: node:npnBuffer>
     - private: 1


(lldb) jlh result
0x1e9a8f728239: [Symbol]
 - hash: 171677908
 - name: 0x1e9a8f728211 <String[15]: node:alpnBuffer>
 - private: 1

Switching between clang and gcc

    CXX=g++ CXX.host=g++ && ./configure -- -Dclang=0.

Ninja

Macosx:

 
    $ ./configure --ninja
    $ ninja -C out/Release

    $ ./configure && tools/gyp_node.py -f ninja && ninja -C out/Release && ln -fs out/Release/node node

This will generate object files in out/Release/src/ but the names will be obj/src/node.node.o instead of obj/src/node/node.o. This matters as we generete object files for different operating systems.

If you take a look in out/Release/ninja.build you'll find a bunch of subninja commands which are used to include other ninja build files:

....
subninja obj/node.ninja

Building with Ninja linux

    $ dnf install ninja-build
    $ ./configure --debug --ninja
    $ ninja-build -C out/Release
    $ out/Release/cctest

Building with Ninja on windows

You'll need to install Visual Studion 2015 and make sure you select Common Tools for C++.

Open cmd with administrator priveliges (Start -> Search for cmd ->CTRL+SHIFT+ENTER):

    > python configure --debug --ninja --dest-cpu=x86 --without-intl
    > tools\gyp_node.py -f ninja
    > ninja -C out\Release
    > out\Release\cctest.exe

Linux getauxval

A good description of this can be found here: https://lwn.net/Articles/519085/

The getauxval is a function that was added in glibc 2.16

    #if defined(__linux__) && defined(__GLIBC__) && defined(__GLIBC_PREREQ)
    # if __GLIBC_PREREQ(2, 16)
    #   define HAS_GETAUXVAL 1
    #   include <sys/auxv.h>
    # endif //  HAS_GETAUXVAL
    #endif

    # LD_SHOW_AUXV=1 ./node
    AT_SYSINFO_EHDR: 0x7fff503ee000
    AT_HWCAP:        9f8bfbff
    AT_PAGESZ:       4096
    AT_CLKTCK:       100
    AT_PHDR:         0x400040
    AT_PHENT:        56
    AT_PHNUM:        9
    AT_BASE:         0x7f223f1d6000
    AT_FLAGS:        0x0
    AT_ENTRY:        0x859c90
    AT_UID:          0
    AT_EUID:         0
    AT_GID:          0
    AT_EGID:         0
    AT_SECURE:       0
    AT_RANDOM:       0x7fff503dddc9
    AT_EXECFN:       ./node
    AT_PLATFORM:     x86_64

To force the setting of AT_SECURE:


    $ setcap cap_net_raw+ep out/Release/node
    $ getcap out/Release/node
    $ useradd beve
    $ su beve
    $ ./out/Release/node

Show all v8 exceptions

    $ out/Release/node --print_all_exceptions /work/node/test/parallel/test-process-setuid-setgid.js

Creating patches

I've found that I need to create patches from old patches that worked with a previous version. At work we apply patches to node sources and when new versions are released these patches might not apply cleanly making it a manual task to make updates to that tag and then generating the patches to be applied.


   $ patch -p1 < 000x-something.patch
   $ git diff --patch-with-stat > patch.out

Then I just copy this and replace the original patch section, but keeping the rest of the original patch. Then try to apply the patch again to make sure it applied cleanly

Cherry pick a V8 commit

    $ git format-patch -1 --stdout f5fad6d > out.patch
    $ git am --directory deps/v8  ~/work/google/javascript/v8/verbose.patch

Don't forget to bump the patch version in deps/v8/include/v8-version.h Also common common.gypi needs to be updated as well:

'v8_embedder_string': '-node.5',

HTTP/2

    const server = h2.createServer();

createServer is defined in lib/internal/http2/core.js and it returns a new Http2Server. Http2Server extends Server in net.js. After calling the super classes constructor the following call is performed:

    this.on('newListener', setupCompat);

setupCompat is a event listener callback which only handles 'request' events, in which case it

Messages/Exceptions with V8

Each V8 Isolate allows listeners to be added for various error messages/exceptions.

    isolate->AddMessageListener(OnMessage);
    isolate->SetFatalErrorHandler(OnFatalError);
    isolate->SetAbortOnUncaughtExceptionCallback(ShouldAbortOnUncaughtException);

AddMessageListener is a callback that will be called when an error occurs. How does this work then?

When a script is run, for example:

    const char *js = "age = ajj40";  // intentionally trigger an exception
    Local<String> source = String::NewFromUtf8(isolate, js, NewStringType::kNormal).ToLocalChecked();
    Local<Script> script = Script::Compile(context, source).ToLocalChecked();
    Local<Value> result = script->Run(context).ToLocalChecked();

Script::Run can be found in api.cc

    (lldb) br s -f api.cc -l 2017
    has_pending_exception = !ToLocal<Value>(i::Execution::Call(isolate, fun, receiver, 0, nullptr), &result);

See the returned value is a bool indicating if there was an error. i::Execution::Call will call:

    return CallInternal(isolate, callable, receiver, argc, argv, MessageHandling::kReport)

Notice that in the MessageHandling::kReport being passed in. CallInternal looks like this (in our case):

    return Invoke(isolate, false, callable, receiver, argc, argv, isolate->factory()->undefined_value(), message_handling);

    MUST_USE_RESULT MaybeHandle<Object> Invoke(
        Isolate* isolate, bool is_construct, Handle<Object> target,
        Handle<Object> receiver, int argc, Handle<Object> args[],
        Handle<Object> new_target, Execution::MessageHandling message_handling)

In our case the target is a JavaScript Function which can be checked by:

    (lldb) p target->IsJSFunction()
    (bool) $58 = true

This will cause us to enter the following block in the Invoke function:

    if (target->IsJSFunction()) {
      Handle<JSFunction> function = Handle<JSFunction>::cast(target);

But there is a check:

   if ((!is_construct || function->IsConstructor()) &&
        function->shared()->IsApiFunction()) {

which causes us to exit the block.

    // Placeholder for return value.
    Object* value = NULL;

    typedef Object* (*JSEntryFunction)(Object* new_target, Object* target,
                                      Object* receiver, int argc,
                                      Object*** args);

So we are declaring a JSEntryFunction which is of type pointer to function that returns a pointer to Object and takes the arguments listed.

    Handle<Code> code = is_construct
      ? isolate->factory()->js_construct_entry_code()
      : isolate->factory()->js_entry_code();

How are js_entry_code() set? See Heap objects for details.

    JSEntryFunction stub_entry = FUNCTION_CAST<JSEntryFunction>(code->entry());

    if (FLAG_clear_exceptions_on_js_entry) isolate->clear_pending_exception();

    // Call the function through the right JS entry stub.
    Object* orig_func = *new_target;
    Object* func = *target;
    Object* recv = *receiver;
    Object*** argv = reinterpret_cast<Object***>(args);
    if (FLAG_profile_deserialization && target->IsJSFunction()) {
      PrintDeserializedCodeInfo(Handle<JSFunction>::cast(target));
    }
    RuntimeCallTimerScope timer(isolate, &RuntimeCallStats::JS_Execution);
    value = CALL_GENERATED_CODE(isolate, stub_entry, orig_func, func, recv, argc, argv);
    (lldb) br s -f execution.cc -l 145

CALL_GENERETED_CODE is a macro which is used to call code generated by one of the compilers. I'm on a x64 some guessing that src/x64/simulator-x64.h is getting called which does:

    // TODO(X64): Don't pass p0, since it isn't used?
   #define CALL_GENERATED_CODE(isolate, entry, p0, p1, p2, p3, p4) \
   (entry(p0, p1, p2, p3, p4))

Setting a breakpoint after this call we can inspect the value returned:

    (lldb) job value
    #exception
    // Update the pending exception flag and return the value.
    bool has_exception = value->IsException(isolate);
    DCHECK(has_exception == isolate->has_pending_exception());
    if (has_exception) {
      if (message_handling == Execution::MessageHandling::kReport) {
        isolate->ReportPendingMessages();
      }
      return MaybeHandle<Object>();
    } else {
      isolate->clear_pending_message();
    }

Remember that we passed Execution::MessageHandling::kReport earlier so we will call isolate-ReportPendingMessages()

    void Isolate::ReportPendingMessages() {
      DCHECK(AllowExceptions::IsAllowed(this));

      Object* exception = pending_exception();

pending_exception() performs some checks and then returns:

    return thread_local_top_.pending_exception_;
    (lldb) job exception
    0x94c7c108af1: [JS_ERROR_TYPE]
    - map = 0xcc576b8e179 [FastProperties]
    - prototype = 0x1993f4b15221
    - elements = 0x37d706602241 <FixedArray[0]> [FAST_HOLEY_SMI_ELEMENTS]
    - properties = 0x94c7c108ec9 <FixedArray[3]> {
     #stack: 0x3aeec9e69bf1 <AccessorInfo> (const accessor descriptor)
     #message: 0x94c7c108ac9 <String[20]: ajj40 is not defined> (data field 0)
     0x37d706604d71 <Symbol: stack_trace_symbol>: 0x94c7c109099 <JSArray[6]> (data field 1)
    // Try to propagate the exception to an external v8::TryCatch handler. If
    // propagation was unsuccessful, then we will get another chance at reporting
    // the pending message if the exception is re-thrown.
    bool has_been_propagated = PropagatePendingExceptionToExternalTryCatch();
    if (!has_been_propagated) return;

Looking into PropagatePendingExceptionToExternalTryCatch:

    bool Isolate::PropagatePendingExceptionToExternalTryCatch() {
      Object* exception = pending_exception();

The first thing it does is again call pending_exceptions() invoking the same checks are before which should really be the same. I wonder if there might be use for an overloaded function taking a pointer to exception?

    HandleScope scope(this);
    Handle<JSMessageObject> message(JSMessageObject::cast(message_obj), this);
    Handle<JSValue> script_wrapper(JSValue::cast(message->script()), this);
    Handle<Script> script(Script::cast(script_wrapper->value()), this);
    int start_pos = message->start_position();
    int end_pos = message->end_position();
    MessageLocation location(script, start_pos, end_pos);
    MessageHandler::ReportMessage(this, &location, message);

MessageHandler::ReportMessage prepares the error message (very simplified) and ends up calling:

    ReportMessageNoExceptions(isolate, loc, message, api_exception_obj);


    v8::MessageCallback callback =FUNCTION_CAST<v8::MessageCallback>(callback_obj->foreign_address());
    Handle<Object> callback_data(listener->get(1), isolate);
    {
      // Do not allow exceptions to propagate.
      v8::TryCatch try_catch(reinterpret_cast<v8::Isolate*>(isolate));
      callback(api_message_obj, callback_data->IsUndefined(isolate) ? api_exception_obj : v8::Utils::ToLocal(callback_data));
    }

This is the call that invokes are message callback. After this we will back out of the frame stacks. On thing I notice that there was an ExceptionScope which the deconstructor sets the exception object. Need to look into why this is done. So we will eventually end up back where in v8::Script::Run:

    has_pending_exception = !ToLocal<Value>(i::Execution::Call(isolate, fun, receiver, 0, nullptr), &result);

    (lldb) p has_pending_exception
    (bool) $208 = true

     RETURN_ON_FAILED_EXECUTION(Value);
     RETURN_ESCAPED(result);

Lets take a closer look at RETURN_ON_FAILED_EXECUTION(Value):

#define RETURN_ON_FAILED_EXECUTION(T) \
  EXCEPTION_BAILOUT_CHECK_SCOPED(isolate, MaybeLocal<T>())

So that will be EXCEPTION_BAILOUT_CHECK_SCOPED(isolate, MaybeLocal<Value>()) which becomes:

    #define EXCEPTION_BAILOUT_CHECK_SCOPED(isolate, value) \
      do {                                                 \
        if (has_pending_exception) {                       \
          call_depth_scope.Escape();                       \
          return value;                                    \
        }                                                  \
     } while (false)

So in our case the value returned will be a MaybeLocal object. This is what gets returned here:

    Local<Value> result = script->Run(context).ToLocalChecked();

Now, notice that we are calling ToLocalChecked(), which

    if (V8_UNLIKELY(val_ == nullptr)) V8::ToLocalEmpty();

    Utils::ApiCheck(false, "v8::ToLocalChecked", "Empty MaybeLocal.");

    if (!condition) Utils::ReportApiFailure(location, message);


    FatalErrorCallback callback = isolate->exception_behavior();

    callback(location, message);

And this is where our OnFatalError function is called. But without the ToLocalChecked our OnFatalError would not be called.

Well, during invokation of a Script, recall from above it was mentioned that we'll end up in CALL_GENERATED_CODE. Now, if we set a breakpoint in file = 'isolate.cc', line = 1062 which is the Throw function we can backtrace and see how we ended up there. ic.cc:2395:

    v8::internal::Runtime_LoadGlobalIC_Miss
    Handle<Object> result;
    ASSIGN_RETURN_FAILURE_ON_EXCEPTION(isolate, result, ic.Load(name));
    return *result;

    MaybeHandle<Object> LoadIC::Load(Handle<Object> object, Handle<Name> name)

This will throw an ReferenceError in our case:

    return ReferenceError(name);

    THROW_NEW_ERROR(isolate(), NewReferenceError(MessageTemplate::kNotDefined, name), Object);

That call will bring us back to isolate.cc (1064) which is the Throw function we breaked in. This time we have a try_catch_handler:

    (lldb) p *try_catch_handler()
    (v8::TryCatch) $2 = {
      isolate_ = 0x0000000105000000
      next_ = 0x0000000000000000
      exception_ = 0x00003107fe782351
      message_obj_ = 0x00003107fe782351
      js_stack_comparable_address_ = 0x00007fff5fbfea50
      is_verbose_ = true
      can_continue_ = true
      capture_message_ = true
      rethrow_ = false
      has_terminated_ = false
    }

ReportPendingMessages:

    // Determine whether the message needs to be reported to all message handlers
    // depending on whether and external v8::TryCatch or an internal JavaScript
    // handler is on top.
    bool should_report_exception;
    if (IsExternalHandlerOnTop(exception)) {
      // Only report the exception if the external handler is verbose.
      should_report_exception = try_catch_handler()->is_verbose_;
    } else {
      // Report the exception if it isn't caught by JavaScript code.
      should_report_exception = !IsJavaScriptHandlerOnTop(exception);
    }

Is this case we have a TryCatch RAII and have set verbose to true. There is no javascript try/catch so the MessageHandler we set by calling AddMessageHandler on the isolate should be the target handler.

SetFatalErrorHandler If this callback handler is not set then the default behaviour in V8 will be to
print a stactrace and exit, for example:

    $ ./exceptions
    OnMessage...

    #
    # Fatal error in v8::ToLocalChecked
    # Empty MaybeLocal.
    #

    Received signal 4 <unknown> 000109100351

    ==== C stack trace ===============================

     [0x00010910322e]
     [0x000109103265]
     [0x000109103186]
     [0x7fff8a2c252a]
     [0x000107ac8423]
     [0x00010624ffbc]
     [0x000106254bef]
     [0x000106254c1d]
     [0x00010621d4dc]
     [0x7fff86cc45ad]
     [0x000000000001]
    [end of stack trace]
    Illegal instruction: 4

You can set a handler for this that does something before exiting of choose to not exit.

When an exceptions is raised in V8 it will cause the function Throw in e

    Object* Isolate::Throw(Object* exception, MessageLocation* location) {

TryCatch

Is used to create a try/catch block in V8 which used RAII:

    {
      TryCatch try_catch(isolate);
      try_catch.SetVerbose(true);

      ... perform operations

      if (try_catch.HasCaught()) {
        printf("Caught: %s\n", *String::Utf8Value(try_catch.Exception()));
      }

How does setting verbose affect things? In Isolate::Throw: Which can be called if there is a JavaScript error and if an external TryCatch exist and verbose is true. If that is the case the Message will be created:

    Handle<Object> message_obj = CreateMessage(exception_handle, location);
    thread_local_top()->pending_message_obj_ = *message_obj;
    ...
    set_pending_exception(*exception_handle);
    return heap()->exception();

So that is what throw does, sets the message_obj and the exception_handle and then returns.

What if we set verbose to false:

    (lldb) expr try_catch.SetVerbose(false);

In this case the message will still be created and the exception set on the thread_local, but when we get to ReportPendingMessages is_verbose_ will be false and the callback will be skipped.

In src/api.cc (line 2710) we can find the TryCatch constructor:

v8::TryCatch::TryCatch(v8::Isolate* isolate)
    : isolate_(reinterpret_cast<i::Isolate*>(isolate)),
      next_(isolate_->try_catch_handler()),
      is_verbose_(false),
      can_continue_(true),
      capture_message_(true),
      rethrow_(false),
      has_terminated_(false) {
  ResetInternal();
  ...
  isolate_->RegisterTryCatchHandler(this);
}
`ResetInternal` will do the following:
```c++
  i::Object* the_hole = isolate_->heap()->the_hole_value();
  exception_ = the_hole;
  message_obj_ = the_hole;
(lldb) p the_hole
(v8::internal::Object *) $4 = 0x0000278095682321
`RegisterTryCatchHandler` will do:
```c++
thread_local_top()->set_try_catch_handler(that);

After this call we can inspect thread_local_top_:

(lldb) p *thread_local_top()
(v8::internal::ThreadLocalTop) $22 = {
  isolate_ = 0x0000000104806e00
  context_ = 0x000025b6fbf03af1
  thread_id_ = (id_ = 1)
  pending_exception_ = 0x000025b693f02321
  wasm_caught_exception_ = 0x0000000000000000
  pending_handler_context_ = 0x0000000000000000
  pending_handler_code_ = 0x0000000000000000
  pending_handler_offset_ = 0
  pending_handler_fp_ = 0x0000000000000000 <no value available>
  pending_handler_sp_ = 0x0000000000000000 <no value available>
  rethrowing_message_ = false
  pending_message_obj_ = 0x000025b693f02321
  scheduled_exception_ = 0x000025b693f02321
  external_caught_exception_ = false
  save_context_ = 0x0000000000000000
  c_entry_fp_ = 0x0000000000000000 <no value available>
  handler_ = 0x0000000000000000 <no value available>
  c_function_ = 0x0000000000000000 <no value available>
  promise_on_stack_ = 0x0000000000000000
  js_entry_sp_ = 0x0000000000000000 <no value available>
  external_callback_scope_ = 0x0000000000000000
  current_vm_state_ = EXTERNAL
  failed_access_check_callback_ = 0x0000000000000000
  try_catch_handler_ = 0x00007fff5fbfde60
}

Notice that pending_exception, pending_message_obj_, and scheduled_exception_ are all set to the_hole. The try_catch_handler_ is also set. So these are memory locations, and from what I understand so far is that these are made available to generated code, enabling them to set values, like pending_message_obj_, but also have access to the try_catch_handler_. This means that a generated piece of code would directly call/jump to try_catch_handler_ and have it execute. But what sets this up?
When CALL_GENERATED_CODE(isolate, stub_entry, orig_func, func, recv, argc, argv); is called the stub_entry will point to the code generated for JSEntryStub. This will place the correct values in the expected registers.

For example, take the following from the generated assemble (created using (lldb) job *code):

movq r10,0x104808790    ;; external reference (Isolate::context_address)
movq r10,0x104808790    ;; external reference (Isolate::context_address)

This instruction is generated by src/x64/code-stubs-x64.cc JSEntryStub::Generate:

ExternalReference context_address(IsolateAddressId::kContextAddress, isolate());
__ Load(kScratchRegister, context_address);o
__ Push(kScratchRegister);  // context

The above will set ExternalReference address_:

ExternalReference::ExternalReference(IsolateAddressId id, Isolate* isolate)
    : address_(isolate->get_address_from_id(id)) {}

And get_address_from_id look like this:

Address Isolate::get_address_from_id(IsolateAddressId id) {
  return isolate_addresses_[id];
}
Move(kScratchRegister, source);
(lldb) expr kScratchRegister
(const v8::internal::Register) $7 = {
  v8::internal::RegisterBase<v8::internal::Register, 16> = (reg_code_ = 10)
}

We can find the following function in src/x64/macro-assembler-x64.h:

void Move(Register dst, ExternalReference ext) {
  movp(dst, reinterpret_cast<void*>(ext.address()), RelocInfo::EXTERNAL_REFERENCE);
}

Notice the RelocInfo::EXTERNAL_REFERENCE which indicates that this is the address of an external C++ function

(lldb) p ext.address()
(v8::internal::Address) $8 = 0x0000000105813b50 <no value available>

To understand how this works we can set a break point and run mkshnapshot:

$ $ lldb out.gn/learning/mksnapshot
(lldb) br s -f code-stubs-x64.cc -n JSEntryStub::Generate
(lldb) expr StackFrame::TypeToMarker(type())
(int32_t) $1 = 2                                            // pushq  $0x2
0x3db66db84066  REX.W movq r10,0x104808790    ;; external reference (Isolate::context_address)
0x3db66db84070: movq   (%r10), %r10

Notice that 0x104808790 is a pointer. This is then dereferenced and that value placed into r10:

(lldb) register read r10
     r10 = 0x0000000104808790
(lldb) memory read -f x -c 1 -s 8 `($r10)`
0x104808790: 0x000025b6fbf03af1

So it contains the value 0x000025b6fbf03af1 and if we look at the context_ entry from above we can see these match:

context_ = 0x000025b6fbf03af1

One thing I noticed in x64/code-stubs-x64.cc JSEntry function was:

__ bind(&handler_entry);
handler_offset_ = handler_entry.pos();

Now, handler_offset_ is a private member of JSEntryStub in src/code-stubs.h. This field is set by FinishCode:

Handle<Code> new_object = GenerateCode();
new_object->set_stub_key(GetKey());
FinishCode(new_object);

Now, handler_entry is a label and this gets the position of the label. I'm guessing (at the moment) that the handler_offet_ allows other generated functions to addess the exception handler.

deps/v8/src/x64/macro-assembler-x64.cc EnterFrame. Just note that when Builtins::Generate_JSEntryTrampoline generates code it calls EnterFrame which does more than store/set rbp. Remember this or the code will be hard to follow.

Compiling on windows

I looking into an issue on windows: https://github.com/nodejs/node/issues/12952

The issue is a link error with the cctest target. In this particular case on windows it was built using:


    .\vcbuild.bat dll debug x64 vc2015

This will result in the following options:
```console
    configure --debug --shared --dest-cpu=x64 --tag=

So this is saying that node should be created as a shared library but not any of its dependencies. In this case the focus is on OpenSSL which will be a static library. The link error is:

    openssl.lib(err.obj) : error LINK2005: ERR_put_error already defined in node.lib(node.dll

We can find the exported symbols of Debug/node.dll using:

    dumpbin /EXPORTS Debug/node.dll > node-exports

We can find the ERR_put_error is indeed exported from node.dll This is done because in node.gyp we have:


      [ 'OS=="win" and '
        'node_use_openssl=="true" and '
        'node_shared_openssl=="false" and node_shared=="false"', {
        'use_openssl_def': 1,
      }, {
        'use_openssl_def': 0,
      }],

In our case use_openssl_def will be 1 which is the used in node.gypi:


      # openssl.def is based on zlib.def, zlib symbols
      # are always exported.
      ['use_openssl_def==1', {
        'sources': ['<(SHARED_INTERMEDIATE_DIR)/openssl.def'],
      }],
      ['OS=="win" and use_openssl_def==0', {
        'sources': ['deps/zlib/win32/zlib.def'],

A .def file is a module definition file which describes attributes of a DLL. In our case openssl.def can be found in Debug\obj\global_intermediate:

    EXPORTS
    ...
    ERR_put_error

So we can see this function is indeed exported, so if we link against openssl.lib this symbol (and the others) will be duplicates.

V8 Platform

This is used by an embedder and is an interface that must be implemented. This interface is defined in deps/v8/include/v8-platform.h. This interface has a number of functions related to threading and running task on threads:

  • CallOnBackgroundThread
  • CallOnForegroundThread
  • CallDelayedOnForegroundThread ...
  • AddTraceEvent

Node can be configured --without-v8-platform which will set environment variables that will be picked up by the pre-processor. For example, if you take a look in src/node.cc you will find:

    #if NODE_USE_V8_PLATFORM
    #include "libplatform/libplatform.h"
    #endif  // NODE_USE_V8_PLATFORM

libplatform/libplatform.h is the interface for a default v8::Platform implementation in V8. Next, we have a v8_platform struct that will use the above default v8::Platform implementation is NODE_USE_V8_PLATFORM is set. But what does it mean to not use the default v8::Platform implementation? Basically this will just create noop functions or functions that throw an error:

    #else  // !NODE_USE_V8_PLATFORM
    void Initialize(int thread_pool_size) {}
    void PumpMessageLoop(Isolate* isolate) {}
    void Dispose() {}
    bool StartInspector(Environment *env, const char* script_path,
                      const node::DebugOptions& options) {
      env->ThrowError("Node compiled with NODE_USE_V8_PLATFORM=0");
      return true;
    }

    void StartTracingAgent() {
      fprintf(stderr, "Node compiled with NODE_USE_V8_PLATFORM=0, "
                      "so event tracing is not available.\n");
    }
    void StopTracingAgent() {}

When trying to run a test --without-v8-platform I run into the following error:


    #
    # Fatal error in ../deps/v8/src/v8.cc, line 110
    # Check failed: platform_.
    #

    ==== C stack trace ===============================

      0   node                                0x00000001019d2dae v8::base::debug::StackTrace::StackTrace() + 30
      1   node                                0x00000001019d2de5 v8::base::debug::StackTrace::StackTrace() + 21
      2   node                                0x00000001019ccf84 V8_Fatal + 452
      3   node                                0x00000001012bcea8 v8::internal::V8::GetCurrentPlatform() + 72
      4   node                                0x0000000100ed0c7c v8::internal::Isolate::Init(v8::internal::Deserializer*) + 1692
      5   node                                0x00000001012965ba v8::internal::Snapshot::Initialize(v8::internal::Isolate*) + 170
      6   node                                0x0000000100260265 v8::Isolate::New(v8::Isolate::CreateParams const&) + 485
      7   node                                0x000000010170901f node::Start(uv_loop_s*, int, char const* const*, int, char const* const*) + 79
      8   node                                0x0000000101708bce node::Start(int, char**) + 478
      9   node                                0x000000010175f21e main + 94
      10  node                                0x0000000100000a34 start + 52
      11  ???                                 0x0000000000000002 0x0 + 2
    Process 9882 stopped


    2644   compiler_dispatcher_ =
    2645       new CompilerDispatcher(this, V8::GetCurrentPlatform(), FLAG_stack_size);

v8::GetCurrentPlatform will perform the following:

    v8::Platform* V8::GetCurrentPlatform() {
      DCHECK(platform_);
      return platform_;
   }

But since we have not set a platform, remember that would have been done if NODE_USE_V8_PLATFORM:

    #if NODE_USE_V8_PLATFORM
    void Initialize(int thread_pool_size) {
      platform_ = v8::platform::CreateDefaultPlatform(thread_pool_size);
      V8::InitializePlatform(platform_);
      tracing::TraceEventHelper::SetCurrentPlatform(platform_);
    }

My understanding is that the in this code path the snapshot blob is being deserialized and when this is done.

Upgrading OpenSSL

Download and verify the download:


    $ gpg openssl-1.0.2l.tar.gz.asc
    gpg: Signature made Thu May 25 14:55:41 2017 CEST using RSA key ID 0E604491
    gpg: Can't check signature: public key not found
    $ gpg --keyserver pgpkeys.mit.edu --recv-key 0E604491
    gpg: requesting key 0E604491 from hkp server pgpkeys.mit.edu
    gpg: key 0E604491: public key "Matt Caswell <[email protected]>" imported
    gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
    gpg: depth: 0  valid:   1  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 1u
    gpg: Total number processed: 1
    gpg:               imported: 1  (RSA: 1)
    $ gpg openssl-1.0.2l.tar.gz.asc
    gpg: Signature made Thu May 25 14:55:41 2017 CEST using RSA key ID 0E604491
    gpg: Good signature from "Matt Caswell <[email protected]>"
    gpg:                 aka "Matt Caswell <[email protected]>"
    gpg: WARNING: This key is not certified with a trusted signature!
    gpg:          There is no indication that the signature belongs to the owner.
    Primary key fingerprint: 8657 ABB2 60F0 56B1 E519  0839 D9C4 D26D 0E60 4491

Compare the changes to make see if you have to make changes to the

OpenSSL with FIPS

When building the OpenSSL and specifying --openssl-fips, which is described as "Build OpenSSL using FIPS canister .o file in supplied folder" Fipsld.

The canister file is just the collection of the all the source code into an single monolihic object module to guarentee the relative order of the original code components:

$ ld -r ­-o fipscanister.o fips_start.o ... fips_end.o

Doing this will preserve the relative order of the original code components.

The follwing can be seen on that page: "When performing a static link against the OpenSSL library, you have to embed the expected FIPS signature in your executable after final linking. Embedding the FIPS signature in your executable is most often accomplished with fisld." "fisld will take the place of the linker (or compiler if invoking via a compiler driver). If you use fisld to compile a source file, fisld will do nothing and simply invoke the compiler you specify through FIPSLD_CC. When it comes time to link, fisld will compile fips_premain.c, add fipscanister.o, and then perform the final link of your program. Once your program is linked, fisld will then invoke incore to embed the FIPS signature in your program."

But how about when we dynamically link to an OpenSSL version that has FIPS support enabled. Since we would not be statically linking we should not have to specify the fipscanister.o. But without specifying the --openssl-fips flag the options to enable/disable FIPS are not avaiable in node (nore are functions enabled/disabled that would otherwise be when when FIPS mode is enabled).

Perhaps we can set the macro NODE_FIPS_MODE if OPENSSL_FIPS is set which should be done when linking against an OpenSSL version that support FIPS?

For instructions building an OpenSSL version with FIPS support see: https://github.com/danbev/learning-libcrypto#fips

Detecting FIPS support: Currently, FIPS support is not available in the version that is shipped with Node.js. But we still have the option to dynamically link to a FIPS compatible OpenSSL library. This is for example what we do at Red Hat. There is a problem here though...

If we run the following command on default build (statically linking with OpenSSL in deps, so no FIPS support):

$ ./node -p "require('crypto').getFips()"
0

This expected.

Node.js can also be dynamically linked with OpenSSL, for example:

$ ./configure --shared-openssl --openssl-is-fips

The --openssl-is-fips option specifies that the OpenSSL library is FIPS compatible and this enables FIPS related functionality in Node to be available. Building and running the same command on this version gives:

$./node -p "require('crypto').getFips()"
0

Now this is not expected.

None-FIPS configuration:

$ node -p 'process.config.variables' | grep openssl
  node_shared_openssl: false,
  node_use_openssl: true,
  openssl_fips: '',
  openssl_is_fips: false,

FIPS Compatible configuration:

./node -p 'process.config.variables' | grep openssl
  node_shared_openssl: true,
  node_use_openssl: true,
  openssl_fips: '',
  openssl_is_fips: true,
  openssl_system_ca_path: '/etc/pki/tls/certs/ca-bundle.crt',
 ./node --expose-internals -p "require('internal/test/binding').internalBinding('config')"
{
  isDebugBuild: false,
  hasOpenSSL: true,
  fipsMode: true,
  hasIntl: true,
  hasSmallICU: true,
  hasTracing: true,
  hasNodeOptions: true,
  hasInspector: true,
  noBrowserGlobals: false,
  bits: 64,
  hasDtrace: true,
  hasCachedBuiltins: true
}
We can see that fipsMode has be set which is also correct. But why does getFips()
return 0? Because you have to also enable fips:
```console
$ ./node --enable-fips -p "require('crypto').getFips()"
1

Enable fips in a container (UBI8/RHEL8):

$ update-crypto-policies --set FIPS
$ fips-mode-setup --enable --no-bootcfg
$ export OPENSSL_FORCE_FIPS_MODE=true

OpenSSL FIPS Object Module

OpenSSL itself is not validated, and never will be. Instead a carefully defined software component called the OpenSSL FIPS Object Module has been created. The Module was designed for compatibility with the OpenSSL library so products using the OpenSSL library and API can be converted to use FIPS 140-2 validated cryptography with minimal effort.

FIPS Mode in which the FIPS approved algorithms are implemented by the FIPS Object Module and non-FIPS approved algorithms are disabled by default. These non-validated algorithms include, but are not limited to, Blowfish, CAST, IDEA, RC-family, and non-SHA message digest and other algorithms.

The v1.2.x Module is only compatible with OpenSSL 0.9.8 releases, while the v2.0 Module is compatible with OpenSSL 1.0.1 and 1.0.2 releases. The v2.0 Module is the best choice for all new software and product development.

After following the instructions to configure OpenSSL with FIPS support, building Node can be done using the following commands:

    $ ./configure --debug --shared-openssl --shared-openssl-libpath=/Users/danielbevenius/work/security/build_1_0_2k/lib --shared-openssl-includes=/Users/danielbevenius/work/security/build_1_0_2k/include --openssl-fips=/Users/danielbevenius/work/security/build_1_0_2k/
    $ make -j8 test

Crypto init

    void InitCrypto(Local<Object> target,
                Local<Value> unused,
                Local<Context> context,
                void* priv) {
    static uv_once_t init_once = UV_ONCE_INIT;
    uv_once(&init_once, InitCryptoOnce);

    Environment* env = Environment::GetCurrent(context);
    SecureContext::Initialize(env, target);

SecureContext::Initialize:

    void CipherBase::Initialize(Environment* env, Local<Object> target) {
        Local<FunctionTemplate> t = env->NewFunctionTemplate(New);

Lets take a look what New does:

    void SecureContext::New(const FunctionCallbackInfo<Value>& args) {
      Environment* env = Environment::GetCurrent(args);
      new SecureContext(env, args.This());
    }

Usage of tls would look like a normal require (though since this is an native module the loading will be done by NativeModule.require(filename):

    const tls = require('tls');

This module exports (among other functions):

    exports.createSecureContext = require('_tls_common').createSecureContext;
    exports.SecureContext = require('_tls_common').SecureContext;
    exports.TLSSocket = require('_tls_wrap').TLSSocket;
    exports.Server = require('_tls_wrap').Server;
    exports.createServer = require('_tls_wrap').createServer;
    exports.connect = require('_tls_wrap').connect;

So calling tls.createSecureContext will end up in `lib/_tls_common.js:

    exports.createSecureContext = function createSecureContext(options, context) {
    ...
      var c = new SecureContext(options.secureProtocol, secureOptions, context);

And here we find the call to SecureContext using new. At this point this function will have been bound by SecureContext::Initialize which is called by node::crypto::Initialize in src/node_crypto.cc. When the tls module is required it will load the internal crypto modulewhich is what will causenode::crypto::Initialize` to be invoked.

SecureContext::Initialize(env, target) has the following functions that is adds:

    Local<FunctionTemplate> t = env->NewFunctionTemplate(SecureContext::New);
    t->InstanceTemplate()->SetInternalFieldCount(1);
    t->SetClassName(FIXED_ONE_BYTE_STRING(env->isolate(), "SecureContext"));

    env->SetProtoMethod(t, "init", SecureContext::Init);
    env->SetProtoMethod(t, "setKey", SecureContext::SetKey);
    env->SetProtoMethod(t, "setCert", SecureContext::SetCert);
    env->SetProtoMethod(t, "addCACert", SecureContext::AddCACert);
    ...
    target->Set(FIXED_ONE_BYTE_STRING(env->isolate(), "SecureContext"), t->GetFunction());

So, when tls.createSecureContext returns c will have been bound to a new function that was set up above.

In most/all of the functions in the SecureContext class you'll find the following:

  SecureContext* sc;
  ASSIGN_OR_RETURN_UNWRAP(&sc, args.Holder());

If call to such a function could look like this:

   c.context.addCACert('something');

The JavaScript layer will add wrap the native secure context in a object with a context member for the native secure context, which is the SecureContext on the C++ side. So when we call addCACert what is unwrapped by ASSIGN_OR_RETURN_UNWRAP is c which is the native context.

Remember that these are prototype functions that are being setup on instance created when using new SecureContext which will be an instance of CipherBase since this is the type that the New function returned.

When is the underlying OpenSSL Context created? This is one in SecureContext::Init which is called in _tls_common.js, in the SecureContext constructor:

  this.context = new NativeSecureContext();
  this.context.init(secureProtocol);

SecureContext::Init will setup/initialize the OpenSSL context. This includes parsing the options passed, secureProtocol above and calling the correct TLS_xxx_method(), and then using that SSL_METHOD pointer to create a new SSL_CTX.

    const binding = internalBinding('crypto');

Recall that internalBinding is a function that is set on the process object when it is set up on node.cc. This will be a call to Binding:

    (lldb) br s -f node.cc -l 2723
    static void Binding(const FunctionCallbackInfo<Value>& args) {
      Local<String> module = args[0]->ToString(env->isolate());
      node::Utf8Value module_v(env->isolate(), module);
      ...
    }
    (lldb) jlh module
    #crypto
    Local<Array> modules = env->module_load_list_array();
    (lldb) jlh modules
    0x16da0049c641: [JSArray]
     - map = 0x34afb7a04099 [FastProperties]
     - prototype = 0xe7de5189b01
     - elements = 0x39e4730d5221 <FixedArray[82]> [FAST_HOLEY_ELEMENTS]
     - length = 69
     - properties = 0x25494902241 <FixedArray[0]> {
        #length: 0x4ddc16a6369 <AccessorInfo> (const accessor descriptor)
     }
     - elements = 0x39e4730d5221 <FixedArray[82]> {
           0: 0x246944578d59 <String[18]: Binding contextify>
           1: 0x246944578d89 <String[15]: Binding natives>
           2: 0x246944578db1 <String[14]: Binding config>
           3: 0x246944578dd9 <String[19]: NativeModule events>
           4: 0x246944578e01 <String[18]: Binding async_wrap>
           5: 0x246944578e31 <String[11]: Binding icu>
           6: 0x246944578e59 <String[17]: NativeModule util>
           7: 0x246944578e81 <String[10]: Binding uv>
           8: 0x246944578ea9 <String[19]: NativeModule buffer>
           9: 0x246944578ed1 <String[14]: Binding buffer>
          10: 0x246944578ef9 <String[12]: Binding util>
          11: 0x246944578f21 <String[26]: NativeModule internal/util>
          12: 0x246944578f49 <String[28]: NativeModule internal/errors>
          13: 0x246944578f71 <String[17]: Binding constants>
          14: 0x246944578fa1 <String[28]: NativeModule internal/buffer>
          15: 0x246944578fc9 <String[19]: NativeModule timers>
          16: 0x246944578ff1 <String[18]: Binding timer_wrap>
          17: 0x246944579021 <String[32]: NativeModule internal/linkedlist>
          18: 0x246944579049 <String[24]: NativeModule async_hooks>
          19: 0x246944579071 <String[19]: NativeModule assert>
          20: 0x246944579099 <String[29]: NativeModule internal/process>
          21: 0x2469445790c1 <String[37]: NativeModule internal/process/warning>
          22: 0x2469445790e9 <String[39]: NativeModule internal/process/next_tick>
          23: 0x246944579111 <String[38]: NativeModule internal/process/promises>
          24: 0x246944579139 <String[35]: NativeModule internal/process/stdio>
          25: 0x246944579161 <String[25]: NativeModule internal/url>
          26: 0x246944579189 <String[33]: NativeModule internal/querystring>
          27: 0x2469445791b1 <String[24]: NativeModule querystring>
          28: 0x2469445791d9 <String[11]: Binding url>
          29: 0x246944579201 <String[17]: NativeModule path>
          30: 0x246944579229 <String[19]: NativeModule module>
          31: 0x246944579251 <String[28]: NativeModule internal/module>
          32: 0x246944579279 <String[15]: NativeModule vm>
          33: 0x2469445792a1 <String[15]: NativeModule fs>
          34: 0x39e4730e0651 <String[10]: Binding fs>
          35: 0x39e4730e0679 <String[19]: NativeModule stream>
          36: 0x39e4730e06a1 <String[36]: NativeModule internal/streams/legacy>
          37: 0x39e4730e06c9 <String[29]: NativeModule _stream_readable>
          38: 0x39e4730e06f1 <String[40]: NativeModule internal/streams/BufferList>
          39: 0x39e4730e0719 <String[37]: NativeModule internal/streams/destroy>
          40: 0x39e4730e0741 <String[29]: NativeModule _stream_writable>
          41: 0x39e4730e0769 <String[27]: NativeModule _stream_duplex>
          42: 0x39e4730e0791 <String[30]: NativeModule _stream_transform>
          43: 0x39e4730e07b9 <String[32]: NativeModule _stream_passthrough>
          44: 0x39e4730e07e1 <String[21]: Binding fs_event_wrap>
          45: 0x39e4730e0811 <String[24]: NativeModule internal/fs>
          46: 0x39e4730e0839 <String[17]: Binding inspector>
          47: 0x21515bc747c9 <String[15]: NativeModule os>
          48: 0x21515bc747f1 <String[10]: Binding os>
          49: 0x21515bc74819 <String[26]: NativeModule child_process>
          50: 0x21515bc74841 <String[18]: Binding spawn_sync>
          51: 0x21515bc74871 <String[17]: Binding pipe_wrap>
          52: 0x21515bc748a1 <String[35]: NativeModule internal/child_process>
          53: 0x21515bc748c9 <String[27]: NativeModule string_decoder>
          54: 0x21515bc748f1 <String[16]: NativeModule net>
          55: 0x21515bc74919 <String[25]: NativeModule internal/net>
          56: 0x21515bc74941 <String[18]: Binding cares_wrap>
          57: 0x21515bc74971 <String[16]: Binding tty_wrap>
          58: 0x21515bc74999 <String[16]: Binding tcp_wrap>
          59: 0x21515bc749c1 <String[19]: Binding stream_wrap>
          60: 0x21515bc749f1 <String[18]: NativeModule dgram>
          61: 0x21515bc74a19 <String[16]: Binding udp_wrap>
          62: 0x21515bc74a41 <String[20]: Binding process_wrap>
          63: 0x21515bc74a71 <String[33]: NativeModule internal/socket_list>
          64: 0x21515bc74a99 <String[20]: NativeModule console>
          65: 0x21515bc74ac1 <String[16]: NativeModule tty>
          66: 0x21515bc74ae9 <String[19]: Binding signal_wrap>
          67: 0x35756b2729e1 <String[16]: NativeModule tls>
          68: 0xd6549a0f479 <String[16]: NativeModule url>
       69-81: 0x25494902351 <the hole>
    }

A NativeModule is a module which has access to the module property and implemented in JavaScript. A Binding is something that does not have a module and only set exports.

    modules->Set(l, OneByteString(env->isolate(), buf));
    (lldb) p buf
    (char [1024]) $16 = "Binding crypto"
    node_module* mod = get_builtin_module(*module_v);
    (lldb) p *mod
    (node::node_module) $24 = {
      nm_version = 55
      nm_flags = 1
      nm_dso_handle = 0x0000000000000000
      nm_filename = 0x0000000101bc4053 "../src/crypto_impl/node_crypto.cc"
      nm_register_func = 0x0000000000000000
      nm_context_register_func = 0x000000010180f630 (node`node::crypto::InitCrypto(v8::Local<v8::Object>, v8::Local<v8::Value>, v8::Local<v8::Context>, void*) at node_crypto.cc:6230)
      nm_modname = 0x0000000101baffde "crypto"
      nm_priv = 0x0000000000000000
      nm_link = 0x00000001027c54c0
   }

In this case I'm actually documenting and troubleshooting which is the reason for the strange path names to the source files.

     if (mod != nullptr) {
       exports = Object::New(env->isolate());
       // Internal bindings don't have a "module" object, only exports.
       CHECK_EQ(mod->nm_register_func, nullptr);
       CHECK_NE(mod->nm_context_register_func, nullptr);
       Local<Value> unused = Undefined(env->isolate());
       mod->nm_context_register_func(exports, unused, env->context(), mod->nm_priv);
       cache->Set(module, exports);

So we check that the nm_register_func is not set but we should have a context aware register node module function but set the exports instance to Undefined. Next mod->nm_context_register_func is called which was configured in node_crypto.cc:

    NODE_MODULE_CONTEXT_AWARE_BUILTIN(crypto, node::crypto::InitCrypto)

So InitCrypto will be the function we end up in which has the call to SecureContext::Initialize(env, target); which I wanted to know how it ended up there.

InitCrypto

This will intialize the following:

    SecureContext::Initialize(env, target);
    Connection::Initialize(env, target); // SSLConnection
    CipherBase::Initialize(env, target);
    DiffieHellman::Initialize(env, target);
    ECDH::Initialize(env, target);
    Hmac::Initialize(env, target);
    Hash::Initialize(env, target);
    Sign::Initialize(env, target);
    Verify::Initialize(env, target); 

Should the constructors for these be checking that they are called with new? Even if these are internal it might be possible to call them using:

    const binding = process.binding('crypto');
    //const h = new binding.Hmac();
    const h = binding.Hmac();

Builtins/Constants/Natives

Using process.binding you can load builtin modules, the constant module, or native modules. The builtin modules are those that are initialized by the loader, for example:

    NODE_MODULE_CONTEXT_AWARE_BUILTIN(config, node::InitConfig)

The above module would then be loaded using process.binding('config'). If you look at the Binding function is src/node.cc you can see that there are two special cases if the module is not found as a builtin. One if the name passed in is constants and one if the name passed in is natives.

constants

Just a note here about how src/node_constant.cc is loaded as there is no NODE_MOUDULE_CONTEXT_AWARE_BUILTIN macro or anything like that. Instead this will be loaded when:

    process.binding('constants').

Which like mentioned earlier in this document this will invoke Binding (in node.cc) and there is a special clause for constants:

    } else if (!strcmp(*module_v, "constants")) {
      exports = Object::New(env->isolate());
      CHECK(exports->SetPrototype(env->context(), Null(env->isolate())).FromJust());
      DefineConstants(env->isolate(), exports);
      cache->Set(module, exports);

DefineConstants:

    DefineErrnoConstants(err_constants);
    DefineWindowsErrorConstants(err_constants);
    DefineSignalConstants(sig_constants);
    DefineUVConstants(os_constants);
    DefineSystemConstants(fs_constants);
    DefineOpenSSLConstants(crypto_constants);
    DefineCryptoConstants(crypto_constants);
    DefineZlibConstants(zlib_constants);

    os_constants->Set(OneByteString(isolate, "errno"), err_constants);
    os_constants->Set(OneByteString(isolate, "signals"), sig_constants);
    target->Set(OneByteString(isolate, "os"), os_constants);
    target->Set(OneByteString(isolate, "fs"), fs_constants);
    target->Set(OneByteString(isolate, "crypto"), crypto_constants);
    target->Set(OneByteString(isolate, "zlib"), zlib_constants);

Natives

When you see:

    NativeModule._source = process.binding('natives');

Similar to when process.binding('constants') is used there is a special clause in Binding for this:

    } else if (!strcmp(*module_v, "natives")) {
      exports = Object::New(env->isolate());
      DefineJavaScript(env, exports);
      cache->Set(module, exports);

DefineJavaScript is declared in src/node_javascript.h but as you might recall there is no implementation in the source code tree for this header the source file is generated using the JavaScript source files and config.gypi that is generated by configure:

    'action_name': 'node_js2c',
    'process_outputs_as_sources': 1,
    'inputs': [
      '<@(library_files)',
      './config.gypi',
    ],
    'outputs': [
      '<(SHARED_INTERMEDIATE_DIR)/node_javascript.cc',
    ...

So we can see that the JavaScript library files are included and config.gypi.

If you call process.binding('config') what will be returned will be a builtin module as there is one registered:

    NODE_MODULE_CONTEXT_AWARE_BUILTIN(config, node::InitConfig)

But this does not populate the process objects config variables which you might think. Instead that is done in node_bootstrap.js:

    const _process = NativeModule.require('internal/process');
    _process.setupConfig(NativeModule._source);

When I was working on decoupling OpenSSL from node.cc I missed out the macros that are used to conditionally set various settings. For example in GetFeatures we have:

    #ifdef SSL_CTRL_SET_TLSEXT_SERVERNAME_CB
      Local<Boolean> tls_sni = True(env->isolate());
    #else
      Local<Boolean> tls_sni = False(env->isolate());
    #endif
      obj->Set(FIXED_ONE_BYTE_STRING(env->isolate(), "tls_sni"), tls_sni);

Fix formatting in src/node_crypto.cc X509ToObject

TLS OpenSSL 1.1.0 Issue

I'm tryin to make test/parallel/test-tls-ecdh-disable.js to be used with both OpenSSL 1.0.x and 1.1.x. The issue is that the test successfully listens and the mustNotCall function is called which is the callback passed to the listener event handler registration.

The cipher in use is ECDHE-RSA-AES128-GCM-SHA256. We can check what ciphers are supported by using the following command:

    $ ~/work/security/build_1_1_0f/bin/openssl ciphers -v 'ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS'
    ECDHE-ECDSA-AES256-GCM-SHA384 TLSv1.2 Kx=ECDH     Au=ECDSA Enc=AESGCM(256) Mac=AEAD
    ECDHE-RSA-AES256-GCM-SHA384 TLSv1.2 Kx=ECDH     Au=RSA  Enc=AESGCM(256) Mac=AEAD
    ECDHE-ECDSA-AES128-GCM-SHA256 TLSv1.2 Kx=ECDH     Au=ECDSA Enc=AESGCM(128) Mac=AEAD

    ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2 Kx=ECDH     Au=RSA  Enc=AESGCM(128) Mac=AEAD <----

    DHE-RSA-AES256-GCM-SHA384 TLSv1.2 Kx=DH       Au=RSA  Enc=AESGCM(256) Mac=AEAD
    DHE-RSA-AES128-GCM-SHA256 TLSv1.2 Kx=DH       Au=RSA  Enc=AESGCM(128) Mac=AEAD
    ECDHE-ECDSA-AES256-CCM8 TLSv1.2 Kx=ECDH     Au=ECDSA Enc=AESCCM8(256) Mac=AEAD
    ECDHE-ECDSA-AES256-CCM  TLSv1.2 Kx=ECDH     Au=ECDSA Enc=AESCCM(256) Mac=AEAD
    ECDHE-ECDSA-AES256-SHA384 TLSv1.2 Kx=ECDH     Au=ECDSA Enc=AES(256)  Mac=SHA384
    ECDHE-RSA-AES256-SHA384 TLSv1.2 Kx=ECDH     Au=RSA  Enc=AES(256)  Mac=SHA384
    ECDHE-ECDSA-AES256-SHA  TLSv1 Kx=ECDH     Au=ECDSA Enc=AES(256)  Mac=SHA1
    ECDHE-RSA-AES256-SHA    TLSv1 Kx=ECDH     Au=RSA  Enc=AES(256)  Mac=SHA1
    DHE-RSA-AES256-CCM8     TLSv1.2 Kx=DH       Au=RSA  Enc=AESCCM8(256) Mac=AEAD
    DHE-RSA-AES256-CCM      TLSv1.2 Kx=DH       Au=RSA  Enc=AESCCM(256) Mac=AEAD
    DHE-RSA-AES256-SHA256   TLSv1.2 Kx=DH       Au=RSA  Enc=AES(256)  Mac=SHA256
    DHE-RSA-AES256-SHA      SSLv3 Kx=DH       Au=RSA  Enc=AES(256)  Mac=SHA1
    ECDHE-ECDSA-AES128-CCM8 TLSv1.2 Kx=ECDH     Au=ECDSA Enc=AESCCM8(128) Mac=AEAD
    ECDHE-ECDSA-AES128-CCM  TLSv1.2 Kx=ECDH     Au=ECDSA Enc=AESCCM(128) Mac=AEAD
    ECDHE-ECDSA-AES128-SHA256 TLSv1.2 Kx=ECDH     Au=ECDSA Enc=AES(128)  Mac=SHA256
    ECDHE-RSA-AES128-SHA256 TLSv1.2 Kx=ECDH     Au=RSA  Enc=AES(128)  Mac=SHA256
    ECDHE-ECDSA-AES128-SHA  TLSv1 Kx=ECDH     Au=ECDSA Enc=AES(128)  Mac=SHA1
    ECDHE-RSA-AES128-SHA    TLSv1 Kx=ECDH     Au=RSA  Enc=AES(128)  Mac=SHA1
    DHE-RSA-AES128-CCM8     TLSv1.2 Kx=DH       Au=RSA  Enc=AESCCM8(128) Mac=AEAD
    DHE-RSA-AES128-CCM      TLSv1.2 Kx=DH       Au=RSA  Enc=AESCCM(128) Mac=AEAD
    DHE-RSA-AES128-SHA256   TLSv1.2 Kx=DH       Au=RSA  Enc=AES(128)  Mac=SHA256
    DHE-RSA-AES128-SHA      SSLv3 Kx=DH       Au=RSA  Enc=AES(128)  Mac=SHA1
    AES256-GCM-SHA384       TLSv1.2 Kx=RSA      Au=RSA  Enc=AESGCM(256) Mac=AEAD
    AES128-GCM-SHA256       TLSv1.2 Kx=RSA      Au=RSA  Enc=AESGCM(128) Mac=AEAD
    AES256-CCM8             TLSv1.2 Kx=RSA      Au=RSA  Enc=AESCCM8(256) Mac=AEAD
    AES256-CCM              TLSv1.2 Kx=RSA      Au=RSA  Enc=AESCCM(256) Mac=AEAD
    AES128-CCM8             TLSv1.2 Kx=RSA      Au=RSA  Enc=AESCCM8(128) Mac=AEAD
    AES128-CCM              TLSv1.2 Kx=RSA      Au=RSA  Enc=AESCCM(128) Mac=AEAD
    AES256-SHA256           TLSv1.2 Kx=RSA      Au=RSA  Enc=AES(256)  Mac=SHA256
    AES128-SHA256           TLSv1.2 Kx=RSA      Au=RSA  Enc=AES(128)  Mac=SHA256
    AES256-SHA              SSLv3 Kx=RSA      Au=RSA  Enc=AES(256)  Mac=SHA1
    AES128-SHA              SSLv3 Kx=RSA      Au=RSA  Enc=AES(128)  Mac=SHA1

Zlib

The version we are using on Fedora is:

     zlib: '1.2.8'

The version building locally is:

     zlib: '1.2.11',

The issue I'm seeing is that test/parallel/test-zlib-failed-init.js passes as expected when using version 1.2.11 but fails when using 1.2.8. The change log for zlib can be found here: http://zlib.net/ChangeLog.txt

    Changes in 1.2.9 (31 Dec 2016)
    ...
    - Reject a window size of 256 bytes if not using the zlib wrapper
    ...

ICU

We are currently building using --with-intl=system-icu which gives the following icu version:

    icu: '57.1'

When I build locally and without any icu flags the version is:

The issue I'm seeing is that test/parallel/test-icu-data-dir.js is failing:

    assert.js:60
    throw new errors.AssertionError({
    ^

    AssertionError [ERR_ASSERTION]: false == true
      at Object.<anonymous> (/root/rpmbuild/BUILD/node-v8.1.0/test/parallel/test-icu-data-dir.js:13:3)


    const child = spawnSync(process.execPath, ['--icu-data-dir=/', '-e', '0']);
    assert(child.stderr.toString().includes(expected));

Fips

When using the bundled/deps version of OpenSSL and the just building without any configuration options, OpenSSL FIPS support is not enabled. The test test/parallel/test-crypto-fips.js performs a number of checks and assumes the above default configuration. I'm running into an issue when configuring Node using --shared-openssl and the system version of OpenSSL supports FIPS (but I'm not setting anything FIPS related when configuring only the shared includes and shared lib directory. When I do this the mentioned test fails when trying to toggle fips_mode using a OpenSSL configuration file.

    [[email protected] node-v8.1.0]# openssl version
    OpenSSL 1.0.2k-fips  26 Jan 2017    
out/Release/node test/parallel/test-icu-data-dir.js:
    Spawned child [pid:9757] with cmd 'require("crypto").fips' expect 0 with args '--openssl-config=/root/rpmbuild/BUILD/node-v8.1.0/test/fixtures/openssl_fips_enabled.cnf' OPENSSL_CONF=undefined
    assert.js:60
      throw new errors.AssertionError({
      ^

    AssertionError [ERR_ASSERTION]: 0 === 1
        at responseHandler (/root/rpmbuild/BUILD/node-v8.1.0/test/parallel/test-crypto-fips.js:56:14)

You might have to manually remove config_fips.gypi when you want to reconfigure fips.

Building the bundled openssl with fips

    $ ./configure --openssl-fips=/Users/danielbevenius/work/security/build_1_0_2k/
    $ out/Release/node
    > process.versions.openssl
    '1.0.2l-fips'

Node N-API

Lets start from the beginning and look at the initialization of a napi addons:

NAPI_MODULE(NODE_GYP_MODULE_NAME, Init)

This macro is defined in src/node_api.h:

#define NAPI_MODULE_X(modname, regfunc, priv, flags)                  \
  EXTERN_C_START                                                      \
    static napi_module _module =                                      \
    {                                                                 \
      NAPI_MODULE_VERSION,                                            \
      flags,                                                          \
      __FILE__,                                                       \
      regfunc,                                                        \
      #modname,                                                       \
      priv,                                                           \
      {0},                                                            \
    };                                                                \
    NAPI_C_CTOR(_register_ ## modname) {                              \
      napi_module_register(&_module);                                 \
    }                                                                 \
  EXTERN_C_END

#define NAPI_MODULE(modname, regfunc)                                 \
  NAPI_MODULE_X(modname, regfunc, NULL, 0)  // NOLINT (readability/null_usage)

This is basically the same as what we have seen for a normal addons and would expend to:

extern "C" {
  static napi_module _module = {
      NAPI_MODULE_VERSION,
      flags,
      __FILE__,
      Init,
      NODE_GYP_MODULE_NAME,
      priv,
      {0},
  };
  static void _register_NODE_GYP_MODULE_NAME(void) __attribute__((constructor));
  static void _register_NODE_GYP_MODULE_NAME(void) {
      napi_module_register(&_module);
  }
}

It's run when a shared library is loaded.

void napi_module_register(napi_module* mod) {
  node::node_module* nm = new node::node_module {
    -1,
    mod->nm_flags,
    nullptr,
    mod->nm_filename,
    nullptr,
    napi_module_register_cb,
    mod->nm_modname,
    mod,  // priv
    nullptr,
  };
  node::node_module_register(nm);
}

Let's set a break point on NAPI_MODULE_INIT and take a closer look at it:

$ ./configure --debug && make -j8
$ make build-addons-napi
$ lldb -- ./node test/addons-napi/1_hello_world/test.js
(lldb) br s -f binding.c -l 13
Breakpoint 1: no locations (pending).
WARNING:  Unable to resolve breakpoint to any actual locations.
(lldb) r

Doing a back trace we will see

(lldb) bt 10
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x00000001027adc5b binding.node`_register_binding at binding.c:13
    frame #1: 0x00000001026eeac6 dyld`ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 420
    frame #2: 0x00000001026eecf6 dyld`ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 40
    frame #3: 0x00000001026ea218 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 330
    frame #4: 0x00000001026e934e dyld`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 134
    frame #5: 0x00000001026e93e2 dyld`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 74
    frame #6: 0x00000001026dd3e5 dyld`dyld::runInitializers(ImageLoader*) + 82
    frame #7: 0x00000001026e60a8 dyld`dlopen + 527
    frame #8: 0x00007fff6704fd86 libdyld.dylib`dlopen + 86
    frame #9: 0x000000010003b238 node`node::DLOpen(v8::FunctionCallbackInfo<v8::Value> const&) [inlined] node::DLib::Open() at node.cc:1158 [opt]
    frame #10: 0x000000010003b224 node`node::DLOpen(args=<unavailable>) at node.cc:1253 [opt]

The example we are looking at is test/addons-napi/1_hello_world and

const binding = require(`./build/${common.buildType}/binding`);

Notice that this does not have a file extension, and one of the file extensions that will be tried is .node. which will be test/addons-napi/1_hello_world/build/Debug/binding.node.

From lib/internal/modules/cjs/loader.js:

// Native extension for .node
Module._extensions['.node'] = function(module, filename) {
  return process.dlopen(module, path.toNamespacedPath(filename));
};

Each addon is a dynamical library which is loaded by LDOpen in src/node.cc

NAPI_MODULE_INIT() {
  napi_property_descriptor desc = DECLARE_NAPI_PROPERTY("hello", Method);
  NAPI_CALL(env, napi_define_properties(env, exports, 1, &desc));
  return exports;

Note that DECLARE_NAPI_PROPERTY is a macro defined in test/common.h and just initializes the struct with:

  napi_property_descriptor desc = { "hello", 0, Method, 0, 0, 0, napi_default, 0);

The napi_property_descriptor struct look like this:

typedef struct {
  // One of utf8name or name should be NULL.
  const char* utf8name;
  napi_value name;

  napi_callback method;
  napi_callback getter;
  napi_callback setter;
  napi_value value;

  napi_property_attributes attributes;
  void* data;
} napi_property_descriptor;

napi_callback is a function pointer, which is getting set to 0. Why is this not set to NULL?

typedef napi_value (*napi_callback)(napi_env env, napi_callback_info info);

napi_default is an enum in src/node_api_types.h:

typedef enum {
  napi_default = 0,
  napi_writable = 1 << 0,
  napi_enumerable = 1 << 1,
  napi_configurable = 1 << 2,

  // Used with napi_define_class to distinguish static properties
  // from instance properties. Ignored by napi_define_properties.
  napi_static = 1 << 10,
} napi_property_attributes;

All JavaScript values are abstracted behind an opaque type named napi_value. This is a way hide details from the using code and they don't have access to the actual implementation of the struct. In napi_value's case it is just a pointer and no additional information exist, but for other types like napi_env:

typedef struct napi_env__* napi_env;

napi_env__ contains information like the Isolate, the node::Environment for example and is defined in src/node_api.cc which is the implementation. So the while you can see the contents of the struct in the debugger trying to access any member information will get an error like the following:

../binding.c:7:6: error: incomplete definition of type 'struct napi_env__'
  env->isolate;
  ~~~^
/Users/danielbevenius/work/nodejs/node-poc/src/node_api_types.h:13:16: note: forward declaration of 'struct napi_env__'
typedef struct napi_env__* napi_env;
               ^
1 error generated.

The way these types are used is by passing them to functions that operate on them, and those functions have access to the internal type definitions. This is also what OpenSSL does for it's type system.

FIXED_ONE_BYTE_STRING

This is a macro which can be found in src/util.h:

  // Used to be a macro, hence the uppercase name.
  template <int N>
  inline v8::Local<v8::String> FIXED_ONE_BYTE_STRING(
    v8::Isolate* isolate,
    const char(&data)[N]) {
    return OneByteString(isolate, data, N - 1);
  }

const char(&data) is the same as const char&[] const char (&)[] So we are passing in the string which recieved as a const char* in the OneByteString function. This is then reinterpreted to const uint8_t*.

OneByteString: the string's bytes are not UTF-8 encoded, can only contain characters in the first 256 unicode code points TwoByteString: Same, but uses two bytes for each character (using surrogate pairs to represent unicode characters that can't be represented in two bytes).

Binary Compatability

Is when a program is linked dynmically to a former version and does not have to be recompiled. If a program needs to be recompiled to run with a new version of library but doesn't require any further modifications, the library is source compatible.

Application Binary Interface (ABI)

An ABI defines the structures and methods used to access external, already compiled libraries/code at the level of machine code. The headers decsribing classes, functions etc are compiled to a set of addresses and expected parameters, memory structure sizes and layout. The application using the ABI must be compiled so that these addresses, the expected paramters, memory layout etc match those that that the ABI provider provided. Changes to the header files can be made but must be done carfully to ensure that the compability is not changed, so that the addresses produced when compiling are not changed which would mean that exising users of the compiled code would become incompatible.

Keeping an ABI stable means not changing function interfaces (return type and number, types, and order of arguments), definitions of data types or data structures, defined constants, etc. New functions and data types can be added, but existing ones must stay the same.

Making a namespace change in C++ will generate a different mangled name so the symbol in the object file will be different.

In Node there is NODE_MODULE_VERSION which is defined in node_version.h

C++ Lint task

Verify that functions that return pointers have the pointer operator in the correct place (to the left).

The rules can be found in 'tools/cpplint.py' and can be run using:

    tools/cpplint.py files

The script will run through the files and for each one call ProcessFile which does some checking and then calls ProcessFileData and later ProcessLine. Not that you can pass in extra_check_functions which might become handy.

NDEBUG

I've seen this in couple of places in the node source code and did not know about it. NDBUG is used to toggle/control whether the assert macro will expand into something that will perform a check or not.

From <assert.h>:

    #ifdef NDEBUG
    #define assert(condition) ((void)0)
    #else
    #define assert(condition) 
    #endif

assert

This macro is disabled if, at the moment of including <assert.h>, a macro with the name NDEBUG has already been defined. This allows for a coder to include as many assert calls as needed in a source code while debugging the program and then disable all of them for the production version by simply including a line like:

    #define NDEBUG 

If the NDEBUG macro is defined when <assert.h> is included the assets are disabled. While looking into adding a addons test I noticed that at-exit undefined NDEBUG but not the other tests that use assert. As far as I can tell there is no need to undefine this and non of the other tests do. This commit removes the undef for consistency.

Building a addons

Change to the directory of the addons and then you can rebuild using:

    $ out/Debug/node deps/npm/node_modules/node-gyp/bin/node-gyp.js rebuild --directory=test/addons/at-exit --nodedir=../../../

On windows you can just copy the command from the output from, for example:

    "C:\\Users\\danbev\\working\\node\\Release\\node.exe" "C:\\Users\\danbev\\working\\node\\deps\\npm\\node_modules\\node-gyp\\bin\\node-gyp" "rebuild" "--directory=test\\addons\\openssl-client-cert-engine" "--nodedir=C:\\Users\\danbev\\working\\node"

Run an a set of JavaScript tests

You can specify the directory to run test, for example this would only run the async-hooks tests:

    $ python tools/test.py --mode=release -J async-hooks

Event Loop

It all starts with the javascript file to be executed, which is you main program.


    ------------> javascript.js ------+-----------------------------+ 
                                     /|\                            | setTimeout/SetInterval
                                      |                             | JavaScript callbacks --------------------------------------+
                                      |                             |                                                            |
                                      |                             | network/disk/child_processes                               |
                                      |                             | JavaScript callbacks --------------------------------------+
                                      |                             |                                                            |----> callback ------> nextTick callback ------------------------------+
                                      |                             | setImmedate                                                |                             /|\                                       |
                                      +                             | JavaScript callbacks --------------------------------------|                              |                                        |
                                       \                            |                                                            |                              +---- process resolved promises ---------+ 
                                        \                           | close events                                               |                    
                                         \                          | JavaScript callbacks --------------------------------------|
                                          \                         |
                                           \                        |
    <----------- process.exit (event) <-----------------------------+

Where is the first interaction with libuv in node? There very first call is (in node.cc):

    argv = uv_setup_args(argc, argv);

But this does not do anything with the event loop. For that we have to look at Init:

    prog_start_time = static_cast<double>(uv_now(uv_default_loop()));

This will land us in uv_common.c:

    uv_loop_t* uv_default_loop(void) {
     if (default_loop_ptr != NULL)
       return default_loop_ptr;
   
     if (uv_loop_init(&default_loop_struct))
    (lldb) p default_loop_ptr
    (uv_loop_t *) $4 = 0x0000000000000000

So we can see that this is the first call to uv_default_loop so lets take a closer look at uv_loop_init in src/unix/loop.c (in libuv that is):

    uv__signal_global_once_init();

Which will call:

    uv_once(&uv__signal_global_init_guard, uv__signal_global_init);

I just skimmed the code but this looks like setting up fork handlers to reset signals. uv_once uses pthread_once. After this we will be back in loop.c:

    heap_init((struct heap*) &loop->timer_heap);

Where are initializing a min heap data structur for timers.

    QUEUE_INIT(&loop->wq)

wq is a work queue (include/uv-threadpool.h):

    void* wq[2];

What does the macro QUEUE_INIT do? It is defined in src/queue.h as:

    QUEUE_NEXT(q) = (q);                                                      \
    QUEUE_PREV(q) = (q);

    typedef void *QUEUE[2];
    ...
    #define QUEUE_NEXT(q)       (*(QUEUE **) &((*(q))[0]))

This will set &loop->wp[0]

And the same goes for handle_queue and active_reqs which will be initialized using QUEUE_INIT as well:

    void* handle_queue[2];
    void* active_reqs[2];

    QUEUE_INIT(&loop->active_reqs);
    QUEUE_INIT(&loop->idle_handles);

    ...
    err = uv__platform_loop_init(loop);

uv__platform_loop_init can be found in src/unix/darwin.c:

    if (uv__kqueue_init(loop)) 

uv__kqueue_init can be found in src/unix/kqueue.c:

    loop->backend_fd = kqueue()

Resolve promises

After calling the callback provided by the user, a smaller loop will check if there are any resolved promises.

Where is this done? If we take a look at lib/internal/bootstrap.js and the start function we find the following line:

    NativeModule.require('internal/process/next_tick').setup();

And lib/internal/process/next_tick.js:

    exports.setup = setupNextTick;

setupNextTick then does:

    const promises = require('internal/process/promises');
    ...
    const emitPendingUnhandledRejections = promises.setup(scheduleMicrotasks);

promises.setup will call process._setupPromises:

      process._setupPromises(function(event, promise, reason) {

Notice that this call takes an anonymous function as its callback. _setupPromises is configured in node.cc in the SetupProcessObject function:

    env->SetMethod(process, "_setupPromises", SetupPromises);

Lets set a break point in SetupPromis and see what is going on:

    (lldb) br s -f node.cc -l 1278
    (lldb) r
    isolate->SetPromiseRejectCallback(PromiseRejectCallback);

So an isolate has a promise_reject_callback_ which is being set here to PromiseRejectCallback. This is what V8 will call if a promise is rejected.

    env->set_promise_reject_function(args[0].As<Function>());

Remember _setupPromises was passed an anonymous function as its callback argument which is what is being set as the set_promise_reject_function on the evnvironment instance. This will then be used by PromiseRejectCallback:

    Local<Function> callback = env->promise_reject_function();
    ...
    callback->Call(process, arraysize(args), args);

Next things that happens in SetupPromises is that _setupPromises is deleted from the process object:

    env->process_object()->Delete(
      env->context(),
      FIXED_ONE_BYTE_STRING(args.GetIsolate(), "_setupPromises")).FromJust();

This way it cannot be called again.

Back in JavaScript (setupNextTick) now we have:

    var _runMicrotasks = {};
    ...
    const tickInfo = process._setupNextTick(_tickCallback, _runMicrotasks);

process._setupNextTick is also configured in node.cc:

    env->SetMethod(process, "_setupNextTick", SetupNextTick);

The arguments to SetupNextTick will be:

    CHECK(args[0]->IsFunction());  // _tickCallback
    CHECK(args[1]->IsObject());    // _runMicrotasks which is just an empty object at this point.

    env->set_tick_callback_function(args[0].As<Function>());
    env->SetMethod(args[1].As<Object>(), "runMicrotasks", RunMicrotasks);

RunMicroTasks does:

    void RunMicrotasks(const FunctionCallbackInfo<Value>& args) {
      args.GetIsolate()->RunMicrotasks();
    }

So we are setting a method on the _runMicrotasks object named runMicrotasks. Next _setupNextTick is removed from the process object. Then we have:

    uint32_t* const fields = env->tick_info()->fields();
    uint32_t const fields_count = env->tick_info()->fields_count();
    Local<ArrayBuffer> array_buffer = ArrayBuffer::New(env->isolate(), fields, sizeof(*fields) * fields_count);
    args.GetReturnValue().Set(Uint32Array::New(array_buffer, 0, fields_count));
    const p = new Promise((resolve, reject) => {
      resolve('ok');
    });

So how are promises created in V8?

    (lldb) br s -f isolate.cc -l 1862

This will break in isolate.cc PushPromise:

    void Isolate::PushPromise(Handle<JSObject> promise) {
      ThreadLocalTop* tltop = thread_local_top();
      PromiseOnStack* prev = tltop->promise_on_stack_;
      Handle<JSObject> global_promise = global_handles()->Create(*promise);
      tltop->promise_on_stack_ = new PromiseOnStack(global_promise, prev);
    }    

Lets take a closer look at new PromiseOnStack:

    class PromiseOnStack {
     public:
      PromiseOnStack(Handle<JSObject> promise, PromiseOnStack* prev)
        : promise_(promise), prev_(prev) {}
      Handle<JSObject> promise() { return promise_; }
      PromiseOnStack* prev() { return prev_; }

     private:
      Handle<JSObject> promise_;
      PromiseOnStack* prev_;
    }; 

So that looks simple enough, it has a promise and a pointer to the previous promise. The callback passed to Promise (which takes a resolve, reject), when is it called? This is part of a constructor call so it would be executed straight way I think, like any constructor.

PromiseRejectCallback
    void PromiseRejectCallback(PromiseRejectMessage message) {
      Local<Promise> promise = message.GetPromise();
      Isolate* isolate = promise->GetIsolate();
      Local<Value> value = message.GetValue();
      Local<Integer> event = Integer::New(isolate, message.GetEvent());

      Environment* env = Environment::GetCurrent(isolate);
      Local<Function> callback = env->promise_reject_function();

      if (value.IsEmpty())
        value = Undefined(isolate);

      Local<Value> args[] = { event, promise, value };
      Local<Object> process = env->process_object();

      callback->Call(process, arraysize(args), args);
  }

After a module has been loaded in Module.runMain, the process._tickCallback function will first process all the callbacks in the nextTickQueue and then call _runMicrotasks().

nextTick

After resolving all the promises and callbacks added using nextTick will be called.

    // bootstrap main module.
    Module.runMain = function() {
      // Load the main module--the command line argument.
      Module._load(process.argv[1], null, true);
      // Handle any nextTicks added in the first tick of the program
      process._tickCallback();
    };

Libuv Thread pool

The following modules use the thread pool:

  • fs.*
  • dns.lookup This is because when using a hostname to lookup there are operating system specific tasks that have to be performed, like on unix /etc/hosts should be used etc.
  • pipes
  • crypto (crypto.randomBytes(), and crypto.pbkdf2())
  • http.get/request() is called with a name in which case dns.loopup() is used
  • any c++ addons that uses it

The follwing modules use the Kernel:

  • tcp/udp sockets, servers
  • unix domain sockets, servers
  • pipes
  • tty input
  • dns.resolveXXX

The default size of the thread pool is 4 (uv_threadpool_size). Just to be clear, the libuv Event Loop is single threaded. The thread pool is used for file I/O operations.

SEGV_MAPERR

Is a segmentation fault, which is an invalid memory access. This can happen when trying to access a page that is not mapped into the address space of the running process. This can happen when dereferencing a null pointer or a pointer that was corrupted with a small integer value.

core dump

Make sure that you specified enough room for a core dump:

    $ ulimit -c unlimited

If possible compile the executable with debugging options. The example I'm using for this is a case where
cctest on arm7 produces a core dump:
```console
    
    [----------] 2 tests from EnvironmentTest
    [ RUN      ] EnvironmentTest.AtExitWithEnvironment
    [       OK ] EnvironmentTest.AtExitWithEnvironment (114 ms)
    [ RUN      ] EnvironmentTest.AtExitWithArgument
    Received signal 11 SEGV_MAPERR 000007a0547c

    ==== C stack trace ===============================

    [end of stack trace]
    Segmentation fault (core dumped)
    $ gdb out/Debug/cctest qemu_cctest_20170809-174708_18638.core

    Reading symbols from out/Release/cctest...done.
    [New LWP 18638]
    [New LWP 18644]
    [New LWP 18645]
    [New LWP 18646]
    [Thread debugging using libthread_db enabled]
    Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
    Core was generated by `out/Release/cctest'.
    #0  0x005e4f8c in v8::Context::Exit() ()

    (gdb) bt
    #0  0x005e4f8c in v8::Context::Exit() ()
    #1  0x01033d94 in node::Environment::Start(int, char const* const*, int, char const* const*, bool) ()
    #2  0x010395f4 in node::CreateEnvironment(node::IsolateData*, v8::Local<v8::Context>, int, char const* const*, int, char const* const*) ()
    #3  0x00e6087c in EnvironmentTest_AtExitWithArgument_Test::TestBody() ()
    #4  0x00eafb12 in testing::Test::Run() ()
    #5  0x00eafcfc in testing::TestInfo::Run() [clone .part.402] ()
    #6  0x00eafde0 in testing::TestCase::Run() [clone .part.403] ()
    #7  0x00eb15b0 in testing::internal::UnitTestImpl::RunAllTests() [clone .part.407] ()
    #8  0x00eb17e6 in testing::UnitTest::Run() ()
    #9  0x004302dc in main ()

I was not able to reproduce this issue when compiling with debugging symbols (./configure --debug). Possible reasons for this? The debugger puts more on the stack and if you overwrite an arrays capacity it migth cause a segment fault in release mode but not in debug mode.

Trying to find out where v8::Context::Exit() is call when in node::Environment::Start. Upon entry there is the following:

    Context::Scope context_scope(context());

This is the context scope used and this is using RAII, and the descructor that will be called when this instance goes out of scope looks like this (in v8.h):

    V8_INLINE ~Scope() { context_->Exit(); }

Workaround issue with test/async-hooks/init-hooks.js

    $ launchctl unload -w /System/Library/LaunchAgents/com.apple.ReportCrash.plist
    $ sudo launchctl unload -w /System/Library/LaunchDaemons/com.apple.ReportCrash.Root.plist

LIKELY/UNLIKELY

If you take a look at the CHECK macro you will se that it is defined like this:

   
    #define CHECK(expr)                                                         \
    do {                                                                        \
      if (UNLIKELY(!(expr))) {                                                  \
        static const char* const args[] = { __FILE__, STRINGIFY(__LINE__),      \
                                            #expr, PRETTY_FUNCTION_NAME };      \
        node::Assert(&args);                                                    \
      }                                                                         \
    } while (0)

And UNLIKELY is defined as:

    #define UNLIKELY(expr) __builtin_expect(!!(expr), 0)

This is done to give the compiler a hint that we expect the value to be true so it can optimise for that case and not the opposite case.

Context

Allows separate unrelated JavaScript to run in an isolate without interfering with each other. What does that mean for a C++ function that is called by javascript. The context should be the one associated with the call.

Tracing

Tracing refers to tracing events that can be emitted by V8. V8 takes a --enable-tracing flag which enables tracing and will create a v8_trace.json file. This file can be opened in chrome and inspected using chrome://tracing

This can also be enabled in Node using --trace-events-enabled and you can specify categories to be traced using --trace-event-categories.

Microsoft Build Engine (MSBuild)

Is a build platform that allows for an xml configuration file to configure and run complex build systems. This is what is produced by gyp for windows (the project file I mean). So you can take a look at the project file for the specific gyp target and see what shows up where in it to troubleshoot configuration issues.

A target is like a procedure. A target can contain properties in a PropertiesGroup and tasks. A task is like a builtin function that performs an action, compiles, links, prints a message etc. A project is what hooks tasks together. A task can have a DefaultTargets attribute which are the targets that will be run if you specify the project file on the command line for msbuild.


    msbuild sample.vcxproj 

The target can also be specified on the command line:
```console
    msbuild sample.vcxproj /t:something
    msbuild sample.vcxproj /t:Clean;Build

In the xml you might come across something like: %(AdditionalOptions). This an iterate directive.

You can increase the level of information provided by msbuild by specifying the /verbosity setting: quit, minimal, normal, detailed, and diagnostic.

Options: /MP Compile multiple sources by using multiple processes /link passes the specified option to LINK

vcbuild.bat

This is the Windows batch script that is used on Windows systems.

    vcbuild.bat /help
%var%   replaced with the environment variable named 'var'.
%n      where n is a number from 0-9 and gives access to the n argument passed on the command line.
%*      all the arguments specified on the command line

You can place quotation marks around a string to avoid escaping. You can place caret (^), an escape character, immediately before the special characters The special characters that need quoting or escaping are usually <, >, |, &, and ^

echo "You & me"
echo You ^& me
cd %~dp0        this will to %~dp (the directory of) on %0 (the first command line argument). 
                So if this is run from a diferent directory this will switch to the directory where
                this script is.

if /i           case-insensitive, for example:

     if /i "%1"=="help" goto help

Configuring on windows:

    python configure --openssl-system-ca-path=PATH
    vcbuild noprojgen

Compiler

cl.exe is the the compiler.

    cl -help 

Scripting

`%~1`          removes quotes from the first command line argument

Use NUL to discard out put, as in comman > NUL

    IF "%var%"=="" (SET var=something)
    IF NOT DEFINED var (SET var=something)
    IF /I "%ERRORLEVEL%" NEQ "0" (
      echo "failed to execute something"
    )

Performace counters

Are used used to provide information as to how well the operating system or an application is performing.

DTrace

is supported on linux, solaris, mac, and bsd and is a tracing framework originally developed for Solaris by Sun. This is enabled by configuring with the --with-dtrace flag. There are three object files for dtrace in Node. These are: node_dtrace.o which is used by all systems that support dtrace
node_dtrace_ustack.o is not supported on linux or mac
node_dtrace_provider.o is supported on all but macosx
Note that the directory where node_dtrace_provider.o is generated differs so if you are working on a task that effects linking take that into account.

Event Tracing for Windows (ETW)

Is an efficient kernel level tracing mechanism allowing logging of kernel or application defined events to a log file. It also allows dynamic enable/disable so this can be performed on a running application.

An event provider writes events to an ETW session. Additional data is added by ETW like the time the event happened, the process that wrote it and the thread id, the processor number, the CPU usage data. This info is then available to event consumers which are applications that read the log file or the consumer can listen to realtime events and process them.

There is a condition in node.gypi that depends on node_etw which looks like this:


    [ 'node_use_etw=="true"', {
      'defines': [ 'HAVE_ETW=1' ],
      'dependencies': [ 'node_etw' ],
      'sources': [
        'src/node_win32_etw_provider.h',
        'src/node_win32_etw_provider-inl.h',
        'src/node_win32_etw_provider.cc',
        'src/node_dtrace.cc',
        'tools/msvs/genfiles/node_etw_provider.h',
        'tools/msvs/genfiles/node_etw_provider.rc',
      ]
    } ],

So remember that node_etw will be run first so lets take a look at it first.

    # generate ETW header and resource files
    {
      'target_name': 'node_etw',
      'type': 'none',
      'conditions': [
        [ 'node_use_etw=="true"', {
          'actions': [
            {
              'action_name': 'node_etw',
              'inputs': [ 'src/res/node_etw_provider.man' ],
              'outputs': [
                'tools/msvs/genfiles/node_etw_provider.rc',
                'tools/msvs/genfiles/node_etw_provider.h',
                'tools/msvs/genfiles/node_etw_providerTEMP.BIN',
              ],
              'action': [ 'mc <@(_inputs) -h tools/msvs/genfiles -r tools/msvs/genfiles' ]
            }
          ]
        } ]
      ]
    },

mc is the Message Compiler.
-h is where you want the generated header files to be placed.
-r is where you want the generated resource compiler script (.rc file) and the generated binary resource file (.bin)
The node_etw_providerTEMP.bin is a binary resource file that contains the provider and event metadata. This is the template resource, which is signified by the TEMP suffix of the base name of the file.

And the input is the manifest file (.man). The manifest registers an event producer named NodeJS-ETW-provider:

    <provider name="NodeJS-ETW-provider"
        guid="{77754E9B-264B-4D8D-B981-E4135C1ECB0C}"
        symbol="NODE_ETW_PROVIDER"
        message="$(string.NodeJS-ETW-provider.name)"
        resourceFileName="node.exe"
        messageFileName="node.exe">

Notice the message attribute which is used for localization which will be matched with :

    <localization>
        <resources culture="en-US">
            <stringTable>
                <string id="NodeJS-ETW-provider.name" value="Node.js ETW Provider"/>

Tasks are typically used to identify major components of the provider, some form of grouping. In node there is one task:

    <task name="MethodRuntime" value="1" symbol="JSCRIPT_METHOD_RUNTIME_TASK">
        <opcodes>
            <opcode name="MethodLoad" value="10" symbol="JSCRIPT_METHOD_METHODLOAD_OPCODE"/>
        </opcodes>
    </task> 

This grouping enables consumers to query only for specific tasks and opcode combinations. The opcodes are specific to the task MethodRuntime. But there are also global opcodes:

    <opcodes>
        <opcode name="NODE_HTTP_SERVER_REQUEST" value="10"/>
        <opcode name="NODE_HTTP_SERVER_RESPONSE" value="11"/>
        <opcode name="NODE_HTTP_CLIENT_REQUEST" value="12"/>
        <opcode name="NODE_HTTP_CLIENT_RESPONSE" value="13"/>
        <opcode name="NODE_NET_SERVER_CONNECTION" value="14"/>
        <opcode name="NODE_NET_STREAM_END" value="15"/>
        <opcode name="NODE_GC_START" value="16"/>
        <opcode name="NODE_GC_DONE" value="17"/>
        <opcode name="NODE_V8SYMBOL_REMOVE" value="21"/>
        <opcode name="NODE_V8SYMBOL_MOVE" value="22"/>
        <opcode name="NODE_V8SYMBOL_RESET" value="23"/>
    </opcodes>

Templates are used to define event specific data that the provider includes with an event. For example in node one template looks like this:

    <template tid="node_connection">
      <data name="fd" inType="win:UInt32" />
      <data name="port" inType="win:UInt32" />
      <data name="remote" inType="win:AnsiString" />
      <data name="buffered" inType="win:UInt32" />
   </template>

Events have to be defined and can refer to template like the following:

    <event value="2" 
      opcode="NODE_HTTP_SERVER_RESPONSE"
      template="node_connection"
      symbol="NODE_HTTP_SERVER_RESPONSE_EVENT"
      message="$(string.NodeJS-ETW-provider.event.2.message)"
      level="win:Informational"/>

Ok, so we can see how the provider and events are configured. This information is used by the mc tool to generate headers and a binary file.

The provider headers is in src/node_win32_etw_provider.h. For each of the opcodes there are function declarations and init_etw() and shutdown_etw(). There is an internal header src/node_win32_etw_provider-inl.h which includes the generated node_etw_provider.h which is located in 'tools/msvs/genfiles/node_etw_provider.h'.

Performace counters

Are used to provide information how node is performing. There is condition in node.gypi that depends on node_perfctr which looks like this:

    [ 'node_use_perfctr=="true"', {
      'defines': [ 'HAVE_PERFCTR=1' ],
      'dependencies': [ 'node_perfctr' ],
      'sources': [
        'src/node_win32_perfctr_provider.h',
        'src/node_win32_perfctr_provider.cc',
        'src/node_counters.cc',
        'src/node_counters.h',
        'tools/msvs/genfiles/node_perfctr_provider.rc',
      ]
    } ],

This file is used in the node_perfctr (node performance counter) target and it is an action target that invokes ctrpp The input to ctrpp is src/res/node_perfctr_provider.man

    {
      'action_name': 'node_perfctr_man',
      'inputs': [ 'src/res/node_perfctr_provider.man' ],
      'outputs': [
         'tools/msvs/genfiles/node_perfctr_provider.h',
         'tools/msvs/genfiles/node_perfctr_provider.rc',
         'tools/msvs/genfiles/MSG00001.BIN',
      ],
      'action': [ 'ctrpp <@(_inputs) '
        '-o tools/msvs/genfiles/node_perfctr_provider.h '
        '-rc tools/msvs/genfiles/node_perfctr_provider.rc'
      ]
    },

-o specifies the header that will be generated. -rc specifies the file that will be generated.

'src/node_win32_perfctr_provider.cc' includes node_perfctr_provider.h Notice this this is much like the ETW and manifest based. Lets look at the manifest. Instead of containing events this file contains counters. The MSG00001.BIN file is a binary resource file for each language you specify (only one in our case).

But when is the actual .res file created?
I can see these are part of the statically linked library:

    dumpbin /archivemembers Release/lib/node.lib

I can see that path is Release\obj\node\node_etw_provider.res and Release\obj\node\node_perfctr_provider.res. Do these files somehow refer to node.lib causing /WHOLEARCHIVE to fail? Is there some way to exclude object from the lib specified by /WHOLEARCHIVE? A .res file is a compiled resource file generated by rc.exe. By default this is created in the same directory as the .rc file.

CTRPP

The CTRPP tool is a pre-processor that parses and validates your counters manifest. The tool also generates code that you use to provide your counter data.

Tracker.exe

Scans .tlog files to figure out what a projects output files are. This is used to know if something should be updated/recompiled I guess. Adding this as you might come across an issue on windows where the last things that is logged is output from the tracker but infact if there is a link error this would not be something that the tracker would report, so you'll need to look further back in the console output.

Inspecting a .lib on windows

    DUMPBIN /ARCHIVEMEMBERS Release\lib\node.lib

Linux Trace Toolkit: next generation (lttng)

LTTng is an open source tracing framework for Linux.

Does it make sense to have the node.res file when node is linked as a static library? Would'nt the case be that the icon and other resource info be that of the embedders.

But out about the ETW and performance counters resources? Resources (as far as I can tell) are intended to be linked to the exe or a dynamic library. It does not seem like it is possible to add

Node executable linking

By default, just building node using ./configure && make -j8 the node executable will be statically linked:

    $ otool -L out/Debug/node
    out/Debug/node:
      /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1259.11.0)
      /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1226.10.1)
      /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 120.1.0)

GYP

For addon tests in node there can be the need to link against a library in the root directory. The problem is that you cannot use <(PRODUCT_DIR) as that is the product dir for the addon itself. There is also the issue that the output directory is different on windows, it is not out but instead the Release and Debug directories are in the root of the project. But there are variables available by the respective node-gyp evironments, for example on Window you can use:


    ['OS=="win"', {
      'libraries': ['../../../../$(Configuration)/lib/zlib.lib'],
    }, {
      'libraries': ['../../../../out/$(BUILDTYPE)/libzlib.a'],
    }],

serdes

This is a built in module for serialization/deserialization which is currently marked as experimental. It allows for serializing JavaScript values in a way compatiable with https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm, which an algorithm for copying complex JavaScript objects and used internally for transferring data to and from Web Workers view postMessage().

Failing test without network connection

make[1]: Leaving directory `/root/rpmbuild/BUILD/node-v8.7.0'
/usr/bin/python2.7 tools/test.py --mode=release -J \
async-hooks \
abort doctool es-module inspector known_issues message parallel pseudo-tty sequential \
addons addons-napi
=== release test-dgram-membership ===
Path: parallel/test-dgram-membership
assert.js:41
  throw new errors.AssertionError({
  ^

AssertionError [ERR_ASSERTION]: Got unwanted exception.
    at innerThrows (assert.js:653:7)
    at Function.doesNotThrow (assert.js:665:3)
    at Object.<anonymous> (/root/rpmbuild/BUILD/node-v8.7.0/test/parallel/test-dgram-membership.js:83:10)
    at Module._compile (module.js:624:30)
    at Object.Module._extensions..js (module.js:635:10)
    at Module.load (module.js:545:32)
    at tryModuleLoad (module.js:508:12)
    at Function.Module._load (module.js:500:3)
    at Function.Module.runMain (module.js:665:10)
    at startup (bootstrap_node.js:187:16)
Command: out/Release/node /root/rpmbuild/BUILD/node-v8.7.0/test/parallel/test-dgram-membership.js
=== release test-dgram-multicast-set-interface-lo ===
Path: parallel/test-dgram-multicast-set-interface-lo
assert.js:41
  throw new errors.AssertionError({
  ^

AssertionError [ERR_ASSERTION]: undefined == true
    at Object.<anonymous> (/root/rpmbuild/BUILD/node-v8.7.0/test/parallel/test-dgram-multicast-set-interface-lo.js:64:8)
    at Module._compile (module.js:624:30)
    at Object.Module._extensions..js (module.js:635:10)
    at Module.load (module.js:545:32)
    at tryModuleLoad (module.js:508:12)
    at Function.Module._load (module.js:500:3)
    at Function.Module.runMain (module.js:665:10)
    at startup (bootstrap_node.js:187:16)
    at bootstrap_node.js:608:3
Command: out/Release/node /root/rpmbuild/BUILD/node-v8.7.0/test/parallel/test-dgram-multicast-set-interface-lo.js
[03:09|% 100|+ 1893|-   2]: Done

Trouble shooting DNS issue on RHEL

BIND (the NameServer) consists of a set of dns related programs. The nameserver itself is called named. There is an admin tool named rdnc and dig or debugging. named reads it's configuration from /etc/named.conf.

/etc/nsswitch.conf

Before this configuration file existed name service lookups were hardcoded into the c library, as well as the search order. Name Service Switch was introduced by Sun Microsystems in there C library implementation in Solaris 2. This uses modules which allowes for adding new services without adding them to the GNU C library.

‘unavail’ The service is permanently unavailable. This can either mean the needed file is not available, or, for DNS, the server is not available or does not allow queries. The default action is continue.

For the hosts and networks databases the default value is dns [!UNAVAIL=return] files. I.e., the system is prepared for the DNS service not to be available but if it is available the answer it returns is definitive.

Regarding test/parallel/test-net-connect-immediate-finish.js and test/parallel/test-net-better-error-messages-port-hostname.js the and the error:

Path: parallel/test-net-connect-immediate-finish
assert.js:41
  throw new errors.AssertionError({
  ^
AssertionError [ERR_ASSERTION]: 'EAI_AGAIN' === 'ENOTFOUND'
    at Socket.client.once.common.mustCall (/builddir/build/BUILD/node-v8.6.0/test/parallel/test-net-connect-immediate-finish.js:35:10)
    at Socket.<anonymous> (/builddir/build/BUILD/node-v8.6.0/test/common/index.js:514:15)
    at Object.onceWrapper (events.js:316:30)
    at emitOne (events.js:115:13)
    at Socket.emit (events.js:210:7)
    at emitErrorNT (internal/streams/destroy.js:64:8)
    at _combinedTickCallback (internal/process/next_tick.js:138:11)
    at process._tickCallback (internal/process/next_tick.js:180:9)
Command: out/Release/node /builddir/build/BUILD/node-v8.6.0/test/parallel/test-net-connect-immediate-finish.js

This only occurs when there is no available dns (you might need to comment it out in /etc/resolve.conf to reproduce this), I manually update /etc/nsswitch.conf and the update the hosts entry:

hosts:      files dns

Notice that the last entry is dns which so it will return the result which will be Using the default (at least this is the default for the RHEL version that we are using in our image: registry.access.redhat.com/rhel7) this error does not occur as the default is:

hosts: files dns myhostname

Since this is an external configuration that could be different for different installations I think the best option is to update the test to be able to handle this situation. I'll create a pull request and see what people think.

Inspector console

In the start function of lib/internal/bootstrap_node.js we have a function that sets up the inspector console. After the global console object is set up, it is passed to setupInspector(originalConsole, wrappedConsole, Module); It does so with the help of a builtin module 'inspector' which can be found in src/inspector_js_api.cc.

const { addCommandLineAPI, consoleCall } = process.binding('inspector');


wrappedConsole[key] = consoleCall.bind(wrappedConsole,
                                       originalConsole[key],
                                       wrappedConsole[key],
                                       config);

The above will create a new function which has its this set to wrappedConsole

    $ out/Debug/node
    [Function: log] [Function: bound log] {}
    [Function: info] [Function: bound log] {}
    [Function: warn] [Function: bound warn] {}
    [Function: error] [Function: bound warn] {}
    [Function: dir] [Function: bound dir] {}
    [Function: time] [Function: bound time] {}
    [Function: timeEnd] [Function: bound timeEnd] {}
    [Function: trace] [Function: bound trace] {}
    [Function: assert] [Function: bound assert] {}
    [Function: clear] [Function: bound clear] {}
    [Function: count] [Function: bound count] {}
    [Function: group] [Function: bound group] {}
    [Function: groupCollapsed] [Function: bound group] {}
    [Function: groupEnd] [Function: bound groupEnd] {}

So for all of the above functions they will now got through the ConsoleCall function in inspector_agent.cc which will check if the inspector is enabled, which is done by passing --inspect, for this process:


    if (env->inspector_agent()->enabled()) {
      ...
    }

In this case the writing to stdout will be performed twice, once to the inspector using:

    inspector_method.As<Function>()->Call(context,
                                          info.Holder(),
                                          call_args.size(),
                                          call_args.data()).IsEmpty());

For example you'll seen the output in both stdout and in chrome's console.

ccache

    $ git clone https://github.com/ccache/ccache.git
    $ cd ccache
    $ ./autogen.sh
    $ ./configure && make 

Add ccache to your PATH.

    export CC="ccache clang -Qunused-arguments"
    export CXX="ccache clang++ -Qunused-arguments"

Now you can configure and build as normal.

To see what is in the cache:

   $ ccache -s

To clear the cache:

   $ ccache -C 

GetPeerCertificate

    STACK_OF(X509)* ssl_certs = SSL_get_peer_cert_chain(w->ssl_);

STACK_OF is a macro defined in deps/openssl/openssl/crypto/stack/safestack.h and will expand to:

    struct stack_st_X509* ssl_certs = SSL_get_peer_cert_chain(w->ssl_);

A Local instance holds a pointer to an Object. If you pass this instance to a function a copy of the object will be created. But both instances will point to the same object (the pointer is copied).

For example:

    static void AddIssuer(X509** cert,
                          const STACK_OF(X509)* const peer_certs,
                          Local<Object> info,
                          Environment* const env) {

    ...
    Local<Object> ca_info = X509ToObject(env, ca);
    // Now we want to update what info points to so that is points to the value of ca_info instead.
    (lldb) expr info
    (v8::Local<v8::Object>) $4 = (val_ = 0x0000000106014b08)
    (lldb) expr *info
    (v8::Object *) $6 = 0x0000000106014b08

The copy constructor for Local<> will copy the value, which includes the pointer so these are separate objects:

    (lldb) p &result
    (v8::Local<v8::Object> *) $86 = 0x00007fff5fbf04b0
    (lldb) p &info
    (v8::Local<v8::Object> *) $87 = 0x00007fff5fbf04a8

But what is info supposed to represent? Well they both initially point to the same thing as we can see but as mentioned they are separate objects. So the following will update what they both (result and info) point to:

    Local<Object> ca_info = X509ToObject(env, ca);
    info->Set(env->issuercert_string(), ca_info);

But the following will cause info to copy constructed (I think) to ca_info so the connection with result is lost here. That is incorrect! What is happening is that the first time info == result so the issuer is set on it, and it contains the ca_info instance. So result can get to it. Next when info is set to ca_info:

    info = ca_info;

This is for the next iteration and if there are more they will be chained to ca_info using the info->Set(env->issuercert_string(), ca_info), which remember can be accessed from result via its issuercert_string property.

result -> #issuercertificate -> ca_info -> #issuercerficate -> ca_info

If this is not done result will not be linked to them all and the chain broken.

Crypto SetKey

    void DiffieHellman::SetPublicKey(const FunctionCallbackInfo<Value>& args) {
      SetKey(args, [](DH* dh, BIGNUM* num) { DH_set0_key(dh, num, nullptr); },
         "Public key");
    }

    void DiffieHellman::SetPrivateKey(const FunctionCallbackInfo<Value>& args) {
    #if OPENSSL_VERSION_NUMBER >= 0x10100000L && OPENSSL_VERSION_NUMBER < 0x10100070L
    // Older versions of OpenSSL 1.1.0 have a DH_set0_key which does not work for
    // Node. See https://github.com/openssl/openssl/pull/4384.
    #error "OpenSSL 1.1.0 revisions before 1.1.0g are not supported"
    #endif
    SetKey(args, [](DH* dh, BIGNUM* num) { DH_set0_key(dh, nullptr, num); },
       "Private key");
    }

    void DiffieHellman::SetKey(const v8::FunctionCallbackInfo<v8::Value>& args,
                           void (*set_field)(DH*, BIGNUM*), const char* what) {
      ...

Notice that the second paramenter to SetKey is a function pointer:

     void function_name(DH* dh, BIGHUM* n);

I noticed that DH_set0_key returns one and wondering why as the function pointer is declared as returning void:

    static int DH_set0_key(DH* dh, BIGNUM* pub_key, BIGNUM* priv_key) {
      if (pub_key != nullptr) {
        BN_free(dh->pub_key);
        dh->pub_key = pub_key;
      }
      if (priv_key != nullptr) {
        BN_free(dh->priv_key);
        dh->priv_key = priv_key;
      }
      return 1;
    }

And the call looks like this in SetKey:

    set_field(dh->dh, num);

Object Set

The Set functions in include/v8.h for Object are deprecated. The ones that should be used are the ones that return a MayBe result instead. This means that you also have to call either FromJust() or ToChecked() before using/retrieving the value.

     3109 class V8_EXPORT Object : public Value {
     3110  public:
     3111   V8_DEPRECATE_SOON("Use maybe version",
     3112                     bool Set(Local<Value> key, Local<Value> value));
     3113   V8_WARN_UNUSED_RESULT Maybe<bool> Set(Local<Context> context,
     3114                                         Local<Value> key, Local<Value> value);
     3115
     3116   V8_DEPRECATE_SOON("Use maybe version",
     3117                     bool Set(uint32_t index, Local<Value> value));
     3118   V8_WARN_UNUSED_RESULT Maybe<bool> Set(Local<Context> context, uint32_t index,
     3119                                         Local<Value> value);

Unqualified name lookup

    void SecureContext::Initialize(Environment* env, Local<Object> target) {
      ...
      env->SetProtoMethod(t, "init", SecureContext::Init);

Even without the SecureContext namespace Init will be resolved correctly as resolution will look in class of the member funtion which is SecureContext.

    const binding = process.binding('crypto');

This will invoke SecureContext::Initialize.

const server = tls.Server(options, function(socket) {

exports.Server = require('_tls_wrap').Server;

_tls_common.js:

const binding = process.binding('crypto');
const NativeSecureContext = binding.SecureContext;

function SecureContext(secureProtocol, secureOptions, context) {
  ... 
  this.context = new NativeSecureContext();
}

So we can see that SecureContext::New is bound to NativeSecureContext.

node_crypto_bio

    static const BIO_METHOD method = {
      BIO_TYPE_MEM,
      "node.js SSL buffer",
      Write,
      Read,
      Puts,
      Gets,
      Ctrl,
      New,
      Free,
      nullptr
    };


BIO* NodeBIO::NewFixed(const char* data, size_t len) {
  BIO* bio = New();

  if (bio == nullptr ||
      len > INT_MAX ||
      BIO_write(bio, data, len) != static_cast<int>(len) ||
      BIO_set_mem_eof_return(bio, 0) != 1) {
    BIO_free(bio);
    return nullptr;
  }

  return bio;
}

BIO_set_mem_eof_return is a macro defined as:

    # define BIO_set_mem_eof_return(b,v) BIO_ctrl(b,BIO_C_SET_BUF_MEM_EOF_RETURN,v,NULL)

which in our case would give:

    BIO_strl(bio, BIO_C_SET_BUF_MEM_EOF_RETURN, 0, NULL)

This call will end up in bio_lib.c and its BIO_ctrl function:

    ret = b->method->ctrl(b, cmd, larg, parg);

Now, b is the BIO we passed in, cmd is BIO_C_SET_BUF_MEM_EOF_RETURN (130), larg is 0, and parg is NULL. The interesting thing here is that b->method will be the method defined in node_crypto_bio.cc:

    (lldb) p *b->method
    (BIO_METHOD) $12 = {
      type = 1025
      name = 0x0000000101d44053 "node.js SSL buffer"
      bwrite = 0x000000010192ed10 (node`node::crypto::NodeBIO::Write(bio_st*, char const*, int) at node_crypto_bio.cc:142)
      bread = 0x000000010192e940 (node`node::crypto::NodeBIO::Read(bio_st*, char*, int) at node_crypto_bio.cc:92)
      bputs = 0x000000010192ef90 (node`node::crypto::NodeBIO::Puts(bio_st*, char const*) at node_crypto_bio.cc:151)
      bgets = 0x000000010192efe0 (node`node::crypto::NodeBIO::Gets(bio_st*, char*, int) at node_crypto_bio.cc:156)
      ctrl = 0x000000010192f2f0 (node`node::crypto::NodeBIO::Ctrl(bio_st*, int, long, void*) at node_crypto_bio.cc:182)
      create = 0x000000010192e7c0 (node`node::crypto::NodeBIO::New(bio_st*) at node_crypto_bio.cc:68)
      destroy = 0x000000010192e830 (node`node::crypto::NodeBIO::Free(bio_st*) at node_crypto_bio.cc:77)
      callback_ctrl = 0x0000000000000000
    }

So b->method->ctrl will call NodeBIO::Ctrl: The first things that is done is that the NodeBIO instance is retrieved from the bio:

    NodeBIO* nbio = FromBIO(bio);

'FromBIO':

    CHECK_NE(BIO_get_data(bio), nullptr);
    return static_cast<NodeBIO*>(BIO_get_data(bio));

BIO_get_data is a macro for OPENSSL versions greater than 1.0.1:

    #define BIO_get_data(bio) bio->ptr

So we can see that the BIO's ponter is a pointer to the NodeBIO instance.

    case BIO_C_SET_BUF_MEM_EOF_RETURN
      nbio->set_eof_return(num);
      break;

So we are calling 'set_eof_return' with

    (lldb) p num
    (long) $14 = 0

Node Crypto BIO Buffer

node_crypto_bio.h has a private Buffer class.

To understand what this buffer does and how it works lets take a look at a usage of it... SecureContext::AddCACert:

    ...
    BIO* bio = LoadBIO(env, args[0]);

In this case args[0] is a certificate (a string) so the following path in LoadBIO will be taken:

    return NodeBIO::NewFixed(*s, s.length());

NodeBIO::NewFixed will call:

    if (bio == nullptr ||
      len > INT_MAX ||
      BIO_write(bio, data, len) != static_cast<int>(len) ||
      BIO_set_mem_eof_return(bio, 0) != 1) {
      BIO_free(bio);
      return nullptr;
    }

We are interested in the BIO_write call which will call Write on the bio's method which is NodeBIO::Write:

    void NodeBIO::Write(const char* data, size_t size) {
    ...
    // Allocate initial buffer if the ring is empty
    TryAllocateForWrite(left);

    void NodeBIO::TryAllocateForWrite(size_t hint) {
      Buffer* w = write_head_;
      Buffer* r = read_head_;
      // If write head is full, next buffer is either read head or not empty.
      if (w == nullptr ||
        (w->write_pos_ == w->len_ &&
         (w->next_ == r || w->next_->write_pos_ != 0))) {
        size_t len = w == nullptr ? initial_ : kThroughputBufferLength;
        if (len < hint)
          len = hint;
        Buffer* next = new Buffer(env_, len);

        if (w == nullptr) {
          next->next_ = next;
          write_head_ = next;
          read_head_ = next;
        } else {
          next->next_ = w->next_;
          w->next_ = next;
        }
      }
    }

First time entering this function w and r will be null as write_head_ and read_head_ will be null:

    (lldb) expr w
    (node::crypto::NodeBIO::Buffer *) $24 = 0x0000000000000000
    (lldb) expr r
    (node::crypto::NodeBIO::Buffer *) $25 = 0x0000000000000000

    size_t len = initial_; // since w == nullptr;
    (lldb) p len
    (size_t) $28 = 1024
    (lldb) p hint
    (size_t) $29 = 1224
    if (len < hint)  // 1024 < 1224
      len = 1224
    
    Buffer* next = new Buffer(env_, 1224);


    Buffer constructor: 
    data_ = new char[len];

    if (w == nullptr) {
      next->next_ = next;
      write_head_ = next;
      read_head_ = next;
    }

So initially there is only a single entry all pointing to this first one. Next, we are back in NodeBIO::Write:

    while (left > 0) { // first time left is == 1224
    ....
    memcpy(write_head_->data_ + write_head_->write_pos_, data + offset, to_write);

void* memcpy( void* dest, const void* src, std::size_t count ) so the destination in our case is write_head_data_ + write_head_write_pos_, the data to be copied is data + 0, and to_write is 1224.

ClientHelloParser

node_crypto.h has a class member that is declared like:

  ClientHelloParser hello_parser_;

So when a new SSLWrap is created constructor of ClientHelloParser will be called. Looking at the constructor for ClientHelloParser it first initilizes its fields and then calls Reset():

      ClientHelloParser() : state_(kEnded),
                        onhello_cb_(nullptr),
                        onend_cb_(nullptr),
                        cb_arg_(nullptr),
                        session_size_(0),
                        session_id_(nullptr),
                        servername_size_(0),
                        servername_(nullptr),
                        ocsp_request_(0),
                        tls_ticket_size_(0),
                        tls_ticket_(nullptr) {
      Reset();
     }

    inline void ClientHelloParser::Reset() {
      frame_len_ = 0;
      body_offset_ = 0;
      extension_offset_ = 0;
      session_size_ = 0;
      session_id_ = nullptr;
      tls_ticket_size_ = -1;
      tls_ticket_ = nullptr;
      servername_size_ = 0;
      servername_ = nullptr;
      ocsp_request_ = 0;  // added by me
    }

I can't find a reason for not resetting ocsp_request_ here. I'll create a PR to get some feedback.

v8::Object::Set and exceptions

This section goes through a call to Set and the possible exceptions that might be throws and how they can be handled.

    $ lldb -- out/Debug/node --inspect-brk test/parallel/test-tls-legacy-onselect.js
    env->SetProtoMethod(t, "setSNICallback",  Connection::SetSNICallback);
    (lldb) br s -f node_crypto.cc -l 3594
    (lldb) r

Open chrome://inspect and set a break point on this line:

    const pair = tls.createSecurePair(null, true, false, false);

A SecurePair is a pair of streams to do encrypted communication with.

    Local<Object> obj = Object::New(env->isolate());
    obj->Set(env->context(), env->onselect_string(), args[0]).FromJust();

We are creating a new Localv8::Object and setting a property on that object. The property name is taken from the Environment function onselect_string(). This function is generated by a macro and by:

    V(onselect_string, "onselect")

Now, obj->Set is only setting a property to be a function and here is a check in this specific code path that arg[0] exists and is a function.

    (lldb) jlh obj
    0x10583e1675b9: [JS_OBJECT_TYPE]
    - map = 0x105859b5fe79 [FastProperties]
    - prototype = 0x105819c04679
    - elements = 0x1058b5982241 <FixedArray[0]> [HOLEY_ELEMENTS]
    - properties = 0x1058b5982241 <FixedArray[0]> {
      #onselect: 0x10583e167571 <JSFunction (sfi = 0x1058fc5f3969)> (const data descriptor)
    }

But, there is a chance where an exception might be thrown and that is if there is a setter for `onselect'. This migth look like:

    Object.defineProperty(Object.prototype, 'onselect', {
      set: function(f) {
        console.log('throw error from setter...');
        throw Error('dummy setter error');
      }
    });

Calling obj->Set(env->context(), env->onselect_string(), args[0]).FromJust(); would cause an exception to be thrown:

$ out/Debug/node  test/parallel/test-tls-legacy-onselect.js
throw error from setter...
FATAL ERROR: v8::FromJust Maybe value is Nothing.
 1: node::Abort() [/Users/danielbevenius/work/nodejs/node/out/Debug/node]
 2: node::OnFatalError(char const*, char const*) [/Users/danielbevenius/work/nodejs/node/out/Debug/node]
 3: v8::Utils::ReportApiFailure(char const*, char const*) [/Users/danielbevenius/work/nodejs/node/out/Debug/node]
 4: v8::Utils::ApiCheck(bool, char const*, char const*) [/Users/danielbevenius/work/nodejs/node/out/Debug/node]
 5: v8::V8::FromJustIsNothing() [/Users/danielbevenius/work/nodejs/node/out/Debug/node]
 6: v8::Maybe<bool>::FromJust() const [/Users/danielbevenius/work/nodejs/node/out/Debug/node]
 7: node::crypto::Connection::SetSNICallback(v8::FunctionCallbackInfo<v8::Value> const&) [/Users/danielbevenius/work/nodejs/node/out/Debug/node]
 8: v8::internal::FunctionCallbackArguments::Call(void (*)(v8::FunctionCallbackInfo<v8::Value> const&)) [/Users/danielbevenius/work/nodejs/node/out/Debug/node]
 9: v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) [/Users/danielbevenius/work/nodejs/node/out/Debug/node]
10: v8::internal::Builtin_Impl_HandleApiCall(v8::internal::BuiltinArguments, v8::internal::Isolate*) [/Users/danielbevenius/work/nodejs/node/out/Debug/node]
11: v8::internal::Builtin_HandleApiCall(int, v8::internal::Object**, v8::internal::Isolate*) [/Users/danielbevenius/work/nodejs/node/out/Debug/node]
12: 0x10e0737043c4
13: 0x10e0737f1600
Abort trap: 6

Notice the FromJust() which will crash if there is an error in the JavaScript setter.

But instead what should happen is (in SetSNICallback):

if (obj->Set(env->context(), env->onselect_string(), args[0]).IsNothing()) {
    return;
}

return where will return to api-arguments.cc line 26:

return GetReturnValue<Object>(isolate);

GetReturnValue can be found in api-arguments.h'. It looks like this will see if there is a return value and return a Handle to it or an empty Handle. After returning fromv8::internal::FunctionCallbackArguments::Call` we are the builtins-api.cc HandleApiCallHelper:

    Handle<Object> result = custom.Call(callback); // this was our call to SetSNICallback
 
    RETURN_EXCEPTION_IF_SCHEDULED_EXCEPTION(isolate, Object);

RETURN_EXCEPTION_IF_SCHEDULED_EXCEPTION is a macro:

#define RETURN_EXCEPTION_IF_SCHEDULED_EXCEPTION(isolate, T) \
  RETURN_VALUE_IF_SCHEDULED_EXCEPTION(isolate, MaybeHandle<T>())

Which will expend to

      RETURN_VALUE_IF_SCHEDULED_EXCEPTION(isolate, MaybeHandle<Object>())


    #define RETURN_VALUE_IF_SCHEDULED_EXCEPTION(isolate, value) \
      do {                                                      \
        Isolate* __isolate__ = (isolate);                       \
        DCHECK(!__isolate__->has_pending_exception());          \
        if (__isolate__->has_scheduled_exception()) {           \
          __isolate__->PromoteScheduledException();             \
          return value;                                         \
        }                                                       \
      } while (false)

So this will expand to (which will be places in HandleApiCallHelper replacing RETURN_EXCEPTION_IF_SCHEDULED_EXCEPTION) as:

        Isolate* __isolate__ = (isolate);            
        DCHECK(!__isolate__->has_pending_exception());
        if (__isolate__->has_scheduled_exception()) {
          __isolate__->PromoteScheduledException(); 
          return MaybeHandle<Object(); 
        }                             

In this case there will be a pending exception so isolate->PromoteScheduledException will be called.

PromoteScheduledException:

Object* thrown = scheduled_exception();
clear_scheduled_exception();
return ReThrow(thrown);
(lldb) job thrown
0x11728e9a5401: [JS_ERROR_TYPE]
 - map = 0x1172bc86cf79 [FastProperties]
 - prototype = 0x117204d0d679
 - elements = 0x1172afa82241 <FixedArray[0]> [HOLEY_SMI_ELEMENTS]
 - properties = 0x11728e9a5469 <PropertyArray[3]> {
    #stack: 0x1172b59ca9b9 <AccessorInfo> (const accessor descriptor)
    #message: 0x1172c10b6711 <String[18]: dummy setter error> (data field 0) properties[0]
    0x1172afa86899 <Symbol: detailed_stack_trace_symbol>: 0x11728e9a5491 <FixedArray[5]> (data field 1) properties[1]
    0x1172afa87069 <Symbol: stack_trace_symbol>: 0x11728e9a5cf1 <JSArray[26]> (data field 2) properties[2]
 }

ReThrow:

set_pending_exception(exception);
return heap()->exception();
void Isolate::set_pending_exception(Object* exception_obj) {
  DCHECK(!exception_obj->IsException(this));
  thread_local_top_.pending_exception_ = exception_obj;
}

Now when returning we will land in bootstrap_node.js and process_fatalException callback:

process._fatalException = function(er) {
  ...
     if (!caught)
        caught = process.emit('uncaughtException', er);

      // If someone handled it, then great.  otherwise, die in C++ land
      // since that means that we'll exit the process, emit the 'exit' event
      if (!caught) {
        try {
          if (!process._exiting) {
            process._exiting = true;
            process.emit('exit', 1);
          }
        } catch (er) {
          // nothing to be done about it at this point.
        }

      } else {
        ...
      }

      return caught;
}

Where er will be:

er = Error: dummy setter error at Object.set

The error would be:

throw error from setter...
/Users/danielbevenius/work/nodejs/node/test/parallel/test-tls-legacy-onselect.js:13
    throw Error('dummy setter error');
    ^

Error: dummy setter error
    at Object.set (/Users/danielbevenius/work/nodejs/node/test/parallel/test-tls-legacy-onselect.js:13:11)
    at Server.<anonymous> (/Users/danielbevenius/work/nodejs/node/test/parallel/test-tls-legacy-onselect.js:20:12)
    at Server.<anonymous> (/Users/danielbevenius/work/nodejs/node/test/common/index.js:520:15)
    at Server.emit (events.js:126:13)
    at TCP.onconnection (net.js:1595:8)

This is more informative about the error and easier to understand where it happend.

v8::Object::Set walkthrough

This function can be found in api.cc. In the walkthrough what we are setting is a callback on an object.

Maybe<bool> v8::Object::Set(v8::Local<v8::Context> context, v8::Local<Value> key, v8::Local<Value> value) {
  auto isolate = reinterpret_cast<i::Isolate*>(context->GetIsolate());
  ENTER_V8(isolate, context, Object, Set, Nothing<bool>(), i::HandleScope);
  ...
}

In this walk through the arguments are:

(lldb) jlh key
#onselect

(lldb) jlh value
0x2cfe064e7d11: [Function]
 - map = 0x2cfea4f02411 [FastProperties]
 - prototype = 0x2cfec3204631
 - elements = 0x2cfea2482241 <FixedArray[0]> [HOLEY_ELEMENTS]
 - initial_map =
 - shared_info = 0x2cfec4a73ae1 <SharedFunctionInfo>
 - name = 0x2cfea2482431 <String[0]: >
 - formal_parameter_count = 0
 - kind = [ NormalFunction ]
 - context = 0x2cfe064e1be1 <FixedArray[5]>
 - code = 0x6b17cf07d01 <Code BUILTIN>
 - source code = () {
    context.actual++;
    return fn.apply(this, arguments);
  }
 - properties = 0x2cfea2482241 <FixedArray[0]> {
    #length: 0x2cfed24ca4f1 <AccessorInfo> (const accessor descriptor)
    #name: 0x2cfed24ca561 <AccessorInfo> (const accessor descriptor)
    #prototype: 0x2cfed24ca5d1 <AccessorInfo> (const accessor descriptor)
 }

 - feedback vector: 0x2cfe7b857fa9: [FeedbackVector] in OldSpace
 - length: 9
 SharedFunctionInfo: 0x2cfec4a73ae1 <SharedFunctionInfo>
 Optimized Code: 0
 Invocation Count: 1
 Profiler Ticks: 0
 Slot #0 LoadProperty PREMONOMORPHIC
  [0]: 0x2cfea2486d59 <Symbol: premonomorphic_symbol>
  [1]: 0x2cfea24871c1 <Symbol: uninitialized_symbol>
 Slot #2 StoreNamedStrict PREMONOMORPHIC
  [2]: 0x2cfea2486d59 <Symbol: premonomorphic_symbol>
  [3]: 0x2cfea24871c1 <Symbol: uninitialized_symbol>
 Slot #4 BinaryOp MONOMORPHIC
  [4]: 1
 Slot #5 Call MONOMORPHIC
  [5]: 0x2cfe7b858019 <WeakCell value= 0x2cfec32088c1 <JSFunction apply (sfi = 0x2cfea24b2181)>>
  [6]: 1
 Slot #7 LoadProperty PREMONOMORPHIC
  [7]: 0x2cfea2486d59 <Symbol: premonomorphic_symbol>
  [8]: 0x2cfea24871c1 <Symbol: uninitialized_symbol>

This is the callback passed in :

pair.ssl.setSNICallback(common.mustCall(function() {
  raw.destroy();
  server.close();
}));

ENTER_V8 is a macro which is defined as:

    #define ENTER_V8(isolate, context, class_name, function_name, bailout_value, \
                     HandleScopeClass)                                           \
      ENTER_V8_HELPER_DO_NOT_USE(isolate, context, class_name, function_name,    \
                             bailout_value, HandleScopeClass, true)

So ENTER_V8(isolate, context, Object, Set, Nothing<bool>(), i::HandleScope); would expend to:

    ENTER_V8_HELPER_DO_NOT_USE(isolate, context, Object, Set, Nothing<bool>(), i::HandleScope, true)

    #define ENTER_V8_HELPER_DO_NOT_USE(isolate, context, Object,      \
                                       function_name, bailout_value,  \
                                       HandleScopeClass, do_callback) \
      if (IsExecutionTerminatingCheck(isolate)) {                     \
        return bailout_value;                                         \
      }                                                               \
      HandleScopeClass handle_scope(isolate);                         \
      CallDepthScope<do_callback> call_depth_scope(isolate, context); \
      LOG_API(isolate, class_name, function_name);                    \
      i::VMState<v8::OTHER> __state__((isolate));                     \
      bool has_pending_exception = false

So back in our v8::Object::Set function these macros would expend to:

    Maybe<bool> v8::Object::Set(v8::Local<v8::Context> context, v8::Local<Value> key, v8::Local<Value> value) {
      auto isolate = reinterpret_cast<i::Isolate*>(context->GetIsolate());
      if (IsExecutionTerminatingCheck(isolate)) {                     
        return Nothing<bool>();                                         
      }                                                               
      HandleScopeClass handle_scope(isolate);                         
      CallDepthScope<do_callback> call_depth_scope(isolate, context); 
      LOG_API(isolate, Object, Set);                    
      i::VMState<v8::OTHER> __state__((isolate));                     
      bool has_pending_exception = false;

      auto self = Utils::OpenHandle(this);
      auto key_obj = Utils::OpenHandle(reinterpret_cast<Name*>(*key));
      auto value_obj = Utils::OpenHandle(*value);
      has_pending_exception = i::Runtime::SetObjectProperty(isolate, 
                                                            self,
                                                            key_obj,
                                                            value_obj,
                                                            i::LanguageMode::kSloppy).is_null();
     RETURN_ON_FAILED_EXECUTION_PRIMITIVE(bool);
     return Just(true);
    }

So, lets take a closer look at i::Runtime::SetObjectProperty (i is V8's internal namespace) which we can find in src/runtime/runtime-object.cc.

    // Check if the given key is an array index.
    bool success = false;
    LookupIterator it = LookupIterator::PropertyOrElement(isolate, object, key, &success);
    if (!success) return MaybeHandle<Object>();

    MAYBE_RETURN_NULL(Object::SetProperty(&it, value, language_mode,
                                          Object::MAY_BE_STORE_FROM_KEYED));
    return value;

MAYBE_RETURN_NULL macro :

    #define MAYBE_RETURN_NULL(call) MAYBE_RETURN(call, MaybeHandle<Object>())

    #define MAYBE_RETURN(call, value)         \
      do {                                    \
        if ((call).IsNothing()) return value; \
      } while (false)

So, in this case our call will expend to:

    if (Object::SetProperty(&it, value, language_mode, Object::MAY_BE_STORE_FROM_KEYED).IsNothing())
       return MaybeHandle<Object>();

Lets follow SetProperty (in src/objects.cc line 4753) and see what it does:

    Maybe<bool> Object::SetProperty(LookupIterator* it, Handle<Object> value,
                                LanguageMode language_mode,
                                StoreFromKeyed store_mode) {
    if (it->IsFound()) {
      bool found = true;
      Maybe<bool> result =
        SetPropertyInternal(it, value, language_mode, store_mode, &found);
    if (found) return result;
  }

  // If the receiver is the JSGlobalObject, the store was contextual. In case
  // the property did not exist yet on the global object itself, we have to
  // throw a reference error in strict mode.  In sloppy mode, we continue.
  if (is_strict(language_mode) && it->GetReceiver()->IsJSGlobalObject()) {
    it->isolate()->Throw(*it->isolate()->factory()->NewReferenceError(
        MessageTemplate::kNotDefined, it->name()));
    return Nothing<bool>();
  }

  ShouldThrow should_throw =
      is_sloppy(language_mode) ? kDontThrow : kThrowOnError;
  return AddDataProperty(it, value, NONE, should_throw, store_mode);
}

SetPropertyInternal (in src/objects.cc):

    ShouldThrow should_throw = is_sloppy(language_mode) ? DONT_THROW : THROW_ON_ERROR;

    switch (it->state()) {
      ...
    }
 
    lldb) p it->state()
    (v8::internal::LookupIterator::State) $22 = ACCESSOR
    (lldb) expr should_throw
    (ShouldThrow) $28 = DONT_THROW

    ....
    return SetPropertyWithAccessor(it, value, should_throw);

SetPropertyWithAccessors:

    Handle<Object> structure = it->GetAccessors();
    Handle<Object> receiver = it->GetReceiver();
    ...
    Handle<Object> setter(AccessorPair::cast(*structure)->setter(), isolate);
(lldb) job *setter
0x2cfe064bd1b1: [Function]
 - map = 0x2cfea4f02411 [FastProperties]
 - prototype = 0x2cfec3204631
 - elements = 0x2cfea2482241 <FixedArray[0]> [HOLEY_ELEMENTS]
 - initial_map =
 - shared_info = 0x2cfe232eed51 <SharedFunctionInfo set>
 - name = 0x2cfea2485f41 <String[3]: set>
 - formal_parameter_count = 1
 - kind = [ NormalFunction ]
 - context = 0x2cfedc582a29 <FixedArray[8]>
 - code = 0x6b17cf07d01 <Code BUILTIN>
 - source code = (f) {
    console.log('throw error from setter...');
    throw Error('dummy setter error');
  }
 - properties = 0x2cfea2482241 <FixedArray[0]> {
    #length: 0x2cfed24ca4f1 <AccessorInfo> (const accessor descriptor)
    #name: 0x2cfed24ca561 <AccessorInfo> (const accessor descriptor)
    #prototype: 0x2cfed24ca5d1 <AccessorInfo> (const accessor descriptor)
 }

 - feedback vector: not available

We can see that this is our setter.

So the next line of interest is:

  return SetPropertyWithDefinedSetter(receiver, Handle<JSReceiver>::cast(setter), value, should_throw);

SetPropertyWithDefinedSetter

RETURN_ON_EXCEPTION_VALUE(isolate, Execution::Call(isolate, setter, receiver,
                                                   arraysize(argv), argv),
                          Nothing<bool>());
return Just(true);

RETURN_ON_EXCEPTION_VALUE checks if Call returned null and if so returns Nothing<bool>().

So lets follow Call (src/execution.cc line 191):

return CallInternal(isolate, callable, receiver, argc, argv, MessageHandling::kReport);

MessageHandling is an enum define as:

enum class MessageHandling { kReport, kKeepPending };

CallInternal:

return Invoke(isolate, false, callable, receiver, argc, argv,
              isolate->factory()->undefined_value(), message_handling);

The second argument name is is_construct and undefined is passed for new_target. Invoke (src/execution.cc line 58):

MUST_USE_RESULT MaybeHandle<Object> Invoke(
    Isolate* isolate, bool is_construct, Handle<Object> target,
    Handle<Object> receiver, int argc, Handle<Object> args[],
    Handle<Object> new_target, Execution::MessageHandling message_handling) {
    ...
    typedef Object* (*JSEntryFunction)(Object* new_target, Object* target,
                                       Object* receiver, int argc,
                                       Object*** args);
    ...
    Object* value = NULL;

    JSEntryFunction stub_entry = FUNCTION_CAST<JSEntryFunction>(code->entry());

    Handle<Code> code = is_construct
       ? isolate->factory()->js_construct_entry_code()
       : isolate->factory()->js_entry_code();
    (lldb) p is_construct
    (bool) $74 = false

So isolate->factory()->js_entry_code() will be called. Lets take a look at this code. To do this we have to take the address from code->entry()

(lldb) job *code
0x6b17cf04001: [Code]
kind = STUB
major_key = JSEntryStub
compiler = unknown
Instructions (size = 234)
0x6b17cf04060     0  55             push rbp
0x6b17cf04061     1  4889e5         REX.W movq rbp,rsp
0x6b17cf04064     4  6a02           push 0x2
0x6b17cf04066     6  49ba7819000601000000 REX.W movq r10,0x106001978    ;; external reference (Isolate::context_address)
0x6b17cf04070    10  4d8b12         REX.W movq r10,[r10]
0x6b17cf04073    13  4152           push r10
0x6b17cf04075    15  4154           push r12
0x6b17cf04077    17  4155           push r13
0x6b17cf04079    19  4156           push r14
0x6b17cf0407b    1b  4157           push r15
0x6b17cf0407d    1d  53             push rbx
0x6b17cf0407e    1e  49bd4800000601000000 REX.W movq r13,0x106000048    ;; external reference (Heap::roots_array_start())
0x6b17cf04088    28  4981c580000000 REX.W addq r13,0x80
0x6b17cf0408f    2f  49bae819000601000000 REX.W movq r10,0x1060019e8    ;; external reference (Isolate::c_entry_fp_address)
0x6b17cf04099    39  41ff32         push [r10]
0x6b17cf0409c    3c  48a1081a000601000000 REX.W movq rax,(0x106001a08)    ;; external reference (Isolate::js_entry_sp_address)
0x6b17cf040a6    46  4885c0         REX.W testq rax,rax
0x6b17cf040a9    49  0f8514000000   jnz 0x6b17cf040c3  <+0x63>
0x6b17cf040af    4f  6a02           push 0x2
0x6b17cf040b1    51  488bc5         REX.W movq rax,rbp
0x6b17cf040b4    54  48a3081a000601000000 REX.W movq (0x106001a08),rax    ;; external reference (Isolate::js_entry_sp_address)
0x6b17cf040be    5e  e902000000     jmp 0x6b17cf040c5  <+0x65>
0x6b17cf040c3    63  6a00           push 0x0
0x6b17cf040c5    65  e916000000     jmp 0x6b17cf040e0  <+0x80>
0x6b17cf040ca    6a  48a38819000601000000 REX.W movq (0x106001988),rax    ;; external reference (Isolate::pending_exception_address)
0x6b17cf040d4    74  498b8588000000 REX.W movq rax,[r13+0x88]
0x6b17cf040db    7b  e932000000     jmp 0x6b17cf04112  <+0xb2>
0x6b17cf040e0    80  49baf019000601000000 REX.W movq r10,0x1060019f0    ;; external reference (Isolate::handler_address)
0x6b17cf040ea    8a  41ff32         push [r10]
0x6b17cf040ed    8d  49baf019000601000000 REX.W movq r10,0x1060019f0    ;; external reference (Isolate::handler_address)
0x6b17cf040f7    97  498922         REX.W movq [r10],rsp
0x6b17cf040fa    9a  6a00           push 0x0
0x6b17cf040fc    9c  e8bf000000     call 0x6b17cf041c0  (JSEntryTrampoline)    ;; code: BUILTIN
0x6b17cf04101    a1  49baf019000601000000 REX.W movq r10,0x1060019f0    ;; external reference (Isolate::handler_address)
0x6b17cf0410b    ab  418f02         pop [r10]
0x6b17cf0410e    ae  4883c400       REX.W addq rsp,0x0
0x6b17cf04112    b2  5b             pop rbx
0x6b17cf04113    b3  4883fb02       REX.W cmpq rbx,0x2
0x6b17cf04117    b7  0f8511000000   jnz 0x6b17cf0412e  <+0xce>
0x6b17cf0411d    bd  49ba081a000601000000 REX.W movq r10,0x106001a08    ;; external reference (Isolate::js_entry_sp_address)
0x6b17cf04127    c7  49c70200000000 REX.W movq [r10],0x0
0x6b17cf0412e    ce  49bae819000601000000 REX.W movq r10,0x1060019e8    ;; external reference (Isolate::c_entry_fp_address)
0x6b17cf04138    d8  418f02         pop [r10]
0x6b17cf0413b    db  5b             pop rbx
0x6b17cf0413c    dc  415f           pop r15
0x6b17cf0413e    de  415e           pop r14
0x6b17cf04140    e0  415d           pop r13
0x6b17cf04142    e2  415c           pop r12
0x6b17cf04144    e4  4883c410       REX.W addq rsp,0x10
0x6b17cf04148    e8  5d             pop rbp
0x6b17cf04149    e9  c3             retl


Handler Table (size = 24)

RelocInfo (size = 23)
0x6b17cf04068  external reference (Isolate::context_address)  (0x106001978)
0x6b17cf04080  external reference (Heap::roots_array_start())  (0x106000048)
0x6b17cf04091  external reference (Isolate::c_entry_fp_address)  (0x1060019e8)
0x6b17cf0409e  external reference (Isolate::js_entry_sp_address)  (0x106001a08)
0x6b17cf040b6  external reference (Isolate::js_entry_sp_address)  (0x106001a08)
0x6b17cf040cc  external reference (Isolate::pending_exception_address)  (0x106001988)
0x6b17cf040e2  external reference (Isolate::handler_address)  (0x1060019f0)
0x6b17cf040ef  external reference (Isolate::handler_address)  (0x1060019f0)
0x6b17cf040fd  code target (BUILTIN)  (0x6b17cf041c0)
0x6b17cf04103  external reference (Isolate::handler_address)  (0x1060019f0)
0x6b17cf0411f  external reference (Isolate::js_entry_sp_address)  (0x106001a08)
0x6b17cf04130  external reference (Isolate::c_entry_fp_address)  (0x1060019e8)
    Object* orig_func = *new_target;
    Object* func = *target;
    Object* recv = *receiver;
    Object*** argv = reinterpret_cast<Object***>(args);
    if (FLAG_profile_deserialization && target->IsJSFunction()) {
      PrintDeserializedCodeInfo(Handle<JSFunction>::cast(target));
    }
    RuntimeCallTimerScope timer(isolate, &RuntimeCallStats::JS_Execution);
    value = CALL_GENERATED_CODE(isolate, stub_entry, orig_func, func, recv, argc, argv);

CALL_GENEREATED_CODE is defined in src/x64/simulator-x64.h:

#define CALL_GENERATED_CODE(isolate, entry, p0, p1, p2, p3, p4) \
  (entry(p0, p1, p2, p3, p4))

Now, stub_entry is of typeJSEntryFunction` which is a typedef

    typedef Object* (*JSEntryFunction)(Object* new_target, Object* target,
                                      Object* receiver, int argc,
                                      Object*** args);
    stub_entry(orig_func, func, recv, argc, argv)

    JSEntryFunction stub_entry = FUNCTION_CAST<JSEntryFunction>(code->entry());

From the comments about FUNCTION_CAST it is used to invoke generated code from within C:

// FUNCTION_CAST<F>(addr) casts an address into a function
// of type F. Used to invoke generated code from within C.
template <typename F>
F FUNCTION_CAST(Address addr) {
  return reinterpret_cast<F>(reinterpret_cast<intptr_t>(addr));
}

Lets take a look at the arguments to entry which are:

    stub_entry(orig_func, func, recv, argc, argv)

To follow the code in stub_entry we need to step-inst (instruction level step) but there is not debug information as this is generated code. What we can do is after every step call dissasemble --pc:

(lldb) target stop-hook add
Enter your stop hook command(s).  Type 'DONE' to end.
> disassemble --pc
> DONE
Stop hook #1 added.
(lldb) undisplay

Now we should be able to step into using:

(lldb) si
Process 72301 stopped
* thread #1: tid = 0xc2e41e, 0x000006b17cf04060, queue = 'com.apple.main-thread', stop reason = instruction step into
    frame #0: 0x000006b17cf04060
->  0x6b17cf04060: pushq  %rbp
    0x6b17cf04061: movq   %rsp, %rbp
    0x6b17cf04064: pushq  $0x2
    0x6b17cf04066: movabsq $0x106001978, %r10        ; imm = 0x106001978

Notice that the address matches that of the output from (lldb) job *code above:

0x6b17cf04060     0  55             push rbp
0x6b17cf04061     1  4889e5         REX.W movq rbp,rsp
0x6b17cf04064     4  6a02           push 0x2
0x6b17cf04066     6  49ba7819000601000000 REX.W movq r10,0x106001978    ;; external reference (Isolate::context_address)

V8 uses Intel's assembly syntax and lldb uses AT&T syntax so the src/dest arguments are switched around, and registers prefixes are not used in Intel syntax.

Start of (lldb) job *code:

0x6b17cf04060     0  55             push rbp
0x6b17cf04061     1  4889e5         REX.W movq rbp,rsp
0x6b17cf04064     4  6a02           push 0x2
0x6b17cf04066     6  49ba7819000601000000 REX.W movq r10,0x106001978    ;; external reference (Isolate::context_address)
0x6b17cf04070    10  4d8b12         REX.W movq r10,[r10]

I'm trying to line up the assembly code output with src/x64/code-stubs-x64.cc and JSEntryStub::Generate and see if I'm looking at the correct code:

push rbp __ pushq(rbp);
movq rpb, rsp __ movp(rbp, rsp);
push 0x2 -- Push(Immediate(StackFrame::TypeToMarker(type()))) This pushed the type of the stack frame which is an emmediate value 2. I think this value is taken from src/frames.h and StackFrame class which has an enum:

enum Type {
  NONE = 0,
  STACK_FRAME_TYPE_LIST(DECLARE_TYPE)
  NUMBER_OF_TYPES,
  // Used by FrameScope to indicate that the stack frame is constructed
  // manually and the FrameScope does not need to emit code.
  MANUAL
};

And the first in the STACK_FRAME_LIST is:

#define STACK_FRAME_TYPE_LIST(V)                                          \
  V(ENTRY, EntryFrame)                                                    \
  ...

movq r10,0x106001978 matches:

ExternalReference context_address(IsolateAddressId::kContextAddress, isolate());
__ Load(kScratchRegister, context_address);

kScratchRegister is r10 and we are moving 0x106001978 (the context_address) into register r10.

`movq r10,[r10]`   __ Load(kScratchRegister, context_address);  // get the pointer
`push r10`         __ Push(kScratchRegister);  // push the pointer onto the stack
`push r12`         __ pushq(r12);  
`push r13`         __ pushq(r13);  
`push r14`         __ pushq(r14);  
`push r15`         __ pushq(r15); 
`push rbx`         __ pushq(rbx);  

`movq r13,0x106000048`   __ InitializeRootRegister();  //  external reference (Heap::roots_array_start())
`addq r13,0x80`          __ Push(c_entry_fp_operand); 

`movq r10,0x1060019e8`   __ Load(rax, js_entry_sp);   ;; external reference (Isolate::c_entry_fp_address)
0x6b17cf04099    39  41ff32         push [r10]
0x6b17cf0409c    3c  48a1081a000601000000 REX.W movq rax,(0x106001a08)    ;; external reference (Isolate::js_entry_sp_address)
0x6b17cf040a6    46  4885c0         REX.W testq rax,rax
0x6b17cf040a9    49  0f8514000000   jnz 0x6b17cf040c3  <+0x63>

You might be wondering what that __ is, well it is a macro:

#define __ ACCESS_MASM(masm)

#define ACCESS_MASM(masm) masm->

So __ will get expanded to masm->pushq(rbp) for example. I think this is done so that it look more like assembly and not so much as C++/C.

0x6b17cf04060     0  55             push rbp                                 // prologue
0x6b17cf04061     1  4889e5         REX.W movq rbp,rsp                       // prologue
0x6b17cf04064     4  6a02           push 0x2                                 // Stack frame type = Context Address type
0x6b17cf04066     6  49ba7819000601000000 REX.W movq r10,0x106001978         // external reference (Isolate::context_address)
0x6b17cf04070    10  4d8b12         REX.W movq r10,[r10]                     // dereference the context_address pointer
0x6b17cf04073    13  4152           push r10                                 // push contents of r10 (ScratchRegister)
0x6b17cf04075    15  4154           push r12                                 // push contents of r12 argument passed
0x6b17cf04077    17  4155           push r13             
0x6b17cf04079    19  4156           push r14
0x6b17cf0407b    1b  4157           push r15
0x6b17cf0407d    1d  53             push rbx                                 // push
0x6b17cf0407e    1e  49bd4800000601000000 REX.W movq r13,0x106000048         // external reference (Heap::roots_array_start())
0x6b17cf04088    28  4981c580000000 REX.W addq r13,0x80                      // push e_entry_fp_operand
0x6b17cf0408f    2f  49bae819000601000000 REX.W movq r10,0x1060019e8         // external reference (Isolate::c_entry_fp_address)
0x6b17cf04099    39  41ff32         push [r10]                               // push the scratch targed used above
0x6b17cf0409c    3c  48a1081a000601000000 REX.W movq rax,(0x106001a08)       // external reference (Isolate::js_entry_sp_address)
0x6b17cf040a6    46  4885c0         REX.W testq rax,rax                      // check if rax is zero
0x6b17cf040a9    49  0f8514000000   jnz 0x6b17cf040c3  <+0x63>               // jump if not zero (ZF = 0)
0x6b17cf040af    4f  6a02           push 0x2                                 //StackFrame::OUTERMOST_JSENTRY_FRAME
0x6b17cf040b1    51  488bc5         REX.W movq rax,rbp                       // __ movp(rax, rbp)
0x6b17cf040b4    54  48a3081a000601000000 REX.W movq (0x106001a08),rax       // __ Store(js_entry_sp, rax)
0x6b17cf040be    5e  e902000000     jmp 0x6b17cf040c5  <+0x65> ---------+
0x6b17cf040c3    63  6a00           push 0x0                            |    // __ Push(Immediate(StackFrame::INNER_JSENTRY_FRAME));
    +-------------------------------------------------------------------+
    |
0x6b17cf040c5    65  e916000000     jmp 0x6b17cf040e0  <+0x80> ---------+    // __ jmp(&invoke);
0x6b17cf040ca    6a  48a38819000601000000 REX.W movq (0x106001988),rax  |    // __ Store(pending_exception, rax)
0x6b17cf040d4    74  498b8588000000 REX.W movq rax,[r13+0x88]           |    // __ LoadRoot(rax, Heap::kExceptionRootIndex);
0x6b17cf040db    7b  e932000000     jmp 0x6b17cf04112  <+0xb2>          |    // __ jump(&exit)
    +-------------------------------------------------------------------+
    |
0x6b17cf040e0    80  49baf019000601000000 REX.W movq r10,0x1060019f0         // external reference (Isolate::handler_address)
0x6b17cf040ea    8a  41ff32         push [r10]                               // push value of the pointer onto the stack
0x6b17cf040ed    8d  49baf019000601000000 REX.W movq r10,0x1060019f0         // PushStackHandler (Isolate::handler_address)
0x6b17cf040f7    97  498922         REX.W movq [r10],rsp                     // 
0x6b17cf040fa    9a  6a00           push 0x0                                 // RelocInfo::CODE_TARGET  ?
0x6b17cf040fc    9c  e8bf000000     call 0x6b17cf041c0  (JSEntryTrampoline)  //  code: BUILTIN
0x6b17cf04101    a1  49baf019000601000000 REX.W movq r10,0x1060019f0         // PopStackHandnler (Isolate::handler_address)
0x6b17cf0410b    ab  418f02         pop [r10]
0x6b17cf0410e    ae  4883c400       REX.W addq rsp,0x0
0x6b17cf04112    b2  5b             pop rbx
0x6b17cf04113    b3  4883fb02       REX.W cmpq rbx,0x2
0x6b17cf04117    b7  0f8511000000   jnz 0x6b17cf0412e  <+0xce>
0x6b17cf0411d    bd  49ba081a000601000000 REX.W movq r10,0x106001a08    ;; external reference (Isolate::js_entry_sp_address)
0x6b17cf04127    c7  49c70200000000 REX.W movq [r10],0x0
0x6b17cf0412e    ce  49bae819000601000000 REX.W movq r10,0x1060019e8    ;; external reference (Isolate::c_entry_fp_address)
0x6b17cf04138    d8  418f02         pop [r10]
0x6b17cf0413b    db  5b             pop rbx
0x6b17cf0413c    dc  415f           pop r15
0x6b17cf0413e    de  415e           pop r14
0x6b17cf04140    e0  415d           pop r13
0x6b17cf04142    e2  415c           pop r12
0x6b17cf04144    e4  4883c410       REX.W addq rsp,0x10
0x6b17cf04148    e8  5d             pop rbp
0x6b17cf04149    e9  c3             retl

0x6b17cf040fd  code target (BUILTIN)  (0x6b17cf041c0)

If we set a break point after CALL_GENERATED_CODE we will see that this code does return and a value is provided:

bool has_exception = value->IsException(isolate);
In this case there was no exceptions so:
isolate->clear_pending_message();

return Handle<Object>(value, isolate);
Handle<Object> result = custom.Call(callback);

RETURN_EXCEPTION_IF_SCHEDULED_EXCEPTION(isolate, Object);

which will expend to :

Isolate* __isolate__ = (isolate);                       
DCHECK(!__isolate__->has_pending_exception());        
if (__isolate__->has_scheduled_exception()) {        
  __isolate__->PromoteScheduledException();         
  return MaybeHandle<Object>();                                    
}

In this case there will be a scheduled_exception.

Object* Isolate::PromoteScheduledException() {
  Object* thrown = scheduled_exception();
  clear_scheduled_exception();
  // Re-throw the exception to avoid getting repeated error reporting.
  return ReThrow(thrown);
}
(lldb) job thrown
0x136bbb969d61: [JS_ERROR_TYPE]
 - map = 0x136bd865de81 [FastProperties]
 - prototype = 0x136bddd8d679
 - elements = 0x136ba4c82241 <FixedArray[0]> [HOLEY_SMI_ELEMENTS]
 - properties = 0x136bbb969d79 <PropertyArray[3]> {
    #stack: 0x136baa1ca9b9 <AccessorInfo> (const accessor descriptor)
    #message: 0x136b27c6e981 <String[18]: dummy setter error> (data field 0) properties[0]
    0x136ba4c87069 <Symbol: stack_trace_symbol>: 0x136bbb969f49 <JSArray[26]> (data field 1) properties[1]
 }

ReThrow will:

set_pending_exception(exception);
return heap()->exception();
``

Notes:
```c++
  RETURN_ON_FAILED_EXECUTION_PRIMITIVE(bool);

This macro is defined as:

#define RETURN_ON_FAILED_EXECUTION_PRIMITIVE(T) \
  EXCEPTION_BAILOUT_CHECK_SCOPED_DO_NOT_USE(isolate, Nothing<T>())

Which will expend to

  EXCEPTION_BAILOUT_CHECK_SCOPED_DO_NOT_USE(isolate, Nothing<bool>())

  #define EXCEPTION_BAILOUT_CHECK_SCOPED_DO_NOT_USE(isolate, Nothing<bool>) \
  do {                                                            \
    if (has_pending_exception) {                                  \
      call_depth_scope.Escape();                                  \
      return Nothing<bool>;                                       \
    }                                                             \
  } while (false)

So the last lines in v8::Object::Set function will be:

    if (has_pending_exception) {
      call_depth_scope.Escape();
      return Nothing<bool>;
    }                                                    
    return Just(true);

Setters

I'm guessing getters/setters are added using V8 Accessor (TODO: look at the v8 examples I have)

factory.h

When debugging you might come across a call that invokes a function in V8's factory, for example:

isolate->factory()->undefined_value();

Now, in the debugger you see something like:

ROOT_LIST(ROOT_ACCESSOR)

Lets take a closer look at the ROOT_LIST macro. It is defined in factory.h:

#define ROOT_ACCESSOR(type, name, camel_name)                         \
  inline Handle<type> name() {                                        \
    return Handle<type>(bit_cast<type**>(                             \
        &isolate()->heap()->roots_[Heap::k##camel_name##RootIndex])); \
  }
  ROOT_LIST(ROOT_ACCESSOR)
#undef ROOT_ACCESSOR

src/heap/heap.h:

#define STRONG_ROOT_LIST(V)
...
V(Oddball, undefined_value, UndefinedValue)

Which would expand to:

  inline Handle<Oddball> undefined_value() {  
    return Handle<Oddball>(bit_cast<Oddball**>(
        &isolate()->heap()->roots_[Heap::kUndefinedValueRootIndex]));

Recall that Oddball describes objects null, undefined, true, and false.

bit_cast can be found in src/base/macros.h.

bootstrap_node.js compilation and execution walkthrough

The goal of this section is to understand what happens when bootstrap_node.js is compiled and run mainly focusing on the V8 side of things.

    $ lldb -- out/Debug/node --print-ast
    (lldb) br s -f node::LoadEnvironment
    (lldb) r

Lets step through to the following line in node::ExecuteString:

Local<Value> f_value = ExecuteString(env, MainSource(env), script_name);

MainSource is a function in node_javascript.h which is used by node_js2c. This was documented earlier so I won't go into details about it now. You can see the content using:

(lldb) jlh MainSource(env)
Which is the same thing as calling:
(lldb) expr (*(v8::internal::Object**)*MainSource(env))->Print()

jlh can be found in the V8 source tree in tools/lldbinit.

ExecuteString is a function in node.cc which calls Compile:

MaybeLocal<v8::Script> script = v8::Script::Compile(env->context(), source, &origin);

Will delegate to ScriptCompiler::Compile, and then to ScriptCompiler::CompileUnboundInternal which can be found in deps/v8/src/api.cc:

MaybeLocal<UnboundScript> ScriptCompiler::CompileUnboundInternal(
   Isolate* v8_isolate, Source* source, CompileOptions options) {
     ...
     i::MaybeHandle<i::SharedFunctionInfo> maybe_function_info = i::Compiler::GetSharedFunctionInfoForScript(
        str, name_obj, line_offset, column_offset, source->resource_options,
        source_map_url, isolate->native_context(), NULL, &script_data,
        options, i::NOT_NATIVES_CODE, host_defined_options);
     

Compiler::GetSharedFunctionForScript in deps/v8/src/compiler.cc:

  ParseInfo parse_info(script);
  Zone compile_zone(isolate->allocator(), ZONE_NAME);
  ...
  maybe_result = CompileToplevel(&parse_info, isolate);

Lets take a look at parse_info:

(lldb) job *parse_info.script()
0x169ddd9abfb1: [Script] in OldSpace
 - source: 0x169df8f0d3d1 <Very long string[21923]>
 - name: 0x169df8f0d3a1 <String[17]: bootstrap_node.js>
 - line_offset: 0
 - column_offset: 0
 - type: 2
 - id: 14
 - context data: 0x169dd88022e1 <undefined>
 - wrapper: 0x169dd88022e1 <undefined>
 - compilation type: 0
 - line ends: 0x169dd88022e1 <undefined>
 - eval from shared: 0x169dd88022e1 <undefined>
 - eval from position: 0
 - shared function infos: 0x169dd8802251 <FixedArray[0]>

This will be passed to CompileToplevel:

  if (parse_info->literal() == nullptr && 
      !parsing::ParseProgram(parse_info, isolate)) {
  ...
  std::forward_list<std::unique_ptr<CompilationJob>> inner_function_jobs;
  std::unique_ptr<CompilationJob> outer_function_job(
      GenerateUnoptimizedCode(parse_info, isolate, &inner_function_jobs));
  ...

In this case

(lldb) expr parse_info->literal() == nullptr
(bool) $80 = true

First, parsing::ParseProgram will parse the JavaScript and produce the abstract syntax tree. ParseProgram:

result = parser.ParseProgram(isolate, info);
info->set_literal(result);

We can inspect parse_info after this function returns:

(lldb) expr parse_info->literal()->Print()
FUNC LITERAL at 0
. NAME <nil>
. INFERRED NAME 1

Now, this does not look like much but this is only the function literal:

(function(process) {
});

We can print the AST generated using:

(lldb) expr AstPrinter(isolate()).PrintProgram(parse_info()->literal())
(const char *) $56 = 0x0000000105016e20 "FUNC at 0\n. KIND 0\n. SUSPEND COUNT 0\n. NAME ""\n. INFERRED NAME ""\n. EXPRESSION STATEMENT at 284\n. . LITERAL "use strict"\n. EXPRESSION STATEMENT at 299\n. . ASSIGN at -1\n. . . VAR PROXY local[0] (0x10682f138) (mode = TEMPORARY) ".result"\n. . . FUNC LITERAL at 300\n. . . . NAME \n. . . . INFERRED NAME \n. . . . PARAMS\n. . . . . VAR (0x10681cd48) (mode = VAR) "process"\n. RETURN at -1\n. . VAR PROXY local[0] (0x10682f138) (mode = TEMPORARY) ".result"\n"

I've not figured out a good way to make this print nicely in lldb so using this is the more readable output:

[generating bytecode for function: ]
--- AST ---
FUNC at 0
. KIND 0
. SUSPEND COUNT 0
. NAME ""
. INFERRED NAME ""
. EXPRESSION STATEMENT at 284
. . LITERAL "use strict"
. EXPRESSION STATEMENT at 299
. . ASSIGN at -1
. . . VAR PROXY local[0] (0x10683f938) (mode = TEMPORARY) ".result"
. . . FUNC LITERAL at 300
. . . . NAME
. . . . INFERRED NAME
. . . . PARAMS
. . . . . VAR (0x10682d548) (mode = VAR) "process"
. RETURN at -1
. . VAR PROXY local[0] (0x10683f938) (mode = TEMPORARY) ".result"

Notice that we have use strict starting a character 284. This due to the comment that preceeds it. We can see that the function literal starts at 300 and that if we would have given it a name it would have shown up as NAME somename. We can also see that it takes a parameter named process

Next GenerateUnoptimizedCode will be called (deps/v8/src/compiler.cc):

  Compiler::EagerInnerFunctionLiterals inner_literals;
  if (!Compiler::Analyze(parse_info, &inner_literals)) {
    return std::unique_ptr<CompilationJob>();
  }
  std::unique_ptr<CompilationJob> outer_function_job(
      PrepareAndExecuteUnoptimizedCompileJob(parse_info, parse_info->literal(), isolate));

PrepareAndExecuteUnoptimizedCompileJob:

  if (job->PrepareJob() == CompilationJob::SUCCEEDED &&
      job->ExecuteJob() == CompilationJob::SUCCEEDED) {

PrepareJob() will print the ast as shown above. Lets take a closer look at ExecuteJob:

return UpdateState(ExecuteJobImpl(), State::kReadyToFinalize);

InterpreterCompilationJob::ExecuteJobImpl:

generator()->GenerateBytecode(stack_limit());

This will land in BytecodeGenerator::GenerateBytecode bytecode-generator.cc:906

GenerateBytecodeBody();

This function will visit all of the nodes in the AST and generate the bytecodes for them.

Later in FinalizeUnoptimizedCode:

outer_function_job->compilation_info()->set_shared_info(shared_info);
(lldb) expr shared_info->code()->Print()
0x1a00905c50e1: [Code]
kind = BUILTIN
name = CompileLazy
compiler = unknown
Instructions (size = 983)
0x1a00905c5140     0  488b5f2f       REX.W movq rbx,[rdi+0x2f]
0x1a00905c5144     4  488b5b07       REX.W movq rbx,[rbx+0x7]
0x1a00905c5148     8  493b5da0       REX.W cmpq rbx,[r13-0x60]
0x1a00905c514c     c  0f844c030000   jz 0x1a00905c549e  (CompileLazy)
...

Later in a call to InterpreterCompilationJob::FinalizeJobImpl will delegate to BytecodeGenerator::FinalizeBytecode where the the BytecodeArray is generated:

Handle<BytecodeArray> bytecode_array = builder()->ToBytecodeArray(isolate);

The bytecodes can be inspected using:

```console
(lldb) expr bytecodes->Print()
0x39d9bd7ade49: [BytecodeArray] in OldSpaceParameter count 1
Frame size 8
    0 E> 0x39d9bd7ade82 @    0 : 93                StackCheck
   15 S> 0x39d9bd7ade83 @    1 : 6f 00 00 00       CreateClosure [0], [0], #0
         0x39d9bd7ade87 @    5 : 1e fb             Star r0
21644 S> 0x39d9bd7ade89 @    7 : 97                Return
Constant pool (size = 1)
0x39d9bd7ade31: [FixedArray] in OldSpace
 - map = 0x39d9d04022f1 <Map(HOLEY_ELEMENTS)>
 - length: 1
           0: 0x39d9bd7add81 <SharedFunctionInfo bajja>
Handler Table (size = 16)

This bytecode array will be set on the compilation_info instance:

compilation_info()->SetBytecodeArray(bytecodes);

And the code will be set to InterpreterEntryTrampoline:

compilation_info()->SetCode(
    BUILTIN_CODE(compilation_info()->isolate(), InterpreterEntryTrampoline));
return SUCCEEDED;

This will return us into FinalizeUnoptimizedCompilationJob:

if (status == CompilationJob::SUCCEEDED) {
  InstallUnoptimizedCode(compilation_info);
  CodeEventListener::LogEventsAndTags log_tag;

InstallUnoptimizedCode will set up the FeedbackMetadata:

   Handle<FeedbackMetadata> feedback_metadata = FeedbackMetadata::New(
        compilation_info->isolate(),
        compilation_info->literal()->feedback_vector_spec());
    compilation_info->shared_info()->set_feedback_metadata(*feedback_metadata);

We can inspect feedback_metadata using:

(lldb) expr feedback_metadata->Print()
0x39d9bd7adea9: [FeedbackMetadata] in OldSpace
 - length: 2
 - slot_count: 1
 Slot #0 kCreateClosure

Back in InstallUnoptimizedCode:

shared->set_code(*compilation_info->code())
shared->set_bytecode_array(*compilation_info->bytecode_array());

This is setting the SharedFunctionInfo instance code to InterpreterEntryTrampoline and the bytecode array is also set that we saw before. After this control will be returned to FinalizeUnoptimizedCompilationJob and we go through all the inner_function_jobs and set the SharedFunctionInfo for them.

for (auto&& inner_job : *inner_function_jobs) {
    Handle<SharedFunctionInfo> inner_shared_info =
        Compiler::GetSharedFunctionInfo(
            inner_job->compilation_info()->literal(), parse_info->script(),
            isolate);
    if (inner_shared_info->is_compiled()) continue;
    inner_job->compilation_info()->set_shared_info(inner_shared_info);
    if (FinalizeUnoptimizedCompilationJob(inner_job.get()) !=
        CompilationJob::SUCCEEDED) {
      return false;
    }
  }

We can inspect inner_shared_info using:

(lldb) expr inner_shared_info->Print()
...
(lldb) expr inner_shared_info->code()->Print()
0x1a00905c50e1: [Code]
kind = BUILTIN
name = CompileLazy
...

After this the shared_info will be returned from CompileToplevel. Which will the return to Compiler::GetSharedFunctionInfoForScript:

maybe_result = CompileToplevel(&parse_info, isolate);
...
Handle<FeedbackVector> feedback_vector = FeedbackVector::New(isolate, result);
vector = isolate->factory()->NewCell(feedback_vector);
compilation_cache->PutScript(source, context, language_mode, result, vector);
(lldb) expr feedback_vector->Print()
0x265e1d0b03b1: [FeedbackVector] in OldSpace
 - length: 1
 SharedFunctionInfo: 0x265e1d0adae9 <SharedFunctionInfo>
 Optimized Code: 0
 Invocation Count: 0
 Profiler Ticks: 0
 Slot #0 kCreateClosure
  [0]: 0x265e1d0b03e1 <Cell value= 0x265eb5c022e1 <undefined>>

So we are backing out of the calls now and the next return will land us in CompileUnboundInternal:

has_pending_exception = !maybe_function_info.ToHandle(&result);

This will later return to ScriptCompiler::Compile:

auto maybe = CompileUnboundInternal(isolate, source, options);
...
v8::Context::Scope scope(context);
return result->BindToCurrentContext();
(lldb) expr maybe
(lldb) expr maybe.ToLocalChecked()->GetId()
(int) $415 = 14
(lldb) jlh maybe.ToLocalChecked()->GetScriptName()
"bootstrap_node.js"

An UnboundScript is a compiled JavaScript but it is not yet tied to a Context.

After this the compilation is finished and control will be returned to node.cc which will now run the script:

Local<Value> result = script.ToLocalChecked()->Run();

So we have compiled the script and now are are going to run it. Just to clarify something here, the script contains an expression (the surrounding ()) that defines a function. So we are not executing the startup function here but instead only only the expression that contains it.

Run will call Execution::Call, which will call CallInternal, which will call Invoke:

    typedef Object* (*JSEntryFunction)(Object* new_target, Object* target,
                                     Object* receiver, int argc,
                                     Object*** args);
    Handle<Code> code = is_construct
       ? isolate->factory()->js_construct_entry_code()
       : isolate->factory()->js_entry_code();
    ...
  
    // start of the function identified by code->entry() address.
    JSEntryFunction stub_entry = FUNCTION_CAST<JSEntryFunction>(code->entry());

    // Call the function through the right JS entry stub.
    Object* orig_func = *new_target;
    Object* func = *target;
    Object* recv = *receiver;
    Object*** argv = reinterpret_cast<Object***>(args);
    if (FLAG_profile_deserialization && target->IsJSFunction()) {
      PrintDeserializedCodeInfo(Handle<JSFunction>::cast(target));
    }
    RuntimeCallTimerScope timer(isolate, &RuntimeCallStats::JS_Execution);
    value = CALL_GENERATED_CODE(isolate, stub_entry, orig_func, func, recv, argc, argv);
  }

Notice that code will be what isolate->factory()->js_entry_code() returns:

(lldb) expr isolate->factory()->js_entry_code()->Print()
0xe6d84b84001: [Code]
kind = STUB
major_key = JSEntryStub
compiler = unknown
Instructions (size = 232)
0xe6d84b84060     0  55             push rbp
0xe6d84b84061     1  4889e5         REX.W movq rbp,rsp
0xe6d84b84064     4  6a02           push 0x2
...

I'm not showing the complete output above as this will be shown later in detail.

Next, we have CALL_GENERATED_CODE which can be found in src/x64/simulator-x64.h:

// Since there is no simulator for the x64 architecture the only thing we can
// do is to call the entry directly.
// TODO(X64): Don't pass p0, since it isn't used?
#efine CALL_GENERATED_CODE(isolate, entry, p0, p1, p2, p3, p4) \
  (entry(p0, p1, p2, p3, p4))

So this will be an call that looks like this:

entry(orig_func, func, recv, argc, argv);

If we use step instruction (si) we can follow the setup and calling of this function:

    0x100e381b4 <+1348>: movq   -0x118(%rbp), %rdx                  // move the value of address of local variable code->entry() into rdx
    0x100e381bb <+1355>: movq   -0x120(%rbp), %rdi                  // first argument which is local variable orig_func
->  0x100e381c2 <+1362>: movq   -0x128(%rbp), %rsi                  // second argument which is local variable func
    0x100e381c9 <+1369>: movq   -0x130(%rbp), %rcx                  // third argument which is local variable recv
    0x100e381d0 <+1376>: movl   -0x30(%rbp), %eax                   // fourth arg which is local variable argc 
    0x100e381d3 <+1379>: movq   -0x138(%rbp), %r8                   // fifth argument which is local variable argv
    0x100e381da <+1386>: movq   %rdx, -0x1c0(%rbp)                  // move code->entry from rdx into local variable 
    0x100e381e1 <+1393>: movq   %rcx, %rdx                          // move recv into rdx
    0x100e381e4 <+1396>: movl   %eax, %ecx                          // move argc into ecx
    0x100e381e6 <+1398>: movq   -0x1c0(%rbp), %r9                   // move code-entry into register r9
    0x100e381ed <+1405>: callq  *%r9                                // call address in register r9

Below we verify the contents of the above instructions:

(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x118`
0x7fff5fbfd828: 0x0000302f34084060
(lldb) expr code->entry()
(byte *) $186 = 0x0000302f34084060

(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x120`
0x7fff5fbfd820: 0x00001d64e5d822e1
(lldb) expr orig_func
(v8::internal::Object *) $209 = 0x00001d64e5d822e1

(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x128`
0x7fff5fbfd818: 0x00001d6417d30669
(lldb) expr func
(v8::internal::Object *) $211 = 0x00001d6417d30669

(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x130`
0x7fff5fbfd810: 0x00001d64f0a82239
(lldb) expr recv
(v8::internal::Object *) $213 = 0x00001d64f0a82239

(lldb) memory read -f x -c 1 -s 4 `$rbp - 0x30`
0x7fff5fbfd910: 0x00000000
(lldb) expr argc
(int) $216 = 0

(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x138`
0x7fff5fbfd808: 0x0000000000000000
(lldb) expr argv
(v8::internal::Object ***) $219 = 0x0000000000000000

(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x1c0`
0x7fff5fbfd780: 0x0000302f34084060

The last instruction callq *%r9 will call into JSEntryStub:

->  0x302f34084060: pushq  %rbp                                  // push callers base frame pointer saving it so we can restore it
(lldb) memory read -f x -c 1 -s 8 `$rsp`                         // inspect the stack
    0x302f34084061: movq   %rsp, %rbp                            // mov the current value of rsp to into rbp which will be the frame pointer for this function
    0x302f34084064: pushq  $0x2                                  // this is pushing an immediate value 2 onto the stack. Where does this come from, the following:
(lldb) expr v8::internal::StackFrame::MarkerToType(2)
(v8::internal::StackFrame::Type) $494 = ENTRY
v8::internal::StackFrame::Type::ENTRY))
(int32_t) $262 = 2
(lldb) memory read -f x -c 2 -s 8 `$rsp`                         // inspect the stack 
    0x302f34084066: movabsq $0x106001990, %r10                   // move the context address into r10
(lldb) up 2
(lldb) expr isolate->isolate_addresses_[IsolateAddressId::kContextAddress]
(v8::internal::Address) $230 = 0x0000000106001990 
    0x302f34084070: movq   (%r10), %r10                          // the context address is a pointer, this will dereference it
(lldb) memory read -f x -c 1 -s 8 `isolate->isolate_addresses_[IsolateAddressId::kContextAddress]`
0x106001990: 0x00001d6417d03a59
(lldb) register read r10
     r10 = 0x00001d6417d03a59
     0x302f34084073: pushq  %r10                                // push the dereferences context onto the stack
     0x302f34084075: pushq  %r12                                // r12 must be preserved accross function calls so save it and pop it later before returning
     0x302f34084077: pushq  %r13                                // r13 must be preserved accross function calls so save it and pop it later before returning
     0x302f34084079: pushq  %r14                                // r14 must be preserved accross function calls so save it and pop it later before returning
     0x302f3408407b: pushq  %r15                                // r14 must be preserved accross function calls so save it and pop it later before returning
     0x302f3408407d: pushq  %rbx                                // rbx must be preserved accross function calls so save it and pop it later before returning
     0x302f3408407e: movabsq $0x106000048, %r13                 // move the value of the roots_array_start into r13
(lldb) expr isolate->heap()->roots_array_start()
(v8::internal::Object **) $233 = 0x0000000106000048
     0x302f34084088: addq   $0x80, %r13                         // addp(kRootRegister, Immediate(kRootRegisterBias)); kRootRegisterBias is 128
     0x302f3408408f: movabsq $0x106001a00, %r10                 // move he CEntryFPAddress into r10
(lldb) memory read -f x -c 1 -s 8 isolate->isolate_addresses_[IsolateAddressId::kCEntryFPAddress]
0x106001a00: 0x0000000000000000
    0x302f34084099: pushq  (%r10)                               // dereference CEntryAddress and push onto the stack
(lldb) memory read -f x -c 1 -s 8 0x0000000106001a00
0x106001a00: 0x0000000000000000
    0x302f3408409c: movabsq 0x106001a20, %rax                   // move JSEntrySPAddress into rax
(lldb) memory read -f x -c 1 -s 8 isolate->isolate_addresses_[IsolateAddressId::kJSEntrySPAddress]
0x106001a20: 0x0000000000000000 
    0x302f340840a6: testq  %rax, %rax                           // is rax zero? if so there is an outer js call
(lldb) register read rax
     rax = 0x0000000000000000
    0x302f340840af: pushq  $0x2                                 // v8::internal::StackFrame::OUTERMOST_JSENTRY_FRAME))
    0x302f340840b1: movq   %rbp, %rax                           // move this functions base pointer into rax 
    0x302f340840b4: movabsq %rax, 0x106001a20                   // store the base pointer in isolate_addresses_[IsolateAddressId::kJSEntrySPAddress]
(lldb) memory read -f x -c 1 -s 8 `isolate->isolate_addresses_[IsolateAddressId::kJSEntrySPAddress]`
0x106001a20: 0x0000000000000000
after
(lldb) memory read -f x -c 1 -s 8 `isolate->isolate_addresses_[IsolateAddressId::kJSEntrySPAddress]`
0x106001a20: 0x00007fff5fbfd760
(lldb) register read rbp
     rbp = 0x00007fff5fbfd940
    0x302f340840be: jmp    0x302f340840c5
    0x302f340840c5: jmp    0x302f340840e0
    0x302f340840e0: movabsq $0x106001a08, %r10                  // move isolate_addresses_[IsolateAddressId::kHandlerAddress]` into r10
(lldb) memory read -f x -c 1 -s 8 `isolate->isolate_addresses_[IsolateAddressId::kHandlerAddress]`
0x106001a08: 0x0000000000000000
    0x302f340840ea: pushq  (%r10)                               // push the dereferenced handler address
    0x302f340840ed: movabsq $0x106001a08, %r10                  // move isolate_addresses_[IsolateAddressId::kHandlerAddress]` into r10
    0x302f340840f7: movq   %rsp, (%r10)                         // move the value of the current stack pointer into the object pointed to be r10
(lldb) memory read -f x -c 1 -s 8 0x106001a08
0x106001a08: 0x00007fff5fbfd710
(lldb) register read rsp
     rsp = 0x00007fff5fbfd710
    0x302f340840fa: callq  0x302f341418c0                       // call JSEntryTrampoline builtin
(lldb) expr isolate->builtins()->builtin_handle(Builtins::Name::kJSEntryTrampoline)->entry()
(byte *) $255 = 0x0000302f341418c0 
      

In deps/v8/src/builtins/x64/builtins-x64.cc we can find Generate_JSEntryTrampolineHelper which is what generates the builtin. As this is done a compile time we can put a break point in it (you can debug mksnapshot though which is done elsewhere in this document).

    0x302f341418c0: movq   %rdi, %r11                           // move the orig_fun into r11
    0x302f341418c3: movq   %rsi, %rdi                           // move func into rdi
    0x302f341418c6: xorl   %esi, %esi                           // zero out esi
    0x302f341418c8: pushq  %rbp                                 // EnterFrame (deps/v8/src/x64/macro-assembler-x64.cc)
    0x302f341418c9: movq   %rsp, %rbp                           // 
    0x302f341418cc: pushq  $0x1c                                // push 0x1c (decimal 28)
(lldb) p v8::internal::StackFrame::TypeToMarker(static_cast<v8::internal::StackFrame::Type>(v8::internal::StackFrame::Type::INTERNAL))
(int32_t) $256 = 28
    0x302f341418ce: movabsq $0x302f34141861, %r10               // CodeObject() what is this, seems like it is Handle<HeapObject> which will be patched later
                                                                // I think this might be the register file?
(lldb) memory read -f x -c 1 -s 8 0x302f34141861
0x302f34141861: 0x6900001d64ceb827
    0x302f341418d8: pushq  %r10                                 // push the HandleHeapObject into the stack
    0x302f341418da: movabsq $0x1d64e5d822e1, %r10               // move undefined value into r10 
(lldb) expr *isolate->factory()->undefined_value()
(v8::internal::Oddball *) $264 = 0x00001d64e5d822e1
    0x302f341418e4: cmpq   %r10, (%rsp)
    0x302f341418e8: jne    0x302f341418fa                       // last of EnterFrame if we don't abort that is
    0x302f341418fa: movabsq $0x106001990, %r10                  // move the ContextAdress into r10 (the scratch register for x86)
(lldb) memory read -f x -c 1 -s 8 `isolate->isolate_addresses_[IsolateAddressId::kContextAddress]`
0x106001990: 0x00001d6417d03a59
    0x302f34141904: movq   (%r10), %rsi                         // deref and move into context into rsi
    0x302f34141907: pushq  %rdi                                 // push function onto the stack
    0x302f34141908: pushq  %rdx                                 // push recv onto the stack (this was moved above with 0x302f34141908: pushq  %rdx)
    0x302f34141909: movq   %rcx, %rax                           // move argc into rax (this was moved above with 0x100e381e4 <+1396>: movl   %eax, %ecx)
    0x302f3414190c: movq   %r8, %rbx                            // move argv into rbx
    0x302f3414190f: movq   %r11, %rdx                           // move orig_func into rdx
// Generate_CheckStackOverflow  TODO: got through this as I struggled to understand/map the generated instructions to the source code :( 
    0x302f34141912: movq   0xd08(%r13), %r10                    // 
    0x302f34141919: movq   %rsp, %rcx
    0x302f3414191c: subq   %r10, %rcx
    0x302f3414191f: movq   %rax, %r11
    0x302f34141922: shlq   $0x3, %r11
    0x302f34141926: cmpq   %r11, %rcx
    0x302f34141929: jg     0x302f34141940
// Generate_JSEntryTrampolineHelper
    0x302f34141940: xorl   %ecx, %ecx                           // zero out ecx, for the following loop
    0x302f34141942: jmp    0x302f3414194f                       // jump to entry label
    0x302f3414194f: cmpq   %rax, %rcx                           // compare argc with rcx (this was moved above). 
    0x302f34141952: jne    0x302f34141944                       // entry the loop if. This is pushing the argv values onto the stack in a loop, but we don't have any so we fall through
    0x302f34141954: callq  0x302f3413c400                       //
(lldb) expr isolate->builtins()->Call(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))->entry()
(byte *) $279 = 0x0000302f3413c400 "@\xfffffff6\xffffffc7\x01\x0f\xffffff84F"
// for the full object with assembly language instructions:
(lldb) job *isolate->builtins()->Call(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))
// Builtins::Generate_Call_ReceiverIsAny 'deps/v8/src/builtins/builtins-call-gen.cc':
// void Builtins::Generate_Call_ReceiverIsAny(MacroAssembler* masm) {
//   Generate_Call(masm, ConvertReceiverMode::kAny);
// }
// Builtins::Generate_Call `deps/v8/src/builtins/x64/builtins-x64.cc`
    0x302f3413c400: testb  $0x1, %dil                           // move the immediate 1 into the low 8 bits or rdi  // __ JumpIfSmi(rdi, &non_callable); which consists of CheckSmi
    0x302f3413c404: je     0x302f3413c450                       // if ZF = 1 then jump. This would happen if rdi was a SMI// __ JumpIfSmi(rdi, &non_callable: which after CheckSmi will jump. rdi is the target and not a smi in our case
// recall that rdi is the function:
(lldb) register read rdi
     rdi = 0x00001d6417d30669
(lldb) expr func
(v8::internal::Object *) $280 = 0x00001d6417d30669

Now I think that func is/was of type JSFunction (deps/v8/src/objects.h) as it was cast to Object* by:

auto fun = i::Handle<i::JSFunction>::cast(Utils::OpenHandle(this));
class JSFunction: public JSObject {
 public:
  // [prototype_or_initial_map]:
  DECL_ACCESSORS(prototype_or_initial_map, Object)
    0x302f3413c40a: movq   -0x1(%rdi), %rcx                      // move the map into rcx. CmpObjectType(rdi, JS_FUNCTION_TYPE, rcx). HeapObject::kMapOffset which is the first field in JSFunction:
(lldb) memory read -f x -c 1 -s 8 `$rdi - 1`
0x1d6417d30668: 0x00001d6433c82521
(lldb) expr JSFunction::cast(func)->prototype_or_initial_map()
(v8::internal::Object *) $286 = 0x00001d64e5d82321
    0x302f3413c40e: cmpb   $-0x1, 0xb(%rcx)                      // CmpObjectType(rdi, JS_FUNCTION_TYPE, rcx) calls CmpInstanceType. Check if the HeapObject is a JSFunction
    0x302f3413c412: je     0x302f3413be20                        // __ j(equal, masm->isolate()->builtins()->CallFunction(mode), RelocInfo::CODE_TARGET);
                                                                 // I was not sure where to find this `j` function but it is in src/x64/assembler-x64.cc (Assembler::j but note
                                                                 // that there are multiple overloaded j functions so make sure  you are looking at the correct one. There is 
                                                                 // as section that discusses Assembler::j in detail later in this document.
                                                                 // so what are we jumping to? We can back up in the debugger and find out:
                                                                 // (lldb) up 2
                                                                 // (lldb job *isolate->builtins()->CallFunction(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))
                                                                 // This will be a builtin named CallFunction_ReceiverIsAny, and being a builtin you'll find it in 
                                                                 // `src/builtins/builtins-definitions.h`. The implementation will be in `src/builtins/builtins-call.cc` and
                                                                 // will result in `return builtin_handle(kCallFunction_ReceiverIsAny)`. This code repsonsible for generating
                                                                 // is Builtins::Generate_CallFunction and in our case that means `src/builtins/x64/builtins-x64.cc`
(lldb) expr isolate->builtins()->CallFunction(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))->entry()
(byte *) $317 = 0x0000302f3413be20
// Builtins::Generate_CallFunction (deps/v8/src/builtins/x64/builtins-x64.cc)
    0x302f3413be20: testb  $0x1, %dil                            // AssertFunction (deps/v8/src/x64/macro-assembler-x64.cc) check if rdi is of type smi (testb(object, Immediate(kSmiTagMask));)
                                                                 // rdi is the function to call:
(lldb) register read rdi
     rdi = 0x00001d6417d30669
(lldb) expr JSFunction::cast(func)
(v8::internal::JSFunction *) $320 = 0x00001d6417d30669
    0x302f3413be24: jne    0x302f3413be36                        // this is also generated by AssertFunciton and the call to Assembler::j If not equal this code will fall through
    0x302f3413be36: pushq  %rdi                                  // AssertFunction still, push rdi (the function) onto the stack
    0x302f3413be37: movq   -0x1(%rdi), %rdi                      // move the map into rdi. CmpObjectType(object, JS_FUNCTION_TYPE, object). HeapObject::kMapOffset which is the first field in JSFunction
    0x302f3413be3b: cmpb   $-0x1, 0xb(%rdi)                      // CmpObjectType(rdi, JS_FUNCTION_TYPE, rcx) calls CmpInstanceType. Check if the HeapObject is a JSFunction
    0x302f3413be3f: popq   %rdi                                  // pop function from stack into rdi again
    0x302f3413be40: je     0x302f3413be52                        // jump if equal will jump to a L label and return from the Check call and then return from AssertFunction
// Builtins::Generate_CallFunction (deps/v8/src/builtins/x64/builtins-x64.cc) 
    0x302f3413be52: movq   0x1f(%rdi), %rdx                      // move the SharedFunctionInfo into rdx:
(lldb) memory read -f x -c 1 -s 8 `$rdi + 0x1f`
0x1d6417d30688: 0x00001d6417d2dc21
(lldb) expr JSFunction::cast(func)->shared()
(v8::internal::SharedFunctionInfo *) $325 = 0x00001d6417d2dc21
    0x302f3413be56: testb  $-0x20, 0x87(%rdx)                    //  testl(FieldOperand(rdx, SharedFunctionInfo::kCompilerHintsOffset)
(lldb) expr JSFunction::cast(func)->shared()->compiler_hints()
(int) $332 = 1056770
(lldb) memory read -f dec -c 1 -s 4 `($rdx + 0x87)`
0x1d6417d2dca8: 1056770
    0x302f3413be5d: jne    0x302f3413bfbc                        // __ j(not_zero, &class_constructor);
    0x302f3413be63: movq   0x27(%rdi), %rsi                      // move the functions context info rsi:
(lldb) memory read -f x -c 1 -s 8 `$rdi + 0x27`
0x1d6417d30690: 0x00001d6417d03a59
(lldb) expr JSFunction::cast(func)->context()
(v8::internal::Context *) $345 = 0x00001d6417d03a59
    0x302f3413be67: testb  $0x3, 0x87(%rdx)                      // check SharedFunctionInfo::IsNativeBit::kMask | SharedFunctionInfo::IsStrictBit::kMask 
    0x302f3413be6e: jne    0x302f3413bf14                        // 
    0x302f3413bf14: movslq 0x73(%rdx), %rbx                      // __ movsxlq(rbx, FieldOperand(rdx, SharedFunctionInfo::kFormalParameterCountOffset))
    0x302f3413bf18: movabsq $0x105300b92, %r10                   // move hook_on_function_call_address into scratch register; part of CheckDebugHook
(lldb) expr isolate->debug()->hook_on_function_call_address()
(v8::internal::Address) $353 = 0x0000000105300b92
    0x302f3413bf22: cmpb   $0x0, (%r10)                          // how is this generate? In CheckDebugHook I can only find cmpb(debug_hook_active_operand, Immediate(0)); but not he previous moveabsq
    0x302f3413bf26: je     0x302f3413bfa4                        // will jump to the label at the end of CheckDebugHook
// MacroAssembler::InvokeFunction
    0x302f3413bfa4: movq   -0x60(%r13), %rdx                     // move the UndefinedValueRootIndex into rdx generated by LoadRoot(rdx, Heap::kUndefinedValueRootIndex);
(lldb) memory read -f x -c 1 -s 8 `$r13 - 0x60`
0x106000068: 0x00001d64e5d822e1
lldb) expr isolate->heap()->roots_[Heap::RootListIndex::kUndefinedValueRootIndex]
(v8::internal::Object *) $354 = 0x00001d64e5d822e1
// InvokePrologue(expected, actual, &done, &definitely_mismatches, flag, Label::kNear)
    0x302f3413bfa8: cmpq   %rax, %rbx                            // Set(rax, actual.immediate()); 
    0x302f3413bfab: je     0x302f3413bfb2                        // will return from InvokePrologue
    0x302f3413bfb2: movq   0x37(%rdi), %rcx                      // move the function code into rcx (movp(rcx, FieldOperand(function, JSFunction::kCodeOffset)))
(lldb) memory read -f x -c 1 -s 8 `$rdi + 0x37`
0x1d6417d306a0: 0x0000302f34144281
(lldb) expr JSFunction::cast(func)->code()
(v8::internal::Code *) $358 = 0x0000302f34144281
    0x302f3413bfb6: addq   $0x5f, %rcx                           // addp(rcx, Immediate(Code::kHeaderSize - kHeapObjectTag)); the instruction start follows the Code object header
(lldb) job JSFunction::cast(func)->code()
0x302f34144281: [Code]
yind = BUILTIN
name = InterpreterEntryTrampoline
compiler = unknown
Instructions (size = 1004)
0x302f341442e0     0  488b5f2f       REX.W movq rbx,[rdi+0x2f]
(lldb) memory read -f x -c 1 -s8 `$rcx + 0x5f`
0x302f341442e0: 0x075b8b482f5f8b48
// notice that rcx now point to the first instruction.
    0x302f3413bfba: jmpq   *%rcx                                // now jump to the first instruction of function :) 

    0x302f341442e0: movq   0x2f(%rdi), %rbx                     // move the feedback vector into rbx
(lldb) memory read -f x -c 1 -s 8 `$rdi + 0x2f`
0x1d6417d30698: 0x00001d6417d306e9
(lldb) expr JSFunction::cast(func)->feedback_vector()
(v8::internal::FeedbackVector *) $363 = 0x00001d6417d306a9
(lldb) job JSFunction::cast(func)->feedback_vector()
0x1d6417d306a9: [FeedbackVector] in OldSpace
 - length: 1
 SharedFunctionInfo: 0x1d6417d2dc21 <SharedFunctionInfo>
 Optimized Code: 0
 Invocation Count: 0
 Profiler Ticks: 0
 Slot #0 kCreateClosure
  [0]: 0x1d6417d306d9 <Cell value= 0x1d64e5d822e1 <undefined>>

    302f341442e4: movq   0x7(%rbx), %rbx                     // move Slot[0] from the feedback vector into rbx
(lldb) memory read -f x -c 1 -s 8 `$rbx + 0x7`
0x1d6417d306f0: 0x00001d6417d306a9
// MaybeTailCallOptimizedCodeSlot(masm, feedback_vector, rcx, r14, r15);
// MaybeTailCallOptimizedCodeSlot(MacroAssembler* masm, Register feedback_vector, Register scratch1, Register scratch2, Register scratch3)
// Register closure = rdi;
// Register optimized_code_entry = rcx;
    0x302f341442e8: movq   0xf(%rbx), %rcx                     // __ movp(optimized_code_entry, FieldOperand(feedback_vector, FeedbackVector::kOptimizedCodeOffset));
(lldb) memory read -f x -c 1 -s 8 `$rbx + 0xf`
0x1d6417d306b8: 0x0000000000000000
(lldb) expr JSFunction::cast(func)->feedback_vector()->optimized_code()
(v8::internal::Code *) $367 = 0x0000000000000000
    0x302f341442ec: testb  $0x1, %cl                           // smi test against low 8 bit of rcx called from JumpIfNotSmi -> CheckSmi
    0x302f341442ef: jne    0x302f34144486                      // JumpIfNotSmi -> Assembler::j 
    0x302f341442f5: testb  $0x1, %cl                           // test again but this time from MacroAssembler::SmiCompare and its call to AssertSmi which calls CheckSmi
    0x302f341442f8: je     0x302f3414430a                      // SmiCompare -> AssertSmi -> Check

Look into static inline Code* GetCodeFromTargetAddress(Address address); which could possible be used to get the code of an address which could be useful when debugging and you only have the address that is being jumped to.

(lldb) expr JSFunction::cast(func)->code()
(v8::internal::Code *) $477 = 0x00000e6d84c44281

(lldb) expr JSFunction::cast(func)->shared()->abstract_code()->Print()
0x265e1d0ade49: [BytecodeArray] in OldSpaceParameter count 1
Frame size 8
    0 E> 0x265e1d0ade82 @    0 : 93                StackCheck
   15 S> 0x265e1d0ade83 @    1 : 6f 00 00 00       CreateClosure [0], [0], #0
         0x265e1d0ade87 @    5 : 1e fb             Star r0
21644 S> 0x265e1d0ade89 @    7 : 97                Return
Constant pool (size = 1)
0x265e1d0ade31: [FixedArray] in OldSpace
 - map = 0x265ed93822f1 <Map(HOLEY_ELEMENTS)>
 - length: 1
           0: 0x265e1d0add81 <SharedFunctionInfo bajja>
Handler Table (size = 16)

Side note on the interpreters dispatch_table_ ...
Every isolate has a interpreter as a member. An interpreter has a dispatch_table_ array of Address's (byte*):
```console
(lldb) expr isolate->interpreter()->dispatch_table_
(Address [768]) $292 = {

Notice that these are addresses which are index by Bytecode's. For example lets take a look the address for CreateClosure:

(lldb) expr isolate->interpreter()->GetBytecodeHandler(static_cast<Bytecode>(Bytecode::kCreateClosure),static_cast<OperandScale>(1))->Print()

How is `dispatch_table_` populated?  
This is done in the `Heap::IterateStrongRoots` function:
```c++
isolate_->interpreter()->IterateDispatchTable(v);

Which is called by StartupDeserializer::DeserializeInto:

isolate->heap()->IterateStrongRoots(this, VISIT_ONLY_STRONG_ROOT_LIST);
isolate->heap()->IterateSmiRoots(this);
isolate->heap()->IterateStrongRoots(this, VISIT_ONLY_STRONG);

which is called by Isolate::Init:

if (!create_heap_objects) des->DeserializeInto(this);

PrepareAndExecuteUnoptimizedCompileJob deps/v8/src/compiler.cc

0x169cbdb41e04  external reference (Runtime::StackGuard)  (0x101440100)
0x169cbdb41e0d  code target (STUB)  (0x169cbda84740)

If you look closely the first instruction is just setting rax to zero using xor. Next, we are pushing the pointer to the function, in this case Runtime::StackGuard into rbx. We then jump to CEntryStub.

What is Runtime::StackGuard?
Well, it is defined in a macro in src/runtime/runtime.h:

...
F(StackGuard, 0, 1)
...
#define FOR_EACH_INTRINSIC(F)         \
  FOR_EACH_INTRINSIC_RETURN_PAIR(F)   \
  FOR_EACH_INTRINSIC_RETURN_OBJECT(F)


#define F(name, nargs, ressize)                                 \
  Object* Runtime_##name(int args_length, Object** args_object, \
                         Isolate* isolate);
FOR_EACH_INTRINSIC_RETURN_OBJECT(F)
#undef F

StackGuard is included in FOR_EACH_INTRINSIC_INTERNAL which is included by FOR_EACH_INTRINSIC_RETURN_OBJECT. So that should expand to:

Object* Runtime_StackGuard(int args_lentgh, Object** args_object, Isolate* isolate);

If we take a look at src/runtime/runtime.h we can find:

static const Runtime::Function kIntrinsicFunctions[] = {
  FOR_EACH_INTRINSIC(F)
  FOR_EACH_INTRINSIC(I)
};

And the indexes into this array are in the enum Runtime::FunctionId:

(lldb) expr kIntrinsicFunctions[v8::internal::Runtime::FunctionId::kStackGuard]
(const Function) $371 = (function_id = kStackGuard, intrinsic_type = RUNTIME, name = "StackGuard", entry = "UH\xffffff89\xffffffe5H\xffffff83\xffffffecP\xffffff89}\xfffffff4H\xffffff89u\xffffffe8H\xffffff89U\xffffffe0H\xffffff8b}\xffffffe0\xffffffe8\xffffffb4d\x04\xffffffff\xffffffb1\x01H\xffffff83\xfffffff8", nargs = '\0', result_size = '\x01')

There is also a function named Runtime::FunctionForId which can be used:

(lldb) expr v8::internal::Runtime::FunctionForId(static_cast<v8::internal::Runtime::FunctionId>(v8::internal::Runtime::FunctionId::kStackGuard))
(const Function *) $1058 = 0x00000001029f0340
(lldb) expr v8::internal::Runtime::FunctionForId(static_cast<v8::internal::Runtime::FunctionId>(v8::internal::Runtime::FunctionId::kStackGuard))->name
(const char *const) $1059 = 0x0000000101c8be27 "StackGuard"
(lldb) expr v8::internal::Runtime::FunctionForId(static_cast<v8::internal::Runtime::FunctionId>(v8::internal::Runtime::FunctionId::kStackGuard))->nargs
(int8_t) $1060 = '\0'
(lldb) expr v8::internal::Runtime::FunctionForId(static_cast<v8::internal::Runtime::FunctionId>(v8::internal::Runtime::FunctionId::kStackGuard))->intrinsic_type
(const IntrinsicType) $1061 = RUNTIME
(lldb) expr v8::internal::Runtime::FunctionForId(static_cast<v8::internal::Runtime::FunctionId>(v8::internal::Runtime::FunctionId::kStackGuard))->entry
(v8::internal::Address) $1063 = 0x000000010142de70 "UH\x89�H��P�}�H�u�H�U�H�}���\x14\x04��\x01H\x83�

If nargs is -1 then the funnction takes a variable number of arguments. We can also disassemble the entry address using:

(lldb) dis -s 0x000000010142de70
node`v8::internal::Runtime_StackGuard:
    0x10142de70 <+0>:  pushq  %rbp
    0x10142de71 <+1>:  movq   %rsp, %rbp
    0x10142de74 <+4>:  subq   $0x50, %rsp
    0x10142de78 <+8>:  movl   %edi, -0xc(%rbp)
    0x10142de7b <+11>: movq   %rsi, -0x18(%rbp)
    0x10142de7f <+15>: movq   %rdx, -0x20(%rbp)
    0x10142de83 <+19>: movq   -0x20(%rbp), %rdi
    0x10142de87 <+23>: callq  0x10046f360               ; v8::internal::Isolate::context at isolate.h:596
    0x10142de8c <+28>: movb   $0x1, %cl

Now that we know this we can disassemble the complete function using:

(lldb) dis -n v8::internal::Runtime_StackGuard

As well as any other runtime function we might be interested in later. How are these stored?
They are stored in a global named kIntrinsicFunctions:

(lldb) target variable kIntrinsicFunctions
(lldb) expr kIntrinsicFunctions[v8::internal::Runtime::FunctionId::kStackGuard]
(const v8::internal::Runtime::Function) $227 = (function_id = kStackGuard, intrinsic_type = RUNTIME, name = "StackGuard", entry = "UH\xffffff89\xffffffe5H\xffffff83\xffffffecP\xffffff89}\xfffffff4H\xffffff89u\xffffffe8H\xffffff89U\xffffffe0H\xffffff8b}\xffffffe0\xffffffe8td\x04\xffffffff\xffffffb1\x01H\xffffff83\xfffffff8", nargs = '\0', result_size = '\x01')
(lldb) expr kIntrinsicFunctions[Builtins::Name::kStackCheck].entry
(v8::internal::Address) $229 = 0x0000000101440100 "UH\x89�H��P�}�H�u�H�U�H�}��td\x04��\x01H\x83�

When is this array populated?
Being a global it is part of the data section in the object file and will be loaded into memory upon start up.

Just to recap, first the address of Runtime_StackGuard will be moved into register rbx, and then we will jump CEntryStub.

(lldb) job *isolate->builtins()->builtin_handle(Builtins::Name::kStackCheck)
0x302f34141da1: [Code]
kind = BUILTIN
name = StackCheck
compiler = unknown
Instructions (size = 17)
0x302f34141e00     0  33c0           xorl rax,rax
0x302f34141e02     2  48bbc000440101000000 REX.W movq rbx,0x1014400c0    ;; external reference (Runtime::StackGuard)
0x302f34141e0c     c  e92f29f4ff     jmp 0x302f34084740      ;; code: STUB, CEntryStub, minor: 8


RelocInfo (size = 3)
0x302f34141e04  external reference (Runtime::StackGuard)  (0x1014400c0)
0x302f34141e0d  code target (STUB)  (0x302f34084740)

(lldb) expr kIntrinsicFunctions[v8::internal::Runtime::FunctionId::kStackGuard].entry
(v8::internal::Address) $380 = 0x00000001014400c0 "UH\x89�H��P�}�H�u�H�U�H�}��d\x04��\x01H\x83�

We can see that the 0x1014400c0 match. And we can disassemble this address using:

(lldb) dis -s 0x1014400c0
node`v8::internal::Runtime_StackGuard:
    0x1014400c0 <+0>:  pushq  %rbp
    0x1014400c1 <+1>:  movq   %rsp, %rbp
    0x1014400c4 <+4>:  subq   $0x50, %rsp
    0x1014400c8 <+8>:  movl   %edi, -0xc(%rbp)
    0x1014400cb <+11>: movq   %rsi, -0x18(%rbp)
    0x1014400cf <+15>: movq   %rdx, -0x20(%rbp)
    0x1014400d3 <+19>: movq   -0x20(%rbp), %rdi
    0x1014400d7 <+23>: callq  0x100486590               ; v8::internal::Isolate::context at isolate.h:596
    0x1014400dc <+28>: movb   $0x1, %cl

What about CEntryStub that is being jumped to?

0x302f34141e0c     c  e92f29f4ff     jmp 0x302f34084740      ;; code: STUB, CEntryStub, minor: 8

CEntryStub is declared in deps/v8/src/code-stubs.h:

class CodeStub : public ZoneObject {
  ...
};
class PlatformCodeStub : public CodeStub {
  Handle<Code> GenerateCode() override;
  ...
};
class CEntryStub : public PlatformCodeStub {
  ...
};

Where is the CEntryStub stored?

jmp 0x302f34084740      ;; code: STUB, CEntryStub, minor: 8

(lldb) expr CodeStub::Major::CEntry
(int) $404 = 4
(lldb) expr CodeStub::MajorKeyFromKey(4)
(v8::internal::CodeStub::Major) $405 = CEntry
(lldb) expr isolate->heap()->code_stubs()->FindEntry(isolate, 4)
(int) $422 = 451
(lldb) job Code::cast(isolate->heap()->code_stubs()->ValueAt(451))
0x302f34284001: [Code]
kind = STUB
major_key = CEntryStub
compiler = unknown
Instructions (size = 327)
0x302f34284060     0  55             push rbp
0x302f34284061     1  4889e5         REX.W movq rbp,rsp
0x302f34284064     4  6a06           push 0x6
0x302f34284066     6  6a00           push 0x0
0x302f34284068     8  49ba014028342f300000 REX.W movq r10,0x302f34284001
0x302f34284072    12  4152           push r10
0x302f34284074    14  4c8bf0         REX.W movq r14,rax
0x302f34284077    17  49ba001a000601000000 REX.W movq r10,0x106001a00
0x302f34284081    21  49892a         REX.W movq [r10],rbp
0x302f34284084    24  49ba9019000601000000 REX.W movq r10,0x106001990
0x302f3428408e    2e  498932         REX.W movq [r10],rsi
0x302f34284091    31  49ba101a000601000000 REX.W movq r10,0x106001a10
0x302f3428409b    3b  49891a         REX.W movq [r10],rbx
0x302f3428409e    3e  4e8d7cf508     REX.W leaq r15,[rbp+r14*8+0x8]
0x302f342840a3    43  4883e4f0       REX.W andq rsp,0xf0
0x302f342840a7    47  488965f0       REX.W movq [rbp-0x10],rsp
0x302f342840ab    4b  40f6c40f       testb rsp,0xf
0x302f342840af    4f  7401           jz 0x302f342840b2  <+0x52>
0x302f342840b1    51  cc             int3l
0x302f342840b2    52  498bfe         REX.W movq rdi,r14
0x302f342840b5    55  498bf7         REX.W movq rsi,r15
0x302f342840b8    58  48ba0000000601000000 REX.W movq rdx,0x106000000
0x302f342840c2    62  ffd3           call rbx
0x302f342840c4    64  493b8588000000 REX.W cmpq rax,[r13+0x88]
0x302f342840cb    6b  0f8447000000   jz 0x302f34284118  <+0xb8>
0x302f342840d1    71  4d8b75a8       REX.W movq r14,[r13-0x58]
0x302f342840d5    75  49baa019000601000000 REX.W movq r10,0x1060019a0
0x302f342840df    7f  4d3b32         REX.W cmpq r14,[r10]
0x302f342840e2    82  7401           jz 0x302f342840e5  <+0x85>
0x302f342840e4    84  cc             int3l
0x302f342840e5    85  488b4d08       REX.W movq rcx,[rbp+0x8]
0x302f342840e9    89  488b6d00       REX.W movq rbp,[rbp+0x0]
0x302f342840ed    8d  498d6708       REX.W leaq rsp,[r15+0x8]
0x302f342840f1    91  51             push rcx
0x302f342840f2    92  49ba9019000601000000 REX.W movq r10,0x106001990
0x302f342840fc    9c  498b32         REX.W movq rsi,[r10]
0x302f342840ff    9f  49c70200000000 REX.W movq [r10],0x0
0x302f34284106    a6  49ba001a000601000000 REX.W movq r10,0x106001a00
0x302f34284110    b0  49c70200000000 REX.W movq [r10],0x0
0x302f34284117    b7  c3             retl
0x302f34284118    b8  48c7c700000000 REX.W movq rdi,0x0
0x302f3428411f    bf  48c7c600000000 REX.W movq rsi,0x0
0x302f34284126    c6  48ba0000000601000000 REX.W movq rdx,0x106000000
0x302f34284130    d0  4989e2         REX.W movq r10,rsp
0x302f34284133    d3  4883ec08       REX.W subq rsp,0x8
0x302f34284137    d7  4883e4f0       REX.W andq rsp,0xf0
0x302f3428413b    db  4c891424       REX.W movq [rsp],r10
0x302f3428413f    df  48b8b0aa430101000000 REX.W movq rax,0x10143aab0
0x302f34284149    e9  40f6c40f       testb rsp,0xf
0x302f3428414d    ed  7401           jz 0x302f34284150  <+0xf0>
0x302f3428414f    ef  cc             int3l
0x302f34284150    f0  ffd0           call rax
0x302f34284152    f2  488b2424       REX.W movq rsp,[rsp]
0x302f34284156    f6  49bab019000601000000 REX.W movq r10,0x1060019b0
0x302f34284160   100  498b32         REX.W movq rsi,[r10]
0x302f34284163   103  49bad019000601000000 REX.W movq r10,0x1060019d0
0x302f3428416d   10d  498b22         REX.W movq rsp,[r10]
0x302f34284170   110  49bac819000601000000 REX.W movq r10,0x1060019c8
0x302f3428417a   11a  498b2a         REX.W movq rbp,[r10]
0x302f3428417d   11d  4885f6         REX.W testq rsi,rsi
0x302f34284180   120  7404           jz 0x302f34284186  <+0x126>
0x302f34284182   122  488975f8       REX.W movq [rbp-0x8],rsi
0x302f34284186   126  49bab819000601000000 REX.W movq r10,0x1060019b8
0x302f34284190   130  498b3a         REX.W movq rdi,[r10]
0x302f34284193   133  49bac019000601000000 REX.W movq r10,0x1060019c0
0x302f3428419d   13d  498b12         REX.W movq rdx,[r10]
0x302f342841a0   140  488d7c175f     REX.W leaq rdi,[rdi+rdx*1+0x5f]
0x302f342841a5   145  ffe7           jmp rdi


RelocInfo (size = -1377915637)

(lldb) expr Code::cast(isolate->heap()->code_stubs()->ValueAt(451))->entry()
(byte *) $427 = 0x0000302f34284060

These are not the same stubs, the opcodes match but not the entry addresses

IGNITION_HANDLER(StackCheck, InterpreterAssembler) {
  Label ok(this), stack_check_interrupt(this, Label::kDeferred);

  Node* interrupt = StackCheckTriggeredInterrupt();
  Branch(interrupt, &stack_check_interrupt, &ok);

  BIND(&ok);
  Dispatch();

  BIND(&stack_check_interrupt);
  {
    Node* context = GetContext();
    CallRuntime(Runtime::kStackGuard, context);
    Dispatch();
  }
}
void MacroAssembler::CallRuntime(const Runtime::Function* f,
                                 int num_arguments,
                                 SaveFPRegsMode save_doubles) {
  // If the expected number of arguments of the runtime function is
  // constant, we check that the actual number of arguments match the
  // expectation.
  CHECK(f->nargs < 0 || f->nargs == num_arguments);

  // TODO(1236192): Most runtime routines don't need the number of
  // arguments passed in because it is constant. At some point we
  // should remove this need and make the runtime routine entry code
  // smarter.
  Set(rax, num_arguments);
  LoadAddress(rbx, ExternalReference(f, isolate()));
  CEntryStub ces(isolate(), f->result_size, save_doubles);
  CallStub(&ces);
}

But, normally these will be called at compile time. My thinking is that these are all callstubs (builtins/runtime), that are generated at compile time and then made availalbe in memory somewhere allowing V8 to jump to the address of the first instruction and start executing it. But how are these loaded into memory?
This is done as Isolate initialization. Lets set the following break point we can follow this process:

(lldb) br s -n Isolate::Init
(lldb) br s -f isolate.cc -l 2703
bool Isolate::Init(StartupDeserializer* des) {
  ...
  isolate_addresses_[IsolateAddressId::kHandlerAddress] = reinterpret_cast<Address>(handler_address());
  isolate_addresses_[IsolateAddressId::kCEntryFPrAddress] = reinterpret_cast<Address>(centry_fp_address());
  isolate_addresses_[IsolateAddressId::kFunctionAddress] = reinterpret_cast<Address>(function_address());
  isolate_addresses_[IsolateAddressId::kContextAddress] = reinterpret_cast<Address>(context_address());
  isolate_addresses_[IsolateAddressId::kPendingExceptionAddress] = reinterpret_cast<Address>(pending_exception_address());
  isolate_addresses_[IsolateAddressId::kPendingHandlerContextAddress] = reinterpret_cast<Address>(pending_handler_context_address());
  isolate_addresses_[IsolateAddressId::kPendingHandlerCodeAddress] = reinterpret_cast<Address>(pending_handler_code_address());
  isolate_addresses_[IsolateAddressId::kPendingHandlerOffsetAddress] = reinterpret_cast<Address>(pending_handler_offset_address());
  isolate_addresses_[IsolateAddressId::kPendingHandlerFPAddress] = reinterpret_cast<Address>(pending_handler_fp_address());
  isolate_addresses_[IsolateAddressId::kPendingHandlerSPAddress] = reinterpret_cast<Address>(pending_handler_sp_address());
  isolate_addresses_[IsolateAddressId::kExternalCaughtExceptionAddress] = reinterpret_cast<Address>(external_caught_exception_address());
  isolate_addresses_[IsolateAddressId::kJSEntrySPAddress] = reinterpret_cast<Address>(js_entry_sp_address());
  ...
  InitializeThreadLocal();
  ...
  if (!create_heap_objects) des->DeserializeInto(this);

The functions, like handler_address(), can be found in v8/src/isolate.h:

inline Address* handler_address() { return &thread_local_top_.handler_; }

Notice that when the isolate_addresses_ array is populated and when InitializeThreadLocal is call the heap is still empty. What is happening is that pointers are being setup. When DeserializeInto(this) is called is when the heap is populated:

void StartupDeserializer::DeserializeInto(Isolate* isolate) {
  Initialize(isolate);
  BuiltinDeserializer builtin_deserializer(isolate, builtin_data_);
  ...
  builtin_deserializer.DeserializeEagerBuiltins();
  ...
  CodeStub::GenerateFPStubs(this);
  StoreBufferOverflowStub::GenerateFixedRegStubsAheadOfTime(this); 
}

DeserializeEagerBuiltins will populate the builtins_ array which is part of Builtins which is a member of Isolate. There seems to be codestubs that have to be generated, they cannot be serialized into the snapshot.

CodeStub::GenerateFPStubs(this), it is here that CEntryStub is generated:

void CodeStub::GenerateFPStubs(Isolate* isolate) {
  // Generate if not already in cache.
  SaveFPRegsMode mode = kSaveFPRegs;
  CEntryStub(isolate, 1, mode).GetCode();
  StoreBufferOverflowStub(isolate, mode).GetCode();
}

We can check that check the various data structures before and after using:

(lldb) expr builtins_
(lldb) expr heap->roots_
(lldb) expr isolate_addresses_
(lldb) expr thread_local_top_

For isolate_addresses_ which are pointers we can inspect them like this:

(lldb) expr isolate_addresses_[11]
(v8::internal::Address) $194 = 0x0000000104808820 ""
(lldb) memory read -f x -c 1 -s 8 0x0000000104808820
0x104808820: 0x0000000000000000
(lldb) expr CodeStub::Major::CEntry
(int) $120 = 4
Local<Value> result = script.ToLocalChecked()->Run();

If we back down through the call frame again, we will be in execution.cc and see this now familar code:

    typedef Object* (*JSEntryFunction)(Object* new_target, Object* target,
                                     Object* receiver, int argc,
                                     Object*** args);
    Handle<Code> code = is_construct
       ? isolate->factory()->js_construct_entry_code()
       : isolate->factory()->js_entry_code();
    ...
  
    // start of the function identified by code->entry() address.
    JSEntryFunction stub_entry = FUNCTION_CAST<JSEntryFunction>(code->entry());

    // Call the function through the right JS entry stub.
    Object* orig_func = *new_target;
    Object* func = *target;
    Object* recv = *receiver;
    Object*** argv = reinterpret_cast<Object***>(args);
    if (FLAG_profile_deserialization && target->IsJSFunction()) {
      PrintDeserializedCodeInfo(Handle<JSFunction>::cast(target));
    }
    RuntimeCallTimerScope timer(isolate, &RuntimeCallStats::JS_Execution);
    value = CALL_GENERATED_CODE(isolate, stub_entry, orig_func, func, recv, argc, argv);
  }

Notice that code will be what isolate->factory()->js_entry_code() returns:

(lldb) p isolate->factory()->js_entry_code()

You can verify that it is in fact bootstrap.js that func represents by issueing:

(lldb) job func

Handle<Code> code represents generated machine code, and we can see that this instance is on KindSTUB`:

(lldb) p code->kind()
(v8::internal::Code::Kind) $32 = STUB

Next, we have CALL_GENERATED_CODE which can be found in src/x64/simulator-x64.h:

// Since there is no simulator for the x64 architecture the only thing we can
// do is to call the entry directly.
// TODO(X64): Don't pass p0, since it isn't used?
#efine CALL_GENERATED_CODE(isolate, entry, p0, p1, p2, p3, p4) \
  (entry(p0, p1, p2, p3, p4))

So this will be an call that looks like this:

entry(orig_func, func, recv, argc, argv);
    0x100e381b4 <+1348>: movq   -0x118(%rbp), %rdx                  // move the value of address of code->entry() into rdx
    0x100e381bb <+1355>: movq   -0x120(%rbp), %rdi                  // first argument which is orig_func
->  0x100e381c2 <+1362>: movq   -0x128(%rbp), %rsi                  // second argument which is func
    0x100e381c9 <+1369>: movq   -0x130(%rbp), %rcx                  // third argument which is recv
    0x100e381d0 <+1376>: movl   -0x30(%rbp), %eax                   // fourth arg which is argc 
    0x100e381d3 <+1379>: movq   -0x138(%rbp), %r8                   // fifth argument which is argv
    0x100e381da <+1386>: movq   %rdx, -0x1c0(%rbp)                  // move code->entry from rdx into local varialbe 
    0x100e381e1 <+1393>: movq   %rcx, %rdx                          // move recv into rdx
    0x100e381e4 <+1396>: movl   %eax, %ecx                          // move argc into ecx
    0x100e381e6 <+1398>: movq   -0x1c0(%rbp), %r9                   // move code-entry into register r9
    0x100e381ed <+1405>: callq  *%r9                                // call address in register 9

(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x118`
0x7fff5fbfd828: 0x0000302f34084060
(lldb) expr code->entry()
(byte *) $186 = 0x0000302f34084060

(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x120`
0x7fff5fbfd820: 0x00001d64e5d822e1
(lldb) expr orig_func
(v8::internal::Object *) $209 = 0x00001d64e5d822e1

(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x128`
0x7fff5fbfd818: 0x00001d6417d30669
(lldb) expr func
(v8::internal::Object *) $211 = 0x00001d6417d30669

(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x130`
0x7fff5fbfd810: 0x00001d64f0a82239
(lldb) expr recv
(v8::internal::Object *) $213 = 0x00001d64f0a82239

(lldb) memory read -f x -c 1 -s 4 `$rbp - 0x30`
0x7fff5fbfd910: 0x00000000
(lldb) expr argc
(int) $216 = 0

(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x138`
0x7fff5fbfd808: 0x0000000000000000
(lldb) expr argv
(v8::internal::Object ***) $219 = 0x0000000000000000

(lldb) memory read -f x -c 1 -s 8 `$rbp - 0x1c0`
0x7fff5fbfd780: 0x0000302f34084060

The last instruction callq *%r9 will call into JSEntryStub:

->  0x302f34084060: pushq  %rbp                                  // push callers base frame pointer saving it so we can restore it
    0x302f34084061: movq   %rsp, %rbp                            // mov the current value of rsp to into rbp which will be the frame pointer for this function
    0x302f34084064: pushq  $0x2                                  // this is pushing an immediate value 2 onto the stack. Where does this come from, the following:
(lldb) p v8::internal::StackFrame::TypeToMarker(static_cast<v8::internal::StackFrame::Type>(v8::internal::StackFrame::Type::ENTRY))
(int32_t) $262 = 2
    0x302f34084066: movabsq $0x106001990, %r10                   // move the context address into r10
(lldb) expr isolate->isolate_addresses_[IsolateAddressId::kContextAddress]
(v8::internal::Address) $230 = 0x0000000106001990 
    0x302f34084070: movq   (%r10), %r10                          // the context address is a pointer, this will dereference it
(lldb) memory read -f x -c 1 -s 8 `isolate->isolate_addresses_[IsolateAddressId::kContextAddress]`
0x106001990: 0x00001d6417d03a59
(lldb) register read r10
     r10 = 0x00001d6417d03a59
     0x302f34084073: pushq  %r10                                // push the dereferences context onto the stack
     0x302f34084075: pushq  %r12                                // r12 must be preserved accross function calls so save it and pop it later before returning
     0x302f34084077: pushq  %r13                                // r13 must be preserved accross function calls so save it and pop it later before returning
     0x302f34084079: pushq  %r14                                // r14 must be preserved accross function calls so save it and pop it later before returning
     0x302f3408407b: pushq  %r15                                // r14 must be preserved accross function calls so save it and pop it later before returning
     0x302f3408407d: pushq  %rbx                                // rbx must be preserved accross function calls so save it and pop it later before returning
     0x302f3408407e: movabsq $0x106000048, %r13                 // move the value of the roots_array_start into r13
(lldb) expr isolate->heap()->roots_array_start()
(v8::internal::Object **) $233 = 0x0000000106000048
     0x302f34084088: addq   $0x80, %r13                         // addp(kRootRegister, Immediate(kRootRegisterBias)); kRootRegisterBias is 128
     0x302f3408408f: movabsq $0x106001a00, %r10                 // move he CEntryFPAddress into r10
(lldb) memory read -f x -c 1 -s 8 isolate->isolate_addresses_[IsolateAddressId::kCEntryFPAddress]
0x106001a00: 0x0000000000000000
    0x302f34084099: pushq  (%r10)                               // dereference CEntryAddress and push onto the stack
(lldb) memory read -f x -c 1 -s 8 0x0000000106001a00
0x106001a00: 0x0000000000000000
    0x302f3408409c: movabsq 0x106001a20, %rax                   // move JSEntrySPAddress into rax
(lldb) memory read -f x -c 1 -s 8 isolate->isolate_addresses_[IsolateAddressId::kJSEntrySPAddress]
0x106001a20: 0x0000000000000000 
    0x302f340840a6: testq  %rax, %rax                           // is rax zero? if so there is an outer js call
(lldb) register read rax
     rax = 0x0000000000000000
    0x302f340840af: pushq  $0x2                                 // v8::internal::StackFrame::OUTERMOST_JSENTRY_FRAME))
    0x302f340840b1: movq   %rbp, %rax                           // move this functions base pointer into rax 
    0x302f340840b4: movabsq %rax, 0x106001a20                   // store the base pointer in isolate_addresses_[IsolateAddressId::kJSEntrySPAddress]
(lldb) memory read -f x -c 1 -s 8 `isolate->isolate_addresses_[IsolateAddressId::kJSEntrySPAddress]`
0x106001a20: 0x0000000000000000
after
(lldb) memory read -f x -c 1 -s 8 `isolate->isolate_addresses_[IsolateAddressId::kJSEntrySPAddress]`
0x106001a20: 0x00007fff5fbfd760
(lldb) register read rbp
     rbp = 0x00007fff5fbfd940
    0x302f340840be: jmp    0x302f340840c5
    0x302f340840c5: jmp    0x302f340840e0
    0x302f340840e0: movabsq $0x106001a08, %r10                  // move isolate_addresses_[IsolateAddressId::kHandlerAddress]` into r10
(lldb) memory read -f x -c 1 -s 8 `isolate->isolate_addresses_[IsolateAddressId::kHandlerAddress]`
0x106001a08: 0x0000000000000000
    0x302f340840ea: pushq  (%r10)                               // push the dereferenced handler address
    0x302f340840ed: movabsq $0x106001a08, %r10                  // move isolate_addresses_[IsolateAddressId::kHandlerAddress]` into r10
    0x302f340840f7: movq   %rsp, (%r10)                         // move the value of the current stack pointer into the object pointed to be r10
(lldb) memory read -f x -c 1 -s 8 0x106001a08
0x106001a08: 0x00007fff5fbfd710
(lldb) register read rsp
     rsp = 0x00007fff5fbfd710
    0x302f340840fa: callq  0x302f341418c0                       // call JSEntryTrampoline builtin
(lldb) expr isolate->builtins()->builtin_handle(Builtins::Name::kJSEntryTrampoline)->entry()
(byte *) $255 = 0x0000302f341418c0 
      

In deps/v8/src/builtins/x64/builtins-x64.cc we can find Generate_JSEntryTrampolineHelper which is what generates the builtin. As this is done a compile time we can put a break point in it (you can debug mksnapshot though which is done elsewhere in this document).

    0x302f341418c0: movq   %rdi, %r11                           // move the orig_fun into r11
    0x302f341418c3: movq   %rsi, %rdi                           // move func into rdi
    0x302f341418c6: xorl   %esi, %esi                           // zero out esi
    0x302f341418c8: pushq  %rbp                                 // EnterFrame (deps/v8/src/x64/macro-assembler-x64.cc)
    0x302f341418c9: movq   %rsp, %rbp                           // 
    0x302f341418cc: pushq  $0x1c                                // push 0x1c (decimal 28)
(lldb) p v8::internal::StackFrame::TypeToMarker(static_cast<v8::internal::StackFrame::Type>(v8::internal::StackFrame::Type::INTERNAL))
(int32_t) $256 = 28
    0x302f341418ce: movabsq $0x302f34141861, %r10               // CodeObject() what is this, seems like it is Handle<HeapObject> which will be patched later
                                                                // I think this might be the register file?
(lldb) memory read -f x -c 1 -s 8 0x302f34141861
0x302f34141861: 0x6900001d64ceb827
    0x302f341418d8: pushq  %r10                                 // push the HandleHeapObject into the stack
    0x302f341418da: movabsq $0x1d64e5d822e1, %r10               // move undefined value into r10 
(lldb) expr *isolate->factory()->undefined_value()
(v8::internal::Oddball *) $264 = 0x00001d64e5d822e1
    0x302f341418e4: cmpq   %r10, (%rsp)
    0x302f341418e8: jne    0x302f341418fa                       // last of EnterFrame if we don't abort that is
    0x302f341418fa: movabsq $0x106001990, %r10                  // move the ContextAdress into r10 (the scratch register for x86)
(lldb) memory read -f x -c 1 -s 8 `isolate->isolate_addresses_[IsolateAddressId::kContextAddress]`
0x106001990: 0x00001d6417d03a59
    0x302f34141904: movq   (%r10), %rsi                         // deref and move into context into rsi
    0x302f34141907: pushq  %rdi                                 // push function onto the stack
    0x302f34141908: pushq  %rdx                                 // push recv onto the stack (this was moved above with 0x302f34141908: pushq  %rdx)
    0x302f34141909: movq   %rcx, %rax                           // move argc into rax (this was moved above with 0x100e381e4 <+1396>: movl   %eax, %ecx)
    0x302f3414190c: movq   %r8, %rbx                            // move argv into rbx
    0x302f3414190f: movq   %r11, %rdx                           // move orig_func into rdx
// Generate_CheckStackOverflow  TODO: got through this as I struggled to understand/map the generated instructions to the source code :( 
    0x302f34141912: movq   0xd08(%r13), %r10                    // 
    0x302f34141919: movq   %rsp, %rcx
    0x302f3414191c: subq   %r10, %rcx
    0x302f3414191f: movq   %rax, %r11
    0x302f34141922: shlq   $0x3, %r11
    0x302f34141926: cmpq   %r11, %rcx
    0x302f34141929: jg     0x302f34141940
// Generate_JSEntryTrampolineHelper
    0x302f34141940: xorl   %ecx, %ecx                           // zero out ecx, for the following loop
    0x302f34141942: jmp    0x302f3414194f                       // jump to entry label
    0x302f3414194f: cmpq   %rax, %rcx                           // compare argc with rcx (this was moved above). 
    0x302f34141952: jne    0x302f34141944                       // entry the loop if. This is pushing the argv values onto the stack in a loop, but we don't have any so we fall through
    0x302f34141954: callq  0x302f3413c400                       //
(lldb) expr isolate->builtins()->Call(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))->entry()
(byte *) $279 = 0x0000302f3413c400 "@\xfffffff6\xffffffc7\x01\x0f\xffffff84F"
// for the full object with assembly language instructions:
(lldb) job *isolate->builtins()->Call(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))
// Builtins::Generate_Call_ReceiverIsAny 'deps/v8/src/builtins/builtins-call-gen.cc':
// void Builtins::Generate_Call_ReceiverIsAny(MacroAssembler* masm) {
//   Generate_Call(masm, ConvertReceiverMode::kAny);
// }
// Builtins::Generate_Call `deps/v8/src/builtins/x64/builtins-x64.cc`
    0x302f3413c400: testb  $0x1, %dil                           // move the immediate 1 into the low 8 bits or rdi  // __ JumpIfSmi(rdi, &non_callable); which consists of CheckSmi
    0x302f3413c404: je     0x302f3413c450                       // if ZF = 1 then jump. This would happen if rdi was a SMI// __ JumpIfSmi(rdi, &non_callable: which after CheckSmi will jump. rdi is the target and not a smi in our case
// recall that rdi is the function:
(lldb) register read rdi
     rdi = 0x00001d6417d30669
(lldb) expr func
(v8::internal::Object *) $280 = 0x00001d6417d30669
// Now I think that func is/was of type JSFunction (deps/v8/src/objects.h) as it was cast to Object* by:
// auto fun = i::Handle<i::JSFunction>::cast(Utils::OpenHandle(this));
// class JSFunction: public JSObject {
 public:
  // [prototype_or_initial_map]:
  DECL_ACCESSORS(prototype_or_initial_map, Object)
    0x302f3413c40a: movq   -0x1(%rdi), %rcx                      // move the map into rcx. CmpObjectType(rdi, JS_FUNCTION_TYPE, rcx). HeapObject::kMapOffset which is the first field in JSFunction:
(lldb) memory read -f x -c 1 -s 8 `$rdi - 1`
0x1d6417d30668: 0x00001d6433c82521
(lldb) expr JSFunction::cast(func)->prototype_or_initial_map()
(v8::internal::Object *) $286 = 0x00001d64e5d82321
    0x302f3413c40e: cmpb   $-0x1, 0xb(%rcx)                      // CmpObjectType(rdi, JS_FUNCTION_TYPE, rcx) calls CmpInstanceType. Check if the HeapObject is a JSFunction
    0x302f3413c412: je     0x302f3413be20                        // __ j(equal, masm->isolate()->builtins()->CallFunction(mode), RelocInfo::CODE_TARGET);
                                                                 // I was not sure where to find this `j` function but it is in src/x64/assembler-x64.cc (Assembler::j but note
                                                                 // that there are multiple overloaded j functions so make sure  you are looking at the correct one. There is 
                                                                 // as section that discusses Assembler::j in detail later in this document.
                                                                 // so what are we jumping to? We can back up in the debugger and find out:
                                                                 // (lldb) up 2
                                                                 // (lldb job *isolate->builtins()->CallFunction(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))
                                                                 // This will be a builtin named CallFunction_ReceiverIsAny, and being a builtin you'll find it in 
                                                                 // `src/builtins/builtins-definitions.h`. The implementation will be in `src/builtins/builtins-call.cc` and
                                                                 // will result in `return builtin_handle(kCallFunction_ReceiverIsAny)`. This code repsonsible for generating
                                                                 // is Builtins::Generate_CallFunction and in our case that means `src/builtins/x64/builtins-x64.cc`
(lldb) expr isolate->builtins()->CallFunction(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))->entry()
(byte *) $317 = 0x0000302f3413be20
// Builtins::Generate_CallFunction (deps/v8/src/builtins/x64/builtins-x64.cc)
    0x302f3413be20: testb  $0x1, %dil                            // AssertFunction (deps/v8/src/x64/macro-assembler-x64.cc) check if rdi is of type smi (testb(object, Immediate(kSmiTagMask));)
                                                                 // rdi is the function to call:
(lldb) register read rdi
     rdi = 0x00001d6417d30669
(lldb) expr JSFunction::cast(func)
(v8::internal::JSFunction *) $320 = 0x00001d6417d30669
    0x302f3413be24: jne    0x302f3413be36                        // this is also generated by AssertFunciton and the call to Assembler::j If not equal this code will fall through
    0x302f3413be36: pushq  %rdi                                  // AssertFunction still, push rdi (the function) onto the stack
    0x302f3413be37: movq   -0x1(%rdi), %rdi                      // move the map into rdi. CmpObjectType(object, JS_FUNCTION_TYPE, object). HeapObject::kMapOffset which is the first field in JSFunction
    0x302f3413be3b: cmpb   $-0x1, 0xb(%rdi)                      // CmpObjectType(rdi, JS_FUNCTION_TYPE, rcx) calls CmpInstanceType. Check if the HeapObject is a JSFunction
    0x302f3413be3f: popq   %rdi                                  // pop function from stack into rdi again
    0x302f3413be40: je     0x302f3413be52                        // jump if equal will jump to a L label and return from the Check call and then return from AssertFunction
// Builtins::Generate_CallFunction (deps/v8/src/builtins/x64/builtins-x64.cc) 
    0x302f3413be52: movq   0x1f(%rdi), %rdx                      // move the SharedFunctionInfo into rdx:
(lldb) memory read -f x -c 1 -s 8 `$rdi + 0x1f`
0x1d6417d30688: 0x00001d6417d2dc21
(lldb) expr JSFunction::cast(func)->shared()
(v8::internal::SharedFunctionInfo *) $325 = 0x00001d6417d2dc21
    0x302f3413be56: testb  $-0x20, 0x87(%rdx)                    //  testl(FieldOperand(rdx, SharedFunctionInfo::kCompilerHintsOffset)
(lldb) expr JSFunction::cast(func)->shared()->compiler_hints()
(int) $332 = 1056770
(lldb) memory read -f dec -c 1 -s 4 `($rdx + 0x87)`
0x1d6417d2dca8: 1056770
    0x302f3413be5d: jne    0x302f3413bfbc                        // __ j(not_zero, &class_constructor);
    0x302f3413be63: movq   0x27(%rdi), %rsi                      // move the functions context info rsi:
(lldb) memory read -f x -c 1 -s 8 `$rdi + 0x27`
0x1d6417d30690: 0x00001d6417d03a59
(lldb) expr JSFunction::cast(func)->context()
(v8::internal::Context *) $345 = 0x00001d6417d03a59
    0x302f3413be67: testb  $0x3, 0x87(%rdx)                      // check SharedFunctionInfo::IsNativeBit::kMask | SharedFunctionInfo::IsStrictBit::kMask 
    0x302f3413be6e: jne    0x302f3413bf14                        // 
    0x302f3413bf14: movslq 0x73(%rdx), %rbx                      // __ movsxlq(rbx, FieldOperand(rdx, SharedFunctionInfo::kFormalParameterCountOffset))
    0x302f3413bf18: movabsq $0x105300b92, %r10                   // move hook_on_function_call_address into scratch register; part of CheckDebugHook
(lldb) expr isolate->debug()->hook_on_function_call_address()
(v8::internal::Address) $353 = 0x0000000105300b92
    0x302f3413bf22: cmpb   $0x0, (%r10)                          // how is this generate? In CheckDebugHook I can only find cmpb(debug_hook_active_operand, Immediate(0)); but not he previous moveabsq
    0x302f3413bf26: je     0x302f3413bfa4                        // will jump to the label at the end of CheckDebugHook
// MacroAssembler::InvokeFunction
    0x302f3413bfa4: movq   -0x60(%r13), %rdx                     // move the UndefinedValueRootIndex into rdx generated by LoadRoot(rdx, Heap::kUndefinedValueRootIndex);
(lldb) memory read -f x -c 1 -s 8 `$r13 - 0x60`
0x106000068: 0x00001d64e5d822e1
lldb) expr isolate->heap()->roots_[Heap::RootListIndex::kUndefinedValueRootIndex]
(v8::internal::Object *) $354 = 0x00001d64e5d822e1
// InvokePrologue(expected, actual, &done, &definitely_mismatches, flag, Label::kNear)
    0x302f3413bfa8: cmpq   %rax, %rbx                            // Set(rax, actual.immediate()); 
    0x302f3413bfab: je     0x302f3413bfb2                        // will return from InvokePrologue
    0x302f3413bfb2: movq   0x37(%rdi), %rcx                      // move the function code into rcx (movp(rcx, FieldOperand(function, JSFunction::kCodeOffset)))
(lldb) memory read -f x -c 1 -s 8 `$rdi + 0x37`
0x1d6417d306a0: 0x0000302f34144281
(lldb) expr JSFunction::cast(func)->code()
(v8::internal::Code *) $358 = 0x0000302f34144281
    0x302f3413bfb6: addq   $0x5f, %rcx                           // addp(rcx, Immediate(Code::kHeaderSize - kHeapObjectTag)); the instruction start follows the Code object header
(lldb) job JSFunction::cast(func)->code()
0x302f34144281: [Code]
yind = BUILTIN
name = InterpreterEntryTrampoline
compiler = unknown
Instructions (size = 1004)
0x302f341442e0     0  488b5f2f       REX.W movq rbx,[rdi+0x2f]
(lldb) memory read -f x -c 1 -s8 `$rcx + 0x5f`
0x302f341442e0: 0x075b8b482f5f8b48
// notice that rcx now point to the first instruction.
    0x302f3413bfba: jmpq   *%rcx                                // now jump to the first instruction of function :) 

    0x302f341442e0: movq   0x2f(%rdi), %rbx                     // move the feedback vector into rbx
(lldb) memory read -f x -c 1 -s 8 `$rdi + 0x2f`
0x1d6417d30698: 0x00001d6417d306e9
(lldb) expr JSFunction::cast(func)->feedback_vector()
(v8::internal::FeedbackVector *) $363 = 0x00001d6417d306a9
(lldb) job JSFunction::cast(func)->feedback_vector()
0x1d6417d306a9: [FeedbackVector] in OldSpace
 - length: 1
 SharedFunctionInfo: 0x1d6417d2dc21 <SharedFunctionInfo>
 Optimized Code: 0
 Invocation Count: 0
 Profiler Ticks: 0
 Slot #0 kCreateClosure
  [0]: 0x1d6417d306d9 <Cell value= 0x1d64e5d822e1 <undefined>>

    302f341442e4: movq   0x7(%rbx), %rbx                     // move Slot[0] from the feedback vector into rbx
(lldb) memory read -f x -c 1 -s 8 `$rbx + 0x7`
0x1d6417d306f0: 0x00001d6417d306a9
// MaybeTailCallOptimizedCodeSlot(masm, feedback_vector, rcx, r14, r15);
// MaybeTailCallOptimizedCodeSlot(MacroAssembler* masm, Register feedback_vector, Register scratch1, Register scratch2, Register scratch3)
// Register closure = rdi;
// Register optimized_code_entry = rcx;
    0x302f341442e8: movq   0xf(%rbx), %rcx                     // __ movp(optimized_code_entry, FieldOperand(feedback_vector, FeedbackVector::kOptimizedCodeOffset));
(lldb) memory read -f x -c 1 -s 8 `$rbx + 0xf`
0x1d6417d306b8: 0x0000000000000000
(lldb) expr JSFunction::cast(func)->feedback_vector()->optimized_code()
(v8::internal::Code *) $367 = 0x0000000000000000
    0x302f341442ec: testb  $0x1, %cl                           // smi test against low 8 bit of rcx called from JumpIfNotSmi -> CheckSmi
    0x302f341442ef: jne    0x302f34144486                      // JumpIfNotSmi -> Assembler::j 
    0x302f341442f5: testb  $0x1, %cl                           // test again but this time from MacroAssembler::SmiCompare and its call to AssertSmi which calls CheckSmi
    0x302f341442f8: je     0x302f3414430a                      // SmiCompare -> AssertSmi -> Check
    

   



RelocInf  (size = 10) 0x302f341418d0  embedded object  (0x302f34141861 <Code BUILTIN>)
0x302f341418dc  embedded object  (0x1d64e5d822e1 <undefined>)
0x302f341418f5  code target (BUILTIN)  (0x302f341659c0)
0x302f341418fc  external reference (Isolate::context_address)  (0x106001990)
0x302f34141933  external reference (Runtime::ThrowStackOverflow)  (0x101438e80)
0x302f3414193c  code target (STUB)  (0x302f34084740)
0x302f34141955  code target (BUILTIN)  (0x302f3413c400)
0x302f3414196b  code target (BUILTIN)  (0x302f341659c0)

SharedFunctionInfo::kCompilerHintsOffset

                            SHARED_FUNCTION_INFO_FIELDS)

JSEntryStub

Process 568 stopped
* thread #1: tid = 0xf3b301, 0x000020819e104060, queue = 'com.apple.main-thread', stop reason = instruction step into
    frame #0: 0x000020819e104060
->  0x20819e104060: pushq  %rbp
    0x20819e104061: movq   %rsp, %rbp
    0x20819e104064: pushq  $0x2
    0x20819e104066: movabsq $0x104803190, %r10        ; imm = 0x104803190

I've previously mapped the assembly with the void JSEntryStub::Generate function but I did not cover:

0x20819e1040fa    9a  e8c1d70b00     call 0x20819e1c18c0  (JSEntryTrampoline)    ;; code: BUILTIN

This matches the assembly generated in Generate_JSEntryTrampolineHelper(MacroAssembler* masm, bool is_construct) in src/builtins/x64/builtins-x64.cc:

    frame #0: 0x000020819e1c18c0
->  0x20819e1c18c0: movq   %rdi, %r11               // __ movp(r11, rdi);
    0x20819e1c18c3: movq   %rsi, %rdi               // __ movp(rdi, rsi);
    0x20819e1c18c6: xorl   %esi, %esi               // __ Set(rsi, 0);
    0x20819e1c18c8: pushq  %rbp                     // FrameScope scope(masm, StackFrame::INTERNAL);
    0x20819e1c18c9: movq   %rsp, %rbp               // FrameScope scope(masm, StackFrame::INTERNAL);
    0x20819e1c18cc: pushq  $0x1c                    // Is this also generated by the above scope?
    0x20819e1c18ce: movabsq $0x20819e1c1861, %r10   // Is this also generated by the above scope?
    ...
->  0x20819e1c1904: movq   (%r10), %rsi             // 
    0x20819e1c1907: pushq  %rdi                     // __ Push(rdi); func onto the stack
    0x20819e1c1908: pushq  %rdx                     // __ Push(rdx); recv onto the stack
    0x20819e1c1909: movq   %rcx, %rax               // __ movp(rax, rcx); argc
    0x20819e1c190c: movq   %r8, %rbx                // __ movp(rbx, r8); pointer to args
    0x20819e1c190f: movq   %r11, %rdx               // __ movp(rdx, r11); new target into rdx
    0x20819e1c1912: movq   0xd08(%r13), %r10        // Generate_CheckStackOverflow(masm, kRaxIsUntaggedInt);
    0x20819e1c1919: movq   %rsp, %rcx               // Generate_CheckStackOverflow(masm, kRaxIsUntaggedInt);
    0x20819e1c191c: subq   %r10, %rcx               // Generate_CheckStackOverflow(masm, kRaxIsUntaggedInt);
    0x20819e1c191f: movq   %rax, %r11               // Generate_CheckStackOverflow(masm, kRaxIsUntaggedInt);
    0x20819e1c1922: shlq   $0x3, %r11               // Generate_CheckStackOverflow(masm, kRaxIsUntaggedInt);
    0x20819e1c1926: cmpq   %r11, %rcx               // Generate_CheckStackOverflow(masm, kRaxIsUntaggedInt);
    0x20819e1c1929: jg     0x20819e1c1940           // jump if alright
    0x20819e1c192f: xorl   %eax, %eax               // handle stack overflow
    0x20819e1c1931: movabsq $0x1014259b0, %rbx      // handle stack overflow
    0x20819e1c193b: callq  0x20819e104740           // handle stack overflow
->  0x20819e1c1940: xorl   %ecx, %ecx               // __ Set(rcx, 0);  // Set loop variable to 0.
    0x20819e1c1942: jmp    0x20819e1c194f           // __ jmp(&entry, Label::kNear);
->  0x20819e1c194f: cmpq   %rax, %rcx               // both are zero at this stage
->  0x20819e1c1952: jne    0x20819e1c1944
                                                    // Handle<Code> builtin = is_construct
                                                    //     ? BUILTIN_CODE(masm->isolate(), Construct)
                                                    //     : masm->isolate()->builtins()->Call();
    0x20819e1c1954: callq  0x20819e1bc400           //__ Call(builtin, RelocInfo::CODE_TARGET);

Lets take a look Call in src/x64/macro-assembler-x64.h:

void Call(Handle<Code> code_object, RelocInfo::Mode rmode);

And the implementation look like this:

void TurboAssembler::Call(Handle<Code> code_object, RelocInfo::Mode rmode) {
#ifdef DEBUG
  int end_position = pc_offset() + CallSize(code_object);
#endif
  DCHECK(RelocInfo::IsCodeTarget(rmode));
  call(code_object, rmode);
#ifdef DEBUG
  DCHECK_EQ(end_position, pc_offset());
#endif
}

I think that call will end up in src/x64/assembler-x64.cc:

void Assembler::call(Handle<Code> target, RelocInfo::Mode rmode) {
  EnsureSpace ensure_space(this);
  // 1110 1000 #32-bit disp.
  emit(0xE8);
  emit_code_target(target, rmode);
}

src/x64/assembler-x64-inl.h emit(0xE8)

RUNTIME_FUNCTION

RUNTIME_FUNCTION(Runtime_InterpreterNewClosure) {
  HandleScope scope(isolate);
  DCHECK_EQ(4, args.length());
  CONVERT_ARG_HANDLE_CHECKED(SharedFunctionInfo, shared, 0);
  CONVERT_ARG_HANDLE_CHECKED(FeedbackVector, vector, 1);
  CONVERT_SMI_ARG_CHECKED(index, 2);
  CONVERT_SMI_ARG_CHECKED(pretenured_flag, 3);
  Handle<Context> context(isolate->context(), isolate);
  FeedbackSlot slot = FeedbackVector::ToSlot(index);
  Handle<Cell> vector_cell(Cell::cast(vector->Get(slot)), isolate);
  return *isolate->factory()->NewFunctionFromSharedFunctionInfo(
      shared, context, vector_cell,
      static_cast<PretenureFlag>(pretenured_flag));
}

deps/v8/src/arguments.h we find the RUNTIME_FUNCTION macro:

#define RUNTIME_FUNCTION_RETURNS_TYPE(Type, Name)                             \
  static INLINE(Type __RT_impl_##Name(Arguments args, Isolate* isolate));     \
                                                                              \
  V8_NOINLINE static Type Stats_##Name(int args_length, Object** args_object, \
                                       Isolate* isolate) {                    \
    RuntimeCallTimerScope timer(isolate, &RuntimeCallStats::Name);            \
    TRACE_EVENT0(TRACE_DISABLED_BY_DEFAULT("v8.runtime"),                     \
                 "V8.Runtime_" #Name);                                        \
    Arguments args(args_length, args_object);                                 \
    return __RT_impl_##Name(args, isolate);                                   \
  }                                                                           \
                                                                              \
  Type Name(int args_length, Object** args_object, Isolate* isolate) {        \
    DCHECK(isolate->context() == nullptr || isolate->context()->IsContext()); \
    CLOBBER_DOUBLE_REGISTERS();                                               \
    if (V8_UNLIKELY(FLAG_runtime_stats)) {                                    \
      return Stats_##Name(args_length, args_object, isolate);                 \
    }                                                                         \
    Arguments args(args_length, args_object);                                 \
    return __RT_impl_##Name(args, isolate);                                   \
  }                                                                           \
                                                                              \
  static Type __RT_impl_##Name(Arguments args, Isolate* isolate)

#define RUNTIME_FUNCTION(Name) RUNTIME_FUNCTION_RETURNS_TYPE(Object*, Name)

So lets see what that expands to:

  static INLINE(Type __RT_impl_Runtime_InterpreterNewClosure(Arguments args, Isolate* isolate));

  V8_NOINLINE static Object* Stats_InterpreterNewClosure(int args_length, Object** args_object, Isolate* isolate) {                    
    RuntimeCallTimerScope timer(isolate, &RuntimeCallStats::Name);            
    TRACE_EVENT0(TRACE_DISABLED_BY_DEFAULT("v8.runtime"), "V8.Runtime_" InterpreterNewClosure);                                        
    Arguments args(args_length, args_object);                                 
    return __RT_impl_InterpreterNewClosure(args, isolate);                                   
  }                                                                           

  Object* Runtime_InterpreterNewClosure(int args_length, Object** args_object, Isolate* isolate) {        
    DCHECK(isolate->context() == nullptr || isolate->context()->IsContext()); 
    CLOBBER_DOUBLE_REGISTERS();                                               
    if (V8_UNLIKELY(FLAG_runtime_stats)) {                                    
      return Stats_Runtime_InterpreterNewClosure(args_length, args_object, isolate);                 
    }                                                                         
    Arguments args(args_length, args_object);                                 
    return __RT_impl_Runtime_InterpreterNewClosure(args, isolate);                                   
  }                                                                           

  static Object* __RT_impl_Runtime_InterpreterNewClosure(Arguments args, Isolate* isolate)
    HandleScope scope(isolate);
    DCHECK_EQ(4, args.length());
    CONVERT_ARG_HANDLE_CHECKED(SharedFunctionInfo, shared, 0);
    CONVERT_ARG_HANDLE_CHECKED(FeedbackVector, vector, 1);
    CONVERT_SMI_ARG_CHECKED(index, 2);
    CONVERT_SMI_ARG_CHECKED(pretenured_flag, 3);
    Handle<Context> context(isolate->context(), isolate);
    FeedbackSlot slot = FeedbackVector::ToSlot(index);
    Handle<Cell> vector_cell(Cell::cast(vector->Get(slot)), isolate);
    return *isolate->factory()->NewFunctionFromSharedFunctionInfo(
      shared, context, vector_cell,
      static_cast<PretenureFlag>(pretenured_flag));
  }

Notice that Runtime_InterpreterNewClosure is called.

NewFunctionFromSharedFunctionInfo will call NewFunction which will create a new function:

  function->initialize_properties();
  function->initialize_elements();
  function->set_shared(*info);
  function->set_code(info->code());
  function->set_context(*context_or_undefined);
  function->set_prototype_or_initial_map(*the_hole_value());
  function->set_feedback_vector_cell(*undefined_cell());
  isolate()->heap()->InitializeJSObjectBody(*function, *map, JSFunction::kSize);
  return function;

Notice the call function->set_code(info->code()). This is the assembly code for InterpreterEntryTrampoline.

value = CALL_GENERATED_CODE(isolate, stub_entry, orig_func, func, recv, 147 argc, argv); So we are back here. Value is of type Function and is the function for node_bootstrap.js. I'm confused as I thought that the above call to CALL_GENEREATED_CODE would run the script. The actuall calling of the functions is done in :

auto ret = f->Call(env->context(), Null(env->isolate()), 1, &arg);

(lldb) dis -f node`v8::internal::Runtime_InterpreterNewClosure:

Handle<JSFunction> Factory::NewFunctionFromSharedFunctionInfo(
    Handle<Map> initial_map, Handle<SharedFunctionInfo> info,
    Handle<Object> context_or_undefined, Handle<Cell> vector,
    PretenureFlag pretenure) {
    
    Handle<JSFunction> result =
      NewFunction(initial_map, info, context_or_undefined, pretenure);
(lldb) job *info
0x2e6ef602e159: [SharedFunctionInfo] in OldSpace
 - name = 0x2e6ea0e02441 <String[0]: >
 - kind = [ NormalFunction ]
 - function_map_index = 129
 - formal_parameter_count = 1
 - expected_nof_properties = 10
 - language_mode = strict
 - instance class name = #Object
 - code = 0x20819e1c4281 <Code BUILTIN>
 - bytecode_array = 0x2e6ef6030501
 - source code = (process) {
  let internalBinding;
  const exceptionHandlerState = { captureFn: null };

  function startup() {
    const EventEmitter = NativeModule.require('events');

    const origProcProto = Object.getPrototypeOf(process);
    Object.setPrototypeOf(origProcProto, EventEmitter.prototype);

    EventEmitter.call(process);

    setupProcessObject();
    ...

  startup();
}
 - anonymous expression
 - function token position = 300
 - start position = 308
 - end position = 21943
 - no debug info
 - length = 1
 - feedback_metadata = 0x2e6ef6030759: [FeedbackMetadata] in OldSpace
 - length: 15
 - slot_count: 83

So we can see this is indeed bootstrap.js

Handle<JSFunction> Factory::NewFunction(Handle<Map> map,
                                        Handle<SharedFunctionInfo> info,
                                        Handle<Object> context_or_undefined,
                                        PretenureFlag pretenure) {
  AllocationSpace space = pretenure == TENURED ? OLD_SPACE : NEW_SPACE;
  Handle<JSFunction> function = New<JSFunction>(map, space);
  DCHECK(context_or_undefined->IsContext() ||
         context_or_undefined->IsUndefined(isolate()));

  function->initialize_properties();
  function->initialize_elements();
  function->set_shared(*info);
  function->set_code(info->code());
  function->set_context(*context_or_undefined);
  function->set_prototype_or_initial_map(*the_hole_value());
  function->set_feedback_vector_cell(*undefined_cell());
  isolate()->heap()->InitializeJSObjectBody(*function, *map, JSFunction::kSize);
  return function;
}

Lets now take a look at info->code()

(lldb) job info->code()
0x20819e1c4281: [Code]
kind = BUILTIN
name = InterpreterEntryTrampoline
compiler = unknown
Instructions (size = 1133)
0x20819e1c42e0     0  488b5f2f       REX.W movq rbx,[rdi+0x2f]
0x20819e1c42e4     4  488b5b07       REX.W movq rbx,[rbx+0x7]
0x20819e1c42e8     8  488b4b0f       REX.W movq rcx,[rbx+0xf]
0x20819e1c42ec     c  f6c101         testb rcx,0x1
0x20819e1c42ef     f  0f8512020000   jnz 0x20819e1c4507  (InterpreterEntryTrampoline)
0x20819e1c42f5    15  f6c101         testb rcx,0x1
0x20819e1c42f8    18  7410           jz 0x20819e1c430a  (InterpreterEntryTrampoline)
...

Setting through again and we will be back in:

value = CALL_GENERATED_CODE(isolate, stub_entry, orig_func, func, recv,
                           argc, argv);
(lldb) job value
0x2e6e9e80ea49: [Function]
 - map = 0x2e6e8f082521 [FastProperties]
 - prototype = 0x2e6ef60043d1
 - elements = 0x2e6ea0e02251 <FixedArray[0]> [HOLEY_ELEMENTS]
 - initial_map =
 - shared_info = 0x2e6ef602e159 <SharedFunctionInfo>
 - name = 0x2e6ea0e02441 <String[0]: >
 - formal_parameter_count = 1
 - kind = [ NormalFunction ]
 - context = 0x2e6ef6003af9 <FixedArray[281]>
 - code = 0x20819e1c4281 <Code BUILTIN>
 - interpreted
 - bytecode = 0x2e6ef6030501
 - source code = (process) {
  let internalBinding;
  ...
}
 - properties = 0x2e6ea0e02251 <FixedArray[0]> {
    #length: 0x2e6ea9c034f9 <AccessorInfo> (const accessor descriptor)
    #name: 0x2e6ea9c03569 <AccessorInfo> (const accessor descriptor)
    #prototype: 0x2e6ea9c035d9 <AccessorInfo> (const accessor descriptor)
 }

 - feedback vector: 0x2e6ef60308f1: [FeedbackVector] in OldSpace
 - length: 83
 SharedFunctionInfo: 0x2e6ef602e159 <SharedFunctionInfo>
 Optimized Code: 0
 Invocation Count: 0
 Profiler Ticks: 0
...

This will then return to node.cc:

Local<Value> result = script.ToLocalChecked()->Run();
if (result.IsEmpty()) {
  ReportException(env, try_catch);
  exit(4);
}

So the generated code will be entered. When it gets around to processing:

const EventEmitter = NativeModule.require('events');

This will result in another call to Invoke and the contents of func will be lib/event.js.

Code

Is a class in src/objects.h which represents generated machine code.

DECL_PRINTER(Code)

This macro looks like this:

#ifdef OBJECT_PRINT
#define DECL_PRINTER(Name) void Name##Print(std::ostream& os);  // NOLINT
#else
#define DECL_PRINTER(Name)
#endif

So if OBJECT_PRINT is defined there will be a function named:

void CodePrint(std::ostream& os);

Anything that extends v8::internal::Object can also have a Print function:

(lldb) expr recv->Print()
0x343ceb998159: [JS_API_OBJECT_TYPE]
 - map = 0x343c184c7d01 [FastProperties]
 - prototype = 0x343cbcce6ee9
 - elements = 0x343cad402251 <FixedArray[0]> [HOLEY_ELEMENTS]
 - embedder fields: 1
 - properties = 0x343ceb99a599 <PropertyArray[3]> {
    #close: 0x343c78ce8bd1 <JSFunction JSStreamWrap.handle.close (sfi = 0x343c78ce83f1)> (const data descriptor)
    #isClosing: 0x343cbccf8271 <JSFunction isClosing (sfi = 0x343c78ca3a49)> (const data descriptor)
    #onreadstart: 0x343cbccf82b1 <JSFunction onreadstart (sfi = 0x343c78ca3af9)> (const data descriptor)
    #onreadstop: 0x343cbccf82f1 <JSFunction onreadstop (sfi = 0x343c78ca3ba9)> (const data descriptor)
    #onshutdown: 0x343cbccf8331 <JSFunction onshutdown (sfi = 0x343c78ca3c59)> (const data descriptor)
    #onwrite: 0x343cbccf8371 <JSFunction onwrite (sfi = 0x343c78ca3d09)> (const data descriptor)
    #owner: 0x343ceb998719 <Socket map = 0x343c184c7cb1> (data field 0) properties[0]
    #onread: 0x343cbccb7619 <JSFunction onread (sfi = 0x343cb163e541)> (const data descriptor)
    #reading: 0x343cad402381 <true> (data field 1) properties[1]
 }
 - embedder fields = {
    0x105d053f0
 }

CHECK

These macros Can be found in src/util.h.

#define LIKELY(expr) __builtin_expect(!!(expr), 1)
#define UNLIKELY(expr) __builtin_expect(!!(expr), 0)

#define CHECK(expr)                                                           \
  do {                                                                        \
    if (UNLIKELY(!(expr))) {                                                  \
      static const char* const args[] = { __FILE__, STRINGIFY(__LINE__),      \
                                          #expr, PRETTY_FUNCTION_NAME };      \
      node::Assert(&args);                                                    \
    }                                                                         \
  } while (0)

#define CHECK_EQ(a, b) CHECK((a) == (b))

So take the following expression:

CHECK_EQ(false, try_catch.IsVerbose());

it would expand to:

if (__builtin_expect(!!(false == try_catch.IsVerbose()), 0)) {
  static const char* const args[] = { __FILE__, STRINGIFY(__LINE__), #expr, __PRETTY_FUNCTION_NAME__ };
  node::Assert(args);
}

__builtin_expect expects its parameters to be of type long and not bool, so there is a need to cast.

(lldb) expr !!(false == try_catch.IsVerbose())
(bool) $122 = true 

This is already bool but the macro can be used with other types.

OutputStackCheck(); OutputStackCheck is generated using a macro:

#define DEFINE_BYTECODE_OUTPUT(name, ...)                             \
  template <typename... Operands>                                     \
  BytecodeNode BytecodeArrayBuilder::Create##name##Node(              \
      Operands... operands) {                                         \
    return BytecodeNodeBuilder<Bytecode::k##name, __VA_ARGS__>::Make( \
        this, operands...);                                           \
  }                                                                   \
                                                                      \
  template <typename... Operands>                                     \
  void BytecodeArrayBuilder::Output##name(Operands... operands) {     \
    BytecodeNode node(Create##name##Node(operands...));               \
    Write(&node);                                                     \
  }                                                                   \
                                                                      \
  template <typename... Operands>                                     \
  void BytecodeArrayBuilder::Output##name(BytecodeLabel* label,       \
                                          Operands... operands) {     \
    DCHECK(Bytecodes::IsJump(Bytecode::k##name));                     \
    BytecodeNode node(Create##name##Node(operands...));               \
    WriteJump(&node, label);                                          \
    LeaveBasicBlock();                                                \
  }
BYTECODE_LIST(DEFINE_BYTECODE_OUTPUT)
#undef DEFINE_BYTECODE_OUTPUT

BYTECODE_LIST is a macro defined in src/interpreter/bytecodes.h:

// The list of bytecodes which are interpreted by the interpreter.
// Format is V(<bytecode>, <accumulator_use>, <operands>).
#define BYTECODE_LIST(V)
 ...
 V(StackCheck, AccumulatorUse::kNone)                                         \

  template <typename... Operands>                                     
  BytecodeNode BytecodeArrayBuilder::CreateStackCheckNode(Operands... operands) {
    return BytecodeNodeBuilder<Bytecode::kStackCheck, __VA_ARGS__>::Make(this, operands...);
  }                                                                   
                                                                      
  template <typename... Operands>                                     
  void BytecodeArrayBuilder::OutputStackCheck(Operands... operands) {     
    BytecodeNode node(CreateStackCheckNode(operands...));               
    Write(&node);                                                     
  }                                                                   
                                                                      
  template <typename... Operands>                                     
  void BytecodeArrayBuilder::OutputStackCheck(BytecodeLabel* label,       
                                          Operands... operands) {     
    DCHECK(Bytecodes::IsJump(Bytecode::kStackCheck));                     
    BytecodeNode node(CreateStackCheckNode(operands...));               
    WriteJump(&node, label);                                          
    LeaveBasicBlock();                                                
  }

Our call does not have any parameters so the first OutputStackCheck will be called in this case which will create the BytecodeNode and then call Write(&node)

(lldb) p *node
(v8::internal::interpreter::BytecodeNode) $218 = {
  bytecode_ = kStackCheck
  operands_ = ([0] = 0, [1] = 0, [2] = 0, [3] = 0, [4] = 0)
  operand_count_ = 0
  operand_scale_ = kSingle
  source_info_ = (position_type_ = kExpression, source_position_ = 0)
}

compiler.cc:803

 Handle<SharedFunctionInfo> shared_info =
   801       isolate->factory()->NewSharedFunctionInfoForLiteral(parse_info->literal(),
   802                                                           parse_info->script());
(lldb) job *shared_info

I'm not showing the whole output but this is the contents of node_bootstrap.js

interpreter.cc:211

 Handle<BytecodeArray> bytecodes =
211       generator()->FinalizeBytecode(isolate(), parse_info()->script());


-> 225   compilation_info()->SetBytecodeArray(bytecodes);
   226   compilation_info()->SetCode(
   227       BUILTIN_CODE(compilation_info()->isolate(), InterpreterEntryTrampoline));
   228   return SUCCEEDED;

So BUILTIN_CODE will expand to:

isolate->builtins()->builtin_handle(Builtins::kInterpreterEntryTrampoline);

So that line will become:

compilation_info()->SetCode(isolate->builtins()->builtin_handle(Builtins::kInterpreterEntryTrampoline));

To recap, builtins/builtins.h' has a macro that creates an enum entry for all of the builtins listed insrc/builtins/builtins-definitions.h`. And we are then calling builtin_handle with that enum value:

V8_EXPORT_PRIVATE Handle<Code> builtin_handle(int index);

So what does this function do?
src/builtins/builtins.cc

Handle<Code> Builtins::builtin_handle(int index) {
  DCHECK(IsBuiltinId(index));
  return Handle<Code>(reinterpret_cast<Code**>(builtin_address(index)));
}

How builtins get initialized

But how does the builtins_ array get?

src/builtins/builtins.h:

BUILTIN_LIST(IGNORE_BUILTIN, IGNORE_BUILTIN, DECLARE_TF, DECLARE_TF,
               DECLARE_TF, DECLARE_TF, DECLARE_ASM)

Where DECLARE_ASM is defined as:

#define DECLARE_ASM(Name, ...) \
  static void Generate_##Name(MacroAssembler* masm);

So there will be a function named Generate_InterpreterEntryTrampoline.

But we want to know where the builtins_ array is populated. Well, it

Object* builtins_[builtin_count];

And builtins_count is a member of the Name enum:

  enum Name : int32_t {
#define DEF_ENUM(Name, ...) k##Name,
    BUILTIN_LIST_ALL(DEF_ENUM)
#undef DEF_ENUM
        builtin_count
  };

It seems like these entries will get populated when the snapshot is deserialized: (snapshot/builtin-deserializer.cc)

builtins->set_builtin(i, DeserializeBuiltin(i));

Lets take a look at the stack when running generated code from V8. In src/globals.h we find:

typedef byte* Address;
...
#define FOR_EACH_ISOLATE_ADDRESS_NAME(C)                \
  C(Handler, handler)                                   \
  C(CEntryFP, c_entry_fp)                               \
  C(CFunction, c_function)                              \
  C(Context, context)                                   \
  C(PendingException, pending_exception)                \
  C(PendingHandlerContext, pending_handler_context)     \
  C(PendingHandlerCode, pending_handler_code)           \
  C(PendingHandlerOffset, pending_handler_offset)       \
  C(PendingHandlerFP, pending_handler_fp)               \
  C(PendingHandlerSP, pending_handler_sp)               \
  C(ExternalCaughtException, external_caught_exception) \
  C(JSEntrySP, js_entry_sp)

enum IsolateAddressId {
#define DECLARE_ENUM(CamelName, hacker_name) k##CamelName##Address,
  FOR_EACH_ISOLATE_ADDRESS_NAME(DECLARE_ENUM)
#undef DECLARE_ENUM
      kIsolateAddressCount
};

Will expand to:

enum IsolateAddressId {
  kHandlerAddress,
  ...
};

Notice that hacker_name parameter is not used in this call. In isolate.cc when an Isolate is initialized by bool Isolate::Init(StartupDeserializer* des):

#define ASSIGN_ELEMENT(CamelName, hacker_name)                  \
  isolate_addresses_[IsolateAddressId::k##CamelName##Address] = \
      reinterpret_cast<Address>(hacker_name##_address());
  FOR_EACH_ISOLATE_ADDRESS_NAME(ASSIGN_ELEMENT)
#undef ASSIGN_ELEMENT
isolate_addresses_[IsolateAddressId::HandlerAddress] = reinterpret_cast<Address>(handler_address());

isolate_addresses_ can be found in isolate.h:

Address isolate_addresses_[kIsolateAddressCount + 1];

So where does handler_address() come from?
This is defined in isolate.h:

inline Address* c_entry_fp_address() { return &thread_local_top_.c_entry_fp_; }
inline Address* handler_address() { return &thread_local_top_.handler_; }
inline Address* js_entry_sp_address() { return &thread_local_top_.js_entry_sp_; }
inline Address* c_function_address() { return &thread_local_top_.c_function_; }
Context** context_address() { return &thread_local_top_.context_; }
Address pending_message_obj_address() { return reinterpret_cast<Address>(&thread_local_top_.pending_message_obj_); }

THREAD_LOCAL_TOP_ADDRESS(Context*, pending_handler_context)
THREAD_LOCAL_TOP_ADDRESS(Code*, pending_handler_code)
THREAD_LOCAL_TOP_ADDRESS(intptr_t, pending_handler_offset)
THREAD_LOCAL_TOP_ADDRESS(Address, pending_handler_fp)
THREAD_LOCAL_TOP_ADDRESS(Address, pending_handler_sp)


#define THREAD_LOCAL_TOP_ADDRESS(type, name) \
  type* name##_address() { return &thread_local_top_.name##_; }

Context* pending_handler_context_address() { return &thread_local_top_.pending_handler_context_;

When is Isolate::Init called?
It is called from node::Start

Isolate* const isolate = Isolate::New(params);

This will invoke return IsolateNewImpl(isolate, params);

(lldb) expr target->Print()
(lldb) expr v8::internal::JSFunction::cast(*target)
(v8::internal::JSFunction *) $1115 = 0x000017771f4b05f1
(lldb) expr v8::internal::JSFunction::cast(*target)->code()
(v8::internal::Code *) $1116 = 0x0000262dec344281

(lldb) job v8::internal::JSFunction::cast(*target)->code()

So that target functions byte code is of type InterpreterEntryTrampoline

JSEntryStub

So stub_entry is from x64/code-stubs-x64.ccbyvoid JSEntryStub::Generate`: (lldb) p stub_entry (JSEntryFunction) $1126 = 0x0000262dec284060

(lldb) job *code 0x262dec284001: [Code] kind = STUB major_key = JSEntryStub compiler = unknown Instructions (size = 232) 0x262dec284060 0 55 push rbp 0x262dec284061 1 4889e5 REX.W movq rbp,rsp 0x262dec284064 4 6a02 push 0x2

Notice that 0x0000262dec284060 from stub_entry matches the first instruction in code.

0x262dec2840ed 8d 49ba088c800501000000 REX.W movq r10,0x105808c08 ;; external reference (Isolate::handler_address) 0x262dec2840f7 97 498922 REX.W movq [r10],rsp 0x262dec2840fa 9a e8c1d70b00 call 0x262dec3418c0 (JSEntryTrampoline) ;; code: BUILTIN

(lldb) dis -s 0x262dec3418c0 0x262dec3418c0: movq %rdi, %r11 0x262dec3418c3: movq %rsi, %rdi 0x262dec3418c6: xorl %esi, %esi 0x262dec3418c8: pushq %rbp 0x262dec3418c9: movq %rsp, %rbp 0x262dec3418cc: pushq $0x1c 0x262dec3418ce: movabsq $0x262dec341861, %r10 ; imm = 0x262DEC341861 0x262dec3418d8: pushq %r10

Runtime_CompileLazy

There is a V8 builtin named CompileLazy which when called can compile the function being called (runtime/runtime-compiler.cc): (See RUNTIME_FUNCTION for details regarding the RUNTIME_FUNCTION macro)

RUNTIME_FUNCTION(Runtime_CompileLazy) {
  ...
  if (!Compiler::Compile(function, Compiler::KEEP_EXCEPTION)) {
    return isolate->heap()->exception();
  }
  DCHECK(function->is_compiled());
  return function->code();

First time, what is compiled is the entire node_bootstrap.js: InterpreterCompilationJob::Status InterpreterCompilationJob::FinalizeJobImpl() (deps/v8/src/interpreter/interpreter.cc)

(lldb) br s -f interpreter.cc -l 201
  Handle<BytecodeArray> bytecodes = generator()->FinalizeBytecode(isolate(), parse_info()->script());
(lldb) job *bytecodes
0x18d63b2dfe1: [BytecodeArray] in OldSpaceParameter count 1
Frame size 8
    0 E> 0x18d63b2e01a @    0 : 93                StackCheck
  299 S> 0x18d63b2e01b @    1 : 6f 00 00 00       CreateClosure [0], [0], #0
         0x18d63b2e01f @    5 : 1e fb             Star r0
21946 S> 0x18d63b2e021 @    7 : 97                Return
Constant pool (size = 1)
0x18d63b2dfc9: [FixedArray] in OldSpace
 - map = 0x18ddb3022f1 <Map(HOLEY_ELEMENTS)>
 - length: 1
           0: 0x18d63b2df19 <SharedFunctionInfo>
Handler Table (size = 16)

Notice that the byte code for this is pretty short. If you look at node_bootstrap.js you see that it is a function. I think this is why the CreateClosure operator is for.

(lldb) job  *shared_info
0x1606df4adc91: [SharedFunctionInfo] in OldSpace
 - name = 0x160691002441 <String[0]: >
 - kind = [ NormalFunction ]
 - function_map_index = 129
 - formal_parameter_count = 0
 - expected_nof_properties = 10
 - language_mode = strict
 - instance class name = #Object
 - code = 0x2066fa144281 <Code BUILTIN>
 - bytecode_array = 0x1606df4adfe1
 - source code = // Hello, and welcome to hacking node.js!
//
// This file is invoked by node::LoadEnvironment in src/node.cc, and is
// responsible for bootstrapping the node.js core. As special caution is given
// to the performance of the startup process, many dependencies are invoked
// lazily.

Notice that source code contains the entire content of bootstrap_node.js.

If you later look at 0x18d63b2df19 (from bytecodes above) we can see that this is the inner function:

(lldb) job  *inner_shared_info
0x1606df4adf19: [SharedFunctionInfo] in OldSpace
 - name = 0x160691002441 <String[0]: >
 - kind = [ NormalFunction ]
 - function_map_index = 129
 - formal_parameter_count = 1
 - expected_nof_properties = 10
 - language_mode = strict
 - instance class name = #Object
 - code = 0x2066fa1450e1 <Code BUILTIN>
 - source code = (process) {
  let internalBinding;
  const exceptionHandlerState = { captureFn: null };

  function startup() {

And notice that this in now the anonymous function that takes the process object. This is what will be called later.

Next, we have:

compilation_info()->SetBytecodeArray(bytecodes);
compilation_info()->SetCode(BUILTIN_CODE(compilation_info()->isolate(), InterpreterEntryTrampoline));

InstallUnoptimizedCode will later use this Code and set that as on the SharedFunctionInfo:

shared->set_code(*compilation_info->code());

The inner_shared_info will also be compiled by FinalizeUnoptimizedCode and the bytecode for it will much longer as expected:

(lldb) job *bytecodes

Next this is compiled by FinalizeJobImpl:

compilation_info()->SetBytecodeArray(bytecodes);
compilation_info()->SetCode(BUILTIN_CODE(compilation_info()->isolate(), InterpreterEntryTrampoline));

Again notice that the code is set to InterpreterEntryTrampoline.

After having done the function will return success.

This will now return and we will end up back in node.cc:

Local<Value> result = script.ToLocalChecked()->Run();

This will land us in v8::Script::Run and the following line:

auto fun = i::Handle<i::JSFunction>::cast(Utils::OpenHandle(this));

Now if we print this JSFunction using job *fun we can see the complete object including the source. We can also take a look at the code and the bytecode:

(lldb) job  fun->abstract_code()
0x1606df4adfe1: [BytecodeArray] in OldSpaceParameter count 1
Frame size 8
    0 E> 0x1606df4ae01a @    0 : 93                StackCheck
  299 S> 0x1606df4ae01b @    1 : 6f 00 00 00       CreateClosure [0], [0], #0
         0x1606df4ae01f @    5 : 1e fb             Star r0
21946 S> 0x1606df4ae021 @    7 : 97                Return
Constant pool (size = 1)
0x1606df4adfc9: [FixedArray] in OldSpace
 - map = 0x1606023022f1 <Map(HOLEY_ELEMENTS)>
 - length: 1
           0: 0x1606df4adf19 <SharedFunctionInfo>
Handler Table (size = 16)

And a snipped of the code:

(lldb) job  fun->code()
0x2066fa144281: [Code]
kind = BUILTIN
name = InterpreterEntryTrampoline
compiler = unknown
Instructions (size = 1004)
0x2066fa1442e0     0  488b5f2f       REX.W movq rbx,[rdi+0x2f]
0x2066fa1442e4     4  488b5b07       REX.W movq rbx,[rbx+0x7]
0x2066fa1442e8     8  488b4b0f       REX.W movq rcx,[rbx+0xf]
0x2066fa1442ec     c  f6c101         testb rcx,0x1
0x2066fa1442ef     f  0f8591010000   jnz 0x2066fa144486  (InterpreterEntryTrampoline)
0x2066fa1442f5    15  f6c101         testb rcx,0x1

Once again we will be back in Invoke and this following line:

Handle<Code> code = is_construct ? isolate->factory()->js_construct_entry_code() : isolate->factory()->js_entry_code();

In this case is_construct is false so Handle<Code> will be what ever isolate->factory()->js_entry_code() returns. And it returns:

(lldb) job *code
0x2066fa084001: [Code]
kind = STUB
major_key = JSEntryStub

Lets take a closer look at the call isolate->factory()->js_entry_code(). In src/factory.h we can find:

#define ROOT_ACCESSOR(type, name, camel_name) inline Handle<type> name();
  ROOT_LIST(ROOT_ACCESSOR)
#undef ROOT_ACCESSOR

And we can find ROOT_LIST in src/heap/heap.h:

#define ROOT_LIST(V)  \
  STRONG_ROOT_LIST(V) \
  SMI_ROOT_LIST(V)    \
  V(StringTable, string_table, StringTable)

STRONG_ROOT_LIST(V) contains (among others):

  /* JS Entries */                                                             \
  V(Code, js_entry_code, JsEntryCode)                                          \
  V(Code, js_construct_entry_code, JsConstructEntryCode)

And in factory-inl.h we have:

#define ROOT_ACCESSOR(type, name, camel_name)                         \
  Handle<type> Factory::name() {                                      \
    return Handle<type>(bit_cast<type**>(                             \
        &isolate()->heap()->roots_[Heap::k##camel_name##RootIndex])); \
  }
ROOT_LIST(ROOT_ACCESSOR)
#undef ROOT_ACCESSOR

So, for js_entry_code the following would be expanded by the preprocessor:

  Handle<JsEntryCode> Factory::js_entry_code() {  
    return Handle<JsEntryCode>(bit_cast<JsEntryCode**>(&isolate()->heap()->roots_[Heap::kJsEntryCodeRootIndex]));
  }

Ok so back to the Invoke function:

JSEntryFunction stub_entry = FUNCTION_CAST<JSEntryFunction>(code->entry());
...
value = CALL_GENERATED_CODE(isolate, stub_entry, orig_func, func, recv, argc, argv);

Now, we know that stub_entry is the JSEntryStub (src/x64/code-stubs-x64.cc) and func is the InterpreterEntryTrampoline (src/builtins/x64/builtins-x64.cc).

->  0xd099dd84060: pushq  %rbp                      // function prologue
    0xd099dd84061: movq   %rsp, %rbp                // function prologue                                                                  Stack
    0xd099dd84064: pushq  $0x2                      // the stack frame marker type                                                        2 (Marker Type ConstructEntryFrame)
    0xd099dd84066: movabsq $0x105808b90, %r10       // mov the value of the current context address
    0xd099dd84070: movq   (%r10), %r10              // dereference so the address is in r10
    0xd099dd84073: pushq  %r10                      // push the context address onto the stack                                            address to current context
    0xd099dd84075: pushq  %r12                      // store callee saved registers which need to be preserved                            whatever was in r12
    0xd099dd84077: pushq  %r13                      // as above                                                                           whatever was in r13
    0xd099dd84079: pushq  %r14                      // as above                                                                           whatever was in r14
    0xd099dd8407b: pushq  %r15                      // as above                                                                           whatever was in r15
    0xd099dd8407d: pushq  %rbx                      // as above                                                                           whatever was in rbx
    0xd099dd8407e: movabsq $0x105807248, %r13       // InitializeRootRegister (x64/macro-assembler-x64.h)
    0xd099dd84088: addq   $0x80, %r13               // also part of InitializeRootRegister
    0x16e271f840f: movabsq $0x105808c00, %r10       // mov the frame pointer address into r10
    0x16e271f8409: pushq  (%r10)                    // dereferense so that the address on the stack                                       address to frame pointer descriptor 
    0x16e271f840C: movabsq 0x105808c20, %rax        // mov the js_entry_sp (stack pointer) into rax
    0x16e271f840a6: testq  %rax, %rax               // test 
    0x16e271f840a9: jne    0x16e271f840c3           // and jump if zero flag is 0 (rax is not all zeros) which is not the case
    0xd099dd840af: pushq  $0x2                      // StackFrame::OUTERMOST_JSENTRY_FRAME))
    0xd099dd840b1: movq   %rbp, %rax                // move the current frame base pointer to rax
    0xd099dd840b4: movabsq %rax, 0x105808c20        // move the base frame pointer to memory location 
    0xd099dd840be: jmp    0xd099dd840c5             // unconditional jump (jmp &cont)
    0xd099dd840c5: jmp    0xd099dd840e0             // unconditional jump (jmp &invok)
    0xd099dd840e0: movabsq $0x105808c08, %r10       // move handler_address to r10 (PushStackHandler src/x64/macro-assembler-x64.cc)
    0xd099dd840ea: pushq  (%r10)                    // push the dereferenced address of handler_address onto the stack                    address to handler
    0xd099dd840ed: movabsq $0x105808c08, %r10       // move the address again. not sure why this is done again through
    0xd099dd840f7: movq   %rsp, (%r10)              // set the stack pointer to be that of the handler. This will be the I
    0xd099dd840fa: callq  0xd099de418c0             // __ Call(BUILTIN_CODE(isolate(), JSEntryTrampoline), RelocInfo::CODE_TARGET);

    `Generate_JSEntryTrampolineHelper(MacroAssembler* masm, bool is_construct)` in `src/builtins/x64/builtins-x64.cc`:
    0x16e2720418c0: movq   %rdi, %r11               // mov new_target into r11 
    0x16e2720418c3: movq   %rsi, %rdi               // move func into rdi
    0x16e2720418c6: xorl   %esi, %esi               // __ Set(rsi, 0);
    0x16e2720418c8: pushq  %rbp                     // function prologue                                                                  previous base frame pointer
    0x16e2720418c9: movq   %rsp, %rbp               // function prologue
    0x16e2720418cc: pushq  $0x1c                    // the stack frame marker type                                                        1c (28)
    0x16e2720418ce: movabsq $0x16e272041861, %r10   // move the CodeObject address into r10??
    0x16e2720418d8: pushq  %r10                     // push the CodeObject onto the stack                                                 code object
    0x16e2720418da: movabsq $0x37b3a2c022e1, %r10   // emit_debug_code
    0x16e2720418e4: cmpq   %r10, (%rsp)             // emit_debug_code
    0x16e2720418e8: jne    0x16e2720418fa           // emit_debug_code
    0x16e2720418fa: movabsq $0x105808b90, %r10
    0x16e272041904: movq   (%r10), %rsi
    0x16e272041907: pushq  %rdi                                                                                                           function
    0x16e272041908: pushq  %rdx                                                                                                           receiver
    0x16e272041909: movq   %rcx, %rax
    0x16e27204190c: movq   %r8, %rbx
    0x16e27204190f: movq   %r11, %rdx
    Generate_CheckStackOverflow(masm, kRaxIsUntaggedInt):
    0x16e272041912: movq   0xd08(%r13), %r10         // LoadRoot (x64/macro-assembler-x64.cc) (index << kPointerSizeLog2) - kRootRegisterBias))
    0x16e272041919: movq   %rsp, %rcx                // 
    0x16e27204191c: subq   %r10, %rcx
    0x16e27204191f: movq   %rax, %r11
    0x16e272041922: shlq   $0x3, %r11
    0x16e272041926: cmpq   %r11, %rcx
    0x16e272041929: jg     0x16e272041940
    0x16e272041940: xorl   %ecx, %ecx                // __ Set(rcx, 0);  Start loop to copy args onto the stack
    0x16e272041942: jmp    0x16e27204194f
    0x16e27204194f: cmpq   %rax, %rcx                // __ cmpp(rcx, rax);
    0x16e272041954: callq  0x16e27203c400            // __ Call(masm->isolate()->builtins()->Call(), RelocInfo::CODE_TARGET)
    
    
The last `Call` I think will land in (x64/macro-assembler-x64.cc line 2003):
```c++
void TurboAssembler::Call(Handle<Code> code_object, RelocInfo::Mode rmode) {
  call(code_object, rmode);
#endif
}

call is declared in src/x64/assembler-x64.h:

void call(Handle<Code> target, RelocInfo::Mode rmode = RelocInfo::CODE_TARGET);

The definition can be found in src/x64/assembler-x64.cc:

void Assembler::call(Address entry, RelocInfo::Mode rmode) {
  DCHECK(RelocInfo::IsRuntimeEntry(rmode));
  EnsureSpace ensure_space(this);
  // 1110 1000 #32-bit disp.
  emit(0xE8);
  emit_runtime_entry(entry, rmode);
}

If we look again at the callq:

    0x16e272041954: callq  0x16e27203c400

And display the opcode for callq:

(lldb) dis -f -b
->  0x16e272041954: e8 a7 aa ff ff                 callq  0x16e27203c400

We can see match the opcode emitted by emit(0xE8) to e8. But what is returned by the call tomasm->isolate()-builtins()->call()`?
In src/isolate.h we have:

Builtins* builtins() { return &builtins_; }

If I back up 2 frames and try to execute isolate()->builtins()->Call() I get:

(lldb) job *isolate->builtins()->Call(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))

In src/builtins/builtins.h we can find the Call function with has a default value for the mode parameter:

Handle<Code> Call(ConvertReceiverMode = ConvertReceiverMode::kAny);

If we take a look in src/builtins/builtins-call.cc we find the definition of Builtins::Call:

Handle<Code> Builtins::Call(ConvertReceiverMode mode) {
  switch (mode) {
    case ConvertReceiverMode::kNullOrUndefined:
      return builtin_handle(kCall_ReceiverIsNullOrUndefined);
    case ConvertReceiverMode::kNotNullOrUndefined:
      return builtin_handle(kCall_ReceiverIsNotNullOrUndefined);
    case ConvertReceiverMode::kAny:
      return builtin_handle(kCall_ReceiverIsAny);
  }
  UNREACHABLE();
}

And if we look in src/builtins/builtins-definitions.h we can find Call_ReceiverIsAny`:

ASM(Call_ReceiverIsAny)

And we should be able to use ConvertReceiverMode::kAny with the Call function to find out what is going to be called (the code_object in call(code_object, rmode);:

(lldb) job *isolate->builtins()->Call(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))
0x16e27203c3a1: [Code]
kind = BUILTIN
name = Call_ReceiverIsAny
compiler = unknown
Instructions (size = 178)
0x16e27203c400     0  40f6c701       testb rdi,0x1
0x16e27203c404     4  0f8446000000   jz 0x16e27203c450  (Call_ReceiverIsAny)
0x16e27203c40a     a  488b4fff       REX.W movq rcx,[rdi-0x1]
0x16e27203c40e     e  80790bff       cmpb [rcx+0xb],0xff
0x16e27203c412    12  0f8408faffff   jz 0x16e27203be20  (CallFunction_ReceiverIsAny)    ;; code: BUILTIN
0x16e27203c418    18  80790bfe       cmpb [rcx+0xb],0xfe
0x16e27203c41c    1c  0f845efcffff   jz 0x16e27203c080  (CallBoundFunction)    ;; code: BUILTIN
0x16e27203c422    22  f6410c02       testb [rcx+0xc],0x2
0x16e27203c426    26  0f8424000000   jz 0x16e27203c450  (Call_ReceiverIsAny)
0x16e27203c42c    2c  80790bb7       cmpb [rcx+0xb],0xb7
0x16e27203c430    30  0f8505000000   jnz 0x16e27203c43b  (Call_ReceiverIsAny)
0x16e27203c436    36  e9e5000000     jmp 0x16e27203c520  (CallProxy)    ;; code: BUILTIN
0x16e27203c43b    3b  48897cc408     REX.W movq [rsp+rax*8+0x8],rdi
0x16e27203c440    40  488b7e27       REX.W movq rdi,[rsi+0x27]
0x16e27203c444    44  488bbfff000000 REX.W movq rdi,[rdi+0xff]
0x16e27203c44b    4b  e970f7ffff     jmp 0x16e27203bbc0  (CallFunction_ReceiverIsNotNullOrUndefined)    ;; code: BUILTIN
0x16e27203c450    50  55             push rbp
0x16e27203c451    51  4889e5         REX.W movq rbp,rsp
0x16e27203c454    54  6a1c           push 0x1c
0x16e27203c456    56  49baa1c30372e2160000 REX.W movq r10,0x16e27203c3a1  (Call_ReceiverIsAny)    ;; object: 0x16e27203c3a1 <Code BUILTIN>
0x16e27203c460    60  4152           push r10
0x16e27203c462    62  49bae122c0a2b3370000 REX.W movq r10,0x37b3a2c022e1    ;; object: 0x37b3a2c022e1 <undefined>
0x16e27203c46c    6c  4c391424       REX.W cmpq [rsp],r10
0x16e27203c470    70  7510           jnz 0x16e27203c482  (Call_ReceiverIsAny)
0x16e27203c472    72  48ba0000000009000000 REX.W movq rdx,0x900000000
0x16e27203c47c    7c  e83f950200     call 0x16e2720659c0  (Abort)    ;; code: BUILTIN
0x16e27203c481    81  cc             int3l
0x16e27203c482    82  57             push rdi
0x16e27203c483    83  b801000000     movl rax,0x1
0x16e27203c488    88  48bb9002430101000000 REX.W movq rbx,0x101430290    ;; external reference (Runtime::ThrowCalledNonCallable)
0x16e27203c492    92  e8a982f4ff     call 0x16e271f84740     ;; code: STUB, CEntryStub, minor: 8
0x16e27203c497    97  48837df81c     REX.W cmpq [rbp-0x8],0x1c
0x16e27203c49c    9c  7410           jz 0x16e27203c4ae  (Call_ReceiverIsAny)
0x16e27203c49e    9e  48ba000000004f000000 REX.W movq rdx,0x4f00000000
0x16e27203c4a8    a8  e813950200     call 0x16e2720659c0  (Abort)    ;; code: BUILTIN
0x16e27203c4ad    ad  cc             int3l
0x16e27203c4ae    ae  488be5         REX.W movq rsp,rbp
0x16e27203c4b1    b1  5d             pop rbp


RelocInfo (size = 11)
0x16e27203c414  code target (BUILTIN)  (0x16e27203be20)
0x16e27203c41e  code target (BUILTIN)  (0x16e27203c080)
0x16e27203c437  code target (BUILTIN)  (0x16e27203c520)
0x16e27203c44c  code target (BUILTIN)  (0x16e27203bbc0)
0x16e27203c458  embedded object  (0x16e27203c3a1 <Code BUILTIN>)
0x16e27203c464  embedded object  (0x37b3a2c022e1 <undefined>)
0x16e27203c47d  code target (BUILTIN)  (0x16e2720659c0)
0x16e27203c48a  external reference (Runtime::ThrowCalledNonCallable)  (0x101430290)
0x16e27203c493  code target (STUB)  (0x16e271f84740)
0x16e27203c4a9  code target (BUILTIN)  (0x16e2720659c0)

To find out what generated this code we can look in src/builtins/builtins-call-gen.cc:

void Builtins::Generate_Call_ReceiverIsAny(MacroAssembler* masm) {
  Generate_Call(masm, ConvertReceiverMode::kAny);
}

And an implementation can be found in src/builtins/x64/builtins-x64.cc:

void Builtins::Generate_Call(MacroAssembler* masm, ConvertReceiverMode mode) {
}

Now, the actual produced assembly code looks like this: (lldb) register read rax rdi rax = 0x0000000000000000 // number of args rdi = 0x000037b34d8b06d9 // the target to call

0x16e27203c400: testb  $0x1, %dil            // __ JumpIfSmi(rdi, &non_callable); which consists of CheckSmi. dil is the low 8 bits of rdi
0x16e27203c404: je     0x16e27203c450        // __ JumpIfSmi(rdi, &non_callable: which after CheckSmi will jump. rdi is the target and not a smi in our case
0x16e27203c40a: movq   -0x1(%rdi), %rcx      // __ CmpObjectType(rdi, JS_FUNCTION_TYPE, rcx); which does a movp
0x16e27203c40e: cmpb   $-0x1, 0xb(%rcx)      // __ CmpObjectType(rdi, JS_FUNCTION_TYPE, rcx); which does a cmpb
0x16e27203c412: je     0x16e27203be20        // __ j(equal, masm->isolate()->builtins()->CallFunction(mode), RelocInfo::CODE_TARGET);
                                             // I was not sure where to find this `j` function but it is in src/x64/assembler-x64.cc
                                             // so what are we jumping to? We can back up in the debugger and find out:
                                             // (lldb) up 2
                                             // (lldb job *isolate->builtins()->CallFunction(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))
                                             // This will be a builtin named CallFunction_ReceiverIsAny, and being a builtin you'll find it in 
                                             // `src/builtins/builtins-definitions.h`. The implementation will be in `src/builtins/builtins-call.cc` and
                                             // will result in `return builtin_handle(kCallFunction_ReceiverIsAny)`. This code repsonsible for generating
                                             // is Builtins::Generate_CallFunction and in our case that means `src/builtins/x64/builtins-x64.cc`

Builtins::Generate_Call(MacroAssembler* masm, ConvertReceiverMode mode) (src/builtins/x64/builtins-x64.cc)
0x16e27203be20: testb  $0x1, %dil            // __ JumpIfSmi(rdi, &non_callable);
0x16e27203be24: jne    0x16e27203be36        // __ JumpIfSmi(rdi, &non_callable);

0x16e27203be36: pushq  %rdi                  // __ Push(rdi);
__ CallRuntime(Runtime::kThrowCalledNonCallable); in `src/x64/macro-assembler-x64.h` and 


...
0x16e27203be63: movq   0x27(%rdi), %rsi      //__ movp(rsi, FieldOperand(rdi, JSFunction::kContextOffset));
0x16e27203be67: testb  $0x3, 0x87(%rdx)      // __ testl(FieldOperand(rdx, SharedFunctionInfo::kCompilerHintsOffset), Immediate(SharedFunctionInfo::IsNativeBit::kMask | SharedFunctionInfo::IsStrictBit::kMask));
0x16e27203be6e: jne    0x16e27203bf14        // __ j(not_zero, &done_convert);
                                             // rax = nr args, rdx = SharedFunctionInfo, rdi = target function, rsi = the function context
0x16e27203bf14: movslq 0x73(%rdx), %rbx      // __ movsxlq(rbx, FieldOperand(rdx, SharedFunctionInfo::kFormalParameterCountOffset));
0x16e27203bf18: movabsq $0x1050057f2, %r10   
0x16e27203bf18: movabsq $0x1050057f2, %r10   // __ InvokeFunctionCode(rdi, no_reg, expected, actual, JUMP_FUNCTION);
0x16e27203bfa4: movq   -0x60(%r13), %rdx     // LoadRoot(rdx, Heap::kUndefinedValueRootIndex);
0x16e27203bfa8: cmpq   %rax, %rbx            // cmpp(expected.reg(), actual.reg());
0x16e27203bfb2: movq   0x37(%rdi), %rcx      // movp(rcx, FieldOperand(function, JSFunction::kCodeOffset));
0x16e27203bfb6: addq   $0x5f, %rcx           // addp(rcx, Immediate(Code::kHeaderSize - kHeapObjectTag));
0x16e27203bfba: jmpq   *%rcx                 // jmp(rcx); 

Once again I got lost :(

I do know that if I set through I'll end up in RUNTIME_FUNCTION(Runtime_InterpreterNewClosure in runtime/runtime-interpreter.cc which has this line:

 return *isolate->factory()->NewFunctionFromSharedFunctionInfo(shared, context, vector_cell,static_cast<PretenureFlag>(pretenured_flag));

This call will delegate to another NewFunctionFromSharedFunctionInfo and:

Handle<JSFunction> result = NewFunction(initial_map, info, context_or_undefined, pretenure);

NewFunction will

Handle<JSFunction> function = New<JSFunction>(map, space);
function->initialize_properties();
function->initialize_elements();
function->set_shared(*info);
function->set_code(info->code());
function->set_context(*context_or_undefined);
function->set_prototype_or_initial_map(*the_hole_value());
function->set_feedback_vector_cell(*undefined_cell());

code will be the InterpreterEntryTrampoline and it looks like it was already been compiled:

(lldb) expr info->is_compiled()
(bool) $483 = true

This Handle is then returned RUNTIME_FUNCTION(Runtime_InterpreterNewClosure:

    0x101437967 <+263>: movq   %rax, -0x8(%rbp)                               // store the value return value Handle<JSFunction> on the stack
    0x10143796b <+267>: movq   -0x8(%rbp), %rax                               // move it into rax which is the register for return value
    0x10143796f <+271>: addq   $0x50, %rsp                                    // clean up local stack variables
    0x101437973 <+275>: popq   %rbp                                           // pop the previous functions base frame pointer
    0x101437974 <+276>: retq                                                  // transfer control to the return address on the stack (placed there by call)
    0x101437975 <+277>: nopw   %cs:(%rax,%rax)

    (after retq):
    0x16e271f847a4: cmpq   0x88(%r13), %rax
    0x16e271f847ab: je     0x16e271f847f8

    0x16e271f847b1: movq   -0x58(%r13), %r14
    0x16e271f847b5: movabsq $0x105808ba0, %r10
    0x16e271f847bf: cmpq   (%r10), %r14
    0x16e271f847c2: je     0x16e271f847c5

    0x16e271f847c5: movq   0x8(%rbp), %rcx
    0x16e271f847c9: movq   (%rbp), %rbp
    0x16e271f847cd: leaq   0x8(%r15), %rsp
    0x16e271f847d1: pushq  %rcx
    0x16e271f847d2: movabsq $0x105808b90, %r10
    0x16e271f847dc: movq   (%r10), %rsi
    0x16e271f847df: movq   $0x0, (%r10)
    0x16e271f847e6: movabsq $0x105808c00, %r10        ; imm = 0x105808C00
    0x16e271f847f0: movq   $0x0, (%r10)
    0x16e271f847f7: retq

    0x16e271fd6cfd: movq   %rax, %rdx
    0x16e271fd6d00: movq   %rax, -0x28(%rbp)
    0x16e271fd6d04: movq   %rsp, %rax
    0x16e271fd6d07: cmpq   -0x38(%rbp), %rax
    0x16e271fd6d0b: je     0x16e271fd6d42

    0x16e271fd6d42: movq   -0x20(%rbp), %r12
    0x16e271fd6d46: addq   $0x4, %r12
    0x16e271fd6d4a: movq   -0x18(%rbp), %rdx
    0x16e271fd6d4e: movq   -0x18(%rdx), %r14
    0x16e271fd6d52: movzbl (%r12,%r14), %eax
    0x16e271fd6d57: movabsq $0x100000000, %r10        ; imm = 0x100000000
    0x16e271fd6d61: cmpq   %rax, %r10
    0x16e271fd6d64: jae    0x16e271fd6d76

    0x16e271fd6d76: movq   -0x10(%rbp), %r15
    0x16e271fd6d7a: movq   (%r15,%rax,8), %rbx
    0x16e271fd6d7e: movq   (%rbp), %rbp
    0x16e271fd6d82: movq   0x38(%rsp), %rax

    0x16e271fd6d87: addq   $0x68, %rsp
    0x16e271fd6d8b: jmpq   *%rbx

    0x16e271fbdde0: movsbq 0x1(%r14,%r12), %rbx
    0x16e271fbdde6: movq   %rbp, %rdx
    0x16e271fbdde9: movq   %rax, (%rdx,%rbx,8)
    0x16e271fbdded: addq   $0x2, %r12
    0x16e271fbddf1: movzbl (%r12,%r14), %ebx
    0x16e271fbddf6: movabsq $0x100000000, %r10        ; imm = 0x100000000
    0x16e271fbde00: cmpq   %rbx, %r10
    0x16e271fbde03: jae    0x16e271fbde15

    0x16e271fbde15: movq   (%r15,%rbx,8), %rbx
    0x16e271fbde19: jmpq   *%rbx
 
    0x16e271fdca60: movq   %rbp, %rbx
    0x16e271fdca63: movl   $0x0, -0x20(%rbx)
    0x16e271fdca6a: movl   %r12d, -0x1c(%rbx)
    0x16e271fdca6e: movl   %r12d, %edx
    0x16e271fdca71: movabsq $0x100000000, %r10        ; imm = 0x100000000
    0x16e271fdca7b: cmpq   %rdx, %r10
    0x16e271fdca7e: jae    0x16e271fdca90

    0x16e271fdca90: subl   $0x39, %edx
    0x16e271fdca93: movabsq $0x100000000, %r10        ; imm = 0x100000000
    0x16e271fdca9d: cmpq   %rdx, %r10
    0x16e271fdcaa0: jae    0x16e271fdcab2
    
    0x16e271fdcab2: cmpl   $0x0, %edx
    0x16e271fdcab5: jl     0x16e271fdcb0b
    0x16e271fdcab7: movl   0x33(%r14), %ecx
    0x16e271fdcabb: movabsq $0x100000000, %r10        ; imm = 0x100000000
    0x16e271fdcac5: cmpq   %rcx, %r10
    0x16e271fdcac8: jae    0x16e271fdcada

    0x16e271fdcada: subl   $0x1, %ecx
    0x16e271fdcadd: movabsq $0x100000000, %r10        ; imm = 0x100000000
    0x16e271fdcae7: cmpq   %rcx, %r10
    0x16e271fdcaea: jae    0x16e271fdcafc

    0x16e271fdcafc: subl   %edx, %ecx
    0x16e271fdcafe: cmpl   $0x0, %ecx
    0x16e271fdcb01: jl     0x16e271fdcb1c

    0x16e271fdcb03: movq   -0x18(%rbx), %rbx
    0x16e271fdcb07: movl   %ecx, 0x33(%rbx)
    0x16e271fdcb0a: retq

    0x16e272044654: movq   -0x18(%rbp), %r14
    0x16e272044658: movq   -0x20(%rbp), %r12
    0x16e27204465c: shrq   $0x20, %r12
    0x16e272044660: movzbl (%r14,%r12), %ebx
    0x16e272044665: cmpb   $-0x69, %bl
    0x16e272044668: je     0x16e2720446a3

    0x16e2720446a3: movq   -0x18(%rbp), %rbx
    0x16e2720446a7: movl   0x2b(%rbx), %ebx
    0x16e2720446aa: leave                              // will leave the function 

    0x16e2720446ab: popq   %rcx
    0x16e2720446ac: addq   %rbx, %rsp
    0x16e2720446af: pushq  %rcx
    0x16e2720446b0: retq

    0x16e272041959: cmpq   $0x1c, -0x8(%rbp)
    0x16e27204195e: je     0x16e272041970

    0x16e272041970: movq   %rbp, %rsp
    0x16e272041973: popq   %rbp
    0x16e272041974: retq

    0x16e271f840ff: movabsq $0x105808c08, %r10        ; imm = 0x105808C08
    0x16e271f84109: popq   (%r10)
    0x16e271f8410c: addq   $0x0, %rsp
    0x16e271f84110: popq   %rbx
    0x16e271f84111: cmpq   $0x2, %rbx
    0x16e271f84115: jne    0x16e271f8412c
    0x16e271f8411b: movabsq $0x105808c20, %r10        ; imm = 0x105808C20
    0x16e271f84125: movq   $0x0, (%r10)
    0x16e271f8412c: movabsq $0x105808c00, %r10        ; imm = 0x105808C00
    0x16e271f84136: popq   (%r10)
    0x16e271f84139: popq   %rbx                       // pop callee saved registers
    0x16e271f8413a: popq   %r15
    0x16e271f8413c: popq   %r14
    0x16e271f8413c: popq   %r14
    0x16e271f8413e: popq   %r13
    0x16e271f84140: popq   %r12
    0x16e271f84142: addq   $0x10, %rsp
    0x16e271f84146: popq   %rbp
    0x16e271f84147: retq

    value = CALL_GENERATED_CODE(isolate, stub_entry, orig_func, func, recv, argc, argv);

value is the function of type JSFunction:

(lldb) job JSFunction::cast(value)->code()
(lldb) job JSFunction::cast(value)->abstract_code()
(lldb) job JSFunction::cast(value)->feedback_vector()

Just to recall, we called this function:

Local<Value> f_value = ExecuteString(env, MainSource(env), script_name);

To compile boostrap_node.js and the returned value if a Function that has been compiled. This is later called:

auto ret = f->Call(env->context(), Null(env->isolate()), 1, &arg);

This will once again land us in Invoke and:

value = CALL_GENERATED_CODE(isolate, stub_entry, orig_func, func, recv, argc, argv);

This is a function call so we will first entry the JSEntryStub, then the InterpreterTrampoline which will delegate to Generate_Call

    0x3db5b98c1954: callq  0x3db5b98bc400

    0x3db5b98bc400: testb  $0x1, %dil         // __ JumpIfSmi(rdi, &non_callable);
    0x3db5b98bc404: je     0x3db5b98bc450     // __ JumpIfSmi(rdi, &non_callable);
    0x3db5b98bc40a: movq   -0x1(%rdi), %rcx   // __ CmpObjectType(rdi, JS_FUNCTION_TYPE, rcx);
    0x3db5b98bc40e: cmpb   $-0x1, 0xb(%rcx)   // CmpInstanceType(map, type) called by CmpObjectType
    0x3db5b98bc412: je     0x3db5b98bbe20     // __ j(equal, masm->isolate()->builtins()->CallFunction(mode), RelocInfo::CODE_TARGET);
                                              // So the next instruction should match that from CallFunction.

    0x3db5b98bbe20: testb  $0x1, %dil         // 

Assembler:j(Condition cc, Handle<Code> target, RelocInfo::Mode rmode)

Can be found in src/x64/assembler-x64.cc (line 1367):

void Assembler::j(Condition cc, Handle<Code> target, RelocInfo::Mode rmode) {
  EnsureSpace ensure_space(this);
  DCHECK(is_uint4(cc));
  // 0000 1111 1000 tttn #32-bit disp.
  emit(0x0F);
  emit(0x80 | cc);
  emit_code_target(target, rmode);
}
``
Take this expression as an example:
```c++
__ j(equal, masm->isolate()->builtins()->CallFunction(mode), RelocInfo::CODE_TARGET);

Condition is an enum found in src/x64/assembler-x64.h:

enum Condition {
  // any value < 0 is considered no_condition
  no_condition  = -1,

  overflow      =  0,
  no_overflow   =  1,
  below         =  2,
  above_equal   =  3,
  equal         =  4,
  not_equal     =  5,
  below_equal   =  6,
  above         =  7,
  negative      =  8,
  positive      =  9,
  parity_even   = 10,
  parity_odd    = 11,
  less          = 12,
  greater_equal = 13,
  less_equal    = 14,
  greater       = 15,

  // Fake conditions that are handled by the
  // opcodes using them.
  always        = 16,
  never         = 17,
  // aliases
  carry         = below,
  not_carry     = above_equal,
  zero          = equal,
  not_zero      = not_equal,
  sign          = negative,
  not_sign      = positive,
  last_condition = greater
};

The jump instruction generated for this would look like:

(lldb) dis -f -b
->  0x16e27203c412: 0f 84 08 fa ff ff  je     0x16e27203be20

  emit(0x0F);
  emit(0x80 | cc);
  emit_code_target(target, rmode);

So we can see that 0f matches the emit(0x0F) call. And 84 matches emit(0x80 | 4) which is 84 in hex. (4 is Condition::equal) I was wondering about (0x80 | cc) and what that does. Well this will determin the type of jmp opcode to emit. A near jmp opcode consists of two bytes, the first being with 0F and the second varies depending on the type of jmp. So lets take Condition::not_equal which is 5:

(lldb) expr -f hex -- `0x80 | 5`
(int) $304 = 0x00000085

The last line, emit_code_target(target, rmode) which can be found in src/x64/assembler-x64-inl.h.

int current = static_cast<int>(code_targets_.size());

expr isolate->builtins()->Call(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))->address()
(v8::internal::Address) $306 = 0x0000302f3413c3a0

x64 jump instructions

Instruction Description Flags short jump opcode near jump opcodes

  • JO Jump if overflow ZF = 1 70 0F 80
  • JE/JZ Jump if equal/Jump if zero ZF = 1 74 0F 84
  • JNE/JNZ Jump if not equal/Jump if no zero ZF = 1 75 0F 85

Types of Jumps

A short jmp is encoded as two bytes 74 and the number of bytes +/- relative to the instruction pointer. The operand can only be a 8-bit operand. So this can jump -126 to +129 bytes.

A near jmp allows for jumps in the current segment and uses 0F 84 and the operand is a 16-bit operand

So a function is translated into bytecode by the ByteCodeGenerator which will visit the AST and emit bytecodes for each AST node. The bytecodes are set on the SharedFunctionInfo object and the code entry address is set to the InterpreterEntryTrampoline builtin stub. InterpreterEntryTrampoline will set up the stack frame and then dispatch to the interpreter's bytecode handler for the functions first bytecode. I think this is what the last call is doing above. So what is the first bytecode in our case?

(lldb) up 2
(lldb) job v8::internal::JSFunction::cast(func)->abstract_code()
0x37b34d8adfe1: [BytecodeArray] in OldSpaceParameter count 1
Frame size 8
    0 E> 0x37b34d8ae01a @    0 : 93                StackCheck
  299 S> 0x37b34d8ae01b @    1 : 6f 00 00 00       CreateClosure [0], [0], #0
         0x37b34d8ae01f @    5 : 1e fb             Star r0
21946 S> 0x37b34d8ae021 @    7 : 97                Return
Constant pool (size = 1)
0x37b34d8adfc9: [FixedArray] in OldSpace
 - map = 0x37b31cc022f1 <Map(HOLEY_ELEMENTS)>
 - length: 1
           0: 0x37b34d8adf19 <SharedFunctionInfo>
Handler Table (size = 16)

So at this point we have the JSEntryStub which will take information from the isolate->isolate->thread_local_top_ and make it available to the rest of the code that will be called. This is something that has to be done each time a function is entered. Now, the first InterpreterEntryTrampoline is called and hopefully we'll be able to verify that this goes through the func->abstract_code() and compiles it.

(lldb) job *isolate->builtins()->CallFunction(static_cast<ConvertReceiverMode>(ConvertReceiverMode::kAny))        0x3db5b98bbdc1: [Code]
kind = BUILTIN
name = CallFunction_ReceiverIsAny
compiler = unknown
Instructions (size = 510)
0x3db5b98bbe20     0  40f6c701       testb rdi,0x1                                      // __ AssertFunction
0x3db5b98bbe24     4  7510           jnz 0x3db5b98bbe36  (CallFunction_ReceiverIsAny)
0x3db5b98bbe26     6  48ba0000000036000000 REX.W movq rdx,0x3600000000
0x3db5b98bbe30    10  e88b9b0200     call 0x3db5b98e59c0  (Abort)    ;; code: BUILTIN
0x3db5b98bbe35    15  cc             int3l
0x3db5b98bbe36    16  57             push rdi
0x3db5b98bbe37    17  488b7fff       REX.W movq rdi,[rdi-0x1]
0x3db5b98bbe3b    1b  807f0bff       cmpb [rdi+0xb],0xff
0x3db5b98bbe3f    1f  5f             pop rdi
0x3db5b98bbe40    20  7410           jz 0x3db5b98bbe52  (CallFunction_ReceiverIsAny)
0x3db5b98bbe42    22  48ba000000003b000000 REX.W movq rdx,0x3b00000000
0x3db5b98bbe4c    2c  e86f9b0200     call 0x3db5b98e59c0  (Abort)    ;; code: BUILTIN
0x3db5b98bbe51    31  cc             int3l                                             // end __ AssertFunction

0x3db5b98bbe52    32  488b571f       REX.W movq rdx,[rdi+0x1f]

Codestubs

Are declared in src/code-stubs.h. Lets take a look at two CEntry and JSEntry:

#define CODE_STUB_LIST_ALL_PLATFORMS(V)       \
  /* --- PlatformCodeStubs --- */             \
  ...
  V(CEntry)                                   \
  ...
  V(JSEntry)                                  \

CodeStub extends ZoneObject. It has an enum named major with all the code stubs:

enum Major {
  NoCache = 0,
  ...
  CEntry,
  ...,
  JSEntry,
};

Handle<Code> GetCode() will return the Code for this code stub and generate it if needed. If the code is not in the cache then the following will be called to generate the code:

Handle<Code> new_object = GenerateCode();

Zone

Is very well documented:

// The Zone supports very fast allocation of small chunks of
// memory. The chunks cannot be deallocated individually, but instead
// the Zone supports deallocating all chunks in one fast
// operation. The Zone is used to hold temporary data structures like
// the abstract syntax tree, which is deallocated after compilation.

ZoneObject

Is an object that exist in a zone and is indented to be extended (like CodeStub does).

Script::Compile

Will delegate to ScriptCompiler::Compile, and then to ScriptCompiler::CompileUnboundInternal, and then to ScriptCompiler::CompileUnboundInternal.

i::MaybeHandle<i::SharedFunctionInfo> maybe_function_info = i::Compiler::GetSharedFunctionInfoForScript(
-> 2332            str, name_obj, line_offset, column_offset, source->resource_options,
   2333            source_map_url, isolate->native_context(), NULL, &script_data,
   2334            options, i::NOT_NATIVES_CODE, host_defined_options);
  ParseInfo parse_info(script);
  Zone compile_zone(isolate->allocator(), ZONE_NAME);
  ...
  maybe_result = CompileToplevel(&parse_info, isolate);

CompileToplevel

  if (parse_info->literal() == nullptr && !parsing::ParseProgram(parse_info, isolate)) {
  ...
  std::unique_ptr<CompilationJob> outer_function_job(GenerateUnoptimizedCode(parse_info, isolate, &inner_function_jobs));
...

GenerateUnoptimizedCode

src/compiler.cc

  Compiler::EagerInnerFunctionLiterals inner_literals;
  if (!Compiler::Analyze(parse_info, &inner_literals)) {
    return std::unique_ptr<CompilationJob>();
  }
  std::unique_ptr<CompilationJob> outer_function_job(
      PrepareAndExecuteUnoptimizedCompileJob(parse_info, parse_info->literal(), isolate));

PrepareAndExecuteUnoptimizedCompileJob

src/compiler.cc

(lldb) br s -f compiler.cc -l 385
std::unique_ptr<CompilationJob> job(interpreter::Interpreter::NewCompilationJob(parse_info, literal, isolate));
  if (job->PrepareJob() == CompilationJob::SUCCEEDED && job->ExecuteJob() == CompilationJob::SUCCEEDED) {
    return job;
  }

Lets take a look at parse_info:

(lldb) job parse_info->script_->source()
"'use strict';\x0a\x0a(function frogger(process) {\x0a  process._rawDebug('entry function');\x0a\x0a  function startup() {\x0a  process._rawDebug('startup function');\x0a  return true;\x0a  }\x0a\x0a  startup();\x0a});\x0a"

Notice that this is the complete contents of bootstrap_node.js:

(lldb) job parse_info->script_->name()
"bootstrap_node.js"

PrepareJob will print the AST if configured with --print-ast:

[generating bytecode for function: ]
--- AST ---
FUNC at 0
. KIND 0
. SUSPEND COUNT 0
. NAME ""
. INFERRED NAME ""
. EXPRESSION STATEMENT at 284
. . LITERAL "use strict"
. EXPRESSION STATEMENT at 299
. . ASSIGN at -1
. . . VAR PROXY local[0] (0x10702f338) (mode = TEMPORARY) ".result"
. . . FUNC LITERAL at 300
. . . . NAME
. . . . INFERRED NAME
. . . . PARAMS
. . . . . VAR (0x10701cd48) (mode = VAR) "process"
. RETURN at -1
. . VAR PROXY local[0] (0x10702f338) (mode = TEMPORARY) ".result"

VAR PROXY indicates that this scope resolution will connect these nodes declaring VAR nodes. This is all PrepareJob does.

`ExecuteJob(

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].