Dynamically loading N-API functions #584

goto-bus-stop · 2020-08-08T11:49:48Z

This issue explores the N-API linking story on Windows and cross-platform.
I tried to make it understandable for people who haven't done any Windows
programming but please ask if anything's unclear 🙇‍♀️

Summary

We could manually load pointers to N-API functions using GetModuleHandle
and dlopen(). On Windows, this lets us avoid a compile-time dependency on
node.lib and makes the win_delay_load_hook unnecessary.

Background

When your Neon addon is loaded, it looks up the address of all the napi_*
functions that it needs. Otherwise, it could not call those functions. These
napi_* functions are provided by the Node.js executable itself.

On Linux, my understanding is that the process has one big namespace for all the
functions. So when the addon is loaded, the OS's runtime linker can get to work
and finds all the napi_* addresses for you, no matter where in the current
process they are defined.

On Windows, things are a bit stricter. You not only need to tell it about the
name of the function, but also which module (exe or dll file) contains it. This
is done at compile time using a .lib file (node.lib for N-API). This .lib
file contains entries like "find napi_create_object in node.exe". When the addon
is loaded, the OS's runtime linker looks up all the napi_* addresses inside
node.exe.

The Problems

This setup causes two complications for Neon.

We need to have this node.lib file available at build time, because we need
to link to it. This file does not ship with Node.js installations, at least
not on Windows.
Not all Node.js executables are called "node.exe". Node.js can be embedded
inside other applications. The major one is "electron.exe". If we build a
Neon addon with Node.js's node.lib, Windows will look for N-API functions
in "node.exe". But, if we are running the node addon inside Electron, there
is no "node.exe": this causes Windows to look for a node.exe at some
predefined paths on the system and load it if it exists. Whether "node.exe"
exists or not, the end result is not good.

So, what we really want to tell Windows is to look up napi_* functions in
the host process, whether it is named "node.exe" or "electron.exe" or
something else.

The obvious way to address problem 1 is by downloading it in a build.rs
script. The more interesting one is problem 2.

A Solution

node-gyp solves this using a delayed loading hook. MSVC and Windows have a neat
feature where you can specify that a particular module that you link to should
not be loaded at startup, but only once you use its functions. For Neon, that
would mean that Windows won't look for all the napi_* functions as soon as the
addon is loaded, but only look them up once they are called. This is called
delayed loading.

Delayed loading also allows you to declare hooks.
These hooks let you intercept the loading of a module or a function. The
interesting bit for us: we can intercept loads of the module "node.exe", and
return the correct value ourselves. GetModuleHandle(NULL)
returns the calling process.

This is what it looks like:
https://github.com/nodejs/node-gyp/blob/aaf33c30296ddb71c12e2b587a5ec5add3f8ace0/src/win_delay_load_hook.cc#L23-L35
(The final line, __pfnDliNotifyHook2 = load_exe_hook, declares a symbol that
the delayed loading code will look for when it loads something. That's how the
hooking is done.)

It is also possible to do this in Rust. The function looks quite similar:
https://github.com/goto-bus-stop/neon/blob/00d60dd0f6b70b70e32bededeba31ba729d44d6a/src/win_delay_load_hook.rs#L51-L72

This lets us load Neon addons on Windows, even if the "node.exe" file has been
renamed 🎉

But… it's not perfect

We can now build addons for Windows, but there are still some pain points:

The build script downloads a file from nodejs.org, outside of Cargo. What if
nodejs.org is down, or blocked?
We still need to handle Electron specially, because its node.lib file
tells the runtime linker to look for napi_* functions in the module
"electron.exe". This can be as simple as just checking for both "node.exe"
and "electron.exe" in the code above. But what if there are other Node.js-
based applications in the future, or a new Node.js fork?

Alternative Solution

Whenever we call a N-API function with delayed loading, Windows uses the
information from the node.lib file to essentially do this for us:

HMODULE node = GetModuleHandle("node.exe");
FARPROC fn = GetProcAddress(node, "napi_create_object");

So, what if we do this ourselves instead? Then, the node.lib file is
unnecessary. We then also control all the details. In particular, it allows us
make tweaks like this:

HMODULE node = GetModuleHandle(NULL);
FARPROC fn = GetProcAddress(node, "napi_create_object");

Now, we immediately tell the Windows API that we're looking for the host
process module, and we don't need to use the load hook to redirect "node.exe"
or "electron.exe" at all.

Bill Ticehurst wrote about this approach in a blog post. That was
for a different use case, but it seems like people are already doing things like
this.

In Rust, we can do this in a cross-platform way using libloading. We can
then store pointers to the napi_* functions that we need in a static location
in memory, so that it's very fast: one pointer dereference and one call
instruction. This is basically the same as what the runtime linker would do.

One possible implementation for this may be the recently proposed dynamic
linking PR for bindgen: #1846. It could generate a NodeApi struct with
all the N-API function pointers in it. When a Neon addon loads, we can create
an instance of this struct. Then we update all our napi_* callsites to
something like:

(napi().napi_create_object)(.. args ..)

Where napi() is a hypothetical function that returns the NodeApi struct. The
extra set of parens is necessary because .napi_create_object is a field
containing a pointer; it's not a method. This napi() function could likely
just return a static address, so it should always be inlined by the compiler.

The initialization can happen in the register_module! macro so that end users
don't have to worry about it.

Advantages

With the dynamic loading approach, we don't need a node.lib file, and we
don't need to hook the delayed loading mechanism. This removes the hairiest
platform-specific part of the build system. The win_delay_load_hook can
also be hard to understand for outsiders, because it relies on an obscure
feature.

Drawbacks

The main ones I can see:

Doing the dynamic loading ourselves is unnecessary on Linux. But AFAICT there
is no simple way to switch between the "standard" approach and this dynamic
approach at compile time, because the source code has to change at each N-API
call site.
Our code becomes a bit clunky with the parens and napi() call.

The text was updated successfully, but these errors were encountered:

kjvalencik · 2020-08-08T15:03:50Z

I really like the alternative approach. I think it's acceptable to introduce the dynamic loading for Linux/macOS to reduce the cross platform differences.

However, if we want to avoid it, we could introduce a napi!(napi_method_name)(args) macro with platform specific implementations. On Windows it would expand to (napi().napi_method_name)(args) and on other OS nodejs_sys::napi_method_name(args).

Or something similar. I'm not sure if rust-analyzer would handle it better with the arguments inside the macro or outside.

goto-bus-stop · 2020-08-09T09:16:30Z

Yes! A macro is a good idea, and would be great for other reasons too. We could then more easily introduce something similar to the NAPI_CALL() business shown here (scroll down a bit) to provide better error information instead of the assert_eq!(status, napi_ok) that we do right now, if we want to, or swap out the linking details again later down the line.

tjallingt · 2020-08-09T11:19:52Z

I'm not sure how much of N-API is used by the average neon app. Would it be possible/difficult to lazily populate the NodeApi struct with the procedure addresses?

goto-bus-stop · 2020-08-09T11:30:55Z

With the current design of the node-bindgen PR, that is not an option. If we write our own Rust signatures for the N-API functions it would be possible. I don't think that's likely to be a performance bottleneck though.

That said one more drawback of the dynamic loading approach with the current node-bindgen design is that we would be looking up every N-API function, instead of only the ones we use. If that turns out to be a problem we can definitely try lazy loading the functions or keeping our own list of them.

goto-bus-stop · 2020-08-11T14:49:00Z

To address that last drawback, we could use .whitelist_function() to only generate bindings for the functions that we use.

goto-bus-stop · 2021-01-10T10:00:10Z

I'd say #646 has done this :)

goto-bus-stop mentioned this issue Aug 13, 2020

Port win_delay_load_hook #588

Merged

1 task

goto-bus-stop mentioned this issue Nov 26, 2020

Dynamic loading sample #645

Closed

kjvalencik mentioned this issue Nov 30, 2020

Dynamic Loading #646

Merged

goto-bus-stop closed this as completed Jan 10, 2021

This was referenced Feb 29, 2024

npm:better-sqlite3 npm:node-libcurl not working denoland/deno#18444

Closed

Deno support - Uncaught Error: Could not locate the bindings file. WiseLibs/better-sqlite3#1034

Open

ducktype mentioned this issue Mar 23, 2024

Windows Support oven-sh/bun#43

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamically loading N-API functions #584

Dynamically loading N-API functions #584

Dynamically loading N-API functions #584

Dynamically loading N-API functions #584

Comments

Summary

Background

The Problems

A Solution

But… it's not perfect

Alternative Solution

Advantages

Drawbacks