Breaking Change in .NET 9 Crashes Ignite Node

Old JDK code meets new Intel security feature, JVM + CLR in one process, and a mysterious crash.

Investigation

Recently we got a report from a user that Ignite crashes after .NET 9 upgrade with a mysterious error:

process 7692 exited with code -1073740791 (0xc0000409)
  • It works on Linux and crashes on Windows
  • The only change is .NET 8 -> .NET 9 upgrade
  • The process exits abruptly, there is no stack trace or any other information

By stepping through the code in debugger, we can see that the crash occurs at a call to unmanaged function:

jint JNI_CreateJavaVM(JavaVM **p_vm, JNIEnv **p_env, void *vm_args)

Looking up the error code yields STATUS_STACK_BUFFER_OVERRUN. Ignite.NET allocates two pointers on the CLR stack and passes them to the JVM, which then writes data to these pointers. The error suggests that the JVM writes more data than expected, causing the stack overrun, but that is not the case.

It was time to go deeper: download JDK symbols and sources and step through the JVM code. Which brings us to this call:

  get_cpu_info_stub(&_cpuid_info);

The stub points to a function that is generated from assembly code at runtime. The assembly code performs a lot of tricks to detect CPU capabilities, it is not easy to follow or debug. It even deals with 386/486 CPUs, which is quite fascinating:

    //
    // if we are unable to change the AC flag, we have a 386
    //
    __ xorl(rax, HS_EFL_AC);
    __ push(rax);
    __ popf();
    __ pushf();
    __ pop(rax);
    __ cmpptr(rax, rcx);
    __ jccb(Assembler::notEqual, detect_486);

Java is old. And we are at a dead end.

When Everything Else Fails, Read the Manual

Breaking changes in .NET 9 is a long list, but the “Interop” section catches our eye: CET supported by default.

If libraries try to change a thread context to any other location, the process is terminated.

In fact, Visual Studio debugger gave us a clue earlier with the following message: Unknown __fastfail() status code: 0x0000000000000030, which corresponds to FAST_FAIL_SET_CONTEXT_DENIED error.

Solution

Even the latest JDK uses the same tricky CPU detection code which triggers CET, so we have to disable it by adding <CETCompat>false</CETCompat> to the csproj file for the target project (the one that starts the process).

This can’t be fixed on Ignite.NET side and has to be done by the user.

Links

Written on November 21, 2024