-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Patch skipping crashing in functions marked with [UnmanagedCallersOnly] #102767
Comments
Can you try changing |
This may hide issue similar to how changing the GC mode hides the issue, but it won't fix the underlying problem.
Thread.Begin/EndThreadAffinity do nothing in .NET Core. You can delete these calls. (These calls were relevant in .NET Framework for low-level hosting, e.g. under MS SQL Server.)
Could you please share more details about the location of the crash? Where is the origin exception thrown exactly? |
Yes, this call is unmanaged->managed transition. It would be useful to know where the crash happens inside this call. I do not think that the Visual Studio will show it to you. You would need to run under low-level debugger like windbg or collect crash dump using (https://learn.microsoft.com/en-us/windows/win32/wer/collecting-user-mode-dumps). |
Now this is where it gets interesting, while running it inside windbg the issue doesn't occur at all after running it many times, i'll try to get you a crash dump, any special settings you would like to be turned on for the minidump flags? |
Default settings should work fine, to start with at least. |
Here you go |
Thank you for sharing the minidump. Here are the details of the crash:
The crash is in "Just My Code" check. It is an extra code that gets inserted by the JIT at the start of every method when running under managed debugger like Visual Studio. It allows the managed debugger to provide better stepping experience. It explains why you are not able to reproduce the crash outside Visual Studio. |
The exception details:
Notice that the ExceptionAddress is different than the actual address of the code (see my previous comment). It suggests that this is problem with patch skipping. The patch skipping is a managed debugger feature where it copies out a few instructions to different spot to execute them in isolation. The patch skipping has to update the relative addresses in the code when it copies the code out, but it was not done here for @dotnet/dotnet-diag-contrib PTLA |
Tagging subscribers to this area: @tommcdon |
Awesome you've been able to pinpoint it so fast, that makes sense. Do you need anything else from me? |
I can't see the bypass buffers in the dump sadly, but I think there's a few issues:
|
No, even in 6.0 the decoder would classify the instruction as non-write. This is actually doing silly copy to and from the module data structure. But this is actually an unlikely root cause. Rather, the window of time between where we overwrite the displacement |
@markoostveen, does the native portion run on many threads and is this the first time they execute managed code? |
@markoostveen would you mind trying to disable Just My Code in the debugger to determine if issue reproduces without it enabled? Also would it be possible to share a VS solution containing a small C# console app and C++ DLL that we can use to reproduce and debug the issue? |
Yes the native portion is running on multiple threads, some started from C# but others started in native code. Those threads are executing jobs from a jobsystem(Written in my spare time) which in the end calls the managed callback as part of a job.
I've ran the program with both Just My Code enabled and disabled giving me the same behavior in both instances resulting in the same crash. If you want I can provide a dump where it is enabled and disabled
I'm not sure if I'll be able to provide this as I don't have an easy way to compress it into a smaller sharable console application. Sharing the entire app isn't possible due to company policy but I'll try to get clearance if need be. |
Well, I got clearance to share it much faster than anticipated, although I can only share release binaries so you'll be able to run it. It is a standalone executable running a small simulation (No GUI yet in C#) While running you will see the following The implementation of the method that is causing the issue is under namespace Ers.EventScheduler if you look at the Ers.dll in ILSpy or another IL inspection tool To cause the issue I was encountering open visual studio using the following command in cmd devenv /debugexe AWealthOfRows.exe Then right click on the awealthofrows project click properties and change the debugger type to "Mixed (.Net Core, .Net 5+)" |
Hey, is there any progress on the issue? I just want to check-in. if there isn't any progress, I'll try to get clearance to extend the period cutoff date/killswitch. |
It's still under investigation. In the meantime, we have created a standalone repro so the original app shouldn't be necessary any longer internal class Program
{
static volatile int count = 0;
[UnmanagedCallersOnly(CallConvs = new[] { typeof(CallConvCdecl) }, EntryPoint = "MyFoo")]
static void MyFoo(nint obj)
{
Interlocked.Increment(ref count);
}
[DllImport("Dll1.dll", CallingConvention = CallingConvention.Cdecl)]
static unsafe extern void TestFunction([MarshalAs(UnmanagedType.FunctionPtr)] delegate* unmanaged[Cdecl]<nint, void> callback);
static unsafe void Main(string[] args)
{
Console.WriteLine("Hello, World!");
const int numThreads = 1000;
for (int i = 0; i < numThreads; i++)
{
TestFunction(&MyFoo);
}
while (count < numThreads)
{
Thread.Sleep(1);
}
Console.WriteLine("Hit enter to quit");
Console.In.ReadLine();
}
} #define API extern "C" __declspec(dllexport)
volatile LONG i = 0;
typedef void(__cdecl* MyCallback)(void* obj);
void DoStuff(MyCallback callback)
{
callback((void*)i);
InterlockedIncrement(&i);
}
DWORD WINAPI ThreadProc(
_In_ LPVOID lpParameter
)
{
DoStuff((MyCallback)lpParameter);
return 0;
}
API void __cdecl TestFunction(void(__cdecl* callback)(void* obj))
{
CreateThread(NULL, 0, ThreadProc, callback, 0, NULL);
} |
I've also disabled the feature FEATURE_EMULATE_SINGLESTEP to check if it was being related to the issue, but when I do, native only debugging works okay as expected, but then the breakpoint being hit is different, in that case, the 'address' member on struct DebuggerControllerPatch is NULL causing IsBound to return false in another assertion |
This is unrelated problem that was fixed by #109603 |
Description
This is a repost of my stackoverflow post, that I have found a temporary solution for here
C# function pointers causes access violation upon calling from C++
I have tried debugging, but when the issue occurs. In this case C++ calls the callback, but before it enters the function's scope the exception is thrown, but immediately returns with an error code.
For the sake of debugging, I've tried switching the calling convention from
cdecl
tostdcall
to see if it has to do with stack corruption.My assumptions are that:
Good to know:
Reproduction Steps
In the following bit of code, I'm using C# function pointers to omit using a delegate type, for faster performance. The code works fine when I use a delegate type marshalled as a function pointer, no exception no issues. However, when changing it from a delegate to a function pointer it sometimes throws an access violation.
Code is modified to remove namespaces and prefixes not relevant to the question
On the C++ side
Expected behavior
The thread should enter and execute the static function passed using a function pointer
Actual behavior
The thread doesn't enter the user-defined portion of the function. an Access Violation is thrown in a generated call visible in the IL during the native-to-managed-transition
Regression?
No response
Known Workarounds
Change the garbage collection mode by adding ```xml
true
true
The text was updated successfully, but these errors were encountered: