Need help understanding an error thrown from safe_app.dll


#1

Hey all,

I have been stuck for a while, trying to solve this on my own, but I cannot locate the actual cause for the error, and I hope that maybe the details of it might give someone a clue as to what might be going on, so that I can get some new angle and approach the problem again.

I like to consider myself a super problem solver guy in managed land. The problems can always be found and fixed eventually. But when it comes to unmanaged code, well that’s simply not something I’ve been doing, and it’s completely veiled to me for this reason. So, it’s hard for me to tell what this is about.

Now, since this is not an error I’ve seen anyone mention, I assume it is some implementation error on my side. However, I can’t find anything substantial to support it.
I am running out of ideas and the single substantial clue I have is pointing towards safe_app.dll.
The problem occurs both on alpha-2 and on local network, no difference discerned (other than latency ofc).

- What application or program are you running?
SAFEExamples.Notebook, referencing dlls from SAFE.EventStore and SAFE.DotNet (which is a C# wrapper around safe_app.dll based on SafeMessages example app).
I compiled safe_app and system_uri from alpha_2 branch.
SAFE.DotNet.Auth also based on SafeMessages example app, is included in the Notebook example app.

- What’s the version number?
N/A

- What are the version numbers of the supporting dependencies that you are using locally? For example, Node.js, NPM, and Rust.
VS2017 Community, dot net core 2.0

- Which operating system are you using and which version number?

Windows 10 Education in a Hyper-V VM (Microsoft Windows NT 6.2.9200.0)

- What steps did you follow when the error occurred and how can we reproduce it?

So, when debugging, and performing a repeatable set of operations, I get a crash. Sometimes there’s an indication from debugger that the error comes from unmanaged code.
Since I have turned on catching of all possible exceptions, it seems likely (I nearly never see this when not dealing with unmanaged code, exceptions are almost always caught).

In EventViewer I have information like this:

Fault bucket 1292424819593853224, type 4
Event Name: APPCRASH
Response: Not available
Cab Id: 0

Problem signature:
P1: dotnet.exe
P2: 2.0.25816.2
P3: 59e535ea
P4: StackHash_afc1
P5: 0.0.0.0
P6: 00000000
P7: c0000005
P8: PCH_D3_FROM_**safe_app**+0x000000000008C363
P9: 
P10: 

The crash is very predictable, but more with the writes than the reads. When I run around 8 (x2) write operations in a row it crashes. It takes more read operations.

The writes operations look like this:
First creating an MD with a couple of entries, and an IMD, then inserting to the MD + 1 new IMD per iteration.
The reads just fetches this data.

All of this, just to clarify, is same on local network as well as alpha-2 network.

I did a series of tests to try find a pattern:
(For the time being you can ignore InvalidHashName error, since it has something to do with the local network, and is not relevant for this problem. Although it is also a problem I would like solved, but I’m trying to dilute my addition to support-requests here as much as possible, so another time :slight_smile: )

Results of iterations of GETs
  • 34 gets then dotnet crash
  • InvalidNameHash (5 s later)
  • 34 gets then UnhandledException (NullReferenceException) with no stack trace (Your app has entered a break state, but no code is currently executing that is supported by the selected debug engine (e.g. only native runtime code is executing).)
    Thread being at [Managed to Native Transition] (5 s later)
  • InvalidNameHash (~ 2 min later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • 145 gets then dotnet crash (5 s later)
  • 32 gets then dotnet crash (45 s later)
  • InvalidNameHash (10 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • 102 gets then then UnhandledException (NullReferenceException) with no stack trace (5 s later)
  • 65 gets then dotnet crash (45 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • 30 gets then dotnet crash (10 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • 29 gets then then UnhandledException (NullReferenceException) with no stack trace (5 s later)
  • InvalidNameHash (15 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (45 s later)
  • 60 gets then (Your app has entered a break state, but no code is currently executing that is supported by the selected debug engine.) One thread at IDataSelfEncryptorReaderFreeNative (10 s later)
  • 59 gets then dotnet crash (45 s later)
  • InvalidNameHash (15 s later)
  • InvalidNameHash (5 s later)
  • 25 gets then dotnet crash (5 min later)
  • InvalidNameHash (5 s later)
  • 27 gets then dotnet crash (5 s later)
  • InvalidNameHash (15 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • 26 gets then dotnet crash (5 s later)
  • InvalidNameHash (15 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • InvalidNameHash (5 s later)
  • 55 gets then dotnet crash (15 s later)

The info from the debugger, and the fact that I cannot actually catch the exceptions, all suggest that it stems from the unmanaged code, and this is all supported by the error event in EventViewer.

Doesn’t matter if I iterate with no delay or put in ~1 minut delay between iterations.
However, doing a few, then waiting 5-10 minutes or so, will allow a bit more than expected, but then it inevitably crashes anyway.

Ok so, just throwing this out there, hoping that maybe someone has a Eureka moment. :slight_smile:


Self encryptor callback cast issue
#2

@bochaco I was thinking that he should be able to put a log.toml in the same directory as safe_app.dll to log the native libs. Is that correct?

@oetyng You can obtain a log.toml from the releases here. I suppose then that it should be placed in the same directory as whatever is calling your library which depends on safe_app.dll.

Just my initial thought while I continue to understand and think about the problem.

cc: @nbaksalyar


#3

You actually need the log.toml file to be at the same location as where the app’s executable is, regardless of where the safe_app.dll is located.


#4

Just a couple of questions to try understanding it better:

By a local network you mean safe_vault instances that you run on your local machine, right?
Have you tried it with mock-routing build of safe_app, does the problem still persist?

And the issue can be reproduced by running the example repo that you’ve linked (SAFEExamples.Notebook), correct?


#5

Thanks alot @hunterlester, @bochaco and @nbaksalyar for picking it up so fast!

Yes, that’s right, by running safe_vault instances on my local machine. I followed the instructions here: How to run a local test network

I did not try it with mock-routing build of safe_app yet though. I guess that would be a good way to exclude some possible causes.

That’s correct.

EDIT: in SAFEExamples.Notebook, you need to reference SAFE.EventStore which in turn uses SAFE.DotNet.
So these two repos should be cloned, and then you could either include the referenced projects into the Notebook solution, or build them and reference the output dlls.


#6

I have been away for a week.
The log.toml only says this:

[appenders.async_console]
kind = "async_console"
pattern = "{({l}):1.1} {d(%y-%m-%d %H:%M:%S%.6f)} [{M} #FS#{f}#FE#:{L}] {m}\n"

[[appenders.async_console.filters]]
kind = "threshold"
level = "trace"

[appenders.async_file]
kind = "async_file"
output_file_name = "Client.log"
pattern = "{({l}):1.1} {d(%y-%m-%d %H:%M:%S%.6f)} [{M} #FS#{f}#FE#:{L}] {m}\n"
append = false
file_timestamp = true

[root]
level = "error"
appenders = ["async_console", "async_file"]

[loggers."crust"]
level = "debug"

[loggers."routing"]
level = "trace"

[loggers."routing_stats"]
level = "trace"

[loggers."safe_core"]
level = "trace"

[loggers."safe_app"]
level = "trace"

[loggers."safe_authenticator"]
level = "trace"

[loggers."ffi_utils"]
level = "trace"

#7

Place log.toml in your directory containing program that indirectly executes the native libraries.

For example, in SAFE Browser, there’s a log.toml in the same directory as the application executable.
See:
browser_files

Log output file being there as well: C:\Users\guilf\Desktop\safe-browser-mock-v0.8.1-win-x64\safe-browser-mock-v0.8.1-win-x64\SAFE_App_Browser_Plugin.MaidSafe.net_Ltd.log.


#8

Something is not right, because I don’t get anything else than what I pasted above.
I placed it both in the debug folder (where all files are copied to before executing them), and it already existed in two other places: main folder of solution, and Documents where FileOps.cs, class copied from SafeMessages, places it.

I can see though that I have this:

NativeBindings.AppInitLogging(null, cb2);

and first parameter name is actually fileName, so I wonder if it really should be null…

On the other hand, this is also called:

NativeBindings.AppSetAdditionalSearchPath(fileOps.ConfigFilesPath, cb1);
… and this is the path to which the log.toml is copied in FileOps.cs.

Not sure what can be wrong here.


#9

oh wait… I had commented out InitLogging… however logs were written to console but not file
Amazingly though, I was able to go way past the 8x2 writes that has been an almost 100% sure block (the crash in OP ocurred here)! But I actually got to 17x2 writes!

And additionally I was able to catch an InvalidCastException!
invalidcastexception

This was an interesting turn of events… OK, I will be trying some more here.

EDIT:

OK, so… I’ll rewind a bit.

I actually had a memory leak when implementing the reading from Immutable data, based on the documentation here: http://docs.maidsafe.net/beaker-plugin-safe-app/#windowsafeimmutabledataread

So I added

await IData.SelfEncryptorReaderFreeAsync(seReaderHandle);

which probably goes without saying that it should be there…

Anyway, this is where the exception above happened.
And the entire method looks like this:

async Task<EventData> GetEventDataFromAddress(StoredEvent stored)
        {
            var seReaderHandle = await IData.FetchSelfEncryptorAsync(stored.DataMapAddress);
            var len = await IData.SizeAsync(seReaderHandle);
            var readData = await IData.ReadFromSelfEncryptorAsync(seReaderHandle, 0, len);
            
            var eventData = new EventData(readData.ToArray(),
                stored.MetaData.CorrelationId,
                stored.MetaData.CausationId,
                stored.MetaData.EventClrType,
                stored.MetaData.Id,
                stored.MetaData.Name,
                stored.MetaData.SequenceNumber,
                stored.MetaData.TimeStamp);

            await IData.SelfEncryptorReaderFreeAsync(seReaderHandle);

            return eventData;
        }

I just want to point out though, that I suspect (without being sure of course) that this is not the same problem as the OP is describing. But for the sake of the full picture I’m adding it here anyway.

EDIT2:
So, I got to 17x2 writes again, with exactly the same exception. Now just battling the connection to the local network so that I can try it out some more times… brb.

EDIT3:

Ok, so… Having 2 libs (SAFE.DotNet and SAFE.DotNet.Auth) initiating logging and copying to same location via FileOps was causing problems with concurrent access to the file, that’s why I had it commented out in SAFE.DotNet. However, I could not connect to local network when not initiating logging in the Auth project… I don’t understand the correlation here and it seems strange to me that the logging would impact connectivity.
So I just let these two projects copy their log.toml to separate directories.

And now I could connect, and confirm that it is exactly the same amount of writes that gives this InvalidCastException every time (17x2).
Good news is that this hardly traceable dotnet crash OP is about, has not occurred after turning on the logging (again,. strange correlation…?).

I have the logs from the console now, about 950 000 loc more than allowed to post here though. How shall I send it over?

EDIT4:
There it came again though… the crash described in OP. After 8x2 writes. So, it seems to still be a problem.

With regards to IDataSelfEncryptorReaderFreeNative error:
I noticed that after 16x2 writes, this part of the code:

 IDataSelfEncryptorReaderFreeCb callback = (_, result) => {
                if (result.ErrorCode != 0)
                {
                    tcs.SetException(result.ToException());
                    return;
                }

                tcs.SetResult(null);
            };

has errorcode -1010, Invalid Self Encryptor handle

But trying to get there again I am now always getting error after 8x2 writes. So I seem to be back on square one again.

EDIT5:

Ok, I have found a pattern…
When I create multiple streams in same db, the OP error seems to show.

When creating new db, I seem to be able to get past that error, and instead I get
Invalid Self Encryptor handle after about 9x2 writes, and never beyond 17x2 writes.

Huh… OK. picture is getting clearer. I am beginning to suspect some MD size problem maybe. It shouldn’t be though…

EDIT6:
My theory didn’t hold. I thought that within same db, I could only create maybe 2-3 streams with 17x2 writes, and after that all new streams would make the app crash after 8x2 writes.
In this trial I got the expected crash after 4th stream, but then I could do 17x2 writes again on the 5th stream, and 20x2 on the 10th stream.

Results: in one and the same db, before getting the InvalidCastException
Unable to cast object of type 'SAFE.DotNET.Native.IDataFetchSelfEncryptorCb' to type 'SAFE.DotNET.Native.IDataSelfEncryptorReaderFreeCb'.

1st stream 17x2 writes, InvalidCastException
2nd stream 17x2 writes, InvalidCastException
3rd stream 22x2 writes, InvalidCastException

then on 4th stream, I can only do 9x2 writes, before getting the crash without any exception caught.

5th, 21x2 writes InvalidCastException
6th, 9x2 writes crash without any exception caught.
7th, 9x2 writes crash without any exception caught.
8th, 8x2 writes, InvalidCastException
9th, 8x2 writes crash without any exception caught.
10th, after 15x2 writes: a whole lot of D 18-01-27 19:18:39.319942 [safe_app::ffi::immutable_data immutable_data.rs:365] **ERRNO: -1010** InvalidSelfEncryptorHandle
then after total 20x2 writes, the InvalidCastException

This all seems semi-nondeterministic, making it hard to find a pattern which would indicate what is going wrong.


#10

I created a new topic for the self encryptor callback cast issue, as to try to keep efforts and attention on solving the problem of OP. (I am btw still not closer to knowing what is going on there.)


#11

Okay, this whole thing needs a new approach, this is getting nowhere.

I started with the first write operation that is made: create a db.

Iterations with 100 ms delay, after 51 iterations (always same count) AppSession destructor is called (do not know why yet), whereby Session.AppPtr is set to IntPtr.Zero, which is setup to cause exception be thrown on subsequent access to this property. Meanwhile, in a parallel thread, a NativeHandle somewhere is being disposed, and when calling native code to free it, there is an access to Session.AppPtr. Boom…

To be continued. I suspect I will find quite a few things with this approach, before getting to the real problem of OP.

(This is more or less the example code - copy paste from SafeMessages app - only being tested now, or equivalent flow)

Questions so far:

  1. What is it that happens after 51 iterations of creating 2 MDs with write permissions inserted?
  2. Is there a general design flaw that we can reset Session.AppPtr, before all current usage of it has ended?

Answers so far:

  1. I have no idea.
  2. I suspect so, and will try to redesign it.

Here follows some results, one section per operation type, with some input variations for each session.

Create db iteration results

Delay: 100 ms
Iterations: 51
Event: AppSession destructor is called, thereby resetting Session.AppPtr before last use of it, causing ArgumentNullException thrown on a subsequent NativeHandle destructor called.

Delay: 200 ms, 1 000 ms,
Iterations: 51
Event: Crash, with msg The program ‘[4404] dotnet.exe’ has exited with code -1073741819 (0xc0000005) ‘Access violation’.

Delay: 5 000 ms,
Iterations: 51
Event: An unhandled exception of type ‘System.NullReferenceException’ occurred in Unknown Module. Object reference not set to an instance of an object.
(No stacktrace, ie. suspected origin: native code. Same as observed when running SAFEExamples.NoteBook after 17x2 or so writes!)

Delay: 10 000 ms
Iterations: 51
Event: AppSession destructor is called, thereby resetting Session.AppPtr before last use of it, causing ArgumentNullException thrown on a subsequent NativeHandle destructor called.

Thoughts:
Progress! Already at this early (relatively, from my high abstraction point of view) stage of code depth (a lot more isolated than running all layers of abstraction) we are seeing similar problems!
Increased delay between iterations does not ameliorate the problem of errors showing up, but it does affect which kind of error we see… Which is not making sense to me currently. But I guess my initial theory of some corrupt memory somewhere is still a good candidate considering these seemingly uncorrelated and dispersed error types.

EDIT:
Regarding question #2
This is just a very small detail of higher level implementation. Temporary solution:
I did a static reference to the AppPtr that is not reset when AppSession destructor is called.
Probably not especially important, but if we want to be able to free all currently used NativeHandles, even after an AppSession has been disposed, then it needs some other approach. So, that would be something to think about for people using the code (or more probable: other examples not yet produced) for reference later.

So, when removing that noise, I’m back at unexplicable app crash. Debugging session just ends and in EventViewer all we can see is that dotnet.exe crashed.

Back to the OP question:
Now, I’m getting closer to a question that is maybe possible to answer for someone out there.

I have boiled it down to a very limited set of operations
    // Creates db with address to category MD
    public async Task CreateDbAsync(string databaseId)
    {
        databaseId = DbIdForProtocol(databaseId);

        if (databaseId.Contains(".") || databaseId.Contains("@"))
            throw new NotSupportedException("Unsupported characters '.' and '@'.");

        // Check if account exits first and return error
        var dstPubIdDigest = await GetMdXorName(databaseId);
        using (var dstPubIdMDataInfoH = await MDataInfo.NewPublicAsync(dstPubIdDigest, 15001))
        {
            var accountExists = false;
            try
            {
                var keysH = await MData.ListKeysAsync(dstPubIdMDataInfoH);
                keysH.Dispose();
                accountExists = true;
            }
            catch (Exception)
            {
                // ignored - acct not found
            }
            if (accountExists)
            {
                throw new Exception("Id already exists.");
            }
        }

        // Create Self Permissions
        using (var categorySelfPermSetH = await MDataPermissionSet.NewAsync())
        {
            await Task.WhenAll(
                MDataPermissionSet.AllowAsync(categorySelfPermSetH, MDataAction.kInsert),
                MDataPermissionSet.AllowAsync(categorySelfPermSetH, MDataAction.kUpdate),
                MDataPermissionSet.AllowAsync(categorySelfPermSetH, MDataAction.kDelete),
                MDataPermissionSet.AllowAsync(categorySelfPermSetH, MDataAction.kManagePermissions));

            using (var streamTypesPermH = await MDataPermissions.NewAsync())
            {
                using (var appSignPkH = await Crypto.AppPubSignKeyAsync())
                {
                    await MDataPermissions.InsertAsync(streamTypesPermH, appSignPkH, categorySelfPermSetH);
                }

                // Create Md for holding categories
                var categoriesMDataInfoH = await MDataInfo.RandomPrivateAsync(15001);
                await MData.PutAsync(categoriesMDataInfoH, streamTypesPermH, NativeHandle.Zero);

                var serializedCategoriesMdInfo = await MDataInfo.SerialiseAsync(categoriesMDataInfoH);

                // Finally update App Container (store db info to it)
                var database = new Database
                {
                    DbId = databaseId,
                    Categories = new DataArray { Type = "Buffer", Data = serializedCategoriesMdInfo }, // Points to Md holding stream types                                                                                    
                };

                var serializedDb = JsonConvert.SerializeObject(database);
                using (var appContH = await AccessContainer.GetMDataInfoAsync(AppContainerPath)) // appContainerHandle
                {
                    var dbIdCipherBytes = await MDataInfo.EncryptEntryKeyAsync(appContH, database.DbId.ToUtfBytes());
                    var dbCipherBytes = await MDataInfo.EncryptEntryValueAsync(appContH, serializedDb.ToUtfBytes());
                    using (var appContEntryActionsH = await MDataEntryActions.NewAsync())
                    {
                        await MDataEntryActions.InsertAsync(appContEntryActionsH, dbIdCipherBytes, dbCipherBytes);
                        await MData.MutateEntriesAsync(appContH, appContEntryActionsH);
                    }
                }
            }
        }
    }

This is very similar to the SafeMessages example MaidSafe have here: https://github.com/maidsafe/safe_mobile

Iterating 51 times over the above code will make the app crash with no signs of why (that I could find).

Below, I will start with the first interactions with safe_app.dll, and see how many iterations I can go before breakdown. Then I will add another interaction, and see how many iterations we can go before errors, until we have added all interactions with safe_app.dll that we see in the code block above.
By doing this I would like to find the place where something goes wrong, and if not that: get more clues.

Each set of iteration is ending with some of following errors (not deterministic which error shows up for which operations):

  • AppSession destructor is called for some unknown reason. No logs in EventViewer.
  • NullReferenceException occurred in Unknown Module, without stacktrace. No logs in EventViewer.
  • ExecutionEngineException occurred in Unknown Module, without stacktrace. No logs in EventViewer.
  • Crashes in the same unexplicable way as described in OP. Errror message The program '[16784] dotnet.exe' has exited with code -1073741819 (0xc0000005) 'Access violation'. and with
following logs in EventViewer:
Fault bucket 2253332025786264427, type 5
Event Name: BEX64
Response: Not available
Cab Id: 0

Problem signature:
P1: dotnet.exe
P2: 2.0.26021.1
P3: 5a3b026e
P4: StackHash_03ab
P5: 0.0.0.0
P6: 00000000
P7: PCH_B2_FROM_safe_app+0x0000000000584D45
P8: c0000005
P9: 0000000000000008
P10:

(this sounds to me like some memory access violation in safe_app.dll)


Starting with the very first interaction:

Iterating over Sha3HashAsync
    async Task<List<byte>> GetMdXorName(string plainTextId)
    {
        return await NativeUtils.Sha3HashAsync(plainTextId.ToUtfBytes());
    }

Is unproblematic for ~1285 iterations, with 1 ms delay.

Next, add: MDataInfo.NewPublicAsync
        databaseId = DbIdForProtocol(databaseId);

        var dstPubIdDigest = await GetMdXorName(databaseId);
        using (var dstPubIdMDataInfoH = await MDataInfo.NewPublicAsync(dstPubIdDigest, 15001))
        {
            // no action here
        }

Is unproblematic for ~ 730 iterations, with 1 ms and 100 ms delay.

Next, add: MData.ListKeysAsync
        databaseId = DbIdForProtocol(databaseId);

        var dstPubIdDigest = await GetMdXorName(databaseId);
        using (var dstPubIdMDataInfoH = await MDataInfo.NewPublicAsync(dstPubIdDigest, 15001))
        {
            var accountExists = false;
            try
            {
                var keysH = await MData.ListKeysAsync(dstPubIdMDataInfoH);
                keysH.Dispose();
                accountExists = true;
            }
            catch (Exception)
            {
                // ignored - acct not found
            }
            if (accountExists)
            {
                throw new Exception("Id already exists.");
            }
        }

Is unproblematic for ~540 iterations, with 1 ms and 100 ms delay.

Next, add: MDataPermissionSet.NewAsync
        using (var categorySelfPermSetH = await MDataPermissionSet.NewAsync())
        {
           
        }

~440 iterations

Next, add: 4 x MDataPermissionSet.AllowAsync
        using (var categorySelfPermSetH = await MDataPermissionSet.NewAsync())
        {
            await Task.WhenAll(
                MDataPermissionSet.AllowAsync(categorySelfPermSetH, MDataAction.kInsert),
                MDataPermissionSet.AllowAsync(categorySelfPermSetH, MDataAction.kUpdate),
                MDataPermissionSet.AllowAsync(categorySelfPermSetH, MDataAction.kDelete),
                MDataPermissionSet.AllowAsync(categorySelfPermSetH, MDataAction.kManagePermissions));
        }

~410 iterations

(not yet committed state to the network.)

Next, add: MDataPermissions.NewAsync
 using (var streamTypesPermH = await MDataPermissions.NewAsync())
 {
      // no action
}

~375 iterations

Next: Crypto.AppPubSignKeyAsync
 using (var streamTypesPermH = await MDataPermissions.NewAsync())
 {
      using (var appSignPkH = await Crypto.AppPubSignKeyAsync())
      {
           // no action
      }
}

~350 iterations

Next: MDataPermissions.InsertAsync
 using (var streamTypesPermH = await MDataPermissions.NewAsync())
 {
      using (var appSignPkH = await Crypto.AppPubSignKeyAsync())
      {
           await MDataPermissions.InsertAsync(streamTypesPermH, appSignPkH, categorySelfPermSetH);
      }
}

~335 iterations

and so on, for each additional interaction, ie:

#1: 1285 x Interaction_1
#2: 721 x #1 + Interaction_2
#3: 537 x #2 + Interaction_3
#4: 438 x #3 + Interaction_4
#5: 377 x #4 + Interaction_5
#6: 347 x #5 + Interaction_6
#7: 323 x #6 + Interaction_7
#8: 310 x #7 + Interaction_8
#9: 296 x #8 + Interaction_9
#10: 285 x #9 + Interaction_10
#11: 224 x #10 + Interaction_11
#12: 148 x #11 + Interaction_12
#13: 133 x #12 + Interaction_13
#14: 66 x #13 + Interaction_14
#15: 66 x #14 + Interaction_15
#16: 51 x #15 + Interaction_16
#17: 50 x #16 + Interaction_17

Memory footprint is low: ~70 MB.

Pattern is getting clear: any interaction we have with safe_app.dll, is contributing to what eventually leads to the crash. The more interactions, the sooner we crash.

And this code is almost 100% copy paste from the MaidSafe examples.
I’m getting more confident this is not some error on my part (ofc not convinced yet though).

Is anyone reading this btw?

I will try run this in .Net Framework instead of .Net Core, to rule out that part.

Currently, a few possible sources for errors I can see on my end are:

  1. Some overlooked implementation in setup of reproduction.
  2. Some subtle mistake in the copy paste of code from MaidSafe example.
  3. Some .Net Core specific error.
  4. Some problem with debugger. (does not seem to be, experience the same when running without it)
  5. Conflicts with other applications (very clean machine though, installed and running almost nothing else)
  6. Some problem with my OS/VM.

Other possible sources:

  1. The vaults running locally
  2. Order of execution of calls to native code.
  3. The native code itself.

@nbaksalyar I have updated the Notebook repo with a unit test for reproducing this. I hope it can be useful if you have time to look at it. :pray: