Segmentation fault for large meshes?

Hello everyone,
I have been using MMG3D to remesh accurately 3D geometries generated with NetGen.

The tool appeared to work just fine on my MacBook for small test meshes, until I encountered a Segmentation error for larger meshes. I intuited that this came from memory management and a lack of space on my laptop, since moving to a larger computer at my lab solved the issue. That way I could generate meshes typically up to 60M cells. Now the problem is I need to refine even more (so a factor of 8 is to be expected, and another 8 later because this is not even the whole geometry…) and the famous Segmentation error occured again. I tried to launch MMG on a supercomputer on a preprocessor partition which has very large memory capacity, typically up to 3 To per node. This is coherent with what MMG tells me (1523001 Mb “detected”) :

– INPUT DATA
%% /gpfsscratch/rech/fls/uxp62jq/PAM/MESH_S30_1_cropped/3d_fluid.msh OPENED
MAXIMUM MEMORY AUTHORIZED (MB) 1523001
MMG3D_NPMAX 9032172
MMG3D_NTMAX 2000000
MMG3D_NEMAX 60269955

 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
     BAD ORIENTATION : vol < 0 -- 40179970 element(s) reoriented
 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

 NUMBER OF VERTICES        6021448
 NUMBER OF TETRAHEDRA     40179970
 NUMBER OF TRIANGLES        925928

– DATA READING COMPLETED. 45.264s

[note] : The bad orientation error does not seem to be problematic and just comes from NetGen’s inability to write GMSH2 files with the correct format for MMG. I do not believe this is the origin or the problem, but who knows…

The interesting feature I have seen is on my lab’s computer, the MMG3D_NEMAX is actually larger even though the detected memory is much lower (typically 96000Mb). Thereby, I do not see any correlation between the machine I use, thus its capacity, and my possibility to process larger meshes. The command -m does not really help. To what I have tried, all it does is increase or decrease a bit the MMG3D_NEMAX but up to some point the number put in -m does not really matter and the MMG3D_NEMAX converges towards a maximum value, which changes between my lab’s computers and the supercomputer ! This is so freaking weird !

Can anyone help me around this issue ? Is MMG limited intrinsically in terms of mesh size ? If so, what are my alternative possibilities ?

Let me thank you in advance for you kind help,
PA M

Hello,
by default, Mmg is initialized so that it won’t try to use more than 50% of the available memory and this limit cannot be overstepped, so the -m option will be effective only in reducing the allocation size.

In HPC applications this 50% limit can be too restrictive, so for the moment you may want to directly modify this line of code and recompile:

Since the information about available memory depends on the system and is used to preallocate arrays, it makes sense that the maximum array sizes differ on different system.

Anyway I cannot see the reason for the behaviour you report here (I would expect the opposite), could you please share the memory size and the array sizes you get on the two machines, in order to try to get a better idea of what is going on?

The segmentation fault could be due to the fact that Mmg could not be capable of imposing the memory limit, so a memory allocation could fail when not finding enough memory to allocate, or to the fact that the number of entity could overstep the maximum integer representable by an int32 datatype. Could you please run your case with -v 6 and share the log, so to have an idea of where it happens?

Everything is OK instead for the orientation warning, it is very verbose but it just tells that some elements have been reoriented.

Hope this helps,
Luca

Hello,
Thanks for your answer.
I am aware that the system tries to use 50% of the available memory. However, on the supercomputer I am using there is 1523001 Mb available, as shown in my first message. It is 5 times more than on my lab computer, so I expect that changing the percentage does not solve the issue. It fails at the same moment : during the optimization process. This typically occurs for the same -hsiz prescribed on any machine. Both systems are in 64 bytes si theoretically there is no reason for it being a system-related issue.
I have talked to people in a CFD-specialized laboratory who use MMG frequently. They say they had similar issues with meshes above 200M cells. So they provided me with a homemade tool that solved their problem and is more adapted to my application, I believe.

Still I think it would be interesting to investigate why some users encountered similar issues. I will post the -v 6 text soon.

Hello,
I am trying to investigate the problem myself too on the last version. What version of Mmg are you using?

If you have the possibility to run your case with the code compiled in debug mode and to obtain a back trace for the segmentation fault (for example by compiling with Address Sanitizer through the compilation and linking options -fsanitize=address -fno-omit-frame-pointer -O0 or on icc through the option -traceback) it would be helpful.

I will also come back to you as soon as I have news from my tests.

Yours,
Luca