Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large models slow on M1 Mac #5412

Closed
iamaydo opened this issue Dec 9, 2020 · 17 comments
Closed

Large models slow on M1 Mac #5412

iamaydo opened this issue Dec 9, 2020 · 17 comments

Comments

@iamaydo
Copy link

iamaydo commented Dec 9, 2020

Version

PrusaSlicer
Version: 2.3.0-beta2+
Build: PrusaSlicer-2.3.0-beta2+-202012051103

Operating System: Macintosh
System Architecture: 64 bit
System Version: macOS Version 10.16 (Build 20B29)
Total RAM size [MB]: 17,180MB
OpenGL installation
GL version: 2.1 Metal - 70.12.7
Vendor: Apple
Renderer: Apple M1
GLSL version: 1.20
Installed extensions:

GL_APPLE_aux_depth_stencil
GL_APPLE_client_storage
GL_APPLE_element_array
GL_APPLE_fence
GL_APPLE_float_pixels
GL_APPLE_flush_buffer_range
GL_APPLE_flush_render
GL_APPLE_packed_pixels
GL_APPLE_pixel_buffer
GL_APPLE_rgb_422
GL_APPLE_row_bytes
GL_APPLE_specular_vector
GL_APPLE_texture_range
GL_APPLE_transform_hint
GL_APPLE_vertex_array_object
GL_APPLE_vertex_point_size
GL_APPLE_vertex_program_evaluators
GL_APPLE_ycbcr_422
GL_ARB_color_buffer_float
GL_ARB_depth_buffer_float
GL_ARB_depth_clamp
GL_ARB_depth_texture
GL_ARB_draw_buffers
GL_ARB_draw_elements_base_vertex
GL_ARB_draw_instanced
GL_ARB_fragment_program
GL_ARB_fragment_program_shadow
GL_ARB_fragment_shader
GL_ARB_framebuffer_object
GL_ARB_framebuffer_sRGB
GL_ARB_half_float_pixel
GL_ARB_half_float_vertex
GL_ARB_imaging
GL_ARB_instanced_arrays
GL_ARB_multisample
GL_ARB_multitexture
GL_ARB_occlusion_query
GL_ARB_pixel_buffer_object
GL_ARB_point_parameters
GL_ARB_point_sprite
GL_ARB_provoking_vertex
GL_ARB_seamless_cube_map
GL_ARB_shader_objects
GL_ARB_shader_texture_lod
GL_ARB_shading_language_100
GL_ARB_shadow
GL_ARB_shadow_ambient
GL_ARB_sync
GL_ARB_texture_border_clamp
GL_ARB_texture_compression
GL_ARB_texture_compression_rgtc
GL_ARB_texture_cube_map
GL_ARB_texture_env_add
GL_ARB_texture_env_combine
GL_ARB_texture_env_crossbar
GL_ARB_texture_env_dot3
GL_ARB_texture_float
GL_ARB_texture_mirrored_repeat
GL_ARB_texture_non_power_of_two
GL_ARB_texture_rectangle
GL_ARB_texture_rg
GL_ARB_transpose_matrix
GL_ARB_vertex_array_bgra
GL_ARB_vertex_blend
GL_ARB_vertex_buffer_object
GL_ARB_vertex_program
GL_ARB_vertex_shader
GL_ARB_window_pos
GL_ATI_separate_stencil
GL_ATI_texture_env_combine3
GL_ATI_texture_float
GL_EXT_abgr
GL_EXT_bgra
GL_EXT_bindable_uniform
GL_EXT_blend_color
GL_EXT_blend_equation_separate
GL_EXT_blend_func_separate
GL_EXT_blend_minmax
GL_EXT_blend_subtract
GL_EXT_clip_volume_hint
GL_EXT_debug_label
GL_EXT_debug_marker
GL_EXT_draw_buffers2
GL_EXT_draw_range_elements
GL_EXT_fog_coord
GL_EXT_framebuffer_blit
GL_EXT_framebuffer_multisample
GL_EXT_framebuffer_multisample_blit_scaled
GL_EXT_framebuffer_object
GL_EXT_framebuffer_sRGB
GL_EXT_geometry_shader4
GL_EXT_gpu_program_parameters
GL_EXT_gpu_shader4
GL_EXT_multi_draw_arrays
GL_EXT_packed_depth_stencil
GL_EXT_packed_float
GL_EXT_provoking_vertex
GL_EXT_rescale_normal
GL_EXT_secondary_color
GL_EXT_separate_specular_color
GL_EXT_shadow_funcs
GL_EXT_stencil_two_side
GL_EXT_stencil_wrap
GL_EXT_texture_array
GL_EXT_texture_compression_dxt1
GL_EXT_texture_compression_s3tc
GL_EXT_texture_env_add
GL_EXT_texture_filter_anisotropic
GL_EXT_texture_integer
GL_EXT_texture_lod_bias
GL_EXT_texture_rectangle
GL_EXT_texture_sRGB
GL_EXT_texture_sRGB_decode
GL_EXT_texture_shared_exponent
GL_EXT_timer_query
GL_EXT_transform_feedback
GL_EXT_vertex_array_bgra
GL_IBM_rasterpos_clip
GL_NV_blend_square
GL_NV_conditional_render
GL_NV_depth_clamp
GL_NV_fog_distance
GL_NV_fragment_program2
GL_NV_fragment_program_option
GL_NV_light_max_exponent
GL_NV_texgen_reflection
GL_NV_texture_barrier
GL_NV_vertex_program2_option
GL_NV_vertex_program3
GL_SGIS_generate_mipmap
GL_SGIS_texture_edge_clamp
GL_SGIS_texture_lod
GL_SGI_color_matrix

Operating system type + version

MacBook Pro 13-inch M1 - macOS 11.0.1

3D printer brand / version + firmware version (if known)

Prusa MK3S - FW 3.9.2

Behavior

  • Describe the problem
    Trying to preview large prints in Preview spikes GPU load to ~99% and makes the program either unresponsive or very slow (few seconds) to respond when trying to rotate around the model or trying to select any menu item. Models are responsive when previewing lower layers or in 3D editor view.�

Issue is not present in PrusaSlicer 2.2.0+-202003211132 on the same laptop.

Is this a new feature request?
No

Project File (.3MF) where problem occurs

Print 15 - Interior 1 - 0.15mm_PLA_MK3.3mf.zip

@bgiot
Copy link
Contributor

bgiot commented Dec 9, 2020

Did you tried with macOS 11.1 beta 2 ? there are a lot of improvements and fixes in Rosetta 2...

@bubnikv
Copy link
Collaborator

bubnikv commented Dec 9, 2020

@FidelCapo narrowed it more closely to
5ff6f30 from 202008210958 - ok
f6acd49 from 202008261352 - not ok

@bubnikv
Copy link
Collaborator

bubnikv commented Dec 9, 2020

It seems there is a sharp threshold in model size. Below the threshold everything renders smoothly. Once the threshold is tripped over, the rendering is 1 frame per 3 seconds.

My personal guess is that either some of the vertex buffer is too large (memory fragmentation?) or maybe the limit of the graphics card GPU dedicated memory is reached and the GPU is swapping?

Apple may know why they are obsoleting OpenGL. Maybe their support of large vertex buffers is subpar?

@bubnikv bubnikv closed this as completed Dec 9, 2020
@bubnikv bubnikv reopened this Dec 9, 2020
@xarbit
Copy link
Contributor

xarbit commented Dec 9, 2020

@bubnikv the limit of the GPU on the M1 is technically 16GB (or 8GB) as it uses the unified memory architecture.
So I don't think the GPU is the limit.

@bubnikv
Copy link
Collaborator

bubnikv commented Dec 9, 2020

I did some more tests. The initial G-code preview at the end of slicing is fine. It is the new G-code preview after the full G-code is generated that seems to be the issue. PrusaSlicer 2.2.0 generated the final G-code preview differently, it did not support simulation in time, thus the vertex buffers were simpler.

@xarbit
Copy link
Contributor

xarbit commented Dec 12, 2020

the performance with PrusaSlicer 2.3 beta 3 and macOS 11.1 RC1 is definitely a lot better.. but still sluggish starting at 0.15 and large models

@bubnikv
Copy link
Collaborator

bubnikv commented Dec 12, 2020

We will try to split the vertex buffers, but we are not sure whether we will manage before Christmas. In worst case, we will release PrusaSlicer 2.3.1 in January with that particular fix. Keep your fingers crossed that our assumptions are right and that splitting the vertex buffers will help.

@kenthinson
Copy link

It seems there is a sharp threshold in model size. Below the threshold everything renders smoothly. Once the threshold is tripped over, the rendering is 1 frame per 3 seconds.

My personal guess is that either some of the vertex buffer is too large (memory fragmentation?) or maybe the limit of the graphics card GPU dedicated memory is reached and the GPU is swapping?

Apple may know why they are obsoleting OpenGL. Maybe their support of large vertex buffers is subpar?

I am curious then why not change to Vulkan/ MoltenVK for Prussia slicer? Sorry for my ignorance. Thanks.

@bubnikv
Copy link
Collaborator

bubnikv commented Dec 13, 2020

I am curious then why not change to Vulkan/ MoltenVK for Prussia slicer? Sorry for my ignorance. Thanks.

We likely will one day, but there are still a lot of old systems that don't support Vulkan, and transition to Vulkan will be very labor consuming and thus painful.

@treyd3
Copy link

treyd3 commented Dec 17, 2020

FYI. I see the same performance issues on my old Intel core i7 iMac (OSX10.13, 20GB RAM, 1GB vram) as on my M1 MBAir. The 2.3 Beta chugs rendering the gcode to the point where I have to force kill the app. The same model and slicer settings in 2.2 render the gcode ok. On iMac it's not completely smooth, but certainly usable. On the M1 MBAir it's nice and smooth. I know the profiles are exactly the same on the two systems because they were directly copied from App Support folder on iMac to the new M1. The model is https://www.thingiverse.com/download:7768268

So whatever is causing the slowdown, it doesn't appear to be CPU arch or OS version dependent.

Let me know if you want logs, etc to analyze and I'll be happy to supply.

edit:
Is it possible that adding the ability to progress through the line segments on each layer has effected the performance?
Probably not a big deal on small, simple models but potentially crippling on larger, very detailed ones or when using high-percentage gyroid infill and the like.

edit2:
same behavior (unusable) using beta3 as other 2.3 builds when slicing the model referenced above. At ~250MB that is a huge model, but again the final 2.2 is able to slice, render, then pan, zoom, traverse layers, etc on both Intel and AS platforms. AS is even more responsive than my i7. I don't think the huge model gcode rendering performance issues is CPU platform related.
How much work would it be to build a load with rendering segments / layer turned off?

@xarbit
Copy link
Contributor

xarbit commented Dec 18, 2020

@treyd3 yes, I experienced the same performance regressions on intel and M1 macs. I can confirm that the issue is not arch dependent and pointed that out as well.

It is not only on detail and large objects but also when slicing multiple small objects at once.

@bjeurissen
Copy link

bjeurissen commented Feb 3, 2021

Are you using geometry shaders by any chance? In our application, this is what is causing slow performance and crashes on Apple M1 (MRtrix3/mrtrix3#2247)

@bubnikv
Copy link
Collaborator

bubnikv commented Feb 3, 2021 via email

@xarbit
Copy link
Contributor

xarbit commented Feb 3, 2021

@bubnikv very cool.. you mind pointing to the commit where this was solved?

@bubnikv
Copy link
Collaborator

bubnikv commented Feb 4, 2021

It was a lengthy refactoring to split vertex / index buffers to multiple smaller pieces.

@bubnikv
Copy link
Collaborator

bubnikv commented Feb 11, 2021

It will be fixed in the next release. Closing.

@bubnikv bubnikv closed this as completed Feb 11, 2021
@bubnikv
Copy link
Collaborator

bubnikv commented Apr 14, 2021

Fixed in PrusaSlicer 2.3.1-rc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants