Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault when creating sysimage with --check-bounds=no under Julia-1.11 #1021

Open
johnomotani opened this issue Dec 13, 2024 · 4 comments
Open

Comments

@johnomotani
Copy link

I'm developing a scientific HPC code (https://github.com/mabarnes/moment_kinetics) that I run using --check-bounds=no for performance, and often want to build a system image. This worked fine on Julia-1.10.x and earlier versions.

With Julia-1.11.2, if I start Julia with --check-bounds=no and/or have --check-bounds=no in the sysimage_build_args, I get a segfault. I've tested the exact same thing on Julia-1.10.7 (and updated to the latest versions of all dependencies on both Julia versions) and building the system image succeeds there.

My compilation script is

using Pkg 

Pkg.activate(".")

using PackageCompiler

create_sysimage(; sysimage_path="moment_kinetics.so",
                precompile_execution_file="util/precompile_run.jl",
                include_transitive_dependencies=false, # This is needed to make MPI work, see https://github.com/JuliaParallel/MPI.jl/issues/518
                sysimage_build_args=`-O3 --check-bounds=no`,
               )

The precompile_run.jl script on its own runs fine with both --check-bounds=no or --check-bounds=yes, but this create_sysimage() call fails with the output pasted at the bottom.

As suggested in this Discourse thread I tried forcing the build to use only one thread, by running like

JULIA_IMAGE_THREADS=1 bin/julia --project -O3 --check-bounds=no precompile.jl

or

JULIA_IMAGE_THREADS=1 JULIA_CPU_THREADS=1 bin/julia -t 1 -p 1 --project -O3 --check-bounds=no precompile.jl

and checked that memory usage does not seem to be an issue - I did not see a large fraction of my RAM being used at any point.

Segfault output:

$ bin/julia --project -O3 --check-bounds=no precompile.jl
  Activating project at `~/moment_kinetics-master-clean`
Precompiling project...
  1 dependency successfully precompiled in 15 seconds. 409 already precompiled.
[ Info: PackageCompiler: Executing /***/moment_kinetics-master-clean/util/precompile_run.jl => /tmp/jl_packagecompiler_2m4voE/jl_UmTb8L
  Activating project at `~/moment_kinetics`
[ Info: PackageCompiler: Done
⠙ [00m:39s] PackageCompiler: compiling incremental system image
[87212] signal 11 (128): Segmentation fault
in expression starting at none:1
process_node! at ./compiler/ssair/ir.jl:1540
iterate_compact at ./compiler/ssair/ir.jl:1809
iterate at ./compiler/ssair/ir.jl:1731 [inlined]
compact! at ./compiler/ssair/ir.jl:2003
compact! at ./compiler/ssair/ir.jl:2001 [inlined]
run_passes_ipo_safe at ./compiler/optimize.jl:994
run_passes_ipo_safe at ./compiler/optimize.jl:1009 [inlined]
optimize at ./compiler/optimize.jl:983
unknown function (ip: 0x72310ad755ad)
finish_nocycle at ./compiler/typeinfer.jl:265
_typeinf at ./compiler/typeinfer.jl:249
typeinf at ./compiler/typeinfer.jl:215
typeinf_edge at ./compiler/typeinfer.jl:923
abstract_call_method at ./compiler/abstractinterpretation.jl:660
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:102
abstract_call_known at ./compiler/abstractinterpretation.jl:2200
unknown function (ip: 0x72310ade7046)
unknown function (ip: 0x723110130f7c)
unknown function (ip: 0x723110130ec9)
abstract_call at ./compiler/abstractinterpretation.jl:2282
abstract_call at ./compiler/abstractinterpretation.jl:2275
abstract_call at ./compiler/abstractinterpretation.jl:2420
abstract_eval_call at ./compiler/abstractinterpretation.jl:2435
abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2451
abstract_eval_statement at ./compiler/abstractinterpretation.jl:2749
abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3065
typeinf_local at ./compiler/abstractinterpretation.jl:3319
typeinf_nocycle at ./compiler/abstractinterpretation.jl:3401
_typeinf at ./compiler/typeinfer.jl:244
typeinf at ./compiler/typeinfer.jl:215
typeinf_edge at ./compiler/typeinfer.jl:923
abstract_call_method at ./compiler/abstractinterpretation.jl:660
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:102
abstract_call_known at ./compiler/abstractinterpretation.jl:2200
unknown function (ip: 0x72310ade7046)
unknown function (ip: 0x723110130f7c)
unknown function (ip: 0x723110130ec9)
abstract_call at ./compiler/abstractinterpretation.jl:2282
abstract_call at ./compiler/abstractinterpretation.jl:2275
abstract_call at ./compiler/abstractinterpretation.jl:2420
abstract_eval_call at ./compiler/abstractinterpretation.jl:2435
abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2451
abstract_eval_statement at ./compiler/abstractinterpretation.jl:2749
abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3065
typeinf_local at ./compiler/abstractinterpretation.jl:3319
typeinf_nocycle at ./compiler/abstractinterpretation.jl:3401
_typeinf at ./compiler/typeinfer.jl:244
typeinf at ./compiler/typeinfer.jl:215
typeinf_edge at ./compiler/typeinfer.jl:923
abstract_call_method at ./compiler/abstractinterpretation.jl:660
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:102
abstract_call_known at ./compiler/abstractinterpretation.jl:2200
unknown function (ip: 0x72310ade7046)
unknown function (ip: 0x723110130f7c)
unknown function (ip: 0x723110130ec9)
abstract_call at ./compiler/abstractinterpretation.jl:2282
abstract_call at ./compiler/abstractinterpretation.jl:2275
abstract_call at ./compiler/abstractinterpretation.jl:2420
abstract_eval_call at ./compiler/abstractinterpretation.jl:2435
abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2451
abstract_eval_statement at ./compiler/abstractinterpretation.jl:2749
abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3065
typeinf_local at ./compiler/abstractinterpretation.jl:3319
typeinf_nocycle at ./compiler/abstractinterpretation.jl:3401
_typeinf at ./compiler/typeinfer.jl:244
typeinf at ./compiler/typeinfer.jl:215
typeinf_edge at ./compiler/typeinfer.jl:923
abstract_call_method at ./compiler/abstractinterpretation.jl:660
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:102
abstract_call_known at ./compiler/abstractinterpretation.jl:2200
unknown function (ip: 0x72310ade7046)
unknown function (ip: 0x723110130f7c)
unknown function (ip: 0x723110130ec9)
abstract_call at ./compiler/abstractinterpretation.jl:2282
abstract_call at ./compiler/abstractinterpretation.jl:2275
abstract_call at ./compiler/abstractinterpretation.jl:2420
abstract_eval_call at ./compiler/abstractinterpretation.jl:2435
abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2451
abstract_eval_statement at ./compiler/abstractinterpretation.jl:2749
abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3065
typeinf_local at ./compiler/abstractinterpretation.jl:3319
typeinf_nocycle at ./compiler/abstractinterpretation.jl:3401
_typeinf at ./compiler/typeinfer.jl:244
typeinf at ./compiler/typeinfer.jl:215
typeinf_edge at ./compiler/typeinfer.jl:923
abstract_call_method at ./compiler/abstractinterpretation.jl:660
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:102
abstract_call_known at ./compiler/abstractinterpretation.jl:2200
unknown function (ip: 0x72310ade7046)
unknown function (ip: 0x723110130f7c)
unknown function (ip: 0x723110130ec9)
abstract_call at ./compiler/abstractinterpretation.jl:2282
abstract_call at ./compiler/abstractinterpretation.jl:2275
abstract_call at ./compiler/abstractinterpretation.jl:2420
abstract_eval_call at ./compiler/abstractinterpretation.jl:2435
abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2451
abstract_eval_statement at ./compiler/abstractinterpretation.jl:2749
abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3065
typeinf_local at ./compiler/abstractinterpretation.jl:3319
typeinf_nocycle at ./compiler/abstractinterpretation.jl:3401
_typeinf at ./compiler/typeinfer.jl:244
typeinf at ./compiler/typeinfer.jl:215
typeinf_edge at ./compiler/typeinfer.jl:923
abstract_call_method at ./compiler/abstractinterpretation.jl:660
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:102
abstract_call_known at ./compiler/abstractinterpretation.jl:2200
unknown function (ip: 0x72310ade7046)
unknown function (ip: 0x723110130f7c)
unknown function (ip: 0x723110130ec9)
abstract_call at ./compiler/abstractinterpretation.jl:2282
abstract_call at ./compiler/abstractinterpretation.jl:2275
abstract_call at ./compiler/abstractinterpretation.jl:2420
abstract_eval_call at ./compiler/abstractinterpretation.jl:2435
abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2451
abstract_eval_statement at ./compiler/abstractinterpretation.jl:2749
abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3065
typeinf_local at ./compiler/abstractinterpretation.jl:3319
typeinf_nocycle at ./compiler/abstractinterpretation.jl:3401
_typeinf at ./compiler/typeinfer.jl:244
typeinf at ./compiler/typeinfer.jl:215
typeinf_edge at ./compiler/typeinfer.jl:923
abstract_call_method at ./compiler/abstractinterpretation.jl:660
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:102
abstract_call_known at ./compiler/abstractinterpretation.jl:2200
unknown function (ip: 0x72310ade7046)
unknown function (ip: 0x723110130f7c)
unknown function (ip: 0x723110130ec9)
abstract_call at ./compiler/abstractinterpretation.jl:2282
abstract_call at ./compiler/abstractinterpretation.jl:2275
abstract_call at ./compiler/abstractinterpretation.jl:2420
abstract_eval_call at ./compiler/abstractinterpretation.jl:2435
abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2451
abstract_eval_statement at ./compiler/abstractinterpretation.jl:2749
abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3065
typeinf_local at ./compiler/abstractinterpretation.jl:3319
typeinf_nocycle at ./compiler/abstractinterpretation.jl:3401
_typeinf at ./compiler/typeinfer.jl:244
typeinf at ./compiler/typeinfer.jl:215
typeinf_edge at ./compiler/typeinfer.jl:923
abstract_call_method at ./compiler/abstractinterpretation.jl:660
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:102
abstract_call_known at ./compiler/abstractinterpretation.jl:2200
unknown function (ip: 0x72310ade7046)
unknown function (ip: 0x723110130f7c)
unknown function (ip: 0x723110130ec9)
abstract_call at ./compiler/abstractinterpretation.jl:2282
abstract_call at ./compiler/abstractinterpretation.jl:2275
abstract_call at ./compiler/abstractinterpretation.jl:2420
abstract_eval_call at ./compiler/abstractinterpretation.jl:2435
abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2451
abstract_eval_statement at ./compiler/abstractinterpretation.jl:2749
abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3065
typeinf_local at ./compiler/abstractinterpretation.jl:3319
typeinf_nocycle at ./compiler/abstractinterpretation.jl:3401
_typeinf at ./compiler/typeinfer.jl:244
typeinf at ./compiler/typeinfer.jl:215
typeinf_edge at ./compiler/typeinfer.jl:923
abstract_call_method at ./compiler/abstractinterpretation.jl:660
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:102
abstract_call_known at ./compiler/abstractinterpretation.jl:2200
unknown function (ip: 0x72310ade7046)
unknown function (ip: 0x723110130f7c)
unknown function (ip: 0x723110130ec9)
abstract_call at ./compiler/abstractinterpretation.jl:2282
abstract_call at ./compiler/abstractinterpretation.jl:2275
abstract_call at ./compiler/abstractinterpretation.jl:2420
abstract_eval_call at ./compiler/abstractinterpretation.jl:2435
abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2451
abstract_eval_statement at ./compiler/abstractinterpretation.jl:2749
abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3065
typeinf_local at ./compiler/abstractinterpretation.jl:3319
typeinf_nocycle at ./compiler/abstractinterpretation.jl:3401
_typeinf at ./compiler/typeinfer.jl:244
typeinf at ./compiler/typeinfer.jl:215
typeinf_edge at ./compiler/typeinfer.jl:923
abstract_call_method at ./compiler/abstractinterpretation.jl:660
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:102
abstract_call_known at ./compiler/abstractinterpretation.jl:2200
unknown function (ip: 0x72310ade7046)
unknown function (ip: 0x723110130f7c)
unknown function (ip: 0x723110130ec9)
abstract_call at ./compiler/abstractinterpretation.jl:2282
abstract_call at ./compiler/abstractinterpretation.jl:2275
abstract_call at ./compiler/abstractinterpretation.jl:2420
abstract_eval_call at ./compiler/abstractinterpretation.jl:2435
abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2451
abstract_eval_statement at ./compiler/abstractinterpretation.jl:2749
abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3065
typeinf_local at ./compiler/abstractinterpretation.jl:3319
typeinf_nocycle at ./compiler/abstractinterpretation.jl:3401
_typeinf at ./compiler/typeinfer.jl:244
typeinf at ./compiler/typeinfer.jl:215
typeinf_ext at ./compiler/typeinfer.jl:1101
typeinf_ext_toplevel at ./compiler/typeinfer.jl:1139
typeinf_ext_toplevel at ./compiler/typeinfer.jl:1135
unknown function (ip: 0x723110168d69)
jl_apply at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
jl_type_infer at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/gf.c:390
_generate_from_hint at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/gf.c:2809 [inlined]
jl_compile_now at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/gf.c:2823 [inlined]
ijl_compile_method_instance at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/gf.c:2835
ijl_compile_hint at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/gf.c:2873
precompile at ./loading.jl:4021
unknown function (ip: 0x7231170b4d82)
jl_apply at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
do_apply at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/builtins.c:831
jl_apply at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
do_call at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/interpreter.c:126
eval_value at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/interpreter.c:223
eval_stmt_value at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/interpreter.c:174 [inlined]
eval_body at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/interpreter.c:663
jl_interpret_toplevel_thunk at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/interpreter.c:821
jl_toplevel_eval_flex at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/toplevel.c:943
ijl_toplevel_eval_in at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/toplevel.c:994
eval at ./boot.jl:430
unknown function (ip: 0x72310ade9d56)
jl_apply at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
do_call at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/interpreter.c:126
eval_value at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/interpreter.c:223
eval_stmt_value at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/interpreter.c:174 [inlined]
eval_body at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/interpreter.c:663
jl_interpret_toplevel_thunk at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/interpreter.c:821
jl_toplevel_eval_flex at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/toplevel.c:943
jl_toplevel_eval_flex at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/toplevel.c:886
ijl_toplevel_eval_in at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/toplevel.c:994
eval at ./boot.jl:430 [inlined]
include_string at ./loading.jl:2734
_include at ./loading.jl:2794
include at ./Base.jl:557
unknown function (ip: 0x72310ade7506)
exec_options at ./client.jl:323
_start at ./client.jl:531
unknown function (ip: 0x7231170c0e3f)
jl_apply at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
true_main at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/jlapi.c:900
jl_repl_entrypoint at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/jlapi.c:1059
main at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/cli/loader_exe.c:58
unknown function (ip: 0x723118a29d8f)
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 27805698 (Pool: 27805071; Big: 627); GC: 23
✖ [00m:47s] PackageCompiler: compiling incremental system image
ERROR: LoadError: failed process: Process(`/home/john/.julia/juliaup/julia-1.11.2+0.x64.linux.gnu/bin/julia --color=yes --startup-file=no --pkgimages=no --cpu-target=native -O3 --check-bounds=no --sysimage=/home/john/.julia/juliaup/julia-1.11.2+0.x64.linux.gnu/lib/julia/sys.so --project=/home/john/physics/moment_kinetics-master-clean --output-o=/tmp/jl_IjCNCNKrkn-o.a /tmp/jl_Ut8Fo9qnCN`, ProcessSignaled(11)) [0]

Stacktrace:
  [1] pipeline_error
    @ ./process.jl:598 [inlined]
  [2] run(::Cmd; wait::Bool)
    @ Base ./process.jl:513
  [3] run
    @ ./process.jl:510 [inlined]
  [4] #20
    @ ~/.julia/packages/PackageCompiler/UbaS4/ext/TerminalSpinners.jl:157 [inlined]
  [5] spin(f::PackageCompiler.var"#20#22"{Cmd}, s::PackageCompiler.TerminalSpinners.Spinner{Base.TTY})
    @ PackageCompiler.TerminalSpinners ~/.julia/packages/PackageCompiler/UbaS4/ext/TerminalSpinners.jl:164
  [6] macro expansion
    @ ~/.julia/packages/PackageCompiler/UbaS4/ext/TerminalSpinners.jl:157 [inlined]
  [7] create_sysimg_object_file(object_file::String, packages::Vector{String}, packages_sysimg::Set{Base.PkgId}; project::String, base_sysimage::String, precompile_execution_file::Vector{String}, precompile_statements_file::Vector{String}, cpu_target::String, script::Nothing, sysimage_build_args::Cmd, extra_precompiles::String, incremental::Bool, import_into_main::Bool)
    @ PackageCompiler ~/.julia/packages/PackageCompiler/UbaS4/src/PackageCompiler.jl:130
  [8] create_sysimg_object_file
    @ ~/.julia/packages/PackageCompiler/UbaS4/src/PackageCompiler.jl:319 [inlined]
  [9] create_sysimage(packages::Nothing; sysimage_path::String, project::String, precompile_execution_file::String, precompile_statements_file::Vector{String}, incremental::Bool, filter_stdlibs::Bool, cpu_target::String, script::Nothing, sysimage_build_args::Cmd, include_transitive_dependencies::Bool, base_sysimage::Nothing, julia_init_c_file::Nothing, julia_init_h_file::Nothing, version::Nothing, soname::Nothing, compat_level::String, extra_precompiles::String, import_into_main::Bool)
    @ PackageCompiler ~/.julia/packages/PackageCompiler/UbaS4/src/PackageCompiler.jl:639
 [10] top-level scope
    @ ~/moment_kinetics/precompile.jl:12
in expression starting at /***/moment_kinetics/precompile.jl:12
@sloede
Copy link
Collaborator

sloede commented Dec 13, 2024

Note that while I do not have an explanation for this behavior with PC.jl, --check-bounds=no was effectively broken for many typical workflows with Julia v1.10 (no support for StaticArrays.jl anymore), and I assume that for Julia v1.11 the situation has not improved. I thus recommend to not rely on this flag anymore.

@johnomotani
Copy link
Author

@sloede is there any advice on what to use instead? I'd rather keep using Julia-1.9 than have a factor of two or so slow-down in my code!

@sloede
Copy link
Collaborator

sloede commented Dec 13, 2024

Use @inbounds wherever necessary (and appropriate). See also the discussion here: JuliaLang/julia#48245

@johnomotani
Copy link
Author

Thanks @sloede. I really hope the Julia developers come up with a better option, and soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants