-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider removing --check-bounds=no
?
#48245
Comments
Here is a case where I regularly use |
Just to be clear, the reason we want to remove it is it requires us to compile the code more conservatively, leading to significant losses in inference accuracy, leading to significant losses of performance when running with |
@matthias314 Out of curiosity, what kind of speedup do you get? |
Triage discussed this in length and the conclusion was that issues that are present in |
That sounds easy to prove: just remove |
@JeffBezanson The speedup varies; often it is indeed not impressive. My point was not so much about the effectiveness of the current implementation of |
Could we get a specific example of |
So a nice example where |
Thanks! I figured out the reason I couldn't reproduce this is that I wasn't using the latest master. |
Apparently there's GPU users that care about this, because GPUs are very sensitive to the branch-heavy code introduced by bounds checks (typically because it increases register pressure which hurts occupancy) and they are using unoptimized kernels that don't have the necessary @Keno Can you elaborate on the Cassette-like solution? What's a reasonable way to implement this for GPUCompiler's abstract interpreter without the const-prop regressions? |
The feature |
The biggest security flaw in C come from the fact it does not have bounds checking enabled in releases. |
Bounds checking can naturally drastically impact performance as we can see running the following example The performance achieved without bounds checking is here over three times higher than with bounds checking activated (executed on a P100 GPU):
|
I think the argument is to use |
This example shows that bounds checking drastically impacts performance for high performance (GPU) applications. As a result, when running scientific high performance code, we do not want a single bounds check to happen . In the same time, while developing a scientific high performance code, we would like bounds check everywhere. |
The point is that your entire session shouldn't be running under |
At the moment we run the code without bounds checks (meaning for a production run), there will be no errors almost 100% of the time.
That could be a good solution indeed, if bounds checking would not impact performance of high performance cpu code. However, running the same code on cpu we can already see that this is not true (problem size: 1024^2):
|
Any indiscriminate use of Any use of Personally, I'd prefer enforcing the use of In regards to GPU code (or any code, really) - it seems much more safe to me to use In case it's not clear - I'm in favour of removing |
Being able to do development with memory safety and then easily switch to faster performance using a simple command line option is literally one of the best and most used features for me as an HPC user. When talking to colleagues from the scientific computing community, maybe the single strongest argument favor of Julia is that it takes away nothing of your performance while making it it much easier to work and prototype with. Another major points I bring up when advocating Julia is that it makes great strides in solving the two-language problem. By default, everything is memory safe and still reasonably fast. But when necessary, it's as simple as flipping a single switch and you get performance on par with Fortran/C++, and that's for real production code, not just academic setups. In addition, you now have to annotate (or better yet: first test, then annotate, since no premature optimization) each loop that might be remotely performance critical. I did a quick check; in our code base for Trixi.jl, we have around 2000 Finally, here are some numbers. For a production run with Trixi.jl, I see a strong influence of
Thus while the effect of disabling bounds checking varies, there is a significant impact on many core algorithms. TL;DR At the current state, I think removing |
Clearly not, since an |
I am sympathetic to the performance argument, but the actual issue here is that Local annotations have the benefit that they only require local knowledge to reason about them, whereas global flags require global knowledge (in my opinion impossible to obtain). Now out-of-bounds exceptions are hopefully not used like that, but In Julia we have exposed local options to control unsafe behaviours (like fast-math or opting out of bounds-checking) and as others have pointed out |
@vchuravy I see your point about fine-grained control and its benefits. On the other hand, you lose a big part of what makes Julia currently very flexible. Now the same code can be memory safe or fast, without the user having to make a conscious decision in each (potentially) performance critical section. Why would you give up this awesome feature and, one of the strongest selling points of Julia in a world where you compete with established, fast code languages like C++/Fortran or slow, rapid prototyping langues like Python? I feel like this would instead lead Julia more towards Python, where you can be just as fast as Julia if you restrict yourself to loops that are amenable to And again, adding If the main argument in favor is that users abuse the |
For me The issue is that I could see a use for development where I would like to answer the question "would |
My two cents is to document the recommended way for developers to refactor existing code and to write code onwards no matter what the outcome is for the recommended change. Mostly to be sure I'm not giving advice on deprecated functionality as we provide tutorials for the HPC folks new to Julia. Thanks! |
A single erroneous
In my experience, well written/idiomatic julia code is as fast (or sometimes beats) equivalent C/C++/Fortran code while retaining the safety features a high level language provides. Turning those safety features off and saying "look, without those safety features we are just as fast!" is, in my opinion, misrepresenting the strength of julia, of being able to be strictly better than the "old guard", further perpetuating the myth that the only way to go fast is to not have safe programs. This is not an either/or thing - we can, should (and ultimately MUST) have both at the same time.
The correct way to deal with a tool that you can do nothing but cut yourself with if not held in exactly the right way (which doesn't exist here, as this is a global flag and I'm pretty sure you're not auditing all your dependencies for correct behavior) is not to tell people to only hold it in the exactly right way, it is to fix the tool so you can't cut yourself in the first place. |
Could a compromise be made that you could disable |
Any larger-than-local-array scope would certainly be appreciated. Hope that's what comes out of this discussion. There are scientific schemas with thousands of array variables (any atmospheric or mildly large experimental dataset, or even a multiphysics particle or mesh based simulation) adding
Tests at small scales actually saves a tremendous amount of compute power before launching at scale, CI is good preventive/predictive maintenance not available at that time. But I'd rather not get out of scope. |
To conclude the input from my side: people that currently rely on the feature ...and at that point I would like to say: thanks to all the people working on the julia compiler and the core language for their awesome work! |
No - this means that both are bad and if you have to use it, choose
As pointed out above, this is not trivial, at all, due to bounds checking removal requiring inlining in the first place, which is very likely to cross module boundaries. This also comes with the problem of just removing bounds checking not necessarily meaning that the compiler can vectorize effectively, which is what's often needed/what people want to achieve with that in the first place (see e.g. eschnett/SIMD.jl#102).
That is not what people in this thread are asking and would indeed be too cumbersome to use - instead, use |
Updating the documentation here with this info (and deprecation) would be appreciated, in particular for those new to Julia coming from Fortran, Matlab, Numpy, etc. and helps us in trying to convince people to adopt Julia. Thanks everyone for the great work and discussion. Edit: my motivation is that we are preparing tutorials for HPC crowds so we want to point them in the right direction moving forward @vchuravy |
@williamfgc not sure if you're talking about the documentation for existing Julia versions or you're just asking about making sure the documentation gets updated if there are changes, but assuming the former: it's easy to improve the docs yourself (for simple changes you can just edit in your browser), and often it's better if the person who sees the gap is the one who tries to close it. (To me the docs look just fine, I wouldn't know what to change, but since I already know how it works I'm sure I'm not seeing them from the right perspective.) |
@timholy more like bringing awareness that docs would become soon outdated if this flag is removed (it's already broken) and block-style |
What about the opposite: develop using Discussion above mostly focused on this being less pretty. |
That is recommended |
From version 1.9 onwards, when `--check-bounds=no` is used, concrete-eval is completely disabled. However, it appears `--check-bounds=no` is still being used within the community, causing issues like the one reported in JuliaArrays/StaticArrays.jl#1155. Although we should move forward to a direction of eliminating the flag in the future (#48245), for the time being, there are many requests to carry out a certain level of compiler optimization, even when this flag is enabled. This commit aims to allow concrete-eval "safely" even under `--check-bounds=no`. Specifically, when the method call being analyzed is `:nothrow`, it should be predominantly safe to concrete-eval it under this flag. Technically, however, even `:nothrow` methods could trigger undefined behavior, since `:nothrow` isn't a strict constraint and it's possible for users to annotate potentially risky methods with `Base.@assume_effects :nothrow`. Nonetheless, since this possibility is acknowledged in `Base.@assume_effects` documentation, I feel it's fair to relegate it to user responsibility.
#50107) From version 1.9 onwards, when `--check-bounds=no` is used, concrete-eval is completely disabled. However, it appears `--check-bounds=no` is still being used within the community, causing issues like the one reported in JuliaArrays/StaticArrays.jl#1155. Although we should move forward to a direction of eliminating the flag in the future (#48245), for the time being, there are many requests to carry out a certain level of compiler optimization, even when this flag is enabled. This commit aims to allow concrete-eval "safely" even under `--check-bounds=no`. Specifically, when the method call being analyzed is `:nothrow`, it should be predominantly safe to concrete-eval it under this flag. Technically, however, even `:nothrow` methods could trigger undefined behavior, since `:nothrow` isn't a strict constraint and it's possible for users to annotate potentially risky methods with `Base.@assume_effects :nothrow`. Nonetheless, since this possibility is acknowledged in `Base.@assume_effects` documentation, I feel it's fair to relegate it to user responsibility.
In the current state of the Julia compiler, bounds checking and its related optimization code, such as `@boundscheck` and `@inbounds`, pose a significant handicap for effect analysis. As a result, we're encountering an ironic situation where the application of `@inbounds` annotations, which are intended to optimize performance, instead obstruct the program's optimization, thereby preventing us from achieving optimal performance. This PR is designed to resolve this situation. It aims to enhance the relationship between bounds checking and effect analysis, thereby correctly improving the performance of programs that have `@inbounds` annotations. In the following, I'll first explain the reasons that have led to this situation for better understanding, and then I'll present potential improvements to address these issues. This commit is a collection of various improvement proposals. It's necessary that we incorporate all of them simultaneously to enhance the situation without introducing any regressions. \## Core of the Problem There are fundamentally two reasons why effect analysis of code containing bounds checking is difficult: 1. The evaluation value of `Expr(:boundscheck)` is influenced by the `@inbounds` macro and the `--check-bounds` flag. Hence, when performing a concrete evaluation of a method containing `Expr(:boundscheck)`, it's crucial to respect the `@inbounds` macro context and the `--check-bounds` settings, ensuring the method's behavior is consistent across the compile time concrete evaluation and the runtime execution. 1. If the code, from which bounds checking has been removed due to `@inbounds` or `--check-bounds=no`, is unsafe, it may lead to undefined behavior due to uncertain memory access. \## Current State The current Julia compiler handles these two problems as follows: \### Current State 1 Regarding the first problem, if a code or method call containing `Expr(:boundscheck)` is within an `@inbounds` context, a concrete evaluation is immediately prohibited. For instance, in the following case, when analyzing `bar()`, if you simply perform concrete evaluation of `foo()`, it wouldn't properly respect the `@inbounds` context present in `bar()`. However, since the concrete evaluation of `foo()` is prohibited, it doesn't pose an issue: ```julia foo() = (r = 0; @BoundsCheck r += 1; return r) bar() = @inbounds foo() ``` Conversely, in the following case, there is _no need_ to prohibit the concrete evaluation of `A1_inbounds` due to the presence of `@inbounds`. This is because the execution of the `@boundscheck` block is determined by the presence of local `@inbounds`: ```julia function A1_inbounds() r = 0 @inbounds begin @BoundsCheck r += 1 end return r end ``` However, currently, we prohibit the concrete evaluation of such code as well. Moreover, we are not handling such local `@inbounds` contexts effectively, which results in incorrect execution of `A1_inbounds()` (even our test is incorrect for this example: <https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34>). Furthermore, there is room for improvement when the `--check-bounds` flag is specified. Specifically, when the `--check-bounds` flag is set, the evaluation value of `Expr(:boundscheck)` is determined irrespective of the `@inbounds` context. Hence, there is no need to prohibit concrete evaluations due to inconsistency in the evaluation value of `Expr(:boundscheck)`. \### Current State 2 Next, we've ensured that concrete evaluation isn't performed when there's potentially unsafe code that may have bounds checking removed, or when the `--check-bounds=no` flag is set, which could lead to bounds checking being removed always. For instance, if you perform concrete evaluation for the function call `baz((1,2,3), 4)` in the following example, it may return a value accessed from illicit memory and introduce undefined behaviors into the program: ```julia baz(t::Tuple, i::Int) = @inbounds t[i] baz((1,2,3), 4) ``` However, it's evident that the above code is incorrect and unsafe program and I believe undefined behavior in such programs is deemed, as explicitly stated in the `@inbounds` documentation: > │ Warning > │ > │ Using @inbounds may return incorrect results/crashes/corruption for > │ out-of-bounds indices. The user is responsible for checking it > │ manually. Only use @inbounds when it is certain from the information > │ locally available that all accesses are in bounds. Actually, the `@inbounds` macro is primarily an annotation to "improve performance by removing bounds checks from safe programs". Therefore, I opine that it would be more reasonable to utilize it to alleviate potential errors due to bounds checking within `@inbounds` contexts. To bring up another associated concern, in the current compiler implementation, the `:nothrow` modelings for `getfield`/`arrayref`/`arrayset` is a bit risky, and `:nothrow`-ness is assumed when their bounds checking is turned off by call argument. If our intended direction aligns with the removal of bounds checking based on `@inbounds` as proposed in issue #48245, then assuming `:nothrow`-ness due to `@inbounds` seems reasonable. However, presuming `:nothrow`-ness due to bounds checking argument or the `--check-bounds` flag appears to be risky, especially considering it's not documented. \## This Commit This commit implements all proposed improvements against the current issues as mentioned above. In summary, the enhancements include: - allowing concrete evaluation within a local `@inbounds` context - folding out `Expr(:boundscheck)` when the `--check-bounds` flag is set (and allow concrete evaluation) - changing the `:nothrow` effect bit to `UInt8` type, and refining `:nothrow` information when in an `@inbounds` context - removing dangerous assumptions of `:nothrow`-ness for built-in functions when bounds checking is turned off - replacing the `@_safeindex` hack with `@inbounds`
In the current state of the Julia compiler, bounds checking and its related optimization code, such as `@boundscheck` and `@inbounds`, pose a significant handicap for effect analysis. As a result, we're encountering an ironic situation where the application of `@inbounds` annotations, which are intended to optimize performance, instead obstruct the program's optimization, thereby preventing us from achieving optimal performance. This PR is designed to resolve this situation. It aims to enhance the relationship between bounds checking and effect analysis, thereby correctly improving the performance of programs that have `@inbounds` annotations. In the following, I'll first explain the reasons that have led to this situation for better understanding, and then I'll present potential improvements to address these issues. This commit is a collection of various improvement proposals. It's necessary that we incorporate all of them simultaneously to enhance the situation without introducing any regressions. \## Core of the Problem There are fundamentally two reasons why effect analysis of code containing bounds checking is difficult: 1. The evaluation value of `Expr(:boundscheck)` is influenced by the `@inbounds` macro and the `--check-bounds` flag. Hence, when performing a concrete evaluation of a method containing `Expr(:boundscheck)`, it's crucial to respect the `@inbounds` macro context and the `--check-bounds` settings, ensuring the method's behavior is consistent across the compile time concrete evaluation and the runtime execution. 1. If the code, from which bounds checking has been removed due to `@inbounds` or `--check-bounds=no`, is unsafe, it may lead to undefined behavior due to uncertain memory access. \## Current State The current Julia compiler handles these two problems as follows: \### Current State 1 Regarding the first problem, if a code or method call containing `Expr(:boundscheck)` is within an `@inbounds` context, a concrete evaluation is immediately prohibited. For instance, in the following case, when analyzing `bar()`, if you simply perform concrete evaluation of `foo()`, it wouldn't properly respect the `@inbounds` context present in `bar()`. However, since the concrete evaluation of `foo()` is prohibited, it doesn't pose an issue: ```julia foo() = (r = 0; @BoundsCheck r += 1; return r) bar() = @inbounds foo() ``` Conversely, in the following case, there is _no need_ to prohibit the concrete evaluation of `A1_inbounds` due to the presence of `@inbounds`. This is because ~~the execution of the `@boundscheck` block is determined by the presence of local `@inbounds`~~ `Expr(:boundscheck)` within a local `@inbounds` context does not need to block concrete evaluation: ```julia function A1_inbounds() r = 0 @inbounds begin @BoundsCheck r += 1 end return r end ``` However, currently, we prohibit the concrete evaluation of such code as well. ~~Moreover, we are not handling such local `@inbounds` contexts effectively, which results in incorrect execution of `A1_inbounds()` (even our test is incorrect for this example: <https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34>)~~ EDIT It was an expected behavior as pointed out by Jameson. Furthermore, there is room for improvement when the `--check-bounds` flag is specified. Specifically, when the `--check-bounds` flag is set, the evaluation value of `Expr(:boundscheck)` is determined irrespective of the `@inbounds` context. Hence, there is no need to prohibit concrete evaluations due to inconsistency in the evaluation value of `Expr(:boundscheck)`. \### Current State 2 Next, we've ensured that concrete evaluation isn't performed when there's potentially unsafe code that may have bounds checking removed, or when the `--check-bounds=no` flag is set, which could lead to bounds checking being removed always. For instance, if you perform concrete evaluation for the function call `baz((1,2,3), 4)` in the following example, it may return a value accessed from illicit memory and introduce undefined behaviors into the program: ```julia baz(t::Tuple, i::Int) = @inbounds t[i] baz((1,2,3), 4) ``` However, it's evident that the above code is incorrect and unsafe program and I believe undefined behavior in such programs is deemed, as explicitly stated in the `@inbounds` documentation: > │ Warning > │ > │ Using @inbounds may return incorrect results/crashes/corruption for > │ out-of-bounds indices. The user is responsible for checking it > │ manually. Only use @inbounds when it is certain from the information > │ locally available that all accesses are in bounds. Actually, the `@inbounds` macro is primarily an annotation to "improve performance by removing bounds checks from safe programs". Therefore, I opine that it would be more reasonable to utilize it to alleviate potential errors due to bounds checking within `@inbounds` contexts. To bring up another associated concern, in the current compiler implementation, the `:nothrow` modelings for `getfield`/`arrayref`/`arrayset` is a bit risky, and `:nothrow`-ness is assumed when their bounds checking is turned off by call argument. If our intended direction aligns with the removal of bounds checking based on `@inbounds` as proposed in issue #48245, then assuming `:nothrow`-ness due to `@inbounds` seems reasonable. However, presuming `:nothrow`-ness due to bounds checking argument or the `--check-bounds` flag appears to be risky, especially considering it's not documented. \## This Commit This commit implements all proposed improvements against the current issues as mentioned above. In summary, the enhancements include: - making `Expr(:boundscheck)` within a local `@inbounds` context not block concrete evaluation - folding out `Expr(:boundscheck)` when the `--check-bounds` flag is set (and allow concrete evaluation) - changing the `:nothrow` effect bit to `UInt8` type, and refining `:nothrow` information when in an `@inbounds` context - removing dangerous assumptions of `:nothrow`-ness for built-in functions when bounds checking is turned off - replacing the `@_safeindex` hack with `@inbounds`
In the current state of the Julia compiler, bounds checking and its related optimization code, such as `@boundscheck` and `@inbounds`, pose a significant handicap for effect analysis. As a result, we're encountering an ironic situation where the application of `@inbounds` annotations, which are intended to optimize performance, instead obstruct the program's optimization, thereby preventing us from achieving optimal performance. This PR is designed to resolve this situation. It aims to enhance the relationship between bounds checking and effect analysis, thereby correctly improving the performance of programs that have `@inbounds` annotations. In the following, I'll first explain the reasons that have led to this situation for better understanding, and then I'll present potential improvements to address these issues. This commit is a collection of various improvement proposals. It's necessary that we incorporate all of them simultaneously to enhance the situation without introducing any regressions. \## Core of the Problem There are fundamentally two reasons why effect analysis of code containing bounds checking is difficult: 1. The evaluation value of `Expr(:boundscheck)` is influenced by the `@inbounds` macro and the `--check-bounds` flag. Hence, when performing a concrete evaluation of a method containing `Expr(:boundscheck)`, it's crucial to respect the `@inbounds` macro context and the `--check-bounds` settings, ensuring the method's behavior is consistent across the compile time concrete evaluation and the runtime execution. 1. If the code, from which bounds checking has been removed due to `@inbounds` or `--check-bounds=no`, is unsafe, it may lead to undefined behavior due to uncertain memory access. \## Current State The current Julia compiler handles these two problems as follows: \### Current State 1 Regarding the first problem, if a code or method call containing `Expr(:boundscheck)` is within an `@inbounds` context, a concrete evaluation is immediately prohibited. For instance, in the following case, when analyzing `bar()`, if you simply perform concrete evaluation of `foo()`, it wouldn't properly respect the `@inbounds` context present in `bar()`. However, since the concrete evaluation of `foo()` is prohibited, it doesn't pose an issue: ```julia foo() = (r = 0; @BoundsCheck r += 1; return r) bar() = @inbounds foo() ``` Conversely, in the following case, there is _no need_ to prohibit the concrete evaluation of `A1_inbounds` due to the presence of `@inbounds`. This is because ~~the execution of the `@boundscheck` block is determined by the presence of local `@inbounds`~~ `Expr(:boundscheck)` within a local `@inbounds` context does not need to block concrete evaluation: ```julia function A1_inbounds() r = 0 @inbounds begin @BoundsCheck r += 1 end return r end ``` However, currently, we prohibit the concrete evaluation of such code as well. ~~Moreover, we are not handling such local `@inbounds` contexts effectively, which results in incorrect execution of `A1_inbounds()` (even our test is incorrect for this example: `https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34`)~~ EDIT: It is an expected behavior as pointed out by Jameson. Furthermore, there is room for improvement when the `--check-bounds` flag is specified. Specifically, when the `--check-bounds` flag is set, the evaluation value of `Expr(:boundscheck)` is determined irrespective of the `@inbounds` context. Hence, there is no need to prohibit concrete evaluations due to inconsistency in the evaluation value of `Expr(:boundscheck)`. \### Current State 2 Next, we've ensured that concrete evaluation isn't performed when there's potentially unsafe code that may have bounds checking removed, or when the `--check-bounds=no` flag is set, which could lead to bounds checking being removed always. For instance, if you perform concrete evaluation for the function call `baz((1,2,3), 4)` in the following example, it may return a value accessed from illicit memory and introduce undefined behaviors into the program: ```julia baz(t::Tuple, i::Int) = @inbounds t[i] baz((1,2,3), 4) ``` However, it's evident that the above code is incorrect and unsafe program and I believe undefined behavior in such programs is deemed, as explicitly stated in the `@inbounds` documentation: > │ Warning > │ > │ Using @inbounds may return incorrect results/crashes/corruption for > │ out-of-bounds indices. The user is responsible for checking it > │ manually. Only use @inbounds when it is certain from the information > │ locally available that all accesses are in bounds. Actually, the `@inbounds` macro is primarily an annotation to "improve performance by removing bounds checks from safe programs". Therefore, I opine that it would be more reasonable to utilize it to alleviate potential errors due to bounds checking within `@inbounds` contexts. To bring up another associated concern, in the current compiler implementation, the `:nothrow` modelings for `getfield`/`arrayref`/`arrayset` is a bit risky, and `:nothrow`-ness is assumed when their bounds checking is turned off by call argument. If our intended direction aligns with the removal of bounds checking based on `@inbounds` as proposed in issue #48245, then assuming `:nothrow`-ness due to `@inbounds` seems reasonable. However, presuming `:nothrow`-ness due to bounds checking argument or the `--check-bounds` flag appears to be risky, especially considering it's not documented. \## This Commit This commit implements all proposed improvements against the current issues as mentioned above. In summary, the enhancements include: - making `Expr(:boundscheck)` within a local `@inbounds` context not block concrete evaluation - folding out `Expr(:boundscheck)` when the `--check-bounds` flag is set (and allow concrete evaluation) - changing the `:nothrow` effect bit to `UInt8` type, and refining `:nothrow` information when in an `@inbounds` context - removing dangerous assumptions of `:nothrow`-ness for built-in functions when bounds checking is turned off - replacing the `@_safeindex` hack with `@inbounds`
In the current state of the Julia compiler, bounds checking and its related optimization code, such as `@boundscheck` and `@inbounds`, pose a significant handicap for effect analysis. As a result, we're encountering an ironic situation where the application of `@inbounds` annotations, which are intended to optimize performance, instead obstruct the program's optimization, thereby preventing us from achieving optimal performance. This PR is designed to resolve this situation. It aims to enhance the relationship between bounds checking and effect analysis, thereby correctly improving the performance of programs that have `@inbounds` annotations. In the following, I'll first explain the reasons that have led to this situation for better understanding, and then I'll present potential improvements to address these issues. This commit is a collection of various improvement proposals. It's necessary that we incorporate all of them simultaneously to enhance the situation without introducing any regressions. \## Core of the Problem There are fundamentally two reasons why effect analysis of code containing bounds checking is difficult: 1. The evaluation value of `Expr(:boundscheck)` is influenced by the `@inbounds` macro and the `--check-bounds` flag. Hence, when performing a concrete evaluation of a method containing `Expr(:boundscheck)`, it's crucial to respect the `@inbounds` macro context and the `--check-bounds` settings, ensuring the method's behavior is consistent across the compile time concrete evaluation and the runtime execution. 1. If the code, from which bounds checking has been removed due to `@inbounds` or `--check-bounds=no`, is unsafe, it may lead to undefined behavior due to uncertain memory access. \## Current State The current Julia compiler handles these two problems as follows: \### Current State 1 Regarding the first problem, if a code or method call containing `Expr(:boundscheck)` is within an `@inbounds` context, a concrete evaluation is immediately prohibited. For instance, in the following case, when analyzing `bar()`, if you simply perform concrete evaluation of `foo()`, it wouldn't properly respect the `@inbounds` context present in `bar()`. However, since the concrete evaluation of `foo()` is prohibited, it doesn't pose an issue: ```julia foo() = (r = 0; @BoundsCheck r += 1; return r) bar() = @inbounds foo() ``` Conversely, in the following case, there is _no need_ to prohibit the concrete evaluation of `A1_inbounds` due to the presence of `@inbounds`. This is because ~~the execution of the `@boundscheck` block is determined by the presence of local `@inbounds`~~ `Expr(:boundscheck)` within a local `@inbounds` context does not need to block concrete evaluation: ```julia function A1_inbounds() r = 0 @inbounds begin @BoundsCheck r += 1 end return r end ``` However, currently, we prohibit the concrete evaluation of such code as well. ~~Moreover, we are not handling such local `@inbounds` contexts effectively, which results in incorrect execution of `A1_inbounds()` (even our test is incorrect for this example: `https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34`)~~ EDIT: It is an expected behavior as pointed out by Jameson. Furthermore, there is room for improvement when the `--check-bounds` flag is specified. Specifically, when the `--check-bounds` flag is set, the evaluation value of `Expr(:boundscheck)` is determined irrespective of the `@inbounds` context. Hence, there is no need to prohibit concrete evaluations due to inconsistency in the evaluation value of `Expr(:boundscheck)`. \### Current State 2 Next, we've ensured that concrete evaluation isn't performed when there's potentially unsafe code that may have bounds checking removed, or when the `--check-bounds=no` flag is set, which could lead to bounds checking being removed always. For instance, if you perform concrete evaluation for the function call `baz((1,2,3), 4)` in the following example, it may return a value accessed from illicit memory and introduce undefined behaviors into the program: ```julia baz(t::Tuple, i::Int) = @inbounds t[i] baz((1,2,3), 4) ``` However, it's evident that the above code is incorrect and unsafe program and I believe undefined behavior in such programs is deemed, as explicitly stated in the `@inbounds` documentation: > │ Warning > │ > │ Using @inbounds may return incorrect results/crashes/corruption for > │ out-of-bounds indices. The user is responsible for checking it > │ manually. Only use @inbounds when it is certain from the information > │ locally available that all accesses are in bounds. Actually, the `@inbounds` macro is primarily an annotation to "improve performance by removing bounds checks from safe programs". Therefore, I opine that it would be more reasonable to utilize it to alleviate potential errors due to bounds checking within `@inbounds` contexts. To bring up another associated concern, in the current compiler implementation, the `:nothrow` modelings for `getfield`/`arrayref`/`arrayset` is a bit risky, and `:nothrow`-ness is assumed when their bounds checking is turned off by call argument. If our intended direction aligns with the removal of bounds checking based on `@inbounds` as proposed in issue #48245, then assuming `:nothrow`-ness due to `@inbounds` seems reasonable. However, presuming `:nothrow`-ness due to bounds checking argument or the `--check-bounds` flag appears to be risky, especially considering it's not documented. \## This Commit This commit implements all proposed improvements against the current issues as mentioned above. In summary, the enhancements include: - making `Expr(:boundscheck)` within a local `@inbounds` context not block concrete evaluation - folding out `Expr(:boundscheck)` when the `--check-bounds` flag is set (and allow concrete evaluation) - changing the `:nothrow` effect bit to `UInt8` type, and refining `:nothrow` information when in an `@inbounds` context - removing dangerous assumptions of `:nothrow`-ness for built-in functions when bounds checking is turned off - replacing the `@_safeindex` hack with `@inbounds`
In the current state of the Julia compiler, bounds checking and its related optimization code, such as `@boundscheck` and `@inbounds`, pose a significant handicap for effect analysis. As a result, we're encountering an ironic situation where the application of `@inbounds` annotations, which are intended to optimize performance, instead obstruct the program's optimization, thereby preventing us from achieving optimal performance. This PR is designed to resolve this situation. It aims to enhance the relationship between bounds checking and effect analysis, thereby correctly improving the performance of programs that have `@inbounds` annotations. In the following, I'll first explain the reasons that have led to this situation for better understanding, and then I'll present potential improvements to address these issues. This commit is a collection of various improvement proposals. It's necessary that we incorporate all of them simultaneously to enhance the situation without introducing any regressions. \## Core of the Problem There are fundamentally two reasons why effect analysis of code containing bounds checking is difficult: 1. The evaluation value of `Expr(:boundscheck)` is influenced by the `@inbounds` macro and the `--check-bounds` flag. Hence, when performing a concrete evaluation of a method containing `Expr(:boundscheck)`, it's crucial to respect the `@inbounds` macro context and the `--check-bounds` settings, ensuring the method's behavior is consistent across the compile time concrete evaluation and the runtime execution. 1. If the code, from which bounds checking has been removed due to `@inbounds` or `--check-bounds=no`, is unsafe, it may lead to undefined behavior due to uncertain memory access. \## Current State The current Julia compiler handles these two problems as follows: \### Current State 1 Regarding the first problem, if a code or method call containing `Expr(:boundscheck)` is within an `@inbounds` context, a concrete evaluation is immediately prohibited. For instance, in the following case, when analyzing `bar()`, if you simply perform concrete evaluation of `foo()`, it wouldn't properly respect the `@inbounds` context present in `bar()`. However, since the concrete evaluation of `foo()` is prohibited, it doesn't pose an issue: ```julia foo() = (r = 0; @BoundsCheck r += 1; return r) bar() = @inbounds foo() ``` Conversely, in the following case, there is _no need_ to prohibit the concrete evaluation of `A1_inbounds` due to the presence of `@inbounds`. This is because ~~the execution of the `@boundscheck` block is determined by the presence of local `@inbounds`~~ `Expr(:boundscheck)` within a local `@inbounds` context does not need to block concrete evaluation: ```julia function A1_inbounds() r = 0 @inbounds begin @BoundsCheck r += 1 end return r end ``` However, currently, we prohibit the concrete evaluation of such code as well. ~~Moreover, we are not handling such local `@inbounds` contexts effectively, which results in incorrect execution of `A1_inbounds()` (even our test is incorrect for this example: `https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34`)~~ EDIT: It is an expected behavior as pointed out by Jameson. Furthermore, there is room for improvement when the `--check-bounds` flag is specified. Specifically, when the `--check-bounds` flag is set, the evaluation value of `Expr(:boundscheck)` is determined irrespective of the `@inbounds` context. Hence, there is no need to prohibit concrete evaluations due to inconsistency in the evaluation value of `Expr(:boundscheck)`. \### Current State 2 Next, we've ensured that concrete evaluation isn't performed when there's potentially unsafe code that may have bounds checking removed, or when the `--check-bounds=no` flag is set, which could lead to bounds checking being removed always. For instance, if you perform concrete evaluation for the function call `baz((1,2,3), 4)` in the following example, it may return a value accessed from illicit memory and introduce undefined behaviors into the program: ```julia baz(t::Tuple, i::Int) = @inbounds t[i] baz((1,2,3), 4) ``` However, it's evident that the above code is incorrect and unsafe program and I believe undefined behavior in such programs is deemed, as explicitly stated in the `@inbounds` documentation: > │ Warning > │ > │ Using @inbounds may return incorrect results/crashes/corruption for > │ out-of-bounds indices. The user is responsible for checking it > │ manually. Only use @inbounds when it is certain from the information > │ locally available that all accesses are in bounds. Actually, the `@inbounds` macro is primarily an annotation to "improve performance by removing bounds checks from safe programs". Therefore, I opine that it would be more reasonable to utilize it to alleviate potential errors due to bounds checking within `@inbounds` contexts. To bring up another associated concern, in the current compiler implementation, the `:nothrow` modelings for `getfield`/`arrayref`/`arrayset` is a bit risky, and `:nothrow`-ness is assumed when their bounds checking is turned off by call argument. If our intended direction aligns with the removal of bounds checking based on `@inbounds` as proposed in issue #48245, then assuming `:nothrow`-ness due to `@inbounds` seems reasonable. However, presuming `:nothrow`-ness due to bounds checking argument or the `--check-bounds` flag appears to be risky, especially considering it's not documented. \## This Commit This commit implements all proposed improvements against the current issues as mentioned above. In summary, the enhancements include: - making `Expr(:boundscheck)` within a local `@inbounds` context not block concrete evaluation - folding out `Expr(:boundscheck)` when the `--check-bounds` flag is set (and allow concrete evaluation) - changing the `:nothrow` effect bit to `UInt8` type, and refining `:nothrow` information when in an `@inbounds` context - removing dangerous assumptions of `:nothrow`-ness for built-in functions when bounds checking is turned off - replacing the `@_safeindex` hack with `@inbounds`
In 1.9, `--check-bounds=no` has started causing significant performance regressions (e.g. #50110). This is because we switched a number of functions that used to be `@pure` to new effects-based infrastructure, which very closely tracks the the legality conditions for concrete evaluation. Unfortunately, disabling bounds checking completely invalidates all prior legality analysis, so the only realistic path we have is to completely disable it. In general, we are learning that these kinds of global make-things-faster-but-unsafe flags are highly problematic for a language for several reasons: - Code is written with the assumption of a particular mode being chosen, so it is in general not possible or unsafe to compose libraries (which in a language like julia is a huge problem). - Unsafe semantics are often harder for the compiler to reason about, causing unexpected performance issues (although the 1.9 --check-bounds=no issues are worse than just disabling concrete eval for things that use bounds checking) In general, I'd like to remove the `--check-bounds=` option entirely (#48245), but that proposal has encountered major opposition. This PR implements an alternative proposal: We introduce a new function `Core.should_check_bounds(boundscheck::Bool) = boundscheck`. This function is passed the result of `Expr(:boundscheck)` (which is now purely determined by the inliner based on `@inbounds`, without regard for the command line flag). In this proposal, what the command line flag does is simply redefine this function to either `true` or `false` (unconditionally) depending on the value of the flag. Of course, this causes massive amounts of recompilation, but I think this can be addressed by adding logic to loading that loads a pkgimage with appropriate definitions to cure the invalidations. The special logic we have now now to take account of the --check-bounds flag in .ji selection, would be replaced by automatically injecting the special pkgimage as a dependency to every loaded image. This part isn't implemented in this PR, but I think it's reasonable to do. I think with that, the `--check-bounds` flag remains functional, while having much more well defined behavior, as it relies on the standard world age mechanisms. A major benefit of this approach is that it can be scoped appropriately using overlay tables. For exmaple: ``` julia> using CassetteOverlay julia> @MethodTable AssumeInboundsTable; julia> @overlay AssumeInboundsTable Core.should_check_bounds(b::Bool) = false; julia> assume_inbounds = @overlaypass AssumeInboundsTable julia> assume_inbounds(f, args...) # f(args...) with bounds checking disabled dynamically ``` Similar logic applies to GPUCompiler, which already supports overlay tables.
In 1.9, `--check-bounds=no` has started causing significant performance regressions (e.g. #50110). This is because we switched a number of functions that used to be `@pure` to new effects-based infrastructure, which very closely tracks the the legality conditions for concrete evaluation. Unfortunately, disabling bounds checking completely invalidates all prior legality analysis, so the only realistic path we have is to completely disable it. In general, we are learning that these kinds of global make-things-faster-but-unsafe flags are highly problematic for a language for several reasons: - Code is written with the assumption of a particular mode being chosen, so it is in general not possible or unsafe to compose libraries (which in a language like julia is a huge problem). - Unsafe semantics are often harder for the compiler to reason about, causing unexpected performance issues (although the 1.9 --check-bounds=no issues are worse than just disabling concrete eval for things that use bounds checking) In general, I'd like to remove the `--check-bounds=` option entirely (#48245), but that proposal has encountered major opposition. This PR implements an alternative proposal: We introduce a new function `Core.should_check_bounds(boundscheck::Bool) = boundscheck`. This function is passed the result of `Expr(:boundscheck)` (which is now purely determined by the inliner based on `@inbounds`, without regard for the command line flag). In this proposal, what the command line flag does is simply redefine this function to either `true` or `false` (unconditionally) depending on the value of the flag. Of course, this causes massive amounts of recompilation, but I think this can be addressed by adding logic to loading that loads a pkgimage with appropriate definitions to cure the invalidations. The special logic we have now now to take account of the --check-bounds flag in .ji selection, would be replaced by automatically injecting the special pkgimage as a dependency to every loaded image. This part isn't implemented in this PR, but I think it's reasonable to do. I think with that, the `--check-bounds` flag remains functional, while having much more well defined behavior, as it relies on the standard world age mechanisms. A major benefit of this approach is that it can be scoped appropriately using overlay tables. For exmaple: ``` julia> using CassetteOverlay julia> @MethodTable AssumeInboundsTable; julia> @overlay AssumeInboundsTable Core.should_check_bounds(b::Bool) = false; julia> assume_inbounds = @overlaypass AssumeInboundsTable julia> assume_inbounds(f, args...) # f(args...) with bounds checking disabled dynamically ``` Similar logic applies to GPUCompiler, which already supports overlay tables.
See RFC for one possible approach in #50239 |
In 1.9, `--check-bounds=no` has started causing significant performance regressions (e.g. #50110). This is because we switched a number of functions that used to be `@pure` to new effects-based infrastructure, which very closely tracks the the legality conditions for concrete evaluation. Unfortunately, disabling bounds checking completely invalidates all prior legality analysis, so the only realistic path we have is to completely disable it. In general, we are learning that these kinds of global make-things-faster-but-unsafe flags are highly problematic for a language for several reasons: - Code is written with the assumption of a particular mode being chosen, so it is in general not possible or unsafe to compose libraries (which in a language like julia is a huge problem). - Unsafe semantics are often harder for the compiler to reason about, causing unexpected performance issues (although the 1.9 --check-bounds=no issues are worse than just disabling concrete eval for things that use bounds checking) In general, I'd like to remove the `--check-bounds=` option entirely (#48245), but that proposal has encountered major opposition. This PR implements an alternative proposal: We introduce a new function `Core.should_check_bounds(boundscheck::Bool) = boundscheck`. This function is passed the result of `Expr(:boundscheck)` (which is now purely determined by the inliner based on `@inbounds`, without regard for the command line flag). In this proposal, what the command line flag does is simply redefine this function to either `true` or `false` (unconditionally) depending on the value of the flag. Of course, this causes massive amounts of recompilation, but I think this can be addressed by adding logic to loading that loads a pkgimage with appropriate definitions to cure the invalidations. The special logic we have now now to take account of the --check-bounds flag in .ji selection, would be replaced by automatically injecting the special pkgimage as a dependency to every loaded image. This part isn't implemented in this PR, but I think it's reasonable to do. I think with that, the `--check-bounds` flag remains functional, while having much more well defined behavior, as it relies on the standard world age mechanisms. A major benefit of this approach is that it can be scoped appropriately using overlay tables. For exmaple: ``` julia> using CassetteOverlay julia> @MethodTable AssumeInboundsTable; julia> @overlay AssumeInboundsTable Core.should_check_bounds(b::Bool) = false; julia> assume_inbounds = @overlaypass AssumeInboundsTable julia> assume_inbounds(f, args...) # f(args...) with bounds checking disabled dynamically ``` Similar logic applies to GPUCompiler, which already supports overlay tables.
In 1.9, `--check-bounds=no` has started causing significant performance regressions (e.g. #50110). This is because we switched a number of functions that used to be `@pure` to new effects-based infrastructure, which very closely tracks the the legality conditions for concrete evaluation. Unfortunately, disabling bounds checking completely invalidates all prior legality analysis, so the only realistic path we have is to completely disable it. In general, we are learning that these kinds of global make-things-faster-but-unsafe flags are highly problematic for a language for several reasons: - Code is written with the assumption of a particular mode being chosen, so it is in general not possible or unsafe to compose libraries (which in a language like julia is a huge problem). - Unsafe semantics are often harder for the compiler to reason about, causing unexpected performance issues (although the 1.9 --check-bounds=no issues are worse than just disabling concrete eval for things that use bounds checking) In general, I'd like to remove the `--check-bounds=` option entirely (#48245), but that proposal has encountered major opposition. This PR implements an alternative proposal: We introduce a new function `Core.should_check_bounds(boundscheck::Bool) = boundscheck`. This function is passed the result of `Expr(:boundscheck)` (which is now purely determined by the inliner based on `@inbounds`, without regard for the command line flag). In this proposal, what the command line flag does is simply redefine this function to either `true` or `false` (unconditionally) depending on the value of the flag. Of course, this causes massive amounts of recompilation, but I think this can be addressed by adding logic to loading that loads a pkgimage with appropriate definitions to cure the invalidations. The special logic we have now now to take account of the --check-bounds flag in .ji selection, would be replaced by automatically injecting the special pkgimage as a dependency to every loaded image. This part isn't implemented in this PR, but I think it's reasonable to do. I think with that, the `--check-bounds` flag remains functional, while having much more well defined behavior, as it relies on the standard world age mechanisms. A major benefit of this approach is that it can be scoped appropriately using overlay tables. For exmaple: ``` julia> using CassetteOverlay julia> @MethodTable AssumeInboundsTable; julia> @overlay AssumeInboundsTable Core.should_check_bounds(b::Bool) = false; julia> assume_inbounds = @overlaypass AssumeInboundsTable julia> assume_inbounds(f, args...) # f(args...) with bounds checking disabled dynamically ``` Similar logic applies to GPUCompiler, which already supports overlay tables.
I think we should consider removing the
--check-bounds=no
option.I can't really think of any situation in which it would be safe or sensible to turn it on.
If you really must have something like it, a Cassette pass that disables bounds checking
in a particular region of code, could achieve any residual benefits without being as massive a footgun.
Worse, turning on
--check-bounds=no
on current master can actually result in significantlyworse performance because it removes inference's ability to do concrete evaluation (constant folding).
Note that that we can make the option a noop in a minor release since throwing a BoundsError
is allowable undefined behavior.
EDIT: note for future readers that the recommended replacement is to mark
@inbounds
code that is discovered to benefit from this flag from@profile
analysis.The text was updated successfully, but these errors were encountered: