Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decide on (Cleartext) Integer Semantics for HEIR's Python Frontend #1252

Open
AlexanderViand-Intel opened this issue Jan 9, 2025 · 0 comments
Labels
python Pull requests that update Python code

Comments

@AlexanderViand-Intel
Copy link
Collaborator

Python famously has arbitrary precision integers, though a lot of the time "interesting" programs use np.int64 or something similar anyway. This raises the question how a basic a + b statement between two integers should be typed.

This is of course assuming we know the type of a and b, either from an annotation (ahead of time compilation) or because we have their actual runtime values (jit compilation).

There are, afaik, basically three different approaches we can take:

  1. "Pythonic Ideal": Track the bitwidth required to represent the result perfectly: 16-bit + 16-bit is 17-bit (or 32-bit, if rounded up to next power of two), 16-bit * 16-bit is 32-bit, etc.
  2. "Numba NBEP1": Numba at some point made the decision to basically upcast smaller types to the machine type, so int16 + int16 is actually int64 (at least on a 64-bit machine). See https://numba.readthedocs.io/en/stable/proposals/integer-typing.html. Note that Numba does not do this for arrays, so array(int32) + array(int32) is still array(int32).
  3. "MLIR / overflow=none": This is how we currently use the arith dialect. Adding/Multiplying two values of type i32 still results in an i32. Overflow is essentially considered "Undefined Behavior" and the compiler simply says "not my problem".

In discussions about this so far, we've always gone with Option 3, as it's by far the easiest to deal with for arithmetic FHE, where we need to impose a fixed plaintext modulus. While Option 1 has some appeal, I think the costs of trying to handle this far outweigh the benefits. Finally, while I understand why Numba chose to "snap to pointer size", I don't think adapting this makes sense for us.

If there's consensus on this (or at least no active outcries against), I propose we go for option 3. However, that poses a bit of an issue with just using Numba's Type Inference out of the box:

Python code to see Numba Type Inference in action
from numba.core.registry import cpu_target
from numba.core import compiler, sigutils
from numba.core.typed_passes import type_inference_stage

# Define a test function
def example_function(x, y):
    z = x + y
    return z

sig_string = "int16(int16, int16)"

test_ir = compiler.run_frontend(example_function)
typingctx = cpu_target.typing_context
targetctx = cpu_target.target_context
typingctx.refresh()
targetctx.refresh()

fn_args, fn_retty = sigutils.normalize_signature(sig_string)
typing_res = type_inference_stage(typingctx, targetctx, test_ir, fn_args,
                                    None)

# Get inferred types
typemap = typing_res.typemap
for var, typ in typemap.items():
    print(f"Variable: {var}, Type: {typ}")

# Variable: arg.x, Type: int16
# Variable: arg.y, Type: int16
# Variable: x, Type: int16
# Variable: y, Type: int16
# Variable: z, Type: int64
# Variable: $16return_value.4, Type: int64

We can get around this by either (a) doing some hacky stuff with forking numba (the culprit is integer_binop_cases in numba.core.typing.builins) which is what I did for testing, (b) not relying on numba type inference at all or (c) adding our own custom integer types via Numba extensions. We probably need to do some of this anyway because Numba treats all array types as dynamically sized, and we probably want statically known shapes.

@AlexanderViand-Intel AlexanderViand-Intel added the python Pull requests that update Python code label Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python Pull requests that update Python code
Projects
None yet
Development

No branches or pull requests

1 participant