-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Refactor] Remove scope attribute from Buffer class #8463
Conversation
Thanks @masahi . Sorry was being late on this. I think it worthwhile to think a bit more. In general, there are two kinds of information that we normally carry throughout the program:
The Buffer object was intended for the declaration site information. There are in general two kinds of design choices here:
Right now this PR follows C0 partially. I wonder if we should also consider the choice between C0 and C1. In the particular case of Buffer, I feel that we should give C1 a serious consideration. We can also use the text format to illustrate the potential differences between the two. # Choice C0: the information was already duplicated on the lhs, so do not present on rhs
ptr : Pointer[float32, "gpu"] = allocate_buffer(32)
# Choice C1:
ptr : Pointer[float32, "gpu"] = allocate_buffer(32, "float32", "gpu") |
cc @jroesch @junrushao1994 to also share some thoughts here. |
@tqchen is there more context on what the design/end goal is? |
related context PR #8366 |
@jroesch I've updated the PR description with more details. |
@tqchen wrt to C0 vs. C1 is there any world in which these would differ? is the argument that you might not always have the type/lhs information in hand when you need to analyze the call site? |
I think if duplicated information are supposed to be consistent, then they are already not independent. So I don't see an advantage in keeping track of two essentially the same information. To @jroesch's question, right now our code base uses two ways to create
One possible middle ground is to keep |
To clarify, we are moving toward a world where the type annotation in the lhs is always available. The main thing we want to decide is whether to remove the additional info from the rhs. Note that if it is the other way around (keep info in the rhs and remove lhs info) it will be less controversial. But in this case we want the info in the lhs so it is available in the future reference pt. This asymmetry arises because we normally assume the information flows from rhs to lhs in the TIR. It can be a bit weird to infer the allocation type from the pointer type of that holds the allocation, of course they should be made consistent. I just want us to think carefully and make such choice consistent |
Another consideration is if the Type of a Buffer's Var is not a PointerType, e.g. PrimType, having the scope on the rhs could be necessary. Are there examples of this occurring / do we envision the need? |
We should not duplicate information in the IR. The flow of information from rhs to lhs is not necessarily the right way to frame it. In an assignment "a = b", there is information present both in a and in b, and the assignment has its own meaning as a whole. Depending on what kind of analysis we want to do, the inference of information can flow either way. |
@csullivan As of #8366, all buffer vars should be of pointer type. If that doesn't hold, I'd consider it a bug. This PR has a check Lines 316 to 317 in 2128bd4
|
I'd say if we agree that having storage scope information in the type is a good idea, then we should exploit this information, even if that ends up being going from left to right. If we want to strictly make the rhs (Buffer declaration) the source of truth, I think we might need to revisit the decision of putting storage information in the type of pointer. |
Thanks everyone for sharing the thoghts. First of all, I think we all agree that we should put storage scope in the type, so that the information can flow clearly from to the use site. On the other hand, there can be certain cases when duplicated information appear, say in the following two assignments, the additional b's type annotation was duplicated because it can be inferred from a, but nevertheless it can also appear in the IR as long as we have clear consistency checks. a : int = some_value()
b : int = a That is why I bought up the C0 and C1 distinguishment. As @kparzysz-quic said, on this particular case the argument can also go the other way if we view the Buffer as the assignment(declaration) as a whole. So if folks feel strongly that the scope information can be removed, I am not too attached to it. I think we should consider more seriously though if it is about the |
Trying to move this convo forward and conclude:
If we all agree, we can go ahead and merge this PR |
Thanks @masahi @jroesch @kparzysz-quic |
Co-authored-by: masa <[email protected]>
Co-authored-by: masa <[email protected]>
A follow up to #8366. Right now, storage scope information are spread across three components:
AttrStmt
withattr::storage_scope
keyPointerType
Buffer
classtvm/include/tvm/tir/buffer.h
Lines 70 to 71 in 2cca934
After #8366, storage scopes associated with
AttrStmt
andPointerType
are identical. To consolidate storage scope information into one place, I'm proposing to remove storage scope inAttrStmt
andBuffer
class.This PR is for the latter refactoring. I removed
scope
data member fromBuffer
class and added an alternative way to access the storage scope through its associated buffer variable.@tqchen @vinx13 @kparzysz-quic @csullivan
also cc @Hzfengsy since the remove field is only used by TensorIR related code