Broadcasting dimension seems counter-intuitive #927

CatchemAL · 2017-09-30T08:18:54Z

Is my intuition failing me here?

[Test]
public void BroadcastingTest()
{
    double[,] matrix = Matrix.Zeros(rows: 100, columns: 3);
    double[] vec = { 1, 2, 3 };

    double[,] broadcasted1 = matrix.Add(vec, dimension: 0); // works!     :O
    double[,] broadcasted2 = matrix.Add(vec, dimension: 1); // fails!     :\
}

Certainly took me by surprise. It's not a massive issue if you get an IndexOutOfRangeException but it's a lot harder to spot on square matrices.

Thanks,
A

ps - Perhaps my intuition agrees with yours (#902)?

The text was updated successfully, but these errors were encountered:

cesarsouza · 2017-09-30T09:39:02Z

Hi @AlexJCross,

Thanks for raising this issue! This is actually something that I always wanted to double-check with another first time user of the elementwise operations. I've been exposed for those methods for so long, that I kind of lost my intuition of what a first time user would expect them to mean.

I understand that the "dimension" parameter in the elementwise operations is not intuitive, but I am not sure it is counter-intuitive either. If I didn't know how it worked already, I think it also would not be immediate to me why it would have worked with dimension: 1 and failed with dimension: 0.

The current behavior of those methods is based on the following behavior in Python:

In [3]: matrix = np.zeros((100, 3))

In [4]: vec = np.array([1,2,3])

In [5]: result = np.add(matrix, np.expand_dims(vec, axis=0))

In [6]: result = np.add(matrix, np.expand_dims(vec, axis=1))
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-e3923ebe2c5e> in <module>()
----> 1 result = np.add(matrix, np.expand_dims(vec, axis=1))

ValueError: operands could not be broadcast together with shapes (100,3) (3,1)

However, I see that the documentation is certainly not helping, since the documentation for the "dimension" parameter is plain wrong in some cases. The real documentation should have been something along the lines of (let me know if you have ideas to improve it):

The dimension along which the operation will be carried. If set to 0, then the vector b must have the same dimensions as rows of A, and the operation will be carried at each column vector. If 1, then it must have the same dimensions as columns in A, and the operation will be carried at each row vector.

Please let me know how did you interpret the dimension parameter at a first look, and if you have any ideas to improve this.

Thanks a lot!
Cesar

CatchemAL · 2017-10-01T10:31:37Z

Hey @cesarsouza,

Thanks very much for the response. I agree that the issue is not clear cut!

Please let me know how did you interpret the dimension parameter at a first look, and if you have any ideas to improve this.

My (initial) interpretation was that dimension refers to the axis along which the vector should lie (as opposed to the expand dimension). I asked my gf this morning - she programs python - and she independently went for dimension: 1. I must admit, I never even considered the possibility of an ambiguity when you first told me about the method (although you did also tell me dimension: 1 in #902 😜 which is why I thought I'd check it with you).

However, thinking about it this morning, I've decided my thinking was a bit woolly and doesn't readily generalise to higher dimensions. If I were to sum() a tensor of order 5 along axes x0, x1, x3, x4 I would expect to collapse the tensor to a vector whose axis lies along x2. By that logic, I should probably expect to specify the four expansion axes when broadcasting a vector from x2 (and not the axis the vector simply lies on). Equally, if I were adding a matrix to a 3D tensor, it feels natural to specify the remaining (expansion) axis. So yes, I think what you have gone for is the most sensible after all.

Agree the documentation not being wrong would help (although I only went to read it once my code bombed out). To my mind, the clearest thing to do is to be more explicit that dimension refers to the axis that should be expanded:

`double[,] broadcasted1 = matrix.Add(vec, expandDimension: 0); // preferred`

Numpy does exactly that and it makes it a lot clearer I think: np.expand_dims(...). Alternatively, I suppose you could do something like this (which is very clear) but doesn't generalise so well.

`double[,] broadcasted1 = matrix.Add(vec, isRowVector: true);`

Regarding the proposed documentation...

The dimension along which the operation will be carried. If set to 0, then the vector b must have the same dimensions as rows of A, and the operation will be carried at each column vector. If 1, then it must have the same dimensions as columns in A, and the operation will be carried at each row vector.

I'm sorry but I'm afraid I don't understand this. Are you proposing changing the current behaviour? If currently set to 0, the vector b must have the same dimensions as columns of A right? Apologies if I have completely misunderstood!

Best,
Alex

cesarsouza · 2017-10-01T12:28:14Z

Hi @AlexJCross,

Regarding the proposed documentation...

The dimension along which the operation will be carried. If set to 0, then the vector b must have the same dimensions as rows of A, and the operation will be carried at each column vector. If 1, then it must have the same dimensions as columns in A, and the operation will be carried at each row vector.
I'm sorry but I'm afraid I don't understand this. Are you proposing changing the current behaviour? If currently set to 0, the vector b must have the same dimensions as columns of A right? Apologies if I have completely misunderstood!

I am sorry. As you see, this was confusing even for me and I switched the axes in that proposal! I guess that's a sign that my brain keeps telling me that the inverted approach is better (#902), and with two independent sources also reporting that dimension: 1 is more intuitive, maybe it would be more sensible to invert the axes after all. The problem is doing so without breaking everything...

I guess your suggestion for the boolean parameter is indeed the clearest (thanks!), and the fact that it uses a different method signature is a big plus because it would allow us to deprecate the previous methods instead of changing their behavior abruptly.

Then maybe what we could consider is to add a new VectorType enumeration that can be either Row or Column, add new Add(double[,] m, double[] vec, VectorType dimension) overloads to the elementwise methods, and mark Add(double[,] m, double[] vec, int dimension) as deprecated. This would enable us to keep both versions working for a while. Then, when its time (let's say, the day the framework finally adds support for general tensors), we can finally write the correct Add(double[,] m, double[] vec, int dimension) with inverted axes w.r.t what we have today.

What do you think?

Thanks a lot for the feedback on this!

Regards,
Cesar

- Updates GH-927: Broadcasting dimension seems counter-intuitive

- Adding new overloads for element-wise operations that accept a VectorType enumeration instead of a raw integer; - Marking the previous methods as obsolete.

cesarsouza · 2017-10-21T10:42:01Z

Added in 3.8.0.

CatchemAL added the question label Sep 30, 2017

cesarsouza self-assigned this Oct 2, 2017

cesarsouza added a commit that referenced this issue Oct 3, 2017

Correcting documentation a little bit (still open to suggestions)

78ffeb3

- Updates GH-927: Broadcasting dimension seems counter-intuitive

cesarsouza added the pending-release label Oct 4, 2017

cesarsouza added a commit that referenced this issue Oct 4, 2017

GH-927: Broadcasting dimension seems counter-intuitive

a38b463

- Adding new overloads for element-wise operations that accept a VectorType enumeration instead of a raw integer; - Marking the previous methods as obsolete.

cesarsouza closed this as completed Oct 21, 2017

cesarsouza removed the pending-release label Oct 21, 2017

cesarsouza added this to the 3.8 milestone Oct 21, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Broadcasting dimension seems counter-intuitive #927

Broadcasting dimension seems counter-intuitive #927

CatchemAL commented Sep 30, 2017

cesarsouza commented Sep 30, 2017 •

edited

Loading

CatchemAL commented Oct 1, 2017

cesarsouza commented Oct 1, 2017

cesarsouza commented Oct 21, 2017

Broadcasting dimension seems counter-intuitive #927

Broadcasting dimension seems counter-intuitive #927

Comments

CatchemAL commented Sep 30, 2017

cesarsouza commented Sep 30, 2017 • edited Loading

CatchemAL commented Oct 1, 2017

cesarsouza commented Oct 1, 2017

cesarsouza commented Oct 21, 2017

cesarsouza commented Sep 30, 2017 •

edited

Loading