Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove element wise operations of exp and gamma to prevent confusion #2440

Open
gwhitney opened this issue Feb 22, 2022 · 66 comments · Fixed by #2465
Open

Remove element wise operations of exp and gamma to prevent confusion #2440

gwhitney opened this issue Feb 22, 2022 · 66 comments · Fixed by #2465

Comments

@gwhitney
Copy link
Collaborator

There is a standard meaning for the Gamma function applied to a square matrix (see https://arxiv.org/abs/1806.10554), which is not the Gamma function applied componentwise, like the current behavior of gamma(M) in mathjs. Also, the componentwise application already has another very short expression, namely map(M, gamma).

To avoid potential confusion, should the current matrix interpretation of gamma(M) as componentwise application be eliminated, requiring the use of map(M, gamma)? That way, no one would accidentally call gamma on a square matrix thinking they were getting the usual matrix gamma function. (And if anyone ever chose to implement the usual matrix gamma, it could either be overloaded onto the gamma symbol, or called gammam by analogy with expm.)

In fact, the analogy with exp is very strong. The usual mathematical meaning of "exp" of a (square) matrix is not componentwise exponentiation, but rather the matrix exponential. Clearly a specific decision was made to call that operation expm in mathjs, and I am not suggesting that decision be revisited. But I would suggest that the componentwise meaning of exp(M) be disabled and the documentation of exp be revised to point to expm for the matrix exponential and map(M, exp) for componentwise exponentiation. Again, the motivation here is to avoid the potential confusion of someone calling exp(M) thinking they were getting expm(M).

Failing that, I'd suggest amplifying the documentation of exp to make it more prominent that exp(M) is not the matrix exponential.

In any case, perhaps it would be good to make a decision at least on whether gamma should continue to apply componentwise to matrices before acting on PR #2416 (which is the proximate cause of my looking into the semantics of gamma and raising this issue).

@josdejong
Copy link
Owner

Thanks for bringing this up. I wasn't aware of that, but it makes sense to remove the matrix support from both gamma and exp to prevent confusion.

This would be a breaking change, so we will need to publish it as a new major version, which is fine (that is what the version numbering is for).

@josdejong josdejong changed the title Matrix functions? Remove element wise operations of exp and gamma to prevent confusion Mar 1, 2022
@gwhitney
Copy link
Collaborator Author

gwhitney commented Mar 2, 2022

OK. Is there any particular protocol then with respect to a PR that makes this change, and/or any particular timing? Other interface-breaking items that are queued up that need to get in around the same time? Just let me know and I can make a PR for this; and also the PR in progress for lgamma (#2417) should be changed to not apply lgamma to every element of an array, so that we don't introduce another instance of this same sort of thing.

@josdejong
Copy link
Owner

josdejong commented Mar 4, 2022

There is not a specific protocol for breaking changes. If there are multiple breaking changes it's of course nice to bundle them in one breaking release. So in this case we could bundle it with the breaking change in #2370 for example.

@josdejong
Copy link
Owner

also the PR in progress for lgamma (#2417) should be changed to not apply lgamma to every element of an array, so that we don't introduce another instance of this same sort of thing

good point!

@gwhitney
Copy link
Collaborator Author

gwhitney commented Mar 5, 2022

OK, I will put in a PR right away to remove the elementwise operation from gamma and exp so that you have flexibility for when to merge and generate a new major version. (If it sits for a bit because you want to wait on a major version bump, I totally understand.)

@gwhitney
Copy link
Collaborator Author

gwhitney commented Mar 5, 2022

When I looked up the matrix exponential, it referred to the matrix logarithm and hyperbolic sine, which in turn referred to the matrix sine, etc. Basically any function defined by a simple power series has an analogue for square matrices that provides the usual definition of that function on square matrices. So the PR will eliminate all of them. If you want to draw the line somewhere else, just let me know and I will revise.

Incidentally, the current typing of say asin for TypeScript is that given a number it may return a Complex, unless config.predictable is true. Since predictable defaults to false, should I change the TypeScript typing of asin(x: number) to be number | Complex ?

gwhitney added a commit to gwhitney/mathjs that referenced this issue Mar 5, 2022
  Prior to this commit, many functions operated elementwise on matrices
  even though in standard mathematical usage they have a different
  meaning on square matrices. Since the elementwise operation is easily
  recoverable using `math.map`, this commit removes the elementwise
  operation on arrays and matrices from these functions.
  Affected functions include all trigonometric functions, exp, log, gamma,
  square, sqrt, cube, and cbrt.
  Resolves josdejong#2440.
@josdejong
Copy link
Owner

josdejong commented Mar 8, 2022

Incidentally, the current typing of say asin for TypeScript is that given a number it may return a Complex, unless config.predictable is true. Since predictable defaults to false, should I change the TypeScript typing of asin(x: number) to be number | Complex ?

Yes the correct TypeScript definition is number | Complex there (for example sqrt has something similar in that regard).

@gwhitney
Copy link
Collaborator Author

gwhitney commented Mar 9, 2022

Yes, I guessed that's how you would respond, so in fact this change in typing is already in #2465.

@josdejong
Copy link
Owner

👍

gwhitney added a commit to gwhitney/mathjs that referenced this issue Mar 15, 2022
  Prior to this commit, many functions operated elementwise on matrices
  even though in standard mathematical usage they have a different
  meaning on square matrices. Since the elementwise operation is easily
  recoverable using `math.map`, this commit removes the elementwise
  operation on arrays and matrices from these functions.
  Affected functions include all trigonometric functions, exp, log, gamma,
  square, sqrt, cube, and cbrt.
  Resolves josdejong#2440.
josdejong pushed a commit that referenced this issue Mar 17, 2022
* chore: Prevent consfusion with standard matrix functions.

  Prior to this commit, many functions operated elementwise on matrices
  even though in standard mathematical usage they have a different
  meaning on square matrices. Since the elementwise operation is easily
  recoverable using `math.map`, this commit removes the elementwise
  operation on arrays and matrices from these functions.
  Affected functions include all trigonometric functions, exp, log, gamma,
  square, sqrt, cube, and cbrt.
  Resolves #2440.

* chore(typescript): Revise usages in light of changes

  sqrt() is now correctly typed as `number | Complex` and so must
  be explicitly cast to number when called on a positive and used
  where a Complex is disallowed; sqrt() no longer applies to matrices
  at all.

* feat: Provide better error messages for v10 -> v11 transition

  Uses new `typed.onMismatch` handler so that matrix calls that used to
  work will suggest a replacement.
gwhitney added a commit that referenced this issue Apr 13, 2022
* chore: Prevent consfusion with standard matrix functions.

  Prior to this commit, many functions operated elementwise on matrices
  even though in standard mathematical usage they have a different
  meaning on square matrices. Since the elementwise operation is easily
  recoverable using `math.map`, this commit removes the elementwise
  operation on arrays and matrices from these functions.
  Affected functions include all trigonometric functions, exp, log, gamma,
  square, sqrt, cube, and cbrt.
  Resolves #2440.

* chore(typescript): Revise usages in light of changes

  sqrt() is now correctly typed as `number | Complex` and so must
  be explicitly cast to number when called on a positive and used
  where a Complex is disallowed; sqrt() no longer applies to matrices
  at all.

* feat: Provide better error messages for v10 -> v11 transition

  Uses new `typed.onMismatch` handler so that matrix calls that used to
  work will suggest a replacement.
gwhitney added a commit that referenced this issue Apr 30, 2022
* chore: Prevent consfusion with standard matrix functions.

  Prior to this commit, many functions operated elementwise on matrices
  even though in standard mathematical usage they have a different
  meaning on square matrices. Since the elementwise operation is easily
  recoverable using `math.map`, this commit removes the elementwise
  operation on arrays and matrices from these functions.
  Affected functions include all trigonometric functions, exp, log, gamma,
  square, sqrt, cube, and cbrt.
  Resolves #2440.

* chore(typescript): Revise usages in light of changes

  sqrt() is now correctly typed as `number | Complex` and so must
  be explicitly cast to number when called on a positive and used
  where a Complex is disallowed; sqrt() no longer applies to matrices
  at all.

* feat: Provide better error messages for v10 -> v11 transition

  Uses new `typed.onMismatch` handler so that matrix calls that used to
  work will suggest a replacement.
gwhitney added a commit that referenced this issue May 12, 2022
* chore: Prevent consfusion with standard matrix functions.

  Prior to this commit, many functions operated elementwise on matrices
  even though in standard mathematical usage they have a different
  meaning on square matrices. Since the elementwise operation is easily
  recoverable using `math.map`, this commit removes the elementwise
  operation on arrays and matrices from these functions.
  Affected functions include all trigonometric functions, exp, log, gamma,
  square, sqrt, cube, and cbrt.
  Resolves #2440.

* chore(typescript): Revise usages in light of changes

  sqrt() is now correctly typed as `number | Complex` and so must
  be explicitly cast to number when called on a positive and used
  where a Complex is disallowed; sqrt() no longer applies to matrices
  at all.

* feat: Provide better error messages for v10 -> v11 transition

  Uses new `typed.onMismatch` handler so that matrix calls that used to
  work will suggest a replacement.
gwhitney added a commit that referenced this issue May 31, 2022
* chore: Prevent consfusion with standard matrix functions.

  Prior to this commit, many functions operated elementwise on matrices
  even though in standard mathematical usage they have a different
  meaning on square matrices. Since the elementwise operation is easily
  recoverable using `math.map`, this commit removes the elementwise
  operation on arrays and matrices from these functions.
  Affected functions include all trigonometric functions, exp, log, gamma,
  square, sqrt, cube, and cbrt.
  Resolves #2440.

* chore(typescript): Revise usages in light of changes

  sqrt() is now correctly typed as `number | Complex` and so must
  be explicitly cast to number when called on a positive and used
  where a Complex is disallowed; sqrt() no longer applies to matrices
  at all.

* feat: Provide better error messages for v10 -> v11 transition

  Uses new `typed.onMismatch` handler so that matrix calls that used to
  work will suggest a replacement.
gwhitney added a commit that referenced this issue May 31, 2022
* chore: Prevent consfusion with standard matrix functions.

  Prior to this commit, many functions operated elementwise on matrices
  even though in standard mathematical usage they have a different
  meaning on square matrices. Since the elementwise operation is easily
  recoverable using `math.map`, this commit removes the elementwise
  operation on arrays and matrices from these functions.
  Affected functions include all trigonometric functions, exp, log, gamma,
  square, sqrt, cube, and cbrt.
  Resolves #2440.

* chore(typescript): Revise usages in light of changes

  sqrt() is now correctly typed as `number | Complex` and so must
  be explicitly cast to number when called on a positive and used
  where a Complex is disallowed; sqrt() no longer applies to matrices
  at all.

* feat: Provide better error messages for v10 -> v11 transition

  Uses new `typed.onMismatch` handler so that matrix calls that used to
  work will suggest a replacement.
gwhitney added a commit that referenced this issue Jul 13, 2022
* chore: Prevent consfusion with standard matrix functions.

  Prior to this commit, many functions operated elementwise on matrices
  even though in standard mathematical usage they have a different
  meaning on square matrices. Since the elementwise operation is easily
  recoverable using `math.map`, this commit removes the elementwise
  operation on arrays and matrices from these functions.
  Affected functions include all trigonometric functions, exp, log, gamma,
  square, sqrt, cube, and cbrt.
  Resolves #2440.

* chore(typescript): Revise usages in light of changes

  sqrt() is now correctly typed as `number | Complex` and so must
  be explicitly cast to number when called on a positive and used
  where a Complex is disallowed; sqrt() no longer applies to matrices
  at all.

* feat: Provide better error messages for v10 -> v11 transition

  Uses new `typed.onMismatch` handler so that matrix calls that used to
  work will suggest a replacement.
gwhitney added a commit that referenced this issue Jul 15, 2022
* chore: Prevent consfusion with standard matrix functions.

  Prior to this commit, many functions operated elementwise on matrices
  even though in standard mathematical usage they have a different
  meaning on square matrices. Since the elementwise operation is easily
  recoverable using `math.map`, this commit removes the elementwise
  operation on arrays and matrices from these functions.
  Affected functions include all trigonometric functions, exp, log, gamma,
  square, sqrt, cube, and cbrt.
  Resolves #2440.

* chore(typescript): Revise usages in light of changes

  sqrt() is now correctly typed as `number | Complex` and so must
  be explicitly cast to number when called on a positive and used
  where a Complex is disallowed; sqrt() no longer applies to matrices
  at all.

* feat: Provide better error messages for v10 -> v11 transition

  Uses new `typed.onMismatch` handler so that matrix calls that used to
  work will suggest a replacement.
gwhitney added a commit that referenced this issue Jul 16, 2022
* chore: Prevent consfusion with standard matrix functions.

  Prior to this commit, many functions operated elementwise on matrices
  even though in standard mathematical usage they have a different
  meaning on square matrices. Since the elementwise operation is easily
  recoverable using `math.map`, this commit removes the elementwise
  operation on arrays and matrices from these functions.
  Affected functions include all trigonometric functions, exp, log, gamma,
  square, sqrt, cube, and cbrt.
  Resolves #2440.

* chore(typescript): Revise usages in light of changes

  sqrt() is now correctly typed as `number | Complex` and so must
  be explicitly cast to number when called on a positive and used
  where a Complex is disallowed; sqrt() no longer applies to matrices
  at all.

* feat: Provide better error messages for v10 -> v11 transition

  Uses new `typed.onMismatch` handler so that matrix calls that used to
  work will suggest a replacement.
josdejong pushed a commit that referenced this issue Jul 18, 2022
…ter (#2559)

* chore: Prevent confusion with standard matrix functions. (#2465)

* chore: Prevent consfusion with standard matrix functions.

  Prior to this commit, many functions operated elementwise on matrices
  even though in standard mathematical usage they have a different
  meaning on square matrices. Since the elementwise operation is easily
  recoverable using `math.map`, this commit removes the elementwise
  operation on arrays and matrices from these functions.
  Affected functions include all trigonometric functions, exp, log, gamma,
  square, sqrt, cube, and cbrt.
  Resolves #2440.

* chore(typescript): Revise usages in light of changes

  sqrt() is now correctly typed as `number | Complex` and so must
  be explicitly cast to number when called on a positive and used
  where a Complex is disallowed; sqrt() no longer applies to matrices
  at all.

* feat: Provide better error messages for v10 -> v11 transition

  Uses new `typed.onMismatch` handler so that matrix calls that used to
  work will suggest a replacement.

* fix: prevent chain from matching rest parameter with stored value

  Since the revised code needs the isTypedFunction predicate, switch to using
  the typed-function implementation for that throughout mathjs, rather than
  rolling our own here.

  Also adds a test that chain() no longer allows this kind of usage.

  Removes the two type declarations in types/index.d.ts that were allowing
  this sort of "split rest" call and added tests that such calls are
  forbidden.

  Adds to the chaining documentation page that such "split" calls are not
  allowed.

* chore: Refresh this PR to reflect underlying changes

  Also addresses the review request with a detailed comment on the
  correctness of a new code section.

  Note that it reverts some changes to the TypeScript signatures of the
  matrix functions ones() and zeros() -- they do not actually have a
  typed-function signature of two numbers and an optional format
  specifically for two dimensions. What they have is a single rest
  parameter, from which the format is extracted if present.

  Hence, due to the ban on breaking rest parameters, it is not
  valid to call math.chain(3).zeros(2) to make a 3-by-2 matrix of zeros,
  which seems like a perfectly valid ban as the division of the dimensions
  is very confusing; this should be written as math.chain([3,2]).zeros().
  The TypeScript signatures are fixed accordingly, along with the edge
  case of no arguments to ones() and zeros() at all, which does work to
  produce the "empty matrix".
@dvd101x
Copy link
Collaborator

dvd101x commented Jan 7, 2023

Hi, first of all I’m constantly impressed with the capabilities of this library and fast pace of development.

I would like to discuss the topic of an alternative to this decision.

All relevant functions of math.js support matrices and arrays.

Source matrices

The benefit is that it allows for simpler evaluations. There are some examples of the workings of these functions in matlab and numpy using both scalars and N dimensional arrays.

As shown in the previous examples matlab and numpy can use exp and gamma in both ways.

I have not used the function gamma, but the array like capabilities of exp seems as important as the array capabilities of sin or cos

@gwhitney
Copy link
Collaborator Author

gwhitney commented Jan 8, 2023

Thanks for the references and the feedback. I don't see the matrix exponential in numpy, but I do see that matlab has exp() apply elementwise to matrices, and expm() for the matrix exponential. mathjs just makes a different choice, map(M, exp) to apply it elementwise, and expm() for the matrix exponential (if I am recalling correctly), so there is really no chance of being confused which one exp(M) does, since it doesn't do anything. And map(M, exp) is brief and very clear, so I am not seeing the problem you allude to? You can also do map(M, gamma) or map(M, sin) so it very easy to get array-like capabilities for any unary function, even one you write yourself.

@dvd101x
Copy link
Collaborator

dvd101x commented Jan 8, 2023

Thanks for reviewing this and for the research on expm( ) on other projects.

Even though numpy doesn't have expm( ), there is one in scipy.linalg.expm. The same happens for gamma( ) in scipy.special.gamma.

I recently encountered one case where exp( ) was expected to work for both scalars and arrays element wise. Thus a new function had to be done to work either way

# If the input is a scalar use exp(input), if it's an array use map(input, exp)
Exp(input) = equalText(typeOf(input), 'number') ? exp(input) : map(input, exp)

As you stated on other projects exp( ) works for scalars and element wise to matrices and expm( ) for the matrix exponential. I understand that mathjs doesn't have to be like other projects and can make different choices.

I see that the same change has been happening for these functions

  • exp( )
  • expm1( )
  • log( )
  • gamma( )
  • sqrt( )
  • cbrt( )
  • trigonometric

I did some quick tests on these functions with numpy/scipy and Matlab and it's the same difference for all. It's expected to work for scalars and element wise for matrices and arrays.

I understand this is to avoid some confusion with matrix operations, nonetheless I prefer the element wise option available on numpy/scipy and Matlab as it's very convenient to avoid some loops or maps.

I think I was writing my opinion too categorically, please note that I respect your decision and I will work around these functions that do not support matrices and arrays anymore.

Maybe a rephrasing of this statement is in order for the newer version

All relevant functions of math.js support matrices and arrays.

@josdejong
Copy link
Owner

Sorry for the confusion on square in my examples. I've renamed it to mySquare so it's clear we're not talking about the built-in square but just "some" unary function or custom function.

I think there is no need for introducing a new tuple kind of data structure to hold function arguments. I would like to avoid that because, like you explain, it brings a lot of complication.

We can handle arguments on the right hand side of the : operator directly inside the parser as function arguments, as part of the : operator itself. Similar to regular function calls, which have a pattern of a symbol followed by arguments. There is no need for tuple data structure there either. Basically, we should extend the parseRange code in the parser such that it handles more cases than the ranges we have now.

Implementation could roughly look something like this:

  • We implement a new function named elementwise(fn, arg1, arg2, arg3, ...), which evaluates as
    map(arg1, callback(x) => fn(x, arg2, arg3, ...)). So, elementwise(log, [1,2,3,4], 2) is equivalent to map([1,2,3,4], f(x) = log(x, 2)).

  • We implement a new node ElementWiseFunctionNode, which can hold a left hand side being a symbol or Node, and a right hand side being a list with function arguments. It is the same as FunctionNode, except that it invokes the new elementwise function on the function and arguments instead of invoking the function itself. Maybe we can extend/reuse FunctionNode for this, we will see.

  • In parseRange, after parsing the first :, we add a new condition which checks for ( being the next token. If so, we parse the next part as function arguments (maybe we can use/reuse parseAccessors there), and then we output an ElementWiseFunctionNode. This handles cases like:

    mySquare:(A)
    mySquare:(1:4)
    log:(1:4, 2)
    
  • In parseRange we now have a new cases when creating a new RangeNode(...). So far it supports start:end and start:step:end. But now, we can also have fn:start:end (like mySquare:1:4), fn:start:step:end (like mySquare:0:5:100) and fn:arg1 (like mySquare:A). We have to extend RangeNode and function range such that they can handle element wise function calls, so it can call elementwise(fn, range(...)). The function you're referring to in your point 3 I think is the existing range function. This will handles cases like:

    mySquare:A
    mySquare:1:4
    range(mySquare, 1, 4)
    

@gwhitney
Copy link
Collaborator Author

gwhitney commented Feb 3, 2023

OK, wow, I was not thinking about this clearly, sorry. If a proposal along these lines is adopted, then we want the meaning of foo:bar to depend on whether foo is a function or a number. But the parse of this expression can't depend on that, and it currently basically parses to range(foo, bar), so we would have to extend the semantics of the range function to cover the case of range(abs, [-3,7,-2]) doing elementwise application to produce [3,7,2]. That's a pretty huge expansion of the behavior of that function. Are you sure you are OK with that?

If so, then the fact that currently ranges accept optionally a third element as in start:step:end introduces wrinkles. One possibility is that we could use that notation for the binary, ternary, etc. function application cases. In other words, to take the log base 2 of each element of the vector V you would write log:V:2 rather than log:(V,2). So then range(func, arg1, arg2, ..., argk) would apply the k-ary function "func" to the arguments in turn generated by broadcasting all of arg1, arg2, etc, up to argk to the same shape. And range(number, number) and range(number, number, number) would behave as they currently do, and 3:4:5:6 would parse to range(3,4,5,6) and would be a runtime error. Note that in this scheme, if you wanted to use a range in one of the positions of an elementwise application, you'd have to parenthesize it; in other words for the log of 1 through 4 base 2, you'd have to write log:(1:4):2, and you couldn't write abs:-3:3 because that would end up trying to call abs(-3,3) which is not valid, you'd have to write abs:(-3:3).

If you don't like that for binary vectorization, then yes, log:(1:4,2) could be a special syntax, but as far as I can see it might as well just parse directly to range(log, 1:4, 2) and still have the semantics of that be as above -- I don't see any reason to create a new node type, etc.

In any case, since all of these : expressions have to parse into range (and so whether we want to use : for elementwise operations depends on whether you're comfortable extending the range function to these cases), I don't see that there's any need for a new elementwise function -- just new typed-function cases for the existing range function. (If we use a different operator symbol, then likely we'll need a new function for its semantics.)

So in other words, the elementwise function you write about should just be the new cases for range. But I don't think the semantics as described are quite right -- you might be vectorizing along some other argument or more than one simultaneously. In other words, you should be able to elementwise-apply log to 4 and [2,3,4] to get the log of 4 to each base 2, 3, and 4, or elementwise-apply log to [4,9,16] and [2,3,4] to get [2,2,2]. The expression you wrote doesn't cover these latter cases.

@gwhitney
Copy link
Collaborator Author

gwhitney commented Feb 3, 2023

Oh and the questions in my point (4) above are all still valid; I truly was asking about the existing square function, and also about any other existing vectorizing functions.

@josdejong
Copy link
Owner

I think you're right and we do not need an elementwise function at all, we can do with extending range. And we do not need a new ElementWiseFunctionNode but can just use RangeNode. Much simpler, Nice!

Syntax

We could indeed consider interpreting log:V:2 as element-wise log with arguments V and 2, instead of log:(V, 2). That would be technically simpler and more straightforward in a way. But I do like the similarity of the regular log(V, 2) and it's element-wise version log:(V, 2).

Mixing the original numeric range with the new element-wise function operator can introduce quite some confusion and weird cases. Maybe we should simply not allow fn:start:end and fn:start:step:end, and do not mix these two different things. So, we could do:

# numeric range
start:end
start:step:end

# element-wise function operator
fn:(arg1, arg2, arg3, ...)
fn:arg1             # shorthand for fn:(arg1) 

# do NOT allow mix of these two:
fn:start:end        # NOT supported
fn:start:step:end   # NOT supported

# function range
range(start, end)
range(start, step, end)
range(fn, arg1, arg2, arg3, ...)

It feels a bit weird indeed to extend range with an element-wise function call. But well, all these concepts of range, map, repeat are close together, and it makes sense to use the very same : operator for them. I have to say it looks good to me, but I'm still not totally convinced, have to let it sink in a bit. The alternative would be to use an other character, like log@(V,2), or log..(V,2), so these two keep cleanly separated (with a new elementwise function under the hood, separate from range). Or we could disallow the shortcut fn:arg1 and require fn:(...), so the parser can already recognize whether it's a numeric range or element-wise function.

Philosophy

About your point (4) on square: it makes sense to me to implement square as x*x for matrices, and the same with cube.
It still bothers me that there is an inconsistency in some functions like exp, log, sin, sqrt not supporting matrices out of the box, and other functions like abs, add, eqaul max sign that do support matrices. That can be confusing. Let me try to write out the philosophy we try to implement right now for functions operating on matrices:

  • in principal, functions evaluate the mathematical way, like multiply and divide.
  • if the mathematical way is element-wise evaluation, we do suppport that, like add and subtract.
  • if the mathematical way is not element-wise evaluation but we miss the mathematical implementation, we throw a clear error when invoking the function with a matrix, like log, sin, square.
  • If you want element-wise evaluation on any function, you can use the new explicit element-wise syntax like log:(V, 2).

I'm not sure if there are still functions in a "gray" area in this regard, but I think abs is one of those. If so, we have to have a look at those. What do you think @gwhitney ?

@josdejong josdejong reopened this Feb 4, 2023
@gwhitney
Copy link
Collaborator Author

gwhitney commented Feb 4, 2023

Yes, I think we are homing in on the details here. But I think there are still some parsing issues:

  1. foo:bar:baz:quux

This is currently a syntax error, and if we are not going to use this for applying a ternary function foo, it can remain a syntax error.

  1. foo:bar:baz and foo:(bar, baz)

The first of these has to parse to a RangeNode with three children, in case foo, bar, and baz all mean numbers. So if the second form also parses to a RangeNode with three children, and the semantics of such a node is given by the range(foo, bar, baz) function, and if that means element-wise application of a binary function foo to bar and baz, then we would inevitably end up with log:V:2 being allowed and working to take the base-2 log of all of the elements of V. And it also seems to mean that 0:(2,6) would be legal and produce [0,2,4,6], obviously really weird. There are three responses I can think of to this situation: (A) Don't sweat it: encourage use of the log:(V,2) form, don't really mention that log:V:2 would work as well, but don't do anything specifically to suppress that form, and similarly, just tolerate 0:(2,6) without promoting it; (B) Parse foo:bar:baz and foo:(bar,baz) into different trees, which use different functions for their semantics, range for the first one and say an extended map (or a new elementwise) for the second. In this scenario, we would not extend the semantics of range, so that log:V:2 would give a runtime error because log is a function not a number (which is exactly what happens right now). Or (C): foo:bar and foo:(bar) and foo:(bar,baz) all parse to the same basic structure whose semantics are given by a function that covers the binary start:end range case and all elementwise application cases, whereas foo:bar:baz parses to a special TernaryRange structure that uses a function for its semantics that only supports the start:step:end case. I think (C) would work perfectly well, but it has the conceptual drawback that it is counterintuitive to coalesce the start:end range and elementwise application semantics while separating the start:step:range, which seems more similar to start:end.

  1. foo:(bar)

In option B above, this will become very problematic, particularly in the case foo:(bar+baz). Is this a range where the right endpoint is the sum of two numbers, or the elementwise application of a function to the sum of two vectors? This cannot be determined at parse time when the values of the symbols foo, bar, and baz are not known, which seems to imply that foo:bar and foo:(bar) must both parse to the same sort of node, which in turn seems to rule out option (B) in point (2).

So to sum up all of these points, I see four ways to go at the moment, in no particular order.

(i) Leave things be: mix of non-conflicting operators auto-vectorizing, and needing explicit map() for others, possibly combined with improving lambda-expression notation.

(ii) Extend : for elementwise operation, tolerate weird cases like log:V:2 and/or 0:(2,6)

(iii) Extend : for elementwise operation, and avoid those weird cases by having 0:6 and abs:V and log:(V,2) all parse to similar Node structures while counterintuitively 0:2:6 parses to a different structure. (Note this counter-intuitiveness would only show up to someone who doesn't deal with mathjs syntax trees in one way: there would be one function I'll call here rangeOrElementwise so that rangeOrElementwise(1:4) is [1,2,3,4] while rangeOrElementwise(log, V, 2) applies log to all of the pairs (v,2) for v in V, and a separate function ternaryRange that handles ternaryRange(0,2,6) to produce [0, 2, 4, 6].)

(iv) Use a different operator symbol than :(...) for elementwise function application. The discussion above has suggested func..(V,W) or func@(V,W), and there are surely other options; just to brainstorm, doubling the parens exp((V)), dollar sign func$(V,W), etc. I think of these I personally like .. best.

@dvd101x
Copy link
Collaborator

dvd101x commented Feb 4, 2023

Edge case, ranges as arguments of a function

Please consider the case for ranges as inputs in element wise functions with multiple arguments.

add(1:1:3, 3:-1:1)           # [4, 4, 4]
subtract(1:1:3, 3:-1:1)      # [-2, 0, 2]
dotMultiply(1:1:3, 3:-1:1)   # [3, 4, 3]
dotDivide(1:1:3, 3:-1:1)     # [0.33333333333333, 1, 3]
dotPow(1:1:3, 3:-1:1)        # [1, 4, 3]

Reference from Wolfram Language

I fount in the Wolfram Language docs something that might be useful regarding the discussion for mappings and element wise functions. I found it interesting as it's Mathematical rather than numerical (like Matlab)

Map[f,expr] or f/@expr applies f to each element on the first level in expr.
Evaluate f on each element of a list: Map[f, {a, b, c, d, e}]
Use the short input form: f /@ {a, b, c, d, e}

Sin can be used with Interval and CenteredInterval objects. »
Sin automatically threads over lists.
Exp[z] gives the exponential of z.
Exp automatically threads over lists.
Exp can be used with Interval and CenteredInterval objects. »
MatrixExp[m] gives the matrix exponential of m.

@gwhitney
Copy link
Collaborator Author

gwhitney commented Feb 4, 2023

Right, I think all of these proposals will be compatible with a numeric range as an input, e.g. log:(1:4, 2) (but not I think abs:-3:3 -- we seem to be leaning toward if you want to do that, you have to write abs:(-3:3), because of the existence of the start:step:end notation.

And thanks for the Wolfram reference; it means that if we choose to go with an alternate operator symbol, we could choose log/@(V,2) to be similar to Wolfram.

@josdejong
Copy link
Owner

Good analysis Glen! I hadn't realized these side effects of weird cases like log:V:2 and/or 0:(2,6) that we get when we use : for multiple purposes.

I'm starting to be convinced that we should not reuse : for the new map operator: it is technically weird to "overload" the clear range function to also be a map function, it gives a syntax that allows doing weird things like 0:(2,6), and I notice by myself that it is hard to understand, name, and explain this concept when mixing these two concepts together. Then it will most likely be confusing for users too.

Let's see if we can settle on an alternative symbol for the map operator. Thanks David for looking up the syntax that Wolfram uses!

Syntax Pros Cons
log:(V,2) exp:V Uses the existing operator : for new map like operation Mixes two concepts together, which is confusing and gives serious complications
log..(V,2) exp..V Similar to the element-wise operators .*, ./ and .^ It may look odd when you see it for the first time?
log/@(V,2) exp/@V Same syntax as Mathematica, people may be familiar with it The / feels redundant and doesn't come back in other places of the syntax of mathjs
log@(V,2) exp@V Since @ is used in other languages too related to lambda's and mapping, people may have a natural feel for what it means

So far I like both log..(V,2) and log@(V,2) the best. More thoughts on this?

@gwhitney
Copy link
Collaborator Author

gwhitney commented Feb 6, 2023

Purely in the spirit of brainstorming: the sort of implementation that's being proposed here makes this not really a new operator (infix notation for a function applied to its left and right operands) but a variant function application notation, that vectorizes rather than applying to entire collections as single entities.

So conceptually this suggests to me that the difference should be some change to the ordinary function-application-notation f(V,W). Since the parentheses are really the only symbols in this notation (no comma in unary case), maybe the notation should be a modification of the parentheses. So the possibilities that come to mind based on this:

  1. since mathjs currently doesn't actually allow juxtaposition with a matrix like 3[2,7] to mean multiplication by scalar (i.e., producing [6,21]), the notation f[V,W] actually appears to be available for vectorized application. And the brackets naturally suggest something vector-y going on. The only possible downside I see is if the association with vector-forming is so strong that readers will presume [V,W] must mean an array with two entries, each an array itself, and then wonder what it could mean to have a function symbol before it. Other than that potential worry - not sure how real a worry it is - I like this notation; it's certainly briefest and makes it clear this is not an operator but a different sort of function application.

  2. Double the parentheses: f((V,W)). The doubling sort of suggests vectorization...

  3. Put the modifier character after the open paren: Any of f(:V,W), f(@V,W), f(*V,W), f(..V,W). Can't be f(.V,W) because f(.2,W) would then be ambiguous between the number .2 and vectorized application to 2 and W. But none of these other four have any prefix meaning, so the notation would work.

  4. Like (3) but repeat the decoration symbol at the close paren for symmetry, e.g., f(:V,W:).

  5. Use one of these symbols to mark each argument to be vectorized over. This would allow e.g. log(*V, 2) for the base 2 log of V and log(*[4,9,16],*[2,3,4]) to produce [2,2,2] but also to disambiguate pow(M, *[2,3,4]) to mean the array whose entries are the square, cube, and fourth power of (square) matrix M, which would be hard to specify with notations that say only "this is a vectorized function application" without indicating exactly which arguments to vectorize over.

Upon initial reflection, I think I like option (5) best, and I think any of the three symbols *, @, or : would be fine - and there might be other options. I don't think it should be .. because that's too confusing with splats, whether or not mathjs actually has them.

@gwhitney
Copy link
Collaborator Author

gwhitney commented Feb 6, 2023

And in fact I will say I like (5) with * best -- it's like an asterisk indicating "Hey these arguments are being treated specially" and it echoes * in regular expressions indicating sequencing, and * in languages like C that comes up in array processing.

@josdejong
Copy link
Owner

That is an interesting idea! So instead of putting a mark on the function, we could put it on the actual argument that we want to iterate over. Conceptually it makes sense. And it gives some interesting new flexibility dealing with element-wise operations. I think though that this concept is a bit more abstract to understand than the element-wise function call like log@(V, 2). And it would not allow this compact exp@V notation.

I agree with you that * makes sense to use in this case, like log( *V, 2) where the * is part of the argument and not the function call. In JavaScript the * is used to mark a function as a generator that you can iterate over, so that is a similar concept. In C and C++ I know it mostly from pointers and de-referencing. These are personal associations, but I get this "watch out complex stuff ahead" warning sign blinking at the inside of my eyes when seeing this * in code. Ideally, our element-wise function call does not have associations with being complex, there is no reason for that.

I'll give it a bit more thought 🤔

@gwhitney
Copy link
Collaborator Author

gwhitney commented Feb 8, 2023

And it would not allow this compact exp@V notation.

Weellll, since there is currently no definition of multiplication of a function and a vector, mathjs conceptually could define square*V to be the vector of the squares of elements of V, etc. Then you would have this ultracompact notation alongside log(*V, 2) and pow(M,*2:4). In this scenario you could write either sin(*angles) or sin*angles -- they would parse differently but produce the same result. I am not necessarily advocating this choice, just exploring the logical possibilities. Nor do I think defining function times collection in this way is a terrible/ridiculous idea, either. Worth contemplating at least.

@josdejong
Copy link
Owner

Ha, that is true. It feels quite confusing to me though to extend multiply to do element-wise evaluation when passing it a function as LHS, like it is to extend the function range for that purpose. But it is a possibility.

@gwhitney
Copy link
Collaborator Author

gwhitney commented Feb 8, 2023

Well, to my eye, "classic" range and elementwise application were just two distinct operations we were considering gluing together for notational convenience. Here at least, if someone said "Hmmm, we have no definition for the product of a function and a vector, what should that do?" there's not much else that comes to mind other than applying the function to each element of the vector; and we already have some precedent for "special" meanings of * for vectors because of the decision to use V*W for the dot product of two vectors. So it feels less of a stretch to me to extend multiply(func, V) in this way; it does actually feel legitimately "multiplication-y" to me.

@gwhitney
Copy link
Collaborator Author

gwhitney commented Feb 8, 2023

In any case, sin(*angles) is only two characters longer and emphasizes that function application is going on, so it may on the other hand not be necessary to support an ultrashort notation like sin@angles. I think you can tell that I like the aspect of the *-argument notation that it makes explicit which argument(s) is/are being vectorized over, and allows you to explicitly not vectorize over a collection argument if you want to simply use that argument repeatedly as a whole, instead of diving in to its elements, but you still want to vectorize over some other argument.

@josdejong
Copy link
Owner

So it feels less of a stretch to me to extend multiply(func, V) in this way; it does actually feel legitimately "multiplication-y" to me.

yeah it's not conflicting in any way. I prefer to not go that direction though.

Latest thoughts from my side: I think I prefer @ over .. when choosing between those two. Having the ultra compact exp@V notation is not that important I think, exp@(V) is fine too.

I do like your proposal of * at the argument too, I would like to think that though a bit more to see if there are complications with it or not. Thoughts:

  1. What are concrete use cases of these advanced cases where we do want some but not all matrix arguments to be evaluated element-wise, like pow(M,*2:4)? I guess it would output an array with matrices. I'm not sure if this is something you would want in practice. If we can only come up with cases where we want all arguments element wise, like log(*[4,9,16],*[2,3,4]), then it feels verbose to have to specify the * with every argument.
  2. Can/should we support the notation for operators too? Like M ^ *2:4? I think that is technically possible. So then A .* B could be written as *A * *B. That looks a bit confusing though, I'm not sure if we should support this.
  3. How to implement this? I think we will implement a new function elementwise, which as input has a function, a list with arguments, and a list which of the arguments needs to be executed element-wise. The expression parser will create a FunctionNode with this elementwise function when it encounters the * notation.

@dvd101x
Copy link
Collaborator

dvd101x commented Apr 7, 2023

I was doing some tests related to numerical integration of ordinary differential equations that could be used in problems like Rocket Trajectory Optimization and it seems promising. I think it has the potential to be more versatile than what can be found in numpy or Matlab as here we can also include units and that would make representing physical systems clearer.

The reason I bring this to your attention is that even though I respect the decision to remove the element wise operations to avoid confusion and be more mathematically sound, I think it might affect numerical computing a bit.

These new ways of mapping functions in the parser are very promising and I was wondering if it's possible to cover both cases? I mean for most functions to accept array likes and also have this versatile ways to map in the parser.

Since the element wise operations were removed I've encountered a few minor issues:

  • Rocket Trajectory Optimization was expected to work in both ways using exp with scalars and also with vectors element wise afterwards
  • cbrt can't be mapped with the current map as the error suggest and it needs a special treatment
  • atan2 is the only trigonometric function that still accepts array likes

@josdejong
Copy link
Owner

josdejong commented May 15, 2023

Do you mean to keep support for matrices in say function add(matrix) alongside supporting map(matrix, add) and add@(matrix)?

Do you mean to keep support for matrices in say function ceil(matrix) alongside supporting map(matrix, ceil) and ceil@(matrix)?

@gwhitney

This comment was marked as outdated.

@josdejong

This comment was marked as outdated.

@gwhitney
Copy link
Collaborator Author

On the amended question, my opinion is that since there is no matrix-analytic meaning of the "ceiling of a matrix" other than the element-wise ceiling, I don't see any harm in continuing to allow ceil(matrix). Of course map(matrix, ceil) would also work, as map(matrix, F) for any unary function F. And I guess the ceil@(matrix) notation is the final new (accepted?) proposal for trying to make elementwise operations easier in light of some having been removed for the sake of avoiding ambiguity between say the square of a matrix and its elementwise square. But I don't think that's been implemented yet? Presuming it is implemented as a general parsing/operator facility (rather than by simply adding a bunch of new operators named exp@, square@, etx.) then I don't see any reason why ceil@(matrix) should not be allowed to work just like any other F@(matrix) would.

@dvd101x
Copy link
Collaborator

dvd101x commented May 16, 2023

In particular I mean for functions like sin, exp, etc. that I will call singleton functions.

I encountered a case while implementing a numerical ODE integrator and the function to integrate includes one of those singleton functions. It's common for these methods to use multiple initial values.

y_prime(t, y) = sin(y) + cos(y+1)
initial_state = [1, 1.1]
h = 1
t = 0
next_state = initial_state + h * y_prime(t, initial_state)

This throws an error due to the use of a vector inside sin and cos, but if the function doesn't use the singleton functions it works fine y_prime(t, y) = t + 2*y + 1

A workaround is that the user of the numerical ODE integrator defines y_prime function in a map if the function includes any singleton function.

y_prime(t, y) = map(y, sin) + map(y+1, cos)

Or maybe it's more legible for it to be

y_prime(t, y) = map(y, _(y1)=sin(y1)+cos(y1+1))

Or manually mapping

y_prime(t, y) = [sin(y[1])+cos(y[1]+1), sin(y[2])+cos(y[2]+1)]

With the new map nomenclature it might be more legible, but I don't know outside the parser how it would work.

y_prime(t, y) = sin@(y) + cos@(y+1)

This is an example of a possible inconvenience that I encountered during testing in comparison to other numerical ODE integrators.

@gwhitney
Copy link
Collaborator Author

Right, there's clearly a need for a function that will be say exp of a number and elementwise exp of a vector. The point of this thread is to settle on what the notation for that should be. I think Jos has settled on exp@(X) where X can be either a scalar or a vector/matrix. Is there any reason to think that using this notation in the definitions of your y_prime examples wouldn't work (once someone implements it)?

@josdejong
Copy link
Owner

I think Jos has settled on exp@(X) where X can be either a scalar or a vector/matrix.

So far I like exp@(X) quite well and know that is technically feasible, but I would like to explore Glen's idea of log( *V, 2) further (see #2440 (comment), #2440 (comment)), see what would be involved to implement that, and think through the use cases.

@dvd101x
Copy link
Collaborator

dvd101x commented Aug 3, 2023

I was reading a similar discussion in the Julia community.

After reading that I’m detracting from my request to allow both methods. It seems to work well for them and cover more cases.

If anything I would only question if in the meantime of this element wise operator is implemented you might consider returning back the old matlab style behavior. Now that solveODE is implemented I could show you some specific cases for y_prime that might seem inconvenient if that’s ok.

@gwhitney
Copy link
Collaborator Author

gwhitney commented Aug 3, 2023

Thanks for these references. I think they do reinforce where this discussion seemed to end up, which is for mathjs not to be concerned with making raw functions work on both scalars and matrices, but instead to provide convenient notation for vectorizing any function. I just am not sure if that notation ever quite settled...

@dvd101x
Copy link
Collaborator

dvd101x commented Aug 4, 2023

Thanks for the review Glen,

Today I was reading in detail the blogpost in which they announced the change they made which is very inline with this discussion.

https://julialang.org/blog/2017/01/moredots/
https://docs.julialang.org/en/v1/manual/mathematical-operations/#man-dot-operators

In particular I found interesting they use some of the characters mentioned here.

X .= f.(2 .* X.^2 .+ 6 .* X.^3 .- sqrt.(X))

@. X = f(2X^2 + 6X^3 - sqrt(X))

And also their description of broadcast and how it would apply for functions with multiple arguments.

Something interesting is that for Julia the . is the broadcast operator and is mandatory for something like this.

f(a,b) = a+b
f.([1,2],[3,4]')

Which in Matlab, Numpy and mathjs is not mandatory.

@josdejong
Copy link
Owner

Really interesting to see this very same discussion for Julia.

We indeed still have to make a decision here, my personal preference so far is a notation like exp@(X) but we can still discuss the notation if there are new thoughts in this regard. And I would still love to think through Glen's idea of log( *V, 2) in more detail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants