Skip to content
This repository has been archived by the owner on Apr 18, 2023. It is now read-only.

Battle testing round 1 #78

Merged
merged 27 commits into from
Mar 3, 2018
Merged

Battle testing round 1 #78

merged 27 commits into from
Mar 3, 2018

Conversation

willtebbutt
Copy link
Member

Making a load of updates to as I use it to implement a separate package. This has thus far proved to be a useful exercise as it is has already highlighted a number of issues and missing features that I am implementing as I proceed.

There will likely be a few new linear algebra optimisations as well, which is really the name of the game at the moment.

…ing. Added some new scalar sensitivites in scalar.jl and changed the BenchmarkTools.jl tests requirement to be 0.0.8 and upwards. This dramatically accelerates testing.
…ual number, and hence the sensitivity w.r.t. an input is in fact zero, an error occurs. Instead, we now assume that a non-Dual number output -> sensitivity = 0.
…sensitivites now return a Diagonal matrix in all scenarios, which implies linear scaling in the length of the Diagonal N, rather than a dense matrix, which implies O(N^2) asymptotic memory and time complexity. Furthermore, the in-place optimisations have been improved slightly to handle Diagonal sensitivity caches better.
…es are returned from the optimisations. Also reformmated some code and changed the test requirements to place fewer restrictions on the version of BenchmarkTools that can be used when testing.
…tion at which stuff fails, such that the error can be easily reproduced.
@codecov
Copy link

codecov bot commented Jan 14, 2018

Codecov Report

Merging #78 into master will decrease coverage by 1.77%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #78      +/-   ##
==========================================
- Coverage   99.81%   98.03%   -1.78%     
==========================================
  Files          19       18       -1     
  Lines         535      509      -26     
==========================================
- Hits          534      499      -35     
- Misses          1       10       +9
Impacted Files Coverage Δ
src/sensitivities/functional/functional.jl 100% <ø> (ø) ⬆️
src/finite_differencing.jl 100% <100%> (ø) ⬆️
src/core.jl 100% <100%> (ø) ⬆️
src/sensitivities/scalar.jl 100% <100%> (+5%) ⬆️
src/code_transformation/differentiable.jl 100% <100%> (ø) ⬆️
src/sensitivity.jl 70.58% <0%> (-29.42%) ⬇️
src/sensitivities/indexing.jl 100% <0%> (ø) ⬆️
src/sensitivities/linalg/generic.jl 100% <0%> (ø) ⬆️
src/sensitivities/functional/reduce.jl 100% <0%> (ø) ⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6a9ddcd...14403ad. Read the comment docs.

@willtebbutt willtebbutt requested a review from wesselb January 14, 2018 20:07
@willtebbutt willtebbutt removed the WIP label Jan 14, 2018
@willtebbutt willtebbutt changed the title WIP: Battle testing round 1 Battle testing round 1 Jan 14, 2018
@willtebbutt willtebbutt mentioned this pull request Jan 15, 2018
"""
function unionise_struct(code::Expr)
tmp = code.args[2]
name = tmp isa Expr && tmp.head == Symbol("<:") ? code.args[2].args[1] : code.args[2]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps store tmp isa Expr && tmp.head == Symbol("<:") in a variable since you're reusing it two lines below.

`code` should be an `Expr` containing the definition of a type. The type will only be
changed if it is parametric. That is, if you wish to be able to differentiate through a
user-defined type, it must contain only `Any`s and parametric types.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should unionise_struct also Nablafy the fields of the struct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. In general that would mean you would have to store pointers to the values in the fields in structs; it wouldn't make any difference for mutable structs though. i.e.

struct Foo
    x::Union{Float64, Node{<:Float64}}
end

stores a pointer to x on the heap, whereas

struct Foo
    x::Float64
end

just stores x.

Expr(Symbol("::"), arg.args[1:end-1]..., unionise_type(arg.args[end])) :
arg.head == Symbol("...") ?
Expr(Symbol("..."), unionise_arg(arg.args[1])) :
throw(error("Unrecognised argument in arg ($arg)."))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrecognised symbol is perhaps more descriptive.


# Both of these functions are technically defined on the entire real line, but the negative
# half is troublesome due to the large number of points at which it isn't defined. As such
# we restrict unit testing to the left-half line.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right-half

# TODO: Make a PR for DiffRules.
@define_diffrule Base.:\(x, y) = :(-$y / $x^2), :(1 / $x)
@define_diffrule Base.:transpose(x) = :(1)
@define_diffrule Base.:abs(x) = :($x > 0 ? one($x) : -one($x))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to let abs'(0) = 0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my (somewhat silly) argument is to pretend that abs is actually defined as

abs(x) = x > -eps ? x : -x

where eps is some constant smaller than whatever value your floating point numbers are able to represent (not machine epsilon), such that we never actually hit the discontinuity when we evaluate abs, and consequently never have to handle it when we are evaluating the gradient.

In short, I really don't think that defining the gradient of abs to be zero at zero makes sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, though 0 can be represented exactly, which would hit the discontinuity.

Copy link
Contributor

@wesselb wesselb Jan 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it is most sensible to let abs'(0) = NaN.

@@ -5,7 +5,11 @@ for i = 1:7
@eval @explicit_intercepts getindex $T [[true]; fill(false, $i - 1)]
end

∇(Ā, ::typeof(getindex), ::Type{Arg{1}}, p, y, ȳ, A, inds...) = setindex!(Ā, ȳ, inds...)
# ∇(Ā, ::typeof(getindex), ::Type{Arg{1}}, p, y, ȳ, A, inds...) = setindex!(Ā, ȳ, inds...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps just remove the line as opposed to commenting it out.

@willtebbutt willtebbutt merged commit bd9460a into master Mar 3, 2018
@willtebbutt willtebbutt mentioned this pull request Mar 3, 2018
@willtebbutt willtebbutt deleted the battle-testing-round-1 branch March 25, 2018 21:39
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants