Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do the autodiff options have any meaning when the Jacobi matrix is provided? #773

Open
gstein3m opened this issue Jul 16, 2021 · 11 comments · May be fixed by SciML/OrdinaryDiffEq.jl#1611
Open

Comments

@gstein3m
Copy link

When I provide the Jacobian

f = ODEFunction(fcn,mass_matrix=M,jac=f_jac)

why do I get different results (number of time steps,...) when I use different options for autodiff:

sol = solve(prob_ode,Rodas4P2(),

or

   sol = solve(prob_ode,Rodas4P2(autodiff=false,diff_type=Val{:central}),

or

   sol = solve(prob_ode,Rodas4P2(autodiff=false,diff_type=Val{:forward}),

In my opinion, these options should not play a role then.

Moreover, usage of a simple numerical f_jac

function f_jac(J,y,P,t)
#--
del = sqrt(eps(1.0)); n = length(y);
f0 = similar(y); f1 = similar(y); fcn(f0,y,P,t);
for i=1:n
del_1 = max(abs(y[i])*del,del);
if y[i]<0.0 del_1 = -del_1; end
y1 = copy(y); y1[i] = y1[i] + del_1;
fcn(f1,y1,P,t)
J[:,i] = (f1-f0)/del_1;
end
end

leads to 864 accepted steps compared to 4813 steps without providing the Jacobian and autodiff=false,diff_type=Val{:forward} option. The diff_type=Val{:central} option is similar to 860 steps.

@ChrisRackauckas
Copy link
Member

Rosenbrock methods don't just need the Jacobian, they also need a gradient w.r.t. time.

@gstein3m
Copy link
Author

Yes, thanks a lot. I did not notice, that these options do also influence the time derivative.
Then there is still the second question: Even for the choice del_1 = del*max(abs(y[i]),del); the simple numerical Jac performs much better than autodiff=false,diff_type=Val{:forward} . Where is the difference?

@ChrisRackauckas
Copy link
Member

What do you mean? Do you have an example?

@gstein3m
Copy link
Author

My application is rather complex, but I will try to construct an example.

@gstein3m
Copy link
Author

gstein3m commented Jul 18, 2021

This is my example: 0 = y^2 -1/t^4, y(1)=-1 with solution y(t) = -1/t^2.

For f(y)=y^2 and h>0 the FD Approximation f'(y) = (f(y+h)-f(y))/h leads to a singular Jacobian latest at time t = sqrt(2/h).

  1. I do not understand the results for diff_type=Val{:forward}. When I look at the implementation of "finite_difference_derivative" and "compute_epsilon" the choice absstep=relstep should lead to the same results as for my Jacobian with y_thres = 1.0.

  2. When increasing the time interval y_thres should be lowerd. In many practical computations y_thres = 1.0e-5 has been a good choice.

  3. Maybe the users should be allowed to alter y_thres in a range [sqrt(eps), 1.0] ?

using DifferentialEquations

function fcn(du,u,p,t)
    du[1] = u[1]^2 - 1/t^4;
end

function f_jac(J,y,P,t)
#-- numerical Jac by FD
 y_thres = P;
 del = sqrt(eps(1.0)); n = length(y);
 f0 = similar(y); f1 = similar(y); fcn(f0,y,P,t);
 for i=1:n
	 del_1 = del*max(abs(y[i]),y_thres);
	 y1 = copy(y); y1[i] = y1[i] + del_1;
	 fcn(f1,y1,P,t)
	 J[:,i] = (f1-f0)/del_1;
 end
end

tspan = (1.0, 1.0e3); M = zeros(1,1);

println("--- AD ---")

f = ODEFunction(fcn,mass_matrix=M)
problem = ODEProblem(f,[-1.0], tspan);

sol = solve(problem,Rodas4P2(),maxiters=Int(1e7),reltol=1.0e-12,abstol=1.0e-12);
println(sol.destats)

println("--- FD central ---")

sol = solve(problem,Rodas4P2(autodiff=false,diff_type=Val{:central}),maxiters=Int(1e7),reltol=1.0e-12,abstol=1.0e-12);
println(sol.destats)

println("--- FD forward ---")

sol = solve(problem,Rodas4P2(autodiff=false,diff_type=Val{:forward}),maxiters=Int(1e7),reltol=1.0e-12,abstol=1.0e-12);
println(sol.destats)

println("--- FD forward, y_thres = 1 ---")

y_thres = 1.0;
f = ODEFunction(fcn,mass_matrix=M,jac=f_jac)
problem = ODEProblem(f,[-1.0], tspan,y_thres);
sol = solve(problem,Rodas4P2(autodiff=false,diff_type=Val{:forward}),maxiters=Int(1e7),reltol=1.0e-12,abstol=1.0e-12);
println(sol.destats)

println("--- FD forward, y_thres = 1.0e-5 ---")

y_thres = 1.0e-5;
f = ODEFunction(fcn,mass_matrix=M,jac=f_jac)
problem = ODEProblem(f,[-1.0], tspan,y_thres);
sol = solve(problem,Rodas4P2(autodiff=false,diff_type=Val{:forward}),maxiters=Int(1e7),reltol=1.0e-12,abstol=1.0e-12);
println(sol.destats)

println("--- FD forward, y_thres = sqrt(eps) ---")

y_thres = sqrt(eps(1.0));
f = ODEFunction(fcn,mass_matrix=M,jac=f_jac)
problem = ODEProblem(f,[-1.0], tspan,y_thres);
sol = solve(problem,Rodas4P2(autodiff=false,diff_type=Val{:forward}),maxiters=Int(1e7),reltol=1.0e-12,abstol=1.0e-12);
println(sol.destats)

Output

--- AD ---
Number of accepted steps: 5275

--- FD central ---
Number of accepted steps: 5276

--- FD forward ---
Number of accepted steps: 1393517

--- FD forward, y_thres = 1 ---
Number of accepted steps: 6022

--- FD forward, y_thres = 1.0e-5 --
Number of accepted steps: 5231

--- FD forward, y_thres = sqrt(eps) ---
Number of accepted steps: 5231

@ChrisRackauckas
Copy link
Member

Yeah it looks like a more optimal eps could be chosen here.

@gstein3m
Copy link
Author

gstein3m commented Feb 25, 2022

The problem for the failure of (autodiff=false,diff_type=Val{:forward}) is within function calc_rosenbrock_differentiation! (see derivative_utils.jl). Here linsolve_tmp changes it's value, but should not. This is a workaround, but sure not the final solution.:

function calc_rosenbrock_differentiation!(integrator, cache, dtd1, dtgamma, repeat_step, W_transform)
calc_tderivative!(integrator, cache, dtd1, repeat_step)
nlsolver = nothing
#- we need to skip calculating W when a step is repeated
new_W = false
if !repeat_step
#- println("a) Cache:",cache.linsolve_tmp)
help = cache.linsolve_tmp[:]
new_W = calc_W!(cache.W, integrator, nlsolver, cache, dtgamma, repeat_step, W_transform)
cache.linsolve_tmp[:] = help
#- println("b) Cache:",cache.linsolve_tmp)
end
return new_W
end

Best regards Gerd

@ChrisRackauckas
Copy link
Member

Isn't it only used for the temporary inside that calculation? Is there a reason that buffer needs to remain unchanged later? If so, we probably need to add another cache vector, or use a different one somewhere.

@gstein3m
Copy link
Author

cache.linsolve_tmp holds the result from calc_tederivative! and is required in rosenbrock_perform_step.

@ChrisRackauckas
Copy link
Member

So the tderivative! just needs to be moved to after the W?

@gstein3m
Copy link
Author

Yes, thats the most simple possibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants