Describe the bug
PANOCEngine.linesearch() does the wrong update if MAX_LINESEARCH_ITERATIONS is reached.
PANOCEngine.line_search_condition() can fail for the wrong reason.
In more detail.
The current code is
if num_ls_iters == MAX_LINESEARCH_ITERATIONS {
self.cache.tau = T::zero();
u_current.copy_from_slice(&self.cache.u_half_step);
}
// Sets `u_current` to `u_plus` (u_current ← u_plus)
u_current.copy_from_slice(&self.cache.u_plus);
If I understand the algorithm correctly, if we reach MAX_LINESEARCH_ITERATIONS, the update should use tau=0, but that isn't what is happening here. The following should fix it
if num_ls_iters == MAX_LINESEARCH_ITERATIONS {
self.cache.tau = T::zero();
// updating with tau = 0
self.line_search_condition(u_current)?;
// No need to update u_current here
}
The comment above the method says
/// Computes the left hand side of the line search condition and compares it with the RHS;
/// returns `true` if and only if lhs > rhs (when the line search should continue)
In other words, the line search will stop if false is returned. In #406 the code was changed so that if NaNs appeared, false wouldn't be returned incorrectly. I think that that check covered all possible NaNs appearing (didn't fully check though), however I bumped into a situation where both the cost and norms were huge, but finite, for example e20 and e40 giving a lhs of around -e40. In this case, the lineseach should continue (and if I forced it to, it ended up returning an improved u_plus with a much smaller value for tau.
The solution I used was to change the return to
Ok((cast::<T>(0) < self.cache.lhs_ls) && (self.cache.lhs_ls <= self.cache.rhs_ls)) and therefore change the logic of the method, it instead returns true if and only if the line search has finished. (This would also mean you can drop the NaN checks I believe.) I'm not sure if the 0 < lhs check makes sense in general, some sort of check to make sure the cost isn't (substantially) worse seems sensible though. The general point is that the first step of the linesearch (tau=1) can give terrible values (NaNs or 10's of orders of magnitudes larger values), but the algorithm still works as long as you "correctly" detect these and let tau get small enough. It might well be the case that this issue is too hard to get around in general, the change I made above works in the specific case I bumped into.
Describe the bug
PANOCEngine.linesearch()does the wrong update ifMAX_LINESEARCH_ITERATIONSis reached.PANOCEngine.line_search_condition()can fail for the wrong reason.In more detail.
The current code is
If I understand the algorithm correctly, if we reach
MAX_LINESEARCH_ITERATIONS, the update should usetau=0, but that isn't what is happening here. The following should fix itThe comment above the method says
In other words, the line search will stop if false is returned. In #406 the code was changed so that if NaNs appeared, false wouldn't be returned incorrectly. I think that that check covered all possible NaNs appearing (didn't fully check though), however I bumped into a situation where both the cost and norms were huge, but finite, for example e20 and e40 giving a lhs of around -e40. In this case, the lineseach should continue (and if I forced it to, it ended up returning an improved
u_pluswith a much smaller value fortau.The solution I used was to change the return to
Ok((cast::<T>(0) < self.cache.lhs_ls) && (self.cache.lhs_ls <= self.cache.rhs_ls))and therefore change the logic of the method, it instead returns true if and only if the line search has finished. (This would also mean you can drop the NaN checks I believe.) I'm not sure if the 0 < lhs check makes sense in general, some sort of check to make sure the cost isn't (substantially) worse seems sensible though. The general point is that the first step of the linesearch (tau=1) can give terrible values (NaNs or 10's of orders of magnitudes larger values), but the algorithm still works as long as you "correctly" detect these and let tau get small enough. It might well be the case that this issue is too hard to get around in general, the change I made above works in the specific case I bumped into.