Description
Describe the issue:
I think that the tolerance used to decide whether a constant is one (or minus one) is too large.
We can argue whether this is a bug or not but I think it may lead to issues.
See code below.
The issue is the use of np.isclose
and np.allclose
within functions _is_1
in pytensor/tensor/rewriting/math.py, is_1pexp
, is_neg`, etc.
The default tolerances used are rtol=1e-05, atol=1e-08
, which are unnecessarily large (and independent on the data type of the constant computed).
I think it's not unreasonable to have a constant
I propose:
- using a much smaller tolerance by default
- make this tolerance dependent on the data type.
- implementing an in-house version of isclose/allclose to be used where needed (so it can be reimplemented by the user and its effects apply to all simplifications)
Le's take for example, the following expression:
I did some quick rounding‐error analysis of this expression and the error bound should be approximately 30 deltas. Assuming they are roughly independent, this means roughly 6 eps. In practice, I have tried it and I get an error of 5 eps in float64 and 3 eps in float32. So I think it's safe to take something like 10 eps as tolerance when assessing if something is one (or minus one), which corresponds to 1e-6 for float32 and 2e-15 for float64.
Reproducable code example:
import pytensor
import pytensor.tensor as pt
from pytensor.graph import rewrite_graph
x = pt.scalar("x")
out = 0.99999 - pt.sigmoid(x)
pytensor.dprint(out, print_type=True)
print('='*20)
with pytensor.config.change_flags(optimizer_verbose = True):
fn = pytensor.function([x], out, mode="FAST_RUN")
#fn = rewrite_graph(out, include=("canonicalize", "stabilize", "specialize"))
print('='*20)
pytensor.dprint(fn, print_type=True);
Error message:
Sub [id A] <Scalar(float64, shape=())>
├─ 0.99999 [id B] <Scalar(float64, shape=())>
└─ Sigmoid [id C] <Scalar(float64, shape=())>
└─ x [id D] <Scalar(float64, shape=())>
====================
rewriting: rewrite local_1msigmoid replaces Sub.0 of Sub(0.99999, Sigmoid.0) with Sigmoid.0 of Sigmoid(Neg.0)
rewriting: rewrite FusionOptimizer replaces Sigmoid.0 of Sigmoid(Neg.0) with Composite{sigmoid((-i0))}.0 of Composite{sigmoid((-i0))}(x)
====================
Composite{sigmoid((-i0))} [id A] <Scalar(float64, shape=())> 0
└─ x [id B] <Scalar(float64, shape=())>
Inner graphs:
Composite{sigmoid((-i0))} [id A]
← sigmoid [id C] <float64> 'o0'
└─ neg [id D] <float64>
└─ i0 [id E] <float64>
PyTensor version information:
2.31.3
Context for the issue:
No response