Open
Description
🐛 Describe the bug
CoreML seems to occasionally generate out of distribution values (1 above the max) for torch.randint. It's non-deterministic, but I've included a repro script below. Generating a large enough set of values (1 million, for example) will repro it regularly. Maybe some floating point rounding issue?
Repro:
import torch
from executorch.backends.apple.coreml.partition import CoreMLPartitioner
from executorch.exir import to_edge_transform_and_lower, EdgeCompileConfig, to_edge
from executorch.extension.pybindings.portable_lib import _load_for_executorch_from_buffer
class Model(torch.nn.Module):
def __init__(self):
super().__init__()
def forward(self, x):
return torch.randint(0, 100, (1000, 874), dtype=torch.int64) + x
model = Model()
inputs = (
torch.zeros(1000, 874, dtype=torch.int64),
)
eager_outputs = model(*inputs)
ep = torch.export.export(model.eval(), inputs)
print(ep)
lowered = to_edge_transform_and_lower(
ep,
partitioner=[CoreMLPartitioner()],
compile_config=EdgeCompileConfig(_check_ir_validity=False)
).to_executorch()
print(lowered.exported_program())
et_model = _load_for_executorch_from_buffer(lowered.buffer)
et_outputs = et_model([*inputs])[0]
flat = et_outputs.flatten()
argmax = torch.argmax(flat)
print(f"Argmax: {argmax}")
print(f"Val: {flat[argmax]}")
print(f"{et_outputs[1]}") # In my specific case, the bad value was here. See outputs below.
Output:
Argmax: 874
Val: 100
tensor([100, 39, 6, 20, 63, 69, 10, 22, 55, 39, 86, 26, 29, 40,
40, 63, 24, 17, 31, 26, 33, 22, 36, 16, 47, 30, 86, 59,
19, 0, 52, 64, 23, 59, 29, 49, 32, 35, 29, 89, 3, 64,
91, 93, 86, 95, 69, 19, 7, 85, 60, 13, 57, 25, 9, 14,
94, 86, 77, 57, 8, 52, 80, 80, 20, 42, 68, 94, 1, 87,
...
Versions
coremltools version 8.3
executorch commit 67b6009 (Jun 14)