Optional subfolder
if model repository contains one ONNX model behind a subfolder
#2008
Labels
onnx
Related to the ONNX export
Hello!
The Quirk
I've noticed some interesting behaviour, and I think there's a chance that it's unintended. Let's start with this snippet:
Perhaps surprisingly, perhaps not, this fails:
This file indeed does not exist, there is only a
model.onnx
in anonnx
subfolder: https://huggingface.co/BAAI/bge-small-en-v1.5/tree/mainWhen the
file_name
is not specified, such as in the above snippet, then thefrom_pretrained
call will try and infer it:optimum/optimum/onnxruntime/modeling_ort.py
Lines 509 to 529 in 8cb6832
In our case, we take the else branch (as the model is remote):
optimum/optimum/onnxruntime/modeling_ort.py
Lines 513 to 519 in 8cb6832
Here,
repo_files
is:which leads to a
onnx_files
of:This bypasses the
if len(...) == 0
andif len(...) > 1
errors, and setsfile_name
asonnx_files[0].name
, i.e."model.onnx"
.This then fails when actually loading the model, because there is no
"model.onnx"
in the root of the repository, whereas we can be quite sure that the user intended to load this ONNX model. Instead, we currently require that the user specifies eithersubfolder="onnx"
orfile_name="onnx/model.onnx"
.Potential Fixes
Fix A
This would work in the normal cases as well as when the only ONNX file is in a subfolder. The
relative_to
means that it'll also work if asubfolder
was provided. There might still be some missed edge cases.The downside is that this results in the following warning:
Fix B
if file_name is None: if model_path.is_dir(): onnx_files = list(model_path.glob("*.onnx")) else: repo_files, _ = TasksManager.get_model_files( model_id, revision=revision, cache_dir=cache_dir, token=token ) repo_files = map(Path, repo_files) pattern = "*.onnx" if subfolder == "" else f"{subfolder}/*.onnx" onnx_files = [p for p in repo_files if p.match(pattern)] if len(onnx_files) == 0: raise FileNotFoundError(f"Could not find any ONNX model file in {model_path}") elif len(onnx_files) > 1: raise RuntimeError( f"Too many ONNX model files were found in {model_path}, specify which one to load by using the " "file_name argument." ) else: file_name = onnx_files[0].name + subfolder = onnx_files[0].parent.as_posix()
This overrides/sets the subfolder so that we load e.g.
model.onnx
from whatever subfolder it exists in. There might still be some missed edge cases.Will you consider a fix for this quirk?
The text was updated successfully, but these errors were encountered: