Model

How to improve the models compatibility?

A model should always run, no matter how slowly.

What is the recommended model size for WebNN applications?

The optimal model size for WebNN applications depends on your users’ network conditions and device capabilities. While WebNN has no strict size limits, larger models impact download times and initial load performance. Consider implementing Web application features to optimize performance:

Use service workers to enable offline functionality and cache models
Leverage the Cache API or Origin Private File System (OPFS) to store models locally, reducing subsequent load times

ONNX Runtime Web model size limitations

ONNX Runtime Web models can range significantly in size, from a few kilobytes to several gigabytes, but the ONNX model itself, serialized in protobuf format, has a maximum size of 2GB, and ONNX Runtime Web can run models up to 4GB in size.

Additional Resources

Working with Large Models

Why not define a model format that covers both topology and weights?

The WebNN API does not directly support model formats, as format handling is intentionally delegated to frameworks and applications. This design decision maintains flexibility while allowing frameworks to implement their own model loading approaches.

Are some models restricted to specific backend or hardware?

Find operations and hardware support details of LiteRT, Windows ML and Core ML backends.