Image Classification

This tutorial demonstrates how to implement an image classification system using WebNN and ONNX Runtime Web , leveraging on-device GPU and NPU acceleration. You’ll use the MobileNetV2 open-source model from Hugging Face to classify images with high performance.

Step 0: WebNN API Setup Requirements

System Requirements

Ensure your system meets the following prerequisites for WebNN API development:

Operating System: Windows (latest version recommended)
Browser: Microsoft Edge Dev
Hardware: GPU and NPU with up-to-date drivers supporting WebNN
Software: Install VS Code and use Live Server extension

Enable WebNN API in Edge Dev

Download and install Microsoft Edge Dev
Launch Edge Dev, and navigate to about:flags in the address bar
Search for WebNN API, click the dropdown, and set to Enabled
Restart the browser when prompted

Step 1: Initialize the Web App

Create a new index.html file and add the standard HTML boilerplate code to your page.


<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>My Website</title>
  </head>
  <body>
    <main>
        <h1>Welcome to My Website</h1>
    </main>
  </body>
</html>

Include the Jimp and Lodash library source files in index.html’s <head> tag.


<script src="https://cdnjs.cloudflare.com/ajax/libs/jimp/0.22.12/jimp.min.js" integrity="sha512-8xrUum7qKj8xbiUrOzDEJL5uLjpSIMxVevAM5pvBroaxJnxJGFsKaohQPmlzQP8rEoAxrAujWttTnx3AMgGIww==" crossorigin="anonymous" referrerpolicy="no-referrer"></script>
<script src="https://cdn.jsdelivr.net/npm/lodash@4.17.21/lodash.min.js"></script>

Verify your setup by clicking the Go Live button in the bottom-right corner of VSCode, which launches a local server in Edge Dev.
Create a new main.js file to contain your application’s JavaScript code.
Create an images subfolder in your project directory and add an image file (for this demo, name it image.jpg).
Download the mobilenetv2-10_fp16.onnx model and save it in your project’s root folder. Users in the PRC can download the model here .
Download the imagenetClasses.js file, which provides 1000 standard image classifications for your model.

Step 2: Add UI Elements and Parent Function

Within the <main> HTML tags, replace the existing content with elements to create a classification button and display a default image.


<h1>Image Classification Demo!</h1> 
<div><img src="./images/image.jpg"></div> 
<button onclick="classifyImage('./images/image.jpg')"  type="button">Click Me to Classify Image!</button> 
<h1 id="outputText"> This image displayed is ... </h1>

Add ONNX Runtime Web to your page by inserting the JavaScript source links in the <head> section.


<script src="./main.js"></script> 
<script src="./imagenetClasses.js"></script>
<script src="https://cdn.jsdelivr.net/npm/onnxruntime-web@1.21.0-dev.20250306-e0b66cad28/dist/ort.all.min.js"></script>

Open main.js and add the initial code snippet.


async function classifyImage(pathToImage){ 
  const imageTensor = await getImageTensorFromPath(pathToImage); // Convert image to a tensor
  const predictions = await runModel(imageTensor); // Run inference on the tensor
  console.log(predictions); // Print predictions to console
  document.getElementById("outputText").innerHTML += predictions[0].name; // Display prediction in HTML
}

Step 3: Pre-process Data

Implement the getImageTensorFromPath function, which includes an async function to retrieve the image.


 async function getImageTensorFromPath(path, width = 224, height = 224) {
  const image = await loadImagefromPath(path, width, height); // 1. load the image
  const imageTensor = imageDataToTensor(image); // 2. convert to tensor
  return imageTensor; // 3. return the tensor
} 
 
async function loadImagefromPath(path, resizedWidth, resizedHeight) {
  const imageData = await Jimp.read(path).then(imageBuffer => { // Use Jimp to load the image and resize it.
    return imageBuffer.resize(resizedWidth, resizedHeight);
  });
  return imageData.bitmap;
}

Create the imageDataToTensor function to convert the loaded image into a tensor compatible with the ONNX model.


function imageDataToTensor(image) {
  const imageBufferData = image.data;
  let pixelCount = image.width * image.height;
  const float16Data = new Float16Array(
      3 * pixelCount);  // Allocate enough space for red/green/blue channels.
 
  const mean =  [0.485, 0.456, 0.406];
  const std = [0.229, 0.224, 0.225];
 
  // Loop through the image buffer, extracting the (R, G, B) channels,
  // rearranging from packed channels to planar channels, and converting to
  // floating point.
  for (let i = 0; i < pixelCount; i++) {
    float16Data[pixelCount * 0 + i] =
        (imageBufferData[i * 4 + 0] / 255.0 - mean[0]) / std[0];  // Red
    float16Data[pixelCount * 1 + i] =
        (imageBufferData[i * 4 + 1] / 255.0 - mean[1]) / std[1];  // Green
    float16Data[pixelCount * 2 + i] =
        (imageBufferData[i * 4 + 2] / 255.0 - mean[2]) / std[2];  // Blue
    // Skip the unused alpha channel: imageBufferData[i * 4 + 3].
  }
  let dimensions = [1, 3, image.height, image.width];
  const inputTensor = new ort.Tensor('float16', float16Data, dimensions);
  return inputTensor;
}

Step 4: Call ONNX Runtime Web

With image retrieval and tensor conversion complete, use the ONNX Runtime Web library to run your model. Enabling WebNN is straightforward - simply specify the executionProvider as webnn.


let modelSession;
 
async function runModel(preprocessedData) {
  // Configure WebNN.
  const modelPath = "./mobilenetv2-10_fp16.onnx";
  const devicePreference = "gpu"; // Other options include "npu" and "cpu".
  const options = {
    executionProviders: [{ name: "webnn", deviceType: devicePreference, powerPreference: "default" }],
    // The key names in freeDimensionOverrides should map to the real input dim names in the model.
    // For example, if a model's only key is batch_size, you only need to set
    freeDimensionOverrides: { "batch_size": 1 }
  };
  modelSession = await ort.InferenceSession.create(modelPath, options);
 
  // Create feeds with the input name from model export and the preprocessed data. 
  const feeds = {}; 
  feeds[modelSession.inputNames[0]] = preprocessedData; 
  // Run the session inference.
  const outputData = await modelSession.run(feeds); 
  // Get output results with the output name from the model export. 
  const output = outputData[modelSession.outputNames[0]]; 
  // Get the softmax of the output data. The softmax transforms values to be between 0 and 1.
  const outputSoftmax = softmax(Array.prototype.slice.call(output.data)); 
  // Get the top 5 results.
  const results = imagenetClassesTopK(outputSoftmax, 5);
 
  return results; 
}

Step 5: Post-process Data

Add a softmax function to transform model outputs into probability values between 0 and 1. Implement a final function to determine the most likely image classification. Add the necessary functions to main.js.


// The `softmax` transforms values to be between 0 and 1.
function softmax(resultArray) {
  // Get the largest value in the array.
  const largestNumber = Math.max(...resultArray);
  // Apply the exponential function to each result item subtracted by the largest number, using reduction to get the
  // previous result number and the current number to sum all the exponentials results.
  const sumOfExp = resultArray 
    .map(resultItem => Math.exp(resultItem - largestNumber)) 
    .reduce((prevNumber, currentNumber) => prevNumber + currentNumber);
 
  // Normalize the resultArray by dividing by the sum of all exponentials.
  // This normalization ensures that the sum of the components of the output vector is 1.
  return resultArray.map((resultValue, index) => {
    return Math.exp(resultValue - largestNumber) / sumOfExp
  });
}
 
function imagenetClassesTopK(classProbabilities, k = 5) { 
  const probs = _.isTypedArray(classProbabilities)
    ? Array.prototype.slice.call(classProbabilities)
    : classProbabilities;
 
  const sorted = _.reverse(
    _.sortBy(
      probs.map((prob, index) => [prob, index]),
      probIndex => probIndex[0]
    )
  );
 
  const topK = _.take(sorted, k).map(probIndex => {
    const iClass = imagenetClasses[probIndex[1]]
    return {
      id: iClass[0],
      index: parseInt(probIndex[1].toString(), 10),
      name: iClass[1].replace(/_/g, " "),
      probability: probIndex[0]
    }
  });
  return topK;
}

Playground

You can try Image Classification in Playground .

You’ve now implemented the complete script for WebNN-powered image classification. Use VS Code’s Live Server extension to launch and test your web application.