[WIP] Add MobileNet Feature Extractor by JunhaoZhu0220 · Pull Request #301 · ml5js/ml5-next-gen

JunhaoZhu0220 · 2026-03-03T10:08:27Z

This PR implements the FeatureExtractor module, adapted from the ml5.js library (v0.12.2). It follows the original API design and behavior as defined in that release, enabling transfer learning on top of a pre-trained MobileNet model for both classification and regression tasks.

Changes

Refactored model loading logic: Instead of loading the full MobileNet and truncating inference at a specific layer (which varies between MobileNet v1 and v2), the new approach uses the FeatureVector variant of MobileNet directly. Its output is then fed into an MLP for downstream training, making the pipeline cleaner and more version-agnostic.
Fixed MLP input shape mismatch: The previous implementation hardcoded the MLP input shape based on MobileNet v1 with alpha=1, without accounting for the alpha hyperparameter that scales filter sizes across the network. This fix dynamically resolves the correct input shape based on the chosen alpha value.

TODO

Webcam Support — Currently, the feature extractor only supports static image uploads. Add real-time webcam input so users can perform inference directly from their computer camera.
Usage Examples — Design example demos to demonstrate the feature extractor in practice. I will draw inspiration from @shiffman's classification tutorial and regression tutorial. Open to suggestions on example design!
Model Save & Load — Add the ability to save a trained model to disk and reload it in a later session, avoiding the need to retrain from scratch each time.
Cosine Similarity Utility & Example — Implement a cosine similarity utility and a related demo.
Documentation

…ting

shiffman · 2026-03-03T15:35:46Z

So excited about this! In addition to the examples I previously made regarding training your own classifier, some examples using cosine similarity could be good! (and perhaps we want to build into ml5 a cosine similarity utility function). Last year I made a demo that matches your webcam image to the most similar image. It has some unfriendly to beginners code, but could be a model for an ml5 version.

https://editor.p5js.org/ml_4_cc/sketches/CWE6Ox_jd

I am hoping to make some videos about this in the future.

@shiffman

…redit to @shiffman !

…sion; Add webcam support.

JunhaoZhu0220 · 2026-03-05T10:03:46Z

Two newly implemented demos have been pushed to the feature-extractor branch:

featureExtractor-webcam-classifier — Classify between two categories (pens and phones) using the webcam.
featureExtractor-webcam-regressor — Control the size of a circle by moving your face closer to or farther from the camera.

Feel free to check it out!

shiffman · 2026-03-06T12:12:26Z

examples/featureExtractor-webcam-classifier/sketch.js

+  trainButton.mousePressed(function () {
+    classifier.train().then(function () {
+      console.log("Starting classification...");
+      classifyVideo();


Take a look at the new ml5.js startClassify() pattern in other examples. This eliminates the need for the "recursive" call to classifyVideo()!

A new classifyStart() and predictStart() method has been implemented in 17afa84.

examples/featureExtractor-webcam-classifier/sketch.js

shiffman · 2026-03-06T12:15:57Z

examples/featureExtractor-webcam-classifier/sketch.js

+  // Initialize the feature extractor
+  featureExtractor = ml5.featureExtractor({ epochs: 100 }, modelReady);
+  // Create a new classifier using those features and with a video element
+  classifier = featureExtractor.classification(video, videoReady);


This matches exactly how we implemented the API in ml5.js prior to 1.0. @gohai I think this could be up for discussion. Do we like this methodology of first loading the extractor and then "turning it into" a classifier? Another option would be to use ml5.neuralNetwork() in conjunction with the feature extractor. This might involve more code, but could be better from a pedagogical standpoint. Would love your thoughts!

For the loading methodology, I have an idea of introducing a task property that accepts two options: classification and regression. With this approach, loading a classifier or regressor would look like:

feClassifier = ml5.featureExtractor({ task: 'classification' }, modelReady); feRegressor = ml5.featureExtractor({ task: 'regression' }, modelReady);

The two-step process of creating an extractor and then using it to create a classifier does indeed seem quite confusing to me. I would prefer a single-line solution, similar to what @JunhaoZhu0220 is proposing.

I have implemented my design in e119244, so now

feClassifier = ml5.featureExtractor({ task: 'classification' }, modelReady); feRegressor = ml5.featureExtractor({ task: 'regression' }, modelReady);

is the new way to initialize the model as shown in the two demos.

I think that's great. Users now get the feature extractor right away, and the task property mirrors the NeuralNetwork class.

(Leaving for others to chime in)

JunhaoZhu0220 · 2026-03-07T12:09:16Z

So excited about this! In addition to the examples I previously made regarding training your own classifier, some examples using cosine similarity could be good! (and perhaps we want to build into ml5 a cosine similarity utility function). Last year I made a demo that matches your webcam image to the most similar image. It has some unfriendly to beginners code, but could be a model for an ml5 version.

https://editor.p5js.org/ml_4_cc/sketches/CWE6Ox_jd

I am hoping to make some videos about this in the future.

@shiffman @gohai For the cosine similarity utility, one demo that came up to my mind is handwritten digit recognition. Inspired by Yann Lecun et al.'s MNIST database, the feature extractor would first obtain feature embeddings for reference images of digits 0–9, then compute the embedding for a webcam frame where the user holds up a handwritten number on paper. The digit is identified by finding the reference embedding with the highest cosine similarity.

Please give me some feedbacks on how you feel about this idea!

examples/featureExtractor-webcam-classifier/sketch.js

examples/featureExtractor-webcam-classifier/index.html

examples/featureExtractor-webcam-classifier/sketch.js

gohai · 2026-03-08T13:03:58Z

examples/featureExtractor-webcam-classifier/sketch.js

@@ -0,0 +1,69 @@
+let featureExtractor;


Thank you for this nice example inspired by @shiffman's video, @JunhaoZhu0220! 🤗

Since "pen" and "phone" are still fairly generic classes, and one might need to know that they aren't part of MobileNet to fully understand and apprechiate this functionality: wondering if making both buttons be contenteditable might be an idea worth trying out, in your opinion? (the user would be prompted to define their own classes, hits "done", and then continues in the way the example works currently)

@gohai I do think this is a perfect idea to offer the users the feasibilty of editing the classes on their own! The interface in aa53f36 looks like

in which two classes will set defaultly to Class 1 and Class 2. If the user input customized class names in the text box, the names for the label will be changed accordingly.

This is already much more friendly @JunhaoZhu0220 ✨

What I meant with contenteditable, and which you might still want to try to see if you like it is the HTML attribute of the same name, which allows users to type in the button directly. (Might make the p5 code shorter, but I haven't tested myself.)

e.g.

(and if the user clicks "Done", then it might remain "Class #1" and "Class #2")

… updating the training hyperparameters within the .train() function.

… .train(); intialize the model with a single line code through "task" property; add training visualization.

…tion to avoid recursively calling .classify()

gohai · 2026-03-21T08:10:02Z

examples/featureExtractor-webcam-regressor/sketch.js

+let sampleCount = 0;
+let predictedValue = 0;
+
+function modelReady() {


Small nitpick: defining those function in code in the order that we expect them to be called (e.g. preload then setup etc) might make it slightly easier to read the sketch top-to-bottom

gohai · 2026-03-21T08:11:59Z

examples/featureExtractor-webcam-regressor/sketch.js

+
+function preload() {
+  // Initialize the feature extractor for regression
+  feRegressor = ml5.featureExtractor({ task: 'regression', version: 2 }, modelReady);


Is the version: 2 here necessary for it to function? (If yes, is there a drawback to changing our default from 1 to 2?)

The reason why I passed version: 2 is that the regression task is different from what the mobilenet is originally designed for, so we might need to utilize a stronger (newer) model which can generalize the feature extraction better 🤗
If this may introduce some confusions to users, I can remove this passing and use the default version: 1, while indicating in the documentation that version: 2 will bring a better performance.

Curious if changing the default to version 2 could be an option? Does this have downsides for the classification task perhaps?

I believe version 2 also works perfectly for classification tasks. (will change the default version to 2 in next commit)

gohai · 2026-03-21T08:26:09Z

examples/featureExtractor-webcam-classifier/sketch.js

+  video.hide();
+  background(0);
+  // Set the video as the input for the Classifier
+  feClassifier.video = video;


It is probably nicer to have a dedicated method to setting the input here (which might be a video, but possibly also an image, canvas...) 🤔 Something to think together with @shiffman at some convenient time.

For context: previously, the video input got passed as an argument to the constructor, but since the feature extractor now gets created in preload(), we typically don't have the video element yet at this point.
Our other models, such as bodyPose, work around this by taking video as an argument to e.g. detectStart(). We could do the same, and require video as an argument to both addImage() and classifyStart(). Or, pass the source to the feature extractor at one point in time only - similarly to how @JunhaoZhu0220 is doing here.

JunhaoZhu0220 added 4 commits February 22, 2026 16:21

Add featureExtractor to preload and improve setupP5Integration format…

bf80337

…ting

Add sample images for feature extraction classification and regression

adea7f2

Add HTML and JavaScript files for featureExtractor example

e7f125f

Add Mobilenet feature extractor implementation for transfer learning

01fa498

JunhaoZhu0220 marked this pull request as draft March 3, 2026 10:09

JunhaoZhu0220 added 5 commits March 5, 2026 16:49

Remove the user unfriendly demo

91006a6

Update a webcam classifier demo (classify between pens and phones), c…

fb88763

…redit to @shiffman !

Refactor featureExtractor to have the same properties as original ver…

2336f6b

…sion; Add webcam support.

Update a demo for regression task of feature extractor.

2964258

Rename the regression task demo

f1efb33

shiffman reviewed Mar 6, 2026

View reviewed changes

examples/featureExtractor-webcam-classifier/sketch.js Outdated Show resolved Hide resolved

shiffman reviewed Mar 6, 2026

View reviewed changes

gohai reviewed Mar 8, 2026

View reviewed changes

examples/featureExtractor-webcam-classifier/sketch.js Outdated Show resolved Hide resolved

gohai reviewed Mar 8, 2026

View reviewed changes

examples/featureExtractor-webcam-classifier/index.html Outdated Show resolved Hide resolved

gohai reviewed Mar 8, 2026

View reviewed changes

examples/featureExtractor-webcam-classifier/sketch.js Outdated Show resolved Hide resolved

gohai reviewed Mar 8, 2026

View reviewed changes

examples/featureExtractor-webcam-classifier/sketch.js Outdated Show resolved Hide resolved

gohai reviewed Mar 8, 2026

View reviewed changes

JunhaoZhu0220 added 6 commits March 9, 2026 20:55

Add finishing newline character

0b921f3

Refactor FeatureExtractor to support task type configuration; support…

490ce7e

… updating the training hyperparameters within the .train() function.

Move model initialization to preload(); modify the hyperparameters in…

e119244

… .train(); intialize the model with a single line code through "task" property; add training visualization.

Adjust the canvas size to (640,480)

a1d3140

Add .predictStart() for regressor and .classifyStart() for classifica…

17afa84

…tion to avoid recursively calling .classify()

make classification classes contenteditable

aa53f36

gohai reviewed Mar 21, 2026

View reviewed changes

Conversation

JunhaoZhu0220 commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shiffman commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JunhaoZhu0220 commented Mar 5, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JunhaoZhu0220 commented Mar 7, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JunhaoZhu0220 Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

JunhaoZhu0220 commented Mar 3, 2026 •

edited

Loading

shiffman commented Mar 3, 2026 •

edited

Loading

JunhaoZhu0220 Mar 21, 2026 •

edited

Loading