Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BodySegmentation: Add raw values of the detected body parts to result #139

Merged
merged 6 commits into from
Jun 20, 2024

Conversation

gohai
Copy link
Member

@gohai gohai commented Jun 18, 2024

@MOQN brought this up during our meeting today: He suggested requested to have some way of accessing the raw values (detected class ids per pixel) - so that so that the users of the library don't have to derive those from the color values of the (multi-colored) mask.

I sketched out two options here - but I'd think we should only pick one (or none) of them:

  • .values: an array of numbers, one per pixel of the input image
  • .valuesImageData: the image data as the the tf model is returning it (this has the value stored in the red channel, so one would want to look at only every fourth entry in the array)

Open questions:

  • Do we need to handle the case where segmentation.length is zero?

@shiffman @ziyuan-linn @MOQN 🙏

@gohai gohai requested review from MOQN, shiffman and ziyuan-linn June 18, 2024 15:20
for (let x=0; x < video.width; x += gridSize) {
for (let y=0; y < video.height; y += gridSize) {
// torsoFront is e.g. 12
if (segmentation.values[y * video.width + x] == 12) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do a more detailed review later, but a quick idea! What about we add some time of enum constants as part of BodySegmentation? so the code could read:

if (segmentation.values[y * video.width + x] == BodySegmentation.RIGHT_FACE) {

or would this make more sense?

if (segmentation.values[y * video.width + x] == ml5.RIGHT_FACE) {

@ziyuan-linn is there anything in other models like this as precedent?

The numbers and names from BodyPix can be in the tfjs-models repo.

(This is important info for the documentation page! cc @MOQN @QuinnHe @alanvww @myrahsa, sorry for so many tags, not sure who is working on this particular model docs!)

Copy link
Member Author

@gohai gohai Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking that makes a lot of sense for {read,use}ability!

One caveat would that we'd be baking the internals of a particular model in the API, where even tf's BodySegmentation is trying (albeit not very successfully) to keep the interface agnostic... so maybe rather BodySegmentation.RIGHT_FACE, and perhaps those properties only get added with the BodyPix model (and not the SelfieSegmentation one)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, that makes sense! Would something like BodyPix.RIGHT_FACE help make it more clear? We could also put strings into the array or something instead of integers but that might get tricky to maintain and be prone to error with mispellings, etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be BodySegmentation.BodyPix.RIGHT_FACE? Or ml5.BodyPix.RIGHT_FACE? (both seem a bit like a mouthful imho 🤔)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, haha, no I think we should only do one of these:

BodySegmentation.RIGHT_FACE
ml5.RIGHT_FACE
BodyPix.RIGHT_FACE

While there could be other models that have different integer values for similar or the same parts later, I think it's ok to leave that for a problem another day?

I might opt for ml5.RIGHT_FACE it's both the fewest characters and also implies that this is something we are doing above and beyond the model's native behavior itself?

I'm not sure! Open to any and all ideas!

Copy link
Member Author

@gohai gohai Jun 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented the BodySegmentation.torsoFront avenue for testing below, if anyone wants to give it a spin.

The properties currently get added to the instance whenever the BodyPix model is loaded. In practice this means that one would access it through the user-defined name of the instance variable - which could be short, e.g. just body. No strong opinions either way (also noticing that my JavaScript is quite rusty... please feel welcome to solve this more elegantly).

As for naming: I've been reusing the data from the existing BODYPIX_PALETTE.js file, which has it camel-case. The file isn't used for anything else (anymore), so we could easily change it e.g. to TORSO_FRONT.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using enum constants is a great idea!! I would vote for BodyPix.RIGHT_FACE. I will elaborate on it in the following comment.

gohai added a commit that referenced this pull request Jun 18, 2024
…o BodySegmentation object

Prototype, as suggested by @shiffman in #139.
@gohai gohai force-pushed the bodysegmentation-values branch from 2cc07f6 to dea5395 Compare June 18, 2024 17:00
@MOQN
Copy link
Member

MOQN commented Jun 19, 2024

@gohai thank you so much for implementing this and sorry for my delayed response!

It looks great and I have one minor suggestion. How about we call them this way?

  • .values to .data
  • .valuesImageData to .imageData

gohai added a commit that referenced this pull request Jun 19, 2024
…o BodySegmentation object

Prototype, as suggested by @shiffman in #139.
@gohai gohai force-pushed the bodysegmentation-values branch from dea5395 to e3253f2 Compare June 19, 2024 14:40
@gohai
Copy link
Member Author

gohai commented Jun 19, 2024

Thanks @MOQN for this great suggestion. I pushed the change. Curious: would you think there is a reason to expose .imageData?

  • .values to .data
  • .valuesImageData to .imageData

@MOQN
Copy link
Member

MOQN commented Jun 19, 2024

I believe we can also apply enum constants for other models. However, using ml5.WRIST might cause confusion between BodyPose and HandPose. BodyPose has left_wrist and right_wrist, while HandPose has wrist.

Also BodyPose.LEFT_WRIST is not ideal because it will lead to conflicts between BodyPose: BlazePose and BodyPose: MoveNet.

Thus, I vote for SpecificModel.PART_NAME.

BodyPix.RIGHT_FACE
BodySegmentation.BODY
BodySegmentation.BACKGROUND
MoveNet.LEFT_WRIST
BlazePose.LEFT_WRIST
HandPose.WRIST

@MOQN
Copy link
Member

MOQN commented Jun 19, 2024

Curious: would you think there is a reason to expose .imageData?

I think it's good to have! It can be used for a super quick demo. I remember that TensorFlow's BodyPix's imageData can be converted to a binary image using toBinaryMask. If we implement this as well, it can be used for masking a certain body part. (I don't think this is necessary at this moment!)

@shiffman
Copy link
Member

Thus, I vote for SpecificModel.PART_NAME.

I like this proposal from @MOQN!

@gohai
Copy link
Member Author

gohai commented Jun 19, 2024

I am not sure what the proper way is to expose additional top-level objects. (SpecificModel would be on the same level as ml5...) @ziyuan-linn Do you know?
(I'm assuming that adding it to the window object is not the way to go.)

@shiffman
Copy link
Member

Ah, right, the proper way to do this is in fact ml5.BodyPix.RIGHT_WRIST unlike p5 we are namespacing everything rather than using the global window object. It is very long and wordy, but maybe we should in fact just do:

ml5.BodyPix.RIGHT_FACE
ml5.MoveNet.LEFT_WRIST
// etc.

Otherwise, yes, we can consider adding more keywords to the global namespace?

@gohai
Copy link
Member Author

gohai commented Jun 19, 2024

An alternative could be adding it to the instance object after loading a specific model. It's slightly less explicit than your and Moon's suggested top-level class for each model:

let bodySegmentation;
bodySegmentation = ml5.bodySegmentation("BodyPix", options);
// ...
if (result.data[...] == bodySegmentation.TORSO_FRONT) {

or, if we were to rename the (example) instance variables after the actual model used

let bodyPix;
bodyPix = ml5.bodySegmentation("BodyPix", options);
// ...
if (result.data[...] == bodyPix.TORSO_FRONT) {

@ziyuan-linn
Copy link
Member

Sorry for the late response! The only current model with a similar behavior is BodyPose. The skeleton info is returned by the getSkeleton function on the object instance which returns the skeleton info based on the underlying model used.

let bodyPose;
function preload() {
  bodyPose = ml5.bodyPose();
}

function setup() {
  connections = bodyPose.getSkeleton();
}

If we want to keep it consistent, we could have a bodySegmentation.getPartsId (tentative name) function. Accessing an id would be something like:

let bodySegmentation;
let rightWristId;
function preload() {
  bodySegmentation = ml5.bodySegmentation();
}

function setup() {
  rightWristId = bodySegmentation.getPartsId().RIHGT_WRIST;
}

The getPartsId function would return different values based on whether bodyPix or selfieSegmentation is loaded.

I don't think adding any global variable is a good idea since it might clash with user-defined globals, especially if they are model names.

I'm pretty sure setting ml5.BodyPix.RIGHT_WRIST = 5 directly would cause an undefined property type error. We have to explicitly declare the BodyPix object.

ml5.BodyPix = {};
ml5.BodyPix.RIGHT_WRIST = 5

I don't think this is a good idea either since having the object on ml5 might confuse people about where to access the actual bodyPix model.

I would vote for adding the enums somewhere on the instance object.

@gohai gohai force-pushed the bodysegmentation-values branch from 6687ce5 to 75a5bb4 Compare June 20, 2024 05:44
@gohai
Copy link
Member Author

gohai commented Jun 20, 2024

I cleaned up and tested this series. The only open questions imho are whether the constants should be behind some getPartsId() function, and whether we want to rename the variable names in the examples to the underlying models being used. Other than that I think this would be ready to be included iff people feel like making such an invasive change ahead of the release still 😀


// add constants to the instance variable
this.BACKGROUND = 0;
this.PERSON = 255;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MOQN Prefer BODY?

@shiffman
Copy link
Member

Thank you @ziyuan-linn for adding your thoughts, it all makes sense to me! I like the way the examples currently look and though I can see the value of a getPartsId() method, I like the simpler syntax of bodySegmentation.RIGHT_FACE rather than bodySegmentation.getPartsId().RIGHT_FACE. So I feel good about merging as is!

If including an explicit call to a method is important for maintenance and consistency across models, I might suggest writing the examples something along the lines of (seeing how just partsIDs() feels as a method name though maybe that's less clear).

let bodyParts = bodySegmentation.partsIDs();

function draw() {
  background(255);
  image(video, 0, 0);
  if (segmentation) {
    let gridSize = 10;
    for (let x=0; x < video.width; x += gridSize) {
      for (let y=0; y < video.height; y += gridSize) {
        if (segmentation.data[y * video.width + x] == bodyParts.TORSO_FRONT) {
          fill(255, 0, 0);
          noStroke();
          circle(x, y, gridSize);
        }
      }
    }
  }
}

Thank you so much @gohai I think we should include this if we can for this week's release!

gohai added 6 commits June 20, 2024 18:20
This information is static, and we'll be offering a more convenient way of accessing it.
This is done when the specific model is being loaded.
This renames bodyPix to bodySegmentation, as spotted by @MOQN. It also assigns the entire result to a global variable, which makes it more obvious which member we're accessing during drawing.
@gohai gohai force-pushed the bodysegmentation-values branch from 75a5bb4 to 2698c3b Compare June 20, 2024 16:23
@gohai
Copy link
Member Author

gohai commented Jun 20, 2024

@ziyuan-linn Since there is so much (frantic) development activity at the moment, let me go ahead and merge this one now. We can still explore a getPartsId() etc here as comments, or as a separate PR though!

@gohai gohai merged commit ed925fc into main Jun 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants