We consider a problem: Can a machine learn from a few labeled pixels to predict every pixel in a new image?
This task is extremely challenging (see Fig. 1) as a single body part could contain visually distinctive areas
(e.g. head consists of eyes, noses and mouths); different body parts might look similar and undistinguishable
(e.g., upper arms v.s. lower arms). It could be even more difficult if we do not provide any precise location
but only the occurrence of body parts in the image. This problem is dubbed weakly-supervised segmentation, where
the goal is to classify every pixel into semantic categories using only partial / weak supervision. There are many
forms of weak annotations which are cheap but not perfect, e.g. image-level tags, bounding boxes, points and scribbles.

Continue reading "Universal Weakly Supervised Segmentation by Pixel-to-Segment Contrastive Learning – The Berkeley Artificial Intelligence Research Blog"

Comment