How to calculate ground-truth mask? #32

aidevmin · 2023-11-03T09:09:47Z

Thanks a lot for amazing repo.

I read your source code but something is not cleared about how to calculate ground-truth mask.

DeFeat.pytorch/mmdet/models/dense_heads/anchor_head.py

Line 168 in c46a793

def _map_roi_levels(self, rois, num_levels):

    def _map_roi_levels(self, rois, num_levels):
        scale = torch.sqrt(
            (rois[:, 2] - rois[:, 0] + 1) * (rois[:, 3] - rois[:, 1] + 1))
        target_lvls = torch.floor(torch.log2(scale / 56 + 1e-6))
        target_lvls = target_lvls.clamp(min=0, max=num_levels - 1).long()
        return target_lvls

rois here is ground-truth bboxes. As my understanding Faster RCNN, neck has 5 levels and you want to map each bounding box to only one level. What is magic number 56 here?
And this

DeFeat.pytorch/mmdet/models/dense_heads/anchor_head.py

Line 262 in c46a793

def get_gt_mask(self, cls_scores, img_metas, gt_bboxes):

    def get_gt_mask(self, cls_scores, img_metas, gt_bboxes):
        featmap_sizes = [featmap.size()[-2:] for featmap in cls_scores]
        featmap_strides = self.anchor_generator.strides
        imit_range = [0, 0, 0, 0, 0]
        with torch.no_grad():
            mask_batch = []

            for batch in range(len(gt_bboxes)):
                mask_level = []
                target_lvls = self._map_roi_levels(gt_bboxes[batch], len(featmap_sizes))
                for level in range(len(featmap_sizes)):
                    gt_level = gt_bboxes[batch][target_lvls==level]  # gt_bboxes: BatchsizexNpointx4coordinate
                    h, w = featmap_sizes[level][0], featmap_sizes[level][1]
                    mask_per_img = torch.zeros([h, w], dtype=torch.double).cuda()
                    for ins in range(gt_level.shape[0]):
                        gt_level_map = gt_level[ins] / featmap_strides[level]
                        lx = max(int(gt_level_map[0]) - imit_range[level], 0)
                        rx = min(int(gt_level_map[2]) + imit_range[level], w)
                        ly = max(int(gt_level_map[1]) - imit_range[level], 0)
                        ry = min(int(gt_level_map[3]) + imit_range[level], h)
                        if (lx == rx) or (ly == ry):
                            mask_per_img[ly, lx] += 1
                        else:
                            mask_per_img[ly:ry, lx:rx] += 1
                    mask_per_img = (mask_per_img > 0).double()
                    mask_level.append(mask_per_img)
                mask_batch.append(mask_level)
            
            mask_batch_level = []
            for level in range(len(mask_batch[0])):
                tmp = []
                for batch in range(len(mask_batch)):
                    tmp.append(mask_batch[batch][level])
                mask_batch_level.append(torch.stack(tmp, dim=0))
                
        return mask_batch_level

As my understanding, after mapping ground-truth box to one level correspond feature map size, from location of ground-truth box in the image we map it to location in feature map (simple scale by width and height). Is that right?

Please correct me if I dont undestood correctly. Thanks.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to calculate ground-truth mask? #32

How to calculate ground-truth mask? #32

aidevmin commented Nov 3, 2023

How to calculate ground-truth mask? #32

How to calculate ground-truth mask? #32

Comments

aidevmin commented Nov 3, 2023