Region Rectification and Cropping

Overview

Function 1 of the Hand Landmark Detection model handles the critical process of region rectification and cropping. This function transforms the rotated hand regions detected by Model 1 into standardized, properly oriented 224x224 images suitable for landmark detection processing.

Purpose and Functionality

This function performs essential preprocessing tasks:

Image Warping: Applies affine transformation to straighten hand regions
Region Cropping: Extracts hand regions from the original full-resolution frame
Standardization: Creates consistent 224x224 input format for the landmark model
Orientation Correction: Uses rotation information to properly align hands

Rotated Rectangle to Points Conversion

Point Calculation Implementation

def rotated_rect_to_points(cx, cy, w, h, rotation, wi, hi):
    b = cos(rotation) * 0.5
    a = sin(rotation) * 0.5
    points = []
    p0x = cx - a*h - b*w
    p0y = cy + b*h - a*w
    p1x = cx + a*h - b*w
    p1y = cy - b*h - a*w
    p2x = int(2*cx - p0x)
    p2y = int(2*cy - p0y)
    p3x = int(2*cx - p1x)
    p3y = int(2*cy - p1y)
    p0x, p0y, p1x, p1y = int(p0x), int(p0y), int(p1x), int(p1y)
    return [(p0x,p0y), (p1x,p1y), (p2x,p2y), (p3x,p3y)]

Image Warping Process

Affine Transformation Implementation

def warp_rect_img(rect_points, img, w, h):
    src = np.array(rect_points[1:], dtype=np.float32)
    dst = np.array([(0, 0), (h, 0), (h, w)], dtype=np.float32)
    mat = cv2.getAffineTransform(src, dst)
    return cv2.warpAffine(img, mat, (w, h))

Rectangle Transformation System

Transformation Algorithm

The rectangle transformation adapts the detected hand regions for optimal landmark extraction:

def rect_transformation(regions, w, h):
    scale_x = 1.4  # Increased from 2.0
    scale_y = 2.4 # Increased from 2.4
    shift_x = 0
    shift_y = -0.4
    for region in regions:
        width = region.rect_w
        height = region.rect_h
        rotation = region.rotation # This will now always be 0.0
        
        # The following lines are for rotation = 0
        region.rect_x_center_a = (region.rect_x_center + width * shift_x) * w
        region.rect_y_center_a = (region.rect_y_center + height * shift_y) * h
        
        # The 'else' block for rotated cases is no longer needed as rotation is always 0
        # else:
            # x_shift = (w * width * shift_x * cos(rotation) - h * height * shift_y * sin(rotation))
            # y_shift = (w * width * shift_x * sin(rotation) + h * height * shift_y * cos(rotation))
            # region.rect_x_center_a = region.rect_x_center*w + x_shift
            # region.rect_y_center_a = region.rect_y_center*h + y_shift
            
        long_side = max(width * w, height * h)
        region.rect_w_a = long_side * scale_x
        region.rect_h_a = long_side * scale_y
        region.rect_points = rotated_rect_to_points(region.rect_x_center_a, region.rect_y_center_a, region.rect_w_a, region.rect_h_a, region.rotation, w, h)

Key Parameters

Scale Factors: scale_x = 1.4, scale_y = 2.4 for optimal hand region coverage
Shift Parameters: shift_x = 0, shift_y = -0.4 for proper hand centering
Output Size: Always 224x224 pixels for consistent model input
Rotation Handling: Currently optimized for rotation = 0.0 cases

After region rectification and cropping, the pipeline proceeds to:

Function 2: Landmark detection and smoothing on warped images
Quality Validation: Ensure adequate input quality for landmark detection
Coordinate Mapping: Prepare for landmark coordinate transformation back to original image space

Overview​

Purpose and Functionality​

Rotated Rectangle to Points Conversion​

Point Calculation Implementation​

Image Warping Process​

Affine Transformation Implementation​

Rectangle Transformation System​

Transformation Algorithm​

Key Parameters​