Skip to content

ShadidYousuf/PerspectiveFix

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

� PerspectiveFix

A robust Python-based document alignment tool that automatically detects document boundaries, applies perspective correction, and enhances the output for optimal readability.

✨ Features

  • Automatic Document Detection - Uses multiple edge detection methods to find document boundaries
  • Perspective Correction - Transforms skewed/tilted documents into a flat, rectangular view
  • Shadow Removal - Clean enhancement for readable output
  • EXIF Rotation Handling - Automatically corrects image orientation from camera metadata
  • Auto-Crop - Removes uniform borders from the processed image
  • Deskewing - Straightens slightly rotated documents

🚀 Installation

pip install -r requirements.txt

📖 Usage

python perspective_fix.py <image_path>

Example:

python perspective_fix.py Sample_Images\sample1.jpeg

🔬 How It Works: Step-by-Step Pipeline

The tool processes images through a multi-stage pipeline. Debug images are saved to debug_output/ for inspection.

Step 1: Load & EXIF Rotation

The image is loaded and automatically rotated based on EXIF orientation metadata. This handles photos taken in portrait/landscape mode on phones.

sample_img = Image.open(image_path)
sample_img = fix_exif_rotation(sample_img)
Original Input
Original

Step 2: Document Boundary Detection

The scanner tries three different edge detection methods to find a quadrilateral document boundary:

Method 1: Canny Edge Detection

blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 150)
edged = cv2.dilate(edged, np.ones((3,3), np.uint8), iterations=2)
Canny Edges
Canny

Method 2: Adaptive Threshold

Uses local thresholding to handle varying lighting conditions.

Method 3: Morphological Gradient

Highlights edges using morphological operations—often works best for documents with subtle boundaries.

The scanner tries each method in order and uses the first one that finds a valid 4-point contour covering at least 20% of the image area.


Step 3: Contour Detection & Point Ordering

Once edges are detected, contours are extracted and approximated to find a quadrilateral:

contours, _ = cv2.findContours(edges, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
for c in contours:
    peri = cv2.arcLength(c, True)
    approx = cv2.approxPolyDP(c, 0.02 * peri, True)
    if len(approx) == 4:
        # Found document boundary!

The 4 corner points are then ordered consistently: top-left → top-right → bottom-right → bottom-left

Detected Contour (green) with Corner Points (red)
Contours

Step 4: Perspective Transform

Using the ordered corner points, a perspective transformation matrix is calculated to "unwarp" the document into a flat rectangle:

# Calculate output dimensions
maxWidth = max(width_bottom, width_top)
maxHeight = max(height_right, height_left)

# Define destination rectangle
dst = np.array([[0, 0], [maxWidth-1, 0], 
                [maxWidth-1, maxHeight-1], [0, maxHeight-1]], dtype="float32")

# Apply perspective transform
M = cv2.getPerspectiveTransform(pts, dst)
warped = cv2.warpPerspective(orig, M, (maxWidth, maxHeight))
After Perspective Correction
Warped

Step 5: Auto-Crop (Optional)

If the warped image has significant uniform borders (e.g., white margins), they are automatically trimmed:

_, binary = cv2.threshold(gray, 250, 255, cv2.THRESH_BINARY_INV)
coords = cv2.findNonZero(binary)
x, y, w, h = cv2.boundingRect(coords)

Step 6: Shadow Removal & Enhancement

The final step applies CamScanner-style shadow removal using the LAB color space:

# Convert to LAB (separates luminance from color)
lab = cv2.cvtColor(img, cv2.COLOR_BGR2LAB)
l, a, b = cv2.split(lab)

# Apply CLAHE for contrast enhancement
clahe = cv2.createCLAHE(clipLimit=3.0, tileGridSize=(8, 8))
l_clahe = clahe.apply(l)

# Divide by blurred version to normalize illumination
blur = cv2.GaussianBlur(l_clahe, (51, 51), 0)
divided = cv2.divide(l_clahe, blur, scale=255)

# Sharpen the result
kernel = np.array([[-1, -1, -1], [-1, 9, -1], [-1, -1, -1]])
sharpened = cv2.filter2D(divided, -1, kernel)

Available Enhancement Modes:

Mode Description
enhance Light CLAHE enhancement only
clean Background subtraction (B&W output)
magic CLAHE + division + sharpen (default)
ultra Most aggressive shadow removal for OCR
Final Output
Final

📁 Output Files

File Description
scanned_output.jpg The final processed document
debug_output/1_original.jpg Original input image
debug_output/2_contours.jpg Detected document boundary visualization
debug_output/3_warped.jpg After perspective correction
debug_output/4_final.jpg After shadow removal & enhancement
debug_output/edges_*.jpg Edge detection results for each method

🔧 Algorithm Details

Point Ordering Algorithm

The order_points() function ensures corners are always in the same order regardless of document orientation:

  1. Top-left: Point with smallest sum (x + y)
  2. Bottom-right: Point with largest sum (x + y)
  3. Top-right: Point with smallest difference (y - x)
  4. Bottom-left: Point with largest difference (y - x)

Orientation Preservation

The algorithm also checks if the detected document orientation (landscape/portrait) matches the original image. If not, it rotates the point order by 90° to maintain consistency.

Fallback: Deskewing

If no document boundary is found, the scanner falls back to Hough Line Transform to detect dominant line angles and correct skew:

lines = cv2.HoughLinesP(edges, 1, np.pi/180, 100, minLineLength=100, maxLineGap=10)
median_angle = np.median([angle for line in lines])
# Rotate image to correct skew

📋 Requirements

  • Python 3.7+
  • OpenCV (opencv-python)
  • NumPy
  • Pillow (PIL)

📜 License

MIT License - Feel free to use and modify!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages