A robust Python-based document alignment tool that automatically detects document boundaries, applies perspective correction, and enhances the output for optimal readability.
- Automatic Document Detection - Uses multiple edge detection methods to find document boundaries
- Perspective Correction - Transforms skewed/tilted documents into a flat, rectangular view
- Shadow Removal - Clean enhancement for readable output
- EXIF Rotation Handling - Automatically corrects image orientation from camera metadata
- Auto-Crop - Removes uniform borders from the processed image
- Deskewing - Straightens slightly rotated documents
pip install -r requirements.txtpython perspective_fix.py <image_path>Example:
python perspective_fix.py Sample_Images\sample1.jpegThe tool processes images through a multi-stage pipeline. Debug images are saved to debug_output/ for inspection.
The image is loaded and automatically rotated based on EXIF orientation metadata. This handles photos taken in portrait/landscape mode on phones.
sample_img = Image.open(image_path)
sample_img = fix_exif_rotation(sample_img)| Original Input |
|---|
![]() |
The scanner tries three different edge detection methods to find a quadrilateral document boundary:
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 150)
edged = cv2.dilate(edged, np.ones((3,3), np.uint8), iterations=2)| Canny Edges |
|---|
![]() |
Uses local thresholding to handle varying lighting conditions.
Highlights edges using morphological operations—often works best for documents with subtle boundaries.
The scanner tries each method in order and uses the first one that finds a valid 4-point contour covering at least 20% of the image area.
Once edges are detected, contours are extracted and approximated to find a quadrilateral:
contours, _ = cv2.findContours(edges, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
for c in contours:
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.02 * peri, True)
if len(approx) == 4:
# Found document boundary!The 4 corner points are then ordered consistently: top-left → top-right → bottom-right → bottom-left
| Detected Contour (green) with Corner Points (red) |
|---|
![]() |
Using the ordered corner points, a perspective transformation matrix is calculated to "unwarp" the document into a flat rectangle:
# Calculate output dimensions
maxWidth = max(width_bottom, width_top)
maxHeight = max(height_right, height_left)
# Define destination rectangle
dst = np.array([[0, 0], [maxWidth-1, 0],
[maxWidth-1, maxHeight-1], [0, maxHeight-1]], dtype="float32")
# Apply perspective transform
M = cv2.getPerspectiveTransform(pts, dst)
warped = cv2.warpPerspective(orig, M, (maxWidth, maxHeight))| After Perspective Correction |
|---|
![]() |
If the warped image has significant uniform borders (e.g., white margins), they are automatically trimmed:
_, binary = cv2.threshold(gray, 250, 255, cv2.THRESH_BINARY_INV)
coords = cv2.findNonZero(binary)
x, y, w, h = cv2.boundingRect(coords)The final step applies CamScanner-style shadow removal using the LAB color space:
# Convert to LAB (separates luminance from color)
lab = cv2.cvtColor(img, cv2.COLOR_BGR2LAB)
l, a, b = cv2.split(lab)
# Apply CLAHE for contrast enhancement
clahe = cv2.createCLAHE(clipLimit=3.0, tileGridSize=(8, 8))
l_clahe = clahe.apply(l)
# Divide by blurred version to normalize illumination
blur = cv2.GaussianBlur(l_clahe, (51, 51), 0)
divided = cv2.divide(l_clahe, blur, scale=255)
# Sharpen the result
kernel = np.array([[-1, -1, -1], [-1, 9, -1], [-1, -1, -1]])
sharpened = cv2.filter2D(divided, -1, kernel)Available Enhancement Modes:
| Mode | Description |
|---|---|
enhance |
Light CLAHE enhancement only |
clean |
Background subtraction (B&W output) |
magic |
CLAHE + division + sharpen (default) |
ultra |
Most aggressive shadow removal for OCR |
| Final Output |
|---|
![]() |
| File | Description |
|---|---|
scanned_output.jpg |
The final processed document |
debug_output/1_original.jpg |
Original input image |
debug_output/2_contours.jpg |
Detected document boundary visualization |
debug_output/3_warped.jpg |
After perspective correction |
debug_output/4_final.jpg |
After shadow removal & enhancement |
debug_output/edges_*.jpg |
Edge detection results for each method |
The order_points() function ensures corners are always in the same order regardless of document orientation:
- Top-left: Point with smallest sum (x + y)
- Bottom-right: Point with largest sum (x + y)
- Top-right: Point with smallest difference (y - x)
- Bottom-left: Point with largest difference (y - x)
The algorithm also checks if the detected document orientation (landscape/portrait) matches the original image. If not, it rotates the point order by 90° to maintain consistency.
If no document boundary is found, the scanner falls back to Hough Line Transform to detect dominant line angles and correct skew:
lines = cv2.HoughLinesP(edges, 1, np.pi/180, 100, minLineLength=100, maxLineGap=10)
median_angle = np.median([angle for line in lines])
# Rotate image to correct skew- Python 3.7+
- OpenCV (
opencv-python) - NumPy
- Pillow (PIL)
MIT License - Feel free to use and modify!




