How can picture of page be straightened out to look as if it was scanned?

I have seen apps, and wondered how can I programmatically take a picture of image. Define how it needs to be transformed so that it looks parallel to camera and not skewed perspective wise.

Then combine multiple photos to create a pdf file. For example this app does it:



I do not use books for such trivial things so sorry I can not recommend any (especially in English). What you need to do is this:

control points

  1. input image
  2. find main contours

    ideally whole grid but even outer contour will suffice (in case no grid is present). You need to divide the contour into horizontal (Red) and vertical (Green) curves (or set of points).

  3. sample contour curves by 4 "equidistant" points

    as the image is distorted (not just rotated) then we need to use at least bi-cubic interpolation. For that we need 16 points (Aqua) per patch.

  4. add mirror points to cover whole grid

    on the image are mirrored (Yellow) points only for horizontal contours you should do this also for vertical contours (did not fit me in the image and did not want to enlarge resolution just for that) and also for the corner points so you got 6x6 control points. The mirror can be done linearly (like I did).

Now the transformation is done like this:

  1. Process all pixels dst(x0,y0) of target image
  2. Handle x,y as parameter for cubic interpolation

    if xs,ys is target image resolution then:


    Now cubic interpolation is usually done on parameter t=<0.0,1.0) so
    if u=<0.0,1.0> use t=u and control points 0,1,2,3.
    if u=<1.0,2.0) use t=u-1.0 and control points 1,2,3,4
    if u=<2.0,3.0> use t=u-2.0 and control points 2,3,4,5

    The same goes for vertical contours and v. Compute xi,yi as bi cubic interpolation of (u,v). And copy pixel:


    This is just nearest neighbor but you can also use bilinear for this ... As cubic curve I would use this polynomial.

    The idea behind bi-cubic interpolation is easy. compute point corresponding to parameter u on 4 horizontal contours. That will give you 4 control points for the final cubic interpolation in vertical direction and v as parameter. Resulting coordinate is your source pixel position.

For more info see:

In case you do not have a grid use any info that can be used as one. For example lines of text can be considered a contour for this ...


Recent Questions

Top Questions

Home Tags Terms of Service Privacy Policy DMCA Contact Us Javascript

©2020 All rights reserved.