Sriram's Blog: December 2015

Wednesday, December 9, 2015

Sudoku Solver - Real time Image

My last post was about solving a sudoku which was a binary image. This time I tried to solve a real time sudoku image. A real time image requires some kind of Image processing to be done before its sent to solve sudoku. Here also I am going to use the KNearest Machine learning algorithm that is available in Opencv.

Int main()

The Image shown below is a real time Image of a sudoku. I will use this image to show the steps that are needed to be done before you use it for recognizing the digits. When I searched in google I found out a good post which speaks about extracting the sudoku from the paper and it helped me achieve this.

Original Image

First the image is filtered with 3X3 gaussian kernel to remove some amount of noise present in the image. Then a adaptive thresholding technique is applied on the image. A closing operation is done on the thresholded image which does a dilation followed by erosion on the image. The closed and gray images are divided to get a narrow histogram which when normalized increases the brightness and the contrast of the image.

Code:

import sys
from matplotlib import pyplot as plt
import numpy as np
import cv2
deflt="D:\Sriram\pyprograms\sudoku\\"
img=cv2.imread("D:\Sriram\pyprograms\sudoku\sudoku.jpg",1)
img=cv2.GaussianBlur(img,(3,3),0)
frame=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
kernel=cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(11,11))
close=cv2.morphologyEx(frame,cv2.MORPH_CLOSE,kernel)
div=np.float32(frame)/(close)
res=np.uint8(cv2.normalize(div,div,0,255,cv2.NORM_MINMAX))
res2=cv2.cvtColor(res,cv2.COLOR_GRAY2BGR)
#plt.imshow(res2),plt.title("res2")
#plt.xticks([]),plt.yticks([])
#plt.show()

Let us now mask out the sudoku part in the image by finding the contours. This done with assumption that sudoku occupies highest area inside the Image. A 5 % approximation is done on the contour to get a perfect shape. Finally the contour with highest area which has 4 vertices is taken out. After finding the biggest contour the sudoku is masked out by creating a mask image which has same dimensions as that of our input image. The biggest contour is filled with white while all other pixels are given zero. The mask and the res are bitwise anded to get the masked image.

thresh=cv2.adaptiveThreshold(res,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY_INV,11,2)
#tempthresh=cv2.resize(thresh,(450,450))
#cv2.imshow("thresh",tempthresh)
_,cnt,hr = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
keys = [i for i in range(48,58)]
area=0
temparea=0
p=0
bestapprox=None
for i in range(len(cnt)):
 temparea=cv2.contourArea(cnt[i])
 if temparea>1000:
  peri=cv2.arcLength(cnt[i],True)
  epsilon=0.05*peri
  approx=cv2.approxPolyDP(cnt[i],epsilon,True)
  if len(approx)==4 and temparea>area:
   area=temparea
   bestapprox=approx
   p=i
img=cv2.polylines(img,[bestapprox],True,(0,255,0),3)
cv2.drawContours(img,cnt,p,(255,0,0),2)
(x,y)=thresh.shape
mask=np.zeros((x,y),np.uint8)
mask=cv2.drawContours(mask,cnt,p,255,-1)
mask=cv2.drawContours(mask,cnt,p,0,2)
masked=cv2.bitwise_and(mask,res)
cv2.imshow("masked",masked)
cv2.waitKey(0)

Now comes the part of extracting the each and every individual box in the image. Let us first find all the box points in the sudoku. For that we take Sobel in both x and y directions which gives us the edges in the respective directions, when we take absolute value of the Sobel. Absolute is taken to make negative peaks also to act as positive peaks in the image. When Sobel taken in the both the directions are anded we get box points of the image (not exactly). The results looks as shown below.

kernelx = cv2.getStructuringElement(cv2.MORPH_RECT,(2,10))
dx = cv2.Sobel(masked,cv2.CV_16S,1,0)
dx = cv2.convertScaleAbs(dx)
cv2.normalize(dx,dx,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dx,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
#cv2.imshow("dx",close)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernelx,iterations = 1)
_,contour, hr = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
    x,y,w,h = cv2.boundingRect(cnt)
    if h/w > 5:
        cv2.drawContours(close,[cnt],0,255,-1)
    else:
        cv2.drawContours(close,[cnt],0,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_CLOSE,None,iterations = 2)
closex = close.copy()
#cv2.imshow("closex",closex)

kernely=cv2.getStructuringElement(cv2.MORPH_RECT,(10,2))
dy=cv2.Sobel(masked,cv2.CV_16S,0,2)
dy=cv2.convertScaleAbs(dy)
cv2.normalize(dy,dy,0,255,cv2.NORM_MINMAX)
ret,close=cv2.threshold(dy,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close=cv2.morphologyEx(close,cv2.MORPH_DILATE,kernely,iterations=1)
_,cnt, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for i in range(len(cnt)):
    x,y,w,h = cv2.boundingRect(cnt[i])
    if w/h > 5:
        cv2.drawContours(close,cnt,i,255,-1)
    else:
        cv2.drawContours(close,cnt,i,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_CLOSE,None,iterations = 2)
closey=close
#cv2.imshow("closey",closey)
grid=cv2.bitwise_and(closex,closey)
#cv2.imshow("grid",grid)

The grid image as shown above does not give us exact box points, so we find contours in the grid, and we take centroid of all those contours using moments function that is available in Opencv.

Now there are two ways in which you get the exact output image,

Exact method:

In this method we take care of convexity defects and correct them by making some perspective transform for the whole image. But in my code I go to this method only when the length of the centroid array is 100. I hope when the paper is stiff and when it is resting on a table the probability that it gets a convexity defect is less hence with this assumption when we don’t have the above condition satisfied we go for the approximate method.

Approximate method:

This is just a normal method of taking a perspective transform and of the input image. This could have done at first without taking all the sobel stuff and finding the grid points but this method does not give us the exact sudoku structure, hence I prefer to check for the getting all the grid points, we don’t get all the grid points sometimes (Many times actually) hence if the numbers of grid points is higher than certain value we can have an assumption that we have got the exact structure of sudoku and make approximate image of sudoku.

Exact method code:

_,contour, hier = cv2.findContours(grid,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
cent=[]
for cnt in contour:
 mom=cv2.moments(cnt)
 (x,y)=int(mom['m10']/mom['m00']),int(mom['m01']/mom['m00'])
 cent.append((x,y))
 cv2.circle(img,(x,y),4,(0,255,0),-1)
ce=np.array(cent,np.uint32)
ce=ce.reshape((100,2))
ce2=ce[np.argsort(ce[:,1])]
#print ce2 
b = np.vstack([ce2[i*10:(i+1)*10][np.argsort(ce2[i*10:(i+1)*10,0])] for i in xrange(10)])
#print b
points = b.reshape((10,10,2))
print points
output=np.zeros((450,450,3),np.uint8)
for i in xrange(3):
 for j in range(3):
  partimg=np.array([points[i*3,j*3,:],points[(i)*3,(j+1)*3,:],points[(i+1)*3,(j+1)*3,:],points[(i+1)*3,j*3,:]],np.float32)
  print partimg
  dest=np.array([[j*150,i*150],[(j+1)*150,(i)*150],[(j+1)*150,(i+1)*150],[(j)*150,(i+1)*150]],np.float32)
  gpres=cv2.getPerspectiveTransform(partimg,dest)
  warp=cv2.warpPerspective(res2,gpres,(450,450))
  output[i*150:(i+1)*150,j*150:(j+1)*150]=warp[i*150:(i+1)*150,j*150:(j+1)*150]

cv2.rectangle(output,(0,0),(450,450),0,1)
cv2.imshow("output",output)

Output Image:

Exact method analysis:

First of all reshape the array into a 100x2 form. Now sort the array in the order of increasing y axis values. Now sort the array in the order of increasing x axis values by taking 10 values at a time. This process is done because the points we have found will not be in the order we expected and hence we do so to get them in the required form. Now we have all the points in the perfect order. Let us take the outer square points which consists of 9 boxes inside it and apply perspective transform to them to get our desired output image.

Approx method code:

dest=np.array(bestapprox,np.float32)
nw=np.array([[0,0],[0,450],[450,450],[450,0]],np.float32)
M=cv2.getPerspectiveTransform(dest,nw)
output=cv2.warpPerspective(res2,M,(450,450))

Approx method analysis :

Just take the best approx points and apply perspective transform to the image. Here I take my output image to be 450X450.

And Now we have successfully extracted the sudoku from the paper. Now it's time for training the OCR and giving the values of the matrix to sudoku solver. I have written some code for training my sudoku which can be used to for generating training data. Literally I did not succeed much, on training my OCR. But sometimes it worked good.

Follow the same steps for generating a training data.

Now coming back to the testing. Take each and every 50X50 box as in my case and give it to the findNearest function. And now solve the sudoku in the same way as I mentioned in the previous post.

The solved sudoku image is shown below.

The image I used was an example image which was used in opencv documentation. I have tried the same using a real time image. The image I used is shown below.

input

The blue line shows the original contour found. The green line shows the approximated contour.
The processed output image is shown below.

Output

The above image shows the output of the above input image. The sudoku is solved the same way as mentioned in the previous post.

My github link : Github Link for source code

Thanks to
Referring sites : Removing convexity defects
Digit Recognition
http://opencvpython.blogspot.in/

Thank you
Happy Coding :)
Cheers!!
#IIE

Sunday, December 6, 2015

Sudoku Solver

Hello Everyone!. Ever got bored of solving sudoku's. Then for you the solution is here, to create something that could solve and show us the solved sudoku . I have done it using python and an opencv python library. The functions used in this can be easily found out in the Opencv documentation refer them or comment below for any further clarification.

In this I have used KNearest(), a machine learning algorithm that is available in Opencv, which can be used to recognize stuffs but here I have used it to recognize the digits in the sudoku. Well Before that you have create your training data, which will be used for recognizing the digits.

Input Image

Int MAIN():

This is the basic sudoku solver which does not require any complex image processing techniques because the image taken is a binary image without any noise. Hence just finding the contours will satisfy the needs.

Let's first look at training the OCR for recognizing the numbers in the given input image

Basically, training the OCR requires hundreds of images for it to be used in recognizing images of different types. Since the images we use are only computer typed digits and the given input image is assumed to be taken in the binary form we are going to use only 1 image for each digit. I hope this enough for recognizing images which are in the binary form without any noise. Let's look forward towards training the OCR.

The first Part of the code reads the input image and converts the given image into gray.

The we find contours in the given image with the arguments saying the function to find contours in the tree form and to make simple approximation on the contours found, hence only minimum number of points required to draw the contour are saved.

from matplotlib import pyplot as plt
import numpy as np
import cv2
img = cv2.imread('sudoku.png',1)
frame=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
samples=np.empty((0,100))
responses=[]
keys = [i for i in range(48,58)]
_,cnt,hr = cv2.findContours(frame,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
#cv2.drawContours(img,cnt,-1,(255,0,0),2)

Image with contours

The below code snippet extracts each and every individual box of the sudoku and checks out whether the box considered has any child contour. If it does then the box having the child is extracted out and the user is asked to enter the response of the extracted image and the entered response is recorded in a variable. The sample used is resized into 10X10 image which is again reshaped into row array and its type converted and saved into a file. It's done so because the file size is reduced when the size of the image is small.

The sample image and its corresponding response is saved into a npz file.

Thus training the OCR can be accomplished.

count=0
numb=[]
for i in range(len(cnt)):
 area=cv2.contourArea(cnt[i])
 if area>=40000:
  x,y,w,z=hr[0][i]
  if w!=-1:
   a,b,c,d=cv2.boundingRect(cnt[w])
   print c,d
   errx=100-c
   erry=180-d
   b=b-(erry/2)
   a=a-(errx/2)
   c=100
   d=180
   roi=frame[b:b+d,a:a+c]
   cv2.imshow("1",roi)
   cv2.rectangle(img,(a,b),(a+c,b+d),(0,255,0),4)
   q=cv2.resize(img,(400,400))
   cv2.imshow("frame",q)
   count=count+1
   key=cv2.waitKey(0)
   if key in keys:
    responses.append(int(chr(key)))
    roismall=cv2.resize(roi,(10,10))
    roismall=roismall.reshape(1,100)
    samples=np.append(samples,roismall,0)
    if key not in numb:
     numb.append(key)
     picname= chr(key)+".png"
     cv2.imwrite(picname,img[b:b+d,a:a+c])
plt.imshow(img),plt.title("frame")
plt.xticks(),plt.yticks()
plt.show()
cv2.waitKey(0)
print count
samples=np.array(samples,np.float32)
responses = np.array(responses,np.float32)
responses = responses.reshape((responses.size,1))
np.savez('knn.npz',train=samples,traindata=responses)
print "training complete"

SudokuTrain.py

Part-2:

The next part is loading the training data and extracting the individual boxes and processing them by converting the given images to the form it has been saved and then giving it to the findNearest function which gives the nearest out an array containing the calculated result. The values recognized are stored in a matrix. The matrix is then given to the sudoku solving algorithm. The algorithm first checks out for an empty cell and assigns it a value from 1 to 9 which is passed to the a function which checks the validity. The function checks whether the assigned value is valid or not, if it is valid the recursively calls itself which again does the same job. If suppose the is not compatible with any of the values assigned for the function returns to its parent function which was recursively called and then the previous function assigns another value to and the process repeats itself.

Output Image

from matplotlib import pyplot as plt
import numpy as np
import cv2
img = cv2.imread('sudoku\sudoku.png',1)
deflt="D:\sudoku\\"
frame=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
print frame.shape
samples=np.empty((0,100))
responses=[]
keys = [i for i in range(48,58)]
_,cnt,hr = cv2.findContours(frame,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
cv2.drawContours(img,cnt,-1,(255,0,0),2)
count=0
model= cv2.ml.KNearest_create()
with np.load('D:\Sriram\pyprograms\sudoku\knn.npz') as data:
 print data.files
 train1=data['train']
 train_labels= data['traindata']
 print train1
 print train_labels
model.train(train1,cv2.ml.ROW_SAMPLE,train_labels)
matrix=np.zeros((9,9),np.uint8)
print matrix
for i in range(len(cnt)):
 area=cv2.contourArea(cnt[i])
 if area>=40000:
  x,y,w,z=hr[0][i]
  if w!=-1:
   a,b,c,d=cv2.boundingRect(cnt[w])
   #print c,d
   errx=100-c
   erry=180-d
   b=b-int(erry/2)
   a=a-int(errx/2)
   c=100
   d=180
   roi=frame[b:b+d,a:a+c]
   cv2.rectangle(img,(a,b),(a+c,b+d),(255,0,0),4)
   q=cv2.resize(img,(400,400))
   cv2.imshow("frame",q)
   roismall=cv2.resize(roi,(10,10))
   roismall=roismall.reshape(1,100)
   roismall1=np.float32(roismall)
   r=model.findNearest(roismall1,1)
   print int(r[1][0])
   count=count+1
   count1=81-count
   row=int(count1/9)
   col=int(count1%9)
   matrix[row][col]=int(r[1][0])
  else:
   count=count+1
   count1=81-count
   row=int(count1/9)
   col=int(count1%9)
   matrix[row][col]=0
print count
#print matrix
def nextcell(matrix,i,j):
 for x in range(i,9):
  for y in range(j,9):
   if matrix[x][y]==0:
    return x,y
 for x in range(0,9):
  for y in range(0,9):
   if matrix[x][y]==0:
    return x,y
 return -1,-1
def checkcell(matrix,i,j,val):
 rowcheck=colcheck=1
 for x in range(9):
  if val==matrix[i][x]:
   rowcheck=0
 if rowcheck==1:
  for x in range(9):
   if val==matrix[x][j]:
    colcheck=0
 if colcheck==1 and rowcheck==1:
  topx=3*(i/3)
  topy=3*(j/3)
  for x in range(topx,topx+3):
   for y in range(topy,topy+3):
    if matrix[x][y]==val:
     return False
  return True
 return False
def sudokusolve(matrix,i=0,j=0):
 i,j=nextcell(matrix,i,j)
 if i==-1:
  return True
 for val in range(1,10):
  if checkcell(matrix,i,j,val):
   matrix[i][j]=val
   if sudokusolve(matrix,i,j):
    return True
   matrix[i][j]=0
 return False

if sudokusolve(matrix,0,0):
 print matrix
 print "Eureka"
else:
 print "Could not solve"
count=0
val=0
for i in range(len(cnt)):
 area=cv2.contourArea(cnt[i])
 if area>=40000:
  x,y,w,z=hr[0][i]
  if w!=-1:
   count=count+1
   continue
  else:
   count=count+1
   count1=81-count
   row=int(count1/9)
   col=int(count1%9)
   val=matrix[row][col]
   s=str(val)
   a,b,c,d=cv2.boundingRect(cnt[i])
   im=cv2.imread(deflt+s+".png",1)
   img[b+10:b+190,a+50:a+150]=im
plt.imshow(img),plt.title("frame")
plt.xticks([]),plt.yticks([])
plt.show()
cv2.destroyAllWindows()

Test.py

Thank you
Happy Coding.
Cheers!!!
#IIE

Thursday, December 3, 2015

Object Detection using opencv

def Story():

Well it has been few months since I updated the blog. Well now I have been given an opportunity to work on FPGAs from the start of my 3rd semester. I started learning image processing in my first year and I am still growing on...

int main()

This is my first tutorial towards image processing. This tutorial will help you identify basic objects which can be further tracked or processed. First of you will require opencv python installed. Before that you have to install numpy depending on your processor.Lets get started !!!.

The opencv python windows installation procedure can be found in this link;Opencv Windows installation

The below image shows a red object being detected. This is a basic program to which helps us to get started with opencv.

I hope the code is self explanatory with its comments.
Happy coding!!..

import cv2
import numpy as np
cap = cv2.VideoCapture(0) #Start Video Capture
while True:
 ret,frame=cap.read() #Capture a frame
 img=cv2.cvtColor(frame,cv2.COLOR_BGR2HSV)  #Convert BGR to HSV for easy processing
 lo=np.array([160,100,100])     # lower and upper Threshold values for inRange. The object of
 up=np.array([179,255,255])     # my intrest is red color.
 mask=cv2.inRange(img,lo,up,None)     # It only shows pixels in the given color range
 kernel=np.ones((5,5),np.uint8)  # A small matrix used for processing by morph function
 mask=cv2.morphologyEx(mask,cv2.MORPH_OPEN,kernel) # Morph is used to remove the noise
 _,cnt,hie=cv2.findContours(mask,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE) # It identifies the objects in our image
 for i in range(len(cnt)):
  moments = cv2.moments(cnt[i]) # It gives a dictionary of all moment values. Here we use it for calculating centre
  x,y,w,h=cv2.boundingRect(cnt[i]) # Find a rectangle bounding our contour
  if w*h>5000:     # If area is greater than this then it is considered as a object.(optional)
   cv2.rectangle(frame,(x,y),(x+w,y+h),(255,0,0),2) # Draw the rectangle with the specified points
   cv2.circle(frame,(int(moments['m10']/moments['m00']), int(moments['m01']/moments['m00'])),2,(0,255,0),3)
   # Draw the centre of the rectangle
 cv2.imshow("frame",frame) # Display the frames
 cv2.imshow("mask",mask)
 k=cv2.waitKey(60) & 0xff # The program exits if ESC key is pressed
 if k==27:
  break

cap.release()
cv2.destroyAllWindows()

#IIE