Let’s start our machine learning journey! This week I started the real work on the project. For those who don’t know, my project will be kind of OCR project, so we will be dealing with images, computer vision, and all sorts of interesting problems. And in this episode we will start getting ready for the machine learning.

As someone who is used to write a semicolon after each line, the Python syntax seems strange to me. On the other side Python is much cleaner and less affected by programmer style (except the ones how try to write everything in one line). Right now I’m taking a course in order to extend my knowledge of Python, but that can’t stop me from working on project ?.

OpenCV

Even though I think that I already fully set up the environment, during last week I added one more library OpenCV (Open Source Computer Vision). OpenCV implements a really wide range of functions for image and video processing. It also has a good documentation with tutorials. You can use it with other languages too.

Because I want to use it with Python 3, I have to install OpenCV 3.1. Installation isn’t that hard, but you have to build it on your own. Try following one of these guides: Installation in Linux (OpenCV Docs) or Installing OpenCV in Ubuntu for Python 3. Don’t forget that after installing OpenCV 3 you still import it as import cv2.

Image Processing

Even though this is a machine learning project, at first we have to get our data somewhere. And since every piece of data, which could be easily put into machine learning algorithms, has already been used. We will have to do some preprocessing and data extracting.

Since I am focusing on OCR and scanners are too slow for modern life, we will be using photos taken by phone. This means that I have to get rid of background and find only the important parts of the image. Let’s start with the problem of finding a page in a photo. There are already solutions for that, but I want to truly understand the concepts behind it, so I can optimize it for my problem.

But first we have to gain some knowledge of techniques used in computer vision. Let’s start with filters because they can be used for more than making your Instagram photos look good. This gives us a good start for learning about problems like edge detection, which is useful for object detection. I don’t feel appropriate to explain these problems, but you can use these videos as your starting point.

Conclusion

I find computer vision problems really interesting. Maybe because the results of your code can be easily visualized or because it is so easy for humans, but really complicated for computers. I haven’t finished any production ready code for this problem yet. But you can look forward to some code in the next blog post.