Beyond applying for graduate schools, these past few months I have been knee-deep into machine learning with TensorFlow at my place of employment.
My project is fairly straight forward. When presented with a mechanical rotating digit, the computer must determine the digit’s value at any point during the rotation. Of course there are complicating factors: the digit may be partially occluded by another object and it is prone to left-right wobbling.
My first attempt at TensorFlow was using v1.14. I took about 2,000 images of the digit in various stages of rotation. Using Pillow I left shifted the image so that the each image had 18 representations with the numeral in left-to-right positions. My training image count was about 38K. I loaded the images into a Numpy array for training. I prepared another 1,000 images for testing. For the time being I have tabled solving the issue of partial occlusion.
The model I used was very simplistic, perhaps 10 layers, and was based on the tutorials given on the TensorFlow website for new users . I trained the model and being a novice did not understand the concepts of over-fitting vs. under-fitting, so I early stopped at 100% accuracy–very likely over-fitted. After training was complete, I evaluated the test images and was satisfied with the 96% accuracy that was returned.
Through the summer this over-fitted model, with its shallow 8-10 layers, was deployed on a Raspberry Pi 0 prototype and used to make real-world predictions of data. The accuracy was not so close to the 96% returned on the test data. Perhaps 70%–I was very unsatisfied.
So with these results in hand I embarked on an effort to improve my TensorFlow algorithms. The first website I stumbled upon was Digit recognition using Tensorflow by teavanist, which used the very well-known MNIST hand-written digits database and presented a problem similar to my own. I went through the tutorial step-by-step and obtained 92.2% from the test data.
The painstaking part was next. I realized that I was short on training images, so I set about collecting another set of images and eventually amassed 145k images, then set apart 25% of these for validation. This took a few weeks. I upgraded to TensorFlow v2.0 and went through teavanist’s code line-by-line, looking up APIs to understand the parameters used. I also did some reading on machine learning theory and the Inception model compared with others.
Ultimately I chose Inception v4 after looking closely at Inception Resnet v1 and Resnet v2. I found a code template for the model and adapted it for use in my project. After hurtling Humpty-Dumpty off the wall, I have come very close to putting him back together again, save for some quirky behavior I am still sorting out. For those who are familiar with old television series, this will hopefully prove to be the $6 million Humpty Dumpty.
Along the way I met with a rather annoying TensorFlow memory leak, and on top of this a bug that prevents saving a fully trained model. Fortunately I am able to save weights. My impression with TensorFlow v2.0 is that even though it is no longer in Beta or RC, it is still a bit buggy. Perhaps rushed out the door too soon? At any rate, I will regress to v1.15 only if I hit a snag that can not be solved in v2.0. For the moment I have foregone posting the code, since this is a company project. Hopefully on my next post I will have some code snippets available to post.