A project using CNNs to remove ruled lines from sketches
- Overview
- The Data
- The Strategy
- The Pipeline
- Data Classification
- CNN Architecture
- CNN Parameters
- CNN Results
- Picture Scrubbing
- Results
- Comparison
- App
- Further Work
- Sources
Thank you to Land Belenky for the project idea. Land's Uncle Peter accumulated 1124 sailboat pencil sketches over the years. Unfortunately, 513 of these images were done on ruled paper. Can we salvage the ruled pictures?
Hypothesis: It is possible to train a CNN to remove ruled lines from an image without apparent degradation of the image
Goals
Ruled | Unruled |
---|---|
![]() |
![]() |
1124 total sailboat drawings, 513 on ruled paper. Some files are corrupt. Taking a closer look at just the ruled drawings ...
Dimensions |
---|
![]() |
Aspect Ratio |
---|
![]() |
It is clear that the drawings will have to be standardized
At first, I considered augmenting lines onto unruled images. I would then train an autoencoder to remove lines from these images. Finally, I would apply the model to ruled images.
Due to the difference in quality across the images, I could find a process by which to augment lines in a way that would be similar to ruled images. I would also have to resize these very large images, degrading the quality.
I found a new strategy from a paper that showed how it was possible to remove staff lines from music scores. Instead of looking at the entire picture, this paper looked at 28 x 28 windows, classified if the central pixel came from a staff or symbol. A CNN was then trained on staff and symbol classifications. Staff classification were then removed. The results were impressive.
From a directory of standardized ruled sailboat drawings, we classify thousands of 30x30 frames and split them into train and test directories. The CNN is trained and tested using these directories. Finally, we take a standardized sailboat drawing to be scrubbed. The drawing is divided into 30x30 frames. We predict whether or not each frame is a line or a drawing frame. The lines frames are removed. The result is a scrubbed sailboat drawing.
Line Class | Drawing Class |
---|---|
![]() |
![]() |
Data is needed to train, test, and predict on a Convolutional Neural Network. I took a sample of 27 drawings and iterated through sections of each drawing, labeling frames as lines or drawings depending on the location of the central pixel. After initially collecting 7,000 total frames (around half/half line/drawing), I looked at the mean pixel intensities for each class for each drawing. I wanted to make sure we were representing all different kinds of lines and drawings that appear on sailboat drawings.
-- | -- |
---|---|
![]() |
![]() |
As you can see, lines tend to be very similar no matter what drawing they are sampled from. Drawings, on the other hand, occur in many different forms across each drawing.
CNN architecture was determined through a process of trial and error. I would create a model, scrub an image, and look at the results hoping to maximize line removal and drawing preservation.
Observations
Parameter | Value |
---|---|
Epochs | 10 |
Batch Size | 10 |
Image Size | 30 x 30 |
Filters | 64 |
Neurons | 64 |
Layers | 6 |
Kernel Size | 4 x 4 |
Pool Size | 2 x 2 |
Activation Function | Relu |
Optimizer | Adam |
Adjusting Epochs |
---|
![]() |
Adjusting Filters |
---|
![]() |
Adjusting Layers |
---|
![]() |
Accuracy |
---|
![]() |
Precision/Recall |
---|
![]() |
Loss |
---|
![]() |
Based on training data, the model is very accurate with great precision/recall rates. You can also clearly see in these graphs that after 10 epochs, test/training accuracy, precision, recall, and loss are not further improved. It is also aparent, especially in loss, that the model starts overfitting to the train data. These observations are confirmed in the scrubbed results. Drawings are just as well preserved at 10 epochs as they are at 100 epochs.
For the final step, I used my model to predict frame by frame, whether or not the central pixel is from a line or a drawing. If the pixel is from a line, the pixel is removed.
This step is extremely computational expensive. To cut down on cost, I adopted a couple strategies:
To speed up the process, I used a memory-optimized EC2 instance. Overall, I sped up the process from 2 hours to 1 minute and 10 seconds.
Before | After |
---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
The CNN appears to work better to other line removal strategies.
More Data
Multiprocessing
Colorizing