Sunday, April 14, 2019

Learning to Match Fresco --- A Machine Learning Approach (Hao Huang)

When I visited the Metropolitan Museum in New York during winter break, I came across this beautiful ancient Egyptian fresco in the Ancient Egyptian Works of Art Exhibition. The fresco was an intricate depiction of a papyrus march from the 1427 – 1400 BC with sights of a distant fishing colony. Although the fresco took more than two years to reconstruct, it provided many key insights into the daily lives of ancient Egyptians and helped to verify many modern-day archeological hypotheses. 

On my way back to Gainesville, I couldn’t stop thinking that it is already 2019 and archeologists still have to piece together each broken fresco by hand. During the time when Google Duplex can mimic the sound of a natural conversation and Siri became everyone’s personal assistant, I couldn’t help to think if machines can also help us match these fresco pieces. 

The more I think about it, the more I think it’s a machine learning optimization problem. So on the plane ride back to Florida, I did some digging and fortunately found that my very own research professor Dr. Corey Toler Franklin has already published a paper on this idea. Basically, it was a machine learning method based on cross training with novel image features. 

So I know what you are asking, what does machine learning actually mean? It is a process in which a computer gradually builds a model that resembles how the real world works. Similar to how we humans develop intuitions about the real world as we grow and learn. 

This is cool and all but how does a computer actually learn? So let’s start simple, I remember when I was a kid I was told not to touch poison ivy. However, I did it anyway because I was curious. Then as expected, I got a rash that itches really bad that I swear not to touch poison-ivies ever again. In this situation, I have a utility function for happiness. The factors that influence happiness is the fulfillment of curiosity and the itch and pain from touching the ivy. Clearly, after my encounter, my utility becomes negative as the pain from the ivy out weights the joy I get from touching it. 

Same as computers, for any situation, a computer would have a utility function with factors that determine how utilities change. With trial and error, we seek to maximize the utility function by changing the factors that determine its utility. 

With this in mind, let’s look at the fresco. The first step to building the fresco matching algorithm is to formulate a utility function. For this case, how well the fresco pieces fit determines the utility. Dr. Toler-Franklin defined a well-fitting match to be two pieces of fresco that minimize the distance between two adjacent edges on each fresco piece, as well as maximizes the contact region between the two edges. As shown in figure 1 and 2, the contact regions are illustrated as two ribbons (red and purple), and you can see how the goal is to make the find two ribbons that match each other the most. 

Figure 1. The depiction of the ribbon region on two fresco pieces

Figure 2. The contact region of the two ribbons



Working with this framework, the next goal is to find image attributes that could inform the machine learning algorithm. Just like how we identify objects, we need to tell the computer what features to look out for. For example, a car has wheels, consists of four windows and usually in a rectangular shape. Thus, if we would build an algorithm to find cars, then we would program the algorithm to look out for rectangular objects that have wheels and windows. 

Similar to how we identify fresco pieces that might match, we look at multiple fresco image features that identify unique pieces such as color, hues, saturation, etc. But the most important feature is the contours, the algorithm focuses on the contours on the side of the fresco pieces and compares the similarities between the two. A three-dimensional infrared scan was performed to capture the contours which are then modeled by a vector map so that the “hills and valleys” on the edges can be interpreted by the computer. 

After defining what is important for the computer to focus on, it is time to train the computer to find matches. The algorithm we are using is called M5P regression trees, similar to how we construct a knowledge map in preparation of an exam, it is basically a piece of software that stores the information we present to the computer with all of its respective importance. In preparation for training, Dr. Toler-Frankin’s team scanned three sets of fresco pieces, each set consists of 5000 different fresco pieces and formed about 300 matches. To train it, we allow the algorithm to read in scanned images of one set of fresco pieces, and by computing similarity between two fresco pieces with the framework discussed above, it makes a prediction as to if the two pieces match. The result is then compared to the answer; if it is wrong, the algorithm will adjust the weights on the different information it considers, in this case, the different fresco attributes discussed above. The computer will continue this revolution of trial and error until we get a reasonably accurate prediction rate. 

So after about 100 thousand iterations, the algorithm achieved a 96.5% accuracy on average across the three sets of fresco pieces. This is almost better than the accuracy the archeologists produce when dealing with fresco preservation, not to mention the immense reduction of time. The paper also discussed how training the algorithm on one set of frescos can also yield good prediction results on another totally different set of fresco. This means that once trained with enough information, the algorithm can be used for different puzzle matching problems not just in frescos. 

However, as magnificent as the algorithms might seem, the downsides are also apparent. Out of the many errors matches the algorithm made, the majority was based on misjudgment on historical context. This is expected as the algorithm was only presented with objective features of the fresco, information relating to the artistic style at the time was not included in the model. Only the archeologist can provide this insight and make judgments based on that. 

It seems that machines can only help humans complete grunt work, but humans still have to step in to solve the critical piece of the puzzle. Just like any machines these days, seemingly sophisticated, yet fundamentally empty. In the coming years as machines progress, they will not be able to completely take over us humans but will work with us to be more productive. 

Thinking back to that Egyptian fresco at the MET museum, like 6000 years ago people were using rods to fish in the Nile river, we will be using machine learning to advance our current productivity. 

Thomas Funkhouser, Hijung Shin, Corey Toler-Franklin, Antonio García Castañeda, Benedict Brown, David Dobkin, Szymon Rusinkiewicz, and Tim Weyrich. 2011. Learning how to match fresco fragments. J. Comput. Cult. Herit. 4, 2, Article 7 (November 2011), 13 pages. DOI: https://doi.org/10.1145/2037820.2037824

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.