As both a mathematician and Data Scientist, I can understand the power of Deep Learning, and Machine Learning more broadly, and the breadth and scale of what it affords us these days with access to cheap computational power and huge amounts of data. However, I can’t help but also feel a sense of unease with how little we really understand how the most advanced algorithms do what they do!

A problem being solved by a huge artificial neural network is essentially just a ‘black box’, a complex system giving little insight as to how the ultimate decisions are made.

When I was a boy (a young applied mathematician at the turn of the century working as a Quant), some of the more advanced computational algorithms I used included Monte Carlo, a relatively brute force approach to calculate expected future values by repeated random sampling to determine a numerical result. I at least had some intuitive grasp of what it was doing ‘under the hood’. Add to the fact that we implemented most algorithms we used from scratch, via such classic sources as Numerical Recipes in C, so we could control various aspects of our models, such as speed and accuracy.

I was very lucky (as a mathematician) that some of the other models I developed allowed for analytical or semi-analytical solutions, which gave me a deeper and clearer understanding of how the model worked, how it performed at the boundaries, and how sensitive it was to certain conditions.

Obviously, limited problems can be solved in such ‘pure’ mathematical ways, but it still begs the question: how much do we really understand about how today’s most advanced algorithms actually work?

There is research being undertaken to try answer such questions (two examples include Google’s Deep Dream and the idea of the ‘information bottleneck’ as proposed by Naftali Tishby), which can not only enhance our understanding of how the models work, learn and make decisions, but their limitations too.

This raises the broader point, the fact that Machine Learning is a fundamentally different way to model the real world. It is uniquely different to what we as mathematicians, statisticians, physicists, etc are traditionally used to, and have done for so long. What impacts will this have on society when decision making is automated? How comfortable will we be, even as practitioners in this field, at trusting a ‘black box’ to decide medical, military, air traffic control and investment actions?

 

Leave a Reply