TECH | Jun 14, 2018

AI: the battle between data and algorithms

Is there a battle going on between data and algorithms? Who is winning?

There is no doubt that Artificial Intelligence is one of the most scintillating technological sectors. According to Gartner’s study “The Business Value of Artificial Intelligence, Worldwide, 2017-2025”, the turnover of business solutions built on Artificial Intelligence platforms will grow exponentially worldwide, with an increase of 70% in 2018 over 2017. This figure is destined to more than triple in 2022, when the business should be worth 3,900 billion dollars. The explosion of Artificial Intelligence has profoundly transformed modern society, propelled by ever increasing computational capacity and the continuously growing amounts of data and information available today. It will affect all aspects of our lives and will be one of the most disruptive technologies of the years to come.

It is clear to me, as director of a company geared to assisting its clients in making the Digital Transformation, that adopting solutions using Artificial Intelligence is the basis of daily activities and grounds for debate with colleagues and customers. Today, reading the specialised press and Internet discussion forums, there seems to be a battle going on between data and algorithms, to support the best AI-based applications. One can read articles that seem nearly to be expressions of opposing factions; biased according to the primacy of data over algorithms or vice versa.

The characteristic feature of Artificial Intelligence, from a technological point of view, is the method/model of learning with which the Intelligence becomes skilled in a task or action (hence the distinction between Machine Learning, Deep Learning, etc.). Therefore, both data and algorithms are necessary for the development of an AI based application.

Is there really a battle between data and algorithms?

It has been a long time since I left the Faculty of Statistical Sciences in Bologna, but with all our company investments in the field of Machine Learning and Artificial Intelligence in general, I am often involved in intriguing project discussions with my colleagues who head our Digital department and are fortunately much more experienced than me.

In my opinion, if it is true – as claimed by Geraldo Salandra – that “Artificial Intelligence is the rocket, but the data is the fuel”, it is also true and irrefutable that AI is a combination of data and algorithms.

There is no doubt that without fuel (i.e. data) you cannot go anywhere, but keep in mind that it is also true that the choice of the right algorithm can compensate for poor data quality, and it is equally certain that choosing a wrong algorithm can impoverish the effects of excellent data.

Must we assume that data is more important than algorithms?

I don’t think that this is always the case. I understand the fundamental value of the data and analytics infrastructure in feeding Artificial Intelligence algorithms.

In our daily experience “data collection and preparation” are, in fact, the activities that require more time in developing Artificial Intelligence-based applications, compared to those for the selection and development of a model. This is why we have invested so much to be able to provide our customers with the best data infrastructure for feeding and training algorithms.

Algorithms however require a great deal of work: nobody can say with certainty which algorithm will produce the best return without first having tried different versions. Developing and comparing algorithms and models to choose the most suitable ones is crucial in defining the success of an AI solution:

  • Which algorithm should I use?
  • How many hours of algorithm training do I have at my disposal?
  • What is the type, quality and size of the data available to me?

The quality of the data set will directly influence the success of the predictive model. Focusing on the data, it is possible to transform a poor database into one that is worth using in the application of Artificial Intelligence, but it is also essential to choose the correct algorithm and model that will fit the available data and be consistent with business goals.

So here we are: business. The word that is often missing in the articles I have read that debate the primacy of data over algorithms or vice versa, is precisely, “business”. The availability of a large amount of good quality data and relevant algorithms allows for better information and applications; but obtaining this kind of data and algorithms is not just a technical issue: in-depth business skills are needed to generate meaningful value and AI applications for companies.

Data and algorithms are not adversaries, but allies in a business-oriented strategy.

Filippo Di Cesare