“Data is a precious thing and will last longer than the system themselves”
Tim Berners Lee
It is a proven fact that fake data,on the internet and information media can manipulate our opinions and affect our decisions. We also know that the heuristic shortcuts practiced by the media are the same ones we all apply in our daily lives. The question we should start asking ourselves is whether similar cognitive distortions can strike the entities that, in the near future, will be increasingly making decisions in our place: AI algorithms.
Data mistreatment is a tool for manipulating any form of intelligence, biological or artificial. Just as it is possible to steer the decision of a human being, supplying him with incorrect, biased or falsified information and data, it is possible to make an AI or machine-learning algorithm fail by feeding it an erroneous dataset, or one purposely modified in order to make the algorithm produce the output desired.
This is what is being done in so-called adversarial attacks, which consists of altering input by adding fake data to it – technically referred to as “adversarial noise” – small disruptions that are invisible to the human eye but which can steer our algorithm completely off course.
How to cause an algorithm to make incorrect decisions
The majority of AI and machine learning algorithms operate more or less like sophisticated systems of classification. This concept is also the basis of the PRTM (Pattern Recognition Theory of Mind) theory proposed by Ray Kurzweil, in PRTM (Pattern Recognition Theory of Mind) proposed by Ray Kurzweil, How to create a mind, according to which the functional mechanism of the human cerebral neocortex is itself based on the recognition and classification of forms on various hierarchical levels of abstraction.
An OCR software program decides whether a graphical element is a letter “a” rather than an “e” through classification based on a specific group of criteria. Doing the same thing on different levels of abstraction, we have metal detectors that report the presence of a weapon inside a luggage bag, face recognition systems accessed on smartphones or self-driving automobiles which decide how to maneuver by interpreting street signs.
Naturally, simply causing an algorithm to fail in a random way (non-targeted attack) ) is easier than tampering with its behavior in order to force the classifier to produce a specific result (targeted attack).
The classic field where this is applied is image recognition. To give some idea how an adversarial targeted attack works, consider the demonstration, given in the paper Explaining and harnessing adversarial examples, where an image of a panda, recognized as such by an image classifier with a 57.7% level of certainty, is overlaid with an “adversarial noise” image – which only appears to be random but is in reality constructed using a complex mathematical procedure – thus obtaining an image that still looks like a panda to our eyes but which the system classifies as a “gibbon” with a certainty level of 99.3%!
Some recent studies show examples of how images altered with adversarial noise can be printed on standard paper and photographed by a smartphone with medium resolution, to trick image-based recognition systems.
For example, it is theoretically possible to print an ATM check with a written amount of $100 in a way that causes it the machine to disburse $10,000 in cash. Or to replace a traffic sign with one altered using adversarial noise, imperceptible to the human eye, which the cam of a self-driving system interprets as a 300 km/h speed limit.
Preventing this type of attack is complex and requires a significant amount of research, which we must nevertheless undertake if our goal is to have a distributed AI that is safe from fake data.
Among the defense strategies currently adopted is training a second classifier to uncover input with adversarial noise and reject it, or to implement an adversarial training routine into the primary classification system, which in all cases weighs down the system, with a resulting negative impact on performance. This is something that is not always acceptable such as, for example, in a facial recognition system which must examine a continuous flow of people passing through an opening (passport control, banks, etc.).
What to do to protect ourselves?
Data mistreatment is an issue we must pay close attention to now and in the near future. Fake data of a numerical nature is a powerful tool for manipulating the decision-making abilities of an intelligent entity, no matter whether it is biological or synthetic. First and foremost, any classification system based on AI or machine-learning algorithms can be fooled or led to furnish specific results by preparing fake input with the addition of adversarial noise.
The behavior of evolved algorithms is extremely non-linear. In practical terms, they perform a series of transformations of data beginning with numerical vector input, where more complex behavior corresponds to a larger number of transformations, multiplying the effect of any incoming noise introduced and thus increasing the system’s sensitivity to small variations in input.
To prevent this sensitivity from being used for nefarious purposes and channeled in order to modify the AI algorithm’s behavior will be one of the key security problems of the near future.