Why machines will never be as smart as a four year old (and why you shouldn’t worry machines are taking over the world)

pexels-photo-798096.jpegIf you believe the news headlines, Artificial Intelligence (AI) will take your job (particularly if you work in financial services), lead to world war three and finally destroy humans all together.  AI, defined here as machines performing human like tasks, has made remarkable progress in the last few years, leading to self-driving cars and beating the best human players at US game show, Jeopardy.

But despite these successes, AI is no match for your average four year old. Infants cannot play Jeopardy or drive cars. Yet no AI system can achieve what pre-schoolers achieve, in parallel and largely unsupervised: learning a language, face recognition, causal and moral reasoning and being able to understand and predict how and why someone acts in a certain way (what psychologists called Theory of Mind).

Alan Turing famously pointed out that the real secret to our intelligence is our ability to learn and that the key to achieving artificial intelligence is to design a machine like a child, not an adult.  Machines excel at and often exceed human performance in single tasks involving routine jobs and/or finding patterns in data. For example: making sales predictions, making risk assessments or electronic trading.  This is what experts call ‘narrow’ intelligence.

While machines’ narrow intelligence can outstrip human performance in equivalent tasks and present a real danger – like drones used in warfare – we are nowhere near building a machine that thinks and learns like a child, which have ‘general’ intelligence. Although it’s predicted that narrow AI will replace (or more likely, displace) some parts of some human jobs, we are a long way off mass unemployment and machines taking over the world, if ever.

Less topical – but perhaps more worrying – is the challenge of teaching machines not to amplify human biases. Imagine an algorithm which includes people’s shopping habits as part of its decision model.  It may ‘learn’ that shopping at a particular supermarket chain is correlated with a higher risk of default and subsequently reject people who shop at this chain. If this particular chain is typically situated in more deprived areas, the algorithm could be harmful to more vulnerable applicants who are otherwise creditworthy.  Another big concern for policy makers is whether you can apply the same rules and regulations designed to mitigate the harm caused by human biases to machines.

It’s the simple stuff that artificial intelligence really struggles with

Making sense of what we see, rather than just seeing pixels and patterns, is a seemingly simple task for humans, yet it is surprisingly complex for computers. Imagine the following scenario: a toddler sees an adult walking over to a cupboard, holding a pile of books in both hands and touching the door with his foot. Within seconds, unprompted, and without encouragement or praise, the child toddles over and opens the cupboard door for the adult.

Understanding this scene requires building a causal model. Distinguishing between the objects (animate or inanimate); comprehending the physical forces at work between objects and people (the books will drop if he uses his hands); inferring the adult’s beliefs (he thinks there’s space in the cupboard) and desires (he wants to open the cupboard); understanding social and moral norms (we help a member of a group that we also identify being a member of) and anticipating the possibilities for intervening in the scene.

In contrast, an artificial neural network – a method modelled loosely on the human brain allowing machines to learn from vast amounts of data – was trained to produce natural language descriptions of real world scenes. The network-generated caption “a group of people standing on top of a beach” was a description of a scene in which a family is seen running away in terror as a flood wreaks havoc on their home. Although the network correctly detects and classifies the objects, it fails to understand the ‘glue’ between the objects – the physics, the mental states or the relationships between the objects.

 Children learn a lot more from a lot less than machines

So how do machines learn? Much of AI’s recent success is due to deep learning, a machine learning approach characterised by large, neural networks with multiple layers of representation. Basic neural networks are a network of processing units that collectively perform complex computations. They typically have multiple layers.

An input layer provides information from the external world to the network – like an image of a dog. Representations are developed from the data by hidden layers. The third layer creates an output – for example, a label saying ‘dog’. Learning involves a gradual adaptation of network parameters – the weights on links between processing units – allowing the network to match its outputs to the thousands of training examples, typically labelled by humans.  The machine learning model is then ‘tested’ by giving it a new sample of dogs, without the correct answer. The outcome is usually an error rate.


Although it can accurately recognise different images, neural networks do not algorithmically represent the image, e.g. a ‘poodle’ as a concept. Instead, it uses the photograph’s pixels to recognise patterns and predict ‘poodle’, without needing to understand the underlying concept. Without this conceptual understanding, machines struggle to extrapolate and generalise to tasks beyond what they have been trained for. Put another way, they are very good one-trick ponies.

In stark contrast to the neural network, children excel at ‘one-shot’ learning and can infer beyond the immediate data they see to create a far richer representation.  Brenden Lake, Professor of Psychology and Data Science at NYU et al says: “children may only need to see a few examples of the concepts hairbrush, pineapple or light saber before they largely ‘get it’, grasping the boundary of the infinite set that defines each concept from the infinite set of all possible objects.  Children are far more practiced than adults at learning new concepts – learning roughly nine or ten new words each day after beginning to speak through the end of high school”.  After hearing about a novel creature (the boojum) that is hungry, they can infer other features the boojum is likely to have (e.g. a mouth) and which categories it belongs to (an animal).

Why can’t machines learn like children? Because of the way children learn so much from so little

The influential ‘Theory Theory’ argues children are just like scientists, developing intuitive theories of core concepts (e.g. physics, biology, psychology) and the abstract causal laws that govern how these concepts interrelate. For example, by 6 months infants have an intuitive theory of physics, understanding that objects still exist when they can’t see them and of continuity and cohesion (that objects follow smooth paths).

How do they develop these intuitive theories? Like grown-up scientists: developing hypotheses, testing them through observation, intervention and experimentation and revising them in the face of evidence.

The Theory Theory proposes children learn causal knowledge through various mechanisms, including deliberately experimenting and intervening in the world – yep, basically playing – and observing what other people do.

Active learning (aka play) – where children decide where and when to intervene – is very different from machine learning, which typically relies on being fed pre-existing training data sets. Different and better evidence can be acquired through ‘playing’ in the environment to establish causation, not just correlation.

This is reassuring for anyone fretting about the prospect of machines taking every job in finance. Real world learning is not just about acquiring new facts or memories, but understanding the causal relations between events in the world. For example, my daughter sees her brother’s behaviour correlates with my mood. But, she can establish which behaviours cause me to get angry by behaving in different ways in different situations to see what happens to my mood.  The active way children learn means they avoid redundant data and focus on parts of the environment not well understood. This confers a significant advantage with each individual experience more instructive. Put simply, children learn more from less training.

Don’t worry about machines taking over finance – worry about their mistakes

Why does all this matter? Essentially, human learning is currently unrivalled by any machine. Or, put another way, we won’t be seeing a Terminator ‘Rise of the Machines’ scenario any time soon. In the words of Facebook’s Head of Research, Yann LeCun, we are very far from having machines that can learn the most basic things about the world in a way children do.

While machines are highly successful in their ability to solve specific tasks in narrow areas – for example financial trading – there is currently no AI system that can flexibly combine multiple, discrete intelligence capabilities in a way that rivals a human’s grasp of “general behavioural laws”.

Even with more data and deeper models, machines are unlikely to reach the level of human learning which requires conceptual understanding, essential for many jobs in finance. For example: thinking abstractly and planning, problem solving, relationship building, innovating and coming up with ideas that have no precedence.  Reassuring right?

The concept of algorithms having ‘baked in’ bias is particularly worrying where algorithms are making decisions that have the potential to harm consumers or competition. For example, whether or not to approve someone for a loan or insurance product.  Imagine this. An algorithm is used to set prices in an insurance market. Factors predicting cost are correlated with some ‘protected’ characteristics e.g. ethnicity and gender. It would be illegal and unfair if prices were set on the basis of such characteristics.

Or imagine price-bots (algorithms that set prices) colluding against consumers. Lawyers paint a hypothetical picture where companies (unknowingly) develop similar algorithms to promote an optimal pricing strategy which predicts another company’s reaction to changing prices. Imagine the algorithms decide the best way to maximise profit for their firm is to set the same optimum price for a particular product. Would this amount to tacit collusion to fix prices?

In the eyes of the law, one has to show discrimination or harm was intentional. Do algorithms have intent?

What does individual or organisational accountability look like when proprietary algorithms are often like black-boxes or trained on biased data sets?  My goal as a parent is to teach my children to be moral and law abiding. Can we do the same for algorithms? This is a make or break question for all of us – firms, policy-makers and customers


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s