Are AI trained with computers and data? Yes — but first, you need a human.
Look closely at the image on the screen. Scrutinise each pixel. Trace out the houses, the buildings, the lamp-posts. Select and mark the edge of the road. Don’t forget to note the moving traffic.
Do it carefully, do it well. Every pixel has its place.
Sounds like the mind of a self-driving car? Actually, it’s the job description of those who help train one.
When one thinks of Artificial Intelligence and Machine Learning, one thinks of computer programmers pounding out code at their terminals. You’d think of downloading vast libraries of pictures, clicking “Analyse”, and letting the computer figure out the rest. And you’d probably imagine it happening in a hi-tech office, in a place like Silicon Valley, USA.
Actually, the places where the first and most crucial steps happen, are places more like Nairobi, Kenya.
How do machines learn? It’s all very simple and at the same time rather complicated — but, the idea, in essence, is as follows.
Let’s say you want to detect faces in a photograph. That’s a very common feature these days, and there used to be a hard way of making it. People would painstakingly program exactly how a face should look, telling their algorithms to search for certain lines and curves organised in a certain pattern. It was hard enough to do it for alphabets, let alone faces.
But now, machine-learning lets the computer figure it out for itself. You start with a basic “neural network”, and train it to do what you want. So you’d find it a million face-photos, saying “this has a face”. And you’d find it two-million non-face photos saying “this doesn’t have a face”.
The computer will automatically analyse all the photos, and find common pattern to work with. “Aha — these pixels lined up increases the chance of it being a face”, it says, “while this one down here means it probably isn’t.” Except it does it all hidden away, so you have no way of knowing exactly what pattern your machine has found. Does it search for eyes and a mouth? Or a circle with a nose? Or something unimaginably different?
Before you can train a machine, you need data to train it on. You need a folder with a million face-photos, and another that’s guaranteed not to have any.
That means, you need somebody to do the sorting.
Sometimes, you can take the easy way. Facebook users may have noticed how that evolved. First, Facebook let you click on a person’s face to “tag” that person: there’s the “face” and “no face” dataset being created for you. Using that, they learnt to automatically detect faces, and prompt you to tag them with names. And now the algorithm’s so good, it can tag people itself without any prompting.
Of course, it’s not limited to faces. Any pictures out there, appropriately captioned, can potentially help machines to recognise objects. Ever wondered why Google lets you upload unlimited photos? Now you know.
There are times when mass community tagging is not enough. You need specialised workers to do it. And that’s where Nairobi, Kenya, comes in.
Nairobi hosts one of the outlets for Samasource, a company that creates datasets for machines to train on. Among its customers are Google, Facebook, and Microsoft. They hire nearby people, including from Kibera, the largest slum in the continent. People are paid $9 a day to do the one task machines cannot: create training data.
It’s mostly about annotation. “This is a lamp-post”, you have to tell the computer. “This is a pothole. This is a shadow. This is a crow placing a nut on the road so that vehicles crack it open when they run over.” Well, maybe not the last one — but you’d be tempted to add it because the work is just so dull, repetitive, and boring.
Incidentally, if you want to try such work yourself, you can. Just sign up for Amazon Mechanical Turk, which offers boring, repetitive tasks that humans still do better than machines. You’ll be paid $1 an hour. Have fun.
Working at Samasource may be repetitive, but it does pay well — relatively speaking. The company targets people currently earning $2 a day, more than quadrupling their income.
They also offer lactation rooms and ninety-day maternity leave, and employ more women than men in a country where becoming a mother usually rules out even taking up a career. Many people have started work at Samasource, and then moved on to bigger things.
Then, of course, there’s the sense of helping something larger. By creating this data, painstakingly working out maps, you’re enabling an AI revolution — albeit one that won’t be felt in your locality for a long while — that wouldn’t otherwise be possible.
Training data is important. In fact, it’s much more important than you may think. The way you pick your data shapes the way your algorithms end up — and in some cases, you have to be very, very careful.
In 2015, Google’s algorithms tagged two African American people as “gorilla”. Turns out, it was trained mainly on photos of “white” people (or “pink” people, as Native Americans would call them). So if any other kind of person were shown, it had more trouble recognising them as human.
Some cases are even more serious. Palantir, the secretive startup, uses artificial-intelligence to help police catch criminals. But in the U.S, police have a tendency to convict “black” people more than “white” ones. If the algorithm uses police records as training-data, there’s a pretty good chance it’ll learn police biases as well.
The problem with machine-learning is, it’s a black-box. We don’t give machines specific instructions; we just say “This is what you need to do, now figure out your own formula”. And it works.
But we don’t know how it works. We can’t see the formula, and, even if we could, there’s no guarantee we’d understand it. They basically take a whole bunch of factors and weigh them against one another, and nobody knows which and when and why.
Once in the 1980s, so the story goes, the U.S. military was trying to train a computer. They wanted it to distinguish between Russian and American tanks. And it worked: they got very good results with their test data. But all the Russian-tank pictures were blurry, and all the American-tank pictures were high-quality.
As the army later realised, the machine hadn’t learnt to recognise tanks at all: it had just learnt to recognise blurry and hi-res pictures.
To be able to trust machine-learning algorithms, we’d need some explanation of how they work. And that’s what Been Kim, researcher at Google Brain, is working on.
Her system, Testing Concept with Activation Vectors, or TCAV, figures out how much impact a certain “input feature” has on an algorithms decision. This helps people know the algorithm’s working as expected.
Take medical analysis, for instance. Now, algorithms can diagnose stuff like cancer, often with greater accuracy the professional doctors. But they can’t say why. And you wouldn’t be happy if the doctor came up and said “you’d better take this treatment because the algorithm said so”.
With TCAV, you can see what the algorithm’s basing its decisions on. You can watch input-features like “fused glands”, “patient age” and “past chemotherapy”, and see how much impact each of these things has on decision-making. TCAV does this by sending in two sets of data, one with the input-features and one without, and see how the algorithm responds.
You can’t learn everything about an algorithm with TCAV. But you can at least get an idea. You can test a hunch. You can say, “is this algorithm relying more on the nozzle of the tank, or the graininess of the photo?”
Earlier, you’d just plug in the data and hope it worked. Now you can sneak a peek inside, and have some sense of how things are going.
If they’re going wrong, though, perhaps it’s time to head back to Nairobi.
Want to write with us? To diversify our content, we’re looking out for new authors to write at Snipette. That means you! Aspiring writers: we’ll help you shape your piece. Established writers: Click here to get started.
Curious for more? Sources and references for this article can be found here.