Oral History: Geoff Hinton On How AI Came To Be And What We're Supposed To Do With It
Or When An Atheist Carpenter Finds Neural Nets
A couple of years ago, I had the pleasure of interviewing Geoff Hinton, who is considered one of the main figures that brought modern AI to fruition. (A very lightly edited version of our conversation follows.)
Hinton, and a small group of like-minded researchers, spent decades in the backwoods of academia trying to make things called neural nets work when everyone else was sure that they never would. As it turned out, Hinton and his stubborn peers were right, and the conventional thinkers were wrong. Neural nets have transformed the technology industry and many of our lives by ushering in the modern form of Artificial Intelligence technology.
In this interview, Hinton talks about his academic journey and why he believed in neural nets when so few others did. He’s insightful, controversial and funny.
(If you enjoy this, there’s quite a bit more about Hinton in a new book by Cade Metz. It’s a wonderful read. There’s also an oral history I put together on Canada’s surprising role in the AI revolution. And, if you’re reading that, you might as well be watching this, which is our Hello World episode on AI.)
Hinton: I'm Geoffrey Hinton. I'm an emeritus professor at the University of Toronto. I'm the Chief Scientific Advisor to the new Vector Institute in Toronto and I'm an engineering fellow at Google. I spend most of my time running a small basic research group at Google in Toronto.
Vance: That's a lot of titles.
You grew up in the UK, and you have this very prestigious family full of famous mathematicians and economists and writers. I was curious what it was like for you growing up. Was there an expectation that you would end up in academia somewhere?
Hinton: Yeah, there was a lot of pressure. I think by the time I was about seven, I realized I was going to have to get a PHD. My mother sort of made it clear that I didn't have to be an academic. I had a choice. I could be an academic or I could be a failure. It was mainly my dad actually.
Vance: Did you rebel against that?
Hinton: I dropped out every so often. I became a carpenter for a while.
Vance: I read about this moment early on in your life when one of your mathematician friends had this huge influence on your thinking about the brain.
Hinton: Yes. He was brilliant. I can't exactly remember what year it was. But pretty early on, sometime in the 60's, in the mid 60's, he sort of introduced me to the idea that memory might be spread out over a large portion of the brain, like a hologram.
Holograms were brand new then, and the point about a hologram is you take a piece of a hologram and you remove it. Then you still have an image of the whole scene. It's just a bit blurrier, and he had the idea that ... other psychologists have this idea too ... that maybe memories in the brain were like holograms. That's what got me going on neural networks.
Vance: I've seen you describe this a couple times, but I couldn't get to sort of the essence of why that sparked this interest for you. It was just that somebody was describing the brain in a new way?
Hinton: No, it's just a very interesting idea of how you would encode things in the brain. That a particular fact, or a particular image, wouldn't be encoded in a few neurons. It would be spread over all neurons. That's kind of interesting.
Vance: This sets off your interest in the brain, and, of course, there could be a number of ways to pursue that at university. What did you study?
Hinton: I did Physiology and Physics. Well, that's not quite true. So I went to university and I did Physics, and Chemistry, and Math. And after a month I dropped out. I went and worked in London doing various things for a year, and then I reapplied to do Architecture.
Hinton: After a day of that, I dropped out. I dropped out and switched to doing Physics and Physiology. I was the only student at Cambridge doing both Physics and Physiology.
I didn't have any Biology background, so Physiology was quite tough. It was really exciting because in the third term, they were going to tell us how the brain worked. That's why I was doing Physiology. We got to the third term, and it was taught by very distinguished people like Huxley.
They taught us about how the brain worked, but their idea of how the brain worked was that there's neurons and there's action potentials that travel along the axons and neurons. And that's it. But they didn't actually say how it worked.
It's like saying, "I'm telling you how a computer works. There's these electrical potentials."
It took me a while to realize that they weren't telling us how it worked because they didn't know. Then, I switched to philosophy.
Vance: This whole time, you were kind of consumed by this idea. You want to know how the brain works?
Hinton: Yeah. So then I switched to philosophy, and they didn't know how the brain works. Then I switched to psychology, and they really didn't know.
Vance: Then you ditched Cambridge?
Hinton: I ditched it and became a carpenter for about a year.
Vance: Doing what? Building houses?
Hinton: No, no, no. I wasn't that good. I was building shelves and things and cupboards. Hanging doors.
Vance: And that was, what? Just to bide your time until you'd figured out what ...
Hinton: Yeah, just to make a living. Doing something I liked.
Vance: Given the pressure from your parents, they must have reacted poorly to that.
Hinton: Yeah, they weren't too pleased. No.
Vance: Okay, but eventually you return to university, and, this time, you pursue Artificial Intelligence and your PhD?
Hinton: Yeah, I went to Edinburgh, and I studied with somebody called Christopher Longuet-Higgins, who was a very clever guy, and had worked on neural networks.
But, just before I got there to work with him, he decided neural networks were nonsense. So I spent my five years arguing with him. He kept trying to persuade me to stop doing neural networks and do more conventional symbolic AI.
Vance: This was the late 60's? 70's? Or?
Hinton: This was the 70's. I started my PHD in 1972.
Vance: Even at the very outset of you pursuing this, it's already being dismissed ...
Hinton: 1972 was the lowest neural networks ever got. In 1972, everybody who did was convinced they were complete nonsense.
Vance: Was part of this a reaction to the Perceptron.
Hinton: Yeah, the Perceptron book came out in '69 and that pretty much destroyed the field.
Vance: Because it was seen as over-hyped?
Hinton: Yeah. What was interesting about the book ... It was a very good book. There was a lot of interesting technical material in it. It was very clever. But then combined with that, there was this strong underlying claim that when you make multiple layers of this stuff, nobody knows how to train them in any way. It's all hopeless.
People sort of accepted that. People accepted that until quite recently. That the idea of having multiple layers in a neural net and just training it all from scratch, where you didn't put any knowledge by hand. All the knowledge came just from the data. People thought that was just crazy wishful thinking.
Vance: What do you think it was inside of you that kept you wanting to pursue this when everyone else was giving up? Just that you thought it was the right direction to go?
Hinton: No, that everyone else was wrong. It was just obvious to me that it was right way to go. The brain has to work somehow, right?
And the brain sure as hell doesn't work by being programmed. I mean, you can try programming little kids, you can tell them what to do. But it doesn't really work. So obviously almost all the knowledge we've got, we've learned, including how language works.
So, this big sort of conspiracy that language can't be learned, which is complete junk. It was based on some very bad mathematics, but everybody believed it. Everybody believed that language is innate. It just struck me as ridiculous.
Vance: It's incredible to have that sort of confidence though as a young researcher.
Hinton: I think it was partly that my parents were atheists, and I went to a religious school. So I was very used to everybody around me believing in God and me knowing that it was just complete nonsense.
That's very good training for a scientist. To be in an environment where there's some obvious nonsense that everybody else believes because most of science is like that.
Vance: I grew up in West Texas. My parents were staunch atheists, and everybody went to church. When I was a little kid, I couldn't understand was everybody was doing.
Hinton: Right. Exactly. It gives you confidence in your own opinion. It gives you at least it's a reasonable theory that everybody might be wrong.
Vance: Okay, the UK is already rejecting neural nets, then you head to the United States, to California.
Hinton: It's not that I couldn't get a job in the UK. I couldn't even get a job interview.
And so then I saw this wonderful opportunity in California, and that was a huge liberation. Because in California they thought neural nets might not be all nonsense.
Vance: This is the 70's and they're still just betting that this might be okay?
Hinton: In the late 70's in California, there was this group in San Diego who, particularly David Rumelhart. David Rumelhart, and Don Norman, and Jay McClelland. But particularly David Rumelhart and Jay McClelland, who thought neural networks were really interesting.
That was a huge liberation.
Vance: It seems that even though people were interested in neural networks, there were still these limitations and it just wasn't going as far as you guys wanted? Or is that the wrong impression?
Hinton: Retrospectively, we know what was going on but we didn't know at the time. So we know now that to really get the power out of these systems that learn everything from data, you need a lot of data and you need a lot of compute power.
Once you have millions kind of training cases and you have many gigaflops of compute power, then these things really work. Back then, we were trying to train things on a thousand training examples.
Actually the paper in Nature on backpropagation, the main example that we used, had like a 100 training examples and it was trained on a machine that took 12 microseconds to do a floating point multiple. So, it was a 12th of a mega flop.
So now a machine is a million times faster than that.
It worked moderately well. But if at the time, we'd said, "You know, if you gave us like 10,000 times as much data and a 1,000,000 times the compute power, this stuff would really work." People would say, "Oh yeah, right."
But actually, that's how it was. And you couldn't actually know that until you had all that data and all that compute power.
Vance: You were in California, then you went to Carnegie Mellon and then, I kind of wanted to get you to Canada and why you choose to come here. I mean part of the reason was because the Defense Department was funding so much research in the US?
Hinton: Yes.
Vance: You didn't' really agree with the policies?
Hinton: I was very lucky that just before I went to Carnegie Mellon I was in Cambridge for a couple of years at a different job. I was in Cambridge and at 2 o'clock in the morning, the phone went.
I assumed it was my buddy, Terry, who had some new idea. He used to get very excited and call up, you know how it is. But it wasn't him. It was someone I had never heard of called Charlie Smith. He said, "You don't know me but I know you, and we'd like to fund your research."
I said, "What do you mean?" And he said, "I work for a foundation. We'd like to fund your research." I said, "Well, how much are we talking about?" He said, "Well, how much do you need?"
So I thought for a bit and said, "Well, I can't really answer that unless I know how much you have." He said, "No, how much do you need?" So, I started thinking big numbers and doubling them. I said, "I need $100,000."
I said, "Why do you want to fund my research?" He said, "Well, I'm working for this foundation called the System Development Foundation and we like to fund really out there ideas that possibly will never work. I've been reading some of your research and we'd like to fund you."
Vance: What a compliment.
Hinton: What a compliment.
So I said, "So, okay, so what do I do?" He said, "Well, you have to write us a two page grant proposal."
I wrote a two page grant proposal. I sent it to them. Then I had to go to California and give a talk to the Advisory Board. The median age of the Advisory Board was 80.
So, if you put the lights down, about half of them went to sleep. They were very distinguished people like Beckman and people like that. I gave this talk. By that point, I've managed to ask for $350,000. Then they said, "Okay, we're going to fund you. We're going to give you $400,000 because we don't think you've asked for enough."
It's not like your normal granting agency. I thought, "That's amazing!" I told the people at UCSD about that, where I'd been working.
The same guy then approached them and said he wanted to fund them. They knew I got $400,000 so they asked him for $2,000,000. He gave them $2,250,000. So then he approached people at MIT who knew about San Diego. They asked him for $5,000,000 and he gave them $5,000,000.
Then he approached people at Stanford, and a friend of mine called Brian Smith wrote the proposal. He asked for $32,000,000 because he realized, why not? Go for $32,000,000.
Vance: You were really at the wrong end of this process.
Hinton: I was. Unfortunately, by that time the foundation didn't have $32,000,000 left and Stanford got, I think they got about $25,000,000. But they spent all their time moaning about how they'd been cheated out of their $32,000,000.
Vance: Was the System Development Foundation tied to the Defense Department? Or it's totally ... ?
Hinton: Yes and no. The way it worked was this, you know what the Rand Corporation was?
Vance: Yeah.
Hinton: They did sort of government consulting. They set up a not for-profit organization called the System Development Corporation. This not-for-profit did consulting on government software. I think they were responsible of the NORAD software.
So they were writing defense software for the government. And because it was defense software for the government, they accidentally made a $100,000,000 profit. Then the IRS found out about this and said, "You're a not-for-profit, you're not allowed to make $100,000,000 profit. But what we'll let you do is give all the money to a foundation, and the foundation has to give the money away quickly."
So they set up the System Development Foundation, and found Charlie Smith and said "Your job is to get rid of this money." So that's what was going on.
Vance: Philosophically you did not want to take money from the government?
Hinton: I didn't want to take Defense Department money. So when that money ran out I sat by the phone, but it didn't ring. Then I had to start applying for Defense grants, because I got a National Science Foundation grant, but that wasn't enough money to keep running the group at CMU. So then I had this nasty situation, I had these graduate students that needed supporting and I didn't want to take defense money. I applied for a few defense grants, I got one of them, but I really didn't like it.
I sort of didn't like the idea that this stuff was going be used for purposes that I didn't think were good. So then I learned about this Canadian Institute for Advanced Research, that allowed you to get a job in Canada, but still spend most of your time doing research. Cause before that if you got a professor job in Canada you'd do a lot more teaching than you did in the States. And that was very attractive. That I could go off to this civilized town and do basic research.
Vance: Once you get to Canada, you must have needed to convince another group of people to believe in neural nets?
Hinton: Well at that point they had a conventional AI program, and I was the weird guy in the conventional AI program. And then later on they disbanded the conventional AI program, and I got to start a new program in neural nets.
The AI program started in 1985. And I joined in 1987. Then that program finished in the mid 90's. I went off to England for a few years. And then when I came back I persuaded them to set up a program in neural nets.
Vance: Was it seen as radical and crazy that CIFAR was willing to fund neural nets?
Hinton: No it wasn't that radical and crazy. It was seen as quite daring to fund this area, because most people in computer science thought this stuff was dead.
Vance: What were some the moments or breakthroughs that kept people believing in neural nets?
Hinton: So the big problem with the neural nets had been, if you made them have lots of layers, they trained very slowly. And in 2005 and 2006 in Toronto we developed this method of training networks with lots of layers, that was more efficient. And we had a paper in Science that was very influential that came out in 2006, that showed we could train very deep neural networks efficiently. And that got a lot of people interested.
So that started interest again in these deep nets with many less feature detectors, and then in 2009, two of the students in my lab developed a way of doing speech recognition using these deep nets. They worked better than the existing technology, on a standard benchmark. And it was only slightly better, but the fact that this existing technology had 30 years of development in it, and these deep nets over a few months could do slightly better, meant that it was obvious that within a few years’ time they're going to do much better.
One of the students went to work with Microsoft and developed it for a bigger vocabulary. Another student went to Google as an intern. And when he went there he said he wanted to replace part of their speech recognizer with this neural net. And they said "Oh that's much too ambitious for an intern. Do something much more modest". And he insisted, and luckily he had a very insightful manager who thought, "Well there's a small chance it'll work. And if it works it'll be important, so why not let him." So the manager let him do it, and he did it, and it worked. And at Google they realized very quickly this is the future of speech recognition. They got it into a product in 2012. They were the first people to have a commercial speech recognizer based on these deep neural nets.
And that came out in the Android. At the time it came out it suddenly made the Android better at speech recognition that Siri.
Vance: Which was a problem for Apple.
Hinton: They pretty soon switched to using neural nets. The first big commercial success was the work done in 2009, that went into production in the Android in 2012. And the second big success was someone called Fei-Fei Li developed a big database of millions of images and had a public competition, called the ImageNet Competition.
In 2012, two more of my students developed a system that dramatically won the competition. It got about almost half the error rate of the existing computer visions systems. They wiped out computer vision. That made a really big impact.
And because I think they'd already done speech recognition, the big companies realized this wasn't just a sort of one trick pony. It did speech recognition, and now it did object recognition. So it was going to do everything. It was that that led to the sudden change in the interest in companies. They weren’t very interested before then, but suddenly they all decided "Oh okay, this is it. This is the future".
Vance: How surprising was that to you? Or did it finally feel like, "Ah ha. The world has come to my vision and what I was expecting is happening?".
Hinton: It was sort of a relief that people finally came to their senses.
There's another piece of history that's actually quite similar, which is continental drift. So if you look in the theory of continental drift that was developed a long, long time ago. All the geologists thought "This is completely crazy theory. This is just wishful thinking." The idea that sort of Africa fits into South America and they could sort of be joined up. I mean that's just loopy, because how could these things move thousands of miles? I mean that's just crazy. It was regarded as really silly, wishful thinking.
And it turned out the guy was right. Continents really have drifted apart, and they really did fit together. And that's how neural nets were regarded by most of the people in AI. Just silly wishful thinking.
Vance: Is there a bit of sadness for you that this is happening at this stage in your career?
Hinton: Yeah, there is a bit of sadness. I'm quite jealous I'm not sort of the same age as some of my peers because they're at an age where they can really have a big future in this stuff.
Vance: What is your take on where this is heading? Are you in the optimistic or the pessimistic camp?
Hinton: I mean my main take is, it's really hard to predict the future. You can predict the future quite well for a few years, and you might be able to predict what's gonna happen in five years’ time, but as soon as you start making predictions about what's going to happen in 20 years’ time, almost always you end up hopelessly wrong. So I think there's gonna be all sorts of things that happen that we didn't expect. And a lot of these doomsday scenarios sort of, they are basically based on science fiction movies. I just think we ... my main feeling is we can't predict what's going to happen.
There's some things we can predict. That this technology is going to change everything. I think that's fairly obvious.
So let me give you just a few examples. If you take a patient in a hospital, they have a medical record. That medical record is for their whole life. They've seen doctors, they've sent things to doctors, they've had tests, they've been diagnosed with things. If you ask how much of the information in that medical record is being used to decide how to treat them, and to predict what's going to happen to them soon, and wonder how to prevent it. What we should have available is hundreds of millions of records like this. And from hundreds of millions of records like this, there's huge amounts of information there. And those huge amounts of information can make medicine much, much better.
Much more proactive, and much more effective. So that's true for the medical records, it's true for things like CT scans. You take a CT scan and doctor looks at a CT scan and says, "Your tumor is this size." There's all sorts of other information in the tumor like it'll tell you how it's developing. Whether it's metastasized. We believe there's lots of information that isn't being used, and will be used in future.
Vance: And you've described radiologists as Wile E. Coyote, siting up there about to look down and realize that they’re going to crash into the ground.
Hinton: I think that was a mistake. I think what's going to happen is radiologists will spend less of their time looking at CT scans and trying to interpret them, and more of their time interacting with patients. Explaining to them what's going on.
Vance: Why did you roll that one back a little bit? Did you get push back from the radiologists?
Hinton: I thought a bit about it afterwards because a few radiologists were upset, and pointed out that they didn't just read these things, they also interacted with patients, and that was going to be harder to automate. And I think what's going to happen is bits of what doctors do is going to be automated, and work much better than they did when doctors did them. There's a lot of work that doctors do that isn't just routine reading of images. That stuff is going to take much longer for machines to replace.
Vance: What about on the jobs front? You talked about the example with the doctor, I mean that's great because the radiologist can spend their time maybe on these more fine details, but it's less good maybe if you're in manufacturing, where an AI can train a robot to preform a lot of tasks that humans can do.
Hinton: I just don't know what's going to happen. Obviously it's going to make new jobs, new kinds of jobs, but I think what you're going to see is that the things that make us most human, things like empathy, things like being able to understand other people, and caring about other people, those are going to be the things that don't get automated. Maybe things like writing poetry.
It's going to tend to remove the routine things. Now there's some routine things like reading a CT scan that are highly paid, but are nevertheless routine things, that aren't really exercising what makes us human. They're just our perceptual system that's doing it. So I think those things will be the last things to be automated and we’ll be able to spend more of our time doing things like that, and less of our time doing boring routine things.
So if you take automatic teller machines for example. You give them a check. They give $20. That's all automated now. It didn't actually lead to a massive loss of jobs. It actually led to more branches of banks that were smaller. I don't think anybody would go back and say we shouldn't have had automatic teller machines. That was intrinsically a boring job. It's tedious to wait for the teller. It's all just much more efficient now. And it's just increased the general good. I think there'll be an awful lot of that goes on. But I think the social impact of all this stuff is very much up to the political system we're in.
So intrinsically making things more efficient, making producing goods more efficient, ought to increase the general good. The only way in which that's going to be bad is if you have a society that takes all the benefit of that increase in productivity, and gives it to the top 1%. That will be bad. If you take that increase in productivity and spread it equally, that's going to be good.
This is a political problem right? This isn't a problem with the technology.
Vance: Well except in this particular case there's a handful of companies that have invested dramatically more into this field than all the rest. And they have already gotten very wealthy and their founders are very wealthy. It’s exactly the scenario you’re playing out.
Hinton: So one of the reasons I live in Canada is Canada has high taxation, and if you make a lot of money, it taxes you a lot. I think that's a great system. I actually believe in taxes.
Vance: Yeah. That's crazy.
Hinton: Taxes ... okay reading the media you wouldn't actually get this idea. Taxes are good, right? Taxes aren't bad. Taxes are good.
If somebody's benefiting a lot, tax them a lot, and that's a great system.
Vance: And just philosophically, I mean all those years ago you made this choice to leave the U.S. because the funding was coming from the Defense Department. Obviously today the Defense Department is using this type of technology.
Hinton: I'm not helping anymore.
Vance: I'm not laying this all on you, but there must be... I mean do you have any regrets on that end?
Hinton: Occasionally I think about it. The thing that worries me most is ... two things really. Autonomous weapons. They're worrying because they're here already. And interference in elections. That worries me. Sort of using data about people to interfere with the elections. Thats clearly not good. And that's clearly happened already. But on the whole this stuff is just going to make things more efficient. And should increase the general good if you had a decent political system.
Vance: But, in the meantime, all I see is companies and the government funding AI to the hilt now and just accelerating this as quickly as they can. I hear a lot of philosophical talk about the safeguards and things like that. But not many practical solutions.
Hinton: So I think the safe guards that will stop robots from taking over, the reason you don't see a huge effort going into that now is cause all the people who know about this stuff, know that that's way, way off in the future.
Do you remember when people were worried that this Large Hadron Collider might make a black hole that would gobble up the earth?
Vance: Yes.
Hinton: Right. I was a sort of bit worried about that because it wouldn't be good to be gobbled up by a black hole. But then I talked to some physicists, and they said "You don't have to worry about that.” If the physicists would talk to us we'd tell them, "You don't have to worry about the robots taking over." At least not in the foreseeable future.
Vance: If you could go back in time to your younger self is there something you would tell your younger self to do differently? I know you've had this singular conviction that's worked out well for you.
Hinton: Yeah. My advice would be learn to program and to write lots of programs, and to test your ideas with programs. And also learn as much math as you can stomach. I could never stomach much, but the little I learned was very helpful. And the more math you learn the more helpful it'll be. But that combination of learning as much math as you can cope with and programming to test your ideas. The most important thing is, if you have an idea write a program that will test whether it works.
Fascinating interview. "Accidentally made a profit" and then had to give away a ton as grant money is such a strange origin story for a foundation!