The Wrong Classroom

This is my response to Rich Sutton’s The Bitter Lesson, Rodney Brooks’ A Better Lesson, and Andy Kitchen’s A Meta Lesson. The bitter lesson from Sutton was that we should stop trying to incorporate human knowledge and human ways of thinking into our AI systems because these approaches always lose out to those using massive scale search and learning. Examples provided include chess, Go and computer vision. Brooks’ better lesson was that the human ingenuity required to get search and learning based approaches to succeed at all gives the lie to the idea that we are ever talking about pure search and learning when looking at these examples of success, i.e. the design and architecture choices by humans are what made them work. Kitchen’s meta lesson is that human ingenuity combined with massive compute are required in order to have systems that learn better ways of learning.

But learning what exactly? To accomplish a specific, narrowly-defined task? To behave “intelligently” in some environment? Or to accomplish a task in virtue of having behaved intelligently? This, to my mind, is at the heart of the disagreements between the search and learn camp and the human knowledge camp. With computer chess, was the goal just to win at chess, or was chess chosen as a metric because it was believed that in order to master it you had to do so the way humans did it? My understanding of the history of AI is that it was the latter. Having a chess-playing machine was not the point of the exercise. Chess was simply seen as the answer to the question, “What are intelligent humans good at?” Intelligence is something that is useful for us to track in other agents, especially other humans, and when we reflect on the most impressive examples of it we’ve come across we come up with examples like “people who are good at math,” “people who read lots of books,” and “people who are good at chess.” The chess challenge was well-defined, not vague like other possible challenges (Does reading books mean understanding them? How do you prove understanding?) and so it makes sense as a way to measure progress towards machine intelligence. Confusing the metric for the end goal was and remains the crucial but largely unacknowledged point of contention between the two camps as defined above.

As I reflect on this I’m reminded of the now infamous story of the Facebook bots that were widely reported to have invented their own language, much to the alarm of the researchers that designed them. The reporting is now often used as an example of the worst of the AI hype machine - there was no alarm on the part of the researchers, the bots had simply learned to optimize a metric while by-passing what was in fact the true end goal of the research: having bots learn to communicate using coherent English sentences. The assumption had been that the task (in this case negotiating with another agent over a split between items of differing value) could only be achieved via the intermediary (to the agents) goal of learning the English language. This is analogous to the chess-as-metric idea: with chess the assumption was that you could only win at it by playing it the way humans do, bringing all of our human intelligence to bear on the task.

Metrics for AI systems have to be well-defined, and my suspicion is that this makes them almost by definition solvable without something we humans would ever track as “intelligence.” But what does this matter? Sometimes the metric and the end goal are aligned, such as in the case of computer vision and speech recognition: we want systems that can detect objects or recognize words in speech, we directly measure their ability to do those things. But when they’re not, such as when the true end goal is something vague like “solving intelligence,” there may be many lessons learned but at least some AI researchers will simply be in the wrong classroom.

comments powered by Disqus