13.2. What Made Lisp Different
When it was first developed, Lisp embodied nine new ideas. Some of these we now take for granted, others are only seen in more advanced languages, and two are still unique to Lisp. The nine ideas are, in order of their adoption by the mainstream, 1. Conditionals. A conditional is an if-then-else construct. We take these for granted now, but Fortran I didn"t have them. It had only a conditional go to closely based on the underlying machine instruction.
2. A function type. In Lisp, functions are a data type just like integers or strings. They have a literal representation, can be stored in variables, can be pa.s.sed as arguments, and so on.
3. Recursion. Lisp was the first high-level language to support recursive functions.
4. Dynamic typing. In Lisp, all variables are effectively pointers. Values are what have types, not variables, and a.s.signing values to variables means copying pointers, not what they point to.
5. Garbage-collection.
6. Programs composed of expressions. Lisp programs are trees of expressions, each of which returns a value. This is in contrast to Fortran and most succeeding languages, which distinguish between expressions and statements.
This distinction was natural in Fortran I because you could not nest statements. So while you needed expressions for math to work, there was no point in making anything else return a value, because there could not be anything waiting for it.
This limitation went away with the arrival of block-structured languages, but by then it was too late. The distinction between expressions and statements was entrenched. It spread from Fortran into Algol and then to both their descendants.
7. A symbol type. Symbols are effectively pointers to strings stored in a hash table. So you can test equality by comparing a pointer, instead of comparing each character.
8. A notation for code using trees of symbols and constants.
9. The whole language there all the time. There is no real distinction between read-time, compile-time, and runtime. You can compile or run code while reading, read or run code while compiling, and read or compile code at runtime.
Running code at read-time lets users reprogram Lisp"s syntax; running code at compile-time is the basis of macros; compiling at runtime is the basis of Lisp"s use as an extension language in programs like Emacs; and reading at runtime enables programs to communicate using s.e.xpressions, an idea recently reinvented as XML.
When Lisp first appeared, these ideas were far removed from ordinary programming practice, which was dictated largely by the hardware available in the late 1950s. Over time, the default language, embodied in a succession of popular languages, has gradually evolved toward Lisp. Ideas 1-5 are now widespread. Number 6 is starting to appear in the mainstream. Python has a form of 7, though there doesn"t seem to be any syntax for it.
As for number 8, this may be the most interesting of the lot. Ideas 8 and 9 only became part of Lisp by accident, because Steve Russell implemented something McCarthy had never intended to be implemented. And yet these ideas turn out to be responsible for both Lisp"s strange appearance and its most distinctive features. Lisp looks strange not so much because it has a strange syntax as because it has no syntax; you express programs directly in the pa.r.s.e trees that get built behind the scenes when other languages are pa.r.s.ed, and these trees are made of lists, which are Lisp data structures.
Expressing the language in its own data structures turns out to be a very powerful feature. Ideas 8 and 9 together mean that you can write programs that write programs. That may sound like a bizarre idea, but it"s an everyday thing in Lisp. The most common way to do it is with something called a macro .
The term "macro" does not mean in Lisp what it means in other languages. A Lisp macro can be anything from an abbreviation to a compiler for a new language. If you really want to understand Lisp, or just expand your programming horizons, I would learn more about macros.
Macros (in the Lisp sense) are still, as far as I know, unique to Lisp. This is partly because in order to have macros you probably have to make your language look as strange as Lisp. It may also be because if you do add that final increment of power, you can no longer claim to have invented a new language, but only a new dialect of Lisp.
I mention this mostly as a joke, but it is quite true. If you define a language that has car, cdr, cons, quote, cond, atom, eq, and a notation for functions expressed as lists, then you can build all the rest of Lisp out of it. That is in fact the defining quality of Lisp: it was in order to make this so that McCarthy gave Lisp the shape it has.
13.3. Where Languages Matter
Even if Lisp does represent a kind of limit that mainstream languages are approaching asymptotically, does that mean you should actually use it to write software? How much do you lose by using a less powerful language? Isn"t it wiser, sometimes, not to be at the very edge of innovation? And isn"t popularity to some extent its own justification? Isn"t the pointy-haired boss right, for example, to want to use a language for which he can easily hire programmers?
There are, of course, projects where the choice of programming language doesn"t matter much. As a rule, the more demanding the application, the more leverage you get from using a powerful language. But plenty of projects are not demanding at all. Most programming probably consists of writing little glue programs, and for little glue programs you can use any language that you"re already familiar with and that has good libraries for whatever you need to do. If you just need to feed data from one Windows app to another, sure, use Visual Basic.
You can write little glue programs in Lisp too (I use it as a desktop calculator), but the biggest win for languages like Lisp is at the other end of the spectrum, where you need to write sophisticated programs to solve hard problems in the face of fierce compet.i.tion. A good example is the airline fare search program that ITA Software licenses to Orbitz. These guys entered a market already dominated by two big, entrenched compet.i.tors, Travelocity and Expedia, and seem to have just humiliated them technologically.
The core of ITA"s application is a 200,000-line Common Lisp program that searches many orders of magnitude more possibilities than their compet.i.tors, who apparently are still using mainframe-era programming techniques. I have never seen any of ITA"s code, but according to one of their top hackers they use a lot of macros, and I am not surprised to hear it.
13.4. Centripetal Forces
I"m not saying there is no cost to using uncommon technologies. The pointyhaired boss is not completely mistaken to worry about this. But because he doesn"t understand the risks, he tends to magnify them.
I can think of three problems that could arise from using less common languages. Your programs might not work well with programs written in other languages. You might have fewer libraries at your disposal. And you might have trouble hiring programmers.
How big a problem is each of these? The importance of the first varies depending on whether you have control over the whole system. If you"re writing software that has to run on a remote user"s machine on top of a buggy, proprietary operating system (I mention no names), there may be advantages to writing your application in the same language as the OS. But if you control the whole system and have the source code of all the parts, as ITA presumably does, you can use whatever languages you want. If any incompatibility arises, you can fix it yourself.
In server-based applications you can get away with using the most advanced technologies, and I think this is the main cause of what Jonathan Erickson calls the "programming language renaissance." This is why we even hear about new languages like Perl and Python. We"re not hearing about these languages because people are using them to write Windows apps, but because people are using them on servers. And as software shifts off the desktop and onto servers (a future even Microsoft seems resigned to), there will be less and less pressure to use middle-of-the-road technologies.
As for libraries, their importance also depends on the application. For less demanding problems, the availability of libraries can outweigh the intrinsic power of the language. Where is the breakeven point? Hard to say exactly, but wherever it is, it is short of anything you"d be likely to call an application. If a company considers it self to be in the software business, and they"re writing an application that will be one of their products, then it will probably involve several hackers and take at least six months to write. In a project of that size, powerful languages probably start to outweigh the convenience of pre-existing libraries.
The third worry of the pointy-haired boss, the difficulty of hiring programmers, I think is a red herring. How many hackers do you need to hire, after all? Surely.
13.5. The Cost of Being Average
How much do you lose by using a less powerful language? There is actually some data out there about that.
The most convenient measure of power is probably code size. The point of high-level languages is to give you bigger abstractions-bigger bricks, as it were, so you don"t need as many to build a wall of a given size. So the more powerful the language, the shorter the program (not simply in characters, of course, but in distinct elements).
How does a more powerful language enable you to write shorter programs? One technique you can use, if the language will let you, is something called bottom-up programming. Instead of simply writing your application in the base language, you build on top of the base language a language for writing programs like yours, then write your program in it. The combined code can be much shorter than if you had written your whole program in the base language -indeed, this is how most compression algorithms work. A bottom-up program should be easier to modify as well, because in many cases the language layer won"t have to change at all.
Code size is important, because the time it takes to write a program depends mostly on its length. If your program would be three times as long in another language, it will take three times as long to write-and you can"t get around this by hiring more people, because beyond a certain size new hires are actually a net lose. Fred Brooks described this phenomenon in his famous book The Mythical Man-Month , and everything I"ve seen has tended to confirm what he said.
So how much shorter are your programs if you write them in Lisp? Most of the numbers I"ve heard for Lisp versus C, for example, have been around 7-10x. But a recent article about ITA in New Architect magazine said that "one line of Lisp can replace 20 lines of C," and since this article was full of quotes from ITA"s president, I a.s.sume they got this number from ITA. If so then we can put some faith in it; ITA"s software includes a lot of C and C++ as well as Lisp, so they are speaking from experience.
My guess is that these multiples aren"t even constant. I think they increase when you face harder problems and also when you have smarter programmers. A really good hacker can squeeze more out of better tools.
As one data point on the curve, at any rate, if you were to compete with ITA and chose to write your software in C, they would be able to develop software twenty times faster than you. If you spent a year on a new feature, they"d be able to duplicate it in less than three weeks. Whereas if they spent just three months developing something new, it would be five years before you had it too.
And you know what? That"s the best-case scenario. When you talk about codesize ratios, you"re implicitly a.s.suming that you can actually write the program in the weaker language. But in fact there are limits on what programmers can do. If you"re trying to solve a hard problem with a language that"s too low-level, you reach a point where there is just too much to keep in your head at once.
So when I say it would take ITA"s imaginary compet.i.tor five years to duplicate something ITA could write in Lisp in three months, I mean five years if nothing goes wrong. In fact, the way things work in most companies, any development project that would take five years is likely never to get finished at all.
I admit this is an extreme case. ITA"s hackers seem to be unusually smart, and C is a pretty low-level language. But in a compet.i.tive market, even a differential of two or three to one would be enough to guarantee that you"d always be behind.
13.6. A Recipe
This is the kind of possibility that the pointy-haired boss doesn"t even want to think about. And so most of them don"t. Because, you know, when it comes down to it, the pointy-haired boss doesn"t mind if his company gets their a.s.s kicked, so long as no one can prove it"s his fault. The safest plan for him personally is to stick close to the center of the herd.
Within large organizations, the phrase used to describe this approach is "industry best practice." Its purpose is to shield the pointy-haired boss from responsibility: if he chooses something that is "industry best practice," and the company loses, he can"t be blamed. He didn"t choose, the industry did.
I believe this term was originally used to describe accounting methods and so on. What it means, roughly, is don"t do anything weird . And in accounting that"s probably a good idea. The terms "cutting-edge" and "accounting" do not sound good together. But when you import this criterion into decisions about technology, you start to get the wrong answers.
Technology often should be cutting-edge. In programming languages, as Erann Gat has pointed out, what "industry best practice" actually gets you is not the best, but merely the average. When a decision causes you to develop software at a fraction of the rate of more aggressive compet.i.tors, "best practice" does not really seem the right name for it.
So here we have two pieces of information that I think are very valuable. In fact, I know it from my own experience. Number 1, languages vary in power. Number 2, most managers deliberately ignore this. Between them, these two facts are literally a recipe for making money. ITA is an example of this recipe in action. If you want to win in a software business, just take on the hardest problem you can find, use the most powerful language you can get, and wait for your compet.i.tors" pointy-haired bosses to revert to the mean.
13.7. Appendix: Power As an ill.u.s.tration of what I mean about the relative power of programming languages, consider the following problem. We want to write a function that generates acc.u.mulators-a function that takes a number n ,and returns a function that takes another number i and returns n incremented by i. (That"s incremented by, not plus. An acc.u.mulator has to acc.u.mulate.) In Common Lisp 7 this would be: (defun foo (n) (lambda (i) (incf n i))) In Ruby it"s almost identical: def foo (n) lambda {|i| n += i } end Whereas in Perl 5 it"s sub foo { my ($n) = @_; sub {$n += shift} }.
which has more elements than the Lisp/Ruby version because you have toextract parameters manually in Perl.
In Smalltalk the code is also slightly longer than in Lisp and Ruby: foo: n |s| s := n.
^[:i| s := s+i. ]
because although in general lexical variables work, you can"t do an a.s.signment to a parameter, so you have to create a new variable s to hold the acc.u.mulated value.
In Javascript the example is, again, slightly longer, because Javascript retains the distinction between statements and expressions, so you need explicit return statements to return values: function foo (n) { return function (i) { return n += i } } (To be fair, Perl also retains this distinction, but deals with it in typical Perl fashion by letting you omit returns.) If you try to translate the Lisp/Ruby/Perl /Smalltalk/Javascript code into Python you run into some limitations. Because Python doesn"t fully support lexical variables, you have to create a data structure to hold the value of n . And although Python does have a function data type, there is no literal representation for one (unless the body is only a single expression) so you need to create a named function to return. This is what you end up with: def foo (n): s = [n]
def bar (i): s[0] += i return s[0]
return bar Python users might legitimately ask why they can"t just write def foo (n): return lambda i: return n += i or even def foo (n): lambda i: n += i and my guess is that they probably will, one day. (But if they don"t want to wait for Python to evolve the rest of the way into Lisp, they could always just...) In OO languages, you can, to a limited extent, simulate a closure (a function that refers to variables defined in surrounding code) by defining a cla.s.s with one method and a field to replace each variable from an enclosing scope. This makes the programmer do the kind of code a.n.a.lysis that would be done by the compiler in a language with full support for lexical scope, andit won"t work if more than one function refers to the same variable, but it is enough in simple cases like this.
Python experts seem to agree that this is the preferred way to solve the problem in Python, writing either def foo (n): cla.s.s acc: def _ _init_ _ (self, s): self.s = s def inc (self, i): self.s += i return self.s return acc (n).inc or cla.s.s foo: def _ _init_ _ (self, n): self.n = n def _ _call_ _ (self, i): self.n += i return self.n I include these because I wouldn"t want Python advocates to say I was misrepresenting the language, but both seem to me more complex than the first version. You"re doing the same thing, setting up a separate place to hold the acc.u.mulator; it"s just a field in an object instead of the head of a list. And the use of these special, reserved field names, especially _ _call_ _, seems a bit of a hack.
In the rivalry between Perl and Python, the claim of the Python hackers seems to be that Python is a more elegant alternative to Perl, but what this case shows is that power is the ultimate elegance: the Perl program is simpler (has fewer elements), even if the syntax is a bit uglier.
How about other languages? In the other languages mentioned here-Fortran, C, C++, Java, and Visual Basic-it does not appear that you can solve this problem at all. Ken Anderson says this is about as close as you can get in Java: public interface Inttoint { public int call (int i); }.
public static Inttoint foo (final int n) { return new Inttoint () { int s = n; public int call (int i) { s = s + i; return s; }};.
which falls short of the spec because it only works for integers.
It"s not literally true that you can"t solve this problem in other languages, of course. The fact that all these languages are Turing-equivalent means that, strictly speaking, you can write any program in any of them. So how would you do it? In the limit case, by writinga Lisp interpreter in the less powerful language.
That sounds like a joke, but it happens so often to varying degrees in large programming projects that there is a name for the phenomenon, Greenspun"s Tenth Rule: Any sufficiently complicated C or Fortran program contains an ad hoc informally-specified bug-ridden slow implementation of half of Common Lisp.
If you try to solve a hard problem, the question is not whether you will use a powerful enough language, but whether you will (a) use a powerful language, (b) write a de facto interpreter for one, or (c) yourself become a human compiler for one. We see this already beginning to happen in the Python example, where we are in effect simulating the code that a compiler would generate to implement a lexical variable.
This practice is not only common, but inst.i.tutionalized. For example, in the OO world you hear a good deal about "patterns." I wonder if these patterns are not sometimes evidence of case (c), the human compiler, at work. When I see patterns in my programs, I consider it a sign of trouble. The shape of a program should reflect only the problem it needs to solve. Any other regularity in the code is a sign, to me at least, that I"m using abstractions that aren"t powerful enough-often that I"m generating by hand the expansions of some macro that I need to write.
Chapter 14. The Dream Language.
Of all tyrannies, a tyranny exercised for the good of its victims may be the most oppressive.
C. S. LEWIS.
A friend of mine once told an eminent operating systems expert that he wanted to design a really good programming language. The expert said that itwould be a waste of time, that programming languages don"t become popular or unpopular based on their merits, and so no matter how good his language was, no one would use it. At least, that was what had happened to the language he had designed.
What does make a language popular? Do popular languages deserve their popularity? Is it worth trying to define a good programming language? How would you do it?
I think the answers to these questions can be found by looking at hackers, and learning what they want. Programming languages are for hackers, and a programming language is good as a programming language (rather than, say, an exercise in denotational semantics or compiler design) if and only if hackers like it.
14.1. The Mechanics of Populari ty