Pin Dancing

Friday, February 4, 2011

The repertoire method in "Concrete Mathematics"

Concrete Mathematics by Knuth et al is a great book but there are a couple of places where people learning by themselves can stumble, fall and get lost.

When I originally worked through the book I couldn't make head or tail of the "repertoire method" of solving recurrences. Eventually I did figure it out and sometime later I wrote up what I understood and posted it on my (then) blog. It seems that a lot of people search for "Concrete Mathematics Repertoire method" on Google and it is my second most popular blog post (The most popular post is this one, fwiw)

So I re-read the repertoire method post today and while it is correct, I can explain it better today so here goes.

(What follows may not make sense to you if you are not working through CM (be warned!). Also be warned that I have no formal training in mathematics, computer science or programming and am entirely self taught. So people with such training can probably explain things better. The following reflects only *my* understanding. That said, onwards!)

By the time you hit the repertoire method section in Chapter 1 of Concrete Mathematics, you have been taught a simple method to find closed forms of recurrences. The essential algorithm is

(a) make a table of the recurrence values R(n) for small values of n.
(b) Eyeball the table to see if you can spot a pattern,
(c) write down the pattern.
(d) Prove (or disprove) the candidate closed form's correctness by Mathematical Induction over (a subset of) the natural numbers.

You used this method, for example to solve the Josephus problem, so you know where to stand (in the circle of your idiot friends trying to commit mass suicide) so that you end up being the survivor), surrender to the Romans and become a historian for the ages).

The repertoire method makes its (first) appearance in the generalization of the Josephus recurrence. The constants 1 and -1 are replaced by alpha beta and gamma to give the more general recurrence

f(1) = alpha
f(2n) = 2*f(n) + beta
f(2n + 1) = 2*f(n) + gamma~~~~~~~~~~~~~~~~~~~~~~~[1]

Your job is to find f(n) such that this recurrence is true for any values of alpha, beta, and gamma.

So how do you do this? You use the only technique you know (at this point in the book) of making a table for small values of n and eyeballing it to spot a pattern (I am too lazy to reproduce the table here, go buy the book!). You don't spot a pattern for the f(n) but you do notice that all values of f(n) follow a pattern of

f(n) = A(n)*alpha + B(n)*beta + C(n)*gamma~~~~~~~~~~~~~~[2]

This is a kind of template of what the final closed form will look like, depending of course on the values of A(n), B(n) etc

In other words if we can find functions A(n), B(n) and C(n) such that when given n they generate the right coefficients as in Table 1.12, then we have a closed form for f(n).

Something very important happened here. You broke down a problem into smaller subproblems.

You still don't know the value of f(n) but now you know that if you can find the values of the functions A(n), B(n), and C(n) you have solved f(n).

Now you can find these values with your trusted "spot a pattern and verify with induction" (the only tool you have at this point) OR you can use the (unexplained!) repertoire method.

The first bit of confusion arises because the authors do both. They guess values of all the three functions in n, A(n), B(n) and C(n) from the initial table, say (correctly) that proving that these values by induction is long and tedious and then they go ahead and prove that the guessed value of A(n) is correct by induction! (to be fair they are solving for only A(n) and not all the unknowns simultaneously but though smaller this induction is still tedious. Try it. I did.)

Or, in more detail,

a value is guessed, (that A(n) = 2^m , m coming from rewriting n as 2^m + k as in the original Josephus problem - the book uses the letter l instead of k, I use k to distinguish easily from the number 1), beta and gamma are set to zero (remember the recurrence has to hold for *all* values of alpha, beta and gamma, including the selected 1,0,0) and then the recurrence [1] is rewritten (using [2]) as a recurrence in terms of A(n).

I'll repeat that so it is clear what is happening

Step 1: Guess a value for A(n), by eyeballing.We guess A(n) = 2^m

Step 2: Select alpha, beta and gamma so that the other two functions of n, B(n) and C(n), get eliminated from [2]. You can do this by selecting alpha = 1 and beta = gamma = zero.

Step 3: Rewrite the equation [2] (i.e the observed generalized form) in terms of the selected values of alpha, beta and gamma. You get f(n) = A(n).

Step 4: Use this equation (ie f(n) = A(n)) and your chosen values of alpha, beta and gamma to rewrite the original recurrence (ie equation [1]) to get

A(1) = 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~[3]
A(2n) = 2*A(n)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~[4]
A(2n + 1) = 2*A(N)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~[5]

Now the problem becomes to prove that your guess (i.e, A(n) = A(2^m + k) = 2^m) satisfies this new recurrence as expressed by [3] and [4] and [5].

The authors state "Sure enough it is true (by induction on m) that A(2^m + k) = 2^m "

The authors don't show the induction but you can work out this (tedious but not difficult) induction. Only basic algebraic manipulation is required)

Be aware that (a) the induction is on m, not n and (b) the predicate to be proven has the form P_m: (A(2^m +k) => [3] AND [4] AND [5]).

First, prove P_zero (as m starts from zero, though n starts from 1 - we need an induction on m, not n!). Then prove that P_m => P_m+1. (Weak Induction is sufficient).

So we prove (yay!!) that A(n) does = 2^m.

You could do the same for B(n) and C(n). Select values for alpha, beta and gamma to create recurrences in terms of B(n) only and then C(n) only, just as we did above for A(n), then use mathematical induction over m to prove your guesses correct.

In the book though, the authors switch to the repertoire method to find B(n) and C(n). This switch is the first confusing bit - A(n) is found using a guess + induction (the old method - eyeball, guess, use induction). But then they switch - and the repertoire method is used to find B(n) and C(n) and the initial guesses as to their values are unused!

Worsening the confusion is the fact that the repertoire method is not identified or explained explicitly at this point. As a student states in a margin note (great idea btw) "Beware: the authors are expecting you to figure out the idea of the repertoire method from seats of pants examples instead of giving a top down presentation"

The seat-of-pants example is actually enough if A(n),B(n) and C(n) are all worked out with the repertoire method.

So let us solve the whole thing with the repertoire method (no induction) and see how it works. Let us throw away the guesses about the values of A(n), B(n), and C(n). We'll assume we couldn't make any guesses for A(n), B(n) and C(n).

All we have is the original recurrence

f(1) = alpha
f(2n) = 2*f(n) + beta
f(2n + 1) = 2*f(n) + gamma~~~~~~~~~~~~~~~~~~~~~[1]

and our observation that f(n) always has the form

f(n) = A(n)*alpha + B(n)*beta + C(n)*gamma~~~~~~~~[2]

ok so we don't know (and we need to find out) the values of A(n), B(n) and C(n) .

The repertoire method (for this recurrence) works like this.
(1) Guess a value for f(n). (ie the guess is for the whole of [2] NOT a component A(n) as we did above!)
(2)See if you can find values for alpha, beta and gamma to validate this guess. Rewrite [1], the original recurrence in terms of your guess for f(n). See if you can find values for alpha, beta and gamma.
(3) Substitute these values (of alpha, beta and gamma), back into [2]. You'll get an equation in terms of the three unknowns A(n), B(n) and C(n).
(4)Repeat steps (1) - (3) till you have three independent equations.
(5)Solve for three linear equations in three unknowns.

Done.

Important: If you make a "wrong" guess you will end up with a useless equation like 0 = 0 or an equation that is not independent of the already derived equations and so on. If this happens, don't worry about it, try another guess till you do get three independent equations.

In detail.

I guess (the authors do too) that f(n) = 1.

Rationale for the guess: f(n) = constant is the simplest possible formulation of f(n) (just for fun you might want to try f(n) = 0)

Let us try substituting this in [1]

f(1) = alpha becomes 1 = alpha (since f(n) is 1 for any n).

similarly

f(2n) = 2*f(n) + beta becomes 1 = 2*1 + beta so beta = -1
f(2n + 1) = 2*f(n) + gamma becomes 1 = 2*1 + gamma so gamma = -1

so we have values for alpha,beta, gamma and f(n) and when we substitute back into [2] we get

A(n) - B(n) - C(n) = 1~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~[6]

ok!!! we have the first equation.

Now if we could get just two more independent equations like this we would have three equations in three unknowns (and so solve by Linear Algebra).

The authors use f(n) = n as their second guess and get A(n) + C(n) = n as their second equation.

(This is a good guess too. After f(n) = k, f(n) = n is the next rung up the complexity ladder. But in this specific example we can do better - see below)

and since they already proved that A(n) = 2^m by the "guess and use induction" method they don't need a third equation. They have two equations in two unknowns and they solve to get the solution.

But we don't have any guesses as to the values of A(n), so we can't plug that in.

We guessed at f(n) = 1 and got the equation A(n) - B(n) - C(n) = 1 and we need two more independent equations. We could re use the authors' f(n) = n guess and also cheat a bit and reuse the (non generalized) Josephus recurrence solution that we already proved that to "guess" that f(n) = 2k + 1. Then alpha = 1, beta = -1 and gamma = 1, giving us the third equation A(n) - B(n) + C(n) = 2k + 1. Three equations three variables. Solve. This gives the right answer too.

But that feels like a cheat. What if we hadn't solved the Josephus recurrence before? How would we guess f(n) = 2k + 1? We could go with f(n) = n but we can do better.

Since n = 2^m + k, we guess (our second guess)

f(n) = 2^m

and f(n) = k. (our third guess)

(Rationale for these guesses. Since n is dependent on 2^m and k why not guess with the simpler variables rather than f(n) = n, like the example in the book does? In this case this decision pays off spectacularly, giving the solutions for A(n) and C(n) directly and B(n) trivially )

These give us the equations, (just like we worked out [4] above, try it!)

A(n) = 2^m~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~[7]
C(n) = k~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~[8]

and these two along with [6] give us B(n) = 2^m - k - 1.

Look ma! no induction !

So now we have the values for A(n), B(n) and C(n) and now we can say that

the solution to the recurrence

f(1) = alpha
f(2n) = 2*f(n) + beta
f(2n + 1) = 2*f(n) + gamma is (given n = 2^m + k as explained earlier in the book)

f(n) = (2^m)*alpha + (2^m - k - 1)*beta + k*gamma.

Double checking,When alpha = 1, beta = -1 and gamma = 1 we get

the solution to

f(1) = 1
f(2n) = 2*f(1) - 1
f(2n + 1) = 2*f(n) + 1 (note: this is the original Josephus Recurrence)

is (2^m)*1 + (2^m - k - 1)*(-1) + k*(1), which resolves to

2k+1 which agrees with [1.9] (in the book).

Hopefully now what the authors say about the recurrence method makes more sense, though the sentence structure at this point in the book is confusing Let us take it apart - my comments in italics.

"First we find settings for parameters for which we know the solution" (the parameters here are alpha beta and gamma, the "solution"s are the various guessed values of f(n) NOT A(n),B(n), C(n). We make guesses for A(n),B(n) etc when we are using the prove by Induction method. When we use Repertoire method, we guess for f(n) and *find* A(n), B(n) etc);

"this gives us a repertoire of special cases that we can solve" (the special cases are the independent equations in the unknowns A(n),B(n),C(n) and solving them gives us the values of A(n) etc).

"Then we obtain the general case" (the solution of recurrence [1], the general value of f(n) )

"by combining the special cases" (In this case we combine the solutions of the equations which are the "special cases").

Hopefully that helped a bit.

The repertoire method pops up all over CM in various contexts, and once you grasp it is easy to identify and use. Enjoy the rest of Concrete Mathematics (which imho is a great, great book every programmer should have on his bookshelf)

Saturday, December 11, 2010

The answer to "Will you mentor me?" is

No.

Thanks for understanding.

Ok that was the nutshell version. If that answers your question, that's great.

The more detailed answer is "No, I won't mentor you,but in this blog entry I will tell you what to do instead, to get where you want to go". And I can reply with the url to this post the next time someone requests mentoring.

I once wrote a comment on Hacker News about what *I* learned about ending up with awesome mentors. Here it is, slightly edited so it reads a little better.

(The OP asked) Recently I have tried approaching a few good developers through their blogs about various matters including advice on how to go about some projects I'm undertaking but I am surprised at the unfriendly responses I have received. Maybe I have been going about it the wrong way but it got me thinking; Shouldn't the guys whose work we look up to be keen on what some of us young aspiring developers have to contribute to the community? I mean sure, we don't have the experience or skills some of these guys have(yet) but we still have some ideas that are viable with the right technical skills to back them. If any of them want to reach out and help nurture some potential talent, it may very well benefit all them in the end, whether financially or in terms of new ideas and experiences.

I commented thus

I have some experience in this, so let me try to explain a couple of things that I learned in the "school of hard knocks".

Once upon a time I was in a situation where I thought I could contribute to something one of the best programmers in the world was working on so I sent an email (I got the address from his webpage) and said something to the effect of " you say on this webpage you need this code and I have been working on something similair in my spare time and I could write the rest for you over the next few months because I am interested in what you are doing" and I got a 2 line reply which said (paraphrased) " A lot of people write to me saying they'll do this , but I've never seen any code yet so I am a little skeptical. Don't take it personally. Thanks. bye.".

So in the next email (sent a minute after I received his reply) I sent him a zipped file of code with an explanation that "this is what I've done so far which is about 70% of what you want" and he immediately replied saying "Whoa you are serious. That is refreshing .. ' and opened up completely, giving me a lot of useful feedback and very specific advice. He is a (very valued) mentor to this day.

Another time, I was reading a paper from a (very famous) professor at Stanford, and I thought I could fill in some gaps in that paper so I wrote a "You know your paper on X could be expanded to give results Y and Z. I could use the resulting code in my present project. Would you be interested in seeing the expanded results or code" email and I got a very dismissive one line email along he lines of " That is an old paper and incomplete in certain respects, Thanks".

So a few days later, I sent along a detailed algorithm that expanded his idea, with a formal proof of correctness and a code implementation and he suddenly switched to a more expansive mode, sending friendly emails with long and detailed corrections and ideas for me to explore.

Now I am not in the league of the above two gentlemen, but perhaps because I work in AI and Robotics in India,which isn't too common, I receive frequent emails to the effect of "please mentor me", often from students. I receive too many of these emails to answer any in any detail, but if I ever get an email with "I am interested in AI/ Robotics. This is what I've done so far. Here is the code. I am stuck at point X. I tried A, B, C nothing worked. What you wrote at [url] suggests you may be the right person to ask. can you help?" I would pay much more attention than to a "please mentor me" email.

In other words, when you asks for a busy person's time for "mentorship" or "advice" or whatever, show (a) you are serious and have gone as far as you can by yourself (b) have taken concrete steps to address whatever your needs are and (optionally. but especially with code related efforts)(c) how helping you could benefit them/their project.

Good developers are very busy and have so much stuff happening in their lives and more work than they could ever hope to complete that they really don't have any time to answer vague emails from some one they've never heard of before.

As an (exaggerated) analogy, think of writing an email to a famous director or movie star or rock star, saying "I have these cool ideas about directing/acting/ music. Can you mentor me/give me advice?"

I am replacing the words "app" and "technical" in your sentence below with "film" and "film making".

"if I have an idea for a film that I want to develop, but my film making skills limit me, it would be nice to have people to bounce the idea off and have it implemented. "(so .. please mentor me/give me advice/make this film for me).

Do you think a top grade director (say Spielberg) would respond to this?

The fact that you at least got a 2 line response shows that the developers you wrote to are much nicer than you may think. They care enough not to completely dismiss your email, though they receive dozens of similar emails a week.

As someone else advised you on this thread, just roll up your sleeves and get to work. If your work is good enough, you'll get all the "mentoring" you'll need. "Mentoring" from the best people in your field is a very rare and precious resource and like anything else in life that is precious, should be earned.

My 2 cents. Fwiw. YMMV.

That says most of what I want to say.

Some minor points now, addressing some points raised in the latest emails.

If you claim to be "very passionate about X" but have never done anything concrete in X I find it difficult to take you seriously. People who are really passionate about anything don't wait for "leaders" or "mentors" before doing *concrete* work in the area of their passion, however limited. Specifically wrt to programming/machine learning etc in the days of the internet and with sites like Amazon or the MIT OCW you have no limits except those you impose on yourself.

I hate to sound all zen master-ey but in my experience, it is doing the work that teaches you what you need to do next. Walking the path reveals more of the map. All the mentoring a truly devoted student needs is an occasional nudge here or an occasional brief warning there. Working with uncertainty is part of the learning. Waiting for mentorship/leadership/"community"[1]/ whatever to start working is a flaw that guarantees you will never achieve anything worthwhile.

Ok pseudo-zen-master-mode off. More prosaic version - "shut up and code". Or make a movie on your webcam, Or write that novel. Whatever. Your *work* will, in time, bring you all the mentoring and community or whatever else you need.

As always My 2 cents. Fwiw. YMMV. Have a nice day.

[1] For some reason Bangalore is crawling with people who first want to form a community and then start learning/working/whatever. These efforts almost invariably peter out uselessly. First do the work. Then if you feel like "communing" talk to others who are also working hard.Please read this , sent to me by my friend Prakash Swaminathan.

Thursday, September 9, 2010

My Schedule for the rest of the year

- Starting tomorrow hack 12 hours a day as part of my current project. ( C & Haskell, Machine Learning, if anyone is interested). Will be traveling to places without Internet connectivity. So expect to be mostly offline.

- Oct end / Beginning of Nov: Back in Bangalore . Back online. Yay!

- Nov end: complete paperwork/documentation/training blah blah, Project handover.

- Nov end. This (phase of this) project done. Whew.

- December - somewhat free. I hope to release some Open Source code before EOY. Fairly old Scala code (so needs to be updated to Scala 2.8, add some comments and so on) but should be useful to others. Paperwork for Open Source release should come through before then.

Jan 1, 2010. New Year. No definite plans but lots of nice opportunities. Problems of plenty. Touch Wood. (Update: "No definite plans" is no longer true. A couple of VERY interesting opportunities in the air. Life is good.)

Wednesday, September 1, 2010

The Secret of Professional Happiness

I was talking to Prakash Swaminathan the other day and he said something that I thought encapsulated the essence of having a great professional life.

(a) Work with people you admire, (b) on interesting projects and (c) work from home as much as possible.

I could imagine dropping (c) if the other two criteria were met (though it does make a lot of sense in today's networked world) but whenever I've compromised on (a) or (b) life has sucked, without exception.

So children,learn from your elders. Always work with great people on great projects and avoid the corporate politics bullshit and you'll be happy professionally.

Of course this assumes you are skilled enough (or are willing to work to get there) that awesome people want you on awesome projects but that is a different post altogether.

Friday, August 20, 2010

Who (and what) I would like to see at DevCamp

Comments, requests and suggestions on my last post are pouring in (re: my last post). Thanks everyone. One of the folks who sent me email asked "Who would *you* like to see speaking at DevCamp, assuming they are in India and willing to deliver a talk, and on what?"

Hmm. Interesting question. I haven't really thought about this very deeply but here is a quick response(very busy day, no time to edit, link to home pages etc, sorry)

In no particular order,

(1) Debashish Ghosh on deep Scala programming. This guy is really good.

(2) Baishampayan Ghose on the technical aspects of paisa.com

(3) Bhasker Kode on Erlang at Hover.in

(4) Peter Thomas - guru on things Wicket-ey, speaking of things wicket-ey. (Due disclosure , old friend of mine)

(5) Narayan Raman on *the evolution* of Sahi (and on running a company based around an open source tool he wrote. How cool is that?)

(6) Anyone from c42 or ActiveSphere on the challenges of setting an n-man (n < 7) consultancy and competing with the big boys (due disclosure both companies built by ex tw -ers. I know a few of them)

(7) Anyone (technical) from FlipKart. They seem to be doing good things (I am a satisfied customer) and I am interested in how they tackle the huge challenges in building (for e.g) reccomendation systems.

(8) Anyone at all in India doing serious work in Haskell (Scala would do).

(9) Anyone building/working in a *technically* challenging startup (Notion Ink, say) on their *technical* challenges.

(10) ThoughtWorkers hacking on stuff, on what they are hacking on. TW ers in general have all kinds of side projects going. The two Viveks (Prahalad and Singh) would be a good start.

Tuesday, August 17, 2010

Speaking at DevCamp 2010

I'll be speaking at DevCamp 2010. As in 2008, I have a "menu" of topics that people can vote on and will select the topic at the very last minute. Since I don't use slides(in general) this isn't very hard to pull off. More on this below.

Dev Camp is interesting because in India, there aren't any "developer to developer" conferences. Most are either company sponsored events (e.g Sun/Oracle/Adobe Tech days) or are overrun by "evangelists" hired by MegaCorps to sell their crapware to developers. DevCamp attempts to head these people off by stating "DevCamp is an annual BarCamp style unconference for hackers, by hackers and of hackers that began in Bangalore in 2008 with code and hacking as its core themes" Some of these "evangelists" are shameless enough to crash the conf anyway, but the Law of Two Feet often takes care of that.

So what do you talk about at DevCamp? (everything that follows is *my* opinion. I have nothing to do with the organizing of DevCamp)

If you are any kind of hacker, you have a pet project running on the side. You are learning or doing something that might be of interest to other developers. So in the last devcamp I attended (in 2008) someone was trying to replace JMeter with an equivalent Erlang tool and he gave a very interesting talk on the advantages and challenges of this approach.

Bring your laptop and show us what you are working on. *Don't* make one of those slide heavy "Introduction to Blah" type talks that are prevalent at most Indian conferences (last year's PyConf India was a good example of this iirc. Hopefully this year is better). Your audience consists of professional developers who are quite comfortable with looking up stuff on the Internet.

As the Dev Camp page puts it, "Assume a high level of exposure and knowledge on the part of your audience and tailor your sessions to suit. Avoid 'Hello World' and how-to sessions which can be trivially found on the net. First hand war stories, in-depth analysis of topics and live demos are best. ". Again some folks so try to sneak in "Introduction to Blah" where X is the latest "hot" topic (Clojure or Android would fit the bill these days for e.g), but again "The Law of Two Feet" (mostly) takes care of them.

If you want to talk about Clojure don't do "An Introduction to Clojure". In the days of YouTube, Rich Hickey can do that much better than you could. Talk about "How I built a Text Processing/WebcCrawler library in Clojure" or "My startup runs on Clojure" (and show us the code). Tell us what *you* know that few others do ("in-depth analysis") and/or show us interesting code you wrote ("live demos"). If someone were to do a talk on (for e.g) how the Clojure *compiler* works and the tradeoffs in its design, that would be interesting to me. If you are recycling "Clojure has macros, woot!" I don't care.

The other interesting aspect about DevCamps is how lightweight it is. There is none of the stuffiness associated with the usual company conferences. It is an *un* conference, like Barcamp, but without the legion of SEO marketing people, "bloggers", non-tech "founders" trawling for naive developers who'll work for free on their latest "killer idea"s etc who swarm Barcamp. BarCamp (imo) attracts fringe lunatics. DevCamp attracts (or should attract when it works well) competent developers.

So, these are the things I could talk about at DevCamp. Since I work on Machine Learning and Compilers, the topics reflect that experience. I could talk about how to build a Leasing System in Java but I doubt I'd have anything interesting to say ;-).

Send me email if you have a preference (or leave a comment here). I'll talk about whatever has the highest number of votes on Sep 4. "Customer Development" for sessions woot? Email > comments here > twitter but any and all forms of media are acceptable.

The topics from highest to lowest number of votes registered at the time of writing are

(1)An In Depth Look at the Category Theory Bits in Haskell (expanded version of the old Monad tutorial)

At DevCamp 2008, I presented a talk on "Understanding Monads" where the idea was that someone who knew nothing about Monads should come to the talk and walk out knowing how they work and when to use them. Instead of giving vague analogies("monads are space stations/containers/elephants.." you build monads from the ground up using first class functions. The talk included, in its first iteration, the List, Maybe and State monads. Later versions (over the years I have given the talk a few times) broke down the Category Theory behind monads and how it helps in structuring programs.

The latest version encompasses all the hairy Category Theory related bits and pieces(Applicatives, Monoids, Functors , Monad Transfomers...) which impede programmers trying to learn Haskell/Scala/ML etc. I don't assume any theory/math background from the audience and introduce required formalisms. The good news is that this is a very polished and popular topic (and is trending highest in the number of "votes") . The bad news is that I am bored of this talk (but will still use it if it scores the highest number of votes).

(2) Building a Type Inferencer in 45 minutes

Static Type Systems, especially those more powerful than the Java/C# variety are a mystery to most programmers. This can be seen for example in how developers with a Java background write "Java in Scala" than idiomatic Scala. The best way (and the Hacker's way) to understand how a Type Inferencer works is to build one. This session builds a Hindley Milner type checker with a couple of extensions.

(3) WarStory: How I escaped Enterprise SW and became a Machine Learning Dev

Self explanatory ;-)

(4) Proof Technique for Programmers - A Developer's gateway to Mathematics (and Machine Learning)

This comes out of something I observed in the Bangalore Dev community. A lot of people read "Programming Collective Intelligence" (a terrible book - read my HN "review" here - I am "plinkplonk". See also comments by brent) and fancy themselves "Machine Learning" people ("we aren't experts but we know the basics". Ummm . No, you don't :-P. )

The sad truth is, you can't do any serious machine learning (or Computer Vision, or Robotics, or NLP or Algorithm heavy) development without high levels of mathematics. "Pop" AI books like PCI are terrible in teaching you anything useful.

To quote Peter Norvig from his review of Chris Bishop's Neural network book (emphasis mine)

"To the reviewer who said "I was looking forward to a detailed insight into neural networks in this book. Instead, almost every page is plastered up with sigma notation", that's like saying about a book on music theory "Instead, almost every page is plastered with black-and-white ovals (some with sticks on the edge)." Or to the reviewer who complains this book is limited to the mathematical side of neural nets, that's like complaining about a cookbook on beef being limited to the carnivore side. If you want a non-technical overview, you can get that elsewhere (e.g. Michael Arbib's Handbook of Brain Theory and Neural Networks or Andy Clark's Connectionism in Context or Fausett's Fundamentals of Neural Networks), but if you want understanding of the techniques, you have to understand the math. Otherwise, there's no beef. "

The "if you want understanding of the techniques, you have to understand the math" bit is true for all areas of ML, not just Neural networks. The biggest stumbling block (there are many ;-)) for most developers attempting to grok the underlying mathematics is the proof based learning method most higher level Math/Machine Learning books assume.

E.g here is the *first* exercise of the *second* chapter of "Elements of Statistical Learning", a which (unlike PCI) book you *should* read if you plan to do Machine Learning-ey things

"Suppose each of K-classes has an associated target tk , which is a
vector of all zeros, except a one in the kth position. Show that classifying to
the largest element of y amounts to choosing the closest target, mink ||tk − y ||, if the elements of y sum to one."

This "Given X, Prove Y" structure is how almost all books in the field teach things. Sure you should code up the algorithms, but doing such problems is how you get *insight* into the field. And algorithms have their own problems (pun intended). Open Cormen et al's "Introduction to Algorithms" and you'll find questions like (randomly opening the third edition)

Problem 20.1 (e) Prove that, under the assumption of simple uniform hashing, your RS-vEB-TREE-INSERT (Note vEB tree == van Emde Boas tree) and RS-vEB-TREE-SUCCESSOR run in O(lg lg u) expected time.

Thus it turns out that for getting into many areas of interest, a knowledge of how to prove things is critical. You will make very slow or zero progress without that understanding. That is the bad news. The good news is, proofs are (relatively) easy for programmers to understand when presented the right way (acquiring skill takes a while). I wasted many years learning this stuff in inefficient ways. Don't make the same mistake.

Zero math background required. Just bring some paper to write on.

(5) Trika - A Hierarchical Reinforcement Learning framework in Scala

A demo and discussion on an RL framework I built. I haven't yet cleared the paperwork to Open Source this (the process is like pulling teeth, long story), but I can still show it off.

(6) Neuro genetic Algorithms - Theory and Applications

An interesting branch of AI/ML with some elegant applications. Again live demo of a couple of interesting algorithms and talk about design/performance trade-offs.

(7) Denotational, Operational and Axiomatic Semantics - Designing programming languages with mathematics

This is of interest to people building their own languages. Most language implementations are adhoc "hacks". They don't have to be.

If you plan to attend, let me know which of these topics strike your fancy. And if you are a reader of this blog, find me and say Hello.

See you at DevCamp!

Saturday, July 17, 2010

The New American Militarism

Excerpt from the preface of Andrew Bacevich's "The New American Militarism: How Americans are Seduced by War"

The final point concerns my understanding of history. Before moving into a career focused on teaching and writing about contemporary U.S. foreign policy, I was trained as a diplomatic historian. My graduate school mentors were scholars of great stature and enormous gifts, admirable in every way. They were also splendid teachers, and I left graduate school very much under their influence. My own abbreviated foray into serious historical scholarship bears the earmarks of their approach, ascribing to Great Men—generals, presidents, and cabinet secretaries—the status of historical prime movers.

I have now come to see that view as mistaken. What seemed plausible enough when studying presidents named Wilson or Roosevelt breaks down completely when a Bush or Clinton occupies the Oval Office. Not only do present-day tendencies to elevate the president to the status of a demigod whose every move is recorded, every word parsed, and every decision scrutinized for hidden meaning fly in the face of republican precepts. They also betray a fundamental misunderstanding of how the world works.

What is most striking about the most powerful man in the world is not the power that he wields. It is how constrained he and his lieutenants are by forces that lie beyond their grasp and perhaps their understanding. Rather than bending history to their will, presidents and those around them are much more likely to dance to history’s tune. Only the illusions churned out by public relations apparatchiks and perpetuated by celebrity-worshipping journalists prevent us from seeing that those inhabiting the inner sanctum of the West Wing are agents more than independent actors. Although as human beings they may be interesting, very few can claim more than marginal historical significance. So while the account that follows discusses various personalities—not only politicians but also soldiers, intellectuals, and religious leaders—it uses them as vehicles to highlight the larger processes that are afoot.

Appreciating the limits of human agency becomes particularly relevant when considering remedial action. If a problem is bigger than a particular president or single administration—as I believe the problem of American militarism to be—then simply getting rid of that president will not make that problem go away. To pretend otherwise serves no purpose.

..................

The bellicose character of U.S. policy after 9/11, culminating with the American-led invasion of Iraq in March 2003, has, in fact, evoked charges of militarism from across the political spectrum. Prominent among the accounts advancing that charge are books such as The Sorrows of Empire: Militarism, Secrecy, and the End of the Republic, by Chalmers Johnson; Hegemony or Survival: America’s Quest for Global Dominance, by Noam Chomsky; Masters of War: Militarism and Blowback in the Era of American Empire, edited by Carl Boggs; Rogue Nation: American Unilateralism and the Failure of Good Intentions, by Clyde Prestowitz; and Incoherent Empire, by Michael Mann, with its concluding chapter called “The New Militarism.”

Each of these books appeared in 2003 or 2004. Each was not only written in the aftermath of 9/11 but responded specifically to the policies of the Bush administration, above all to its determined efforts to promote and justify a war to overthrow Saddam Hussein.

As the titles alone suggest and the contents amply demonstrate, they are for the most part angry books. They indict more than explain, and whatever explanations they offer tend to be ad hominem. The authors of these books unite in heaping abuse on the head of George W. Bush, said to combine in a single individual intractable provincialism, religious zealotry, and the reckless temperament of a gunslinger. Or if not Bush himself, they finger his lieutenants, the cabal of warmongers, led by Vice President Dick Cheney and senior Defense Department officials, who whispered persuasively in the president’s ear and used him to do their bidding. Thus, according to Chalmers Johnson, ever since the Persian Gulf War of 1990–1991, Cheney and other key figures from that war had “wanted to go back and finish what they started.” Having lobbied unsuccessfully throughout the Clinton era “for aggression against Iraq and the remaking of the Middle East,” they had returned to power on Bush’s coattails. After they had “bided their time for nine months,” they had seized upon the crisis of 9/11 “to put their theories and plans into action,” pressing Bush to make Saddam Hussein number one on his hit list.6 By implication, militarism becomes something of a conspiracy foisted on a malleable president and an unsuspecting people by a handful of wild-eyed ideologues.

By further implication, the remedy for American militarism is self-evident: “Throw the new militarists out of office,” as Michael Mann urges, and a more balanced attitude toward military power will presumably reassert itself.

As a contribution to the ongoing debate about U.S. policy, The New
American Militarism rejects such notions as simplistic. It refuses to lay the
responsibility for American militarism at the feet of a particular president
or a particular set of advisers and argues that no particular presidential election holds the promise of radically changing it. Charging George W. Bush with responsibility for the militaristic tendencies of present-day U.S. foreign policy makes as much sense as holding Herbert Hoover culpable for the Great Depression: whatever its psychic satisfactions, It is an exercise in scapegoating that lets too many others off the hook and allows society at large to abdicate responsibility for what has come to pass.
The point is not to deprive George W. Bush or his advisers of whatever
credit or blame they may deserve for conjuring up the several large-scale
campaigns and myriad lesser military actions comprising their war on terror. They have certainly taken up the mantle of this militarism with a verve not seen in years. Rather it is to suggest that well before September 11, 2001, and before the younger Bush’s ascent to the presidency a militaristic predisposition was already in place both in official circles and among
Americans more generally. In this regard, 9/11 deserves to be seen as an event that gave added impetus to already existing tendencies rather than as a turning point. For his part, President Bush himself ought to be seen as a player reciting his lines rather than as a playwright drafting an entirely new script.

In short, the argument offered here asserts that present-day American
militarism has deep roots in the American past. It represents a bipartisan
project. As a result, it is unlikely to disappear anytime soon, a point
obscured by the myopia and personal animus tainting most accounts of
how we have arrived at this point.

Great Book. Worth Reading.