Friday, January 15, 2010

Learning about Machine Learning

Bradford Cross has posted an awesome blog post (edit: removed link, since Bradford took down the post) titled "Learning about Statistical Learning". If you plan to work in ML, read the post, buy some of the books and work through them.

Could save you years of work if you are systematic from the beginning (I wasn't), especially if you are self taught (I am).

I work on different domains (Robotics/Computer Vision/Simulation) from Bradford and so have a different list of books. Please read Bradford's lists first. This is a supplement to his awesome post rather than a replacement.

I assume you are a good developer and you have a solid grip on algorithm analysis etc.(though, that said, see reccomendations for Discrete Math books below)

The first step..

Learn proof techniques *first*. You'll make no serious progress till you do. The best book is

Velleman's "How to Prove It" - reccomended by Bradford but I am repeating it here because this is critical.


In my experience you need to be somewhat comfortable with 6 branches of Mathematics before you can tackle ML. Imo, best to take a year and get these right before venturing into ML proper. (I know, it sounds awfully boring. I wasted a lot of time trying to shorten this step. In this case, the long way is the real shortcut)

(1) Calculus - best "lite" book - Calculus by Strang (free download) ,

best "heavy" books - Calculus by Spivak, Principles of Mathematical Analysis a.k.a "Baby Rudin"

(2) Some book on Discrete Math (don't know what to recommend here - I don't like Rosen's book) + a good book on say Introduction to Algorithms by Cormen et al will do [*]

(3) Linear Algebra (First work through Strang's book, then Axler's)

(4) Probability (Bertsekas is a good book for those with no prior exp) and

(5) Statistics (I would recommend Devore and Peck for the total beginner but it is a damn expensive book. So hit a library or get a bootlegged copy to see if it suits you before buying a copy, see brad's list for advanced stuff.)

(6) Information Theory (MacKay's book is freely available online)

Basic AI.

Brad suggests Mitchell's book.

I think AIMA (3d Edition) is much better. ( I am biased. I wrote and maintained the Java code for a long while -- children, don't do this. Java is an terrible language to develop AI algorithms in. If you need the JVM use Scala or Clojure -- and I think it covers a lot more than Mitchell does. Take a look at both. Pick one).

Machine Learning.

NB: you need all the linear algebra, calculus etc worked through before you hit this point

In order,

"Pattern Recognition and Machine Learning" by Christopher Bishop,

*then* "Elements of Statistical Learning" (free download).

Neural Networks:

In order,

Neural Network Design Hagan Demuth and Beale,

Neural Networks, A Comprehensive Foundation (2nd edition) - By Haykin (there is a newer edition out but I don't know anything about that, this is the one I used)

and Neural Networks for Pattern Recognition ( Bishop).

At this point you are in good shape to read any papers in NN. My reccomendations - anything by Yann LeCun and Geoffrey Hinton. Both do amazing research.

Reinforcement Learning (again this is just stuff *I* happened to specialize in for various projects, so feel free to ignore)

Reinforcement Learning - An Introduction by Barto and Sutton (follow up with "Recent Advances In reinforcement Learning" (PDF) which is an old paper but a GREAT introduction to *Hierarchical* Reinforcement learning

Neuro Dynamic Programming by Bertsekas

Computer Vision

Introductory Techniques for 3-D Computer Vision, by Emanuele Trucco and Alessandro Verri.

An Invitation to 3-D Vision by Y. Ma, S. Soatto, J. Kosecka, S.S. Sastry. (warning TOUGH!!)


I know only about the software/algorithms side of Robotics and that too only Probabilistic Robotics. I don't know anything about hardware, electronics or Physics.

Probabilistic Graphical Models: Principles and Techniques (Adaptive Computation and Machine Learning) (strictly speaking not a robotics book, but a lot of the theory in this book is behind the algorithms in the next book

Probabilistic Robotics (Intelligent Robotics and Autonomous Agents) by Thrun, Burgard and Fox (trivia Thrun also wrote the Robotics chapter in AIMA - did I tell you AIMA rocks as a first introduction to AI?)

And that's all folks. Happy hacking!

[*] working though Cormen et al is a humungous task and can easily consume a year or more of work. Something like Sally Goldman's new Algorithm book maybe more suited to programmers.

PS: I have been getting a lot of email asking *how* one should learn X or Y. I have no idea really. The above is a list of books that worked for me and is provided only in the spirit of "these are good books that worked for me I don't know if they'll work for you."

As to how I learned, I just read books and papers, try to understand, (a lot of banging head against wall at this point) and try to solve problems and code stuff. Beyond that I have no advice on how to learn effectively etc. I am entirely self taught and have no idea how to teach this stuff. You probably need to talk to a good prof.


Anonymous said...

Hi Ravi
Not related to this post and sorry for cluttering.

I am planning to read sedgewick DS&Algo book to learn about do u recommend approaching the topic DS& algo?
Should i remember every algo and how to implement it in practical situations .i.e approach each DS & algo from how they are used in real world?

Please advise.Thanks

Ravi said...

No harm working through Sedgewick.

Don't bother memorizing tonnes of algorithms. Actually I don't know how to tell you how to learn algorithms. Working through a good book and the exercises should be sufficient I think.

Oh yeah did you finish SICP yet? ;-)

AKHTAR said...

Nice Blog Ravi. I recently joined a course in Machine learning with a institution called GuruPrevails Bangalore.We are going through Linear algebra lectures as start and following strang's book as our standard text. Just was curious abt you. I work for a data profiling team in a software company and wanted to move into core areas of algorithms..Does a formal degree is really important in getting into serious machine learning jobs or it's more abt individual skills?

Ravi said...

"Does a formal degree is really important in getting into serious machine learning jobs or it's more abt individual skills?"

I have no idea. I don't have a degree in CS, leave lone ML or algorithms.

My policy is, if you are good enough, people seek you out. If they aren't focus on getting better.

Unknown said...

Wow - those are a lot of books. Assuming you are only studying (not working), how much time would it take for someone to go through all of these books? I understand each individual's capacity is different, but still could you make an informed guess for an average individual?

If this question doesn't make sense then please try answering this - since how long have you been reading all these?

I am curious because I would like to start on this path (of self education), and am thinking - can I quit my job long enough?

Ravi said...

"how long have you been reading all these?"

Eight Years and counting. :-)

Matthew said...

Glad I came across your post. I'd hit the progress barrier for getting into ML without more serious math fundementals.

Picked up Velleman and its a damn good book. You might have saved me a fair bit of time. Thanks.

nikhil said...

Thanks for posting this book chain. Started working through it. Can you please recommend a book on probability? (for someone who needs to start from scratch, something like Gilbert Strang's Introduction to Algebra).

Bradford Cross's blog has been taken down, so the post you linked to is no longer available.

I have been preparing for the online Stanford classes, this post has been very helpful, and its inspiring to see someone move from enterprise s/w dev to attaining such expertise in a chosen field. Thank you.


Ravi said...

bertsekas 's book is decent for probability.

Anonymous said...

A really nice book on Discrete Maths is - Tremblay and Manohar.

Rohit Arondekar said...

Although this is an old blog post it still relevant. AFAIK, besides me at least one more person has looked at this list[1].

For those coming here late, the blog post by Bradford that Ravi wanted to link to was taken down. However it's still available via WaybackMachine. Bradford wrote two articles:

Learning About Statistical Learning

Learning About Machine Learning 2nd Ed

The second blog post is a revision of the first after taking feedback from Hackernews, this very blog post and other sources. I hope this helps anybody coming here late! :)