<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-6475416749449483556</id><updated>2011-12-03T18:12:13.060-08:00</updated><title type='text'>Mechanistician</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>82</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-8153996193150768842</id><published>2011-09-29T15:00:00.000-07:00</published><updated>2011-09-29T15:03:53.865-07:00</updated><title type='text'>Methods for comparing rankings of search engine results</title><content type='html'>Interesting visualization from a &lt;a href="http://arxiv.org/pdf/cs/0505039v1"&gt;paper on comparing search engine results&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/-v22K4fYLTeU/ToTq6Pkb4EI/AAAAAAAAAMA/ntNJnN2wIsY/s1600/dna_query.png"&gt;&lt;img style="cursor: pointer; width: 400px; height: 242px;" src="http://2.bp.blogspot.com/-v22K4fYLTeU/ToTq6Pkb4EI/AAAAAAAAAMA/ntNJnN2wIsY/s400/dna_query.png" alt="" id="BLOGGER_PHOTO_ID_5657905318216851522" border="0" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-8153996193150768842?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/8153996193150768842/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2011/09/methods-for-comparing-rankings-of.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/8153996193150768842'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/8153996193150768842'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2011/09/methods-for-comparing-rankings-of.html' title='Methods for comparing rankings of search engine results'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-v22K4fYLTeU/ToTq6Pkb4EI/AAAAAAAAAMA/ntNJnN2wIsY/s72-c/dna_query.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-1388801797085207904</id><published>2011-08-05T20:31:00.000-07:00</published><updated>2011-08-05T20:38:55.855-07:00</updated><title type='text'>InMaps - visualizing your LinkedIn network</title><content type='html'>LinkedIn's &lt;a href="http://inmaps.linkedinlabs.com/"&gt;InMaps&lt;/a&gt; does a pretty good job at detecting communities within your connections.  It requires that you have at least 50 connections to use it (otherwise there is not enough info to make it interesting), but that threshold might still be too small, depending on your career trajectory.  For me, it got really good once I had 75+ connections:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-0o8fQAj7gJc/Tjy2v5l4IfI/AAAAAAAAAL4/LrUvaNjk8L8/s1600/li_conns.png"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 400px; height: 383px;" src="http://2.bp.blogspot.com/-0o8fQAj7gJc/Tjy2v5l4IfI/AAAAAAAAAL4/LrUvaNjk8L8/s400/li_conns.png" alt="" id="BLOGGER_PHOTO_ID_5637581767590814194" border="0" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-1388801797085207904?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/1388801797085207904/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2011/08/inmaps-visualizing-your-linkedin.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/1388801797085207904'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/1388801797085207904'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2011/08/inmaps-visualizing-your-linkedin.html' title='InMaps - visualizing your LinkedIn network'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-0o8fQAj7gJc/Tjy2v5l4IfI/AAAAAAAAAL4/LrUvaNjk8L8/s72-c/li_conns.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-677907357941766577</id><published>2011-07-02T15:37:00.000-07:00</published><updated>2011-07-09T08:52:27.957-07:00</updated><title type='text'>Search, Network, Analytics @ LinkedIn</title><content type='html'>Tons of interesting projects from LinkedIn: &lt;a href="http://sna-projects.com/sna/"&gt;SNA&lt;/a&gt;.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For those (still?) not familiar with the whole NoSQL movement, here is a nice intro, from a data modeling perspective: &lt;a href="http://highscalability.com/blog/2009/10/29/paper-no-relation-the-mixed-blessings-of-non-relational-data.html"&gt;No Relation - The Mixed Blessings of Non-Relational Databases&lt;/a&gt;.   &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I'm particularly fond of the approach taken in &lt;a href="http://ofps.oreilly.com/titles/9781449396107/"&gt;HBase&lt;/a&gt;, based on the &lt;a href="http://labs.google.com/papers/bigtable.html"&gt;BigTable&lt;/a&gt; paper from Google.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As if that wasn't enough, here's something else to sink another couple of hours into: &lt;a href="http://videolectures.net/wsdm09_dean_cblirs/"&gt;Challenges in Building Large-Scale Information Retrieval Systems&lt;/a&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt; &lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-677907357941766577?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/677907357941766577/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2011/07/search-network-analytics-linkedin.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/677907357941766577'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/677907357941766577'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2011/07/search-network-analytics-linkedin.html' title='Search, Network, Analytics @ LinkedIn'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-8255788790935936706</id><published>2011-06-17T15:52:00.000-07:00</published><updated>2011-07-09T08:55:13.177-07:00</updated><title type='text'>Mining of massive datasets</title><content type='html'>Good resource for mining large datasets: &lt;a href="http://infolab.stanford.edu/~ullman/mmds.html"&gt;book&lt;/a&gt; and &lt;a href="http://www.stanford.edu/class/cs246/cs246-11-mmds/"&gt;class&lt;/a&gt;.  &lt;a href="http://en.wikipedia.org/wiki/MinHash"&gt;MinHash&lt;/a&gt;, a technique used in the &lt;a href="http://www2007.org/papers/paper570.pdf"&gt;google news personalization system&lt;/a&gt; is described in detail in the book.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-8255788790935936706?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/8255788790935936706/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2011/06/mining-of-massive-datasets.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/8255788790935936706'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/8255788790935936706'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2011/06/mining-of-massive-datasets.html' title='Mining of massive datasets'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-474424383287905476</id><published>2011-05-15T11:36:00.000-07:00</published><updated>2011-05-25T15:12:48.526-07:00</updated><title type='text'>TED talk on issues with online personalization</title><content type='html'>&lt;iframe width="560" height="349" src="http://www.youtube.com/embed/B8ofWFx525s" frameborder="0" allowfullscreen&gt;&lt;/iframe&gt;&lt;br /&gt;&lt;br /&gt;Related post: &lt;a href="http://www.bigspaceship.com/blog/think/the-filter-bubble-algorithms-as-gatekeepers/"&gt;http://www.bigspaceship.com/blog/think/the-filter-bubble-algorithms-as-gatekeepers&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-474424383287905476?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/474424383287905476/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2011/05/ted-talk-on-issues-with-online.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/474424383287905476'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/474424383287905476'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2011/05/ted-talk-on-issues-with-online.html' title='TED talk on issues with online personalization'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://img.youtube.com/vi/B8ofWFx525s/default.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-2350468960707153871</id><published>2011-03-11T11:45:00.001-08:00</published><updated>2011-03-20T20:56:21.967-07:00</updated><title type='text'>IBM's Watson</title><content type='html'>&lt;iframe title="YouTube video player" width="640" height="390" src="http://www.youtube.com/embed/3G2H3DZ8rNc" frameborder="0" allowfullscreen&gt;&lt;/iframe&gt;&lt;br /&gt;&lt;br /&gt;To build your own watson, see this &lt;a href="https://www.ibm.com/developerworks/mydeveloperworks/blogs/InsideSystemStorage/entry/ibm_watson_how_to_build_your_own_watson_jr_in_your_basement7?lang=en"&gt;article&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-2350468960707153871?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/2350468960707153871/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2011/03/ibms-watson.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/2350468960707153871'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/2350468960707153871'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2011/03/ibms-watson.html' title='IBM&apos;s Watson'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://img.youtube.com/vi/3G2H3DZ8rNc/default.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-6512301226017920059</id><published>2011-01-25T20:43:00.000-08:00</published><updated>2011-01-25T21:00:02.366-08:00</updated><title type='text'>Scala</title><content type='html'>I've been playing with &lt;a href="http://www.scala-lang.org/"&gt;Scala&lt;/a&gt; lately, and I have to admit it is growing on me.  Below is a simple recursive interpreter written in it for a small &lt;a href="http://en.wikipedia.org/wiki/Turing_completeness"&gt;Turing-complete&lt;/a&gt; language.  Check out how it keeps state as it executes (the 'state0' and 'update' methods, with 'update' returning the head function of a linked list of functions as new variables are assigned to, and 'state0' being the tail function, which returns 0 as a default for all unassigned variables):&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="java"&gt;&lt;br /&gt;abstract class Aexp&lt;br /&gt;case class IntAexp(i: Int) extends Aexp&lt;br /&gt;case class VarAexp(v: String) extends Aexp&lt;br /&gt;case class SumAexp(a1: Aexp, a2: Aexp) extends Aexp&lt;br /&gt;case class MulAexp(a1: Aexp, a2: Aexp) extends Aexp&lt;br /&gt;&lt;br /&gt;abstract class Bexp&lt;br /&gt;case class BoolBexp(bool: Boolean) extends Bexp&lt;br /&gt;case class LessBexp(a1: Aexp, a2: Aexp) extends Bexp&lt;br /&gt;case class AndBexp(b1: Bexp, b2: Bexp) extends Bexp&lt;br /&gt;&lt;br /&gt;abstract class Cmd&lt;br /&gt;case class SkipCmd() extends Cmd&lt;br /&gt;case class AssignCmd(v: String, a: Aexp) extends Cmd&lt;br /&gt;case class SeqCmd(c1: Cmd, c2: Cmd) extends Cmd&lt;br /&gt;case class IfCmd(b: Bexp, c1: Cmd, c2: Cmd) extends Cmd&lt;br /&gt;case class WhileCmd(b: Bexp, c: Cmd) extends Cmd&lt;br /&gt;&lt;br /&gt;object eval {&lt;br /&gt;  type State = (String) =&gt; Int&lt;br /&gt;&lt;br /&gt;  def state0(v: String): Int = 0&lt;br /&gt;&lt;br /&gt;  def update(s: State, v: String, n: Int): State = { &lt;br /&gt;   v2 =&gt; { if (v.equals(v2)) n else s(v2) }&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  def A(a: Aexp, s: State): Int = {&lt;br /&gt;    a match {&lt;br /&gt;      case IntAexp(i) =&gt; i&lt;br /&gt;      case VarAexp(v) =&gt; s(v)&lt;br /&gt;      case SumAexp(a1, a2) =&gt; A(a1, s) + A(a2, s)&lt;br /&gt;      case MulAexp(a1, a2) =&gt; A(a1, s) * A(a2, s)&lt;br /&gt;    }&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  def B(b: Bexp, s: State): Boolean = {&lt;br /&gt;    b match {&lt;br /&gt;      case BoolBexp(bool) =&gt; bool&lt;br /&gt;      case LessBexp(a1, a2) =&gt; A(a1, s) &lt; A(a2, s)&lt;br /&gt;      case AndBexp(b1, b2) =&gt; B(b1, s) &amp;&amp; B(b2, s)&lt;br /&gt;    }&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  def bigstep(c: Cmd, s: State): State = {&lt;br /&gt;    c match {&lt;br /&gt;      case AssignCmd(v, a) =&gt; update(s, v, A(a, s))&lt;br /&gt;      case SkipCmd() =&gt; s&lt;br /&gt;      case SeqCmd(c1, c2) =&gt; bigstep(c2, bigstep(c1, s))&lt;br /&gt;      case IfCmd(b, c1, c2) =&gt; if (B(b, s)) bigstep(c1, s) else bigstep(c2, s)&lt;br /&gt;      case WhileCmd(b, c) =&gt; if (B(b, s)) bigstep(WhileCmd(b, c), bigstep(c, s)) else s&lt;br /&gt;    }&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  def main(args: Array[String]): Unit = {&lt;br /&gt;    var c: Cmd = WhileCmd(LessBexp(VarAexp("x"), IntAexp(10)),&lt;br /&gt;      AssignCmd("x", SumAexp(VarAexp("x"), IntAexp(1))))&lt;br /&gt;&lt;br /&gt;    System.out.println("Big-step evaluating: " + c)&lt;br /&gt;    val res: State = bigstep(c, state0)&lt;br /&gt;    System.out.println("Result: " + res("x") + "\n\n")&lt;br /&gt;  }&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-6512301226017920059?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/6512301226017920059/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2011/01/scala.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/6512301226017920059'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/6512301226017920059'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2011/01/scala.html' title='Scala'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-7389223890655180426</id><published>2011-01-10T14:57:00.000-08:00</published><updated>2011-01-25T21:00:50.142-08:00</updated><title type='text'>Talk on Ensemble methods</title><content type='html'>Upcoming interesting talk at the local Bay Area Artificial Intelligence Meetup Group: &lt;a href="http://meetu.ps/53w3"&gt;Diversity, Complexity, and Regularization in Ensemble Models&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-7389223890655180426?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/7389223890655180426/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2011/01/diversity-complexity-and-regularization.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/7389223890655180426'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/7389223890655180426'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2011/01/diversity-complexity-and-regularization.html' title='Talk on Ensemble methods'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-2116875073859274413</id><published>2010-12-13T13:46:00.001-08:00</published><updated>2010-12-13T14:16:32.699-08:00</updated><title type='text'>Gamma database machine</title><content type='html'>While the &lt;a href="http://mechanistician.blogspot.com/2010/12/blast-from-past-titanium-dbms.html"&gt;Titanium DBMS&lt;/a&gt; may not have gone anywhere, here's an example of an older system that was certainly influential in today's hot technologies: the &lt;a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.113.6798&amp;amp;rep=rep1&amp;amp;type=pdf"&gt;Gamma database machine&lt;/a&gt;.  Contrast Gamma with a combination of &lt;a href="http://hadoop.apache.org/mapreduce/"&gt;Hadoop map-reduce&lt;/a&gt;, &lt;a href="http://pig.apache.org/"&gt;Pig&lt;/a&gt;, and &lt;a href="http://hbase.apache.org/"&gt;HBase&lt;/a&gt;.  Another interesting system along those lines is &lt;a href="http://db.cs.yale.edu/hadoopdb/hadoopdb.html"&gt;HadoopDB&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-2116875073859274413?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/2116875073859274413/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2010/12/gamma-database-machine.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/2116875073859274413'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/2116875073859274413'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2010/12/gamma-database-machine.html' title='Gamma database machine'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-6252975586557894163</id><published>2010-12-13T13:22:00.001-08:00</published><updated>2010-12-13T14:15:54.837-08:00</updated><title type='text'>Blast from the past: Titanium DBMS</title><content type='html'>Came across an &lt;a href="http://www.drdobbs.com/184410397"&gt;old article&lt;/a&gt; from Dr. Dobb's while looking for something completely unrelated.  It talks about some of the challenges faced by the developers of the system back then known as the NTC Ship Manager, then SafeNet, and now known as &lt;a href="http://www.abs-ns.com/Home-3.html"&gt;NS5&lt;/a&gt;, an Enterprise Resource Planning (ERP) system for the maritime shipping industry which I worked on some years ago (NS5, today, uses Java and MySQL).  Here is an article about Titanium form 1993, claiming it outpeforms "traditional relational databases" by 5 to 1:&lt;br /&gt;&lt;br /&gt;&lt;iframe style="border: 0px none;" src="http://books.google.com/books?id=IDsEAAAAMBAJ&amp;amp;lpg=PA17&amp;amp;ots=etHskosaAh&amp;amp;dq=mdbs%20titanium&amp;amp;pg=PA17&amp;amp;output=embed" frameborder="0" height="500" scrolling="no" width="700"&gt;&lt;/iframe&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-6252975586557894163?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/6252975586557894163/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2010/12/blast-from-past-titanium-dbms.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/6252975586557894163'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/6252975586557894163'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2010/12/blast-from-past-titanium-dbms.html' title='Blast from the past: Titanium DBMS'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-4210966288920473633</id><published>2010-10-25T18:24:00.001-07:00</published><updated>2010-10-25T18:24:48.915-07:00</updated><title type='text'>CrowdConf 2010</title><content type='html'>&lt;a href="http://crowdconf.com/vids.html"&gt;Videos&lt;/a&gt; of the conference are up.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-4210966288920473633?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/4210966288920473633/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2010/10/crowdconf-2010.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/4210966288920473633'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/4210966288920473633'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2010/10/crowdconf-2010.html' title='CrowdConf 2010'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-4121867711052024772</id><published>2010-10-24T08:53:00.000-07:00</published><updated>2010-10-25T11:47:49.649-07:00</updated><title type='text'>Adding a new operator to PostgreSQL</title><content type='html'>So, I'm adding a new operator to &lt;a href="http://en.wikipedia.org/wiki/PostgreSQL"&gt;PostgreSQL&lt;/a&gt; as part of a project for my &lt;a href="https://docs.google.com/Doc?docid=0Af0eNWmCGibiZGh3N3g3OV8yMGdoOHcyMmY1&amp;amp;hl=en&amp;amp;authkey=CJCh7Bk"&gt;database systems implementation class&lt;/a&gt; in order to enhance pg with the functionality described in the paper titled &lt;a href="http://users.soe.ucsc.edu/%7Ealkis/papers/553.pdf"&gt;A Scalable, Predictable Join Operator for Highly Concurrent Data Warehouses&lt;/a&gt;.&lt;span style="text-decoration: underline;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;Fortunately pg uses &lt;a href="http://en.wikipedia.org/wiki/Flex_lexical_analyser"&gt;Flex&lt;/a&gt;/&lt;a href="http://en.wikipedia.org/wiki/GNU_Bison"&gt;Bison&lt;/a&gt; to manage parsing SQL, greatly simplifying the task of making modifications to the grammar.&lt;br /&gt;&lt;br /&gt;One of the steps for doing so includes adding the new keyword to the file &lt;a href="http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/parser/kwlist.h?rev=1.13;content-type=text%2Fplain"&gt;pgsql/src/include/parser/kwlist.h&lt;/a&gt;, an excerpt of which is shown here:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="c"&gt;&lt;br /&gt;PG_KEYWORD("isnull", ISNULL, TYPE_FUNC_NAME_KEYWORD)&lt;br /&gt;PG_KEYWORD("isolation", ISOLATION, UNRESERVED_KEYWORD)&lt;br /&gt;PG_KEYWORD("join", JOIN, TYPE_FUNC_NAME_KEYWORD)&lt;br /&gt;PG_KEYWORD("key", KEY, UNRESERVED_KEYWORD)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The new operator described in the paper is aptly named CJOIN, so naturally, I defined my keyword right below the existing JOIN keyword:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="c"&gt;&lt;br /&gt;PG_KEYWORD("isnull", ISNULL, TYPE_FUNC_NAME_KEYWORD)&lt;br /&gt;PG_KEYWORD("isolation", ISOLATION, UNRESERVED_KEYWORD)&lt;br /&gt;PG_KEYWORD("join", JOIN, TYPE_FUNC_NAME_KEYWORD)&lt;br /&gt;PG_KEYWORD("cjoin", CJOIN, TYPE_FUNC_NAME_KEYWORD)&lt;br /&gt;PG_KEYWORD("key", KEY, UNRESERVED_KEYWORD)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Only to find out my changes weren't being recognized (i.e. I would issue a cjoin query from the client and the server would still complain of a syntax error).  Why?  Where in the &lt;a href="http://www.informationweek.com/news/security/showArticle.jhtml?articleID=205600229"&gt;near 1 million lines of code&lt;/a&gt; had I missed something?&lt;br /&gt;&lt;br /&gt;Well, it turns out the fact that the list in kwlist.h was in alphabetical order was not by coincidence or due to a desire to be neat, it actually affected execution, as explained in the comments at the beginning of that file:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="c"&gt;&lt;br /&gt;/*&lt;br /&gt;* List of keyword (name, token-value, category) entries.&lt;br /&gt;*&lt;br /&gt;* !!WARNING!!: This list must be sorted by ASCII name, because binary&lt;br /&gt;*   search is used to locate entries.&lt;br /&gt;*/&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;D'oh!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-4121867711052024772?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/4121867711052024772/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2010/10/adding-new-keyword-to-postgresql.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/4121867711052024772'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/4121867711052024772'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2010/10/adding-new-keyword-to-postgresql.html' title='Adding a new operator to PostgreSQL'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-6813916076888576840</id><published>2010-10-15T17:47:00.000-07:00</published><updated>2010-10-15T17:48:16.099-07:00</updated><title type='text'>How to start a social movement</title><content type='html'>Interesting video on the dynamics of starting a social movement:&lt;br /&gt;&lt;br /&gt;&lt;object width="640" height="390"&gt;&lt;param name="movie" value="http://www.youtube.com/v/fW8amMCVAJQ&amp;hl=en_US&amp;feature=player_embedded&amp;version=3"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;param name="allowScriptAccess" value="always"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/fW8amMCVAJQ&amp;hl=en_US&amp;feature=player_embedded&amp;version=3" type="application/x-shockwave-flash" allowfullscreen="true" allowScriptAccess="always" width="640" height="390"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-6813916076888576840?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/6813916076888576840/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2010/10/how-to-start-social-movement.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/6813916076888576840'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/6813916076888576840'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2010/10/how-to-start-social-movement.html' title='How to start a social movement'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-8483212378738627320</id><published>2010-09-30T21:26:00.000-07:00</published><updated>2010-10-01T18:51:48.191-07:00</updated><title type='text'>Crowdsourcing</title><content type='html'>This Monday (10/4) is &lt;a href="http://crowdconf.com/"&gt;crowdconf&lt;/a&gt;, the "world's first conference on the future of distributed work."&lt;br /&gt;&lt;br /&gt;Here are some links for papers/research on &lt;a href="http://en.wikipedia.org/wiki/Crowdsourcing"&gt;crowdsourcing&lt;/a&gt;:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;A &lt;a href="http://hcomp2009.wikispaces.com/"&gt;bunch of papers&lt;/a&gt; from the &lt;a href="http://www.hcomp2009.org/Home.html"&gt;Human Computation Workshop in 2009&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.cs.cmu.edu/%7Ebiglou/research.html"&gt;Luis von Ahn's research&lt;/a&gt; (creator of the &lt;a href="http://video.google.com/videoplay?docid=-8246463980976635143#"&gt;ESP game&lt;/a&gt;)&lt;/li&gt;&lt;li&gt;&lt;a href="http://glittle.org/Papers/TurKit.pdf"&gt;TurKit: tools for iterative tasks on Mechanical Turk&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://vark.com/aardvarkFinalWWW2010.pdf"&gt;The anatomy of a large scale social search engine&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://wiki.freebase.com/images/e/e0/Hcomp10-anatomy.pdf"&gt;The anatomy of a large-scale human-computation engine&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://users.soe.ucsc.edu/%7Eorazio/papers/DavisACVHL_CVPR10.pdf"&gt;The HPU (Human Processing Unit)&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.iis.sinica.edu.tw/%7Ecclljj/publication/2009/09_SCA-HCOMP.pdf"&gt;A survey of human computation systems&lt;/a&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-8483212378738627320?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/8483212378738627320/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2010/09/crowdsourcing.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/8483212378738627320'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/8483212378738627320'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2010/09/crowdsourcing.html' title='Crowdsourcing'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-1160532246739972984</id><published>2010-08-05T22:10:00.000-07:00</published><updated>2010-08-05T22:18:00.595-07:00</updated><title type='text'>MapReduce algorithm design book</title><content type='html'>&lt;a href="http://www.umiacs.umd.edu/%7Ejimmylin/book.html"&gt;Jimmy Lin's book&lt;/a&gt; is a great resource for people writing map-reduce  algorithms (the final pre-production manuscript is available as a free download).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-1160532246739972984?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/1160532246739972984/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2010/08/mapreduce-algorithm-design-book.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/1160532246739972984'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/1160532246739972984'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2010/08/mapreduce-algorithm-design-book.html' title='MapReduce algorithm design book'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-7026974262307487492</id><published>2010-07-26T02:45:00.000-07:00</published><updated>2010-07-26T04:17:13.663-07:00</updated><title type='text'>Architecture for Highly Concurrent Server Apps</title><content type='html'>Came across this paper today, &lt;a href="http://www.eecs.harvard.edu/%7Emdw/papers/seda-sosp01.pdf"&gt;SEDA: An Architecture for Well-Conditioned, Scalable Internet Services&lt;/a&gt;.  From the &lt;a href="http://www.eecs.harvard.edu/%7Emdw/proj/seda/"&gt;author's website&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;"SEDA is an acronym for staged event-driven architecture, and decomposes a complex, event-driven application into a set of stages connected by queues. This design avoids the high overhead associated with thread-based concurrency models, and decouples event and thread scheduling from application logic. By performing admission control on each event queue, the service can be well-conditioned to load, preventing resources from being overcommitted when demand exceeds service capacity. SEDA employs dynamic control to automatically tune runtime parameters (such as the scheduling parameters of each stage), as well as to manage load, for example, by performing adaptive load shedding. Decomposing services into a set of stages also enables modularity and code reuse, as well as the development of debugging tools for complex event-driven applications."&lt;br /&gt;&lt;br /&gt;Even though SEDA hasn't been worked on since 2002 (you can download the code from the website), it presents some very interesting ideas.  For completeness, here's one paper (&lt;a href="http://capriccio.cs.berkeley.edu/pubs/threads-hotos-2003.pdf"&gt;Why Events Are A Bad Idea (for high-concurrency servers)&lt;/a&gt;) that critiques SEDA's performance, while this other one (&lt;a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.141.5192&amp;amp;rep=rep1&amp;amp;type=pdf"&gt;Comparing the Performance of Web Server Architectures&lt;/a&gt;) presents findings in favor of a SEDA-like architecture.&lt;br /&gt;&lt;br /&gt;In the Java world, the event-based model for building highly scalable server applications is preferred over the simpler thread-per-connection model, and this article (&lt;a href="http://today.java.net/article/2007/02/08/architecture-highly-scalable-nio-based-server"&gt;Architecture of a highly scalable NIO-based server&lt;/a&gt;) provides a good intro.  &lt;a href="http://jboss.org/netty"&gt;Netty&lt;/a&gt;, an open-source project built around the concepts presented in the article and in SEDA, &lt;span class="highlight"&gt;provides "an asynchronous event-driven network application framework and tools&lt;/span&gt; for rapid development of maintainable high performance &amp;amp; high scalability protocol servers &amp;amp; clients."&lt;br /&gt;&lt;br /&gt;Update 1: This &lt;a href="http://www.kegel.com/c10k.html"&gt;page&lt;/a&gt; also has a lot of good info on the topic.&lt;br /&gt;Update 2: This &lt;a href="http://paultyma.blogspot.com/2008/03/writing-java-multithreaded-servers.html"&gt;post&lt;/a&gt; also presents some interesting findings.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-7026974262307487492?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/7026974262307487492/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2010/07/architecture-for-highly-concurrent.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/7026974262307487492'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/7026974262307487492'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2010/07/architecture-for-highly-concurrent.html' title='Architecture for Highly Concurrent Server Apps'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-606011733220118085</id><published>2010-07-07T21:50:00.001-07:00</published><updated>2010-07-12T23:14:35.322-07:00</updated><title type='text'>Large-Scale Machine Learning with Mahout - Part 3</title><content type='html'>RecommenderJob essentially coordinates a series of map-reduce tasks to accomplish the goal of producing recommendations for the specified users from the specified training data.  It performs the computation in a batch fashion and has no support for incremental processing.  For example, considering the GroupLens dataset mentioned in the previous post, if suddenly you received another 100k more ratings, RecommenderJob would have to process the accumulated 200k ratings in one shot to compute new recommendations based on the latest training data.  Despite this and other limitations, it illustrates Hadoop's map-reduce paradigm well.&lt;br /&gt;&lt;br /&gt;RecommenderJob contains 4 map-reduce pairs, in the following order:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;ItemIDIndexMapper and ItemIDIndexReducer&lt;/li&gt;&lt;li&gt;ToItemPrefsMapper and ToUserVectorReducer&lt;/li&gt;&lt;li&gt;UserVectorToCoocurrenceMapper and UserVectorToCoocurrenceReducer&lt;/li&gt;&lt;li&gt;RecommenderMapper and IdentityReducer&lt;/li&gt;&lt;/ol&gt;You can think of the data flowing from mapper to reducer, and from map-reduce pair to map-reduce pair.&lt;br /&gt;&lt;br /&gt;The first map-reduce pair is configured as follows:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="java"&gt;&lt;br /&gt;    JobConf itemIDIndexConf = AbstractJob.prepareJobConf(inputPath, itemIDIndexPath, jarFile,&lt;br /&gt;      TextInputFormat.class, ItemIDIndexMapper.class, IntWritable.class, LongWritable.class,&lt;br /&gt;      ItemIDIndexReducer.class, IntWritable.class, LongWritable.class, MapFileOutputFormat.class);&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Where inputPath is a directory containing one or more files, each line in a file being a record of interest. inputPath denotes the input is for the mapper.  ItemIDIndexPath is the path to the directory where the output of the reducer will be saved.  TexInputFormat is a class that handles a specific file format and will be used to process the files in inputPath.   ItemIDIndexMapper is the map class.  IntWritable and LongWritable are the output key and value types for the mapper.  ItemIDIndexReducer is the reducer class.  The second IntWritable and LongWritable classes represent the format of the key and value output types of the reducer.  MapFileOutputFormat is the class the handles the format that the output of the reducer will be in.&lt;br /&gt;&lt;br /&gt;Once this first job is configured, it is run by the JobClient.  The goal of the first map-reduce pair in this job is to handle the possibility that the spectrum of itemID's falls outside the range of java's int.  The reason we would like to map itemID's to ints is that data structures in the remaining tasks contained in the job (arrays) use ints as the indexing method.  So, the job will hash all (long) userIDs to ints (where there will inevitably be collisions, which in this case are unhandled) and save that map in the output file, which not coincidentally is of type MapFileOutputFormat.  The possibility that the range of int would not be sufficient to uniquely identify all itemID's is something that would be handled in a more robust recommender.&lt;br /&gt;&lt;br /&gt;Note that the hash function, idToIndex, has been corrected in Mahout's trunk so as to not return negative values:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="java"&gt;&lt;br /&gt;static int idToIndex(long itemID) {&lt;br /&gt;  return 0x7FFFFFFF &amp;amp; ((int) itemID ^ (int) (itemID &gt;&gt;&gt; 32));&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;ItemIDIndexMapper's map method is called once per line in the input file.  Each time, the key will be the line number and the value will be the line.  This is so because that is what TexInputFormat, which we specified, does.  In turn, the output will be another set of key-value pairs, the key being the hash value of the itemID and the value being the itemID.&lt;br /&gt;&lt;br /&gt;What happens next, the magic that Hadoop performs between the mapper and the reducer and which is commonly known as the "shuffle," is complicated.  Tom White's &lt;a href="http://oreilly.com/catalog/9780596521981"&gt;book&lt;/a&gt; has a great explanation.  Another good and concise explanation can be found on page 3 of this &lt;a href="http://neilconway.org/docs/nsdi2010_hop.pdf"&gt;paper&lt;/a&gt;, which discusses modifications to Hadoop so as to enable data to be pipelined between tasks and between jobs.&lt;br /&gt;&lt;br /&gt;Once it comes time to run ItemIDIndexReducer's reduce method, Hadoop has gathered all values with the same key into a list.  The reduce method will then choose the smallest of those values and output it to be the corresponding value (itemID) for the key (hashed itemID).  This gets saved into a file-based map-like data structure, &lt;a href="http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/MapFile.html"&gt;MapFile&lt;/a&gt;, in a directory specified by  ItemIDIndexPath.&lt;br /&gt;&lt;br /&gt;The second map-reduce task is configured as follows:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="java"&gt;&lt;br /&gt;    JobConf toUserVectorConf = AbstractJob.prepareJobConf(inputPath, userVectorPath, jarFile,&lt;br /&gt;      TextInputFormat.class, ToItemPrefsMapper.class, LongWritable.class, ItemPrefWritable.class,&lt;br /&gt;      ToUserVectorReducer.class, LongWritable.class, VectorWritable.class, SequenceFileOutputFormat.class);&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Its purpose is to is to create the user preference vector mentioned in the previous post.  It uses the same input as the previous map-reduce pair and the reducer outputs a file composed of a sequence of key-vector instances.  This SequenceFile is distinct from the MapFile used in the previous task only slightly.  The MapFile can be thought of as SequenceFile that has an index that allows random access lookups by key.  The SequenceFile, on the other hand, as its name suggests, is meant to be accessed sequentially.&lt;br /&gt;&lt;br /&gt;Notice that ToUserVectorReducer hashes the itemID using the same method as ItemIDIndexMapper.  This is because the user preference vector will be stored in a RandomAccessSparseVector, which is a HashMap-like structure, where keys are ints and values are doubles.  I have looked to see if there is a RandomAccessSparseVector in Mahout that allows for long keys, but one sure would be handy.&lt;br /&gt;&lt;br /&gt;As we follow along in ToUserVectorReducer's reduce method, we see that the user preference vector is limited to a configurable number of preferences, and by default that limit is set to 20.  This surfaces another area that can be improved in this current implementation, which even if it allowed for itemID's to be longs, it does not account for a user preference vector that does not fit in RAM.  I do not mean these observations as criticism of Mahout, which already in its early stages has a tremendous amount of useful functionality; they are simply meant to get your mind thinking about some of the issues you could solve should you venture into building your own recommendation system or into contributing to Mahout itself.  This size limitation of the user preference vector is restricted to non-zero values, since the RandomAccessSparseVector is initialized to a cardinality of Integer.MAX_VALUE.  The cardinality is the 'effective size' of the RandomAccessSparseVector, meaning that it represents the number of allowable indices into the vector.  RandomAccessSparseVector is useful for storing sparse vectors, as it only keeps track of non-default (zero) values stored in it.&lt;br /&gt;&lt;br /&gt;The method that it uses to limit the size of the user preference vector won't exactly cut its length at the limit, but it does a decent job at picking a user's top preferred items.  On the other hand, this kind of scheme would not necessarily to well work if your problem domain denotes preferences as binary values, such as the one mentioned in the Google news personalization system described in this &lt;a href="http://www2007.org/papers/paper570.pdf"&gt;paper&lt;/a&gt; (and you can also watch a Google talk video about it &lt;a href="http://videolectures.net/google_datar_gnp/"&gt;here&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;The third map-reduce task is configured as follows:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="java"&gt;&lt;br /&gt;    JobConf toCooccurrenceConf = AbstractJob.prepareJobConf(userVectorPath, cooccurrencePath, jarFile,&lt;br /&gt;      SequenceFileInputFormat.class, UserVectorToCooccurrenceMapper.class, IntWritable.class,&lt;br /&gt;      IntWritable.class, UserVectorToCooccurrenceReducer.class, IntWritable.class, VectorWritable.class,&lt;br /&gt;      MapFileOutputFormat.class);&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Its purpose is to build the other major data structure RecommenderJob computes: the coocurrence matrix.  This job will read the user preference vector file computed in the previous job and save the computed coocurrence matrix to the cooccurrencePath directory in the form of a MapFile.&lt;br /&gt;&lt;br /&gt;UserVectorToCooccurrenceMapper's map method is called once for each user preference vector, and it will output every permutation of 2 itemID's for that user (excluding pairs of same itemID's), in other words, the items that co-occur in a user's preference list.  Note that it will output seemingly repeated pairs, for example, it will output [Key:itemID=123, value:itemID=456] and [Key:itemID=456, value:itemID=123].  This is needed because we are gonna build the rows (or columns, since the matrix is symmetrical) of the coocurrence matrix and we need to access each row (or column) for every itemID.  The Hadoop will then collect all values for a given key, and feed the key and list of values into the reducer.&lt;br /&gt;&lt;br /&gt;UserVectorToCooccurrenceReducer's reduce method will build one RandomAccessSparseVector for every itemID (hashed itemID, that is); the rows (or columns) of the matrix.  it simply iterates over the items in cooccurrenceRow and keeps track of how many times it sees a given item.  It also has a small optimization where it zeros out any coocurrence of 1.  Finally each cooccurrenceRow will be saved into a MapFile, keyed by itemID, so it can be quickly accessed in random order.&lt;br /&gt;&lt;br /&gt;The fourth, and last, map-reduce task is configured as follows:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="java"&gt;&lt;br /&gt;    JobConf recommenderConf = AbstractJob.prepareJobConf(userVectorPath, outputPath, jarFile,&lt;br /&gt;      SequenceFileInputFormat.class, RecommenderMapper.class, LongWritable.class,&lt;br /&gt;      RecommendedItemsWritable.class, IdentityReducer.class, LongWritable.class,&lt;br /&gt;      RecommendedItemsWritable.class, TextOutputFormat.class);&lt;br /&gt;    recommenderConf.set(RecommenderMapper.COOCCURRENCE_PATH, cooccurrencePath);&lt;br /&gt;    recommenderConf.set(RecommenderMapper.ITEMID_INDEX_PATH, itemIDIndexPath);&lt;br /&gt;    recommenderConf.setInt(RecommenderMapper.RECOMMENDATIONS_PER_USER, recommendationsPerUser);&lt;br /&gt;    recommenderConf.set(RecommenderMapper.USERS_FILE, usersFile);&lt;br /&gt;    recommenderConf.setClass("mapred.output.compression.codec", GzipCodec.class, CompressionCodec.class);&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The additional job configuration settings simply make available information such as paths of previously computed output files that this job will need and other settings such as how many recommendations are to made to each user.  It is also specified that the output of this job is to be compressed.&lt;br /&gt;&lt;br /&gt;The reducer specified in this job is the IdentityReducer, which simply outputs its input (notice it is deprecated, as there is a newer API that simplifies things a bit - the concepts, however, remain the same in the newer API):&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="java"&gt;&lt;br /&gt;@Deprecated&lt;br /&gt;public class IdentityReducer&amp;lt;K, V&amp;gt;&lt;br /&gt;    extends MapReduceBase implements Reducer&amp;lt;K, V, K, V&amp;gt; {&lt;br /&gt;&lt;br /&gt;  /** Writes all keys and values directly to output. */&lt;br /&gt;  public void reduce(K key, Iterator&amp;lt;V&amp;gt; values,&lt;br /&gt;                     OutputCollector&amp;lt;K, V&amp;gt; output, Reporter reporter)&lt;br /&gt;    throws IOException {&lt;br /&gt;    while (values.hasNext()) {&lt;br /&gt;      output.collect(key, values.next());&lt;br /&gt;    }&lt;br /&gt;  }&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;RecommenderMapper is a bit more involved than the tasks we've seen so far.  In addition to overriding the map method, it also overrides configure and close.  The calling of these methods is orchestrated by org.apache.hadoop.mapred.MapRunner, which will call configure upon instantiating RecommenderMapper, and will call close once it has called its map method for every key-value pair assigned to it.&lt;br /&gt;&lt;br /&gt;RecommenderMapper will perform the matrix-vector multiplication of the coocurrence matrix with a user's preference vector to compute the recommendation vector mentioned in the previous post.  It will then select the highest valued indices (as many as were specified in RECOMMENDATIONS_PER_USER, which defaults to 10) of the recommendation vector (filtering them so as to not select something the user has already seen) and them map those indices back to the original itemIDs.  The recommendation are encapsulated on a RecommendedItemsWritable object, which together with the LongWritable which encapsulates the userID, are accumulated in the output file.&lt;br /&gt;&lt;br /&gt;Notable observations include how the matrix-vector multiplication is done (keep a running sum of the result of multiplying the ith column of the matrix by the ith value of the vector) and the mechanics of how the columns of the coocurrence matrix are loaded and cached (which is complex enough to deserve its own post).&lt;br /&gt;&lt;br /&gt;Another thing that should jump at you, is the file that is specified as input into the last task, which goes to the RecommenderMapper.  It is the file containing the user preference vectors, and thus if you were specifying multiple mappers, it would be based on that information that input splits would be performed.  Whereas ideally, if you had many users for who to make recommendations, you would want to distribute based on that criteria instead.  Similarly, when you have a small number of users, you would just want to perform one matrix - vector multiplication for that user and save yourself the trouble of calling the RecommenderMapper's map method for users to whom you are not interested in making recommendations.  This is handled in the current implementation by returning from the map method if the user preference vector does not belong to a user we are intrested in:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="java"&gt;&lt;br /&gt;  @Override&lt;br /&gt;  public void map(LongWritable userID,&lt;br /&gt;                  VectorWritable vectorWritable,&lt;br /&gt;                  OutputCollector&amp;lt;LongWritable,RecommendedItemsWritable&amp;gt; output,&lt;br /&gt;                  Reporter reporter) throws IOException {&lt;br /&gt;    &lt;br /&gt;    if ((usersToRecommendFor != null) &amp;&amp; !usersToRecommendFor.contains(userID.get())) {&lt;br /&gt;      return;&lt;br /&gt;    }&lt;br /&gt;    ...&lt;br /&gt;  }&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-606011733220118085?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/606011733220118085/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2010/07/large-scale-machine-learning-with_07.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/606011733220118085'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/606011733220118085'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2010/07/large-scale-machine-learning-with_07.html' title='Large-Scale Machine Learning with Mahout - Part 3'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-8052112300121720411</id><published>2010-07-06T21:29:00.001-07:00</published><updated>2010-07-08T00:08:00.199-07:00</updated><title type='text'>Large-Scale Machine Learning with Mahout - Part 2</title><content type='html'>Following up on the previous post, today we examine a map-reduce-based job that does a bit more than count how many times each word occurs in a document.  You'll wanna already be familiar with Hadoop and have it installed and running on your computer, and for that you can use the references described previously.&lt;br /&gt;&lt;br /&gt;&lt;a href="https://svn.apache.org/repos/asf/mahout/tags/mahout-0.3/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/RecommenderJob.java"&gt;RecommenderJob&lt;/a&gt; runs a collaborative filtering algorithm that is fairly easy to distribute compared with others (such as the predominantly online SVD algorithms popularized in the recent &lt;a href="http://www.netflixprize.com/"&gt;Netflix challenge&lt;/a&gt;, see &lt;a href="http://sifter.org/%7Esimon/journal/20061211.html"&gt;here&lt;/a&gt; and &lt;a href="http://jmlr.csail.mit.edu/papers/volume10/takacs09a/takacs09a.pdf"&gt;here&lt;/a&gt;).  You can check &lt;a href="http://www.manning.com/owen/"&gt;Mahout In Action&lt;/a&gt; for a more detailed description, but briefly, 2 main data structures are computed:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;The coocurrence matrix&lt;/li&gt;&lt;li&gt;The user preference vector&lt;/li&gt;&lt;/ol&gt;The coocurrence matrix is a symmetric matrix of items by items, where the intersection of an item row and an item column holds the count of how many times that item pair was present in the set of rated items of a given user.  The user preference vector is simply a vector of all the items, each user having its own preference vector.  The only non-zero items in a user's preference vector are those which the user has rated, and the rating is the value for that item in the vector.&lt;br /&gt;&lt;br /&gt;To compute recommendations for a given user, the coocurrence matrix is multiplied on the right by the user preference vector, and the result is another vector with the same dimensions as the user preference vector, let's call it the "recommendation vector."  What you'll wanna do then is recommend to the user the highest-valued items in their recommendation vector (presumably excluding those items the user has already rated).&lt;br /&gt;&lt;br /&gt;One simple dataset to use is the 100k GroupLens dataset, which can be downloaded from here, and which has the following tab-separated format: user id | item id | rating | timestamp.  RecommenderJob expects a comma-separated file and you can either write a script to convert it to the format expected by the RecommenderJob or edit the the RecommenderJob to process that format (you could also create a map-reduce task with an IdentityReducer to do the preprocessing).&lt;br /&gt;&lt;br /&gt;For starters, the job can be run from the IDE as a junit unit test, similar to the one shown below:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="java"&gt;&lt;br /&gt;package dr.item_rec;&lt;br /&gt;&lt;br /&gt;import org.apache.hadoop.fs.FileSystem;&lt;br /&gt;import org.apache.hadoop.fs.Path;&lt;br /&gt;import org.apache.hadoop.mapred.JobConf;&lt;br /&gt;import org.junit.Test;&lt;br /&gt;import org.junit.Assert;&lt;br /&gt;&lt;br /&gt;public class RecommenderJob_Tester {&lt;br /&gt;    @Test&lt;br /&gt;    public void MainTest() throws Exception {&lt;br /&gt;        JobConf conf = new JobConf();&lt;br /&gt;        conf.set("fs.default.name", "file:///");&lt;br /&gt;        conf.set("mapred.job.tracker", "local");&lt;br /&gt;&lt;br /&gt;        Path input = new Path("/Users/mario/Desktop/fastcf/data/rec_job");&lt;br /&gt;        Path output = new Path("/Users/mario/Desktop/fastcf/data/rec_job_out");&lt;br /&gt;        Path usersFile = new Path("/Users/mario/Desktop/fastcf/data/recs/users");&lt;br /&gt;&lt;br /&gt;        FileSystem fs = FileSystem.getLocal(conf);&lt;br /&gt;        // delete output dirs, including those defined in AbstractJob,&lt;br /&gt;        // RecommenderJob's parent class&lt;br /&gt;        fs.delete(output, true);&lt;br /&gt;        fs.delete(new Path("/Users/mario/Desktop/fastcf/distributed_recommender/temp/userVectors"), true);&lt;br /&gt;        fs.delete(new Path("/Users/mario/Desktop/fastcf/distributed_recommender/temp/itemIDIndex"), true);&lt;br /&gt;        fs.delete(new Path("/Users/mario/Desktop/fastcf/distributed_recommender/temp/cooccurrence"), true);&lt;br /&gt;&lt;br /&gt;        RecommenderJob driver = new RecommenderJob();&lt;br /&gt;        driver.setConf(conf);&lt;br /&gt;&lt;br /&gt;        int exitCode = driver.run(new String[]{&lt;br /&gt;                "--input",&lt;br /&gt;                input.toString(),&lt;br /&gt;                "--output",&lt;br /&gt;                output.toString(),&lt;br /&gt;                "--usersFile",&lt;br /&gt;                usersFile.toString()&lt;br /&gt;        });&lt;br /&gt;        &lt;br /&gt;        Assert.assertEquals(0, exitCode);&lt;br /&gt;    }&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The source for the project can be downloaded from &lt;a href="http://dl.dropbox.com/u/4681937/distributed_recommender.zip"&gt;here&lt;/a&gt;.  A couple of modifications to the project since the last post include (these details would only be relevant to those not terribly familiar with the Java tools used here):&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Adding junit as a dependency to the pom.xml file.&lt;/li&gt;&lt;li&gt;Under module settings - libraries, look for he mahout-core:0.3 library and attach its source by going to the directory where you downloaded Mahout to (the same can be done with the Hadoop jar).  This is convenient for being able to step into Mahout and Hadoop code from within you project while debugging within the IDE.&lt;/li&gt;&lt;/ol&gt;Other details include the input and output files, which can be deduced by reading the source for RecommenderJob (/Users/mario/Desktop/fastcf/data/recs/users can be a file as simple as having a single userID).  We'll continue this next time.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-8052112300121720411?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/8052112300121720411/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2010/07/large-scale-machine-learning-with_06.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/8052112300121720411'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/8052112300121720411'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2010/07/large-scale-machine-learning-with_06.html' title='Large-Scale Machine Learning with Mahout - Part 2'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-5833062697192417586</id><published>2010-07-04T18:36:00.000-07:00</published><updated>2010-07-07T21:38:54.030-07:00</updated><title type='text'>Large-Scale Machine Learning with Mahout - Part 1</title><content type='html'>Mining very large data sets is within everyone's reach nowadays, with open source libraries like &lt;a href="http://hadoop.apache.org/"&gt;Hadoop&lt;/a&gt; and &lt;a href="http://mahout.apache.org/"&gt;Mahout&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Hadoop needs no introduction since most anyone interested in cloud computing has an idea of what it is.  Even at such an early release number (0.2x) it is already heavily used in big name places like Facebook and Yahoo, just to mention a couple.  For completeness sake, some of the seminal papers on which the Hadoop ecosystem is based include:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://labs.google.com/papers/mapreduce.html"&gt;Google's Map-Reduce paper&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://labs.google.com/papers/gfs.html"&gt;Google's Distributed File System paper&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;What is not as well-known is the Mahout project, which aims to build scalable machine learning libraries (and which leverages Hadoop but does not restrict itself to it).&lt;br /&gt;&lt;br /&gt;Mahout was also inspired by a research paper:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.cs.stanford.edu/people/ang//papers/nips06-mapreducemulticore.pdf"&gt;Map-Reduce for Machine Learning on Multicore&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;That paper makes the observation that machine learning algorithms that fit the Statistical Query Model (those which compute sufficient statistics over the data) fit well into the map-reduce paradigm.  For example, batch gradient descent, which computes the gradient contribution from each training instance and sums them all together to get the gradient, can be parallelized by splitting up the training data in several chunks, computing the local chunk contributions to the gradient in parallel, and then adding the chunk contributions for the overall gradient.  An interesting result in that paper is that they achieved linear computational speed up with an increasing number of cores.&lt;br /&gt;&lt;br /&gt;Even if machine learning is not particularly your cup-of-tea, examining the Mahout code can be a good exercise in learning how to leverage the power of Hadoop, and that is the point of this post and the next.&lt;br /&gt;&lt;br /&gt;For starters, you'll wanna have access to the best guide to Hadoop currently out there, Tom White's book:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://oreilly.com/catalog/9780596521981"&gt;Hadoop: The Definitive Guide&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;In addition, there is a book about Mahout in the making:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.manning.com/owen/"&gt;Mahout in Action&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;And having that one within reach is also a good idea.  This post and the next are somewhat gonna mirror part of chapter 6 from that book, "Distributing Recommendation Computations," but I'm aiming at going into a little more detail to get things setup, particularly for those out there who don't have a lot of experience with the Java ecosystem.&lt;br /&gt;&lt;br /&gt;We'll be working with with tagged releases from each project, rather than the trunk.  This is just so this discussion does not become obsolete in a couple of days given that the rapidly evolving nature of these projects.&lt;br /&gt;&lt;br /&gt;I'm currently working on my Mac laptop, where I use RapidSVN to manage my subversion repositories.  RapidSVN can be downloaded &lt;a href="http://www.rapidsvn.org/index.php/Main_Page"&gt;here&lt;/a&gt;.  Once you have it, add bookmarks for the URL's of the existing Hadoop and Mahout repositories as shown in the screenshot below.  Next, you will want to check out a working copy of the release tagged "Mahout-0.3" to a local directory of your choice.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_JBvsBkmE5OU/TDPJO2FyJZI/AAAAAAAAAKo/r6mHjZD5u-Y/s1600/rapidsvn.png"&gt;&lt;img style="cursor: pointer; width: 400px; height: 205px;" src="http://3.bp.blogspot.com/_JBvsBkmE5OU/TDPJO2FyJZI/AAAAAAAAAKo/r6mHjZD5u-Y/s400/rapidsvn.png" alt="" id="BLOGGER_PHOTO_ID_5490953627569890706" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;After that, you will wanna do the same with Hadoop's release tagged "release-0.20.2" to its own directory.&lt;br /&gt;&lt;br /&gt;Now that you have the sources locally, we are gonna open them up with an IDE.  I'm currently using IntelliJ IDEA, which has a community edition version that can be downloaded for free &lt;a href="http://www.jetbrains.com/idea/"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;To open Mahout, create a new IDEA project and select "Import Project from external model" and the model will be a "Maven" model.  Select the directory you downloaded the Mahout sources to as the project's root directory and with the remaining default settings untouched, click "Next" until the end to finish IDEA's project creation wizard.&lt;br /&gt;&lt;br /&gt;At this point you'll wanna build the project.  Mahout uses Maven, which is a "software project management and comprehension tool."  In simple terms, Maven does in Mahout what Ant and Ivy do in Hadoop.  &lt;a href="http://ant.apache.org/"&gt;Ant&lt;/a&gt; is the well-known tool for automating software build processes in the Java world, whereas &lt;a href="http://ant.apache.org/ivy/"&gt;Ivy&lt;/a&gt; is a dependency management tool.  &lt;a href="http://maven.apache.org/"&gt;Maven&lt;/a&gt;, in turn, merges those capabilities (and more) into one tool.  You'll wanna download Maven, for which you can follow &lt;a href="http://day-to-day-stuff.blogspot.com/2008/03/installing-maven-under-mac-osx.html"&gt;these instructions&lt;/a&gt;, and then configure IDEA to use Maven.  To configure IDEA, go to File - Settings - Maven and enter in the "Maven Home Directory" field the path to where you just installed Maven.&lt;br /&gt;&lt;br /&gt;You'll wanna read up on the Maven &lt;a href="http://maven.apache.org/guides/introduction/introduction-to-the-lifecycle.html"&gt;lifecycle&lt;/a&gt;, but for now, just to make sure everything is alright, open the Maven Projects window in IDEA, and run the Clean followed by the Install tasks under the Apache Lucene Mahout node, as shown in the screenshot below.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_JBvsBkmE5OU/TDPYCSdLOzI/AAAAAAAAAKw/L4_2h0bkTXg/s1600/maven.png"&gt;&lt;img style="cursor: pointer; width: 400px; height: 204px;" src="http://2.bp.blogspot.com/_JBvsBkmE5OU/TDPYCSdLOzI/AAAAAAAAAKw/L4_2h0bkTXg/s400/maven.png" alt="" id="BLOGGER_PHOTO_ID_5490969904520313650" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;This will download a bunch of dependencies, compile the code, and deploy  it to you local Maven repository.  My local repository happens to be  under "/Users/mario/.m2" and you can find all the libraries needed by  your Maven projects under there.&lt;br /&gt;&lt;br /&gt;The whole thing may take a few minutes, but once it's done, you're all set.  Building using IDEA's Build - Make Project menu command would be a pain (if you try it, you'll likely get several errors), since you would have to add a lot of dependencies, directories for code that gets generated during the build, etc.  It's best to just let the Maven build handle it.  Later on, we will be creating a project that references the Mahout library (and thereby all of Mahout's dependencies, including Hadoop), and so there you will be able to use the IDE.  For now, let's open up Hadoop with IDEA.&lt;br /&gt;&lt;br /&gt;Create again a new IDEA project, but this time select "create Java project from existing sources," and enter the directory where you downloaded the Hadoop sources to.  IDEA will present you with all the source directories it has found; leave them all selected and click next.  Next, merge all the libraries IDEA has found into one and call it "lib" and click next.  Then, merge all the modules IDEA has as suggested into one and call it "Hadoop." Finally, open up the Ant Build tab in IDEA and run the "compile" target as shown below (other interesting targets include "compile-core-test" and "generate-test-records").  Again, building this project using the IDE can be a pain, so let Ant handle it.  The Ant task includes Ivy tasks, which download dependencies just as Maven did.  In Hadoop's case though, everything gets downloaded to within the Hadoop directory.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_JBvsBkmE5OU/TDPvgtGVLxI/AAAAAAAAAK4/Gt-ITG1NR-I/s1600/ant.png"&gt;&lt;img style="cursor: pointer; width: 400px; height: 219px;" src="http://1.bp.blogspot.com/_JBvsBkmE5OU/TDPvgtGVLxI/AAAAAAAAAK4/Gt-ITG1NR-I/s400/ant.png" alt="" id="BLOGGER_PHOTO_ID_5490995715835768594" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;After the builds (in both Hadoop and Mahout), additional code is  generated.  If you wanna pickup the additional directories as modules,  you can just recreate the project following the same steps.   Alternatively, prior to opening the project in the IDEA, you can run the  Maven and Ant tasks from the command line.&lt;br /&gt;&lt;br /&gt;Once you have Hadoop built, you'll wanna install it locally and make sure you can run jobs in stand-alone and pseudo-distributed modes.  Tom White's book is an excellent resource for setting this up, and you can just copy the contents of your Hadoop folder to another directory, say "hadoop-installation" and follow the instructions in Appendix A of the book to install hadoop from the hadoop-installation directory.&lt;br /&gt;&lt;br /&gt;Once you are confident in your local Hadoop installation, create a new IDEA project, where you will copy some files from Mahout and examine one job in detail, the &lt;a href="https://svn.apache.org/repos/asf/mahout/tags/mahout-0.3/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/RecommenderJob.java"&gt;RecommenderJob&lt;/a&gt;, found in the org.apache.mahout.cf.taste.hadoop.item package.&lt;br /&gt;&lt;br /&gt;The easiest way to do so is to create a new IDEA project from scratch and select Maven Module as the project type.  Leave all the defaults alone and click next to finish the wizard.  Next, you'll need to do 4 things:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Edit the pom.xml file to include a dependency on Mahout-0.3.&lt;/li&gt;&lt;li&gt;Add slf4j-jcl-1.5.8.jar as a library to your project (this library can be found in your local Maven repository).  Add a new Library under Module Settings and attach the folder you copied the Jar to as a Jar Directory.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Copy all the files relevant to the RecommenderJob from Mahout to your project (the ones shown in the screenshot).&lt;/li&gt;&lt;li&gt;Under Project settings, set the project language level to "6 - @Override in interfaces"&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;Your project should look somewhat like the screenshot below.  You should now be able to build using IDEA and using the Maven lifecycle tasks.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_JBvsBkmE5OU/TDQCPWtPKaI/AAAAAAAAALA/akllEcu1XT8/s1600/recommenderjob.png"&gt;&lt;img style="cursor: pointer; width: 400px; height: 206px;" src="http://3.bp.blogspot.com/_JBvsBkmE5OU/TDQCPWtPKaI/AAAAAAAAALA/akllEcu1XT8/s400/recommenderjob.png" alt="" id="BLOGGER_PHOTO_ID_5491016308488087970" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Next time we'll discuss RecommenderJob in detail.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-5833062697192417586?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/5833062697192417586/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2010/07/large-scale-machine-learning-with.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/5833062697192417586'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/5833062697192417586'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2010/07/large-scale-machine-learning-with.html' title='Large-Scale Machine Learning with Mahout - Part 1'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_JBvsBkmE5OU/TDPJO2FyJZI/AAAAAAAAAKo/r6mHjZD5u-Y/s72-c/rapidsvn.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-3697603822515151006</id><published>2010-06-09T18:24:00.000-07:00</published><updated>2010-06-10T06:59:17.155-07:00</updated><title type='text'>Bayesian statistics</title><content type='html'>I just finished a 2-course sequence on Bayesian statistics (&lt;a href="http://www.soe.ucsc.edu/courses/course?ams206"&gt;ams206&lt;/a&gt; and &lt;a href="http://www.soe.ucsc.edu/courses/course?ams207"&gt;ams207&lt;/a&gt;) and doing so has certainly clarified many things I hadn't picked up on while trying to learn the topic by myself.&lt;br /&gt;&lt;br /&gt;The Bayesian paradigm provides a very simple, elegant, and coherent foundation for building statistical models.  I think of these models as being analogous to architectural blueprints for a building or UML diagrams for a software system.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_JBvsBkmE5OU/TBBEmhAgdjI/AAAAAAAAAKQ/DwjFQjiYrU8/s1600/UML_Diagrams.jpg"&gt;&lt;img style="cursor: pointer; width: 400px; height: 300px;" src="http://1.bp.blogspot.com/_JBvsBkmE5OU/TBBEmhAgdjI/AAAAAAAAAKQ/DwjFQjiYrU8/s400/UML_Diagrams.jpg" alt="" id="BLOGGER_PHOTO_ID_5480956174996108850" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;In the same way that a software system may be implemented in Java or C++, using this or that framework, we may also implement our statistical model in a variety of ways.  From the full Bayesian model, we may concede in "correctness" for the sake of computation complexity (simulation vs optimization), we may try this or that prior, or make this or that independence assumption; however, the full model remains explicit and our compromises are made clear.&lt;br /&gt;&lt;br /&gt;For anybody looking for introductory resources on Bayesian stats as I once was, I would recommend Andrew Gelman's book, &lt;a href="http://www.stat.columbia.edu/%7Egelman/book/"&gt;Bayesian Data Analysis&lt;/a&gt;, which includes answers to selected exercises (useful for self-study).  Jim Albert's book, &lt;a href="http://bayes.bgsu.edu/bcwr/"&gt;Bayesian Computation in R&lt;/a&gt;, will also come in handy from an application perspective, although I think the use of the &lt;a href="http://cran.r-project.org/web/packages/LearnBayes/LearnBayes.pdf"&gt;LearnBayes&lt;/a&gt; package obscures things a bit on a first encounter (it would have been preferable to have seen the MCMC coded from scratch with lots of comments explaining the 'why' of things).  Finally, once you're done with Albert's book, I would recommend &lt;a href="http://stat-athens.aueb.gr/%7Ejbn/winbugs_book/"&gt;Bayesian Modeling Using WinBUGS&lt;/a&gt;, and a good strategy would be to try to model and solve the examples in that book using R and then compare your answers with their WinBUGS implementation.&lt;br /&gt;&lt;img src="file:///C:/DOCUME%7E1/mario/LOCALS%7E1/Temp/moz-screenshot.png" alt="" /&gt;&lt;img src="file:///C:/DOCUME%7E1/mario/LOCALS%7E1/Temp/moz-screenshot-1.png" alt="" /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-3697603822515151006?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/3697603822515151006/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2010/06/bayesian-statistics.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/3697603822515151006'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/3697603822515151006'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2010/06/bayesian-statistics.html' title='Bayesian statistics'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_JBvsBkmE5OU/TBBEmhAgdjI/AAAAAAAAAKQ/DwjFQjiYrU8/s72-c/UML_Diagrams.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-6224499050734607087</id><published>2010-05-12T18:43:00.000-07:00</published><updated>2010-08-12T23:28:49.822-07:00</updated><title type='text'>Bayesian Methods for Large Datasets</title><content type='html'>Interesting &lt;a href="http://videolectures.net/icml08_salakhutdinov_bpm/"&gt;talk&lt;/a&gt; on applying full bayesian models to large datasets (with the &lt;a href="http://www.netflixprize.com/"&gt;netflix challenge&lt;/a&gt; as the example application).&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.cs.toronto.edu/%7Eamnih/papers/bpmf.pdf"&gt;Here is the paper&lt;/a&gt; that goes with the talk.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-6224499050734607087?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/6224499050734607087/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2010/05/bayesian-methods-for-large-datasets.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/6224499050734607087'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/6224499050734607087'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2010/05/bayesian-methods-for-large-datasets.html' title='Bayesian Methods for Large Datasets'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-7764519080064359186</id><published>2010-05-09T07:36:00.000-07:00</published><updated>2010-05-09T08:26:43.660-07:00</updated><title type='text'>A sign of the times</title><content type='html'>Stanford has a new course offering in their statistics department this summer: &lt;a href="http://scpd.stanford.edu/search/publicCourseSearchDetails.do;jsessionid=0C9A895BC40AD9BE458C79A90A8AD2CD?method=load&amp;amp;courseId=6973878"&gt;Algorithmic Trading&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;To my knowledge, this is one of the first major universities to have such a focused offering (are there any others?).&lt;br /&gt;&lt;br /&gt;To put things in perspective, here are some numbers from &lt;a href="http://en.wikipedia.org/wiki/Algorithmic_trading"&gt;Wikipedia&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;"In 2006 at the London Stock Exchange over 40% of all orders were  entered by algorithmic traders, with 60% predicted for 2007. American markets  and equity markets generally have a higher proportion of algorithmic trades  than other markets, and estimates for 2008 range as high as an 80%  proportion in some markets. Foreign exchange markets also have  active algorithmic trading (about 25% of orders in 2006). Futures and options markets are considered to be fairly  easily integrated into algorithmic trading, with about 20% of options volume expected to be computer generated by  2010. Bond markets are moving toward more access to algorithmic  traders."&lt;br /&gt;&lt;br /&gt;Keep in mind that any insight into what goes on inside Wall Street is probably just the tip of the iceberg.  For example, &lt;a href="http://en.wikipedia.org/wiki/Pairs_trade"&gt;pairs trading&lt;/a&gt;, a technique covered in the course, is an old one:&lt;br /&gt;&lt;br /&gt;"The pairs trade or pair trading, also known as market neutral, was developed in the late 1980s by quantitative analysts and pioneered by Gerald Bamberger while at Morgan Stanley. With the help of others at Morgan Stanley  at the time, including Nunzio Tartaglia, Bamberger found that certain  securities, often competitors in the same sector, were correlated in  their day-to-day price movements. When the correlation broke down, i.e.  one stock traded up while the other traded down, they would sell the  outperforming stock and buy the underperforming one, betting that the  "spread" between the two would eventually converge. A notable pair  trader was hedge fund &lt;a href="http://en.wikipedia.org/wiki/Long-Term_Capital_Management" title="Long-Term Capital Management"&gt;Long-Term Capital Management&lt;/a&gt;."&lt;br /&gt;&lt;br /&gt;So what is this a sign of?  As the Wikipedia article suggests, algorithmic trading is becoming more and more common place, and so employers are expecting future employees to know about it straight out of university, just as they need to know other essentials such as finance or economics.&lt;br /&gt;&lt;br /&gt;I wonder if this trend will contribute to making "glitches" like the  &lt;a href="http://online.wsj.com/article/SB10001424052748704370704575227754131412596.html"&gt;Dow's  recent drop and subsequent rise of nearly 1,000 points in the space of  an hour&lt;/a&gt; more common or less common.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-7764519080064359186?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/7764519080064359186/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2010/05/sign-of-times.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/7764519080064359186'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/7764519080064359186'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2010/05/sign-of-times.html' title='A sign of the times'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-3250507639244808388</id><published>2010-05-04T16:19:00.000-07:00</published><updated>2010-05-04T17:20:50.066-07:00</updated><title type='text'>Java vs C++</title><content type='html'>Here is a completely unscientific yet entertaining piece of anecdotal evidence regarding java vs c++ performance:&lt;br /&gt;&lt;br /&gt;So, I'm thinking about ways to scale recommender systems, and as a baseline, I'm using &lt;a href="http://www.timelydevelopment.com/demos/NetflixPrize.aspx"&gt;Timely Development&lt;/a&gt;'s (TD) implementation of &lt;a href="http://sifter.org/%7Esimon/journal/20061211.html"&gt;Simon Funk&lt;/a&gt;'s well described approach to the &lt;a href="http://www.netflixprize.com/"&gt;Netflix Challenge&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;One of the things I'm playing with is &lt;a href="http://lucene.apache.org/mahout/"&gt;Mahout&lt;/a&gt;, which aims to be a scalable machine learning library, and being built on top of &lt;a href="http://hadoop.apache.org/"&gt;Hadoop&lt;/a&gt;, it is written in java.  I rewrote TD's implementation in java, with the only change worth mentioning being that I changed the way the data is structured from something like this:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="cpp"&gt;&lt;br /&gt;struct Data&lt;br /&gt;{&lt;br /&gt;int         CustId;&lt;br /&gt;short       MovieId;&lt;br /&gt;BYTE        Rating;&lt;br /&gt;float       Cache;&lt;br /&gt;};&lt;br /&gt;Data            m_aRatings[MAX_RATINGS];&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;To something like this:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="java"&gt;&lt;br /&gt;int[] m_aRatings_CustId;&lt;br /&gt;short[] m_aRatings_MovieId;&lt;br /&gt;byte[] m_aRatings_Rating;&lt;br /&gt;float[] m_aRatings_Cache;&lt;br /&gt;&lt;br /&gt;m_aRatings_CustId = new int[Constants.MAX_RATINGS];&lt;br /&gt;m_aRatings_MovieId = new short[Constants.MAX_RATINGS];&lt;br /&gt;m_aRatings_Rating = new byte[Constants.MAX_RATINGS];&lt;br /&gt;m_aRatings_Cache = new float[Constants.MAX_RATINGS];&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The reason for doing so being that java does not have structs and using a class instead of a struct would use more memory than I have in my laptop.  Even with this change, the java implementation still uses more memory than the c++ implementation, but as long as I could run it on my laptop, I didn't care.  Oh yeah, I also changed some Windows specific things in TD's code so I could run it on Linux, but that has a negligible effect in what the overall code is doing.&lt;br /&gt;&lt;br /&gt;Running TD's code using 10 features and reading only 99 of the 17,770 movie files finishes fairly quickly:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="java"&gt;&lt;br /&gt;real    0m49.097s&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;I was now curios to see how much slower the java code would run when given the same parameters:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="java"&gt;&lt;br /&gt;real    0m26.520s&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;That brought a smile to my face :)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-3250507639244808388?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/3250507639244808388/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2010/05/java-vs-c.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/3250507639244808388'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/3250507639244808388'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2010/05/java-vs-c.html' title='Java vs C++'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-7286121538701236328</id><published>2010-04-09T16:14:00.000-07:00</published><updated>2010-04-10T06:27:45.275-07:00</updated><title type='text'>Handling uncertainty and complexity in AI</title><content type='html'>We've been seeing several approaches to handling uncertainty (using probabilistic models) and complexity (using structured/relational representations) in tandem in artificial intelligence systems, including &lt;a href="http://projects.csail.mit.edu/church/wiki/Probabilistic_Models_of_Cognition_Tutorial"&gt;Church&lt;/a&gt;, the topic of an &lt;a href="http://web.mit.edu/newsoffice/2010/ai-unification"&gt;MIT news article&lt;/a&gt; released a couple of weeks ago.  Other systems include:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://alchemy.cs.washington.edu/"&gt;Alchemy&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.eecs.harvard.edu/%7Eavi/IBAL/"&gt;IBAL&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://people.csail.mit.edu/milch/blog/index.html"&gt;BLOG&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;a href="http://www.cs.berkeley.edu/%7Ejordan/papers/pearl-festschrift.pdf"&gt;This paper&lt;/a&gt;, from &lt;a href="http://www.eecs.berkeley.edu/%7Ejordan/"&gt;Michael Jordan&lt;/a&gt;, provides another approach towards an expressive probabilistic representation, one that draws from Bayesian nonparametrics.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-7286121538701236328?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/7286121538701236328/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2010/04/handling-uncertainty-and-complexity-in.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/7286121538701236328'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/7286121538701236328'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2010/04/handling-uncertainty-and-complexity-in.html' title='Handling uncertainty and complexity in AI'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-3274104255752724646</id><published>2010-03-19T16:11:00.000-07:00</published><updated>2010-06-23T08:12:51.264-07:00</updated><title type='text'>Enhancing Video Resolution With Stills</title><content type='html'>Enhancing videos or images is nothing new to Hollywood:&lt;br /&gt;&lt;br /&gt;&lt;object width="640" height="385"&gt;&lt;param name="movie" value="http://www.youtube.com/v/Vxq9yj2pVWk&amp;amp;hl=en_US&amp;amp;fs=1&amp;amp;"&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;embed src="http://www.youtube.com/v/Vxq9yj2pVWk&amp;amp;hl=en_US&amp;amp;fs=1&amp;amp;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="640" height="385"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;br /&gt;But some of the stuff that can be done back here in the real world can be quite impressive.  I just finished an &lt;a href="http://www.soe.ucsc.edu/classes/ee264/Winter10/"&gt;image processing class&lt;/a&gt; and got a taste of some of what's possible.&lt;br /&gt;&lt;br /&gt;In one of the projects in the class, my friend and I tried to reproduce some of the results from a recent paper, &lt;a href="http://grail.cs.washington.edu/projects/enhancing-spacetime/"&gt;Enhancing and Experiencing Spacetime Resolution with Videos and Stills&lt;/a&gt;, and though we did not implement all the components described in the paper (such as the spacetime fusion in the diagram below), we were still able to obtain some decent results.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_JBvsBkmE5OU/TCIjuLKP2eI/AAAAAAAAAKY/fMOQLPYWL3U/s1600/syscomps.png"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 400px; height: 81px;" src="http://4.bp.blogspot.com/_JBvsBkmE5OU/TCIjuLKP2eI/AAAAAAAAAKY/fMOQLPYWL3U/s400/syscomps.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5485986572267411938" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Here's a video, the well-known &lt;a href="http://nsl.cs.sfu.ca/wiki/index.php/Foreman"&gt;foreman sequence&lt;/a&gt;, downloaded from &lt;a href="http://trace.eas.asu.edu/yuv/index.html"&gt;here&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;&lt;object width="480" height="385"&gt;&lt;param name="movie" value="http://www.youtube.com/v/9n1fz72e9Sg&amp;amp;hl=en_US&amp;amp;fs=1&amp;amp;"&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;embed src="http://www.youtube.com/v/9n1fz72e9Sg&amp;amp;hl=en_US&amp;amp;fs=1&amp;amp;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="320" height="265"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;br /&gt;We processed the original sequence to simulate the output of a hybrid camera (as described in the original paper), which gave us the video below:&lt;br /&gt;&lt;br /&gt;&lt;object width="480" height="385"&gt;&lt;param name="movie" value="http://www.youtube.com/v/AwH9FLpilRY&amp;amp;hl=en_US&amp;amp;fs=1&amp;amp;"&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;embed src="http://www.youtube.com/v/AwH9FLpilRY&amp;amp;hl=en_US&amp;amp;fs=1&amp;amp;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="320" height="265"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;br /&gt;The flickering observed is caused by the intermediate high-resolution frames, the entire sequence being something like image below (red represents high-res frames, while gray represents low-res frames).  In the hybrid foreman sequence, the gray frames are simply the original frames down-sampled by a factor of 4.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_JBvsBkmE5OU/S6QNPaAxY4I/AAAAAAAAAKI/tWgo4TJjVpI/s1600-h/input_seq.png"&gt;&lt;img style="cursor: pointer; width: 194px; height: 99px;" src="http://1.bp.blogspot.com/_JBvsBkmE5OU/S6QNPaAxY4I/AAAAAAAAAKI/tWgo4TJjVpI/s400/input_seq.png" alt="" id="BLOGGER_PHOTO_ID_5450496007356900226" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Finally, we processed the simulated hybrid input with our system, and we got this (if you watch closely, you'll observe several leftover artifacts, areas where the video appears to warp):&lt;br /&gt;&lt;br /&gt;&lt;object width="320" height="265"&gt;&lt;param name="movie" value="http://www.youtube.com/v/SLeUg9vjBV8&amp;amp;hl=en_US&amp;amp;fs=1&amp;amp;"&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;embed src="http://www.youtube.com/v/SLeUg9vjBV8&amp;amp;hl=en_US&amp;amp;fs=1&amp;amp;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="320" height="265"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;br /&gt;Even with the leftover artifacts, I'd say we were to propagate a decent amount of info from the high-res frames to the low-res frames, wouldn't you?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-3274104255752724646?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/3274104255752724646/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2010/03/enhancing-video-resolution-with-stills.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/3274104255752724646'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/3274104255752724646'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2010/03/enhancing-video-resolution-with-stills.html' title='Enhancing Video Resolution With Stills'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_JBvsBkmE5OU/TCIjuLKP2eI/AAAAAAAAAKY/fMOQLPYWL3U/s72-c/syscomps.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-4348347089240612084</id><published>2010-03-16T12:24:00.000-07:00</published><updated>2010-03-16T12:40:01.882-07:00</updated><title type='text'>Data Modeling vs Algorithmic Modeling</title><content type='html'>An &lt;a href="http://www.stat.osu.edu/%7Ebli/dmsl/papers/Breiman.pdf"&gt;interesting paper (plus comments)&lt;/a&gt; contrasting the 2 differing approaches used in statistics and machine learning.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-4348347089240612084?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/4348347089240612084/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2010/03/data-modeling-vs-algorithmic-modeling.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/4348347089240612084'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/4348347089240612084'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2010/03/data-modeling-vs-algorithmic-modeling.html' title='Data Modeling vs Algorithmic Modeling'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-9124209187665865980</id><published>2010-02-20T09:54:00.000-08:00</published><updated>2010-02-21T12:27:58.405-08:00</updated><title type='text'>Least-Squares For Image Restoration</title><content type='html'>Here's an interesting use of &lt;a href="http://mechanistician.blogspot.com/2009/02/lecture-2-linear-regression-part-1.html"&gt;least-squares&lt;/a&gt;: image restoration.  You can find a good mathematical description of the matlab code in the paper titled &lt;a href="http://books.google.com/books?id=eZQ9PO7rL4gC&amp;amp;pg=PA177&amp;amp;lpg=PA177&amp;amp;dq=image+restoration+by+an+iterative+damped+least-squares+method+with+two+tyoes+of+adaptation&amp;amp;source=bl&amp;amp;ots=v1Mgike5VE&amp;amp;sig=XsQ8QMDb_Wco5L1zbrUMyKZuSlg&amp;amp;hl=en&amp;amp;ei=20t-S-qFH46osgPQ5cj8Cw&amp;amp;sa=X&amp;amp;oi=book_result&amp;amp;ct=result&amp;amp;resnum=2&amp;amp;ved=0CAkQ6AEwAQ#v=onepage&amp;amp;q=&amp;amp;f=false"&gt;Image Restoration by an Iterative Damped Least-Squares Method with Two Types of Adaptation&lt;/a&gt; ("damped" meaning regularized and "adaptive" is implemented in the code by using edge detection for spatially dependent regularization: no regularization around edges while regularizing on smooth sections of the image).&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;clc;clear all;close all;&lt;br /&gt;&lt;br /&gt;filename='some_image.png';&lt;br /&gt;&lt;br /&gt;%read image from disk&lt;br /&gt;f=imread(filename);&lt;br /&gt;&lt;br /&gt;%convert image to grayscale&lt;br /&gt;f=rgb2gray(f);&lt;br /&gt;&lt;br /&gt;%convert grayscale values to double&lt;br /&gt;f=im2double(f);&lt;br /&gt;&lt;br /&gt;%display original image&lt;br /&gt;figure; imagesc(f); axis image; colormap(gray);&lt;br /&gt;&lt;br /&gt;%compute the motion blur filter&lt;br /&gt;h=fspecial('motion',72,315);&lt;br /&gt;&lt;br /&gt;%blur the image&lt;br /&gt;g=imfilter(f,h,'conv','circular');&lt;br /&gt;&lt;br /&gt;%display the blurred image&lt;br /&gt;figure; imagesc(g); axis image; colormap(gray);&lt;br /&gt;&lt;br /&gt;%add gaussian noise to the image&lt;br /&gt;mean=0;&lt;br /&gt;sd=10/255;&lt;br /&gt;noise=mean+sd*randn(size(f,1),size(f,2));&lt;br /&gt;n=g+noise;&lt;br /&gt;&lt;br /&gt;%display the noise&lt;br /&gt;figure; imagesc(noise); axis image; colormap(gray);&lt;br /&gt;&lt;br /&gt;%display the blurred and noisy image&lt;br /&gt;figure; imagesc(n); axis image; colormap(gray);&lt;br /&gt;&lt;br /&gt;%perform gradient descent to recover f from n&lt;br /&gt;lambda=.01;&lt;br /&gt;step=1;&lt;br /&gt;error_norm=1;&lt;br /&gt;laplacian=[0  1  0;&lt;br /&gt;           1 -4  1;&lt;br /&gt;           0  1  0];&lt;br /&gt;&lt;br /&gt;%the noisy image is our first estimate&lt;br /&gt;f_hat_0=n;&lt;br /&gt;relative_error=intmax;&lt;br /&gt;iter=0;&lt;br /&gt;restf=figure;&lt;br /&gt;wf=figure;&lt;br /&gt;&lt;br /&gt;%repeat until little improvement is made&lt;br /&gt;while relative_error&gt;.1&lt;br /&gt;    &lt;br /&gt;    %compute left side of gradient equation&lt;br /&gt;    f_inter_left=imfilter(f_hat_0,h,'conv','circular')-n;&lt;br /&gt;    f_inter_left=imfilter(f_inter_left,h','conv','circular');&lt;br /&gt;    &lt;br /&gt;    %perform edge detection&lt;br /&gt;    [w,t]=edge(f_hat_0,'canny',[.2 .3],2);&lt;br /&gt;    %rescale edge intensities&lt;br /&gt;    w=1./(1+.01.*w);&lt;br /&gt;    &lt;br /&gt;    %display the edges image&lt;br /&gt;    figure(wf);&lt;br /&gt;    imagesc(w); axis image; colormap(gray);&lt;br /&gt;    drawnow;&lt;br /&gt;    &lt;br /&gt;    %compute right side of gradient equation&lt;br /&gt;    f_inter_right=imfilter(f_hat_0,laplacian,'conv','symmetric');&lt;br /&gt;    f_inter_right=f_inter_right.*w;&lt;br /&gt;    f_inter_right=imfilter(f_inter_right,laplacian','conv','symmetric');&lt;br /&gt;    f_inter_right=lambda*f_inter_right;&lt;br /&gt;       &lt;br /&gt;    %compute the gradient&lt;br /&gt;    gradient=f_inter_left+f_inter_right;&lt;br /&gt;    &lt;br /&gt;    %update current image estimate&lt;br /&gt;    f_hat_1=f_hat_0-step*(gradient);&lt;br /&gt;    &lt;br /&gt;    %compute relative error&lt;br /&gt;    relative_error=(norm(f_hat_1-f_hat_0,error_norm)/...&lt;br /&gt;        norm(f_hat_0,error_norm))*100;&lt;br /&gt;    &lt;br /&gt;    %reset our next estimate&lt;br /&gt;    f_hat_0=f_hat_1;&lt;br /&gt;    iter=iter+1;&lt;br /&gt;    disp(['iteration: ',num2str(iter),...&lt;br /&gt;        ', error: ',num2str(relative_error),...&lt;br /&gt;        ', gradient: ',num2str(norm(gradient,2))]);&lt;br /&gt;    &lt;br /&gt;    %display our current image estimate&lt;br /&gt;    figure(restf);&lt;br /&gt;    imagesc(f_hat_0); axis image; colormap(gray);&lt;br /&gt;    drawnow;&lt;br /&gt;end&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;We start with the original image (found &lt;a href="http://blogs.mathworks.com/steve/2008/07/14/opening-by-reconstruction/"&gt;here&lt;/a&gt;):&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_JBvsBkmE5OU/S4Ah69N1cTI/AAAAAAAAAJQ/h6SKuxzWFYY/s1600-h/some_image.png"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 400px; height: 400px;" src="http://3.bp.blogspot.com/_JBvsBkmE5OU/S4Ah69N1cTI/AAAAAAAAAJQ/h6SKuxzWFYY/s400/some_image.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5440385646612279602" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;And add some noise to it (in addition to blurring it), ending up with:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_JBvsBkmE5OU/S4Aibr1GKrI/AAAAAAAAAJY/PTnQzU21Tqw/s1600-h/blur_noise.png"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 400px; height: 300px;" src="http://3.bp.blogspot.com/_JBvsBkmE5OU/S4Aibr1GKrI/AAAAAAAAAJY/PTnQzU21Tqw/s400/blur_noise.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5440386208880798386" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;After one iteration of steepest descent, here is what edge detection looks like on the image:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_JBvsBkmE5OU/S4AjvTjuE3I/AAAAAAAAAJw/2ZbExMEG_Zo/s1600-h/edges1.png"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 400px; height: 300px;" src="http://2.bp.blogspot.com/_JBvsBkmE5OU/S4AjvTjuE3I/AAAAAAAAAJw/2ZbExMEG_Zo/s400/edges1.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5440387645474476914" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;And the image has not changed much:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_JBvsBkmE5OU/S4Aj_Xsnt-I/AAAAAAAAAJ4/VQRTyEqW7w4/s1600-h/restored1.png"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 400px; height: 300px;" src="http://3.bp.blogspot.com/_JBvsBkmE5OU/S4Aj_Xsnt-I/AAAAAAAAAJ4/VQRTyEqW7w4/s400/restored1.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5440387921463457762" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;After a couple hundred iterations of steepest descent, here's what the edges look like:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_JBvsBkmE5OU/S4AjCEIC7XI/AAAAAAAAAJg/047Ta0Y5iq4/s1600-h/edges.png"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 400px; height: 300px;" src="http://1.bp.blogspot.com/_JBvsBkmE5OU/S4AjCEIC7XI/AAAAAAAAAJg/047Ta0Y5iq4/s400/edges.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5440386868237757810" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;And the image is much clearer:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_JBvsBkmE5OU/S4AjSe4HufI/AAAAAAAAAJo/X77cQYaIZCc/s1600-h/restored.png"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 400px; height: 300px;" src="http://3.bp.blogspot.com/_JBvsBkmE5OU/S4AjSe4HufI/AAAAAAAAAJo/X77cQYaIZCc/s400/restored.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5440387150296627698" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-9124209187665865980?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/9124209187665865980/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2010/02/regularized-least-squares-for-image.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/9124209187665865980'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/9124209187665865980'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2010/02/regularized-least-squares-for-image.html' title='Least-Squares For Image Restoration'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_JBvsBkmE5OU/S4Ah69N1cTI/AAAAAAAAAJQ/h6SKuxzWFYY/s72-c/some_image.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-5960346623614770071</id><published>2010-02-07T12:26:00.000-08:00</published><updated>2010-02-07T12:35:38.252-08:00</updated><title type='text'>Markov Logic</title><content type='html'>This is probably one of the most exciting things I've come across in terms re-unifying the several clusters AI has been divided into, and that aims to build a general-purpose, consistent, and realistic system: &lt;a href="http://en.wikipedia.org/wiki/Markov_logic_network"&gt;Markov Logic&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;You can get a high level overview from this &lt;a href="http://videolectures.net/cikm08_domingos_mlmaul/"&gt;video&lt;/a&gt;, and find additional details &lt;a href="http://alchemy.cs.washington.edu/"&gt;here&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-5960346623614770071?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/5960346623614770071/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2010/02/markov-logic.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/5960346623614770071'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/5960346623614770071'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2010/02/markov-logic.html' title='Markov Logic'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-4405632023147647435</id><published>2010-01-30T10:37:00.000-08:00</published><updated>2010-01-30T17:21:23.032-08:00</updated><title type='text'>Boosting Bayesian Networks</title><content type='html'>I strongly favor the idea of using a joint distribution as a knowledge base (memory?) for reasoning agents, but I don't necessarily think Bayesian networks in their current incarnation are the "end all, be all" representation. Specifically, I think the acyclicity requirement of the DAG may be too constraining (lately I've been particularly intrigued by &lt;a href="http://research.microsoft.com/apps/pubs/default.aspx?id=64334"&gt;dependency networks&lt;/a&gt; and their properties).  &lt;br /&gt;&lt;br /&gt;Nevertheless, the idea of boosting Bayesian networks for better density estimation described in this paper, &lt;a href="http://genie.weizmann.ac.il/pubs/conference/nips02.pdf"&gt;Boosting Density Estimation&lt;/a&gt;, is very interesting.&lt;br /&gt;&lt;br /&gt;Now, having an ensemble of Bayesian networks might be harder to interpret from a data mining perspective, but from an artificial intelligence perspective, I imagine a "language center" that knows how to query that knowledge base and transform the complex joint distribution representation into some sort of natural language form (and vice-versa).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-4405632023147647435?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/4405632023147647435/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2010/01/boosting-bayesian-networks.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/4405632023147647435'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/4405632023147647435'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2010/01/boosting-bayesian-networks.html' title='Boosting Bayesian Networks'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-1451711118469337938</id><published>2010-01-23T09:24:00.000-08:00</published><updated>2010-01-26T12:52:11.783-08:00</updated><title type='text'>Probabilistic Reasoning</title><content type='html'>Assuming you're familiar with Bayes' rule (if not, check &lt;a href="http://people.cs.ubc.ca/%7Emurphyk/Bayes/bayesrule.html"&gt;this&lt;/a&gt; out), let's review a classical problem, the Monty Hall problem:&lt;br /&gt;&lt;br /&gt;"Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 2, which has a goat. He then says to you, "Do you want to pick door No. 3?" Is it to your advantage to switch your choice?"&lt;br /&gt;&lt;br /&gt;The first time I saw that problem was in &lt;a href="http://www.amazon.com/Text-Mining-Application-Programming/dp/1584504609"&gt;Text Mining Application Programming&lt;/a&gt;, and the solution went something like this:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Initially, the probability of the car being behind any of the doors is equal:&lt;br /&gt;&lt;br /&gt;p(p_1)=p(p_2)=p(p_3)=1/3&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Once the contestant picks door number 1 and the host opens door number 2, the contestant should use Bayes' rule to update his/her beliefs about where the prize is. The probability of the host opening door number 2 is 0.5, since he can only choose one of the two doors the contestant has not selected:&lt;br /&gt;&lt;br /&gt;p(o_2)=1/2&lt;br /&gt;&lt;br /&gt;Furthermore, we have the following conditionals:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;The probability of the host opening door 2, given that the prize is behind door 1, is 0.5, since it makes no difference if the host chooses door 2 or door 3:&lt;br /&gt;&lt;br /&gt;p(o_2|p_1)=1/2&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;The probability of the host picking door 2, given that the prize is behind door 2, is 0:&lt;br /&gt;&lt;br /&gt;p(o_2|p_2)=0&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;The probability of the host picking door 2, given that the prize is behind door 3, is 1:&lt;br /&gt;&lt;br /&gt;p(o_2|p_3)=1&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;So, applying Bayes' rule with the numbers we have, we get the following:&lt;br /&gt;&lt;br /&gt;p(p_1|o_2)=p(o_2|p_1)*p(p_1)/p(o_2)=1/3&lt;br /&gt;p(p_2|o_2)=p(o_2|p_2)*p(p_2)/p(o_2)=0&lt;br /&gt;p(p_3|o_2)=p(o_2|p_3)*p(p_3)/p(o_2)=2/3&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;According to this, the contestant can double his/her chances of getting the car by switching from door 1 to door 3.  Not an intuitive result, as far as I'm concerned, but nevertheless, the numbers seem to make sense.&lt;br /&gt;&lt;br /&gt;Perhaps you're not convinced you should switch.  Perhaps you think the problem is not well defined enough and leaves much to interpretation.  In that case, here is a more detailed enunciation of the problem:&lt;br /&gt;&lt;br /&gt;"Suppose you’re on a game show and you’re given the choice of three doors. Behind one door is a car; behind the others, goats. The car and the goats were placed randomly behind the doors before the show. The rules of the game show are as follows: After you have chosen a door, the door remains closed for the time being. The game show host, Monty Hall, who knows what is behind the doors, now has to open one of the two remaining doors, and the door he opens must have a goat behind it. If both remaining doors have goats behind them, he chooses one randomly. After Monty Hall opens a door with a goat, he will ask you to decide whether you want to stay with your first choice or to switch to the last remaining door. Imagine that you chose Door 1 and the host opens Door 2, which has a goat. He then asks you “Do you want to switch to Door Number 3?” Is it to your advantage to change your choice?"&lt;br /&gt;&lt;br /&gt;Surprisingly, the analysis above remains and the advantage in switching also remains :)&lt;br /&gt;&lt;br /&gt;Confused, still?  Well, let's briefly look at another classical problem, the Three Prisoners problem:&lt;br /&gt;&lt;br /&gt;"Three prisoners, 1, 2, and 3, have been tried for murder.  Exactly one will be hanged tomorrow morning, but only the guard knows who.  1 asks the guard to give a letter to another prisoner — one who will be released.  Later 1 asks the guard to whom he gave the letter. The guard answers '2'.  1 thinks, '2 will be released. Only 3 and I remain. My chances of dying have risen from 1/3 to 1/2.' "&lt;br /&gt;&lt;br /&gt;Is prisoner 1 correct in his thinking?&lt;br /&gt;&lt;br /&gt;I first came across this problem in &lt;a href="http://www.amazon.com/Probabilistic-Reasoning-Intelligent-Systems-Plausible/dp/1558604790"&gt;Probabilistic Reasoning in Intelligent Systems&lt;/a&gt;, where the author explains that no, prisoner 1 is &lt;span style="font-style: italic;"&gt;not&lt;/span&gt; correct in his thinking: his probability of dying remains 1/3.&lt;br /&gt;&lt;br /&gt;Let's apply Bayes' rule to this problem:&lt;br /&gt;&lt;br /&gt;p(g_1|i_2)=p(i_2|g_1)*p(g_1)/p(i_2)=?&lt;br /&gt;&lt;br /&gt;The initial probability that 1 is guilty, p(g_1), is 1/3.  The probability that 2 is innocent, p(i_2), is 2/3.  Finally, the probability that 2 is innocent given that 1 is guilty, p(i_2|g_1), is 1. So, p(g_1|i_2)=1/2.&lt;br /&gt;&lt;br /&gt;Wait... huh?  I thought prisoner 1's probability of dying was to remain 1/3... what happened?  Also, the Prisoners problem sounds so similar to the Monty Hall problem... assuming p(g_1|i_2)=1/3 is the correct answer, why is it that our beliefs should be updated given the new information in the Monty Hall problem, but not in the Prisoners problem?&lt;br /&gt;&lt;br /&gt;It is at this point that I am reminded of the mathematical proof that 1=2:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Let a=b&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Then a^2=ab&lt;/li&gt;&lt;br /&gt;&lt;li&gt;a^2+a^2=a^2+ab&lt;/li&gt;&lt;br /&gt;&lt;li&gt;2a^2=a^2+ab&lt;/li&gt;&lt;br /&gt;&lt;li&gt;2a^2-2ab=a^2+ab-2ab&lt;/li&gt;&lt;br /&gt;&lt;li&gt;2a^2-2ab=a^2-ab&lt;/li&gt;&lt;br /&gt;&lt;li&gt;2(a^2-ab)=1(a^2-ab)&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Canceling the (a^2-ab) from both sides gives 1=2&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;The fallacy is in the last step, where we cancel (a^2-ab) from both sides.  We cannot do that because (a^2-ab)=0 and we cannot divide by zero (see &lt;a href="http://www.math.toronto.edu/mathnet/falseProofs/guess8.html"&gt;here&lt;/a&gt;).  The moral of the story is that we can't blindly apply a method without thinking about what we're doing, a point that is sometimes obvious (as in 1=2) and sometimes much more subtle (as in the Prisoners problem).&lt;br /&gt;&lt;br /&gt;The fallacy in our analysis of the Prisoners problem consists in not conditioning on the right piece of information.  The probability that 2 is innocent, 2/3, is not what we should be conditioning on since that is not exactly the new information that we acquired.  The information we acquired was that the guard said that 2 would be declared innocent.  The guard could only provide one of two answers to prisoner 1:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;2 will be released, or&lt;/li&gt;&lt;li&gt;3 will be released&lt;/li&gt;&lt;/ol&gt;So, the probability that the guard said 2 will be released, p(r_2), is 1/2, and that is what we should be conditioning on:&lt;br /&gt;&lt;br /&gt;p(g_1|r_2)=p(r_2|g_1)*p(g_1)/p(r_2)=?&lt;br /&gt;&lt;br /&gt;In this case, the probability that the guard would have said 2 was gonna be released given that 1 was guilty, p(r_2|g_1), is also 1/2 since he could have as well chosen 3 to be whom he'd give the letter to, so p(g_1|r_2)=1/3.&lt;br /&gt;&lt;br /&gt;As we can see, our beliefs are updated by the Bayesian mechanism in both problems, but whereas in the Monty Hall problem the new information is relevant to our beliefs, in the Prisoners problem it isn't.&lt;br /&gt;&lt;br /&gt;If you're still not convinced about the Monty Hall problem, check the page &lt;a href="http://en.wikipedia.org/wiki/Monty_Hall_problem"&gt;Wikipedia has for it&lt;/a&gt;, a surprisingly long page.  There you will find a suggestion to consider a &lt;a href="http://en.wikipedia.org/wiki/Monty_Hall_problem#Increasing_the_number_of_doors"&gt;variation of the problem&lt;/a&gt; where rather than 3, there are 1,000,000 doors instead:&lt;br /&gt;&lt;br /&gt;"In this case there are 999,999 doors with goats behind them and one door with a prize. The player picks a door. The game host then opens 999,998 of the other doors revealing 999,998 goats—imagine the host starting with the first door and going down a line of 1,000,000 doors, opening each one, skipping over only the player's door and one other door. The host then offers the player the chance to switch to the only other unopened door. On average, in 999,999 out of 1,000,000 times the other door will contain the prize, as 999,999 out of 1,000,000 times the player first picked a door with a goat. A rational player should switch. Intuitively speaking, the player should ask how likely is it, that given a million doors, he or she managed to pick the right one."&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-1451711118469337938?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/1451711118469337938/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2010/01/bayes-rule.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/1451711118469337938'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/1451711118469337938'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2010/01/bayes-rule.html' title='Probabilistic Reasoning'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-4732130691439688417</id><published>2009-12-11T13:20:00.000-08:00</published><updated>2009-12-31T16:55:04.802-08:00</updated><title type='text'>Boosting</title><content type='html'>A short &lt;a href="http://amachinelearningtutorial.googlecode.com/files/boosting_paper.pdf"&gt;paper&lt;/a&gt; I wrote for a class, exploring boosting for classification, part of a bigger project I'm working on.  &lt;br /&gt;&lt;br /&gt;Also, here is some code for &lt;a href="http://en.wikipedia.org/wiki/Boosting"&gt;boosting&lt;/a&gt; MATLAB's &lt;a href="http://www.mathworks.com/access/helpdesk/help/toolbox/stats/classregtree.html"&gt;classregtree&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;function [ensemble,iter,w]=SAMME_trees(X,y,w,T,ensemble,t)&lt;br /&gt;%m is the number of training instances.&lt;br /&gt;m=size(X,1);&lt;br /&gt;%number of classes.&lt;br /&gt;k=10;&lt;br /&gt;&lt;br /&gt;for iter=t:T-1 &lt;br /&gt;    %train classifier on weighted dataset and compute its error.&lt;br /&gt;    %sample according to w.&lt;br /&gt;    randnum=rand(1,m);&lt;br /&gt;    cW=cumsum(w);&lt;br /&gt;    indices=zeros(1,m);&lt;br /&gt;    for i=1:m,&lt;br /&gt;        %find which bin the random number falls into&lt;br /&gt;        idx=find(randnum(i)&gt;cW, 1,'last')+1;&lt;br /&gt;        if isempty(idx)&lt;br /&gt;            indices(i)=1;&lt;br /&gt;        else&lt;br /&gt;            indices(i)=idx;&lt;br /&gt;        end&lt;br /&gt;    end&lt;br /&gt;    &lt;br /&gt;    [tree,error,diff_vec]=train(X(indices,:),y(indices,1),X,y);&lt;br /&gt;    if error&lt;=0 || error&gt;=(1-(1/k))&lt;br /&gt;        iter=iter-1;&lt;br /&gt;        return&lt;br /&gt;    end&lt;br /&gt;    &lt;br /&gt;    %calculate the weight of the classifier based on its error.&lt;br /&gt;    alpha=log((1-error)/error)+log(k-1);&lt;br /&gt;    &lt;br /&gt;    %add the classifier and its weight to our ensemble.&lt;br /&gt;    ensemble{iter,1}=alpha;&lt;br /&gt;    ensemble{iter,2}=tree;&lt;br /&gt;    &lt;br /&gt;    %update the datapoint weights.&lt;br /&gt;    w=w.*exp(alpha.*(diff_vec.*2-1));&lt;br /&gt;    w=w./sum(w);&lt;br /&gt;    &lt;br /&gt;    disp(['boosting iteration: ',num2str(iter)]);     &lt;br /&gt;end&lt;br /&gt;%nothing here!&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;function [tree,error,diff_vec]=train(X,y,X_to_pred,y_to_pred)&lt;br /&gt;%train classifier.&lt;br /&gt;tree=classregtree(X,y,'method','classification','prune','off');&lt;br /&gt;%predict everything.&lt;br /&gt;y_hat_cell=tree(X_to_pred);&lt;br /&gt;%convert everything to int vector.&lt;br /&gt;y_hat=int32(str2num(char(y_hat_cell)));&lt;br /&gt;%see what was missed.&lt;br /&gt;diff_vec=ones(size(y));&lt;br /&gt;diff_vec(y_to_pred==y_hat)=0;&lt;br /&gt;error=sum(diff_vec)/size(diff_vec,1);&lt;br /&gt;&lt;br /&gt;disp(['missclassification ratio: ',num2str(error)]);&lt;br /&gt;end&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Driven with:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;clc;clear;&lt;br /&gt;stream0=RandStream('mt19937ar','Seed',0);&lt;br /&gt;RandStream.setDefaultStream(stream0);&lt;br /&gt;&lt;br /&gt;out_iters=1;&lt;br /&gt;in_iters=10;&lt;br /&gt;accuracies=zeros(out_iters,in_iters);&lt;br /&gt;accuracies_tolerance_one=zeros(out_iters,in_iters);&lt;br /&gt;    &lt;br /&gt;%read data.&lt;br /&gt;source_data='../data/winequality-white.csv';&lt;br /&gt;covariate_num=12;&lt;br /&gt;data=textread(source_data,'','delimiter',';','headerlines',1);&lt;br /&gt;header=textread(source_data,'%s',covariate_num,'delimiter',';');&lt;br /&gt;k=10;&lt;br /&gt;%m is the full number of data instances (rows).&lt;br /&gt;m=size(data,1);&lt;br /&gt;%n is the number of attributes (take one out, cause of target).&lt;br /&gt;n=size(data,2)-1;&lt;br /&gt;%separate instances into attributes and targets.&lt;br /&gt;X=data(:,1:end-1);&lt;br /&gt;y=int32(data(:,end));&lt;br /&gt;&lt;br /&gt;for out_iter=1:out_iters&lt;br /&gt;        &lt;br /&gt;    %split data into train and test sets.&lt;br /&gt;    probs=rand(1,m);&lt;br /&gt;    prob_thresh=2/3;&lt;br /&gt;    X_train=X(probs&lt;=prob_thresh,:);&lt;br /&gt;    y_train=y(probs&lt;=prob_thresh,:);&lt;br /&gt;    X_test=X(probs&gt;prob_thresh,:);&lt;br /&gt;    y_test=y(probs&gt;prob_thresh,:);&lt;br /&gt;    &lt;br /&gt;    %w is the weight assigned to each data point.&lt;br /&gt;    m_train=size(X_train,1);&lt;br /&gt;    w=ones(m_train,1)./m_train;&lt;br /&gt;    %will be storing rows of [alpha,tree].&lt;br /&gt;    boost_iters=2;&lt;br /&gt;    ensemble=cell(in_iters*boost_iters,2);&lt;br /&gt;    t=0;&lt;br /&gt;    &lt;br /&gt;    for in_iter=1:in_iters&lt;br /&gt;        %fit model.&lt;br /&gt;        t=t+1;&lt;br /&gt;        [ensemble,t,w]=SAMME_trees(X_train,y_train,w,boost_iters+t,...&lt;br /&gt;            ensemble,t);&lt;br /&gt;        &lt;br /&gt;        %predict targets on test set.&lt;br /&gt;        m_y_test=size(y_test,1);&lt;br /&gt;        candidates=zeros(m_y_test,k);&lt;br /&gt;        for i=1:t&lt;br /&gt;            alpha=ensemble{i,1};&lt;br /&gt;            tree=ensemble{i,2};&lt;br /&gt;            y_pred_i=int8(str2num(char(tree(X_test))));&lt;br /&gt;            for j=1:m_y_test&lt;br /&gt;                candidates(j,y_pred_i(j,1))=...&lt;br /&gt;                    candidates(j,y_pred_i(j,1))+alpha;&lt;br /&gt;            end&lt;br /&gt;        end&lt;br /&gt;        [val,y_pred]=max(candidates,[],2);&lt;br /&gt;        y_pred=int32(y_pred);&lt;br /&gt;        &lt;br /&gt;        %calculate confusion matrix and accuracies.&lt;br /&gt;        [confus,legend]=confusionmat(y_test,y_pred)&lt;br /&gt;        total=sum(sum(confus))&lt;br /&gt;        accuracy=sum(diag(confus))*100.0/total&lt;br /&gt;        accuracy_tolerance_one=sum(sum(spdiags(confus,[-1 0 1])))*100.0/total;&lt;br /&gt;        &lt;br /&gt;        accuracies(out_iter,in_iter)=accuracy;&lt;br /&gt;        accuracies_tolerance_one(out_iter,in_iter)=accuracy_tolerance_one;&lt;br /&gt;    end&lt;br /&gt;end&lt;br /&gt;accuracies=accuracies';&lt;br /&gt;accuracies_tolerance_one=accuracies_tolerance_one';&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-4732130691439688417?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/4732130691439688417/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/12/boosting.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/4732130691439688417'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/4732130691439688417'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/12/boosting.html' title='Boosting'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-7392525061849103517</id><published>2009-12-04T07:54:00.000-08:00</published><updated>2009-12-04T12:31:12.068-08:00</updated><title type='text'>Best Data Mining Algorithm</title><content type='html'>Which data mining algorithm is the best one obviously depends on whatever you may take 'best' to mean, and picking a single one may be a futile exercise in the current state of affairs, but this &lt;a href="http://www.cs.umd.edu/%7Esamir/498/10Algorithms-08.pdf"&gt;paper&lt;/a&gt; (published in 2007) does a good job discussing the &lt;a href="http://www.cs.uvm.edu/%7Eicdm/algorithms/index.shtml"&gt;top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM)&lt;/a&gt; in December 2006:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;C4.5&lt;/li&gt;&lt;li&gt;k-Means&lt;/li&gt;&lt;li&gt;SVM&lt;/li&gt;&lt;li&gt;Apriori&lt;/li&gt;&lt;li&gt;EM&lt;/li&gt;&lt;li&gt;PageRank&lt;/li&gt;&lt;li&gt;AdaBoost&lt;/li&gt;&lt;li&gt;kNN&lt;/li&gt;&lt;li&gt;Naive Bayes&lt;/li&gt;&lt;li&gt;CART&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-7392525061849103517?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/7392525061849103517/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/12/best-data-mining-algorithm.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/7392525061849103517'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/7392525061849103517'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/12/best-data-mining-algorithm.html' title='Best Data Mining Algorithm'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-6575419923898161048</id><published>2009-11-25T08:49:00.000-08:00</published><updated>2009-12-15T21:35:51.211-08:00</updated><title type='text'>Easy Learning of Bayesian Networks</title><content type='html'>The other day I came across a very interesting idea, described in this &lt;a href="http://www.blogger.com/research.microsoft.com/en-us/um/people/dmax/publications/dn2bn.pdf"&gt;paper&lt;/a&gt;, where the authors discuss learning a Bayesian network by first learning a &lt;a href="http://jmlr.csail.mit.edu/papers/volume1/heckerman00a/html/node5.html"&gt;dependency network&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Why would you want to do that?  Well, a dependency network can be thought of as a Bayesian network where cycles are allowed in the graph, and so if you lift the acyclicity restriction, it turns out to be really easy (almost trivial) to learn a probabilistic graphical model: all you have to do is learn a conditional probability distribution (CPD) for each variable given the other variables.  This is a supervised learning problem and we know many ways of solving it (any probabilistic discriminative method, particularly if coupled with a nice feature selection method, will do just fine).  Furthermore, you can learn each node's CPD in parallel.&lt;br /&gt;&lt;br /&gt;Then, to obtain a Bayesian network from the dependency network, all you have to do is remove the least significant edges participating in cycles in the dependency network.  The paper describes a very easy way of doing so.  However, I'm starting to wonder if I'd even wanna do that.  Why not just stick with the dependency network?  If causality matters to you, you may argue that causality is not cyclical, or at best simply hard to represent cyclically, but I think that ceases to be an issue when you consider the time dimension (more about this on a future post).&lt;br /&gt;&lt;br /&gt;Causality aside, as you may have expected, a dependency network is not a panacea.  A joint distribution doesn't factorize as nicely over a dependency network as it does over a Bayesian network.  For example, if we had 2 variables, X and Y, with a mutual cyclic dependency (i.e. a bidirectional edge between the 2 nodes), we'd hope to be able to write something like: P(X,Y)=P(X|Y)P(Y|X), but if you simplify that out using the chain rule P(X,Y)=P(X|Y)P(Y), you'd end up with P(X,Y)=P(X)P(Y), which says the 2 are independent.  Also, it is possible, and it does happen often (particularly if learning a single tree for each CPD, something that could be alleviated with surrogate splits), that you may end up with X being a significant variable in Y's conditional distribution, but Y not being significant in X's conditional distribution.  No big deal, though, you can still recover the joint distribution and answer any probabilistic query using &lt;a href="http://en.wikipedia.org/wiki/Gibbs_sampling"&gt;Gibbs Sampling&lt;/a&gt;, and that might just be good enough.&lt;br /&gt;&lt;br /&gt;For illustration purposes, I used &lt;a href="http://www.r-project.org/"&gt;R&lt;/a&gt;'s &lt;a href="http://cran.r-project.org/web/packages/rpart/index.html"&gt;rpart&lt;/a&gt; package to learn a dependency network structure for this &lt;a href="http://archive.ics.uci.edu/ml/datasets/Adult"&gt;dataset&lt;/a&gt; (using only 10 of the 14 attributes), and this is what it came up with (rendered with &lt;a href="http://www.graphviz.org/"&gt;Graphviz&lt;/a&gt;):&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_JBvsBkmE5OU/Sw1yDV52_VI/AAAAAAAAAJE/XtjCVDYqo0A/s1600/adult.png"&gt;&lt;img style="cursor: pointer; width: 400px; height: 361px;" src="http://3.bp.blogspot.com/_JBvsBkmE5OU/Sw1yDV52_VI/AAAAAAAAAJE/XtjCVDYqo0A/s400/adult.png" alt="" id="BLOGGER_PHOTO_ID_5408104129286831442" border="0" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-6575419923898161048?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/6575419923898161048/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/11/easy-learning-of-bayesian-networks.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/6575419923898161048'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/6575419923898161048'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/11/easy-learning-of-bayesian-networks.html' title='Easy Learning of Bayesian Networks'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_JBvsBkmE5OU/Sw1yDV52_VI/AAAAAAAAAJE/XtjCVDYqo0A/s72-c/adult.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-4863413674413981708</id><published>2009-10-03T19:01:00.000-07:00</published><updated>2009-10-04T09:23:59.672-07:00</updated><title type='text'>Reactive vs Deliberative Agents</title><content type='html'>So, what exactly are agents?  What does it mean for an agent to be reactive or deliberative and what are [dis]advantages of one over the other?&lt;br /&gt;&lt;br /&gt;From the &lt;a href="http://aima.cs.berkeley.edu/"&gt;Intelligent Agent Book&lt;/a&gt;, an &lt;span style="font-weight: bold;"&gt;agent&lt;/span&gt; is anything that can be viewed as perceiving its &lt;span style="font-weight: bold;"&gt;environment&lt;/span&gt; through &lt;span style="font-weight: bold;"&gt;sensors&lt;/span&gt; and acting upon that environment through &lt;span style="font-weight: bold;"&gt;actuators&lt;/span&gt;.  I would add (as does the author) that a most important component of the agent is its &lt;span style="font-weight: bold;"&gt;performance measure&lt;/span&gt;, the details of which are not under control of the agent, but instead are provided to it by its designer/creator (not unlike the reward function in &lt;a href="http://en.wikipedia.org/wiki/Reinforcement_learning"&gt;reinforcement learning&lt;/a&gt;).  It is the purpose of the agent, then, to act in its environment so as to maximize its performance measure.&lt;br /&gt;&lt;br /&gt;It might be interesting to contrast this "designer/creator" approach of reinforcement learning to an approach based purely on evolutionary techniques such as &lt;a href="http://en.wikipedia.org/wiki/Genetic_algorithm"&gt;genetic algorithms&lt;/a&gt;.  Both approaches are in fact characterized by a performance function which must be maximized but a differentiating characteristic is that reinforcement learning agents learn from [trial-and-error] interaction with the environment whereas evolutionary techniques traditionally don't.  Are these approaches mutually exclusive or are they instead complementary?  Could reproduction be the link which establishes the connection between reinforcement learning and evolutionary techniques?&lt;br /&gt;&lt;br /&gt;There are other properties of agents which may help in organizing them into some sort of taxonomy.  One such property is the degree of &lt;span style="font-weight: bold;"&gt;autonomy&lt;/span&gt; a given agent possesses. Autonomy can be a measure of how much an agent relies of the prior knowledge of its designer rather than on its own precepts to act in its environment.  The more an agent relies on its own precepts, the more autonomous it is.  Therefore, simplistically, a reinforcement learning agent will tend to be more autonomous than a candidate solution (an agent, if you will) in a genetic algorithm.&lt;br /&gt;&lt;br /&gt;Some examples?  OK, well, the most obvious example of an agent is a human being, which has ears, eyes, etc, for sensors, and hands, legs, etc, for actuators.  Another example would be a robot, which could have cameras and radars for sensors and wheels and robotic arms for actuators:&lt;br /&gt;&lt;br /&gt;&lt;object width="425" height="344"&gt;&lt;param name="movie" value="http://www.youtube.com/v/qbQDJ1c_Nxk&amp;amp;hl=en&amp;amp;fs=1&amp;amp;"&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;embed src="http://www.youtube.com/v/qbQDJ1c_Nxk&amp;amp;hl=en&amp;amp;fs=1&amp;amp;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;br /&gt;A more abstract example of an agent would be a stock trading software program, designed to "sense" a stream of  stock price quotes and provided with the ability to buy or sell those stocks (its "actuators"), a simple instance of which can be found &lt;a href="http://mechanistician.blogspot.com/2009/08/trading-strategy-implementation.html"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;All of these agents exhibit a certain degree of autonomy (the stock trading agent being the least autonomous; go look at the source code), and can be thought of as entities which sense and act in their environments, and as they do so, they build a model of what happens when they take a certain action while the environment is in a given state.  This model, or abstract representation, of their environment could possibly help the agent "reason" about what action to take next so as to maximize its performance measure not only today, but throughout the lifetime of the agent.  The degree to which a model evolves with the agent's experience is an indicator of the agent's level of autonomy (more &lt;a href="http://en.wikipedia.org/wiki/Neuroplasticity"&gt;plasticity&lt;/a&gt; indicates more autonomy).&lt;br /&gt;&lt;br /&gt;A related and very interesting perspective is that described in &lt;a href="http://openlibrary.org/b/OL22498578M/hedonistic_neuron"&gt;The Hedonistic Neuron: A Theory of Memory, Learning, and Intelligence&lt;/a&gt;, by &lt;a href="http://www.stsc.hill.af.mil/crosstalk/1996/02/reinforc.asp"&gt;A. Harry Klopf&lt;/a&gt;.  That name may sound familiar to you, especially if you've read Sutton and Barto's &lt;a href="http://www.cs.ualberta.ca/%7Esutton/book/ebook/the-book.html"&gt;introduction to reinforcement learning&lt;/a&gt;.   Harry Klopf is whom the book was dedicated to, and in the last section of chapter I, &lt;a href="http://www.cs.ualberta.ca/%7Esutton/book/ebook/node12.html"&gt;History of Reinforcement Learning&lt;/a&gt;, there is a mention of how influential Harry was on shaping the authors' ideas on reinforcement learning.  To give you a sense of what "The Hedonistic Neuron" is about here is how the preface begins:&lt;br /&gt;&lt;br /&gt;"At the level of whole organisms, it is commonly observed that animals pursue pleasure and avoid pain.  At the level of the single neuron, it is well known that the individual brain cell receives two classes of synaptic inputs: excitation and inhibition.  What is the relationship between these two descriptions, one mentalistic and at an organismic level, the other physicalistic and at a neuronal level?  Is there no simple relationship, as most neuroscientists would be likely to assume today?  Have we described phenomena at two such vastly different levels and in such vastly different terms that the complexity and subtlety of their relationship will defy elucidation for some time to come?  Or, to go to the other extreme, might their relationship be the simplest one conceivable?  Namely, might excitation and inhibition represent, in elementary physical terms, one and the same thing that pleasure and pain represent in complex mental terms?  But, at this point, can we even say precisely what we would mean by such questions?  And even if we can, are not such questions simplistic to the point of being unproductive?  This book seeks to demonstrate that we can say what we mean and the we will benefit from asking such questions."&lt;br /&gt;&lt;br /&gt;Now, the book was published in 1982, and much has changed in neuroscience since then, nevertheless, much still remains unclear about &lt;span style="font-style: italic;"&gt;what&lt;/span&gt; it is that neurons do in cognitive terms (even if we do know a little more about &lt;span style="font-style: italic;"&gt;how&lt;/span&gt; it is that they do it in chemical terms).   So, don't expect to read about the latest and greatest explanations regarding the physiological and cognitive properties of the brain if you decide to read Harry's book, but do expect to come out with some very thought-provoking ideas.  In fact, evidence of the book's transcendence is that today we have a hedonistic synapse hypothesis: "the assumptions are that operant conditioning in living organisms has an underlying neuro-chemical mechanism and that this mechanism could be implemented by means of each synapse optimizing its behavior to harvest the most of globally broadcasted reward" (check out this &lt;a href="http://hebb.mit.edu/people/seung/papers/Neuron18Dec03.pdf"&gt;paper&lt;/a&gt; from &lt;a href="http://hebb.mit.edu/people/seung/index.html"&gt;Sebastian Seung&lt;/a&gt;).    So, it may be synapses and not neurons that exhibit a goal-seeking behavior, but we may be splitting hairs here.  The idea is that we could have a goal-seeking system composed of goal-seeking elements:&lt;br /&gt;&lt;br /&gt;"The theory to be developed explains goal-seeking brain function in terms of goal-seeking neurons.  It has been generally (and implicitly) assumed in the past that advanced (intelligent) goal-seeking brain function &lt;span style="font-style: italic;"&gt;emerges&lt;/span&gt; from the interaction of non-goal-seeking neurons.  Assuming such a passive, non-goal-seeking role for the neuron may not be valid.  The single neuron is a remarkably complex and sophisticated cell and it may well play a more active role.  Perhaps an analogy will help make the point clear.  Consider that goal-seeking social systems (such as the United States) would probably remain mysterious if we assumed that the people making up the social system were non-goal-seeking in nature.  There would probably be no way of explaining complex goal-seeking social system behavior (such as putting a person on the moon) in terms of the interactions of non-goal-seeking people.  Are we any more likely to be successful in understanding goal-seeking nervous systems by assuming non-goal-seeking neurons?"&lt;br /&gt;&lt;br /&gt;Perhaps embedded in the mechanism by which a neuron makes a binary decision (to fire or not to fire) is the essence of &lt;a href="http://en.wikipedia.org/wiki/Free_will"&gt;free-will&lt;/a&gt;?  Could a neuron be described as an agent that senses a stream of inhibitory and excitatory synaptic inputs (possibly encoding the temporal and spatial patterns in the stream into some sort of model) and then deciding whether or not to fire (whether or not to act)?  Could clusters of neurons conspire so as to better satisfy their individual hedonistic needs?  What role would knowledge play in a world in which these ideas weren't so far-fetched?&lt;br /&gt;&lt;br /&gt;An answer to that last question could be that knowledge is that which enables an agent to predict the evolution of an given process through time, in scenarios where the agent is simply observing the process or where the agent is attempting to influence the process; think Markov chain vs Markov decision process, respectively (although the Markov assumption, that knowledge of the present renders the past and the future independent, though useful, may be overly simplistic and I do not imply here that it is crucial for more general modeling purposes):&lt;br /&gt;&lt;br /&gt;&lt;table align="center" border="1" cellpadding="0" cellspacing="0"&gt;&lt;colgroup&gt;&lt;col width="192"&gt;&lt;col width="192"&gt;&lt;col width="192"&gt;&lt;col width="192"&gt;&lt;/colgroup&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td colspan="2" rowspan="2" style="text-align: center; width: 1.7313in;"&gt;&lt;p&gt;Markov Models &lt;/p&gt;&lt;/td&gt;&lt;td colspan="2" style="text-align: center; width: 1.7313in;"&gt;&lt;p&gt;Do we have control over the state transitions? &lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td style="text-align: center; width: 1.7313in;"&gt;&lt;p class="P1"&gt;No &lt;/p&gt;&lt;/td&gt;&lt;td style="text-align: center; width: 1.7313in;"&gt;&lt;p&gt;Yes &lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td rowspan="2" style="text-align: center; width: 1.7313in;"&gt;&lt;p&gt;Are the states completely observable? &lt;/p&gt;&lt;/td&gt;&lt;td style="text-align: center; width: 1.7313in;"&gt;&lt;p&gt;Yes &lt;/p&gt;&lt;/td&gt;&lt;td style="text-align: center; width: 1.7313in;"&gt;&lt;p&gt;Markov Chain &lt;/p&gt;&lt;/td&gt;&lt;td style="text-align: center; width: 1.7313in;"&gt;&lt;p&gt;Markov Decision Process &lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td style="text-align: center; width: 1.7313in;"&gt;&lt;p&gt;No &lt;/p&gt;&lt;/td&gt;&lt;td style="text-align: center; width: 1.7313in;"&gt;&lt;p&gt;Hidden Markov Model &lt;/p&gt;&lt;/td&gt;&lt;td style="text-align: center; width: 1.7313in;"&gt;&lt;p&gt;Partially Observable Markov Decision Process &lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;br /&gt;So, as an agent perceives and acts in its environment, it builds a model of how it sees the environment evolving, and uses that model to make decisions about what actions to take next so as to maximize its performance measure.  The model, then, represents the agent's knowledge, and it is how the agent uses its model (how the agent "reasons") that differentiates reactive from deliberative agents.&lt;br /&gt;&lt;br /&gt;The simplest kind of a reactive agent is one which possesses a direct mapping between states and actions, so that at each point in time, no matter what state the environment is in, there would be a function which given the current state, would output the best possible action to be taken by the agent, presumably one that would maximize the agent's expected reward from the environment.  For a concrete example of this kind of agent, take a look at &lt;a href="http://www.youtube.com/watch?v=yCqPMD6coO8&amp;amp;feature=PlayList&amp;amp;p=A89DCFA6ADACE599&amp;amp;index=19"&gt;lecture 20&lt;/a&gt; of CS229, where &lt;span style="font-weight: bold;"&gt;policy search&lt;/span&gt; algorithms are discussed (the kind of reinforcement learning algorithms that have been successfully applied to fly autonomous helicopters).  Briefly, a &lt;span style="font-weight: bold;"&gt;policy&lt;/span&gt; is a function that maps states to actions, and the algorithm &lt;span style="font-weight: bold;"&gt;searches&lt;/span&gt; for a policy that maximizes the agent's expected return (very much like how gradient descent algorithms can be used to search for a set of parameters which maximize a function's fit to certain data).&lt;br /&gt;&lt;br /&gt;So, an agent whose controller/brain was trained using a policy search algorithm would know exactly what to do on environment states it was specifically trained/simulated on, but it would be severely limited on anything beyond that.  For complex tasks, training a controller sure beats hand-coding it, but for even more complex tasks, policy search may require extensive, possibly impractical, training.  This training usually happens in a simulator, and the line between reactive and deliberative agents can be blurry (especially considering the generalization power of learning continuous functions), but for illustration purposes, imagine a simple reactive agent that is faced with an environment state it has never been trained on and therefore has no state-action mapping for this particular state: the agent would be at a loss of what to do.  Granted, we could have some sort of default action for unfamiliar states, but I hope you get a feel for how restrictive this approach can be, or conversely, how much work you have to do in a simulator to train a good controller of this kind.  On the other hand, given a state, policy search algorithms require minimal computation in order to select actions, so they can have very fast "reaction times."&lt;br /&gt;&lt;br /&gt;Now what if we equip our agent with some more tools so it isn't as stuck if it finds itself in an unfamiliar situation?  We could design our agent so that it has a model of the environment that allows it to "predict" how the environment will transition from state to state given a certain action by the agent.  In addition, this agent could learn a state-value function based on rewards it receives from the environment when in a given state (that way it knows what states it prefers and hopefully will seek to be in those states).  Voila!  Now the agent can ponder/deliberate on the results of its actions and presumably select a plan (a sequence of actions), not just a single action.  Algorithms that do this fall under the umbrella of value-function approximation (see &lt;a href="http://www.youtube.com/watch?v=LKdFTsM3hl4&amp;amp;feature=PlayList&amp;amp;p=A89DCFA6ADACE599&amp;amp;index=16"&gt;lecture 17&lt;/a&gt; of CS229), and evidently, agents driven by this kind of algorithms appear to be much more flexible than agents based on policy search.  This deliberation, however, requires computational resources, in other words, it takes time to simulate several levels of state transitions resulting from action sequences.  We have several options in terms of how to model the state-transition-function and value-function, giving rise to a broad spectrum of agents with probabilistic planning capabilities (see &lt;a href="http://robots.stanford.edu/"&gt;Sebastian Thrun&lt;/a&gt;'s &lt;a href="http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&amp;amp;tid=10668"&gt;Probabilisitc Robotics&lt;/a&gt; for more details).&lt;br /&gt;&lt;br /&gt;Can we do more to enhance our agents' deliberative capabilities?  Actually, there are some very neat ideas in &lt;a href="http://bayes.cs.ucla.edu/BOOK-2K/"&gt;Causality&lt;/a&gt; that suggest we can.  The reading can be somewhat dense, but the punchline is that the kinds of models and algorithms therein presented may allow an agent to infer the consequences of actions the agent might not have known a priori that it had the capability to execute.  So, rather than asking itself "how would things turn out if I took this action?", an agent could ask itself "how would things turn out if I could take this action?", and if the outcome was favorable, the agent could work towards being able to take the hypothesized action.&lt;br /&gt;&lt;br /&gt;The kind of knowledge that would allow an agent to ask and answer that hypothetical question is much more complex than the knowledge with which we have traditionally endowed classical reinforcement learning agents (state transition probabilities and possibly a richly structured graphical representation of a given time slice).  It is the kind of knowledge that would require causal modeling and causal reasoning, the kind of knowledge which would allow an agent to answer interventional queries from non-experimental data and to answer &lt;a href="http://en.wikipedia.org/wiki/Counterfactual_conditional"&gt;conterfactual&lt;/a&gt; queries; basically, the kind of knowledge that would allow an agent to run complex &lt;a href="http://en.wikipedia.org/wiki/Thought_experiment"&gt;thought experiments&lt;/a&gt;.  That's some pretty powerful stuff!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-4863413674413981708?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/4863413674413981708/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/10/reactive-vs-deliberative-agents.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/4863413674413981708'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/4863413674413981708'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/10/reactive-vs-deliberative-agents.html' title='Reactive vs Deliberative Agents'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-5865097474557375879</id><published>2009-09-23T17:10:00.000-07:00</published><updated>2009-09-23T17:11:06.357-07:00</updated><title type='text'>Singularity University</title><content type='html'>Cool video from the &lt;a href="http://singularityu.org/"&gt;Singularity University&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;&lt;object width="560" height="340"&gt;&lt;param name="movie" value="http://www.youtube.com/v/63VRMvyIWT4&amp;hl=en&amp;fs=1&amp;"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/63VRMvyIWT4&amp;hl=en&amp;fs=1&amp;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="560" height="340"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-5865097474557375879?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/5865097474557375879/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/09/singularity-university.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/5865097474557375879'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/5865097474557375879'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/09/singularity-university.html' title='Singularity University'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-5914061244116706499</id><published>2009-09-12T13:18:00.000-07:00</published><updated>2009-09-14T07:44:45.008-07:00</updated><title type='text'>Bayesian Network Approximate Inference</title><content type='html'>Sampling methods for bayesian network inference, also known as particle-based methods (each sample being a particle), are just like any other Monte Carlo method: the more samples, the better the approximation (approximate inference, as exact inference, is &lt;a href="http://en.wikipedia.org/wiki/NP-hard"&gt;NP-hard&lt;/a&gt;, that is, if you want arbitrarily accurate answers).  A sample consists of a particular instantiation of the bayesian network (an assignment of values to each of the nodes in the network) and the approximation sought is that which answers a probabilistic query, which in its most basic form would look something like p(x|e)=?, where 'x' are unobserved nodes, 'e' are evidence/observed nodes, and '?' would be the posterior distribution of 'x' given 'e'.&lt;br /&gt;&lt;br /&gt;Though there are many other kinds of algorithms for performing inference in probabilistic graphical models (PGM), I like sampling methods because they are very intuitive, they highlight the &lt;a href="http://en.wikipedia.org/wiki/Generative_model"&gt;generative&lt;/a&gt; nature of PGM's, and finally, because they are more generally applicable than the other methods (even if often a bit challenging to get to work properly).&lt;br /&gt;&lt;br /&gt;For illustration purposes, we will use Likelihood Weighting (LW), which is a special case of a general sampling framework called Importance Sampling, of which you can learn more from this &lt;a href="http://videolectures.net/mlss08au_freitas_asm/"&gt;video lecture&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;In LW, to generate a particle, we sample the nodes in the network in an orderly fashion (ensuring parent nodes are sampled before their children), using the nodes' conditional probability distributions (CPD's), and setting evidence nodes to their observed values.  Finally, we assign to the resulting particle a weight corresponding to the likelihood of seeing that particular particle.  Having a bunch of weighted particles, we can then generate a conditional distribution for the query variables.  More concretely (using &lt;a href="http://people.cs.ubc.ca/~murphyk/Software/BNT/bnt.html"&gt;BNT&lt;/a&gt; and the same network as &lt;a href="http://mechanistician.blogspot.com/2009/09/probabilistic-graphical-models.html"&gt;last time&lt;/a&gt;):&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;clear;&lt;br /&gt;clc;&lt;br /&gt;&lt;br /&gt;%number of nodes&lt;br /&gt;N=4; &lt;br /&gt;%graph adjacency matrix&lt;br /&gt;dag=zeros(N,N);&lt;br /&gt;%nodes order topologically&lt;br /&gt;C=1; &lt;br /&gt;R=2;&lt;br /&gt;S=3;&lt;br /&gt;W=4;&lt;br /&gt;%set grph edges&lt;br /&gt;dag(C,[R S])=1;&lt;br /&gt;dag(R,W)=1;&lt;br /&gt;dag(S,W)=1;&lt;br /&gt;&lt;br /&gt;false=1;&lt;br /&gt;true=2;&lt;br /&gt;&lt;br /&gt;%node sizes vector (all binary)&lt;br /&gt;ns=2*ones(1,N); &lt;br /&gt;%crate the bayesian network&lt;br /&gt;bnet = mk_bnet(dag,ns);&lt;br /&gt;%create the nodes' CPD's&lt;br /&gt;bnet.CPD{C}=tabular_CPD(bnet,C,[0.5 0.5]);&lt;br /&gt;bnet.CPD{R}=tabular_CPD(bnet,R,[0.8 0.2 0.2 0.8]);&lt;br /&gt;bnet.CPD{S}=tabular_CPD(bnet,S,[0.5 0.9 0.5 0.1]);&lt;br /&gt;bnet.CPD{W}=tabular_CPD(bnet,W,[1 0.1 0.1 0.01 0 0.9 0.9 0.99]);&lt;br /&gt;&lt;br /&gt;%create the LW inference engine&lt;br /&gt;engine=likelihood_weighting_inf_engine(bnet);&lt;br /&gt;&lt;br /&gt;%assume no observed nodes&lt;br /&gt;evidence=cell(1,N);&lt;br /&gt;samples=100;&lt;br /&gt;[engine,loglik]=enter_evidence(engine,evidence,samples);&lt;br /&gt;%what's the prior probability for 'wet grass'?&lt;br /&gt;distr=marginal_nodes(engine,W);&lt;br /&gt;distr.T&lt;br /&gt;&lt;br /&gt;%now assume sprinkler is off, what's the&lt;br /&gt;%conditional probability for 'wet grass'?&lt;br /&gt;evidence{S}=false;&lt;br /&gt;[engine,loglik]=enter_evidence(engine,evidence,samples);&lt;br /&gt;distr=marginal_nodes(engine,W);&lt;br /&gt;distr.T&lt;br /&gt;&lt;br /&gt;%now assume it is raining, what's the&lt;br /&gt;%conditional probability for 'wet grass' now?&lt;br /&gt;evidence{R}=true;&lt;br /&gt;[engine,loglik]=enter_evidence(engine,evidence,samples);&lt;br /&gt;distr=marginal_nodes(engine,W);&lt;br /&gt;distr.T&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Which yields:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;%prior for W implies grass is slightly more &lt;br /&gt;%likely to be wet than not&lt;br /&gt;ans =&lt;br /&gt;&lt;br /&gt;    0.3700&lt;br /&gt;    0.6300&lt;br /&gt;&lt;br /&gt;%once we learn the sprinkler is off,&lt;br /&gt;%the grass is almost as likely to be wet&lt;br /&gt;%as it is to be dry&lt;br /&gt;ans =&lt;br /&gt;&lt;br /&gt;    0.4778&lt;br /&gt;    0.5222&lt;br /&gt;&lt;br /&gt;%once we learn it is raining though, the&lt;br /&gt;%grass becomes much more likely be wet&lt;br /&gt;ans =&lt;br /&gt;&lt;br /&gt;    0.1433&lt;br /&gt;    0.8567&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The magic happens when we make the call to 'enter_evidence', which looks like this:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;function [engine, ll] = enter_evidence(engine, evidence, nsamples)&lt;br /&gt;% ENTER_EVIDENCE Add the specified evidence to the network (likelihood_weighting)&lt;br /&gt;% [engine, ll] = enter_evidence(engine, evidence, nsamples)&lt;br /&gt;% evidence{i} = [] if if X(i) is hidden, and otherwise contains its observed value &lt;br /&gt;%(scalar or column vector)&lt;br /&gt;%&lt;br /&gt;% If nsamples is not specified, the value specified when the engine was created will be used.&lt;br /&gt;% ll (log-likelihood) is set to [].&lt;br /&gt;&lt;br /&gt;ll = [];&lt;br /&gt;if nargin &lt; 3, nsamples = engine.nsamples; end&lt;br /&gt;&lt;br /&gt;bnet = bnet_from_engine(engine);&lt;br /&gt;N = length(bnet.dag);&lt;br /&gt;samples = cell(nsamples, N);&lt;br /&gt;weights = zeros(1, nsamples);&lt;br /&gt;&lt;br /&gt;ns = bnet.node_sizes;&lt;br /&gt;original_evidence = evidence;&lt;br /&gt;observed = ~isemptycell(original_evidence);&lt;br /&gt;for s=1:nsamples&lt;br /&gt;  evidence = original_evidence(:); % must be a column vector&lt;br /&gt;  w = 1;&lt;br /&gt;  for i=1:N&lt;br /&gt;    ps = parents(bnet.dag, i);&lt;br /&gt;    e = bnet.equiv_class(i);&lt;br /&gt;    if observed(i)&lt;br /&gt;      p = exp(log_prob_node(bnet.CPD{e}, evidence(i), evidence(ps)));&lt;br /&gt;      w = w * p;&lt;br /&gt;    else&lt;br /&gt;      x = sample_node(bnet.CPD{e}, evidence(ps));&lt;br /&gt;      evidence{i} = x;&lt;br /&gt;    end&lt;br /&gt;  end&lt;br /&gt;  samples(s,:) = evidence;&lt;br /&gt;  weights(s) = w;&lt;br /&gt;end                 &lt;br /&gt;&lt;br /&gt;engine.samples = samples;&lt;br /&gt;engine.weights = weights;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;log_prob_node uses a node's CPD to compute the [log] probability of the observed value for a given observed node conditioned on the values of its parents and this value is then used to calculate the weight for a given particle.  sample_node samples values from unobserved nodes given the values of their parents.  The particles are then aggregated and normalized in the call to marginal_nodes.  Below is a fraction of the particles and their weights generated in this example:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;%sprinkler is off and it is raining&lt;br /&gt;%nodes are in the specified topological order&lt;br /&gt;[1]    [2]    [1]    [2]&lt;br /&gt;[2]    [2]    [1]    [2]&lt;br /&gt;[2]    [2]    [1]    [2]&lt;br /&gt;&lt;br /&gt;%weights associated with each of those particles&lt;br /&gt;0.1000    &lt;br /&gt;0.7200    &lt;br /&gt;0.7200    &lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-5914061244116706499?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/5914061244116706499/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/09/bayesian-network-approximate-inference.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/5914061244116706499'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/5914061244116706499'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/09/bayesian-network-approximate-inference.html' title='Bayesian Network Approximate Inference'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-2540580939858404607</id><published>2009-09-09T10:18:00.001-07:00</published><updated>2009-09-15T13:48:22.838-07:00</updated><title type='text'>Probabilistic Graphical Models</title><content type='html'>So, I recently got a copy of &lt;a href="http://robotics.stanford.edu/%7Ekoller/"&gt;Daphne Koller&lt;/a&gt;'s &lt;a href="http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&amp;amp;tid=11886"&gt;Probabilistic Graphical Models (PGM)&lt;/a&gt;, and it is massive: 1200 pages of pretty much everything you've ever wanted to know about bayesian [and markov] networks.&lt;br /&gt;&lt;br /&gt;To be honest, I would have liked to have seen some stuff on modeling Markov Decision Processes with Dynamic Bayesian Networks, but the book already covers so much material that I'm willing to overlook that detail (she does mention some good pointers on DBN's for MDP's at the end of chapter 23).&lt;br /&gt;&lt;br /&gt;There may be gentler introductions into the field out there (&lt;a href="http://www.amazon.com/Learning-Bayesian-Networks-Richard-Neapolitan/dp/0130125342"&gt;Learning Bayesian Networks&lt;/a&gt;, for example), but if you want to go beyond the introduction and build a good, solid foundation, PGM is the way to go.&lt;br /&gt;&lt;br /&gt;The content is structured within 4 main sections:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Representation&lt;/li&gt;&lt;li&gt;Inference&lt;/li&gt;&lt;li&gt;Learning&lt;/li&gt;&lt;li&gt;Actions and Decisions&lt;/li&gt;&lt;/ol&gt;And you can take a look at the rest in the &lt;a href="http://mitpress.mit.edu/books/chapters/0262013193refs1.pdf"&gt;table of contents&lt;/a&gt;, but those 4 sections are a very descriptive way of summarizing what PGM's are all about.  Briefly, we assume that a knowledge base can be represented as a joint probability distribution.  This is by no means a far-fetched assumption: even if you've never taken a probability class, you use the concepts in your daily discourse.  Consider the following examples, from &lt;a href="http://bayes.cs.ucla.edu/jp_home.html"&gt;Pearl&lt;/a&gt;'s &lt;a href="http://www.amazon.com/Probabilistic-Reasoning-Intelligent-Systems-Plausible/dp/1558604790"&gt;Probabilistic Reasoning in Intelligent Systems (PRIS)&lt;/a&gt;:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Likelihood: "Tim is more &lt;span style="font-style: italic;"&gt;likely&lt;/span&gt; to fly than to walk"&lt;/li&gt;&lt;li&gt;Conditioning: "&lt;span style="font-style: italic;"&gt;If&lt;/span&gt; Tim is sick, he can't fly"&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Relevance: "Whether Tim flies &lt;span style="font-style: italic;"&gt;depends&lt;/span&gt; on whether he is sick"&lt;/li&gt;&lt;li&gt;Causation: "Being sick &lt;span style="font-style: italic;"&gt;caused&lt;/span&gt; Tim's inability to fly"&lt;/li&gt;&lt;/ol&gt;Believe it or not, when you use language like that, you are in fact making probabilistic assertions :)&lt;br /&gt;&lt;br /&gt;Pearl wrote PRIS at a time when most people were trying to build knowledge systems based on &lt;a href="http://en.wikipedia.org/wiki/Logic"&gt;logic&lt;/a&gt;, but that approach turned out to be riddled with problems, a big one being the logic systems' inability to handle uncertainty.  So, building on the work of those who came before him, he successfully made the case for and launched us into this new era of structured probabilistic reasoning systems.&lt;br /&gt;&lt;br /&gt;The reason for adding the qualifier "structured" and not simply "probabilistic reasoning systems," is that naively manipulating a joint probability distribution can be a very computationally expensive proposition.  So, we exploit the structure inherent in the problem domain.  More specifically, we exploit conditional independences in our joint probability distribution to drastically reduce the number of computations we need to perform (basically, as you head out the door for a short evening walk, you don't need to know what the traffic is like in Japan in order to decide whether or not to take an umbrella with you in New York).  It turns out &lt;a href="http://en.wikipedia.org/wiki/Graph_theory"&gt;graphs&lt;/a&gt; (networks) are a very natural data structure with which to exploit these conditional independences and write computer algorithms for.&lt;br /&gt;&lt;br /&gt;PRIS is a very interesting read, even though it may be slightly dated by now.  Highlights include several discussions on the merits of probabilistic systems over logic systems, for example, a discussion on &lt;a href="http://en.wikipedia.org/wiki/George_P%C3%B3lya"&gt;Polya&lt;/a&gt;, who "argued that the process of discovery, even in as formal a field as mathematics, is guided by nondeductive inference mechanisms, entailing a lot of guesswork" (as opposed to a system that would have all the basic facts and would reach all other conclusions through deductive inference).  The discussion is about Polya's so-called "&lt;a href="http://www.amazon.com/Patterns-Plausible-Inference-G-Polya/dp/B000GS0CBI"&gt;patterns of plausible inference&lt;/a&gt;," which was his term for the patterns he identified as representing these nondeductive inference rules that could potentially serve as reasoning mechanisms in some reasoning system.  Interestingly, after he had identified a bunch of these patterns and attempted to formalize them into a coherent set of reasoning rules,  he [wisely] basically said "screw it," and settled upon the calculus of probabilities as a meta-pattern from which all his patterns could be derived.&lt;br /&gt;&lt;br /&gt;So, we have PGM's as an efficient way to represent joint probability distributions, and for a mental image, you can just think of a graph whose nodes are strategically linked to each other so as to maximally exploit conditional independences; the nodes representing the random variables of interest in our domain.  Each node has attached to it a conditional probability distribution (CPD), as is the case in a bayesian network, or more generally, a factor (as in the case of a markov network), and we say that the joint probability distribution factors into the CPD's and over the graph.  Pictorially, this:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;joint(:,:,1,1) =&lt;br /&gt;&lt;br /&gt;    0.2000    0.0200&lt;br /&gt;    0.0900    0.0010&lt;br /&gt;&lt;br /&gt;joint(:,:,2,1) =&lt;br /&gt;&lt;br /&gt;    0.0050    0.0005&lt;br /&gt;    0.0360    0.0004&lt;br /&gt;&lt;br /&gt;joint(:,:,1,2) =&lt;br /&gt;&lt;br /&gt;         0    0.1800&lt;br /&gt;         0    0.0090&lt;br /&gt;&lt;br /&gt;joint(:,:,2,2) =&lt;br /&gt;&lt;br /&gt;    0.0450    0.0495&lt;br /&gt;    0.3240    0.0396&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Splits into each of the CPD's (depicted below as tables next to each node), and each CPD goes to its node as specified by the structure of the graph:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_JBvsBkmE5OU/SqggOjSM5UI/AAAAAAAAAI8/8SS427FoRp8/s1600-h/bayesianGraph.png"&gt;&lt;img style="cursor: pointer; width: 400px; height: 333px;" src="http://1.bp.blogspot.com/_JBvsBkmE5OU/SqggOjSM5UI/AAAAAAAAAI8/8SS427FoRp8/s400/bayesianGraph.png" alt="" id="BLOGGER_PHOTO_ID_5379585189255898434" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The joint distribution can be easily generated with this script, using &lt;a href="http://people.cs.ubc.ca/%7Emurphyk/Software/BNT/bnt.html"&gt;BNT&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;N = 4; &lt;br /&gt;dag = zeros(N,N);&lt;br /&gt;C = 1; S = 2; R = 3; W = 4;&lt;br /&gt;dag(C,[R S]) = 1;&lt;br /&gt;dag(R,W) = 1;&lt;br /&gt;dag(S,W)=1;&lt;br /&gt;&lt;br /&gt;false = 1; true = 2;&lt;br /&gt;ns = 2*ones(1,N); % binary nodes&lt;br /&gt;&lt;br /&gt;%bnet = mk_bnet(dag, ns);&lt;br /&gt;bnet = mk_bnet(dag, ns, 'names', {'cloudy','S','R','W'}, 'discrete', 1:4);&lt;br /&gt;names = bnet.names;&lt;br /&gt;%C = names{'cloudy'};&lt;br /&gt;bnet.CPD{C} = tabular_CPD(bnet, C, [0.5 0.5]);&lt;br /&gt;bnet.CPD{R} = tabular_CPD(bnet, R, [0.8 0.2 0.2 0.8]);&lt;br /&gt;bnet.CPD{S} = tabular_CPD(bnet, S, [0.5 0.9 0.5 0.1]);&lt;br /&gt;bnet.CPD{W} = tabular_CPD(bnet, W, [1 0.1 0.1 0.01 0 0.9 0.9 0.99]);&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;CPD{C} = reshape([0.5 0.5], 2, 1);&lt;br /&gt;CPD{R} = reshape([0.8 0.2 0.2 0.8], 2, 2);&lt;br /&gt;CPD{S} = reshape([0.5 0.9 0.5 0.1], 2, 2);&lt;br /&gt;CPD{W} = reshape([1 0.1 0.1 0.01 0 0.9 0.9 0.99], 2, 2, 2);&lt;br /&gt;joint = zeros(2,2,2,2);&lt;br /&gt;for c=1:2&lt;br /&gt;  for r=1:2&lt;br /&gt;    for s=1:2&lt;br /&gt;      for w=1:2&lt;br /&gt; joint(c,s,r,w) = CPD{C}(c) * CPD{S}(c,s) * CPD{R}(c,r) * ...&lt;br /&gt;     CPD{W}(s,r,w);&lt;br /&gt;      end&lt;br /&gt;    end&lt;br /&gt;  end&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;joint&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The 'representation' section of the book goes into all the gory details, not just on bayesian networks, but markov networks as well, and deeper into how to represent the CPD's, template-based representations, etc, etc.&lt;br /&gt;&lt;br /&gt;OK, so we got this great way to represent knowledge, now how do we use it?  Well, we use it to answer probabilistic queries, and the algorithms for doing so are the topic of the 'inference' section of the book, where we infer/reason with our knowledge base.&lt;br /&gt;&lt;br /&gt;Now, it can be a pain to create this models by hand, which would entail:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Create the graph structure being careful to maximize our ability to exploit conditional independences&lt;/li&gt;&lt;li&gt;Parameterize the graph with CPD's&lt;/li&gt;&lt;/ol&gt;Fortunately, there exist algorithms to ease that burden, and those are discussed in the 'learning' section, where the goal is to automatically learn/infer the structure and parameterization of the network from raw data.&lt;br /&gt;&lt;br /&gt;Finally, the last section, 'actions and decisions', brings it all together and provides insights into issues one might have to deal with in order to build an intelligent agent based on these principles.  Topics discussed include causality, utilities and decisions, and structured decision problems.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-2540580939858404607?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/2540580939858404607/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/09/probabilistic-graphical-models.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/2540580939858404607'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/2540580939858404607'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/09/probabilistic-graphical-models.html' title='Probabilistic Graphical Models'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_JBvsBkmE5OU/SqggOjSM5UI/AAAAAAAAAI8/8SS427FoRp8/s72-c/bayesianGraph.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-9186635698847892323</id><published>2009-08-24T15:04:00.000-07:00</published><updated>2009-08-24T15:54:14.396-07:00</updated><title type='text'>Machine Learning and Computational Finance</title><content type='html'>Videos for the recent &lt;a href="http://web.mac.com/davidrh/AMLCF09/Workshop.html"&gt;Advances in Machine Learning for Computational Finance&lt;/a&gt; conference are posted on the &lt;a href="http://videolectures.net/"&gt;videolectures.net&lt;/a&gt; site:&lt;br /&gt;&lt;br /&gt;&lt;a href='http://videolectures.net/amlcf09_london/'&gt;&lt;br /&gt;  &lt;img src='http://videolectures.net/amlcf09_london/thumb.jpg' border=0/&gt;&lt;br /&gt;  &lt;br/&gt;Advances in Machine Learning for Computational Finance&lt;/a&gt;&lt;br/&gt;&lt;br /&gt;&lt;br /&gt;Good stuff!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-9186635698847892323?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/9186635698847892323/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/08/machine-learning-and-computational.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/9186635698847892323'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/9186635698847892323'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/08/machine-learning-and-computational.html' title='Machine Learning and Computational Finance'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-2228379127184070114</id><published>2009-08-20T18:12:00.000-07:00</published><updated>2009-09-10T11:03:36.996-07:00</updated><title type='text'>Trading Strategy Implementation</title><content type='html'>&lt;span style="font-weight: bold;"&gt;Disclaimer&lt;/span&gt;: run this strategy at your own peril; you will most likely lose all your money.  This is purely for illustrative purposes.&lt;br /&gt;&lt;br /&gt;OK, now that I've got that disclaimer out of the way, we can continue.  Seriously though, it's so easy to get connected and start trading that I am sure many people have gotten caught up in the excitement and whatever little knowledge they may have acquired from some infomercial-like seminar or book, that they went in head-first and ended up with a story to tell but no profits to show.  Make sure you understand what you're getting into very well if you decide to experiment with trading (let alone automated trading), although you are probably better off just reading &lt;a href="http://www.amazon.com/Random-Walk-Guide-Investing/dp/039332639X"&gt;Malkiel's Random Walk Guide To Investing&lt;/a&gt; and following the 10 rules proposed there.  Most of that is nicely summarized in &lt;a href="http://www.amazon.com/Dilbert-Way-Weasel-Scott-Adams/dp/0060518057"&gt;Dilbert's investment advice&lt;/a&gt;:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Make a will.&lt;/li&gt;&lt;li&gt;Pay off your credit cards.&lt;/li&gt;&lt;li&gt;Get term life insurance if you have a family to support.&lt;/li&gt;&lt;li&gt;Fund your 401(k) to the maximum.&lt;/li&gt;&lt;li&gt;Fund your IRA to the maximum.&lt;/li&gt;&lt;li&gt;Buy a house if you want to live in a house and you can afford it.&lt;/li&gt;&lt;li&gt;Put six months’ expenses in a money market fund.&lt;/li&gt;&lt;li&gt;Take whatever money is left over and invest 70% in a stock index fund and 30% in a bond fund through any discount broker and never touch it until retirement.&lt;/li&gt;&lt;li&gt;If any of this confuses you, or you have something special going on (retirement, college planning, tax issues) hire a fee-based financial planner, not one who charges a percentage of your portfolio.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt; That's it!  &lt;br /&gt;&lt;br /&gt;Now, if &lt;span style="font-style: italic;"&gt;really&lt;/span&gt; you wanna try your hand at automated trading systems, whatever your motivation may be, use a demo or paper trading account as much as possible and educate yourself on the markets you will be trading (an &lt;a href="http://en.wikipedia.org/wiki/Master_of_Business_Administration"&gt;MBA&lt;/a&gt; should be a good baseline).  If you're a student and have a professor interested, you could use&lt;a href="http://www.interactivebrokers.com/en/general/education/IBStudentTradingLab.php"&gt; IB's student trading lab&lt;/a&gt;.  The strategy below can actually be run against Interactive Brokers' own demo software (which does not require opening an account, just download the software), and to do that, you can follow &lt;a href="http://www.maxdama.com/2008/12/interactive-brokers-via-matlab.html"&gt;Max's&lt;/a&gt; and &lt;a href="http://www.tradingwithmatlab.com/video-tutorials"&gt;Domenic's&lt;/a&gt; tutorials.&lt;br /&gt;&lt;br /&gt;The strategy below is a mean-reverting strategy where we imagine the prices to "bounce" between upper and lower price bands, based on one from &lt;a href="http://epchan.blogspot.com/2009/05/matlab-as-automated-execution-system.html"&gt;Ernest Chan&lt;/a&gt;, and which connects to IB's &lt;a href="http://www.interactivebrokers.com/en/p.php?f=tws&amp;amp;ib_entity=llc"&gt;Trader Workstation application&lt;/a&gt; through their &lt;a href="http://www.interactivebrokers.com/en/p.php?f=programInterface&amp;amp;ib_entity=llc"&gt;ActiveX API&lt;/a&gt;.  Max's tutorial should guide you through all that.  Now, unlike Ernest's code, I don't use the matlab2IB library, so as long as you have Matlab on a Windows box you should be good to go.  It is a &lt;a href="http://en.wikipedia.org/wiki/Technical_analysis"&gt;technical analysis&lt;/a&gt; (TA) trading strategy which uses &lt;a href="http://en.wikipedia.org/wiki/Bollinger_bands"&gt;Bollinger bands&lt;/a&gt;, and although TA sounds a little too much like &lt;a href="http://en.wikipedia.org/wiki/Astrology"&gt;astrology&lt;/a&gt; for my taste, the mechanics of an automated trading system are well illustrated.&lt;br /&gt;&lt;br /&gt;The code is contained in 2 files, 'trader.m' and 'driver.m', and 'driver.m' is the one you execute (sorry about the crappy syntax highlighting... Matlab is not a choice in &lt;a href="http://code.google.com/p/syntaxhighlighter/wiki/Languages"&gt;syntaxhighlighter&lt;/a&gt;):&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;%driver.m&lt;br /&gt;&lt;br /&gt;%clear workspace variables and command window&lt;br /&gt;clear;&lt;br /&gt;clc;&lt;br /&gt;&lt;br /&gt;%state holds info that will be persisted througout the execution,&lt;br /&gt;%and that needs to be accessed from elsewhere (trader)&lt;br /&gt;global state; &lt;br /&gt;state=containers.Map();&lt;br /&gt;&lt;br /&gt;%exSeq denotes more or less the execution sequence&lt;br /&gt;exSeq={'preInit','init','accountUpdates','accountUpdatesEnd',...&lt;br /&gt;    'marketData','histData','updateHistData','placeOrders','exit'};&lt;br /&gt;currStep='preInit';&lt;br /&gt;&lt;br /&gt;%leave ip blank for localhost&lt;br /&gt;%socket port as defined in tws app&lt;br /&gt;%ib api webinar says up to 8 concurrent clients are supported,&lt;br /&gt;%which can be achieved by increasing clientId&lt;br /&gt;conInfo=struct('ipAddress','','port',7496,'clientId',0);&lt;br /&gt;&lt;br /&gt;%pacific standard time&lt;br /&gt;timeZone='PST';&lt;br /&gt;&lt;br /&gt;%create instance of tws activex control and pass event handler 'trader'&lt;br /&gt;tws=actxcontrol('TWS.TwsCtrl.1',[0 0 0 0],figure('Visible','off'),...&lt;br /&gt;    'trader');&lt;br /&gt;&lt;br /&gt;%all output should include as prefix this anonynous function&lt;br /&gt;dispPrefix=@() [datestr(now,13),' driver: '];&lt;br /&gt;&lt;br /&gt;%connect to tws app&lt;br /&gt;disp([dispPrefix(),'connecting...']);&lt;br /&gt;tws.connect(conInfo.ipAddress,conInfo.port,conInfo.clientId);&lt;br /&gt;&lt;br /&gt;while true&lt;br /&gt;    switch currStep&lt;br /&gt;        &lt;br /&gt;        case 'preInit'&lt;br /&gt;            %here we just get confirmation of the connection and &lt;br /&gt;            %some useful data such as the nextValidId&lt;br /&gt;            disp([dispPrefix(),'in preInit...']);&lt;br /&gt;            &lt;br /&gt;            if isKey(state,'nextValidId')&lt;br /&gt;                %this is the condition that tells us we're good&lt;br /&gt;                %to move to the next step (see nextValidId event &lt;br /&gt;                %handled by trader)&lt;br /&gt;                currStep='init';&lt;br /&gt;            end&lt;br /&gt;            &lt;br /&gt;        case 'init'         &lt;br /&gt;            %here we add any kind of initialization or configuration&lt;br /&gt;            %parameters for this strategy&lt;br /&gt;            disp([dispPrefix(),'in init...']);   &lt;br /&gt;            &lt;br /&gt;            params=struct();&lt;br /&gt;            &lt;br /&gt;            %symbol for the front contract&lt;br /&gt;            params.localSymbol='ESU9';&lt;br /&gt;            state('localSymbol')=params.localSymbol;&lt;br /&gt;            &lt;br /&gt;            %IB symbol for this future&lt;br /&gt;            params.symbol='ES';&lt;br /&gt;            state('symbol')=params.symbol;&lt;br /&gt;            &lt;br /&gt;            params.securityType='FUT';&lt;br /&gt;            params.exchange='GLOBEX';&lt;br /&gt;            params.currency='USD';&lt;br /&gt;            &lt;br /&gt;            %desired order size&lt;br /&gt;            params.numberOfContracts=1;&lt;br /&gt;            &lt;br /&gt;            %number of ticks used to compute the middle band, &lt;br /&gt;            %a N-period simple moving average (MA)&lt;br /&gt;            params.N=20;&lt;br /&gt;            &lt;br /&gt;            %the upper band at KA times an N-period standard deviation &lt;br /&gt;            %above the middle band (MA+KA*sigma)&lt;br /&gt;            params.KA=.5;&lt;br /&gt;            &lt;br /&gt;            %the lower band at KB times an N-period standard deviation &lt;br /&gt;            %below the middle band (MA-KB*sigma)&lt;br /&gt;            params.KB=.5;    &lt;br /&gt;            &lt;br /&gt;            %KA and KB are generally 2 according to wkipedia, but that all&lt;br /&gt;            %depends on the strategy... this is one value you can play with&lt;br /&gt;            %to see more action happen...&lt;br /&gt;            &lt;br /&gt;            currStep='accountUpdates';&lt;br /&gt;            &lt;br /&gt;        case 'accountUpdates'&lt;br /&gt;            %request account updates&lt;br /&gt;            disp([dispPrefix(),'in accountUpdates...']); &lt;br /&gt;            &lt;br /&gt;            %the particular event that this will kick off is the &lt;br /&gt;            %'updatePortfolioEx' event, which will tell us if we&lt;br /&gt;            %currently have any positions in the security we'll be trading&lt;br /&gt;            tws.reqAccountUpdates(1,'');  &lt;br /&gt;            &lt;br /&gt;            currStep='accountUpdatesEnd';&lt;br /&gt;            &lt;br /&gt;        case 'accountUpdatesEnd'&lt;br /&gt;            %wait until we get confirmation that those were all&lt;br /&gt;            %available account updates&lt;br /&gt;            disp([dispPrefix(),'in accountUpdatesEnd...']);  &lt;br /&gt;            &lt;br /&gt;            if isKey(state,'doneAccountUpdates')&lt;br /&gt;                %this batch of updates has completed, so let's ask to&lt;br /&gt;                %stop receiving account position updates;&lt;br /&gt;                %even though tws' activex api doc says to use '2' to stop,&lt;br /&gt;                %'0' is what seems to work (1=subscribe, 0=unsub?)&lt;br /&gt;                tws.reqAccountUpdates(0,'');&lt;br /&gt;                &lt;br /&gt;                %set next step to be executed&lt;br /&gt;                currStep='marketData';&lt;br /&gt;                &lt;br /&gt;                %if there was nothing for tws to nitify us of, &lt;br /&gt;                %then we need to initialize the value in state here&lt;br /&gt;                if ~isKey(state,'pos')&lt;br /&gt;                    state('pos')=0;&lt;br /&gt;                end&lt;br /&gt;                &lt;br /&gt;                disp([dispPrefix(),'starting position in security: ',...&lt;br /&gt;                    num2str(state('pos'))]);&lt;br /&gt;            end&lt;br /&gt;            &lt;br /&gt;        case 'marketData'&lt;br /&gt;            %here we are gonna ask tws to start sending us market data&lt;br /&gt;            disp([dispPrefix(),'in marketData...']);   &lt;br /&gt;            &lt;br /&gt;            %tws has these factory (create*) methods for all the&lt;br /&gt;            %COM objects we need to pass as arguments&lt;br /&gt;            contract=tws.createContract();&lt;br /&gt;            &lt;br /&gt;            %string: this is the symbol of the underlying asset.&lt;br /&gt;            contract.symbol=params.symbol;&lt;br /&gt;            &lt;br /&gt;            %string: this is the security type. Valid values are:&lt;br /&gt;            %STK, OPT, FUT, IND, FOP, CASH, BAG&lt;br /&gt;            contract.secType=params.securityType;  &lt;br /&gt;            &lt;br /&gt;            %string: the expiration date. Use the format YYYYMM if&lt;br /&gt;            %applicable&lt;br /&gt;            contract.expiry='';  &lt;br /&gt;            &lt;br /&gt;            %double: the strike price&lt;br /&gt;            contract.strike=0; &lt;br /&gt;            &lt;br /&gt;            %string: specifies a Put or Call. &lt;br /&gt;            %Valid values are: P, PUT, C, CALL.&lt;br /&gt;            contract.right='';   &lt;br /&gt;            &lt;br /&gt;            %string: allows you to specify a future or option contract &lt;br /&gt;            %multiplier. This is only necessary when multiple &lt;br /&gt;            %possibilities exist.&lt;br /&gt;            contract.multiplier='';    &lt;br /&gt;            &lt;br /&gt;            %string: the order destination, such as Smart.&lt;br /&gt;            contract.exchange=params.exchange; &lt;br /&gt;            &lt;br /&gt;            %string: specifies the currency. Ambiguities may require that&lt;br /&gt;            %this field be specified, for example, when SMART is the&lt;br /&gt;            %exchange and IBM is being requested (IBM can trade in&lt;br /&gt;            %GBP or USD). Given the existence of this kind of&lt;br /&gt;            %ambiguity, it is a good idea to always specify the currency.&lt;br /&gt;            contract.currency=params.currency;       &lt;br /&gt;            &lt;br /&gt;            %string: this is the local exchange symbol of the &lt;br /&gt;            %underlying asset.&lt;br /&gt;            contract.localSymbol=params.localSymbol;&lt;br /&gt;            &lt;br /&gt;            %string: identifies the listing exchange for the &lt;br /&gt;            %contract (do not list SMART).&lt;br /&gt;            contract.primaryExch=params.exchange;  &lt;br /&gt;            &lt;br /&gt;            %integer: if set to true, contract details requests and &lt;br /&gt;            %historical data queries can be performed pertaining to &lt;br /&gt;            %expired contracts.&lt;br /&gt;            contract.includeExpired=0;  &lt;br /&gt;            &lt;br /&gt;            %object: dynamic memory structure used to store the leg &lt;br /&gt;            %definitions for this contract.&lt;br /&gt;            contract.comboLegs='';  &lt;br /&gt;            &lt;br /&gt;            %integer: the unique contract identifier.&lt;br /&gt;            contract.conId=0;&lt;br /&gt;&lt;br /&gt;            %request market data ('0' indicates we do not want a snapshot,&lt;br /&gt;            %but wanna continue to receive a stream of data)&lt;br /&gt;            tws.reqMktDataEx(contract.conId,contract,'',0);&lt;br /&gt;            &lt;br /&gt;            currStep='histData';&lt;br /&gt;            &lt;br /&gt;        case 'histData'&lt;br /&gt;            %request historical data&lt;br /&gt;             disp([dispPrefix(),'in histData...']);&lt;br /&gt;             &lt;br /&gt;             if ~isKey(state,'primedHistData')&lt;br /&gt;                 %we only need to request historical data once, after that,&lt;br /&gt;                 %we'll just keep updating it with market data&lt;br /&gt;                 &lt;br /&gt;                 %request historical data&lt;br /&gt;                 tws.reqHistoricalDataEx(contract.conId,contract,...&lt;br /&gt;                     [datestr(now,'yyyymmdd HH:MM:SS'),' ',timeZone],...&lt;br /&gt;                     '1 D','1 min','TRADES',1,2);&lt;br /&gt;                 &lt;br /&gt;                 %start tic from the time we request all initial histData&lt;br /&gt;                 tic;&lt;br /&gt;                 &lt;br /&gt;                 state('primedHistData')=true;&lt;br /&gt;             else&lt;br /&gt;                 if isKey(state,'doneHistData')&lt;br /&gt;                     %we've already requested historical data, &lt;br /&gt;                     %so no need to do it again, we're now ready to &lt;br /&gt;                     %process this data&lt;br /&gt;                     currStep='updateHistData';&lt;br /&gt;                 else&lt;br /&gt;                     %do nothing, still waiting on all the data,&lt;br /&gt;                     %which is getting handled by the 'historicalData'&lt;br /&gt;                     %event&lt;br /&gt;                 end&lt;br /&gt;             end&lt;br /&gt;             &lt;br /&gt;        case 'updateHistData'&lt;br /&gt;            %update our historical data based on the market &lt;br /&gt;            %data that we subscribed to earlier&lt;br /&gt;            disp([dispPrefix(),'in updateHistData...']);&lt;br /&gt;            &lt;br /&gt;            elapsed=toc;&lt;br /&gt;            %every X elapsed seconds, get last tick data and update&lt;br /&gt;            %our price history vector, keeping the vector's&lt;br /&gt;            %length constant, once there is an update,&lt;br /&gt;            %then we move forward to placeOrders&lt;br /&gt;            &lt;br /&gt;            %the elapsed time thershold is also another value you can&lt;br /&gt;            %play with to see more action happen...&lt;br /&gt;            if (elapsed &gt;= 5)&lt;br /&gt;                tic;&lt;br /&gt;                histDataVec=state('histDataVec');&lt;br /&gt;                histDataVec(1:end-1)=histDataVec(2:end);&lt;br /&gt;                &lt;br /&gt;                %if by chance we haven't received any market data yet,&lt;br /&gt;                %this call below should fail&lt;br /&gt;                histDataVec(end)=state('last');&lt;br /&gt;                &lt;br /&gt;                state('histDataVec')=histDataVec;&lt;br /&gt;                &lt;br /&gt;                %calculate moving average&lt;br /&gt;                ma=mean(histDataVec(end-params.N+1:end));&lt;br /&gt;                &lt;br /&gt;                %calculate moving standard deviation&lt;br /&gt;                mstd=std(histDataVec(end-params.N+1:end));&lt;br /&gt;                &lt;br /&gt;                %calculate deviation of ask and bid prices from the&lt;br /&gt;                %moving average middle band.  All we're doing here is&lt;br /&gt;                %solving for K on the 'ask' and 'bid' prices:&lt;br /&gt;                %ask=(MA+K*sigma)&lt;br /&gt;                askK=(state('ask')-ma)/mstd;&lt;br /&gt;                %bid=(MA+K*sigma)&lt;br /&gt;                bidK=(state('bid')-ma)/mstd;&lt;br /&gt;                &lt;br /&gt;                %move onto next step&lt;br /&gt;                currStep='placeOrders';&lt;br /&gt;            end&lt;br /&gt;            &lt;br /&gt;        case 'placeOrders'&lt;br /&gt;            %place orders if the signals we're watching tells us to do so&lt;br /&gt;            disp([dispPrefix(),'in placeOrders...']);&lt;br /&gt;            &lt;br /&gt;            %here we decide whether to exit or go back to updateHistData,&lt;br /&gt;            %which could be based on time (trading during trading hours,&lt;br /&gt;            %or a predefined number of trades); we illustrate with a &lt;br /&gt;            %predefined number of trades&lt;br /&gt;            if ~isKey(state,'placeOrdersCount')&lt;br /&gt;                state('placeOrdersCount')=1;&lt;br /&gt;                currStep='updateHistData';&lt;br /&gt;            else&lt;br /&gt;                if state('placeOrdersCount') &gt; 50&lt;br /&gt;                    currStep='exit';&lt;br /&gt;                else&lt;br /&gt;                    state('placeOrdersCount')=state('placeOrdersCount')+1;&lt;br /&gt;                    currStep='updateHistData';&lt;br /&gt;                end&lt;br /&gt;            end&lt;br /&gt;            &lt;br /&gt;            pos=state('pos');&lt;br /&gt;            orderId=state('nextValidId');&lt;br /&gt;            order=tws.createOrder();&lt;br /&gt;            order.orderId=orderId;&lt;br /&gt;            order.clientId=conInfo.clientId;&lt;br /&gt;            order.action=''; %need below!&lt;br /&gt;            order.totalQuantity=0; %need below!&lt;br /&gt;            order.orderType='MKT';&lt;br /&gt;            order.transmit=1;&lt;br /&gt;            order.ocaGroup='';&lt;br /&gt;            order.lmtPrice=0;&lt;br /&gt;            order.auxPrice=0;&lt;br /&gt;            order.timeInForce='DAY';&lt;br /&gt;            &lt;br /&gt;            disp([dispPrefix(),...&lt;br /&gt;                'pos: ',num2str(pos),...&lt;br /&gt;                ', askK: ',num2str(askK),...&lt;br /&gt;                ', bidK: ',num2str(bidK),...&lt;br /&gt;                ', ma: ',num2str(ma),...&lt;br /&gt;                ', bid: ',num2str(state('bid')),...&lt;br /&gt;                ', ask: ',num2str(state('ask'))]);&lt;br /&gt;            &lt;br /&gt;            if (pos == 0 &amp;&amp; askK &lt; -params.KA) &lt;br /&gt;                &lt;br /&gt;                %if we are below the lower band and have nothing, go long&lt;br /&gt;                &lt;br /&gt;                order.action='BUY';&lt;br /&gt;                order.totalQuantity=params.numberOfContracts;&lt;br /&gt;                tws.placeOrderEx(orderId,contract,order);&lt;br /&gt;                disp([dispPrefix(),'placed entry only...']);&lt;br /&gt;                pos=params.numberOfContracts;&lt;br /&gt;                &lt;br /&gt;            elseif (pos &lt; 0 &amp;&amp; askK &lt; -params.KA) &lt;br /&gt;                &lt;br /&gt;                %if we are below the lower band and are short, buy 2,&lt;br /&gt;                %one for establishing the long position and &lt;br /&gt;                %another one to cover the short&lt;br /&gt;                &lt;br /&gt;                order.action='BUY';&lt;br /&gt;                order.totalQuantity=2*params.numberOfContracts;&lt;br /&gt;                tws.placeOrderEx(orderId,contract,order);&lt;br /&gt;                disp([dispPrefix(),'placed exit and entry...']);&lt;br /&gt;                pos=params.numberOfContracts;&lt;br /&gt;                &lt;br /&gt;            elseif (pos == 0 &amp;&amp; bidK &gt; params.KB) &lt;br /&gt;                &lt;br /&gt;                %if we are above the upper band and have nothing, go short&lt;br /&gt;                &lt;br /&gt;                order.action='SELL';&lt;br /&gt;                order.totalQuantity=params.numberOfContracts;&lt;br /&gt;                tws.placeOrderEx(orderId,contract,order);&lt;br /&gt;                disp([dispPrefix(),'placed entry only...']);&lt;br /&gt;                pos=-params.numberOfContracts;&lt;br /&gt;                &lt;br /&gt;            elseif (pos &gt; 0 &amp;&amp; bidK &gt; params.KB) &lt;br /&gt;                &lt;br /&gt;                %if we're above the upper band and are currently long,&lt;br /&gt;                %sell 2, one for establishing the short position and &lt;br /&gt;                %anothe rone to take profits on the previous long&lt;br /&gt;                &lt;br /&gt;                order.action='SELL';&lt;br /&gt;                order.totalQuantity=2*params.numberOfContracts;&lt;br /&gt;                tws.placeOrderEx(orderId,contract,order);&lt;br /&gt;                disp([dispPrefix(),'placed exit and entry...']);&lt;br /&gt;                pos=-params.numberOfContracts;&lt;br /&gt;                &lt;br /&gt;            end&lt;br /&gt;            &lt;br /&gt;            state('pos')=pos;&lt;br /&gt;            state('nextValidId')=orderId+1;&lt;br /&gt;            &lt;br /&gt;        case 'exit'&lt;br /&gt;            %get ready to leave&lt;br /&gt;            disp([dispPrefix(),'in exit...']);&lt;br /&gt;            &lt;br /&gt;            %maybe should cancel marketdata request, but we'll let&lt;br /&gt;            %disconnecting take care of that for now&lt;br /&gt;            pos=state('pos');&lt;br /&gt;            orderId=state('nextValidId');&lt;br /&gt;            order=tws.createOrder();&lt;br /&gt;            order.orderId=orderId;&lt;br /&gt;            order.clientId=conInfo.clientId;&lt;br /&gt;            order.action=''; %need below!&lt;br /&gt;            order.totalQuantity=0; %need below!&lt;br /&gt;            order.orderType='MKT';&lt;br /&gt;            order.transmit=1;&lt;br /&gt;            order.ocaGroup='';&lt;br /&gt;            order.lmtPrice=0;&lt;br /&gt;            order.auxPrice=0;&lt;br /&gt;            order.timeInForce='DAY';&lt;br /&gt;            &lt;br /&gt;            %should really be looking and waiting for order confirmations,&lt;br /&gt;            %but the pauses will do the job for now...&lt;br /&gt;            if (pos &gt; 0)&lt;br /&gt;                %cover the longs&lt;br /&gt;                order.action='SELL';&lt;br /&gt;                order.totalQuantity=pos;&lt;br /&gt;                tws.placeOrderEx(orderId,contract,order);&lt;br /&gt;                disp([dispPrefix(),'placed final order...']);&lt;br /&gt;                pause(5);&lt;br /&gt;            elseif (pos &lt; 0)&lt;br /&gt;                %cover the shorts&lt;br /&gt;                order.action='BUY';&lt;br /&gt;                order.totalQuantity=abs(pos);&lt;br /&gt;                tws.placeOrderEx(orderId,contract,order);&lt;br /&gt;                disp([dispPrefix(),'placed final order...']);&lt;br /&gt;                pause(5);&lt;br /&gt;            end&lt;br /&gt;            &lt;br /&gt;            pos=0;&lt;br /&gt;            state('pos')=pos;&lt;br /&gt;            state('nextValidId')=orderId+1;&lt;br /&gt;            &lt;br /&gt;            break;&lt;br /&gt;            &lt;br /&gt;        otherwise&lt;br /&gt;            disp([dispPrefix(),'invalid currStep: ',state('currStep')]);&lt;br /&gt;            &lt;br /&gt;    end&lt;br /&gt;    &lt;br /&gt;    %the pause gives the activex component events a chance to be handled&lt;br /&gt;    pause(2);&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;%disconnect from the tws app &lt;br /&gt;tws.disconnect();&lt;br /&gt;disp([dispPrefix(),'disconnecting...']);&lt;br /&gt;pause(2);&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This is 'trader.m':&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;%trader.m&lt;br /&gt;&lt;br /&gt;function trader(varargin)&lt;br /&gt;&lt;br /&gt;%as per matlab's documentation on arguments passed to event handlers,&lt;br /&gt;%'end-1' is the position the event payload will be at&lt;br /&gt;arg=varargin{end-1};&lt;br /&gt;&lt;br /&gt;%very little code should go in the switch statement,&lt;br /&gt;%all logic should be dispatched to subfunctions on this file&lt;br /&gt;switch arg.Type     &lt;br /&gt;    case 'errMsg'&lt;br /&gt;        %EVERY message should be well understood&lt;br /&gt;        custom_disp(['in errMsg: ',arg.errorMsg]);   &lt;br /&gt;    case 'updateAccountValue'&lt;br /&gt;       %no need to do anything here&lt;br /&gt;    case 'updateAccountTime'&lt;br /&gt;        %no need to do anything here&lt;br /&gt;    case 'updatePortfolioEx'&lt;br /&gt;        updatePortfolioEx(arg);&lt;br /&gt;    case 'accountDownloadEnd'&lt;br /&gt;        accountDownloadEnd();&lt;br /&gt;    case 'nextValidId'&lt;br /&gt;        nextValidId(arg);&lt;br /&gt;    case 'openOrderEnd'&lt;br /&gt;        custom_disp('in openOrderEnd');&lt;br /&gt;    case 'tickPrice'&lt;br /&gt;        tickPrice(arg);&lt;br /&gt;    case 'tickSize'&lt;br /&gt;        %no need to do anything here&lt;br /&gt;    case 'tickString'&lt;br /&gt;        %no need to do anything here&lt;br /&gt;    case 'tickGeneric'&lt;br /&gt;        %no need to do anything here&lt;br /&gt;    case 'historicalData'&lt;br /&gt;        historicalData(arg);&lt;br /&gt;    case 'updatePortfolio'&lt;br /&gt;        %this guy has been replaced by updatePortfolioEx, so&lt;br /&gt;        %no need to do anything here&lt;br /&gt;    otherwise&lt;br /&gt;        custom_disp(['unknown event: ',arg.Type]);    &lt;br /&gt;end&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;function custom_disp(textMessage)&lt;br /&gt;disp([datestr(now,13),' trader: ',textMessage]);&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;function nextValidId(arg)&lt;br /&gt;%we get this message soon after having established a connection&lt;br /&gt;custom_disp('in nextValidId');&lt;br /&gt;global state;&lt;br /&gt;state('nextValidId')=arg.id;&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;function updatePortfolioEx(arg)&lt;br /&gt;custom_disp('in updatePortfolioEx');&lt;br /&gt;global state;&lt;br /&gt;if strcmp(arg.contract.symbol,state('symbol')) &amp;&amp;...&lt;br /&gt;        strcmp(arg.contract.localSymbol,state('localSymbol'))&lt;br /&gt;    state('pos')=arg.position;&lt;br /&gt;end&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;function accountDownloadEnd()&lt;br /&gt;custom_disp('in accountDownloadEnd');&lt;br /&gt;global state;&lt;br /&gt;state('doneAccountUpdates')=true;&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;function tickPrice(arg)&lt;br /&gt;%as we get updates for this guy, shove the info into our state&lt;br /&gt;custom_disp('in tickPrice');&lt;br /&gt;global state;&lt;br /&gt;switch arg.tickType&lt;br /&gt;    case 1 %bid&lt;br /&gt;        state('bid')=arg.price;&lt;br /&gt;    case 2 %ask&lt;br /&gt;        state('ask')=arg.price;&lt;br /&gt;    case 4 %last&lt;br /&gt;        state('last')=arg.price;&lt;br /&gt;end&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;function historicalData(arg)&lt;br /&gt;%all we're doing here is gathering the historical data messages&lt;br /&gt;%into a vector&lt;br /&gt;%custom_disp('in historicalData');&lt;br /&gt;global state;&lt;br /&gt;&lt;br /&gt;if ~isKey(state,'histDataVec')&lt;br /&gt;    %create the vector and position indicator &lt;br /&gt;    &lt;br /&gt;    %we shouldn't get more than a day's worth of data&lt;br /&gt;    state('histDataVec')=zeros(24*60,1,'double');&lt;br /&gt;    state('histDataVecN')=0;&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;%-1 is shown on the last message in the histData request&lt;br /&gt;if (arg.close == -1)&lt;br /&gt;    %resize the data vector to the number of messages we actualy received&lt;br /&gt;    state('doneHistData')=true;&lt;br /&gt;    histDataVec=state('histDataVec');&lt;br /&gt;    N=state('histDataVecN');&lt;br /&gt;    state('histDataVec')=histDataVec(1:N,1);&lt;br /&gt;else&lt;br /&gt;    %Matlab arrays have pass-by-value semantics, so this thing&lt;br /&gt;    %that we're doing here is very inneficient (the array is getting&lt;br /&gt;    %cloned everytime).  Just a heads up...&lt;br /&gt;    &lt;br /&gt;    %keep track of the pieces we're interested in&lt;br /&gt;    histDataVec=state('histDataVec');&lt;br /&gt;    N=state('histDataVecN')+1;&lt;br /&gt;    histDataVec(N,1)=arg.close;&lt;br /&gt;    state('histDataVec')=histDataVec;&lt;br /&gt;    state('histDataVecN')=N;&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;end&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;And the output window with all the tracing would look something like this:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;14:14:14 driver: connecting...&lt;br /&gt;14:14:15 driver: in preInit...&lt;br /&gt;14:14:15 trader: in nextValidId&lt;br /&gt;14:14:15 trader: in openOrderEnd&lt;br /&gt;14:14:15 trader: in errMsg: Market data farm connection is OK:ibdemo&lt;br /&gt;14:14:15 trader: in errMsg: HMDS data farm connection is OK:demohmds&lt;br /&gt;14:14:17 driver: in preInit...&lt;br /&gt;14:14:19 driver: in init...&lt;br /&gt;14:14:21 driver: in accountUpdates...&lt;br /&gt;14:14:22 trader: in accountDownloadEnd&lt;br /&gt;14:14:23 driver: in accountUpdatesEnd...&lt;br /&gt;14:14:23 driver: starting position in security: 0&lt;br /&gt;14:14:25 driver: in marketData...&lt;br /&gt;14:14:25 trader: in tickPrice&lt;br /&gt;14:14:25 trader: in tickPrice&lt;br /&gt;14:14:25 trader: in tickPrice&lt;br /&gt;14:14:25 trader: in tickPrice&lt;br /&gt;14:14:25 trader: in tickPrice&lt;br /&gt;14:14:25 trader: in tickPrice&lt;br /&gt;14:14:25 trader: in tickPrice&lt;br /&gt;14:14:26 trader: in tickPrice&lt;br /&gt;14:14:26 trader: in tickPrice&lt;br /&gt;14:14:27 driver: in histData...&lt;br /&gt;14:14:29 trader: in tickPrice&lt;br /&gt;14:14:29 driver: in histData...&lt;br /&gt;14:14:31 driver: in updateHistData...&lt;br /&gt;14:14:32 trader: in tickPrice&lt;br /&gt;14:14:33 driver: in updateHistData...&lt;br /&gt;14:14:35 driver: in placeOrders...&lt;br /&gt;14:14:35 driver: pos: 0, askK: -0.62025, bidK: -1.5063, ma: 966.1, bid: 965.25, ask: 965.75&lt;br /&gt;14:14:35 driver: placed entry only...&lt;br /&gt;14:14:36 trader: in tickPrice&lt;br /&gt;14:14:37 trader: unknown event: openOrder1&lt;br /&gt;14:14:37 trader: unknown event: openOrder2&lt;br /&gt;14:14:37 trader: unknown event: openOrder3&lt;br /&gt;14:14:37 trader: unknown event: openOrder4&lt;br /&gt;14:14:37 trader: unknown event: openOrderEx&lt;br /&gt;14:14:37 trader: unknown event: orderStatus&lt;br /&gt;14:14:37 trader: in tickPrice&lt;br /&gt;14:14:37 trader: in tickPrice&lt;br /&gt;14:14:37 trader: unknown event: execDetails&lt;br /&gt;14:14:37 trader: unknown event: execDetailsEx&lt;br /&gt;14:14:37 trader: unknown event: openOrder1&lt;br /&gt;14:14:37 trader: unknown event: openOrder2&lt;br /&gt;14:14:37 trader: unknown event: openOrder3&lt;br /&gt;14:14:37 trader: unknown event: openOrder4&lt;br /&gt;14:14:37 trader: unknown event: openOrderEx&lt;br /&gt;14:14:37 trader: unknown event: orderStatus&lt;br /&gt;14:14:37 trader: unknown event: openOrder1&lt;br /&gt;14:14:37 trader: unknown event: openOrder2&lt;br /&gt;14:14:37 trader: unknown event: openOrder3&lt;br /&gt;14:14:37 trader: unknown event: openOrder4&lt;br /&gt;14:14:37 trader: unknown event: openOrderEx&lt;br /&gt;14:14:37 trader: unknown event: orderStatus&lt;br /&gt;14:14:37 driver: in updateHistData...&lt;br /&gt;...&lt;br /&gt;14:19:45 trader: in tickPrice&lt;br /&gt;14:19:45 trader: in tickPrice&lt;br /&gt;14:19:45 trader: unknown event: execDetails&lt;br /&gt;14:19:45 trader: unknown event: execDetailsEx&lt;br /&gt;14:19:45 trader: unknown event: openOrder1&lt;br /&gt;14:19:45 trader: unknown event: openOrder2&lt;br /&gt;14:19:45 trader: unknown event: openOrder3&lt;br /&gt;14:19:45 trader: unknown event: openOrder4&lt;br /&gt;14:19:45 trader: unknown event: openOrderEx&lt;br /&gt;14:19:45 trader: unknown event: orderStatus&lt;br /&gt;14:19:45 trader: unknown event: openOrder1&lt;br /&gt;14:19:45 trader: unknown event: openOrder2&lt;br /&gt;14:19:45 trader: unknown event: openOrder3&lt;br /&gt;14:19:45 trader: unknown event: openOrder4&lt;br /&gt;14:19:45 trader: unknown event: openOrderEx&lt;br /&gt;14:19:45 trader: unknown event: orderStatus&lt;br /&gt;14:19:45 trader: in tickPrice&lt;br /&gt;14:19:46 trader: in tickPrice&lt;br /&gt;14:19:47 trader: in tickPrice&lt;br /&gt;14:19:48 trader: in tickPrice&lt;br /&gt;14:19:49 trader: in tickPrice&lt;br /&gt;14:19:49 driver: disconnecting...&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The code is pretty self-explanatory, and is a nice baseline to get started on some simple strategies.  Anything more complicated than that and I would recommend some sort of &lt;a href="http://en.wikipedia.org/wiki/State_pattern"&gt;state pattern&lt;/a&gt; implementation.&lt;br /&gt;&lt;br /&gt;I ran it against the demo account and despite the fact that my parameters were optimized to minimize my waiting time (very low params.KA and params.KB and an arbitrary number of cycles before I liquidated any current positions, not  to mention the fact that only market orders were used), it only lost about $100 (you start with about $500,000 in the demo account and each transaction cost about $1,000). This shouldn't by any means sound encouraging, since those prices aren't even real, they're simulated.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_JBvsBkmE5OU/So4DTpFyLkI/AAAAAAAAAIs/BTwpa80KGdE/s1600-h/ibaccount.JPG"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 400px; height: 390px;" src="http://1.bp.blogspot.com/_JBvsBkmE5OU/So4DTpFyLkI/AAAAAAAAAIs/BTwpa80KGdE/s400/ibaccount.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5372235041482092098" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_JBvsBkmE5OU/So4DakxHqPI/AAAAAAAAAI0/lYOxQlheYHg/s1600-h/iborder.JPG"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 400px; height: 160px;" src="http://2.bp.blogspot.com/_JBvsBkmE5OU/So4DakxHqPI/AAAAAAAAAI0/lYOxQlheYHg/s400/iborder.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5372235160580761842" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Anyways, this should give you an idea of the whole thing.  You may wanna look into what &lt;a href="http://en.wikipedia.org/wiki/Futures_contract"&gt;futures&lt;/a&gt; and &lt;a href="http://en.wikipedia.org/wiki/E-mini_S&amp;P"&gt;e-minis&lt;/a&gt; are all about or change the strategy to trade some stock that you may be familiar with.&lt;br /&gt;&lt;br /&gt;Another good set of resources are IB's &lt;a href="http://www.interactivebrokers.com/en/general/education/priorWebinars.php?ib_entity=llc"&gt;webinars&lt;/a&gt; on their APIs and the sample applications they provide that hook up to their APIs.&lt;br /&gt;&lt;br /&gt;Enjoy!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-2228379127184070114?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/2228379127184070114/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/08/trading-strategy-implementation.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/2228379127184070114'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/2228379127184070114'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/08/trading-strategy-implementation.html' title='Trading Strategy Implementation'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_JBvsBkmE5OU/So4DTpFyLkI/AAAAAAAAAIs/BTwpa80KGdE/s72-c/ibaccount.JPG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-5408951053427385651</id><published>2009-08-17T11:53:00.000-07:00</published><updated>2009-08-17T18:08:19.609-07:00</updated><title type='text'>Interactive Brokers, ActiveX &amp; Matlab</title><content type='html'>So, last time I mentioned a few useful sites with info about algorithmic trading in general and about using matlab as a trading platform.  Ernest Chan, in particular, offers a wealth of information on his &lt;a href="http://epchan.blogspot.com/"&gt;blog&lt;/a&gt; and on his &lt;a href="http://www.epchan.com/"&gt;website&lt;/a&gt;, which includes a &lt;a href="http://www.epchan.com/Research.html"&gt;premium content section&lt;/a&gt; that you can access if you have a copy of his &lt;a href="http://www.amazon.com/dp/0470284889?tag=quantitativet-20&amp;amp;camp=14573&amp;amp;creative=327641&amp;amp;linkCode=as1&amp;amp;creativeASIN=0470284889&amp;amp;adid=13902YA02NJTS0PKS84K&amp;amp;"&gt;quantitative trading book&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;There's some good stuff in that premium content section, including an article mentioned on a &lt;a href="http://epchan.blogspot.com/2009/05/matlab-as-automated-execution-system.html"&gt;post&lt;/a&gt; of his that discusses a matlab implementation of a &lt;a href="http://en.wikipedia.org/wiki/Technical_analysis"&gt;technical analysis&lt;/a&gt; trading strategy (using &lt;a href="http://en.wikipedia.org/wiki/Bollinger_bands"&gt;Bollinger bands&lt;/a&gt;).  The strategy is simple and illustrative enough, but the problem is that if you want to execute it and play around with it, you need the &lt;a href="http://www.exchangeapi.com/ProductOverview.htm"&gt;matlab2ib&lt;/a&gt; library to hook up to interactive brokers.  Fortunately, the creators of the matlab2ib library (which appears to be a thin shell around &lt;a href="http://www.interactivebrokers.com/php/apiUsersGuide/apiguide.htm"&gt;IB's own java API&lt;/a&gt;) offer a free 30-day trial version of their product; unfortunately however, this limited-time trial version is also limited in functionality and happens to lack certain functions used in the Bollinger band example.&lt;br /&gt;&lt;br /&gt;What I wanted was a quick way to get a feel for the workflow in the Trader Workstation application and its API so that I could then nicely structure a lightweight and reusable template to try out a few of my own strategies.  As I mentioned in my previous post, I have an interest in becoming more than simply familiar with matlab, and so my platform was decided.  On the other hand, &lt;a href="http://www.r-project.org/"&gt;R&lt;/a&gt; is particularly attractive for this since it offers for free much of the functionality that you would otherwise have to pay for in matlab.  For example, matlab offers the following relevant toolboxes:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.mathworks.com/access/helpdesk/help/toolbox/datafeed/index.html?/access/helpdesk/help/toolbox/datafeed/&amp;amp;http://www.mathworks.com/access/helpdesk/help/helpdesk.html"&gt;datafeed toolbox&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.mathworks.com/access/helpdesk/help/toolbox/finance/index.html?/access/helpdesk/help/toolbox/finance/&amp;amp;http://www.mathworks.com/access/helpdesk/help/helpdesk.html"&gt;financial toolbox&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.mathworks.com/access/helpdesk/help/toolbox/econ/index.html?/access/helpdesk/help/toolbox/econ/&amp;amp;http://www.mathworks.com/access/helpdesk/help/helpdesk.html"&gt;econometrics toolbox&lt;/a&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.mathworks.com/access/helpdesk/help/toolbox/finderiv/index.html?/access/helpdesk/help/toolbox/finderiv/&amp;amp;http://www.mathworks.com/access/helpdesk/help/helpdesk.html"&gt;financial derivatives toolbox&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.mathworks.com/access/helpdesk/help/toolbox/finfixed/index.html?/access/helpdesk/help/toolbox/finfixed/&amp;amp;http://www.mathworks.com/access/helpdesk/help/helpdesk.html"&gt;fixed income toolbox&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;Which is all really impressive stuff (I mean, they've got everything!), but if you're operating on a budget and have the know-how, a little elbow grease goes a long way in R:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.rmetrics.org/"&gt;Rmetrics&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://cran.r-project.org/web/views/Finance.html"&gt;CRAN Task View: Empirical Finance&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://cran.r-project.org/web/views/Econometrics.html"&gt;CRAN Task View: Computational Econometrics&lt;/a&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;Furthermore, there is even a package for connecting R to IB, &lt;a href="http://cran.r-project.org/web/packages/IBrokers/index.html"&gt;IBrokers&lt;/a&gt;, which looks very much alive, although last time I looked at it there was no order handling functionality yet.&lt;br /&gt;&lt;br /&gt;Another possible venue could have been python, which also has IB connectivity through &lt;a href="http://code.google.com/p/ibpy/"&gt;ibpy&lt;/a&gt;, a library interesting on its own merits due to the approach used by its creator, &lt;a href="http://blog.melhase.net/"&gt;Troy Melhase&lt;/a&gt;, who wrote a &lt;a href="http://code.google.com/p/java2python/"&gt;java2python&lt;/a&gt; utility for converting IB's java api to python source using &lt;a href="http://www.antlr.org/"&gt;ANTLR&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Personally, though, as much as I'm tempted otherwise, at this point I wanna mess with infrastructure stuff as little as possible and get to the meat of the system, the trading strategies, as quickly as possible.  The approach that seemed to satisfy that requirement best was to use IB's &lt;a href="http://en.wikipedia.org/wiki/ActiveX"&gt;ActiveX&lt;/a&gt; API from matlab, so I resurrected my dusty old windows box and wrote this little post in the time that it took for it to boot up :)&lt;br /&gt;&lt;br /&gt;Next time we'll actually look at some source and explore &lt;a href="http://www.interactivebrokers.com/en/p.php?f=tws&amp;amp;p=g"&gt;Trader Workstation&lt;/a&gt;'s workflow.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-5408951053427385651?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/5408951053427385651/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/08/interactive-brokers-activex-matlab.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/5408951053427385651'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/5408951053427385651'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/08/interactive-brokers-activex-matlab.html' title='Interactive Brokers, ActiveX &amp; Matlab'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-1414446807844176711</id><published>2009-08-09T14:53:00.001-07:00</published><updated>2009-09-16T09:06:31.963-07:00</updated><title type='text'>Automated Trading Systems</title><content type='html'>An &lt;a href="http://en.wikipedia.org/wiki/Algorithmic_trading"&gt;automated trading system&lt;/a&gt; seems like a good project in which to bring together all the ideas we've been discussing in this blog so far, perhaps with an aim to participate in the trading competition &lt;a href="http://www.interactivebrokers.com/en/general/education/IBTradingOlympiad.php"&gt;Interactive Brokers&lt;/a&gt; (IB) has been holding and will hopefully hold again this year.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.mathworks.com/"&gt;Matlab&lt;/a&gt; appears to be a widely popular choice for these kinds of things, and plus it won't hurt to get some experience on that platform since it is also pretty popular in the engineering classes at &lt;a href="http://www.soe.ucsc.edu/courses"&gt;UC Santa Cruz&lt;/a&gt;, where I'll be transferring later this year. Classes I'm particularly looking forward to this fall include:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;CMPS242: &lt;a href="http://www.soe.ucsc.edu/classes/cmps242/Winter08/"&gt;Machine Learning&lt;/a&gt;&lt;/li&gt;&lt;li&gt;CMPE241: &lt;a href="http://bionics.soe.ucsc.edu/education/classes_index.html"&gt;Feedback Control Systems&lt;/a&gt;&lt;/li&gt;&lt;li&gt;CMPE250: &lt;a href="http://www.soe.ucsc.edu/classes/cmpe250/Fall08/"&gt;Multimedia Systems&lt;br /&gt;&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_JBvsBkmE5OU/Sn-4l8rv6QI/AAAAAAAAAIc/FVLimsUoQWI/s1600-h/context_00004_pulp_fiction.jpg"&gt;&lt;img style="cursor: pointer; width: 380px; height: 220px;" src="http://4.bp.blogspot.com/_JBvsBkmE5OU/Sn-4l8rv6QI/AAAAAAAAAIc/FVLimsUoQWI/s400/context_00004_pulp_fiction.jpg" alt="" id="BLOGGER_PHOTO_ID_5368212242933213442" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;And I'm sure they'll also be the subject of a few blog posts.  Back on the topic of automated trading systems, here are a few links that can help jumpstart the project (video tutorials, code, etc) if you're also interested in building such a system:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://epchan.blogspot.com/"&gt;http://epchan.blogspot.com/&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.tradingwithmatlab.com/"&gt;http://www.tradingwithmatlab.com/&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.maxdama.com/"&gt;http://www.maxdama.com/&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;a href="http://en.wikipedia.org/wiki/Paper_trading"&gt;Paper trading&lt;/a&gt; is a nice, safe way to get started, but one of the really interesting things has to be being able to move the market based on one's trading activity (the difference between observing a markov chain and influencing it in a markov decision problem).  IB offers a &lt;a href="http://www.interactivebrokers.com/en/p.php?f=tws&amp;amp;ib_entity=uk"&gt;paper trading account&lt;/a&gt; to their customers.&lt;br /&gt;&lt;br /&gt;The following quote from  Assistant U.S. Attorney Joseph Facciponti acting on behalf of Goldman Sachs in the recent snafu where an ex-Goldman employee, Sergey Aleynikov, allegedly uploaded proprietary trading software from Goldman to a server in Germany prior to moving to a rival company, sheds some light on the otherwise extremely secretive industry:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.bloomberg.com/apps/news?pid=20601087&amp;amp;sid=axYw_ykTBokE"&gt;"The bank has raised the possibility that there is a danger that somebody who knew how to use this program could use it to manipulate markets in unfair ways..."&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;My irony meter exploded when I read that...&lt;br /&gt;&lt;br /&gt;Anyways, there are a lot of other venues in which to exert control over markets including &lt;a href="http://en.wikipedia.org/wiki/Market_maker"&gt;market making&lt;/a&gt; (see &lt;a href="http://blog.oddhead.com/2006/10/30/implementing-hansons-market-maker/"&gt;here&lt;/a&gt;) and how the Fed attempts to influence the greater economy by manipulating very short term interest rates (here's a &lt;a href="http://www.learningmarkets.com/index.php/20081017521/Stocks/Investing-Basics/the-federal-reserves-open-market-operations.html"&gt;video resource&lt;/a&gt; on the &lt;a href="http://www.newyorkfed.org/aboutthefed/fedpoint/fed32.html"&gt;Fed's open market operations&lt;/a&gt;).  I wonder the extent to which such control mechanisms are automated.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-1414446807844176711?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/1414446807844176711/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/08/automated-trading-systems.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/1414446807844176711'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/1414446807844176711'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/08/automated-trading-systems.html' title='Automated Trading Systems'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_JBvsBkmE5OU/Sn-4l8rv6QI/AAAAAAAAAIc/FVLimsUoQWI/s72-c/context_00004_pulp_fiction.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-7741840967701923122</id><published>2009-08-07T11:11:00.001-07:00</published><updated>2009-08-11T01:08:28.271-07:00</updated><title type='text'>Cognitive Reflection Test</title><content type='html'>Speaking of control and optimal strategy, more often than not, the correct [or optimal] answer isn't the most obvious one. &lt;a href="http://mechanistician.blogspot.com/2009/06/strategy-optimization.html"&gt;Nathan Rothschild&lt;/a&gt; gave us an example of that, and I am also reminded of a study that makes an interesting analysis based on an individual's response to the following questions:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;A bat and a ball cost $1.10 in total. The bat costs a dollar more than the ball. How much does the ball cost?&lt;/li&gt;&lt;li&gt;If it takes 5 machines 5 minutes to make 5 widgets, how long would it take 100 machines to make 100 widgets?&lt;/li&gt;&lt;li&gt;In a lake, there is a patch of lily pads. Every day, the patch doubles in size. If it takes 48 days for the patch to cover the entire lake, how long would it take for the patch to cover half of the lake?&lt;/li&gt;&lt;/ol&gt;Try to answer them before reading further...&lt;br /&gt;&lt;br /&gt;In the &lt;a href="http://www.mit.edu/people/shanefre/On%20the%20Ball.pdf"&gt;study&lt;/a&gt;, &lt;a href="http://www.mit.edu/people/shanefre/"&gt;Shane Frederick&lt;/a&gt; calls the 3-question test the "Cognitive Reflection Test"  (CRT), and claims that it measures the "ability or disposition to resist reporting the response that first comes to mind."  You will have to read the paper for the correct answers and for more details, but briefly, the nature of the questions is such that if you resist the temptation to blurt out the obvious and ponder the question for a little bit, you will almost inevitably arrive at the correct answer.  To a lot of people, however, investing the time in thinking about the problem can be a painful experience, and the alternative, a quick and apparently good enough answer, can provide a relief to this pain.&lt;br /&gt;&lt;br /&gt;The study goes further and shows that performance on the CRT is correlated with an individual's ability or tendency to forgo immediate reward in place of a higher future reward, a quality that can be essential in optimal decision making.&lt;br /&gt;&lt;br /&gt;Of course it helps if you have seem similar problems before and are able to reason by analogy; however, there is the implication that you have [willingly] endured through those similar problems before and learned the pattern.  Clearly, there is a powerful feedback mechanism at work here, were your good performance increases the likelihood you will practice your skill, which will in turn improve your performance, and so on.  Some believe that this mechanism (and a few other important details such as a drive, favorable environment, good socio-economic circumstances, etc), if exploited well, is one of the main factors in great performance (genius-like work) vs average performance (see &lt;a href="http://www.geoffcolvin.com/"&gt;Geoff Colvin's Talent is Overrated&lt;/a&gt;), and I tend to agree with that (most people have the same foundational ability but there is a time element, similar to how continuous compounding works, that needs to work with this feedback mechanism).&lt;br /&gt;&lt;br /&gt;Anyways, the study also mentions an interesting question:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;What would you prefer, a sure $500 or a 15% chance of $1,000,000?&lt;/li&gt;&lt;/ul&gt;There was also correlation of people's performance on the CRT with their answer to this question, specifically, more people who scored well on the CRT chose the 15% gamble, whereas most people who did not score so well chose on the CRT chose the $500.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://en.wikipedia.org/wiki/Decision_theory"&gt;Decision theory&lt;/a&gt; tells us that the correct way to answer this question is to calculate the expected value of each choice:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;$500 * 1.0 = $500&lt;/li&gt;&lt;li&gt;$1,000,000 * 0.15 = $150,000 &lt;/li&gt;&lt;/ul&gt;And the thing is that after playing this particular game enough times, having always chosen the second option as a policy will have yielded a higher profit.  What if you only play the game once?  Well, then there can be some controversy, and that's why some people see this particular question (one-shot game) as a case where personal preference is what matters (what's better: blue or green?) and there being no necessarily wrong answers; however, if one looks at this situation as an indicative of an individual's general tendency towards choice under uncertainty throughout the course of their life, even if they never face this exact problem again, having this general policy and decision-making strategy will statistically yield higher profits/rewards.&lt;br /&gt;&lt;br /&gt;This kind of thinking is one of the strengths of reinforcement learning methods and bayesian influence networks, which when automated (given a proper model), can work much faster and objectively than any human today possibly could.  Such simulation systems may also be more likely to identify &lt;a href="http://en.wikipedia.org/wiki/Unintended_consequence"&gt;unintended consequences&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Now, what happens when the decisions we make influence other people's decisions, which in turn may influence our decisions?  How do we find optimal strategies then?  Well, &lt;a href="http://en.wikipedia.org/wiki/Game_theory"&gt;game theory&lt;/a&gt; has a lot to say about that and it may very well be the subject of a future post.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-7741840967701923122?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/7741840967701923122/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/08/cognitive-reflection-test.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/7741840967701923122'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/7741840967701923122'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/08/cognitive-reflection-test.html' title='Cognitive Reflection Test'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-6192148334796640702</id><published>2009-08-04T16:27:00.001-07:00</published><updated>2009-09-09T10:33:06.403-07:00</updated><title type='text'>Time Series Modeling, Forecasting &amp; Control II</title><content type='html'>So, continuing with our discussion on time series analysis, I would like to focus on what I consider to be the end-goal of such enterprise: control.  Modeling and forecasting (I think 'estimating' would be a better, more encompassing term than 'forecasting') are just things we need to do in order to be able to control.&lt;br /&gt;&lt;br /&gt;By control, I mean influencing a system (entirely or partially) so as to maximize some notion of reward received from that system.  There is nothing really new here, these ideas have been the core of a several related fields:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Reinforcement_learning"&gt;reinforcement learning&lt;/a&gt;: "reinforcement learning is a sub-area of machine learning concerned with how an agent ought to take actions in an environment so as to maximize some notion of long-term reward."&lt;/li&gt;&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Optimal_control"&gt;optimal control&lt;/a&gt;: "optimal control deals with the problem of finding a control law for a given system such that a certain optimality criterion is achieved."&lt;/li&gt;&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Operations_research"&gt;operations research&lt;/a&gt;: "concerned with determining the maxima (of profit, assembly line performance, crop yield, bandwidth, etc) or minima (of loss, risk, etc.) of some objective function."&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Artificial_intelligence"&gt;artificial intelligence&lt;/a&gt;: "the study and design of systems that perceive their environment and take actions which maximize their chances of success."&lt;/li&gt;&lt;/ul&gt;Now, given all these fields and the countless minds which have been dedicated to solving what is essentially the same problem, have we found a solution yet?  Or have we at least found some sort of general framework or unifying theme that may help guide us in our search?  Well, unfortunately (or fortunately?), a perfect solution still evades us, but I think we have a pretty good general framework for developing good enough (and sometimes optimal) solutions: &lt;a href="http://en.wikipedia.org/wiki/Bayesian_network"&gt;bayesian networks&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The term 'bayesian networks' may not be the best (perhaps 'probabilistic graphical models' is better, more general), but then again, &lt;a href="http://www.famousquotes.me.uk/shakespeare_william/40.htm"&gt;what's in a name&lt;/a&gt;?  In any event, I think bayesian networks are the best general framework we have for dealing with these problems (if you know of a better one, do let me know), and this post will illustrate that a bit.  This is not a tutorial or anything like that, rather my goal is to pique your curiosity enough so that you may decide to look further into bayesian networks.&lt;br /&gt;&lt;br /&gt;Let's start with some motivating problems:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;&lt;p&gt;There are eight types of blood (A, B, AB and O, each coming in + or -), which    may be stored until it is six weeks old, at which point it has to be discarded.    Some blood types may be substituted for others (as shown in the diagram). For    example, O- blood may be used for anyone, while B+ blood may be used only with    patients with B+ blood or AB+ blood.&lt;/p&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_JBvsBkmE5OU/SnjfZVlm3WI/AAAAAAAAAIM/S1NxSyEKNqg/s1600-h/bloodmanagement.jpg"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 320px; height: 240px;" src="http://2.bp.blogspot.com/_JBvsBkmE5OU/SnjfZVlm3WI/AAAAAAAAAIM/S1NxSyEKNqg/s320/bloodmanagement.jpg" alt="" id="BLOGGER_PHOTO_ID_5366284582396550498" border="0" /&gt;&lt;/a&gt;&lt;p&gt;Each week, new donations are made, and new demands for blood arise. The problem    is to determine which blood type to use for each demand type in a given week.    Do you use the O- blood now, or hold it for later when it might be used for    a patient with A- blood, which can only use A- blood or O- blood?&lt;/p&gt;&lt;/li&gt;&lt;li&gt;A pig breeder is growing pigs for a period of four months and subsequently selling them. During this period the pig may or may not develop a certain disease. If the pig has the disease at the time it must be sold, the pig must be sold for slaughtering, and its expected market price is then 300 DKK (Danish kroner). If it is disease free, its expected market price as a breeding animal is 1000 DKK.&lt;br /&gt;&lt;br /&gt;Once a month, a veterinary doctor sees the pig and makes a test for presence of the disease. If the pig is ill, the test will indicate this with probability 0.80, and if the pig is healthy, the test will indicate this with probability 0.90. At each monthly visit, the doctor may or may not treat the pig for the disease by injecting a certain drug. The cost of an injection is 100 DKK.&lt;br /&gt;&lt;br /&gt;A pig has the disease in the first month with probability 0.10. A healthy pig develops the disease in the subsequent month with probability 0.20 without injection, whereas a healthy and treated pig develops the disease with probability 0.10, so the injection has some preventive effect. An untreated pig that is unhealthy will remain so in the subsequent month with probability 0.90, whereas the similar probability is 0.50 for an unhealthy pig that is treated. Thus spontaneous cure is possible, but treatment is beneficial on average.  The problem here is to come up with an optimal course of action course of action, a strategy, which maximizes the expected utility for the pig breeder.&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;How would you solve these 2 problems?  Is there a method that can find a satisfactory answer that can be applied to both problems?  I think you know the answer already :)&lt;br /&gt;&lt;br /&gt;The blood management problem is from &lt;a href="http://www.castlelab.princeton.edu/"&gt;Warren Powell&lt;/a&gt;'s book "&lt;a href="http://www.castlelab.princeton.edu/adp.htm"&gt;Approximate dynamic programming: solving the curses of dimensionality&lt;/a&gt;" and there is some &lt;a href="http://www.castlelab.princeton.edu/adp/ADP%20datasets/bloodmanagement.htm"&gt;material online&lt;/a&gt; about it, including a &lt;a href="http://www.castlelab.princeton.edu/adp/ADP%20datasets/ADPBloodManagement_VincentYu2007.zip"&gt;matlab implementation of an approximate dynamic programming solution&lt;/a&gt;.  The second problem is from a &lt;a href="http://www.stats.ox.ac.uk/%7Esteffen/papers/limids.pdf"&gt;paper&lt;/a&gt; by &lt;a href="http://www.stats.ox.ac.uk/%7Esteffen/"&gt;Steffen Lauritzen&lt;/a&gt;, and there is an an implementation of the algorithm proposed by Steffen in the &lt;a href="http://www.cs.ubc.ca/%7Emurphyk/Software/BNT/bnt.html"&gt;matlab toolbox for bayesian networks&lt;/a&gt; developed by &lt;a href="http://www.cs.ubc.ca/%7Emurphyk/"&gt;Kevin Murphy&lt;/a&gt;, which includes &lt;a href="http://www.cs.ubc.ca/%7Emurphyk/Software/BNT/usage.html#influence"&gt;as example a solution to the pig problem&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Granted, methods such as those under the umbrella of reinforcement learning and approximate dynamic programming are extremely powerful, maybe as powerful as bayesian networks, however there appears to be a general consensus in the research community that regards having a more explicit model of the environment as a generally superior approach.  To fuel the controversy (or confusion, really, since we really just don't quite know which is in fact better: model-free or model-based), there &lt;span style="font-style: italic;"&gt;is&lt;/span&gt; a place for models in certain reinforcement learning architectures; in fact, there is a &lt;a href="http://www.cs.ualberta.ca/%7Esutton/book/ebook/node94.html"&gt;varying degree of model complexity&lt;/a&gt; in the wide spectrum of reinforcement learning techniques out there.&lt;br /&gt;&lt;br /&gt;Personally, and this is purely qualitative speculation on my part, I think that we could get more mileage out of using bayesian networks as &lt;span style="font-style: italic;"&gt;the&lt;/span&gt; framework onto which we integrate reinforcement learning ideas rather than the other way around (Lauritzen's Single Policy Updating can be thought of as a sort of Policy Iteration used for solving Markov Decision Problems).  Why do I think so?  Well, for one, I think just about every machine learning technique we know of can be framed in a bayes net perspective, not just those that require sequential decision making and trading-off rewards in the present vs rewards in the future.  Furthermore, there is a variety of reasoning patterns that are naturally facilitated by the bayes net formulation:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;diagnostic reasoning: reasoning from symptoms to causes&lt;br /&gt;&lt;/li&gt;&lt;li&gt;predictive reasoning: reasoning from new information about causes to new beliefs about effects&lt;br /&gt;&lt;/li&gt;&lt;li&gt;intercausal reasoning: reasoning about mutual causes of a common effect&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;Yes, this 'reasoning patterns' taxonomy can be a little sketchy if looked at too closely, specially if the reader is in any way 'allergic' to notions of causality such as those discussed by &lt;a href="http://bayes.cs.ucla.edu/BOOK-2K/"&gt;Pearl&lt;/a&gt;, but when we get down to it, they are simply computations we do on a joint probability distribution; and tell me, do we have a more efficient way to represent and compute on high-dimensional joint probability distributions than bayesian networks?&lt;br /&gt;&lt;br /&gt;So, what's the best way to jump into bayesian nets?  Well, assuming you've gone through &lt;a href="http://aima.cs.berkeley.edu/"&gt;AIMA&lt;/a&gt;, I would recommend starting with Neapolitan's &lt;a href="http://www.amazon.com/Learning-Bayesian-Networks-Artificial-Intelligence/dp/0130125342"&gt;Learning Bayesian Networks&lt;/a&gt;.  As you go through it, download and play with the &lt;a href="http://www.cs.ubc.ca/%7Emurphyk/Software/BNT/bnt.html"&gt;bayes net toolbox for matlab&lt;/a&gt;, a preview of which is included below (solution for the pig breeder problem):&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;% pigs model from Lauritzen and Nilsson, 2001&lt;br /&gt;% from BNT/examples/limids&lt;br /&gt;&lt;br /&gt;seed = 0;&lt;br /&gt;rand('state', seed);&lt;br /&gt;randn('state', seed);&lt;br /&gt;&lt;br /&gt;% we number nodes down and to the right&lt;br /&gt;h = [1 5 9 13];&lt;br /&gt;t = [2 6 10];&lt;br /&gt;d = [3 7 11];&lt;br /&gt;u = [4 8 12 14];&lt;br /&gt;&lt;br /&gt;N = 14;&lt;br /&gt;dag = zeros(N);&lt;br /&gt;&lt;br /&gt;% causal arcs&lt;br /&gt;for i=1:3&lt;br /&gt; dag(h(i), [t(i) h(i+1)]) = 1;&lt;br /&gt; dag(d(i), [u(i) h(i+1)]) = 1;&lt;br /&gt;end&lt;br /&gt;dag(h(4), u(4)) = 1;&lt;br /&gt;&lt;br /&gt;% information arcs&lt;br /&gt;fig = 2;&lt;br /&gt;switch fig&lt;br /&gt;case 0,&lt;br /&gt; % no info arcs&lt;br /&gt;case 1,&lt;br /&gt;  % no-forgetting policy (figure 1)&lt;br /&gt;  for i=1:3&lt;br /&gt;    dag(t(i), d(i:3)) = 1;&lt;br /&gt;  end&lt;br /&gt;case 2,&lt;br /&gt; % reactive policy (figure 2)&lt;br /&gt; for i=1:3&lt;br /&gt;   dag(t(i), d(i)) = 1;&lt;br /&gt; end&lt;br /&gt;case 7,&lt;br /&gt; % omniscient policy (figure 7: di has access to hidden state h(i-1))&lt;br /&gt; dag(t(1), d(1)) = 1;&lt;br /&gt; for i=2:3&lt;br /&gt;   %dag([h(i-1) t(i-1) d(i-1)], d(i)) = 1;&lt;br /&gt;   dag([h(i-1) d(i-1)], d(i)) = 1; % t(i-1) is redundant given h(i-1)&lt;br /&gt; end&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;ns = 2*ones(1,N);&lt;br /&gt;ns(u) = 1;&lt;br /&gt;&lt;br /&gt;% parameter tying&lt;br /&gt;params = ones(1,N);&lt;br /&gt;uparam = 1;&lt;br /&gt;final_uparam = 2;&lt;br /&gt;tparam = 3;&lt;br /&gt;h1_param = 4;&lt;br /&gt;hparam = 5;&lt;br /&gt;dparams = 6:8;&lt;br /&gt;&lt;br /&gt;params(u(1:3)) = uparam;&lt;br /&gt;params(u(4)) = final_uparam;&lt;br /&gt;params(t) = tparam;&lt;br /&gt;params(h(1)) = h1_param;&lt;br /&gt;params(h(2:end)) = hparam;&lt;br /&gt;params(d) = dparams;&lt;br /&gt;&lt;br /&gt;limid = mk_limid(dag, ns, 'chance', [h t], 'decision', d, 'utility', u, ...&lt;br /&gt;               'equiv_class', params);&lt;br /&gt;&lt;br /&gt;% h = 1 means healthy, h = 2 means diseased&lt;br /&gt;% d = 1 means don't treat, d = 2 means treat&lt;br /&gt;% t = 1 means test shows healthy, t = 2 means test shows diseased&lt;br /&gt;&lt;br /&gt;if 0&lt;br /&gt; % use random params&lt;br /&gt; limid.CPD{final_uparam} = tabular_utility_node(limid, u(4));&lt;br /&gt; limid.CPD{uparam} = tabular_utility_node(limid, u(1));&lt;br /&gt; limid.CPD{tparam} = tabular_CPD(limid, t(1));&lt;br /&gt; limid.CPD{h1_param} = tabular_CPD(limid, h(1));&lt;br /&gt; limid.CPD{hparam} = tabular_CPD(limid, h(2));&lt;br /&gt;else&lt;br /&gt; limid.CPD{final_uparam} = tabular_utility_node(limid, u(4), [1000 300]);&lt;br /&gt; % costs have negative utility!&lt;br /&gt; limid.CPD{uparam} = tabular_utility_node(limid, u(1), [0 -100]);&lt;br /&gt;&lt;br /&gt; % h  P(t=1) P(t=2)&lt;br /&gt; % 1  0.9   0.1&lt;br /&gt; % 2  0.2   0.8&lt;br /&gt; limid.CPD{tparam} = tabular_CPD(limid, t(1), [0.9 0.2 0.1 0.8]);&lt;br /&gt;&lt;br /&gt; % P(h1)&lt;br /&gt; limid.CPD{h1_param} = tabular_CPD(limid, h(1), [0.9 0.1]);&lt;br /&gt;&lt;br /&gt; % hi di P(hj=1) P(hj=2),  j = i+1, i=1:3&lt;br /&gt; % 1  1  0.8     0.2&lt;br /&gt; % 2  1  0.1     0.9&lt;br /&gt; % 1  2  0.9     0.1&lt;br /&gt; % 2  2  0.5     0.5&lt;br /&gt; limid.CPD{hparam} = tabular_CPD(limid, h(2), [0.8 0.1 0.9 0.5 0.2 0.9 0.1 0.5]);&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;% Decision nodes get assigned uniform policies by default&lt;br /&gt;for i=1:3&lt;br /&gt; limid.CPD{dparams(i)} = tabular_decision_node(limid, d(i));&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;engines = {};&lt;br /&gt;engines{end+1} = global_joint_inf_engine(limid);&lt;br /&gt;engines{end+1} = jtree_limid_inf_engine(limid);&lt;br /&gt;%engines{end+1} = belprop_inf_engine(limid, 'max_iter', 1*N, 'filename', ...&lt;br /&gt;                                   %fname, 'tol', 1e-3);&lt;br /&gt;&lt;br /&gt;exact = [1 2];&lt;br /&gt;%approx = 3;&lt;br /&gt;approx = [];&lt;br /&gt;&lt;br /&gt;max_iter = 1;&lt;br /&gt;order = d(end:-1:1);&lt;br /&gt;%order = d(1:end);&lt;br /&gt;&lt;br /&gt;NE = length(engines);&lt;br /&gt;MEU = zeros(1, NE);&lt;br /&gt;niter = zeros(1, NE);&lt;br /&gt;strategy = cell(1, NE);&lt;br /&gt;for e=1:NE&lt;br /&gt; [strategy{e}, MEU(e), niter(e)] = solve_limid(engines{e}, 'max_iter', ...&lt;br /&gt;     max_iter, 'order',  order);&lt;br /&gt;end&lt;br /&gt;MEU&lt;br /&gt;&lt;br /&gt;% check results match those in the paper (p. 22)&lt;br /&gt;direct_policy = eye(2); % treat iff test is positive&lt;br /&gt;never_policy = [1 0; 1 0]; % never treat&lt;br /&gt;tol = 1e-0; % results in paper are reported to 0dp&lt;br /&gt;for e=exact(:)'&lt;br /&gt; switch fig&lt;br /&gt;  case 2, % reactive policy&lt;br /&gt;   assert(approxeq(MEU(e), 727, tol));&lt;br /&gt;   assert(approxeq(strategy{e}{d(1)}(:), never_policy(:)))&lt;br /&gt;   assert(approxeq(strategy{e}{d(2)}(:), direct_policy(:)))&lt;br /&gt;   assert(approxeq(strategy{e}{d(3)}(:), direct_policy(:)))&lt;br /&gt;  case 1, assert(approxeq(MEU(e), 729, tol));&lt;br /&gt;  case 7, assert(approxeq(MEU(e), 732, tol));&lt;br /&gt; end&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;for e=approx(:)'&lt;br /&gt; for i=1:3&lt;br /&gt;   approxeq(strategy{exact(1)}{d(i)}, strategy{e}{d(i)})&lt;br /&gt;   dispcpt(strategy{e}{d(i)})&lt;br /&gt; end&lt;br /&gt;end&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Which yields a strategy for a maximum expected utility of 727 DKK, a pretty good approximation to the optimal maximum expected utility of 729 DKK.  You should really go read Lauritzen's paper, but just in case you're curious, here is what an optimal strategy would look like:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Never treat in the first month.&lt;/li&gt;&lt;li&gt;Treat in the second month if and only if test results are positive in the first month and in the second month.&lt;/li&gt;&lt;li&gt;Treat in the last month if and only if the test results are positive in the third month or in both of the first and second months.&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;An interesting exercise would be to now solve the blood management problem under this framework.  We shall leave that for a future post.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-6192148334796640702?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/6192148334796640702/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/08/time-series-modelling-forecasting.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/6192148334796640702'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/6192148334796640702'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/08/time-series-modelling-forecasting.html' title='Time Series Modeling, Forecasting &amp; Control II'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_JBvsBkmE5OU/SnjfZVlm3WI/AAAAAAAAAIM/S1NxSyEKNqg/s72-c/bloodmanagement.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-7963726954963453129</id><published>2009-07-29T09:43:00.000-07:00</published><updated>2009-07-29T13:42:54.083-07:00</updated><title type='text'>Time Series Modeling, Forecasting &amp; Control</title><content type='html'>Most of the algorithms we've discussed so far are for data which is considered to be independently and identically distributed, or IID (we did look briefly at Markov models &lt;a href="http://mechanistician.blogspot.com/2009/05/markov-chain-monte-carlo-ii.html"&gt;here&lt;/a&gt; and &lt;a href="http://mechanistician.blogspot.com/2009/05/hidden-markov-model-iii.html"&gt;here&lt;/a&gt;); in other words, each data point is sampled from the same invariant distribution and where a given sample is independent from any samples that came before it.  This is easily visualized by thinking of a data generating machine/process whose core engine is a Gaussian distribution.  Once we turn the machine on and it begins to generate data, we may see something like this:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;&gt; rnorm(10,mean=10,sd=1)&lt;br /&gt;[1] 10.685401  9.262564  9.965892  9.637854 11.069165  8.789038 10.151829&lt;br /&gt;[8] 10.851841  9.615401 11.004101&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;What if we have a different kind of generating machine?  Say we have a &lt;a href="http://en.wikipedia.org/wiki/Finite-state_machine"&gt;state machine&lt;/a&gt;, where each data point generated depends stochastically on the current state of the machine:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;seed&lt;-0&lt;br /&gt;for (val in rnorm(10,mean=10,sd=1)) &lt;br /&gt;{&lt;br /&gt; seed&lt;-seed+val&lt;br /&gt; cat(seed,"\n")&lt;br /&gt;}&lt;br /&gt;8.715963 &lt;br /&gt;17.36178 &lt;br /&gt;25.38322 &lt;br /&gt;35.13913 &lt;br /&gt;45.95719 &lt;br /&gt;56.685 &lt;br /&gt;66.37728 &lt;br /&gt;75.36139 &lt;br /&gt;85.55207 &lt;br /&gt;95.67641 &lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Or how about a machine that is evolving through time according to some deterministic (or nondeterministic) function, which really is just another kind of state machine where the state is a time counter:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;samples&lt;-rnorm(10,mean=10,sd=1)&lt;br /&gt;for (i in 1:10) &lt;br /&gt;{&lt;br /&gt; sample&lt;-samples[i]*i&lt;br /&gt; cat(sample,"\n")&lt;br /&gt;}&lt;br /&gt;11.94815 &lt;br /&gt;21.08698 &lt;br /&gt;24.32918 &lt;br /&gt;40.33526 &lt;br /&gt;48.37734 &lt;br /&gt;73.4921 &lt;br /&gt;62.65447 &lt;br /&gt;70.49196 &lt;br /&gt;84.39426 &lt;br /&gt;83.01182&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;In all these cases, we would not be well served by the IID assumption when trying to infer the generating process from data.  What we do instead is to incorporate the sequential data dependency into our models.  There are several standard ways of doing so, the most common, in order of increasing generality, being:&lt;a href="http://en.wikipedia.org/wiki/White_noise"&gt;&lt;br /&gt;&lt;/a&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/White_noise"&gt;White Noise Processes&lt;/a&gt;&lt;p&gt;y_t = e_t; where e_t has a certain invariant distribution such as a Normal distribution with mean 'mu' and standard deviation 'sd'.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Moving_average"&gt;Moving Average (MA) Processes&lt;/a&gt;&lt;p&gt;y_t = e_t + w_1 * e_t-1; which is simply a white noise processes plus feedback from the noise at time t-1.  This could be extended to also contain feedback from even more white noise signals from the past (t-2, t-3, etc).&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/AR_process"&gt;Auto Regressive (AR) Processes&lt;/a&gt;&lt;p&gt;y_t = e_t + w_1 * y_t-1; where now we feed back not the previous noise, but the previous value of the series.  This can be seen as a special case of an MA process where we add up all previous white noise terms.  This model can also be generalized by feeding back more past values of the time series (t-2, t-3, etc).&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Autoregressive_moving_average_model"&gt;Auto Regressive Moving Average (ARMA) Processes&lt;/a&gt;&lt;p&gt;y_t = e_t + w_1 * y_t-1 + w_2 * e_t-1; where we combine AR and MA processes, in order to seek better fit with fewer parameters than having a pure AR or MA model.  We could also extend this model by adding further feedback components from time series values and/or noise.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average"&gt;Auto Regressive Integrated Moving Average (ARIMA) Processes&lt;/a&gt;&lt;p&gt;The previous models were for &lt;a href="http://en.wikipedia.org/wiki/Stationary_process"&gt;stationary processes&lt;/a&gt;, whereas ARIMA is a model for nonstationary processes.  Take stock prices, for example, which follow a nonstationary process.  If we subtract price at t-1 from price at t, we end up with another time series which may be stationary.  If so, we could then apply some sort of ARMA model, say ARMA(p,q), which would be equivalent to saying we used an ARIMA(p,1,q) model (the 1 is for first difference).  There are other variations, including SARIMA (seasonal) and VARIMA (vector), and they're all nicely structured under the &lt;a href="http://en.wikipedia.org/wiki/Box-Jenkins"&gt;Box-Jenkins framework&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;All these models, once parameterized at their meta level (choosing 'p' and/or 'q'), can then be fit using maximum likelihood methods.  Furthermore, rather than having a simple point-estimate prediction of the expectation, we can get a probabilistic distribution for the predicted value, in other words E(y|x) vs P(y|x).  Speaking of which, I wonder what would be the result of introducing probabilities earlier in children's math curriculum, say as early as when they learn their multiplication tables.  Then, they could be taught not that 3*5=15, but that E(3*5)=15.  Perhaps I'm exaggerating a little, but just because we don't see &lt;span style="font-style: italic;"&gt;it&lt;/span&gt;, it does not mean &lt;span style="font-style: italic;"&gt;it&lt;/span&gt; is not there.  These things pop up all the time in math (when dealing with 2-dimensional problems, the 3rd dimension is always lurking; or when we convert to the domain of complex numbers to solve a differential equation and then drop the imaginary part at the end); it is always helpful to learn to frame problems under new frameworks or imagine what's "behind" the paper, so to speak (projecting data points to a higher-dimensional space in order to linearly separate nonlinear data is just another example).&lt;br /&gt;&lt;br /&gt;Anyways, all of the above can be done in &lt;a href="http://www.r-project.org/"&gt;R&lt;/a&gt; using the &lt;a href="http://stat.ethz.ch/R-manual/R-patched/library/stats/html/arima.html"&gt;arima&lt;/a&gt; command and a variety of ancillary commands (&lt;a href="http://stat.ethz.ch/R-manual/R-patched/library/stats/html/acf.html"&gt;acf&lt;/a&gt;, etc), but what I would like to explore further is a different kind of model for sequential data: linear dynamical systems, specifically, &lt;a href="http://en.wikipedia.org/wiki/Kalman_filter"&gt;Kalman filters&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;One obvious way to seek improvement in the temporal models described above is to include additional (exogenous) variables we think may have an effect on our variable of interest.  For example, if we are trying to model the evolution of a stock price through time, we may add to the model, in addition to feedback lags from the stock price, feedback lags from macroeconomic variables such as inflation or gross national product, or maybe even feedback lags from the market price, resulting in a spatio-temporal model (an example of this is the so called &lt;a href="http://en.wikipedia.org/wiki/Autoregressive_moving_average_model#Autoregressive_moving_average_model_with_exogenous_inputs_model_.28ARMAX_model.29"&gt;ARMAX&lt;/a&gt; model). &lt;br /&gt;&lt;br /&gt;We can think of the methods we've looked at so far as modeling time series having a certain (possibly zero) momentum component and a random component.  Another way to visualize a time series is to think of it as a sequence of observed measurement from a process that transitions from state to state at each time interval, including the possibility that the state may not be completely observable.  A Markov chain is an example of a dynamic, stationary process that evolves this way.  A hidden Markov model is one in which the state is not observable (all we see are observable measurements) and is considered to be discrete-valued.  The analogous to a hidden Markov model but with continuous latent variables is a &lt;a href="http://en.wikipedia.org/wiki/Linear_dynamical_system"&gt;linear dynamical system&lt;/a&gt;, which can be tracked/monitored/filtered and predicted with a Kalman filter, also known as a linear quadratic estimator (LQE). &lt;br /&gt;&lt;br /&gt;That last name, LQE, is interesting because we can pair it with a linear quadratic regulator (LQR) and end up with an optimal solution to the &lt;a href="http://en.wikipedia.org/wiki/Linear-quadratic-Gaussian_control"&gt;linear quadratic gaussian control problem&lt;/a&gt; (LQG), which is what lectures &lt;a href="http://www.youtube.com/watch?v=-ff6l5D8-j8&amp;amp;feature=PlayList&amp;amp;p=A89DCFA6ADACE599&amp;amp;index=17"&gt;18&lt;/a&gt; and &lt;a href="http://www.youtube.com/watch?v=UFH5ibWnA7g&amp;amp;feature=PlayList&amp;amp;p=A89DCFA6ADACE599&amp;amp;index=18"&gt;19&lt;/a&gt; are mostly about.  However, the really interesting part is that as useful and widely applicable as the Kalman filter is, it can be thought of as a special case of a dynamic bayesian network and so any improvements one may want to make to a Kalman filter can be accomplished by looking at it from the dynamic bayesian network perspective.  We will look at these things further in a soon to follow part 2 of this post.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-7963726954963453129?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/7963726954963453129/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/07/time-series-modeling-forecasting.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/7963726954963453129'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/7963726954963453129'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/07/time-series-modeling-forecasting.html' title='Time Series Modeling, Forecasting &amp; Control'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-9050939314022351947</id><published>2009-07-26T13:51:00.000-07:00</published><updated>2009-07-27T14:03:16.491-07:00</updated><title type='text'>Capital Asset Pricing Model</title><content type='html'>In our last post we looked at how Markowitz suggested we could optimize portfolios to maximize return and minimize risk.  The optimal strategy turned out to be holding some combination of the tangency portfolio and the risk free asset (depending on one's specific risk profile).  Naturally, we overlooked many details; however, the interested reader can refer to the following books for more information:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.mhhe.com/business/finance/bkm/index.html"&gt;Investments, by Bodie, Kane, and Marcus&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://legacy.orie.cornell.edu/%7Edavidr/StatFinance/index.html"&gt;Statistics and Finance, by Ruppert&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;Rupert's book includes implementations in SAS and MATLAB, and is generally more accessible than &lt;a href="http://www.stanford.edu/%7Exing/statfinbook/index.html"&gt;Lai and Xing's Statistical Models and Methods for Financial Markets&lt;/a&gt;.  In addition, &lt;a href="http://www.stat.tamu.edu/%7Eljin/Finance/stat689-R.htm"&gt;here&lt;/a&gt; is a page that includes R code for most of the examples and illustrations in Rupert's.&lt;br /&gt;&lt;br /&gt;Under certain idealized but plausible assumptions, elucidated as premises for the &lt;a href="http://en.wikipedia.org/wiki/Capital_asset_pricing_model"&gt;Capital Asset Pricing Model&lt;/a&gt; (CAPM), the market portfolio is the tangency portfolio.  The S&amp;amp;P 500 being a commonly used proxy for the market portfolio implies that a reasonable strategy under this framework is to invest into some combination of some market index such as an ETF for the S&amp;amp;P 500 (&lt;a href="http://www.google.com/finance?q=spy"&gt;SPY&lt;/a&gt;) and the risk free asset (&lt;a href="http://www.treasurydirect.gov/indiv/products/products.htm"&gt;Treasury bonds&lt;/a&gt;).  Other indices include much more than just 500 companies, such as the &lt;a href="http://en.wikipedia.org/wiki/Wilshire_5000"&gt;Wilshire 5000&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;There are 2 books from &lt;a href="http://www.princeton.edu/%7Ebmalkiel/"&gt;Burton Malkiel&lt;/a&gt; that further discuss these passive investment strategies and many other interesting bits of information, including:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;An illustration of the difficulty of timing the market, first described by Charles Ellis (author of &lt;a href="http://www.amazon.com/Winning-Losers-Game-Strategies-Successful/dp/0071387676"&gt;Winning the Loser's Game&lt;/a&gt;), where an investor who stayed invested in the S&amp;amp;P 500 index from 1982-2000 made an 18% return, whereas if that investor had missed the best 30 days in the market, the investor would have realized an 11% return.  &lt;/li&gt;&lt;li&gt;Discussions on several market booms and busts, which should be enlightening for those who believe this current economic recession/depression is the first or the last one we will ever see.  This should also be a good starting point for a reader interested in researching how previous crises were handle, internationally and in the US, particularly before the &lt;a href="http://en.wikipedia.org/wiki/Federal_Reserve_System"&gt;Federal Reserve System&lt;/a&gt; existed.&lt;/li&gt;&lt;/ul&gt;The two books are:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/A_Random_Walk_Down_Wall_Street"&gt;A random walk down wall-street&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.amazon.com/gp/product/039332639X/ref=pd_lpo_k2_dp_sr_1?pf_rd_p=304485901&amp;amp;pf_rd_s=lpo-top-stripe-1&amp;amp;pf_rd_t=201&amp;amp;pf_rd_i=0393058549&amp;amp;pf_rd_m=ATVPDKIKX0DER&amp;amp;pf_rd_r=0804DJ4BGEBGKMEYJ269"&gt;The random walk guide to investing&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;The second one is a more recent, summarized version of the first one, but the first one includes many interesting discussions on &lt;a href="http://en.wikipedia.org/wiki/Technical_analysis"&gt;technical analysis&lt;/a&gt;, &lt;a href="http://en.wikipedia.org/wiki/Fundamental_analysis"&gt;fundamental analysis&lt;/a&gt;, &lt;a href="http://en.wikipedia.org/wiki/Modern_portfolio_theory"&gt;modern portfolio theory&lt;/a&gt;, &lt;a href="http://en.wikipedia.org/wiki/Behavioral_economics"&gt;behavioral finance&lt;/a&gt;, &lt;a href="http://en.wikipedia.org/wiki/Efficient-market_hypothesis"&gt;efficient market theory&lt;/a&gt;, etc.&lt;br /&gt;&lt;br /&gt;Anyway, what I wanted to look at today was CAPM, and the perspective we will take is that of someone who is familiar with linear regression and wonders what insights can be gleaned from runing regression analysis to try and explain the return of a particular risky asset as a function of the market return (with some index as proxy for the market portfolio).&lt;br /&gt;&lt;br /&gt;The code below shows that we will be using SPDR's for both the risky assets and the market index.  Our risk free asset will be the 3-month treasury bill rate:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;library(quantmod)&lt;br /&gt;symbols&lt;-c("XLY","XLP","XLE","XLF","XLV","XLI","XLB","XLK","XLU","SPY")&lt;br /&gt;fromDate&lt;-'1999-01-01'&lt;br /&gt;&lt;br /&gt;#download all data available for the given symbols from yahoo finance&lt;br /&gt;setDefaults(getSymbols.yahoo,from=fromDate) &lt;br /&gt;getSymbols(symbols,return.class="timeSeries")&lt;br /&gt;&lt;br /&gt;#get 3-month t-bill rate&lt;br /&gt;getSymbols("DTB3",return.class="timeSeries",src="FRED")&lt;br /&gt;&lt;br /&gt;#gather all data into one multivariate time series, using only the&lt;br /&gt;#close price, which is adjusted for splits and dividends&lt;br /&gt;allAdj&lt;-get(symbols[1])[,6]&lt;br /&gt;for(symbol in symbols[-1]) allAdj&lt;-cbind(allAdj,get(symbol)[,6])&lt;br /&gt;allAdj&lt;-cbind(allAdj,DTB3)&lt;br /&gt;&lt;br /&gt;#adjust the column names&lt;br /&gt;for(i in seq(along=symbols)) colnames(allAdj)[i]&lt;-strsplit(colnames(allAdj)[i],".Adjusted")&lt;br /&gt;&lt;br /&gt;#clean and save to disk&lt;br /&gt;allAdj&lt;-na.omit(allAdj)&lt;br /&gt;write.csv(allAdj,file="data_spy.csv") &lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Once we have our data, we can run regression of a given risky asset's excess return against the market's excess return (excess from the risk free asset):&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;library(fPortfolio)&lt;br /&gt;snpData&lt;-readSeries(file="data_spy.csv",header=TRUE,sep=",")&lt;br /&gt;&lt;br /&gt;rfree&lt;-as.numeric(snpData$DTB3)&lt;br /&gt;market&lt;-as.numeric(snpData$SPY)&lt;br /&gt;risky&lt;-as.numeric(snpData$XLF)&lt;br /&gt;&lt;br /&gt;market&lt;-diff(log(market))-(rfree/(100*252))[2:length(rfree)]&lt;br /&gt;risky&lt;-diff(log(risky))-(rfree/(100*252))[2:length(rfree)]&lt;br /&gt;&lt;br /&gt;fit&lt;-lm(risky~market)&lt;br /&gt;summary(fit)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Which yields:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;Call:&lt;br /&gt;lm(formula = risky ~ market)&lt;br /&gt;&lt;br /&gt;Residuals:&lt;br /&gt;       Min         1Q     Median         3Q        Max &lt;br /&gt;-0.2019700 -0.0051045 -0.0002983  0.0046492  0.2107481 &lt;br /&gt;&lt;br /&gt;Coefficients:&lt;br /&gt;              Estimate Std. Error t value Pr(&gt;|t|)    &lt;br /&gt;(Intercept) -8.269e-05  2.774e-04  -0.298    0.766    &lt;br /&gt;market       1.354e+00  1.972e-02  68.664   &lt;2e-16 ***&lt;br /&gt;---&lt;br /&gt;Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 &lt;br /&gt;&lt;br /&gt;Residual standard error: 0.01424 on 2634 degrees of freedom&lt;br /&gt;Multiple R-squared: 0.6416, Adjusted R-squared: 0.6414 &lt;br /&gt;F-statistic:  4715 on 1 and 2634 DF,  p-value: &lt; 2.2e-16 &lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;We end up with an intercept of -8.269e-05 and a slope of 1.354.  Another metric of interest is R squared, for which we have a value of 0.6416.  R squared can be thought of as a measure of how well the model explains the predicted data, and in this case, 1 - 0.64 = 0.36 means that 36% of the behavior of the excess returns of the risky asset are not explained by the market.  So 36% is a measure of the amount of &lt;a href="http://financial-dictionary.thefreedictionary.com/Nonsystematic+risk"&gt;nonsystematic risk&lt;/a&gt; (sigma) in this particular risky asset (this can and should be diversified away in a good portfolio, which is another use of CAPM since it allows for a computationally efficient way to estimate the values needed in Markowitz Portfolio Optimization theory).  The intercept and slope are also commonly known as alpha and beta under the CAPM framework, and together with the sigma, &lt;a href="http://www.martialcapital.com/alpha-beta-r2.php"&gt;they are considered significant characteristics of a particular asset&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Under CAPM, a non-zero alpha would indicate the degree of mispricing in a given asset (if greater than zero underpriced, otherwise overpriced), which according to the &lt;a href="http://en.wikipedia.org/wiki/Efficient-market_hypothesis"&gt;efficient market hypothesis&lt;/a&gt; (an assumption of CAPM) should not happen (something that was confirmed in our little exercise).  Finally, beta measures the asset's sensitivity to market movements (if the return on the market goes up by 1%, then the return on the asset would go up by beta%), so in CAPM, it is beta and not the variance in the returns that is the significant measure of [systematic] risk in the asset.&lt;br /&gt;&lt;br /&gt;Although useful, CAPM has received a lot of heat, especially since it is so easy set up and test (you can read Malkiel's or Ruppert's book for details).  In addition, there are extensions to the model, where, following the regression mindset, we simply add more regressors, giving rise to the multifactor pricing model under &lt;a href="http://en.wikipedia.org/wiki/Arbitrage_pricing_theory"&gt;Arbitrage Pricing Theory&lt;/a&gt;, an example of which is the &lt;a href="http://en.wikipedia.org/wiki/Fama-French_three-factor_model"&gt;Fama-French three factor model&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;For now, we will stick with the plain-vanilla CAPM and look at what happens when we recognize that beta is a time-varying quantity and how we may be able to track it.  Until then, &lt;a href="http://www.stanford.edu/~wfsharpe/art/djam/djam.htm"&gt;here&lt;/a&gt; is an interesting interview with &lt;a href="http://en.wikipedia.org/wiki/William_Forsyth_Sharpe"&gt;Willian Sharpe&lt;/a&gt;, one of the creators of CAPM.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-9050939314022351947?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/9050939314022351947/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/07/capital-asset-pricing-model.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/9050939314022351947'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/9050939314022351947'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/07/capital-asset-pricing-model.html' title='Capital Asset Pricing Model'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-1868829725933280255</id><published>2009-07-15T13:06:00.000-07:00</published><updated>2009-07-17T16:14:43.648-07:00</updated><title type='text'>Markowitz Portfolio Optimization</title><content type='html'>Having looked at a method for modeling stock returns, we can now look at how those ideas form the basis for a very significant contribution from &lt;a href="http://en.wikipedia.org/wiki/Harry_Markowitz"&gt;Harry Markowitz&lt;/a&gt; to &lt;a href="http://en.wikipedia.org/wiki/Modern_portfolio_theory"&gt;modern portfolio theory&lt;/a&gt;.  In essence, investors want to:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Maximize returns, specifically, the mean of the return&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Minimize risk, which we define to be the standard deviation of the return&lt;/li&gt;&lt;/ul&gt;And although these two goals are generally at odds with each other, Markowitz came up with a framework for structuring the problem so as to facilitate making the trade-off decision fairly objectively.  Specifically, the framework gives us a way to mathematically handle something that we are all intuitively familiar with: the need for &lt;a href="http://en.wikipedia.org/wiki/Diversification_%28finance%29"&gt;diversification&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;We illustrate the ideas with an example, where an investor with a time horizon of 1 month is trying to find a portfolio composed of some combination of the 9 &lt;a href="http://www.sectorspdr.com/"&gt;select sector SPDR&lt;/a&gt; &lt;a href="http://en.wikipedia.org/wiki/Exchange-traded_fund"&gt;ETF&lt;/a&gt;'s that the &lt;a href="http://en.wikipedia.org/wiki/S&amp;amp;P_500"&gt;S&amp;amp;P 500&lt;/a&gt; has been divided into.&lt;br /&gt;&lt;br /&gt;The data we need for our example can be found online for free (&lt;a href="http://finance.yahoo.com/"&gt;Yahoo! Finance&lt;/a&gt;), and downloading it directly from within &lt;a href="http://www.r-project.org/"&gt;R&lt;/a&gt; is a breeze using the &lt;a href="http://www.quantmod.com/"&gt;quantmod&lt;/a&gt; library:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;library(quantmod)&lt;br /&gt;symbols&lt;-c("XLY","XLP","XLE","XLF","XLV","XLI","XLB","XLK","XLU")&lt;br /&gt;&lt;br /&gt;#download all data available for the given symbols from yahoo finance&lt;br /&gt;setDefaults(getSymbols.yahoo,from='1900-01-01') &lt;br /&gt;getSymbols(symbols,return.class="timeSeries")&lt;br /&gt;&lt;br /&gt;#gather all data into one multivariate time series, using only the&lt;br /&gt;#close price which is adjusted for splits and dividends&lt;br /&gt;allAdj&lt;-get(symbols[1])[,6]&lt;br /&gt;for(symbol in symbols[-1]) allAdj&lt;-cbind(allAdj,get(symbol)[,6])&lt;br /&gt;&lt;br /&gt;#adjust the column names&lt;br /&gt;for(i in 1:9) colnames(allAdj)[i]&lt;-strsplit(colnames(allAdj)[i],".Adjusted")&lt;br /&gt;&lt;br /&gt;#save to disk&lt;br /&gt;write.csv(allAdj,file="spdrAdj.csv") &lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Now, they key property that we want to exploit is that by mixing assets whose returns are not correlated, we can reduce the standard deviation of the return of the mix itself.  In addition, we can do so without necessarily sacrificing our demands of a high expected return.  The expectation (mean) is a linear operator and so expectations of linear combinations of expectations &lt;a href="http://en.wikipedia.org/wiki/Expected_value#Linearity"&gt;are easily computed&lt;/a&gt;.  The variance, however, &lt;a href="http://en.wikipedia.org/wiki/Variance#Weighted_sum_of_variables"&gt;is a little different&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Having downloaded the data into our R session, we need to estimate the expectations and variance/covariances of the returns of the ETFs in our portfolio.  Having that information, we then need to decided what percentage of our funds we will dedicate to each candidate ETF.  This vector of weights assigned to each ETF is what defines the portfolio.  In one scenario, the one we will consider in the code example, in addition to the sum of all weights needing to add up to one, the weights must also be positive, a so called &lt;a href="http://en.wikipedia.org/wiki/Long_%28finance%29"&gt;long-only&lt;/a&gt; portfolio (an alternative scenario would consider portfolios where &lt;a href="http://en.wikipedia.org/wiki/Short_%28finance%29"&gt;shorting&lt;/a&gt; is permitted).&lt;br /&gt;&lt;br /&gt;The calculations for estimating the returns' expectations and their variances/covariances are easy.  The hard part is making reliable assumptions about how stationary the time-series processes are, how much data to use, how robust our estimates are, etc.  In this example, we make the very basic assumption that the processes are stationary, but in future posts we will consider more realistic (dynamic) models for drift and volatility.  Optimizing the portfolio mix under a long-only constraint can be done numerically with &lt;a href="http://en.wikipedia.org/wiki/Quadratic_programming"&gt;quadratic programming&lt;/a&gt;, which R has a nice library for.  When we do not have the long-only constraint, the optimization can be done analytically with the method of &lt;a href="http://en.wikipedia.org/wiki/Lagrange_multipliers"&gt;Lagrange multipliers&lt;/a&gt;.  More details can be found on chapter 3 of &lt;a href="http://www.stanford.edu/%7Exing/statfinbook/data.html"&gt;Statistical Models and Methods for Financial Markets&lt;/a&gt;.  A very useful R library that provides all this functionality out-of-the-box is &lt;a href="http://www.rmetrics.org/"&gt;RMetrics&lt;/a&gt;, which is what we use below:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;library(fPortfolio)&lt;br /&gt;library(fEcofin)&lt;br /&gt;library(corpcor)&lt;br /&gt;&lt;br /&gt;#the two functions simply fix a small problem in how RMetrics&lt;br /&gt;#was plotting the tangency line in 'tangencyLines' (neither &lt;br /&gt;#the intercept nor the slope considered the risk free rate &lt;br /&gt;#if one was given)&lt;br /&gt;&lt;br /&gt;myTangencyLines&lt;-function(object, &lt;br /&gt; return = c("mean", "mu"), &lt;br /&gt; risk = c("Cov", "Sigma", "CVaR", "VaR"), &lt;br /&gt; auto = TRUE, ...) &lt;br /&gt;{&lt;br /&gt;    return = match.arg(return)&lt;br /&gt;    risk = match.arg(risk)&lt;br /&gt;    data = getSeries(object)&lt;br /&gt;    spec = getSpec(object)&lt;br /&gt;    constraints = getConstraints(object)&lt;br /&gt;    tgPortfolio = tangencyPortfolio(data, spec, constraints)&lt;br /&gt;    assets = frontierPoints(tgPortfolio, return = return, risk = risk, &lt;br /&gt;        auto = auto)&lt;br /&gt;    slope = (assets[2]-getRiskFreeRate(spec))/assets[1]&lt;br /&gt;    abline(getRiskFreeRate(spec), slope, ...)&lt;br /&gt;    invisible(list(slope = slope, assets = assets))&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;myTailoredFrontierPlot&lt;-function(object, &lt;br /&gt; risk = c("Cov", "Sigma", "CVaR", "VaR"), &lt;br /&gt; mText = NULL, &lt;br /&gt; col = NULL, xlim = NULL, ylim = NULL, &lt;br /&gt; twoAssets = FALSE) &lt;br /&gt;{&lt;br /&gt;    offset = 0.1&lt;br /&gt;    risk &lt;- match.arg(risk)&lt;br /&gt;    if (is.null(xlim)) {&lt;br /&gt;        if (risk == "Cov") {&lt;br /&gt;            xmax = max(sqrt(diag(getCov(object))))&lt;br /&gt;        }&lt;br /&gt;        if (risk == "Sigma") {&lt;br /&gt;            xmax = max(sqrt(diag(getSigma(object))))&lt;br /&gt;        }&lt;br /&gt;        if (risk == "CVaR") {&lt;br /&gt;            alpha = getAlpha(object)&lt;br /&gt;            quantiles = colQuantiles(getSeries(object), prob = alpha)&lt;br /&gt;            n.max = which.max(-quantiles)&lt;br /&gt;            r = getSeries(object)[, n.max]&lt;br /&gt;            r = r[r &lt; quantiles[n.max]]&lt;br /&gt;            xmax = -mean(r)&lt;br /&gt;        }&lt;br /&gt;        if (risk == "VaR") {&lt;br /&gt;            xmax = max(-colQuantiles(getSeries(object), prob = alpha))&lt;br /&gt;        }&lt;br /&gt;        xlim = c(0, xmax)&lt;br /&gt;        Xlim = c(xlim[1] - diff(xlim) * offset, xlim[2] + diff(xlim) * &lt;br /&gt;            offset)&lt;br /&gt;    }&lt;br /&gt;    if (is.null(ylim)) {&lt;br /&gt;        ylim = range(getMean(object))&lt;br /&gt;        Ylim = c(ylim[1] - diff(ylim) * offset, ylim[2] + diff(ylim) * &lt;br /&gt;            offset)&lt;br /&gt;    }&lt;br /&gt;    frontierPlot(object, pch = 19, risk = risk, xlim = Xlim, &lt;br /&gt;        ylim = Ylim)&lt;br /&gt;    if (is.null(mText)) &lt;br /&gt;        mText = getTitle(object)&lt;br /&gt;    mtext(mText, side = 3, line = 0.5, font = 2)&lt;br /&gt;    grid()&lt;br /&gt;    abline(h = 0, col = "grey")&lt;br /&gt;    abline(v = 0, col = "grey")&lt;br /&gt;    data = getData(object)&lt;br /&gt;    spec = getSpec(object)&lt;br /&gt;    constraints = getConstraints(object)&lt;br /&gt;    mvPortfolio = minvariancePortfolio(data, spec, constraints)&lt;br /&gt;    minvariancePoints(object, risk = risk, auto = FALSE, pch = 19, &lt;br /&gt;        col = "red")&lt;br /&gt;    tangencyPoints(object, risk = risk, pch = 19, col = "blue")&lt;br /&gt;    myTangencyLines(object, risk = risk, col = "blue")&lt;br /&gt;    xy = equalWeightsPoints(object, risk = risk, pch = 15, col = "grey")&lt;br /&gt;    text(xy[, 1] + diff(xlim)/20, xy[, 2] + diff(ylim)/20, "EWP", &lt;br /&gt;        font = 2, cex = 0.7)&lt;br /&gt;    if (is.null(col)) &lt;br /&gt;        col = rainbow(6)&lt;br /&gt;    xy = singleAssetPoints(object, risk = risk, cex = 1.5, col = col, &lt;br /&gt;        lwd = 2)&lt;br /&gt;    text(xy[, 1] + diff(xlim)/20, xy[, 2] + diff(ylim)/20, rownames(xy), &lt;br /&gt;        font = 2, cex = 0.7)&lt;br /&gt;    if (twoAssets) {&lt;br /&gt;        twoAssetsLines(object, risk = risk, lty = 3, col = "grey")&lt;br /&gt;    }&lt;br /&gt;    invisible(object)&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;#this is where the 'action' begins&lt;br /&gt;&lt;br /&gt;djiData&lt;-readSeries(file="spdrAdj.csv",header=TRUE,sep=",")&lt;br /&gt;&lt;br /&gt;by&lt;-unique(timeLastNdayInMonth(time(djiData), 5))&lt;br /&gt;djiData&lt;-aggregate(djiData,by,mean)&lt;br /&gt;&lt;br /&gt;djiData.ret&lt;-100*returns(djiData,method="discrete")&lt;br /&gt;&lt;br /&gt;bmp("returns.bmp")&lt;br /&gt;plot(djiData.ret)&lt;br /&gt;dev.off()&lt;br /&gt;&lt;br /&gt;bmp("prices.bmp")&lt;br /&gt;plot(djiData)&lt;br /&gt;dev.off()&lt;br /&gt;&lt;br /&gt;bmp("corrMatrix.bmp")&lt;br /&gt;assetsCorImagePlot(djiData.ret)&lt;br /&gt;dev.off()&lt;br /&gt;&lt;br /&gt;globminSpec&lt;-portfolioSpec()&lt;br /&gt;globminPortfolio&lt;-minvariancePortfolio(&lt;br /&gt; data=djiData.ret,&lt;br /&gt; spec=globminSpec,&lt;br /&gt; constraints="LongOnly")&lt;br /&gt;print(globminPortfolio)&lt;br /&gt;&lt;br /&gt;tgSpec&lt;-portfolioSpec()&lt;br /&gt;setRiskFreeRate(tgSpec)&lt;-0.15&lt;br /&gt;tgPortfolio&lt;-tangencyPortfolio(&lt;br /&gt; data=djiData.ret,&lt;br /&gt; spec=tgSpec,&lt;br /&gt; constraints="LongOnly")&lt;br /&gt;print(tgPortfolio)&lt;br /&gt;&lt;br /&gt;pfront&lt;-portfolioFrontier(&lt;br /&gt; data=djiData.ret,&lt;br /&gt; spec=tgSpec,&lt;br /&gt; constraints="LongOnly")&lt;br /&gt;&lt;br /&gt;bmp("frontier.bmp")&lt;br /&gt;myTailoredFrontierPlot(pfront)&lt;br /&gt;dev.off()&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The output includes the minimum variance portfolio:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;Title:&lt;br /&gt; MV Minimum Variance Portfolio &lt;br /&gt; Estimator:         covEstimator &lt;br /&gt; Solver:            solveRquadprog &lt;br /&gt; Optimize:          minRisk &lt;br /&gt; Constraints:       LongOnly &lt;br /&gt;&lt;br /&gt;Portfolio Weights:&lt;br /&gt;   XLY    XLP    XLE    XLF    XLV    XLI    XLB    XLK    XLU &lt;br /&gt;0.0000 0.6629 0.0683 0.0000 0.2367 0.0000 0.0000 0.0000 0.0321 &lt;br /&gt;&lt;br /&gt;Covariance Risk Budgets:&lt;br /&gt;   XLY    XLP    XLE    XLF    XLV    XLI    XLB    XLK    XLU &lt;br /&gt;0.0000 0.6629 0.0683 0.0000 0.2367 0.0000 0.0000 0.0000 0.0321 &lt;br /&gt;&lt;br /&gt;Target Return and Risks:&lt;br /&gt;  mean     mu    Cov  Sigma   CVaR    VaR &lt;br /&gt;0.1472 0.1472 3.0633 3.0633 7.9035 5.2951 &lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The tangency portfolio:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;Title:&lt;br /&gt; MV Tangency Portfolio &lt;br /&gt; Estimator:         covEstimator &lt;br /&gt; Solver:            solveRquadprog &lt;br /&gt; Optimize:          minRisk &lt;br /&gt; Constraints:       LongOnly &lt;br /&gt;&lt;br /&gt;Portfolio Weights:&lt;br /&gt;   XLY    XLP    XLE    XLF    XLV    XLI    XLB    XLK    XLU &lt;br /&gt;0.0000 0.0000 0.9999 0.0000 0.0001 0.0000 0.0000 0.0000 0.0000 &lt;br /&gt;&lt;br /&gt;Covariance Risk Budgets:&lt;br /&gt;XLY XLP XLE XLF XLV XLI XLB XLK XLU &lt;br /&gt;  0   0   1   0   0   0   0   0   0 &lt;br /&gt;&lt;br /&gt;Target Return and Risks:&lt;br /&gt;   mean      mu     Cov   Sigma    CVaR     VaR &lt;br /&gt; 0.7914  0.7914  5.3748  5.3748 11.4222  7.4833 &lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;And some nice plots; monthly returns:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_JBvsBkmE5OU/Sl5r4-nEGNI/AAAAAAAAAHs/jMYOrzTz_HQ/s1600-h/returns.bmp"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 320px; height: 320px;" src="http://4.bp.blogspot.com/_JBvsBkmE5OU/Sl5r4-nEGNI/AAAAAAAAAHs/jMYOrzTz_HQ/s320/returns.bmp" border="0" alt=""id="BLOGGER_PHOTO_ID_5358839233241159890" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Monthly price series:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_JBvsBkmE5OU/Sl5sC-zBU7I/AAAAAAAAAH0/WaQwOQDtp5Y/s1600-h/prices.bmp"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 320px; height: 320px;" src="http://1.bp.blogspot.com/_JBvsBkmE5OU/Sl5sC-zBU7I/AAAAAAAAAH0/WaQwOQDtp5Y/s320/prices.bmp" border="0" alt=""id="BLOGGER_PHOTO_ID_5358839405090001842" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Correlation matrix of ETF monthly returns:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_JBvsBkmE5OU/Sl5sz3yCPfI/AAAAAAAAAH8/icvYnkQBav8/s1600-h/corrMatrix.bmp"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 320px; height: 320px;" src="http://2.bp.blogspot.com/_JBvsBkmE5OU/Sl5sz3yCPfI/AAAAAAAAAH8/icvYnkQBav8/s320/corrMatrix.bmp" border="0" alt=""id="BLOGGER_PHOTO_ID_5358840245020409330" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Efficient frontier with minimum variance and tangency portfolios:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_JBvsBkmE5OU/Sl5s-mHYsbI/AAAAAAAAAIE/1JpRqkxkf9Y/s1600-h/frontier.bmp"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 320px; height: 320px;" src="http://3.bp.blogspot.com/_JBvsBkmE5OU/Sl5s-mHYsbI/AAAAAAAAAIE/1JpRqkxkf9Y/s320/frontier.bmp" border="0" alt=""id="BLOGGER_PHOTO_ID_5358840429256683954" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The slope of the line that connects the risk-free asset (return of 0.15% on today's &lt;a href="http://www.ustreas.gov/offices/domestic-finance/debt-management/interest-rate/daily_treas_bill_rates.shtml"&gt;4-week Treasury Bill&lt;/a&gt;) on the vertical axis is called the &lt;a href="http://en.wikipedia.org/wiki/Sharpe_ratio"&gt;Sharpe ratio&lt;/a&gt;, and can be thought of as a reward-to-risk ratio (we will revisit this number in future posts).&lt;br /&gt;&lt;br /&gt;A remarkable result of all this is that an optimal portfolio (in the sense that the Sharpe ratio is maximized) can be achieved by some mix of the risk-free asset with the tangency portfolio of risky assets.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-1868829725933280255?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/1868829725933280255/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/07/markowitz-portfolio-optimization.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/1868829725933280255'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/1868829725933280255'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/07/markowitz-portfolio-optimization.html' title='Markowitz Portfolio Optimization'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_JBvsBkmE5OU/Sl5r4-nEGNI/AAAAAAAAAHs/jMYOrzTz_HQ/s72-c/returns.bmp' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-535182511710498093</id><published>2009-07-06T08:10:00.000-07:00</published><updated>2009-07-08T06:54:38.279-07:00</updated><title type='text'>Stock Returns</title><content type='html'>A good (concise) reference for statistical models and methods for financial markets is this &lt;a href="http://www.stanford.edu/%7Exing/statfinbook/"&gt;book&lt;/a&gt;, which goes by just that name.  One of the authors teaches at Stanford, and the other one at Columbia University, which incidentally is where the winner from the last &lt;a href="http://interactivebrokers.com/en/general/about/mediaRelations/03-26-09.php?ib_entity=llc"&gt;Interactive Brokers Collegiate Trading Olympiad&lt;/a&gt; is from.  As I've mentioned before, I find the financial markets a particularly appealing environment for experimenting with autonomous agents, and so for the next few posts we will be reviewing how some of the machine learning methods described in &lt;a href="http://see.stanford.edu/see/lecturelist.aspx?coll=348ca38a-3a6d-4052-937d-cb017338d7b1"&gt;CS229&lt;/a&gt; can be used to analyze financial data.&lt;br /&gt;&lt;br /&gt;The first example illustrates the use of linear regression to model the daily log returns of the stock of a given company as a function of the log returns of several other companies.  The &lt;a href="http://www.stanford.edu/%7Exing/statfinbook/data.html"&gt;data&lt;/a&gt; and &lt;a href="http://www.stanford.edu/%7Exing/statfinbook/functions.html"&gt;code&lt;/a&gt; can be downloaded from the book's website.&lt;br /&gt;&lt;br /&gt;Why returns?  Well, returns (given by the change in price of assets) are what dictate the profit (or loss) resulting from investing in stocks (neglecting transaction fees, taxes, etc), and so it makes sense to study their behavior.  The big question is whether or not stock returns are predictable.  The short answer is that the statistical evidence suggests they are not.  The long answer is, of course, a little more complicated than that and is the subject of much debate and research (for a good, non-technical overview check &lt;a href="http://www.princeton.edu/%7Ebmalkiel/"&gt;Malkiel&lt;/a&gt;'s "&lt;a href="http://www.amazon.com/Random-Walk-Down-Wall-Street/dp/0393325350"&gt;A Random Walk Down Wall Street&lt;/a&gt;."&lt;br /&gt;&lt;br /&gt;Why log returns?  Well, log returns (nearly equivalent to returns on a small scale) exhibit the additive property, which allows us to add single-period log returns to get the total multi-period log return (as opposed to multiplying the single-period returns).  This additive property allows us to make use of the central limit theorem and say that the log returns (single and multi-period) are independent and identically, normally, distributed.  Assuming that this holds, then the price movement could readily be described as a &lt;a href="http://en.wikipedia.org/wiki/Random_walk"&gt;geometric random walk&lt;/a&gt;, which is the discrete-time equivalent of the &lt;a href="http://en.wikipedia.org/wiki/Wiener_process"&gt;geometric Brownian motion&lt;/a&gt;, a widely used &lt;a href="http://en.wikipedia.org/wiki/Gaussian_process"&gt;gaussian process&lt;/a&gt; model of stock prices in &lt;a href="http://en.wikipedia.org/wiki/Black%E2%80%93Scholes"&gt;option pricing&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;At this point I would like to remind the reader of a quote attributed to&lt;a href="http://en.wikiquote.org/wiki/George_E._P._Box"&gt; George Box&lt;/a&gt;: "All models are wrong, but some are useful."  Essentially, always keep in mind that models are (or try to be) useful representations of an observed process, they are not the process, and so there is pretty much always room for improvement.  For example, later on we will look at &lt;a href="http://en.wikipedia.org/wiki/Heteroskedasticity"&gt;time series models of stock returns that do not make the i.i.d. assumption&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;To illustrate these ideas, we can simulate geometric random walks in R.  We imagine the process to be described by a &lt;span style="font-style: italic;"&gt;drift&lt;/span&gt; term and a &lt;span style="font-style: italic;"&gt;volatility&lt;/span&gt; term. Drift accounts for the upward historical trend observed in stock prices (the only term needed for modeling returns in risk-free assets such as savings accounts or treasury bonds), where as volatility accounts for how much the price fluctuates form the path described by just the drift.  Thus, a generative process would go something like this:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Start at some price Pinitial&lt;/li&gt;&lt;li&gt;Set Pnow=Pinitial&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Draw a log return 'r' from a normal distribution with drift 'mu' and volatility 'sd'&lt;/li&gt;&lt;li&gt;Set Pnow = Pnow * exp(r)&lt;/li&gt;&lt;li&gt;Repeat from step 3 until done&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;In the example below, we examine an asset with annual mean (drift) 10% and sd (volatility) 20%, not unreasonable for large stocks in the U.S. for the last 80 years or so:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;#grw_impl.r&lt;br /&gt;&lt;br /&gt;#for repeatability&lt;br /&gt;set.seed(123)&lt;br /&gt;&lt;br /&gt;#simulation parameters&lt;br /&gt;mu&lt;-0.1&lt;br /&gt;sd&lt;-0.2&lt;br /&gt;Pinitial&lt;-100&lt;br /&gt;N&lt;-80&lt;br /&gt;t&lt;-1:N&lt;br /&gt;&lt;br /&gt;#create placeholders for time series&lt;br /&gt;price_series1&lt;-1:N&lt;br /&gt;price_series2&lt;-1:N&lt;br /&gt;&lt;br /&gt;#sample vector from normal distribution&lt;br /&gt;sample_returns1&lt;-rnorm(t,mu,sd)&lt;br /&gt;sample_returns2&lt;-rnorm(t,mu,sd)&lt;br /&gt;&lt;br /&gt;#seed both series with same initial price&lt;br /&gt;Pnow1&lt;-Pinitial&lt;br /&gt;Pnow2&lt;-Pinitial&lt;br /&gt;&lt;br /&gt;#run both series forward&lt;br /&gt;for(i in 1:N)&lt;br /&gt;{&lt;br /&gt;  price_series1[i]&lt;-Pnow1*exp(sample_returns1[i])&lt;br /&gt;  Pnow1&lt;-price_series1[i]&lt;br /&gt;&lt;br /&gt;  price_series2[i]&lt;-Pnow2*exp(sample_returns2[i])&lt;br /&gt;  Pnow2&lt;-price_series2[i]&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;#plot&lt;br /&gt;bmp("price_realization_samples.bmp")&lt;br /&gt;&lt;br /&gt;ymin&lt;-min(Pinitial,price_series1,price_series2)&lt;br /&gt;ymax&lt;-max(Pinitial,price_series1,price_series2)&lt;br /&gt;plot(c(0,t),c(Pinitial,price_series1),type='l',xlab="year",ylab="price",xlim=c(-10,N),ylim=c(ymin,ymax))&lt;br /&gt;lines(c(0,t),c(Pinitial,price_series2),lty=2)&lt;br /&gt;legend(-9.5,ymax,c("price_series1","price_series2"),lty=c(1,2),merge=TRUE,bg='gray90')&lt;br /&gt;&lt;br /&gt;dev.off()&lt;br /&gt;&lt;br /&gt;#pretend this is unknown time series and calculate drift and volatility&lt;br /&gt;returns1&lt;-rep(NA,N)&lt;br /&gt;returns2&lt;-rep(NA,N)&lt;br /&gt;&lt;br /&gt;for(i in 2:N)&lt;br /&gt;{&lt;br /&gt;  returns1[i]&lt;-log(1+(price_series1[i]-price_series1[i-1])/price_series1[i-1])&lt;br /&gt;  returns2[i]&lt;-log(1+(price_series2[i]-price_series2[i-1])/price_series2[i-1])&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;cat("\nsingle period drift 1:\t",mean(returns1,na.rm=TRUE),"\n")&lt;br /&gt;cat("\nsingle period drift 2:\t",mean(returns2,na.rm=TRUE),"\n")&lt;br /&gt;&lt;br /&gt;cat("\nsingle period volatility 1:\t",sd(returns1,na.rm=TRUE),"\n")&lt;br /&gt;cat("\nsingle period volatility 2:\t",sd(returns2,na.rm=TRUE),"\n")&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Driven with:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;source("grw_impl.r")&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Yielding:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_JBvsBkmE5OU/SlJ79UoV8yI/AAAAAAAAAHU/B-e-qPmUXIQ/s1600-h/price_realization_samples.bmp"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 320px; height: 320px;" src="http://3.bp.blogspot.com/_JBvsBkmE5OU/SlJ79UoV8yI/AAAAAAAAAHU/B-e-qPmUXIQ/s320/price_realization_samples.bmp" border="0" alt=""id="BLOGGER_PHOTO_ID_5355479200337097506" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;single period drift 1:  0.1053144 &lt;br /&gt;&lt;br /&gt;single period drift 2:  0.09006281 &lt;br /&gt;&lt;br /&gt;single period volatility 1:  0.1854532 &lt;br /&gt;&lt;br /&gt;single period volatility 2:  0.1906028 &lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Notice that despite the 2 very different realization paths in the figure, the 2 series came from the same generating process and both display similar drift and volatility.&lt;br /&gt;&lt;br /&gt;Now, could we ascertain from these 2 sample paths that the log returns are normally distributed?  A qualitative way to determine this is to plot a histogram of the log returns and see what it looks like:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;bmp("returnHist.bmp")&lt;br /&gt;par(mfrow=c(2,1)) &lt;br /&gt;hist(returns1)  &lt;br /&gt;hist(returns2)&lt;br /&gt;dev.off()&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_JBvsBkmE5OU/SlN0nm1tbcI/AAAAAAAAAHc/OxpPdY2ae4w/s1600-h/returnHist.bmp"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 320px; height: 320px;" src="http://1.bp.blogspot.com/_JBvsBkmE5OU/SlN0nm1tbcI/AAAAAAAAAHc/OxpPdY2ae4w/s320/returnHist.bmp" border="0" alt=""id="BLOGGER_PHOTO_ID_5355752605663194562" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Another qualitative way would be a &lt;a href="http://en.wikipedia.org/wiki/Normal_probability_plot"&gt;normal probability plot&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;par(mfrow=c(2,1)) &lt;br /&gt;qqnorm(returns1);qqline(returns1)&lt;br /&gt;qqnorm(returns2);qqline(returns2)&lt;br /&gt;dev.off()&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_JBvsBkmE5OU/SlN4hcnozII/AAAAAAAAAHk/TTK3KvGGXj0/s1600-h/returnQQPlot.bmp"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 320px; height: 320px;" src="http://3.bp.blogspot.com/_JBvsBkmE5OU/SlN4hcnozII/AAAAAAAAAHk/TTK3KvGGXj0/s320/returnQQPlot.bmp" border="0" alt=""id="BLOGGER_PHOTO_ID_5355756897887112322" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;A quantitative way of determining normality is to look at the sample &lt;a href="http://en.wikipedia.org/wiki/Skewness"&gt;skewness&lt;/a&gt; and &lt;a href="http://en.wikipedia.org/wiki/Kurtosis"&gt;kurtosis&lt;/a&gt; to see if those values near those of a normal distribution (zero for both):&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;library(moments)&lt;br /&gt;cat("\nsample skewness 1:\t",skewness(returns1,na.rm=TRUE),"\n")&lt;br /&gt;cat("\nsample skewness 2:\t",skewness(returns2,na.rm=TRUE),"\n")&lt;br /&gt;cat("\nsample kurtosis 1:\t",kurtosis(returns1,na.rm=TRUE)-3,"\n")&lt;br /&gt;cat("\nsample kurtosis 2:\t",kurtosis(returns2,na.rm=TRUE)-3,"\n")&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;From wikipedia: "The "minus 3" at the end of this formula is often explained as a correction to make the kurtosis of the normal distribution equal to zero."  The excess skewness and kurtosis (in either direction) gives us a measure of how much the sample data deviates from a normal distribution:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;sample skewness 1:  0.03947952 &lt;br /&gt;&lt;br /&gt;sample skewness 2:  0.2006779 &lt;br /&gt;&lt;br /&gt;sample kurtosis 1:  -0.1816579 &lt;br /&gt;&lt;br /&gt;sample kurtosis 2:  -0.383789 &lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Of course everything fits almost perfectly since we carefully generated the data (even if just little of it); real stock returns will most likely not look as pretty, but fairly close. &lt;br /&gt;&lt;br /&gt;So, now that we've talked a little about log returns, we come back to the original example, which attempted to model the log returns of a specific company, MSFT, as a linear function of the log returns of several other similar companies (AAPL, DELL, YHOO, etc).  The code is self explanatory, and the exercise is divided in 3 parts: regression for full model, variable selection, and regression diagnostics.&lt;br /&gt;&lt;br /&gt;&lt;u&gt;Regression for full model&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;R's 'lm' method is used to fit a linear model using all candidate companies in the data frame.  Notable functions include 'plot', which, with the data frame as argument, plots the entire scatter plot matrix - also 'cor' - which outputs a matrix of correlations that is handy to have while looking at the scatter plot matrix.  &lt;br /&gt;&lt;br /&gt;&lt;u&gt;Variable Selection&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;Up until now, we have been looking simply at the MSE for comparing regression methods; however statisticians have developed a whole slew of metrics that can be used to determine the quality of a model and guide the feature selection process.  Calling 'summary' on the object returned by 'lm' provides the following:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;Call:&lt;br /&gt;lm(formula = msft ~ aapl + adbe + adp + amd + dell + gtw + hp + &lt;br /&gt;    ibm + orcl + sunw + yhoo)&lt;br /&gt;&lt;br /&gt;Residuals:&lt;br /&gt;       Min         1Q     Median         3Q        Max &lt;br /&gt;-0.0380497 -0.0028902 -0.0002195  0.0027663  0.0286453 &lt;br /&gt;&lt;br /&gt;Coefficients:&lt;br /&gt;              Estimate Std. Error t value Pr(&gt;|t|)    &lt;br /&gt;(Intercept)  2.782e-05  1.634e-04   0.170   0.8649    &lt;br /&gt;aapl         4.180e-02  1.624e-02   2.573   0.0102 *  &lt;br /&gt;adbe         7.821e-02  1.389e-02   5.630 2.22e-08 ***&lt;br /&gt;adp          4.891e-02  2.266e-02   2.158   0.0311 *  &lt;br /&gt;amd          1.920e-02  1.069e-02   1.796   0.0727 .  &lt;br /&gt;dell         1.771e-01  2.127e-02   8.325  &lt; 2e-16 ***&lt;br /&gt;gtw          3.959e-02  9.919e-03   3.991 6.97e-05 ***&lt;br /&gt;hp           2.703e-02  1.772e-02   1.526   0.1273    &lt;br /&gt;ibm          2.302e-01  2.753e-02   8.360  &lt; 2e-16 ***&lt;br /&gt;orcl         1.264e-01  1.541e-02   8.204 5.73e-16 ***&lt;br /&gt;sunw        -2.057e-02  1.213e-02  -1.696   0.0902 .  &lt;br /&gt;yhoo         2.583e-02  1.238e-02   2.085   0.0372 *  &lt;br /&gt;---&lt;br /&gt;Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 &lt;br /&gt;&lt;br /&gt;Residual standard error: 0.005753 on 1243 degrees of freedom&lt;br /&gt;Multiple R-squared: 0.5726, Adjusted R-squared: 0.5688 &lt;br /&gt;F-statistic: 151.4 on 11 and 1243 DF,  p-value: &lt; 2.2e-16 &lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;'Std. Error', a measure of precision, is the standard deviation of the estimate.  't-value' is the &lt;a href="http://en.wikipedia.org/wiki/Student%27s_t-statistic"&gt;t-statistic&lt;/a&gt; for testing that the specified coefficient is zero; it is obtained by dividing the estimate by its standard error and it's used in the calculation of the p-value.  'Pr(&gt;|t|)' is the &lt;a href="http://en.wikipedia.org/wiki/P-value"&gt;p-value&lt;/a&gt;.  If the calculated p-value is below the threshold chosen for statistical significance (usually the 0.10, the 0.05, or 0.01 level), then the null hypothesis (that the coefficient is zero) is rejected in favor of the alternative hypothesis.  R gives you a nifty little reminder by adding a graphical representation of the p-value for each estimate: the more stars, the better.  One could use this information for tuning the model, for example, by removing statistically insignificant regressors (i.e. HP, AMD, SUNW) in a &lt;a href="http://www.statistics.com/resources/glossary/b/bwdelimin.php"&gt;backward elimination&lt;/a&gt; procedure (alternatively, R includes the '&lt;a href="http://stat.ethz.ch/R-manual/R-patched/library/MASS/html/stepAIC.html"&gt;stepAIC&lt;/a&gt;' function, which performs stepwise model selection by &lt;a href="http://en.wikipedia.org/wiki/Akaike_information_criterion"&gt;Akaike Information Criterion&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;&lt;u&gt;Regression Diagnostics&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;The techniques mentioned above to test for normality (normal probability plot, skewness and kurtosis, etc) are also useful here, since under the probabilistic view of least squares regression, we assume the errors (residuals) to be normally distributed.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-535182511710498093?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/535182511710498093/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/07/stock-returns.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/535182511710498093'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/535182511710498093'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/07/stock-returns.html' title='Stock Returns'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_JBvsBkmE5OU/SlJ79UoV8yI/AAAAAAAAAHU/B-e-qPmUXIQ/s72-c/price_realization_samples.bmp' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-727397406271727072</id><published>2009-07-01T17:34:00.000-07:00</published><updated>2009-07-01T18:33:48.182-07:00</updated><title type='text'>Supervised Learning Method Comparison</title><content type='html'>"Which supervised learning method works best for what?" - an empirical survey by &lt;a href="http://www.cs.cornell.edu/%7Ecaruana/"&gt;Rich Caruana&lt;/a&gt; (&lt;a href="http://videolectures.net/solomon_caruana_wslmw/"&gt;Video&lt;/a&gt; &amp;amp; &lt;a href="http://www.cs.cornell.edu/courses/cs678/2007sp/empirical.caruana.678.07.pdf"&gt;Slides&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;Abstract:&lt;br /&gt;&lt;span id="lec_desc_edit" style="line-height: 140%;"&gt;&lt;span class="wiki"&gt;"Decision trees are intelligible, but do they perform well enough that you should use them? Have SVMs replaced neural nets, or are neural nets still best for regression, and SVMs best for classification? Boosting maximizes margins similar to SVMs, but can boosting compete with SVMs? And if it does compete, is it better to boost weak models, as theory might suggest, or to boost stronger models? Bagging is simpler than boosting -- how well does bagging stack up against boosting? Breiman said Random Forests are better than bagging and as good as boosting. Was he right? And what about old friends like logistic regression, KNN, and naive bayes? Should they be relegated to the history books, or do they still fill important niches?&lt;br /&gt;In this talk we compare the performance of ten supervised learning methods on nine criteria: Accuracy, F-score, Lift, Precision/Recall Break-Even Point, Area under the ROC, Average Precision, Squared Error, Cross-Entropy, and Probability Calibration. The results show that no one learning method does it all, but some methods can be "repaired" so that they do very well across all performance metrics. In particular, we show how to obtain the best probabilities from max margin methods such as SVMs and boosting via Platt's Method and isotonic regression. We then describe a new ensemble method that combines select models from these ten learning methods to yield much better performance. Although these ensembles perform extremely well, they are too complex for many applications. We'll describe what we're doing to try to fix that. Finally, if time permits, we'll discuss how the nine performance metrics relate to each other, and which of them you probably should (or shouldn't) use.&lt;/span&gt;&lt;/span&gt;"&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-727397406271727072?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/727397406271727072/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/07/supervised-learning-method-comparison.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/727397406271727072'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/727397406271727072'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/07/supervised-learning-method-comparison.html' title='Supervised Learning Method Comparison'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-1052078378895418322</id><published>2009-06-29T04:14:00.000-07:00</published><updated>2010-07-30T22:26:57.395-07:00</updated><title type='text'>Strategy Optimization</title><content type='html'>A particularly thought-provoking chapter in &lt;a href="http://aima.cs.berkeley.edu/"&gt;AIMA&lt;/a&gt; is chapter 17, "Making Complex Decisions," which elegantly introduces many powerful concepts in a way that highlights their interrelations.  The list below is an excerpt from the &lt;a href="http://aima.cs.berkeley.edu/contents.html"&gt;TOC&lt;/a&gt;:&lt;br /&gt;&lt;ul&gt;&lt;li&gt; Sequential Decision Problems&lt;/li&gt;&lt;li&gt;Partially observable MDPs&lt;/li&gt;&lt;li&gt;Decision-Theoretic Agents&lt;/li&gt;&lt;li&gt;Decisions with Multiple Agents: Game Theory&lt;/li&gt;&lt;li&gt;Mechanism Design&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;Of particular interest to me is how these topics relate to strategy discovery and optimization.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://en.wikipedia.org/wiki/Strategy"&gt;A strategy is a plan of action designed to achieve a particular goal&lt;/a&gt;, and strategy discovery can be thought of as a subset of strategy optimization since in theory, we could start from a trivial, arbitrary, possibly bad strategy and somehow optimize it until we have "discovered" a satisfactory strategy.  Perhaps "strategy design" would more generally describe what I'm talking about, but I prefer to term "optimization" for all the sentiments it evokes.&lt;br /&gt;&lt;br /&gt;We have already seen how we can use reinforcement learning methods to come up with satisfactory strategies for controllers in settings such as the inverted pendulum problem.  In that context, we called the strategy a policy and defined the optimal policy to be a mapping from states to actions that would maximize the controller's expected return from the environment.  What other kinds of scenarios may be these concepts and methods be useful for?  Consider the following 2 examples:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Nathan_Mayer_Rothschild#Legend"&gt;Legend has it&lt;/a&gt; that Nathan Rothschild, a London Banker in the 19th century, acted cleverly on early news of the outcome of the battle of Waterloo and made a fortune.  &lt;a href="http://en.wikipedia.org/wiki/Consols"&gt;Consols&lt;/a&gt; would rise as a result of Napoleon's defeat, but rather than buying Consols anticipating the increase, Nathan sold his existing holdings.  Other investors, unaware of the outcome in Waterloo, got wind of this and assumed Nathan  was acting on information that the battle was lost.  Nathan was known at the time to have many connections and was expected to have access to early information, so these other investors also sold their holdings, driving the prices further down.  At the proverbial last minute, Nathan bought at bargain prices.  When news of the battle came out, the prices soared, and Nathan realized a nice profit.&lt;/li&gt;&lt;li&gt;A more modern story is that of how Porsche cornered the market in VW stock and made a killing in 2008.  You can read about it &lt;a href="http://radian.org/notebook/porsche"&gt;here&lt;/a&gt;, &lt;a href="http://dealbook.blogs.nytimes.com/2008/10/31/porsches-vw-move-too-clever-by-half/"&gt;here&lt;/a&gt;, and &lt;a href="http://www.telegraph.co.uk/finance/globalbusiness/3362913/Porsche-crashes-into-controversy-in-the-ultimate-short-squeeze.html"&gt;here&lt;/a&gt;.&lt;/li&gt;&lt;/ol&gt;These and many other accounts, in finance and in other fields, are examples of successful strategies that while on one hand are amazing because of their ingenuity, on the other hand appear evident in hindsight, suggesting that recognizable causal mechanisms were exploited rather than being successful because of good fortune.  Furthermore, after reading several such accounts, there appear to be common traces among such successful strategies (even across disparate fields) which suggests that there may exist a template strategy  (or a reusable process for arriving at such strategy) that may be customized to specific scenarios, leading to the question: "could such strategy optimization processes be automated?"&lt;br /&gt;&lt;br /&gt;The strategy optimization process can indeed be automated, and a framework for doing so draws from fields such as &lt;a href="http://en.wikipedia.org/wiki/Decision_theory"&gt;decision theory&lt;/a&gt;, &lt;a href="http://levine.sscnet.ucla.edu/general/whatis.htm"&gt;game theory&lt;/a&gt;, &lt;a href="http://en.wikipedia.org/wiki/System_dynamics"&gt;system dynamics&lt;/a&gt;, and &lt;a href="http://en.wikipedia.org/wiki/Control_theory"&gt;control theory&lt;/a&gt;, among others, making heavy use of modeling and simulation, not unlike our inverted pendulum example.  Take the London Banker, for instance, who if ambivalent about the outcomes resulting from initially selling or buying, could run simulations of each strategy to determine the most profitable one.  Of course, the success of this exercise would primarily depend on the quality of the models for each independent system component and their interrelations, some of which may include a model for information flow, a model for Nathan and traders other than Nathan, and a model of the "physics" of the environment (i.e. what actions may or may not be taken by market participants at any given time).&lt;br /&gt;&lt;br /&gt;And therein lies the rub: it is very hard to model complex systems, specially when it involves modeling human behavior.  Nevertheless, there exist methods to ease the burden and that is something I expect to dive into deeper in future posts.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-1052078378895418322?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/1052078378895418322/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/06/strategy-optimization.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/1052078378895418322'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/1052078378895418322'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/06/strategy-optimization.html' title='Strategy Optimization'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-942474703282875439</id><published>2009-05-31T10:50:00.000-07:00</published><updated>2009-06-29T04:18:15.585-07:00</updated><title type='text'>Lec17 - Fitted Value Iteration</title><content type='html'>&lt;a href="http://www.youtube.com/watch?v=LKdFTsM3hl4&amp;feature=PlayList&amp;p=A89DCFA6ADACE599&amp;index=16"&gt;Video Lecture - 17&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Below is a python implementation for controlling an inverted pendulum in a cart (described &lt;a href="http://en.wikipedia.org/wiki/Inverted_pendulum"&gt;here&lt;/a&gt; and animated &lt;a href="http://brain.cc.kogakuin.ac.jp/~kanamaru/NN/CPRL/"&gt;here&lt;/a&gt;) using the so called &lt;a href="http://en.wikipedia.org/wiki/Bang-bang_control"&gt;bang-bang control&lt;/a&gt; regime (alternatively, we could have used some measure of &lt;a href="http://www.fourmilab.ch/hackdiet/www/subsection1_2_3_0_5.html"&gt;proportional control&lt;/a&gt;), in which the state space is discretized.  The task is further defined in &lt;a href="http://www.stanford.edu/class/cs229/materials.html"&gt;problem set #4&lt;/a&gt; (whose solution includes a matlab implementation) and uses system dynamics defined &lt;a href="http://www-anw.cs.umass.edu/rlr/domains.html"&gt;here&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;#polecart_disc.py&lt;br /&gt;from pylab import *&lt;br /&gt;from numpy import *&lt;br /&gt;&lt;br /&gt;#Displays the pole-cart graph.&lt;br /&gt;class show_cart_params: pass&lt;br /&gt;show_cart_args=show_cart_params()&lt;br /&gt;show_cart_args.first=True&lt;br /&gt;show_cart_args.trial=0&lt;br /&gt;def show_cart(state_vector):&lt;br /&gt;    figure(1)&lt;br /&gt;    length=3&lt;br /&gt;    x=state_vector[0,0];x_dot=state_vector[1,0]&lt;br /&gt;    theta=state_vector[2,0];theta_dot=state_vector[3,0]&lt;br /&gt;    x1=x&lt;br /&gt;    y1=0&lt;br /&gt;    x2=x+length*sin(theta)&lt;br /&gt;    y2=length*cos(theta)&lt;br /&gt;    cart_linex1,cart_liney1=[x-.4,x+.4],[-.25,-.25]&lt;br /&gt;    cart_linex2,cart_liney2=[x+.4,x+.4],[-.25,0]&lt;br /&gt;    cart_linex3,cart_liney3=[x+.4,x-.4],[0,0]&lt;br /&gt;    cart_linex4,cart_liney4=[x-.4,x-.4],[0,-.25]&lt;br /&gt;    base_linex,base_liney=[x,x],[-.5,-.25]&lt;br /&gt;    axis([-3,3,-0.5,3.5])&lt;br /&gt;    global show_cart_args&lt;br /&gt;    &lt;br /&gt;    if show_cart_args.first:&lt;br /&gt;        show_cart_args.first=False &lt;br /&gt;        show_cart_args.pend_line,=plot([x1,x2],[y1,y2],c='black')&lt;br /&gt;        show_cart_args.cart_line1,=plot(cart_linex1,cart_liney1,c='cyan')&lt;br /&gt;        show_cart_args.cart_line2,=plot(cart_linex2,cart_liney2,c='cyan')&lt;br /&gt;        show_cart_args.cart_line3,=plot(cart_linex3,cart_liney3,c='cyan')&lt;br /&gt;        show_cart_args.cart_line4,=plot(cart_linex4,cart_liney4,c='cyan')&lt;br /&gt;        show_cart_args.base_line,=plot(base_linex,base_liney,c='cyan')&lt;br /&gt;&lt;br /&gt;    show_cart_args.pend_line.set_data([x1,x2],[y1,y2])&lt;br /&gt;    show_cart_args.cart_line1.set_data(cart_linex1,cart_liney1)&lt;br /&gt;    show_cart_args.cart_line2.set_data(cart_linex2,cart_liney2)&lt;br /&gt;    show_cart_args.cart_line3.set_data(cart_linex3,cart_liney3)&lt;br /&gt;    show_cart_args.cart_line4.set_data(cart_linex4,cart_liney4)&lt;br /&gt;    show_cart_args.base_line.set_data(base_linex,base_liney)&lt;br /&gt;    if show_cart_args.trial!=num_failures:&lt;br /&gt;        title('trial #%d'%num_failures)&lt;br /&gt;        show_cart_args.trial=num_failures&lt;br /&gt;    draw()&lt;br /&gt;&lt;br /&gt;#Returns a discretized value (an index) for a continuous&lt;br /&gt;#state vector: x is divided into 3 "boxes", x_dot into 3,&lt;br /&gt;#theta into 6, and theta_dot into 3.&lt;br /&gt;def get_state(state_vector):&lt;br /&gt;    one_degree=0.0174532&lt;br /&gt;    six_degrees=0.1047192&lt;br /&gt;    twelve_degrees=0.2094384&lt;br /&gt;    fifty_degrees=0.87266&lt;br /&gt;    total_states=163&lt;br /&gt;    state=0&lt;br /&gt;    x=state_vector[0,0];x_dot=state_vector[1,0]&lt;br /&gt;    theta=state_vector[2,0];theta_dot=state_vector[3,0]&lt;br /&gt;&lt;br /&gt;    if -2.4&gt;x or x&gt;2.4 or -twelve_degrees&gt;theta or theta&gt;twelve_degrees:&lt;br /&gt;        #went past boundaries&lt;br /&gt;        state=total_states-1&lt;br /&gt;    else:&lt;br /&gt;        if -1.5&gt;x:&lt;br /&gt;            state=0&lt;br /&gt;        elif x&lt;1.5:&lt;br /&gt;            state=1&lt;br /&gt;        else:&lt;br /&gt;            state=2&lt;br /&gt;&lt;br /&gt;        if -.5&gt;x_dot:&lt;br /&gt;            pass&lt;br /&gt;        elif .5&gt;x_dot:&lt;br /&gt;            state+=3&lt;br /&gt;        else:&lt;br /&gt;            state+=6&lt;br /&gt;&lt;br /&gt;        if -six_degrees&gt;theta:&lt;br /&gt;            pass&lt;br /&gt;        elif -one_degree&gt;theta:&lt;br /&gt;            state+=9&lt;br /&gt;        elif 0&gt;theta:&lt;br /&gt;            state+=18&lt;br /&gt;        elif one_degree&gt;theta:&lt;br /&gt;            state+=27&lt;br /&gt;        elif six_degrees&gt;theta:&lt;br /&gt;            state+=36&lt;br /&gt;        else:&lt;br /&gt;            state+=45&lt;br /&gt;&lt;br /&gt;        if -fifty_degrees&gt;theta_dot:&lt;br /&gt;            pass&lt;br /&gt;        elif fifty_degrees&gt;theta_dot:&lt;br /&gt;            state+=54&lt;br /&gt;        else:&lt;br /&gt;            state+=108&lt;br /&gt;&lt;br /&gt;    return state&lt;br /&gt;&lt;br /&gt;#Simulates pole-cart dynamics.&lt;br /&gt;def cart_pole(action,state_vector):&lt;br /&gt;    x=state_vector[0,0];x_dot=state_vector[1,0]&lt;br /&gt;    theta=state_vector[2,0];theta_dot=state_vector[3,0]&lt;br /&gt;    &lt;br /&gt;    gravity=9.8&lt;br /&gt;    masscart=1.0&lt;br /&gt;    masspole=.3&lt;br /&gt;    total_mass=masspole+masscart&lt;br /&gt;    length=.7&lt;br /&gt;    polemass_length=masspole*length&lt;br /&gt;    force_mag=10.0&lt;br /&gt;    tau=.02&lt;br /&gt;    fourthirds=1.3333333333333&lt;br /&gt;&lt;br /&gt;    action_flip_prob=0.0&lt;br /&gt;    force_noise_factor=0.0&lt;br /&gt;    no_control_prob=0.0&lt;br /&gt;&lt;br /&gt;    if action_flip_prob&gt;random.random():&lt;br /&gt;        action=1-action&lt;br /&gt;&lt;br /&gt;    if action&gt;0:&lt;br /&gt;        force=force_mag&lt;br /&gt;    else:&lt;br /&gt;        force=-force_mag&lt;br /&gt;&lt;br /&gt;    force=force* \&lt;br /&gt;           (1-force_noise_factor+random.random()*2*force_noise_factor)&lt;br /&gt;&lt;br /&gt;    if no_control_prob&gt;random.random():&lt;br /&gt;        force=0&lt;br /&gt;&lt;br /&gt;    costheta=cos(theta)&lt;br /&gt;    sintheta=sin(theta)&lt;br /&gt;&lt;br /&gt;    temp=(force+polemass_length*theta_dot*theta_dot*sintheta)/total_mass&lt;br /&gt;    thetaacc=(gravity*sintheta-costheta*temp)/(length*(fourthirds \&lt;br /&gt;            -masspole*costheta*costheta/total_mass))&lt;br /&gt;    xacc=temp-polemass_length*thetaacc*costheta/total_mass&lt;br /&gt;&lt;br /&gt;    new_x=x+tau*x_dot&lt;br /&gt;    new_x_dot=x_dot+tau*xacc&lt;br /&gt;    new_theta=theta+tau*theta_dot&lt;br /&gt;    new_theta_dot=theta_dot+tau*thetaacc&lt;br /&gt;&lt;br /&gt;    return array([new_x,new_x_dot,new_theta,new_theta_dot]).reshape(4,1)&lt;br /&gt;&lt;br /&gt;ion()&lt;br /&gt;display_started=False&lt;br /&gt;num_states=163&lt;br /&gt;gamma=.995&lt;br /&gt;tolerance=.01&lt;br /&gt;no_learning_threshold=20&lt;br /&gt;time=0&lt;br /&gt;time_steps_to_failure={}&lt;br /&gt;num_failures=0&lt;br /&gt;time_at_start_of_current_trial=0&lt;br /&gt;transition_counts=zeros((2,num_states,num_states))&lt;br /&gt;transition_probs=ones((2,num_states,num_states))/num_states&lt;br /&gt;reward_counts=zeros((num_states,2))&lt;br /&gt;reward=zeros((num_states,1))&lt;br /&gt;value=random.ranf((num_states,1))*0.1&lt;br /&gt;consecutive_no_learning_trials=0&lt;br /&gt;x=0.0;x_dot=0.0;theta=0.0;theta_dot=0.0&lt;br /&gt;state_vector=array([x,x_dot,theta,theta_dot]).reshape(4,1)&lt;br /&gt;state_index=get_state(state_vector)&lt;br /&gt;&lt;br /&gt;#The main loop in this animation, will continue to run&lt;br /&gt;#until the number of consecutive trials in which value&lt;br /&gt;#iteration converged within one iteration is whatever&lt;br /&gt;#no_learning_threshold is defined to be.&lt;br /&gt;while no_learning_threshold&gt;consecutive_no_learning_trials:&lt;br /&gt;    &lt;br /&gt;    #Calculate the expected value of taking each of the&lt;br /&gt;    #2 actions we have at our disposal. If one action is&lt;br /&gt;    #not clearly better than the other, choose randomly.&lt;br /&gt;    score1=dot(transition_probs[0,state_index,:],value)&lt;br /&gt;    score2=dot(transition_probs[1,state_index,:],value)&lt;br /&gt;    if score1&gt;score2:&lt;br /&gt;        action=0&lt;br /&gt;    elif score2&gt;score1:&lt;br /&gt;        action=1&lt;br /&gt;    else:&lt;br /&gt;        if random.random()&gt;.5:&lt;br /&gt;            action=1&lt;br /&gt;        else:&lt;br /&gt;            action=0&lt;br /&gt;&lt;br /&gt;    #Take the action and see what state we end up in next.&lt;br /&gt;    state_vector=cart_pole(action,state_vector)&lt;br /&gt;    time+=1&lt;br /&gt;    new_state_index=get_state(state_vector)&lt;br /&gt;&lt;br /&gt;    if display_started:&lt;br /&gt;        show_cart(state_vector)&lt;br /&gt;&lt;br /&gt;    #This here defines the reward function for this task: -1&lt;br /&gt;    #if the pole falls or the cart has gone beyond the specified&lt;br /&gt;    #boundary, and 0 otherwise.  The last state is considered&lt;br /&gt;    #to be the one that satisfies these conditions.&lt;br /&gt;    if new_state_index==num_states-1:&lt;br /&gt;        R=-1.0&lt;br /&gt;    else:&lt;br /&gt;        R=0.0&lt;br /&gt;&lt;br /&gt;    #Update transition counts and reward counts.  We keep track of&lt;br /&gt;    #2 things in reward_counts: the number of times we received a&lt;br /&gt;    #reward in that state and the total reward we received in that state&lt;br /&gt;    transition_counts[action,state_index,new_state_index]= \&lt;br /&gt;        transition_counts[action,state_index,new_state_index]+1.0&lt;br /&gt;    reward_counts[new_state_index,1]=reward_counts[new_state_index,1]+1.0&lt;br /&gt;    reward_counts[new_state_index,0]=reward_counts[new_state_index,0]+R&lt;br /&gt;&lt;br /&gt;    if new_state_index==num_states-1:&lt;br /&gt;&lt;br /&gt;        #If we ended up in the terminal state, this trial is over,&lt;br /&gt;        #so let's now update our statistics.&lt;br /&gt;        #First, turn the transition counts into probabilities,&lt;br /&gt;        for a in range(2):&lt;br /&gt;            for s in range(num_states):&lt;br /&gt;                den=sum(transition_counts[a,s,:])&lt;br /&gt;                if den&gt;0:&lt;br /&gt;                    transition_probs[a,s,:]=transition_counts[a,s,:]/den&lt;br /&gt;&lt;br /&gt;        #then calculate the expected reward in each state using&lt;br /&gt;        #the total reward for that state and how many times&lt;br /&gt;        #we received a reward in that state.&lt;br /&gt;        for s in range(num_states):&lt;br /&gt;            if reward_counts[s,1]&gt;0:&lt;br /&gt;                reward[s,0]=reward_counts[s,0]/reward_counts[s,1]&lt;br /&gt;&lt;br /&gt;        iterations=0&lt;br /&gt;        new_value=zeros((num_states,1))&lt;br /&gt;        while True:&lt;br /&gt;            #Now update value estimates by performing&lt;br /&gt;            #value iteration until convergence, which is defined&lt;br /&gt;            #as our value estimates not changing below the&lt;br /&gt;            #specified tolerance.&lt;br /&gt;            iterations=iterations+1&lt;br /&gt;            for s in range(num_states):&lt;br /&gt;                value1=dot(transition_probs[0,s,:],value)&lt;br /&gt;                value2=dot(transition_probs[1,s,:],value)&lt;br /&gt;                new_value[s,0]=max(value1,value2)&lt;br /&gt;            new_value=reward+gamma*new_value&lt;br /&gt;            diff=max(abs(value-new_value))&lt;br /&gt;            value=new_value+0#copy&lt;br /&gt;            if tolerance&gt;diff:&lt;br /&gt;                break&lt;br /&gt;&lt;br /&gt;        #if we converged in 1 iteration, keep track of that because&lt;br /&gt;        #once this happens no_learning_threshold times in a row,&lt;br /&gt;        #our experiment is over.&lt;br /&gt;        if iterations==1:&lt;br /&gt;            consecutive_no_learning_trials+= 1&lt;br /&gt;            #watch the last trial&lt;br /&gt;            if consecutive_no_learning_trials&gt;no_learning_threshold-2:&lt;br /&gt;                display_started=True&lt;br /&gt;        else:&lt;br /&gt;            consecutive_no_learning_trials=0&lt;br /&gt;            display_started=False&lt;br /&gt;&lt;br /&gt;        #num_failures can be thought of as the trial number&lt;br /&gt;        num_failures+=1&lt;br /&gt;        #update how many time steps we lasted in this trial&lt;br /&gt;        time_steps_to_failure[num_failures]= \&lt;br /&gt;            time-time_at_start_of_current_trial&lt;br /&gt;        time_at_start_of_current_trial=time&lt;br /&gt;        #reset our state vector to some random location&lt;br /&gt;        x=random.uniform(-1.1,1.1);x_dot=0.0;theta=0.0;theta_dot=0.0&lt;br /&gt;        state_vector=array([x,x_dot,theta,theta_dot]).reshape(4,1)&lt;br /&gt;        state_index=get_state(state_vector)&lt;br /&gt;    else:&lt;br /&gt;        state_index=new_state_index&lt;br /&gt;&lt;br /&gt;#Plot learning curve:&lt;br /&gt;figure(2)&lt;br /&gt;plot(time_steps_to_failure.values(),'k')&lt;br /&gt;title('Time steps per trial')&lt;br /&gt;ioff()&lt;br /&gt;show()&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The output includes an animated plot of the cart on its last trial and a plot showing how many time-steps the system was balanced within constraints on each trial:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_JBvsBkmE5OU/Siko6ZAsBQI/AAAAAAAAAGs/z08fbZftkuI/s1600-h/polecart.png"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 320px; height: 240px;" src="http://1.bp.blogspot.com/_JBvsBkmE5OU/Siko6ZAsBQI/AAAAAAAAAGs/z08fbZftkuI/s320/polecart.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5343847416463099138" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_JBvsBkmE5OU/SikpC3XmTwI/AAAAAAAAAG0/5EA0Orlj-HQ/s1600-h/polecart_disc.png"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 320px; height: 240px;" src="http://1.bp.blogspot.com/_JBvsBkmE5OU/SikpC3XmTwI/AAAAAAAAAG0/5EA0Orlj-HQ/s320/polecart_disc.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5343847562051211010" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Rather than discretizing the state space, we could directly estimate the value as a function of a continuous feature vector which is derived from the "perceived" continuous state vector.  This approach, fitted value iteration, is described &lt;a href="http://www.stanford.edu/class/cs229/notes/cs229-notes12.pdf"&gt;here&lt;/a&gt;, and is just one of the &lt;a href="http://www.ict.swin.edu.au/personal/jbrownlee/2005/TR07-2005.pdf"&gt;many methods applied to the inverted pendulum problem&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;#polecart_cont.py&lt;br /&gt;import sys&lt;br /&gt;from pylab import *&lt;br /&gt;from numpy import *&lt;br /&gt;&lt;br /&gt;#To add more artificial features just append them to feats.&lt;br /&gt;def get_feat_vec(x,x_dot,theta,theta_dot):&lt;br /&gt;    feats=[1.0,&lt;br /&gt;           x,&lt;br /&gt;           x_dot,&lt;br /&gt;           theta,&lt;br /&gt;           theta_dot,&lt;br /&gt;           x_dot**2,&lt;br /&gt;           theta_dot**2,&lt;br /&gt;           theta*x]&lt;br /&gt;    return array(feats).reshape(1,len(feats))&lt;br /&gt;&lt;br /&gt;#Displays the cart pole graph.&lt;br /&gt;class show_cart_params: pass&lt;br /&gt;show_cart_args=show_cart_params()&lt;br /&gt;show_cart_args.first=True&lt;br /&gt;show_cart_args.trial=0&lt;br /&gt;def show_cart(x,x_dot,theta,theta_dot):&lt;br /&gt;    figure(1)&lt;br /&gt;    length=3&lt;br /&gt;    x1=x&lt;br /&gt;    y1=0&lt;br /&gt;    x2=x+length*sin(theta)&lt;br /&gt;    y2=length*cos(theta)&lt;br /&gt;    cart_linex1,cart_liney1=[x-.4,x+.4],[-.25,-.25]&lt;br /&gt;    cart_linex2,cart_liney2=[x+.4,x+.4],[-.25,0]&lt;br /&gt;    cart_linex3,cart_liney3=[x+.4,x-.4],[0,0]&lt;br /&gt;    cart_linex4,cart_liney4=[x-.4,x-.4],[0,-.25]&lt;br /&gt;    base_linex,base_liney=[x,x],[-.5,-.25]&lt;br /&gt;    axis([-3,3,-0.5,3.5])&lt;br /&gt;    global show_cart_args&lt;br /&gt;    &lt;br /&gt;    if show_cart_args.first:&lt;br /&gt;        show_cart_args.first=False &lt;br /&gt;        show_cart_args.pend_line,=plot([x1,x2],[y1,y2],c='black')&lt;br /&gt;        show_cart_args.cart_line1,=plot(cart_linex1,cart_liney1,c='cyan')&lt;br /&gt;        show_cart_args.cart_line2,=plot(cart_linex2,cart_liney2,c='cyan')&lt;br /&gt;        show_cart_args.cart_line3,=plot(cart_linex3,cart_liney3,c='cyan')&lt;br /&gt;        show_cart_args.cart_line4,=plot(cart_linex4,cart_liney4,c='cyan')&lt;br /&gt;        show_cart_args.base_line,=plot(base_linex,base_liney,c='cyan')&lt;br /&gt;&lt;br /&gt;    show_cart_args.pend_line.set_data([x1,x2],[y1,y2])&lt;br /&gt;    show_cart_args.cart_line1.set_data(cart_linex1,cart_liney1)&lt;br /&gt;    show_cart_args.cart_line2.set_data(cart_linex2,cart_liney2)&lt;br /&gt;    show_cart_args.cart_line3.set_data(cart_linex3,cart_liney3)&lt;br /&gt;    show_cart_args.cart_line4.set_data(cart_linex4,cart_liney4)&lt;br /&gt;    show_cart_args.base_line.set_data(base_linex,base_liney)&lt;br /&gt;    if show_cart_args.trial!=num_failures:&lt;br /&gt;        title('trial #%d'%num_failures)&lt;br /&gt;        show_cart_args.trial=num_failures&lt;br /&gt;    draw()&lt;br /&gt;&lt;br /&gt;#Returns whether or not the system is out of bounds.&lt;br /&gt;def is_terminal(x,theta):&lt;br /&gt;    return -2.4&gt;x or x&gt;2.4 or -twelve_degrees&gt;theta or theta&gt;twelve_degrees&lt;br /&gt;&lt;br /&gt;#Simulates cart pole dynamics.&lt;br /&gt;def cart_pole(action,x,x_dot,theta,theta_dot):   &lt;br /&gt;    gravity=9.8&lt;br /&gt;    masscart=1.0&lt;br /&gt;    masspole=.3&lt;br /&gt;    total_mass=masspole+masscart&lt;br /&gt;    length=.7&lt;br /&gt;    polemass_length=masspole*length&lt;br /&gt;    force_mag=10.0&lt;br /&gt;    tau=.02&lt;br /&gt;    fourthirds=1.3333333333333&lt;br /&gt;&lt;br /&gt;    action_flip_prob=0.0&lt;br /&gt;    force_noise_factor=0.0&lt;br /&gt;    no_control_prob=0.0&lt;br /&gt;&lt;br /&gt;    if action_flip_prob&gt;random.random():&lt;br /&gt;        action=1-action&lt;br /&gt;&lt;br /&gt;    if action&gt;0:&lt;br /&gt;        force=force_mag&lt;br /&gt;    else:&lt;br /&gt;        force=-force_mag&lt;br /&gt;&lt;br /&gt;    force=force* \&lt;br /&gt;           (1-force_noise_factor+random.random()*2*force_noise_factor)&lt;br /&gt;&lt;br /&gt;    if no_control_prob&gt;random.random():&lt;br /&gt;        force=0&lt;br /&gt;&lt;br /&gt;    costheta=cos(theta)&lt;br /&gt;    sintheta=sin(theta)&lt;br /&gt;&lt;br /&gt;    temp=(force+polemass_length*theta_dot*theta_dot*sintheta)/total_mass&lt;br /&gt;    thetaacc=(gravity*sintheta-costheta*temp)/(length*(fourthirds \&lt;br /&gt;            -masspole*costheta*costheta/total_mass))&lt;br /&gt;    xacc=temp-polemass_length*thetaacc*costheta/total_mass&lt;br /&gt;&lt;br /&gt;    new_x=x+tau*x_dot&lt;br /&gt;    new_x_dot=x_dot+tau*xacc&lt;br /&gt;    new_theta=theta+tau*theta_dot&lt;br /&gt;    new_theta_dot=theta_dot+tau*thetaacc&lt;br /&gt;&lt;br /&gt;    return get_feat_vec(new_x,new_x_dot,new_theta,new_theta_dot)&lt;br /&gt;&lt;br /&gt;gamma=0.995&lt;br /&gt;feat_vec=get_feat_vec(0,0,0,0)&lt;br /&gt;n=shape(feat_vec)[1]&lt;br /&gt;twelve_degrees=0.2094384&lt;br /&gt;m=100000&lt;br /&gt;w_vec=zeros((n,1))&lt;br /&gt;feat_mat=ones((m,n))&lt;br /&gt;diff=sys.maxint&lt;br /&gt;new_diff=diff-1&lt;br /&gt;epochs=0&lt;br /&gt;&lt;br /&gt;#generate a bunch of state samples to fit the value function&lt;br /&gt;x=random.uniform(-1.1,1.1);x_dot=0.0;theta=0.0;theta_dot=0.0&lt;br /&gt;for i in range(m):&lt;br /&gt;    if random.random()&gt;.5:&lt;br /&gt;        action=1&lt;br /&gt;    else:&lt;br /&gt;        action=0    &lt;br /&gt;        &lt;br /&gt;    feat_vec=cart_pole(action,x,x_dot,theta,theta_dot)&lt;br /&gt;    feat_mat[i,:]=feat_vec&lt;br /&gt;&lt;br /&gt;    if is_terminal(feat_vec[0,1],feat_vec[0,3]):&lt;br /&gt;        x=random.uniform(-1.1,1.1);x_dot=0.0&lt;br /&gt;        theta=random.uniform(-twelve_degrees,twelve_degrees);theta_dot=0.0&lt;br /&gt;    else:&lt;br /&gt;        x=feat_vec[0,1];x_dot=feat_vec[0,2]&lt;br /&gt;        theta=feat_vec[0,3];theta_dot=feat_vec[0,4]&lt;br /&gt;&lt;br /&gt;#Iteratively score each of the generated states and use the&lt;br /&gt;#resulting feature,target tuple to estimate the parameters&lt;br /&gt;#of a linear function.&lt;br /&gt;while 10&gt;epochs and diff&gt;new_diff:&lt;br /&gt;    diff=new_diff&lt;br /&gt;    epochs+=1&lt;br /&gt;&lt;br /&gt;    y_vec=zeros((m,1))   &lt;br /&gt;    for i in range(m):&lt;br /&gt;        feat_vec=feat_mat[i,:].reshape(1,n)&lt;br /&gt;        x=feat_vec[0,1];x_dot=feat_vec[0,2]&lt;br /&gt;        theta=feat_vec[0,3];theta_dot=feat_vec[0,4]&lt;br /&gt;&lt;br /&gt;        if is_terminal(x,theta):&lt;br /&gt;            R=-1.0&lt;br /&gt;        else:&lt;br /&gt;            R=0.0         &lt;br /&gt;        &lt;br /&gt;        feat_vec0=cart_pole(0,x,x_dot,theta,theta_dot)&lt;br /&gt;        score1=R+gamma*dot(feat_vec0,w_vec)&lt;br /&gt;        &lt;br /&gt;        feat_vec1=cart_pole(1,x,x_dot,theta,theta_dot)&lt;br /&gt;        score2=R+gamma*dot(feat_vec1,w_vec)&lt;br /&gt;        &lt;br /&gt;        if score1&gt;score2:&lt;br /&gt;            action=0&lt;br /&gt;            score=score1&lt;br /&gt;        elif score2&gt;score1:&lt;br /&gt;            action=1&lt;br /&gt;            score=score2&lt;br /&gt;        else:&lt;br /&gt;            if random.random()&gt;.5:&lt;br /&gt;                action=1&lt;br /&gt;                score=score2&lt;br /&gt;            else:&lt;br /&gt;                action=0&lt;br /&gt;                score=score1&lt;br /&gt;&lt;br /&gt;        y_vec[i,0]=score&lt;br /&gt;&lt;br /&gt;    result=linalg.lstsq(feat_mat,y_vec)&lt;br /&gt;    w_new=result[0]&lt;br /&gt;    new_diff=result[1]&lt;br /&gt;    w_vec=w_new+0#copy&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;ion()&lt;br /&gt;time=0&lt;br /&gt;time_steps_to_failure={}&lt;br /&gt;num_failures=0&lt;br /&gt;time_at_start_of_current_trial=0&lt;br /&gt;x=random.uniform(-1.1,1.1);x_dot=0.0;theta=0.0;theta_dot=0.0&lt;br /&gt;display_started=False&lt;br /&gt;max_num_failures_to_run=250&lt;br /&gt;&lt;br /&gt;#Run the learned controller a few times to plot and compare&lt;br /&gt;#against the discretized version.&lt;br /&gt;while max_num_failures_to_run&gt;num_failures:&lt;br /&gt;    &lt;br /&gt;    feat_vec0=cart_pole(0,x,x_dot,theta,theta_dot)&lt;br /&gt;    score1=dot(feat_vec0,w_vec)&lt;br /&gt;    &lt;br /&gt;    feat_vec1=cart_pole(1,x,x_dot,theta,theta_dot)&lt;br /&gt;    score2=dot(feat_vec1,w_vec)&lt;br /&gt;    &lt;br /&gt;    if score1&gt;score2:&lt;br /&gt;        action=0&lt;br /&gt;    elif score2&gt;score1:&lt;br /&gt;        action=1&lt;br /&gt;    else:&lt;br /&gt;        if random.random()&gt;.5:&lt;br /&gt;            action=1&lt;br /&gt;        else:&lt;br /&gt;            action=0&lt;br /&gt;&lt;br /&gt;    feat_vec=cart_pole(action,x,x_dot,theta,theta_dot)&lt;br /&gt;    x=feat_vec[0,1];x_dot=feat_vec[0,2]&lt;br /&gt;    theta=feat_vec[0,3];theta_dot=feat_vec[0,4]&lt;br /&gt;    time+=1&lt;br /&gt;&lt;br /&gt;    if display_started:&lt;br /&gt;        show_cart(x,x_dot,theta,theta_dot)&lt;br /&gt;&lt;br /&gt;    if is_terminal(x,theta):&lt;br /&gt;        num_failures+=1&lt;br /&gt;        if num_failures&gt;max_num_failures_to_run-2:&lt;br /&gt;            display_started=True&lt;br /&gt;&lt;br /&gt;        time_steps_to_failure[num_failures]= \&lt;br /&gt;            time-time_at_start_of_current_trial&lt;br /&gt;        time_at_start_of_current_trial=time&lt;br /&gt;&lt;br /&gt;        x=random.uniform(-1.1,1.1);x_dot=0.0;theta=0.0;theta_dot=0.0&lt;br /&gt;&lt;br /&gt;#Plot learning curve:&lt;br /&gt;figure(2)&lt;br /&gt;plot(time_steps_to_failure.values(),'k')&lt;br /&gt;title('Time steps per trial')&lt;br /&gt;ioff()&lt;br /&gt;show()&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Notice that this controller lasts much longer (although obviously there are nonlinearities not captured by our value function).  It also performs the same on all trials, which has to do with how we are initializing the system (not completely random, but within an 'x' range that it can stay within, even though it allows the pole to fall below the specified constraint):&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_JBvsBkmE5OU/SildzC7DJrI/AAAAAAAAAHE/jW4JUkLarpA/s1600-h/polecart_cont.png"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 320px; height: 240px;" src="http://3.bp.blogspot.com/_JBvsBkmE5OU/SildzC7DJrI/AAAAAAAAAHE/jW4JUkLarpA/s320/polecart_cont.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5343905564391057074" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-942474703282875439?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/942474703282875439/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/05/lec17-fitted-value-iteration_31.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/942474703282875439'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/942474703282875439'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/05/lec17-fitted-value-iteration_31.html' title='Lec17 - Fitted Value Iteration'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_JBvsBkmE5OU/Siko6ZAsBQI/AAAAAAAAAGs/z08fbZftkuI/s72-c/polecart.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-5301707897625904540</id><published>2009-05-27T13:05:00.000-07:00</published><updated>2009-05-31T15:59:44.654-07:00</updated><title type='text'>Pacman and Reinforcement Learning</title><content type='html'>UC Berkeley's approach to teaching their &lt;a href="http://inst.eecs.berkeley.edu/%7Ecs188/sp09/information.html"&gt;undergraduate Artificial Intelligence class&lt;/a&gt; should be an example to all institutions out there offering a similar course.  Not only have they not moved away from &lt;a href="http://aima.cs.berkeley.edu/"&gt;AIMA&lt;/a&gt;, they've designed the course with a strong focus on programming projects, giving students a chance to truly internalize the concepts learned.  This year's class used the game &lt;a href="http://inst.eecs.berkeley.edu/%7Ecs188/sp09/pacman.html"&gt;Pacman&lt;/a&gt; as a running example to build intelligent agents driven by strategies of complexity ranging from search algorithms to probabilistic models, and it's so well done I would not be surprised if this turned out to be one of the most popular compsci undergraduate classes at Berkeley.  Now, putting something together like this obviously requires a great deal of effort, but &lt;a href="http://www.eecs.berkeley.edu/%7Edenero/"&gt;John DeNero&lt;/a&gt;, the course &lt;a href="http://gsi.berkeley.edu/awards/denero_2008_tea.html"&gt;designer&lt;/a&gt; and instructor, actually encourages instructors at other universities to get in touch with him and use the materials for their own classes.  So, the road is already paved; there should be no reason for other institutions not to go down it (University of Utah is &lt;a href="http://www.cs.utah.edu/%7Ehal/courses/2009S_AI/"&gt;already doing it&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;Some of the code for the Pacman game is based on a implementation from &lt;a href="http://www.livewires.org.uk/python/home"&gt;LiveWires&lt;/a&gt;, and you can download their stuff for further &lt;a href="http://www.livewires.org.uk/python/worksheets"&gt;documentation&lt;/a&gt; on the graphics, layout, and general spirit of the animation; however, this is not necessary for completing the programming projects.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_JBvsBkmE5OU/Sh2v8P1DyuI/AAAAAAAAAF8/EvDMJmRPnQk/s1600-h/pacman_game.gif"&gt;&lt;img style="cursor: pointer; width: 320px; height: 142px;" src="http://3.bp.blogspot.com/_JBvsBkmE5OU/Sh2v8P1DyuI/AAAAAAAAAF8/EvDMJmRPnQk/s320/pacman_game.gif" alt="" id="BLOGGER_PHOTO_ID_5340618182707366626" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Since we're talking about reinforcement learning, I thought it might nice to fill in the stubs for their &lt;a href="http://inst.eecs.berkeley.edu/%7Ecs188/sp09/projects/reinforcement/reinforcement.html"&gt;RL project&lt;/a&gt; and experiment a little.&lt;br /&gt;&lt;br /&gt;The first thing to do is to implement an agent that uses value iteration to compute the optimal policy on the grid world, and it does not take much to adapt the implementation in our last post to work in this scenario.  Here is what the __init__ method in ValueIterationAgent would look like:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;  def __init__(self, mdp, discount = 0.9, iterations = 100):&lt;br /&gt;    """&lt;br /&gt;      Your value iteration agent should take an mdp on&lt;br /&gt;      construction, run the indicated number of iterations&lt;br /&gt;      and then act according to the resulting policy.&lt;br /&gt;    &lt;br /&gt;      Some useful mdp methods you will use:&lt;br /&gt;          mdp.getStates()&lt;br /&gt;          mdp.getPossibleActions(state)&lt;br /&gt;          mdp.getTransitionStatesAndProbs(state, action)&lt;br /&gt;          mdp.getReward(state, action, nextState)&lt;br /&gt;    """&lt;br /&gt;    self.mdp = mdp&lt;br /&gt;    self.discount = discount&lt;br /&gt;    self.iterations = iterations&lt;br /&gt;    self.values = util.Counter() # A Counter is a dict with default 0&lt;br /&gt;     &lt;br /&gt;    "*** YOUR CODE HERE ***"   &lt;br /&gt;    #iteratively estimate opt value function&lt;br /&gt;    values=dict([(s,0.0) for s in mdp.getStates()])&lt;br /&gt;    for i in range(iterations):&lt;br /&gt;      V=values.copy()&lt;br /&gt;      for s in mdp.getStates():&lt;br /&gt;        if not mdp.isTerminal(s):  &lt;br /&gt;          now=mdp.getReward(s,None,None)&lt;br /&gt;          actions=mdp.getPossibleActions(s)&lt;br /&gt;          futures=[0.0]*len(actions)&lt;br /&gt;          for a in actions:&lt;br /&gt;            futures.append(sum([p*V[next] for (next,p) in&lt;br /&gt;                                mdp.getTransitionStatesAndProbs(s,a)]))&lt;br /&gt;          maxfuture=max(futures)&lt;br /&gt;          values[s]=now+discount*maxfuture&lt;br /&gt;    for k,v in values.items():&lt;br /&gt;      self.values[k]=v&lt;br /&gt;        &lt;br /&gt;    #opt policy is greedy wrt opt value function&lt;br /&gt;    pi={}&lt;br /&gt;    for s in mdp.getStates():&lt;br /&gt;      if not mdp.isTerminal(s):&lt;br /&gt;        maxexpect=-sys.maxint&lt;br /&gt;        for a in mdp.getPossibleActions(s):&lt;br /&gt;          expect=sum([p*self.values[next] for (next,p) in&lt;br /&gt;                      mdp.getTransitionStatesAndProbs(s,a)])&lt;br /&gt;          if expect&gt;maxexpect:&lt;br /&gt;            maxexpect=expect&lt;br /&gt;            pi[s]=a&lt;br /&gt;      else:&lt;br /&gt;        pi[s]=()&lt;br /&gt;    self.pi=pi &lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;For which the command below yields:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;python gridworld.py -a value -i 100 -k 10&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_JBvsBkmE5OU/SiBfn2YYO3I/AAAAAAAAAGc/VBAKQq-vIm4/s1600-h/gridPolicy.png"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 400px; height: 367px;" src="http://2.bp.blogspot.com/_JBvsBkmE5OU/SiBfn2YYO3I/AAAAAAAAAGc/VBAKQq-vIm4/s400/gridPolicy.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5341374296278711154" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;But the real fun begins when we move on to designing agents that learn from interaction with the environment rather than from a model that is just handed to them.  &lt;a href="http://en.wikipedia.org/wiki/Q-learning"&gt;Q-learning&lt;/a&gt; is the particular approach used for the rest of the exercise, and the 2 most salient methods in the QLearningAgent are update and getAction (which should really just call getPolicy, but no biggie):&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;  def getAction(self, s):&lt;br /&gt;    """&lt;br /&gt;      What action to take in the current state. With&lt;br /&gt;      probability self.epsilon, we should take a random&lt;br /&gt;      action and take the best policy action otherwise.&lt;br /&gt;    &lt;br /&gt;      After you choose an action make sure to&lt;br /&gt;      inform your parent self.doAction(state,action) &lt;br /&gt;      This is done for you, just don't clobber it&lt;br /&gt;       &lt;br /&gt;      HINT: You might want to use util.flipCoin(prob)&lt;br /&gt;      HINT: To pick randomly from a list, use random.choice(list)&lt;br /&gt;    """  &lt;br /&gt;    # Pick Action&lt;br /&gt;    legalActions = self.getLegalActions(s)&lt;br /&gt;    action = None&lt;br /&gt;    "*** YOUR CODE HERE ***"&lt;br /&gt;    if not util.flipCoin(self.epsilon):    &lt;br /&gt;      maxactions=[]&lt;br /&gt;      maxQ=max([self.getQValue(s,a) for a in self.getLegalActions(s)])&lt;br /&gt;      for a in legalActions:&lt;br /&gt;        if self.getQValue(s,a)&gt;=maxQ:&lt;br /&gt;          maxactions.append(a)&lt;br /&gt;          maxQ=self.getQValue(s,a)&lt;br /&gt;    else:&lt;br /&gt;      maxactions=legalActions&lt;br /&gt;        &lt;br /&gt;    action=random.choice(maxactions)&lt;br /&gt;    &lt;br /&gt;    # Need to inform parent of action for Pacman (do not delete this line)&lt;br /&gt;    self.doAction(s,action)    &lt;br /&gt;    &lt;br /&gt;    return action&lt;br /&gt;  &lt;br /&gt;  def update(self, s, a, nexts, r):&lt;br /&gt;    """&lt;br /&gt;      The parent class calls this to observe a &lt;br /&gt;      state = action =&gt; nextState and reward transition.&lt;br /&gt;      You should do your Q-Value update here&lt;br /&gt;      &lt;br /&gt;      NOTE: You should never call this function,&lt;br /&gt;      it will be called on your behalf&lt;br /&gt;    """&lt;br /&gt;    "*** YOUR CODE HERE ***"&lt;br /&gt;    if len(self.getLegalActions(nexts))==0:&lt;br /&gt;      self.Q[s,a]=r&lt;br /&gt;    else:&lt;br /&gt;      self.Q[s,a]=self.getQValue(s,a)+ \&lt;br /&gt;                   self.alpha* \&lt;br /&gt;                   (r+ \&lt;br /&gt;                    self.gamma* \&lt;br /&gt;                    max([self.getQValue(nexts,nexta) for nexta in self.getLegalActions(nexts)])- \&lt;br /&gt;                    self.getQValue(s,a))&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Watching this code run on the grid world is really amazing for 2 reasons, first because it works :) but more importantly because of how well it works given that there is no explicit model of the environment.  This is a little bit like neural nets, whose "knowledge" is implicit in the weights assigned to connections between nodes, well similarly, a q-learning agent's "knowledge" is implicit in its Q function, and the  dynamics of the environment (provided it is well explored) are reflected in the values assigned to each state-action pair.  &lt;a href="http://www.cs.ualberta.ca/~sutton/book/the-book.html"&gt;Sutton and Barto's book&lt;/a&gt; has a good discussion on this towards the end of part II and most of part III.&lt;br /&gt;&lt;br /&gt;After the grid world, you get a chance to play with the crawler, which is an animation of a one-legged robot that drags itself around your screen, good to get an intuition on several of the algorithm parameters, especially epsilon, which helps balance the agent's dedication to  exploration vs exploitation:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_JBvsBkmE5OU/SiBHfj8ZpXI/AAAAAAAAAGM/qMJMrV0r0Go/s1600-h/crawler.png"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 400px; height: 113px;" src="http://2.bp.blogspot.com/_JBvsBkmE5OU/SiBHfj8ZpXI/AAAAAAAAAGM/qMJMrV0r0Go/s400/crawler.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5341347765611505010" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;It's easy to start thinking about all sorts of things as you control the knobs and watch this little thing crawling on your screen: can we design an ultimate reward signal (or even a few) such as survival or continuation of the species? The robot could then be looking for food or a mate, and if it did mate, could its offspring "perform" better by moving itself faster, perhaps combining the learned strategies of its ancestors (think genetic algorithms)? Or the knobs that we control (epsilon, learning rate, discount), do they have equivalents in the brain, some neurotransmitter or other chemicals perhaps?  How could a model of the environment speed learning up? What kind of model? I'm sure most people playing around with this game would ask themselves these questions (and many more) and that is one of the objectives of a class like this: to stimulate the intellect to all that is possible (and as time goes by, seemingly more and more within grasp) and give students the tools to answer those questions; an objective that is more than fulfilled in this case.  Plus, we aren't even done yet, there is still the matter of equipping Pacman with q-learning abilities.&lt;br /&gt;&lt;br /&gt;Once you get to the ApproximateQAgent, the most important methods would be getQValue and update:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;  def getQValue(self, s, a):&lt;br /&gt;    """&lt;br /&gt;      Should return Q(state,action) = w * featureVector&lt;br /&gt;      where * is the dotProduct operator&lt;br /&gt;    """&lt;br /&gt;    "*** YOUR CODE HERE ***"&lt;br /&gt;    #return PacmanQAgent.getQValue(self, s, a)&lt;br /&gt;    feats=self.featExtractor.getFeatures(s,a)&lt;br /&gt;    return sum([feats[k]*self.W[k] for k,v in feats.items()])&lt;br /&gt;    &lt;br /&gt;  def update(self, s, a, nexts, r):&lt;br /&gt;    """&lt;br /&gt;       Should update your weights based on transition  &lt;br /&gt;    """&lt;br /&gt;    "*** YOUR CODE HERE ***"&lt;br /&gt;    #PacmanQAgent.update(self, s, a, nexts, r)&lt;br /&gt;    feats=self.featExtractor.getFeatures(s,a)&lt;br /&gt;    if len(self.getLegalActions(nexts))==0:&lt;br /&gt;      for k,v in feats.items():&lt;br /&gt;        self.W[k]=r*feats[k]&lt;br /&gt;    else:&lt;br /&gt;      for k,v in feats.items():&lt;br /&gt;        corr=r+self.gamma*self.getValue(nexts)- \&lt;br /&gt;              self.getQValue(s,a)&lt;br /&gt;        self.W[k]=self.W[k]+ \&lt;br /&gt;                   self.alpha*corr*feats[k]&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;And that should be enough to run on some of the bigger grids with a decent chance of winning:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;python pacman.py -p ApproximateQAgent -a extractor=SimpleExtractor -x 50 -n 60 -l mediumClassic &lt;br /&gt;Beginning 50 episodes of Training&lt;br /&gt;Training Done (turning off epsilon and alpha)&lt;br /&gt;---------------------------------------------&lt;br /&gt;Pacman died! Score: -100&lt;br /&gt;Pacman emerges victorious! Score: 1327&lt;br /&gt;Pacman died! Score: -70&lt;br /&gt;Pacman emerges victorious! Score: 1346&lt;br /&gt;Pacman died! Score: -274&lt;br /&gt;Pacman emerges victorious! Score: 591&lt;br /&gt;Pacman died! Score: -180&lt;br /&gt;Pacman emerges victorious! Score: 701&lt;br /&gt;Pacman died! Score: 87&lt;br /&gt;Pacman emerges victorious! Score: 1335&lt;br /&gt;Average Score: 476.3&lt;br /&gt;Scores:        -100, 1327, -70, 1346, -274, 591, -180, 701, 87, 1335&lt;br /&gt;Win Rate:      5/10 (0.50)&lt;br /&gt;Record:        Loss, Win, Loss, Win, Loss, Win, Loss, Win, Loss, Win&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_JBvsBkmE5OU/SiBJ3K5pf5I/AAAAAAAAAGU/18qPa4c7iMM/s1600-h/pacmanWin.png"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 400px; height: 263px;" src="http://4.bp.blogspot.com/_JBvsBkmE5OU/SiBJ3K5pf5I/AAAAAAAAAGU/18qPa4c7iMM/s400/pacmanWin.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5341350370229190546" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The last part of the project encourages you to build better feature extractors, which if done  well should considerably improve Pacman's performance.  This issue of generalizing in the state space is a recurring one not only in reinforcement learning, but throughout AI, and we've seen a few techniques as we learned about supervised and unsupervised learning.  Next time we'll go back to cs229 and look at strategies for generalizing in continuous state-spaces.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-5301707897625904540?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/5301707897625904540/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/05/pacman-and-reinforcement-learning.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/5301707897625904540'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/5301707897625904540'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/05/pacman-and-reinforcement-learning.html' title='Pacman and Reinforcement Learning'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_JBvsBkmE5OU/Sh2v8P1DyuI/AAAAAAAAAF8/EvDMJmRPnQk/s72-c/pacman_game.gif' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-3268016826252873385</id><published>2009-05-25T08:04:00.000-07:00</published><updated>2009-05-26T17:29:42.818-07:00</updated><title type='text'>Lec16 - Value and Policy Iteration</title><content type='html'>Both value and policy iteration are algorithms for solving an MDP formulation of a reinforcement learning problem where the environment dynamics are completely known.  Resources for this kind of scenario include:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;chapter 17 of &lt;a href="http://aima.cs.berkeley.edu/"&gt;AIMA&lt;/a&gt;&lt;/li&gt;&lt;li&gt;the &lt;a href="http://www.cs.ualberta.ca/%7Esutton/book/ebook/node40.html"&gt;dynamic programming section&lt;/a&gt; of the &lt;a href="http://www.cs.ualberta.ca/%7Esutton/book/the-book.html"&gt;RL book&lt;/a&gt;&lt;/li&gt;&lt;li&gt;the &lt;a href="http://see.stanford.edu/materials/aimlcs229/cs229-notes12.pdf"&gt;RL notes&lt;/a&gt; from cs229&lt;/li&gt;&lt;/ul&gt;Below we have some simple implementations of such algorithms running against a simple grid world described in AIMA:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;#mdp_imp.py&lt;br /&gt;import sys,operator,random&lt;br /&gt;&lt;br /&gt;class Env:&lt;br /&gt;    def __init__(self,states,rewards,terminals,dynamics,actions):&lt;br /&gt;        self.states=states&lt;br /&gt;        self.rewards=rewards&lt;br /&gt;        self.terminals=terminals&lt;br /&gt;        self.actions=actions&lt;br /&gt;        self.dynamics=dynamics&lt;br /&gt;        self.R=dict(zip(states,rewards))&lt;br /&gt;        self.allowablestates=[states[i]&lt;br /&gt;                              for i in range(len(states))&lt;br /&gt;                              if rewards[i] is not None]&lt;br /&gt;&lt;br /&gt;    def T(self,state,action):&lt;br /&gt;        trans=[]&lt;br /&gt;        for prob,move in self.dynamics:&lt;br /&gt;            trans.append((prob,self.go(state,move(self,action))))&lt;br /&gt;        return trans&lt;br /&gt;&lt;br /&gt;    def go(self,state,action):&lt;br /&gt;        next=tuple(map(operator.add,state,action))&lt;br /&gt;        if next in self.allowablestates:&lt;br /&gt;            return next&lt;br /&gt;        else:&lt;br /&gt;            return state&lt;br /&gt;&lt;br /&gt;    def desired(self,action):&lt;br /&gt;        return action&lt;br /&gt;&lt;br /&gt;    def left(self,action):&lt;br /&gt;        idx=self.actions.index(action)-1&lt;br /&gt;        return self.actions[idx]&lt;br /&gt;&lt;br /&gt;    def right(self,action):&lt;br /&gt;        idx=(self.actions.index(action)+1)%len(self.actions)&lt;br /&gt;        return self.actions[idx]&lt;br /&gt;        &lt;br /&gt;class Agent:&lt;br /&gt;    def __init__(self,actions,gamma):&lt;br /&gt;        self.actions=actions&lt;br /&gt;        self.gamma=gamma&lt;br /&gt;&lt;br /&gt;def display(agt,env,pi,desc):&lt;br /&gt;    dirs={(1,0):'^',(0,1):'&gt;',(-1,0):'v',(0,-1):'&lt;',(0,0):'.'}&lt;br /&gt;    print '\n'+desc&lt;br /&gt;    for state in env.allowablestates:&lt;br /&gt;        print 'state: %s, move: %s, action: %s'% \&lt;br /&gt;              (state,dirs[pi[state]],pi[state])&lt;br /&gt;&lt;br /&gt;def policyIter(agt,env):&lt;br /&gt;    #create some random policy&lt;br /&gt;    pi={}&lt;br /&gt;    for s in env.terminals: pi[s]=(0,0)&lt;br /&gt;    for s in set(env.allowablestates)-set(env.terminals):&lt;br /&gt;        pi[s]=agt.actions[random.randint(0,len(agt.actions)-1)]&lt;br /&gt;&lt;br /&gt;    #iteratively improve policy&lt;br /&gt;    done=False&lt;br /&gt;    while not done:&lt;br /&gt;        done=True&lt;br /&gt;        &lt;br /&gt;        #first compute value function for current policy using modified&lt;br /&gt;        #version of policy iteration (policy and iters are constant)&lt;br /&gt;        V=dict([(s,0.0) for s in env.states])&lt;br /&gt;        for s in env.terminals: V[s]=env.R[s]&lt;br /&gt;        iters=20&lt;br /&gt;        for i in range(iters):&lt;br /&gt;            for s in set(env.allowablestates)-set(env.terminals):&lt;br /&gt;                now=env.R[s]&lt;br /&gt;                future=sum([p*V[next] for (p,next) in env.T(s,pi[s])])                &lt;br /&gt;                V[s]=now+agt.gamma*future&lt;br /&gt;        &lt;br /&gt;        #then make policy greedy towards computed value function&lt;br /&gt;        for s in set(env.allowablestates)-set(env.terminals):&lt;br /&gt;            maxexpect=sum([p*V[next] for (p,next) in env.T(s,pi[s])])&lt;br /&gt;            for a in agt.actions:&lt;br /&gt;                expect=sum([p*V[next] for (p,next) in env.T(s,a)])&lt;br /&gt;                if expect&gt;maxexpect and a!=pi[s]:&lt;br /&gt;                    done=False&lt;br /&gt;                    maxexpect=expect&lt;br /&gt;                    pi[s]=a&lt;br /&gt;    return pi&lt;br /&gt;&lt;br /&gt;def valueIter(agt,env):&lt;br /&gt;    #initialize value function for each state&lt;br /&gt;    V=dict([(s,0.0) for s in env.states])&lt;br /&gt;    for s in env.terminals: V[s]=env.R[s]&lt;br /&gt;&lt;br /&gt;    #iteratively estimate opt value function&lt;br /&gt;    delta=sys.maxint&lt;br /&gt;    while (delta&gt;1e-5):&lt;br /&gt;        delta=-sys.maxint&lt;br /&gt;        for s in set(env.allowablestates)-set(env.terminals):&lt;br /&gt;            now=env.R[s]&lt;br /&gt;            maxfuture=max([sum([p*V[next] for (p,next) in env.T(s,a)])&lt;br /&gt;                           for a in agt.actions])&lt;br /&gt;            value=now+agt.gamma*maxfuture&lt;br /&gt;            delta=max(delta,abs(value-V[s]))&lt;br /&gt;            V[s]=value&lt;br /&gt;            &lt;br /&gt;    #opt policy is greedy wrt opt value function&lt;br /&gt;    pi={}&lt;br /&gt;    for s in env.terminals: pi[s]=(0,0)&lt;br /&gt;    for s in set(env.allowablestates)-set(env.terminals):&lt;br /&gt;        maxexpect=-sys.maxint&lt;br /&gt;        for a in agt.actions:&lt;br /&gt;            expect=sum([p*V[next] for (p,next) in env.T(s,a)])&lt;br /&gt;            if expect&gt;maxexpect:&lt;br /&gt;                maxexpect=expect&lt;br /&gt;                pi[s]=a&lt;br /&gt;    return pi&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Driven with:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;#mdp_inter.py&lt;br /&gt;from mdp_imp import *&lt;br /&gt;&lt;br /&gt;#the states in the environment and&lt;br /&gt;#their (x,y) coordinates&lt;br /&gt;states=[(2,0),(2,1),(2,2),(2,3),&lt;br /&gt;        (1,0),(1,1),(1,2),(1,3),&lt;br /&gt;        (0,0),(0,1),(0,2),(0,3)]&lt;br /&gt;&lt;br /&gt;#rewards obtained for simply entering&lt;br /&gt;#each state. 'None' denotes an unreachable state&lt;br /&gt;rewards=[-0.04,-0.04,-0.04,+1.00,&lt;br /&gt;         -0.04, None,-0.04,-1.00,&lt;br /&gt;         -0.04,-0.04,-0.04,-0.04]&lt;br /&gt;&lt;br /&gt;#states considered terminal (absorbing)&lt;br /&gt;terminals=[(2,3),(1,3)]&lt;br /&gt;&lt;br /&gt;#given an action, the environment dynamics specify that&lt;br /&gt;#with .8 probability it will have the intended effect (move&lt;br /&gt;#the agent in the desired direction), with .1 probability&lt;br /&gt;#the agent will go left, and with .1 probability the&lt;br /&gt;#agent will go right. If the action takes the agent&lt;br /&gt;#outside the boundaries of the environment or into a 'None'&lt;br /&gt;#state, then the agent will remain where it is&lt;br /&gt;dynamics=[(0.8,Env.desired),(0.1,Env.left),(0.1,Env.right)]&lt;br /&gt;&lt;br /&gt;#the only actions the agent can take are to move&lt;br /&gt;#north,east,south,west&lt;br /&gt;#denoted below with unit vectors (y,x)&lt;br /&gt;actions=[(1,0),(0,1),(-1,0),(0,-1)]&lt;br /&gt;&lt;br /&gt;#gamma denotes the discount factor, indicative of an&lt;br /&gt;#agent's preference of present rewards over future rewards&lt;br /&gt;gamma=.9&lt;br /&gt;&lt;br /&gt;agt=Agent(actions,gamma)&lt;br /&gt;env=Env(states,rewards,terminals,dynamics,actions)&lt;br /&gt;display(agt,env,valueIter(agt,env),'Value Iteration')&lt;br /&gt;display(agt,env,policyIter(agt,env),'Policy Iteration')&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;With output:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;Value Iteration&lt;br /&gt;state: (2, 0), move: &gt;, action: (0, 1)&lt;br /&gt;state: (2, 1), move: &gt;, action: (0, 1)&lt;br /&gt;state: (2, 2), move: &gt;, action: (0, 1)&lt;br /&gt;state: (2, 3), move: ., action: (0, 0)&lt;br /&gt;state: (1, 0), move: ^, action: (1, 0)&lt;br /&gt;state: (1, 2), move: ^, action: (1, 0)&lt;br /&gt;state: (1, 3), move: ., action: (0, 0)&lt;br /&gt;state: (0, 0), move: ^, action: (1, 0)&lt;br /&gt;state: (0, 1), move: &gt;, action: (0, 1)&lt;br /&gt;state: (0, 2), move: ^, action: (1, 0)&lt;br /&gt;state: (0, 3), move: &lt;, action: (0, -1)&lt;br /&gt;&lt;br /&gt;Policy Iteration&lt;br /&gt;state: (2, 0), move: &gt;, action: (0, 1)&lt;br /&gt;state: (2, 1), move: &gt;, action: (0, 1)&lt;br /&gt;state: (2, 2), move: &gt;, action: (0, 1)&lt;br /&gt;state: (2, 3), move: ., action: (0, 0)&lt;br /&gt;state: (1, 0), move: ^, action: (1, 0)&lt;br /&gt;state: (1, 2), move: ^, action: (1, 0)&lt;br /&gt;state: (1, 3), move: ., action: (0, 0)&lt;br /&gt;state: (0, 0), move: ^, action: (1, 0)&lt;br /&gt;state: (0, 1), move: &gt;, action: (0, 1)&lt;br /&gt;state: (0, 2), move: ^, action: (1, 0)&lt;br /&gt;state: (0, 3), move: &lt;, action: (0, -1)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Note that both algorithms find the same optimal policy for this problem.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-3268016826252873385?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/3268016826252873385/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/05/lec16-value-and-policy-iteration.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/3268016826252873385'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/3268016826252873385'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/05/lec16-value-and-policy-iteration.html' title='Lec16 - Value and Policy Iteration'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-8905668786851728168</id><published>2009-05-21T17:50:00.000-07:00</published><updated>2009-05-24T12:21:20.591-07:00</updated><title type='text'>Lec16 - Reinforcement Learning</title><content type='html'>&lt;a href="http://www.youtube.com/watch?v=RtxI449ZjSc&amp;amp;feature=PlayList&amp;amp;p=A89DCFA6ADACE599&amp;amp;index=15"&gt;Video Lecture - 16&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Reinforcement learning (RL) is a particularly interesting field, and it starts from how it is &lt;a href="http://en.wikipedia.org/wiki/Reinforcement_learning"&gt;defined&lt;/a&gt;: RL is concerned with how an agent ought to take actions in an environment so as to maximize some notion of long-term reward.  Artificial intelligence (AI) itself is similarly &lt;a href="http://en.wikipedia.org/wiki/Artificial_intelligence"&gt;defined&lt;/a&gt;: AI is concerned with the study and design of intelligent agents, where an intelligent agent is a system that perceives its environment and takes actions which maximize its chance of success.  So, the problem of RL is as general as the problem of AI; however, because AI has always been a general umbrella for any approach that tried to synthetically reproduce any aspect of what one might consider the whole of human intelligence, it has diverged and fragmented in many more directions than has RL (one can easily get a feel for this by reading the wikipedia entries for RL and AI).  Despite its also tumultuous &lt;a href="http://www.cs.ualberta.ca/%7Esutton/book/ebook/node12.html"&gt;past&lt;/a&gt;, the narrower focus of RL has allowed it to remain close to its core formulation of learning by trial and error through interaction with an environment that provides feedback, which has allowed for a fairly stable framework into which ideas from other AI threads could be incorporated (supervised and unsupervised learning systems can be thought of as components of a reinforcement learning system).&lt;br /&gt;&lt;br /&gt;Some good resources (in addition to cs229) include &lt;a href="http://aima.cs.berkeley.edu/"&gt;AIMA&lt;/a&gt;, &lt;a href="http://www.cs.ualberta.ca/%7Esutton/book/the-book.html"&gt;Sutton and Barto's book&lt;/a&gt;, and a couple others  that discuss the same principles of RL but in other fields: &lt;a href="http://www.castlelab.princeton.edu/adp.htm"&gt;approximate dynamic programming&lt;/a&gt;, &lt;a href="http://www.athenasc.com/ndpbook.html"&gt;neuro-dynamic programming&lt;/a&gt;, and &lt;a href="http://www.athenasc.com/dpbook.html"&gt;dynamic programming and optimal control&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;What are some of the things RL has been used for?  Well, some pretty amazing things:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.sciencedaily.com/releases/2008/09/080902171117.htm"&gt;Autonomous helicopter flight&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;object width="425" height="344"&gt;&lt;param name="movie" value="http://www.youtube.com/v/VCdxqn0fcnE&amp;amp;hl=en&amp;amp;fs=1"&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;embed src="http://www.youtube.com/v/VCdxqn0fcnE&amp;amp;hl=en&amp;amp;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Legged_robot"&gt;Legged robots&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;object width="425" height="344"&gt;&lt;param name="movie" value="http://www.youtube.com/v/CEQlZtCi7IQ&amp;amp;hl=en&amp;amp;fs=1"&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;embed src="http://www.youtube.com/v/CEQlZtCi7IQ&amp;amp;hl=en&amp;amp;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;br /&gt;&lt;object width="425" height="344"&gt;&lt;param name="movie" value="http://www.youtube.com/v/b2bExqhhWRI&amp;amp;hl=en&amp;amp;fs=1"&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;embed src="http://www.youtube.com/v/b2bExqhhWRI&amp;amp;hl=en&amp;amp;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Inverted_pendulum"&gt;Inverted pendulum&lt;/a&gt; (a toy problem, but illustrative nevertheless)&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;object width="425" height="344"&gt;&lt;param name="movie" value="http://www.youtube.com/v/MWJHcI7UcuE&amp;amp;hl=en&amp;amp;fs=1"&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;embed src="http://www.youtube.com/v/MWJHcI7UcuE&amp;amp;hl=en&amp;amp;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;br /&gt;&lt;object width="425" height="344"&gt;&lt;param name="movie" value="http://www.youtube.com/v/H7QxuSu5UIA&amp;hl=en&amp;fs=1"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/H7QxuSu5UIA&amp;hl=en&amp;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Another interesting video, where &lt;a href="http://en.wikipedia.org/wiki/AIBO"&gt;AIBO&lt;/a&gt; learns using RL&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;object width="425" height="344"&gt;&lt;param name="movie" value="http://www.youtube.com/v/y06TMZB-Qvk&amp;hl=en&amp;fs=1"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/y06TMZB-Qvk&amp;hl=en&amp;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.research.ibm.com/massive/tdl.html"&gt;Game playing&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.icsi.berkeley.edu/~moody/JForecastMoodyWu.pdf"&gt;Automated trading systems&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.cns.nyu.edu/~daw/thesis.pdf"&gt;Models of biological neural systems&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;That last one is interesting because not only do our machine learning methods benefit from an understanding of how the brain works, but those very same methods may help guide our study of the brain.  Here's another example of this from some of the work that &lt;a href="http://www.cs.cmu.edu/~tom/"&gt;Tom Mitchell&lt;/a&gt; is doing:&lt;br /&gt;&lt;br /&gt;&lt;object width="560" height="340"&gt;&lt;param name="movie" value="http://www.youtube.com/v/JVLu5_hvr8s&amp;hl=en&amp;fs=1"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/JVLu5_hvr8s&amp;hl=en&amp;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="560" height="340"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;br /&gt;Next time, we'll look at some code and illustrate how some of these systems might be implemented.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-8905668786851728168?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/8905668786851728168/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/05/lec16-reinforcement-learning.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/8905668786851728168'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/8905668786851728168'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/05/lec16-reinforcement-learning.html' title='Lec16 - Reinforcement Learning'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-5707398500751239568</id><published>2009-05-18T10:53:00.000-07:00</published><updated>2009-05-22T09:39:39.458-07:00</updated><title type='text'>Hidden Markov Model III</title><content type='html'>Last time we used an existing HMM to showcase examples of the inference scenarios in which HMM's might be useful.  Today we explore how we might infer the HMM itself from data.&lt;br /&gt;&lt;br /&gt;The parameters to be estimated are the transition probabilities and the evidence probabilities, and the inputs we have to work with include the observation sequence and the number of latent states, similar to how the number of gaussians was an input to the &lt;a href="http://mechanistician.blogspot.com/2009/05/lec12-expectation-maximization.html"&gt;mixture of gaussians algorithm&lt;/a&gt;.  In fact, the standard algorithm for learning a hidden markov model is an instance of the Expectation-Maximization algorithm known as the &lt;a href="http://en.wikipedia.org/wiki/Baum-Welch_algorithm"&gt;Baum-Welch algorithm&lt;/a&gt;, which uses a slightly modified version of the &lt;a href="http://en.wikipedia.org/wiki/Forward-backward_algorithm"&gt;Forward-Backward algorithm&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;#hmm_imp.py&lt;br /&gt;from numpy import *&lt;br /&gt;&lt;br /&gt;############################################################&lt;br /&gt;#hmmAlgorithms&lt;br /&gt;############################################################&lt;br /&gt;&lt;br /&gt;def fwd(tProb,eProb,prior,ev):&lt;br /&gt;    nstates=shape(prior)[0]&lt;br /&gt;    T=shape(ev)[1]-1&lt;br /&gt;    fMatrix=zeros((nstates,T))&lt;br /&gt;&lt;br /&gt;    #our initial forward message equals the prior&lt;br /&gt;    #state distribution (time=0)&lt;br /&gt;    fMatrix[:,0]=prior.reshape(nstates,)&lt;br /&gt;&lt;br /&gt;    for t in range(1,T):&lt;br /&gt;        fTemp=zeros((nstates,nstates))&lt;br /&gt;        &lt;br /&gt;        for i in range(nstates):&lt;br /&gt;            for j in range(nstates):&lt;br /&gt;                #for every state i at time t,&lt;br /&gt;                #calculate the probability that each&lt;br /&gt;                #state j at t-1 would have lead to i at t,&lt;br /&gt;                #weighted by the belief that we were in&lt;br /&gt;                #state j at t-1, and weighted by the&lt;br /&gt;                #probability that state i would have&lt;br /&gt;                #produced the evidence we observed at t&lt;br /&gt;                fTemp[i,j]= \&lt;br /&gt;                    tProb[j,i]*fMatrix[j,t-1]*eProb[i,ev[0,t]]&lt;br /&gt;&lt;br /&gt;        #collapse fTemp from right to left, summing contributions&lt;br /&gt;        #of each state j at time t-1 to a given state i at t&lt;br /&gt;        fMatrix[:,t]=normEachRow(fTemp.sum(1))&lt;br /&gt;&lt;br /&gt;    return fMatrix&lt;br /&gt;&lt;br /&gt;#same as fwd, w/o norm&lt;br /&gt;def likeli(tProb,eProb,prior,ev):&lt;br /&gt;    nstates=shape(prior)[0]&lt;br /&gt;    T=shape(ev)[1]-1&lt;br /&gt;    fMatrix=zeros((nstates,T))&lt;br /&gt;&lt;br /&gt;    #our initial forward message equals the prior&lt;br /&gt;    #state distribution (time=0)&lt;br /&gt;    fMatrix[:,0]=prior.reshape(nstates,)&lt;br /&gt;&lt;br /&gt;    for t in range(1,T):&lt;br /&gt;        fTemp=zeros((nstates,nstates))&lt;br /&gt;        &lt;br /&gt;        for i in range(nstates):&lt;br /&gt;            for j in range(nstates):&lt;br /&gt;                #for every state i at time t,&lt;br /&gt;                #calculate the probability that each&lt;br /&gt;                #state j at t-1 would have lead to i at t,&lt;br /&gt;                #weighted by the belief that we were in&lt;br /&gt;                #state j at t-1, and weighted by the&lt;br /&gt;                #probability that state i would have&lt;br /&gt;                #produced the evidence we observed at t&lt;br /&gt;                fTemp[i,j]= \&lt;br /&gt;                    tProb[j,i]*fMatrix[j,t-1]*eProb[i,ev[0,t]]&lt;br /&gt;&lt;br /&gt;        #collapse fTemp from right to left, summing contributions&lt;br /&gt;        #of each state j at time t-1 to a given state i at t&lt;br /&gt;        fMatrix[:,t]=fTemp.sum(1)&lt;br /&gt;&lt;br /&gt;    return fMatrix&lt;br /&gt;&lt;br /&gt;#same as fwd, but backwards &lt;br /&gt;def bwd(tProb,eProb,prior,ev):&lt;br /&gt;    nstates=shape(prior)[0]&lt;br /&gt;    T=shape(ev)[1]-1&lt;br /&gt;&lt;br /&gt;    #by initializing the entire matrix to ones,&lt;br /&gt;    #the last backward message will be set to all ones,&lt;br /&gt;    #which indicates our diffuse prior belief on our&lt;br /&gt;    #final state.  If we knew what state we ended up on,&lt;br /&gt;    #we could use that information here.&lt;br /&gt;    bMatrix=ones((nstates,T))&lt;br /&gt;&lt;br /&gt;    for t in range(T-1,0,-1):&lt;br /&gt;        fTemp=zeros((nstates,nstates))&lt;br /&gt;&lt;br /&gt;        #for every state i at time t-1,&lt;br /&gt;        #calculate the probability that it would&lt;br /&gt;        #have lead to state j at time t, weighted&lt;br /&gt;        #by the probabilistic belief that we are&lt;br /&gt;        #in state j at time t, also weighted by&lt;br /&gt;        #the likelihood that j would emit the evidence&lt;br /&gt;        #it emitted at time t&lt;br /&gt;        for i in range(nstates):&lt;br /&gt;            for j in range(nstates):&lt;br /&gt;                fTemp[i,j]=tProb[i,j]*bMatrix[j,t]*eProb[j,ev[0,t]]&lt;br /&gt;&lt;br /&gt;        #collapse fTemp from right to left, summing contributions&lt;br /&gt;        #of each state j at time t to a given state i at t-1&lt;br /&gt;        bMatrix[:,t-1]=normEachRow(fTemp.sum(1))&lt;br /&gt;&lt;br /&gt;    #allow the backward message to be neutral&lt;br /&gt;    #for the prior state distribution (time=0)&lt;br /&gt;    bMatrix[:,0]=ones((1,nstates))&lt;br /&gt;    return bMatrix&lt;br /&gt;&lt;br /&gt;#uses only filtered estimates&lt;br /&gt;def viterbi(tProb,eProb,prior,ev):&lt;br /&gt;    nstates=shape(prior)[0]&lt;br /&gt;    T=shape(ev)[1]-1&lt;br /&gt;    fMatrix=zeros((nstates,T))&lt;br /&gt;    pMatrix=zeros((nstates,T),int)&lt;br /&gt;&lt;br /&gt;    #our initial forward message equals the prior&lt;br /&gt;    #state distribution (time=0)&lt;br /&gt;    fMatrix[:,0]=prior.reshape(nstates,)&lt;br /&gt;&lt;br /&gt;    for t in range(1,T):&lt;br /&gt;        fTemp=zeros((nstates,nstates))&lt;br /&gt;        &lt;br /&gt;        for i in range(nstates):&lt;br /&gt;            for j in range(nstates):&lt;br /&gt;                #for every state i at time t,&lt;br /&gt;                #calculate the probability that each&lt;br /&gt;                #state j at t-1 would have lead to i at t,&lt;br /&gt;                #weighted by the belief that we were in&lt;br /&gt;                #state j at t-1, and weighted by the&lt;br /&gt;                #probability that state i would have&lt;br /&gt;                #produced the evidence we observed at t&lt;br /&gt;                fTemp[i,j]= \&lt;br /&gt;                    tProb[j,i]*fMatrix[j,t-1]*eProb[i,ev[0,t]]&lt;br /&gt;&lt;br /&gt;        #sweep fTemp from right to left, choosing the most likely&lt;br /&gt;        #of all states j at time t-1 to reach state i at t&lt;br /&gt;        fMatrix[:,t]=fTemp.max(1)&lt;br /&gt;        pMatrix[:,t]=fTemp.argmax(1) &lt;br /&gt;        fMatrix[:,t]=normEachRow(fMatrix[:,t])&lt;br /&gt;&lt;br /&gt;    return fMatrix ,pMatrix&lt;br /&gt;&lt;br /&gt;def baumwelch(ev,nstates):&lt;br /&gt;    counter=1&lt;br /&gt;    T=shape(ev)[1]-1&lt;br /&gt;    evclass=set(filter(lambda i: i is not None,ev.tolist()[0]))&lt;br /&gt;    nclass=len(evclass)&lt;br /&gt;    tProb=normEachRow(random.rand(nstates,nstates))&lt;br /&gt;    eProb=normEachRow(random.rand(nstates,nclass))&lt;br /&gt;    prior=normEachRow(ones((nstates,)))&lt;br /&gt;    gamma=zeros((nstates,nstates,T))&lt;br /&gt;    eps=zeros((nstates,T))&lt;br /&gt;    &lt;br /&gt;    tProb_old=tProb*0&lt;br /&gt;    eProb_old=eProb*0  &lt;br /&gt;    while (abs(tProb_old-tProb)).max()&gt;1e-3 and \&lt;br /&gt;          (abs(eProb_old-eProb)).max()&gt;1e-3:      &lt;br /&gt;        tProb_old=tProb+0&lt;br /&gt;        eProb_old=eProb+0&lt;br /&gt;&lt;br /&gt;        ###########&lt;br /&gt;        #E-step:&lt;br /&gt;        ###########&lt;br /&gt;        &lt;br /&gt;        f=fwd(tProb,eProb,prior,ev)&lt;br /&gt;        b=bwd(tProb,eProb,prior,ev)&lt;br /&gt;&lt;br /&gt;        #update gamma&lt;br /&gt;        #for all time t,&lt;br /&gt;        #compute the probability that we transition&lt;br /&gt;        #from state i at time t to state j at time t+1,&lt;br /&gt;        #weighted by the beliefs about where we are at&lt;br /&gt;        #each time and the probability that the evidence&lt;br /&gt;        #seen would have been emitted at time t+1, divided&lt;br /&gt;        #by the smoothed likelihood of the observed sequence&lt;br /&gt;        #observed up until time t would have been generated&lt;br /&gt;        #by the hmm&lt;br /&gt;        for t in range(T-1):&lt;br /&gt;            for i in range(nstates):&lt;br /&gt;                for j in range(nstates):&lt;br /&gt;                    gamma[i,j,t]= \&lt;br /&gt;                        f[i,t]*tProb[i,j]* \&lt;br /&gt;                        b[j,t+1]*eProb[j,ev[0,t+1]]/ \&lt;br /&gt;                        sum(f[i,t]*b[j,t])&lt;br /&gt;&lt;br /&gt;        #update eps&lt;br /&gt;        #which is our smoothed state distribution&lt;br /&gt;        #for each time step&lt;br /&gt;        eps=f*b&lt;br /&gt;        eps=transpose(normEachRow(transpose(eps)))&lt;br /&gt;&lt;br /&gt;        ###########&lt;br /&gt;        #M-step:&lt;br /&gt;        ###########&lt;br /&gt;                    &lt;br /&gt;        #updated tProb_new&lt;br /&gt;        for i in range(nstates):&lt;br /&gt;            for j in range(nstates):&lt;br /&gt;                #num is the weighted number of times that&lt;br /&gt;                #i has transitioned to j in our evidence sequence&lt;br /&gt;                num=sum([gamma[i,j,t] for t in range(T)])&lt;br /&gt;&lt;br /&gt;                #den is the weighted number of times that&lt;br /&gt;                #i has transitioned to any other state in&lt;br /&gt;                #each time t of our observation sequence&lt;br /&gt;                den=0&lt;br /&gt;                for k in range (nstates):&lt;br /&gt;                    den+=sum([gamma[i,k,t] for t in range(T)])&lt;br /&gt;                tProb[i,j]=num/den&lt;br /&gt;                                   &lt;br /&gt;        #update eProb_new&lt;br /&gt;        #weighted probabilitic belief about each state&lt;br /&gt;        #combined with the probability that each state&lt;br /&gt;        #will emit the given evidence&lt;br /&gt;        for j in range(nstates):&lt;br /&gt;            for k in range(nclass):&lt;br /&gt;                num=sum([eps[j,t]&lt;br /&gt;                         for t in range(T)&lt;br /&gt;                         if ev[0,t]==k])&lt;br /&gt;                den=sum([eps[j,t]&lt;br /&gt;                         for t in range(T)])&lt;br /&gt;                eProb[j,k]=num/den&lt;br /&gt;                    &lt;br /&gt;        #update prior&lt;br /&gt;        #sum each state here, which we will normalize later.&lt;br /&gt;        #how about doing some viterbu here and then&lt;br /&gt;        #looking at the most likely path and getting the&lt;br /&gt;        #prior distirbution from there?&lt;br /&gt;        for j in range(nstates):&lt;br /&gt;            prior[j,]=sum([eps[j,t] for t in range(T)])&lt;br /&gt;&lt;br /&gt;        tProb=normEachRow(tProb)&lt;br /&gt;        eProb=normEachRow(eProb)&lt;br /&gt;        prior=normEachRow(prior)&lt;br /&gt;        &lt;br /&gt;        print 'iter: %d'%counter&lt;br /&gt;        counter+=1&lt;br /&gt;        &lt;br /&gt;    return prior,eProb,tProb&lt;br /&gt;&lt;br /&gt;############################################################&lt;br /&gt;#helperFunctions&lt;br /&gt;############################################################&lt;br /&gt;&lt;br /&gt;def normEachRow(pDist):&lt;br /&gt;    dim=shape(pDist)&lt;br /&gt;    rows=dim[0]&lt;br /&gt;    if len(dim)&gt;1 and rows&gt;1:&lt;br /&gt;        for row in range(rows):&lt;br /&gt;            pDist[row,:]=pDist[row,:]/sum(pDist[row,:])&lt;br /&gt;    else:&lt;br /&gt;        pDist=pDist/sum(pDist)&lt;br /&gt;    return pDist&lt;br /&gt;&lt;br /&gt;def printMatrix(matrix,msg,start=0,dec=True):&lt;br /&gt;    if not dec:&lt;br /&gt;        format='%d\t'&lt;br /&gt;    else:&lt;br /&gt;        format='%1.3f\t'&lt;br /&gt;    print '\n'+msg&lt;br /&gt;    rows,cols=shape(matrix)&lt;br /&gt;    if cols&gt;10: cols=11&lt;br /&gt;    for col in range(start,cols): print 't=%d\t'%col,&lt;br /&gt;    print '\n',&lt;br /&gt;    for row in range(rows):&lt;br /&gt;        for col in range(start,cols):&lt;br /&gt;            print format%matrix[row,col],&lt;br /&gt;        print '\n',&lt;br /&gt;&lt;br /&gt;def accu(a,b):&lt;br /&gt;    assert shape(a)==shape(b)&lt;br /&gt;    total=float(shape(a)[1])&lt;br /&gt;    c=0&lt;br /&gt;    for i in range(shape(a)[1]):&lt;br /&gt;        if a[0,i]==b[0,i]:&lt;br /&gt;            c=c+1&lt;br /&gt;    return c/total&lt;br /&gt;&lt;br /&gt;def load(filename):&lt;br /&gt;    inf=open(filename,'r')&lt;br /&gt;    inf.readline()&lt;br /&gt;    data=[]&lt;br /&gt;    for i in inf.readlines():&lt;br /&gt;        x=i.split()&lt;br /&gt;        y=x[0].split(",")&lt;br /&gt;        data.append(y)&lt;br /&gt;    return data&lt;br /&gt;&lt;br /&gt;def backtrack(vMatrix,pMatrix):&lt;br /&gt;    rows,cols=shape(vMatrix)&lt;br /&gt;    path=zeros((1,cols),int)&lt;br /&gt;    &lt;br /&gt;    #the most likely state at the last time step&lt;br /&gt;    #is the one with the highest probability in&lt;br /&gt;    #the vMatrix at that time step:&lt;br /&gt;    lastPathCol=shape(path)[1]-1&lt;br /&gt;    lastvMatrixCol=shape(vMatrix)[1]-1&lt;br /&gt;    lastpMatrixCol=shape(pMatrix)[1]-1&lt;br /&gt;    path[0,lastPathCol]=vMatrix[:,lastvMatrixCol].argmax(0)&lt;br /&gt;&lt;br /&gt;    #once we have the most likely last state's index, we &lt;br /&gt;    #use it to look up in the backpointer matrix to&lt;br /&gt;    #trace our way back the most likely path:&lt;br /&gt;    while lastPathCol&gt;0:&lt;br /&gt;        path[0,lastPathCol-1]= \&lt;br /&gt;                pMatrix[path[0,lastPathCol],lastpMatrixCol]&lt;br /&gt;        lastPathCol-=1&lt;br /&gt;        lastpMatrixCol-=1&lt;br /&gt;                                        &lt;br /&gt;    return path&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Driven with:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;#hmm_inter.py&lt;br /&gt;from hmm_imp import *&lt;br /&gt;&lt;br /&gt;############################################################&lt;br /&gt;#weatherModel&lt;br /&gt;############################################################&lt;br /&gt;&lt;br /&gt;#state map&lt;br /&gt;stateMap={'sunny':0,'rainy':1,'foggy':2}&lt;br /&gt;stateIndex={0:'sunny',1:'rainy',2:'foggy'}&lt;br /&gt;nStates=len(stateMap)&lt;br /&gt;&lt;br /&gt;#observation map&lt;br /&gt;obsMap={'no':0,'yes':1}&lt;br /&gt;obsIndex={0:'no',1:'yes'}&lt;br /&gt;&lt;br /&gt;#prior probability on weather states&lt;br /&gt;#P(sunny)=0.50&lt;br /&gt;#P(rainy)=0.25&lt;br /&gt;#P(foggy)=0.25&lt;br /&gt;pi=array([0.50,&lt;br /&gt;          0.25,&lt;br /&gt;          0.25]).reshape(nStates,1)&lt;br /&gt;&lt;br /&gt;#transition probabilities of weather states&lt;br /&gt;#                    tomorrrow&lt;br /&gt;#    today     sunny  rainy  foggy&lt;br /&gt;#    sunny      0.8    0.05   0.15&lt;br /&gt;#    rainy      0.2    0.60   0.20 &lt;br /&gt;#    foggy      0.2    0.30   0.50&lt;br /&gt;A=array([[0.8, 0.05, 0.15],&lt;br /&gt;         [0.2, 0.60, 0.20],&lt;br /&gt;         [0.2, 0.30, 0.50]])&lt;br /&gt;&lt;br /&gt;#conditional probabilities of evidence given weather&lt;br /&gt;#            P(umbrella=no|weather)  P(umbrella=yes|weather) &lt;br /&gt;#    sunny          0.9                     0.1 &lt;br /&gt;#    rainy          0.2                     0.8 &lt;br /&gt;#    foggy          0.7                     0.3 &lt;br /&gt;B=array([[0.9, 0.1],&lt;br /&gt;         [0.2, 0.8],&lt;br /&gt;         [0.7, 0.3]])&lt;br /&gt;&lt;br /&gt;############################################################&lt;br /&gt;#session&lt;br /&gt;############################################################&lt;br /&gt;&lt;br /&gt;#data (evidence has the dummy 'None' entry for time=0)&lt;br /&gt;evidence=array([None,0,0,0,1,0,0,1,1,0,1,0])&lt;br /&gt;classPath=array([2,2,2,1,0,2,1,1,2,1,1])&lt;br /&gt;classPath=classPath.reshape(1,len(classPath))&lt;br /&gt;evidence=evidence.reshape(1,len(evidence))&lt;br /&gt;&lt;br /&gt;#compute forward messages&lt;br /&gt;fMatrix=fwd(A,B,pi,evidence)&lt;br /&gt;&lt;br /&gt;#normalize probabilities&lt;br /&gt;fMatrix=transpose(normEachRow(transpose(fMatrix)))&lt;br /&gt;&lt;br /&gt;#show state distribution as a result of filtering&lt;br /&gt;printMatrix(fMatrix,'fwd matrix')&lt;br /&gt;&lt;br /&gt;#compute backward messages&lt;br /&gt;bMatrix=bwd(A,B,pi,evidence)&lt;br /&gt;&lt;br /&gt;#show backward messages&lt;br /&gt;printMatrix(bMatrix,'bwd matrix')&lt;br /&gt;&lt;br /&gt;#compute smoothed estimate (hindsight)&lt;br /&gt;sMatrix=fMatrix*bMatrix&lt;br /&gt;sMatrix=transpose(normEachRow(transpose(sMatrix)))&lt;br /&gt;&lt;br /&gt;#show smooth matrix&lt;br /&gt;printMatrix(sMatrix,'smooth matrix',start=1)&lt;br /&gt;&lt;br /&gt;#by summing over the un-normalized state distribution&lt;br /&gt;#resulting from filtering, we obtain the likelihood&lt;br /&gt;#of the hmm having generated the given observation sequence&lt;br /&gt;fMatrix=likeli(A,B,pi,evidence)&lt;br /&gt;lkh=sum(fMatrix[:,shape(fMatrix)[1]-1])&lt;br /&gt;print '\nLikelihood of obs. seq. given original hmm: %f '%lkh&lt;br /&gt;&lt;br /&gt;def doViterbi(tProb,eProb,prior, \&lt;br /&gt;              evidence_seq,state_seq,printPath=False):&lt;br /&gt;    print '\nViterbi...'&lt;br /&gt;    &lt;br /&gt;    #compute viterbi matrix and backpointer matrix&lt;br /&gt;    vMatrix,pMatrix=viterbi(tProb,eProb,prior,evidence_seq)&lt;br /&gt;&lt;br /&gt;    #show viterbi matrix and backpointer matrix&lt;br /&gt;    printMatrix(vMatrix,'viterbi matrix',start=1)&lt;br /&gt;    printMatrix(pMatrix,'backpointer matrix',start=1,dec=False)&lt;br /&gt;&lt;br /&gt;    #backtrack backpointer matrix to find most likely&lt;br /&gt;    #sequence of states&lt;br /&gt;    estPath=backtrack(vMatrix,pMatrix)&lt;br /&gt;    assert shape(state_seq)==shape(estPath)&lt;br /&gt;&lt;br /&gt;    if printPath:&lt;br /&gt;        print '\nEstimated class path:\n',estPath&lt;br /&gt;        print '\nActual class path:\n',state_seq&lt;br /&gt;    &lt;br /&gt;    print '\nViterbi decoder accuracy: %1.2f'% \&lt;br /&gt;          accu(state_seq,estPath)&lt;br /&gt;&lt;br /&gt;doViterbi(A,B,pi,evidence,classPath,printPath=True)&lt;br /&gt;&lt;br /&gt;#load more data from disk&lt;br /&gt;data=load('weather-test1-1000.txt')&lt;br /&gt;&lt;br /&gt;#adding 'None' as per note at beginning of session&lt;br /&gt;observations=[None]&lt;br /&gt;states=[]&lt;br /&gt;for c,o in data:&lt;br /&gt;  observations.append(obsMap[o])&lt;br /&gt;  states.append(stateMap[c])&lt;br /&gt;&lt;br /&gt;observations=array(observations)&lt;br /&gt;observations=observations.reshape(1,len(observations))&lt;br /&gt;states=array(states)&lt;br /&gt;states=states.reshape(1,len(states))&lt;br /&gt;&lt;br /&gt;doViterbi(A,B,pi,observations,states)&lt;br /&gt;&lt;br /&gt;print '\nBaum-Welch...'&lt;br /&gt;&lt;br /&gt;random.seed(135)&lt;br /&gt;prior,eProb,tProb=baumwelch(observations,3)&lt;br /&gt;&lt;br /&gt;print '\nprior:\n',prior&lt;br /&gt;print '\ntProb:\n',tProb&lt;br /&gt;print '\neProb:\n',eProb&lt;br /&gt;&lt;br /&gt;doViterbi(tProb,eProb,prior,observations,states)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;With output:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;fwd matrix&lt;br /&gt;t=0 t=1 t=2 t=3 t=4 t=5 t=6 t=7 t=8 t=9 t=10 &lt;br /&gt;0.500 0.667 0.728 0.752 0.282 0.558 0.679 0.241 0.082 0.447 0.142 &lt;br /&gt;0.250 0.074 0.042 0.034 0.424 0.120 0.055 0.466 0.722 0.198 0.600 &lt;br /&gt;0.250 0.259 0.231 0.214 0.294 0.322 0.265 0.293 0.197 0.356 0.258 &lt;br /&gt;&lt;br /&gt;bwd matrix&lt;br /&gt;t=0 t=1 t=2 t=3 t=4 t=5 t=6 t=7 t=8 t=9 t=10 &lt;br /&gt;1.000 0.448 0.346 0.180 0.404 0.272 0.101 0.152 0.317 0.145 1.000 &lt;br /&gt;1.000 0.228 0.276 0.450 0.241 0.312 0.533 0.473 0.292 0.493 1.000 &lt;br /&gt;1.000 0.324 0.378 0.370 0.355 0.416 0.365 0.376 0.391 0.361 1.000 &lt;br /&gt;&lt;br /&gt;smooth matrix&lt;br /&gt;t=1 t=2 t=3 t=4 t=5 t=6 t=7 t=8 t=9 t=10 &lt;br /&gt;0.747 0.718 0.589 0.356 0.470 0.352 0.100 0.083 0.223 0.142 &lt;br /&gt;0.042 0.033 0.067 0.318 0.115 0.151 0.601 0.672 0.335 0.600 &lt;br /&gt;0.210 0.249 0.344 0.326 0.415 0.496 0.300 0.245 0.441 0.258 &lt;br /&gt;&lt;br /&gt;Likelihood of obs. seq. given original hmm: 0.000654 &lt;br /&gt;&lt;br /&gt;Viterbi...&lt;br /&gt;&lt;br /&gt;viterbi matrix&lt;br /&gt;t=1 t=2 t=3 t=4 t=5 t=6 t=7 t=8 t=9 t=10 &lt;br /&gt;0.754 0.858 0.862 0.485 0.737 0.856 0.485 0.198 0.480 0.196 &lt;br /&gt;0.063 0.017 0.012 0.242 0.061 0.019 0.242 0.594 0.240 0.589 &lt;br /&gt;0.183 0.125 0.126 0.273 0.202 0.125 0.273 0.209 0.280 0.215 &lt;br /&gt;&lt;br /&gt;backpointer matrix&lt;br /&gt;t=1 t=2 t=3 t=4 t=5 t=6 t=7 t=8 t=9 t=10 &lt;br /&gt;0 0 0 0 0 0 0 0 0 0 &lt;br /&gt;1 2 0 0 1 2 0 1 1 1 &lt;br /&gt;2 0 0 0 2 0 0 2 1 2 &lt;br /&gt;&lt;br /&gt;Estimated class path:&lt;br /&gt;[[0 0 0 0 0 0 0 1 1 1 1]]&lt;br /&gt;&lt;br /&gt;Actual class path:&lt;br /&gt;[[2 2 2 1 0 2 1 1 2 1 1]]&lt;br /&gt;&lt;br /&gt;Viterbi decoder accuracy: 0.36&lt;br /&gt;&lt;br /&gt;Viterbi...&lt;br /&gt;&lt;br /&gt;viterbi matrix&lt;br /&gt;t=1 t=2 t=3 t=4 t=5 t=6 t=7 t=8 t=9 t=10 &lt;br /&gt;0.754 0.858 0.862 0.485 0.737 0.856 0.485 0.198 0.480 0.196 &lt;br /&gt;0.063 0.017 0.012 0.242 0.061 0.019 0.242 0.594 0.240 0.589 &lt;br /&gt;0.183 0.125 0.126 0.273 0.202 0.125 0.273 0.209 0.280 0.215 &lt;br /&gt;&lt;br /&gt;backpointer matrix&lt;br /&gt;t=1 t=2 t=3 t=4 t=5 t=6 t=7 t=8 t=9 t=10 &lt;br /&gt;0 0 0 0 0 0 0 0 0 0 &lt;br /&gt;1 2 0 0 1 2 0 1 1 1 &lt;br /&gt;2 0 0 0 2 0 0 2 1 2 &lt;br /&gt;&lt;br /&gt;Viterbi decoder accuracy: 0.61&lt;br /&gt;&lt;br /&gt;Baum-Welch...&lt;br /&gt;iter: 1&lt;br /&gt;iter: 2&lt;br /&gt;[...]&lt;br /&gt;iter: 26&lt;br /&gt;iter: 27&lt;br /&gt;&lt;br /&gt;prior:&lt;br /&gt;[  7.33520128e-01   2.75132722e-05   2.66452359e-01]&lt;br /&gt;&lt;br /&gt;tProb:&lt;br /&gt;[[  6.88340698e-01   9.00867784e-06   3.11650293e-01]&lt;br /&gt; [  7.81393712e-01   1.87048374e-05   2.18587583e-01]&lt;br /&gt; [  8.65993461e-01   4.23514682e-05   1.33964187e-01]]&lt;br /&gt;&lt;br /&gt;eProb:&lt;br /&gt;[[ 0.68360127  0.31639873]&lt;br /&gt; [ 0.13036119  0.86963881]&lt;br /&gt; [ 0.66894045  0.33105955]]&lt;br /&gt;&lt;br /&gt;Viterbi...&lt;br /&gt;&lt;br /&gt;viterbi matrix&lt;br /&gt;t=1 t=2 t=3 t=4 t=5 t=6 t=7 t=8 t=9 t=10 &lt;br /&gt;0.693 0.693 0.693 0.679 0.693 0.693 0.679 0.679 0.693 0.679 &lt;br /&gt;0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 &lt;br /&gt;0.307 0.307 0.307 0.321 0.307 0.307 0.321 0.321 0.307 0.321 &lt;br /&gt;&lt;br /&gt;backpointer matrix&lt;br /&gt;t=1 t=2 t=3 t=4 t=5 t=6 t=7 t=8 t=9 t=10 &lt;br /&gt;0 0 0 0 0 0 0 0 0 0 &lt;br /&gt;2 2 2 2 2 2 2 2 2 2 &lt;br /&gt;0 0 0 0 0 0 0 0 0 0 &lt;br /&gt;&lt;br /&gt;Viterbi decoder accuracy: 0.49&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;It helps to look at a good diagram of the trellis that results as we unroll the HMM through time; an example of such a diagram can be found in &lt;a href="http://rii.ricoh.com/~stork/DHS.html"&gt;DHS&lt;/a&gt;, towards the end of chapter 3.  Another good resource to get a good grasp of the HMM algorithms is an interactive spreadsheet that can be found &lt;a href="http://www.cs.jhu.edu/~jason/papers/#tnlp02"&gt;here&lt;/a&gt;.  &lt;br /&gt;&lt;br /&gt;You can check out the estimated transition and emission probability matrices and try to make out how the entries correspond to our original state and evidence vocabulary.  In practice, in systems that do things like recognize speech, a lot of human guidance is used when estimating hidden markov models.  Running the viterbi algorithm on the observation sequence with the estimated model returns an accuracy of 0.49, which sounds better than it is, since one of the states (the sunny state) is much more prevalent in our dataset (a prior of 0.50), and so if you always guessed that state you would get a score close to the prior of that state given enough data.&lt;br /&gt;&lt;br /&gt;Completely unsupervised learning is hard; even humans will come to varying answers in a manual clustering exercise.  Reinforcement learning (what we'll be focusing on the next few posts) allows us to use at least some sort of feedback, although not as much as supervised learning, and so should yield better results than learning without any kind of supervision (provided, of course, that a proper reward system is engineered).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-5707398500751239568?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/5707398500751239568/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/05/hidden-markov-model-iii.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/5707398500751239568'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/5707398500751239568'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/05/hidden-markov-model-iii.html' title='Hidden Markov Model III'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-1029472587790976111</id><published>2009-05-15T08:51:00.000-07:00</published><updated>2009-05-18T12:31:08.803-07:00</updated><title type='text'>Hidden Markov Model II</title><content type='html'>Today we look at some of the things we can use an HMM for:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;filtering/monitoring&lt;/li&gt;&lt;li&gt;predicting&lt;/li&gt;&lt;li&gt;smoothing&lt;/li&gt;&lt;li&gt;computing likelihood of observation sequence&lt;/li&gt;&lt;li&gt;decoding observation sequence into most likely state sequence&lt;/li&gt;&lt;/ul&gt;Some good resources for HMM inference algorithms include:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;this &lt;a href="http://see.stanford.edu/materials/aimlcs229/cs229-hmm.pdf"&gt;paper on hidden markov models&lt;/a&gt; from the &lt;a href="http://see.stanford.edu/see/materials/aimlcs229/handouts.aspx"&gt;handouts section in cs229&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;chapter 15 from &lt;a href="http://aima.cs.berkeley.edu/"&gt;AIMA&lt;/a&gt;.  This is the best introductory resource I know that discusses the inference scenarios listed above, both from a general perspective and from the perspective of specific implementations (hidden markov models, kalman filters, and dynamic bayesian networks).&lt;/li&gt;&lt;li&gt;&lt;a href="http://inst.eecs.berkeley.edu/%7Ecs188/sp08/announcements.html"&gt;Berkeley's undergraduate AI class website&lt;/a&gt;, which includes an HMM project whose domain model was derived from these &lt;a href="http://inst.eecs.berkeley.edu/%7Ecs188/sp08/slides/tr-98-041-1.pdf"&gt;notes&lt;/a&gt;.  This is the domain model (a very simple weather model) that was used in our implementations below.&lt;/li&gt;&lt;/ul&gt;The Berkeley AI class (cs188) website also includes implementations of these inference algorithms, but in my implementations I sought a simpler presentation to facilitate following chaper 15 of AIMA (too many loops in the solutions posted for cs188 obscure what's going on).  I did reuse their helper functions and data, which can be downloaded from the &lt;a href="http://inst.eecs.berkeley.edu/~cs188/sp08/projects.html"&gt;projects section of the website&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;#hmm.py&lt;br /&gt;&lt;br /&gt;######################################################################&lt;br /&gt;# Weather Model&lt;br /&gt;######################################################################&lt;br /&gt;&lt;br /&gt;# state map&lt;br /&gt;weatherStateMap   = {'sunny' : 0, 'rainy' : 1, 'foggy' : 2}&lt;br /&gt;weatherStateIndex = {0 : 'sunny', 1 : 'rainy', 2 : 'foggy'}&lt;br /&gt;&lt;br /&gt;# observation map&lt;br /&gt;weatherObsMap   = {'no' : 0, 'yes' : 1}&lt;br /&gt;weatherObsIndex = {0 : 'no', 1 : 'yes'}&lt;br /&gt;&lt;br /&gt;# prior probability on weather states&lt;br /&gt;# P(sunny) = 0.5  P(rainy) = 0.25  P(foggy) = 0.25&lt;br /&gt;weatherProb = [0.5, 0.25, 0.25]&lt;br /&gt;&lt;br /&gt;# transition probabilities of weather states&lt;br /&gt;#                    tomorrrow&lt;br /&gt;#    today     sunny  rainy  foggy&lt;br /&gt;#    sunny      0.8    0.05   0.15&lt;br /&gt;#    rainy      0.2    0.6    0.2 &lt;br /&gt;#    foggy      0.2    0.3    0.5&lt;br /&gt;weatherTProb = [ [0.8, 0.05, 0.15], [0.2, 0.6, 0.2], [0.2, 0.3, 0.5] ]&lt;br /&gt;&lt;br /&gt;# conditional probabilities of evidence (observations) given weather&lt;br /&gt;#                          sunny  rainy  foggy &lt;br /&gt;# P(umbrella=no|weather)    0.9    0.2    0.7&lt;br /&gt;# P(umbrella=yes|weather)   0.1    0.8    0.3&lt;br /&gt;weatherEProb = [ [0.9, 0.2, 0.7], [0.1, 0.8, 0.3] ]&lt;br /&gt;&lt;br /&gt;######################################################################&lt;br /&gt;# Helper Functions&lt;br /&gt;######################################################################&lt;br /&gt;&lt;br /&gt;# Using the prior probabilities and state map, return:&lt;br /&gt;#     P(state)&lt;br /&gt;def getStatePriorProb(prob, stateMap, state):&lt;br /&gt;   return prob[stateMap[state]]&lt;br /&gt;&lt;br /&gt;# Using the transition probabilities and state map, return:&lt;br /&gt;#     P(next state | current state)&lt;br /&gt;def getNextStateProb(tprob, stateMap, current, next):&lt;br /&gt;   return tprob[stateMap[current]][stateMap[next]]&lt;br /&gt;&lt;br /&gt;# Using the observation probabilities, state map, and observation map,&lt;br /&gt;# return:&lt;br /&gt;#     P(observation | state)&lt;br /&gt;def getObservationProb(eprob, stateMap, obsMap, state, obs):&lt;br /&gt;   return eprob[obsMap[obs]][stateMap[state]]&lt;br /&gt;&lt;br /&gt;# Normalize a probability distribution&lt;br /&gt;def normalize(pdist):&lt;br /&gt;   s = sum(pdist)&lt;br /&gt;   for i in range(0,len(pdist)):&lt;br /&gt;      pdist[i] = pdist[i] / s&lt;br /&gt;   return pdist&lt;br /&gt;&lt;br /&gt;def loadData(filename):&lt;br /&gt;   input = open(filename, 'r')&lt;br /&gt;   input.readline()&lt;br /&gt;   data = []&lt;br /&gt;   for i in input.readlines():&lt;br /&gt;      x = i.split()&lt;br /&gt;      y = x[0].split(",")&lt;br /&gt;      data.append(y)&lt;br /&gt;   return data&lt;br /&gt;&lt;br /&gt;def accuracy(a,b):&lt;br /&gt;   total = float(max(len(a),len(b)))&lt;br /&gt;   c = 0&lt;br /&gt;   for i in range(min(len(a),len(b))):&lt;br /&gt;      if a[i] == b[i]:&lt;br /&gt;         c = c + 1          &lt;br /&gt;   return c/total&lt;br /&gt;&lt;br /&gt;######################################################################&lt;br /&gt;# HMM inference algorithms &lt;br /&gt;######################################################################&lt;br /&gt;&lt;br /&gt;# Filtering/monitoring.&lt;br /&gt;def filterfwd(stateMap,stateIndex,obsMap,obsIndex,pdist,tprob,eprob,obs):&lt;br /&gt;   pdist_new=[0.0]*len(stateMap)&lt;br /&gt;   for i in range(len(pdist_new)):&lt;br /&gt;&lt;br /&gt;      #predict state at 't' based on state belief at 't-1'&lt;br /&gt;      #p(x_t|e_1:t-1) = sum(p(x_t|x_t-1) * p(x_t-1|e_1:t-1)) &lt;br /&gt;      pdist_new[i]=sum([tprob[j][i]*pdist[j]&lt;br /&gt;                        for j in range(len(stateMap))])&lt;br /&gt;      &lt;br /&gt;      #update prediction with current observation/evidence&lt;br /&gt;      #p(x_t|e_1:t) ~ p(e_t|x_t) * p(x_t|e_1:t-1) &lt;br /&gt;      pdist_new[i]=eprob[obsMap[obs]][i]*pdist_new[i]&lt;br /&gt;      &lt;br /&gt;   return normalize(pdist_new)&lt;br /&gt;&lt;br /&gt;# Prediction.&lt;br /&gt;def predictfwd(stateMap,stateIndex,obsMap,obsIndex,pdist,tprob,eprob):&lt;br /&gt;   pdist_new=[0.0]*len(stateMap)&lt;br /&gt;   for i in range(len(pdist_new)):&lt;br /&gt;&lt;br /&gt;      #predict state at 't+1' based on state belief at 't'&lt;br /&gt;      #p(x_t+1|e_1:t) ~ sum(p(x_t+1|x_t) * p(x_t|e_1:t)) &lt;br /&gt;      pdist_new[i]=sum([tprob[j][i]*pdist[j]&lt;br /&gt;                        for j in range(len(stateMap))])&lt;br /&gt;&lt;br /&gt;   return normalize(pdist_new)&lt;br /&gt;&lt;br /&gt;# Likelihood.&lt;br /&gt;def likelifwd(stateMap,stateIndex,obsMap,obsIndex,pdist,tprob,eprob,obs):&lt;br /&gt;   pdist_new=[0.0]*len(stateMap)&lt;br /&gt;   for i in range(len(pdist_new)):&lt;br /&gt;&lt;br /&gt;      #predict state at 't' based on state belief at 't-1'&lt;br /&gt;      #p(x_t|e_1:t-1) = sum(p(x_t|x_t-1) * p(x_t-1|e_1:t-1)) &lt;br /&gt;      pdist_new[i]=sum([tprob[j][i]*pdist[j]&lt;br /&gt;                        for j in range(len(stateMap))])&lt;br /&gt;      &lt;br /&gt;      #update prediction with current observation/evidence&lt;br /&gt;      #p(x_t|e_1:t) ~ p(e_t|x_t) * p(x_t|e_1:t-1) &lt;br /&gt;      pdist_new[i]=eprob[obsMap[obs]][i]*pdist_new[i]&lt;br /&gt;      &lt;br /&gt;   return pdist_new&lt;br /&gt;&lt;br /&gt;# Smoothing/hindsight.&lt;br /&gt;def smoothback(stateMap,stateIndex,obsMap,obsIndex,pdist,tprob,eprob,f,obs):&lt;br /&gt;   pdist_new=[0.0]*len(stateMap)&lt;br /&gt;   b=[0.0]*len(stateMap)&lt;br /&gt;   for i in range(len(pdist_new)):&lt;br /&gt;&lt;br /&gt;      #smooth state at time 'k' (1&lt;=k&amp;lt;t) based on state belief at 't'    &lt;br /&gt;      #p(e_K+1:t|x+k) = sum(p(e_k+1|x_k+1)*p(e_k+2:t|x_k+1)*p(x_k+1|x_k))&lt;br /&gt;      b[i]=sum([eprob[obsMap[obs]][j] * pdist[j] * tprob[i][j]&lt;br /&gt;                        for j in range(len(stateMap))])&lt;br /&gt;      &lt;br /&gt;      #p(x_k|e_1:t) ~ p(x_k|e_1:k) * p(e_K+1:t|x+k)  &lt;br /&gt;      pdist_new[i]=f[i]*b[i]&lt;br /&gt;&lt;br /&gt;   return normalize(pdist_new),b&lt;br /&gt;&lt;br /&gt;# Decoding/Viterbi algorithm.&lt;br /&gt;def viterbi(stateMap,stateIndex,obsMap,obsIndex,pdist,tprob,eprob,obs):&lt;br /&gt;   pdist_new=[0.0]*len(stateMap)&lt;br /&gt;   state_idx=[0]*len(stateMap)&lt;br /&gt;   for i in range(len(pdist_new)):&lt;br /&gt;&lt;br /&gt;      # p(x_t|x_t-1) * p(x_1,...,x_t-1|e_1:t-1)&lt;br /&gt;      states=[tprob[j][i]*pdist[j] for j in range(len(stateMap))]&lt;br /&gt;&lt;br /&gt;      # max(p(x_t|x_t-1) * p(x_1,...,x_t-1|e_1:t-1)) &lt;br /&gt;      pdist_new[i]=max(states)&lt;br /&gt;&lt;br /&gt;      # here, we keep track of:&lt;br /&gt;      # for each possible state in the present (i),&lt;br /&gt;      # what is the most likely state in the past (j)&lt;br /&gt;      # to have lead to this state (i) in the present&lt;br /&gt;      state_idx[i]=states.index(pdist_new[i])&lt;br /&gt;      &lt;br /&gt;      # p(e_t|x_t) * max_val(p(x_t|x_t-1) * p(x_1,...,x_t-1|e_1:t-1))&lt;br /&gt;      pdist_new[i]=eprob[obsMap[obs]][i]*pdist_new[i]&lt;br /&gt;      &lt;br /&gt;   return normalize(pdist_new),state_idx&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Driven with:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;#hmm_infer.py&lt;br /&gt;from hmm import *&lt;br /&gt;&lt;br /&gt;#load sequence of observations from disk&lt;br /&gt;data=loadData('weather-test1-1000.txt')&lt;br /&gt;observations = []&lt;br /&gt;states=[]&lt;br /&gt;for c,o in data:&lt;br /&gt;  observations.append(o)&lt;br /&gt;  states.append(c)&lt;br /&gt;&lt;br /&gt;#initialize state distribution to our &lt;br /&gt;#prior state distribution&lt;br /&gt;stateDist=weatherProb&lt;br /&gt;&lt;br /&gt;print '\nPrior distribution over states (t=0):'&lt;br /&gt;for i in range(len(stateDist)):&lt;br /&gt;    print '   ',&lt;br /&gt;    print weatherStateIndex[i],&lt;br /&gt;    print '%1.3f' % stateDist[i],&lt;br /&gt;&lt;br /&gt;obsn=10&lt;br /&gt;obs_seq=observations[:obsn]&lt;br /&gt;state_seq=states[:obsn]&lt;br /&gt;&lt;br /&gt;print '\n\nUmbrella observation sequence:'&lt;br /&gt;print '   ', obs_seq&lt;br /&gt;&lt;br /&gt;####################################################&lt;br /&gt;# Filtering/Monitoring:&lt;br /&gt;#Compute the updated the belief state distribution&lt;br /&gt;#according to the model and the current observation&lt;br /&gt;####################################################&lt;br /&gt;print '\nFiltering/Monitoring...'&lt;br /&gt;for i in range(obsn):&lt;br /&gt;    stateDist=filterfwd(weatherStateMap, \&lt;br /&gt;                        weatherStateIndex, \&lt;br /&gt;                        weatherObsMap, \&lt;br /&gt;                        weatherObsIndex, \&lt;br /&gt;                        stateDist, \&lt;br /&gt;                        weatherTProb, \&lt;br /&gt;                        weatherEProb, \&lt;br /&gt;                        obs_seq[i])&lt;br /&gt;    &lt;br /&gt;    print '\nDistribution over state at t=%d:'%(i+1)&lt;br /&gt;    for i in range(len(stateDist)):&lt;br /&gt;        print '   ',&lt;br /&gt;        print weatherStateIndex[i],&lt;br /&gt;        print '%1.3f' % stateDist[i],&lt;br /&gt;&lt;br /&gt;####################################################&lt;br /&gt;# Prediction:&lt;br /&gt;#Forecast state distribution 'n' time steps into&lt;br /&gt;#the future&lt;br /&gt;####################################################&lt;br /&gt;print '\n\nPrediction...'&lt;br /&gt;for i in range(15):&lt;br /&gt;    stateDist=predictfwd(weatherStateMap, \&lt;br /&gt;                         weatherStateIndex, \&lt;br /&gt;                         weatherObsMap, \&lt;br /&gt;                         weatherObsIndex, \&lt;br /&gt;                         stateDist, \&lt;br /&gt;                         weatherTProb, \&lt;br /&gt;                         weatherEProb)&lt;br /&gt;        &lt;br /&gt;    print '\nDistribution over state at t+%d:'%(i+1)&lt;br /&gt;    for j in range(len(stateDist)):&lt;br /&gt;        print '   ',&lt;br /&gt;        print weatherStateIndex[j],&lt;br /&gt;        print '%1.3f' % stateDist[j],&lt;br /&gt;&lt;br /&gt;####################################################&lt;br /&gt;# Likelihood:&lt;br /&gt;#Compute the likelihood of our given HMM having&lt;br /&gt;#generated the observed sequence&lt;br /&gt;####################################################&lt;br /&gt;print '\n\nLikelihood...'&lt;br /&gt;&lt;br /&gt;#reset our state distribution to our &lt;br /&gt;#prior state distribution&lt;br /&gt;stateDist=weatherProb&lt;br /&gt;&lt;br /&gt;for obs in obs_seq:&lt;br /&gt;    stateDist=likelifwd(weatherStateMap, \&lt;br /&gt;                        weatherStateIndex, \&lt;br /&gt;                        weatherObsMap, \&lt;br /&gt;                        weatherObsIndex, \&lt;br /&gt;                        stateDist, \&lt;br /&gt;                        weatherTProb, \&lt;br /&gt;                        weatherEProb, \&lt;br /&gt;                        obs)&lt;br /&gt;    &lt;br /&gt;print '\nLikelihood of observations given our HMM: ',&lt;br /&gt;print '%1.6f' % sum(stateDist)&lt;br /&gt;&lt;br /&gt;####################################################&lt;br /&gt;# Smoothing/Hindsight:&lt;br /&gt;#Compute distribution over past states&lt;br /&gt;#given observations up to present&lt;br /&gt;####################################################&lt;br /&gt;print '\nSmoothing/hindsight...'&lt;br /&gt;#reset our state distribution to our &lt;br /&gt;#prior state distribution&lt;br /&gt;stateDist=weatherProb&lt;br /&gt;&lt;br /&gt;fs=[[]]*obsn&lt;br /&gt;for i in range(obsn):&lt;br /&gt;    stateDist=filterfwd(weatherStateMap, \&lt;br /&gt;                        weatherStateIndex, \&lt;br /&gt;                        weatherObsMap, \&lt;br /&gt;                        weatherObsIndex, \&lt;br /&gt;                        stateDist, \&lt;br /&gt;                        weatherTProb, \&lt;br /&gt;                        weatherEProb, \&lt;br /&gt;                        obs_seq[i])&lt;br /&gt;    &lt;br /&gt;    #record filtered state distribution beliefs&lt;br /&gt;    fs[i]=stateDist&lt;br /&gt;    &lt;br /&gt;    print '\nDistribution over state at t=%d:'%(i+1)&lt;br /&gt;    for j in range(len(stateDist)):&lt;br /&gt;        print '   ',&lt;br /&gt;        print weatherStateIndex[j],&lt;br /&gt;        print '%1.3f' % stateDist[j],&lt;br /&gt;&lt;br /&gt;print '\n\nt=%d\n'%obsn&lt;br /&gt;print 'Umbrella observation sequence:'&lt;br /&gt;print '   ', obs_seq&lt;br /&gt;&lt;br /&gt;b=[1,1,1]&lt;br /&gt;for i in range(1,obsn):&lt;br /&gt;    stateDist,b=smoothback(weatherStateMap, \&lt;br /&gt;                           weatherStateIndex, \&lt;br /&gt;                           weatherObsMap, \&lt;br /&gt;                           weatherObsIndex, \&lt;br /&gt;                           b, \&lt;br /&gt;                           weatherTProb, \&lt;br /&gt;                           weatherEProb, \&lt;br /&gt;                           fs[-(i+1)], \&lt;br /&gt;                           obs_seq[-i])&lt;br /&gt;        &lt;br /&gt;    print '\nDistribution over state at t-%d:'%(i)&lt;br /&gt;    for j in range(len(stateDist)):&lt;br /&gt;        print '   ',&lt;br /&gt;        print weatherStateIndex[j],&lt;br /&gt;        print '%1.3f' % stateDist[j],&lt;br /&gt;&lt;br /&gt;####################################################&lt;br /&gt;# Decoding:&lt;br /&gt;#Compute the most likely sequence of states given&lt;br /&gt;#the observed sequence of observations&lt;br /&gt;####################################################&lt;br /&gt;print '\n\nDecoding...'&lt;br /&gt;#reset our state distribution to our &lt;br /&gt;#prior state distribution&lt;br /&gt;stateDist=weatherProb&lt;br /&gt;&lt;br /&gt;decoded_seq=[0]*obsn&lt;br /&gt;state_index=[[]]*obsn&lt;br /&gt;for i in range(obsn):&lt;br /&gt;    stateDist,state=viterbi(weatherStateMap, \&lt;br /&gt;                            weatherStateIndex, \&lt;br /&gt;                            weatherObsMap, \&lt;br /&gt;                            weatherObsIndex, \&lt;br /&gt;                            stateDist, \&lt;br /&gt;                            weatherTProb, \&lt;br /&gt;                            weatherEProb, \&lt;br /&gt;                            obs_seq[i])&lt;br /&gt;    &lt;br /&gt;    state_index[i]=state&lt;br /&gt;&lt;br /&gt;#backtrack to find the most likely state sequence&lt;br /&gt;decoded_seq[obsn-1]=stateDist.index(max(stateDist))&lt;br /&gt;for i in range(obsn-1,0,-1):&lt;br /&gt;  decoded_seq[i-1]=state_index[i][decoded_seq[i]]&lt;br /&gt;&lt;br /&gt;#turn state indices into state names&lt;br /&gt;decoded_seq=[weatherStateIndex[idx]&lt;br /&gt;             for idx in decoded_seq]&lt;br /&gt;  &lt;br /&gt;print '\nViterbi - predicted state sequence:\n   ',&lt;br /&gt;print decoded_seq&lt;br /&gt;print 'Viterbi - actual state sequence:\n   ',&lt;br /&gt;print state_seq &lt;br /&gt;print '\nThe accuracy of the viterbi decoder is', &lt;br /&gt;print accuracy(state_seq, decoded_seq)&lt;br /&gt;&lt;br /&gt;print '\nDecoding full sequence of observations...'&lt;br /&gt;#reset our state distribution to our &lt;br /&gt;#prior state distribution&lt;br /&gt;stateDist=weatherProb&lt;br /&gt;obsn=len(observations)&lt;br /&gt;decoded_seq=[0]*obsn&lt;br /&gt;state_index=[[]]*obsn&lt;br /&gt;&lt;br /&gt;for i in range(obsn):&lt;br /&gt;    stateDist,state=viterbi(weatherStateMap, \&lt;br /&gt;                            weatherStateIndex, \&lt;br /&gt;                            weatherObsMap, \&lt;br /&gt;                            weatherObsIndex, \&lt;br /&gt;                            stateDist, \&lt;br /&gt;                            weatherTProb, \&lt;br /&gt;                            weatherEProb, \&lt;br /&gt;                            observations[i])&lt;br /&gt;    &lt;br /&gt;    state_index[i]=state&lt;br /&gt;&lt;br /&gt;#backtrack to find the most likely state sequence&lt;br /&gt;decoded_seq[obsn-1]=stateDist.index(max(stateDist))&lt;br /&gt;for i in range(obsn-1,0,-1):&lt;br /&gt;  decoded_seq[i-1]=state_index[i][decoded_seq[i]]&lt;br /&gt;&lt;br /&gt;#turn state indices into state names&lt;br /&gt;decoded_seq=[weatherStateIndex[idx]&lt;br /&gt;             for idx in decoded_seq]&lt;br /&gt;   &lt;br /&gt;print '\nThe accuracy of the viterbi decoder is', &lt;br /&gt;print accuracy(states, decoded_seq)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;With output:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;Prior distribution over states (t=0):&lt;br /&gt;    sunny 0.500     rainy 0.250     foggy 0.250 &lt;br /&gt;&lt;br /&gt;Umbrella observation sequence:&lt;br /&gt;    ['no', 'no', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'yes']&lt;br /&gt;&lt;br /&gt;Filtering/Monitoring...&lt;br /&gt;&lt;br /&gt;Distribution over state at t=1:&lt;br /&gt;    sunny 0.667     rainy 0.074     foggy 0.259 &lt;br /&gt;Distribution over state at t=2:&lt;br /&gt;    sunny 0.728     rainy 0.042     foggy 0.231 &lt;br /&gt;Distribution over state at t=3:&lt;br /&gt;    sunny 0.752     rainy 0.034     foggy 0.214 &lt;br /&gt;Distribution over state at t=4:&lt;br /&gt;    sunny 0.282     rainy 0.424     foggy 0.294 &lt;br /&gt;Distribution over state at t=5:&lt;br /&gt;    sunny 0.558     rainy 0.120     foggy 0.322 &lt;br /&gt;Distribution over state at t=6:&lt;br /&gt;    sunny 0.679     rainy 0.055     foggy 0.265 &lt;br /&gt;Distribution over state at t=7:&lt;br /&gt;    sunny 0.241     rainy 0.466     foggy 0.293 &lt;br /&gt;Distribution over state at t=8:&lt;br /&gt;    sunny 0.082     rainy 0.722     foggy 0.197 &lt;br /&gt;Distribution over state at t=9:&lt;br /&gt;    sunny 0.447     rainy 0.198     foggy 0.356 &lt;br /&gt;Distribution over state at t=10:&lt;br /&gt;    sunny 0.142     rainy 0.600     foggy 0.258 &lt;br /&gt;&lt;br /&gt;Prediction...&lt;br /&gt;&lt;br /&gt;Distribution over state at t+1:&lt;br /&gt;    sunny 0.285     rainy 0.445     foggy 0.270 &lt;br /&gt;Distribution over state at t+2:&lt;br /&gt;    sunny 0.371     rainy 0.362     foggy 0.267 &lt;br /&gt;Distribution over state at t+3:&lt;br /&gt;    sunny 0.423     rainy 0.316     foggy 0.262 &lt;br /&gt;Distribution over state at t+4:&lt;br /&gt;    sunny 0.454     rainy 0.289     foggy 0.257 &lt;br /&gt;Distribution over state at t+5:&lt;br /&gt;    sunny 0.472     rainy 0.273     foggy 0.255 &lt;br /&gt;Distribution over state at t+6:&lt;br /&gt;    sunny 0.483     rainy 0.264     foggy 0.253 &lt;br /&gt;Distribution over state at t+7:&lt;br /&gt;    sunny 0.490     rainy 0.258     foggy 0.252 &lt;br /&gt;Distribution over state at t+8:&lt;br /&gt;    sunny 0.494     rainy 0.255     foggy 0.251 &lt;br /&gt;Distribution over state at t+9:&lt;br /&gt;    sunny 0.496     rainy 0.253     foggy 0.251 &lt;br /&gt;Distribution over state at t+10:&lt;br /&gt;    sunny 0.498     rainy 0.252     foggy 0.250 &lt;br /&gt;Distribution over state at t+11:&lt;br /&gt;    sunny 0.499     rainy 0.251     foggy 0.250 &lt;br /&gt;Distribution over state at t+12:&lt;br /&gt;    sunny 0.499     rainy 0.251     foggy 0.250 &lt;br /&gt;Distribution over state at t+13:&lt;br /&gt;    sunny 0.500     rainy 0.250     foggy 0.250 &lt;br /&gt;Distribution over state at t+14:&lt;br /&gt;    sunny 0.500     rainy 0.250     foggy 0.250 &lt;br /&gt;Distribution over state at t+15:&lt;br /&gt;    sunny 0.500     rainy 0.250     foggy 0.250 &lt;br /&gt;&lt;br /&gt;Likelihood...&lt;br /&gt;&lt;br /&gt;Likelihood of observations given our HMM:  0.000654&lt;br /&gt;&lt;br /&gt;Smoothing/hindsight...&lt;br /&gt;&lt;br /&gt;Distribution over state at t=1:&lt;br /&gt;    sunny 0.667     rainy 0.074     foggy 0.259 &lt;br /&gt;Distribution over state at t=2:&lt;br /&gt;    sunny 0.728     rainy 0.042     foggy 0.231 &lt;br /&gt;Distribution over state at t=3:&lt;br /&gt;    sunny 0.752     rainy 0.034     foggy 0.214 &lt;br /&gt;Distribution over state at t=4:&lt;br /&gt;    sunny 0.282     rainy 0.424     foggy 0.294 &lt;br /&gt;Distribution over state at t=5:&lt;br /&gt;    sunny 0.558     rainy 0.120     foggy 0.322 &lt;br /&gt;Distribution over state at t=6:&lt;br /&gt;    sunny 0.679     rainy 0.055     foggy 0.265 &lt;br /&gt;Distribution over state at t=7:&lt;br /&gt;    sunny 0.241     rainy 0.466     foggy 0.293 &lt;br /&gt;Distribution over state at t=8:&lt;br /&gt;    sunny 0.082     rainy 0.722     foggy 0.197 &lt;br /&gt;Distribution over state at t=9:&lt;br /&gt;    sunny 0.447     rainy 0.198     foggy 0.356 &lt;br /&gt;Distribution over state at t=10:&lt;br /&gt;    sunny 0.142     rainy 0.600     foggy 0.258 &lt;br /&gt;&lt;br /&gt;t=10&lt;br /&gt;&lt;br /&gt;Umbrella observation sequence:&lt;br /&gt;    ['no', 'no', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'yes']&lt;br /&gt;&lt;br /&gt;Distribution over state at t-1:&lt;br /&gt;    sunny 0.223     rainy 0.335     foggy 0.441 &lt;br /&gt;Distribution over state at t-2:&lt;br /&gt;    sunny 0.083     rainy 0.672     foggy 0.245 &lt;br /&gt;Distribution over state at t-3:&lt;br /&gt;    sunny 0.100     rainy 0.601     foggy 0.300 &lt;br /&gt;Distribution over state at t-4:&lt;br /&gt;    sunny 0.352     rainy 0.151     foggy 0.496 &lt;br /&gt;Distribution over state at t-5:&lt;br /&gt;    sunny 0.470     rainy 0.115     foggy 0.415 &lt;br /&gt;Distribution over state at t-6:&lt;br /&gt;    sunny 0.356     rainy 0.318     foggy 0.326 &lt;br /&gt;Distribution over state at t-7:&lt;br /&gt;    sunny 0.589     rainy 0.067     foggy 0.344 &lt;br /&gt;Distribution over state at t-8:&lt;br /&gt;    sunny 0.718     rainy 0.033     foggy 0.249 &lt;br /&gt;Distribution over state at t-9:&lt;br /&gt;    sunny 0.747     rainy 0.042     foggy 0.210 &lt;br /&gt;&lt;br /&gt;Decoding...&lt;br /&gt;&lt;br /&gt;Viterbi - predicted state sequence:&lt;br /&gt;    ['sunny', 'sunny', 'sunny', 'sunny', 'sunny', 'sunny', 'rainy', 'rainy', 'rainy', 'rainy']&lt;br /&gt;Viterbi - actual state sequence:&lt;br /&gt;    ['foggy', 'foggy', 'foggy', 'rainy', 'sunny', 'foggy', 'rainy', 'rainy', 'foggy', 'rainy']&lt;br /&gt;&lt;br /&gt;The accuracy of the viterbi decoder is 0.4&lt;br /&gt;&lt;br /&gt;Decoding full sequence of observations...&lt;br /&gt;&lt;br /&gt;The accuracy of the viterbi decoder is 0.636&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;An interesting thing to note is how predicting into the future will eventually lead back to the stationary distribution of the markov chain.  Next time, we will talk about how to actually infer the parameters of our HMM from data.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-1029472587790976111?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/1029472587790976111/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/05/hidden-markov-model-ii.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/1029472587790976111'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/1029472587790976111'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/05/hidden-markov-model-ii.html' title='Hidden Markov Model II'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-4702716323747823578</id><published>2009-05-13T12:02:00.000-07:00</published><updated>2009-05-19T11:50:22.486-07:00</updated><title type='text'>Hidden Markov Model</title><content type='html'>In our last post we talked about one particular application of markov chains: sampling from probability distributions.  Another example was that of a &lt;a href="http://www.jwz.org/dadadodo/"&gt;random text generator&lt;/a&gt; (RTG), and as trivial a pursuit as it may be, an RTG is a good simple application from which to explore these concepts.&lt;br /&gt;&lt;br /&gt;Imagine that we have a different kind of RTG, one that instead of learning word transition probabilities, learns part-of-speech (POS) transition probabilities.  One categorization of POS may include: {'noun', 'verb', 'pronoun', 'preposition', 'adverb', 'conjunction', 'participle', 'article'}, but there are &lt;a href="http://en.wikipedia.org/wiki/Brown_Corpus#Part-of-speech_tags_used"&gt;other variations&lt;/a&gt;.  Imagine also that we have a POS distribution ('noun'=.3, 'verb'=.3, ... all adding up to one, of course), and a model that approximates the distribution of words given a certain POS, so that given that POS='verb', we would have that: ('eat'=.1, 'sleep'=.3, 'work'=.3, play='.3'), which are the verbs in our vocabulary.  If we had that info, we could generate random text as follows:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Start with a random POS drawn from the POS distribution&lt;/li&gt;&lt;li&gt;Draw a random word from the given POS' word distribution&lt;/li&gt;&lt;li&gt;Transition to the next POS given the POS transition probability matrix&lt;/li&gt;&lt;li&gt;Draw a new word from the new POS' word distribution&lt;/li&gt;&lt;li&gt;Repeat step 3&lt;/li&gt;&lt;/ol&gt;If we generate some random text using this model and show it to someone, what they would observe (the sequence of words) would be called an observed markov chain.  Unless we show them the sequence of POS tags, they would not know about it, and so that would be an unobserved (hidden) markov chain to them.  These kind of models turn out to be useful in all kinds of applications, &lt;a href="http://en.wikipedia.org/wiki/Speech_recognition"&gt;speech recognition&lt;/a&gt; for example, where a computer sees an observed markov chain of sound recordings and it tries to infer the what the unobserved markov chain of words that "generated those sounds" actually was.  As a side note, &lt;a href="http://see.stanford.edu/see/courses.aspx"&gt;Stanford Engineering Everywhere&lt;/a&gt; also includes a course on &lt;a href="http://see.stanford.edu/see/courseinfo.aspx?coll=63480b48-8819-4efd-8412-263f1a472f5a"&gt;Natural Language Processing&lt;/a&gt;.  Here's a &lt;a href="http://www.cs.berkeley.edu/%7Eklein/cs288/sp09/"&gt;link&lt;/a&gt; to Berkeley's version of that course.&lt;br /&gt;&lt;br /&gt;A &lt;a href="http://en.wikipedia.org/wiki/Hidden_Markov_Model"&gt;hidden markov model&lt;/a&gt; (HMM) is useful for a lot of things, but in essence, we learn the parameters of a HMM based on the available data (any priors and the sequence of observations) so that we can use it to calculate the most probable sequence of [latent] states given the sequence of observations (chapter 15 of &lt;a href="http://aima.cs.berkeley.edu/"&gt;AIMA&lt;/a&gt; has an interesting discussion on things we can do with temporal models such as HMM).&lt;br /&gt;&lt;br /&gt;Are probabilites of sequences really useful?  It turns out they are, think about a cell phone conversation you've recently had where the reception wasn't great, and the speaker on the other end said something you couldn't really understand.  Imagine, for example, that you heard '?all', but you weren't sure what letter should've gone into the place of the question mark; did they say 'ball', 'call', or 'wall'?  A silly way to decide what the word was would be to just use the probability of ever hearing each of those words and picking the most probable one, but then you're not taking advantage of the context of the conversation, which includes all the other words that you heard before '?all' in the conversation.  Most likely, what you would do would be to "calculate" the probabilities of each word given that you were talking about a soccer game, and chose 'ball' as most likely, in other words, you would calculate probabilities of sequences.  We will look into estimating the parameters of the HMM and then into calculating probabilities of sequences, but first, we examine the simpler case of doing so in observed markov chains.&lt;br /&gt;&lt;br /&gt;One field very concerned with sequences is &lt;a href="http://en.wikipedia.org/wiki/Bioinformatics"&gt;bioinformatics&lt;/a&gt;,  which uses computational methods to solve problems in biology, and where common activities include mapping and analyzing DNA and protein sequences and aligning different DNA and protein sequences to compare them.  For many applications, a &lt;a href="http://en.wikipedia.org/wiki/DNA_sequence"&gt;DNA sequence&lt;/a&gt; can be treated as a string composed of the characters in the set {A,G,C,T}, and one simple example of something that can be done is to try and come up with a model of how DNA sequences evolve through time. For that, several &lt;a href="http://en.wikipedia.org/wiki/Substitution_model"&gt;substitution models&lt;/a&gt; have been proposed, including a number of different &lt;a href="http://en.wikipedia.org/wiki/Models_of_DNA_evolution"&gt;markov models&lt;/a&gt;.  In order to infer the parameters of a markov chain using data, we would use a technique we have seen before, maximum likelihood, which is well described &lt;a href="http://www.stat.cmu.edu/%7Ecshalizi/462/lectures/06/markov-mle.pdf"&gt;here&lt;/a&gt;, and which unsurprisingly yields that the estimates are the ratios present in the dataset:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;#mc.py&lt;br /&gt;from numpy import *&lt;br /&gt;&lt;br /&gt;############################################################&lt;br /&gt;#compute probabilities of sequences given a markov chain&lt;br /&gt;############################################################&lt;br /&gt;def printchain(chain):&lt;br /&gt;    for node in chain:&lt;br /&gt;        print node+':',&lt;br /&gt;        print chain[node]&lt;br /&gt;    &lt;br /&gt;def nextprob(chain,node,nextnode):&lt;br /&gt;    if node in chain and nextnode in chain[node].keys():&lt;br /&gt;        return chain[node][nextnode]&lt;br /&gt;    else:&lt;br /&gt;        return 0.0&lt;br /&gt;&lt;br /&gt;def seqprob(chain,seq):&lt;br /&gt;    probseq=nextprob(chain,'b',seq[0])&lt;br /&gt;    for index in range(len(seq)-1):&lt;br /&gt;        probseq*=nextprob(chain,seq[index],seq[index+1])&lt;br /&gt;    return probseq&lt;br /&gt;&lt;br /&gt;chain={}&lt;br /&gt;chain['b']={'A':1.0}&lt;br /&gt;chain['A']={'C':0.3,'G':0.7}&lt;br /&gt;chain['C']={'T':1.0}&lt;br /&gt;chain['G']={'T':1.0}&lt;br /&gt;chain['T']={'e':1.0}&lt;br /&gt;chain['e']=None&lt;br /&gt;&lt;br /&gt;print '\nprob from A to C:\t%f'%nextprob(chain,'A','C')&lt;br /&gt;print 'prob of seq AGT:\t%f'%seqprob(chain,'AGT')&lt;br /&gt;printchain(chain)&lt;br /&gt;&lt;br /&gt;############################################################&lt;br /&gt;#infer markov chain from given sequences &lt;br /&gt;############################################################&lt;br /&gt;&lt;br /&gt;def buildchain(seqs):&lt;br /&gt;    #initialize chain&lt;br /&gt;    chain={'b':{},'e':None}&lt;br /&gt;&lt;br /&gt;    #add sequence items as nodes&lt;br /&gt;    for seq in seqs:&lt;br /&gt;        for item in seq:&lt;br /&gt;            chain.setdefault(item,{})&lt;br /&gt;    for i in range(len(seqs[0])-1):&lt;br /&gt;        for seq in seqs:&lt;br /&gt;            chain[seq[i]].setdefault(seq[i+1],0.0)&lt;br /&gt;            chain[seq[i]][seq[i+1]]+=1&lt;br /&gt;&lt;br /&gt;    #add limit nodes&lt;br /&gt;    for seq in seqs:&lt;br /&gt;        chain['b'].setdefault(seq[0],0.0)&lt;br /&gt;        chain['b'][seq[0]]+=1&lt;br /&gt;        chain[seq[len(seq)-1]].setdefault('e',1.0)&lt;br /&gt;        &lt;br /&gt;    #normalize each node's transition probabilities&lt;br /&gt;    for node in chain:&lt;br /&gt;        if node&lt;&gt;'e':&lt;br /&gt;            norm=1/sum(chain[node].values())&lt;br /&gt;            for nextnode in chain[node]:&lt;br /&gt;                chain[node][nextnode]*=norm&lt;br /&gt;        &lt;br /&gt;    return chain&lt;br /&gt;            &lt;br /&gt;seqs=['ACT','AGT']&lt;br /&gt;chain=buildchain(seqs)&lt;br /&gt;&lt;br /&gt;print '\nprob of seq ACT:\t%f'%seqprob(chain,seqs[0])&lt;br /&gt;print 'prob of seq AGT:\t%f'%seqprob(chain,seqs[1])&lt;br /&gt;printchain(chain)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;For which we get:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;prob from A to C: 0.300000&lt;br /&gt;prob of seq AGT: 0.700000&lt;br /&gt;A: {'C': 0.29999999999999999, 'G': 0.69999999999999996}&lt;br /&gt;C: {'T': 1.0}&lt;br /&gt;b: {'A': 1.0}&lt;br /&gt;e: None&lt;br /&gt;G: {'T': 1.0}&lt;br /&gt;T: {'e': 1.0}&lt;br /&gt;&lt;br /&gt;prob of seq ACT: 0.500000&lt;br /&gt;prob of seq AGT: 0.500000&lt;br /&gt;A: {'C': 0.5, 'G': 0.5}&lt;br /&gt;C: {'T': 1.0}&lt;br /&gt;b: {'A': 1.0}&lt;br /&gt;e: None&lt;br /&gt;G: {'T': 1.0}&lt;br /&gt;T: {'e': 1.0}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Notice anything wrong with this picture?  Watch what happens when we use the dataset below to infer a new chain and calculate probabilities of sequences:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;seqs=['GTATC','GTTTC','GT-AC','GT-AC']&lt;br /&gt;chain=buildchain(seqs)&lt;br /&gt;&lt;br /&gt;print '\nprob of seq GTATC:\t%f'%seqprob(chain,seqs[0])&lt;br /&gt;print 'prob of seq GTTTC:\t%f'%seqprob(chain,seqs[1])&lt;br /&gt;print 'prob of seq GT-AC:\t%f'%seqprob(chain,seqs[2])&lt;br /&gt;printchain(chain)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;We get:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;prob of seq GTATC: 0.013605&lt;br /&gt;prob of seq GTTTC: 0.023324&lt;br /&gt;prob of seq GT-AC: 0.190476&lt;br /&gt;A: {'C': 0.666666666667, 'T': 0.333333333333}&lt;br /&gt;C: {'e': 1.0}&lt;br /&gt;b: {'G': 1.0}&lt;br /&gt;e: None&lt;br /&gt;G: {'T': 1.0}&lt;br /&gt;-: {'A': 1.0}&lt;br /&gt;T: {'A': 0.142857142857, 'C': 0.285714285714, '-': 0.285714285714, 'T': 0.285714285714}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;When we should have gotten 25% probability for sequences GTATC and GTTTC respectively and 50% probability for sequence GT-AC.&lt;br /&gt;&lt;br /&gt;What our markov chain model failed to capture was that given the dataset ['GTATC','GTTTC','GT-AC','GT-AC'], and assuming fixed length sequences of size 5, the probability of 'T' being followed by an item in our alphabet ['A','G','C','T','-'] changes depending on whether the 'T' we are looking at falls into the 2nd, 3rd, or 4th position in our sequence.  Doing some manual calculations, we can see that we have the following transition probabilities:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;when 'T' is in position 2: {'A':.25,'T':.25,'-':.5}&lt;/li&gt;&lt;li&gt;when 'T' is in position 3: {'T':1.0}&lt;/li&gt;&lt;li&gt;when 'T' is in position 4: {'C':1.0}&lt;/li&gt;&lt;/ul&gt;A hidden markov model would actually model these transitions effectively, where the "hidden" chain would be the sequence of positions.  Actually, this is a very simple hidden markov model, because of the transition probabilities for the hidden chain (i.e. position 1 transitions with 100% probability into position 2, and so on), and so a few modifications to our 'buildchain' and 'seqprob' methods up there would fix our problem.  In our next post, however, we will look at a more general algorithm to learn and compute with HMM's.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-4702716323747823578?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/4702716323747823578/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/05/hidden-markov-model_13.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/4702716323747823578'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/4702716323747823578'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/05/hidden-markov-model_13.html' title='Hidden Markov Model'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-5681149278458310316</id><published>2009-05-12T15:12:00.000-07:00</published><updated>2009-05-14T07:08:26.629-07:00</updated><title type='text'>Markov Chain Monte Carlo II</title><content type='html'>Continuing our discussion on simulation, today we discuss a  class of algorithms for sampling from probability distributions based on constructing a markov chain that has the desired distribution as its equilibrium distribution: &lt;a href="http://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo"&gt;markov chain monte carlo&lt;/a&gt; (MCMC).&lt;br /&gt;&lt;br /&gt;What is a markov chain's equilibrium distribution?  For illustration purposes, take the following example, from &lt;a href="http://www.amazon.com/Applied-Algebra-Analysis-Undergraduate-Mathematics/dp/0387331956"&gt;&lt;span style="text-decoration: underline;"&gt;Applied Linear Algebra and Matrix Analysis&lt;/span&gt;&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;"Suppose two toothpaste companies compete for customers in a fixed market in which each customer uses either brand A or brand B.  Suppose also that a market analysis shows that the buying habits of the customers fit the following pattern: each quarter, 30% of A users will switch to B, while the rest stay with A.  During the same time, 40% of B users will switch to A, while the rest will stay with B."&lt;br /&gt;&lt;br /&gt;As the python session below shows, it does not matter what the initial state distribution is (i.e. what percentage of customers uses brand A and B), after enough quarters, the state distribution will converge to an equilibrium:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;#tpaste_inter.py&lt;br /&gt;from numpy import *&lt;br /&gt;&lt;br /&gt;def evolve(x,T,n):&lt;br /&gt;    for i in range(n):&lt;br /&gt;        x=dot(T,x)&lt;br /&gt;    print str(x)+'\n' &lt;br /&gt;&lt;br /&gt;#transition matrix&lt;br /&gt;T=array([[.7,.4],&lt;br /&gt;         [.3,.6]])&lt;br /&gt;&lt;br /&gt;#one initial state distribution&lt;br /&gt;x0=array([.5,&lt;br /&gt;          .5]).reshape(2,1)&lt;br /&gt;&lt;br /&gt;evolve(x0,T,20)&lt;br /&gt;&lt;br /&gt;#another initial state distribution&lt;br /&gt;x0=array([1,&lt;br /&gt;          0]).reshape(2,1)&lt;br /&gt;&lt;br /&gt;evolve(x0,T,20)&lt;br /&gt;&lt;br /&gt;#yet another initial state distribution&lt;br /&gt;x0=array([0,&lt;br /&gt;          1]).reshape(2,1)&lt;br /&gt;&lt;br /&gt;evolve(x0,T,20)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Output:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;[[ 0.57142857]&lt;br /&gt; [ 0.42857143]]&lt;br /&gt;&lt;br /&gt;[[ 0.57142857]&lt;br /&gt; [ 0.42857143]]&lt;br /&gt;&lt;br /&gt;[[ 0.57142857]&lt;br /&gt; [ 0.42857143]]&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;As shown, all the initial distributions converged to A retaining 57% of customers, with B retaining the remaining 43%.  This convergence to an equilibrium distribution does not always happen in markov chains (check &lt;a href="http://www.dis.uniroma1.it/%7Eleon/didattica/webir/pagerank.pdf"&gt;this&lt;/a&gt; out for necessary conditions for convergence), but we do need it in MCMC.  &lt;a href="http://en.wikipedia.org/wiki/PageRank"&gt;PageRank&lt;/a&gt; is also an example of this, where the transition matrix is a link graph and the equilibrium state distribution denotes the rank of each page.&lt;br /&gt;&lt;br /&gt;The idea in MCMC, is that we need to construct a transition matrix so that its equilibrium distribution represents the target distribution we want to sample from, so that independently of what we initialize the initial state distribution to, by applying the transition matrix enough times, we will converge to our desired distribution.  One obvious benefit of this method over say rejection sampling is that in rejection sampling, we are sampling all over the distribution space (the proposal distribution's space, that is), which may generate a lot of rejections (wasted samples).  The same is true with importance sampling (sampling importance resampling, that is).  With MCMC, on the other hand, we are smarter about navigating the state space of the desired distribution by navigating the links on its chain.  For example, imagine we are sapling from some bivariate distribution, one approach is to use use something like rejection sampling and to generate a bunch of i.i.d (x0,x1) tuples; while another approach would be to start with a random sample for 'x0', then generate 'x1' from a distribution of p(x1|x0), which gives us our first tuple (x0,x1), then we generate a new sample for 'x0' based on our last 'x1' by drawing this time from the conditional p(x0|x1), which gives us our next tuple (not i.i.d), and then we generate a new 'x1' based on this last 'x0' for the next tuple, and so on.  This last approach is known as &lt;a href="http://en.wikipedia.org/wiki/Gibbs_sampling"&gt;Gibbs Sampling&lt;/a&gt;, and is one variant of MCMC.&lt;br /&gt;&lt;br /&gt;A more general form of MCMC is the so called &lt;a href="http://en.wikipedia.org/wiki/Metropolis-Hastings_algorithm"&gt;Metropolis-Hastings algorithm&lt;/a&gt;, where rather than always taking the next link in the chain as a valid sample, we only take it if it is a much better sample than the last sample (according to some fitness function).  If it is not much better, we don't take it.  If they are about the same (similar fitness scores), we  flip a coin to decide.  Good introductory resources include Nando's &lt;a href="http://www.cs.ubc.ca/%7Enando/papers/mlintro.pdf"&gt;paper on MCMC&lt;/a&gt; and chapter 11 of Bishop's &lt;a href="http://research.microsoft.com/en-us/um/people/cmbishop/prml/"&gt;book&lt;/a&gt;.  In there you will find more details about how we go about constructing a transition matrix (the proposal conditional distribution) with the desired properties, which can be considered somewhat of an art (different choices for the proposal distribution lead to different variants of the Metropolis-Hastings algorithms, one of which is Gibbs Sampling).&lt;br /&gt;&lt;br /&gt;Below is an example from Marsland's &lt;a href="http://www-ist.massey.ac.nz/smarsland/MLBook.html"&gt;book&lt;/a&gt;, where he uses Metropolis-Hastings to sample from a mixture of two gaussians based on a proposal distribution that is a gaussian of the form N(x,b^2) for some b&gt;0.  This means that the proposal is a drawn from a Normal centered at the current value. This variant of Metropolis-Hastings is known as the "Random-walk-Metropolis-Hastings" algorithm, the reason for the name being that if we did not do the accept-reject step (evaluating the new sample by a fitness function) we would then be simulating a &lt;a href="http://en.wikipedia.org/wiki/Random_walk"&gt;random walk&lt;/a&gt;: &lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;#mcmc.py&lt;br /&gt;&lt;br /&gt;# Code from Chapter 14 of Machine Learning: An Algorithmic Perspective&lt;br /&gt;# by Stephen Marsland (http://seat.massey.ac.nz/personal/s.r.marsland/MLBook.html)&lt;br /&gt;&lt;br /&gt;# You are free to use, change, or redistribute the code in any way you wish for&lt;br /&gt;# non-commercial purposes, but please maintain the name of the original author.&lt;br /&gt;# This code comes with no warranty of any kind.&lt;br /&gt;&lt;br /&gt;# Stephen Marsland, 2008&lt;br /&gt;&lt;br /&gt;# The Metropolis-Hastings algorithm&lt;br /&gt;from pylab import *&lt;br /&gt;from numpy import *&lt;br /&gt;&lt;br /&gt;def p(x):&lt;br /&gt;    mu1 = 3&lt;br /&gt;    mu2 = 10&lt;br /&gt;    v1 = 10&lt;br /&gt;    v2 = 3&lt;br /&gt;    return 0.3*exp(-(x-mu1)**2/v1) + 0.7* exp(-(x-mu2)**2/v2)&lt;br /&gt;&lt;br /&gt;stepsize = 0.5&lt;br /&gt;x = arange(-10,20,stepsize)&lt;br /&gt;px = zeros(shape(x))&lt;br /&gt;for i in range(len(x)):&lt;br /&gt;    px[i] = p(x[i])&lt;br /&gt;N = 5000&lt;br /&gt;&lt;br /&gt;# random walk chain&lt;br /&gt;u2 = random.rand(N)&lt;br /&gt;sigma = 10&lt;br /&gt;y2 = zeros(N)&lt;br /&gt;#start with a random sample&lt;br /&gt;y2[0] = random.normal(0,sigma)&lt;br /&gt;for i in range(N-1):&lt;br /&gt;    #get next sample in the chain&lt;br /&gt;    y2new = y2[i] + random.normal(0,sigma)&lt;br /&gt;    #apply fitness function&lt;br /&gt;    alpha = min(1,p(y2new)/p(y2[i]))&lt;br /&gt;    #only take this sample if better than last&lt;br /&gt;    if u2[i] &lt; alpha:&lt;br /&gt;        y2[i+1] = y2new&lt;br /&gt;    else:&lt;br /&gt;        y2[i+1] = y2[i]&lt;br /&gt;&lt;br /&gt;figure(1)&lt;br /&gt;nbins = 30&lt;br /&gt;hist(y2, bins = x)&lt;br /&gt;plot(x, px*N/sum(px), color='r', linewidth=2)&lt;br /&gt;&lt;br /&gt;show()&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;And for which we get:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_JBvsBkmE5OU/SgwllTE7uyI/AAAAAAAAAFc/sQxR4WoTksY/s1600-h/mcmc.png"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 320px; height: 240px;" src="http://3.bp.blogspot.com/_JBvsBkmE5OU/SgwllTE7uyI/AAAAAAAAAFc/sQxR4WoTksY/s320/mcmc.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5335680981233548066" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-5681149278458310316?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/5681149278458310316/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/05/markov-chain-monte-carlo-ii.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/5681149278458310316'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/5681149278458310316'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/05/markov-chain-monte-carlo-ii.html' title='Markov Chain Monte Carlo II'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_JBvsBkmE5OU/SgwllTE7uyI/AAAAAAAAAFc/sQxR4WoTksY/s72-c/mcmc.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-2342757171904655377</id><published>2009-05-07T18:31:00.000-07:00</published><updated>2009-05-12T17:28:37.564-07:00</updated><title type='text'>Markov Chain Monte Carlo</title><content type='html'>We move quicker now towards the reinforcement learning part of the lectures, where the concept of a markov decision process (MDP) is introduced and further discussed (starting in &lt;a href="http://www.youtube.com/watch?v=RtxI449ZjSc&amp;amp;feature=PlayList&amp;amp;p=A89DCFA6ADACE599&amp;amp;index=15"&gt;lecture 16&lt;/a&gt;).  Before we get there though, we should look into a few related concepts that will allow us to build up to MDP's, particularly, simpler markov models: markov chains and hidden markov models.&lt;br /&gt;&lt;br /&gt;To see how all these different flavors of discrete time, discrete state markov models are related, the following chart can be useful (found &lt;a href="http://www.pomdp.org/pomdp/pomdp-faq.shtml"&gt;here&lt;/a&gt;):&lt;br /&gt;&lt;br /&gt;&lt;table align="center" border="1" cellpadding="0" cellspacing="0"&gt;&lt;colgroup&gt;&lt;col width="192"&gt;&lt;col width="192"&gt;&lt;col width="192"&gt;&lt;col width="192"&gt;&lt;/colgroup&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td colspan="2" rowspan="2" style="text-align: center; width: 1.7313in;"&gt;&lt;p&gt;Markov Models &lt;/p&gt;&lt;/td&gt;&lt;td colspan="2" style="text-align: center; width: 1.7313in;"&gt;&lt;p&gt;Do we have control over the state transitions? &lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td style="text-align: center; width: 1.7313in;"&gt;&lt;p class="P1"&gt;No &lt;/p&gt;&lt;/td&gt;&lt;td style="text-align: center; width: 1.7313in;"&gt;&lt;p&gt;Yes &lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td rowspan="2" style="text-align: center; width: 1.7313in;"&gt;&lt;p&gt;Are the states completely observable? &lt;/p&gt;&lt;/td&gt;&lt;td style="text-align: center; width: 1.7313in;"&gt;&lt;p&gt;Yes &lt;/p&gt;&lt;/td&gt;&lt;td style="text-align: center; width: 1.7313in;"&gt;&lt;p&gt;Markov Chain &lt;/p&gt;&lt;/td&gt;&lt;td style="text-align: center; width: 1.7313in;"&gt;&lt;p&gt;Markov Decision Process &lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td style="text-align: center; width: 1.7313in;"&gt;&lt;p&gt;No &lt;/p&gt;&lt;/td&gt;&lt;td style="text-align: center; width: 1.7313in;"&gt;&lt;p&gt;Hidden Markov Model &lt;/p&gt;&lt;/td&gt;&lt;td style="text-align: center; width: 1.7313in;"&gt;&lt;p&gt;Partially Observable Markov Decision Process &lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;br /&gt;So, what is a markov chain?  If you think about our &lt;a href="http://mechanistician.blogspot.com/2009/02/lec3-locally-weighted-linear-regression.html"&gt;diagram relating probability vs. inferential statistics&lt;/a&gt;, there are 2 ways from which we can look at markov chains.  The first one is to think about the data generating process, and if we did so, we would say that a markov chain is a system whose initial state (x0) is a probability distribution vector, and whose transition matrix (A) is composed of columns which are themselves probability distribution vectors (a stochastic matrix).  This system evolves through time, which we consider to be discrete, by taking the initial state vector  and "passing it through" the transition matrix, generating the next state vector (x1), and so on, for state vectors (x2), (x3), ..., (xn).  After 'k' iterations, we would have seen 'k' state vectors, and we can think of this sequence of state vectors as the &lt;span style="font-weight: bold;"&gt;chain&lt;/span&gt;, with the state vectors being the links.  This leads us to the second way of looking at markov chains, which answers the question: what can we say about the chain (the sequence of observations) that is generated?  Well, the links in the chain have the so called &lt;span style="font-weight: bold;"&gt;markov&lt;/span&gt; property, which says that a future state vector (xk+1) is independent of all past state vectors (x0 through xk-1) given the present state vector (xk).&lt;br /&gt;&lt;br /&gt;The markov property we mentioned characterizes a so called 1st-order markov model. You could have a 2nd-order model, where a state vector depends only on the 2 previous states, etc, until you have a model where your state vector depends on all previous states, and nth-order model.  There are many sequence of events where successive observations are correlated, and so this markov property is simply a [formalism|simplification] of that fact.  Obviously, the smaller the order of the markov model, the more tractable it is, and fortunately, just as the naive bayes assumption works surprisingly well in a variety of applications, so do the 1st-order models.  Check the wikipedia entry on &lt;a href="http://en.wikipedia.org/wiki/Markov_chain"&gt;markov chains&lt;/a&gt; for a list of applications and for further references, including this &lt;a href="http://decision.csl.uiuc.edu/%7Emeyn/pages/book.html"&gt;book&lt;/a&gt;, whose earlier edition is available for download.  Gentler introductions can be found in &lt;a href="http://www.amazon.com/Applied-Algebra-Analysis-Undergraduate-Mathematics/dp/0387331956"&gt;linear algebra books&lt;/a&gt; or in &lt;a href="http://www.amazon.com/Probability-Statistics-Processes-Electrical-Engineering/dp/0131471228"&gt;probability books&lt;/a&gt; that have a chapter or two on stochastic processes.&lt;br /&gt;&lt;br /&gt;There are many simple &lt;a href="http://en.wikipedia.org/wiki/Examples_of_Markov_chains"&gt;examples&lt;/a&gt; of markov chains, and the most prevalent one throughout the web appears to be that of a random text generator; check this &lt;a href="http://code.activestate.com/recipes/194364/"&gt;python script&lt;/a&gt; out for an example of an algorithm described in chapter 3 of &lt;a href="http://cm.bell-labs.com/cm/cs/tpop/"&gt;The Practice of Programming&lt;/a&gt;.  What we will do instead, is explore an application of markov chains to simulations, the &lt;a href="http://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo"&gt;Markov chain Monte Carlo&lt;/a&gt; algorithm.&lt;br /&gt;&lt;br /&gt;We've very briefly talked about markov chains, but what about &lt;a href="http://en.wikipedia.org/wiki/Monte_Carlo_method"&gt;monte carlo methods&lt;/a&gt;?  As wikipedia mentions, monte carlo method is a general term for basically any algorithm that uses random samplings to calculate its results, and one desired property of these algorithms is that as the number of samplings increase, so does the accuracy of the calculated results.  There is another kind of randomized algorithms, the &lt;a href="http://en.wikipedia.org/wiki/Las_Vegas_algorithm"&gt;las vegas algorithms&lt;/a&gt;, that despite their use of random numbers, they always give the correct result (think &lt;a href="http://en.wikipedia.org/wiki/Quicksort"&gt;quicksort&lt;/a&gt;).  One  illustrative example of a monte carlo method is to calculate &lt;a href="http://en.wikipedia.org/wiki/Pi"&gt;pi&lt;/a&gt;.  There are some good explanations &lt;a href="http://www.chem.unl.edu/zeng/joy/mclab/mcintro.html"&gt;here&lt;/a&gt; and &lt;a href="http://www.eveandersson.com/pi/monte-carlo-circle"&gt;here&lt;/a&gt;, the latter one also has a nice little applet of the simulation at work.&lt;br /&gt;&lt;br /&gt;Simulation from probability distributions relies on the ability to generate random numbers from a (0,1) &lt;a href="http://en.wikipedia.org/wiki/Uniform_distribution_%28continuous%29"&gt;uniform distribution&lt;/a&gt;, which can be accomplished with the &lt;a href="http://en.wikipedia.org/wiki/Linear_congruential_generator"&gt;linear congruential method&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;#lcm_imp.py&lt;br /&gt;import time&lt;br /&gt;&lt;br /&gt;m=2**32&lt;br /&gt;a=1664525&lt;br /&gt;c=1013904223&lt;br /&gt;xi=time.time()&lt;br /&gt;&lt;br /&gt;def seed(x):&lt;br /&gt;    global xi&lt;br /&gt;    xi=x&lt;br /&gt;&lt;br /&gt;def rng():&lt;br /&gt;    global xi&lt;br /&gt;    xi=(a*xi+c)%m&lt;br /&gt;    return xi/float(m)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Driven with:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;#lcm_inter.py&lt;br /&gt;from lcm_imp import *&lt;br /&gt;&lt;br /&gt;seed(123)&lt;br /&gt;for i in range(10): print rng()&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Yielding:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;0.283736921381&lt;br /&gt;0.43513002363&lt;br /&gt;0.0386512577534&lt;br /&gt;0.220879904693&lt;br /&gt;0.359427076299&lt;br /&gt;0.590244138846&lt;br /&gt;0.361280900426&lt;br /&gt;0.326849908335&lt;br /&gt;0.07973951241&lt;br /&gt;0.647962252842&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Having that, we can simulate other distributions by transforming the samples from the (0,1) uniform distribution using the inverse of the distribution function of the density we want to sample from (&lt;a href="http://en.wikipedia.org/wiki/Inverse_transform_sampling"&gt;inverse transform method&lt;/a&gt;).  For example, say we want to sample from an (a,b) uniform distribution where a=-1 and b=1.  We know that the density of a continuous uniform distribution is f(x)=1/(a-b) in the interval a&lt;=x&lt;=b, and its distribution function is F(x)=(x-a)/(b-a) for some a&lt;=x&amp;lt;b.  The inverse of this distribution function is x=(b-a)F(x)+a and so if we plug in the the resulting samples for U(0,1) into F(x) in the inverse distribution, we will obtain results which are distributed U(a,b) for whatever 'a' and 'b' we choose:  &lt;pre name="code" class="python"&gt;&lt;br /&gt;#invtrans_inter.py&lt;br /&gt;from lcm_imp import *&lt;br /&gt;&lt;br /&gt;seed(123)&lt;br /&gt;&lt;br /&gt;base_U_samples=[rng() for i in range(10)]&lt;br /&gt;&lt;br /&gt;a,b=-1,1&lt;br /&gt;&lt;br /&gt;desired_U_samples=[(b-a)*x+a for x in base_U_samples]&lt;br /&gt;&lt;br /&gt;for item in desired_U_samples: print item&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;For which we get:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;-0.432526157238&lt;br /&gt;-0.129739952739&lt;br /&gt;-0.922697484493&lt;br /&gt;-0.558240190614&lt;br /&gt;-0.281145847403&lt;br /&gt;0.180488277692&lt;br /&gt;-0.277438199148&lt;br /&gt;-0.34630018333&lt;br /&gt;-0.84052097518&lt;br /&gt;0.295924505685&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The efficacy of the inverse transform method relies on our ability to calculate the desired inverse distribution. For distributions where we can't do so, we could turn to another method, which although not very efficient in high-dimensions, it's easy to understand: &lt;a href="http://en.wikipedia.org/wiki/Rejection_sampling"&gt;rejection sampling&lt;/a&gt;.  This technique is nicely explained in part (2/6) of &lt;a href="http://videolectures.net/mlss08au_freitas_asm/"&gt;Nando's lectures on Monte Carlo simulation&lt;/a&gt; (with demo and all).  For illustrative purposes, here is an example from Stephen Marsland's recent &lt;a href="http://www-ist.massey.ac.nz/smarsland/MLBook.html"&gt;Machine Learning: An Algorithmic Perspective&lt;/a&gt; book:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;#rejectsamp_imp.py&lt;br /&gt;&lt;br /&gt;# Code from Chapter 14 of Machine Learning: An Algorithmic Perspective&lt;br /&gt;# by Stephen Marsland (http://seat.massey.ac.nz/personal/s.r.marsland/MLBook.html)&lt;br /&gt;&lt;br /&gt;# You are free to use, change, or redistribute the code in any way you wish for&lt;br /&gt;# non-commercial purposes, but please maintain the name of the original author.&lt;br /&gt;# This code comes with no warranty of any kind.&lt;br /&gt;&lt;br /&gt;# Stephen Marsland, 2008&lt;br /&gt;&lt;br /&gt;# The basic rejection sampling algorithm&lt;br /&gt;&lt;br /&gt;from pylab import *&lt;br /&gt;from numpy import *&lt;br /&gt;&lt;br /&gt;def qsample():&lt;br /&gt;    return random.rand()*4.&lt;br /&gt;&lt;br /&gt;def p(x):&lt;br /&gt;    return 0.3*exp(-(x-0.3)**2) + 0.7* exp(-(x-2.)**2/0.3) &lt;br /&gt;&lt;br /&gt;def rejection(nsamples):&lt;br /&gt;    &lt;br /&gt;    M = 0.72#0.8&lt;br /&gt;    samples = zeros(nsamples,dtype=float)&lt;br /&gt;    count = 0&lt;br /&gt;    for i in range(nsamples):&lt;br /&gt;        accept = False&lt;br /&gt;        while not accept:&lt;br /&gt;            x = qsample()&lt;br /&gt;            u = random.rand()*M&lt;br /&gt;            if p(x)&gt;u:&lt;br /&gt;                accept = True&lt;br /&gt;                samples[i] = x&lt;br /&gt;            else: &lt;br /&gt;                count += 1&lt;br /&gt;    print count   &lt;br /&gt;    return samples&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Driven with:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;#rejectsamp_inter.py&lt;br /&gt;from rejectsamp_imp import *&lt;br /&gt;&lt;br /&gt;x = arange(0,4,0.01)&lt;br /&gt;x2 = arange(-0.5,4.5,0.1)&lt;br /&gt;realdata = 0.3*exp(-(x-0.3)**2) + 0.7* exp(-(x-2.)**2/0.3)&lt;br /&gt;box = ones(len(x2))*0.75#0.8&lt;br /&gt;box[:5] = 0&lt;br /&gt;box[-5:] = 0&lt;br /&gt;plot(x,realdata,'k',lw=6)&lt;br /&gt;plot(x2,box,'k--',lw=6)&lt;br /&gt;&lt;br /&gt;import time&lt;br /&gt;t0=time.time()&lt;br /&gt;samples = rejection(10000)&lt;br /&gt;t1=time.time()&lt;br /&gt;print "Time ",t1-t0&lt;br /&gt;&lt;br /&gt;hist(samples,15,normed=1,fc='k')&lt;br /&gt;xlabel('x',fontsize=24)&lt;br /&gt;ylabel('p(x)',fontsize=24)&lt;br /&gt;axis([-0.5,4.5,0,1])&lt;br /&gt;show()&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;And which yields:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;17857&lt;br /&gt;Time  0.555863857269&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_JBvsBkmE5OU/SgXsdtWgKvI/AAAAAAAAAFM/FvONXLRNGOw/s1600-h/rejectsamp.png"&gt;&lt;img style="cursor: pointer; width: 320px; height: 240px;" src="http://2.bp.blogspot.com/_JBvsBkmE5OU/SgXsdtWgKvI/AAAAAAAAAFM/FvONXLRNGOw/s320/rejectsamp.png" alt="" id="BLOGGER_PHOTO_ID_5333929328824560370" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Notice how it took 17875 tries to get 10000 samples from our desired distribution.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-2342757171904655377?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/2342757171904655377/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/05/markov-chain.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/2342757171904655377'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/2342757171904655377'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/05/markov-chain.html' title='Markov Chain Monte Carlo'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_JBvsBkmE5OU/SgXsdtWgKvI/AAAAAAAAAFM/FvONXLRNGOw/s72-c/rejectsamp.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-5092092724269592196</id><published>2009-05-07T09:39:00.000-07:00</published><updated>2009-05-12T18:21:55.958-07:00</updated><title type='text'>Numpy gotcha!</title><content type='html'>So, in my haste to push &lt;a href="http://mechanistician.blogspot.com/2009/05/lec12-expectation-maximization.html"&gt;my last post&lt;/a&gt; on unsupervised learning and move on to sequential data, I did not check the results as thoroughly as I should have, and so when my brother looked at them, he wisely pointed out that the separations among the clusters found by the expectation maximization application to mixture of gaussians (MoG) looked linear, which is what you'd expect from k-means but not from MoG.&lt;br /&gt;&lt;br /&gt;I looked at it and sure enough, there were a few minor issues, which once fixed, uncovered another issue that took me a little longer to figure out.  What was happening now was that all the points were being labeled as belonging to one cluster/gaussian by the MoG algorithm.  I eventually narrowed it down to the cause, which was that my implementation was estimating covariance matrices with very large values, and so when it came time to find the most probable cluster for all points, the one with the largest covariance matrix was overpowering the remaining gaussians.&lt;br /&gt;&lt;br /&gt;There was nothing wrong with the implementation per se, instead, what slipped by me was this particular instance of how numpy arrays are handled:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;&gt;&gt;&gt; from numpy import *&lt;br /&gt;&gt;&gt;&gt; a=random.ranf((5,))&lt;br /&gt;&gt;&gt;&gt; b=random.ranf((5,1))&lt;br /&gt;&gt;&gt;&gt; a*b&lt;br /&gt;array([[ 0.04429077,  0.03425297,  0.63137759,  0.00791116,  0.47450932],&lt;br /&gt;       [ 0.04375046,  0.03383511,  0.62367522,  0.00781465,  0.46872064],&lt;br /&gt;       [ 0.02370387,  0.01833176,  0.3379054 ,  0.00423395,  0.25395146],&lt;br /&gt;       [ 0.03411096,  0.02638025,  0.48626142,  0.00609285,  0.36544784],&lt;br /&gt;       [ 0.02545382,  0.01968512,  0.36285147,  0.00454653,  0.27269958]])&lt;br /&gt;&gt;&gt;&gt; c=random.ranf((5,1))&lt;br /&gt;&gt;&gt;&gt; d=random.ranf((5,1))&lt;br /&gt;&gt;&gt;&gt; c*d&lt;br /&gt;array([[ 0.18859163],&lt;br /&gt;       [ 0.33695256],&lt;br /&gt;       [ 0.34911791],&lt;br /&gt;       [ 0.09585323],&lt;br /&gt;       [ 0.20524801]])&lt;br /&gt;&gt;&gt;&gt; e=a*b&lt;br /&gt;&gt;&gt;&gt; shape(e)&lt;br /&gt;(5, 5)&lt;br /&gt;&gt;&gt;&gt; shape(e[:,2])&lt;br /&gt;(5,)&lt;br /&gt;&gt;&gt;&gt; e&lt;br /&gt;array([[ 0.04429077,  0.03425297,  0.63137759,  0.00791116,  0.47450932],&lt;br /&gt;       [ 0.04375046,  0.03383511,  0.62367522,  0.00781465,  0.46872064],&lt;br /&gt;       [ 0.02370387,  0.01833176,  0.3379054 ,  0.00423395,  0.25395146],&lt;br /&gt;       [ 0.03411096,  0.02638025,  0.48626142,  0.00609285,  0.36544784],&lt;br /&gt;       [ 0.02545382,  0.01968512,  0.36285147,  0.00454653,  0.27269958]])&lt;br /&gt;&gt;&gt;&gt; e[:,2]&lt;br /&gt;array([ 0.63137759,  0.62367522,  0.3379054 ,  0.48626142,  0.36285147])&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;A few things to note there:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;The product of 2 arrays with dimensions (5,1) results in an element-wise multiplication.  OK, nothing wrong here.&lt;/li&gt;&lt;li&gt;An array of dimensions (5,) multiplied with another one with dimensions (5,1) results in a (5,5) array.  Not sure why a (5,) wouldn't default to (5,1).&lt;/li&gt;&lt;li&gt;Slicing the 3rd column of the array 'e' up there results in a (5,) array as opposed to a (5,1), which is what I would have expected.&lt;/li&gt;&lt;/ol&gt;Anyways, a simple 'reshape' can bring the array back to what you want, so you just gotta be aware of these things.&lt;br /&gt;&lt;br /&gt;Also see: &lt;a href="http://www.scipy.org/NumPy_for_Matlab_Users#head-e9a492daa18afcd86e84e07cd2824a9b1b651935"&gt;array or matrix, which should I use?&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-5092092724269592196?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/5092092724269592196/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/05/numpy-gotcha.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/5092092724269592196'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/5092092724269592196'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/05/numpy-gotcha.html' title='Numpy gotcha!'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-1207042341624405400</id><published>2009-05-05T06:06:00.000-07:00</published><updated>2009-05-07T09:31:12.729-07:00</updated><title type='text'>Lec12 - Expectation Maximization</title><content type='html'>The Expectation Maximization (EM) algorithm is covered in lectures &lt;a href="http://www.youtube.com/watch?v=ZZGTuAkF-Hw&amp;amp;feature=PlayList&amp;amp;p=A89DCFA6ADACE599&amp;amp;index=11"&gt;12&lt;/a&gt;, &lt;a href="http://www.youtube.com/watch?v=LBtuYU-HfUg&amp;amp;feature=PlayList&amp;amp;p=A89DCFA6ADACE599&amp;amp;index=12"&gt;13&lt;/a&gt;, and &lt;a href="http://www.youtube.com/watch?v=ey2PE5xi9-A&amp;amp;feature=PlayList&amp;amp;p=A89DCFA6ADACE599&amp;amp;index=13"&gt;14&lt;/a&gt;, from its general form to a few derived instances (mixture of gaussians, mixture of naive bayes, and factor analysis).&lt;br /&gt;&lt;br /&gt;We will illustrate &lt;a href="http://en.wikipedia.org/wiki/Unsupervised_learning"&gt;unsupervised learning&lt;/a&gt; with implementations of mixture of gaussians (an instance of EM) and k-means. K-means approaches the problem from a geometric perspective (at each iteration, a data point is assigned to one and only one cluster, the closest one), while mixture of gaussians does so from a probabilistic perspective (a data point belongs to all clusters, with a given probability):&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;#em_imp.py&lt;br /&gt;from numpy import *&lt;br /&gt;import sys&lt;br /&gt;&lt;br /&gt;def gen_data(sources,k,m):&lt;br /&gt;    save3Dplot('sources',sources)&lt;br /&gt;    kdata=[zeros((m/k,2)) for i in range(k)]&lt;br /&gt;    for i in range(k):&lt;br /&gt;        for j in range(2):&lt;br /&gt;            mu=sources[i][j]['mu']&lt;br /&gt;            sd=sources[i][j]['sd']&lt;br /&gt;            kdata[i][:,j]=random.normal(mu,sd,m/k)&lt;br /&gt;        saveplot('X'+str(i),kdata[i],' ')&lt;br /&gt;&lt;br /&gt;    data=kdata[0]&lt;br /&gt;    for i in range(1,k):&lt;br /&gt;        data=concatenate((data,kdata[i]))&lt;br /&gt;&lt;br /&gt;    random.shuffle(data)&lt;br /&gt;    return data&lt;br /&gt;&lt;br /&gt;def saveplot(fname,data,sep):&lt;br /&gt;    m,n=data.shape&lt;br /&gt;    fp=file(fname,'w')&lt;br /&gt;    for i in range(m):&lt;br /&gt;        for j in range(n):&lt;br /&gt;            fp.write(str(data[i,j])+sep)&lt;br /&gt;        fp.write('\n')&lt;br /&gt;    fp.close()&lt;br /&gt;&lt;br /&gt;def plot(name,clusters,centroids,X):&lt;br /&gt;    saveplot(name+'centroids',centroids,' ')&lt;br /&gt;    k=shape(centroids)[0]&lt;br /&gt;    for i in range(k):&lt;br /&gt;        data=X.take(where(clusters==i)[0],0)&lt;br /&gt;        saveplot(name+'data'+str(i),data,' ')&lt;br /&gt;&lt;br /&gt;def dnorm(X,muvec,covmat):&lt;br /&gt;    m=shape(X)[0]&lt;br /&gt;    probs=zeros((m,1))&lt;br /&gt;    dt=linalg.det(covmat)&lt;br /&gt;    A=1./((2*pi)*sqrt(dt))&lt;br /&gt;    covinv=linalg.inv(covmat)&lt;br /&gt;    for i in range(m):&lt;br /&gt;        t=X[i]-muvec&lt;br /&gt;        temp=-.5*dot(dot(t,covinv),transpose(t))&lt;br /&gt;        probs[i,0]=A*exp(temp)&lt;br /&gt;    return probs&lt;br /&gt;&lt;br /&gt;def save3Dplot(fname,sources):&lt;br /&gt;    for src in range(len(sources)):&lt;br /&gt;        mu1=sources[src][0]['mu']&lt;br /&gt;        sd1=sources[src][0]['sd']&lt;br /&gt;        mu2=sources[src][1]['mu']&lt;br /&gt;        sd2=sources[src][1]['sd']&lt;br /&gt;        covmat=zeros((2,2))&lt;br /&gt;        covmat[0,0]=sd1**2&lt;br /&gt;        covmat[1,1]=sd2**2&lt;br /&gt;        avg=array([mu1,mu2])     &lt;br /&gt;        data=zeros((100,100))&lt;br /&gt;        &lt;br /&gt;        for i in range(100):&lt;br /&gt;            for j in range(100):&lt;br /&gt;                xvec=array([i,j])/10.&lt;br /&gt;                data[i,j]=dnorm(xvec.reshape(1,2),avg,covmat)&lt;br /&gt;        &lt;br /&gt;        saveplot(fname+str(src),data,'\n')&lt;br /&gt;&lt;br /&gt;def kmeans(X,k):&lt;br /&gt;    m=shape(X)[0]&lt;br /&gt;    n=shape(X)[1]&lt;br /&gt;    oldcentroids=zeros((k,n))&lt;br /&gt;    e=1&lt;br /&gt;&lt;br /&gt;    #pick actual points at random to be initial centroids&lt;br /&gt;    centroids=X[(random.ranf((k,1))*m).astype('int')]&lt;br /&gt;    centroids=centroids.reshape(k,n)&lt;br /&gt;    clusters=zeros((m,1),'int')&lt;br /&gt;&lt;br /&gt;    #repeat until convergence&lt;br /&gt;    while linalg.norm(oldcentroids-centroids)&gt;1e-15:&lt;br /&gt;        oldcentroids=centroids+0 #copy&lt;br /&gt;&lt;br /&gt;        #compute cluster memberships&lt;br /&gt;        for i in range(m):&lt;br /&gt;            #repeat X[i,:] k times&lt;br /&gt;            dpoints=kron(ones((k,1)),X[i,:])&lt;br /&gt;            dists=sum((dpoints-centroids)**2,1).reshape(k,1)&lt;br /&gt;            clusters[i,0]=argmin(dists)&lt;br /&gt;&lt;br /&gt;        #compute new cluster centroids&lt;br /&gt;        for i in range(k):&lt;br /&gt;            centroids[i,:]=mean(X.take(where(clusters==i)[0],0),0)&lt;br /&gt;&lt;br /&gt;        print 'iter: %d, norm: %s'%(e,linalg.norm(oldcentroids-centroids))&lt;br /&gt;        e+=1&lt;br /&gt;&lt;br /&gt;    #compute final cluster memberships&lt;br /&gt;    for i in range(m):&lt;br /&gt;        dpoints=kron(ones((k,1)),X[i,:])&lt;br /&gt;        dists=sum((dpoints-centroids)**2,1).reshape(k,1)&lt;br /&gt;        clusters[i,0]=argmin(dists)&lt;br /&gt;    &lt;br /&gt;    return clusters,centroids&lt;br /&gt;&lt;br /&gt;def MoG(X,k,kclusters,kcentroids):&lt;br /&gt;    m=shape(X)[0]&lt;br /&gt;    n=shape(X)[1]&lt;br /&gt;    W=zeros((m,k))&lt;br /&gt;    lastll=sys.maxint&lt;br /&gt;    ll=0  &lt;br /&gt;    e=1&lt;br /&gt;    &lt;br /&gt;    #first guess parameters (using results from kmeans)&lt;br /&gt;    priors=zeros((k,1))&lt;br /&gt;    muvecs=zeros((k,n))&lt;br /&gt;    covmats=[zeros((n,n)) for i in range(k)]&lt;br /&gt;    for i in range(k):&lt;br /&gt;        priors[i,0]=len(X.take(where(kclusters==i)[0],0))/float(m)&lt;br /&gt;        covmats[i]=cov(transpose(X.take(where(kclusters==i)[0],0)))&lt;br /&gt;        muvecs[i]=kcentroids[i]&lt;br /&gt;        &lt;br /&gt;    #repeat until convergence&lt;br /&gt;    while abs(lastll-ll)&gt;1e-5 and lastll&gt;ll:&lt;br /&gt;        &lt;br /&gt;        #e-step: compute w's (probability of membership in each k)&lt;br /&gt;        for i in range(k):&lt;br /&gt;            #posterior~likelihood*prior&lt;br /&gt;            W[:,i:i+1]=dnorm(X,muvecs[i,:],covmats[i])*priors[i,0]&lt;br /&gt;        #normalize&lt;br /&gt;        W=W/W.sum(1).reshape(m,1)&lt;br /&gt;&lt;br /&gt;        #m-step: update parameters based on last membership&lt;br /&gt;        for i in range(k):&lt;br /&gt;            priors[i,0]=sum(W[:,i])/float(m)&lt;br /&gt;            muvecs[i,:]=sum(W[:,i:i+1]*X,0).reshape(1,n)/sum(W[:,i])&lt;br /&gt;            for j in range(n):&lt;br /&gt;                for l in range(n):&lt;br /&gt;                    weight=W[:,i:i+1]&lt;br /&gt;                    diff1=(X[:,j]-muvecs[i,j]).reshape(m,1)&lt;br /&gt;                    diff2=(X[:,l]-muvecs[i,l]).reshape(m,1)&lt;br /&gt;                    covmats[i][j,l]=sum(weight*diff1*diff2)/sum(W[:,i])&lt;br /&gt;            &lt;br /&gt;        #compute log-likelihood&lt;br /&gt;        ll1=zeros((m,1))&lt;br /&gt;        ll2=W.argmax(1).reshape(m,1)&lt;br /&gt;        for i in range(m):&lt;br /&gt;            ll1[i,:]=dnorm(X[i].reshape(1,n),muvecs[ll2[i],:],covmats[ll2[i]])&lt;br /&gt;            ll2[i,0]=priors[ll2[i]]&lt;br /&gt;        ll1=(1e-15&gt;ll1).choose(ll1,1e-15)&lt;br /&gt;        ll2=(1e-15&gt;ll2).choose(ll2,1e-15)&lt;br /&gt;        lastll=ll&lt;br /&gt;        ll=sum(log(ll1)+log(ll2))   &lt;br /&gt;        print 'iter: %d, log-likelihood: %s'%(e,ll)&lt;br /&gt;        e+=1&lt;br /&gt;&lt;br /&gt;    #compute final membership&lt;br /&gt;    clusters=W.argmax(1).reshape(m,1)&lt;br /&gt;    &lt;br /&gt;    return clusters,muvecs&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Driven with:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;#em_inter.py&lt;br /&gt;from em_imp import *&lt;br /&gt;import os&lt;br /&gt;&lt;br /&gt;#build your [bivariate] gaussians&lt;br /&gt;sources=[[{'mu': 7.0, 'sd': 0.7}, {'mu': 7.0, 'sd': 0.7}],&lt;br /&gt;         [{'mu': 3.0, 'sd': 0.7}, {'mu': 7.0, 'sd': 0.7}],&lt;br /&gt;         [{'mu': 5.0, 'sd': 1.0}, {'mu': 4.0, 'sd': 1.0}]]&lt;br /&gt;&lt;br /&gt;k=len(sources) #number of gaussians&lt;br /&gt;m=1500*k #number of data points&lt;br /&gt;random.seed(123456) #for repeatability&lt;br /&gt;&lt;br /&gt;#sample data from sources&lt;br /&gt;X=gen_data(sources,k,m)&lt;br /&gt;&lt;br /&gt;#save files for data plot&lt;br /&gt;saveplot('X',X,' ')&lt;br /&gt;&lt;br /&gt;print '\nkmeans:'&lt;br /&gt;&lt;br /&gt;#cluster data using k-means&lt;br /&gt;kclusters,kcentroids=kmeans(X,3)&lt;br /&gt;&lt;br /&gt;#save files for kmeans plot&lt;br /&gt;plot('k',kclusters,kcentroids,X)&lt;br /&gt;&lt;br /&gt;print '\nem:'&lt;br /&gt;&lt;br /&gt;#cluster data using MoG&lt;br /&gt;MoGclusters,MoGcentroids=MoG(X,3,kclusters,kcentroids)&lt;br /&gt;&lt;br /&gt;#save files for MoG plot&lt;br /&gt;plot('MoG',MoGclusters,MoGcentroids,X)&lt;br /&gt;&lt;br /&gt;print '\nkcentroids:'&lt;br /&gt;print kcentroids&lt;br /&gt;&lt;br /&gt;print '\nMoGCentroids:'&lt;br /&gt;print MoGcentroids&lt;br /&gt;&lt;br /&gt;#run gnuplot script&lt;br /&gt;os.system('gnuplot plotinst')&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Using the following gnuplot script:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;#plotinst&lt;br /&gt;set term gif&lt;br /&gt;set output 'source_contours.gif' &lt;br /&gt;unset surface&lt;br /&gt;set contour&lt;br /&gt;set view 0,0,1,1                                  &lt;br /&gt;splot 'sources0','sources1','sources2'            &lt;br /&gt;set output 'data_color.gif'           &lt;br /&gt;plot 'X0' using 2:1, 'X1' using 2:1, 'X2' using 2:1&lt;br /&gt;set output 'data_nocolor.gif'                     &lt;br /&gt;plot 'X' using 2:1                                            &lt;br /&gt;set output 'sources3d.gif'&lt;br /&gt;set hidden3d&lt;br /&gt;set surface                           &lt;br /&gt;set view 60,25,1,1                     &lt;br /&gt;splot 'sources0','sources1','sources2'&lt;br /&gt;set output 'kdata_color.gif'           &lt;br /&gt;plot 'kdata0' using 2:1, 'kdata1' using 2:1, 'kdata2' using 2:1, 'kcentroids' using 2:1 pointtype 7 pointsize 2&lt;br /&gt;set output 'MoGdata_color.gif'           &lt;br /&gt;plot 'MoGdata0' using 2:1, 'MoGdata1' using 2:1, 'MoGdata2' using 2:1, 'MoGcentroids' using 2:1 pointtype 7 pointsize 2&lt;br /&gt;set output&lt;br /&gt;set term x11&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;We get the following output:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;kmeans:&lt;br /&gt;iter: 1, norm: 1.49829818106&lt;br /&gt;iter: 2, norm: 0.143890700032&lt;br /&gt;iter: 3, norm: 0.0288887879026&lt;br /&gt;iter: 4, norm: 0.00466759639044&lt;br /&gt;iter: 5, norm: 0.00179109984505&lt;br /&gt;iter: 6, norm: 0.0&lt;br /&gt;&lt;br /&gt;em:&lt;br /&gt;iter: 1, log-likelihood: -165991.331942&lt;br /&gt;iter: 2, log-likelihood: -165986.870227&lt;br /&gt;&lt;br /&gt;kcentroids:&lt;br /&gt;[[ 4.96865803  3.83410557]&lt;br /&gt; [ 3.03335108  6.95460482]&lt;br /&gt; [ 6.9650538   6.94975746]]&lt;br /&gt;&lt;br /&gt;MoGCentroids:&lt;br /&gt;[[ 4.97013245  3.91434377]&lt;br /&gt; [ 3.02119885  6.98255096]&lt;br /&gt; [ 6.96649683  6.97264637]]&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;And also the following plots:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Contours of the bivariate gaussians sources of data to be clustered:&lt;/li&gt;&lt;/ul&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_JBvsBkmE5OU/SgMK_qIi23I/AAAAAAAAAEc/0V_nsHA1zqs/s1600-h/source_contours.gif"&gt;&lt;img style="cursor: pointer; width: 320px; height: 240px;" src="http://2.bp.blogspot.com/_JBvsBkmE5OU/SgMK_qIi23I/AAAAAAAAAEc/0V_nsHA1zqs/s320/source_contours.gif" alt="" id="BLOGGER_PHOTO_ID_5333118472494963570" border="0" /&gt;&lt;/a&gt;&lt;ul&gt;&lt;li&gt;Data generated from the gaussians:&lt;/li&gt;&lt;/ul&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_JBvsBkmE5OU/SgMLZCH7B3I/AAAAAAAAAEk/uJc5tM0u_iU/s1600-h/data_color.gif"&gt;&lt;img style="cursor: pointer; width: 320px; height: 240px;" src="http://1.bp.blogspot.com/_JBvsBkmE5OU/SgMLZCH7B3I/AAAAAAAAAEk/uJc5tM0u_iU/s320/data_color.gif" alt="" id="BLOGGER_PHOTO_ID_5333118908431533938" border="0" /&gt;&lt;/a&gt;&lt;ul&gt;&lt;li&gt;Data generated from the gaussians (no color):&lt;/li&gt;&lt;/ul&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_JBvsBkmE5OU/SgMLgKbdCtI/AAAAAAAAAEs/hXyXDcvgqOM/s1600-h/data_nocolor.gif"&gt;&lt;img style="cursor: pointer; width: 320px; height: 240px;" src="http://3.bp.blogspot.com/_JBvsBkmE5OU/SgMLgKbdCtI/AAAAAAAAAEs/hXyXDcvgqOM/s320/data_nocolor.gif" alt="" id="BLOGGER_PHOTO_ID_5333119030920022738" border="0" /&gt;&lt;/a&gt;&lt;ul&gt;&lt;li&gt;3-D view of the gaussians:&lt;/li&gt;&lt;/ul&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_JBvsBkmE5OU/SgMLrQEYH-I/AAAAAAAAAE0/vcbOvvetZHw/s1600-h/sources3d.gif"&gt;&lt;img style="cursor: pointer; width: 320px; height: 240px;" src="http://3.bp.blogspot.com/_JBvsBkmE5OU/SgMLrQEYH-I/AAAAAAAAAE0/vcbOvvetZHw/s320/sources3d.gif" alt="" id="BLOGGER_PHOTO_ID_5333119221412405218" border="0" /&gt;&lt;/a&gt;&lt;ul&gt;&lt;li&gt;Data as clustered by the k-means algorithm:&lt;/li&gt;&lt;/ul&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_JBvsBkmE5OU/SgMLzPL4P2I/AAAAAAAAAE8/OxwjO7Hcmnk/s1600-h/kdata_color.gif"&gt;&lt;img style="cursor: pointer; width: 320px; height: 240px;" src="http://2.bp.blogspot.com/_JBvsBkmE5OU/SgMLzPL4P2I/AAAAAAAAAE8/OxwjO7Hcmnk/s320/kdata_color.gif" alt="" id="BLOGGER_PHOTO_ID_5333119358614388578" border="0" /&gt;&lt;/a&gt;&lt;ul&gt;&lt;li&gt;Data as clustered by the EM-mixture of gaussians algorithm (initialized with the results of k-means):&lt;/li&gt;&lt;/ul&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_JBvsBkmE5OU/SgML6HQElyI/AAAAAAAAAFE/y0HZQaxeUcA/s1600-h/MoGdata_color.gif"&gt;&lt;img style="cursor: pointer; width: 320px; height: 240px;" src="http://2.bp.blogspot.com/_JBvsBkmE5OU/SgML6HQElyI/AAAAAAAAAFE/y0HZQaxeUcA/s320/MoGdata_color.gif" alt="" id="BLOGGER_PHOTO_ID_5333119476743575330" border="0" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6475416749449483556-1207042341624405400?l=mechanistician.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mechanistician.blogspot.com/feeds/1207042341624405400/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://mechanistician.blogspot.com/2009/05/lec12-expectation-maximization.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/1207042341624405400'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6475416749449483556/posts/default/1207042341624405400'/><link rel='alternate' type='text/html' href='http://mechanistician.blogspot.com/2009/05/lec12-expectation-maximization.html' title='Lec12 - Expectation Maximization'/><author><name>Mechanistician</name><uri>http://www.blogger.com/profile/04405962471175755507</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_JBvsBkmE5OU/SgMK_qIi23I/AAAAAAAAAEc/0V_nsHA1zqs/s72-c/source_contours.gif' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6475416749449483556.post-2445343866075424088</id><published>2009-04-30T16:22:00.000-07:00</published><updated>2009-05-03T15:07:33.898-07:00</updated><title type='text'>Bayesian Linear Regression</title><content type='html'>Let's illustrate bayesian linear regression with the dataset we have been using so far to test our &lt;a href="http://mechanistician.blogspot.com/2009/03/lec3-locally-weighted-regression.html"&gt;previous implementations&lt;/a&gt;, described &lt;a href="http://mechanistician.blogspot.com/2009/02/lecture-2-linear-regression-part-1.html"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;A straightforward linear model fit using R would look like this:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;&gt; data&lt;-read.table("housing_data")&lt;br /&gt;&gt; dim(data)&lt;br /&gt;[1] 505  14&lt;br /&gt;&gt; set.seed(123)&lt;br /&gt;&gt; names(data)&lt;br /&gt; [1] "V1"  "V2"  "V3"  "V4"  "V5"  "V6"  "V7"  "V8"  "V9"  "V10" "V11" "V12"&lt;br /&gt;[13] "V13" "V14"&lt;br /&gt;&gt; train&lt;-sample(1:505,2/3*505)&lt;br /&gt;&gt; fit&lt;-lm(V14~.,data,subset=train,x=TRUE,y=TRUE)&lt;br /&gt;&gt; testpred&lt;-predict(fit,data[-train,])&lt;br /&gt;&gt; mse&lt;-sum((testpred-data[-train,]$V14)^2)/length(testpred)&lt;br /&gt;&gt; mse&lt;br /&gt;[1] 21.54311&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;A quick inspection of the 'lm' command reminds us what R is doing, which is using the &lt;a href="http://www.alkires.com/teaching/ee103/Rec8_LLSAndQRFactorization.htm"&gt;QR factorization method&lt;/a&gt; (the reasons for which are mentioned in our previous discussion on the &lt;a href="http://mechanistician.blogspot.com/2009/02/lec2-linear-regression-normal-equations.html"&gt;normal equations&lt;/a&gt;).  In addition to these geometric and algebraic perspectives, we have also seen how we can also find the best parameter vector by framing the problem under a convex optimization perspective, where we minimize the mean squared error function.  We have also seen how this mean squared error function is justified under a maximum likelihood perspective if we make certain plausible probabilistic assumptions about the distribution of our data.  Furthermore, we have seen that we can come up with a score function that can make our algorithm more resilient to overfitting (particularly if fitting polynomials, rather than just a straight line), if we consider the prior distribution of our parameter vector in our maximum likelihood perspective, resulting in the addition of a regularization term; bringing us closer to a full bayesian approach.  Today we examine what a full bayesian approach would look like.&lt;br /&gt;&lt;br /&gt;Let's start with the code first, picking up on the same session we left off above:&lt;br /&gt;&lt;br /&gt;&lt;pre name="code" class="python"&gt;&lt;br /&gt;&gt; library(LearnBayes)&lt;br /&gt;&gt; theta.sample&lt;-blinreg(fit$y,fit$x,5000)&lt;br /&gt;&gt; predmat&lt;-data[-train,1:13]&lt;br /&gt;&gt; dim(predmat)&lt;br /&gt;[1] 169  13&lt;br /&gt;&gt; predmat&lt;-cbind(matrix(1,169,1),predmat)&lt;br /&gt;&gt; dim(predmat)&lt;br /&gt;[1] 169  14&lt;br /&gt;&gt; predmat&lt;-as.matrix(predmat)&lt;br /&gt;&gt; mean.draws&lt;-blinregpred(predmat,theta.sample)&lt;br /&gt;&gt; dim(mean.draws)&lt;br /&gt;[1] 5000  169&lt;br /&gt;&gt; bpredtest&lt;-apply(mean.draws,MARGIN=2,FUN=mean)&lt;br /&gt;&gt; length(bpredtest)&lt;br /&gt;[1] 169&lt;br /&gt;&gt; bmse&lt;-sum((bpredtest-data[-train,]$V14)^2)/length(bpredtest)&lt;br /&gt;&gt; bmse&lt;br /&gt;[1] 21.53147&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The package 'LearnBayes' is a companion to Jim Albert's &lt;a href="http://bayes.bgsu.edu/bcwr/"&gt;Bayesian Computation with R&lt;/a&gt; (check it out for more details), and can be downloaded directly from R using the '&lt;span class="il"&gt;install&lt;/span&gt;.&lt;span class="il"&gt;packages&lt;/span&gt;("LearnBayes")' command.&lt;br /&gt;&lt;br /&gt;The results obtained aren't that much different from what we got by using 'lm', but that's not really the point.  What we are interested in here is the computational model enabled by the bayesian method, which although not a clear winner in the task of performing linear regression, it can scale in ways that allow a much richer modeling of the interaction among several variables (think bayesian networks).  Linear regression is just a familiar way to explore this computational model.  Let's distill some of the magic going on behind the calls to 'blinreg' and 'blinregpred'.  Before we do so, you might wanna check out Jim Albert's own blog entries on the advantages of using bayesian regression over frequentist regression (&lt;a href="http://learnbayes.blogspot.com/2007/11/bayesian-regression.html"&gt;here&lt;/a&gt;, then &lt;a href="http://learnbayes.blogspot.com/2007/11/bayesian-model-selection.html"&gt;here&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;From the help docs (in R, use '?blinreg' at the prompt), blinreg "gives a simulated sample from the joint posterior distribution of the regression vector and the error standard deviation for a linear regression model with a noninformative prior."  At this point, we are working with the parameters of our model, which under the probabilistic interpretation of linear regression are the coefficients of the weight vector and the standard deviation.  Remember from &lt;a href="http://www.youtube.com/watch?v=HZ4cvaztQEs&amp;amp;feature=PlayList&amp;amp;p=A89DCFA6ADACE599&amp;amp;index=2"&gt;lecture 3&lt;/a&gt; that we model the distribution of our dependent variable as:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;p(y|&lt;u&gt;x&lt;/u&gt;,&lt;u&gt;w&lt;/u&gt;,sd) ~ N(mu=&lt;u&gt;w&lt;/u&gt;&lt;sup&gt;T&lt;/sup&gt;&lt;u&gt;x&lt;/u&gt;,var=sd^2)&lt;/li&gt;&lt;/ul&gt;Where:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;'y' is the dependent variable&lt;/li&gt;&lt;li&gt;'&lt;u&gt;x&lt;/u&gt;' is the feature vector&lt;/li&gt;&lt;li&gt;&lt;u&gt;w&lt;/u&gt; is the weight vector&lt;/li&gt;&lt;li&gt;N(mu=,var=) represents a gaussian parameterized by its mean (mu) and its variance (var)&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;Also remember that in a bayesian approach, we define a prior distribution over our parameters, which we then condition under the likelihood function to obtain an updated distribution of the parameters, the poster
