<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-2293611840857369720</id><updated>2012-05-21T10:19:23.397+04:00</updated><category term='Python'/><category term='graphical models'/><category term='CVPR'/><category term='Visual Studio'/><category term='GML'/><category term='education'/><category term='computer graphics'/><category term='Microsoft'/><category term='NetCracker'/><category term='web-site'/><category term='CBIR'/><category term='PASCAL VOC'/><category term='ICCV'/><category term='comics'/><category term='Vision 101'/><category term='OpenCV'/><category term='low-level'/><category term='recognition'/><category term='art'/><category term='CMU'/><category term='parsing'/><category term='conference'/><category term='MCMC'/><category term='Cambridge'/><category term='C++'/><category term='MRF'/><category term='ICVSS'/><category term='job'/><category term='augmented reality'/><category term='quantum mechanics'/><category term='AI'/><category term='GraphiCon'/><category term='CERN'/><category term='structured learning'/><category term='video'/><category term='labelling'/><category term='Android'/><category term='science'/><category term='paper'/><category term='computational complexity'/><category term='IEEE'/><category term='Internet'/><category term='optical flow'/><category term='robotics'/><category term='tutorial'/><category term='FOSS'/><category term='humour'/><category term='ECCV'/><category term='music'/><category term='F#'/><category term='philosophy'/><category term='Java'/><category term='blog'/><category term='Google'/><category term='industry'/><category term='life'/><category term='C#'/><category term='movie'/><category term='lecture'/><category term='Stanford'/><category term='computer vision'/><category term='MATLAB'/><category term='software'/><category term='summer school'/><category term='laserscanning'/><category term='coding'/><category term='history'/><category term='Russia'/><category term='testing'/><category term='machine learning'/><category term='reblog'/><category term='academic'/><category term='particle physics'/><category term='LaTeX'/><category term='MSU'/><title type='text'>Computer Blindness</title><subtitle type='html'>and Computer Blondness</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default?start-index=26&amp;max-results=25'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>41</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-3741524789219382099</id><published>2012-04-16T02:42:00.002+04:00</published><updated>2012-04-16T02:42:20.888+04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Stanford'/><category scheme='http://www.blogger.com/atom/ns#' term='graphical models'/><category scheme='http://www.blogger.com/atom/ns#' term='education'/><category scheme='http://www.blogger.com/atom/ns#' term='lecture'/><category scheme='http://www.blogger.com/atom/ns#' term='coding'/><title type='text'>PGM-class and MRF parameter learning</title><content type='html'>&lt;div dir="ltr" style="text-align: left;" trbidi="on"&gt;&lt;div style="text-align: justify;"&gt;I’m taking Stanford CS 228 (a.k.a. &lt;a href="https://class.coursera.org/pgm/class/index"&gt;pgm-class&lt;/a&gt;) on Coursera. The class is great, I guess it provides close to the maximum one can do under the constraints of remoteness and bulkness. The thing I miss is theoretical problems, which were taken aside from the on-line version because they&amp;nbsp;could not be graded&amp;nbsp;automatically.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;There is an important thing about graphical models I fully realized only recently (partly due to the class). This thing should be articulated clearly in every introductory course, but is often just mentioned, probably because lecturers consider it obvious. The thing is &lt;b&gt;there is no probabilistic meaning of MRF potentials whatsoever&lt;/b&gt;. The partition function is there not only for amenity: in contrast to Bayesian networks, there is no general way to assign potentials of an &lt;b&gt;undirected &lt;/b&gt;graphical model to avoid normalization. The loops make it impossible. The implication is one should not assign potentials by estimating frequencies of assignments to factors (possibly conditioned on features) like &lt;a href="http://shapovalov.ro/papers/Shapovalov-et-al-PCV2010.pdf"&gt;I did earlier&lt;/a&gt;. This is quite a bad heuristic because it is&amp;nbsp;susceptible to overcounting. Let me give an example.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;For the third week programming assignment we needed to implement a Markov network for handwriting OCR. The unary and pairwise potentials are somewhat obvious, but there was also a task to add ternary factors. The accuracy of the pairwise model is 26%. &lt;a href="http://www.cs.elte.hu/~klao/"&gt;Mihaly Barasz&lt;/a&gt; tried to add ternary factors with values proportional to &lt;a href="http://en.wikipedia.org/wiki/Trigram"&gt;trigram&lt;/a&gt; frequencies in English, which decreased performance to 21% (&lt;a href="https://class.coursera.org/pgm/forum/thread?thread_id=876"&gt;link for those who have access&lt;/a&gt;). After removing pairwise factors, the performance rose to 38%. Why has the joint model failed? The reason is overcounting evidence: different factor types enforce the same co-occurrences,&amp;nbsp;thus creating bias towards more frequent assignments, and this shows it can be significant. Therefore, we should train models with cycles discriminatively.&amp;nbsp;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;One more thought I’d like to share: graphical model design is similar to software engineering in the way that the crucial thing for the both is eliminating insignificant dependencies on the architecture design stage.&amp;nbsp;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-3741524789219382099?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/3741524789219382099/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2012/04/pgm-class-and-mrf-parameter-learning.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/3741524789219382099'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/3741524789219382099'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2012/04/pgm-class-and-mrf-parameter-learning.html' title='PGM-class and MRF parameter learning'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-8102804300226452792</id><published>2012-01-05T20:05:00.000+04:00</published><updated>2012-01-05T20:05:11.239+04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='IEEE'/><category scheme='http://www.blogger.com/atom/ns#' term='low-level'/><category scheme='http://www.blogger.com/atom/ns#' term='C++'/><category scheme='http://www.blogger.com/atom/ns#' term='coding'/><title type='text'>When and why it is safe to cast floor/ceil result to integer</title><content type='html'>&lt;div dir="ltr" style="text-align: left;" trbidi="on"&gt;&lt;div style="text-align: justify;"&gt;Happy new year, ladies and gents! I am back, and the following couple of posts are going to be programming-related.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The C/C++ standard library declares &lt;a href="http://cplusplus.com/reference/clibrary/cmath/floor/"&gt;floor function&lt;/a&gt; such that it returns a floating-point number:&lt;/div&gt;&lt;pre style="color: green; font-size: 12px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: -webkit-auto;"&gt;     double floor (      double x );&lt;br /&gt;      float floor (       float x );  // C++ only&lt;/pre&gt;&lt;div style="text-align: justify;"&gt;I used to be tortured by two questions:&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;ol style="text-align: left;"&gt;&lt;li style="text-align: justify;"&gt;Why would the floor function return anything other than integer?&lt;/li&gt;&lt;li style="text-align: justify;"&gt;What is the most correct way to cast the output to integer?&lt;/li&gt;&lt;/ol&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;At last I figured out the answer to both questions. Note that the rest of the post applies to the&amp;nbsp;&lt;a href="http://cplusplus.com/reference/clibrary/cmath/ceil/"&gt;ceil&lt;/a&gt;&amp;nbsp;function as well.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;For the second point, I knew the common idiom was just type-casting the output:&lt;/div&gt;&lt;pre style="color: green; font-size: 12px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: -webkit-auto;"&gt;&lt;/pre&gt;&lt;pre style="color: green; font-size: 12px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: -webkit-auto;"&gt;double a;&lt;/pre&gt;&lt;pre style="color: green; font-size: 12px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: -webkit-auto;"&gt;int intRes = int(floor(a));  // use static_cast if you don't like brevity&lt;/pre&gt;&lt;div style="text-align: justify;"&gt;If you’ve heard &lt;a href="http://www.exploringbinary.com/the-four-stages-of-floating-point-competence/"&gt;anything&lt;/a&gt; about peculiarities of floating-point arithmetic, you might start worrying that floor might return not exact integer value but &amp;nbsp;$\lfloor a \rfloor - \epsilon$, so that type-casting is incorrect (assume $a$&amp;nbsp;is positive). However, this is not the case. Consider the following statement.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;If $a$ is not integer and is representable as &lt;a href="http://en.wikipedia.org/wiki/IEEE_754-2008"&gt;IEEE-754&lt;/a&gt; floating-point number, than both $\lfloor a \rfloor$ and&amp;nbsp;$\lceil a \rceil$ are representable within that domain. This means that for any float/double value floor can return a number that is integer and fits in float/double format.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span style="font-size: x-small;"&gt;The proof is easy but requires understanding of &lt;a href="http://en.wikipedia.org/wiki/Floating_point"&gt;representation of floating-point numbers&lt;/a&gt;. Suppose $a = c \times b^q, c &amp;nbsp;\in [0.1, 1)$ has $k$ significant digits, i.e. $k$-th digit after the point in $c$ is not null, but all the further ones are. Since it is representable, $k$ is less then the maximum width of&amp;nbsp;significand. Since $a$ is not integer, both&amp;nbsp;$\lfloor a \rfloor$ and&amp;nbsp;$\lceil a \rceil$ have significands with the number of significant digits less then $k$. Rounding however might increase the order of magnitude but only by one. So, the rounded number’s significand fits in&amp;nbsp;$k$ digits, Q.E.D.&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;However, this does not mean one can always type-cast the output safely, since, for example, not every integer stored in double can be represented as 4-byte int (and this is the answer to the question #1). Double precision numbers have 52 digit significands, so any integer number up to $2^{52}$ can be stored exactly. For the int type, it is only $2^{31}$. So if there is possibility of overflow, check before the cast or use the int64 type.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;On the other hand, a lot of int’s cannot be represented as floats (also true for int64 and double). Consider the following code:&lt;/div&gt;&lt;pre style="color: green; font-size: 12px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: -webkit-auto;"&gt;std::cout &amp;lt;&amp;lt; std::showbase &amp;lt;&amp;lt; std::hex&amp;nbsp;&lt;/pre&gt;&lt;pre style="color: green; font-size: 12px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: -webkit-auto;"&gt;      &amp;lt;&amp;lt; int(float(1 &amp;lt;&amp;lt; 23)) &amp;lt;&amp;lt; " " &amp;lt;&amp;lt; int(float(1 &amp;lt;&amp;lt; 24)) &amp;lt;&amp;lt; std::endl&lt;/pre&gt;&lt;pre style="color: green; font-size: 12px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: -webkit-auto;"&gt;      &amp;lt;&amp;lt; int(float(1 &amp;lt;&amp;lt; 23) + 1) &amp;lt;&amp;lt; " " &amp;lt;&amp;lt; int(float(1 &amp;lt;&amp;lt; 24) + 1) &amp;lt;&amp;lt; std::endl;&lt;br /&gt;&lt;/pre&gt;The output is:&lt;br /&gt;&lt;pre style="color: green; font-size: 12px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: -webkit-auto;"&gt;&lt;/pre&gt;&lt;pre style="color: green; font-size: 12px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: -webkit-auto;"&gt;0x800000 0x1000000&lt;br /&gt;0x800001 0x1000000&lt;br /&gt;&lt;/pre&gt;&lt;pre style="color: green; font-size: 12px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: -webkit-auto;"&gt;&lt;/pre&gt;&lt;div style="text-align: justify;"&gt;$2^{24}+1$ cannot be represented as float exactly (it is represented as&amp;nbsp;$2^{24}$), so one should not use the float version of floor for numbers greater than few millions.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;There are &lt;a href="http://www.exploringbinary.com/why-powers-of-ten-up-to-10-to-the-22-are-exact-as-doubles/"&gt;classes of numbers&lt;/a&gt; that are guaranteed to be represented exactly in floating point format. See also &lt;a href="http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html"&gt;the comprehensive tutorial&lt;/a&gt; on floating-point arithmetic by David Goldberg.&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-8102804300226452792?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/8102804300226452792/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2012/01/when-and-why-it-is-safe-to-cast.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/8102804300226452792'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/8102804300226452792'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2012/01/when-and-why-it-is-safe-to-cast.html' title='When and why it is safe to cast floor/ceil result to integer'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-7158874029262721435</id><published>2011-11-25T20:54:00.001+04:00</published><updated>2011-11-25T21:51:32.699+04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='computer vision'/><category scheme='http://www.blogger.com/atom/ns#' term='GML'/><category scheme='http://www.blogger.com/atom/ns#' term='summer school'/><category scheme='http://www.blogger.com/atom/ns#' term='Microsoft'/><category scheme='http://www.blogger.com/atom/ns#' term='Russia'/><category scheme='http://www.blogger.com/atom/ns#' term='Cambridge'/><title type='text'>Kinect Apps Challenge</title><content type='html'>&lt;div dir="ltr" style="text-align: left;" trbidi="on"&gt;&lt;div style="text-align: justify;"&gt;&lt;a href="http://graphics.cs.msu.ru/en"&gt;Graphics &amp;amp; Media Lab&lt;/a&gt; and &lt;a href="http://research.microsoft.com/en-us/labs/cambridge/default.aspx"&gt;Microsoft Research Cambridge&lt;/a&gt; have announced a contest for applications that use Kinect sensor. The authors of the five brightest apps will be funded to attend the 2012 Microsoft Research PhD summer school. I blogged about the last year's event in &lt;a href="http://computerblindness.blogspot.com/2011/08/all-summer-schools.html"&gt;my previous post&lt;/a&gt;. Details of the contest are &lt;a href="http://summerschool2011.graphicon.ru/en/contest"&gt;here&lt;/a&gt;.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;I guess there's no need to explain what &lt;a href="http://en.wikipedia.org/wiki/Kinect"&gt;Kinect&lt;/a&gt;&amp;nbsp;is. It is extremely successful combined colour and depth sensor by Microsoft. It is being distributed for killer price (≈$150) as an add-on for Xbox, although it is hard to imagine the range&amp;nbsp;of possible applications. Controlling a computer only by gestures is considered as a primer of &lt;a href="http://en.wikipedia.org/wiki/Natural_user_interface"&gt;natural user interface&lt;/a&gt;.&amp;nbsp;To name some applications beyond gaming, this kind of NUI is useful to help surgeons to keep their hands clean:&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;object class="BLOGGER-youtube-video" classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0" data-thumbnail-src="http://1.gvt0.com/vi/f5Ep3oqicVU/0.jpg" height="266" width="320"&gt;&lt;param name="movie" value="http://www.youtube.com/v/f5Ep3oqicVU&amp;fs=1&amp;source=uds" /&gt; &lt;param name="bgcolor" value="#FFFFFF" /&gt; &lt;embed width="320" height="266"  src="http://www.youtube.com/v/f5Ep3oqicVU&amp;fs=1&amp;source=uds" type="application/x-shockwave-flash"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Kinect helps blind people to navigate through buildings:&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;object class="BLOGGER-youtube-video" classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0" data-thumbnail-src="http://0.gvt0.com/vi/l6QY-eb6NoQ/0.jpg" height="266" width="320"&gt;&lt;param name="movie" value="http://www.youtube.com/v/l6QY-eb6NoQ&amp;fs=1&amp;source=uds" /&gt; &lt;param name="bgcolor" value="#FFFFFF" /&gt; &lt;embed width="320" height="266"  src="http://www.youtube.com/v/l6QY-eb6NoQ&amp;fs=1&amp;source=uds" type="application/x-shockwave-flash"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;For more ideas look at the winners of &lt;a href="http://kinecthacks.net/winners-of-openni-2011-developer-challenge/"&gt;OpenNI challenge&lt;/a&gt;.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;OpenNI is an open-source alternative to Kinect SDK. Its strong point is it is integrated with&amp;nbsp;&lt;a href="http://pointclouds.org/"&gt;PCL&lt;/a&gt;, but beware that you cannot use it for the contest. Only &lt;a href="http://kinectforwindows.org/"&gt;Kinect for Windows&lt;/a&gt; SDK is allowed.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Kinect is probably the most successful &lt;i&gt;commercial &lt;/i&gt;outcome from the Microsoft Research lab. MSR is unique because they do a lot of theoretical research there, and it is unclear if it is profitable for Microsoft to fund it. But the projects like Kinect reveal the doubts. I will post about MSR organization in comparison to other industrial labs in one of the later posts.&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-7158874029262721435?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/7158874029262721435/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2011/11/kinect-apps-challenge.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/7158874029262721435'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/7158874029262721435'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2011/11/kinect-apps-challenge.html' title='Kinect Apps Challenge'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-3630196890901289843</id><published>2011-08-16T03:35:00.005+04:00</published><updated>2011-08-16T03:40:04.876+04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='blog'/><category scheme='http://www.blogger.com/atom/ns#' term='MATLAB'/><category scheme='http://www.blogger.com/atom/ns#' term='PASCAL VOC'/><category scheme='http://www.blogger.com/atom/ns#' term='ICVSS'/><category scheme='http://www.blogger.com/atom/ns#' term='computer vision'/><category scheme='http://www.blogger.com/atom/ns#' term='GML'/><category scheme='http://www.blogger.com/atom/ns#' term='F#'/><category scheme='http://www.blogger.com/atom/ns#' term='life'/><category scheme='http://www.blogger.com/atom/ns#' term='summer school'/><category scheme='http://www.blogger.com/atom/ns#' term='machine learning'/><category scheme='http://www.blogger.com/atom/ns#' term='Microsoft'/><category scheme='http://www.blogger.com/atom/ns#' term='MSU'/><category scheme='http://www.blogger.com/atom/ns#' term='movie'/><category scheme='http://www.blogger.com/atom/ns#' term='Cambridge'/><category scheme='http://www.blogger.com/atom/ns#' term='ICCV'/><category scheme='http://www.blogger.com/atom/ns#' term='structured learning'/><category scheme='http://www.blogger.com/atom/ns#' term='OpenCV'/><title type='text'>All the summer schools</title><content type='html'>&lt;div style="text-align: justify;"&gt;This summer I attended two conferences (&lt;a href="http://www.3dimpvt.org/"&gt;3DIMPVT&lt;/a&gt;, &lt;a href="http://emmcvpr11.csd.uwo.ca/"&gt;EMMCVPR&lt;/a&gt;) and two summer schools. I know my latency is somewhat annoying, but it's better to review them now then never. :) This post is about the summer schools, and the following is going to be about the conferences.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;b&gt;&lt;span class="Apple-style-span"&gt;PhD Summer School in Cambridge&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Both schools were organized by &lt;a href="http://research.microsoft.com/en-us/"&gt;Microsoft Research&lt;/a&gt;. The first one, &lt;a href="http://research.microsoft.com/en-us/events/2011summerschool/"&gt;PhD Summer School&lt;/a&gt; was in Cambridge, UK. The lectures covered some general issues for computer science PhD students (like using cloud computing for research and career perspectives) as well as some recent technical results by Microsoft Research. From the computer vision side, there were several talks:&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://research.microsoft.com/en-us/people/antcrim/"&gt;Antonio Criminisi&lt;/a&gt; described their &lt;a href="http://research.microsoft.com/en-us/projects/MedicalImageAnalysis/"&gt;InnerEye&lt;/a&gt; system for retrieval of similar body part scans, which is useful for diagnosis based on similar cases' medical history. He also featured the basics of Random Forests as an advertisement to his &lt;a href="http://research.microsoft.com/en-us/people/antcrim/decisionforeststutorial.aspx"&gt;ICCV 2011 tutorial&lt;/a&gt;. The new thing was using peculiar weak classifiers (like 2nd order separation surfaces). Antonio argued they perform much better then trees in some cases.&lt;/li&gt;&lt;li&gt;&lt;a href="http://research.microsoft.com/en-us/um/people/awf/"&gt;Andrew Fitzgibbon&lt;/a&gt; gave a brilliant lecture about pose estimation for &lt;a href="http://en.wikipedia.org/wiki/Kinect"&gt;Kinect&lt;/a&gt; (MSR Cambridge is really proud of that algorithm [&lt;a href="http://research.microsoft.com/pubs/145347/BodyPartRecognition.pdf"&gt;Shotton, 2011&lt;/a&gt;], this is the topic for another post).&lt;/li&gt;&lt;li&gt;&lt;a href="http://graphics.cs.msu.ru/people/staff/obarinova"&gt;Olga Barinova&lt;/a&gt; talked about the modern methods of image analysis and her work for the past 2 years (graphical models for &lt;a href="http://graphics.cs.msu.ru/en/science/research/machinelearning/hough"&gt;non-maxima suppression for object detection&lt;/a&gt; and &lt;a href="http://graphics.cs.msu.ru/en/science/research/3dreconstruction/geometricparsing"&gt;urban scene parsing&lt;/a&gt;). &lt;/li&gt;&lt;/ul&gt;The other great talks were about &lt;a href="http://www.netmf.com/gadgeteer/"&gt;.NET Gadgeteer&lt;/a&gt;, the system for modelling and even deployment of electronic gadgets (yes, hardware!), and &lt;a href="http://research.microsoft.com/en-us/um/cambridge/projects/fsharp/"&gt;F#&lt;/a&gt;, Microsoft's alternative to &lt;a href="http://en.wikipedia.org/wiki/Scala_(programming_language)"&gt;Scala&lt;/a&gt;, the language that combines object-oriented paradigm with functional. &lt;a href="http://en.wikipedia.org/wiki/Tony_Hoare"&gt;Sir Tony Hoare&lt;/a&gt; also gave a lecture, so I had a chance to ask him how he ended up in Moscow State University in the 60s. It turns out he studied statistics, and &lt;a href="http://en.wikipedia.org/wiki/Andrey_Nikolayevich_Kolmogorov"&gt;Andrey Kolmogorov&lt;/a&gt; was one of the leaders of the field that time, so that internship was a great opportunity for him. He said he had liked the time in Moscow. :) There were also magnificent lectures by &lt;a href="http://research.microsoft.com/en-us/people/simonpj/"&gt;Simon Peyton-Jones&lt;/a&gt; about giving talks and writing papers. Those advices are &lt;i&gt;the must&lt;/i&gt; for everyone who does research, you can find the slides &lt;a href="http://research.microsoft.com/en-us/um/people/simonpj/papers/giving-a-talk/giving-a-talk.htm"&gt;here&lt;/a&gt;. Slides for some of the lectures are available from &lt;a href="http://research.microsoft.com/en-us/events/2011summerschool/#Abstracts"&gt;the school page&lt;/a&gt;.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The school talks did not take all the time. Every night was occupied by some social event (go-karting, &lt;a href="http://en.wikipedia.org/wiki/Punt_(boat)"&gt;punting&lt;/a&gt; etc.) as well as unofficial after-parties in Cambridge pubs. Definitely it is the most fun school/conference I've attended so far. Karting was especially great, with the quality track, pit-stops, stats and prizes, so special thanks to Microsoft for including it to the program! &lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;b&gt;&lt;span class="Apple-style-span"&gt;Microsoft Computer Vision School in Moscow&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;b&gt;&lt;span class="Apple-style-span"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span"&gt;This year, &lt;a href="http://summerschool2011.graphicon.ru/en"&gt;Microsoft Research summer school&lt;/a&gt; in Russia was devoted to computer vision and organized in cooperation with our lab. The school started before its official opening with a &lt;a href="http://courses.graphicon.ru/school/mscvs2011/assign"&gt;&lt;b&gt;homework assignment&lt;/b&gt;&lt;/a&gt; we authored (I was one of four student volunteers). The task was to develop an image classification method capable to distinguish two indoor and two outdoor classes. The results were rated according to the performance on the hidden test set. Artem Konev won the challenge with 95.5% accuracy and was awarded a prize consisted of an xBox and Kinect. Two years ago we used those data for the projects on &lt;i&gt;Introduction to Computer Vision&lt;/i&gt; course, where nobody reached even 90%. It reflects not just the lore of participants, but also the progress of computer vision: all the top methods used PHOW descriptors and linear SVM with approximate decomposed &lt;i&gt;χ&lt;sup&gt;2&lt;/sup&gt;&lt;/i&gt; kernel [&lt;a href="http://www.robots.ox.ac.uk/~vgg/publications/papers/vedaldi10.pdf"&gt;Vedaldi and Zisserman, 2010&lt;/a&gt;], which were unavailable that time!&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;In fact, &lt;b&gt;&lt;a href="http://www.robots.ox.ac.uk/~az/"&gt;Andrew Zisserman&lt;/a&gt; &lt;/b&gt;was one of the speakers. Andrew is the most cited computer vision researcher and the only person whose &lt;a href="http://computerblindness.blogspot.com/2010/10/papers-citations-co-authorship-and.html"&gt;Zisserman number&lt;/a&gt; is zero. :) His course was on &lt;a href="http://summerschool2011.graphicon.ru/en/courses/az"&gt;Visual Search and Recognition&lt;/a&gt;, including instance-level and category-level recognition. The ideas that were relatively new:&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;when computing visual words, sometimes it is fruitful to use soft assignments to clusters, or more advanced methods like &lt;i&gt;Locality-constrained linear coding&lt;/i&gt; [&lt;a href="http://www.robots.ox.ac.uk/~vgg/rg/papers/wang_etal_CVPR10.pdf"&gt;Wang et al., 2010&lt;/a&gt;];&lt;/li&gt;&lt;li&gt;for instance-level recognition it is possible to use &lt;i&gt;query expansion&lt;/i&gt; to overcome occlusions [&lt;a href="http://marcade.robots.ox.ac.uk:8080/~vgg/publications/2007/Chum07b/chum07b.pdf"&gt;Chum et al., 2007&lt;/a&gt;]: the idea is to use the best matched images from the base as new queries;&lt;/li&gt;&lt;li&gt;object detection is traditionally done with sliding window, the problems here are: various aspect ratio, partial occlusions, multiple responses and background clutter for substantially non-convex objects;&lt;/li&gt;&lt;li&gt;for object detection use bootstrapped sequential classification: on the next stage take the false negative detections from the previous stage as negative examples and retrain the classifier;&lt;/li&gt;&lt;li&gt;&lt;i&gt;multiple kernel learning&lt;/i&gt; [&lt;a href="http://www.vision.ee.ethz.ch/publications/papers/proceedings/eth_biwi_00649.pdf"&gt;Gehler and Nowozin, 2009&lt;/a&gt;] is a hot tool that is used to find the ideal linear combination of SVM kernels: combining different features is fruitful, but learning the combination is not much better than just averaging (Lampert: “Never use MKL without comparison to simple baselines!”); &lt;/li&gt;&lt;li&gt;movies are common datasets, since there are a lot of repeated objects/people/environments, and the privacy issues are easy to overcome. The movies like &lt;i&gt;&lt;a href="http://www.imdb.com/title/tt0107048/"&gt;Groundhog Day&lt;/a&gt;&lt;/i&gt; and &lt;i&gt;&lt;a href="http://www.imdb.com/title/tt0130827/"&gt;Run Lola Run&lt;/a&gt;&lt;/i&gt; are especially good since they contain repeated episodes. You can try to find &lt;a href="http://arthur.robots.ox.ac.uk:8081/cgi-bin/shot_viewer/vg_single_image.py?movie+name=lola&amp;amp;frame+nb=25901"&gt;the clocks on Video Google Demo&lt;/a&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;Zisserman talked about &lt;a href="http://pascallin.ecs.soton.ac.uk/challenges/VOC/"&gt;PASCAL challenge&lt;/a&gt; a lot. During a break he mentioned that he annotated some images himself since “it is fun”. One problem with the challenge is we don't know if the progress over years really reflects the increased quality of methods, or is just because of growth of the training set (though, it is easy to check).&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;a href="http://research.microsoft.com/en-us/um/people/awf/"&gt;Andrew Fitzgibbon&lt;/a&gt;&lt;/b&gt; gave two more great lectures, one about Kinect (with slightly different motivation than in Cambridge) and another about continuous optimization. He talked a lot about reconciling theory and practice:&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;the life-cycle of a research project is: 1) chase the high-hanging fruit (theoretically-sound model), 2) try to make stuff really work, 3) look for the things that confuse/annoy you and fix them;&lt;/li&gt;&lt;li&gt;for Kinect pose estimation, the &lt;i&gt;good&lt;/i&gt; top-down method based on tracking did not work, so they ended up classifying body parts discriminatively, temporal smoothing is used on the late stage;&lt;/li&gt;&lt;li&gt;“don't be obsessed with theoretical guarantees: they are either weak or trivial”;&lt;/li&gt;&lt;li&gt;on the simplest optimization method: “How many people have invented [coordinate] alternation at some point of their life?”. Indeed, the method is guaranteed to converge, but the problems arise when the valleys are not axis-aligned;&lt;/li&gt;&lt;li&gt;gradient descent is not a panacea: in some cases it does small steps too, &lt;span class="Apple-style-span" style="line-height: 19px; "&gt;&lt;span class="Apple-style-span"&gt;&lt;a href="http://en.wikipedia.org/wiki/Conjugate_gradient_method"&gt;conjugate gradient method&lt;/a&gt; is better (it uses 1st order derivatives only);&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="line-height: 19px; "&gt;&lt;span class="Apple-style-span"&gt;when possible, use second derivatives to determine step size, but estimating them is hard in general;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span"&gt;&lt;span class="Apple-style-span" style="line-height: 19px;"&gt;one almost never needs to take the matrix inverse; in MATLAB, to solve the system &lt;i&gt;Hd = −g,&lt;/i&gt; use backslash: &lt;i&gt;d = −H\g&lt;/i&gt;;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span"&gt;&lt;span class="Apple-style-span" style="line-height: 19px; "&gt;&lt;span class="Apple-style-span"&gt;the Friday evening method is to try MATLAB &lt;/span&gt;&lt;span class="Apple-style-span"&gt;&lt;a href="http://www.mathworks.com/help/techdoc/ref/fminsearch.html"&gt;&lt;span class="Apple-style-span"&gt;fminsearch&lt;/span&gt;&lt;/a&gt; &lt;/span&gt;&lt;span class="Apple-style-span"&gt;(implementing the derivative-free &lt;/span&gt;&lt;span class="Apple-style-span" style="line-height: normal; "&gt;&lt;span class="Apple-style-span"&gt;&lt;a href="http://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method"&gt;Nelder-Mead method&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"&gt;).&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;span class="Apple-style-span" style="line-height: 19px;"&gt;Dr. Fitzgibbon asked the audience what the first rule of machine learning is. I hardly helped over replying “Never talk about machine learning”, but he expected the different answer: “Always try the nearest neighbour first!” &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;span class="Apple-style-span" style="line-height: 19px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;span class="Apple-style-span" style="line-height: 19px;"&gt;&lt;b&gt;&lt;a href="http://pub.ist.ac.at/~chl/"&gt;Christoph Lampert&lt;/a&gt;&lt;/b&gt; gave lectures on kernel methods, and structured learning, and kernel methods for structured learning. Some notes on the kernel methods talk:&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;&lt;span class="Apple-style-span"&gt;&lt;span class="Apple-style-span" style="line-height: 19px;"&gt;(obvious) don't rely on the error on a train set, and (less obvious) don't even report about it in your papers;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span"&gt;&lt;span class="Apple-style-span" style="line-height: 19px;"&gt;for SVM kernels, in order to be legitimate, a kernel should be an inner product; it is often hard to prove it directly, but there are workarounds: a kernel can be drawn from a conditionally positive-definite matrix; sum, product and exponent of a kernel(s) is a kernel too etc. (thus, important for multiple-kernel learning, linear combination of kernels is a kernel);&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span"&gt;&lt;span class="Apple-style-span" style="line-height: 19px;"&gt;since training (and running) non-linear SVMs is computationally hard, explicit feature maps are popular now: try to decompose the kernel back to conventional dot product of modified features; typically the features should be transformed to infinite sums, so take first few terms &lt;span class="Apple-style-span" style="font-family: Georgia, serif; font-size: 16px; line-height: normal; "&gt;[&lt;a href="http://www.robots.ox.ac.uk/~vgg/publications/papers/vedaldi10.pdf"&gt;Vedaldi and Zisserman, 2010&lt;/a&gt;]&lt;/span&gt;;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span"&gt;&lt;span class="Apple-style-span" style="line-height: 19px;"&gt;if the kernel can be expressed as a sum over vector components (e.g. &lt;span class="Apple-style-span" style="font-family: Georgia, serif; font-size: 16px; line-height: normal; "&gt;&lt;i&gt;χ&lt;sup&gt;2&lt;/sup&gt;&lt;/i&gt;&lt;/span&gt; kernel $\sum_d x_d x'_d / (x_d + x'_d)$), it is easy to decompose; &lt;i&gt;radial basis function &lt;/i&gt;(RBF) kernel ($\exp (\|x-x'\|^2 / 2\sigma^2)$) is the exponent of a sum, so it is hardly decomposable (more strict conditions are in the paper);&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span"&gt;&lt;span class="Apple-style-span" style="line-height: 19px;"&gt;when using RBF kernel, you have another parameter &lt;i&gt;σ&lt;/i&gt; to tune; the rule of thumb is to take &lt;i&gt;σ&lt;/i&gt;² equal to the median distance between training vectors (thus, cross-validation becomes one-dimensional).&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="line-height: 19px;"&gt;Christoph also told a motivating story why one should always use cross-validation (so just forget the previous point :). Sebastian Nowozin was working on his [&lt;a href="http://www.nowozin.net/sebastian/papers/nowozin2007actionclassification.pdf"&gt;ICCV 2007&lt;/a&gt;] paper on action classification. He used the method by &lt;a href="http://zimmer.csufresno.edu/~yucao/csci226/Projects/Project_6/Behavior%20Recognition%20via%20Sparse%20Spatio-Temporal%20Features.pdf"&gt;Dollár et al. [2005]&lt;/a&gt; as a baseline. The paper reported 80.6% accuracy on the KTH dataset. He outperformed the method by a couple of per cents and then decided to reproduce Dollár's results. Imagine his wonder when simple cross-validation (with same features and kernels) yielded 85.2%! So, Sebastian had to improve his method to beat the baseline. &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="line-height: 19px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="line-height: 19px;"&gt;I feel I should stop writing about the talks now since the post grows enormously long. Another Lampert's lecture and &lt;a href="http://research.microsoft.com/en-us/um/people/carrot/"&gt;Carsten Rother&lt;/a&gt;'s course on CRFs were close to my topic, so they deserve separate posts (I already reviewed basics of &lt;a href="http://computerblindness.blogspot.com/2011/04/on-structured-learning.html"&gt;structured learning&lt;/a&gt; and &lt;a href="http://computerblindness.blogspot.com/2010/05/max-product-optimization-methods-and.html"&gt;max-product optimization&lt;/a&gt; in this blog). &lt;a href="http://www.ais.uni-bonn.de/~amueller/"&gt;Andreas Müller&lt;/a&gt; &lt;a href="http://peekaboo-vision.blogspot.com/2011/08/cvml-ivan-laptev.html"&gt;blogged about&lt;/a&gt; the recent &lt;a href="http://www.irisa.fr/vista/Equipe/People/Ivan.Laptev.html"&gt;Ivan Laptev&lt;/a&gt;'s action recognition talk on CVML, which was pretty similar to ours. &lt;a href="http://summerschool2011.graphicon.ru/en/schedule"&gt;The slides are available for all MSCVS talks&lt;/a&gt;, and &lt;b&gt;videos will be shared in September&lt;/b&gt;.&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="line-height: 19px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="line-height: 19px;"&gt;There were also several &lt;b&gt;practical sessions&lt;/b&gt;, but I personally consider them not that useful, because one hardly ever can &lt;i&gt;feel&lt;/i&gt; the essence of a method in 1.5 hours changing the code according to some verbose instruction. It is more of an art to design such tutorials, and no one can really master it. :) Even if the task is well-designed, one may not succeed performing it due to technical reasons: during Carsten Rother's tutorial, &lt;a href="https://plus.google.com/115888484810854823852/about?hl=en"&gt;Tanya&lt;/a&gt; and me spent half an hour to spot the bug caused by confusing &lt;span class="Apple-style-span" &gt;input&lt;/span&gt; and &lt;span class="Apple-style-span"&gt;index&lt;/span&gt; variable names (MATLAB is still dynamically typed). &lt;a href="http://cmp.felk.cvut.cz/~chum/"&gt;Ondrej Chum&lt;/a&gt; once mentioned how his tutorial was doomed since half of the students did not know how to work with sparse matrices.&lt;span class="Apple-style-span"&gt; So, practical sessions are hard.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="line-height: 19px;"&gt;&lt;span class="Apple-style-span"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="line-height: 19px;"&gt;&lt;span class="Apple-style-span"&gt;There was also a &lt;b&gt;poster session&lt;/b&gt;, but I cannot remember a lot of bright works, unfortunately. &lt;a href="http://www.cvc.uab.es/personal2.asp?idioma=en&amp;amp;id=720&amp;amp;shapovalovanataliya"&gt;Nataliya Shapovalova&lt;/a&gt; who won the best poster award, presented quite interesting work on action recognition, which I liked as well (and it is not the last name bias! :) My congratulations to Natasha!&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="line-height: 19px;"&gt;&lt;span class="Apple-style-span"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="line-height: 19px;"&gt;&lt;span class="Apple-style-span"&gt;The planned &lt;b&gt;social events&lt;/b&gt; were not so exhaustive as in Cambridge, but self-organization worked out. The most prominent example was our overnight walk around Moscow, in which a substantial part of school participants took part. It included catching last subway train, drinking whiskey and gin, a game of &lt;s&gt;guessing&lt;/s&gt; hallucinating names of each other, and moving a car from the tram rail to let the tram go in the morning. :) I also met some of &lt;a href="http://opencv.willowgarage.com/"&gt;OpenCV&lt;/a&gt; developers from Nizhny Novgorod there.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;MSCVS is a one-time event, unfortunately. There are at least three &lt;b&gt;annual computer vision summer schools&lt;/b&gt; in Europe: &lt;a href="http://www.dmi.unict.it/icvss/"&gt;ICVSS&lt;/a&gt; (the most mature one, I attended it &lt;a href="http://computerblindness.blogspot.com/search/label/ICVSS"&gt;last year&lt;/a&gt;), &lt;a href="http://www.di.ens.fr/willow/events/cvml2011/"&gt;CVML&lt;/a&gt; (held in France by INRIA) and &lt;a href="http://www.vision.ee.ethz.ch/summerschool2011/"&gt;VSSS&lt;/a&gt; (includes sport sessions besides the lectures, held in Zürich). If you are a PhD student in vision (especially in the beginning of your program), it is worth attending one of them each year to keep up with current trends in the vision community, especially if you don't go to the major conferences. The sets of topics (and even speakers!) have usually large intersection, so pick one of them. ICVSS has arguably the most competitive participant selection, but the application deadline and acceptance notification are in March, so one can apply to the other schools if rejected.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-3630196890901289843?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/3630196890901289843/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2011/08/all-summer-schools.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/3630196890901289843'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/3630196890901289843'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2011/08/all-summer-schools.html' title='All the summer schools'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-8894328619983165868</id><published>2011-06-25T11:27:00.004+04:00</published><updated>2011-06-25T12:32:02.758+04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='CBIR'/><category scheme='http://www.blogger.com/atom/ns#' term='computer vision'/><category scheme='http://www.blogger.com/atom/ns#' term='music'/><category scheme='http://www.blogger.com/atom/ns#' term='reblog'/><category scheme='http://www.blogger.com/atom/ns#' term='Google'/><category scheme='http://www.blogger.com/atom/ns#' term='Internet'/><category scheme='http://www.blogger.com/atom/ns#' term='CVPR'/><category scheme='http://www.blogger.com/atom/ns#' term='web-site'/><title type='text'>Google search by image</title><content type='html'>&lt;div style="text-align: justify;"&gt;Last week Google introduced &lt;i&gt;&lt;a href="http://www.youtube.com/watch?&amp;amp;v=t99BfDnBZcI"&gt;Search by Image&lt;/a&gt;&lt;/i&gt; feature. There were a handful of web-sites that suggested content-based image retrieval in the Internet, but the quality was low, &lt;a href="http://computerblindness.blogspot.com/2010/03/web-scale-content-based-image-retrieval.html"&gt;as I blogged earlier&lt;/a&gt;. I repeated the queries &lt;a href="http://tineye.com"&gt;TinEye&lt;/a&gt; failed at, and Google image search found them both! It found &lt;a href="http://images.google.com/search?tbs=sbi:AMhZZisjgaicgygO1CzML9zQMv0h6l5RMPu8mv-Ko-pBo67FdV1HyUQGg-Csuw_1oVKZrjvHMsVU3ZEncoye9kXrjIHC9-SQGmeBnrREih40b4MTShpwIjWFz6mduatSA2MFBnBdP7j8gkfC8foLcHl8v41s3STsCywI3rQJ9ZXlV_1uxr6H1j6Cnf3N-jHsIuvb6EyZgCAZpTAGvuRAYdDwe9noW4OLZpiTSaXoOeY-ikVaTVY9XV2LBYdn8karWEm2zhNcY5axPvdBmYoujTf6o0VugqkC6A3kG-BxJ3MEuVhL_1-nIzc66_1u3RLQlIYMLWeE06jtY8mlO4g03FXYhqQAFbvP0jv7BhGfsY-nSqNsFP9kYlC4Ea0iEgGOeOID5ofUtCy8TTu7SRkxOdY8KIjV4ZQuCSb799kNrwIiuhatpdT0j11FlvYZd0dS076OoMHvd3w2fAguLk3JvTCYfJvKOulwPso2t52mrqNepTOSJdVKF2jlpkevwPq0RSK4g29BOxfJSbvB5RFmWNVBgRX1ulRiuqepseIU-i6EXCbF3E4H-zdRSh8bGnjiaIy2KQQYuoNofz16L4sFqAlRjCaaf17lSMBngxR-nuZY4Bp9NKcS5jL3ORO00Y9YG8gbxfa9QKtRcTUOJY4IDuAuJ7qOZu8VpDOkLw7Bd2te1DNMpO_1HKJ3gkbmS1oA0AugmgRYOOAOBJ7jTrC1GEiOu0D7WpoX4KcpVAeRF-XLDklY0RTPRJoI-Zi6Hzp_1zvq_168ZLo_1wWBQ1otN3mdsxLOam8N5_1rZurENOia6ckito5rETreWGrpjjdPG-Z3orRbaYIUCYY5TO7BeJpoQ1PtCnY698uqrMkWrQfCCB2QCQB4u8-5IGn9QOpHjJ_1-Vn88ER1Mzt7Sm_1pM0kvsquCrSJPvY3gsGZSyz5dxdhvUG1XYiV3Tgz0AwcJ3MNs71DC0IVB4_1H0UFhK4KDqT77LONhL_1E0ItjZP13a7DsnSbDLpyD4wHC1NTi8Lg7aza728LpKxtB_1IKGcV3deKKWot1PgJvCjbGJ6JK0x8Oc6A11XSnB_1znUnYZtGKzH2YsdGLx9wINFVcbxrx7JMO3MkYTsd6gYxs0Yh78YSgOpiQ4fwo-ENjwkHzK4yw7X6TyddBenX-W5hNpokhX1BItO7xQ0FcJDjpZwDo1uWN8Gd4MEWk9FoDGTyZZ13oDiYGrImkhIra6hgvtVpQFyQQ3JaU8syEhKECkhaIIzmT97nHlHMVYtwn0FAUoGXu-gh6BeW6yPgH3Z9wUCsHu_1CcQa7rIVtqQG47-cpJfAjMZ6eE-UHx8Fn3slx0X45AY&amp;amp;hl=en&amp;amp;biw=1166&amp;amp;bih=878&amp;amp;gbv=2"&gt;few instances of Marion Lenbach &lt;/a&gt;(one of them from this blog, which means the coverage is large!), and I finally remembered &lt;a href="http://images.google.com/search?tbs=sbi:AMhZZitnFJPZ0Ir9xs08V-1aMnmiMLJFDC9WK1sZ2cI2OSlipQ1dhdWY1L4QfV9_1k3J4wvpLzDDJfDx5INwhM_1C34yw16744c9CRaWwsy4tUTjPCSKdGDK9Xb_1Q7zXcvSOG14Ljvc8KbPoNW8IumbhTDCrBOaaRi0tC5a-7OkgHuEXpijtTCgGGk2XJu_1e4QwLvxgGw31uQIjuDRc0btU_1zZDJPRDfNxYh--MstB4xcAm1NVY6H6MH8r-9n_1qpI3O3oOFt-e6ePDoFz3bvhUZz-1fQxueAj_1MlxA0EuWHUEPcIo4AWUW8XD_12bZZP1i-PgvJ3AwCQYqT6qqgvpDAMFUHsQ3qSqAvgQba67zl7Dh1fuL_1d1BOplyvHtvtIuHb_1ZKeXmKt5KBGnlcRYWLUdYbGGSkQiG9j07x2NFiJyt92dKKT8uzydiQh4L84LW9j0Cto6EvH5lfVcldTn4ZpYNYVHcSqeRY_1VpGY9qkSBRWp3FA7qxj67XyHL6kBY7Zd3V0875BbfWUFMjBo3kQyuZ3goZbW7NYdjk_1miVzGE7qDOqHHVBQN3Pxzso_1gkfdQDzfLtewh-KtKpS7AhQJplUnzEch_19Wb28yhkd68nC9YdySr-8irvsfnhbmK96vdmZ4vwdC9Hn3bW4JnV5Z9GNWLdnCfYfPB7GwdC7pl-MESjA_1e1C_1aJ2lYe3-HSg_1PKbe1bxEmvIoc54YtWjveNcuvYbzHujTndSvjEvrLpm5RPcIigvdbK7Ra7VuSx1b_14sZ8-5KsKxLISyc-GxWXm5DWKV1wDnCPhamAuUhk2c4x6_1Imcrk-s4nNejNCW8cF9D3-yJP2XFWV4d5ybNDPH4pJDQCBAKKZPAhgnpL2dLF2g3FT3YGNUgaAz8rE_15QG06wLY2mxkBZj51ujh7EM-ITR4au_13GcgjRq4D6LEO6KV8dSFvF4MXrlPUbDPHCdIHtGDRkX1BCTuzDq4VxBhZdMhubNet5J6c6JhtL56zNAFdL1qzo-stTqxBgGkgTU8mikeoEzky_19N-Y6bWH_1eb00-p3M5LXN7eJxBz4exLQ53B9RlzqnYcM3vXDnLvoxOR_1vQkBBZTuR2fZ1oS3Xx_1n-DfxO2EuMPGbuzNwPjq5H2u6_1w0SKVHjvdifvQjaFq2IaG3HYM7wYYbD8Iim9KzlBQMOmhpuSeYl9jr03fnD44OpHurTsN5zg_1BVppd4gEdvs2G8hYUJEWkpyeiMA6Jjyez_16_1WEbkWw9Dr_10s3xxAOQTds_1rn30tlH4iYNhv8w771sv6J_1qAOuxRrCfMGgIU9C9Y4ao0_1LZ7DzpLWBaKyHkyT4mfK9lu0&amp;amp;hl=en&amp;amp;biw=1166&amp;amp;bih=878&amp;amp;gbv=2"&gt;the movie&lt;/a&gt; from the HOG paper: &lt;a href="http://www.imdb.com/title/tt0134119/"&gt;The Talanted Mr. Ripley&lt;/a&gt;.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;So, from the quick glance Google finally accomplished what the others could not do for ages. Why did they succeed? There are two possible reasons: large facilities that allow building and storing large index efficiently, and a unique technology. The former is surely the case: it seems the engine indexed a large portion of the photos in the web. I cannot say anything about the technology: there is nothing about that among the &lt;a href="http://googleresearch.blogspot.com/2011/06/google-at-cvpr-2011.html"&gt;Google's CVPR papers&lt;/a&gt;, so one need to do black-box testing to see which transformations and modifications are allowed.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Google seems to expand to the areas of multimedia web, even where the niche is already occupied. Recently &lt;a href="http://googleresearch.blogspot.com/2011/06/instant-mix-for-music-beta-by-google.html"&gt;they announced&lt;/a&gt; their alternative to &lt;a href="http://grooveshark.com/"&gt;grooveshark&lt;/a&gt;, the recommendation system for &lt;a href="http://music.google.com"&gt;Music beta&lt;/a&gt; (the service is unfortunately available on invitation basis in the US only). The system is not based on &lt;a href="http://en.wikipedia.org/wiki/Collaborative_filtering"&gt;collaborative filtering&lt;/a&gt; (only), they (also) analyse the content. I planned to investigate this area too, but given that it is becoming mature and thus not so alluring. I am eager to see if the service will succeed. After all, &lt;a href="http://www.google.com/buzz"&gt;Buzz&lt;/a&gt; did not replace &lt;a href="http://twitter.com/"&gt;Twitter&lt;/a&gt;, and &lt;a href="http://www.orkut.com/"&gt;Orkut&lt;/a&gt; did not replace &lt;a href="http://www.facebook.com/"&gt;Facebook&lt;/a&gt;.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-8894328619983165868?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/8894328619983165868/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2011/06/google-search-by-image.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/8894328619983165868'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/8894328619983165868'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2011/06/google-search-by-image.html' title='Google search by image'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-5311077598598230668</id><published>2011-04-26T02:20:00.003+04:00</published><updated>2011-04-26T02:29:15.516+04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='blog'/><category scheme='http://www.blogger.com/atom/ns#' term='graphical models'/><category scheme='http://www.blogger.com/atom/ns#' term='laserscanning'/><category scheme='http://www.blogger.com/atom/ns#' term='structured learning'/><category scheme='http://www.blogger.com/atom/ns#' term='parsing'/><category scheme='http://www.blogger.com/atom/ns#' term='lecture'/><category scheme='http://www.blogger.com/atom/ns#' term='tutorial'/><category scheme='http://www.blogger.com/atom/ns#' term='paper'/><category scheme='http://www.blogger.com/atom/ns#' term='machine learning'/><title type='text'>On structured learning</title><content type='html'>&lt;div style="text-align: justify;"&gt;This post is devoted to &lt;i&gt;structured learning&lt;/i&gt;, a novel machine learning paradigm which may be used for solving different computer vision problems. For example, finding optimal parameters of graphical models is essentially a structured learning problem. I am going to give an introductory talk on structured learning for Dmitry Vetrov's &lt;a href="http://machinelearning.ru/wiki/index.php?title=Smais"&gt;graphical models class&lt;/a&gt; this Friday at 4:20 pm, so please feel free to drop by if you are in Moscow (the talk is to be in Russian).&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Structured learning is basically a very general &lt;a href="http://en.wikipedia.org/wiki/Supervised_learning"&gt;supervised learning&lt;/a&gt; setting. Consider the &lt;a href="http://en.wikipedia.org/wiki/Statistical_classification"&gt;classification&lt;/a&gt; setting as a basic problem. One needs to find parameters of a function that maps a feature vector to one of the pre-defined class labels $\mathbb{R}^m \to \{c_1, c_2, \dots, c_K\}$. The fundamental property is the classes are unordered and &lt;i&gt;orthogonal&lt;/i&gt;. The latest means the probabilities of objects classified as different labels are uncorrelated (some negative correlation is normal since strong classifying of an object as $c_i$ decreases the probability for rest of the labels). &lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Now consider &lt;a href="http://en.wikipedia.org/wiki/Regression_analysis"&gt;regression&lt;/a&gt;, which is another supervised learning problem where feature vectors map to real values: $\mathbb{R}^m \to \mathbb{R}$. It might be considered a classification problem with infinite number of classes. There are two obvious flaws in this reduction. First, a training set is unlikely to have at least one example for each class. To overcome this one can quantize the codomain and train a classifier over a finite set of class labels. However, it leaves us with the second flaw: the method does not take into account correlation between the possible outcomes. The bins are ordered: the training features that correspond to neighbouring bins should be handled differently from those that correspond to distant bins. That's why some global methods (like &lt;a href="http://en.wikipedia.org/wiki/Linear_regression"&gt;linear regression&lt;/a&gt;) are used.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The similar situation is in the case of a &lt;i&gt;structured &lt;/i&gt;outcome. In this case, one usually has a plenty of possible outcomes, but they are not independent. The techniques of structured learning are applicable when the outcomes have some underlying structure, and the elements of the structure have similar sets of features. Also, it should be possible to estimate how bad the prediction is (often in terms of incorrectly predicted elements), which is called &lt;i&gt;structured loss&lt;/i&gt;. The methods allow small deviations from the ideal training outcome, which are possible e.g. because of noise, but penalize the substantial ones (just like regression!). The prediction function (parameters of which are tuned) can thus be formalized as a mapping $\mathbb{R}^{m \times l} \to \{c_1, c_2, \dots, c_K\}^l$.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The possible example is &lt;a href="http://en.wikipedia.org/wiki/Hidden_Markov_model"&gt;hidden Markov model&lt;/a&gt; learning, where the elements are emission potentials and transition potentials, and the loss can be defined as the number of wrong HMM outputs. According to the upper formalization, there are $l$ elements, each represented by $m$ features (in practice, $m$ can vary for different types of elements). Since the labels of transition elements are strictly defined by emission outputs, not every outcome is possible.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Another example of a structured prediction problem is &lt;a href="http://en.wikipedia.org/wiki/Statistical_parsing"&gt;natural language parsing&lt;/a&gt;. Given a sentence of a natural language, the corresponding parsing tree is to be constructed. Surprisingly, parsing trees could also be represented as high-dimensional vectors, with constraints applied. To summarize, structure prediction has outcome that is &lt;b&gt;multivariate&lt;/b&gt;, &lt;b&gt;correlated &lt;/b&gt;and &lt;b&gt;constrained&lt;/b&gt;.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Ideally, the parameters of structured prediction should be tuned via likelihood maximization, but this turns out to be intractable due to the need of computing the partition function on each gradient optimization step. That's why the L&lt;sup&gt;2&lt;/sup&gt;-regularized hinge loss is usually minimized. The prediction algorithm is represented as a scoring function over possible outcomes. The task of the learning algorithm is to find the parameters of the scoring function to make it return maximum values for the true outcomes on the training set, and low ones for the instances that are far from optimal. Given that, the margin between good and bad possible outcomes should be maximized (in terms of scoring function value).&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;You can learn more about structure learning from &lt;a href="http://www.seas.upenn.edu/~taskar/"&gt;Ben Taskar&lt;/a&gt;'s &lt;a href="http://nips.cc/Conferences/2007/Program/event.php?ID=573"&gt;NIPS 2007 tutorial&lt;/a&gt;. See also our recent &lt;a href="http://shapovalov.ro/papers/Shapovalov-Velizhev-3dimpvt2011.pdf"&gt;3DIMPVT paper&lt;/a&gt; for an up-to-date survey of MRF/CRF training methods and their applications.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;PS. I've installed &lt;a href="http://www.mathjax.org/"&gt;MathJax &lt;/a&gt;equation engine into the blog. Seems to work, huh?&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-5311077598598230668?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/5311077598598230668/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2011/04/on-structured-learning.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/5311077598598230668'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/5311077598598230668'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2011/04/on-structured-learning.html' title='On structured learning'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-8780512094443928019</id><published>2011-03-31T00:36:00.005+04:00</published><updated>2011-03-31T01:27:18.539+04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='computer graphics'/><category scheme='http://www.blogger.com/atom/ns#' term='conference'/><category scheme='http://www.blogger.com/atom/ns#' term='education'/><category scheme='http://www.blogger.com/atom/ns#' term='computer vision'/><category scheme='http://www.blogger.com/atom/ns#' term='GML'/><category scheme='http://www.blogger.com/atom/ns#' term='GraphiCon'/><category scheme='http://www.blogger.com/atom/ns#' term='lecture'/><category scheme='http://www.blogger.com/atom/ns#' term='MSU'/><category scheme='http://www.blogger.com/atom/ns#' term='Microsoft'/><category scheme='http://www.blogger.com/atom/ns#' term='Russia'/><title type='text'>[CFP] GraphiCon-2011 and MSCVSS</title><content type='html'>&lt;div style="text-align: justify;"&gt;&lt;a href="http://graphics.cs.msu.ru/en"&gt;Our lab&lt;/a&gt; traditionally organizes &lt;a href="http://gc2011.graphicon.ru/en"&gt;GraphiCon&lt;/a&gt;, the major (and may be the only) conference in Russia specialized in computer graphics and computer vision. The conference is not very selective, still it has a decent community of professionals behind it. So, if you want to get a quality feedback on your preliminary work, or always wanted to visit Russia, consider submitting a paper by May, 16.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;b&gt;Special offer&lt;/b&gt; for young readers of my blog:  If it is the place where you learned about the conference and decided to submit a paper, I'll buy you a beer during the conference. :) Consider it another social micro-event.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Another piece of news is primarily for undergrads and PhD students from Russia. Our lab and Microsoft Research organize another event this summer: &lt;a href="http://summerschool2011.graphicon.ru/en"&gt;Computer vision summer school&lt;/a&gt;. There are such lecturers as &lt;a href="http://pub.ist.ac.at/~chl/"&gt;Cristoph Lampert&lt;/a&gt; and &lt;a href="http://www.robots.ox.ac.uk/~az/"&gt;Andrew Zisserman&lt;/a&gt;, among others. Participation is free of charge, and accommodation is provided. Deadline is on April, 30. You should not miss it!&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-8780512094443928019?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/8780512094443928019/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2011/03/cfp-graphicon-2011-and-mscvss.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/8780512094443928019'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/8780512094443928019'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2011/03/cfp-graphicon-2011-and-mscvss.html' title='[CFP] GraphiCon-2011 and MSCVSS'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-6762118073256830626</id><published>2011-03-07T23:01:00.007+03:00</published><updated>2011-04-25T23:26:48.384+04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='ICCV'/><category scheme='http://www.blogger.com/atom/ns#' term='conference'/><category scheme='http://www.blogger.com/atom/ns#' term='blog'/><category scheme='http://www.blogger.com/atom/ns#' term='IEEE'/><category scheme='http://www.blogger.com/atom/ns#' term='academic'/><category scheme='http://www.blogger.com/atom/ns#' term='CVPR'/><category scheme='http://www.blogger.com/atom/ns#' term='MSU'/><category scheme='http://www.blogger.com/atom/ns#' term='machine learning'/><title type='text'>IEEE goes evil?</title><content type='html'>&lt;div style="text-align: justify;"&gt;In November, the IEEE released &lt;a href="http://www.ieee.org/publications_standards/publications/rights/rights_policies.html"&gt;a new copyright agreement&lt;/a&gt;, that is to be signed by the authors of all papers published by the organization since January 2011. The main novelty is that the authors are not allowed to publish final versions of their papers on-line. Fortunately, we still may post the versions accepted for print (&lt;i&gt;with &lt;/i&gt;post-review corrections), which is still great, indeed. You can find the FAQ concerning the new policy &lt;a href="http://www.ieee.org/documents/authorversionfaq.pdf"&gt;here&lt;/a&gt;.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;It seems the IEEE realizes that nowadays there is almost no need in their printed materials and the digital library as search engines manage to find papers posted by their authors. No doubt that they do &lt;i&gt;not &lt;/i&gt;want to restrict dissemination of scientific results, but that is a question of their survival as a major publishing organization. The worst is they use the drug-dealer strategy: first they organize great conferences and journals (the majority of top venues in my field are run by IEEE), and allow authors to re-publish anything on-line ("the first hit for free"), then cut it off. Yes, one still may post an accepted version, but who knows what is their next step?&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The IEEE motivate their decision by preserving the value of their database: "&lt;i&gt;the IEEE is better able to track usage of articles for the benefit of authors and journals&lt;/i&gt;". One can object that there are big and growing open-access databases like &lt;a href="http://mendeley.com/"&gt;Mendeley&lt;/a&gt;, where all the statistics is open and refined (the personal info of users is known). Also, not all institutions have an access to the library. For example, my university (the largest one in Russia) does not. And I cannot afford to pay $25 per paper. &lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;What can we do about that? Sure, it is impossible to stop publish in IEEE venues, although the community can find the way around, as was demonstrated by the founders of &lt;a href="http://jmlr.csail.mit.edu/"&gt;JMLR&lt;/a&gt;, now the major journal in machine learning, which was &lt;a href="http://yaroslavvb.blogspot.com/2010/11/springer-temporarily-opens-machine.html"&gt;created to cancel the overhead of the commercial publishing model&lt;/a&gt;. Matt Blaze encourages the researchers &lt;a href="http://www.crypto.com/blog/copywrongs/"&gt;not to serve as reviewers for IEEE&lt;/a&gt;. As for me, I'm too young (as a researcher) to be a reviewer. &lt;s&gt;But it also has effect on me: I am now preparing a final version for the proceedings published by the IEEE, and I doubt if there is any use in improving my paper (even given a quality review), assuming that majority of researchers will use the accepted version and will never see the revised one.&lt;/s&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;UPD (Mart 14, 2011). I should have misunderstood the new policy at first, Blaze's post and IEEE terminology somehow misled me. The &lt;i&gt;accepted&lt;/i&gt; paper means &lt;b&gt;accepted for print&lt;/b&gt; by the IEEE, not the submitted for review. So, you still &lt;b&gt;can use review results&lt;/b&gt; to correct the paper and post it on-line, then submit for publishing, where the formatting will probably be adjusted. So, &lt;b&gt;any meaningful part could be reflected in the on-line version&lt;/b&gt;. I have corrected the post now, so it may seem weird.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-6762118073256830626?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/6762118073256830626/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2011/03/ieee-goes-evil.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/6762118073256830626'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/6762118073256830626'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2011/03/ieee-goes-evil.html' title='IEEE goes evil?'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-1601782706424380688</id><published>2011-02-28T23:02:00.000+03:00</published><updated>2011-03-01T02:46:59.878+03:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='CERN'/><category scheme='http://www.blogger.com/atom/ns#' term='particle physics'/><category scheme='http://www.blogger.com/atom/ns#' term='history'/><category scheme='http://www.blogger.com/atom/ns#' term='science'/><category scheme='http://www.blogger.com/atom/ns#' term='Russia'/><title type='text'>The one about CERN</title><content type='html'>&lt;div style="text-align: justify;"&gt;About a month ago I visited &lt;a href="http://en.wikipedia.org/wiki/CERN"&gt;CERN&lt;/a&gt;, arguably the most famous research organization in Europe. It is the place where &lt;a href="http://en.wikipedia.org/wiki/World_Wide_Web"&gt;the World Wide Web&lt;/a&gt; has been invented, and the home of &lt;a href="http://en.wikipedia.org/wiki/Large_Hadron_Collider"&gt;the Large Hadron Collider&lt;/a&gt;. I was very excited about the trip, &lt;a href="http://www.youtube.com/watch?v=510oEykrQiE"&gt;much like Sheldon Cooper&lt;/a&gt;. Here are some facts which were new to me:&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;... about CERN:&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li style="text-align: justify;"&gt;CERN was founded after the WWII by a bunch of European countries. They were exhausted by the war, and the only way to catch up with the USA and USSR in fundamental science was to join forces.&lt;/li&gt;&lt;li style="text-align: justify;"&gt;The original name was &lt;i&gt;Conseil Européen pour la Recherche Nucléaire&lt;/i&gt; (European Council for Nuclear Research), which abbreviated to CERN. Later the name has been officially changed to &lt;i&gt;European laboratory for particle physics&lt;/i&gt;, which is both more relevant and less fearful for the locals, however the brand CERN is used now even in official documents.&lt;/li&gt;&lt;li style="text-align: justify;"&gt;There are almost 3,000 full-time employees, but most of them are engineers and not scientists. There are a lot of visiting researchers though.&lt;/li&gt;&lt;li style="text-align: justify;"&gt;There are 20 member states now (primarily EU states), and 6 observer states (such as Russia and the USA).&lt;/li&gt;&lt;li style="text-align: justify;"&gt;CERN's annual budget is about € 1 billion, it is funded by the member states in proportion to their economical power, e.g. Germany gives 20% of the money.&lt;/li&gt;&lt;li style="text-align: justify;"&gt;The budget money are spent to infrastructure and support, all the individual experiments are funded by research groups and their universities. &lt;/li&gt;&lt;li style="text-align: justify;"&gt;In spite of the USA is not a member state, it leads on the number of researchers who work on CERN projects (more than thousand), second is Germany, third is Russia (yes, we still have a good shape in particle physics). It turned out that everybody at CERN spoke Russian, even the janitor. :)&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;... about the LHC:&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;ul&gt;&lt;li&gt;It is in fact a circular tunnel of 27 km in circumference lying 175 m beneath the ground.&lt;/li&gt;&lt;li&gt;The tunnel was used before the LHC, it was build in 1983 for &lt;a href="http://en.wikipedia.org/wiki/Large_Electron%E2%80%93Positron_Collider"&gt;the Large Electron-Positron Collider&lt;/a&gt;. In 2008 it was upgraded to be able to accelerate heavy particles like protons to become the Large &lt;i&gt;Hadron &lt;/i&gt;Collider (remind that proton and neutron are thousand times heavier than electron).&lt;/li&gt;&lt;li&gt;The tunnel is about 4 meters in diameter, one can walk there or ride a bike. &lt;/li&gt;&lt;li&gt;The tunnel encapsulates two small pipes for the particles that intersect at four points (to make the tracks' lengths equal, like in speed scating arenas). There are more then a thousand of electric magnet dipoles along the pipe. They are not that big as I imagined before.&lt;/li&gt;&lt;li&gt;It is nearly vacuum and zero temperature inside the tubes.&lt;/li&gt;&lt;li&gt;Proton beams are not generated inside the LHC. First, they are accelerated in the linear accelerator and almost reach the speed of light &lt;i&gt;c&lt;/i&gt;. While the speed is rising, it becomes harder and harder to increase it since it cannot overcome the speed of light. Then they are accelerated in the small circular accelerator, and only after that they are injected into the LHC where during 40 minutes the beams are accelerated to speed as much close to &lt;i&gt;c&lt;/i&gt; as possible.&lt;/li&gt;&lt;li&gt;When accelerated, the beams are being observed during 10 hours. They suffer about 10,000 collisions per second, about 20 pairs of protons collide each time. Since the speed is large, the energy of that collisions is enormous.&lt;/li&gt;&lt;li&gt;The collisions take place within special locations called &lt;i&gt;detectors&lt;/i&gt;. We visited a control centre of one of them, &lt;a href="http://en.wikipedia.org/wiki/ATLAS_experiment"&gt;ATLAS&lt;/a&gt;. A detector has multiple layers, each able to register certain kind of particles, like photons.&lt;/li&gt;&lt;li&gt;Ten thousand collision per second would yield really big amount of data, so only few of them are selected to be logged. I don't know how they select those collisions, machine learning might be used. :)&lt;/li&gt;&lt;li&gt;All the collected data are spread into servers all over the world. An authorized researcher may log in to the grid network and execute her script to analyse the data.&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;... about Higgs boson:&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Higgs_boson"&gt;Higgs boson&lt;/a&gt; is a hypothetical particle, existence of which would prove the standard model of particle physics. &lt;/li&gt;&lt;li&gt;Higgs boson appears as a result of collision of two protons and large amount of energy. The protons in the LHC are accelerated enough to produce theoretically sufficient energy. &lt;/li&gt;&lt;li&gt;Higgs boson is very heavy and thus unstable. In theory, it decays into either four muons or two photons. So, if there will be two counter-directed light beams registered by the detector, this will be an evidence of the boson. See the picture below for example of likely detector output in case of the boson shows up.&lt;/li&gt;&lt;li&gt;Scientist say that if Higgs boson would not be detected, all the modern knowledge on particle physics will crush. They will be  obliged to develop a new theory from scratch.&lt;/li&gt;&lt;li&gt;There are no published results that report on Higgs boson detection so far...&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://upload.wikimedia.org/wikipedia/commons/thumb/1/1c/CMS_Higgs-event.jpg/650px-CMS_Higgs-event.jpg"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 450px; height: 417px;" src="http://upload.wikimedia.org/wikipedia/commons/thumb/1/1c/CMS_Higgs-event.jpg/650px-CMS_Higgs-event.jpg" border="0" alt="" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;Don't you want to be a theoretical physicist now? =)&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-1601782706424380688?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/1601782706424380688/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2011/02/one-about-cern.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/1601782706424380688'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/1601782706424380688'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2011/02/one-about-cern.html' title='The one about CERN'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-9168604994353859029</id><published>2011-01-25T09:04:00.000+03:00</published><updated>2011-01-25T09:04:28.728+03:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='video'/><category scheme='http://www.blogger.com/atom/ns#' term='AI'/><category scheme='http://www.blogger.com/atom/ns#' term='art'/><category scheme='http://www.blogger.com/atom/ns#' term='computer vision'/><category scheme='http://www.blogger.com/atom/ns#' term='CMU'/><category scheme='http://www.blogger.com/atom/ns#' term='robotics'/><category scheme='http://www.blogger.com/atom/ns#' term='life'/><category scheme='http://www.blogger.com/atom/ns#' term='lecture'/><category scheme='http://www.blogger.com/atom/ns#' term='MSU'/><category scheme='http://www.blogger.com/atom/ns#' term='movie'/><title type='text'>Art + Multimedia</title><content type='html'>&lt;div style="text-align: justify;"&gt;In December we visited a lecture about interaction between arts and sciences, namely between (primarily) visual arts and (again, primarily) multimedia studies (&lt;a href="http://theoryandpractice.ru/seminars/11878-multimedia-mezhdistsiplinarnoe-vzaimodeystvie-sovremennogo-iskusstva-i-nauki-19-12"&gt;Russian page&lt;/a&gt;). The lecture was given by &lt;a href="http://www.ablogina.com/scroll.php?lg=en"&gt;Asya Ablogina&lt;/a&gt;, who turned out to be a nice girl. Although she lacked any technical background, she was doing well. Surprisingly, there is a lot of artwork that exploits technical support in a witty manner, which I was unaware of, so the lousy stuff exhibited in &lt;a href="http://www.mmoma.ru/en/"&gt;the Moscow Museum of Modern Art&lt;/a&gt; is not everything one can do in this field!&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The most interesting part was Asya's exposition of some masterpieces. Most of them were presented during 2009&lt;span class="Apple-style-span"&gt; &lt;/span&gt;&lt;span class="Apple-style-span" style="line-height: 18px; "&gt;&lt;span class="Apple-style-span" style="font-size: medium; "&gt;&lt;span class="Apple-style-span"&gt;&lt;i&gt;Science as Suspense&lt;/i&gt; exhibit&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; in Moscow (&lt;a href="http://www.winzavod.ru/scienceartfest/index.html"&gt;Russian&lt;/a&gt;). &lt;a href="http://www.fondation-langlois.org/html/e/page.php?NumPage=101"&gt;Nicolas Reeves&lt;/a&gt; is a famous Canadian architect, also known for his work on modelling biological systems. He used the output of biologically-inspired computer algorithms (such as &lt;a href="http://en.wikipedia.org/wiki/Genetic_algorithms"&gt;genetic algorithms&lt;/a&gt;) to draw pictures. In Moscow he presented a project called &lt;strike&gt;&lt;i&gt;Marching&lt;/i&gt;&lt;/strike&gt;&lt;i&gt; Floating Cubes&lt;/i&gt;: massive but light cubes float in the air. Their movements are controlled by tiny fans, although any little gust of wind can affect the movement. Each cube is equipped with an on-board computer, which helps to avoid collisions. The implemented algorithms are simple but stochastic, hence the behaviour is unpredictable. They are said to move like animal creatures. Here is the video:&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;iframe title="YouTube video player" class="youtube-player" type="text/html" width="480" height="390" src="http://www.youtube.com/embed/qR7dPW73nCk" frameborder="0" allowfullscreen=""&gt;&lt;/iframe&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;A similar project was developed by &lt;a href="http://www.zprod.org/PG/home.htm"&gt;Paul Granjon&lt;/a&gt;. He also tries to gift robots with animal behaviour. In the video below, robots are sexed, i.e. they are able to locate robots of the opposite sex and eventually end up in coitus. Another Paul's robot (the &lt;i&gt;Smartbot, &lt;/i&gt;&lt;a href="http://www.winzavod.ru/lj/sienceart/DSCF9887.JPG"&gt;photo&lt;/a&gt;) creeps over the restricted space and always grumbles (just like &lt;a href="http://en.wikipedia.org/wiki/Marvin_the_Paranoid_Android"&gt;Marvin&lt;/a&gt;!). I also enjoy the way Paul speaks:&lt;/div&gt;&lt;br /&gt;&lt;iframe title="YouTube video player" class="youtube-player" type="text/html" width="480" height="390" src="http://www.youtube.com/embed/a-DTcw8EQBU" frameborder="0" allowfullscreen=""&gt;&lt;/iframe&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The real idol in the sci-art community is &lt;a href="http://en.wikipedia.org/wiki/Stelarc"&gt;Stelarc&lt;/a&gt; (see also &lt;a href="http://web.stelarc.org/"&gt;his homepage&lt;/a&gt;, although it takes balls to get through the welcome page :). His talent is recognized by both scientists and artists (it is enough to mention he's an Honorary Professor of Arts and Robotics at &lt;a href="http://cmu.edu/"&gt;Carnegie Mellon&lt;/a&gt;).  He's best known for his experiments on his own body. For example, during a performance he allowed to control his body remotely over the Internet by muscle stimulation. Probably his favourite project is &lt;a href="http://web.stelarc.org/projects/prosthetichead/index.html"&gt;&lt;i&gt;Prosthetic Head&lt;/i&gt;&lt;/a&gt;, which is in fact a 3D model of his own head. The head is learned from Stelarc's behaviour and speech, so it is able to communicate with people using colloquial information as well as non-verbal cues. It is interesting to try it in practice, but here is just a non-interactive demo video:&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;iframe title="YouTube video player" class="youtube-player" type="text/html" width="480" height="390" src="http://www.youtube.com/embed/Nym8hfNI9Gg" frameborder="0" allowfullscreen=""&gt;&lt;/iframe&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Stelarc is probably the only person on Earth who has three ears. In 2007 surgeons &lt;a href="http://web.stelarc.org/projects/earonarm/index.html"&gt;implanted an ear into his arm&lt;/a&gt;! It is not functioning now (just a piece of skin), but he plans to install an audio receiver into the ear and broadcast everything he hears over the Internet. I definitely recommend to look through &lt;a href="http://web.stelarc.org/projects.html"&gt;his projects&lt;/a&gt;, there are decent ones.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Asya also presented &lt;a href="http://www.ablogina.com/scroll.php?lg=en"&gt;her own works&lt;/a&gt;. She focus on different photographic techniques such as overexposure. Here are photos from her project &lt;span style="font-style:italic;"&gt;Canvas of the Road&lt;/span&gt; compiled in a video. There is a play of words in Russian: the single word for canvas and road surface. On these photos car traces look like painter's strokes:&lt;/div&gt;&lt;br /&gt;&lt;iframe title="YouTube video player" class="youtube-player" type="text/html" width="480" height="390" src="http://www.youtube.com/embed/-eTVsP_R-w4" frameborder="0" allowfullscreen=""&gt;&lt;/iframe&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Another Asya's project is a short film series called &lt;i&gt;Habitat&lt;/i&gt;. She showed only one episode with the title &lt;i&gt;Habitat: MSU Main Building&lt;/i&gt;. The film was about a girl who is a PhD student in &lt;a href="http://msu.ru/"&gt;Moscow State University&lt;/a&gt; and described her place: a standard 8 sq.m. dorm room. The narrative was full of expressive epithets and metaphors about that awful room: the girl felt like in a cage, she also didn't like shared bathroom, etc. It's funny, because I live in a similar PhD student single room, and I'm quite happy with it, since I've been living 5 years in a shared room before. :)&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The second lecturer, Vladimir Vishnyakov, presented his project called &lt;i&gt;The Museum of Revived Photography&lt;/i&gt;. He used the photos of the XIX century to create image-based animation. The idea is technically simple: first segment people from the background, then use &lt;a href="http://en.wikipedia.org/wiki/Inpainting"&gt;inpainting&lt;/a&gt; techniques to restore the background behind them, and animate the movements, but Vladimir referred to a programmer he works with as a real genius. In fact, all the stuff I post here varies from easy to moderately complex (except of Stelarc's projects), so you can do something similar too. The main problem is to come up with the concept. Come on then! ;)&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-9168604994353859029?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/9168604994353859029/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2011/01/art-multimedia.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/9168604994353859029'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/9168604994353859029'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2011/01/art-multimedia.html' title='Art + Multimedia'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://img.youtube.com/vi/qR7dPW73nCk/default.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-2636248024275255603</id><published>2010-12-22T22:21:00.003+03:00</published><updated>2010-12-22T23:00:29.049+03:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='comics'/><category scheme='http://www.blogger.com/atom/ns#' term='humour'/><category scheme='http://www.blogger.com/atom/ns#' term='software'/><category scheme='http://www.blogger.com/atom/ns#' term='life'/><category scheme='http://www.blogger.com/atom/ns#' term='paper'/><category scheme='http://www.blogger.com/atom/ns#' term='LaTeX'/><title type='text'>Comic strip: Nerd's confession</title><content type='html'>I continue sharing my &lt;a href="http://computerblindness.blogspot.com/2010/08/comic-strip-10.html"&gt;Prague comic strips&lt;/a&gt;, since I have not been posting for a while. In fact, I was busy writing a paper, so this one may seem somehow connected. &lt;b&gt;Disclaimer:&lt;/b&gt; There's nothing personal in the strip. Kids under 18 are not allowed. :)&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://shapovalov.ro/images/comics/LaTeX.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 500px; " src="http://shapovalov.ro/images/comics/LaTeX_preview.png" border="0" title="In comparison to condoms, LaTeX is free, reusable, more than 98% effective, and not condemned by the church. Are you still in doubt?" /&gt;&lt;/a&gt;&lt;div&gt;The next post is going to be about multimedia technologies for art. Stay tuned!&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-2636248024275255603?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/2636248024275255603/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2010/12/comic-strip-nerds-confession.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/2636248024275255603'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/2636248024275255603'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2010/12/comic-strip-nerds-confession.html' title='Comic strip: Nerd&apos;s confession'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-6074838624729624352</id><published>2010-11-26T10:30:00.004+03:00</published><updated>2010-11-26T10:53:38.712+03:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='comics'/><category scheme='http://www.blogger.com/atom/ns#' term='humour'/><category scheme='http://www.blogger.com/atom/ns#' term='philosophy'/><category scheme='http://www.blogger.com/atom/ns#' term='machine learning'/><title type='text'>Happy Thaknsgiving!</title><content type='html'>&lt;p&gt;&lt;a href="http://chaospet.com/2008/11/27/115-russells-turkey/"&gt;Here&lt;/a&gt; is a comic strip by Ryan Lake that illustrates the concept of Russell's turkey. The idea is simply the induction does not always work the way you expect. There could be some factors, which did not have influence on the training data, yet changing the learned concept dramatically. We should thus keep in mind that machine learning (which is based on induction) is not the panacea.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_IDEWI0P9RbA/TO9kJdjpq3I/AAAAAAAAAEQ/mVShp8p45CQ/s1600/russels-turkey.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 343px; height: 400px;" src="http://4.bp.blogspot.com/_IDEWI0P9RbA/TO9kJdjpq3I/AAAAAAAAAEQ/mVShp8p45CQ/s400/russels-turkey.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5543759780032129906" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;&lt;a href="http://en.wikipedia.org/wiki/Bertrand_Russell"&gt;Bertrand Russell&lt;/a&gt; is well-known for his metaphors. Probably the most famous one is &lt;a href="http://en.wikipedia.org/wiki/Russell's_teapot"&gt;Russell's teapot&lt;/a&gt; that aims to show how pointless is to believe in god.&lt;/p&gt;&lt;p&gt;Happy Thanksgiving!&lt;/p&gt;&lt;p&gt;PS. I am reading &lt;a href="http://en.wikipedia.org/wiki/Animal_Farm"&gt;&lt;em&gt;Animal Farm&lt;/em&gt;&lt;/a&gt; by George Orwell now, it is a great animal metaphor too.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-6074838624729624352?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/6074838624729624352/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2010/11/happy-thaknsgiving.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/6074838624729624352'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/6074838624729624352'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2010/11/happy-thaknsgiving.html' title='Happy Thaknsgiving!'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_IDEWI0P9RbA/TO9kJdjpq3I/AAAAAAAAAEQ/mVShp8p45CQ/s72-c/russels-turkey.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-1767138312464865721</id><published>2010-11-10T21:13:00.002+03:00</published><updated>2010-11-10T21:31:41.876+03:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='humour'/><category scheme='http://www.blogger.com/atom/ns#' term='conference'/><category scheme='http://www.blogger.com/atom/ns#' term='paper'/><category scheme='http://www.blogger.com/atom/ns#' term='CVPR'/><category scheme='http://www.blogger.com/atom/ns#' term='MRF'/><title type='text'>Nice paper acknowledgement</title><content type='html'>&lt;p&gt;It turns out that not only in Russia PhD students could be underpaid so that they cannot afford food:&lt;/p&gt;&lt;p&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_IDEWI0P9RbA/TNrhL-39bzI/AAAAAAAAAEI/8s6wiQe5oaw/s1600/delong.PNG"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 213px;" src="http://1.bp.blogspot.com/_IDEWI0P9RbA/TNrhL-39bzI/AAAAAAAAAEI/8s6wiQe5oaw/s400/delong.PNG" border="0" alt="" id="BLOGGER_PHOTO_ID_5537986287777967922" /&gt;&lt;/a&gt;&lt;/p&gt;&lt;p&gt;This is from &lt;a href="http://www.csd.uwo.ca/~olga/"&gt;Olga Veksler&lt;/a&gt;'s paper "&lt;a href="http://www.csd.uwo.ca/faculty/olga/Papers/cvprFinal07.pdf"&gt;Graph cut based optimization for MRFs with truncated convex priors&lt;/a&gt;" from CVPR 2007. &lt;/p&gt;&lt;p&gt;&lt;a href="http://www.csd.uwo.ca/~adelong3/"&gt;Andrew&lt;/a&gt; seems to be a nice person. I should definitely ask Anton Osokin to introduce us on ocasion. =)&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-1767138312464865721?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/1767138312464865721/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2010/11/nice-paper-acknowledgement.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/1767138312464865721'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/1767138312464865721'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2010/11/nice-paper-acknowledgement.html' title='Nice paper acknowledgement'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_IDEWI0P9RbA/TNrhL-39bzI/AAAAAAAAAEI/8s6wiQe5oaw/s72-c/delong.PNG' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-3216776224435266994</id><published>2010-10-10T10:10:00.009+04:00</published><updated>2011-03-14T01:01:35.327+03:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='computer vision'/><category scheme='http://www.blogger.com/atom/ns#' term='paper'/><category scheme='http://www.blogger.com/atom/ns#' term='academic'/><category scheme='http://www.blogger.com/atom/ns#' term='history'/><category scheme='http://www.blogger.com/atom/ns#' term='OpenCV'/><category scheme='http://www.blogger.com/atom/ns#' term='web-site'/><category scheme='http://www.blogger.com/atom/ns#' term='movie'/><title type='text'>Papers, citations, co-authorship, and genealogy</title><content type='html'>&lt;div style="text-align: justify;"&gt;This summer I accidentally found out that there are two papers citing &lt;a href="http://graphics.cs.msu.ru/sites/default/files/download/CMRT09_Barinova_et_al.pdf"&gt;our CMRT 2009 paper&lt;/a&gt;. I was excited a bit about that since they were the first actual citations of me, so I even read those papers.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The first of them [&lt;a href="http://www.itlab.unn.ru/archive/MSConf10/msconf-2010_book.pdf#page=60"&gt;Димашова, 2010&lt;/a&gt;] was published in Russian by &lt;a href="http://opencv.willowgarage.com/wiki/"&gt;OpenCV&lt;/a&gt; developers from Nizhny Novgorod. They implemented cascade-based face detection algorithm that used either Local Binary Patterns (LBP) or Haar features. The algorithm was released within OpenCV 2.0. They cite our paper as an example of using Random Forest on the stages of the cascade. However, they implemented the classical variation with the cascade over AdaBoost.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Another citation [&lt;a href="http://www.zemris.fer.hr/~ssegvic/pubs/sikiric10mipro.pdf"&gt;Sikiric et al., 2010&lt;/a&gt;] is more relevant since it came from the road mapping community. They address the problem of recovering a road appearance mosaic from the orthogonal views to the surface. They contrast their approach with ours in the way that theirs do not employ human interaction. In fact, we need human input for recognizing road defects and lane marking, rectification is done in the previous stage, which is fully automatic.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The rest of the post is devoted to some interesting metrics and structures concerning citations, co-authorship and supervising students.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;Citations&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The number of citations is a weak measure of paper quality. We can also go further and estimate the impact of a journal or a particular researcher based on the number of citations. A generally accepted measure is the journal &lt;a href="http://en.wikipedia.org/wiki/Impact_factor"&gt;impact factor&lt;/a&gt;, which is simply the mean number of citations by paper published in the journal for some period in time. Individual researchers could be evaluated by the impact factor of the journals they published in, though it is considered as a bad practice. Another not-so-bad practise is &lt;a href="http://en.wikipedia.org/wiki/H-index"&gt;h-index&lt;/a&gt;. By definition, one's h-index equals &lt;i&gt;H&lt;/i&gt; if there are at least &lt;i&gt;H&lt;/i&gt; citations of his top &lt;i&gt;H&lt;/i&gt; papers&lt;sup&gt;&lt;a href="http://www.blogger.com/post-edit.g?blogID=2293611840857369720&amp;amp;postID=3216776224435266994#supremum"&gt;1&lt;/a&gt;&lt;/sup&gt;. It has also been &lt;a href="http://en.wikipedia.org/wiki/H-index#Criticism"&gt;criticised&lt;/a&gt; widely.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;So, &lt;a href="http://blogs.nature.com/kausikdatta/2010/09/20/impact-of-impact-factors"&gt;citation-based scoring has a lot of flaws&lt;/a&gt;. But what can we use instead? Another interesting approach is introduced by &lt;a href="http://readermeter.org/"&gt;ReaderMeter&lt;/a&gt;. They collect the information about the number of people who have added some paper to their Mendeley collections and compute something like h-index. Unfortunately, they recently excluded some papers from their database, so the statistics became less representative but more accurate. &lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;&lt;b&gt;Co-authorship and Erdős number&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;a href="http://en.wikipedia.org/wiki/Paul_Erd%C5%91s"&gt;Paul Erdős&lt;/a&gt; was a Hungarian mathematician who published about 14 hundred research papers with 511 different co-authors. That's why he has a special role in bibliometrics. &lt;a href="http://en.wikipedia.org/wiki/Erd%C5%91s_number"&gt;Erdős number&lt;/a&gt; is defined as a collaborative distance from a researcher to Erdős. More strictly, Wikipedia defines:&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;blockquote&gt;Paul Erdős is the one person having an Erdős number of zero. For any author other than Erdős, if the lowest Erdős number of all of his coauthors is &lt;i&gt;k&lt;/i&gt;, then the author's Erdős number is &lt;i&gt;k&lt;/i&gt; + 1.&lt;/blockquote&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;My Erdős number is at most 7 (via Olga Barinova, Pushmeet Kohli, Philip H.S. Torr, Bernhard Schölkopf, John Shawe Taylor, David Godsil). To be honest, Erdős number system is primarily used for math papers. Even if we use the wider definition, it is not that beautiful, because there are not too many gates, e.g. all the vision community will probably connect to Erdős via the machine learning gate. Thus, the researchers who work on the edge will be closer. To illustrate this fact, David Marr probably had not any finite Erdős number during his lifetime (now he definitely has). So, we can introduce, say, Zisserman number for computer vision.&lt;sup&gt;&lt;a href="http://www.blogger.com/post-edit.g?blogID=2293611840857369720&amp;amp;postID=3216776224435266994#Zisserman"&gt;2&lt;/a&gt;&lt;/sup&gt; &lt;a href="http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/z/Zisserman:Andrew.html"&gt;According to DBLP&lt;/a&gt;, Andrew Zisserman has 165 direct co-authors so far. Now, my Zisserman number is 3, David Marr's is 4 (via Tomaso Poggio, Lior Wolf, Yonatan Wexler).&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The movie industry have their own metrics, which is the &lt;a href="http://en.wikipedia.org/wiki/Bacon_number#Bacon_numbers"&gt;Bacon number&lt;/a&gt; (after &lt;a href="http://en.wikipedia.org/wiki/Kevin_Bacon"&gt;Kevin Bacon&lt;/a&gt;). The distance is established 1 if two actors have appeared in a single movie. Someone tried to combine those two to the &lt;a href="http://en.wikipedia.org/wiki/Erd%C5%91s-Bacon_number"&gt;Erdős-Bacon number&lt;/a&gt;. For some person, it is just a sum of her Erdős and Bacon numbers. Of course, very few people have a finite Erdős-Bacon number, since one should both appear in a movie and publish a research paper (and it is still not sufficient). Often they are possessed by researchers who consulted the filming crew and accidentally were filmed. =) Erdős himself has this number 3 or 4 (depending on the details of definition), since he starred in &lt;a href="http://en.wikipedia.org/wiki/N_is_a_Number"&gt;N is a Number (1993)&lt;/a&gt; and his Bacon number is thus 3 0r 4. &lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The person with the probably lowest Erdős-Bacon number is &lt;a href="http://en.wikipedia.org/wiki/Daniel_Kleitman"&gt;Daniel Kleitman&lt;/a&gt;, an MIT mathematician, who has appeared in one of my favourite movies &lt;a href="http://en.wikipedia.org/wiki/Good_Will_Hunting"&gt;Good Will Hunting (1997)&lt;/a&gt; along with Minnie Driver, who collaborated with Bacon in &lt;a href="http://en.wikipedia.org/wiki/Sleepers_(film)"&gt;Sleepers (1996)&lt;/a&gt;. Since Kleitman has 6 joint papers with Erdős, his Erdős-Bacon number is 3.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Surprisingly, Marvin Minsky has his Erdős number (4) greater than his Bacon number (2), which he obtained via &lt;a href="http://www.imdb.com/title/tt0190708/"&gt;The Revenge of the Dead Indians (1993)&lt;/a&gt; and &lt;a href="http://en.wikipedia.org/wiki/Yoko_Ono"&gt;Yoko Ono&lt;/a&gt;. Another strange example is &lt;strike&gt;the paedophile's dream&lt;/strike&gt; &lt;a href="http://en.wikipedia.org/wiki/Natalie_Portman"&gt;Natalie Portman&lt;/a&gt;. She has graduated from Harvard, saying she would rather "be smart than a movie star". That's my kind of a girl! A neuroscience paper [&lt;a href="http://www.nmr.mgh.harvard.edu/DOT/PDF/Baird_NeuroImage_16_1120_2002.pdf"&gt;Baird et al., 2002&lt;/a&gt;] brought Natalie Hershlag (her real name) Erdős number of 5, and then she appeared in &lt;a href="http://www.imdb.com/title/tt0808399/"&gt;New York, I Love You (2009)&lt;/a&gt; along with Kevin Bacon, so she reached the same Erdős-Bacon number as Minsky, i.e. 6.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;UPD (Mart 13, 2011). There is a totally relevant &lt;a href="http://xkcd.com/599/"&gt;xkcd strip&lt;/a&gt;.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;&lt;b&gt;Scientific genealogy&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Finally, I tell about what is known as scientific genealogy. Every grown-up researcher has a PhD advisor, usually one, while an advisor can have a lot of students. Let's just use the analogy with parents and children. We get a tree (or a forest) representing the historical structure of science. &lt;a href="http://www.genealogy.ams.org/"&gt;The mathematics genealogy project&lt;/a&gt; aims to recover this structure.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;I tried to track my genealogy back. Unfortunately, I failed to find who was &lt;a href="http://www.keldysh.ru/pages/cgraph/people/bayakovski.htm"&gt;Yury M. Bayakovsky&lt;/a&gt;'s PhD advisor. But if consider &lt;a href="http://graphics.cs.msu.ru/en/people/staff/obarinova"&gt;Olga Barinova&lt;/a&gt; my advisor, I am the 11th generation descendant of &lt;a href="http://en.wikipedia.org/wiki/Gauss"&gt;Carl Friedrich Gauß&lt;/a&gt;. Nice ancestry, huh?&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;a href="http://www.visiongenealogy.summerhost.info/"&gt;The similar project exists for computer vision&lt;/a&gt;. There are 290 people in the base, though there are duplicates (I've found five &lt;a href="http://www.vision.ee.ethz.ch/~vferrari/"&gt;Vitorio Ferrari&lt;/a&gt;'s :). It seems strange that some trees have depth as big as 5 (e.g. &lt;a href="http://userweb.cs.utexas.edu/~grauman/"&gt;Kristen Graumann&lt;/a&gt; and &lt;a href="http://www.lsi.upc.edu/~aquattoni/"&gt;Adriana Quattoni&lt;/a&gt; are the 5th generation after David Marr), though vision is a relatively young field.&lt;/div&gt;&lt;br /&gt;&lt;sup&gt;&lt;a name="supremum"&gt;1&lt;/a&gt;&lt;/sup&gt; Okay, take the supremum of the set if you want to stay formal&lt;br /&gt;&lt;sup&gt;&lt;a name="Zisserman"&gt;2&lt;/a&gt;&lt;/sup&gt; It seems that Philip Torr &lt;a href="http://cms.brookes.ac.uk/staff/PhilipTorr/"&gt;has already used&lt;/a&gt; that number.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-3216776224435266994?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/3216776224435266994/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2010/10/papers-citations-co-authorship-and.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/3216776224435266994'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/3216776224435266994'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2010/10/papers-citations-co-authorship-and.html' title='Papers, citations, co-authorship, and genealogy'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-2372372326266381700</id><published>2010-09-16T16:39:00.002+04:00</published><updated>2010-09-16T17:01:21.550+04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='MATLAB'/><category scheme='http://www.blogger.com/atom/ns#' term='laserscanning'/><category scheme='http://www.blogger.com/atom/ns#' term='GML'/><category scheme='http://www.blogger.com/atom/ns#' term='software'/><category scheme='http://www.blogger.com/atom/ns#' term='FOSS'/><category scheme='http://www.blogger.com/atom/ns#' term='C++'/><category scheme='http://www.blogger.com/atom/ns#' term='coding'/><title type='text'>LidarK 2.0 released</title><content type='html'>&lt;p align="justify"&gt;The second major release of GML LidarK is now available. It reflects our 3-year experience on 3D data processing. The description from &lt;a href="http://graphics.cs.msu.ru/en/science/research/3dpoint/lidark"&gt;the project page&lt;/a&gt;:&lt;/p&gt;&lt;p align="justify"&gt;&lt;cite&gt;The LidarK library provides an open-source framework for processing multidimensional point data such as 3D LIDAR scans. It allows building a spatial index for performing fast search queries of different kinds. Although it is intended to be used for LIDAR scans, it can be helpful for a wide range of problems that require spatial data processing.&lt;/cite&gt;&lt;/p&gt;&lt;p align="justify"&gt;The API has been enriched with various features in this release. Indeed, it became more consistent and logical. New ways to access data (i.e. various iterators) are implemented. One can now find k nearest neighbours for any point, not just for one that belongs to the index. Since the data structure is a container, we've decided to parametrize it with template parameter. This decision is controversive: one does not need to cast tuples any more, but the code became clumsier and less portable, unfortunately.&lt;/p&gt;&lt;p align="justify"&gt;The C++/MATLAB code is licensed free of charge for academic use. You can download it &lt;a href="http://graphics.cs.msu.ru/en/science/research/3dpoint/lidark"&gt;here&lt;/a&gt;.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-2372372326266381700?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/2372372326266381700/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2010/09/lidark-20-released.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/2372372326266381700'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/2372372326266381700'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2010/09/lidark-20-released.html' title='LidarK 2.0 released'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-3980465889131585286</id><published>2010-09-11T22:51:00.002+04:00</published><updated>2010-10-10T22:06:43.078+04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='video'/><category scheme='http://www.blogger.com/atom/ns#' term='conference'/><category scheme='http://www.blogger.com/atom/ns#' term='ECCV'/><category scheme='http://www.blogger.com/atom/ns#' term='recognition'/><category scheme='http://www.blogger.com/atom/ns#' term='CMU'/><category scheme='http://www.blogger.com/atom/ns#' term='computer vision'/><category scheme='http://www.blogger.com/atom/ns#' term='GML'/><category scheme='http://www.blogger.com/atom/ns#' term='paper'/><category scheme='http://www.blogger.com/atom/ns#' term='CVPR'/><category scheme='http://www.blogger.com/atom/ns#' term='Microsoft'/><category scheme='http://www.blogger.com/atom/ns#' term='movie'/><category scheme='http://www.blogger.com/atom/ns#' term='humour'/><category scheme='http://www.blogger.com/atom/ns#' term='graphical models'/><category scheme='http://www.blogger.com/atom/ns#' term='labelling'/><category scheme='http://www.blogger.com/atom/ns#' term='parsing'/><category scheme='http://www.blogger.com/atom/ns#' term='history'/><category scheme='http://www.blogger.com/atom/ns#' term='MRF'/><title type='text'>ECCV 2010 highlights</title><content type='html'>&lt;div style="text-align: justify;"&gt;Today is the last day of &lt;a href="http://www.ics.forth.gr/eccv2010/intro.php"&gt;ECCV 2010&lt;/a&gt;, so the best papers are already announced. Although I was not there, a lot of papers are available via &lt;a href="http://cvpapers.com/eccv2010.html"&gt;cvpapers&lt;/a&gt;, so I eventually run into some of them.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-size:large;"&gt;Best papers&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The full list of the conference awards is available &lt;a href="http://www.ics.forth.gr/eccv2010/awards.php"&gt;here&lt;/a&gt;. The best paper is therefore "&lt;a href="http://research.microsoft.com/en-us/um/people/pkohli/papers/lrkt_eccv2010.pdf"&gt;Graph Cut based Inference with Co-occurrence Statistics&lt;/a&gt;" by &lt;a href="http://sots.brookes.ac.uk/lubor/"&gt;Lubor Ladicky&lt;/a&gt;, Chris Russell, &lt;a href="http://research.microsoft.com/en-us/um/people/pkohli/"&gt;Pushmeet Kohli&lt;/a&gt;, and &lt;a href="http://cms.brookes.ac.uk/staff/PhilipTorr/"&gt;Philip Torr&lt;/a&gt;. The second best is "&lt;a href="http://www.cs.cmu.edu/~abhinavg/blocksworld/blocksworld.pdf"&gt;Blocks World Revisited: Image Understanding Using Qualitative Geometry and Mechanics&lt;/a&gt;" by &lt;a href="http://www.cs.cmu.edu/~abhinavg/Home.html"&gt;Abhinav Gupta&lt;/a&gt;, &lt;a href="http://www.cs.cmu.edu/~efros/"&gt;Alyosha Efros&lt;/a&gt;, and &lt;a href="http://www.cs.cmu.edu/~hebert/"&gt;Martial Hebert&lt;/a&gt;. Two thoughts before starting reviewing them. First, the papers come from the two institutions which are (arguably) considered now the leading ones in the vision community: &lt;a href="http://research.microsoft.com/en-us/"&gt;Microsoft Research&lt;/a&gt; and &lt;a href="http://www.ri.cmu.edu/"&gt;Carnegie-Mellon Robotics Institute&lt;/a&gt;. Second, both papers are about semantic segmentation (although the latter couples it with implicit geometry reconstruction); Vidit Jain already &lt;a href="http://vimsu99.blogspot.com/2010/06/acceptance-bias-at-computer-vision.html"&gt;noted the acceptance bias&lt;/a&gt; in favour of the recognition papers.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Okay, the papers now. Ladicky et al. addressed the problem of global terms in the energy minimization for semantic segmentation. Specifically, their global term deals only with &lt;i&gt;occurrences&lt;/i&gt; of object classes and invariant to how many connected components (i.e. objects) or individual pixels represent the class. Therefore, one cow on an image gives the same contribution to the global term as two cows, as well as one accidental pixel of a cow. The global term penalizes big quantity of different categories in the single image (&lt;a href="http://en.wikipedia.org/wiki/Minimum_description_length"&gt;the MDL prior&lt;/a&gt;), which is helpful when we are given a large set of possible class labels, and also penalizes the co-occurrence of the classes that are unlikely to come along with each other, like cows and sheep. Statistics is collected from the train set and defines if the co-occurrence of the certain pair of classes should be encouraged or penalized. Although the idea of  incorporating co-occurrence to the energy function is not new [&lt;a href="http://www.csail.mit.edu/~murphyk/Papers/AImemo03.pdf"&gt;Torralba et al, 2003&lt;/a&gt;; &lt;a href="http://vision.cs.ucla.edu/papers/rabinovichVGWB07.pdf"&gt;Rabinovich et al., 2007&lt;/a&gt;], the authors claim that their method is the first one which simultaneously satisfy the four conditions: &lt;i&gt;global energy minimization&lt;/i&gt; (implicit global term rather than a multi-stage heuristic process), &lt;i&gt;invariance to the structure of classes&lt;/i&gt; (see above), &lt;i&gt;efficiency &lt;/i&gt;(not to make the model order of magnitude larger) and &lt;i&gt;parsimony &lt;/i&gt;(MDL prior, see above).&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;How do the authors minimize the energy? They restrict the global term to the function of the set of classes represented in the image, which is monotonic w.r.t. argument set enclosing (more classes, more penalty). Then the authors introduce some fake nodes for αβ-swap or α-expansion procedure, so that the optimized energy remains submodular. It is really similar to how they applied graph-cut based techniques for minimizing energy with higher-order cliques [&lt;a href="http://www.springerlink.com/index/U44182U53Q41078Q.pdf"&gt;Kohli, Ladicky and Torr, 2009&lt;/a&gt;]. So when you face some non-local terms in the energy, you can try something similar. &lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;What are the shortcomings of the method? It would be great to penalize &lt;i&gt;objects &lt;/i&gt;in addition to classes. First, local interactions are taken into account, as well as global ones. But what about the medium level? Shape of an object, size, colour consistency etc. are the great cues. Second, on the global level only inter-class co-occurrences play a role, but what about the intra-class ones? It is impossible to have two suns in a photo, but it is likely to meet several pedestrians walking along the street. It is actually done by &lt;a href="http://www.cse.wustl.edu/~mgeorg/readPapers/byVenue/iccv2009/desai2009_iccv_discriminativeModelsMulticlassObjectLayout.pdf"&gt;Desai et al. [2009]&lt;/a&gt; for object detection.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The second paper is by Gupta et al., who have remembered the romantic period of computer vision, when the scenes composed of perfect geometrical shapes were reconstructed successfully. They address the problem of 3D reconstruction by a single image, like in &lt;a href="http://www.cs.uiuc.edu/homes/dhoiem/projects/popup/index.html"&gt;auto pop-up&lt;/a&gt; [&lt;a href="http://www.cs.uiuc.edu/homes/dhoiem/publications/popup.pdf"&gt;Hoiem, Efros and Hebert, 2005&lt;/a&gt;]. They compare the result of auto pop-up with &lt;a href="http://en.wikipedia.org/wiki/Potemkin_village"&gt;Potemkin villages&lt;/a&gt;: "there is nothing behind the pretty façade." (I believe this comparison is the contribution of &lt;a href="http://www.cs.cmu.edu/~efros/"&gt;the second author&lt;/a&gt;). Instead of surfaces, they fit boxes into the image, which allows them to put a wider range of constraints to the 3D structure, including: &lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;ul&gt;&lt;li&gt;&lt;i&gt;static equilibrium&lt;/i&gt;: it seems that the only property they check here is that centroid is projected into the figure bearing;&lt;/li&gt;&lt;li&gt;&lt;i&gt;enough support force&lt;/i&gt;: they estimate density (light -- vegetation, medium -- human, heavy -- buildings) and say that it is unlikely that building is build on the tree;&lt;/li&gt;&lt;li&gt;&lt;i&gt;volume constraint&lt;/i&gt;: boxes cannot intersect;&lt;/li&gt;&lt;li&gt;&lt;i&gt;depth ordering&lt;/i&gt;: backprojecting the result to the image plane should correspond to what we see on the image.&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;This is a great paper that exploits Newtonian mechanics as well as human intuition, however, there are still some heuristics (like the density of a human) which could probably be generalized out. It seems that this approach has a big potential, so it might became the seminal paper for the new direction. Composing recognition with geometry reconstruction is quite trendy now, and this method is ideologically simple but effective. There are a lot of examples how the algorithm works on &lt;a href="http://www.cs.cmu.edu/~abhinavg/blocksworld/"&gt;the project page&lt;/a&gt;.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span" style="font-size:large;"&gt;&lt;b&gt;Funny papers&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;There are a couple of ECCV papers which have fancy titles. The first one is "&lt;a href="http://grail.cs.washington.edu/malkovich/eccv10paper.pdf"&gt;Being John Malkovich&lt;/a&gt;" by &lt;a href="http://www.cs.washington.edu/homes/kemelmi/"&gt;Ira Kemelmacher-Shlizerman&lt;/a&gt;, &lt;a href="http://www.cs.washington.edu/homes/aditya/"&gt;Aditya Sankar&lt;/a&gt;, &lt;a href="http://www.adobe.com/technology/people/seattle/shechtman.html"&gt;Eli Shechtman&lt;/a&gt;, and &lt;a href="http://www.cs.washington.edu/homes/seitz/"&gt;Steve Seitz&lt;/a&gt; from &lt;a href="http://grail.cs.washington.edu/"&gt;the University of Washington GRAIL&lt;/a&gt;. If you've seen &lt;a href="http://www.imdb.com/title/tt0120601/"&gt;the movie&lt;/a&gt;, you can guess what is the article about. Given the video of someone pulling faces, the algorithm transforms it to the video of &lt;a href="http://en.wikipedia.org/wiki/John_Malkovich"&gt;John Malkovich&lt;/a&gt; making similar faces. "Ever wanted to be someone else? Now you can." In contrast to the movie, in the paper not necessarily John Malkovich plays himself: it could be George Bush, Cameron Diaz, John Clooney and even any person for whom you can find a sufficient video or photo database! You can see the video of the real-time puppetry on &lt;a href="http://grail.cs.washington.edu/malkovich/"&gt;the project page&lt;/a&gt;, although obvious lags take place and the result is still far from being perfect.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Another fancy title is "&lt;a href="http://cs.unc.edu/~jmf/publications/Frahm_et_al_ReconstructionFromPhotoCollection.pdf"&gt;Building Rome on a Cloudless Day&lt;/a&gt;". There are 11 (eleven) authors contributing to the paper, including &lt;a href="http://www.cs.unc.edu/~marc/"&gt;Marc Pollefeys&lt;/a&gt;. This summer I spent one cloudless day in Rome, and, to be honest, it was not that pleasant. So, why is the paper called this way then?  The paper refers to another one: "&lt;a href="http://grail.cs.washington.edu/rome/rome_paper.pdf"&gt;Building Rome in a Day&lt;/a&gt;" from ICCV 2009 by the guys from Washington again, which itself refers to the proverb "&lt;a href="http://en.wiktionary.org/wiki/Rome_wasn't_built_in_a_day"&gt;Rome was not built in a day&lt;/a&gt;." In this paper authors build a dense 3D model of some Rome sights using a set of Flickr photos tagged "Rome" or "Roma". Returning back to the monument of collective intelligence from ECCV2010, they did the same, but without cloud computing, that's why the day is cloudless now. S.P.Q.R.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;I cannot avoid to mention here the following papers, although they are not from ECCV. Probably the most popular CVPR 2010 paper is "&lt;a href="http://www.cs.cmu.edu/~rahuls/pub/cvpr2010-food-rahuls.pdf"&gt;Food Recognition Using Statistics of Pairwise Local Features&lt;/a&gt;" by &lt;a href="http://www.cs.washington.edu/homes/yang/"&gt;Shulin Yang&lt;/a&gt;, &lt;a href="http://www.cs.cmu.edu/~meichen/"&gt;Mei Chen&lt;/a&gt;, &lt;a href="http://www.ri.cmu.edu/person.html?person_id=241"&gt;Dean Pomerleau&lt;/a&gt;, &lt;a href="http://www.cs.cmu.edu/~rahuls/"&gt;Rahul Sukthankar&lt;/a&gt;. The first page of the paper contains the motivation picture with a hamburger, and it looks pretty funny. They insist that the stuff from McDonald's is very different from that from Burger King, and it is really important to recognize them to keep track of the calories. Well, the authors don't look overweight, so the method should work.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The last paper in this section is "&lt;a href="http://vision.ucsd.edu/sites/default/files/gestalt.pdf"&gt;Paper Gestalt&lt;/a&gt;" by the imaginary Carven von Bearnensquash, published in Secret Proceedings of Computer Vision and Pattern Recognition (CVPR), 2010. The authors (presumably from UCSD) make fun of the way we usually write computer vision papers assuming that some features might convince a reviewer to accept or reject the paper, like mathematical formulas that create an illusion of author qualification (although if they are irrelevant), ROC curves etc. It also derides the attempts to apply black-box machine-learning techniques without the appropriate analysis of the possible features. Now I am trying to subscribe to the Journal of Machine Learning Gossip.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;&lt;b&gt;Colleagues&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;There was only one paper from our lab at the conference: "&lt;a href="http://research.microsoft.com/en-us/um/people/pkohli/papers/bltk_eccv2010.pdf"&gt;Geometric Image Parsing in Man-Made Environments&lt;/a&gt;" by &lt;a href="http://graphics.cs.msu.ru/en/people/staff/obarinova"&gt;Olga Barinova&lt;/a&gt; and Elena Tretiak (in co-authorship with &lt;a href="http://www.robots.ox.ac.uk/~vilem/"&gt;Victor Lempitsky&lt;/a&gt; and Pushmeet Kohli). The scheme similar to the image parsing framework [&lt;a href="http://www.loni.ucla.edu/~ztu/publication/IJCV_parsing.pdf"&gt;Tu et al., 2005&lt;/a&gt;] is utilized, i.e. top-down analysis is performed. They detect parallel lines (like edges of buildings and windows), their vanishing points and the zenith jointly, using a witty graphical model. The approach is claimed to be robust to the clutter in the edge map.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Indeed, this paper could not have been possible without me. =) It was me who convinced Lena to join the lab two years ago (actually, it was more like convincing her not to apply for &lt;a href="http://lvk.cs.msu.su/"&gt;the other lab&lt;/a&gt;). So, the lab will remember me at least as a decent selectioner/scout...&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-3980465889131585286?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/3980465889131585286/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2010/08/eccv-2010-highlights.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/3980465889131585286'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/3980465889131585286'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2010/08/eccv-2010-highlights.html' title='ECCV 2010 highlights'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-8540749275212097267</id><published>2010-08-27T22:49:00.006+04:00</published><updated>2010-08-28T00:14:43.853+04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='comics'/><category scheme='http://www.blogger.com/atom/ns#' term='humour'/><category scheme='http://www.blogger.com/atom/ns#' term='software'/><category scheme='http://www.blogger.com/atom/ns#' term='job'/><category scheme='http://www.blogger.com/atom/ns#' term='history'/><category scheme='http://www.blogger.com/atom/ns#' term='Google'/><category scheme='http://www.blogger.com/atom/ns#' term='web-site'/><title type='text'>Comic strip: 10%</title><content type='html'>&lt;div&gt;Last summer, during my internship at &lt;a href="http://cmp.felk.cvut.cz/"&gt;CMP&lt;/a&gt; in Prague I drew some &lt;a href="http://xkcd.com/"&gt;xkcd&lt;/a&gt;-style comic strips, and I want to share some of them here. I considered starting a different blog for that, but there are too few of them, and they do suck (at least in comparison to Randall's work). However, most of them are programming/research related, so they are appropriate here.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I planned to use them when I would have nothing to post about. This is not the case now (I am planning a couple more posts in a week or so), but this one is topical. It was originally drawn in September 1, 2009 when &lt;a href="http://gmailblog.blogspot.com/2009/09/todays-gmail-problems.html"&gt;GMail was not operating&lt;/a&gt; for a few hours. From the recent news: &lt;a href="http://googleblog.blogspot.com/2010/08/update-on-google-wave.html"&gt;Google shuts Google Wave down&lt;/a&gt;, admitting their failure predicted by some sceptics (see the image title text).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://shapovalov.ro/images/comics/GMail.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 500px; " title="Those employees who were working on Google Wave as their main project used to waste up to 80% of their working time" src="http://shapovalov.ro/images/comics/GMail_preview.png" border="0" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-8540749275212097267?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/8540749275212097267/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2010/08/comic-strip-10.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/8540749275212097267'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/8540749275212097267'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2010/08/comic-strip-10.html' title='Comic strip: 10%'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-7079292272814369704</id><published>2010-08-19T01:49:00.008+04:00</published><updated>2010-08-21T15:18:14.674+04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='graphical models'/><category scheme='http://www.blogger.com/atom/ns#' term='computational complexity'/><category scheme='http://www.blogger.com/atom/ns#' term='paper'/><category scheme='http://www.blogger.com/atom/ns#' term='reblog'/><category scheme='http://www.blogger.com/atom/ns#' term='science'/><title type='text'>Theorem 8.10. P ≠ NP.</title><content type='html'>&lt;div style="text-align: justify;"&gt;Nearly two weeks ago Indian-American mathematician &lt;a href="http://www.hpl.hp.com/personal/Vinay_Deolalikar/"&gt;Vinay Deolalikar&lt;/a&gt;, the employee of HP Labs, sent a letter to a group of mathematicians asking them to proof-read his attempt to prove &lt;a href="http://en.wikipedia.org/wiki/P_versus_NP_problem"&gt;P ≠ NP&lt;/a&gt;. The pre-print was eventually distributed over the Internet, so a lot of people became aware. Later two major flaws were found in the proof, so it is now considered wrong. Nevertheless, Vinay has removed the paper from his page and commented that he fixed all the issues and is going to submit a paper to the journal review. You can download the original paper from my web-site [&lt;a href="http://shapovalov.ro/botva/pnp12pt.pdf"&gt;Deolalikar, 2010&lt;/a&gt;]. &lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;We'll see if he is bluffing or not. In any case, it is very pleasant to me that the main role in the proof is played by... the graphical probabilistic model framework. Yeah, those graphical models we use in computer vision! It was surprising indeed, since the problems in computational complexity are usually discrete and well-posed. So, this post is devoted to understanding why a graphical model emerges in the proof.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;First, I am going to review some key points of the proof. It is far from being exhaustive, just some rough sketches that are critical for understanding. Then I report on the flaws found in the proof, and then I end up with a methodological note on the whole P vs. NP problem. &lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span"  style="font-size:large;"&gt;&lt;b&gt;k-SAT and d1RSB Phase&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;In order to prove P ≠ NP, it is enough to show the absence of a polynomial algorithm for some problem from &lt;a href="http://en.wikipedia.org/wiki/NP_(complexity)"&gt;the class NP&lt;/a&gt;, e.g. for some &lt;a href="http://en.wikipedia.org/wiki/NP-complete"&gt;NP-complete&lt;/a&gt; problem (which are the "hardest" problems in NP). &lt;a href="http://en.wikipedia.org/wiki/Boolean_satisfiability_problem"&gt;The boolean satisfiability problem&lt;/a&gt; (SAT) addresses the issue of checking if there is at least one assignment of the boolean variables in the formula, represented in conjunctive normal form (CNF), which makes it TRUE. The k-satisfiability problem (k-SAT) is a particular case of SAT, where all the clauses in the CNF have order &lt;i&gt;k&lt;/i&gt;. For example, (&lt;i&gt;x&lt;/i&gt;˅&lt;i&gt;y&lt;/i&gt;)&amp;amp;(¬&lt;i&gt;x&lt;/i&gt;˅&lt;i&gt;y&lt;/i&gt;)&amp;amp;(&lt;i&gt;x&lt;/i&gt;˅¬&lt;i&gt;y&lt;/i&gt;)&amp;amp;(¬&lt;i&gt;x&lt;/i&gt;˅¬&lt;i&gt;y&lt;/i&gt;) is an instance of 2-SAT which has the answer 'NO', since any boolean values of (&lt;i&gt;x, y&lt;/i&gt;) makes the formula false. (&lt;i&gt;x&lt;/i&gt;˅&lt;i&gt;y&lt;/i&gt;˅&lt;i&gt;z&lt;/i&gt;)&amp;amp;(¬&lt;i&gt;x&lt;/i&gt;˅&lt;i&gt;u&lt;/i&gt;˅&lt;i&gt;v&lt;/i&gt;)&amp;amp;(¬&lt;i&gt;x&lt;/i&gt;˅&lt;i&gt;u&lt;/i&gt;˅¬&lt;i&gt;z&lt;/i&gt;) is a 3-SAT over 5 variables (x, y, z, u, v), which is satisfiable, e.g. on the input (1,0,0,1,0). 2-SAT is in P, but k-SAT is NP-complete whenever &lt;i&gt;k&lt;/i&gt; &gt; 2. The Deolalikar's idea is to show that k-SAT (with &lt;i&gt;k&lt;/i&gt; &gt; 8) is outside of P, which (if true) proves that there is a separation between P and NP.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The space of possible k-SAT solutions is analysed. Let &lt;i&gt;m&lt;/i&gt; be the number of clauses in the CNF, &lt;i&gt;n&lt;/i&gt; be the number of variables. Consider the ratio &lt;i&gt;α&lt;/i&gt; = &lt;i&gt;m&lt;/i&gt;/&lt;i&gt;n&lt;/i&gt;. In the border case &lt;i&gt;m&lt;/i&gt; = 1 (&lt;i&gt;α &lt;/i&gt;is small), k-SAT is always true, there are usually a lot of inputs that satisfy the CNF. Consider an ensemble of random k-SATs (the following is statistical reasoning). With the growth of &lt;i&gt;α&lt;/i&gt;, the probability of the CNF to be satisfiable decreases. When a certain threshold &lt;i&gt;α&lt;sub&gt;d&lt;/sub&gt; &lt;/i&gt;is reached, the "true" solutions set breaks into clusters. This stage is known in statistical mechanics as &lt;i&gt;dynamical one step replica symmetry breaking&lt;/i&gt; (d1RSB) phase. Moving further, we make the problem completely unsatisfiable. It turns out that d1RSB stage subproblems are the most challenging of the all k-SAT problems. The effect could be observed only if &lt;i&gt;k&lt;/i&gt; &gt; 8, that's why such problems are used in the proof. [&lt;a href="http://shapovalov.ro/botva/pnp12pt.pdf"&gt;Deolalikar, 2010&lt;/a&gt;, Chapter 6]&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_IDEWI0P9RbA/TGxu-GTlYVI/AAAAAAAAADY/ZEBvw8N25Z0/s1600/d1RSB.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 144px;" src="http://4.bp.blogspot.com/_IDEWI0P9RbA/TGxu-GTlYVI/AAAAAAAAADY/ZEBvw8N25Z0/s400/d1RSB.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5506898457490973010" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;b&gt;&lt;span class="Apple-style-span"  style="font-size:large;"&gt;FO(PFP) and FO(LFP)&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;In the &lt;a href="http://en.wikipedia.org/wiki/Finite_model_theory"&gt;finite model theory&lt;/a&gt; there are recurrent extensions of the first-order logic. The predicate P&lt;sub&gt;&lt;i&gt;i&lt;/i&gt;&lt;/sub&gt;(&lt;i&gt;x&lt;/i&gt;) is evaluated as some second-order function &lt;i&gt;φ&lt;/i&gt;(P&lt;sub&gt;&lt;i&gt;i-1&lt;/i&gt;&lt;/sub&gt;&lt;i&gt;, x&lt;/i&gt;), where x is a &lt;i&gt;k&lt;/i&gt;-element vector, P&lt;sub&gt;&lt;i&gt;0&lt;/i&gt;&lt;/sub&gt;(&lt;i&gt;x&lt;/i&gt;) ≡ false. In the general case, either there is a fixed point, or is P&lt;sub&gt;&lt;i&gt;i &lt;/i&gt;&lt;/sub&gt;looping. For example, if &lt;i&gt;φ&lt;/i&gt;(P&lt;i&gt;, x&lt;/i&gt;) ≡ ¬P(&lt;i&gt;x&lt;/i&gt;), then P&lt;sub&gt;&lt;i&gt;0&lt;/i&gt;&lt;/sub&gt;(&lt;i&gt;x&lt;/i&gt;) ≡ false, P&lt;sub&gt;&lt;i&gt;1&lt;/i&gt;&lt;/sub&gt;(&lt;i&gt;x&lt;/i&gt;) ≡ true, P&lt;sub&gt;&lt;i&gt;2&lt;/i&gt;&lt;/sub&gt;(&lt;i&gt;x&lt;/i&gt;) ≡ false, etc. Here &lt;i&gt;x&lt;/i&gt; is meaningless, but it is not always the case. Consider the following definition: &lt;i&gt;φ&lt;/i&gt;(P&lt;i&gt;, x&lt;/i&gt;) ≡ max(&lt;i&gt;x&lt;/i&gt;) ˅ P(&lt;i&gt;x&lt;/i&gt;) // recall &lt;i&gt;x&lt;/i&gt; is a vector, in the boolean case 'max' is the same as 'or'.  If &lt;i&gt;x&lt;/i&gt; contains at least one non-zero element, P&lt;sub&gt;&lt;i&gt;0&lt;/i&gt;&lt;/sub&gt;(&lt;i&gt;x&lt;/i&gt;) = false, P&lt;sub&gt;&lt;i&gt;1&lt;/i&gt;&lt;/sub&gt;(&lt;i&gt;x&lt;/i&gt;) = true, P&lt;sub&gt;&lt;i&gt;2&lt;/i&gt;&lt;/sub&gt;(&lt;i&gt;x&lt;/i&gt;) = true, etc. Otherwise, P&lt;sub&gt;&lt;i&gt;i&lt;/i&gt;&lt;/sub&gt;(&lt;b&gt;0&lt;/b&gt;) = false for all &lt;i&gt;i&lt;/i&gt;. In the case of looping, let's redefine the fixed point to be constantly false. &lt;a href="http://en.wikipedia.org/wiki/FO_(complexity)#Partial_fixed_point_is_PSPACE"&gt;&lt;i&gt;FO(PFP)&lt;/i&gt;&lt;/a&gt; is a class of problems of checking if there will be a loop for some &lt;i&gt;x&lt;/i&gt;, or a fixed point. They are actually very difficult problems. FO(PFP) is equal to the whole &lt;a href="http://en.wikipedia.org/wiki/PSPACE"&gt;PSPACE&lt;/a&gt;. Suppose now that &lt;i&gt;φ&lt;/i&gt; is monotonic on P. It means that P(&lt;i&gt;x&lt;/i&gt;) only appears in the formula with even number of negations before it (or zero, as in the second example). This means that once P&lt;sub&gt;&lt;i&gt;i&lt;/i&gt;&lt;/sub&gt;(&lt;i&gt;x&lt;/i&gt;) is true, P&lt;sub&gt;&lt;i&gt;j&lt;/i&gt;&lt;/sub&gt;(&lt;i&gt;x&lt;/i&gt;) will be true for any &lt;i&gt;j&lt;/i&gt; &gt; &lt;i&gt;i&lt;/i&gt;. This reduces the class to &lt;a href="http://en.wikipedia.org/wiki/FO_(complexity)#Least_Fixed_Point_is_PTIME"&gt;&lt;i&gt;FO(LFP)&lt;/i&gt;&lt;/a&gt;, which is proven to be equal to the class &lt;a href="http://en.wikipedia.org/wiki/P_(complexity)"&gt;P&lt;/a&gt;. So, the global problem now is to show that k-SAT is outside of the class FO(LFP).&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;b&gt;&lt;span class="Apple-style-span"  style="font-size:large;"&gt;ENSP: the graphical model for proving P ≠ NP&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;So how a graphical model emerges here? Graphical model is a way to describe a multivariate probability distribution, i.e. dependencies of covariates in the distribution. Recall now &lt;a href="http://en.wikipedia.org/wiki/NP_(complexity)#Verifier-based_definition"&gt;the definition of NP&lt;/a&gt;. If we somehow guess the certificate, we are done (i.e. have a polynomial algorithm). If the space of certificates (possible solutions, in terms of Deolalikar) is simple, we can probably get a polynomial algorithm. What is simple? This means that the distribution is parametrizable with a small amount of parameters (c.f. &lt;a href="http://en.wikipedia.org/wiki/Intrinsic_dimension"&gt;intrinsic dimensionality&lt;/a&gt;), which allows us to traverse the solution space efficiently. This is twofold. First, the distribution is simple if it has a limited support, i.e. all the probability measure is distributed among a limited number of points of the probabilistic space. Second, it is simple if the covariates are as much conditionally independent as possible. In the ideal case, if all the groups of covariates are independent (recall that &lt;a href="http://www.morris.umn.edu/~sungurea/introstat/stat2611/independence.html"&gt;pairwise and mutual independence do not subsume each other!&lt;/a&gt;), we need only &lt;i&gt;n&lt;/i&gt; parameters to describe the distribution (&lt;i&gt;n&lt;/i&gt; is the number of covariates), while in general case this number is exponential. See [&lt;a href="http://shapovalov.ro/botva/pnp12pt.pdf"&gt;Deolalikar, 2010&lt;/a&gt;, p. 6] for examples.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;How to measure the degree of conditional independence? Yes, to build a graphical model. It is factorized to cliques according to &lt;a href="http://computerblindness.blogspot.com/2010/01/on-mrf-factorization.html"&gt;Hammersley-Clifford theorem&lt;/a&gt;. Larger cliques you get, stronger dependency is. When the largest cliques are k-interactions, the distribution can be parametrized with &lt;i&gt;n&lt;/i&gt;2&lt;sup&gt;&lt;i&gt;k&lt;/i&gt;&lt;/sup&gt; independent parameters. Finally, Vinay shows that FO(LFP) can deal only with the distributions parametrizable with 2&lt;sup&gt;poly(log &lt;i&gt;n&lt;/i&gt;)&lt;/sup&gt; parameters, which is not the case for d1RSB k-SAT (its solution space is too complicated). &lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;In order to show it strictly, Deolalikar introduces a tailored graphical model describing LO(LPF) iterative process, the E&lt;i&gt;lement-Neighborhood-Stage Product&lt;/i&gt; model (ENSP):&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_IDEWI0P9RbA/TGzwEAdhhiI/AAAAAAAAADg/Clp6Qz0wNRw/s1600/ENSP.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 300px; height: 400px;" src="http://4.bp.blogspot.com/_IDEWI0P9RbA/TGzwEAdhhiI/AAAAAAAAADg/Clp6Qz0wNRw/s400/ENSP.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5507040396001248802" /&gt;&lt;/a&gt;There are two types of sites: corresponding to the variables on the stages of LFP (small circles) and corresponding to the elements of witty neighbourhood system (some closure over &lt;a href="http://en.wikipedia.org/wiki/Gaifman_graph"&gt;Gaifman graph&lt;/a&gt;; big blue circles). When a variable is assigned 0 or 1, it is painted with red or green respectively. The last column corresponds to the fixed point, all the variables are assigned. Thus, this model is easily factorizable to the small cliques, and it cannot describe the imaginable LFP process for some k-SAT. See [&lt;a href="http://shapovalov.ro/botva/pnp12pt.pdf"&gt;Deolalikar, 2010&lt;/a&gt;, Chapter 8] for the details.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span"  style="font-size:large;"&gt;&lt;b&gt;So what's the problem?&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;a href="http://www.cs.umass.edu/~immerman/"&gt;Neil Immerman&lt;/a&gt;, an expert in the finite model theory, noticed two major flaws. The first one is that Vinay actually model only monadic LFP, which is not equal to P, as assumed. He thus proved that there is a gap between NP and some subclass of P, which is not obligatory equals P. The second major flow deals with modelling k-SAT as a logical structure. I do not fully understand this step, but some order-invariance assumptions are wrong. Here is a discussion in &lt;a href="http://rjlipton.wordpress.com/2010/08/15/the-p%E2%89%A0np-proof-is-one-week-old/"&gt;Lipton's blog&lt;/a&gt;. According to it, the flaws are really fatal.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span"  style="font-size:large;"&gt;&lt;b&gt;The third way&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;It seems that it is really difficult to prove P ≠ NP. The most of scientists think that P = NP is improbable, and that's why the hypothesis P ≠ NP is generally adopted now. But there is one more option: neither P = NP nor P ≠ NP is provable. As every mathematical theory, computational complexity is a set of axioms and statements, which are usually could be proved to be correct or not. But there are sometimes some statements formulated in terms of the theory, which could not. Moreover, according to &lt;a href="http://en.wikipedia.org/wiki/Goedel's_theorem#First_incompleteness_theorem"&gt;the first Gödel's incompleteness theorem&lt;/a&gt;, in any mathematical theory based on our natural understanding of natural numbers, there is either a statement that could be proved both true or false, or an unprovable one. I know no actual reasons why this could not be the case for P ≠ NP (although I have feelings there are some since it is not usually taken into account). &lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Suppose it is really so. This would mean that the whole theory should be reconsidered. Some axioms or definitions will change to make this fact provable. But may be anything else will become unprovable. Who knows...&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;i&gt;&lt;/i&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;i&gt;&lt;/i&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-7079292272814369704?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/7079292272814369704/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2010/08/theorem-810-p-np.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/7079292272814369704'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/7079292272814369704'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2010/08/theorem-810-p-np.html' title='Theorem 8.10. P ≠ NP.'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_IDEWI0P9RbA/TGxu-GTlYVI/AAAAAAAAADY/ZEBvw8N25Z0/s72-c/d1RSB.png' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-8906501423918336164</id><published>2010-08-15T22:38:00.002+04:00</published><updated>2010-08-16T09:35:32.513+04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='graphical models'/><category scheme='http://www.blogger.com/atom/ns#' term='recognition'/><category scheme='http://www.blogger.com/atom/ns#' term='computer vision'/><category scheme='http://www.blogger.com/atom/ns#' term='labelling'/><category scheme='http://www.blogger.com/atom/ns#' term='parsing'/><category scheme='http://www.blogger.com/atom/ns#' term='paper'/><category scheme='http://www.blogger.com/atom/ns#' term='history'/><category scheme='http://www.blogger.com/atom/ns#' term='MCMC'/><category scheme='http://www.blogger.com/atom/ns#' term='MRF'/><title type='text'>Image Parsing: Unifying Segmentation, Detection, and Recognition</title><content type='html'>&lt;div style="text-align: justify;"&gt;&lt;a href="http://computerblindness.blogspot.com/2010/06/object-detection-vs-semantic.html"&gt;Recently I have posted&lt;/a&gt; about the connection between object category detection and semantic image segmentation. While delving into this problem more deeply, I have found the paper denoted in the title of this post.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The ICCV 2003 paper by Tu et al. was among the three &lt;a href="http://en.wikipedia.org/wiki/Marr_Prize#9th_ICCV.2C_2003.2C_Nice.2C_France"&gt;Marr prize&lt;/a&gt; winners. And it is really a prominent piece of work! In &lt;a href="http://resources.metapress.com/pdf-preview.axd?code=t5r052375jr10776&amp;amp;size=largest"&gt;the introduction&lt;/a&gt; to &lt;a href="http://www.springerlink.com/content/0920-5691/63/2/"&gt;the special Marr prize IJCV issue&lt;/a&gt;, &lt;a href="http://ljk.imag.fr/membres/Bill.Triggs/////////"&gt;Bill Triggs&lt;/a&gt; featured the paper as one that "would have been particularly pleasing to &lt;a href="http://en.wikipedia.org/wiki/David_Marr_(neuroscientist)"&gt;David Marr&lt;/a&gt;", because of "its bold attack on the central problem of perceptual organization and whole scene understanding". &lt;a href="http://www.loni.ucla.edu/~ztu/publication/IJCV_parsing.pdf"&gt;Here&lt;/a&gt; is the journal version of the paper.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div style="text-align: justify;"&gt;The authors combined discriminative and generative models, which resulted to the unified framework for image parsing. For each image the corresponding parsing tree could be found. The root of the tree represents the whole scene. On the next level, nodes represent semantic fragments, such as human faces, text, or textured regions. The leaves of the tree correspond to the pixels of the image. &lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The approach differs drastically from the vanilla CRF framework in the way that the structure of the parsing tree is dynamic while the CRF structure remains constant and just interaction models between the pairs of sites may change. The goal is to obtain the tree that maximizes the posterior probability given the image. The directed search in the space of valid trees is performed by means of &lt;a href="http://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo"&gt;Markov chain Monte-Carlo&lt;/a&gt; (MCMC). The possible tree changes like split and merge of regions, varying the border between regions, are defined. Such changes are described in terms of learnable Markov chain transition kernels.&lt;/div&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;What they actually did is they effectively combined top-down and bottom-up approaches. In a nutshell, the bottom-up (discriminative) method generates the hypotheses of pixel labels and object positions using the local neighbourhood, and the top-down generative model is then built making use of those hypotheses. The latest guarantees the consistency of the output and the optimum of its posterior probability.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Okay, what does it mean for the issue of detection and segmentation convergence? Because the shapes of objects are determined implicitly during the recognition, and stuff regions are detected by their colour and texture, the problem of semantic image segmentation is actually solved. Moreover, &lt;i&gt;multi-scale semantic segmentation&lt;/i&gt; is performed during the parsing. Therefore, image parsing seems to be the most general formulation of image recognition problem.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The nodes of a parsing tree, which belong to the same depth level, are not connected (right, because it is a tree). This means that spatial interactions could not be modelled directly. A totally different approach was introduced in another Marr prize winning paper by &lt;a href="http://www.ics.uci.edu/~fowlkes/papers/drf-iccv09.pdf"&gt;Desai et al [2009]&lt;/a&gt;. They also use bottom-up tests to generate hypotheses on the object locations and then use CRF over those location to take spatial context into account. They model different kinds of inter-class and intra-class interactions. Thus, they ensure that the objects in the image (described by their bounding boxes) are arranged correctly, e.g. a cup is likely to be on a table (spatial arrangement), while a train could not be close to a ship (mutual exclusion).&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;It seems that the two papers exploit the two different forms of information (aggregation vs. spatial interactions), which are mutually redundant to a certain extent, but do not completely exhaust each other. It seems that the group of researchers who will successfully combine those approaches will receive some 201x Marr prize.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-8906501423918336164?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/8906501423918336164/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2010/08/image-parsing-unifying-segmentation.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/8906501423918336164'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/8906501423918336164'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2010/08/image-parsing-unifying-segmentation.html' title='Image Parsing: Unifying Segmentation, Detection, and Recognition'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-6754230677991532876</id><published>2010-07-28T00:41:00.010+04:00</published><updated>2010-07-29T01:51:43.193+04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='ICVSS'/><category scheme='http://www.blogger.com/atom/ns#' term='education'/><category scheme='http://www.blogger.com/atom/ns#' term='life'/><title type='text'>ICVSS 2010: some trivia</title><content type='html'>&lt;div style="text-align: justify;"&gt;If you think that summer schools are only about lectures and posters, you are certainly wrong. Well, I also did not expect that I would get drunk every night (it was not in Russia after all!). On the first night I convinced people to go to the beach for night swimming, but on the next day I realized that it was not a really good idea, because lectures started at 9. Eventually, we did not go to bed early for the rest of the week.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;I would like to thank "Marsa Sicla" people (Fig. 1), you were the great company!&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;img src="http://2.bp.blogspot.com/_IDEWI0P9RbA/TFCOBatCnzI/AAAAAAAAAC0/ff-MjKNGfI0/s400/MSpeople.jpg" style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 235px;" border="0" alt="" id="BLOGGER_PHOTO_ID_5499051300018626354" /&gt;&lt;div style="text-align: center;"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;Fig. 1. The Marsa Sicla company. Left to right: &lt;/span&gt;&lt;a href="http://flavors.me/robotblackdebbie"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;Richard&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;, &lt;/span&gt;&lt;a href="http://www.cs.ucf.edu/~ramin/"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;Ramin&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;,&lt;br /&gt;&lt;/span&gt;&lt;a href="http://www.lmt.ei.tum.de/team/alt/"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;Nicolas&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;, &lt;/span&gt;&lt;a href="http://www.facebook.com/selinachenxi?ref=ts#!/selinachenxi"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;Xi&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;, &lt;/span&gt;&lt;a href="http://www.stanford.edu/~vijayc/"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;Vijay&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;, &lt;/span&gt;&lt;a href="http://www.limsi.fr/Annuaire/infos?id=cchastag"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;Clément&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;, &lt;/span&gt;&lt;a href="http://www.linkedin.com/pub/aishwarya-natesh/13/845/103"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;Aish&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;, &lt;/span&gt;&lt;a href="http://shapovalov.ro/"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;me&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;. &lt;/span&gt;&lt;a href="http://www.ktl.elf.stuba.sk/portal/?a=17&amp;amp;s=222"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;Sandra&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt; is behind the camera.&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Actually, there was a reason why we had such a close-knit company. On the Fig. 2 you can see the map of the region where the school took place. The school was hosted by Baia Samuele hotel village (inside the blue frame on the map). The cheapest accommodation at Baia Samuele costed €100 per night, so some pragmatic people scared a different accommodation up. It was in a neighbouring hotel village, Marsa Sicla. &lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;iframe width="550" height="700" frameborder="0" scrolling="no" marginheight="0" marginwidth="0" src="http://maps.google.com/maps/ms?ie=UTF8&amp;amp;hl=en&amp;amp;t=k&amp;amp;msa=0&amp;amp;msid=114318388347498250868.00048c655613735765d52&amp;amp;ll=36.725023,14.753973&amp;amp;spn=0.012039,0.011802&amp;amp;z=16&amp;amp;output=embed"&gt;&lt;/iframe&gt;&lt;br /&gt;&lt;small&gt;Fig. 2.  ICVSS location map. View &lt;a href="http://maps.google.com/maps/ms?ie=UTF8&amp;amp;hl=en&amp;amp;t=k&amp;amp;msa=0&amp;amp;msid=114318388347498250868.00048c655613735765d52&amp;amp;ll=36.725023,14.753973&amp;amp;spn=0.012039,0.011802&amp;amp;z=16&amp;amp;source=embed" style="color:#0000FF;text-align:left"&gt;ICVSS_2010_MSvsBS&lt;/a&gt; in a larger map&lt;/small&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;In a map it looked pretty close, but in practice it turned out that there were a lot of fences (blue line)! Officially, we had to walk along the red route (~2.5 km, partly along a highway) through the official entrance. Moreover, we had to leave the village for lunch (since there were buffet), otherwise we would have been charged €30 per meal. To control it they wanted us to leave our documents at the control post (bottom blue tick on Fig. 2). Thus, we were supposed to walk 2.5 km four times a day. But we found a better solution.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Although the fence was equipped with a barbwire, we managed to find two shortcuts (green ticks). One of them led through ever-closed gate (Fig. 3), which were suitable for hopping over. So, we used the green path usually. It made the way two times shorter. Moreover, we did not have to leave Baia Samuele for lunch, since we weren't officially there. Sometimes we even had free lunch, that finally proved that &lt;a href="http://en.wikipedia.org/wiki/No_free_lunch_theorem"&gt;the no free lunch theorem&lt;/a&gt; is wrong (joke by Ramin).&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_IDEWI0P9RbA/TFCZgjifHcI/AAAAAAAAAC8/2AHMtWfUhck/s1600/IMG_0271.jpg"&gt;&lt;img style="text-align: justify;display: block; margin-top: 0px; margin-right: auto; margin-bottom: 10px; margin-left: auto; cursor: pointer; width: 400px; height: 300px; " src="http://4.bp.blogspot.com/_IDEWI0P9RbA/TFCZgjifHcI/AAAAAAAAAC8/2AHMtWfUhck/s400/IMG_0271.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5499063929594125762" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;Fig. 3. Hopping over the fence. &lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Well, the lunch was really great. But this was not really a question of food, the main reason not to leave Baia Samuele was of course socialization, which used to be active during lunch (and also Internet access at the conference centre :). If we knew about such situation before booking the apartments, we would probably followed the official accommodation recommendations. However, we had a great company at Marsa Sicla too. But because of "unofficial night programme" I had to (almost) sleep during lectures. Most of guys was on their last year of PhD, so they considered the school as a vacation. But I really wanted to learn a lot of stuff. It was really difficult to perceive any information  when you had not slept enough time. Surprisingly, I somehow managed to pass the final exam, but I feel I need to look through the lecture slides again, reading up some papers they refer to, otherwise the scientific part of the school will be useless for me. &lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Thus, I finish posting about ICVSS. To conclude, I want to recommend everyone to visit such summer schools, because they are the best way to enter into the community. &lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-6754230677991532876?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/6754230677991532876/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2010/07/icvss-2010-some-trivia.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/6754230677991532876'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/6754230677991532876'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2010/07/icvss-2010-some-trivia.html' title='ICVSS 2010: some trivia'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_IDEWI0P9RbA/TFCOBatCnzI/AAAAAAAAAC0/ff-MjKNGfI0/s72-c/MSpeople.jpg' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-8760605451835242490</id><published>2010-07-26T20:33:00.006+04:00</published><updated>2010-07-29T01:52:48.305+04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Stanford'/><category scheme='http://www.blogger.com/atom/ns#' term='PASCAL VOC'/><category scheme='http://www.blogger.com/atom/ns#' term='ICVSS'/><category scheme='http://www.blogger.com/atom/ns#' term='computer vision'/><category scheme='http://www.blogger.com/atom/ns#' term='lecture'/><category scheme='http://www.blogger.com/atom/ns#' term='paper'/><category scheme='http://www.blogger.com/atom/ns#' term='optical flow'/><category scheme='http://www.blogger.com/atom/ns#' term='history'/><category scheme='http://www.blogger.com/atom/ns#' term='Russia'/><category scheme='http://www.blogger.com/atom/ns#' term='Cambridge'/><title type='text'>ICVSS 2010: Have you cited Lagrange?</title><content type='html'>&lt;div style="text-align: justify;"&gt;Yeah, I have. :) Before the beginning of the summer school we received &lt;a href="http://svg.dmi.unict.it/icvss2010/Abstracts/SoattoReadingGroup.pdf"&gt;a homework assignment&lt;/a&gt; from &lt;a href="http://www.cs.ucla.edu/~soatto/"&gt;Prof. Stefano Soatto&lt;/a&gt;. We had to read the three papers and discover the roots of ideas exploited by the authors. Such connections are not obligatory expressed by references. As a result, one had to get a tree (or, may be, lattice) with a root in the given paper. Since the task was not accurately formulated, most of us chose just one paper and made a tree for it. Later it turned out that all three problems (optical flow estimation, multi-view stereo and volume registration) have the same roots because they lead to the similar optimization problems and we were supposed to establish that connection.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;I (like most of the students) had a limited time to prepare the work. All the modern research in optical flow is based on two seminal papers by MIT's &lt;a href="http://www.umiacs.umd.edu/~pturaga/ENEE731/papers/OpticalFlow/Optical_Flow_Horn.pdf"&gt;Horn &amp;amp; Shunk&lt;/a&gt; and CMU's &lt;a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.49.2019&amp;amp;rep=rep1&amp;amp;type=pdf"&gt;Lukas &amp;amp; Kanade&lt;/a&gt;, both from 1981. During the last 30 years a lot of work has been done. It was summarized by &lt;a href="http://www.gris.tu-darmstadt.de/~sroth/pubs/cvpr10sun.pdf"&gt;Sun et al. (2010)&lt;/a&gt;, who discovered that only small fraction of the ideas give significant improvement and combined them into quite simple method that found its place on the top of &lt;a href="http://vision.middlebury.edu/flow/eval/results/results-e1.php"&gt;Middlebury table&lt;/a&gt;. Since I had no time to read a lot of optical flow papers, I discovered a lot of math stuff (like PDEs and variation calculus) used in Horn-Shunk paper. Just as a joke I added references to the works of Newton, Lagrange and d'Alambert to &lt;a href="http://shapovalov.ro/papers/report-icvss2010-Shapovalov.pdf"&gt;my report&lt;/a&gt;. Surprisingly, joke was not really understood.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;There were only 21 submissions, one of them was 120 pages long (the author did not show up at the seminar =), the tree depth varied from 1 to 20. I was not the only one who cited "pre-historic" papers, someone traced ideas back to Aristotle through Occam. The questions about the horizon arose: it is really ridiculous to find the roots of computer vision at Aristotle works. Since the prizes were promised for rare references (the prize policy was left unclear too), some argument took place. Soatto did not give any additional explanations on the formulation of the task but let the audience decide which references are legitimate in controversial cases. Eventually, I was among the 5 people whose references were considered controversial, so I needed to defend them. Well, I suppose I looked pretty pathetic talking about Newton's finite differences which were used for approximation of derivatives in the Horn-Shunk's paper. Surprisingly, almost a half of the audience voted for me. =) Also, my reference to &lt;a href="http://www.earlymoderntexts.com/reip.html"&gt;the Thomas Reid's essay&lt;/a&gt; was tentatively approved. &lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Finally, there were no bonuses for the papers that could not be found with Google, and the whole $1000 prize went to the Cambridge group. To conclude, Soatto (among other) said that nobody had read any German or Russian papers. After the seminar I told him how I tried to dig into the library (it is described &lt;a href="http://semifinalist.livejournal.com/113425.html"&gt;here&lt;/a&gt; in Russian), he answered something like "funny, it is hard to find them even for you!"&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;One evening during the dinner we talked about Russia, and &lt;a href="http://www.stanford.edu/~vijayc/"&gt;Vijay&lt;/a&gt; told a lot of interesting stuff (like the guy who developed &lt;a href="http://chatroulette.com/"&gt;ChatRoulette &lt;/a&gt;is now working in Silicon Valley). He remembered &lt;a href="http://www.stanford.edu/~boyd/"&gt;Stephen Boyd&lt;/a&gt; who is known for his jokes about Russia. I said that I watched &lt;a href="http://stanford.edu/class/ee364a/videos.html"&gt;the records&lt;/a&gt; of his amazing &lt;a href="http://stanford.edu/class/ee364a/"&gt;course on Convex Optimization&lt;/a&gt;. It turned out that Vijay (who is now a PhD student in Stanford) took that course, and he promised to tell Boyd that he has a fan in Russia. (Or, maybe in Soviet Russia fans have Boyd =).&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;i&gt;continued &lt;a href="http://computerblindness.blogspot.com/2010/07/icvss-2010-some-trivia.html"&gt;here&lt;/a&gt;&lt;/i&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-8760605451835242490?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/8760605451835242490/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2010/07/icvss-2010-have-you-cited-lagrange.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/8760605451835242490'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/8760605451835242490'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2010/07/icvss-2010-have-you-cited-lagrange.html' title='ICVSS 2010: Have you cited Lagrange?'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-7611540104303652090</id><published>2010-07-25T03:17:00.008+04:00</published><updated>2010-10-18T00:57:10.238+04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='CBIR'/><category scheme='http://www.blogger.com/atom/ns#' term='graphical models'/><category scheme='http://www.blogger.com/atom/ns#' term='recognition'/><category scheme='http://www.blogger.com/atom/ns#' term='laserscanning'/><category scheme='http://www.blogger.com/atom/ns#' term='ICVSS'/><category scheme='http://www.blogger.com/atom/ns#' term='computer vision'/><category scheme='http://www.blogger.com/atom/ns#' term='lecture'/><category scheme='http://www.blogger.com/atom/ns#' term='Microsoft'/><category scheme='http://www.blogger.com/atom/ns#' term='Russia'/><category scheme='http://www.blogger.com/atom/ns#' term='MRF'/><category scheme='http://www.blogger.com/atom/ns#' term='Cambridge'/><title type='text'>ICVSS 2010: lectures and posters</title><content type='html'>&lt;div style="text-align: justify;"&gt;&lt;div style="text-align: justify; "&gt;I returned from my eurotrip yesterday and now I am ready to start a series of posts about &lt;a href="http://svg.dmi.unict.it/icvss2010/"&gt;International Computer Vision Summer School&lt;/a&gt; (ICVSS 2010). Generally, I enjoyed the school. That week gave me (I hope) a lot of new knowledge and new friends.&lt;/div&gt;&lt;div style="text-align: justify; "&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify; "&gt;The scientific programme of the school included lectures, tutorials, a student poster session and a reading group. Lectures occupied most of official programme time and were given by a great team of professors including &lt;a href="http://research.microsoft.com/en-us/um/people/szeliski/"&gt;Richard Szeliski&lt;/a&gt;, renowned &lt;a href="http://cbcl.mit.edu/people/poggio/poggio-new.htm"&gt;Tomasso Poggio&lt;/a&gt; and enchanting &lt;a href="http://userweb.cs.utexas.edu/~grauman/"&gt;Kristen Grauman&lt;/a&gt;. I am not going to describe all the talks, but feature the ones that are close to my interests. You can find the complete programme &lt;a href="http://svg.dmi.unict.it/icvss2010/speakers&amp;amp;Syllabus.htm"&gt;here&lt;/a&gt;. Unfortunately, no video was recorded, but I have an access to all the &lt;a href="http://svg.dmi.unict.it/icvss2010/programme.htm"&gt;slides &lt;/a&gt;and &lt;a href="http://svg.dmi.unict.it/icvss2010/posters.html"&gt;posters&lt;/a&gt;,  so if you are interested in anything, I can send it to you (I believe it does not violate any copyrights).&lt;/div&gt;&lt;div style="text-align: justify; "&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify; "&gt;Wednesday was the day of Recognition. Kristen Grauman gave a talk about visual search. She covered the topics concerning specific object search using local descriptors and bags of words, object category search with pyramid matching, and also discussed state-of-the-art in the challenging problem of web-scale image retrieval. &lt;a href="http://www.comp.leeds.ac.uk/me/"&gt;Mark Everingham&lt;/a&gt; continued with a talk about category recognition and localization using machine learning techniques. Localization is reached using bounding boxes, segments, or object parts search (like finding eyes and a mouth to find a face). Sure he could not avoid to mention importance of context. He also explained the &lt;a href="http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2010/index.html"&gt;PASCAL VOC&lt;/a&gt; evaluation protocol.&lt;/div&gt;&lt;div style="text-align: justify; "&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify; "&gt;The tutorials covered some applications and did not really impress me. Poster session was a great opportunity to meet people. Some posters were really decent, for example &lt;a href="http://www.ims.tuwien.ac.at/staff_detail.php?ims_id=bleyer"&gt;Michael Bleyer&lt;/a&gt;'s poster on dense stereo estimation using soft segmentation, which won a half of the school's best presentation prize. The work was done with &lt;a href="http://research.microsoft.com/en-us/labs/cambridge/"&gt;Microsoft Research Cambridge&lt;/a&gt;, they formulated a really complicated energy function based on surface plane estimations and minimized it with Lempitsky's graph-cut based &lt;a href="http://www.robots.ox.ac.uk/~vilem/fusion.pdf"&gt;fusion-move algorithm (2009)&lt;/a&gt;. More details could be found in &lt;a href="http://www.ims.tuwien.ac.at/publication_detail.php?ims_id=285"&gt;their recent CVPR paper&lt;/a&gt;.&lt;/div&gt;&lt;div style="text-align: justify; "&gt;&lt;br /&gt;&lt;/div&gt;&lt;img src="http://3.bp.blogspot.com/_IDEWI0P9RbA/TE4TNMXqiGI/AAAAAAAAACs/8CwaabS8Ueo/s320/mePoster.jpg" title="Me presenting the poster. Photo by Sandra." style="float:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;width: 202px; height: 320px;" border="0" alt="" id="BLOGGER_PHOTO_ID_5498353312445663330" /&gt;&lt;div style="text-align: justify; "&gt;The problem with the poster session was that lecturers did not attend it, although they could really give a great feedback. &lt;a href="http://shapovalov.ro/papers/poster-Shapovalov-icvss2010.pdf"&gt;My presentation&lt;/a&gt; was in the last day of the session after the not-really-popular reading group and in the room upstairs (its existence was not a well-known fact :), so many people preferred to spend that evening on the beach. The school audience was quite heterogeneous, there were a lot of people from medical imaging, video compression etc., so I had to explain some basics (like MRFs) to some students interested in my poster. There were also really smart guys (like those from Cambridge group). There was a bit of useful feedback: someone recommended me to use QPBO for inference, and I should probably consider it. &lt;/div&gt;&lt;div style="text-align: justify; "&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify; "&gt;Since the speakers did not attend the poster session, students had to communicate with them informally. A roommate of mine, &lt;a href="http://www.cs.ucf.edu/~ramin/"&gt;Ramin&lt;/a&gt;, who does a crowd analysis, caught Richard Szeliski and asked him about local descriptors in video that operate in 3D image-time space. Szeliski told that it is a promising field and even remembered &lt;a href="http://www.irisa.fr/vista/Equipe/People/Ivan.Laptev.html"&gt;Ivan Laptev&lt;/a&gt;'s name. During our tour to Ragusa Ibla I asked Mark Everingham (who is probably the closest of all speakers to my topic) about 3D point cloud classification. He said it was not really his field, but it should be fruitful to analyse clouds not only in local levels, and stuff like multi-layer CRFs could be useful. During the last 2 years there appeared a few papers that exploit that simple intuitive idea and incorporate shape detectors with CRFs, but it usually looks awkward. Well, may be smartly designed multi-layer CRFs might be really useful. It is funny, when I told Everingham that I'm from Moscow he replied that it was great, Moscow has a great math school and remembered &lt;a href="http://www.cs.ucl.ac.uk/staff/V.Kolmogorov/"&gt;Vladimir Kolmogorov&lt;/a&gt;. So, our education seems to be not so terrible. :D&lt;/div&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;i&gt;continued &lt;a href="http://computerblindness.blogspot.com/2010/07/icvss-2010-have-you-cited-lagrange.html"&gt;here&lt;/a&gt;&lt;/i&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-7611540104303652090?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/7611540104303652090/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2010/07/icvss-2010-lectures-and-posters.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/7611540104303652090'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/7611540104303652090'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2010/07/icvss-2010-lectures-and-posters.html' title='ICVSS 2010: lectures and posters'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_IDEWI0P9RbA/TE4TNMXqiGI/AAAAAAAAACs/8CwaabS8Ueo/s72-c/mePoster.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-6011266024014418651</id><published>2010-06-27T11:53:00.003+04:00</published><updated>2011-03-08T12:35:44.707+03:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='IEEE'/><category scheme='http://www.blogger.com/atom/ns#' term='software'/><category scheme='http://www.blogger.com/atom/ns#' term='paper'/><category scheme='http://www.blogger.com/atom/ns#' term='academic'/><category scheme='http://www.blogger.com/atom/ns#' term='Internet'/><category scheme='http://www.blogger.com/atom/ns#' term='web-site'/><title type='text'>Mendeley news</title><content type='html'>&lt;div style="text-align: justify;"&gt;Few months ago I &lt;a href="http://computerblindness.blogspot.com/2009/12/mendeley-reference-management-software.html"&gt;posted&lt;/a&gt; about &lt;a href="http://www.mendeley.com/"&gt;Mendeley&lt;/a&gt;, the reference management system. I've been using it all the time and I am really satisfied. It is much more convenient to have your papers on a server than on a flash drive. Also keeping meta-information about the papers makes dealing with them (exploring, reading, citing etc.) more pleasant.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;In this month a major update was released and a monetizing scheme was introduced. There are now 3 tariff plans in addition to the free one that allows one to use up to 1Gb of server disk space. There are also restrictions on the max number of shared collections and the number of users per collection. It does not look critical after all: there are no features you cannot use in the free version, so the company seems not evil. However, on the sister project &lt;a href="http://last.fm/"&gt;last.fm&lt;/a&gt; the radio feature became paid after years of free service, which caused many users drift to &lt;a href="http://listen.grooveshark.com/"&gt;grooveshark&lt;/a&gt;.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;The update has affected the design of the paper info right-hand bar, and the references bar was eliminated. They promise to rebirth the bar in the future releases. It was really unusable before. When you read a paper, it is handy to have the list of indexed references on the bar, especially if the reference-style citing is used (it is the major problem with reading papers from the screen: where the hell [42] is referencing to?) I hope the people in Mendeley realize it.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;It would be great to have a possibility to find papers without leaving the application. In theory, you could drag a citation from the references bar to a collection, find its (more) precise details via Internet search, and you are likely to have the link, and if it is direct, you can add the pdf with the Add File dialog using the found URL. In practice, the found URLs usually are not direct, they refer to IEEE/ACM/Springer pages, where the paper could be downloaded (usually not freely). In the same time, the paper is likely to be available for free through the direct link on the author's homepage. Moreover, Google Scholar often finds them too, but Mendeley chooses indirect links.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Mendeley needs to be more "semantic" while working with authors and conferences. It stores the author name as it appears in the paper. When you want to filter your papers by author, you can see "Shapovalov, R", "Shapovalov, R.", "Shapovalov, Roman", "Shapovalov, Roman V." etc. in the list. There are bases like &lt;a href="http://www.informatik.uni-trier.de/~ley/db/"&gt;DBLP&lt;/a&gt;, Mendeley can build their own index, so the author should be identified. If we know the author (but not only her name appearance in the paper), we are able to filter Google Scholar results and retrieve the direct link to the paper hosted on the author/university site. Finally, it is feasible to create the system, where human don't need to find papers. If I see the reference, I can automatically retrieve the pdf in a few seconds. I want Mendeley to develop in this direction.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-6011266024014418651?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/6011266024014418651/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2010/06/mendeley-news.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/6011266024014418651'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/6011266024014418651'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2010/06/mendeley-news.html' title='Mendeley news'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-9120770347554436408</id><published>2010-06-21T02:21:00.008+04:00</published><updated>2010-07-27T02:50:05.853+04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Vision 101'/><category scheme='http://www.blogger.com/atom/ns#' term='recognition'/><category scheme='http://www.blogger.com/atom/ns#' term='PASCAL VOC'/><category scheme='http://www.blogger.com/atom/ns#' term='computer vision'/><category scheme='http://www.blogger.com/atom/ns#' term='labelling'/><category scheme='http://www.blogger.com/atom/ns#' term='robotics'/><category scheme='http://www.blogger.com/atom/ns#' term='MRF'/><title type='text'>Object detection vs. Semantic segmentation</title><content type='html'>&lt;div style="text-align: justify;"&gt;Recently I realized that object class detection and semantic segmentation are the two &lt;i&gt;different&lt;/i&gt; ways to solve the recognition task. Although the approaches look very similar, methods vary significantly on the higher level (and sometimes on the lower level too). Let me first state the problem formulations.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;b&gt;Semantic segmentation&lt;/b&gt; (or pixel classification) associates one of the pre-defined class labels to each pixel. The input image is divided into the regions, which correspond to the objects of the scene or "stuff" (in terms of &lt;a href="http://robotics.stanford.edu/~koller/Papers/Heitz%2BKoller:ECCV08.pdf"&gt;Heitz and Koller (2008)&lt;/a&gt;). In the simplest case pixels are classified w.r.t. their local features, such as colour and/or texture features &lt;a href="http://johnwinn.org/Publications/papers/TextonBoost_ECCV2006.pdf"&gt;(Shotton et al., 2006)&lt;/a&gt;. Markov Random Fields could be used to incorporate inter-pixel relations.&lt;/div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_IDEWI0P9RbA/TB6VYXUN_CI/AAAAAAAAACA/DllZ1rwWktA/s1600/semsegm.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 247px;" src="http://3.bp.blogspot.com/_IDEWI0P9RbA/TB6VYXUN_CI/AAAAAAAAACA/DllZ1rwWktA/s400/semsegm.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5484985641992059938" /&gt;&lt;/a&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;div&gt;&lt;div style="text-align: justify;"&gt;&lt;b&gt;Object detection&lt;/b&gt; addresses the problem of localization of objects of the certain classes. Minimum bounding rectangles (MBRs) of the objects are the ideal output. The simplest approach here is to use a sliding window of varying size and classify sub-images defined by the window. Usually, neighbouring windows have similar features, so each object is likely to be alarmed by several windows. Since multiple/wrong detections are not desirable, non-maximum suppression (NMS) is used. In PASCAL VOC contest an object is considered detected, if the true and found rectangles are intersected on at least half of their union area. In the &lt;a href="http://en.wikipedia.org/wiki/Marr_Prize"&gt;Marr prize&lt;/a&gt; winning paper by &lt;a href="http://www.cse.wustl.edu/~mgeorg/readPapers/byVenue/iccv2009/desai2009_iccv_discriminativeModelsMulticlassObjectLayout.pdf"&gt;Desai et al. (2009)&lt;/a&gt; more intelligent scheme for NMS and incorporation of context is proposed. In the &lt;a href="http://thomas.deselaers.de/publications/files/alexe-cvpr10.pdf"&gt;recent paper by Alexe&lt;/a&gt; the objectness measure for a sliding window is presented.&lt;/div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_IDEWI0P9RbA/TB6VMSU8JxI/AAAAAAAAAB4/jlGsKvWF3Ds/s1600/detection.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 247px;" src="http://2.bp.blogspot.com/_IDEWI0P9RbA/TB6VMSU8JxI/AAAAAAAAAB4/jlGsKvWF3Ds/s400/detection.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5484985434494478098" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;In theory, the two problems are almost equivalent. Object detection reduces easily to semantic segmentation. If we have a segmentation output, we just need to retain object classes (or discard the "stuff" classes) and take MBRs of regions. The contrary is more difficult. Actually, all the stuff turns into the &lt;i&gt;background&lt;/i&gt; class. All the found objects within the rectangles should be segmented, but it is a solvable issue since foreground extraction techniques like &lt;a href="http://research.microsoft.com/en-us/um/people/ablake/papers/ablake/siggraph04.pdf"&gt;GrabCut&lt;/a&gt; could be applied. So, there are technical difficulties which could be overcome and the two problems could be considered equivalent, however, in practice the approaches are different.&lt;/div&gt;&lt;div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;There arise two questions:&lt;/div&gt;&lt;div style="text-align: justify;"&gt;1. Which task has more applications? I think we do not generally need to classify background into e.g. ground and sky (unless we are programming an autonomous robot), we are interested in finding objects more. Do we often need to obtain the exact object boundary?&lt;/div&gt;&lt;div style="text-align: justify;"&gt;2. Which task is sufficient for the "retrieval" stage of the intelligent vision system in the philosophical sense? I.e. which task is more suitable for solving the global problem of exhaustive scene analysis? &lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Thoughts?&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-9120770347554436408?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/9120770347554436408/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2010/06/object-detection-vs-semantic.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/9120770347554436408'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/9120770347554436408'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2010/06/object-detection-vs-semantic.html' title='Object detection vs. Semantic segmentation'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_IDEWI0P9RbA/TB6VYXUN_CI/AAAAAAAAACA/DllZ1rwWktA/s72-c/semsegm.png' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2293611840857369720.post-231856070345338416</id><published>2010-05-30T22:38:00.009+04:00</published><updated>2010-06-02T02:29:16.778+04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Vision 101'/><category scheme='http://www.blogger.com/atom/ns#' term='graphical models'/><category scheme='http://www.blogger.com/atom/ns#' term='laserscanning'/><category scheme='http://www.blogger.com/atom/ns#' term='computer vision'/><category scheme='http://www.blogger.com/atom/ns#' term='labelling'/><category scheme='http://www.blogger.com/atom/ns#' term='software'/><category scheme='http://www.blogger.com/atom/ns#' term='tutorial'/><category scheme='http://www.blogger.com/atom/ns#' term='machine learning'/><category scheme='http://www.blogger.com/atom/ns#' term='web-site'/><category scheme='http://www.blogger.com/atom/ns#' term='MRF'/><title type='text'>Max-product optimization methods and software</title><content type='html'>&lt;div style="text-align: justify;"&gt;I hoped I would never post a sorry-for-not-posting message, but I do (and now I feel like I owe you a long post =). In fact, I have been being really busy,  particularly with putting my masters thesis down to paper. However, it has a positive outcome: I systematized my knowledge about inference techniques in MRFs, and I'd like to share them here.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Please don't consider it a survey. While it claims to be quite comprehensive, it is not enough deep and verbose. You are not gonna see a lot of formulas, but the links to papers and implementations along with the recommendations on applicability of the methods and the software. It could be useful for anybody who wants to quickly try energy minimization in her task, but don't wanna spend a lot of time digging into the literature and implementing algorithms. So let's call this form of posting &lt;i&gt;unsurvey&lt;/i&gt;. You can always find more details on the topic in &lt;a href="http://www.cbsr.ia.ac.cn/users/szli/MRF_Book/book.html"&gt;Stan Li's book&lt;/a&gt; (not to be confused with &lt;a href="http://en.wikipedia.org/wiki/Stan_Lee"&gt;Stan Lee&lt;/a&gt; recently appeared in The Big Bang Theory).&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;i&gt;Max-product&lt;/i&gt; energy minimization problem arises in a number of fields such as statistical physics, social network analysis, and, of course, computer vision, where the problems like image denoising, dense stereo reconstruction, semantic segmentation lead to the single discrete optimization problem: maximize the product of some functions defined over nodes and edges of the graph &lt;i&gt;G=(N,E), &lt;/i&gt;typically a grid over image pixels in low-level vision&lt;i&gt;.&lt;/i&gt; Due to using logarithms, it is equivalent to maximizing the sum&lt;/div&gt;&lt;div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_IDEWI0P9RbA/TAQqzCgcwLI/AAAAAAAAABw/gNITgNUVcEE/s1600/energy.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 318px; height: 47px;" src="http://3.bp.blogspot.com/_IDEWI0P9RbA/TAQqzCgcwLI/AAAAAAAAABw/gNITgNUVcEE/s400/energy.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5477550103124033714" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;div style="text-align: justify;"&gt;&lt;div style="text-align: justify;"&gt;where each &lt;span class="Apple-style-span"  style="font-size:medium;"&gt;&lt;i&gt;Y&lt;/i&gt;&lt;sub&gt;&lt;i&gt;i&lt;/i&gt;&lt;/sub&gt;&lt;/span&gt; corresponds to a node of the graph and takes on a value from a fixed discrete range. It could be a class label in semantic segmentation, or a disparity value in the pixel in dense stereo. So-called &lt;i&gt;potential functions&lt;/i&gt; &lt;i&gt;φ&lt;/i&gt; (&lt;i&gt;unary&lt;/i&gt; in the first term and &lt;i&gt;pairwise &lt;/i&gt;in the second one) define possibility of the assignment given class labels to the nodes and edges correspondingly. They could be conditioned by the data. For example, unary potentials could reflect colour information in the pixel, and pairwise potentials could enforce assignment of the similar labels to the ends of an edge to provide smoothness in the resulting labelling. If the graph contains cliques of the order 3 and greater, one should consider more terms (&lt;a href="http://computerblindness.blogspot.com/2010/01/on-mrf-factorization.html"&gt;as I explained earlier&lt;/a&gt;), but in practice higher order cliques are often ignored. Also, an important particular case, 4-connected grid, contains only second order cliques, that's why the research community focused on this "pairwise" problem.&lt;/div&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;In general, this maximization problem is NP-hard. That's why in the 1980s they solved it using genetic algorithms or simulated annealing. Fortunately, specific methods have been developed.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;First of all, there are two cases when the exact global optimum could be found in polynomial time. The first is when the graph has a tree structure. Actually, I don't know application where this case is useful. Probably, it is good for analysis of parsing trees in natural language processing. In this case the maximum is found using the dynamic programming based &lt;a href="http://en.wikipedia.org/wiki/Belief_propagation"&gt;belief propagation&lt;/a&gt; algorithm. One node is selected as the root, and messages are passed from the root and then back to the root. The optimal assignment is thus found.&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;Another polynomial case allows an arbitrary graph topology, but restricts the number of labels and the form of pairwise potentials. Only one of the two labels could be assigned to each node. Pairwise potentials should be &lt;i&gt;submodular&lt;/i&gt;, i.e. &lt;i&gt;φ(0,0) + &lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;φ(1,1) &gt;= &lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;φ(0,1) + &lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;φ(1,0). &lt;/i&gt;Thus, the binary assignment implies that all the nodes should be divided into the two groups according to their labels. This division is found using the algorithm for finding minimum cut on the supplementary graph. Two fake nodes are added to the graph and declared as the source and the sink. Both fake nodes are adjacent to all the nodes of the "old" graph. Using your favourite min-cut/max-flow algorithm you can separate the nodes into two sets, which gives you the resulting labelling.&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;br /&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;Unfortunately, the binary labelling problem is not so common too. In practice people want to choose one of the multiple labels for the nodes. This problem is NP-hard, but the iterative procedures of &lt;/span&gt;&lt;span class="Apple-style-span"&gt;α-expansion&lt;/span&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt; and &lt;/span&gt;&lt;span class="Apple-style-span"&gt;αβ-swap&lt;/span&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt; give good approximations with certain theoretical guarantees. [&lt;a href="http://www.csd.uwo.ca/~yuri/Papers/pami01.pdf"&gt;Boykov et al., 2001&lt;/a&gt;] During each iteration the min-cut algorithm is run, so there is an important constraint. Each projection to any two variables of pairwise potentials should be submodular. If it is hold for your energy, I recommend you to use this method since it is really fast and accurate. &lt;a href="http://www.csd.uwo.ca/~olga/OldCode.html"&gt;The implementation by Olga Veksler&lt;/a&gt; is available.  &lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;br /&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;What should we do if the energy is not submodular and the graph contains cycles? There are some methods too, but they are not so fast and typically not so effective. First of all, the stated integer programming problem could be relaxed to linear programming. You can use any LP solver, but it will eventually take days to find the optimum. Moreover, it won't be exact, because the relaxed problem solution could be rounded back in multiple ways. No surprise here, because the possibility of getting the global optimum would prove that P = NP.&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;br /&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;One of the methods for approaching the general problem is &lt;/span&gt;&lt;span class="Apple-style-span"&gt;Loopy Belief Propagation&lt;/span&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt; [&lt;a href="http://www.yaroslavvb.com/papers/kschischang-factor.pdf"&gt;Kschischang, 2001&lt;/a&gt;]. No doubt, it is really loopy. Dynamic programming is used on the graph with loops, and it can eventually fall into the endless loop passing some messages along the loop. It is now considered outdated, but you can still try it, it is implemented in &lt;a href="http://people.kyb.tuebingen.mpg.de/jorism/libDAI/"&gt;libDAI&lt;/a&gt;. The input is a factor graph, which is inconvenient in most cases, but really flexible (you could create cliques of arbitrary order).&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;br /&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;State-of-the-art approach in general case, &lt;/span&gt;&lt;span class="Apple-style-span"&gt;tree-reweighted message passing (TRW)&lt;/span&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;, has been developed by Wainwright and Kolmogorov. [&lt;a href="http://www.cs.ucl.ac.uk/staff/V.Kolmogorov/papers/TRW-S-PAMI.pdf"&gt;Kolmogorov, 2006&lt;/a&gt;] It is based on the decomposition of the original graph to some graphs where exact message-passing inference is possible, i.e. trees. Message passing is done iteratively on the trees, and trees are enforced to assign the same values in particular nodes. The procedure is guaranteed to converge, but unfortunately not always to the global maximum. However, it is better than Loopy BP. &lt;a href="http://www.cs.ucl.ac.uk/staff/V.Kolmogorov/papers/TRW-S.html"&gt;Kolmogorov's implementation&lt;/a&gt; contains LBP and TRW-S under a common interface, so you can compare their performance. To understand the decomposition technique better, you could read &lt;a href="http://www.cs.ualberta.ca/~jag/papersVis2/07ICCV/data/papers/ICCV/053.pdf"&gt;Komodakis's paper [2007]&lt;/a&gt;. It describes a general decomposition framework.&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;br /&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;Once a company of the world-leading MRF researchers gathered and implemented LBP, graph-cuts and TRW-S under common interface, &lt;a href="http://www.springerlink.com/index/d173073726316r32.pdf"&gt;investigated performance&lt;/a&gt; and even &lt;a href="http://vision.middlebury.edu/MRF/code/"&gt;shared their code &lt;/a&gt;on a Middlebury college page. It would be great to use those implementations interchangeably,  but unfortunately the interface is tailored for the stereo problem, i.e. it is impossible to use the general form of potentials. My latest research is about usage MRFs for laser scan classification, where the topology of the graph is far from a rectangular grid, and the potentials are more complex than the ones in the Potts model. So I ended up in writing my own wrappers.&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;br /&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;Another interesting method was re-discovered by &lt;a href="http://research.microsoft.com/pubs/80483/PAMI07-QPBO.pdf"&gt;Kolmogorov and Rother [2007]&lt;/a&gt;. It is known as &lt;/span&gt;&lt;span class="Apple-style-span"&gt;quadratic pseudo-boolean optimization (QPBO)&lt;/span&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;. This graph-cut based method can either assign a label to a node, or reject to classify it. The consistency property is guaranteed: if the label has been assigned, this exact label is assigned in the global optimum. The equivalent solution could be obtained using a specific LP relaxation with 1/2 probabilities for class labels allowed. I have not used the technique, so I cannot recommend an implementation.&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;br /&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-style: normal; "&gt;To summarize, the algorithm for choosing an optimization method is the following. If your graph is actually a tree, use belief propagation. Otherwise, if your energy is submodular, use graph-cut based inference; it is both effective and efficient (even exact in the binary case). If not, use TRW-S, it often yields a good approximation.&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/i&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2293611840857369720-231856070345338416?l=computerblindness.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://computerblindness.blogspot.com/feeds/231856070345338416/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://computerblindness.blogspot.com/2010/05/max-product-optimization-methods-and.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/231856070345338416'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2293611840857369720/posts/default/231856070345338416'/><link rel='alternate' type='text/html' href='http://computerblindness.blogspot.com/2010/05/max-product-optimization-methods-and.html' title='Max-product optimization methods and software'/><author><name>Roman Shapovalov</name><uri>https://profiles.google.com/103110049533984878325</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-8d0JtBNGM_o/AAAAAAAAAAI/AAAAAAAAAAA/Vc7QubaXpjc/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_IDEWI0P9RbA/TAQqzCgcwLI/AAAAAAAAABw/gNITgNUVcEE/s72-c/energy.png' height='72' width='72'/><thr:total>5</thr:total></entry></feed>
