ICVSS 2010: some trivia

If you think that summer schools are only about lectures and posters, you are certainly wrong. Well, I also did not expect that I would get drunk every night (it was not in Russia after all!). On the first night I convinced people to go to the beach for night swimming, but on the next day I realized that it was not a really good idea, because lectures started at 9. Eventually, we did not go to bed early for the rest of the week.

I would like to thank "Marsa Sicla" people (Fig. 1), you were the great company!

Fig. 1. The Marsa Sicla company. Left to right: Richard, Ramin,
Nicolas, Xi, Vijay, Clément, Aish, me. Sandra is behind the camera.

Actually, there was a reason why we had such a close-knit company. On the Fig. 2 you can see the map of the region where the school took place. The school was hosted by Baia Samuele hotel village (inside the blue frame on the map). The cheapest accommodation at Baia Samuele costed €100 per night, so some pragmatic people scared a different accommodation up. It was in a neighbouring hotel village, Marsa Sicla.

Fig. 2. ICVSS location map. View ICVSS_2010_MSvsBS in a larger map

In a map it looked pretty close, but in practice it turned out that there were a lot of fences (blue line)! Officially, we had to walk along the red route (~2.5 km, partly along a highway) through the official entrance. Moreover, we had to leave the village for lunch (since there were buffet), otherwise we would have been charged €30 per meal. To control it they wanted us to leave our documents at the control post (bottom blue tick on Fig. 2). Thus, we were supposed to walk 2.5 km four times a day. But we found a better solution.

Although the fence was equipped with a barbwire, we managed to find two shortcuts (green ticks). One of them led through ever-closed gate (Fig. 3), which were suitable for hopping over. So, we used the green path usually. It made the way two times shorter. Moreover, we did not have to leave Baia Samuele for lunch, since we weren't officially there. Sometimes we even had free lunch, that finally proved that the no free lunch theorem is wrong (joke by Ramin).

Fig. 3. Hopping over the fence.

Well, the lunch was really great. But this was not really a question of food, the main reason not to leave Baia Samuele was of course socialization, which used to be active during lunch (and also Internet access at the conference centre :). If we knew about such situation before booking the apartments, we would probably followed the official accommodation recommendations. However, we had a great company at Marsa Sicla too. But because of "unofficial night programme" I had to (almost) sleep during lectures. Most of guys was on their last year of PhD, so they considered the school as a vacation. But I really wanted to learn a lot of stuff. It was really difficult to perceive any information when you had not slept enough time. Surprisingly, I somehow managed to pass the final exam, but I feel I need to look through the lecture slides again, reading up some papers they refer to, otherwise the scientific part of the school will be useless for me.

Thus, I finish posting about ICVSS. To conclude, I want to recommend everyone to visit such summer schools, because they are the best way to enter into the community.

ICVSS 2010: Have you cited Lagrange?

Yeah, I have. :) Before the beginning of the summer school we received a homework assignment from Prof. Stefano Soatto. We had to read the three papers and discover the roots of ideas exploited by the authors. Such connections are not obligatory expressed by references. As a result, one had to get a tree (or, may be, lattice) with a root in the given paper. Since the task was not accurately formulated, most of us chose just one paper and made a tree for it. Later it turned out that all three problems (optical flow estimation, multi-view stereo and volume registration) have the same roots because they lead to the similar optimization problems and we were supposed to establish that connection.

I (like most of the students) had a limited time to prepare the work. All the modern research in optical flow is based on two seminal papers by MIT's Horn & Shunk and CMU's Lukas & Kanade, both from 1981. During the last 30 years a lot of work has been done. It was summarized by Sun et al. (2010), who discovered that only small fraction of the ideas give significant improvement and combined them into quite simple method that found its place on the top of Middlebury table. Since I had no time to read a lot of optical flow papers, I discovered a lot of math stuff (like PDEs and variation calculus) used in Horn-Shunk paper. Just as a joke I added references to the works of Newton, Lagrange and d'Alambert to my report. Surprisingly, joke was not really understood.

There were only 21 submissions, one of them was 120 pages long (the author did not show up at the seminar =), the tree depth varied from 1 to 20. I was not the only one who cited "pre-historic" papers, someone traced ideas back to Aristotle through Occam. The questions about the horizon arose: it is really ridiculous to find the roots of computer vision at Aristotle works. Since the prizes were promised for rare references (the prize policy was left unclear too), some argument took place. Soatto did not give any additional explanations on the formulation of the task but let the audience decide which references are legitimate in controversial cases. Eventually, I was among the 5 people whose references were considered controversial, so I needed to defend them. Well, I suppose I looked pretty pathetic talking about Newton's finite differences which were used for approximation of derivatives in the Horn-Shunk's paper. Surprisingly, almost a half of the audience voted for me. =) Also, my reference to the Thomas Reid's essay was tentatively approved.

Finally, there were no bonuses for the papers that could not be found with Google, and the whole $1000 prize went to the Cambridge group. To conclude, Soatto (among other) said that nobody had read any German or Russian papers. After the seminar I told him how I tried to dig into the library (it is described here in Russian), he answered something like "funny, it is hard to find them even for you!"

One evening during the dinner we talked about Russia, and Vijay told a lot of interesting stuff (like the guy who developed ChatRoulette is now working in Silicon Valley). He remembered Stephen Boyd who is known for his jokes about Russia. I said that I watched the records of his amazing course on Convex Optimization. It turned out that Vijay (who is now a PhD student in Stanford) took that course, and he promised to tell Boyd that he has a fan in Russia. (Or, maybe in Soviet Russia fans have Boyd =).

ICVSS 2010: lectures and posters

I returned from my eurotrip yesterday and now I am ready to start a series of posts about International Computer Vision Summer School (ICVSS 2010). Generally, I enjoyed the school. That week gave me (I hope) a lot of new knowledge and new friends.

The scientific programme of the school included lectures, tutorials, a student poster session and a reading group. Lectures occupied most of official programme time and were given by a great team of professors including Richard Szeliski, renowned Tomasso Poggio and enchanting Kristen Grauman. I am not going to describe all the talks, but feature the ones that are close to my interests. You can find the complete programme here. Unfortunately, no video was recorded, but I have an access to all the slides and posters, so if you are interested in anything, I can send it to you (I believe it does not violate any copyrights).

Wednesday was the day of Recognition. Kristen Grauman gave a talk about visual search. She covered the topics concerning specific object search using local descriptors and bags of words, object category search with pyramid matching, and also discussed state-of-the-art in the challenging problem of web-scale image retrieval. Mark Everingham continued with a talk about category recognition and localization using machine learning techniques. Localization is reached using bounding boxes, segments, or object parts search (like finding eyes and a mouth to find a face). Sure he could not avoid to mention importance of context. He also explained the PASCAL VOC evaluation protocol.

The tutorials covered some applications and did not really impress me. Poster session was a great opportunity to meet people. Some posters were really decent, for example Michael Bleyer's poster on dense stereo estimation using soft segmentation, which won a half of the school's best presentation prize. The work was done with Microsoft Research Cambridge, they formulated a really complicated energy function based on surface plane estimations and minimized it with Lempitsky's graph-cut based fusion-move algorithm (2009). More details could be found in their recent CVPR paper.

The problem with the poster session was that lecturers did not attend it, although they could really give a great feedback. My presentation was in the last day of the session after the not-really-popular reading group and in the room upstairs (its existence was not a well-known fact :), so many people preferred to spend that evening on the beach. The school audience was quite heterogeneous, there were a lot of people from medical imaging, video compression etc., so I had to explain some basics (like MRFs) to some students interested in my poster. There were also really smart guys (like those from Cambridge group). There was a bit of useful feedback: someone recommended me to use QPBO for inference, and I should probably consider it.

Since the speakers did not attend the poster session, students had to communicate with them informally. A roommate of mine, Ramin, who does a crowd analysis, caught Richard Szeliski and asked him about local descriptors in video that operate in 3D image-time space. Szeliski told that it is a promising field and even remembered Ivan Laptev's name. During our tour to Ragusa Ibla I asked Mark Everingham (who is probably the closest of all speakers to my topic) about 3D point cloud classification. He said it was not really his field, but it should be fruitful to analyse clouds not only in local levels, and stuff like multi-layer CRFs could be useful. During the last 2 years there appeared a few papers that exploit that simple intuitive idea and incorporate shape detectors with CRFs, but it usually looks awkward. Well, may be smartly designed multi-layer CRFs might be really useful. It is funny, when I told Everingham that I'm from Moscow he replied that it was great, Moscow has a great math school and remembered Vladimir Kolmogorov. So, our education seems to be not so terrible. :D

