Mendeley news

Few months ago I posted about Mendeley, the reference management system. I've been using it all the time and I am really satisfied. It is much more convenient to have your papers on a server than on a flash drive. Also keeping meta-information about the papers makes dealing with them (exploring, reading, citing etc.) more pleasant.

In this month a major update was released and a monetizing scheme was introduced. There are now 3 tariff plans in addition to the free one that allows one to use up to 1Gb of server disk space. There are also restrictions on the max number of shared collections and the number of users per collection. It does not look critical after all: there are no features you cannot use in the free version, so the company seems not evil. However, on the sister project last.fm the radio feature became paid after years of free service, which caused many users drift to grooveshark.

The update has affected the design of the paper info right-hand bar, and the references bar was eliminated. They promise to rebirth the bar in the future releases. It was really unusable before. When you read a paper, it is handy to have the list of indexed references on the bar, especially if the reference-style citing is used (it is the major problem with reading papers from the screen: where the hell [42] is referencing to?) I hope the people in Mendeley realize it.

It would be great to have a possibility to find papers without leaving the application. In theory, you could drag a citation from the references bar to a collection, find its (more) precise details via Internet search, and you are likely to have the link, and if it is direct, you can add the pdf with the Add File dialog using the found URL. In practice, the found URLs usually are not direct, they refer to IEEE/ACM/Springer pages, where the paper could be downloaded (usually not freely). In the same time, the paper is likely to be available for free through the direct link on the author's homepage. Moreover, Google Scholar often finds them too, but Mendeley chooses indirect links.

Mendeley needs to be more "semantic" while working with authors and conferences. It stores the author name as it appears in the paper. When you want to filter your papers by author, you can see "Shapovalov, R", "Shapovalov, R.", "Shapovalov, Roman", "Shapovalov, Roman V." etc. in the list. There are bases like DBLP, Mendeley can build their own index, so the author should be identified. If we know the author (but not only her name appearance in the paper), we are able to filter Google Scholar results and retrieve the direct link to the paper hosted on the author/university site. Finally, it is feasible to create the system, where human don't need to find papers. If I see the reference, I can automatically retrieve the pdf in a few seconds. I want Mendeley to develop in this direction.

Read Users' Comments (0)

Object detection vs. Semantic segmentation

Recently I realized that object class detection and semantic segmentation are the two different ways to solve the recognition task. Although the approaches look very similar, methods vary significantly on the higher level (and sometimes on the lower level too). Let me first state the problem formulations.

Semantic segmentation (or pixel classification) associates one of the pre-defined class labels to each pixel. The input image is divided into the regions, which correspond to the objects of the scene or "stuff" (in terms of Heitz and Koller (2008)). In the simplest case pixels are classified w.r.t. their local features, such as colour and/or texture features (Shotton et al., 2006). Markov Random Fields could be used to incorporate inter-pixel relations.

Object detection addresses the problem of localization of objects of the certain classes. Minimum bounding rectangles (MBRs) of the objects are the ideal output. The simplest approach here is to use a sliding window of varying size and classify sub-images defined by the window. Usually, neighbouring windows have similar features, so each object is likely to be alarmed by several windows. Since multiple/wrong detections are not desirable, non-maximum suppression (NMS) is used. In PASCAL VOC contest an object is considered detected, if the true and found rectangles are intersected on at least half of their union area. In the Marr prize winning paper by Desai et al. (2009) more intelligent scheme for NMS and incorporation of context is proposed. In the recent paper by Alexe the objectness measure for a sliding window is presented.

In theory, the two problems are almost equivalent. Object detection reduces easily to semantic segmentation. If we have a segmentation output, we just need to retain object classes (or discard the "stuff" classes) and take MBRs of regions. The contrary is more difficult. Actually, all the stuff turns into the background class. All the found objects within the rectangles should be segmented, but it is a solvable issue since foreground extraction techniques like GrabCut could be applied. So, there are technical difficulties which could be overcome and the two problems could be considered equivalent, however, in practice the approaches are different.

There arise two questions:
1. Which task has more applications? I think we do not generally need to classify background into e.g. ground and sky (unless we are programming an autonomous robot), we are interested in finding objects more. Do we often need to obtain the exact object boundary?
2. Which task is sufficient for the "retrieval" stage of the intelligent vision system in the philosophical sense? I.e. which task is more suitable for solving the global problem of exhaustive scene analysis?

Thoughts?

Read Users' Comments (4)