On developing benchmark evaluations

The Multimedia COMMONS 2016 workshop (October 16 2016) –that will run as part of the ACM Multimedia conference in Amsterdam– will provide a forum for the community of current and potential users of the Multimedia Commons. This is a multi-institution collaboration initiative, that was launched last year to compute features, generate annotations, and develop analysis tools, principally focusing on the Yahoo Flickr Creative Commons 100 Million dataset (YFCC100M), which contains around 99.2 million images and nearly 800,000 videos from Flickr. The workshop aims to share novel research using the YFCC100M dataset, emphasizing approaches that were not possible with smaller or more restricted multimedia collections; ask new questions about the scalability, generalizability, and reproducibility of algorithms and methods; re-examine how we use data challenges and benchmarking tasks to catalyze research advances; and discuss priorities, methods, and plans for continuously expanding annotation efforts.

At the MMCommons workshop I will discuss the development of benchmark evaluations in the context of  a series of tasks focusing on audiovisual search emphasizing its ‘multimodal’ aspects, starting in 2006 with the workshop on ‘Searching Spontaneous Conversational Speech’ that led to tasks in CLEF and MediaEval (“Search and Hyperlinking”), and recently also TRECVid (“Video Hyperlinking”). The value and importance of Benchmark Evaluations is widely acknowledged. Benchmarks play a key role in many research projects. It takes time, a well-balanced team of domain specialists preferably with links to the user community and industry, and a strong involvement of the research community itself to establish a sound evaluation framework that includes (annotated) data sets, well-defined tasks that reflect the needs in the ‘real world’, a proper evaluation methodology, ground-truth, including a strategy for repetitive assessments, and last but not least, funding. Although the benefits of an evaluation framework are typically reviewed from a perspective of ‘research output’ –e.g., a scientific publication demonstrating an advance of a certain methodology– it is important to be aware of the value of the process of creating a benchmark itself: it increases significantly the understanding of the problem we want to address and as a consequence also the impact of the evaluation outcomes.

The focus of my talk will be on the process rather than on the results of these evaluations themselves, and will address cross-benchmark connections, and new benchmark paradigms, specifically the integration of benchmarking in industrial ‘living labs’ or Evaluation-as-a-Service (EaaS) initiatives that are becoming popular in some domains.

Audiovisual Linking: Scenarios, Approaches and Evaluation

The concept of (hyper)linking, well-known in the text domain, has been inspiring researchers and practitioners in the audiovisual domain since many years. On 30th of August 2016, I will talk about audiovisual linking at the 1st International Conference on Multimodal Media Data Analytics (MMDA2016) in The Hague, The Netherlands.

Various application scenarios can benefit from audiovisual linking. In recent years, we have been looking at recommendation and storytelling scenarios for video-to-video linking in the context of the Video Hyperlinking task in the MediaEval and TRECVid benchmark evaluation series, text-to-video linking to support fast access to broadcast archives in news production context, and audio-to-image linking in the context of visual radio. The latter relates to quite a long history of research projects on ‘linking the spoken word’ to related information sources in scenarios that aim for example to assist participants in meetings or to suggest slides during presentations.

I will present some of the approaches we experimented with the past years and zoom in on the process of setting up the video hyperlinking benchmark evaluations we have been running in MediaEval and TRECVid.

Call for Task Proposals MediaEval 2016

MediaEval is a benchmarking initiative dedicated to developing and evaluating new algorithms and technologies for multimedia retrieval, access and exploration. It offers tasks to the research community that are related to human and social aspects of multimedia. MediaEval emphasizes the ‘multi’ in multimedia and seeks tasks involving multiple modalities, e.g., audio, visual, textual, and/or contextual.

MediaEval is now calling for proposals for tasks to run in the 2016 benchmarking season. A proposal consists of a description of the motivation for the task and the challenges that task participants must address. It provides information on the data and evaluation methodology to be used. The proposal must also include a statement of how the task is related to MediaEval (i.e., its human or social component), and how it extends the state of the art in an area related to multimedia indexing, search or other technologies that support users in accessing information in multimedia collections.

For more detailed information about the content of the task proposal, please look here.

Tasks  are selected for inclusion at MediaEval on the basis of their feasibility, their match with the topical focus of MediaEval, and also according to the outcome of a survey circulated to the wider multimedia research community.

Proposal Deadline: 8 January 2016

The MediaEval 2016 Workshop is scheduled for 20-21 October 2016 in the Netherlands (just after ACM Multimedia 2016).

For more information about MediaEval see http://multimediaeval.org 

Topic models and diversity in video hyperlinking

Video Hyperlinking

The use of hierarchical topic models to find anchor-target pairs could potentially improve diversity in video hyperlinking, and the evaluation of video hyperlinking should focus more on assessing serendipity in the links. These are two important findings of the work of Anca-Roxanna Simon who defended successfully her PhD thesis on “Semantic Structuring of Video Collections from Speech: Segmentation and Hyperlinking”, Wednesday 2nd of December at the University of Rennes, France.

View original post 432 more words

Video Hyperlinking @ TRECVid-2015

Video Hyperlinking

After running a video hyperlinking benchmark evaluation for a number of years at MediaEval, we are excited to have now an evaluation running on video hyperlinking at TRECVid as well. On the 17th of November 2015, we discussed the results of the evaluation and the plans for next year at the TRECVid workshop in Gaithersburg, US.

Benchmarking the concept of video hyperlinking already started in 2009 with the Linking Task in VideoCLEF that involved linking video to wikipedia material on the same subject in a different language. In 2012, we started a ‘brave new task’ in MediaEval, where we explored approaches to benchmark the concept of linking videos to other videos using internet video from blip.tv.  In 2013-2014, ‘search and hyperlinking’ ran as a regular MediaEval task, this time with a collection of about 2500 hours of broadcast video from BBC instead of internet video.

Thanks to MediaEval we could improve our understanding of the concept of…

View original post 600 more words

Video Hyperlinking @ ACM Multimedia 2015

Video Hyperlinking

IMG_3739The SLAM Workshop on Speech, Language and Audio in Multimedia, connected to the ACM Multimedia conference and located in Brisbane, Australia this year, had a special session on Video Hyperlinking on Friday the 30th of November 2015. It was a good opportunity to discuss in more detail the results of the MediaEval benchmark evaluation on ‘Searching and Anchoring in Video Archives (SAVA)’ and to look forward to the upcoming 2015 TRECVid benchmark evaluation workshop. Video Hyperlinking became one of the TRECVid tasks this year. More on Video Hyperlinking and TRECVid after the workshop 16-18 November.

IMG_3754At SLAM, there was a session with four presentations on Video Hyperlinking. Benoît Huet (Eurecom) introduced the session with an overview on the topic. The rationale behind video hyperlinking is that it can help to improve access to archived video, a topic that was central to recently finished EU projects AXES and LinkedTV. Benoît provided…

View original post 411 more words

Pop-up archiefvideo

axes-randomvisualHet efficiënter beschikbaar en doorzoekbaar maken van grote videoarchieven is al jaren onderwerp van intensief onderzoek in diverse nationale en internationale projecten.  Het AXES project dat eind maart 2015 werd afgerond, leverde een belangrijke bijdrage aan dit onderzoek door de technologie voor het analyseren en doorzoeken van video’s een forse stap verder vooruit te brengen en tegelijkertijd goed te kijken naar de gebruikerswensen van diverse gebruikersgroepen die geïnteresseerd zijn in videoarchieven. Denk aan producenten en journalisten die videomateriaal willen hergebruiken voor nieuwe producties, onderzoekers die kennis willen opdoen door (grote hoeveelheden) video te analyseren, of ‘de thuisgebruiker’ die uit is op ‘infotainmenten op een toegankelijke en speelse manier informatie wil opdoen of gewoon een leuk filmpje wil kijken. Met name voor deze laatste groep, die niet altijd meer achter een PC zit, maar op de bank, in de tuin of in de trein met een tablet op schoot, heeft AXES nieuwe concepten ontwikkeld om interessante video’s uit het archief naar boven te halen en naar de gebruiker toe te brengen. Hieronder een kort verslag van deze nieuwe concepten.

Continue reading

Bruikbaarheid van spraakherkenning binnen de geesteswetenschappen (2)

scholarAanleiding voor het tweede deel van de serie over bruikbaarheid van spraakherkenning binnen de geesteswetenschappen is de kick-off van een aantal CLARIAH demonstratie-projecten maandag 11 Maart bij het Meertens Instituut. Eén van die projecten is het project Oral History Today. In dit project wordt een applicatie gebouwd speciaal voor onderzoekers die met Oral History –letterlijk Mondelinge Geschiedenis– aan de slag willen. De nadruk ligt hier sterk op de gebruikskant en omdat vanzelfsprekend het gesproken woord  in Oral History centraal staat en automatische spraakherkenning bij de ontsluiting van interviews een nuttige rol kan spelen, is deze kick-off een goede aanleiding om het vanuit een ander perspectief over bruikbaarheid te hebben.

Continue reading