Zijn robots te vertrouwen met kinderen? Onderzoek van start op UT

Ella Velner demonstreert de Furhat Robot op het Robotics Symposium op de Universiteit Twente
Het onderzoeksteam voor het project ‘Kinderen in gesprek met media‘, een samenwerking tussen Universiteit Twente (Human Media Interaction) en Nederlands Instituut voor Beeld en Geluid gesubsidieerd door SIDN fonds en CLICKNL, is compleet. Naast Ella Velner die al een paar maanden aan de slag was is per 1 januari  ook Thomas Beelen begonnen. De eerste verkennende experimenten worden al in de stijgers gezet, met twee Furhat Robots als hoofdrolspelers. Deze nieuwe generatie robots zijn tot levensechte communicatie in staat zijn en zullen als basis dienen om in gesprek te gaan met kinderen over media. De eerste experimenten zullen met name verkennend zijn met betrekking tot de mogelijkheden van de robots en tegelijk ook de basis leggen voor onderzoeksvragen rond privacy en het vertrouwen van informatie. Vragen die antwoorden moeten gaan opleveren hoe we verantwoord gebruik kunnen maken van kunstmatige intelligentie. Op 22 januari stonden we meteen op het Robotics Symposium van het Digital Society Institute op de Universiteit Twente om over het onderzoek te vertellen. Op de foto: Ella Velner laat zien hoe goed de Furhat Robot (op tafel, met muts)  oog contact kan maken. 

Preparing for ICT with Industry 2020 at Beeld en Geluid

ICT with Industry brings together scientists and professionals from industry and governments to work collaboratively on case studies, which are subject to an intense week of analysing, discussing, and modeling solutions. ICT with Industry 2020 will be held from January 20 to 24 in the Lorentz Center in Leiden. On Thursday 9th of January, we had the opportunity to kick-off and prepare the use case from the Netherlands Institute for Sound and Vision (NISV) at the premises of the institute in Hilversum. To meet the local R&D team, learn about the institute and its rich archive, and to have a first discussion about the use case and research questions.

The use case NISV brought in this year is related to data research and data journalism. For a number of years now, together with partners in the CLARIAH project, NISV has been working on the development of a research data infrastructure including online environments (such as the Media Suite) that scholars can use for research: search and explore multimedia collections (text, audio, video, images from Sound and Vision, Eye, National Library, KNAW institutes), make comparisons between collections, perform data analyses, collaborate with other researchers, and create virtual personal collections and annotate them. Currently, the infrastructure allows working with collections that have –sort of manually– been selected, prepared and shaped to fit the technical requirements of the data infrastructure. Now that more public information sources are being made available in various formats, facilitating multimedia data analysis and comparison for data researchers and data journalists requires that we need to be able to make such collections accessible far more dynamically in the data infrastructure. So that data researchers and data journalists can easily dive into sources that are most relevant for their topic of interest, using the rich set of tools and facilities that are available in the infrastructure.

After a first brainstorm on the topic at the kick-off and some time to reflect on it the upcoming week, a team of researchers from the universities of Amsterdam, Leiden, Delft and Twente, together with specialists from NISV, will be looking into the matter from many different perspectives: information retrieval, computer vision, crowdsourcing, (hidden) data analysis, interfacing and fact-checking. We are looking forward to a very interesting week at ICT with Industry and will definitely report on its outcomes here. To be continued!

Unlocking Archives for Scholarly Research

The CLARIAH Media Suite was at the CLARIN Annual Conference in Pisa, Italy,  from 8-10 October. The message was first of all that we have made good progress in establishing a scholarly research infrastructure for doing mixed-media analysis with multimedia data that are abundantly available at national archives, libraries and knowledge institutions. Multimedia collections that for a long time have been ‘locked’ due to the lack of a proper interface to these data, both from a technical and legal (IPR/Privacy) perspective.

CLARIN Conference: proceedings with the paper “Media Suite: Unlocking Archives for Mixed Media Scholarly Research”,  slides.

Our approach is based on (i) scholarly requirements with respect to access and analysis of data, (ii) requirements with respect to the sustainability of the technical infrastructure we are developing, and (iii) the principle that everything that we build for (i) should be usable for scholars in (ii): not in an X amount of years when the infrastructure is “ready”, but immediately (or at least as soon as possible). So that we can build the infrastructure in co-development, keep track of its benefits (research output) and shortcomings, and continuously update its roadmap.

In this context, we defined some architecture principles for the infrastructure. The institutions that own or hold the data such as archives are responsible for the quality of the data (metadata) and for facilitating access to the data. To authorize access to the data, we use a federated authentication mechanism. Data from the various institutions, and tools for searching, analysis, annotation and visualisation, are available through a “workspace” or Virtual Research Environment (VRE).  Data created by scholars in the workspace can –if IPR permits– be exported in various formats for analysis in external tools that are already available. And finally, an application such as the Media Suite provides an interface to the underlying infrastructure, geared towards the specific requirements of a scholarly user group, in our case media scholars.

Speech Recognition

An example of the data analysis tools available in the CLARIAH workspace is automatic speech recognition (ASR). At the CLARIN conference, we presented an overview paper (see proceedings) on ASR for scholarly research (see below). It explains that ASR is helpful for: (i) supporting the transcription of the spoken word (e.g., in interview collections), turning it from a fully manual and time-consuming process into a (semi) automatic one, and (ii) increasing the efficiency of discovery in large audiovisual collections.  The question is how we make ASR available for scholars given these intentions.


Activating ASR

First of all, we have to make a distinction between the processing of individual files or small personal collections on the one hand, and large institutional collections. For the first scenario, the CLARIAH infrastructure incorporates a speech recognition service, that can be easily deployed by individual scholars.  Within the closed environment of the workspace, scholars can upload files and select enrichment services such as speech recognition from a drop-down menu. Speech recognition jobs are scheduled in the background and after finishing, the transcripts become available in the (personal) workspace for viewing and searching as part of a personal collection index.  On request of scholars, the processing of large collections is scheduled manually by technology specialists. For example, “news and actualities” programs are a rich source for scholarly research. To process all available content of this type of programming in an archive, all program identifiers for this type need to be collected first, using a combination of metadata fields (e.g., genre, title). With the identifiers, the digital source files can be extracted from the archive and send to a computer cluster for recognition. This computer cluster can be a local cluster or a high performance computing cluster, depending on the quantity.

browse_transcripts (1)

Browsing transcripts

Challenges in Enabling Mixed Media Scholarly Research with Multi Media Data in a Sustainable Infrastructure

See below the presentation I gave on 29-06-2018 at the Digital Humanities 2018 Conference, Mexico City, on the development of the Media Suite, an online research environment that facilitates scholarly research using large multimedia collections maintained at archives, libraries and knowledge institutions. The Media Suite unlocks the data on the collection level, item level, and segment level, provides tools that are aligned with the scholarly primitives (discovery, annotation, comparison, linking), and has a ‘workspace’ for storing personal mixed media collections and annotations, and to do advanced analysis using Jupyter Notebooks and NLP tools.

See the notes for the narrative that goes with the slides. The screencasts that were originally in the slides are not included. I will post these later.

The Media Suite is developed in the Dutch CLARIAH Research Infrastructure project by an interdisciplinary, international team of developers, scholars, and information technology specialists, and is maintained at one of the CLARIAH Centers, The Netherlands Institute for Sound and Vision.

AV in the spotlight at DH2018

Next year we will have an easy ride to the Digital Humanities Conference as it will be organised in The Netherlands. But this year we’re off to Mexico City to have a “Humanidades Digitales” experience. My first DH conference was in 2011 at Stanford where I presented a poster on “Distributed Acces to Oral History collections: Fitting Access Technology to the Needs of Collection Owners and Researchers” (pdf) based on our experiences with the Verteld Verleden project. I remember I was a bit disappointed that at that time there was not so much interest in Oral History at the conference (and in my poster) nor in audiovisual content as a significant source for scholarly research. Thanks also to the workshops organized at DH in the past years by the AV in Digital Humanities Special Interest Group the topic ‘audiovisual’ in Digital Humanities is emerging. I am very pleased to be able to present our work on the CLARIAH Media Suite at this year’s conference (on Thursday) and show the huge progress the CLARIAH project made in unlocking multimedia content –radio, television, film, oral history, newspapers, contracts, posters and photos– from Dutch archival institutions for scholarly research.



Using open content for a music video

To create some nice music videos for the songs I created for the new Grafton Music album (preview), I fiddled around with the Open Images repository to create a remix that could work as a music video. My first try was for a song called ‘Lighthouse’. Well yes, after querying for ‘lighthouse’ I indeed stumbled upon some lighthouse related stuff. However, creating something exciting out of the beautiful, but the not really overwhelming amount of lighthouse footage, was quite a challenge. So I could say that it’s about the music and the video is ‘just for entertainment’.  On the other hand, the video somehow has a bit of the ’round-and-round-thingie’ you expect with a lighthouse, especially in the last part of the song.

My second try was for Belle Rebelle, a song by the famous French composer Gounod with lyrics (Jean-Antoine de Baïf ) from the late Middle Ages, put in a modern arrangement. As I wanted to follow the storyline of the song a bit, I tried several keywords such as ‘love’ and ‘beautiful’ but that didn’t quite do it. The keyword ‘fashion’ was the lucky shot that brought me some really cool retro footage that I thankfully ‘remixed’ for the music video. See below some outstanding example stills, or watch the video (and share-if-you-like, thank you).  Or try out openbeelden.nl yourself and dive into some great fashion content.

Special session on video hyperlinking: what to link and how to do that?

Video Hyperlinking

Video hyperlinking is growing interest in the multimedia retrieval community. In video hyperlinking the goal is to apply the concept of linking that we are used to in the text domain to videos: enable the user to browse from one video to another. The assumption is that video hyperlinking can help to explore large video repositories more adequately. Links are created based on an automatically derived, topical relationship between video segments. The question however is, how do we identify which video segments in these repositories are good candidates for linking? And also, if we have such candidates, how to make sure that the links to video targets are really interesting for a user? Five research groups presented their view on this today, at a special session at the International Conference on Multimedia Retrieval (ICMR2017) in Bucharest.

Hubs and false links

IMG_8581Chong-Wah Ngo from City University of Hong Kong…

View original post 645 more words

CLARIN/CLARIAH Collaboration on Automatic Transcription Chain for Digital Humanities

In the CLARIAH project, we are developing the Media Suite, an application that supports scholarly research using audiovisual media collections. In 2017 we will also be integrating tools that support Oral History research in the Media Suite. From 10 to 12 May 2017,  scholars and technology experts discussed the development of an automatic transcription chain for spoken word collections in the context of CLARIN, the European counterpart of CLARIAH, at a CLARIN-PLUS workshop in Arezzo. We observed that CLARIAH and CLARIN use a different but interesting complementary approach to the development of such a transcription chain that encourages further collaboration.

Continue reading

Second CfP “Identifying and Linking Interesting Content in Large Audiovisual Repositories”


Identifying and Linking Interesting Content in Large Audiovisual Repositories

An emerging key challenge for multimedia information retrieval as technologies for component feature identification and standard ad hoc search mature is to develop mechanisms for richer content analysis and representations, and novel modes of exploration. For example, to enable users to create their own personal narratives by seamlessly exploring (multiple) large audiovisual repositories at the segment level, either by following established trails or creating new ones on the fly. A key research question in developing these new technologies and system is how we can automatically identify video content that viewers perceive to be interesting taking multiple modalities into account (visual, audio, text).

The ICMR2017 Special Session “Identifying and Linking Interesting Content in Large Audiovisual Repositories” is calling for papers (6 pages) presenting significant and innovative research focusing on mechanisms that help identifying significant elements within AV (or multimedia, in general) repositories and the creation of links between interesting video segments and other video segments (or multimedia content).

Papers should extend the state of the art by addressing new problems or proposing insightful solutions. We encourage submissions covering relevant perspectives in this area including:

  • Multi/mixed-media hyperlinking (audio-to-image, text-to-video)
  • Linking across audiovisual repositories (e.g., from professional to public)
  • Alignment of social media posts to video (segments)
  • Video-to-video search
  • Retrieval models that incorporate multimodal, segment-based retrieval and linking
  • Segment-level recommendation in videos
  • Video segmentation and summarization
  • Multimodal search (explicit combination of multimodal features)
  • Query generation from video
  • Video-to-text description
  • Content-driven, social-driven interestingness prediction
  • Object interestingness modeling and prediction
  • (User) evaluation of interestingness, hyperlinking or archive exploration systems
  • Use cases related to video hyperlinking or interestingness prediction in video
  • Interfaces for linked-video based storytelling.

For submission details see: http://icmr2017.ro/call-for-special-sessions-s2.php