Het onderzoeksteam voor het project ‘Kinderen in gesprek met media‘, een samenwerking tussen Universiteit Twente (Human Media Interaction) en Nederlands Instituut voor Beeld en Geluid gesubsidieerd door SIDN fonds en CLICKNL, is compleet. Naast Ella Velner die al een paar maanden aan de slag was is per 1 januari ook Thomas Beelen begonnen. De eerste verkennende experimenten worden al in de stijgers gezet, met twee Furhat Robots als hoofdrolspelers. Deze nieuwe generatie robots zijn tot levensechte communicatie in staat zijn en zullen als basis dienen om in gesprek te gaan met kinderen over media. De eerste experimenten zullen met name verkennend zijn met betrekking tot de mogelijkheden van de robots en tegelijk ook de basis leggen voor onderzoeksvragen rond privacy en het vertrouwen van informatie. Vragen die antwoorden moeten gaan opleveren hoe we verantwoord gebruik kunnen maken van kunstmatige intelligentie. Op 22 januari stonden we meteen op het Robotics Symposium van het Digital Society Institute op de Universiteit Twente om over het onderzoek te vertellen. Op de foto: Ella Velner laat zien hoe goed de Furhat Robot (op tafel, met muts) oog contact kan maken.
ICT with Industry brings together scientists and professionals from industry and governments to work collaboratively on case studies, which are subject to an intense week of analysing, discussing, and modeling solutions. ICT with Industry 2020 will be held from January 20 to 24 in the Lorentz Center in Leiden. On Thursday 9th of January, we had the opportunity to kick-off and prepare the use case from the Netherlands Institute for Sound and Vision (NISV) at the premises of the institute in Hilversum. To meet the local R&D team, learn about the institute and its rich archive, and to have a first discussion about the use case and research questions.
The use case NISV brought in this year is related to data research and data journalism. For a number of years now, together with partners in the CLARIAH project, NISV has been working on the development of a research data infrastructure including online environments (such as the Media Suite) that scholars can use for research: search and explore multimedia collections (text, audio, video, images from Sound and Vision, Eye, National Library, KNAW institutes), make comparisons between collections, perform data analyses, collaborate with other researchers, and create virtual personal collections and annotate them. Currently, the infrastructure allows working with collections that have –sort of manually– been selected, prepared and shaped to fit the technical requirements of the data infrastructure. Now that more public information sources are being made available in various formats, facilitating multimedia data analysis and comparison for data researchers and data journalists requires that we need to be able to make such collections accessible far more dynamically in the data infrastructure. So that data researchers and data journalists can easily dive into sources that are most relevant for their topic of interest, using the rich set of tools and facilities that are available in the infrastructure.
After a first brainstorm on the topic at the kick-off and some time to reflect on it the upcoming week, a team of researchers from the universities of Amsterdam, Leiden, Delft and Twente, together with specialists from NISV, will be looking into the matter from many different perspectives: information retrieval, computer vision, crowdsourcing, (hidden) data analysis, interfacing and fact-checking. We are looking forward to a very interesting week at ICT with Industry and will definitely report on its outcomes here. To be continued!
The CLARIAH Media Suite was at the CLARIN Annual Conference in Pisa, Italy, from 8-10 October. The message was first of all that we have made good progress in establishing a scholarly research infrastructure for doing mixed-media analysis with multimedia data that are abundantly available at national archives, libraries and knowledge institutions. Multimedia collections that for a long time have been ‘locked’ due to the lack of a proper interface to these data, both from a technical and legal (IPR/Privacy) perspective.
Our approach is based on (i) scholarly requirements with respect to access and analysis of data, (ii) requirements with respect to the sustainability of the technical infrastructure we are developing, and (iii) the principle that everything that we build for (i) should be usable for scholars in (ii): not in an X amount of years when the infrastructure is “ready”, but immediately (or at least as soon as possible). So that we can build the infrastructure in co-development, keep track of its benefits (research output) and shortcomings, and continuously update its roadmap.
In this context, we defined some architecture principles for the infrastructure. The institutions that own or hold the data such as archives are responsible for the quality of the data (metadata) and for facilitating access to the data. To authorize access to the data, we use a federated authentication mechanism. Data from the various institutions, and tools for searching, analysis, annotation and visualisation, are available through a “workspace” or Virtual Research Environment (VRE). Data created by scholars in the workspace can –if IPR permits– be exported in various formats for analysis in external tools that are already available. And finally, an application such as the Media Suite provides an interface to the underlying infrastructure, geared towards the specific requirements of a scholarly user group, in our case media scholars.
An example of the data analysis tools available in the CLARIAH workspace is automatic speech recognition (ASR). At the CLARIN conference, we presented an overview paper (see proceedings) on ASR for scholarly research (see below). It explains that ASR is helpful for: (i) supporting the transcription of the spoken word (e.g., in interview collections), turning it from a fully manual and time-consuming process into a (semi) automatic one, and (ii) increasing the efficiency of discovery in large audiovisual collections. The question is how we make ASR available for scholars given these intentions.
First of all, we have to make a distinction between the processing of individual files or small personal collections on the one hand, and large institutional collections. For the first scenario, the CLARIAH infrastructure incorporates a speech recognition service, that can be easily deployed by individual scholars. Within the closed environment of the workspace, scholars can upload files and select enrichment services such as speech recognition from a drop-down menu. Speech recognition jobs are scheduled in the background and after finishing, the transcripts become available in the (personal) workspace for viewing and searching as part of a personal collection index. On request of scholars, the processing of large collections is scheduled manually by technology specialists. For example, “news and actualities” programs are a rich source for scholarly research. To process all available content of this type of programming in an archive, all program identifiers for this type need to be collected first, using a combination of metadata fields (e.g., genre, title). With the identifiers, the digital source files can be extracted from the archive and send to a computer cluster for recognition. This computer cluster can be a local cluster or a high performance computing cluster, depending on the quantity.
See below the presentation I gave on 29-06-2018 at the Digital Humanities 2018 Conference, Mexico City, on the development of the Media Suite, an online research environment that facilitates scholarly research using large multimedia collections maintained at archives, libraries and knowledge institutions. The Media Suite unlocks the data on the collection level, item level, and segment level, provides tools that are aligned with the scholarly primitives (discovery, annotation, comparison, linking), and has a ‘workspace’ for storing personal mixed media collections and annotations, and to do advanced analysis using Jupyter Notebooks and NLP tools.
See the notes for the narrative that goes with the slides. The screencasts that were originally in the slides are not included. I will post these later.
The Media Suite is developed in the Dutch CLARIAH Research Infrastructure project by an interdisciplinary, international team of developers, scholars, and information technology specialists, and is maintained at one of the CLARIAH Centers, The Netherlands Institute for Sound and Vision.
Next year we will have an easy ride to the Digital Humanities Conference as it will be organised in The Netherlands. But this year we’re off to Mexico City to have a “Humanidades Digitales” experience. My first DH conference was in 2011 at Stanford where I presented a poster on “Distributed Acces to Oral History collections: Fitting Access Technology to the Needs of Collection Owners and Researchers” (pdf) based on our experiences with the Verteld Verleden project. I remember I was a bit disappointed that at that time there was not so much interest in Oral History at the conference (and in my poster) nor in audiovisual content as a significant source for scholarly research. Thanks also to the workshops organized at DH in the past years by the AV in Digital Humanities Special Interest Group the topic ‘audiovisual’ in Digital Humanities is emerging. I am very pleased to be able to present our work on the CLARIAH Media Suite at this year’s conference (on Thursday) and show the huge progress the CLARIAH project made in unlocking multimedia content –radio, television, film, oral history, newspapers, contracts, posters and photos– from Dutch archival institutions for scholarly research.
To create some nice music videos for the songs I created for the new Grafton Music album (preview), I fiddled around with the Open Images repository to create a remix that could work as a music video. My first try was for a song called ‘Lighthouse’. Well yes, after querying for ‘lighthouse’ I indeed stumbled upon some lighthouse related stuff. However, creating something exciting out of the beautiful, but the not really overwhelming amount of lighthouse footage, was quite a challenge. So I could say that it’s about the music and the video is ‘just for entertainment’. On the other hand, the video somehow has a bit of the ’round-and-round-thingie’ you expect with a lighthouse, especially in the last part of the song.
My second try was for Belle Rebelle, a song by the famous French composer Gounod with lyrics (Jean-Antoine de Baïf ) from the late Middle Ages, put in a modern arrangement. As I wanted to follow the storyline of the song a bit, I tried several keywords such as ‘love’ and ‘beautiful’ but that didn’t quite do it. The keyword ‘fashion’ was the lucky shot that brought me some really cool retro footage that I thankfully ‘remixed’ for the music video. See below some outstanding example stills, or watch the video (and share-if-you-like, thank you). Or try out openbeelden.nl yourself and dive into some great fashion content.
To foster scholarly research using large data collections in the art and humanities, the CLARIAH project is developing a research infrastructure that aims to streamline access to large audiovisual collections and related context collections, available at different locations in The Netherlands. Also, it provides scholars with robust and sustainable tools to work with these collections. Gateway to the data and tools in the infrastructure is the Media Suite, a portal that helps scholars to explore, select, analyze and annotate data collections. Many practical issues arise in the process of making data collections from various institutions available within the infrastructure in a way that effectively supports scholarly use. The identification of such issues and developing strategies to address these is pivotal to the success of a research infrastructure.
To test the emerging infrastructure, ‘Research Pilots’ were awarded by CLARIAH, six of them focussing on the audiovisual domain. Scholars defined a research question and suggested data collections and tools that they need to address the research question in the Media Suite. Recently, we organized a workshop with scholars, content-owners, and CLARIAH developers, to discuss the details of the data requirements of scholars and to investigate the alignment of these with the status of the CLARIAH infrastructure. The workshop improved our mutual understanding of large, institutional data collections in a research infrastructure but also made clear that there are barriers to overcome to serve the needs of scholars with respect to collection access. We identified three caveats with respect to effectively using these collections in practice.
The first one is that scholars make assumptions about the data collections that may not always be valid. As explained by NISV’s expert in media history Bas Agterberg, the process of audiovisual archiving through the years has been influenced by many practical issues, ranging from the take-up of collections assembled for other purposes than archiving, mergers with other institutes, to institutional data selection policies that changed over time for various reasons. So, when a scholar would be interested in a specific type of programming in a specific time-period, it is important to understand that there may be gaps in the archive that could for instance influence representativeness off the data for research. From a research infrastructure perspective, the lesson learned is that we should put an effort in documenting data collections, for example by providing pointers to the existing documentation available with collection owners.
The second issue with collections is that it is often far from obvious how to trace specific programs or genres in the metadata. For scholars, a question like “give me all autobiographical documentaries between 1965 and 1975” makes perfect sense. However, it may require some ‘metadata archaeology’ to discover which metadata fields to query and how to query them, to be able to select the desired items from a collection. As is the case with the collections themselves, also the metadata have a history with respect to its origin, metadata models and protocols for filling the fields. The Media Suite provides a “Collection inspector” that could be helpful in providing statistics on the completion of individual metadata fields in a collection and distribution over the years. However, the ‘raw’ field names may not always make sense for scholars without background knowledge on the metadata model of a specific collection. To improve its usefulness for scholars, the metadata fields in the Collection Inspector may need to be mapped to a comprehensible format. A minimum requirement is that for each of the collections in the infrastructure we can provide documentation on its metadata model so that the rationale behind the naming of fields can be tracked down.
The third issue with respect to the usability of data collections in the infrastructure is the availability of transcripts such as subtitles or manually or automatically generated speech transcripts, that can be used for searching relevant clips in large amounts of data. However, such transcripts are typically sparse. For instance, for the broadcast data in the NISV collections, synchronized subtitles are only available from 2006 onwards. To improve search granularity for collections without subtitles, CLARIAH is setting up an automatic speech recognition service that is embedded in the infrastructure, capable of processing very large data collections. One of the models for use is that when scholars require speech transcripts for specific collections or date ranges, this service can be called upon on request.
The Media Suite development team is working on (strategies for) the integration of multimedia data collections from DANS (oral history), EYE (film), KB (newspapers for comparative search) and Beeld en Geluid (program guides), in close collaboration with the content owners. The goal is to enable scholars to analyze these data collections in the Media Suite, access the source data (e.g., view content) via available platforms from content owners (e.g., Delpher), and when necessary, address issues on data archaeology and granularity as discussed above.
Video hyperlinking is growing interest in the multimedia retrieval community. In video hyperlinking the goal is to apply the concept of linking that we are used to in the text domain to videos: enable the user to browse from one video to another. The assumption is that video hyperlinking can help to explore large video repositories more adequately. Links are created based on an automatically derived, topical relationship between video segments. The question however is, how do we identify which video segments in these repositories are good candidates for linking? And also, if we have such candidates, how to make sure that the links to video targets are really interesting for a user? Five research groups presented their view on this today, at a special session at the International Conference on Multimedia Retrieval (ICMR2017) in Bucharest.
Hubs and false links
Chong-Wah Ngo from City University of Hong Kong…
View original post 645 more words
In the CLARIAH project, we are developing the Media Suite, an application that supports scholarly research using audiovisual media collections. In 2017 we will also be integrating tools that support Oral History research in the Media Suite. From 10 to 12 May 2017, scholars and technology experts discussed the development of an automatic transcription chain for spoken word collections in the context of CLARIN, the European counterpart of CLARIAH, at a CLARIN-PLUS workshop in Arezzo. We observed that CLARIAH and CLARIN use a different but interesting complementary approach to the development of such a transcription chain that encourages further collaboration.
2nd CALL FOR ICMR2017 SPECIAL SESSION PAPERS
Identifying and Linking Interesting Content in Large Audiovisual Repositories
An emerging key challenge for multimedia information retrieval as technologies for component feature identification and standard ad hoc search mature is to develop mechanisms for richer content analysis and representations, and novel modes of exploration. For example, to enable users to create their own personal narratives by seamlessly exploring (multiple) large audiovisual repositories at the segment level, either by following established trails or creating new ones on the fly. A key research question in developing these new technologies and system is how we can automatically identify video content that viewers perceive to be interesting taking multiple modalities into account (visual, audio, text).
The ICMR2017 Special Session “Identifying and Linking Interesting Content in Large Audiovisual Repositories” is calling for papers (6 pages) presenting significant and innovative research focusing on mechanisms that help identifying significant elements within AV (or multimedia, in general) repositories and the creation of links between interesting video segments and other video segments (or multimedia content).
Papers should extend the state of the art by addressing new problems or proposing insightful solutions. We encourage submissions covering relevant perspectives in this area including:
- Multi/mixed-media hyperlinking (audio-to-image, text-to-video)
- Linking across audiovisual repositories (e.g., from professional to public)
- Alignment of social media posts to video (segments)
- Video-to-video search
- Retrieval models that incorporate multimodal, segment-based retrieval and linking
- Segment-level recommendation in videos
- Video segmentation and summarization
- Multimodal search (explicit combination of multimodal features)
- Query generation from video
- Video-to-text description
- Content-driven, social-driven interestingness prediction
- Object interestingness modeling and prediction
- (User) evaluation of interestingness, hyperlinking or archive exploration systems
- Use cases related to video hyperlinking or interestingness prediction in video
- Interfaces for linked-video based storytelling.
For submission details see: http://icmr2017.ro/call-for-special-sessions-s2.php