Several people have suggested to organize a joint meeting of the projects that either have recently started or will start in the near future, and that do "something with semantics and cultural heritage". For this meeting we have coined the term "Amsterdam Cultural Heritage Exchange" where "Amsterdam" should be interpreted loosely (if at all). The objective of the day is to exchange information, in particular on the software the different projects have developed c.q. are developing c.q. are using as baseline.

Guus Schreiber

Place and Time

Thursday 30 March
CWI, Room Z.009
Directions to CWI
You should come from CWI main entrance. Once you are in the main entrance, you can ask the receptionist for directions to the room. There will also be some obvious signs posted on the walls for directions to the room.

Project's Participation

The coordinators of the following projects have expressed their interest in participating:


9.30 - 9.40
Brief round of introductions

9.40 - 12.30
Project presentations
- STITCH (Isaac, 20 min)
- CHOICE (Veenstra, 20 min)
- RHCe (Houben, 20 min)
- MuseUM (Kamps, 20 min)

Coffee break 11:00 - 11:30

- CHIP (Aroyo, 20 min)
- MultimediaN E-Culture (Scheiber, 20 min)
- MunCH (Smulders, 10 min)
- RNA (Nederbracht, 10 min)

Lunch 12:30 - 13:30

13.30 - 15.00
Demos session (parallel). Note: this is an informal meeting, feel free to bring demo's, Internet access is available
- CHIP (Rutledge, Stash)
- CHOICE (Brugman)
- MultimediaN E-Culture (van Ossenbruggen)
- RHCe (van Strien)
- MuSeUM (2 demo's)

Coffee break 15.00 - 15.15

15.15 - 16.30
Plenary discussion on links, overlap, cooperation, alignment, complementarity, exchange, ....

Project Description


A Thesaurus Browser at Sound and Vision
Over the last few years people at Sound and Vision (Beeld en Geluid) spent much effort designing a state of the art thesaurus to support their documentation work, and to populate this thesaurus on basis of several previously existing thesauri and term lists. This thesaurus (the GTAA - Gemeenschappelijke Thesaurus Audiovisuele Archieven) has approximately 160.000 terms in six facets: Subjects, Genres, Person Names, Names, Makers, and Locations. Some facets are hierarchically structured, others are flat lists of terms. Some facets in the thesaurus are updated regularly, others are more stable over time. A relational database is used for storage. The GTAA plays a central role in research and software development within the CHOICE@CATCH project. As a pilot project for the CHOICE research program a web application for browsing and searching for relevant terms in the GTAA was built. Intended users are cataloguers at Sound and Vision and public broadcasting corporations. The browser’s user interface was designed to exploit all structure and information that is present in the thesaurus, or that is added by the CHOICE project, in a useful and user friendly manner. Examples of such structures are broader term - narrower term hierarchies, thematic classification of terms, associative relations between terms and cross-facet links. When searching for terms, the use of synonyms and spelling variations provides the cataloguers with additional entry points for their searches. The browser’s primary data source is currently a static SKOS representation of the GTAA thesaurus. Since the browser is supposed to provide access to an up-to-date version of the thesaurus at all times, in the future we will dynamically generate SKOS from the Sound and Vision thesaurus database. Beside the research on the representation of the GTAA, the CHOICE project has also focused on the use of GTAA terms by NLP techniques to derive indexing keywords from context documents, in order to help Sound and Vision’s cataloguers in their indexing task. We will also demonstrate some results of this domain specific information extraction process.


Multiple-collection Searching Using Metadata
The MuSeUM project addresses the prototypical problem of a cultural heritage institution with the ambition to disclose all of its content in a single, unified system. The institution has various legacy systems, each dealing with a small part of the collection, each constructed for different purposes, in different times, by different people, working in different traditions, based on different design principles, with different access methods, etcetera. In short, the cultural heritage institution is confronted with its _own_ history. MuSeUM investigates theoretically transparent ways of combining modern information retrieval methods based on statistical language modeling with varying amounts of metadata and non-content features. Our approach to metadata is, in essence, the famous dumb-down principle: although metadata is based on a specific thesaurus or ontology, we can alway fall back on the description of the terms in ordinary language. In this way, we can directly employ the powerful methods of textual information retrieval. Focused Access to Semi-Structured Documents Standard search engines do _document_ retrieval: in response to a query a whole document or web-page is returned, and it is left to the user to locate the desired information within the document. Current data on the is semi-structured, consisting of documents marked up in XML, XHTML or other formats. Here the document's structure provides additional handles for retrieval: if we find a document with a relevant section, why not return this section directly to a user? From a long full-text article spanning many pages, maybe all relevant information for a is contained in a single paragraph in the conclusions. The resulting focused search engines will guide a user directly to the relevant information. We'll show two demo's of focused search engines: - A collection of full-text scientific articles from the IEEE Computer Society. - A collection of wiki-pages from Wikipedia. These search engines consist of an XML element retrieval system, and an interface that presents found document components within their natural context.


The goal of Eindhoven Regional Historic Centre (RHCe) is to improve the method for disclosure of their archive image collection. In collaboration with the Eindhoven University of Technology possibilities of using Semantic Web techniques have been explored and a model to support the user in browsing and searching the RHCe collection of then has been defined and implemented in a CHI (Cultuur Historische Informatie)demonstrator.


STITCH is a CATCH project which aims at using Semantic Web techniques to create semantic interoperability between collections in the Cultural Heritage (CH) domain. For our Pilot Project we selected 2 CH collections, and tried to integrate them into a single browsing interface. Our talk and demo will thus include details about formalisation of the collections' vocabularies into SW formats, running mapping tools against those vocabularies, and the design of several views to visualise the results of mappings.


The current demo for the CHIP project functions as a recommender system for helping the user find what interests him or her in the Rijksmuseum collection. It starts by presenting the user with images of artifacts to rate. As these artifacts ratings accumulate into the user profile, the demo recommends topics of potential interest based on the profile. The user can in turn rate these recommended topics. These topic ratings also accumulate in the profile, resulting in recommended artifacts which the user can rate.
The user can click on any topic or artifact for a full display about it, enabling the user to make a more informed rating for it. These displays also let the user rate artifacts and concepts that are closely related to the current one. In this manner, the user progressively rates and browses not just through the information space, but the evolving space representing his or her interests and taste.
The demo provides transparency with a "Why?" link for each recommendation, which shows the user what previous ratings resulting in that recommendation. We also provide transparency by designing the display to be dynamic in a way that lets the user see immediately how each rating impacts the recommendations and, thus, the system's current understanding of his or her taste.



