Shaman: D14. 3-Report on demonstration and evaluation activity in the domain of industrial design and engineering more |
15 views |
Information Systems, Industrial Information Systems, SHAMAN EU Project, Digital Preservation, and Demonstration and Evaluation of Research Outputs
Large Scale Integrating Project Grant Agreement no.: 216736 D14.3 - REPORT ON DEMONSTRATION AND EVALUATION ACTIVITY IN THE DOMAIN OF INDUSTRIAL DESIGN AND ENGINEERING SHAMAN –WP14-D14.3 Project Number Due Date Actual Date Document Author/s: Version: Dissemination level Status Contributing Sub-project and Work Package Document approved by FP7-ICT-216736 1 May 2011 30 June 2011 SSLIS, UStrath, Philips SHAMAN-WP14.3 Version 0.5 Public Final SHAMAN WP14, ISP2
Co-funded by the European Union
Document Version Control
Version 0.1 0.2 0.3 0.4 0.5 Date 2011-03-25 2011-05-24 2011-06-03 2011-06-20 2011-07-01 Change Made (and if appropriate reason for change) First suggestion of the outline Annotated outline, tasks for partners Annotated outline First version (incomplete) Final version Initials of Commentator(s) or Author(s) TDW, EM TDW, EM, EF, KM, KT, VC TDW, EM, KM, KT, DH, HUH EM, TDW EM, TDW, KM, DH
Document Change Commentator or Author Author Initials EM TDW EF KM DH HUH VC Name of Author Elena Maceviciute Tom Wilson Emma Forsgren Kathleen Menzies Dominic Heutelbeck Hans-Ulrich Heidbrink Vittore Casarosa Institution SSLIS SSLIS Philips UStrath FUH InConTec HATII
Document Quality Control Version QA V0.1 V0.2 Date 2011.06.20 2011.07.01 Comments (and if appropriate reason for change) Initials of QA Person EM, TW EM, TW, KM, DH
SHAMAN WP14-D14.2
1 / 56
Catalogue Entry
Title Creators Subject Description Publisher Contributor Date ISBN Type Format Language Rights
Report on demonstration and evaluation activity in the domain of industrial design and engineering Elena Maceviciute, Tom Wilson, Emma Forsgren, Kathleen Menzies, ISP2 demonstration scenarios, demonstration and evaluation by R&D community, customers, end users Report on the activities carried out to present the ISP2 to the evaluators and the results of these activities. SHAMAN
2011 Public project deliverable Text English
Citation Guidelines SHAMAN Project. (2011). D14.3 - Report on demonstration and evaluation activity in the domain of industrial design and engineering. SHAMAN –WP14-D14.3.
SHAMAN WP14-D14.2
2 / 56
Table of Contents
1. 2. 2.1. 2.2. 2.3. SUMMARY INTRODUCTION THE OBJECTIVES OF THE DEMONSTRATION AND EVALUATION IN THE DOMAIN OF INDUSTRIAL
DESIGN AND ENGINEERING THE IMPLEMENTATION OF THE DEMONSTRATION AND EVALUATION ACTIVITIES FOR INDUSTRIAL DESIGN AND ENGINEERING THE STRUCTURE OF THE REPORT
4 5 6 6 7
3. PREPARATION FOR DEMONSTRATION AND EVALUATION IN INDUSTRIAL DESIGN AND ENGINEERING 8 3.1. 3.2. 3.3. 3.4. 3.4.1. 3.5. 3.6. 3.7. 4. PRESERVATION REQUIREMENTS FOR THE INDUSTRIAL DESIGN AND ENGINEERING DOMAIN ADAPTING THE SHAMAN ASSESSMENT FRAMEWORK FOR ISP2 EVALUATION DEVELOPING THE DEMONSTRATION APPROACH SELECTING AUDIENCES SELECTION OF CUSTOMER ORGANIZATION AND EVENT PARTICIPANTS. DEMONSTRATION PROCESS PREPARING EVALUATION METHODS AND INSTRUMENTS FOR END-USER EVALUATION METHODS OF EVALUATION WITH R&D COMMUNITY 8 9 10 12 12 12 14 15 16 16 16 16 17 19 19 20 22 23 24 27 29 30
EVALUATION BY THE CUSTOMER ORGANIZATION
4.1. DEMONSTRATION AND EVALUATION EVENT – IMPLEMENTATION 4.2. RESULTS OF EVALUATION BY CUSTOMERS AND END USERS 4.2.1. RESPONSE TO THE PRESENTATIONS 4.2.2. THE OUTCOMES OF THE FOCUS GROUP 5. 5.1. 5.2. 5.3. 5.4. 6. 7. EVALUATION BY R&D COMMUNITIES PUBLISHING STATISTICS AND CITATIONS WEBOMETRIC ANALYSIS EXPERT OPINION CONCLUSION CONCLUSIONS AND RECOMMENDATIONS BIBLIOGRAPHY AND REFERENCES
LIST OF ABBREVIATIONS 8. 8.1. 8.2. 8.3. 8.4. 8.5. ANNEXES
ANNEX 1: ISP2 DEMONSTRATION AND EVALUATION EVENT TIMETABLE 30 ANNEX 2: DEMONSTRATION SCENARIO AND FOCUS GROUP GUIDE 31 ANNEX 3: QUESTIONNAIRES TO THE FOCUS GROUP PARTICIPANTS 33 ANNEX 4: TRANSCRIPTION OF QUESTIONS DURING THE DEMONSTRATION SESSION AND FOCUS GROUP INTERVIEW 42 ANNEX 5: CONFERENCES ATTENDED AND PAPERS SUBMITTED 54
SHAMAN WP14-D14.2
3 / 56
1. SUMMARY 1. This deliverable reports on the demonstration and evaluation of ISP2, which was designed to demonstrate the potential of the SHAMAN framework for digital preservation in the context of industrial design and engineering companies and for the research and development community. 2. The demonstration process was carried out by means of presentation based on screen-casts in Philips Consumer Lifestyle division the Audio, Video, Multimedia and Accessories business area members. They were typical members of a consumer organization and were involved in engineering or middle management work in the company, one was an intern. 3. The evaluation is based on the report of a focus group held in Philips‟s Eindhoven headquarters. It is supplemented with structured data from self-completed questionnaires, administered on the same occasions. However, the level of participation in the focus group was low as three out of nine intended participants withdrew because of more pressing duties on the morning of the event. Therefore, the data show only the individual opinions. However, the evaluation methods were successful in delivering useful information for the preparation of this deliverable. 4. Participants pointed out that, although their needs for digital preservation were limited, other business areas in the division were more likely to have greater needs, for example, the Health and Wellness area, which had legal obligations to retain data. The theoretical justification for the SHAMAN framework appeared to be of little interest to them. The primary requirement they had of software was that it should „save the time of the engineer‟. The elements resembling the optimum system, i.e., those that accepted a document or design and then automatically extracted all the necessary metadata to enable it to be found and used in the future, were favoured most. The participants found some features of the demonstrator, such as the Open Conjurer module and the concept of „terminology evolution‟ particularly appealing. 5. It is quite evident that the SHAMAN framework is applicable to the domain, and it is also evident that the key factors that will make digital preservation in general a requirement will depend very much on the nature of the business. The area of audio and video appliances does not need a very long-term preservation because of the rather short lifetime of an individual product. However, even here, in a different business area, relating to health and well-being, legal requirements necessitate a much longer period of preservation for documents and their need for digital preservation will be much greater. 6. Evaluation has also been performed to determine the project‟s impact on the R&D community by means of submission and rejection rates of papers to journals and conferences, and bibliometric and Webometric analyses. The results demonstrate that the research outputs from the project are of interest to the R&D community and that the impact of the project as a whole compares favourably with other European projects in the digital preservation area. 7. The evaluation has revealed strengths and shortcomings in the demonstration process, which will influence the further development of demonstrators for all three SHAMAN domains of interest. The SHAMAN framework for digital preservation is seen as offering new possibilities and interesting approaches to information capturing and re-use by the practitioners in industrial design and engineering companies.
SHAMAN WP14-D14.2
4 / 56
2. INTRODUCTION The work undertaken in WP14 addresses three different communities with a stake in the project's outputs; that is, members of the three previously identified SHAMAN Domains of Focus (DoF): memory institutions (DoF1); industrial design and engineering (DoF2); and escience (DoF3). The memory institutions were the focus of demonstration and evaluation work during the previous stage and the results were reported in D.14.2. This deliverable, D14.3, reports the results of the demonstration and evaluation of the further work on the development of digital preservation applications for industrial design and engineering. The following tasks are outlined for WP14: Task 14.1 Demonstration to, and evaluation by, researcher and developer communities (to show the integration of digital library and persistent archive technologies in a grid environment, the Multivalent technology and the use of context representation and information extraction within advanced digital preservation applications). Task 14.2 Demonstration to, and evaluation by, customer communities (aims to involve relevant communities that aim to set up their own digital libraries and persistent archives with the benefits of a Grid environment, Multivalent technologies, context support and information extraction functions in demonstration and evaluation activities). Task 14.3 Demonstration to end-user communities (to enlist end-users from the R&D and customer communities who may participate in demonstrations and early evaluations of the use of the demonstrators). Task 14.4 Application of the SHAMAN assessment framework (to summarize the results of the evaluation activities, and to evaluate progress and impact on the representative target domains of the SHAMAN outputs on the basis of the criteria specified in Task 1.4). Task 14.5 User evaluation plan (to set up the principles behind the entire user evaluation process and link it to the SHAMAN assessment framework, define the selection of users and user groups, the evaluation objects, methods and metrics, etc.). This document presents the results of evaluation of the SHAMAN outcomes as demonstrated to the representatives of the customer organization within the industrial design and engineering domain. In addition it assesses the indicators derived from the analysis of publicly presented research results to the communities of researchers and developers. Demonstration and evaluation is based on the work done previously in SHAMAN and reported in earlier deliverables. It builds on identified user requirements and their analysis as well as the SHAMAN Assessment Framework developed in WP1 (SHAMAN 2008; SHAMAN 2009b). The first round of evaluation and demonstration focused on assessment of the context capturing mechanisms and distributed ingestion capabilities of the demonstrators developed in WP11 and defined in D11.1 (SHAMAN 2009a; SHAMAN 2010). The D14.1 Demonstration and evaluation plan outlined the structure of the demonstration and evaluation activities and defined detailed relations with other work packages (SHAMAN 2009) and each of them has contributed to the evaluation process to some extent. 2.1 The objectives of the demonstration and evaluation in the domain of industrial design and engineering 1) WP14 has a shared responsibility with WP15, WP16, WP17 to disseminate the results of SHAMAN development among the researchers and developers of digital preservation and to the potential customers and users. At this stage, WP14 achieved this objective by organizing SHAMAN WP14-D14.2
5 / 56
a demonstration events in the partner organization division dealing with industrial design and engineering. This complemented the activities of WP17 and helped to prepare the activities of WP15. 2) WP14 has a shared responsibility with WP12 and other research WPs as well as WP18 to evaluate the outcomes of the developed SHAMAN demonstrator for industrial design and engineering domain customer organizations as well as the implementation of the project. This objective was achieved through a number of activities planned in WP14.1 and reported in the D18.2. Therefore. this deliverable concentrates on the assessment of an integrated subproject ISP2, which includes the elements developed in other research WPs into one demonstrator. 3) As at this stage, evaluation is formative in nature; it is intended to aid the design and implementation of the demonstrators as their development advances within ISP1, ISP2 and ISP3. In general terms, WP14 identifies how SHAMAN's demonstrators can be better aligned with the current and future expectations of customer organizations and whether the ideas underlying SHAMAN are approved by the R&D community as a valid research direction. Taken as a whole, data gathered will also supplement the internal assessment activities taking place within Research & Technical Development (RTD). 4) In addition to informing RTD efforts, evaluation of the demonstrators and presentations (as opposed to the prototype) offers a unique insight into how the SHAMAN project is perceived by, and how it can be explained to, those within its three targeted Domains of Focus. WPs15 and 16 (Training; Scientific- and academic-oriented dissemination) can also use the findings and results of WP14 as they devise strategies for raising interest in SHAMAN. 5) In the case of evaluation by the R&D community the gathered data are used to assess the technical merits and successes of project outputs, measuring them against the relevant specifications, Key Performance Indicators (KPIs) and system criteria devised within and across PCAs and WPs. This activity was performed by the R&D WPs before and during the process of integrating technology elements into the ISP2 demonstrator. The outcomes of this process are implicit in D12.3. 2.2. The implementation of the demonstration and evaluation activities for industrial design and engineering The nature of the ISP2 demonstrator has dictated the demonstration mode and the evaluation methods that could be applied in industrial design and engineering institutions. The method of the focus group discussions related to the presentation of the demonstrator was tested in preparing the D14.2. It was dictated by the fact that SHAMAN is not a fully fledged digital preservation system and no test-bed could be presented to the end-users for hands-on testing. The technological ideas presented with the help of a demonstrator could be best assessed by a discussion among participants of the presentation considering the relevance of the demonstrated functions and features to their domain. Consequently, the same discussion group method was used for the ISP2 evaluation. In addition, some data on the general assessment of information systems in industrial design and engineering companies was collected through interviewing nine users of the Connect Us tool. The demonstration and evaluation event was organized in Eindhoven for the engineers working in the Philips Consumer Lifestyle division. The evaluation event consisted of the presentation of the ISP2 demonstrator and a consequent focus group discussion with the audience. The event was prepared and conducted together with the R&D team that developed
SHAMAN WP14-D14.2
6 / 56
ISP2 and the partner that supplied the test collections (Philips). The WP14 team helped to design the presentations, selected the participants, developed the evaluation instruments, analysed the evaluation data and produced recommendations for further work on the demonstrators. In addition to the demonstration and evaluation events for end-users from the customer institutions, the activities carried out within WP16 and WP17 were assessed using bibliometric, Webometric and content analysis methodologies. By these means, data were obtained on the evaluation of the research outcomes by R&D community. The visibility of the SHAMAN project among research and customer institutions could be assessed partially through these methods. 2.3 The structure of the report The report presents the preparation of the demonstration and evaluation activities (Chapter 3) with an emphasis on the demonstration approaches that influence the perception of the audience of the SHAMAN development outcomes. The methods used for eliciting feedback from the end-user community representatives are presented. A special approach for measuring the impact on the R&D community and its feedback was developed and a set of methods is also presented in Chapter 3. Chapter 4 includes the results of the customer organization evaluation from the demonstration and evaluation event in Philips‟s Consumer Lifestyle division . Chapter 5 presents the data and analysis of the impact made on R&D community. The final Chapter concentrates on the conclusions and recommendations to the developers, especially, with regard to the improvement of the ISP3 to meet the expectations of the enduser and customer organizations‟ communities.
SHAMAN WP14-D14.2
7 / 56
3. PREPARATION FOR DEMONSTRATION AND EVALUATION IN INDUSTRIAL DESIGN AND ENGINEERING Demonstration and evaluation of the SHAMAN ideas implemented in the ISP2 for industrial design and engineering were carried out within the constraints set by the SHAMAN Assessment Framework and in relation to the digital preservation requirements collected during the previous stages of the project. 3.1 Preservation requirements for the industrial design and engineering domain Preservation requirements for the industrial design and engineering domain (DoF2) were derived from investigations undertaken in WP1. The demonstration and evaluation activities for DoF2 are based on those findings and aim to further our understanding of the current state of digital preservation within the field, further refining use cases, additional requirements and scenarios specific to ISP2. The goal is to get real-life feedback from the engineers about our understanding of the digital objects they produce, their representation information and thepreservation policies for managing their use in organizations focused on electromechanical products. When discussing preservation requirements in this particular domain, it is important to highlight its complexity and diversity. In the engineering domain, products progress through a sequence of stages in what is termed the “product lifecycle”, starting as a concept, being marketed as a finished product and ending with the product‟s disposal. The aim of an engineering company is to enter the market with innovative products based on high speed development at minimal cost. To reduce the time to market of their products, organizations employ complex processes and tools. Electronic products are complex in their compilation as they contain various components spanning from the mechanical, electrical and electronic, software design and electro-mechanical field and are developed by employing a wide range of tools. To handle the enormous amount of data created throughout the product lifecycle, there is a huge diversity of software, for example: Enterprise Recourse Planning, Product Lifecycle Management (PLM) systems and Document Management Systems. To help in their daily work, engineers also use Electronic Design Automation (EDA) applications, which support design processes and flows, ranging from an initial concept to a logical representation and finally a physical design (necessary data when producing e.g., a chip, circuit board or an entire system). The SHAMAN framework needs to take into consideration the fact that organizations have many different systems and tools from different vendors for handling product data. Also, the variety of digital objects is huge and the EDA applications are usually based on proprietary file formats. Based on interview material we find that requirements for digital preservation within industrial design and engineering can be split into two groups: legal requirements and re-use of product data. Sectors such as aerospace, medicine or defence are obliged to archive data and information according to strict legislation, which means they have to remain accessible in the future; for example, when specific proofs are required. In the second group, we find that a product needs to be managed even once it is on the market. Product certification processes are one reason for preserving product data. Another is that the re-use of product data is relevant for products with long lifetimes which need modification because of improvements or new functions. Such changes make it necessary to go back to the original data for the consideration of parts or whole sets. Further, when products have passed quality controls, elements of them may be re-used for other products.
SHAMAN WP14-D14.2
8 / 56
In the engineering industry, digital preservation and its advantages are not yet fully understood. Archiving is something of a sub-activity; an indirect result of using different tools and systems within the company. Experience with digital preservation is not extensive; nevertheless there is a growing need for the archiving and preservation of design data within certain sectors of DoF2. 3.2 Adapting the SHAMAN assessment framework for ISP2 evaluation The SHAMAN assessment framework (fully presented in SHAMAN 2009b) was developed to serve three fundamental purposes: evaluate and validate that the project outputs conform with and fully cover the identified user requirements; support the implementation of the SHAMAN prototypes and demonstrators; and contribute towards measuring the overall success of the project. To fulfil these purposes, the Framework is built upon the goals and objectives of the SHAMAN project, which effectively represent the aspects to be evaluated. These can be summarized in three areas: (1) digital preservation theory for the development, adoption and maintenance of DP systems and their respective functions; (2) utilization of grid-based technologies to support shared collections that are distributed across multiple institutions and locations; and (3) creation of a dissemination network to promote best practice, sharing of expertise and support for preserving and re-using digital objects. Therefore, the SHAMAN Assessment Framework incorporated evaluation criteria from a number of sources, such as the criteria for information systems success (DeLone & McLean 1992, 2003), software requirements specifications (IEEE 1998), criteria and mechanisms for benchmarking and risk mitigation as expressed in TRAC (CRL&OCLC 2007) and DRAMBORA (DCC & DPE 2008), and benchmarks for evaluation of software artefacts and conceptual schemes (iRODS 2008). Three sub-groups were defined within each DoF, as stipulated in the SHAMAN Description of Work (DoW) (SHAMAN, 2008). These were: potential customer organizations; potential end users of SHAMAN technologies (including users of preserved materials and objects); and members of each domain's R&D community. As the ISP2 emerged in the form of a demonstrator that implemented certain features of the SHAMAN principles, the assessment framework was adapted for that particular instance of the general theoretical framework. It was also necessary to take into account the nature of the audience. One set of evaluation criteria was directed to the R&D community: the acceptance of the theoretical principles, approval of the innovation level, the benchmark with other DP projects, the nature of the response (constructive criticism vs. negative denial), etc. The criteria to be tested by the customer organization were much more difficult to define. The transformation of the criteria into answerable questions is evident from the initial definition of the criteria in Annex 2 and the final question formulation in the end-user questionnaire (Annex 2). The process incorporated the translation of the assessment framework criteria to the requirements for the demonstrator (D12.3) and construction of meaningful questions for the focus group discussion and the questionnaire. In the case of evaluation by the customer organization representatives, the primary evaluation aims were to:
SHAMAN WP14-D14.2
9 / 56
1. find out to what extent the demonstrators are understood by the members of the customer organization (in the industrial design and engineering company); 2. determine what improvements would make the demonstrators easier to understand; 3. assess if the demonstrated digital preservation principles meet the expectations of the end users; and 4. determine what gaps exist between these expectations and the SHAMAN framework. 3.3 Developing the demonstration approach The demonstration process at this stage is closely associated with the evaluation, as the end users can assess only what has been demonstrated. Therefore, the evaluation possibilities are constrained by the demonstration process. The demonstration process also can be perceived as a separate activity that can be carried out without any subsequent evaluation. This is envisaged in other WPs working on marketing and outreach. This section is devoted to the ways that the demonstration events were designed and implemented for the industrial design and engineering company. The basic analysis of the engineering domain and the development of the scenarios for presentation to the customer organization are presented in the D12.3 (SHAMAN2010b). They are all immersed in the life-cycle of digital preservation as was in the case of ISP1. Here we shortly repeat the most important elements of the demonstration scenarios. “Analysis of the engineering domain led to two key insights that guide the integration of longterm preservation into the Engineering domain (Fig. 1). As it is the central data management system in design and engineering, the PLM system is the one and only software system the archive interfaces with. For integration with design and engineering processes, Release for Production (RFP) is the one and only event right for triggering archival of product data, otherwise inconsistent data states are likely.” (SHAMAN 2011: 16).
Figure 1: Phases in the engineering process and involvement of relevant data management systems along the phases (ISP2 Presentation)
SHAMAN WP14-D14.2
10 / 56
So, the archival system interfacing with the PLM and capturing the design data was selected to start the presentation, closely followed by the normalization of the captured data and access of it through the iRODS metadata repository. The central position in the demonstration event was allocated to the capturing of the collaborative context within the design processes and storing the date in the digital preservation together with other product design data. Open Conjurer fusing the ideas of social semantic networks and collaborative knowledge creation exploited a scenario of collaborative decision making and capturing the data on these decisions. It was demonstrated through the concrete example of working on a TV set:
Figure 2: Collaboration example (ISP2 Presentation)
The possibilities of evolving ontologies for knowledge discovery in digital preservation systems were demonstrated in the final part of the presentation, which was based on the Philips ideation process and included comparison and merging of the actual ontologies used in Philips.
Figure 3: The Philips ideation process (ISP2 Presentation)
The demonstration and evaluation in the case of the ISP2 involved close collaboration with and participation of the R&D team, as this was considered to be the most experimental part of the project, with regard to the particularities of the digital preservation process for industrial design and engineering and with the technological solutions offered for the audience. The members of the R&D team took part throughout demonstration and evaluation process and its preparation as well as in the focus group discussion.
SHAMAN WP14-D14.2
11 / 56
3.4 Selecting audiences The demonstration and evaluation in the domain or industrial design and engineering was planned on a smaller scale than that for the memory institutions. This is visible in the table 1 re-produced from the D14.1: Domain of focus Memory institutions Industrial design and engineering E-science R and D Customers End-users X X X X X X
Table 1: Audiences for the SHAMAN evaluation process (SHAMAN 2009: 14) Therefore, the selected participants in the demonstration and evaluation event represented the customer organization, which at the same time was a Project partner, also providing test material for the ISP2. 3.4.1. Selection of customer organization and event participants. The Philips Consumer Lifestyle division at Eindhoven is the Project partner in the industrial design and engineering domain. Given the confidential nature of the design and engineering processes in any industrial organization, this was the only such organization within which the demonstration and evaluation activities could be undertaken. Apart from the issue of commercial confidentiality, this was the organization where the requirements study had been undertaken and which supplied data for the purpose of aiding systems development. Undertaking any such work in a different organization would have involved the time consuming process of gaining entry and then the development of demonstration systems appropriate to that organization. The Lifestyle division has five „key business areas‟: Health and Wellness, Domestic Appliances, Coffee, Personal Care and Audio, Video, Multimedia and Accessories. Our participants were all from the last of these business areas. Apart from the student intern, the participants were all senior staff of the division: Business development manager; Senior IT specialist; Advanced development and benchmark manager; and Advanced development programme manager. All had been with the company for twentyfive years or more. They all represented various functions within the customer organization, both as potential end-users of the DP technology and/or users of preserved documents. 3.5 Demonstration process As the demonstration did not involve use of a prototype system or some kind of a functional demonstrator allowing users hands-on experience, the main mode of demonstration was an extended presentation with screen-casts presenting the functionalities and features of the prototype demonstrators and some live demonstration on screen. The first full scale demonstration activity was carried out for the SAB members meeting in Amsterdam. The experience of this evaluation was used to improve the presentation and demonstration material as well as the presentation procedure, which had to be shortened substantially. Therefore, two of the original demonstration parts on Preservation aware codesign and Multivalent technology in design were removed. The process included the preparation of the presentation material, creation of the screen-casts, development of the evaluation instruments (focus group schedule and questionnaires),
SHAMAN WP14-D14.2
12 / 56
training of the presenters, rehearsing the events and actual conduct of the demonstration activities (see more in Chapter 4). The demonstration took place on May 24, 2011 in Eindhoven at the Philips Science and Technology park. The analysis of the collected data was performed in June, 2011. The demonstration and evaluation process also included internal monitoring and assessment of the activity on the basis of the KPIs developed within the SHAMAN Assessment Framework. The assessment results of demonstration and evaluation activity of the ISP2 for the WP14 are presented below in Table 2.
Title of KPI Defined Measured Target Result Title of KPI Defined Measured Target Result Demonstration activities The demonstration and evaluation exercise carried out in time to give possibility of addressing problems Timeliness All (100%) of evaluation exercises to be conducted within the appropriate timeframe Planned demonstration/evaluation event carried out within the set time-frame and the feedback provided to the R&D team Demonstration facilities Adequate demonstration facilities have been organized Ability to demonstrate each element for the SHAMAN framework 100% of elements successfully demonstrated Adequate demonstration facilities organized in the presentation site. Four of six (67%) developed components demonstrated. Reason: one of the undemonstrated components related to one demonstrated, another evaluated during the first stage (with ISP1). Adoption encouragement Steps taken to encourage the adoption of the SHAMAN framework Number of demonstration activities for different audiences At least two activities within each domain The components most relevant to the company identified, some of the work transferred to other projects involving Philips. Limited achievement as the demonstration planned only in one partner institution. Business reach Attracting business participants who could build upon SHAMAN products and services Number of business participants in demonstration activities At least two per domain The improvements for coming training and dissemination activities identified and passed on to other WP. Limited achievement as the demonstration was planned only in one partner institution. Application of the SHAMAN Assessment Framework The extent to which the Assessment Framework has been applied in the demonstration and evaluation activities The percentage of evaluation activities in which the Assessment Framework is applied 100% 100%
Title of KPI Defined Measured Target Result
Title of KPI Defined Measured Target Result
Title of KPI Defined Measured Target Result
Table 2: Achievements of key performance for ISP2 demonstration and evaluation
SHAMAN WP14-D14.2
13 / 56
3.6 Preparing evaluation methods and instruments for end-user evaluation The focus group discussions and the questionnaires as methods for evaluation of the SHAMAN R&D outcomes were already tested during the first stage and reported in D14.2. The WP14 team decided to use the same evaluation methods for the ISP2, though the differences in the evaluation process were quite significant. From the beginning of the Project only one demonstration facility was planned for the ISP2 as seen in Table 3 (see DoW and D14.1)
WP12 – evaluation plan. Meeting customer and project requirements Product Prototype for electronic and mechanical design and engineering domain Time M44 Partners ULiv, InConTec, SSLIS, FUH, Philips, IM, Globit Training M3-self assessment Demo mode Use of prototype Demo site Demonstration facility at Philips Consumer Lifestyle Criteria User evaluation Ease of learning (measured in time required to be able to use the prototype) KPIs Diversity of the supported contents; Volume of the integration of contents; Integration of analysis tools; Supported preservation services; Content providers satisfaction System criteria Essential properties Maintainability Effectiveness
Table 3: WP12 evaluation plan (SHAMAN 2009: 31-32) Regarding the nature of the ISP2 demonstrator some of the evaluation criteria, which were not applicable in the circumstances, have lost their relevance, e.g., ease of learning. However, most of the rest were taken into account and included into the focus group and questionnaire. Some of the KPIs were assessed before the evaluation with customer organizations. Thus, the diversity of supported content was increased in comparison to the ISP1 with video and 3D objects, supported preservation services were diversified in accordance with DoF2 needs, etc. The R&D group collaborated with the WP14 team to solicit relevant data from the participants in the areas of the greatest interest to developers. Most of the questions in the questionnaire were focused on the following key elements of the ISP2: Integration of long-term preservation processes into the PLM system. Transformation of native 3D formats into normalized format. Being able to search for archived product data. Capturing and archiving of collaboration metadata (e.g. design decisions) for re-use. Being able to find product metadata during knowledge evolution. The focus group was designed to clarify the relevance of the ISP2 solutions to the company and to the individual workers in it, to compare the reactions to the established requirements and to some extent to assess the presentation quality and clarity. Some of the discussion also involved the developer‟s team who asked questions or answered questions from the participants.
SHAMAN WP14-D14.2
14 / 56
Some data from the interviews with the Philips workers on the use of the Connect Us tool (conducted by Emma Forsgren in January 2011) were used to understand the approaches of the company workers to the evaluation of the ICT tools in general. This knowledge also helped to structure the questionnaire and guide the focus group. 3.7 Methods of evaluation with R&D community The evaluation of the SHAMAN outputs with research communities differed slightly from the evaluation with the customer organizations and end-users among memory institutions. Not all of it was related to the presentation of demonstrators. Other types of material (articles, conference presentations, etc.) were used for soliciting the required feedback. The presentation of the demonstrators was used in the discussion session with Scientific Advisory Board (SAB). To some extent the members of SHAMAN Consortium also served as evaluators of the project R&D outputs. The expected feedback from R&D community consisted of the reactions of the members of the community towards the presented results. As some years have already passed, we could investigate the citations of the SHAMAN publications to some extend and this is the main indicator of the reaction of the R&D community in this document. The citations were checked in the ISI Web of Knowledge and in Scholar Google. Thus, the main results related to the assessment of the achievements of the SHAMAN Consortium by R&D community were collected using bibliometrics and content analysis. In addition, the penetration and the influence of the SHAMAN project was assessed using Webometrics – statistical measurements of the SHAMAN presence on the World Wide Web. A straightforward descriptive bibliometrics was used with a consequent analysis of the meaning of the results. The number of publications in this case provides a picture of the present research areas. The number of accepted papers is an indirect measure of the quality of the work or the interest of the R&D community in the subjects of work. The Webometric exercise was also limited to some descriptive measures and was compared to the baseline data presented in D14.2. This section addresses not only the Web impact on the R&D community but also the visibility of SHAMAN among memory institutions on the Web. A session with the SAB members was organized and feedback from peers was collected through informed discussion for further development of the R&D output and for the assessment of the SHAMAN framework in the market of scientific ideas.
SHAMAN WP14-D14.2
15 / 56
4. EVALUATION BY THE CUSTOMER ORGANIZATION 4.1 Demonstration and evaluation event – implementation The demonstration and evaluation event was carried out in Philips, Eindhoven on 24th May, 2011. The event was organized through the Philips liaison person with the Project, Kees Tuinenbreijer. The division chosen at the outset of the partnership with Philips was the „Consumer lifestyle‟ division and, as noted earlier, the participants at the event were from the Audio, video, multimedia and accessories business area. Initially, nine participants had been expected but three dropped out on the morning of the event, leaving five Philips staff and one student intern. This cannot be described as a wholly satisfactory level of participation, but, in the environment of a major business, where sudden demands are made of people, it had to be accepted as inevitable. 4.2 Results of evaluation by customers and end users 4.2.1. Response to the presentations A self-completed questionnaire (see Annex 6) was used to collect responses from the participants. With only five actual Philips staff members the results can hardly be called representative of all opinion within the organization, particularly in a division with such different business areas as „Health and wellness‟ and „Coffee‟. Consequently, the results must be taken simply for what they are: the views of five individuals. First, the presentation overall was regarded by four of the five as „Not very useful‟ and this view was reflected in the responses to the more specific questions, most of which reported aspects of the presentation as being of only „Moderate relevance‟. With regard to the problem areas presented, only „Collection of collaborative work data‟ showed a clear division of opinion, with two participants judging this to be of „High relevance‟ and two judging it of „Low relevance‟. This is probably the consequence of only two of the participants being involved in computer-supported collaborative work. In the third part of the questionnaire, participants were asked for their evaluation of the event overall using a five-point scale, where „5‟ indicated high satisfaction. The results were: 1. How would you rate the overall organisation of the event? Average score: 4.4 2. How would you rate the structure of the event? Average score: 3.6 3. How would you rate the presentation methods used? Average score: 3.8 4. How effective were the speakers overall? Average score: 3.0 5. How useful was the event with respect to your job? Average score: 2.4 6. How much did the event meet your expectations? Average score: 3.2 The results suggest only a moderate level of satisfaction with the event as a whole and a low evaluation of it in relation to their work. The perceived importance of the functions demonstrated in the presentation is shown in Table 4 below. The division of opinion is evident from the table: the typical response was that the function was of „Moderate importance‟ for the company. Only five functions were rated „Very important‟ by two or more of the group: Normalization of PLM data, Indexing PLM metadata, Capturing metadata, Terminology evolution, and Semantic search. The fact that four of the participants rated Capturing metadata as „Very important‟ is, perhaps, the only
SHAMAN WP14-D14.2
16 / 56
clear message from this analysis. This was also one of the main messages from practitioners in the memory institutions who participated in the evaluation of ISP1.
Function Social modelling of organization culture Integration of social network and identity management Middleware for reasoning over corporate social graph and collaboration metadata for application customisation. Aggregation of heterogeneous data repositories Normalization of PLM data into open standards Indexing of PLM metadata in the archive Relevance of automatic and manual capturing of metadata annotations for archiving in design processes Indexing of the social and collaboration context in the archive Document-centred collaboration scoping Usefulness of Multivalent technologies for accessing archived standard for the currently available file formats (pdf, doc, ppt, xml, txt, jt) Practical relevance of terminology evolution Applicability of metadata schema mappings for representing terminology evolution Benefits of accessing the archive via semantic search using different independent vocabularies Very important 1 1 1 1 2 2 4 1 1 1 2 0 2 Moderate importance 2 2 2 2 3 2 1 3 4 2 3 4 2 1 2 Not important 2 2 2 2 1 1 1 Unknown 1 1 1 1 0 0 1 1 2 1 1 0
1
Table 4: Perceived importance of demonstrated functions
The division of opinion is evident from the table: the typical response was that the function was of „Moderate importance‟ for the company. Only five functions were rated „Very important‟ by two or more of the group: Normalization of PLM data, Indexing PLM metadata, Capturing metadata, Terminology evolution, and Semantic search. The fact that four of the participants rated Capturing metadata as „Very important‟ is, perhaps, the only clear message from this analysis. This was also one of the main messages from practitioners in the memory institutions who participated in the evaluation of ISP1. When asked, What improvements would you like to see implemented in the demonstrated SHAMAN framework? The responses were: “More emphasis on removing „archiving‟ burden from engineer.” “Machine to machine interface to replace the GUI front end.” “More simplicity in using the tools.” “More practical approaches to day-to-day problems in (our) practice.” “Example of metadata extraction.” Again, the concern for automating as much as possible of the information management process, is evident in these comments. 4.2.2. The outcomes of the focus group The outcome of the evaluation session as a whole (including the question and answer sessions during the presentations) can be summarised as follows: a) The participants from this particular business area of the „Lifestyle‟ division perceived no need, either personally, or from the company‟s point of view, for long-term preservation of design and engineering documentation. b) Estimates of the length of time for which documents were archived at present ranged from “more than three years” through “seven years” to “fourteen years”.
SHAMAN WP14-D14.2
17 / 56
c) Improved search capabilities were seen as desirable, as add-ons to the existing archiving systems which were based on Product Lifecycle Management systems and Document Management Systems. d) Automatic meta-data generation was seen as a desirable time-saving for engineers in their daily work. e) The Open Conjurer sub-system attracted interest, though participants pointed out that they tended to work in face-to-face meetings rather than over networks and that if the system could be developed to capture information from such meetings, it could be of value to the company. They also stated that the usage of the social graph metaphor would be useful in order to initiate collaborative sessions and projects, making it easier to put the groups together, as the information is already present in the social graph. The influence of practical concerns As in the case of participants in the memory institutions domain, the comments of the engineers participating in the event were strongly influenced by the nature of their work, the types of products upon which they worked and the prevailing systems used in the company. There were frequent comments regarding „saving the time of the engineer‟ and a vision was presented of a desirable system in which all an engineer had to do was to deposit a document (of any kind) in a system, which would then carry out all the necessary operations to identify, index, code with metadata, and archive in the appropriate form without further intervention by the engineer.
SHAMAN WP14-D14.2
18 / 56
5. EVALUATION BY R&D COMMUNITIES 5.1. Publishing statistics and citations The data collected from the members of Work Packages revealed that papers were presented at thirty-one conferences (in some cases, more than one paper was presented at the same conference). The geographical distribution shows a wider spread of activity than in 2009, with papers being given to conferences in the following countries: UK – 5; Germany – 3; Australia, Belgium, Lithuania, Netherlands, Spain, Sweden, Switzerland, India, USA – 2; Austria, Canada, Israel, Portugal, Turkey – 1. The data submitted indicate that a total of forty-eight conference submissions were prepared and that only nine were not accepted. This is an acceptance rate of 81%, which, given the high rejection rates of many conferences, suggests that the ideas generated within SHAMAN are perceived to be of relevance to the various research communities. An interesting difference, compared with 2009, is that only two conferences were attended in the USA (seven in 2009). This may be a consequence of the financial crisis affecting higher education in Europe, which is limiting travel costs. On the other hand, conferences were attended in sixteen different countries, compared with twelve in 2009 and eleven of the sixteen were in „new‟ countries, suggesting that an excellent job is being done to communicating the work of SHAMAN internationally. The list of conferences attended and papers prepared is given as Annex 13. At present we already can start collecting the first measures showing the impact of the SHAMAN research by looking at the citation numbers for some older articles, though they cannot be very high at the moment and some instances are citations within the project community. For this first modest account of cited materials we have selected the ISI Web of Science database that registers the citations for particular works and Scholar Google, which is quite controversially evaluated as a source of citation numbers (Kousha et al. 2009). Despite this controversy, the Scholar Google was used for the very reason of the recent dates of the project publications and the fact that some of them (like most of the deliverables) available only online would not have found their way into the ISI journals by now. Article
Watry P. (2007). Digital Preservation Theory and Application: Transcontinental Persistent Archives Testbed Activity. The International Journal of Digital Curation, Issue 2, Volume 2, p. 41-68. Barateiro J, Antunes G, Cabral M, et al. (2008). Using a Grid for Digital Preservation. In:. Digital libraries: universal and ubiquitous access to information, proceedings. 5362, 225-235. Autunes, Barateiro, Cabral et al. (2009). Preserving digital data in heterogeneous environments. In Proceeding JCDL '09 Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries. New York, NY, USA: ACM. Barateiro J, Borbinha J, Antunes G, et al. (2009) Challenges on Preserving Scientific Data with Data Grids. In: DAGRES09: data grid for e-science workshop. P. 17-22 Hedges M, Blanke T, Hasan A. (2009). Rule-based curation and preservation of data: A data grid approach using iRODS. Future generation computer systems-the international journal of grid computing-theory methods and applications. 25(4), 446-452 Innocenti, P., Ross, S., Maceviciute, E., Wilson T., Ludwig J., Pempe W. (2009). Assessing digital preservation frameworks: the approach of the SHAMAN project. In: Proceedings of MEDES '09 Proceedings of the International Conference on Management of Emergent Digital EcoSystems. New York, NY, USA: ACM. Mois M, Klas CP, Hemmje ML (2009). Digital preservation as communication with the future. In Conference Information: 16th International Conference on Digital Signal Processing, JUL 05-07, 2009 Santorini, Greece. 16th International Conference on Digital Signal Processing, vol. 1, 112-119 Nicholson D, Dobreva M. (2009). Beyond OAIS: towards a reliable and consistent digital preservation implementation .16the International Conference on Digital Signal Processing. Vol. 1, p. 104-111. Maceviciute, E. and Wilson, T. (2010) Evaluating the SHAMAN Digital Preservation Framework for
ISI Web of Scholar Google Science 0 23 1 0 0 0 0 0 0 0 6 3 1 1 3 1 1 1
SHAMAN WP14-D14.2
19 / 56
memory institutions. In: Proceedings of International Conference on Digital Libraries 2010: Shaping the information paradigm (ICDL 2010), 23-26 February 2010, New Delhi. Vol 1, p. 155-165. Delhi: Teri. Hedges M, Blanke T, Hasan A. (2009). Rule-based curation and preservation of data: A data grid approach using iRODS. Future generation computer systems-the international journal of grid computing-theory methods and applications. 25(4), 446-452 SHAMAN. 2009. SHAMAN Requirements Analysis Report (public version) and Specification of the SHAMAN Assessment Framework and Protocol, SHAMAN Project SHAMAN Implementation of Templates to Manage the Ingest Workflow (D6.3) Total
0
0 0
1
3 6
1 Table 5: Citations of SHAMAN publications in ISI Web of Science and Scholar Google
40
It is quite obvious that Scholar Google registers most of citations as was expected from the very beginning. One citation for one item is registered so far in the ISI Web of Science. Only two of the public deliverables are cited and mainly in the publications by the SHAMAN consortium members, but eleven other publications were already explicitly used in the works of other researchers. The article cited most presents the general principles for digital preservation systems development. The team working Portugal with DP reference architecture and implementation for e-science is receiving quite significant attention overall (10 citations).
5.2. Webometric analysis To be able to compare the results of the webometric analysis with the baseline and the PLANETS and CASPAR projects established in the D14.2, the same procedures were performed using Google queries. The following search strings were used with Google search to get hit count estimate: "Planets, Preservation and Long-term Access through Networked Services" "Shaman, Sustaining Heritage Access through Multivalent Archiving" "Caspar, Cultural, Artistic and Scientific knowledge for Preservation, Access and Retrieval" Doublequotes are required by the Google search engine to combine whitespace-separated words into an "exact phrase search" or single search term. Initially Google's results page reports "22200 results" for the phrase "Planets, Preservation and Long-term Access through Networked Services". However, this must be refined to get a meaningful figure. This is simply a case of clicking through to the last results page where the "true figure". Doing this, we find that there are 345 results. With "Caspar, Cultural, Artistic and Scientific knowledge for Preservation, Access and Retrieval" we get 19600 results. Clicking through this reduces to 247. With "Shaman, Sustaining Heritage Archiving through Multivalent" the initial 3340 results becomes 146. The table xx shows the hit count estimates, URL and Website comparison for the three digital preservation projects as performed in 2010 and in 2011 (June). Though the numbers are slightly increased for the final website count in 2011, the actual nature of the Google citations does not allow to make any conclusions about increased impact or influence of the projects.
SHAMAN WP14-D14.2
20 / 56
Project
Google query
Hit count estimate 2010 2011 22200
URLs 2010 299 2011 345
Websites 2010 155 2011 103
PLANETS
‟Planets, Preservation and long-term access through Networked services‟ ‟Shaman, Sustaining heritage access through Multivalent archiving‟ ‟Caspar, Cultural artistic and scientific knowledge preservation for access and retrieval‟
3830
SHAMAN CASPAR
2480 485
3340 19600
209 73
146 247
65 15
84 105
Table 6: Web impact: comparison of Web citations for EU digital preservation projects. The rest of the data were gathered using the LexiURL software designed by Mike Thelwall (Statistical Cybermetrics Research Group, University of Wolverhampton, 2010). This searched using Microsoft‟s Bing search engine because of limitations of the other services or APIs. Using Lexi's text file input method we run through it the same terms as those used above for the Google searches. To quote the LexiURL user documentation: …the main table of a LexiURL Searcher impact report lists the number of URLs, domains, web sites, Second or Top-Level-Domains (STLDs) and TLDs matching each search, as calculated from the long result files. This page can be reached from the main index.html page by clicking on the Overview of results link. The most reliable impact indicator is normally the number of domains rather than the number of URLs due to the possibility that text or links are copied across multiple pages within a web site… The results also include full lists of matching URLs, domains, sites, STLDs and TLDs for each query: clicking on the appropriate links on the home page reveals these lists. (http://lexiurl.wlv.ac.uk/searcher/usingLexiURLSearcherClassic.html) For this we used Yahoo's "Site Explorer" service, entering the URL "shaman-ip.eu" into the search box. This returned 129 in-links. Next we had to pick through Yahoo's results and remove entries that were essentially duplicates (e.g. blog postings, advertisements or discussion group threads indexed twice due to new postings above them on different dates/URLs). This reduces the 129 to 82. Alternatively, using the LexiURL software gives the figure of 123 external URLs from 100 domains. The search string Lexi uses on Yahoo is: "linkdomain:shaman-ip.eu -site:shamanip.eu". Entering this same query manually into the Yahoo search box yields 179 results BUT as the LexiURL report explains: "The number of URLs returned by the search engine [is more reliable and is] (NOT the estimated number of URLs it reports)". The figure 179 can therefore be safely ignored. The discrepancy between the 100 given using the software and the 82 appearing through Site Explorer is a mystery. It is well known that search engine results are just an approximation; they can fluctuate very quickly.
SHAMAN WP14-D14.2
21 / 56
No. Of connections 256
External URLs 72
External Web sites, May 2010 1* 17 2* 14 3* 9 4* 7 5* 7 Total 54
No. of url connections 129
External websites 82
Categories of external Web-sites, June 2011 1* 17 2* 16 3* 9 4* 12 5* 15 6* 10 7* 3 Total 82
1*: partners & related services; 2*: academic and research sites; 3*: European sites (.eu); 4*: blogs, social networking & sharesites; 5*: memory institutions, 6*: publishers, journals, conference sites; 7*: companies
Table 7: SHAMAN Web impact: in-links to SHAMAN Web site (June 2011). Comparing with the previous results found in 2010 we can see that the number of memory institutions linking to the SHAMAN project site has doubled, and additional and rather extensive category of publishing and conference sites has emerged. There are also three company sites (non-partners in the project) linking to the SHAMAN project website. Though the increase of in-links is not very high – ten since the last year – the categories of the inlinking sites are more diverse than in 2010. 5.3. Expert opinion The feedback of high-level researchers was also received from the meeting with the Scientific Advisory Board. The meeting was held at the Amsterdam headquarters of Philips on February 25. All SHAMAN partners and WPs presented their results to the SAB, represented by these members: Neil Beagrie – Charles Beagrie Ltd. Bruno Bachimont – Univ. Compiegne/Institut National de l‟Audiovisuel Olivier Rouchon – CNRS/CINES – could not make it as he got blocked by snow storm in his way from Montpellier to Frankfurt Mr. Jean-Yves Delaunay (Airbus) participated off-site. In fact, the meeting of the project team with the members of the SAB was the first ISP2 demonstration event, but work on ISP1 and ISP3 was also introduced. The evaluation took form of a lively discussion. The SAB members focused on the positive achievements of the project and agreed that overall presented results were impressive and very detailed. The presentations were too long and in any other audience they should be presented differently and significantly shorter. This experience was taken into account in preparing the presentation for Philips. The recommendations of the SAB addressed not only the weak points in the presentations but also suggested to highlight the value added through the project time. They explained the necessity to emphasize actual capabilities of SHAMAN solutions for efficient preservation through living examples that would address not only day to day running of the system, but also the usefulness for the future. Another point worth strengthening was the developed Reference Architecture that is different and more valuable than the ones developed in earlier EU projects. The directions and plans for the future work are also worth addressing in future presentations.
SHAMAN WP14-D14.2
22 / 56
5. 4. Conclusion All in all the results of examination of SHAMAN‟s impact and the existing feedback from the R&D community is satisfactory for the examined project period. The published output is quite high, the web impact is on the par with the comparable EU projects and more diversified than during previous period, and the feedback from the peer review process is balanced and constructive. The R&D community feedback concentrated on the functionality of the demonstrators, novelty of the approaches, added value. The constructive and critical comments of the peer review are taken into account for further R&D work in the project. This especially should be said of the input from the SAB meeting that influenced further development of the demonstrators and the methods of presentation and evaluation during the demonstration event in Eindhoven.
SHAMAN WP14-D14.2
23 / 56
5. CONCLUSIONS AND RECOMMENDATIONS What have we learned about our target organizations and potential users? In the case of ISP2 our target customer organizations (industrial design and engineering companies) were represented by Philips Consumer Lifestyle Division and, more specifically, the Audio, Video, Multimedia and Accessories business area. It was pointed out that in this area, the products have relatively short lifetimes and that, consequently, the need for longterm digital preservation was limited. The maximum estimate of the time for which documents would need to be preserved was fourteen years and the potential users of digital preservation software believed that existing systems were perfectly capable of coping with preservation over that period. The nature of the technology and the pace of development within the industry militates against re-use of earlier technologies, although earlier ideas that were originally not capable of being realised in a product could be re-used. Such re-use appeared to depend more upon ensuring that the language in which earlier ideas were expressed was „understandable‟ to modern search capabilities in systems. The example was given of „3D television‟, which was earlier known as a „stereoscopic display‟. Participants pointed out that, although their needs for digital preservation were limited, other business areas in the division were more likely to have greater needs, for example, the Health and Wellness area, which had legal obligations to retain data. The theoretical justification for the SHAMAN framework appeared to be of little interest to them: their concerns centred on the „business case‟, would the costs of implementing a SHAMAN-based set of processes be justified by the savings? In the light of their present archiving practice, it would be difficult to make any such business case. As engineers, the primary requirement they had of software was that it should „save the time of the engineer‟. They felt that any systems that required more input from the user than they gave already would not be acceptable and that the optimum system would be one that accepted a document or design and then automatically extracted all the necessary metadata to enable it to be found and used in the future. What have we learned about the evaluation approach for digital preservation? The evaluation approach adopted for ISP2 was essentially the same as that for ISP1, albeit with a more limited audience in one organization. Time and work pressures in business organizations limit the extent of participation and, in this case, three of the nine intended participants had to drop out on the morning of the event. In spite of the limitations, however, the evaluation methods were successful in delivering useful information for the preparation of this deliverable. If there are to be any future evaluation events involving industry, however, it will be necessary to ensure much greater and better informed participation, not only from the point of view of uncovering more diverse views, but also from the point of view of increasing the potential impact of SHAMAN. The approach of selecting a “typical” member of organization showed that digital preservation is of little concern to engineering staff and, therefore, the main points of interest focused on the partial potentially useful elements of the system rather than on the core principles. It would be important to ensure in the future evaluations that the representatives of a customer organization are exposed to at least some of the issues of preservation before taking part in evaluation.
SHAMAN WP14-D14.2
24 / 56
What have we learned about the SHAMAN ISP2 demonstrator? The ISP2 demonstrator was developed specifically for the industrial design and engineering domain and, even more specifically, on the basis of the requirements study within Philips Consumer Lifestyle division. It employed, where appropriate, data from Philips on the design and production of television sets. In spite of this, however, it was evident that the mode of presentation failed to get across some significant aspects of the SHAMAN framework. Future demonstrations need to focus much more closely on the environment and present the framework in a more directly intelligible fashion. There was no clear demonstration of what kind of user interface could be designed and how the archival functions could be managed through such an interface; rather, participants were left to infer these things from the rather academic presentation offered. The demonstration needs a better story, based on the day-to-day work of the design engineer to make the ideas intelligible and, potentially, acceptable. The issue of product differentiation should also be addressed: participants pointed out that commercial software vendors, including Microsoft, were developing the kinds of capabilities offered by SHAMAN. If this is the case, a serious attempt must be made to demonstrate the differences between what is on offer as proprietary software and what can be accomplished through the open source software employed by SHAMAN. The demonstrator had different elements from those demonstrated in ISP1 and, although the participants had a limited need for preservation, per se, they nevertheless found some features of the demonstrator, such as the Open Conjurer module and the concept of „terminology evolution‟ particularly appealing. Open Conjurer, in particular, was thought to have potential value, if it could be shown to work with meetings, and not only with computer supported collaborative work. The collection of collaborative context metadata was considered to be important and the usefulness of its availability in the future was emphasized. What have we learned about the applicability of the SHAMAN framework to industrial design and engineering? It is quite evident that the SHAMAN framework is applicable to the domain, and it is also evident that the key factors that will make digital preservation in general a requirement will depend very much on the nature of the business. As we have noted, in an area such as that dealing with audio and video appliances, very long-term preservation is not needed because of the rather short lifetime of an individual product. However, even here, in a different business area, relating to health and well-being, legal requirements may necessitate a much longer period of preservation for documents. We can imagine, also, that in fields such as the aero-space industry, where, for example, an aircraft may have a lifetime of several decades, during which time spare-parts are needed and, in the event of serious damage, more extensive repair and refurbishment, the need for digital preservation will be much greater. What have we learned that has implications for the training and dissemination phase of the Project? Given the diversity of business areas in the field of industrial design and engineering, it is clear that any training and dissemination must be closely tailored to the needs of the area. With ISP2, SHAMAN has presentation materials that were developed for the exercise in Philips, but they can be presented in general terms to other kinds of companies, potential commercial developers of software or conferences and exhibitions. While not every company designs television sets, it is probably true to say that most would understand the design process and its implications and have some perception of how the SHAMAN preservation
SHAMAN WP14-D14.2
25 / 56
framework could be of help. Clearly however, as noted above, there is a need to develop a presentation of the ideas that focuses on the work of the engineer, presents a business case for the adoption of the ideas, and presents a comparison with existing proprietary software.
SHAMAN WP14-D14.2
26 / 56
7.
Bibliography and References
CRL & OCLC (2007)Trustworthy Repositories Audit & Certification (TRAC): Criteria and Checklist. Chicago : Center for Research Libraries ; Dublin, Ohio : OCLC Online Computer Library Center, Inc. [Access through the internet 2011-06-12: http://www.crl.edu/sites/default/files/attachments/pages/trac_0.pdf]. DCC & DPE (2008). DRAMBORA interactive. [Access through the internet 2011-06-12: http://www.repositoryaudit.eu/] DeLone, W.H. & McLean, E.R. (1992). Information systems success: the quest for the dependent variable. Information Systems Research, 3(1): 60–95. DeLone, W.H. & McLean, E.R. (2003). The DeLone and McLean model of information systems success: a ten-year update. Journal of Management Information Systems, 19(4): 9-30. IEEE. (1998). IEEE recommended practice for software requirements specifications, Institute of Electrical and Electronics Engineers. IEEE Std. 830-1998, New York. Kousha, K., Thelwall, M. & Rezaie, S. (2010). Using the web for research evaluation: The Integrated Online Impact indicator, Journal of Informetrics, 4(1), 124-135 SHAMAN (2008). D1.1 survey of users and providers in Europe and SHAMAN usage scenarios and DP support. SHAMAN (2009). D14.1. Demonstration and Evaluation Plan. SHAMAN (2009a). D11.1. Implementation of context capturing mechanisms within production environments and federated server architecture for temporary storage before ingests. SHAMAN (2009b). D1.2. SHAMAN Requirements Analysis Report (public version) and Specification of the SHAMAN Assessment Framework and Protocol. [Access through the internet 2010-09-02: http://shaman-ip.eu/shaman/document] SHAMAN (2009c). Project periodic report – Y1. Project objectives, progress and achievements. SHAMAN – WP18-D18.1 (Y1). SHAMAN (2010). D11.1A. Implementation of context capturing mechanisms within production environments and federated server architecture for temporary storage before ingests. Addendum. SHAMAN (2010a) Project periodic report – Y2. Project objectives, progress and achievements. SHAMAN – WP18-D18.1 (Y2). SHAMAN (2011) Project periodic report – Y3. Project objectives, progress and achievements. SHAMAN – WP18-D18.2. SHAMAN. (2010b). D14.2 - Report on demonstration and evaluation activity in the domain of "memory institutions". SHAMAN –WP14-D14.2. SHAMAN (2010c). D12.3 - Prototype for Electronic and Mechanical Design and Engineering Domain – Demonstration Scenario Planning. SHAMAN – WP12-D12.3 Thelwall, M. (2009). Introduction to webometrics: quantitative web research for the social sciences. San Rafael, CA: Morgan & Claypool.
SHAMAN WP14-D14.2
27 / 56
LIST OF ABBREVIATIONS 3D Three dimensional CAD Computer aided design CASPAR Cultural, Artistic and Scientific knowledge for Preservation, Access and Retrieval CNRS/CINES Centre national de la recherche scientifique/Centre Informatique National de l‟Enseignement Supérieur CSCW Computer supported collaborative work D14.2 Deliverable 14.2 DoF SHAMAN Domain of Focus DoF1 SHAMAN Memory Institutions Domain of Focus DoF2 SHAMAN Industrial Design & Engineering segments Domain of Focus DoF3 SHAMAN e-Science Domain of Focus DoW Description of work DP Digital Preservation DRAMBORA Digital Repository Audit Method Based on Risk Assessment EDA Electronic Design Automation IEEE Institute of Electrical and Electronics Engineers iRODS Integrated Rule Oriented Data Systems ISP Integration and demonstration Sub-Project IT Information Technology KPI Key Performance Indicator PCA Project Competence Area PLANETS Preservation and Long-term Access through Networked Services PLM Product Life-cycle management R&D Research and Development RFP Release for Production RTD Research and Technical Development SAB Scientific Advisory Board STLD Second or Top-Level-Domain TRAC Trusted Repositories Audit & Certification TV Television UK United Kingdom URI Uniform Resource Identifier URL Uniform Resource Locator Y1 Year one WP Work package SHAMAN WP14-D14.2
28 / 56
8. ANNEXES ANNEX 1: ISP2 demonstration and evaluation event timetable 13:00-13:15 Kees opens the event. At the same time the first part of the questionnaire is distributed. Kees asks participants to fill it in. Round of introductions. 13:15-13:25 Elena explains the aims and the structure of the event (short slide presentation) 13:25-14:45 Presentation of the ISP2 domain 13.25-13.55 Hans Ulrich Heidbrink, Incontec Introduction to ISP2 Design Data Acquisition Normalization and access Video demonstration of PLM integration 13:55-14:00 Break for questions 14.00-14.25 Dominic Heutelbeck, FTK Collaboration and Decision Making Demonstration of the Decision Room 14:25-14:30 Break for questions 14.30-14.40 Jörg Brunsmann, FUH Long-term access to Evolving Metadata in the Engineering Industry 14:40-14:45 Break for questions 14:45-15:00 Coffee break (during it the second and third parts of the questionnaire are distributed) 15:00-16:00 Focus group discussion (Elena, Tom, Kathleen, Vittore) 16:00-16:15 Filling in the questionnaire parts 3 and 2 by the participants.
SHAMAN WP14-D14.2
29 / 56
ANNEX 2: Demonstration scenario and focus group guide The focus group leader reminds participants of the purpose of the meeting, sets the rules of discussion (e.g., asking question, having each participant‟s opinion round the table, free discussion, and move to another issue). Focus group leader has to ensure that every participant has a possibility and is drawn to express personal opinion, raise doubts, and comment on other participants‟ input. The opening question: I. What is your overall impression of the demonstration? Subsidiaries: Did you find it useful/not useful? What was lacking? What impressed you most in the demonstration? What struck you as innovative in the demonstration? II. Having seen the presentation do you think that the ideas presented are of relevance to Philips? Subsidiaries: How relevant can it be for your own work? Do you feel that there is a need of re-using your earlier designs or other data? Which demonstrated functions do you see as most useful in relation to your work? III. Does the company already support the development of digital preservation tools or use them? If YES, which tools? Subsidiaries: Do you think the demonstrated features could help Philips to develop a digital preservation policy? Which demonstrated functions do you see as most useful for this purpose? Would these innovative features be an improvement on existing technologies of which your are aware? IV. If systems were developed based on SHAMAN do you think the company should be interested in using them? Subsidiaries: Would the framework extend your existing systems in any useful way? How? V. These are the key features of the SHAMAN framework that you have seen: Integration of long-term preservation processes into PLM Transformation of native 3D formats into normalized format Being able to search for archived product data Capturing and archiving of collaboration metadata (e.g. design decisions) for re-use
SHAMAN WP14-D14.2
30 / 56
Being able to find product metadata during knowledge evolution Which would be the most important for you from the perspective of long-term digital preservation for the company purposes? VI. What incentives would help your company to adopt systems based on the SHAMAN framework? Subsidiary: Considering the demonstrated components of the SHAMAN framework, do you see benefits of the framework beyond the perspective of long-term preservation, and if yes where?
5. Ask the participants to fill in the evaluation Questionnaire (impressions from demonstration, parts 2 and 3).
SHAMAN WP14-D14.2
31 / 56
ANNEX 3: Questionnaires to the focus group participants Dear participants, We would be grateful if in addition to the discussion you could answer this questionnaire. It is divided in two parts. Please, fill in the first one at the beginning of the meeting. The second part can be answered only after you have seen the presentation and participated in the discussion. Please, put it aside – your moderator will remind when to fill it. Part I 1. Other Which organization do you work in? (please, tick the appropriate box) Name of organization: ____________________________ Philips
2. What is your role or job in your organization? __________________________________ __________________________________________________________________________ 3. How long have you worked in the organization? _________________ years
4. If a project is finished and has reached release for mass production where are your design and engineering documents archived? On my local disk In the team data environment (CAD or simulation or data server) In a Product Lifecycle Management system I deposit it in an archiving system It is done by my IT department as backup service
I don‟t archive myself, and don‟t know if it is done in the company 5. If files are archived, for how long are they archived or stored? _______________ 5.1 Are there purging procedures in places for archived files? No Yes Don‟t know 5.2 Are there time retention policies for archived files? No Yes Don‟t know 6. Are any external services used to archive documentation and other engineering documents?
SHAMAN WP14-D14.2
32 / 56
Yes for service documents Yes for engineering artefacts No we do it only with internal resources I don‟t know 7.
Do you ever need to access archived design information or documents? No Yes 7.1 To do so, do you need to take specific action? No it is in my local disk No I have access to the team data environment Yes, I need to checkout from a PLM system Yes, I have to get it from an archiving system Yes, I ask the IT department for a backup copy
7.2 Do you have search capabilities to access older file sets? No Yes What search criteria can you can use? I need to know file name and version I need to know the time range when it was stored I can search using metadata to identify the files I need the product identifier or number to start searching I know it by experience and scan my own repository
7.3 Have you ever missed design information from older design which would have been helpful for your actual work? Never Occasionally Often
SHAMAN WP14-D14.2
33 / 56
8. Does your organization have a documented digital preservation policy? No Yes Don‟t know
9. Are you personally involved in digital preservation work in the organization? No Yes 9.1 How long have you worked on digital preservation matters? _________ years 9.2 What is your role in relation to digital preservation? Please tick appropriate functions. Formulating policies and/or procedures Selecting materials for digitization Managing the digitization process Defining the organizational need for technology Data curation Selecting and adopting digital preservation systems Managing or providing IT services Software development Other (please specify) _______________________________________ 10. Are you familiar with the Product Lifecycle Management (PLM) systems used in your company? No Yes 11.1 Do you use the PLM system in your work? No Yes Do you see the PLM system as having an archival function? No Yes
SHAMAN WP14-D14.2
34 / 56
Do you see differences between PLM check in and check out and the ingest and access process in digital preservation systems? No Yes What is the main difference you see? _________________ _______________________________________________ _______________________________________________ 11.2 Do you need more information on the differences between digital archiving and digital preservation? No Yes
12.
Does your organization have a documented collaboration policy? No Yes Don‟t know
13. Are you personally involved in computer supported collaborative work (CSCW) in the organization? No Yes How long have you worked on CSCW matters? __________years What is your role in relation to CSCW? Please tick appropriate functions. Formulating policies and/or procedures Identity management Managing the collaboration processes Defining the organizational need for technology Selecting and adopting collaboration tools Managing or providing IT services Software development Other (please specify) ______________________________ ___________________________________________________
SHAMAN WP14-D14.2
35 / 56
Thank you for answering this part of the questionnaire. You will be asked to complete Parts 2 and 3 at the end of the focus group interview.
SHAMAN WP14-D14.2
36 / 56
Part II 1. Having viewed the demonstration, how useful do you find the features of the demonstrator for your company? Please tick one box. Very useful Useful Not at all useful Not known
Not very useful 2. Having viewed the demonstration, how would you rate the relevance of the following problems in general from the company‟s perspective?
Question High Moderate Low Unknown
Integration of long-term preservation into design process Collection, and enrichment of design and product data for archiving Collection of collaborative work data. Search and access capabilities for preserved materials. Re-use of archived information. Management of vocabulary evolution
3. Please rate the importance of demonstrated functions for your organization.
Function Social modelling of organization culture Integration of social network and identity management Middleware for reasoning over corporate social graph and collaboration metadata for application customisation. Aggregation of heterogeneous data repositories Normalization of PLM data into open standards Indexing of PLM metadata in the archive Relevance of automatic and manual capturing of metadata annotations for archiving in design processes Indexing of the social and collaboration context in the archive Document-centred collaboration scoping Usefulness of multivalent technologies for accessing archived standard for the currently available file formats (pdf, doc, ppt, xml, txt, jt) Practical relevance of terminology evolution Applicability of metadata schema mappings for representing terminology evolution Benefits of accessing the archive via semantic search using different independent vocabularies Very important Moderate importance Not important Unknown
SHAMAN WP14-D14.2
37 / 56
4. Which other functions would you expect to find in a digital preservation or collaboration system in your environment?
5. From the company‟s perspective, what standards need to be integrated within the SHAMAN framework? ___________________________________________________________________________ ___________________________________________________________________________ ___________________________________________________________________________ ___________________________________________________________________________ ___________________________________________________________________________ ___________________________________________________________________________ ___________________________________________________________________________ 6. What improvements would you like to see implemented in the demonstrated SHAMAN framework? ___________________________________________________________________________ ___________________________________________________________________________ ___________________________________________________________________________ ___________________________________________________________________________ ___________________________________________________________________________ ___________________________________________________________________________ ANY OTHER COMMENTS ___________________________________________________________________________ ___________________________________________________________________________ ___________________________________________________________________________ ___________________________________________________________________________ ___________________________________________________________________________ ___________________________________________________________________________ ___________________________________________________________________________ ___________________________________________________________________________ ___________________________________________________________________________ _____________________________________________ Thank you for answering our questionnaire and participating in the focus group. The final section is a short set of questions about the presentations themselves.
SHAMAN WP14-D14.2
38 / 56
Part III
Please rate some of the aspects of this event by circling one of the numbers where: 1=Poor and 5=Excellent .
1. How would you rate the overall organisation of the event? 2. How would you rate the structure of the event? 3. How would you rate the presentation methods used? 4. How effective were the speakers overall? 5. How useful was the event with respect to your job? 6. How much did the event meet your expectations?
1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
7. What else would you like to have seen covered at this event? _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ 8. What did you like best about this event? _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ 9. What did you like least? _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ 10. Please feel free to make any additional comments. _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ Thank you for your responses and for your participation in the whole event. Your participation has been essential for the evaluation of the SHAMAN Project and will be reflected in future work.
SHAMAN WP14-D14.2
39 / 56
ANNEX 4: Transcription of questions during the demonstration session and focus group interview Date: 2011-05-24 Place: the Netherlands, Eindhoven, Philips High Tech Campus, Strip, room Bohr, Organizer Kees Tuinenbreijer Participants: Jo Vandale, Cor Luijks, Hein Groeneveld, Edwin Montie, Addi de Vaan, Lun Chi SHAMAN team: Hans-Ulrich Heidbrink, Dominic Heutelbeck, Jörg Brunsmann, Elena Maceviciute, Kathleen Menzies, Vittore Casarosa, Tom Wilson The meeting started at 12:00 with a luch and round the table introduction. The presentations started with short welcome by Kees and Elena introducing the structure of the meeting, the role of participants. She also asked for the group consent to record the meeting and informed about the confidentiality obligation by the SHAMAN team. Hans-Ulrich presented the overall issues of digital preservation in the area of industrial engineering and design. He also explained design data acquisition and normalization and access issues. The screen shot of the Scenario 1 Project Life Management Integration (ADI layer normalization, storage and access) was shown in this part. No questions followed the part. Dominic Heutelbeck presented Open Conjurer work in collaborative decision making and metadata capturing in the Decision room. Question session [Note: in the question sessions and in the focus group transcript, names in bold italic are those of the Philips participants, names in ordinary italic are SHAMAN participants.] Question: How do you know what features you can look for? Dominic: You would start by something that you want to research. You are not browsing the repository for fun. You are goal oriented. You will be guided in this search by your goal what you are looking for. You can combine search criteria from the PLM normalization. You know that there is this SRAM or CPU we used. We need all team members who were involved in the project who used this CPU, because you need an expert in that. This is the general problem you have when you are collecting the data. We have not tackled it explicitly. In the enterprise scenario where the whole work is governed by strict policies regarding intellectual property and who may do what, this would be part of the company policies that when you are working on this project you of course checking what you are doing on this. SHAMAN WP14-D14.2
40 / 56
Question: A lot of things are decided at the desks and in the meeting rooms. That passes the systems that you are talking about. Dominic: This also would be a part of potential application in SmartVortex that we are thinking about. We are looking at massive data streams and analysing them in quasi real time. One of the things in these data streams are events that are occurring. If you look into real life meetings you would have video and audio streams that can be analysed. We have a partner in Sweden who owns technology to annotate dialogue and videos to identify key phrases for annotating decision points within a meeting. This is something you will be able do. Of course, you are right that one can only capture what goes through the system. If we meet outside watching the ducklings and making a decision there is no way to capture that. There may be some mobile version but not now. In practice you only can lock what goes through collaborative applications. Compared to the current state of the art where these systems have no homogenous way of expressing themselves in terms of who is doing what, this is a very big step. Now we are able to get this at all. There are, of course, practical limitations. Hans-Ulrich extends the answer with respect to the typical organizational and business processes of the organization. Searching at the moment is not a big problem even if there is a lot of data produced. The more important thing is to link different data into a meaningful whole and make connections among them. Coffee break is made at this point for 15 minutes. Jörg Brunsmann presents ISP2 scenario on domain ontologies. Question session Question: Such information is never and nowhere stored. Does this help Philips to store such information? There is no way of finding it if people are writing everything on paper and not keeping information somewhere. Some of the information, like decisions during the development of design, is only in the heads of people. And you do not have any means of finding it. Jörg: This problem is actually outside the scope of the project. The demonstrated tools help to automatically extract data as long as it was stored and recorded. If you talk about the manual process of entering the data, it is out of the scope of this project. Hans-Ulrich: You are asking how to get brain content and turn it into computer content. What we have presented is the situation that applies if the content was put down and then ontologies used in this content reflect that this content is changing. Then we can track these changes and make links between these changes. We have the methodology that tracks or assigns new track in language changes. (Introduces the conceptual dynamics work in SHAMAN). Your question is one of the most important problems of human management. But tools will not help you there. We have to talk about the business model underneath, how to attract the people to fill it in. If it is only a burden and there is no incentive, no one will do it. There are only two ways to solve the problem: you must it and it is a part of annual personal review, you do not do it, you are not ranked top. Or you can do it by offering something attractive, credits for education and training or something that will interest the people. But this is a different discussion.
SHAMAN WP14-D14.2
41 / 56
Hein: The question was not related to your presentation too. It was more on all the presentation so far. When I was talking to the students we were talking what was technical product documentation and additional product documentation. And the additional product documentation is exactly that what we are not registering somewhere. In these interviews I thought that was part of the SHAMAN project getting that information somewhere registered and retrievable so that it could be used. So far in this presentation, we are looking at the system which has been created as long as it had data that is missing that is not put in the system is not set up with this. That is my concern. Hans-Ulrich explains the issues of the organization and main goals of the SHAMAN project. The sensitivity of ideation process does not allow us to use much of this data for the public demonstration. Addi: That is a real problem concerning real work. I can make a comment with an example that really happened. At a certain moment a group of people in lab was working on 3D research spending many hours and days. After some time we showed them searching the internet that the solution they were working on was discovered already in 1970s. It was not new at all. The only difference is that in 1970s it was not called 3D but stereoscopic display. If the people do not know that stereoscopic display of that time is 3D, they did not find documents. These kind of tools then can help you by finding the correct wording that was used at that time. Hans-Ulrich: This will still have to be recorded somewhere. Dominic: These kinds of documents do not magically appear. There should be the investment and strategic will to make it happen. SHAMAN can help you in any case as we have looked into the strategic points when the decisions happen. First, we have shown how to treat the existing data. Step 2 is how to capture the data that is around the documentation and is important but not recorded still. So, we show how to capture this soft information. The step3 we have shown tools that helps to trace back the documented events through data. But without the culture of writing everything down in the company this will never work well. Hans-Ulrich provides examples of data mining, search, etc. and the importance of meta-data in these processes. Addi: If there are so many great technologies available then you can solve our problem. As an engineer I have a document and I want to put it into computer. I do not care where it is stored and what metadata is with is, but I want to find it when I look for it. If you could provide us that instead of plenty menus where I have to fill in plenty of metadata fields. If you can tell us: you have finished the document, click on „save‟ and that‟s it. The simplicity is the key. Hans Ulrich: This basically depends on the requirements that are preset before the start of the project. That would be necessary to set together with Philips people sitting together. Kathleen: you do not need the preservation, just searches. Addi. It would be good if the successful searches are also preserved. Then it saves time for others performing the same search. Cor: Open Conjurer can be also useful for the meetings with chat and open minutes official area. People chat and key master meetings can be preserved without anyone recording the whole session. Then you can play the whole meeting, but you need the interface for that. Dominic: collaborative software can be customized from the building blocks. You do not start from the scratch.
SHAMAN WP14-D14.2
42 / 56
Focus group discussion Elena: I have a list of questions to you. Now you should answer our questions. First, we can talk round the table. What are your overall impressions of the presentation? How you see this in the context of your work, in the first place, and then in the context of your company. Hein: I am not working anymore as I used to work in a project. Now I am working in an IT environment and I own a component information database, managing that. I still am involved in technical support standards and procedures. For me it is not so much searching and using a system as SHAMAN or using old files. For me these questions are out of scope. Tom: But, for the company. Hein: For the company? Maybe when you are in predevelopment or research it can be useful. Not so much in product development anymore, I think, that searching for this kind of things should be done in predevelopment and research environment. That is my feeling of what I have seen. Jo: I have two environments. First of all benchmarking. There, currently, we archive or publish or whatever you want to call it, collect all the information on the central web page which is manually organized. There good archiving and searching will be useful definitely as a back end. on what is already existing. That‟s one. There I try to organize as structured as possible to make it as easy as possible for people to find things later on. On the other hand, going further than two or three years back almost never happens, because then it is already, at least in our business, old-fashioned. Chance that you have to search on things older than three years is very low. Would be different for Health because that where they have patents and that stuff lasts longer but pure observation on what competition is doing… would be irrelevant in three years. If I looked at the second part, the management of the development documentation and stuff. There we have a front end which is working quite well. We used in the past several tools going from central Excel and local drive (I mean local server drive, not housed on the local PC) to APD. I mean APD was in between, that is a typical example of a system in which you could archive the documents and search. But it was way too complex and too slow. So, we had a perfect basket but nobody was using it. So, we trashed it after two years and we started with a much simpler tool. There we have only Front End and the front is well used, there it stops at the archiving. The documents are stored without further archiving, without further search engine, so you have to go back to the whole structure to do find up what there is. These are two cases where if you have a very simple back end that can provide this functionality at a reasonable price, then that would be a good option. Elena: Do you see this as a potential solution, because this is not a market system yet? Jo: Yes, like I said with functionality that we discussed there. Cor: The demonstrations went rather quickly and they were pre-programmed. I was not able to play within them myself. My first impression is that I have not seen many things that I have not seen before in other archiving systems. Of course, the names are a little bit different, but the amount of clicks you have to do before you reach the point where you want to be is not very different. What I miss in the whole story is what has to be done behind the scenes to make this system work. Normally, there are database administrators and some kind of roles. From the presentation that you had it was not clear to me what that means for an organization. How many people are occupied to ensure that the database is robust, the tool is robust and so
SHAMAN WP14-D14.2
43 / 56
on. That I missed a little bit. Then the simplicity is the ultimate goal that‟s what are we looking for. And not so much the structure behind, rules behind it, just use it. Tom: You said that you have seen similar things in other archiving systems. You and your colleague used the term archiving, and we use the term preservation. The distinction here is that those files will not be lost, while they could be lost in an archiving system because there will nothing that can read them in the future. The key question for us is if that aspect of preservation something that is key or useful. Someone: true Cor and Hein, intermingled: That depends on the area you are working on. Like Jo said for us with three years back is not relevant. For other people it might be very important. For us archiving already solves our problem, because for at least of couple of years we can get the things back. We are not looking 25 years back but for a shorter time. But you are right, we are intermixing terms here, but the difference is there. It depends on how long long-term is. Jo: There is also a recent example here. We had to rebuild the old software which was five years old. Even for that you had to get the machines standing in the corner of the room and in the end to get it running. Because people initially operating these units were not available anymore so you have to search and running it again. But your system could not solve that, perhaps solve documentation part and make sure that you can read .pdf or C file of whatever in a specific format. But getting everything together is such a huge job, and for the moments and single instances you need it – it‟s too much money spend. Addi: I do not recognize that problem. I think that we had discussion before. All the documents in the past that I want to read I can read. I do not see the problem at all. Jo: I can at least read all the documents that I have created over 30 years. Addi: When I look on the internet, the pdf is perfect. And even pdf files nowadays are scanned and key words are generated from the pdfs and can be found on the search engines. I don‟t see a problem in that preserving in a special file format to do that, pdf is excellent. Most people are using pdf for that reason. Elena: That is for textual documents. Jo: For text and graphics. In the past what we had – Latex – typical example for engineers programming text - and WordPerfect that came with DOS oriented semi-graphical environment. And then we had pdf and that worked and Excel. Before Excel we had something else, but all this is still readable today. Even documents from the old [Commodore???] 64 are still readable today. It is amazing how the people keep these things alive in one or another way. Kathleen: What about beyond just the basic document itself. We tried to talk about the preservation of the links between the documents and the decisions that led up to the contextual aspects. Even if you can open the pdf and read it perfectly all the associations may not be preserved. Addi: Yes, language problem that I can recognize. It is correct. But they will be redundant. For the moment you are right. The first time someone writes a document, he start to generate own vocabulary. At that moment it is very difficult to know if the document will be really important or not. You cannot tell this at that moment. After ten years if someone is interested in the document to find it back it is very difficult to know what the meaning of these words
SHAMAN WP14-D14.2
44 / 56
was before at this moment in time. And if I talk with a lawyer this is a profession to do this in such a way that in ten years you can prove what was the real meaning of the words in that document. It is a five year study after you have graduated from the university. Hans-Ulrich. If you have documents that have some direct accessible contents, pdf is only text document and you can extract a picture or select the text, but if you have document where you really need to see specific objects then you will have problem to flatten them on the pdf level (preserved print out) or you make that in a way that you can directly access information and treat information for further use. The mining knowledge out of the pdf document is it possible? That is essential question. If you have documents that belong to a set with references and mappings and somehow connected objects then you are lost with pdf. And this is the technical world Dominic: We have to clarify what a document means. When we talk about documents in SHAMAN engineering scenario we do not mean text documents. We are talking about the entire population of product data that is involved in the engineering process. So, if you take that into account, then your statements that you don‟t have these problems because you have pdf. will they still hold true? Addi: Yeah. We have product documentation in development projects, and I don‟t see the problem over there. We have a lot of solutions to do this. Dominic: So, actual design and simulation data? Addi: Yes – I do not recognize that problem at all. Edwin: In my daily practice, the problem I see more is the availability of a simple archiving tool and even the discipline to use it by engineers. To me your project is the third step and not my immediate concern, so to be your project is a bit academic. Elena: Can you say what has impressed you most, Chun? Chun: The most interesting part is about collaborative decision making process and social analysis. It was very interesting. Elena: Anything that was innovative in the presentation that has potential for solving problems? Addi: Some elements. Elena: Which ones? Addi: I see a number of elements that would be nice to have in total solution. When you have certain solutions available, search engines and things like that in the company, there are certain elements it would be nice to have. Like suggestion tools helping to find the right words, key word generation, helping to archive – these are good elements. But we need to have packages of solutions in the company that can use these sub-solutions. Cor: Like I said the Open Conjurer. That can be extended to a full-fledged meeting tool. You really can use it, you go in with impulse, with documents and actions to discuss and you come out with conclusions, new action points and the minutes of the meeting. And where you have participants from in the room and also outside the room – I mean on the other side of the world. And also that everyone has the ability to connect to write. Big issue in meanings, where some fool is making minutes and can‟t follow the meeting, because he is too busy capturing what others have said. It would be better if all can write themselves their statements
SHAMAN WP14-D14.2
45 / 56
or action points you have a better quality. You can look at the documents and share them, together with all he rest, – would be good too. It is also better social connected. You can directly connect all the people who are in the project, you do not need to make a mailing list. From within the product you can bring people together and information together and, in the end, archive everything together. Elena: Something else that struck you, an innovation, something useful? Edwin: I thought the collaboration part. Maybe not practical yet, I am not an expert. It‟s certainly relevant, but that is not practical. But it is a development project. Elena: Addi, you said that you do not recognize the problem because you have already everything. Do you think that there is any possibility to improve the technology that exists by what you have seen today? Addi: No doubt, that you can improve it. I am sure that it can be improved. I am convinced that we will see developments in this area in two years. To help find back information, to make it usable – that‟s key Elena: From what you have seen today what is potentially useful? Addi: You try to combine solutions there. That‟s what you are doing. It is a new trend with open innovation. You have documents and you can link them to groups and people to groups. Jo: The first demonstration of the archiving and storing of documents. With a normal interface and machine to machine interface it would be also interesting. We have existing tools and we will not replace them. In my case as I don‟t have a back-end which is doing proper archiving of the documents. Having a system with machine to machine interface we could couple them them, of course. A question in relation to that: Where are the documents stored? Is it local storage? Is it cloud storage? Hans-Ulrich: iRODS is an environment that is grid based and you can define any storage you want underneath. iRODS ensures the refresh and redundancy, new version appearance, format changes per definition as a preservation system. I think this is important in engineering repositories, because you need to trace when there is a system change and what kind of refresh or update procedures I need to do to be able to read later. You can update at minimum when files (pdf or engineering documents) come from different vendors. Dominic: What you call a back end that you do not have, we have and we can build upon. There are different back ends we can bind to. iRODS is our primary example, but usage of cloud storage is also possible because we have different extraction layers . Hans-Ulrich: if you look to the OAIS model from ingest to access we have all them. SHAMAN is the first project that covers everything from pre-ingest to access. Tom: I get the feeling that what you would prefer is for all of this stuff to be out of your sight. For you to use the systems that you are currently using but with the capacity to integrate with these? Jo: Yes and no. I am talking now about advanced development and I am not going to demonstrate it here. We have a tool which is adopted by engineers to do the tracking on monthly base of their work. And they also in that tool can store the documents which are related to their work. So they do that, but this is a loose end. The documents are there, we do not trash them. This is not the issue. We keep them but you cannot search in these documents, you cannot get them back. If you like to find this information you have to know the project SHAMAN WP14-D14.2
46 / 56
name or you have to go to the project cluster and find the name and then try to guess when the job was done. There it stops. Simple storage of documents. If you have preservation after that through machine-machine interface then instead of storing it there in a simple directory to feed it through to a preservation system, then you have both. You will for the engineers have a perfect acceptable system as of today that they like to use, so you do not change that, but you add functionality to that to take care of the preservation. Then it can have another interface for searching, for finding, enriching the information. But on the first instance, you have machine to machine interface. So for the current engineers working day to day nothing changes. So, you add functionality without changing the front end. Hans Ulrich: I fully agree what you say. And there is some criticism that we take into account. We have not put lot of emphasis into the user interfaces. We only tried to show that the metadata are created and can be taken for search. To do this in a real application you need to put more emphasis into a nice to use interface. Jo. We also learned that the user interface is sometimes more important than engine behind it. People use their eyes to buy the tools. We see simply, for instance… we have an example of the APD, it was perhaps good, but we could not bring the engineers as far to use it. Then you created a beautiful monster and nobody was using it. That‟s also not good. You put a lot of effort into it, lot of back ends and processing power behind it but they are not using it. It was just an empty framework of titles and projects without content. Hans Ulrich: That is a typical difference between European and American approach. European research puts almost all first emphasis on engines and core functionality, but American companies first consider user interface, then they make two or three releases and during the releases so much pressure comes from the customers that they improve engine itself. Elena: That is one of the applications where the system can be immediately of help to fulfill the company function. Are there any other such applications that you already use and feel like one of the demonstrated elements would be a wonderful addition to that? … Nothing for the moment. There is no much potential in these features to use with your present systems? Addi: A logging of decision making. It can be chosen. It is equally important. Tom: Was there something missing that you thought we might be presenting? Was there something you were looking forward to and didn‟t hear? Elena: Or you knew everything about SHAMAN before coming? Edwin: Neither. Of course, I didn‟t know everything about SHAMAN. Through Kees I knew a bit about the project. As I said there is a big gap between my daily needs and your development. But I did not expect you to solve my daily needs. So, the meeting was what I expected in that sense. Kathleen: So what was your understanding of the term preservation? I feel that you think it is not relevant to what you are doing, perhaps you came into touch with new concept? Edwin: I think I understand the concept as you explained it. It is more than just archiving – maintaining the accessibility to it over a longer period. But that is three steps away from my problem of getting engineers to store it in the first place. That is rather a big gap between my daily experience and the topic of your project. Which is fine. Kathleen: You agree that they should be doing that. In an ideal world there will be archiving and preservation?
SHAMAN WP14-D14.2
47 / 56
Edwin: Yes, sure, but I am practical person. I worry about step 1 and 2, but not about 3 and 4. Addi: There is one of the points that people are saying – there are lots of tools to do that, software tools to help companies in their project management, to document things, to store the documents, to have release procedures whether it is ready or not, who is allowed to approve a certain document. All these things exist. But the real problem is to get your team to do all that stuff. That consumes a lot of time and effort and it is difficult for people to do it. Simply it does not happen. Edwin: Because we tend to stress a lot the things you have not solved. But you never planned to solve them. Addi: If you can give us a solution that people can just dump it was and it‟s done. That is OK. That is ease of use. That is something what we really want to have. Jo: Also, because if I really look back how many times people really go back for three or four years. It almost never happens. Because we are in industry that moves so fast. They look into it, but always within scope…. It is so short – one two years and from then on you should not look back. Addi: There is one of the problems that I hit many times. Many times I need to have documents or things that were done by people 10-15 years ago. So, I know the problem. The problem is that the hardware people were working on is almost destroyed or it is not working any more or it was even erased. Information is not there. So, you have to go back and trace the people who were working on it and hope that they have paper copies or have on a hard disk. It is lot of work to find back the information. But a lot of information can be recovered, because it is still there. It is just so time consuming. That some people are spending a lot of time to recover that information is not balancing all the work that you need to do to tell your people to start storing everything, becaise that is much more expensive. If you look at our company, we do it this way because it is only now and then and we do not know beforehand how much information we really need to find back. Hein: I think it is more useful if you go to Health and Wellness [section of the division]. Jo: I will not say that it is not useful. Definitely not. What Addi is saying is also correct. Twenty years ago I was working on teletext and even today I get questions on teletext. Then I answer them from the top of my head or refer them to another guy who is still doing it today. If it will be an archive system you will simply type in “teletext” and get a bunch of papers that you cannot read during the coming four month. Hein: I did not mean that the tool is useful or not, I just meant that the business where it will be of more use is health care. Hans Ulrich: Even from legal obligations point of view. Jo: We have legal obligations but for that we do need much archiving or preservation. Hein: In health care you have legal obligation to document all the decisions. Cor: For our people it is the same problem as many years ago - backing up. At that time everybody said “Guys, you have to make a backup, because the hard disk may crash” and everybody said “Ah – it never happens.” Until it happened and you learned. Nowadays, no one talks of it because it is done automatically. It‟s done but you don‟t see it, nobody knows it. OK, your PC gets a little slower sometimes. Something like that, I think, the archiving and preservation should do for people. SHAMAN WP14-D14.2
48 / 56
Addi: The first example in that area is Time Machine of Apple. That thing I do lot. It‟s always back up to the last file. You can go back in history and can just get out file twenty days ago, two years ago, four years ago. But you don‟t have to do anything. It is just connected. You don‟t have to press any button, nothing. Hans Ulrich: I am also a Time Machine user and my experience with it is not good. Addi: Nothing is perfect, it stays a machine. By now my experience so far is only positive. When you look from the user level, user is not aware. I just bought it, connected it as they said and I never realized it again. Until the moment your Apple is crashing and you learn that you can restore it in twenty minutes from five hours up front. And you were never aware of any backup. Maybe it‟s too simple, but OK. Hans Ulrich: If you look in preservation area, Apple is providing backup that is a simple way. You also can control the time. If this crashes Apple is gone. In preservation the problem is that you have a combination of hardware, software, format tracking and the ability to put some elements in formats that are reproducible for future generations. I am not sure if people who are looking in the Metalogic file are able to do the first 1992-93 elements reading them out from some backup, which may still be existing. The systems are no longer there, the people who knew languages and, especially, the use paradigm are no longer there – retired, left company. That is the one part. The second part is, the more we have digital-born documents, the more research capability and identification of that what I like to have get in trouble. We had a study in SHAMAN activities, showing the increasing demand of the storage. Storage today is not a big problem, go to shop, buy another ten terabytes (Edwin: store you pictures on it and drop it to the ground). There is more produced. But if you really like something to use in a dedicated environment like in engineering or whatever task you have, you need to differentiate between valuable and non-valuable information. In all our discussions, which we had, the main problem is that nobody says that you can purge and delete it. Everyone says: store it, so that you are on the safe side, nothing is lost, you have everything. But it does not count as nothing helps the community to retrieve and to differentiate between valuable and non-valuable information. That is also the area that you have to think about in the discussions about the archiving, preservation, metadata generation, retrieval processes. It does not help you to have everything on thousands of discs once you don‟t know how to access it and identify and how to reduce the volume. Even if we have meta-search engines, grid computing, cloud computing, multiple CPUs involved that can do the task, as long as you cannot specify the body of searching, as long as you do not know how metadata generation itself was working, it does not help you. This is the reason most of the people say: I don‟t‟ search I just phone the guy who can know it. Tom: I am struck by the fact that two things seem to emerge as problem areas, even if preservation itself is not perceived as a problem area. One is a language issue that you mentioned and the other is people issue – knowing who was involved. You are contacted about teletext because people in the company know you have worked with teletext. But people die. So, you are left with the language problem. When the people are no longer around to talk to, to interpret the documents that you‟ve got, you are thrust back to dealing with the language problem. Does it suggest that it is the crucial issue in all of this? Addi: Yes, but only sometimes. Because there is another thing if you go very back to the past to all the material that is there. It is only minor if you compare with all the new data that is generated today. If you look at the amount of value in the history it is very limited in relation to value that is generated today. In that sense it is also very doubtful for me how really interesting is digging out all old stuff. OK we can generate some value and say wow, if we SHAMAN WP14-D14.2
49 / 56
did it better we could convince the engineers that they were working on the wrong topic. But if you ask the engineers to dig into the past, look and learn the meaning of the words and see what you can do with it, I would prefer the people to say – look what competition is doing today. That is much more value. Elena: As far as I understood one thing that might be an incentive to get a system based on SHAMAN ideas would be if it is a back system, invisible to users and does all the job of preservation… Jo: Not fully invisible. Initially, as I have said, to get documents in I would make it invisible or as invisible as possible. But for search it has to be here. Otherwise you put everything in a basket and cannot find it. It does not make any sense. But having a tool that we use today that does not have preservation at the end and then making a new tool where you have to do work again taking over the titles and descriptions and key-words to pull in another tool, people will not do it. We‟d rather used the existing front end and use there the key-words and description and pull that together with the documents through machine to machine interface to SHAMAN and the SHAMAN front end can pop-up with an extra window to verify if what he is trying to do is correct that is another thing. It can happen in between, because if the engineer at that moment does not want to do extra effort he can push submit button and then it goes with the information it has. Otherwise, you can enrich it. Tom: So, automatic thesaurus generation, automatic meta-data creation, once the file is in the system. Because then that even removes existing work from the engineer. [Sounds of agreement] Dominic: I can clarify. You mentioned twice and I got an impression that you think that what we have shown you in the first presentation is the SHAMAN front end. Jo: No, no. As you said yourself it is far from finished. I only say that if you make it a really decoupled tool… Dominic: That is exactly what we are doing. Because we used this PLM system only as a representative of any PLM system. So, the approach with this ADI-layer that is pulling data from a PLM system or a population of PLM systems is generic. It is a generic bridge from any PLM system to an archival system. So, if you have any high end PLM systems deployed it will be only a matter of writing adapter to get the data from this system to the archive as well. So, what you said would be nice to take the existing engineering front end system and add this back end tool is exactly what we have presented in the first presentation. Maybe that was not clear enough. Hans Ulrich: It would be good to take the existing PHILIPS production system, but we could not do it. Elena: What other incentives would be there? The cost was mentioned as the affordable right cost. What other factors to adopt such system could be? Edwin: The cost is not only the licence fee, but also if I have to employ somebody full-time to maintain the system, the operating costs. But it also falls on the cost. Hans Ulrich: How would you calculate the return of investment versus this operating cost. Edwin: You can‟t. Because the return is future and nobody believes it ever happens, so nobody sees the return on the investment. It‟s the problem you described in your first presentation. It‟s naïve, but it‟s practice. SHAMAN WP14-D14.2
50 / 56
Hans Ulrich: If you say it reduces search situations because he needs sometimes access to all the documents. Here the telephone rings to know something about telefax technology… Edwin: You have to convince some budget owning manager to spend money to save money in some ten years. That is not a very successful scheme. Addi: I think to add another thing. Someone is calling him to ask about what he did in the past. And he gets more answers than he has questions. All the extra information he gets is adding lots of value. You don‟t find it in the search engine, by just typing in questions. You do not know even the question, so you don‟t get the answers. If you call someone who worked in the field before you get a lot of new information and the right questions. Dominic: That‟s when the capturing and preservation of the social context comes into play, because if you have an archive where you have some, e.g. teletext documentation and the social context has been captured you can easily find out who worked on the project and look if he is still on the roster. These are not contradictory approaches, but they are complementing each other. Addi: But you need to do more. But it is also means for me to ask how you can do it. If you take this kind of solutions they need to be integrated it with all other tools in the company, so that they are able to work together. If you don‟t do that… Vittore: But if you take the cost could you look at it instead of as return on investment as insurance. You do not expect to have car accident, but you pay insurance. Can it be the same for Philips or any industry Addi: That is a difficult question. If you ask me five years from now when you buy all these solutions will they still be available all the these solutions are available on other software packages here and there. And if you take software and the company, these things you get just like that. And if you go to Microsoft they also are developing these kind of solutions as well and they also know what the people and the companies would like to have and they are also thinking in this direction. Look to the specification of the SharePoint2010. Lot of these things you will find there. Elena: thanks the participants for giving their valuable time and asks them to fill in the questionnaire.
SHAMAN WP14-D14.2
51 / 56
ANNEX 5: Conferences attended and papers submitted
1 Conferences attended
12th International Conference on Asia-Pacific Digital Libraries, ICADL 2010 14th International Conference on Knowledge-Based and Intelligent Information & Engineering System 2010 AIIM Roadshow 2nd International Conference on Knowledge Engineering and Ontology Development 2010 48th Annual Meeting of the Association for Computational Linguistics 4th Iberian Grid Infrastructure Conference - Ibergrid 2010 7th International Conference on Preservation of Digital Objects (iPRES 2010) 8th European Conference on Digital Archiving (ECA) Association of Art Historians (AAH) Conferences 2010 Balisage: The Markup Conference 2010 Blender Conference 2010 CM Symposium on Applied Computing 2010 COLLIN 2010 - 1st Symposium on Collective Intelligence: FernUniversitaet in Hagen Dagstuhl Seminar: “Automation in Digital Preservation" 2010 DLID-ICADL-10, Digital Libraries for International Development in conjunction with International Conference on Asian Digital Libraries DocEng2010: ACM Symposium on Document Engineering Documents Document Recognition and Retrieval 2010 ICPR (International Conference on Pattern Recognition) 2010 ICT 2010 IFLA 2010: Gothenburg, Sweden. Digital resources of Lithuanian language heritage today: achievements, problems, perspectives, 2010, Institute of Lithuanian literature and folklore, Vilnius. Communication and information management models and expressions, Vilnius University, Vilnius. International Conference on Data Storage and Data Engineering - DSDE 2010 International Conference on Digital Libraries 2010: Shaping the information paradigm (ICDL 2010) International Conference on Machine Learning International Symposium on Collaboration in Conserving time-based art, Smithsonian Museums International Symposium The Digital Oblivion IS&T Archiving 2010 Preservation Strategies and Imaging Technologies for Cultural Heritage Institutions and Memory Organisations Conference Open Repositories 2010 (OR 2010) Madrid Optics, Photonics and Digital Technologies for Multimedia Applications Conference Research and Advanced Technology for Digital Libraries, 14th European Conference
SHAMAN WP14-D14.2
52 / 56
2
Conference papers published in journals
Barateiro, J., Antunes, G., Freitas, F., Borbinha, J. 2010. Designing Digital Preservation Solutions: A Risk Management-Based Approach. In International Journal of Digital Curation, Vol. 5, No. 1. ISSN: 17468256 Darányi, S., Wittek, P., Dobreva, M.: Using wavelet analysis for text categorization in digital libraries: a first experiment with Strathprints. To Appear in International Journal on Digital Libraries. Maceviciute E. & Wilson T. (2010). "Information behaviour research and information systems development: the SHAMAN project, an example of collaboration." Information Research, 15(4) paper 445. [Available at http://InformationR.net/ir/15-4/paper445.html] I Menzies, K., Dunsire, G. and Birrell, D. An Investigation of Information Systems Interoperability in UK Universities: Findings and Recommendations. Submitted to the International Journal on Digital Libraries, Special Issue on ECDL 2010, December 2010. Thierry Jacquin, Hervé Déjean, Jean-Pierre Chanod Towards Document Process Preservation: Xerox Launches Document Process Modelling Technology Xeproc©, Special Theme on Digital Preservation, ERCIM News 80, Jan. 2010 Wittek, P., Tan, C.L.: Compactly Supported Basis Functions as Support Vector Kernels for Classification. To Appear in IEEE Transactions on Pattern Analysis and Machine Intelligence. Wolfgang Wilkes. W, Brunsmann. J, Heutelbeck D, Hemmje M, Hundsdörfer A, Heidbrink. H.; Towards Support for Long-Term Digital Preservation in Product Life Cycle Management. IPRESS 2009. International Journal of Digital Curation in Volume 6, Issue 1
3
Conference paper published in an online newsletter
Borbinha, J. 2010. SHAMAN - Sustaining Heritage Access through Multivalent Archiving. ERCIM News 2010(80) - http://ercim-news.ercim.eu/en80/special/shaman
4
Conference papers published in proceedings
Aitken, B., Innocenti, P., Ross, S. & Konstantelos, L. User requirements for a next generation digital preservation framework: analysis and implementation. In IS&T Archiving 2010 Preservation Strategies and Imaging Technologies for Cultural Heritage Institutions and Memory Organisations (pp. 48-52) Springfield, VA: Society for Imaging Science and Technology. Antunes, G., Barateiro, J. & Borbinha, J. A reference architecture for digital preservation. In A. Rauber, M. Kaiser, R. Guenther and P. Constantopoulos, (Eds.). 7th International Conference on Preservation of Digital Objects (iPRES 2010), September 19 - 24, 2010. (pp. ). Vienna: Austrian Computer Society. Antunes, G., Barateiro, J., Borbinha, J., Silvestre, D. & Freitas, F. 2010. Digital preservation based on federations of data grids. In Proceedings of the 4th Iberian Grid Infrastructure Conference - Ibergrid 2010. (pp. ) Braga, Portugal: Netbiblo Barateiro, J., Antunes, G. & Borbinha, J. Aligning OAIS with enterprise architecture. In 8th European Conference on Digital Archiving (April 28 - 30, 2010) Geneva, Switzerland. Borbinha, J. 2010. Automation in Digital Preservation, In Dagstuhl Seminar 10291 Proceedings, ISSN: 18624405 Brunsmann, J. (2010). Archiving Pushed Inferences from Sensor Data Streams. 1st International Workshop on Semantic Sensor Web. Brunsmann, J. (2010). The Design of a Social Semantic Search Engine: Preserving archived collaborative engineering knowledge with ontology matching. 2nd International Conference on Knowledge Engineering and Ontology Development.
SHAMAN WP14-D14.2
53 / 56
Chanod, J.-P., Dobreva, M., Rauber, A., Ross, S., Casarosa, V. (Eds.), Proceeding of the Dagstuhl Seminar 10291 “Automation in Digital Preservation”, LZI, ISSN 1862 – 4405, 2010. Available: http://drops.dagstuhl.de/portals/index.php?semnr=10291 Chanod, J.-P., Dobreva, M., Rauber, A., Ross, S., Casarosa, V. Issues in Digital Preservation: Towards a New Research Agenda,. In: J-P. Chanod, M. Dobreva, A. Rauber, S. Ross (Eds.) Proceeding of the Dagstuhl Seminar 10291 “Automation in Digital Preserva Darányi, S. , Wittek, P. , Dobreva, M. Toward a 5M model of digital libraries. In: Mostafa, J., Oard, D., Parikh, T., Rajput, N. Digital Libraries for International Development Workshop in conjunction with joint JCDL/ICADL 2010 , Gold Coast, Australia, 2 Darányi, S., Wittek, P., Dobreva, M.: Position paper: Adding a 5M layer to the 5S model of digital libraries. Proceedings of DLID-ICADL-10, Digital Libraries for International Development in conjunction with International Conference on Asian Digital Libraries Darányi, S., Wittek, P., Dobreva, M.: Toward a 5M Model of Digital Libraries. Proceedings of DLID-ICADL10, Digital Libraries for International Development in conjunction with International Conference on Asian Digital Libraries. June, 2010. Déjean, H. & Meunier, J-L. Reflexions on the INEX Structure Extraction Competition, DAS (Document Analysis System), Boston, MA, USA, 9-11 June, 2010 Déjean, H. Numbered Sequence Detection in Documents, DRR San Jose, CA, USA, 20-22 January 2010 Innocenti, P. Evolution and preservation of digital art: case studies from ZKM and AEC, in Association of Art Historians (AAH) Conference 2010, 16 April 2010, University of Glasgow, Glasgow, United Kingdom Innocenti, P. Theories, methods and testbeds for curation and preservation of digital art.In IS&T Archiving 2010 Preservation Strategies and Imaging Technologies for Cultural Heritage Institutions and Memory Organisations (pp. 1-4) Springfield, VA: Society for Imaging Science and Technology. Innocenti, P., Konstantelos, L., Ross, S., Maceciuvite, E., Wilson, T.D., Ludwig, J. and Pempe, W. (2010). Assessing digital preservation infrastructures: implementing a framework for library, engineering and eScience organizations. In Archiving 2010, Den Haag, the Netherlands, May 2010 (pp. 18-23). Springfield, VA: Society for Imaging Science and Technology. Lecerf, L. and Chidlovskii, B. Scalable indexing for layout based document retrieval and ranking , Proceedings ACM Symposium on Applied Computing, pp. 28-32, March 2010. Maceviciute, E. and Wilson, T.D. (2010). Evaluating the SHAMAN digital preservation framework for memory institutions. In International Conference on Digital Libraries 2010: Shaping the information paradigm (ICDL 2010), 23-26 February 2010, New Delhi (Vol Menzies, K., Birrell, D. and Dunsire, G. New Evidence on the Interoperability of Information Systems within UK Universities. In: M. Lalmas, J. M. Jose, A. Rauber, F. Sebastiani, I. Frommholz (Eds.): Research and Advanced Technology for Digital Libraries, 14th Meunier, J-L. Automated Quality Assurance for Document Logical Analysis, ICPR (International Conference on Pattern Recognition), Istanbul, Turkey, 23rd August 2010 Schott, M., Kraetzer, C., Specht, N., Dittmann, J. and Vielhauer, C. Ensuring Integrity and Authenticity for Images in Digital Long-Term Preservation. Proc. of Optics, Photonics and Digital Technologies for Multimedia Applications, Conference 7 Vion-Dury, J-Y. Stand-alone encoding of document history (or one step beyond XML diff). In Balisage: The Markup Conference 2010, Montréal, Canada, August 3 - 6, 2010.
Vion-Dury, J-Y. Diffing, patching and merging XML documents: toward a generic calculus of editing deltas. DocEng2010: ACM Symposium on Document Engineering, Manchester , United Kingdom, Sept. 2024 2010 Wittek, P., Darányi, S., Dobreva, M.: Matching evolving hilbert spaces and language for semantic access to digital libraries. Proceedings of ICADL-10, International Conference on Asian Digital Libraries. June, 2010.
SHAMAN WP14-D14.2
54 / 56
Wittek, P., Darányi, S., Dobreva, M.: Matching evolving hilbert spaces and language for semantic access to digital libraries. In: Chowdhury, G., Khoo, C., Hunter, J. (Eds.). The Role of Digital Libraries in a Time of Global Change. Proc. Of the 12th International Conference on Asia-Pacific Digital Libraries, ICADL 2010, Gold Coast, Australia, June 21-25, 2010, (pp. 262-263) Berlin: Springe3r-Verlag. (LNCS 6102).
5
Conference paper published as a book chapter.
Brocks, H., Kranstedt, A. Jaschke, G. & Hemmje, M. (2010). Modelling context for digital preservation. In Edward Szczerbicki and Ngoc Thanh Nguyen, (Eds.). Smart information and knowledge management: advances, challenges, and critical issues. (pp. 226-260). Berlin: Springer-Verlag.
6
Submissions accepted but not published.
Aitken B, Innocenti, P. and Ross S (2010). Gathering the requirements for a next generation digital preservation framework. Presented at: IS&T Archiving 2010 Preservation Strategies and Imaging Technologies for Cultural Heritage Institutions and Memory Organisations Conference Barateiro, J., G. Antunes, et al. (2010). Aligning OAIS with enterprise architecture. Presented at: 8th European Conference on Digital Archiving. Geneva, Switzerland. Déjean, H. (2010). Numbered sequence detection in documents. Presented at: Document Recognition and Retrieval XVII" conference. Innocenti, P. Aitken B, Ross S (2010). Gathering the requirements for a next generation digital preservation framework. Presented at: 8th European Conference on Digital Archiving. Geneva, Switzerland.
7
Submissions not accepted.
Brunsmann, J. and W. Wilkes State-of-the-art of long-term archiving in product lifecycle management. Darányi, S., Wittek, P., Dobreva, M.: A novel use of wavelet analysis for text categorization in digital preservation workflows. Submitted to the European Research Consortium for Informatics and Mathematics News, 2010. Darányi, S., Wittek, P.: Toward a phase space model of evolving semantic content for digital libraries. Submitted to European Conference on Digital Libraries, 2010. Innocenti, P., S. Ross, et al. (2010). Assessing Digital Preservation Frameworks for Memory Institutions, Industrial Design & Engineering and e-e-science Domains: the Approach of the SHAMAN Project. IEEE eScience Security Workshop 2010. not accepted. Wittek, P., Darányi, S.: Semantic flows: visualizing evolving content in document collections. Submitted to 14th International Conference on Knowledge-Based and Intelligent Information & Engineering System, 2010. Wittek, P., Tan, C.L.: A Non-Parametric Ordering of Data Points to Identify Clustering Structure. Submitted to International Conference on Machine Learning, 2010. Wittek, P.: The Impact of Stemming and Stop Words on Sentence Similarity in Vector Spaces. Submitted to 48th Annual Meeting of the Association for Computational Linguistics, 201 Zabos, A. Demonstrating the next generation preservation framework. Archiving 2010
Zabos, A.. Informationsaustausch der Zukunft. ECA 2010.
SHAMAN WP14-D14.2
55 / 56