Graphic Card Emc Utl 90 Rating: 6,6/10 4180 votes

JAGUAR F-TYPE. OTR Price From: £51,925. Powerful, agile and distinctive, F‑TYPE is a true Jaguar sports car, engineered for high performance and instantaneously responsive handling. Available as both Coupe and Convertible and now including the new F‑TYPE SVR. Buy ASUS Video Graphics Card 90-C1CNRA-J0UAN0KZ. Find quality video graphics card products at discounted prices. This laptop also uses Integrated Intel HD Graphic Card. You can see power, battery, and caps lock small LEDs and an 2in1 SD card reader. Lenovo G580 uses Intel Core i5-3210M processor with 2.5 GHz speed. It has 4 GB installed RAM and it can be upgraded to 8 GB for the maximum capacity. Emc utl laptop drivers download; lenovo g580.

Start display at page:
Download 'Proceedings of the 4th Ph.D. Retreat of the HPI Research School on Service-oriented Systems Engineering'
  • Marion Oliver
  • 3 years ago
  • Views:


EMC SLIC07 4: $60.67. EMC SLIC07 4 Port Gigabit Ethernet IO Module Card EM1-SLIC07 w 303-121-100A. Apple Genuine Graphics Card HD 5770 HD 5870 1Gb Mac Pro 11 - 51 2006- 2012. Ibm 15k7 3.5: $103.00. $ HPAgilent_16531A: Digitizing: $90.00 $ HPAgilent_16531A: Digitizing Oscilloscope Card 100 MHz. HP LaserJet 4050TN: $99.99. Browse Pour Homme 33 Dampg available for sale now on the internet. Featuring a curated array of Pour Homme 33 Dampg in stock and ready to ship now online. Shop for Dell certifed parts and upgrades for laptops and desktops. Find great deals on Dell batteries, adapters, memory and parts. Featured Graphics Card Deals: Dell. Dell 1 GB AMD Radeon R5 240 Full Height Graphic Card. PNY Technologies. Typically 24-48 hours after your order's ship date; Rewards expire in 90 days (except where.

1 Proceedings of the 4th Ph.D. Retreat of the HPI Research School on Service-oriented Systems Engineering hrsg. von Christoph Meinel, Hasso Plattner, Jürgen Döllner, Mathias Weske, Andreas Polze, Robert Hirschfeld, Felix Naumann, Holger Giese Technische Berichte Nr. 31 des Hasso-Plattner-Instituts für Softwaresystemtechnik an der Universität Potsdam


3 Technische Berichte des Hasso-Plattner-Instituts für Softwaresystemtechnik an der Universität Potsdam


5 Technische Berichte des Hasso-Plattner-Instituts für Softwaresystemtechnik an der Universität Potsdam 31 Proceedings of the 4th Ph.D. Retreat of the HPI Research School on Service-oriented Systems Engineering herausgegeben von Christoph Meinel Hasso Plattner Jürgen Döllner Mathias Weske Andreas Polze Robert Hirschfeld Felix Naumann Holger Giese Universitätsverlag Potsdam

6 Bibliografische Information der Deutschen Nationalbibliothek Die Deutsche Nationalbibliothek verzeichnet diese Publikation in der Deutschen Nationalbibliografie; detaillierte bibliografische Daten sind im Internet über abrufbar. Universitätsverlag Potsdam Am Neuen Palais 10, Potsdam Tel.: +49 (0) / Fax: Die Schriftenreihe Technische Berichte des Hasso-Plattner-Instituts für Softwaresystemtechnik an der Universität Potsdam wird herausgegeben von den Professoren des Hasso-Plattner-Instituts für Softwaresystemtechnik an der Universität Potsdam. Das Manuskript ist urheberrechtlich geschützt. Online veröffentlicht auf dem Publikationsserver der Universität Potsdam URL URN urn:nbn:de:kobv:517-opus Zugleich gedruckt erschienen im Universitätsverlag Potsdam: ISBN

7 Contents 1 Context-aware Reputation in SOA and Future Internet Rehab Alnemr 2 Abstraction of Process Specifications Artem Polyvyanyy 3 Information Integration in Services Computing Mohammed AbuJarour 4 Declarative and Event-based Context-oriented Programming Malte Appeltauer 5 Towards Service-Oriented, Standards-Based, Image-Based Provisioning, Interaction with and Styling of Geovirtual 3D Environments Dieter Hildebrandt 6 Reliable Digital Identities for SOA and the Web Ivonne Thomas 7 Introducing the Model Mapper Enactor Pattern Hagen Overdick 8 A Runtime Environment for Online Processing of Operating System Kernel Events Michael Schöbel 9 Computational Analysis of Virtual Team Collaboration in the Early Stages of Engineering Design Matthias Uflacker 10 Handling of Closed Networks in FMC-QE Stephan Kluth 11 Modelling Security in Service-oriented Architectures Michael Menzel 12 Automatic Extraction of Locking Protocols Alexander Schmidt 13 Service-Based, Interactive Portrayal of 3D Geovirtual Environments Benjamin Hagedorn 14 An overview on the current approaches for building end executing mashups Emilian Pascalau Fall 2009 Workshop i

8 15 Requirements Traceability in Service-oriented Computing Michael Perscheid 16 Models at Runtime for Monitoring and Adapting Software Systems Thomas Vogel 17 Services for Real-Time Computing Uwe Hentschel 18 On Programming Models for Multi-Core Computers Frank Feinbube 19 Towards a Service Landscape for a Real-Time Project Manager Dashboard Thomas Kowark 20 Towards Visualization of Complex, Service-Based Software Systems Jonas Trümper 21 Web Service Generation and Data Quality Web Services Tobias Vogel 22 Model-Based Extension of AUTOSAR for Architectural Online Reconfiguration Basil Becker ii Fall 2009 Workshop

9 Context-aware Reputation in SOA and Future Internet Rehab Alnemr In this report I continue to elaborate different constructs of my work in order to reach the final goal of designing and modeling context-aware reputation systems. Throughout the report, I illustrate different goals and benefits of developing such systems. 1 Introduction Reputation is used in our social communities as a tool of regulating society by the mean of circulating evaluations. It has been also an important tool in computer science, especially after the emergence of e-markets. In these markets, consumers get product information from the auction sites as well as weblogs. Thus mobilizing and combining the experience of many consumers. Currently, it became even more crucial with the increased deployment of Service Oriented Architectures (SOA), Internet of Service (IoS), and recently cloud computing. In the near future, reputation will be a factor in negotiating about computer resources in cloud computing. It plays even a bigger role in mitigating risks, when the price involved is higher, and in conveying cooperation and increasing trust among different entities in these systems. Transferring the internet from a network of information to a network of knowledge and services requires more complex and cognitive view on reputation. Existing reputation based systems and models are not cognitive enough to reflect the real nature of reputation notions. Understanding reputation concepts and constructs is essential specially if we are dealing with humans and artificial intelligence agents. In my previous work [4] [5], I have introduced a context-aware reputation framework that enables reputation portability between different virtual communities. The basic constructs of this framework are: Reputation Object (holds the contexts to be evaluated along with the corresponding values and their calculation models), Trust Reputation Centers (reputation providers), and RRTM (categorization of reputation models to be used as reference). Reputation in this framework addresses different contexts to be evaluated, and is generic enough to be transfered among communities(platforms), which fits in the SOA and cloud vision. Later, I have introduced in [3] the use of Attention Allocation Points (AAPs) techniques -used in economics- to address the problem of having an abundance of information and solve it by pinpointing the most important information to the current transaction. The output pieces of information are used later to build reputation of entities involved in the process. Thus, the process of rating is transformed into a process Fall 2009 Workshop 1-1

10 Context-aware Reputation in SOA and Future Internet of evaluation. The building blocks to this evaluation process construct at the end a new concept of Reputation-as-a-Service (RaaS). The relation between reputation and quality is explored (in the same reference), since understanding reputation requires a correct understanding of the notions of quality to enable the evaluation process. Understanding -and later measuring- Service Reputation requires distinguishing it from other concepts. This distinction helps in differentiating what is being measured, which is a critical factor in conguring Service Level Agreements (SLAs). This separation of concepts helps also in evaluating quality processes along each phase of the service life cycle, and among the different roles of service parties, e.g., service provider. Following this line of work, I am working with the POSR [1] team in studying the quality notions and processes to have the right information to build the RaaS and assign Reputation Objects to services. In parallel, SLA is being investigated, since they include performance measurement information needed in the evaluation process. For allocating the right kind of information to be used in reputation evaluation, I am working with a team in Frei University in Berlin, specialized in Complex Event Processing, to allocate attention points. This is done by using their RuleResponder [8] project. In the following section I start by introducing new definitions for quality notions and service reputation along with the proposed model-driven approach to service reputation [2]. Section 3 illustrates our work in analyzing SLAs and introduces a new hybrid approach for distributing trust in SLA choreography [6]. The work with Frei University team [9] is introduced in Section 4. This is followed by the conclusion and next steps. 2 Model-driven approach to Service Reputation This section elaborates on our study of the quality notions that was done to get the right kind of information for the reputation evaluation process. The investigation revealed several misconceptions and a diversity of terminologies usage. Moreover, it led us to the fact that current approaches use limited sources to evaluate quality. Therefore, we try to provide concrete denitions of Quality of Service (QoS), Quality of Results (QoR), and Service Quality and to differentiate between them. Quality attributes associated with these concepts vary from subjective to objective measures (depending on the context in which the service used). The discussion in this section includes: the proposed new meta-model, the combined sources for service s quality assessment, and the new definition of Service Reputation Object. 2.1 Quality notions in SOA The requirements specified by the service consumer in the process of service selection include both functional and non-functional features of the service. A critical requirement for service selection is quality, but assessing the quality of a certain service is not straightforward. Gathering sufficient information for such assessment incorporates determining what to ask for and from which source to get it. Using service description provides information about functional features of the services, but this is not enough to select a service. 1-2 Fall 2009 Workshop

11 2 MODEL-DRIVEN APPROACH TO SERVICE REPUTATION We propose an extensible meta-model that specifies the definitions, relations, and differences between quality notions. In our approach, quality attributes of a specific service are derived from the proposed meta-model by means of model-to-model transformation. A service call corresponds to a model-to-text transformation, where the values of service quality attributes are specified. This is accomplished by using a combined approach to collect the required information from different sources: users ratings, service past interactions, user preferences and invocation analysis. Using these sources, a Service Reputation Object is derived from the model for each service. The object holds values of different quality attributes and represents a context-oriented reputation for a service. In the literature, reputation is considered as a quality attribute. In our work, we view Service Reputation as an upper concept that encapsulates quality notions not the other way around New Definitions The terminologies associated with quality and reputation concepts lack unied denitions in the literature, e.g.,qos is sometimes confused with QoR, and there is no concrete denition for the notion of Service Quality (SQ), yet. Our definitions are: Quality of Results describes the extend of how the outputs of a service met or exceeded the service consumer s expectations (degree of satisfaction), in terms of results soundness, completeness, and correctness. Service Quality describes the extend of how the service has been invoked in a convenient manner (from the service consumer s point of view). Quality of Service describes the general features that are used to evaluate service efficiency in its context. Reputation describes the notion of profiling an entity to evaluate the expectation of its performance in several contexts Quality Attributes The above denitions classify the attributes into three categories: general attributes (applicable to any service), domain based attributes (applicable to a specic domain only), and service specic attributes (applicable to a single service only). Some approaches categorize quality attributes as strict separation of general attributes, such as price and delivery time, and domain based attributes (also called business related attributes), such as how many days left for the flight in a travel service website. Others categorize them as user-centeric (where quality attributes stem from user needs and preferences only) and non-user-centric attributes. In our approach we describe the nature of the attributes as dynamic (general or domain-based according to the context), subjective (affected by several evaluation sources like user preferences, history, invocation analysis), and service specific (applicable to a single service only) In the quantification phase of service s quality profiling, each attribute should be defined in terms of: domain, value, measuring method, weight relative to domain, Fall 2009 Workshop 1-3

12 Context-aware Reputation in SOA and Future Internet potential gain of rating, and temporal characteristic (decaying or increasing). Several attributes are used in the study of quality of service and information quality, such as: completeness, reliability, accuracy, accessibility, consistency, timeliness, availability, relevancy, efficiency, usability, security, and trust. 2.2 A meta model to Service Reputation A system model abstracts system s specifications from any specific implementation platform; which is referred to as Platform-Independent Model (PIM). Adjusting the abstract model (PIM) to the chosen implementation platform to get a Platform-Specific Model (PSM) is achieved through a set of model-to-model transformations. Finally, software artifacts, e.g., code, reports, are generated from the PSM by means of modelto-text transformations. Our model describes different quality attributes independently of any environment or preference settings. This corresponds to the PIM in Software Engineering. To derive a more concrete service quality model which describes a specific service in certain environments and settings and conforming to specific user preferences, we apply model-to-model transformation techniques. The result is a model that corresponds to a PSM in Software Engineering. In the PSM of service quality, service s specific domain attributes are considered and dynamically categorized according to the PIM. Eventually, we apply model-to-text transformation techniques to the PIM to get the software artifacts for the considered service. In our approach, the resulting software artifacts are concrete values for the service quality attributes of the considered service. Our proposed meta-model to the quality notions associated with a service is depicted Figure 1: A Meta-model for quality notions in SOC in Figure 1. In this model, we present a classification of quality attributes in three different abstract categories, which are QoS, SQ, and QoR. Moreover, we emphasize the generalization-specification relationship between QoS from one side, and SQ and QoR from the other side. This classification is based on the quality attributes used to assess a service call. General quality attributes fall under the QoS category, domain based quality attributes fall under the QoR category, and service-specific quality attributes under the SQ category. Extensibility is one of the main features of our proposed model. Quality attributes are not limited to a predefined set of attributes, but rather a service consumer can adapt the model to his own definition of quality by adding more quality attributes that are, in most cases, domain-specific and require domain experts. The model has a dynamic feature, which is reflected in the dynamic categorization of quality attributes (generic, domain based, and service specific). In one application, 1-4 Fall 2009 Workshop

13 2 MODEL-DRIVEN APPROACH TO SERVICE REPUTATION cost can be considered as a QoR attribute, but when it comes to cost vs. purpose, it falls in the SQ category. Figure 2 depicts our approach through an example. Figure 2: Model-to-model and model-to-text transformations 2.3 Discussion: From Quality Attributes to Service Reputation In this section, we discuss specific aspects of our approach; namely: evaluation sources, invocation analysis and Service Reputation Object. One of the uses of our work is to enhance service selection by investigating its reputation. This is done by providing enough information about several quality aspects of a service (profiling) to be able to select the most appropriate service. Previous work in the field use only one or two of the following evaluation sources: 1. feedback from users: ratings per service are aggregated as a reputation value. 2. service history: some or all of the service past interactions. 3. service provider s reputation: service reputation is a direct inheritance from the service provider s reputation. 4. advertisement: the service provider advertise his services himself. In our approach we combine several sources to get a full service profile, so that we construct service reputation objects. Instead of using binary reports from service consumers, we use a feedback report detailing the rating of each quality attributes involved in the process in the light of service s configurations and user s preferences. Invocation analysis is introduced as an automatic source of information to enrich the process of deriving service quality. Fall 2009 Workshop 1-5

14 Context-aware Reputation in SOA and Future Internet Service Reputation Object Our approach led us to conclude a new reputation concept for a service. Giving a service a reputation value is not a novel idea, some researchers already addressed this idea, but most of them either collects binary reports from users and aggregate the result in a single rating value or they use only the service provider s reputation, which is again a single value. Being a single value, reputation is usually thought of as a quality attribute. In our work, we do not only consider reputation as an object, but we also consider Service Reputation as an upper concept that encapsulates quality notions not the other way around. Referring to reputation definition in Section 2.1.1, profiling an entity describes recording its behavior and analyzing its characteristics in order to predict or assess its ability in a certain sphere or to identify a certain pattern. These characteristics are various quality attributes that are tested and calculated per service call creating at the end a list of values that are used to construct a service profile. Hence, we redefine service reputation as: Service Reputation: describes the notion of profiling a single service by collecting its performance information using different quality attributes to construct the service profile. Service Reputation Object: is an object that holds service reputation information, and consists of a set of collective values calculated by functions that depend on the nature of the corresponding quality attribute, aggregated from several service calls. A Service Reputation Object (SRO), which has detailed values of quality attributes, can be visualized by addressing Figure 3. A single quality attribute semantic description is formalized as: QA i {QA 1,..,QA n }, an algorithm used to calculate one or more quality attribute is Alg k {Alg 1,..,Alg y }, a single service call is SC j {SC 1,..,SC m }. The functions used to calculate the quality value are not necessarily aggregation or averaging. Moreover, it does not have to be the same function used to compute the elements of the array, can be different for each attribute; according to its nature, i.e., average, maximum, multiplication, etc., where F w {F 1,..,F x }. The final calculated value for a single attribute from all the service calls is CalcV alue i {CalcV alue 1,..,CalcV alue n }. SC m SC 2 SRO= QA 1 QA 2 QA 3.. QA n Algorithm 1 Algorithm 2 Algorithm 3.. Algorithm y SC 1 CalcValue 1 =F 1 (Value 1 ) CalcValue 2= F 2 (Value 2 ) CalcValue 3= F 3 (Value 3 ). CalcValue n= F x (Value n ) Figure 3: Constructing Service Reputation Object by analyzing quality notions 1-6 Fall 2009 Workshop

15 3 DISTRIBUTED TRUST MANAGEMENT FOR VALIDATING SLA CHOREOGRAPHIES 3 Distributed Trust Management for Validating SLA Choreographies For business workflow automation in a service-enriched environment services scattered across heterogeneous Virtual Organisations (VOs) can be aggregated in a producer -consumer manner, building hierarchical structures of added value. To preserve the supply chain, the Service Level Agreements (SLAs) corresponding to the underlying choreography of services should also be incrementally aggregated. This cross-vo hierarchical SLA aggregation requires validation, for which a distributed trust system becomes a prerequisite. This section elaborates on the proposed hybrid approach of using reputation systems and PKIs to distribute trust in SOA, and to identify violation-prone services at service selection stage. It also actively contributes in breach management at the time of penalty enforcement. This work is part of the phase which aims to use SLAs to get quality information that leads later to constructing the reputation service. The hybrid distributed trust system enable previously introduced rule-based runtime validation framework for hierarchical SLA aggregations. The discussion includes: the justification and significance of a hybrid trust model for the validation of hierarchical SLA aggregations, and the conceptual elements of our hybrid PKI and reputation based trust model. 3.1 A Framework for Validation of Hierarchical SLA Aggregations A Service Level Agreement (SLA) is a formally negotiated contract between a service provider and a service consumer to ensure the expected level of a service. In a service enriched environment such as Grid, cooperating workflows may result into a service choreography spun across several Virtual Organisations and involving many business partners. Service Level Agreements are made between services at various points of the service choreography. Service choreography is usually distributed across several Virtual Organizations and under various administrative domains. The complete aggregation information of the SLAs below a certain level in the chain is known only by the corresponding service provider and only a filtered part is exposed up towards the immediate consumer. This is the reason why during the validation process, the composed SLAs are required to be decomposed in an incremental manner down towards the supply chain of services and get validated in their corresponding service providers s domain. A validation framework for the composed SLAs, therefore, faces many design constraints and challenges: a trade-off between privacy and trust, distributed query processing, and automation to name the most essential ones. In our proposed model, the privacy concerns of the partners are ensured by the SLA View model [7], whereas the requirements of trust and security can be addressed through a reputation-based trust system built upon a distributed PKI (Public Key Infrastructure) based security system. Additionally, we use Rule Responder [8] to weave the outer shell of the validation system by providing the required infrastructure for the automation of role description of partners as well as steering and redirection of the distributed validation queries. Every service provider is limited only to its own view. The Fall 2009 Workshop 1-7

16 Context-aware Reputation in SOA and Future Internet whole SLA Choreography is seen as an integration of several SLA Views.(details found in [7]). SLA-views can be implemented by using Rule Responder architecture. Rule Responder adopts the approach of multi agent systems. There are three kinds of agents, namely: Organisational Agents (OA), Personal Agents (PA), and External Agents (EA). An EA is an entity that invokes the system from outside. A virtual organization is represented by an OA, which is the main point of entry for communication with the outer world. A PA corresponds to the SLA View of a service provider. Each individual agent is described by its syntactic resources of personal information, the semantic descriptions that annotate the information resources with metadata and describe the meaning with precise business vocabularies (ontologies) and a pragmatic behavioral decision layer to react autonomously. The flow of information is from external to organizational to personal Agent. During service choreography, services may form temporary composition with other services, scattered across different VOs. The question of whose parent VO acts as the root CA in this case is solved by including third party trust manager like the case for dynamic ad hoc networks. In case of SLA violation, in addition to enforcing penalty, the affected party is likely to keep a note of the violating service in order to avoid it in future. Moreover, a fair business environment demands even more and the future consumers of the failing service also have a right to know about its past performance. Reputationbased trust systems are widely used to maintain the reputation of different business players and to ensure this kind of knowledge. Our hybrid trust model based on PKI and reputation- based trust systems to harvest advantages from both techniques. The main points of the model are first, the PKI based trust model has a third party trust manager that will act as a root CA and authenticate member VOs. These VOs are themselves CAs as they can further authenticate their containing services. Second, Selection of services at the the pre-sla stage is done by using reputation to prevent SLA violation. Services reputation are updated after each SLA validation. third, While the trust model promises trust and security, the SLA views protect privacy. 3.2 Single Sign-On and Reputation Centers In the proposed model, a third party acts as a root CA and authenticates member VOs. These VOs are themselves CAs as they can further authenticate their containing services. Each member is given a certificate. With Single Sign-On, the user does not have to bother to sign in again and again in order to traverse along the chain of trusted partners (VOs and services). This can be achieved by the Cross-CA Hierarchical Trust Model where the root CA provides certificates to its subordinate CAs and these subordinates can further issue certificates to other CAs (subordinates), services or users. In our previous work [4], the Trust Reputation Center (TRC) acts as a trusted third party. As depicted in figure 4, this reputation-based trust model has direct correspondence with Rule Responder s agents and their mutual communication. The PAs consult OAs and OAs in return consult the TRC which is equivalent to the third party CA in PKI based system. 1-8 Fall 2009 Workshop

17 4 TOWARDS SEMANTIC EVENT-DRIVEN SYSTEMS Third Party Root CA TRC CA-VO1 CA-VO2 CA-VO3 OA1 OA2 OA3 a1 b1 a2 b2 a3 b3 Figure 4: Correspondence bet.: PKI, reputation based systems, and Rule Responder architecture 3.3 Proposed Trust Management Model The processes involved in our model are: Validation of complete SLA aggregation: to do this the validation query is required to traverse through all the SLA views lying across heterogeneous administrative domains and get validated locally at each SLA view. The multi-agent architecture of Rule Responder provides communication middle-ware to the distributed stakeholders namely the client, the VOs, and various service providers. Use of reputation in the selection phase: reputation transfer is required at two stages: at service selection stage and at penalty enforcement stage. In the process of service selection, the reputation transfer helps to select the least violationprone services, taking into account proactive measures to avoid SLA violations. Out of all the available services, the client first filters the best services complying its happiness criteria. Then the client compares the credentials from reputation objects of the services. The reputation object is traced. Then the client can select the best service in accordance to its already devised criteria. We assume that out of redundant services which fulfil client s requirements, the service with the highest reputation is selected. Use of PKI and reputation in breach management: this hybrid Trust is used in the breach management after an occurrence of SLA violation. [7] Refer to [6] to find more details on the use case scenario and about this paper. 4 Towards Semantic Event-Driven Systems This section elaborates on the work of allocating the right kind of information to be used in reputation evaluation, using Complex Event Processing (CEP) techniques and semantic web technologies. The discussion in this section includes: description of a new conceptual proposal for a semantic event-driven Internet, and contribution with an architectural design artefact consisting of two essential levels of enhancements. A common denominator is the use of Semantic Web technologies on each level, i.e., Fall 2009 Workshop 1-9

18 Context-aware Reputation in SOA and Future Internet for processing data (tied to events) and for semantically formalizing attention allocation points which are applied on the detected and predicted situation instances via a semantic match making process (ontology and rule-based constraint inference logic). Here, we introduce our work-in-progress on semantic, proactive, quality assured, eventdriven information processing approach towards the vision of a semantic event-driven systems, which comprises tow levels of proposed enhancements: 1. Semantic Complex Event Processing (SCEP), which is the combination of CEP and and Semantic Web technologies, to achieve better machine understandability of (event) data and highly automated real-time processing of large and heterogenous event sources 2. Semantic quality, trust and reputation assessment of the produced information to support precise information dissemination and focused user attention allocation we are currently working on grounded integration of the two layers and on the implementation and evaluation of the approach with industrial relevant use cases. 4.1 Semantic Complex Event Processing Real-world occurrences can be defined as events that are happening over space and time. An event instance is an concrete semantic object containing data describing the event. An event pattern is a template that matches certain sets of events. One of the critical success factors of event-driven systems is the capability of detecting complex events from simple event notifications. The promises of the combination of event processing and semantic technologies, such as rules and ontologies, which leads to semantic event processing (SCEP), are that the event processing rule engines can understand what is happening in terms of events and (process) states and that they will know what reactions and processes they can invoke and what events it can signal. Semantic (meta) models of events can improve the quality of event processing by using event metadata in combination with ontologies and rules (knowledge bases). Event knowledge bases can represent complex event data models which link to existing semantic domain knowledge such as domain vocabularies / ontologies and existing domain data. Semantic inference is used to infer relations between events such as e.g. transitivity or equality between event types and their properties. Temporal and spatial reasoning on events can be done based on their data properties, e.g. a time ontology describing temporal quantities. In our semantic event model an event instance is a set of RDF triples and an event pattern is a graph pattern. More complex events are formed by combining smaller patterns in various ways. A complex conjunctive event filter is a complex graph pattern of the RDF graph. A map of the attribute/value pairs to a set of RDF triples (RDF graph) can be used as event instance. The interesting part of this event data model is the linking of the existing knowledge (non-event concepts) to the event instances, for example the name of the stock which the event is about is not identified by a simply string name, but by an URI which links to the semantic knowledge about the stock. This 1-10 Fall 2009 Workshop

19 4 TOWARDS SEMANTIC EVENT-DRIVEN SYSTEMS 5$>(!-#C7'( *:'%14(!'#$%&'()*+(*%,-%'(.+/( 0$%$,'#'%1(.+/(!'#$%&'()*+(*%,-%'(!1$1'(+23'4432( 567'(*%,-%'( 86'29(+23'4432( )3#C7'H( *:'%14( B2-C7'(!132'(.=$C132( *:'%1( D-41329( F%1373,-'4G567'4( Figure 5: Architecture of Semantic Complex Event Processor knowledge can be used later for the processing of events, e.g. in the condition part of a Event-Condition-Action (ECA) reaction rule. Figure 5 shows our architectural vision for semantic complex event processing. It combines a knowledge base which includes ontologies and rules, and an incoming event stream which comes from event producers, e.g. sensors or event adapters. The system has to combine static and real-time knowledge references and generate new knowledge. Having such a semantic event pattern/filter makes it possible to detect complex event which is derived from the already happened events in combination with the knowledge of the processing system. In the first step the raw-level events are classified and mapped to a secondary level of events. The sequence and syntactic checks are also processed in this step. Next step, the SCEP engine processes the events more expressively based on their semantic relationships to other events and other none-event concepts existing in the knowledge-base. The knowledge-based can be seen as TBox (assertions on concepts) and the event object stream as ABox (assertions on individuals) 4.2 Proposed Approach through Stock Market Use Case Semantic event-driven systems can have different use cases in different fileds such as e-health, business activity monitoring, fraud detection, logistics and cargo, etc. In this example, a broker-agent in the financial stock market (who studies price movements, and patterns), has to take the decision whether to buy Company XY stocks for his customers. Several factors contribute to his decision: recent trends seen in the market movements, events detected in the company itself, and the reputation of the sources from which the broker gets his intelligence. We assume that the broker has information from two sources: source 1 has high reputation in the real-estate market movements while source 2 has high reputation in commodity stocks. Attention to these two particular markets can form the final decision. Next step is to formalize this info into a correspondent complex event. During this process, a new intel appears: the elimination of several agricultural lands to be converted into construction areas due to change of political administration in the city. This new intel is entered as a new event in the process, with the source reputation and his area of expertise. Then, the trend mining Fall 2009 Workshop 1-11

20 Context-aware Reputation in SOA and Future Internet engine process information about trends in the political section to get the possibility of another change in the administration. 5 Conclusion and Future work Correct understanding and later modeling of reputation concept helps to: enhance the e-market dynamics to reflect real-life cognitive models of interactions, connect reputation models to other elements in the architecture (eg.agent memory), base the decision making process on correct and context-related parameters, and construct SLAs in e- contracts that are: context related, customized, and realistic. I will continue working on the three previously mentioned parts of my vision in parallel by cooperating with different teams. References [1] M AbuJarour, M Craculeac, F Menge, T Vogel, and J Schwarz. Posr: A Comprehensive System for Aggregating and Using Web Services. Proceedings of the 2009 IEEE Congress on Services, [2] Rehab Alnemr, Mohammed AbuJarour, and Christoph Meinel. Model-driven Approach to Service Reputation [3] Rehab Alnemr, Justus Bross, and Christoph Meinel. Constructing a context-aware service-oriented reputation model using attention allocation points [4] Rehab Alnemr and Christoph Meinel. Getting More from Reputation Systems: A ContextAware Reputation Framework Based on Trust Centers and Agent Lists. Computing in the Global Information Technology, ICCGI 08, pages , [5] Rehab Alnemr, Matthias Quasthoff, and Christoph Meinel. Taking trust management to the next level [6] Irfan Ul Haq, Rehab Alnemr, Adrian Paschke, E. Schikuta, H. Boley, and C. Meinel. Distributed Trust Management for Validating SLA Choreographies. SLAs in Grids workshop, CoreGRID Springer series (to appear), [7] Irfan Ul Haq, Altaf Huqqani, and Erich Schikuta. Aggregating hierarchical service level agreements in business value networks. Business Process Management Conference (BPM2009), [8] A. Paschke, H. Boley, A. Kozlenkov, and B. Craig. Rule responder: Ruleml-based agents for distributed collaboration on the pragmatic web. Proceedings of the 2nd international conference on Pragmatic web Tilburg, The Netherlands, [9] Kia Teymourian, Rehab Alnemr, Olga Streibel, Adrian Paschke, and Christoph Meinel. Towards semantic event-driven systems Fall 2009 Workshop

21 Abstraction of Process Specifications Artem Polyvyanyy Software engineers constantly deal with problems of designing, analyzing, and improving process specifications, e.g., source code, service compositions, or process models. Process specifications are abstractions of behavior observed or intended to be implemented in reality which result from creative engineering practice. Usually, process specifications are formalized as directed graphs, where edges capture temporal relations between decisions, synchronization points, and work activities. Every process specification is a compromise between two points: On the one hand engineers strive to operate with less modeling constructs which conceal irrelevant details, while on the other hand the details are required to achieve the desired level of customization for envisioned process scenarios. In our research, we approach the problem of varying abstraction levels of process specifications. Formally, developed abstraction mechanisms exploit the structure of a process specification and allow the generalization of low-level details into concepts of a higher abstraction level. The reverse procedure can be addressed as process specialization. Keywords: Process abstraction, process structure, process modeling 1 Introduction Process specifications represent exponential amounts of process execution scenarios with linear numbers of modeling constructs, e.g., service compositions and business process models. Nevertheless, real world process specifications cannot be grasped quickly by software engineers due to their size and sophisticated structures making a demand for techniques to deal with the complexity. The research topic of process abstraction emerged from a joint research project with a health insurance company. Operational processes of the company are captured in about EPCs. The company faced the problem of information overload in the process specifications when employing the models in use cases other than process execution, e.g., process analysis by management. To reduce the modeling effort, the company requested to develop automated mechanisms to derive abstract, i.e., simplified, process specifications from the existing ones. The research results derived during the project are summarized in [6]. Abstraction is the result of the generalization or elimination of properties in an entity or a phenomenon in order to reduce it to a set of essential characteristics. Information loss is the fundamental property of abstraction and is its intended outcome. When working with process specifications, engineers operate with abstractions of real world concepts. In our research, we develop mechanisms to perform abstractions of formal process specifications. The challenge lies in identifying what the units of process Fall 2009 Workshop 2-1

22 Abstraction of Process Specifications In stock Receive order Ship products Send bill Receive payment Not in stock Make products Abstraction Analyze order Order Not in stock Check stock In stock Purchase raw material Make production plan Ship products Send bill Manufacture products Receive payment Figure 1: Process abstraction (BPMN notation) logic suitable for abstraction are and then afterwards performing the abstraction. Once abstraction artifacts are identified, they can be eliminated or replaced by concepts of higher abstraction levels which conceal, but also represent, abstracted detailed process behavior. Finally, individual abstractions must be controlled in order to achieve an abstraction goal a process specification that suits the needs of a use case. Figure 1 shows an example of two process specifications (given as BPMN process models) which are in the abstraction relation. The model at the top of the figure is the abstract version of the model at the bottom. Abstract tasks are highlighted with a grey background; the corresponding concealed fragments are enclosed within the regions with a dashed borderline. The fragments have structures that result in an abstract process which captures the core process behavior of the detailed one. The abstract process has dedicated routing and work activity modeling constructs and conceals detailed behavior descriptions, i.e., each abstracted fragment is composed of several work activities. The research challenge lies in proposing mechanisms which allow examining every process fragment prior to performing abstraction and suggesting mechanisms which coordinate individual abstractions, i.e., assign higher priority to abstracting certain fragments rather than the others. The rest of the paper is organized as follows: The next section presents the connectivity-based framework designed to approach the discovery of process fragments suitable for abstraction. Sect. 3 discusses issues relevant to the control of process abstraction. Sect. 4 discusses a technique which aids validation of process correctness and is found on ideas from Sect. 2. The paper closes with conclusions which summarize our findings and ideas on next research steps. 2 Discovery of Process Fragments A necessary part of a solution for process abstraction are the mechanisms for the discovery of process fragments, i.e., parts of process logic suitable for abstraction. The chances of making a correct decision on which part of a process specification to abstract from can only be maximized if all the potential candidates for conducting an abstraction are considered. To achieve this completeness, we employ the connectivity property of 2-2 Fall 2009 Workshop

23 2 DISCOVERY OF PROCESS FRAGMENTS process graphs directed graphs used to capture process specifications. Connectivity is a property of a graph. A graph is k-connected if there exists no set of k 1 elements, each a vertex or an edge, whose removal makes the graph disconnected, i.e., there is no path between some pair of elements in a graph. Such a set is called a separating (k 1)-set. 1-, 2-, and 3-connected graphs are referred to as connected, biconnected, and triconnected, respectively. Each separating set of a process graph can be addressed as a set of boundary elements of a process fragment, where a boundary element is incident with elements inside and outside the fragment and connects the fragment to the main flow of the process. Let m be a parameter, the discovery of all separating m-sets (graph decomposition) of the process graph leads to the discovery of all process fragments with m boundary elements potential abstraction candidates. In general, one can speak about (n, e)-connectivity of process graphs. A graph is (n, e)-connected if there exists no set of n nodes and there exists no set of e edges, whose removal makes the graph disconnected. Observe, an (n, e)-connected graph is (n+e+1)-connected. A lot of research was carried out by the compiler theory community to gain value from the triconnected decomposition of process specifications, i.e., the discovery of triconnected fragments in process graphs. The decompositions which proved useful were (2, 0)-decomposition, or the tree of the triconnected components, cf., [9], and (0, 2)-decomposition, cf., [2]. Triconnected process graph fragments form hierarchies of single-entry-single-exit (SESE) fragments and are used for process analysis, process comparison, process comprehension, etc. For these decompositions, linear-time complexity algorithms exist [1, 3]. Recently, these techniques were introduced to the business process management community [10, 11]. We employed triconnected process graph fragments to propose mechanisms of process abstraction [7, 8]; we discover and generalize the triconnected process fragments to tasks of a higher abstraction level. H M I C F J N A B D G K O Q R S T E L P Figure 2: Process graph, its SESE fragments Figure 2 shows an example of a process graph. Routing decisions and synchronization points can be distinguished by the degree of the corresponding vertex, i.e., the number of incident edges, and orientation of the incident edges, i.e., incoming or outgoing. Process starts (ends) have no incoming (outgoing) edges. Moreover, Figure 2 visualizes the triconnected fragments of the graph (SESE fragments). Each triconnected fragment is enclosed in the region and is formed by edges inside or intersecting the region. Fragments enclosed by regions with dotted borderlines can be discovered after performing (0, 2)-decomposition of the process graph, Fall 2009 Workshop 2-3

24 Abstraction of Process Specifications HIM HIM A B C D F G JN KO Q R S T A B C D F G JNK OQR S T ELP ELP (a) (b) Figure 3: (a) Abstract process graph, (b) (3, 0)-connected fragment abstraction whereas regions with dashed borderlines define fragments additionally discovered by (2, 0)-decomposition. Observe that trivial fragments composed of a single vertex of the process graph are not visualized. The abstract process graph, obtained from the graph shown in Figure 2, is given in Figure 3(a). The abstraction is performed following the principles from [8], i.e., by aggregating the triconnected fragments into concepts of a higher abstraction level. The graph from Figure 3(a) has a single separating pair B, S. Next abstractions of the triconnected fragments of the graph will result in aggregation of either the unstructured fragment composed of nodes {C,.., R} {E, L, P }, or the fragment with entry node B and exit node S. In order to increase the granularity of process fragments that are used to perform abstractions in [10, 11], one can start looking for multiple-entries-multiple-exits (MEME) fragments within the triconnected fragments. Figure 4 visualizes a connectivity-based process graph decomposition framework, i.e., the scheme for process fragment discovery. In the figure, each dot represents a connectivity property of the process graph (process fragment) subject to decomposition, e.g., (0, 0) means that the graph is connected if no nodes and no edges are removed. Edges in the figure hint at which decomposition can be performed for a graph with a certain connectivity level. For instance, one can decompose a (2,0) (0,0) (1,0) (0,1) (1,1) (0,2) (2,1) (1,2) (3,0) (0,3) Figure 4: Connectivity-based process graph decomposition framework (0, 0)-connected graph by looking for single nodes or edges which make it disconnected. Similarly, the triconnected graphs can be decomposed into (3, 0)-, (2, 1)-, (1, 2)-, or (0, 3)-connected fragments. In general, an (n, e)-connected process graph (process fragment) can be (n + 1, e)- or (n, e + 1)-decomposed. Observe that in this way highly connected process fragments get gradually decomposed. An (n, e)-decomposition (n + e > 3) allows to decompose unstructured process graphs into MEME fragments with n + e entries and exits. A process graph in Figure 3(b) is obtained by abstracting a (3, 0)-connected fragment defined by a separating set {F, G, S} (F and G are the entries and S the exit of the fragment, highlighted with grey background). For reasonable n + e combinations, it is possible to perform decomposition in low polynomial-time complexity. For example, the (3, 0)-decomposition of a (2, 0)- connected graph can be accomplished by removing a vertex from the graph and then 2-4 Fall 2009 Workshop

25 3 ABSTRACTION CONTROL running the triconnected decomposition [1]. Each discovered separation pair together with the removed vertex form a separating triple of a 4-connected fragment. The procedure should be repeated for each vertex of the original graph. Hence, a squaretime complexity decomposition procedure is obtained. Following the described rationale, one can accomplish (k, 0)-decomposition in O(n k 1 ) time. By following the principles of the connectivity-based decomposition framework, we not only discover process fragments used to perform process abstraction, but also learn their structural characteristics. Structural information is useful at other stages of abstraction, e.g., when introducing control over abstraction. Initial work on classifying and checking the correctness of process specifications based on discovered process fragments was accomplished in [4]. 3 Abstraction Control The task of adjusting the abstraction level of process specifications requires intensive intellectual work and in most cases can only be accomplished by process analysts manually. However, for certain use cases it is possible to derive automated or to support semi-automated abstraction control mechanisms. The task of abstraction control lies in telling significant process elements from insignificant ones and to abstract the latter. In [6], work activities are classified as insignificant if they are rarely observed during process execution. We were able to establish the abstraction control as investigated processes were annotated with information on the average time required to execute work activities. Process fragments which contain insignificant work activities get abstracted. Hence, we actually deal with the significance of fragments which represent detailed work specifications. Significant and insignificant process fragments can be distinguished once a technique for fragment comparison is in place, i.e., a partial order relation is defined for the process fragments. The average time required to execute work activities in the process provides an opportunity to derive a partial order relation, i.e., fragments which require less time are considered insignificant. Other examples of criteria and the corresponding abstraction use cases are discussed in [5]. Once an abstraction criterion, e.g., the average execution time of work activities, is accepted for abstraction, one can identify a minimal and a maximal value of the criterion for a given process specification. In our example, the minimal value corresponds to the most rarely observed work activity of the process and the maximal value corresponds to the average execution time of the whole process. By specifying a criterion value from the interval, one identifies insignificant process fragments those that contain work activities for which the criterion value is lower than the specified value. Afterwards, insignificant fragments get abstracted. In [5], we proposed an abstraction slider as a mechanism to control abstraction. An abstraction slider is an object that can be described by a slider interval defined by a minimal and a maximal allowed value for an abstraction criterion value, has a state a value which defines the desired abstraction level of a process specification, and exposes behavior an operation which changes its state. Fall 2009 Workshop 2-5

26 Abstraction of Process Specifications 4 Correctness of Process Specifications Process specifications define allowed process execution scenarios. As a result of creative modeling practices, models can contain errors, e.g., have scenarios with improper termination or contain activities that can never become enabled and, hence, executed. The basic correctness criterion for process specifications, originally defined for WF-nets, is behavioral soundness. In a sound process, for each activity from a specification there exists an execution scenario (a process instance) which contains this activity. Moreover, one can uniquely recognize the events of starting and finalizing a process instance. SESE fragments have proven useful when decomposing the task of analyzing behavioral correctness of process specifications. For instance, if a SESE fragment of a process specification was shown to be sound, in the context of overall model correctness it can be addressed as a single edge passing control flow from its entry to its exit [4, 10]. S2 Allocate resources L1 S4 S5 Discard item Decrease quality class Repair defect S7 L2 S8 P1 S1 insufficient resources P2 Produce item S6 check failed start Confirm order Update order statistics S9 S10 P3 Arrange delivery S3 check passed stop Figure 5: (2, 0)-decomposition of the process specification (BPMN notation) The connectivity property of process graphs, in particular (2, 0)- and (0, 2)- decompositions of process specifications, are extensively studied in literature for the purpose of SESE fragments discovery [2, 7, 9, 11]. However, these techniques cannot be applied to an arbitrary structural class of process specifications in a straight-forward manner. Figure 5 shows the (2, 0)-decomposition of a process specification. Observe that not all (2, 0)-fragments form SESE fragments, the ones enclosed into the regions with a dashed borderline do not have a dedicated entry and/or exit nodes, whereas fragments enclosed in the regions with a dotted borderline are SESE fragments. The primary reason for this is that the process specification contains mixed gateways, i.e., control flow routing nodes with multiple incoming and multiple outgoing edges. Despite the fact that the process from Figure 5 appears to be block structured it can expose complex execution scenarios with control flow entering and leaving structured block patterns through both gateways which mark the block. In [4], we show that such complex behavior can be localized by examining structural patterns in hidden unstructured 2-6 Fall 2009 Workshop

27 5 CONCLUSIONS regions of control flow. As an outcome, the correctness of the behavior of process specifications within these regions can be validated in linear time. Figure 6 provides an alternative view on (2, 0)- decomposition of the process specification from Figure 5. Here, each node represents a (2, 0)-fragment and edges hint at containment relation of fragments. For instance, fragment P 1 is contained in fragment S1, and contains fragments S2 and S3. Observe that one obtains a tree structure and that fragment names hint at their structural class, e.g., S for sequence, P for parallel block (structurally, but not semantically), and L for loops, cf., [4]. In the process specification from Figure 5, fragments S5, L1, L2, and P 2 are non-sese fragments, the corresponding fragment nodes are highlighted with dark grey background in Figure 6. Together with fragments that are L1 S4 S7 S2 S5 L2 P2 S8 S1 P1 S6 S3 S9 P3 S10 Figure 6: The tree of the triconnected components of the process from Figure 5 represented by adjacent nodes in Figure 6 the control flow region constitutes a SESE fragment of hidden complex behavior with entry and exit nodes of fragment S2. The region is highlighted with grey background in Figure 6. In [4], we observe that process behavior within above descried regions is determined by loop fragments, i.e., by fragments L1 and L2 in the running example. A (2, 0)-fragment is a non-sese fragment if and only if it has a boundary node that is a mixed gateway which has at least one incoming and at least one outgoing edge both among internal and external fragment edges. Moreover, a non-sese fragment is either a loop fragment, or it shares a boundary node with a loop fragment. In the class of free-choice process specifications, the boundary nodes of a loop fragment cannot introduce concurrency to the process instance of a sound process specification. Hence, the boundary nodes of L1 and L2 fragments in process specification from Figure 5 must implement exclusiveor semantics. Furthermore, as loops share boundary nodes with other non-sese fragments they imply behavioral constraints on all (2, 0)-fragments of the region. For further information, cf., [4]. 5 Conclusions In our research we develop methods which allow the derivation of high abstraction level process specifications from detailed ones. In order to discover fragments suitable for abstraction, we employ structure of process specifications, which are usually formalized as directed graphs. As an outcome, developed techniques can be generalized to any process modeling notation which uses directed graphs as the underlying formalism. It is a highly intellectual task to bring a process specification to a level of abstraction that fulfills emergent engineering needs without a single perfect solution. By employing the technique for the discovery of abstraction fragments, one can approach the problem as a manual engineering effort. Besides, when it is sufficient to fulfill certain use case, cf., [6], one can define the principles for the semi-automated or fully automated composition of individual abstractions. Fall 2009 Workshop 2-7

28 Abstraction of Process Specifications As future steps aimed at strengthening the achieved results, we plan to validate the applicability of the connectivity-based process graph decomposition framework for the purpose of process abstraction with industry partners and to look for process abstraction use cases for which automated control mechanisms can be proposed. Finally, studies regarding the methodology of abstractions need to complement technical results. References [1] Carsten Gutwenger and Petra Mutzel. A Linear Time Implementation of SPQR- Trees. In GD, pages 77 90, London, UK, Springer Verlag. [2] Richard Johnson. Efficient Program Analysis using Dependence Flow Graphs. PhD thesis, Cornell University, Ithaca, NY, USA, [3] Richard Johnson, David Pearson, and Keshav Pingali. The Program Structure Tree: Computing Control Regions in Linear Time. pages ACM Press, [4] Artem Polyvyanyy, Luciano García-Bañuelos, and Mathias Weske. Unveiling Hidden Unstructured Regions in Process Models. In CoopIS, Vilamoura, Algarve- Portugal, November Springer Verlag. [5] Artem Polyvyanyy, Sergey Smirnov, and Mathias Weske. Process Model Abstraction: A Slider Approach. In EDOC, pages , Munich, Germany, September IEEE Computer Society. [6] Artem Polyvyanyy, Sergey Smirnov, and Mathias Weske. Reducing Complexity of Large EPCs. In MobIS: EPK, pages , Saarbruecken, Germany, November GI. [7] Artem Polyvyanyy, Sergey Smirnov, and Mathias Weske. On Application of Structural Decomposition for Process Model Abstraction. In BPSC, Leipzig, Germany, GI. [8] Artem Polyvyanyy, Sergey Smirnov, and Mathias Weske. The Triconnected Abstraction of Process Models. In BPM, Ulm, Germany, September Springer Verlag. [9] Robert E. Tarjan and Jacobo Valdes. Prime Subprogram Parsing of a Program. In POPL, pages , New York, NY, USA, ACM. [10] Jussi Vanhatalo, Hagen Völzer, and Frank Leymann. Faster and More Focused Control-Flow Analysis for Business Process Models Through SESE Decomposition. In ICSOC, pages Springer Verlag, September [11] Jussi Vanhatalo, Hagen Völzer, and Jana Koehler. The Refined Process Structure Tree. In BPM, pages , Milan, Italy, September Fall 2009 Workshop

29 Information Integration in Services Computing Mohammed AbuJarour Information integration has been the typical approach toEVENTS' <file name> ruledefinition = [('SYNCHRONOUS' 'ASYNCHRONOUS')] 'RULE' <rule name> patterndefinition {rulemodifier} rulemodifier = (wheredefinition whithindefinition resultdefinition ) patterndefinition = [('STRICTSEQUENCE' 'STRICTPARTITION' 'SKIPTILLNEXT')] 'PATTERN' '{'eventstructure'}' eventstructure = (eventspecification eventsequence eventalternative eventnegation) eventsequence = '['eventstructure {',' eventstructure}']' eventalternative = '('eventstructure' 'eventstructure { ' ' eventstructure}')' eventnegation = '~' (eventspecification eventsequence eventalternative) eventspecification = <event type> [eventlength] [':' <event name>] eventlength = '['[(<start> '.' <end> ('<' '=' '>') <number>)]']' wheredefinition = 'WHERE' '{'wherecondition {',' wherecondition}'}' wherecondition = (fieldjoin relation) fieldjoin = '['<event field name>']' relation = value relationop value value = fieldspec [operation fieldspec] fieldspec = (<integer const> <float const> <event name> (fieldspecevent fieldspecarray)) fieldspecevent = '.' <event field name> fieldspecarray = '[' ']' '.' ('len' ('avg' 'max' 'min') '.' <event field name>) operation = ('+' '-' '*' '/' '&' ' ') relationop = ('<' '<=' ' '>=' '>' '!=') whithindefinition = 'WITHIN' <time> resultdefinition = 'RETURN' ('SEQUENCE' { value {, value } } ) Table 1: Language for event pattern specification The Windows Monitoring Kernel provides the basic event logging infrastructure. Instead of writing events to a logfile, the runtime environment handles events as they occur. The event processing is based on deterministic finite automata which are derived from a formal description of event constellations. Event constellations can be specified using the grammar shown in table 1. The language uses concepts from EventScript [1] and SASE+ [2]. An EPC (event processing) file contains a sequence of epcstatements, which are either an eventdefinition or a ruledefinition. An eventdefinition is used to define the available event types and the structure of the event data. Each event contains event header information (such as timestamps) and event data fields (arbitrary information about the occurring event). Currently, the EPC compiler parses a regular C header file with Windows Monitoring Kernel event Fall 2009 Workshop 8-5

86 A Runtime Environment for Online Processing of Operating System Kernel Events definitions and extracts the event structure information. The datatypes of the event data fields are saved to ensure typesafe usage of event data fields in other parts of the EPC file. After defining available event types, a set of rules can be specified in an EPC file. Each rule definition contains at least a name and a pattern definition. Additionally, the mode of rule processing (either synchronous or asynchronous) and additional rule modifiers can be specified. The mode of a rule defines whether events are processed in the execution context in which they occur (synchronous processing, blocks the activity which causes the event) or not (asynchronous processing, no blocking). Rule modifiers are used to fine-tune the rule definition, e.g. by specifying additional relations between certain events or by setting a time limit for the pattern occurrence. The basic elements of each pattern definition are either single events or arrays of events. Single events are specified by giving an event type name, arrays are specified by an event type name and []. Optionally, length limitations can be specified in the brackets, e.g. a[<5] defines that less than five events of type a are expected to match the pattern. Single events and arrays can be named by using a colon: type:name specifies a named single event and type[]:name specifies a named array of events. These names can be used e.g. in rule modifiers to compare specific event data fields. Event patterns can be constructed analogous to regular expressions: An event sequence [a,b,c] defines that events of types a, b and c have to occur subsequently in an eventstream to match the pattern. An alternative (a b) defines that either an event of type a or an event of type b is expected. A negation ~X specifies that a certain pattern X does not eventstream. occur in the These structure elements can be nested in arbitrary ways, e.g. [a,~[b,~c],(d e)] is a valid pattern specification. Repetitions of composed sub-patterns (kleene star operators) are currently not supported - for repetitions of single events arrays can be used. Additionally to the event pattern structure, the pattern semantic has to be specified. There are three different semantic modes for a pattern specification: STRICTSEQUENCE: all specified pattern elements occur in a strict sequence in the combined eventstream containing all occurring events in the system. STRICTPARTITION: all pattern elements occur in a strict sequence in a subeventstream (partition) containing only events of a specific execution context. SKIPTILLNEXT: no strict sequence of pattern elements is required, irrelevant events are skipped. 8-6 Fall 2009 Workshop

87 4 ONLINE EVENT PROCESSING After specifying the event pattern, additional conditions can be defined. These conditions allow to define relations between event data fields or to define join fields. In the current prototypical implementation a single event data field or a pair of event data fields can be combined and compared to another single event data field or pair of fields. The common arithmetic and relation operators can be used. Special array operators provide access to the number of array elements and the minimum value, the maximum value, or the average value of the array elements. These basic operations are sufficient for our proof of concept implementation. Support for arbitrary calculations might be added in future versions. Event data fields can be used as join fields by using square brackets, e.g. WHERE {[name]}. This example specifies that all events used in the PATTERN statement must contain a data field name and all matching events must have the same value assigned to this field. Join fields which are part of the execution context define a eventstream partition that is considered in the STRICTPARTITION-semantic. A rule can have a timeframe which defines the maximal difference between the timestamp of the first detected event of a pattern and the final event. In our prototype, the timeframe can be defined based on the CPU cycle counter which is used for event timestamping. Alternatively, time can be specified in seconds. The runtime environment calculates the corresponding value for the CPU cycle counter. The last element in a rule description is the RETURN-statement. There are two possible results a rule can provide: (1) the rule can return the whole event sequence which matched the pattern, or (2) the rule can return a set of event data fields. Concepts that are planned for extensions of the language are: (1) A DO-statement for action specification which are executed immediately after a defined pattern is detected. The set of useful actions and security aspects has to be investigated. (2) String operations, e.g. for file name comparisons. Following, an example for a pattern specification is given: The rule, shown in listing 1, can be used to detect lock contention, i.e. concurrent acquire operations for the same synchronization object. 1 SYNCHRONOUS RULE lockcontention Listing 1: Detect lock contention 2 SKIPTILLNEXT PATTERN { [ lock :a, ~ lock :b, lock :c] } 3 WHERE { [ ObjAddress ], 4 a. Operation 1, 5 b. Operation 0, 6 c. Operation 1 } This rule analyzes lock events which belong to the same synchronization object (= the event field ObjAddress has the same value). The first lock:a event tries to acquire the synchronization object (Operation is 1), the second lock:b event releases the object (Operation is 0), and the third event also tries to acquire the object. By negating the second event (i.e. the event must not occur in the eventstream) the pattern can detect lock contention: a synchronization object can only be acquired by one single thread, therefore a release operation can only be generated if the syn- Fall 2009 Workshop 8-7

88 A Runtime Environment for Online Processing of Operating System Kernel Events chronization object was acquired successfully. If there is no contention on the synchronization object, event lock:a and event lock:b are generated by the same thread and the pattern detection aborts. If the release is not detected and another acquire event occurs, it must be from a different thread and the lock is contented. The exclusion of recursively acquired locks would require a more complex rule. The rule is executed in a synchronous way, therefore the execution of the activity that generates the lock:c event is blocked if the described pattern occurs. In such cases the locking strategy might be adapted or an entry might be logged for later analysis. The runtime environment consists of the following components: the compiler for EPC specifications and a user-mode application which loads compiled rule definitions into the system. The compiler was implemented using the Coco/R 2 framework. It compiles a rule specification (compliant with the described grammar) into a deterministic finite automata. Due to space restrictions we omit details about the compilation process. Converting regular expressions into automata is a well investigated topic. Conditions and actions are assigned to automata transitions. Conditions are checked to determine if the transition is valid at the current state of the automata, considering the current event. E.g. the expected event type and the WHERE conditions are checked via conditions. Actions are executed if the transition is actually chosen. Actions might save specific data fields of the current event for later reference or modify certain aspects of the runtime environment for the specific rule. The binary representation of the automata is loaded by the runtime system. If an event occurs, the runtime system identifies rules that are interested in the specific event. The transition from the initial automata state to the next state (i.e. when the first event relevant for the pattern occurs) starts an automata run. Each run is represented in memory by a runtime state representation. The data structure for runtime state representation is generated by the compiler. It contains the current automata state, event field information (which is initialized by transition actions), and information about results that is returned if a pattern is detected. These items can be derived from the EPC script. When a new event occurs, the event information is stored in an execution context local buffer. That means e.g. that each thread has its own buffer for occurring events. The event information is then fed into the event processor. The event processor manages a global buffer for event data. This buffer contains event data that is relevant for rule results and event data that does not require immediate processing. First, the event processor checks if there are any synchronous rules waiting for the new event. If this is the case, the event processor evaluates the specific rule automata. Only if the event data is relevant for rule results (e.g. if the rule has to return the whole event sequence), the event information is copied to the global event buffer. Secondly, if there are any asynchronous rules waiting for the new event, the event data is copied to the global event buffer (if not already done as a result of processing a synchronous rule) and a reference is stored in the asynchronous event queue Fall 2009 Workshop

89 5 SUMMARY Afterwards, the execution context local event buffer can be reused for the next event. Events from the asynchronous event queue are processed by a special thread. As already described, the runtime state of a specific rule encapsulates all information relevant for a potential event pattern match. Each rule has its own memory pool for such runtime state information. The pool size can be configured to either contain a specific number of state information or to have a specific size. Then, the actual pool size determines how many event pattern matches can be detected parallely. This concept of memory pooling was chosen for the following reasons: (1) the amount of memory required to process a specific rule is predictable, (2) allocating and deallocating memory for runtime state information (which in general requires only a few byte) is inefficient, and (3) allocating the memory before the start of rule processing ensures that the required memory is indeed available. The event processor-managed global buffer is subdivided into smaller buffer units. Each of these units maintains a reference counter which describes how many event data entries are still referenced by runtime state entries. If the reference counter reaches zero, the particular buffer can be re-used to store event data. Different heuristics are investigated to improve the performance of the runtime environment: (1) Adaptive event activation and deactivation. The runtime environment can determine the set of events that is relevant for the loaded rules and for current automata runs. By interacting with the instrumentation framework, the generation of irrelevant events can be turned off to reduce system disturbance. (2) Evaluation ordering. The compiler generates deterministic automata, i.e. there is always exactly one valid transition for a specific state and a specific event. Transitions and conditions are evaluated sequentially. Based on probabilities of evaluation results, reordering the evaluation sequences can lead to increased performance of rule processing. The online processing of event streams offers the advantage that the eventstream itself must not be stored to disk. Synchronous event processing allows blocking activities if they generate bad patterns. Furthermore, the runtime environment API provides functions which can be used to build self-aware applications, and to implement concepts from autonomic computing. More details and examples can be found in [5]. 5 Summary Online processing of operating system kernel events can be used to detect specific event constellations in the stream of events ocurring in the system. The possibility to react to detected constellations (synchronously or asynchronously) can be utilized to implement a variety of system adaptation policies - e.g. resources may be reassigned, malicious activities can be blocked or deadlocks can be prevented. The design of the operating system kernel integrated runtime environment for online processing of event streams is the main contribution of my work. The proof of concept implementation shows the feasibility and demonstrates different usage scenarios. Fall 2009 Workshop 8-9

90 A Runtime Environment for Online Processing of Operating System Kernel Events References [1] Norman H. Cohen and Karl Trygve Kalleberg. Eventscript: An Event-processing Language based on Regular Expressions with Actions. In LCTES 08: Proceedings of the 2008 ACM SIGPLAN-SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems, pages , New York, NY, USA, ACM. [2] Yanlei Diao, Neil Immerman, and Daniel Gyllstrom. Sase+: An agile Language for Kleene Closure over Event Streams, [3] Yoav Etsion, Dan Tsafrir, Scott Kirkpatrick, and Dror G. Feitelson. Fine Grained Kernel Logging with KLogger: Experience and Insights. In Proceedings of the EuroSys 2007, pages , March [4] Andreas Polze and Dave Probert. Teaching Operating Systems: The Windows Case. In SIGCSE 06: Proceedings of the 37th SIGCSE technical symposium on Computer science education, pages , New York, NY, USA, ACM Press. [5] Michael Schöbel and Andreas Polze. A Runtime Environment for Online Processing of Operating System Kernel Events. In Proceeding of the Seventh International Workshop on Dynamic Analysis, [6] Alexander Schmidt and Michael Schöbel. Analyzing System Behavior: How the Operating System Can Help. In Proceedings of INFORMATIK 2007, LNI Nr. 110, Band 2, [7] Michael Schöbel. Operating System Abstractions for Service-Based Systems. In Proceedings of the Fall 2006 Workshop of the HPI Research School on Service- Oriented Systems Engineering, [8] Michael Schöbel. The Windows Monitoring Kernel. In Proceedings of the 2nd Ph.D. retreat of the HPI Research School on Service-Oriented Systems Engineering, [9] Michael Schöbel and Andreas Polze. Kernel-mode scheduling server for cpu partitioning: a case study using the windows research kernel. In SAC 08: Proceedings of the 2008 ACM symposium on Applied computing, pages , New York, NY, USA, ACM Fall 2009 Workshop

91 Computational Analysis of Virtual Team Collaboration in the Early Stages of Engineering Design Matthias Uflacker This report briefly outlines and motivates the content of my thesis, which is drawing close to completion. 1 Summary of the Dissertation The early stages of engineering projects are considered as the most critical phase of a product lifecycle. They mark a vibrant, creative, and dynamic process at the outset of conceptual design. Decisions made at this phase determine the rest of the engineering process and any misconceptions and omissions have significant impact on the project success [32]. Researchers are exploring factors for successful design collaboration that encourage innovative potential, but appropriate observation methods are needed to respond to the augmented virtualization and dispersion of team environments. How can computer-mediated team interactions be efficiently captured and explored? Can we further use this information to identify collaboration signatures that may indicate beneficial or detrimental movements in the creation of new concepts? The dissertation describes how virtual, unstructured, and heterogeneous design team interactions can be captured to support the assessment and comparison of process characteristics at real-time. It presents models, software technologies, and methods, which automate the collection of online collaboration activities and establishes a central interface to actor and asset relationships over the course of time. A case study in eleven distributed design projects suggests that this approach can point out significant team performance indicators during the early stages of concept creation and prototyping. My research is structured into three phases. In the first phase, the work develops a theoretical foundation and a service-oriented software implementation to provide a platform for the computational capture of distributed online team activities. Team collaboration networks are introduced as a model to establish a generic foundation for the graph-based, semantic representation of associations between actors and shared information objects over time. A resource-oriented software system ( provides a service interface to describe and access team collaboration networks, and to explore trends and signatures in the communication behavior of design teams. In a second phase, the system is integrated into the concept development phases of Fall 2009 Workshop 9-1

92 Computational Analysis of Virtual Team Collaboration in the Early Stages of Engineering Design eleven distributed, inter-disciplinary engineering projects over a period of eight month in each case. The information sharing activities scanned in archives, wiki spaces, and public team folders are translated into team collaboration networks and provide a basis for the visual and quantitative examination of communication structures and evolving trends. In a final research phase, the generated models are taken into a case study analysis of potential team performance indicators. Significant dependencies are revealed between patterns in the correspondence of the observed teams and self-reported team satisfaction (R 2 = 0, 48). Overall, the results of the analysis substantiate the beneficial impact of design thinking principles in the early stages of engineering projects. The findings suggest that those teams generally perform better, which put emphasis on external communication (to end-users, etc.), internal information sharing (through documentation), and diversity in the solution space (through iterative prototyping). The dissertation merges recent trends in information system technology with prevailing research questions in engineering design and design thinking. With the digital footprint of team communication steadily growing, computational data acquisition represents a promising approach to improve empirical observation techniques in the study of conceptual design. The contribution of this work is a flexible service platform to conduct unobtrusive design process analysis and to create insights into virtual collaboration practice during the front end of innovation. Team collaboration networks provide a configurable, semantic data layer for the temporal and structural analysis of communication signatures in distributed teams. The results of the case study reveal a closer understanding of factors in computermediated communication that are critical for success. In particular, it is shown that communication structures in virtual design collaboration can serve as a surrogate for qualitative aspects of the team performance. This is significant because it demonstrates that objective collaboration metrics may indicate design team performance live and in situ. With that, this work provides a foundation for present and future research in design management and collaboration support. 2 Motivation 2.1 A Window on Conceptual Design Practice Despite previous efforts to improve the assessment and transparency of performancerelevant process characteristics at project run-time, a deeper understanding of the requirements (i.e., how to monitor) and the relevant metrics (i.e., what to monitor) is required. This work aims to further promote the field of automated design observations by suggesting answers to both, the how and the what: it develops a novel approach to the IT-supported measurement, processing, and analysis of online collaboration activities and scrutinizes the expressiveness of the obtained data structures with regard to indicating team performance. With the instantiation of appropriate sensors during 9-2 Fall 2009 Workshop

93 2 MOTIVATION the early stages of conceptual design, the research tries to improve the real-time 1 evaluation capabilities for engineering processes and to generate new insights from computational team observations. 2.2 Challenges for Design Research The scientific exploration of what designers do when they do design relies on data that is collected either directly by researchers observing the design process or indirectly through a design proxy. Methods for direct data collection comprise ethnographic studies such as the classification of visual or verbal design activity recordings and transcripts [11,21]. Indirect data collection implies, e.g, the conduction of ex-post interviews and the study of logbooks, documentation or designed artifacts to get insight into the design process [19]. While both approaches to data collection introduce considerable costs and efforts to the research, each comes with additional distinct drawbacks. Direct data collection usually relies on a controlled environment, design laboratories, or artificial design tasks that support thorough real-time observations of the design process. Indirect analysis, on the other hand, can be criticized for its loss of accuracy and objectiveness that is introduced by the mediating proxy. In distributed design scenarios, the costs of data collection methods are even more aggravated, as multiple design locations and concurrent, individual activities need to be considered. The acquisition of data is ultimately decelerated, allowing only for a retrospective analysis of the process, rather than taking a live view on the researched subject. As a result, field observations and experiments in design research are often coarse or limited to a small number of samples. More efficient data collection approaches are required to increase the resolution of embedded design observations. In this context, the increasing proliferation of online information handling and digital design knowledge repositories has been identified to provide worthwhile sources for design research [22]. How can this information be efficiently leveraged and what can we learn from it? 2.3 Challenges for IT Systems Engineering Challenges in the design of IT systems to support design research stem from the complexities in the targeted multi-user environment. These are defined by a framework of requirements related to human behavior, organizations, and technology [15]. Conceptual design is inherently unstructured and ad-hoc, rendering the computational measurement, comparison, and prediction of complex process variables idiosyncratic [8]. The possibilities for automated data collection and interpretation of team behavior are naturally limited. A second dimension of complexity is spanned by a constantly transforming tool landscape in engineering design, which is and will remain heterogeneous. The unstruc- 1 In respect of the ambivalent meaning of this notion, especially in computer science, real-time refers here to the ability to collect and prepare data from user observations steadily and almost instantaneously, i.e., within the range of minutes. Fall 2009 Workshop 9-3

94 Computational Analysis of Virtual Team Collaboration in the Early Stages of Engineering Design tured and varying nature of team interaction in the different phases of design demands for a diverse set of support tools with different strengths and characteristics. Currently, it is common to evaluate technology-mediated team communication based on the observation of a single domain or communication channel only. However, any observation and analysis focussed on a single communication tool or collaboration suite can comprise only fractional parts of the information handling. In order to complete the picture, capabilities to centrally monitor and access a broader range of distributed and concurrent communication activities are needed. What are the characteristics of a computer system to record, formalize and leverage those activities? 3 Studying Team Communication to Improve Innovation Potential The long-term objective of this research is ultimately shaped by the ambition to extend knowledge about the conceptual design process and ways to increase their innovation potential. The dissertation aims to contribute to this goal by improving design research methodology with an instrument for the incessant capture and evaluation of online communication activities. 3.1 Rationale Building on previous works in design theory and shifting paradigms in IT-supported collaboration, three outside factors predominantly provide the underlying rationale for this study. In the following, I argue that the way design project teams work, communicate, and interact in a computer-mediated environment ultimately affects the potential of an organization to grow and innovate. Therefore, IT-based communication demonstrates a supplemental, yet important subject for design research Economic Growth Needs Innovation The dynamics of economic life has always been influenced by wave-like movements, many times triggered by new technology and innovation entering the market [18]. It is generally accepted that innovation is a motor for economic growth and should therefore be maximized [1]. Companies are constantly challenged with what has been described by Joseph A. Schumpeter as creative destruction: the incessant generation of viable solutions, products, or services, which is imperative to stay ahead and survive in a global market [27]. It is necessary to understand how innovation occurs in order to systematically increase innovative potential. The foundation for innovation is laid through new concept creation during the fuzzy front-end of innovation [16]. This front-end of innovation is poorly understood and presents one of the greatest opportunities for improving the innovation process [17, 25]. Iterative methodologies such as user-centered design [23] 9-4 Fall 2009 Workshop

95 3 STUDYING TEAM COMMUNICATION TO IMPROVE INNOVATION POTENTIAL and design thinking [3] have been shown to provide successful approaches to encourage the generation of new concepts during the front-end of innovation [10, 33]. At the same time, there is little understanding of how design thinking or user-centeredness can be measured and evaluated Innovation Needs Communication Schoen [26] describes design as a reflective conversation with the materials of a design situation. He sketches design as the process of seeing-moving-seeing: interpreting and making sense of the world (seeing), performing actions to affect a desired change (moving), and assessing the effects in the changed environment (seeing), which causes further actions or the completion of the design cycle [4]. In a team-based environment, the design situation is constantly manipulated in a social, interacting, and collaborative manner. This perspective on design reveals the importance of communicating and sharing a precise picture of relevant design information to other process participants. Having a holistic view on the fluctuating design situation in a team, i.e., achieving common ground among the stakeholders, is a crucial, yet challenging task in engineering design [24]. The correlation between communication and the outcome of a design process has been subject of many previous works. Essays such as Mythical Man-Month [2], Peopleware [7], but also other more research oriented studies [9, 31] have shown over and over that efficient team communication is a dominant factor for the well-being and success of engineering projects. Recent studies of the design process [6, 30] further suggest that communication within design teams is instrumental to successful design activity. While it is generally accepted that communication is a critical factor for design, only little is known about how certain communication behavior influences the outcome quality Communication Needs IT The increasing proliferation of software services in computer-supported co-operative work (CSCW) and virtual collaboration has changed work environments and the way people communicate and share information. Groupware applications have moved to the Internet and the World Wide Web, providing an open interface to connecting people and information. In fact, communicating over the Internet has become standard and indispensable, especially in global organizations. About 62% of all employed Americans have Internet access and virtually all of those (98%) use on the job [13]. Web 2.0 services and other communication methods such as instant messaging are increasingly moving into the workplace [29]. Storing and distributing project data in online services creates an unequaled level of information availability. This expansion of online instruments in the information handling and negotiation activities of design teams can be leveraged to better understand communication structures during the early phases of innovation. With the digital footprint of communication growing, computational observation and analysis methods for team processes become potentially more feasible and powerful. Fall 2009 Workshop 9-5

96 Computational Analysis of Virtual Team Collaboration in the Early Stages of Engineering Design 3.2 Digital Traces as Surrogates for Communication Behavior While the connections between economic growth, innovative potential, and communication are undisputed in literature, the correlations between the technical encoding of online communication and its semantic value in a collaborative environment are relatively unexplored. It is probably the most fundamental question in the computational analysis and interpretation of unstructured communication, to what extent the technical representation can serve as a surrogate in the interpretation of the original intent of a message. This is a problem deeply rooted in the nature of communication itself. Shannon and Weaver [28] count it to the category of semantic problems: how precisely do the observable symbols convey the desired meaning? Recent studies give first evidence that data at the technical level of communication might be an observable surrogate for the semantic intent [22]. If and how data in the form of recorded interaction events could ultimately serve as an indicator of differences in design team activity and innovative performance remains largely an open question. If dependencies between observable patterns in team communication and the objective qualities of a design process can be identified, the envisioned instrument would qualify itself as an improvement to design research methodology. The case study and data analysis conducted later in this work sought to explore and interpret such correlations. The question if and how this kind of observation itself effects the behavior of the observed designers is interesting and relevant, but is outside the scope of this work. 4 A Service Approach to Real-time Communication Analysis The dissertation sought to expand the exploration window in empirical design research by providing new capabilities for automated data collection and analysis. The focus is on technology-enabled, distributed engineering design spaces, in which computermediated team interactions are common and widespread. Responding to the diversity of virtual collaboration environments, the work introduces a distributed approach to capturing and measuring team interactions from miscellaneous online communication streams. A set of software services provides the functionality to incrementally capture and query detailed communication properties. This establishes a central point of access to the communication records and meta-information about how design teams virtually communicate information over the course of a project. 4.1 Research Questions and Hypothesis In the effort to construct, apply, and evaluate an automated, service-based approach to design team observation, this work raises and answers three principal research questions. Research Question 1. How can the chronological appearance of design team interactions be modeled and represented in a computer-processable format? 9-6 Fall 2009 Workshop

97 4 A SERVICE APPROACH TO REAL-TIME COMMUNICATION ANALYSIS The computer-supported analysis of team collaboration requires the formalization and recording of activities under observation. The first research question is asking for an appropriate data structure, which is able to maintain a temporal representation of the identified collaboration properties. The requirements for the formalization of generally informal activities are complex. To ensure applicability in the myriad of different scenarios, the data structure cannot be designed for a particular predetermined environment or project, nor must it interfere with the natural creative modes of the subjects under study. This raises demand for a generic data schema, which supports the flexible configuration of extensible, yet unambiguous semantics. Research Question 2. What are the structural and dynamic properties of a software system that facilitates the integration of concurrent communication capture and analysis into dispersed design environments? The second question addresses the architectural layout of a software system to handle the formalization process and to provide capabilities for data inspection. The system design clearly needs to respond to the peculiarities of prevailing design environments without interfering with their existing setups. Multiple workspaces distributed in time and space, concurrent team interactions and a diversity of media and groupware for virtual collaboration demand for a scalable solution. Service-orientation defines an architectural paradigm to construct distributed and loosely coupled software systems that promote integrability, flexibility, and reusability [12]. The instantiation of a service-based software system to conduct computational observations and analyses in distributed and heterogeneous design environments is new and unprecedented. What services does this system need to provide and how can it be integrated into the design process? Research Question 3. Which communication patterns in conceptual design may serve as observable surrogates for applied design thinking principles? To scrutinize the value of the collected data for design research, the dissertation tries to identify recurring characteristics that may stand for designerly ways of interacting during the creation of new concepts [5]. Design thinking is a methodology to stimulate creativity and innovation in earl-stage engineering by integrating a set of principles, attitudes, and methods into the design process [3]. How is design thinking reflected in the online communication behavior of teams? Can we observe structural properties (patterns) that indicate the observance of design thinking principles? Focusing on potential design thinking proxies in the captured communication activities provides a meaningful starting point for the analysis of performance correlations. Revealing such correlations through computational observation and analysis would endorse the presented approach. In the context of this research question, the following hypothesis is being formulated: Hypothesis. The computational analysis of online communication in conceptual design processes can reveal quantitative characteristics that correlate with independent team performance measures. Fall 2009 Workshop 9-7

98 Computational Analysis of Virtual Team Collaboration in the Early Stages of Engineering Design Identifying a significant correlation between properties of the formalized communication behavior and independently measured team performance is a necessary, but not sufficient requirement for the computational evaluation of design process qualities. Too many indeterminable factors are influencing team performance, preventing a complete and definite assertion by means of IT. However, correlations could provide indicators for what is relevant and worth to observe in a design process. The presented solution would suggest itself for the implementation of a design management dashboard to monitor those indicators at real-time. 4.2 Step 1: Development of a Model for Team Collaboration Capture The development of an appropriate data model to describe the individual communication activities of design teams is a necessary first step for the analysis. While the observation of technology-mediated design activities has been the subject of many previous works, a more flexible and generic approach is required, which supports application in the diversity of existing and future IT-supported design environments. The work introduces team collaboration networks (TCN), a graph structure in which the occurrence, attributes, and relationships of heterogeneous actors and information objects are represented over the course of a project. A system of team collaboration networks (TCN-S) combines multiple network instances to support the conduction of parallel team observations. A TCN-S is organized by a set of domain- and network-specific ontologies to support integration, re-usability, and comparison of individual network structures. 4.3 Step 2: Service Implementation & Testbed Integration In a second step, the work introduces, a resource-based implementation of a TCN-S. The platform provides a rich API for the continuous recording and evaluation of virtual collaboration activities in heterogeneous groupware applications. The services were applied in an eight-month period of early stage concept creation in eleven engineering design projects and used to collect data from archives, wiki pages, and online shared folders. The generated team collaboration networks provide the basis for the upcoming exploration of the communication structures. 4.4 Step 3: Case Study in Team Performance Evaluation Finally, a case study is conducted to evaluate the captured online activities in design collaboration with the developed tools and to test the hypothesis. In the focus of this analysis are patterns, which may reflect applied design thinking principles in the collected data sets. In particular, the work seeks to identify structures in and wikibased communication that correlate with the different, independent team performance measures. Correlations would reveal potential indicators for beneficial or detrimental process characteristics. With relevant, observable metrics in place, the plat- 9-8 Fall 2009 Workshop

99 5 RESEARCH METHODOLOGY form would legitimate itself as a foundation for team observations continuing research in real-time design process evaluation and management. 5 Research Methodology The three illustrated phases of this work are guided by the principles of research in information systems (IS). The IS research framework developed by Hevner and March [14] forms a symbiosis of design science and behavioral science, defining a bilateral process to create knowledge and to contribute to a socio-technological environment. This synergetic approach to information systems research is visualized in Figure 1. Environment Relevance IS research Rigor Knowledge base Develop / Build People Roles Capabilities Characteristics Organizations Strategies Structure and Culture Processes Technology Infrastructure Application Communications architecture Development capabilities Artifacts Theories Foundations Theories Frameworks Instruments Constructs Business Applicable Models 1 Needs Knowledge Methods 4 Instantiations Access 3 2 Refine Justify / Evaluate Analyses Case studies Experiments Field studies Simulations 5 6 Methodologies Data analysis techniques Formalisms Measures Validation criteria Contributions to the environment Contributions to the knowledge base Figure 1: Research framework by Hevner and March [14]. Design science seeks to extend the boundaries of human and organizational capabilities by creating new and innovative artifacts [14]. It attempts to create things that serve human purpose [20]. This work s purpose is to create a tool that supports design researchers by enabling new observation and analysis capabilities. In this context, the work starts with a detailed exploration of the environment in design research and practice. It identifies needs and requirements in the IT-based conduction, observation, and analysis of innovation processes in conceptual design (1, Fig. 1). Building on existing Fall 2009 Workshop 9-9

100 Computational Analysis of Virtual Team Collaboration in the Early Stages of Engineering Design instruments, models, and formalisms, the work continues with developing software artifacts to better address the identified needs in design research (2). A service-based solution is presented, which provides a flexible system for capturing and leveraging observed communication structures of design teams. The created models and tool instantiations are evaluated in a case study with engineering design teams (3). Behavioral science has its roots in natural science research methods. It seeks to develop hypotheses and empirically justify theories that explain or predict organizational and human phenomena surrounding the use of information systems [14]. This work conducts an analysis of the communication behavior of engineering teams (4, Fig. 1). Addressing the organizational and technological environment of conceptual engineering design processes, theories about the explanatory power of online communication structures are developed (5). With the data collected during the project case studies, it is tested whether these structures can be used to predict independent team performance measures (6). 6 Contribution The contribution of this work to the field of design research is expected to entail beneficial input to both, theory and praxis. This section shall briefly highlight the results of the research process in terms of input to a common knowledge base and to the research environment (cf. Figure 1). 6.1 Contributions to the Knowledge Base With team collaboration networks, the work presents a solution to a research problem rooted in the formal requirements of computational models and the informality of design processes. How can the unstructured collaboration activities in conceptual design be captured for a computer-based observation and real-time analysis? A system of networks constitutes a configurable semantic model for representing and exploring meta-information about the temporal relationships between actors and arbitrary information assets in multiple design projects. An analysis of communication artifacts that were scanned in different groupware activities of real-live design teams shows that this approach is expedient. Correlations between the observed online communication behavior and supporting performance measurements could be demonstrated for the analyzed use cases, suggesting beneficial impact of design thinking principles. In particular, a significant interdependency between the results of a team performance diagnostic [34] and two independent variables in the -based communication could be identified (R 2 = 0, 48). Other recent design research studies already build on the results of this work. Skogstad [30] has developed new theory about how designers gain the insights needed to create novel solutions and how reviewers can have both positive and negative effects on the design process. Parts of the hypotheses have been tested with team collaboration networks. Overall, the developed constructs for the observation of team communication provide a basis for achieving new insights into design aspects and performance Fall 2009 Workshop

101 REFERENCES relevant properties of virtual collaborative processes. Future research projects may build up on these foundations and refine the results through additional case studies and improved software instantiations. 6.2 Contributions to the Environment The work provides tools and techniques to design researchers that help to study the ever-widening variety of technologies and phenomena that arise within distributed engineering team activity. The platform largely decouples the data collection process from the actual analysis and allows for consistent and continuous observation. The service-oriented design simplifies the automated recording of distributed communication data and expedites an unobtrusive integration of team work analysis into existing and future research projects. A novel approach to real-time team diagnostics is established. The service platform also provide new capabilities for supporting design teams. With a more precise understanding of performance-relevant indicators, realtime awareness for potential drawbacks and process impediments is created. New starting points for the design and implementation of improved tools and dashboards for the management of design processes are created. References [1] D.B. Audretsch. Innovation, growth and survival. International Journal of Industrial Organization, 13(4): , [2] F.P. Brooks. The Mythical Man-Month: Essays on Software Engineering. Addison-Wesley Reading, MA;, [3] Tim Brown. Design thinking. Harvard Business Review, pages 85 92, June [4] W.J. Clancey. Situated cognition: On human knowledge and computer representations. Cambridge University Press, [5] N. Cross. Designerly ways of knowing: design discipline versus design science. Design Issues, 17(3):49 55, [6] N. Cross and A. Clayburn. Observations of teamwork and social processes in design. Design Studies, 16(2): , [7] T. DeMarco and T. Lister. Peopleware: Productive Projects and Teams. Dorset House, 2 edition, [8] K.J. Dooley and A.H. Van de Ven. Explaining complex organizational dynamics. Organization Science, pages , [9] J.E. Driskell, P.H. Radtke, and E. Salas. Virtual teams: Effects of technological mediation on team performance. Group Dynamics: Theory, Research, and Practice, 7(4): , Fall 2009 Workshop 9-11

102 Computational Analysis of Virtual Team Collaboration in the Early Stages of Engineering Design [10] C.L. Dym, A.M. Agogino, O. Eris, D.D. Frey, and L.J. Leifer. Engineering design thinking, teaching, and learning. IEEE Engineering Management Review, 34(1):65 92, [11] O. Eris. Perceiving, Comprehending, and Measuring Design Activity Through the Questions Asked while Designing. PhD thesis, Stanford University, Stanford, CA, [12] T. Erl. Service-oriented architecture: concepts, technology, and design. Prentice Hall PTR, Upper Saddle River, NJ, USA, [13] Deborah Fallows. at work. Pew Internet & Americal Life Project, [14] A.R. Hevner and S.T. March. The information systems research cycle. IEEE Computer, 36(11): , [15] A.R. Hevner, S.T. March, J. Park, and S. Ram. Design Science in Information Systems Research. MIS Quarterly, 28(1):75 106, [16] J. Kim and D. Wilemon. Strategic issues in managing innovation s fuzzy front-end. European Journal of Innovation Management, 5(1):27 39, [17] P.A. Koen, G. Ajamian, S. Boyce, A. Clamen, E. Fisher, S. Fountoulakis, A. Johnson, P. Puri, and R. Seibert. Fuzzy front end: Effective methods, tools, and techniques. The PDMA toolbook for new product development, [18] ND Kondratieff and WF Stolper. The long waves in economic life. The Review of Economics and Statistics, 17(6): , [19] Ade Mabogunje. Measuring Conceptual Design Process Performance in Mechanical Engineering: A Question Based Approach. PhD thesis, Stanford University, Stanford, CA, [20] S.T. March and G.F. Smith. Design and natural science research on information technology. Decision Support Systems, 15(4): , [21] A. J. Milne and L. Leifer. Information Handling and Social Interaction of Multi-Disciplinary Design Teams in Conceptual Design: A Classification Scheme Developed from Observed Activity Patterns. In Proc. of the ASME Design Theory & Methodology Conference, [22] Andrew J. Milne. An Information-theoretic Approach to the Study of Ubiquitous Computing Workspaces Supporting Geographically Distributed Engineering Design Teams as Goupusers. PhD thesis, Stanford University, Stanford, CA, [23] J. Nielsen. Usability Engineering. Morgan Kaufmann, [24] MJ Perry, R. Fruchter, and D. Rosenberg. Co-ordinating distributed knowledge: A study into the use of an organisational memory. Cognition, Technology & Work, 1(3): , [25] DG Reinertsen. Taking the fuzziness out of the fuzzy front end. Research technology management, 42(6):25 31, [26] D.A. Schön. Designing as reflective conversation with the materials of a design situation. Research in Engineering Design, 3(3): , Fall 2009 Workshop

103 REFERENCES [27] Joseph A. Schumpeter. Capitalism, Socialism, and Democracy. Harper and Brothers, New York, [28] C.E. Shannon and W. Weaver. The mathematical theory of information. Urbana: University of Illinois Press, 97, [29] Eulynn Shiu and Amanda Lenhart. How Americans use instant messaging. Pew Internet & Americal Life Project, [30] Philipp Leo Skogstad. A Unified Innovation Process Model For Engineering Designers and Managers. PhD thesis, Stanford University, Stanford, CA, [31] L.F. Thompson and M.D. Coovert. Teamwork online: The effects of computer conferencing on perceived confusion, satisfaction, and postdiscussion accuracy. Group Dynamics: Theory, Research, and Practice, 7(2): , [32] S. Vosinakis, P. Koutsabasis, M. Stavrakis, N. Viorres, and J. Darzentas. Supporting conceptual design in collaborative virtual environments. In Proc. of 11th Panhellenic Conference on Informatics, PCI 2007, [33] K. Vredenburg, J.Y. Mao, P.W. Smith, and T. Carey. A survey of user-centered design practice. In Proceedings of the SIGCHI conference on Human factors in computing systems: Changing our world, changing ourselves, pages ACM New York, NY, USA, [34] Ruth Wageman, J. R. Hackman, and Erin V. Lehman. Team diagnostic survey: Development of an instrument. The Journal of Applied Behavioral Science, 41(4):373, Fall 2009 Workshop 9-13

104 Fall 2009 Workshop

105 Handling of Closed Networks in FMC-QE Stephan Kluth The following report summarizes the investigations for the handling of closed networks in FMC-QE. In a closed network, the population of service requests is fixed and the external service time is zero. While the standard FMC-QE load generation model is a good representation of many other real world scenarios, it is imprecise for this class of models. In order to cope this problem, some investigations on solution methods for closed networks have been done. As a result, the summation method, an iterative solution method for closed queuing network models, has been adapted to FMC-QE. In this report, this adaption is explained. In examples, the summation method in FMC-QE is compared to the standard FMC-QE M/M/m model and M/M/1/K models. For comparative performance values, the investigated examples are also evaluated using classical solution methods for closed queuing networks. 1 Introduction In the following report, the performance predictions for closed queuing networks, modeled and evaluated with Queuing Theory approaches are compared to the predictions of the corresponding FMC-QE model and Tableau. Therefore, first, the networks are modeled and analyzed with Queueing Theory techniques and later on, an FMC-QE model and FMC-QE Tableaux are set up. It will be seen, that the standard FMC-QE performance evaluation techniques are not a good approximation for this class of models. Therefore, FMC-QE will be extended in order to address this class of problems. 2 Closed Tandem Network As a simple, but significant example, first, a tandem network consisting of two servers connected to each other in a closed network is examined. First, the example is calculated using standard Queuing Theory approaches, calculating the global balance equations and solving of the linear equation system. Then the example is calculated using the standard FMC-QE load model with an external service time of zero time units. It will be seen, that there are too large approximation errors in this model. In a second FMC-QE model, the performance values are calculated using an M/M/1/K model. This model leads to correct solutions for this special case of a closed network, but in order to compute the performance values, some results of the calculations had to be known in advance. In the third FMC-QE model, the summation method [1] is adapted to FMC-QE. Fall 2009 Workshop 10-1

106 Handling of Closed Networks in FMC-QE 2.1 Original Example The original model, shown in figure 1 is defined in [2]. 1 2 Figure 1: Closed Tandem Network - Original Model [2] In this example, the two servers [ have exponentially distributed service times with mean values of 5s (µ 1 = 1 SRq1 ] [ 5 s ) and 2,5s (µ1 = 1 SRq1 ] 2,5 s ) and a FCFS service discipline. There are 3 service requests in the network (n ges = 3), which leads to the state transition diagram shown in figure 2 [2]: ,0 2,1 1,2 0, Figure 2: Closed Tandem Network - State transition diagram [2] In [2], the global balance equations are set up as: p(3, 0)µ 1 = p(2, 1)µ 2, p(2, 1)(µ 1 + µ 2 ) = p(3, 0)µ 1 + p(1, 2)µ 2, p(1, 2)(µ 1 + µ 2 ) = p(2, 1)µ 1 + p(0, 3)µ 2, p(0, 3)µ 2 = p(1, 2)µ 1. This leads to the steady state probabilities [2]: (1) p(3, 0) = 0, 5333, p(2, 1) = 0, 2667, p(1, 2) = 0, 1333, p(0, 3) = 0, (2) Using this steady state probabilities, the marginal probabilities are computed as [2]: p 1 (0) = p 2 (3) = p(0, 3) = 0, 0667, p 1 (1) = p 2 (2) = p(1, 2) = 0, 1333, p 1 (2) = p 2 (1) = p(2, 1) = 0, 2667, p 1 (3) = p 2 (0) = p(3, 0) = 0, After computing these probabilities, the performance values, starting with the utilization ρ i, are derived as [2]: ρ 1 = 1 p 1 (0) = 0, 9333, ρ 2 = 1 p 2 (0) = 0, (4) The arrival rates (throughput in the closed network) are then derived as [2]: λ = λ 1 = λ 2 = rho 1 = rho [ ] 2 SRq = 0, (5) µ 1 µ 2 s 10-2 Fall 2009 Workshop (3)

107 2 CLOSED TANDEM NETWORK The mean number of service requests n i are [2]: 3 n 1 = k p 1 (k) = 2, 2667 [SRq 1 ], n 2 = k=1 The mean response times R i are [2]: 3 k p 2 (k) = 0, 7333 [SRq 2 ]. (6) k=1 R 1 = n 1 λ 1 = 12, 1429[s], R 2 = n 2 λ 2 = 3, 9286[s]. (7) 2.2 M/M/1 Approximation The corresponding FMC-QE model is shown in figure 3. (a) Service Request Structure! '$ % (b) Server Structure '#$ % (c) Dynamic Behavior Figure 3: Closed Tandem Network - FMC-QE Model In a first approximation for this closed network, the performance values are derived in an FMC-QE Tableau with M/M/1 servers: n i, q = ρ2 1 ρ n i, s = ρ n i = ρ 1 ρ (8) Fall 2009 Workshop 10-3

108 Handling of Closed Networks in FMC-QE and an external time X ext = 0 (the arrival rate λ was adjusted till X ext = 0 for n ges = 3). This tableau is shown in table 1. Experimental Parameters [1] n ges 3 [1] λ bott 0,2000 f 0,7101 λ [1] 0,1420 Table 1: Closed Tandem Network - M/M/1 Tableau (also in Appendix) Service Request Section Server Section Dynamic Evaluation Section [bb] [bb-1] [bb] i p parent(i),i v parent(i) v [bb] [bb] [bb] [bb] [bb-1] [bb] [bb] [bb] [bb] [bb] [bb] [bb] [bb] [bb] [bb] SRq i i,int v i λ i Server i m parent(i) m i,int m i Mpx i X i m i,mpx μ i ρ i n i,q n i,s n i R i 3 1 Sub-Request 2 1, ,1420 Executer ,5000 1,0000 0,4000 0,3551 0,1955 0,3551 0,5505 3, Sub-Request 1 1, ,1420 Executer ,0000 1,0000 0,2000 0,7101 1,7394 0,7101 2, , Request 1, ,1420 Executer ,2000 1,9348 1,0652 3, , Request Generation 1, ,1420 Client ,0000 ###### 0,0000 0,0000 0,0000 Multiplexer Section j Name j m j [1] X j 1 Server 1 1 2, Server 2 1 5,0000 The approximation errors (relative error δx i = x i x arrival rate: [3]) in the prediction of the overall λ GlobalBalanceCalculation = 0, 1867, λ M/M/1 Approx. = 0, 1420 δ λ = λ GlobalBalanceCalculation λ M/M/1 Approx. λ GlobalBalanceCalculation = 0, 239 (9) and the response times of the servers (especially R 1 ): R 1,GlobalBalanceCalculation = 12, 1429, R 1,M/M/1 Approx. = 17, 2474, R 2,GlobalBalanceCalculation = 3, 9286, R 2,M/M/1 Approx. = 3, 8763 δ R1 = R 1,GlobalBalanceCalculation R 1,M/M/1 Approx. R 1,GlobalBalanceCalculation δ R2 = R 2,GlobalBalanceCalculation R 2,M/M/1 Approx. R 2,GlobalBalanceCalculation = 0, 296 = 0, 013 (10) are very high, because in this solution, the number of service requests in the system is only a mean number derived by calculations for open networks and not a constant value as usually for closed networks. 2.3 M/M/1/K Model In a second calculation, the two servers are represented by M/M/1/K servers with a capacity of K = 3 and so the server formulas are: 10-4 Fall 2009 Workshop

109 2 CLOSED TANDEM NETWORK ρ i = λi µ i ρ i 1 ρ n i,q = i ρ i(kρ K+1) i ρ 1 ρ K+1 i 1 i K(K 1) ρ 2(K+1) i = 1 { 1 1 ρ i ρ 1 ρ n i,s = K+1 i 1 i 1 1 ρ K+1 i = 1 { ρi 1 ρ n i = i K+1 ρ K+1 1 ρ K+1 i ρ i 1 i K ρ 2 i = 1 In this tableau, shown in table 2, the different arrival rates λ i are adjusted using the overall arrival rate λ and the traffic flow coefficients v i in order to fit with the value 0, 1867 of the original example [2], calculated via the global balance equation. (11) Experimental Parameters [1] n ges 3 λ [1] 0,4000 Table 2: Closed Tandem Network - M/M/1/K Tableau (also in Appendix) [bb] [bb-1] [bb] i p parent(i),i v parent(i) v [bb] [bb] [bb] [bb] [bb] [bb-1] [bb] [bb] [bb] [bb] [bb] [bb] [bb] [bb] [bb] [bb] SRq i i,int v i λ i λ i,eff Server i m parent(i) m i,int m i Mpx i X i m i,mpx μ i ρ i n i,q n i,s n i R i 3 1 Sub-Request 2 1,00 1 0,5 0,5 0,2000 0,1867 Executer ,5000 1,0000 0,4000 0,5000 0,2667 0,4667 0,7333 3, Sub-Request 1 1, ,4000 0,1867 Executer ,0000 1,0000 0,2000 2,0000 1,3333 0,9333 2, , Request 1, ,4000 0,1867 Executer ,2000 1,6000 1,4000 3, , Request Generation 1, ,4000 Client ,0000 ##### 0,0000 0,0000 0,0000 Multiplexer Section j Name j m j [1] X j 1 Server 1 1 5, Server 2 1 1,2500 Service Request Section Server Section Dynamic Evaluation Section In this special case of this tandem network, the performance values are exactly the same for the M/M/1/K model and the calculation via the global balance equations, but this in not true for every closed network (in the second example of this section the values are not correct) and for the calculation of this model, the results (especially the arrival rates) had to be known in advance in order to adjust the traffic flow coefficients for this model or a more complex equation system had to be solved in order to retrieve the results. In this closed tandem model, the effective arrival rates: ( ) λ 1 1 ρ i 1 ρ λ i,eff = ρ K ρ i 1 i i λ ( ) (12) 1 1 ρ i = 1 had to be the same for every server. So the arrival rates λ i had to be adjusted through the traffic flow coefficient λ i = v i λ. Also the overall number of service requests in the system n ges, with: K+1 and n i = { ρi 1 ρ i K+1 1 ρ K+1 i ρ K+1 i ρ i 1 K ρ 2 i = 1 (13) Fall 2009 Workshop 10-5

110 Handling of Closed Networks in FMC-QE n ges = 2 n i (14) had to be 3[SRq] (n ges = 3[SRq]). For this proof-of-concept tandem network, the M/M/1/K model was of course calculable, but for larger networks, this model and calculation is not feasible. i=1 2.4 Summation Method In FMC-QE, the arrival rate is an input parameter and the number of service requests in the system are a result. In closed systems, this is normally the opposite, so classical approaches do not fit. The summation method [1] are an exception, while there, the arrival rate is also an input parameter in a closed model. In the summation method, the mean number of service requests in each node are a function of the throughput of the node [2]: [1] propose the following formulas for f i (λ i ) [2]: n i = f i (λ i ). (15) ρ i, T ype 1, 2, 4 (m 1 K 1 K ρ i = 1), i ρ f i (λ i ) = m i ρ i + i p 1 K m i 1 i (m i ), T ype 1 (m i > 1), ρ K m i i λ i µ i, T ype 3. with the utilization [2]: ρ i = and the waiting probabilities (for Type-3 Server, m i > 1) [2]: m i! (1 ρ i ) m i 1 k=0 p i (m i ) = (16) λ i m i µ i (17) (m i ρ i ) k k! (m i ρ i ) m + 1 i 1 (18) The function f i (λ i ) is correct for Type-3 servers (infinite servers) and an approximation for Type-1,2 and 4 [2]. If f i is given for every basic server station (number of basic server stations = I) in the network, the overall number of service requests in the system is given by [2]: I n i = i=1 I f i (λ i ) = K (19) i=1 and including the traffic flow coefficients v i, the overall number of service requests in the system is a function of the arrival rate [2]: 10-6 Fall 2009 Workshop

Graphic Card Emc Utl 9000

111 2 CLOSED TANDEM NETWORK I f i (v i λ) = g(λ) = K. (20) i=1 For the usage of the summation method in the Tableau, the basic server formulas are substituted by the summation formulas (16) and then the solution is derived in an iterative calculation, modified from [2]: While the desired bottleneck utilization f (λ = f λ bott ) is 0 for the lower bound of the arrival rate and 1 for the upper bound (arrival rate λ = bottleneck throughput λ bott ), the bounds are f l = 0 and f u = 1 in the first step. (desired bottleneck utilization f f i ) Then in the second step, f = f l+f u and the tableau is solved. If the overall number 2 of service requests in the system are K ± ɛ, then the solution is found, else the bounds are set to f u = f if the overall number of service requests in the system is greater than K and f l = f if the overall number of service requests in the system is smaller than K. and then the next iteration starts with the second step. Table 3 shows the corresponding Tableau of the closed tandem network example. Table 3: Closed Tandem Network - Summation Method Tableau (also in Appendix) Experimental Parameters [1] n ges 3 [1] λ bott 0,2000 f 0,9144 λ [1] 0,1829 Service Request Section Server Section Dynamic Evaluation Section [bb] [bb-1] [bb] i p parent(i),i v parent(i) v [bb] [bb] [bb] [bb] [bb-1] [bb] [bb] [bb] [bb] [bb] [bb] [bb] [bb] [bb] [bb] SRq i i,int v i λ i Server i m parent(i) m i,int m i Mpx i X i m i,mpx μ i ρ i n i,q n i,s n i R i 3 1 Sub-Request 2 1, ,1829 Executer ,5000 1,0000 0,4000 0,4572 0,2005 0,4572 0,6577 3, Sub-Request 1 1, ,1829 Executer ,0000 1,0000 0,2000 0,9144 1,4279 0,9144 2, , Request 1, ,1829 Executer ,2000 1,6284 1,3716 3, , Req. Generation 1, ,1829 Client ,0000 ##### 0,0000 0,0000 0,0000 Multiplexer Section j Name j m j [1] X j 1 Server 1 1 2, Server 2 1 5,0000 The approximation errors in the prediction of the overall arrival rate and the response times of the servers are: λ GlobalBalanceCalculation = 0, 1867, λ SUM Approx. = 0, 1829 δ λ = λ GlobalBalanceCalculation λ SUM Approx. = 0, 0204 λ GlobalBalanceCalculation R 1,GlobalBalanceCalculation = 12, 1429, R 1,SUM Approx. = 12, 8078, R 2,GlobalBalanceCalculation = 3, 9286, R 2,SUM Approx. = 3, 5961 δ R1 = R 1,GlobalBalanceCalculation R 1,SUM Approx. R 1,GlobalBalanceCalculation δ R2 = R 2,GlobalBalanceCalculation R 2,SUM Approx. R 2,GlobalBalanceCalculation = 0, 0548 = 0, 0846 (21) Fall 2009 Workshop 10-7

112 Handling of Closed Networks in FMC-QE 3 Closed Central Server Example The second example is a classical CPU - Disk(s) closed model (Closed Central Server). In this part, a solution following the Theorem of Gordon and Newell [4] is compared to the FMC-QE summation method. 3.1 Original Model The original example is the example in [5], re-engineered in figure 4. p 1,1 =0,1 p 1,2 =0,4 2 1 Disk1 CPU p 1,3 =0,5 Disk2 3 Figure 4: Closed Central Server Example - Original Model Reengineered In [5] the model was calculated using the Theorem of Gordon and Newell [4]. For a system with three Service Requests inside (K = 3[SRq]) the calculations lead to the following results [5] (recalculated manual and using WinPEPSY [6] 1 for higher precision): e 1 = 1, 000, e 2 = 0, 400, e 3 = 0, 500, ρ 1 = 0, 6939, ρ 2 = 0, 3469, ρ 3 = 0, 6939, D 1 = 0, 3469, D 2 = 0, 1388, D 3 = 0, 1735, n 1,q = 0, 5714, n 2,q = 0, 1224, n 3,q = 0, 5714, n 1,s = 0, 6939, n 2,s = 0, 3469, n 3,s = 0, 6939, n 1 = 1, 2653, n 2 = 0, 4693, n 3 = 1, 2653, R 1 = 3, 6471, R 2 = 3, 3824, R 3 = 7, 2941, R = 8, 6471 (22) 3.2 FMC-QE Model In the corresponding FMC-QE model, shown in figure 5, some transformations have been done. The different steps will not be shown here in detail, but these were a transformation to a Petri Net, the integration of the external load generation, a feedforward - feed-backward transformation for the loop around the CPU, a hierarchy for the disk branch and another hierarchy for the whole service request. The corresponding Tableau in table 4 shows the performance predictions of the FMC-QE model. 1 prbazan/pepsy/ 10-8 Fall 2009 Workshop

113 3 CLOSED CENTRAL SERVER EXAMPLE!'! ' (a) Service Request Structure!#' - '!'#$ ' %' *, %&'() * ' (b) Server Structure %&'+ *' '#! (c) Dynamic Behavior Figure 5: Closed Central Server Example - FMC-QE Model Fall 2009 Workshop 10-9

114 Handling of Closed Networks in FMC-QE Experimental Parameters [1] n ges 3 [1] λ bott 0,4500 f 0,6895 λ [1] 0,3103 Table 4: Closed Central Server Example - Tableau (also in Appendix) Service Request Section Server Section Dynamic Evaluation Section [bb] i [bb] SRq i p parent(i),i [bb-1] v parent(i) v [bb] i,int [bb] v i [bb] λ i [bb] Server i [bb-1] m parent(i) [bb] m i,int [bb] m i Mpx i [bb] X i [bb] m i,mpx [bb] μ i [bb] ρ i [bb] n i,q [bb] n i,s [bb] n i [bb] R i 3 1 Write Request 2 0,56 1,000 1,000 0,556 0,1724 Disk2 Writer ,0000 1,0000 0,2500 0,6895 0,5866 0,6895 1,2762 7, Write Request 1 0,44 1,000 1,000 0,444 0,1379 Disk1 Writer ,5000 1,0000 0,4000 0,3448 0,1029 0,3448 0,4477 3, Data Writing Request 1,00 1,000 1,000 1,000 0,3103 Data Writer ,4500 0,6895 1,0343 1,7238 5, Calculation Request 1,00 1,000 1,111 1,111 0,3448 Calculator ,0000 1,0000 0,5000 0,6895 0,5866 0,6895 1,2762 3, Reliable Calculation Req. 1,00 1,000 1,000 1,000 0,3103 Reliable Calculator ,4500 0,5866 0,6895 1,2762 4, Transaction 1,00 1,000 1,000 1,000 0,3103 Transaction Executer ,4500 1,2762 1,7238 3,0000 9, Transaction Generation 1,00 1,000 1,000 1,000 0,3103 Client ,0000 ###### 0,0000 0,0000 0,0000 0,0000 Multiplexer Section j Name j m j [1] X j 1 CPU 1 2, Disk1 1 2, Disk2 1 1,1111 The approximation errors (relative error δx i = x i [3]) at the server level are very x small (except the runaway value n 2,q for which the error is still under 16%): δ ρ1 = 0, 0063, δ ρ2 = 0, 0061, δ ρ3 = 0, 0063, δ D1 = 0, 0061, δ D2 = 0, 0065, δ D3 = 0, 0063, δ n1,q = 0, 0266, δ n2,q = 0, 1593, δ n3,q = 0, 0266, δ n1,s = 0, 0063, δ n2,s = 0, 0061, δ n3,s = 0, 0063, δ n1 = 0, 0086, δ n2 = 0, 0460, δ n3 = 0, 0086, δ R1 = 0, 0149, δ R2 = 0, 0403, δ R3 = 0, (23) On the net level (R and D), the approximation error seems very high at the first moment, but in the FMC-QE model the repetition of the calculation (loop around the CPU) is transformed into a higher traffic flow within the Reliable Calculation. The arrival rate at the highest level is lower than in the original model, which is just another interpretation of the model and not an error. So with this normalization, the approximation errors are: δ D = 0, 0061, δ R = 0, (24) 4 Conclusions The first two approaches of the integration of closed Queuing Networks into FMC-QE, more precisely the approximation using M/M/m servers and an external service time of zero time units, as well as the second model of M/M/m/K servers, were either to imprecise or inapplicable. But the third approach, the integration of the summation method [1, 2] into FMC-QE, explained and exemplified in this report, extends FMC- QE to the class of closed Queuing Networks, providing an iterative algorithm and good approximation for closed systems. The approximative errors in the examples, evaluated using the summation method, are, except for the runaway value n 2,q in the second example, within the maximal error of 15% as descibed in [1] and often even better than the average error of 5% for the number of service requests and the response time also described in [1]. The values for the throughput deliver also precise values as descibed in [1] Fall 2009 Workshop

115 REFERENCES References [1] Gunter Bolch, Georg Fleischmann, and R. Schreppel. Ein funktionales Konzept zur Analyse von Warteschlangennetzen und Optimierung von Leistungsgrößen. In Ulrich Herzog and Martin Paterok, editors, Messung, Modellierung und Bewertung von Rechensystemen, 4. GI/ITG-Fachtagung, Erlangen, 29. September - 1. Oktober 1987, Proceedings, volume 154 of Informatik-Fachberichte, pages Springer, [2] Gunter Bolch, Stefan Greiner, Hermann de Meer, and Kishor Shridharbhai Trivedi. Queueing Networks and Markov Chains : Modeling and Performance Evaluation With Computer Science Applications. John Wiley & Sons, Inc., [3] Ilja N. Bronstein, Konstantin A. Semendjajew, Gerhard Musiol, and Heiner Muehlig. Taschenbuch der Mathematik. Verlag Harri Deutsch, Thun, Frankurt am Main, 2 edition, [4] W. J. Gordon and G. F. Newell. Closed queueing systems with exponential servers. Operations Research, 15: , [5] Martin Haas and Werner Zorn. Methodische Leistungsanalyse von Rechensystemen. R. Oldenbourg Verlag GmbH, München, Wien, [6] Matthias Kirschnick. The Performance Evaluation and Prediction SYstem for Queueing NetworkS PEPSY-QNS. Technical Report TR-I , Computer Science Department Operating Systems - IMMD IV, Friedrich-Alexander-University, Erlangen-Nürnberg, Erlangen, Germany, June Fall 2009 Workshop 10-11

116 Handling of Closed Networks in FMC-QE Appendix (a) M/M/1 Table 5: Closed Tandem Network - Tableaux (b) M/M/1/K (c) Summation Method Experimental Parameters [1] n ges [1] λ bott f λ [1] 3 0,2000 0,7101 0,1420 Service Request Section Server Section Dynamic Evaluation Section [bb] [bb-1] [bb] i p parent(i),i v parent(i) v [bb] [bb] SRq i i,int v i λ i [bb] Server i [bb] [bb-1] [bb] [bb] m parent(i) m i,int m i Mpx i X i [bb] [bb] [bb] [bb] [bb] [bb] [bb] [bb] m i,mpx μ i ρ i n i,q n i,s n i R i 3 1 Sub-Request 2 1, ,1420 Executer ,5000 1,0000 0,4000 0,3551 0,1955 0,3551 0,5505 3, Sub-Request 1 1, ,1420 Executer ,0000 1,0000 0,2000 0,7101 1,7394 0,7101 2, , Request 1, ,1420 Executer ,2000 1,9348 1,0652 3, , Request Generation 1, ,1420 Client ,0000 ###### 0,0000 0,0000 0,0000 Multiplexer Section [1] j Name j m j X j 1 Server 1 1 2, Server 2 1 5,0000 Experimental Parameters [1] n ges 3 λ [1] 0,4000 Service Request Section Server Section Dynamic Evaluation Section [bb] [bb-1] [bb] i p parent(i),i v parent(i) v [bb] [bb] [bb] [bb] [bb] [bb-1] [bb] [bb] [bb] [bb] [bb] [bb] [bb] [bb] [bb] [bb] SRq i i,int v i λ i λ i,eff Server i m parent(i) m i,int m i Mpx i X i m i,mpx μ i ρ i n i,q n i,s n i R i 3 1 Sub-Request 2 1,00 1 0,5 0,5 0,2000 0,1867 Executer ,5000 1,0000 0,4000 0,5000 0,2667 0,4667 0,7333 3, Sub-Request 1 1, ,4000 0,1867 Executer ,0000 1,0000 0,2000 2,0000 1,3333 0,9333 2, , Request 1, ,4000 0,1867 Executer ,2000 1,6000 1,4000 3, , Request Generation 1, ,4000 Client ,0000 ##### 0,0000 0,0000 0,0000 Multiplexer Section [1] j Name j m j X j 1 Server 1 1 5, Server 2 1 1,2500 Experimental Parameters [1] n ges [1] λ bott f λ [1] 3 0,2000 0,9144 0,1829 Service Request Section Server Section Dynamic Evaluation Section [bb] i p parent(i),i v parent(i) [bb-1] SRq i [bb] v i,int [bb] v i [bb] λ i [bb] Server i [bb] [bb-1] [bb] [bb] m parent(i) m i,int m i Mpx i X i [bb] [bb] [bb] [bb] [bb] [bb] [bb] [bb] m i,mpx μ i ρ i n i,q n i,s n i R i 3 1 Sub-Request 2 1, ,1829 Executer ,5000 1,0000 0,4000 0,4572 0,2005 0,4572 0,6577 3, Sub-Request 1 1, ,1829 Executer ,0000 1,0000 0,2000 0,9144 1,4279 0,9144 2, , Request 1, ,1829 Executer ,2000 1,6284 1,3716 3, , Req. Generation 1, ,1829 Client ,0000 ##### 0,0000 0,0000 0,0000 Multiplexer Section [1] j j X j Name j m 1 Server 1 1 2, Server 2 1 5, Fall 2009 Workshop

117 REFERENCES Table 6: Closed Central Server Example - Tableau Experimental Parameters [1] n ges 3 [1] λ bott 0,4500 0,6895 λ [1] 0,3103 Service Request Section Server Section Dynamic Evaluation Section [bb] [bb-1] [bb] i p parent(i),i v parent(i) v [bb] [bb] [bb] [bb] [bb-1] [bb] [bb] [bb] [bb] [bb] [bb] [bb] [bb] [bb] [bb] SRq i i,int v i λ i Server i m parent(i) m i,int m i Mpx i X i m i,mpx μ i ρ i n i,q n i,s n i R i 3 1 Write Request 2 0,56 1,000 1,000 0,556 0,1724 Disk2 Writer ,0000 1,0000 0,2500 0,6895 0,5866 0,6895 1,2762 7, Write Request 1 0,44 1,000 1,000 0,444 0,1379 Disk1 Writer ,5000 1,0000 0,4000 0,3448 0,1029 0,3448 0,4477 3, Data Writing Request 1,00 1,000 1,000 1,000 0,3103 Data Writer ,4500 0,6895 1,0343 1,7238 5, Calculation Request 1,00 1,000 1,111 1,111 0,3448 Calculator ,0000 1,0000 0,5000 0,6895 0,5866 0,6895 1,2762 3, Reliable Calculation Req. 1,00 1,000 1,000 1,000 0,3103 Reliable Calculator ,4500 0,5866 0,6895 1,2762 4, Transaction 1,00 1,000 1,000 1,000 0,3103 Transaction Executer ,4500 1,2762 1,7238 3,0000 9, Transaction Generation 1,00 1,000 1,000 1,000 0,3103 Client ,0000 ###### 0,0000 0,0000 0,0000 0,0000 Multiplexer Section [1] j j X j Name j m 1 CPU 1 2, Disk1 1 2, Disk2 1 1,1111 f Fall 2009 Workshop 10-13

118 Fall 2009 Workshop

119 Modelling Security in Service-oriented Architectures Michael Menzel Hasso-Plattner-Institute Service-oriented Architectures (SOA) facilitate the provision and orchestration of business services to enable a faster adoption to changing business demands. Web services provide a technical foundation to realize this paradigm and support a variety of different security mechanisms and approaches. Security requirements are codified in Web Service policies that control the service s behavior in terms of secure interactions with other participants in an SOA. To facilitate and simplify the generation of enforceable security policies, I foster a model-driven approach based on the modelling of security intentions in system design models. These security intentions are translated to my security meta-model for SOA that is used to generate Web Service policies. This report summarizes my research work of the past six month that has been focused on the modelling of security requirements. I defined SecureSOA as a security design language that enables the definition of security intentions for SOA. In this report, the abstract syntax and notion of SecureSOA are introduced. In addition, the schema is described that has been chosen to integrate SecureSOA in any system design language. As an example, I will demonstrate the integration of SecureSOA in Fundamental Modelling Concept (FMC) Block Diagrams. 1 Introduction IT-infrastructures have evolved into distributed and loosely coupled service-based systems that are capable to expose company s assets and resources as business services. To implement this paradigm, the Web Service specification provide a technical foundation based on XML-messaging. In addition, Web Services facilitate the usage of different security pattern, mechanisms and algorithms to secure the interaction between the participants in an Service-oriented Architecture. These specifications enable the enforcement of security goals such as confidentiality, integrity, authentication and identification in a Web Service-based system. Web Service policies (WS-Policy and WS-SecurityPolicy) are used to state these security requirements on a technical layer concerning the usage of Web Service specifications such as WS-Security. They are deployed at Web Services that have to enforce these requirements. In addition, polices enables services to communicate requirements to its service consumers in order to inform about e.g. required identity Fall 2009 Workshop 11-1

120 Modelling Security in Service-oriented Architectures information, trusted parties or required mechanisms to secure exchanged information. However, Web Service policies are complex, hard to understand and even harder to codify. To simplify security configurations of Web Services, tool support is offered by all major Web Service platforms. These platforms provide preconfigured policies that can be selected using profiles or bindings. However, this approach requires developers to have some security knowledge, since an appropriate binding has to be choosen and additional security related configurations might still be necessary. In addition, these tools do not take the overall system architecture into consideration. To overcome this limitation and to enable a simplified generation of security policies, I foster a model-driven approach that integrates simple security intentions in SOA system models. SOA system models provides an abstract view on different aspects such as participants, information, and workflows. The integration of security intentions enables a modeller to state basic requirements on a technically independent level. For instance, specific information can be annotated as confidential or services might require a specific set of trustworthy identity information, e.g. credit card information. Modelling security has been a research topic in the recent years. Several approaches emerged [6, 10], but none of them is suitable to express simple security intentions that could easily be integrated in different modelling languages. In [2] Basin and Lodderstedt introduce SecureUML that provides a security design language to describe role-based access control and authorisation constraints. In addition, they describe a general schema to integrate security modelling languages into arbitrary system design languages. I have adapted this schema to provide a security modelling language that enables the definition of security intentions in SOA. This language is called SecureSOA and is the foundation of my model-driven approach. Security intentions modelled in a language based on SecureSOA are translated to my meta-model for Security in SOA that has been introduced in [7]. This translation is driven by security pattern as described in [8]. Finally, the security requirements defined in the meta-model are used to generate enforceable WS-SecurityPolicy. The structure of this report is as follows. In Section 2 I will outline my model-driven approach for SOA. Section 3 introduces my concept to enhance design modelling languages with security intentions, while SecureSOA is described as a security design language to model these intentions in the next Section. A SecureSOA dialect based on FMC is introduced in Section 5. The next Section presents a use case that is modelled using the dialect. Section 7 presents related work, while Section 8 concludes the report. 2 Model-driven Security in SOA My model-driven approach should enable SOA Architects to state security intentions at the modelling layer and to facilitate a generation of enforceable security configurations [7]. As illustrated in Figure 1, my approach consist of three layers. To state security requirements, I foster an annotation of system design models e.g. FMC block diagrams or BPMN models with security intentions that are defined by my 11-2 Fall 2009 Workshop

121 3 MODELLING SECURITY INTENTIONS FOR SOA Figure 1: Model-driven Security in SOA security modelling language SecureSOA. These languages are combined in a modelling dialect that combines modelling elements from both languages, as described in the next section. The modelled security intentions represent security goals that must be enforced and refer to a security profile. Profiles are used to abstract from technical details that should be hidden from the modeller. Instead of specifying, for instance, the algorithms, key strength and other technical details, the modelled instance of an intention refers to a profile that provide this information. However, intentions and profiles are not sufficient to generate security policies, since additional technical information are required. Therefore, I defined a security constraint model that is used to capture these technical details. Information at the modelling layer are gathered and translated to this model. To perform the transformation from security intentions to security constraints, further knowledge might be needed. Expertise knowledge might be required to determine an appropriate strategy to secure services and resource, since multiple solutions might exists to satisfy a security goal. For example, confidentiality can be implemented by securing a channel using SSL or by securing parts of transferred messages. To describe these strategies and their preconditions in a standardised way, I foster the usage of security patterns as described in [8]. Security patterns have been introduced by Yoder and Barcalow in 1997 [13] and are based on the idea of design patterns as described by Christopher Alexander [1]. Based on this work, I defined a formalised system of security configuration patterns that describes patterns for each security intention and that is used to resolve appropriate set of security constraints. The final step in my model-driven approach is the transformation of security configuration into enforceable security policy languages, depending on the capabilities of the target environment. 3 Modelling Security Intentions for SOA To integrate security intentions in system design languages, an enhancement of these languages is required. In general, three approaches can be distinguished to implement such an enhancement: Fall 2009 Workshop 11-3

122 Modelling Security in Service-oriented Architectures 1. Light-weight extensions The easiest way to enhance a particular system design language is the usage of extension points provided by the languages itself. For instance, UML provides stereotypes (enhances the semantics of modelling elements) and tags (name/value pair) to extend UML modelling elements. Lightweight UML extensions are used by UMLsec to express security requirements. The advantage of using extension points is the simplified integration of security requirements in existing modelling tools. However, the visualisation of complicated security requirements might get confusing. Moreover, not all modelling languages define extension points to enhance modelling elements. 2. Heavy-weight extensions Another approach to enhance modelling languages is based on the extension of its meta-model. For example, this approach is used by Rodríguez to define his security extensions for BPMN and UML [10]. The fact that the definition and integration of security requirements is done specifically for a particular system design modelling language based on its meta-model is one major disadvantage of this approach. 3. Defining a new language To avoid the drawbacks mentioned above, a new modelling language can be defined. This modelling language integrates security elements and contain specific redefined elements of the system design modelling language. SecureUML uses this approach to model security requirements as an integral part of system models. Therefore, Basin and Lodderstedt described a generic approach to create a new security design languages by integrating security modeling languages into system design modelling languages as described in [2]. The advantage of the schema described by Basin and Lodderstedt is its flexibility. A security modelling language can be defined once with certain extension points and than be integrated into different design modelling languages. The resulting language is called a modelling dialect. I have adopted this approach to model security intentions as shown in Figure 2. The schema consists of the following parts: 1. A security modelling language is used to express security requirements for a specific purpose. While SecureUML provides a security modelling language to model authorization constraints, I have defined SecureSOA that enables a modelling of security intentions for Web Services. 2. The structure of a system is described using a system design modelling languages. While different types of modelling languages can be used, my approach is based on FMC Block diagrams that is used to visualise system architectures. 3. Both languages are integrated by merging their vocabulary using the extension points of the security modelling language. The resulting language is called a dialect Fall 2009 Workshop

123 4 SECURE SOA - A SECURITY DESIGN LANGUAGE FOR SOA Figure 2: schema for constructing security design languages In order to provide extension points, the security modelling language has to formalise entities that are subject of the security requirements and that can be identified in the system design models as well. For example, I formalise participants in an SOA such as services as objects that participate in an interaction by exchanging information. The security intentions defined in my security language state requirements that refer to a particular object or information. FMC visualises system architectures that are composed of agents communicating over a channel. Therefore, each agent is an object and can be integrated using subclassing. 4 Secure SOA - a security design language for SOA SecureSOA enables a modelling of security intentions for Web Service-based systems and is defined by a MOF-based meta-model. The concrete syntax (notion) is defined using UML profiles. The SecureSOA meta-model [7] consist of two parts. The meta-model for SOA introduces the basic entities in my model and their relationships to describe interactions in a Service-oriented Architecture. Based on this model, a model for security intentions is provided as well. 4.1 A metamodel for SOA As introduced in [11], one of the basic entities in my model is an object that consists of a set of attributes and can participate in an interaction, see Figure 3. An interaction is always performed on a medium that is connected to the objects. For instance in the scope of Web Services, an object could be a Web Service client or a Web Service itself. In addition, each interaction also involve the exchange of information. To enable a detailed description of Web Service messaging, I model transferred information as data transfer objects as introduced by Fowler in [3]. Figure 4 shows Fall 2009 Workshop 11-5

124 Modelling Security in Service-oriented Architectures Figure 3: The Security Base Model the adaptation of this concept to my model. A data transfer object represents serialised information and is an information itself. However, it can also contain information. This recursive structure facilitates the description of SOAP messages and its message parts. In addition, Figure 4 visualises the mapping to the SOAP message structure as defined in the SOAP messaging framework specification [4]. A SOAP envelope is a data transfer object that can contain different message parts that are data transfer objects itself. Figure 4: Messaging in SOA Moreover, a data transfer object has a target and an issuer. This reflects that a data transfer object can be send over several objects acting as intermediaries. Therefore, issuer and target do not have to correspond necessarily to the objects that are involved in an interaction exchanging a data transfer object. In the scope of Web Service technology, WS-Addressing [5] would be used to represent the issuer and target in a SOAP-message by including a SOAP header that is also a data transfer object. 4.2 Modelling Security Intentions Security intentions are defined specifically for one security aspect in terms of Web Service security and are related to one or more security goal. I have defined the following set of security intentions: User Authentication, Non-Repudiation, Identity Provisioning, Data Authenticity, Data Confidentiality, and Trust. Data Confidentiality, for instance, requires the security goal confidentiality for a particular piece of information, while identity 11-6 Fall 2009 Workshop

125 5 A SECURESOA DIALECT BASED ON FMC Figure 5: Modelling Security Intentions provisioning states that the trustworthy identification and authentication of a user at a particular object is required. However, SecureSOA is not limited to this constraints and supports custom enhancements by adding additional security intentions. As shown in Figure 5, each security intention is related to a security profile. The fundamental idea is to hide technical details at the modelling layer. The modeller should not be bothered with details such as security algorithms and mechanisms that are used to enforce this intention. This set of information is predefined in a security profile and is referenced by the security intention. Moreover, security intentions state requirements for a specific subject that is either an object or a data transfer object. Therefore, each security intention has an intention subject. Object intentions refer to an object, while information intentions refer to a data transfer object. 5 A SecureSOA dialect based on FMC SecureSOA offers the possibility to express security intentions in various modelling languages. I have chosen FMC Compositional Structure Diagram (Block Diagram) to visualise software systems with security annotations, since FMC offers a suitable foundation to describe an SOA on a technical layer in terms of the involved participants and their communication channels. Figure 6: FMC Meta-Model Fall 2009 Workshop 11-7

126 Modelling Security in Service-oriented Architectures The FMC meta-model is depicted in figure 6. Agents interact by performing read or write operations on a channel. To integrate SecureSOA in FMC, the entities in FMC have to be mapped to its corresponding entities in SecureSOA. As aforementioned, the easiest way to perform the integration is to subclass elements in SecureSOA. Object is subclassed by Agent, while information is subclassed by value. However, there is no corresponding element in FMC that can be mapped to service, client, sts and data transfer object, although their parents have been mapped to entities in FMC. To integrate these elements, it is necessary to add new elements to the meta-model of the dialect that subclass related elements in FMC and SecureSOA, as shown in Figure 7. Figure 7: FMC Meta-Model Finally, the element interaction has to be mapped to FMC. Subclassing will not work as integration technique, since interaction is not just a channel in FMC. It is composed of a channel in combination with an operation that is performed on this channel. Therefore, associations and an OCL-Constraint must be defined to perform the integration. 6 Example The following section introduces a common claim-based service composition scenario as shown in Figure 8. The order scenario contains an order process, in which a user is requesting goods using an online store web application. This application invokes a composed order service that uses two external services; a payment and a shipping service. The payment service represents an external service which handles the payment of the order process. In order to do so, the service needs payment information including a payment amount and credit card information like card type, card holder, card number, expiration date and a security code. The shipping service initiates the shipping of the goods using the recipients address. Based on SecureSOA, the notion of each FMC actor has been enhanced to indicate its type (client, service or STS) using stereotypes. Users in this example have an account at his trusted bank and at the registration office, who act as identity providers managing the user s digital identity. The user can authenticate at the identity providers to request a security token that can be used to access a specific service. In addition, this SecureSOA dialect is used to annotate security intentions to various actors in this use case. In particular, the payment service has established a trust relationship with the trusted bank, while the shipping service trusts information from 11-8 Fall 2009 Workshop

127 7 RELATED WORK Figure 8: The Order Scenario with Security Intentions the registration office. Therefore, the money transfer service and the speed shipping service are annotated with the security intention identity provisioning, while the identity providers are annotated with the identity information that they offer. For instance, the money requires credit card information while the STS trusted bank is capable to issue this information. To secure the exchanged information, the intentions data authenticity and data confidentiality are annotated as well in this example. 7 Related Work The domain of model-driven security in the context of and SOA and business processes is an emerging research area. Previous work done by Rodriguez et al. [9], [10] discusses an approach to express security requirements in the context of business processes by defining a meta-model that links security requirement stereotypes to activity elements of a business process and proposed graphical annotation elements to visually enrich the process model with related security requirements. A model-driven scenario based on their annotations is considered as future work. Model-driven security and the automated generation of security enhanced software artefacts and security configurations have been a topic of interest in recent years. Similar to SecureUML, Jürjens presented the UMLSec extension for UML [6] in order to express security relevant information within UML-diagrams. This approach relies on Fall 2009 Workshop 11-9

128 Modelling Security in Service-oriented Architectures the usage of UML profiles, tags, and stereotypes to express requirements such as confidentiality, access control and non-repudiation. One focus of UMLSec lies on the verification of these security security requirements that facilitate a verification of security protocols modelled with UMLsec. However, the verification of security protocols and system models also implies that all security relevant information must be included in the UML-model and results in models that are difficult to understand due to its complexity. Although UMLsec can be adapted to verify communication related requirements in SOA, it does not provide a simple, high-level notion for security intention. 8 Conclusion and Future Work System design languages provide a suitable abstract perspective to specific security goals on a more accessible level, as we have shown in [12]. In this report, I presented an approach to enhance arbitrary system design model with security annotations. My approach is based on an universal schema that has been introduced by Lodderstedt and Basin in [2]. To express security intentions related to Web Service Security, I have defined SecureSOA that has been integrated in FMC as a system design language. Moreover, I presented an order service scenario that was the basis to illustrate the expression of security intentions in FMC concerning trust relationship, identity provisioning, and confidentiality. The specification of security requirements on an abstract level is the basis for my model-driven approach that addresses the difficulty to generate security configurations for Web Service systems. The foundation constitutes my generic security model that specifies security goals, policies, and constraints based on a set of basic entities as described in [7]. Security intentions and related requirements defined at the modelling layer can be mapped to this model. To resolve concrete security protocols and mechanisms, a security pattern-driven approach has been introduced in [8] that resolves appropriate security protocols with regard to specific preconditions. The gathered information can be mapped to policy specifications such as WS-SecurityPolicy. Altogether, my proposed modelling enhancement constitutes a suitable foundation to describe and implement a model-driven transformation of abstract security intentions to enforceable security configurations in different application domains. As described in [8], security patterns are a promising approach to resolve additional information needed in the transformation process. In the next step, I will use my security model and the pattern system to provide an automated translation of security intentions modelled with SecureSOA to WS-SecurityPolicy. References [1] Christopher Alexander, Sara Ishikawa, Murray Silverstein, Max Jacobsen, Ingrid Fiksdahl-King, and Shlomo Angel. A Pattern Lanuage: Towns - Buildings - Construction. Oxford University Press, Fall 2009 Workshop

129 REFERENCES [2] David Basin, Jürgen Doser, and Torsten Lodderstedt. Model driven security: from uml models to access control infrastructures. ACM Transactions on Software Engineering and Methodology, 15(1):39 91, January [3] Martin Fowler. Patterns of Enterprise Application Architecture. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, [4] Martin Gudgin, Marc Hadley, Noah Mendelsohn, Jean-Jacques Moreau, Henrik Frystyk Nielsen, Anish Karmarkar, and Yves Lafon. Soap version 1.2 part 1: Messaging framework (second edition). Specification, April [5] Martin Gudgin, Marc Hadley, and Tony Rogers. Web services addressing 1.0. Specification, May [6] Jan Juerjens. UMLsec: Extending UML for Secure Systems Development. In UML 02: Proceedings of the 5th International Conference on The Unified Modeling Language, pages , [7] Michael Menzel and Christoph Meinel. A security meta-model for service-oriented architectures. In Proc. SCC, [8] Michael Menzel, Ivonne Thomas, and Christoph Meinel. Security requirements specification in service-oriented business process management. In ARES, [9] Alfonso Rodríguez, Eduardo Fernández-Medina, and Mario Piattini. Towards a uml 2.0 extension for the modeling of security requirements in business processes. In TrustBus, pages 51 61, [10] Alfonso Rodríguez, Eduardo Fernández-Medina, and Mario Piattini. A bpmn extension for the modeling of security requirements in business processes. IEICE Transactions, 90-D(4): , [11] Christian Wolter, Michael Menzel, and Christoph Meinel. Modelling security goals in business processes. In Proc. GI Modellierung 2008, number ISBN GI LNI, Berlin, Germany, [12] Christian Wolter, Michael Menzel, Andreas Schaad, Philip Miseldine, and Christoph Meinel. Model-driven business process security requirement specification. Journal of Systems Architecture Special Issue on Secure Web Services, [13] Joseph Yoder and Jeffrey Barcalow. Architectural patterns for enabling application security. In PLoP, Fall 2009 Workshop 11-11

130 Fall 2009 Workshop

131 Automatic Extraction of Locking Protocols Alexander Schmidt 1 Introduction Monitoring and tracing applications raises many challenging problems. First, one has to find the right amount of tracing points in the system so that the information gathered is sufficient for a particular analysis. Secondly, once the trace points have been determined, one has to decide what information to store. If too few information is stored, it might not be possible to reconstruct any useful control flow of the system. On the other hand, if too much information is stored, it might confuse the analysis process. Finally, the tracing solution must not degrade the performance of the system under inspection too much. How much too much actually is, depends heavily on the intent the monitoring tool is used for. Developers may accept performance degradations at an order of magnitude in order to find the reason for an error, while system administrators running production systems will agree only to a small performance penalty. One particular problem in the problem area described above, is how to ensure that any retrieved data is valid with respect to the system. That means, if a monitoring tool retrieves a data object, can you rely on that data or is there a possibility that the data is corrupt? This matter becomes even more important when multi-threaded systems are concerned, which is more likely nowadays with the advent of many- and multi-core architectures. In such systems, when data objects are shared between different actors, or threads, the system implements a consistency model on its data [10], i.e., all actors agree on when they see updates on shared data objects. There are multiple ways to achieve that, for example distributed shared memory systems, or hardware and software transactional memory systems. Another approach to adhere to a consistency model is to design a locking protocol, which defines how participating actors determine who gets access to a shared object. The solution to the Producer-Consumer Problem is an example of such a locking protocol. In consequence, by applying the locking protocol, all accesses to a shared data object are serialized into a sequence. It is this sequence that prevents shared data objects from corruptions. This issue has long not been addressed by both monitoring and tracing tools. To overcome that problem, we present the KStruct approach, which is a data structure inspection tool that allows accessing data objects at runtime while ensuring the consistency of the retrieved data. KStruct was carefully designed with locking protocols in mind. In order to define a locking protocol, we introduced the KStruct Access domain specific language [15], a subset of the C programming language. KStruct Access is based on the observation that many multi-threaded systems implement locking protocols at the object or data structure level. For that reason KStruct Access allows to annotate structure definitions and to declare locking brackets i.e., what data structure is protected by which lock. For each data structure that should be monitored, there must be a definition in KStruct Access, which is created during the Annotation phase. Based on that, the KStruct compiler extracts that knowledge and compiles it into a driver that is capable of accessing the system under inspection. The driver incorporates the KStruct Runtime system in order to get access to the system s locking functions and data heap. The driver further provides several interfaces to reveal the monitored data. We denoted the phase of using the driver the Audit phase. As the KStruct runtime system lets you interactively browse through the Fall 2009 Workshop 12-1

132 Automatic Extraction of Locking Protocols Figure 1: The KStruct AAA aproach. object heap of the system, it may become necessary at some point to extend the annotated data structure which close the basic KStruct cycle. So far, all the defined locking protocols have been defined manually for a small subset of data structures of the Windows Research Kernel. To provide access to greater amount of data structures, a more convenient way had to be found to derive the locking protocol for a particular data structure. Within this paper, we extend the previous KStruct cycle by an Analysis phase which is run by our tool KStruct Advice. KStruct Advice leverages static [19] It allows writing programs that run on general purpose graphic processors (GPGPU) by NVIDIA. Due to the hardware architecture of these devices, many complex computational problems can be solved much faster than on current CPUs. These problems include physical computations and video processing. Nowadays development for CUDA is done using some C extensions. The code is precompiled with a compiler provided by NVIDIA and finally compiled to binaries accessing CUDA-enabled graphic drivers. For CUDA works with all modern graphic cards from NVIDIA, even NVIDIA ION [14], it is available in hundreds of thousands of computers. This makes it particularly interesting for research on parallel computing Programming Model The design goals of the CUDA programming model are to enable programmers to develop parallel algorithms that scale to hundreds of cores using thousands of threads. [19] Thereby the developers should not need to think about the mechanics of a parallel programming language and should be enabled to employ the CPU as well as the GPU. The application parts that are executed on the graphic card are called kernels. They are executed by lots of lightweight CUDA threads which can communicate using slow device memory or fast shared memory. While the kernel is running the CPU is free to handle other workload. Each kernel has some equivalent tasks to fulfill. Listing 1 shows such a typical kernel execution scheme. At first each thread can calculate its unique thread identifier using some constructs provided by the CUDA environment. This identifier can be used to make control decisions and to compute the memory addresses of the input parameters. Finally the calculated result is written back to the global memory. / / d e r i v e a b s o l u t e t h r e a d i d f r o m b l o c k i d a n d r e l a t i v e t h r e a d i d i n t t hreadid = b l o ckidx. x * blockdim. x + threadidx. x ; i n t parameter = datagpu [ threadid ] ; } / / r e a d f r o m a r r a y ( u s i n g i d a s i n d e x ) r e s u l t = parameter * parameter ; } / / c a l c u l a t e t h e r e s u l t datagpu [ t h readid ] = r e s u l t ; } / / w r i t e t o a r r a y ( u s i n g i d a s i n d e x ) Listing 1: CUDA kernel execution scheme There is a strict separation of kernel routines and normal program routines. Kernel code cannot access host memory. Kernel methods must not be recursive and must not use static variables Evaluation with an example In order to get a deep understanding of the programming model, I chose an active research problem that is complex enough to demonstrate the abilities and limits of Fall 2009 Workshop 18-5

204 optimization steps optimization steps On Programming Models for Multi-Core Computers CUDA. The n queens puzzle fulfilled these requirements. Its goal is to find all ways to place a given number n of queens on a chessboard which has n times n fields. A configuration is only valid if no queen attacks another one. This holds if no queen is placed in a row, column or diagonal that is used by another queen. All solutions for this puzzle are known up to a number of 26 queens. Preußer et al. from the University of Dresden [5] calculated all 22,317,699,616,364,044 valid configurations using specialized FPGA boards. Another project is by Figueroa from the Universidad de Concepción de Chile [6] who uses an approach that is similar to [17] where a middleware is provided that distributes work packages over the Internet. While calculations are done by the personal computers of registered users in both cases, does not exploit their graphic cards. This is critical, because the statistics of the project [16] show, that the application of graphic cards brings a huge performance benefit compared to common CPUs. Both projects are based on an optimized single-threaded algorithm written by J. Somers [23]. After investigating the problem and programming an own solution, I decided to take the algorithm of J. Somers as the basis for my CUDA version as well. This decision founded on the fact that this algorithm does not use recursions, consumes little memory and is widely accepted. The first and most important step was to parallelize the algorithm. Therefore I modified it in a way that allowed the precalculation of board settings for a given number of rows. Its results are then used as the input for an algorithm that calculates all solutions starting from the given setting. Applied to CUDA, the first algorithm runs on the host and the second one as a kernel on the graphic card. After the basic program was running, I step by step modified it to use fast shared memory instead of slow mapped device memory for the arrays. As shared memory is rare on graphic cards, using more of it reduced the number of threads I could deploy. 16 queens problem on GeForce GTX queens problem on GeForce 8600M GS execution time execution time NQueens v0.13 3,3 s NQueens v0.13 3,0 s NQueens v0.14 2,8 s NQueens v0.14 3,5 s NQueens v0.15 2,4 s NQueens v0.15 4,3 s NQueens v0.17 2,5 s NQueens v0.17 4,7 s NQueens v0.22 2,1 s NQueens v0.22 2,7 s Figure 3: Runtime comparison of different versions of the NQueens program. The higher the version number the more shared memory is used instead of device memory. While the optimizations lead to better performance on the latest graphic cards (Geforce GTX 275), it results in worse performance on former CUDA-enabled versions (Geforce 8600M GS) Fall 2009 Workshop

205 3 PROGRAMMING MODELS FOR PARALLEL COMPUTING In Figure 3 the performance impact of these optimization steps is shown. Using a graphic card with the latest CUDA-enabled architecture from NVIDIA (Geforce GTX 275), the shift from device to shared memory lead to a performance increase. The benefit from faster memory accesses exceeded the penalty of the reduced thread count. On the other hand the same optimizations lead to a decreased performance on the former architecture (Geforce 8600M GS). Surprisingly after the last optimization step the performance is significantly improved and outperforms the original program version. In the last version shared memory is used for all arrays and the memory footprint per thread is minimized as well. The reason for the different results of the optimizations on the different architectures is a performance problem that arises from incoherent memory accesses on later CUDA-enabled architectures. This evaluation shows that general purpose graphic cards are still far away from CPUs, because programmers can not even make assumptions on the effects that optimizations of a program will have. A program that is optimized for one architecture will not necessarily run fast on another CUDA-enabled architecture Conclusion During my experience with the CUDA programming model I learned that there are a lot of restrictions a programmer must cope with. At first there is the unavailability of recursion and the differentiation of CUDA subroutines from normal routines. This is not only very unfamiliar to C programmers, but leads to replicated code as well. The second one is the focus on minimizing the memory usage of the algorithms. As a programmer one has to think 'more carefully about memory access patterns and data structures sizes'. [11] Another problem with memory that I faced was the unpredictability of the resulting register count. I was not able to derive a pattern for the number of occupied registers from code changes. Unfortunately one needs to know the number of occupied registers at coding time to optimize the code accordingly. The last and most surprising restricting is that CUDA threads are only allowed to run for a very short period of time. Long running CUDA kernels result in a driver restart will thus be canceled. If a program needs to run several seconds or more, the display mode of the operating system must either be deactivated or additional graphics hardware must be used to take over the rendering work. There is still a lot of work to do until general purpose programming for graphic cards will be as comfortable as programming for CPUs. 3.2 Pushing the Limits: Processes and Threads The usage of multiple processes and threads as workers is a widespread programming model. Inspired by an article of Mark Russinovich [20] and in cooperation with a vendor of complex enterprise applications, we tried to figure out what are the limits in the usage of processes and threads. The first approach which is described in this section was the creation of a model to predict the number of threads that can be created on a 64-bit Windows operating system. While lots of information is present for 32-bit [20], there is little information available for a 64-bit Windows operating system. Fall 2009 Workshop 18-7

206 On Programming Models for Multi-Core Computers After studying the Windows internals to figure out how much memory a thread consumes [21], we came up with a fairly simple model: max_threads = (available_memory + available_pagefile_memory) program_heap kernel_thread_size + thread_commited_memory kernel_thread_size = size(kernel_stack) + size(t EB) + size(et HREAD) The maximal thread count equals the amount of memory that is available for thread stacks divided by the size of a thread. A thread s size is the sum of its user mode as well as its kernel mode representation. Due to some additional kernel constructs the size of kernel representation of a thread is negligible bigger than in our model. In order to check the accuracy of this model, we compared it against some benchmark results. Thread Count GB 6 GB 8 GB computed native C Java NET Figure 4: Maximal thread count that can be achieved for different memory sizes on a 64-bit Windows 7 operating system using C,.NET and Java compared to a computed value based on our model. Figure 4 shows the results of our evaluation. We found that our model fits very well for native C. The divergence is due to Address Space Layout Randomization (ASLR) which is included in Windows operation systems since Windows Vista. Using the Java environment half of this thread count can be created. That penalty is due to the overhead for the thread management of the Java Virtual Machine. With framework the possible thread count is even smaller. This is due to the fact relies on committed memory which potentially wastes resources for they are unused. Further inspection of the Shared Source Common Language Infrastructure (SSCLI) [12] would be necessary to check what improvements are possible here. While studying the Windows internals and investigating the limiting factors for thread creations within the operating system, a lot of interesting questions arose that we will try to answer in the future Fall 2009 Workshop

207 REFERENCES 4 Conclusion This paper gives an overview of current architectural shifts and their impact on service computing. In section 2 energy efficiency considerations are presented. The need for operating system and middleware support to reduce the power consumption of a system by deactivating unnecessary system parts is illustrated by the example of memory bricks. Furthermore in section 3 the emerging importance of programming models for multi-core computers are discussed. Taking CUDA as one representative, general purpose GPU programming models were studied. The restrictions of these models were shown by evaluating a solution to the n queens problem. Some of the limits and problems of parallel computing were examined. As shown in this paper we face a lot of interesting architectural shifts today. In order to let service computing benefit the most from these shifts, we need to get a better understanding of them and to revise our service notion appropriately. References [1] Advanced Micro Devices. ATI Stream Software Development Kit (SDK) [2] Apple. Apple - Mac OS X - New technologies in Snow Leopard [3] Wu chun Feng, Xizhou Feng, and Rong Ce. Green Supercomputing Comes of Age. IT Professional, 10(1):17 23, January [4] Dell. DELL Leitfaden für Stromversorgung & Kühlung. EMEA-de.pdf, [5] Thomas B. Preußer, Bernd Nägel, and Rainer G. Spallek [6] Israel Figueroa. [7] C. A. R. Hoare. Communicating Sequential Processes (CSP). Prentice Hall, [8] Chung-hsing Hsu and Wu-chun Feng. A power-aware run-time system for highperformance computing. In SC 05: Proceedings of the 2005 ACM/IEEE conference on Supercomputing, page 1, Washington, DC, USA, IEEE Computer Society. [9] Intel. Intel Threading Building Blocks [10] Khronos. Opencl - the open standard for parallel programming of heterogeneous systems Fall 2009 Workshop 18-9

208 On Programming Models for Multi-Core Computers [11] William Mark. Future graphics architectures. Queue, 6(2):54 64, [12] Microsoft. Shared source common language infrastructure. March [13] Microsoft. Parallel computing developer center [14] NVIDIA. Nvidia cuda-enabled products. cuda_learn_products.html, [15] NVIDIA. NVIDIA CUDA - Programming Guide - Version NVIDIA Corporation, edition, August [16] Vijay Pande. - client statistics by os. [17] Vijay Pande. - distributed computing. [18] J. Richling, J. H. Schönherr, G. Mühl, and M. Werner. Towards energy-aware multicore scheduling. PIK - Praxis der Informationsverarbeitung und Kommunikation, 32:88 95, [19] Greg Ruetsch and Brent Oster. Getting started with cuda, [20] Mark Russinovich. Pushing the limits of windows: Processes and threads. July [21] Michael Schöbel. NtCreateThread: memory allocations in kernel mode [22] David C. Snowdon, Etienne Le Sueur, Stefan M. Petters, and Gernot Heiser. Koala: a platform for os-level power management. In EuroSys 09: Proceedings of the 4th ACM European conference on Computer systems, pages , New York, NY, USA, ACM. [23] Jeff Somers. The n queens problem - a study in optimization Fall 2009 Workshop

209 Towards a Service Landscape for a Real-Time Project Manager Dashboard Thomas Kowark Continuously improving possibilities for electronic communication have facilitated a shift from co-located towards geographically distributed development teams. While those virtual development teams combine the cumulative knowledge of their members without the need for costly travels or relocations, they also have to cope with the challenges that the mainly indirect communication implies. Problems within the team structure or the communication behavior are much harder to recognize and can jeopardize project success. Hence, project managers should be able to have an overview about the communication patterns of their team during all, and especially early, stages of the development process. This report presents the extensions that have been applied to the platform in order to utilize it for capturing and analyzing the digital communication artifacts created by the members of software development teams. Furthermore, a case study is outlined that re-enacts a large scale development process during the course of a university lecture. Thus, it provides extensive data about the digital communication footprints of software developers that can be compared to analog observations to reason about indicators for communication problems present within the digital traces. 1 Introduction The digital footprint of developers is steadily increasing due to various reasons. First of all, in geographically distributed teams (e.g., virtual development teams), electronic communication is often the fastest and most cost-effective way of communicating with other team members. But even co-located teams are producing more and more digital artifacts of their work because of modern collaborative application lifecycle management (ALM) tools such as Trac, Codebeamer, or RedMine [9]. These tools aggregate the formerly widespread landscape of CASE-tools into single applications. Thus, only one system needs to be set-up in order to provide functionality like bug tracking, wikis, or time management. However, some communication aspects, like instant-messaging or , are in most cases still covered by external tools, even though some ALM solutions support those functionalities. Platform In order to provide a unified representation of this heterogenous data, the platform has been developed [11]. It uses the Resource Description Fall 2009 Workshop 19-1

210 Towards a Service Landscape for a Real-Time Project Manager Dashboard Framework (RDF) [2] and the Web Ontology Language Extension (OWL) [3] to define semantic descriptions of the underlying models along with all relations between the model elements (e.g., the different relations a person can have to an ). The semantic description of domain ontologies is quite beneficial, since it (1) allows to easily extend the platform with support for new types of resources and (2) simplifies the generation of meaningful queries on the data since the data models are extensively described. So far, the platform has been used for a post-hoc analysis of the digital communication of virtual design teams. Those studies have revealed patterns within the team communication behavior that can be used to assess expected team performance in the early stages of the examined projects and capture a variety of possible communication problems [10]. Accordingly, project managers might be able to intervene as soon as such patterns become apparent and thereby help to improve the overall project outcome. Additionally, side-projects have been started to further extend the platform with means to visualize the gathered data or extend some of the artifacts with information about their importance for the overall team communication. The distinctive character of the evaluated projects has raised the question, whether the observations and findings also hold true in other set-ups and team structures. We have therefore extended the existing platform to capture digital artifacts that are very common in software development processes and additionally set-up a midsized software development project that serves as a case study. Those two measures help us to investigate the digital communication behavior of software developers and compare it to the patterns found for engineering design teams. The remainder of this research report is structured as follows: Section 2 presents related work in the field of project team communication analysis. Section 3 gives an overview about our recent research activities and the extensions that have been developed for the platform. Section 4 presents the outline of an upcoming case study, as well as other projects regarding the further improvement of the platform and concludes the report. 2 Related Work The analysis of development team communication in the field of software engineering has been the topic of preceding research and development efforts. Various collaborative application lifecycle management tools like the aforementioned Trac, RedMine, or CodeBeamer try to provide a single point of access for developers and, additionally, allow basic queries on the data present within the systems. The amount of available datasources differs between those products, as does the flexibility of the definition of queries on this data. An approach that aggregates data from different data sources has been developed by Ohira et. al [6]. The Empirical Project Monitor relies on numerous feeder applications that parse datasources like source code management systems, bug trackers, or archives. It provides a number of preset visualizations for this 19-2 Fall 2009 Workshop

211 3 RECENT WORK data. Additionally, an underlying communication model tries to detect flaws within the collaboration behavior based on empirical studies. Reiner [8] presented a proposal for a knowledge modeling framework that supports the collaboration of design teams. Communication information has been deduced from explicit interactions between members of a design team by a software tool that he developed to provide a prototypical implementation for the proposed framework. While all those approaches are able to create the same views on the available communication artifacts as our approach, the difference can be found within the representation of the internal models. The semantically annotated models of the can be easily extended with new ontologies by adding the respective resource description to the application. Additionally, external services that parse the information available in various sources can feed the networks with new data. This also allows to simply re-use existing resource definitions for the information from different implementations of the same concepts (e.g., different wiki types). The aforementioned implementations would require extensive adoptions of internal models to achieve the same behavior and defining arbitrary queries on this data would require knowledge about those internal data structures. 3 Recent Work During the last six months, focus has been put on extensions to the platform that, in addition to the existing support for s, wiki pages, and WebDAV folders, allow for an investigation of team dynamics especially within software development teams. As shown by the highlighted boxes in Figure 1, the platform now is able to handle information retrieved from Subversion 1 repositories (SVN), as well as wiki pages and bug tracking information available within CodeBeamer - a solution for collaborative application lifecycle management. Both additions to the platform included the implementation of a so-called feeder service that connects to the respective data source (i.e., the SVN repository and the CodeBeamer Web Service interfaces) and generates a Javascript Simple Object Notation (JSON) representation of the data. These JSON objects have to follow the resource definition specified within Resource Definition Framework (RDF) documents that have been created for both conceptual models. The resulting ontologies allow to incorporate the information into the team communication networks [11] created by the Using wiki, SVN, and information from previous projects of the development exercise of a software engineering lecture, it was tried to identify patterns within those graphs that indicate strong or weak individual or team performances. Since only little information regarding the roles of the team members and their distinct characteristics has been collected during this lecture, only weak indicators but no statistically significant evidence could be derived that certain communication and 1 Fall 2009 Workshop 19-3

212 Towards a Service Landscape for a Real-Time Project Manager Dashboard Archives Wiki Logs Subversion Revisions Codebeamer Tickets / Wiki Team Communication Networks Graph Analysis Developer Researcher Figure 1: Extensions to the service-landscape. source code management behavior is beneficial for the project outcome. An example for such an indicator is the observation that team members with extreme check-in behavior seem to be more easily remembered by the project tutors. Generally, students that were responsible for many commits were considered to be heavy performers. Those with very few were sought to have struggled keeping up with their team members. While those factors might indicate that a certain person might be a good, or at least diligent, programmer, other factors are also important for project success. Recent studies indicate that the social network between the developers can be used to predict project failures [5]. Hence, good programming skills of single persons are no guarantee for an overall good team performance. To take this into account, a case study with a closer resemblance of real life development processes and a more structured observation process became necessary to provide the foundation for meaningful data collection, analysis, and accordingly, hypotheses creation [4]. The outline of the examined project and the methods used for analyzing the communication behavior of the project members is presented in the following section. 4 A Software Engineering Case Study The implementation of the extensions for SVN and CodeBeamer information is the foundation for a case study that is performed during the upcoming winter term. The purpose of this study is to determine whether indicators exist within the collection of digital artifacts created by software developers, that are evidential for individual team roles, performance of individual team members, or the performance of the entire development team Fall 2009 Workshop

213 4 A SOFTWARE ENGINEERING CASE STUDY 4.1 Project Outline The focus of the study is the observation of the development teams of a software engineering lecture during the project part of the course. The project goal is to develop a basic enterprise resource planning (ERP) system for small to mid-sized companies in a joint effort by all approximately 80 participants. The students are guided by both research assistants and senior students as tutors, but the main responsibility for organizing the group work remains within their own field of duty. The entire project is further subdivided into 13 smaller teams, each of which is responsible for a certain set of requirements. The initial requirements given to each team are not prioritized. Hence, the teams have to perform user research to define the requirements that the target product has to fulfill. Furthermore, certain requirements not only have an impact on single teams, but also effect multiple ones. Thus, the teams not only need to coordinate the work within their owns teams, but also have to ensure the interoperability of components developed by different teams. The SCRUM process is chosen to be the basic framework guiding the collaboration. A team of six Product Owners will be responsible for defining and prioritizing the requirements, as well as presenting them to the individual development teams. Furthermore, the product owners are responsible for evaluating the progress of the teams at the end of each development cycle - a so-called Sprint. Sprints last 3 weeks and are synchronized between all sub-teams. Within the sprints, the teams are required to have weekly meetings with the tutors to present recent developments, problems, and next steps. Delegates of the teams, so-called SCRUM- Masters, conduct an additional weekly meeting to further coordinate the work of the sub-teams by, for example, defining interfaces, discussing architectural decisions, or communication guidelines. The teams are provided with a Subversion repository for source code management, global and team-internal mailing lists, and a CodeBeamer installation for feature tracking and wiki functionality. 4.2 Observation Process In order to avoid distortion of the gathered data, the tutors are not granted access to the unified data representation of the team communication networks and thus do not have the possibility to perform queries on that data. They, however, have access to all artifacts created by the students and would theoretically be able to deduce all information that is stored within the However, as the amount of artifacts grows over time, this process is supposed to become more and more complicated and time-consuming. On the other hand, tutors get to know the members of the teams they supervise and, accordingly, might search through the available data more target-oriented. As already mentioned, the tutors also have to log their impressions of the weekly team meetings. These logs have to include observations about the student behavior, e.g.: Fall 2009 Workshop 19-5

214 Towards a Service Landscape for a Real-Time Project Manager Dashboard Which are the team leaders? Which team member is responsible for which aspect of the project? How did the team perform during the last development cycle? Are there any obvious problems within the team itself? How is the communication between interdependent teams handled? All information that is gathered during those meetings can later be used for a detailed analysis of the information stored within the team communication networks of the to answer the following research questions: Do the team communication networks contain indicators for the observations made by the tutors? Is it possible to determine patterns that indicate problems within the teams before they are noticed by the tutors? In addition to these observations, the students are requested to perform the Belbin Team Role Inventory test [1] at the beginning of the project. This test assesses how the individual team members presumably behave within the team environment and what their alleged main characteristics are. Furthermore, the students are asked to complete a survey at the end of the lecture where they evaluate the teamwork and the support by the tutors. With this data, it is possible to analyze the networks with a focus on the following questions: Is it possible to deduce team communication network patterns that indicate certain roles of team members? Which combinations of team roles might be beneficial for the project outcome and how does this affect the team communication network? 4.3 Future Work To further broaden the database for the evaluation of the platform as a project communication analysis tool and not only focus on software development teams, the investigation of the communication artifacts created during the engineering design lecture that Uflacker et al. used for their initial research work will be continued, as well. This lecture is a cooperation with the Stanford University and provides insights into the collaboration of heterogenous, globally distributed engineering design teams. While the main objective during the upcoming months is the aforementioned data analysis, further evolution of the platform itself will also take place. The first topic in this area is the performance of the application. The expected graph representing the development teams of the case study project is likely to reveal possible 19-6 Fall 2009 Workshop

215 REFERENCES Archives Wiki Logs Subversion Revisions Codebeamer Tickets / Wiki.. Team Communication Networks Graph Analysis Developer d.query d.see.. Researcher Figure 2: The service-landscape. performance bottlenecks in the storage of the networks in a relational database. We will therefore evaluate different approaches for storing the team communication networks in persistent storage spaces. Furthermore, evaluating new services that utilize the data captured within the and, thus, help evolving the landscape for a manager s dashboard (see Figure 2) will be a major topic for further research. This, for example, implies that new ways to visualize the gathered data have to be found since the new ontologies have different characteristics than the existing ones. While an approach to this problem has been developed with the d.see application, its applicability to the new models has yet to be determined. Another aspects that has to be considered is the generation of queries. While the existing SPARQL [7] interface provides a powerful mechanism to query for arbitrary values, the long term target is the usage of the platform by project managers. To achieve this goal, means have to be identified to use the semantic annotations of the models to create d.query, a simple interface for the query creation that does not rely on the authors to learn a distinctive query language but use natural language. References [1] Meredith Belbin. Management Teams. John Wiley & Sons, Fall 2009 Workshop 19-7

216 Towards a Service Landscape for a Real-Time Project Manager Dashboard [2] D. Brickley and R. Guha. RDF vocabulary description language 1.0: RDF schema. W3C Recommendation, February [3] M. Dean and G. Schreiber. OWL web ontology language reference. W3C Recommendation, February [4] Kathleen M. Eisenhardt. Building theories from case study research. The Academy of Management Review, 14(4): , [5] Andrew Meneely, Laurie Williams, Will Snipes, and Jason Osborne. Predicting failures with developer networks and social network analysis. In SIGSOFT 08/FSE-16: Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering, pages 13 23, New York, NY, USA, ACM. [6] Masao Ohira, Reishi Yokomori, Makoto Sakai, Ken-ichi Matsumoto, Katsuro Inoue, Michael Barker, and Koji Torii. Empirical project monitor: A system for managing software development projects in real time. In International Symposium on Empirical Software Engineering, Redondo Beach, USA, [7] Eric Prud hommeaux and Andy Seaborne. SPARQL Query Language for RDF. W3C Recommendation, January [8] Kurt A. Reiner. A framework for knowledge capture and a study of development metrics in collaborative engineering design. PhD thesis, Stanford, CA, USA, Adviser-Leifer, Larry J. [9] M. Rita Thissen, Jean M. Page, Madhavi C. Bharathi, and Toyia L. Austin. Communication tools for distributed software development teams. In SIGMIS-CPR 07: Proceedings of the 2007 ACM SIGMIS CPR conference on Computer personnel research, pages 28 35, New York, NY, USA, ACM. [10] Matthias Uflacker, Philipp Skogstad, Alexander Zeier, and Larry Leifer. Analysis of virtual design collaboration with team communication networks. In Proceedings of the 17th International Conference on Engineering Design (ICED 09), Vol. 8, volume 8, pages , [11] Matthias Uflacker and Alexander Zeier. Capturing team information spaces with resourcebased information networks. In IADIS International Conference WWW/Internet, Freiburg, Germany, Fall 2009 Workshop

217 Towards Visualization of Complex, Service-Based Software Systems Jonas Trümper Traditional development tools such as IDEs and debuggers only provide partial support for developers to cope with a large system s complexity. Especially parallel execution poses a huge challenge for developers as it raises the system s runtime complexity by orders of magnitude. For example, synchronization has to be handled and each execution thread has its own local stack and state. This work at the HPI Research School aims at developing concepts and tools for software visualization that help to cope with the complexity of such large software systems in various ways. Current research includes, but is not limited to, software visualization for debugging performance issues caused by flawed synchronization of shared memory access. 1 Motivation: Improve Productivity of the Software Development Process Large software systems, in particular service-oriented software systems, typically consist of millions lines of code, are maintained over a long period of time and are developed by a large, diverse team. This poses an enormous challenge to developers in several dimensions. For example, (1) the knowledge about the whole system is typically distributed. That is, a single developer is no more able to memorize the complete system structure with all its details. More precisely, each developer working on the system typically has detailed knowledge about one or a few parts of the system and is roughly aware of the big picture. (2) The dependencies between system components may not be explicitly documented or visible. Dynamic binding due to polymorphism in objectoriented software systems complicates this even further as the exact dependencies are only visible at runtime. (3) Documentations and actual system implementations often exhibit significant differences in practice. Hence, the only reliable information sources are represented by the actual implementation artifacts, e.g., source codes and binaries. As a consequence, creating a mental map of and understanding the internal dependencies of such a large software system by source code reading is a complex and time-consuming task for an individual. Locating bugs or identifying performance bottlenecks, however, requires even more: the specific interaction between the system s actors has to be understood. Concurrent execution introduces new classes of complexity. Among others, the source code becomes more complex as synchronization must be handled, too. In Fall 2009 Workshop 20-1

218 Towards Visualization of Complex, Service-Based Software Systems addition to that, the complexity of the system s runtime behavior typically rises with each additional thread 1 running in parallel. And third, a new class of bugs and performance problems is introduced due to the possibility of parallel access to shared memory. When it comes to multithreaded or even distributed applications the complexity of the conventional debugging workflow (i.e., debugger, breakpoints and step-by-step execution) typically rises by orders of magnitude. As software visualization is the art and science of generating visual representations of various aspects of software and its development process [1], it aims to help to comprehend software systems and to improve the productivity of the software development process [1]. A lot of research exists in the area of software visualization for singlethreaded applications; however the visualization tools are typically incapable of exploration or analysis of parallel executions. However, extending those singlethread visualization tools is neither always possible nor straightforward just as singlethreaded applications can not simply be switched to be multithreaded. Software analysis tools, especially for parallel executions, are supposed to help finding answers to questions like: Why is the system s execution in that part of the implementation so slow? What is going on in this parallel execution? Which code segments actually run in parallel? Where is forced sequential execution due to dependencies? Which visualization techniques actually do provide real benefit in answering those type of questions is still unclear. In addition to that, each question/developer task may require completely different visualization strategies. So there is probably no one-sizefits-all visualization that is applicable to all kinds of issues introduced by parallel execution, and visualization needs to be tailored for each concrete task. The remainder of this paper will focus on multithreaded software systems and performance bugs caused by flawed synchronization between threads. The concepts are intended to be applicable to service-oriented systems as well. The paper is structured as follows: Chapter 2 outlines the peculiarities of debugging concurrent executions. Chapter 3 discusses existing work, Chapter 4 briefly presents a concept for software visualization of multithreaded applications using the example of performance bugs. Chapter 5 outlines planned next steps for the research. 2 Locating Performance Bugs in Concurrent Execution Performance bugs in concurrent executions with multiple threads that access shared memory are typically caused by either an actually slow implementation or flawed synchronization with respect to the shared memory access. Whereas the former cause 1 Within this paper, the term thread is used to identify a separate execution with local storage and stack. That is, the term thread is used for a separate execution within a process on a local machine or a service on a separate machine Fall 2009 Workshop

219 3 RELATED WORK can be identified comparatively easy by means of, e.g., Call Stack Sampling (see Section 3.1), the latter cause is hard to identify: most of the calls to the flawed code run as fast as expected, but very few random calls are very slow. Manually identifying the cause for very few outliers in a set of regular executions by reading the source code is hardly possible and typically requires guesswork. Likewise, examining such unintended behavior with a conventional debugger in concurrent executions is a hard task, because those concurrent paths have to be tracked mentally and the separate execution states have to be considered. Furthermore, it s typically not possible to predict which of the executions of a specific piece of code will constitute itself as an outlier and so all executions of the respective code would have to be stepped through manually. The lack of proper tool support again and again causes developers to manually add debug output into the source code in order to manually track a system s runtime behavior and execution costs. Tedious subsequent manual analysis of execution costs has to be done. Unfortunately, the very important and valuable original execution context is lost in this kind of post-mortem analysis. 3 Related Work 3.1 Data Acquisition and Performance Analysis With regard to data acquisition concerning runtime behavior of software systems, lightweight techniques were introduced that aim to provide support for performance-related tasks. Call Stack Profiling (e.g., [2]) is a technique that records an instrumented system s call stack at specific time intervals. This, in turn, permits to derive averaged execution costs per function/method. Call Graph Profilers like gprof [2] introduced by Graham et al. additionally provide callee/call site context and summed up execution costs per callee/call site combination. With regard to the above mentioned class of performance bugs (a few outliers in a huge set of regulars), Call Stack Sampling and Call Graph Profiling provide only average execution costs, so developers may notice a generally slowed execution of this particular piece of code; they won t be able to tell the slow outliers from the rest, though. Consequently, other data acquisition techniques like that of instrumenting profilers (e.g., PIN [7]), which are able to record Function Boundary Traces (FBT), are better suited for the identification of outliers. As a consequence of recording the full call history, time warping is possible during post-mortem analysis: the analyzed time slot within the recorded trace can be selected freely and so time can be virtually turned back. Developers are able to reconstruct the cause chain and context that lead to an event or function call without having to trigger another system run. 3.2 Visualization Tools As a matter of fact, the size of FBTs typically exceeds the size of sampled and/or aggregated traces by orders of magnitude (e.g., several gigabytes of data) and so FBTs are considered to be not human-readable - although they may be stored in plain Fall 2009 Workshop 20-3

220 Towards Visualization of Complex, Service-Based Software Systems text. Thus, FBTs need to be visualized in a way that allows developers to explore their content efficiently. ThreadScope, proposed by Wheeler and Thain [8], is a tool that is able to visualize the execution of large, multithreaded applications in connected graphs. The graphs enable developers to visually identify bottlenecks in executions and possible deadlocks. They also propose techniques to reduce graph size for specific developer tasks, e.g., filtering read-only access to memory. This simplifies identification of erroneous synchronization that causes values to be overwritten unintentionally. However, their tool does not feature essential navigation and visualization techniques to simplify the analysis. The graphs become very huge for typical logging durations and as such exploring them in all is a tedious task again. Zhao and Stasko present a tool to visualize multithreaded applications [9]. The tool supports visualization of lock usage, execution history and a high-level view on thread activity. However, it lacks a vital feature: users can analyze the activity of single threads, but there is no synchronized multi-thread view which would permit to actually examine parallel activities and their context. ParaVision, a tool by Nutt et al. [5], provides multiple views on the runtime behavior of parallelized systems. Unfortunately, the views do not scale for systems with a high number of threads running in parallel and large executions traces. 4 Visualizing Function Boundary Traces of Multithreaded Systems to Locate Performance Bugs Mentally tracking multiple concurrent executions within a debugger is a complex, expensive and sometimes even close-to-impossible task. In this work, the approach is to provide substantial support by means of visualization so that developers can track and analyze those parallel executions with ease. The concept bases on a single assumption: developers that want to locate performance bugs typically have read the source code before and know where synchronization objects (namely locks) are accessed. With this premise, the general concept is as follows: each separate (chosen) thread of execution is visualized in a separate graph, which permits to analyze each threads activity separately and in parallel. The execution sequences are visualized using a stack depth based approach so that the resulting graph is as small as possible. (see Figure 1). Time is mapped along the x-axis of the visualization and stack depth along the y-axis. Calling relations between functions are implicit, that is, if function foo calls bar, then bar is drawn below foo. Since the whole graph exceeds a typical screen size, panning and zooming is implemented so that developers can adjust the shown time frame according to their needs. Overview maps enable quick navigation in the graph and provide additional orientation within the visualized execution trace Fall 2009 Workshop

221 5 NEXT STEPS Figure 1: The prototype tool showing a sequence cutout of two threads. On top, overviews depict each threads activity and allow to directly navigate to sections of interest. 5 Next Steps 5.1 Synchronized Navigation Across Thread Boundaries Navigation within the visualized execution trace(s) has to be as intuitive as possible. So synchronization between the separate thread views is a must - modifying the shown time range in one of the views should also update the other views accordingly. However, synchronization, in this case, is not trivial. Non-linear time mapping is applied to execution times during import, because execution durations typically vary a lot. Without time mapping, long lasting function/method executions would span multiple screens whereas very short executions maybe only span a single pixel. As the current implementation applies time mapping separately for each thread, there is no global time that could be used for synchronization purposes. It would probably be desirable to calculate such time mapping on-the-fly for the currently shown time slot across the shown threads in order to ensure a consistent global time. Fall 2009 Workshop 20-5

222 Towards Visualization of Complex, Service-Based Software Systems # Functions # Ticks Figure 2: An execution trace interpreted as b/w 2-dimensional raster image. 5.2 Evaluating Techniques to Identify Relevant Sections in Execution Traces As a matter of fact, Function Boundary Traces usually describe thousands to millions of sequential or parallel function executions. Depending on the purpose with which the trace was recorded, the relevant sections therein may be only a small percentage of the recorded data. So navigating to the relevant sections of the trace - where the synchronization of the analyzed applications is handled - can be a tedious task. Hence, the idea is to provide developers with pointers to the relevant time slots within such a trace. Kuhn and Greevy [4] propose to interpret execution traces as a signal in time and introduce a visualization that enables recognition of stack depth rising and dropping. Their assumption is that complex functionality typically causes large stacks to be created at runtime. Consequently, their visualization is tailored for analyzing the complexity of explicitly traced features. Inspired by Kuhn and Greevy, we aim to evaluate similar interpretation of execution traces as a signal in time to identify outliers within the whole recorded execution. More precise, the goal is to apply known image processing techniques to execution traces. Image registration is commonly used to, e.g., determine land mass movements in images generated by airborne microwave scanners. Land mass movements are determined by calculating the correlation between two images and thus deriving the relative shift between the two images. We aim to re-use this idea in order to identify outliers caused by flawed synchronization. An execution trace can be thought of as a 2-dimensional matrix (or even a b/w 2-dimensional raster image) having the dimensions (#funcs, #ticks). Whenever a function is active, the value of that cell/pixel is 1, otherwise 0 (see Figure 2). Using the Discrete/Fast Fourier Transform (DFT/FFT) [6], it is possible to efficiently calculate frequency representations of the recorded threads (see Figure 3). An almost regular execution containing a single outlier will cause the frequency representation of thread 2 to contain two peaks: the highest peak representing the regular executions and a lower peak representing the outlier. A virtually correct execution can be obtained by asking the user to select a part of the execution which is correct, i.e., has acceptable execution costs. The differences in the frequency representation of the flawed execution and the virtual execution with correct synchronization provides hints to the outliers Fall 2009 Workshop

223 5 NEXT STEPS Time Domain Stack Depth Thread 1 Lock release / Lock retrieve Regular executions foo() foo() foo() foo() t Stack Depth Thread 2 foo() foo() foo() foo() t Stack Depth Irregular executions, intermediate hang in thread 1 Thread 1 foo() foo() foo() foo() t Stack Depth Thread 2 foo() foo() foo() t FFT applied to Thread 1 (Thread 2 analog) Frequency Domain Amplitude Amplitude Regular executions f Irregular executions, intermediate hang in thread 1 f Figure 3: Example of time and frequency domain representations for an execution trace. The inverse fourier transform of a frequency spectrum containing only the differences (outliers) will provide a virtual trace with repeating execution of the outlier. Superposing this virtual trace on top of the recorded thread trace then enables developers to identify the outliers within the actual thread traces. Alternatively, data mining techniques could be applied; e.g., the Subgroup Discovery Problem [3] describes a similar problem: a huge data set containing mostly regular values and only several irregular values. The goal then is to identify the largest possible subset having different characteristics than all other elements (such as very few irregular executions). 5.3 Further Applications of Frequency Spectra Analysis Interpreting execution traces as a signal in time can probably be useful in tackling other debugging problems, as well. For example, frequency domain comparison of two traces of the same feature one before introducing a bug and one afterwards will give some indication of the differences and thus likely help to find the bug in the proximity of the differences. Fall 2009 Workshop 20-7

224 Towards Visualization of Complex, Service-Based Software Systems References [1] Stefan Diehl. Software Visualization. Visualizing the Structure, Behaviour, and Evolution of Software. Springer, Berlin, [2] Susan L. Graham, Peter B. Kessler, and Marshall K. Mckusick. Gprof: A call graph execution profiler. In SIGPLAN 82: Proceedings of the 1982 SIGPLAN symposium on Compiler construction, pages , New York, NY, USA, ACM. [3] Willi Klösgen. Handbook of data mining and knowledge discovery. Oxford University Press, Inc., New York, NY, USA, [4] Adrian Kuhn and Orla Greevy. Exploiting the analogy between traces and signal processing. In ICSM 06: Proceedings of the 22nd IEEE International Conference on Software Maintenance, pages , Washington, DC, USA, IEEE Computer Society. [5] G.J. Nutt, A.J. Griff, J.E. Mankovich, and J.D. McWhirter. Extensible parallel program performance visualization. Modeling, Analysis, and Simulation of Computer Systems, International Symposium on, 0:205, [6] Tao Pang. An Introduction to Computational Physics. Cambridge University Press, New York, NY, USA, [7] Vijay Janapa Reddi, Alex Settle, Daniel A. Connors, and Robert S. Cohn. Pin: a binary instrumentation tool for computer architecture research and education. In WCAE 04: Proceedings of the 2004 workshop on Computer architecture education, page 22, New York, NY, USA, ACM. [8] Kyle Wheeler and Douglas Thain. Visualizing massively multithreaded applications with threadscope. Concurrency and Computation: Practice and Experience, [9] Qiang A. Zhao and John T. Stasko. Visualizing the execution of threads-based parallel programs. Technical report, Georgia Institute of Technology, Fall 2009 Workshop

225 Web Service Generation and Data Quality Web Services Tobias Vogel 1 Overview My research on service-orientation during my first 5 month in the Research School was twofold. In the first part, which was more practical, I examined how existing applications can be provided for service-oriented architectures. More concrete, I integrated a service to wrap web applications into the PoSR framework which has been submitted to the IEEE Services Cup. Second, I investigated on the provisioning of data quality Web Services for duplicate detection. I examined differences between traditional and service-oriented duplicate detection and identified seven separate problem classes and their corresponding properties. 2 Potsdam Service Repository (PoSR) The joint PoSR project is a collaboration between the Business Process Technology group and the Information Systems group at HPI. It serves as a showcase to investigate on the benefits and usefulness of Web Service composition: existing Web Services are found by a crawler or generated on top of existing web applications. References for both, services and their meta data are stored in the repository where service requesters can search for appropriate services. Services can be graphically aggregated to composite Web Services while at the same time user interfaces are automatically created both for the single services and their compositions based on the discovered WSDL descriptions. My contribution was the web application wrapper where I developed a methodology to semi-automatically generate Web Service (SOAP) interfaces for multi-stepped web applications. Online services, e.g., shopping sites or travel information sites gain their user input via single or sequences of HTML forms. With this user provided information, web pages are created dynamically, called the Deep Web. This interaction model works well for interactions between humans and web applications. It is not feasible if computer programs have to autonomously use the functionality offered by web applications, for example to integrate web applications functionality as a business process in existing landscapes. To allow this, it might be desired to obtain Web Service interfaces for this functionality, however, in general, they are not offered. Thus, they can be generated by Fall 2009 Workshop 21-1

226 Web Service Generation and Data Quality Web Services third parties on top of the existing, published web applications, which is possible with the developed Generator. The generated Web Services use the web applications on behalf of the user by imitating the HTTP communication between the server and the user s web browser, i.e. sending appropriate sequences of HTTP requests to the web server. Figure 1 shows how the wrapper integrates into the communication map between client and server and thus, provides a SOAP interface for the web application. The particular communication protocol, i.e. the sequence and characteristics of the respective HTTP calls, is implemented in the Web Service. The following sections show how this knowledge is derived from web applications. Figure 1: The upper part of the figure shows the intended interaction with the web application: a human user employing a web browser. Below, a wrapper s located between the web application and the client, accepting SOAP calls and submitting corresponding HTTP requests to the web server to imitate the upper behavior. Now, the web application is accessible as a Web Service. 2.1 Web Application Model Web applications are computer programs that use HTML pages as user interface. User input is provided by HTML forms. Unlike traditional applications, web applications expose internal information, e.g., variable names or data types, in these forms. Furthermore, in multi-stepped web applications not all information are undisclosed in each form; they might appear in one or another form only, instead. This can be illustrated in a table (Figure 2) where columns are steps (or forms) and rows are the web application s variables which appear occasionally. This is called the web application s model, the set of all internal variables. The Web Service caller needs a concise Web Service description with all the relevant, use-case dependent input parameters which are normally submitted little by 21-2 Fall 2009 Workshop

227 2 POTSDAM SERVICE REPOSITORY (POSR) Figure 2: An web application s model containing 6 model elements from 3 forms. little when visiting the multi-stepped web application with a browser. The generated Web Service takes these parameters and sends them to the web application. However, to function properly, also static, non-use-case dependent variables (e.g., hidden variables) have to be submitted and thus are to be contained in the Web Service s implementation. The information needed for this is retrieved from the model. The main challenge is to aggregate form elements appearing in different forms to the same variable in the model. This can be achieved with the aforementioned exposed information, illustrated in Listing <t r> 44 <td> 45 <label for= t e l f i e l d > 46 Tel : 47 </ label> 48 <input type= t e x t name= telephone value= id= t e l f i e l d class= a d d r e s s f i e l d 50 maxlength= 40 > 51 </ td> 52 <td> 53 I n s e r t your telephone number 54 </ td> 55 </ t r> Listing 1: Example snippet from an address entry form Fall 2009 Workshop 21-3

228 Web Service Generation and Data Quality Web Services Each element of a form has a name and a type (line 48) which are the core information of any form element. Furthermore, there might be an initial value set when the form is loaded. Additional meta information might be available. In the example, an id and a class attribute are given (line 49). While these may already provide some hints for a human user, better descriptions can be found as visible rendered text as in lines 46 and 53. These descriptions can be offered as labels (line 45 to 47) or as text in near proximity (line 53) of the form element [5, 8]. 2.2 Form Element Matching To wrap a web application, the relevant function has to be run through by a human user, while the HTTP traffic is monitored by the Web Service generator and complete forms as well as their form elements are analyzed. In each step, all new form elements are tried to be matched to existing model elements, which have been found in previous steps. The decision, whether or not to match these elements, is taken from a similarity measure. All of the information mentioned in Section 2.1 is used to estimate this similarity, however different weights are applied due to the different degree of uncertainty in the attributes. The most reliable attribute is the form element s name, which is unlikely to change over different forms. Descriptive information (surrounding text, class attributes, etc.) might not be unique to this form element. This is why it adds a lower contribution to the overall similarity measure. The values of sent and received form elements provide the third contribution. In my experiments, I chose 1.0/0.75/0.5 for the previous weights (in the same order). I further applied a threshold of 0.6 to relaxate the matching, so that not every form element is forcefully matched to an existing model element, but new model elements can be created (Figure 3, compare to Figure 2 in step 3). The Web Service is provided with the full model as well as the information, when to send which form with which model elements. The actual values are read from the Web-Service-invoking SOAP message. To allow the Web Service caller to provide meaningful data, the model element data is also used to create a WSDL file containing the descriptions, default values, types, etc. for all user-definable (i.e., non-hidden, nondisabled) model elements. 2.3 Outcome The approach successfully matches corresponding form elements to model elements. With this, it is possible, to create single-call Web Services on the basis of multi-stepped web applications. However, the current prototypical implementation does not work on every existing web application, because, for example, the generated Web Services are incapable to execute JavaScripts, understand HTTPS or solve CAPTCHAs, etc. I documented the final outcome in a paper [7] which has been presented on the Workshop on Engineering Service-Oriented Applications (WESOA) in November Fall 2009 Workshop

229 3 DATA QUALITY WEB SERVICES Figure 3: Matching algorithm during step 3. The left part shows five already known model elements (M A to M E ) on the top and four form elements (F 1 to F 4 ) discovered in this step. The lines between them illustrate the similarity regarding to the three different similarity metrics. The table on the right shows the summed up similarities between any possible pair. Shaded cells constitute that this form element is regarded as being a representative of the corresponding model element and thus, is aggregated to it. Similarities below the threshold are ignored. M C and M D are not matched and therefore do not present any form elements in this step. F 4 could also not be matched and thus, constitutes an own, new model element. 3 Data Quality Web Services Data quality plays an important role for entrepreneurial success. However, many companies do not care about the data quality in their ERP or CRM systems, as recent studies show 1. Several measures can be driven to increase data quality, e.g., data normalization, duplicate detection, data fusion, etc. In my research, I concentrate on the detection of duplicates. Traditional duplicate detection employs well-established algorithms and heuristics, which in short search through a database and estimate the similarity of pairs of tuples based on data type, value, and additional information to identify these pairs as possible duplicates. However, sometimes the amount of available information is restricted: the schema might not be up-to-date, the mapping is unclear, privacy issues prevent full access on all the data, etc. Thus, the research question is, which information can be left out while still achieving appropriate results. In other words: which information is essential for a duplicate detection process and which information has to be inferred from the data. Web Services share many characteristics with the restrictions mentioned before. 1 data-quality-study-reveals-businesses-still-face-significant-challenges/ Fall 2009 Workshop 21-5

230 Web Service Generation and Data Quality Web Services Usually, they do not have access to the full amount of data and rely on the provided input information, generally available information, and/or other Web Services. They are invoked on-demand with exactly the information that is to be decided about, e.g., if only a small number of items has to be tested for similarity in an ad-hoc manner. Further, they provide a clearly specified functionality while remaining as general as possible to ensure a broad number of possible service requesters. These properties turn Web Services into the ideal foundation for evaluating duplicate detection algorithms with the limitations described above. There are a number of providers and registries for data quality Web Services 2. They provide services for data alignment (e.g., format telephone numbers properly), data completion (e.g., add postcodes to cities), or data verification (e.g., check ISBNs against book titles). However, there are no Web Services for duplicate detection. 3.1 Comparison Between Traditional and Service-Oriented Duplicate Detection Traditional duplicate detection algorithms operate on whole databases and have relatively fast access to all tuples, which allows them to drive statistical analysis on this data, e.g., the distribution of values in specific columns or the detection of key constraints. These algorithms further base their similarity measure on the table s data types. This makes it possible to compare telephone numbers differently from names of persons. Also attribute names provide useful information, because similar family names are more significant than matching first names, for example. Furthermore, traditional algorithms are provided with a mapping between the schemas of two data items to compare. Finally, the attributes of the items are distinguishable. In contrast, duplicate detection Web Services would only operate on the provided values as they do not access the Web Service requester s data storage. They might not have data type information, since they are only provided with attributes and their values which are deserialized as strings. Further, also the attribute names might be missing and thus, only lists of values would be provided. The mapping can be given (e.g., by the order of the attributes) but might also be unknown (or the order is insignificant). Finally, even field separators might be missing, e.g., when two pieces of textual information are compared. To sum up, there are four different pieces of information which might or might not be available: 1. Field separators 2. Mapping 3. Attribute names 4. Data types Fall 2009 Workshop

231 3 DATA QUALITY WEB SERVICES Calculative, this results in 2 4 = 16 different combinations. However, further examination has shown that only seven combinations are possible or practically relevant, respectively. 3.2 Classes of Different Duplicate Detection Problems Figure 4: Seven different classes of duplicate detection are illustrated in a decision tree. Left means available, right means not available. The four decisions are resulting in seven classes of duplicate detection problems, depicted in Figure 4. They are briefly characterized in the following. 1. Availability of separators, mapping, attribute names, and data types: this is most similar to the traditional duplicate detection task. However, only few tuples are available in this case, already. 2. Availability of separators, mapping, and attribute names: data type information is missing. This can be the case when the input is provided in JSON format, for example. The mapping is derived from the attribute names or the order of the attributes. Data type information has to be inferred to be used for the similarity estimation. 3. Availability of separators and mapping: attribute names are missing. Thus, only lists of values can be provided while the mapping might be specified by the order of the items in the list, for example. To apply corresponding algorithms, the attributes have to be classified. The better this classification works, the more tailored algorithms can be used. Fall 2009 Workshop 21-7


DISTRIBUTED TRUST MANAGEMENT FOR VALIDATING SLA CHOREOGRAPHIES Irfan Ul Haq Department of Knowledge and Business Engineering, University of Vienna, Austria Rehab Alnemr Hasso Plattner Institute, Potsdam

More information

Technische Berichte Nr. 11 des Hasso-Plattner-Instituts für Softwaresystemtechnik an der Universität Potsdam

HASSO - PLATTNER - INSTITUT für Softwaresystemtechnik an der Universität Potsdam Requirements for Service Composition Harald Meyer Dominik Kuropka Technische Berichte Nr. 11 des Hasso-Plattner-Instituts

More information

Efficient Model Synchronization of Large-Scale Models

Efficient Model Synchronization of Large-Scale Models Holger Giese, Stephan Hildebrandt Technische Berichte Nr. 28 des Hasso-Plattner-Instituts für Softwaresystemtechnik an der Universität Potsdam Technische

More information

Modeling and Verifying Dynamic Evolving Service-Oriented Architectures

Modeling and Verifying Dynamic Evolving Service-Oriented Architectures Holger Giese, Basil Becker Technische Berichte Nr. 75 des Hasso-Plattner-Instituts für Softwaresystemtechnik an der Universität Potsdam

More information

Industrial Case Study on the Integration of SysML and AUTOSAR with Triple Graph Grammars

Industrial Case Study on the Integration of SysML and AUTOSAR with Triple Graph Grammars Holger Giese, Stephan Hildebrandt, Stefan Neumann, Sebastian Wätzoldt Technische Berichte Nr. 57 des Hasso-Plattner-Instituts

More information

Data in Business Processes

Data in Business Processes Andreas Meyer, Sergey Smirnov, Mathias Weske Technische Berichte Nr. 50 des Hasso-Plattner-Instituts für Softwaresystemtechnik an der Universität Potsdam Technische Berichte

More information

Efficient and Scalable Graph View Maintenance for Deductive Graph Databases based on Generalized Discrimination Networks

Efficient and Scalable Graph View Maintenance for Deductive Graph Databases based on Generalized Discrimination Networks Thomas Beyhl, Holger Giese Technische Berichte Nr. 99 des Hasso-Plattner-Instituts

More information

Introduction to Service Oriented Architectures (SOA)

Introduction to Service Oriented Architectures (SOA) Responsible Institutions: ETHZ (Concept) ETHZ (Overall) ETHZ (Revision) - Version from: 26.10.2007 1 Content 1. Introduction

More information

Enriching Raw Events to Enable Process Intelligence Research Challenges

Enriching Raw Events to Enable Process Intelligence Research Challenges Nico Herzberg, Mathias Weske Technische Berichte Nr. 73 des Hasso-Plattner-Instituts für Softwaresystemtechnik an der Universität

More information

Modeling and Enacting Complex Data Dependencies in Business Processes

Modeling and Enacting Complex Data Dependencies in Business Processes Andreas Meyer, Luise Pufahl, Dirk Fahland, Mathias Weske Technische Berichte Nr. 74 des Hasso-Plattner-Instituts für Softwaresystemtechnik

More information

Semantic Business Process Management Lectuer 1 - Introduction

Arbeitsgruppe Semantic Business Process Management Lectuer 1 - Introduction Prof. Dr. Adrian Paschke Corporate Semantic Web (AG-CSW) Institute for Computer Science, Freie Universitaet Berlin

More information

A Business Process Services Portal

A Business Process Services Portal IBM Research Report RZ 3782 Cédric Favre 1, Zohar Feldman 3, Beat Gfeller 1, Thomas Gschwind 1, Jana Koehler 1, Jochen M. Küster 1, Oleksandr Maistrenko 1, Alexandru

More information



More information

Modeling Collaborations in Self-Adaptive Systems of Systems: Terms, Characteristics, Requirements, and Scenarios

Modeling Collaborations in Self-Adaptive Systems of Systems: Terms, Characteristics, Requirements, and Scenarios Sebastian Wätzoldt, Holger Giese Technische Berichte Nr. 96 des Hasso-Plattner-Instituts

More information

Bastian Koller HLRS High Performance Computing Center Stuttgart, University of Stuttgart Nobelstrasse 19 70550 Stuttgart +49-711-68565891

Negotiating SLAs with Dynamic Pricing Policies Peer Hasselmeyer NEC Laboratories Europe, IT Research Division, NEC Europe, Ltd. Rathausallee 10 53757 Sankt Augustin, Germany +49-2241-92520

More information

SOA: The missing link between Enterprise Architecture and Solution Architecture

SOA: The missing link between Enterprise Architecture and Solution Architecture Jaidip Banerjee and Sohel Aziz Enterprise Architecture (EA) is increasingly being acknowledged as the way to maximize existing

More information

Service-Oriented Architectures

Architectures Computing & 2009-11-06 Architectures Computing & SERVICE-ORIENTED COMPUTING (SOC) A new computing paradigm revolving around the concept of software as a service Assumes that entire systems

More information

Rules and Business Rules

OCEB White Paper on Business Rules, Decisions, and PRR Version 1.1, December 2008 Paul Vincent, co-chair OMG PRR FTF TIBCO Software Abstract The Object Management Group s work on standards for business

More information

Data-Aware Service Choreographies through Transparent Data Exchange

Institute of Architecture of Application Systems Data-Aware Service Choreographies through Transparent Data Exchange Michael Hahn, Dimka Karastoyanova, and Frank Leymann Institute of Architecture of Application

More information

Managing and Tracing the Traversal of Process Clouds with Templates, Agendas and Artifacts

Managing and Tracing the Traversal of Process Clouds with Templates, Agendas and Artifacts Marian Benner, Matthias Book, Tobias Brückmann, Volker Gruhn, Thomas Richter, Sema Seyhan paluno The Ruhr Institute

More information

WHITE PAPER. Enabling predictive analysis in service oriented BPM solutions.

Graphic Card Emc Utl 90 Login

WHITE PAPER Enabling predictive analysis in service oriented BPM solutions. Summary Complex Event Processing (CEP) is a real time event analysis, correlation and processing mechanism that fits in seamlessly

More information

Supply Chain Platform as a Service: a Cloud Perspective on Business Collaboration

Supply Chain Platform as a Service: a Cloud Perspective on Business Collaboration Guopeng Zhao 1, 2 and Zhiqi Shen 1 1 Nanyang Technological University, Singapore 639798 2 HP Labs Singapore, Singapore

More information


MODEL DRIVEN DEVELOPMENT OF BUSINESS PROCESS MONITORING AND CONTROL SYSTEMS Tao Yu Department of Computer Science, University of California at Irvine, USA Email: Jun-Jang Jeng IBM T.J. Watson

More information

Model Driven and Service Oriented Enterprise Integration---The Method, Framework and Platform

Driven and Oriented Integration---The Method, Framework and Platform Shuangxi Huang, Yushun Fan Department of Automation, Tsinghua University, 100084 Beijing, P.R. China {huangsx, fanyus}

More information

Extend the value of your core business systems.

Legacy systems renovation to SOA September 2006 Extend the value of your core business systems. Transforming legacy applications into an SOA framework Page 2 Contents 2 Unshackling your core business systems

More information

secure intelligence collection and assessment system Your business technologists. Powering progress

secure intelligence collection and assessment system Your business technologists. Powering progress The decisive advantage for intelligence services The rising mass of data items from multiple sources

More information

Nr.: Fakultät für Informatik Otto-von-Guericke-Universität Magdeburg

Nr.: Fakultät für Informatik Otto-von-Guericke-Universität Magdeburg Nr.: Fakultät für Informatik Otto-von-Guericke-Universität Magdeburg Impressum ( 5 TMG) Herausgeber: Otto-von-Guericke-Universität Magdeburg

More information

Digital libraries of the future and the role of libraries

Digital libraries of the future and the role of libraries Donatella Castelli ISTI-CNR, Pisa, Italy Abstract Purpose: To introduce the digital libraries of the future, their enabling technologies and their

More information

White Paper. How Streaming Data Analytics Enables Real-Time Decisions

White Paper How Streaming Data Analytics Enables Real-Time Decisions Contents Introduction.. 1 What Is Streaming Analytics?.. 1 How Does SAS Event Stream Processing Work?.. 2 Overview..2 Event Stream

More information

MDE Adoption in Industry: Challenges and Success Criteria

MDE Adoption in Industry: Challenges and Success Criteria Parastoo Mohagheghi 1, Miguel A. Fernandez 2, Juan A. Martell 2, Mathias Fritzsche 3 and Wasif Gilani 3 1 SINTEF, P.O.Box 124-Blindern, N-0314

More information

Ontology-Based Discovery of Workflow Activity Patterns

Ontology-Based Discovery of Workflow Activity Patterns Diogo R. Ferreira 1, Susana Alves 1, Lucinéia H. Thom 2 1 IST Technical University of Lisbon, Portugal {diogo.ferreira,susana.alves} 2

More information

Football Card Emc Grading

Dagstuhl seminar on Service Oriented Computing. Service design and development. Group report by Barbara Pernici, Politecnico di Milano

Dagstuhl seminar on Service Oriented Computing Service design and development Group report by Barbara Pernici, Politecnico di Milano Abstract This paper reports on the discussions on design and development

More information

Conclusion and Future Directions

Chapter 9 Conclusion and Future Directions The success of e-commerce and e-business applications depends upon the trusted users. Masqueraders use their intelligence to challenge the security during transaction

More information


MODELING OF SERVICE ORIENTED ARCHITECTURE: FROM BUSINESS PROCESS TO SERVICE REALISATION Marek Rychlý and Petr Weiss Faculty of Information Technology, Brno University of Technology, Czech Republic,,

More information

Towards Modeling and Transformation of Security Requirements for Service-oriented Architectures

Towards Modeling and Transformation of Security Requirements for Service-oriented Architectures Sven Feja 1, Ralph Herkenhöner 2, Meiko Jensen 3, Andreas Speck 1, Hermann de Meer 2, and Jörg Schwenk 3

More information

From Business World to Software World: Deriving Class Diagrams from Business Process Models

From Business World to Software World: Deriving Class Diagrams from Business Process Models WARARAT RUNGWORAWUT 1 AND TWITTIE SENIVONGSE 2 Department of Computer Engineering, Chulalongkorn University 254

More information

The DEMONS Integrated Access Control Model for Collaborative Network Monitoring

The DEMONS Integrated Access Control Model for Collaborative Network Monitoring Eugenia I. Papagiannakopoulou Maria N. Koukovini Georgios V. Lioudakis Dimitra I. Kaklamani Iakovos S. Venieris The 4 th

More information

Data Warehouse (DW) Maturity Assessment Questionnaire

Data Warehouse (DW) Maturity Assessment Questionnaire Catalina Sacu - Marco Spruit Frank Habers September, 2010 Technical Report UU-CS-2010-021

More information

Towards Collaborative Requirements Engineering Tool for ERP product customization

Towards Collaborative Requirements Engineering Tool for ERP product customization Boban Celebic, Ruth Breu, Michael Felderer, Florian Häser Institute of Computer Science, University of Innsbruck 6020 Innsbruck,

More information

Service Oriented Architecture

Service Oriented Architecture Charlie Abela Department of Artificial Intelligence Last Lecture Web Ontology Language Problems? CSA 3210 Service Oriented Architecture 2 Lecture Outline

More information

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom

More information

Ontology construction on a cloud computing platform

Ontology construction on a cloud computing platform Exposé for a Bachelor's thesis in Computer science - Knowledge management in bioinformatics Tobias Heintz 1 Motivation 1.1 Introduction PhenomicDB is

More information

Verifying Business Processes Extracted from E-Commerce Systems Using Dynamic Analysis

Verifying Business Processes Extracted from E-Commerce Systems Using Dynamic Analysis Derek Foo 1, Jin Guo 2 and Ying Zou 1 Department of Electrical and Computer Engineering 1 School of Computing 2 Queen

More information

Business-Driven Software Engineering Lecture 3 Foundations of Processes

Business-Driven Software Engineering Lecture 3 Foundations of Processes Jochen Küster Agenda Introduction and Background Process Modeling Foundations Activities and Process Models Summary

More information

SOA Enabled Workflow Modernization

Abstract Vitaly Khusidman Workflow Modernization is a case of Architecture Driven Modernization (ADM) and follows ADM Horseshoe Lifecycle. This paper explains how workflow modernization fits into the ADM

More information

Semantic Business Process Management

Arbeitsgruppe Lecture Semantic Business Process Management Prof. Dr. Adrian Paschke Corporate Semantic Web (AG-CSW) Institute for Computer Science, Freie Universitaet Berlin

More information

Service Level Agreements based on Business Process Modeling

Service Level Agreements based on Business Process Modeling Holger Schmidt Munich Network Management Team University of Munich, Dept. of CS Oettingenstr. 67, 80538 Munich, Germany Email:

More information


CLOUD BASED SEMANTIC EVENT PROCESSING FOR MONITORING AND MANAGEMENT OF SUPPLY CHAINS A VLTN White Paper Dr. Bill Karakostas Executive Summary Supply chain visibility is essential

More information


SLA BASED SERVICE BROKERING IN INTERCLOUD ENVIRONMENTS Foued Jrad, Jie Tao and Achim Streit Steinbuch Centre for Computing, Karlsruhe Institute of Technology, Karlsruhe, Germany {foued.jrad, jie.tao, achim.streit}

More information

Oracle Service Bus Examples and Tutorials

March 2011 Contents 1 Oracle Service Bus Examples.. 2 2 Introduction to the Oracle Service Bus Tutorials.. 5 3 Getting Started with the Oracle Service Bus Tutorials.. 12 4 Tutorial 1. Routing a Loan

More information

Reusable Knowledge-based Components for Building Software. Applications: A Knowledge Modelling Approach

Reusable Knowledge-based Components for Building Software Applications: A Knowledge Modelling Approach Martin Molina, Jose L. Sierra, Jose Cuena Department of Artificial Intelligence, Technical University

More information

Semantic-ontological combination of Business Rules and Business Processes in IT Service Management

Semantic-ontological combination of Business Rules and Business Processes in IT Service Management Alexander Sellner 1, Christopher Schwarz 1, Erwin Zinser 1 1 FH JOANNEUM University of Applied Sciences,

More information

Service Virtualization: Managing Change in a Service-Oriented Architecture

Service Virtualization: Managing Change in a Service-Oriented Architecture Abstract Load balancers, name servers (for example, Domain Name System [DNS]), and stock brokerage services are examples of virtual

More information

Data Mining Governance for Service Oriented Architecture

Data Mining Governance for Service Oriented Architecture Ali Beklen Software Group IBM Turkey Istanbul, TURKEY Turgay Tugay Bilgin Dept. of Computer Engineering Maltepe University Istanbul,

More information


CONTEMPORARY SEMANTIC WEB SERVICE FRAMEWORKS: AN OVERVIEW AND COMPARISONS Keyvan Mohebbi 1, Suhaimi Ibrahim 2, Norbik Bashah Idris 3 1 Faculty of Computer Science and Information Systems, Universiti Teknologi

More information

Lightweight Data Integration using the WebComposition Data Grid Service

Lightweight Data Integration using the WebComposition Data Grid Service Ralph Sommermeier 1, Andreas Heil 2, Martin Gaedke 1 1 Chemnitz University of Technology, Faculty of Computer Science, Distributed

More information


META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING Ramesh Babu Palepu 1, Dr K V Sambasiva Rao 2 Dept of IT, Amrita Sai Institute of Science & Technology 1 MVR College of Engineering 2

More information

Towards Process Evaluation in Non-automated Process Execution Environments

Towards Process Evaluation in Non-automated Process Execution Environments Nico Herzberg, Matthias Kunze, Andreas Rogge-Solti Hasso Plattner Institute at the University of Potsdam Prof.-Dr.-Helmert-Strasse

More information

A Hierarchical Self-X SLA for Cloud Computing

A Hierarchical Self-X SLA for Cloud Computing 1 Ahmad Mosallanejad, 2 Rodziah Atan, 3 Rusli Abdullah, 4 Masrah Azmi Murad *1,2,3,4 Faculty of Computer Science and Information Technology, UPM, Malaysia,

More information

A Framework of User-Driven Data Analytics in the Cloud for Course Management

A Framework of User-Driven Data Analytics in the Cloud for Course Management Jie ZHANG 1, William Chandra TJHI 2, Bu Sung LEE 1, Kee Khoon LEE 2, Julita VASSILEVA 3 & Chee Kit LOOI 4 1 School of Computer

More information

Pattern Matching for an Object-oriented and Dynamically Typed Programming Language

Pattern Matching for an Object-oriented and Dynamically Typed Programming Language Felix Geller, Robert Hirschfeld, Gilad Bracha Technische Berichte Nr. 36 des Hasso-Plattner-Instituts für Softwaresystemtechnik

More information

A SOA visualisation for the Business

J.M. de Baat 09-10-2008 Table of contents 1 Introduction..3 1.1 Abbreviations..3 2 Some background information.. 3 2.1 The organisation and ICT infrastructure.. 3 2.2 Five layer SOA architecture..

More information

Monitoring BPMN-Processes with Rules in a Distributed Environment

Monitoring BPMN-Processes with Rules in a Distributed Environment Lothar Hotz 1, Stephanie von Riegen 1, Lars Braubach 2, Alexander Pokahr 2, and Torsten Schwinghammer 3 1 HITeC e.v. c/o Fachbereich Informatik,

More information

Service-Oriented Architecture and Software Engineering

-Oriented Architecture and Software Engineering T-86.5165 Seminar on Enterprise Information Systems (2008) 1.4.2008 Characteristics of SOA The software resources in a SOA are represented as services based

More information

A Framework for the Delivery of Personalized Adaptive Content

A Framework for the Delivery of Personalized Adaptive Content Colm Howlin CCKF Limited Dublin, Ireland Danny Lynch CCKF Limited Dublin, Ireland Abstract

More information

Service Computing: Basics Monica Scannapieco

Service Computing: Basics Monica Scannapieco Generalities: Defining a Service Services are self-describing, open components that support rapid, low-cost composition of distributed applications. Since services

More information

Policy Driven Practices for SOA

Independent Insight for Oriented Practice Policy Driven Practices for SOA Lawrence Wilkes CBDI Forum Agenda! Enterprise SOA Challenge! SOA Policy Areas! Layered Architecture as a basis

More information

Software Engineering Prof. N.L. Sarda Computer Science & Engineering Indian Institute of Technology, Bombay Lecture-4 Overview of Phases (Part - II)

Software Engineering Prof. N.L. Sarda Computer Science & Engineering Indian Institute of Technology, Bombay Lecture-4 Overview of Phases (Part - II) We studied the problem definition phase, with which

More information

A Guide Through the BPM Maze

A Guide Through the BPM Maze WHAT TO LOOK FOR IN A COMPLETE BPM SOLUTION With multiple vendors, evolving standards, and ever-changing requirements, it becomes difficult to recognize what meets your BPM

More information

A Comparison of SOA Methodologies Analysis & Design Phases

202 A Comparison of SOA Methodologies Analysis & Design Phases Sandra SVANIDZAITĖ Institute of Mathematics and Informatics, Vilnius University Abstract. Service oriented computing is a new software engineering

More information

Business Process Configuration with NFRs and Context-Awareness

Business Process Configuration with NFRs and Context-Awareness Emanuel Santos 1, João Pimentel 1, Tarcisio Pereira 1, Karolyne Oliveira 1, and Jaelson Castro 1 Universidade Federal de Pernambuco, Centro

More information

Project VIDE Challenges of Executable Modelling of Business Applications

Project VIDE Challenges of Executable Modelling of Business Applications Radoslaw Adamus *, Grzegorz Falda *, Piotr Habela *, Krzysztof Kaczmarski #*, Krzysztof Stencel *+, Kazimierz Subieta * * Polish-Japanese

More information

Monitoring within an Autonomic Network: A. Framework

Monitoring within an Autonomic Network: A GANA based Network Monitoring i Framework Anastasios Zafeiropoulos, Athanassios Liakopoulos, Alan Davy, Ranganai Chaparadza Greek Research and

More information

Autonomic computing: strengthening manageability for SOA implementations

Autonomic computing Executive brief Autonomic computing: strengthening manageability for SOA implementations December 2006 First Edition Worldwide, CEOs are not bracing for change; instead, they are embracing

More information


P ERFORMANCE M ONITORING AND A NALYSIS S ERVICES - S TABLE S OFTWARE WP3 Document Filename: Work package: Partner(s): Lead Partner: v1.0-.doc WP3 UIBK, CYFRONET, FIRST UIBK Document classification: PUBLIC

More information

Figure 1: Illustration of service management conceptual framework

Dagstuhl Seminar on Service-Oriented Computing Session Summary Service Management Asit Dan, IBM Participants of the Core Group Luciano Baresi, Politecnico di Milano Asit Dan, IBM (Session Lead) Martin

More information

The Key to SOA Governance: Understanding the Essence of Business

THE NAME OF THE GAME: KANAME The Key to SOA Governance: Understanding the Essence of by Keith Swenson Kaname is a Japanese term meaning essence. In a Japanese fan, the bottom piece that keeps the fan together

More information

Design Patterns for Complex Event Processing

Design Patterns for Complex Event Processing Adrian Paschke BioTec Center, Technical University Dresden, 01307 Dresden, Germany adrian.paschke AT ABSTRACT Currently engineering efficient

More information

Business Rule Standards -- Interoperability and Portability

Rule Standards -- Interoperability and Portability April 2005 Mark H. Linehan Senior Technical Staff Member IBM Software Group Emerging Technology Donald F. Ferguson IBM Fellow Software

More information

LEADing Practice: Artifact Description: Business, Information & Data Object Modelling. Relating Objects

LEADing Practice: Artifact Description: Business, Information & Data Object Modelling Relating Objects 1 Table of Contents 1.1 The Way of Thinking with Objects.. 3 1.2 The Way of Working with Objects..

More information

A Semantic Marketplace of Peers Hosting Negotiating Intelligent Agents

Graphic Card Emc Utl 90 Form

A Semantic Marketplace of Peers Hosting Negotiating Intelligent Agents Theodore Patkos and Dimitris Plexousakis Institute of Computer Science, FO.R.T.H. Vassilika Vouton, P.O. Box 1385, GR 71110 Heraklion,

More information

Common Capabilities for Service Oriented Infrastructures In A Grid & Cloud Computing

Common Capabilities for Service Oriented Infrastructures In A Grid & Cloud Computing Prof. R.T Nakhate Nagpur University DMIETR, Salod Wardha Prof. M. Sayankar Nagpur University BDCOE Sevagram, Wardha

Grading More information

Knowledgent White Paper Series. Developing an MDM Strategy WHITE PAPER. Key Components for Success

Developing an MDM Strategy Key Components for Success WHITE PAPER Table of Contents Introduction.. 2 Process Considerations.. 3 Architecture Considerations.. 5 Conclusion.. 9 About Knowledgent.. 10

More information

Certified Information Professional 2016 Update Outline

Certified Information Professional 2016 Update Outline Introduction The 2016 revision to the Certified Information Professional certification helps IT and information professionals demonstrate their ability

More information

CS 565 Business Process & Workflow Management Systems

CS 565 Business Process & Workflow Management Systems Professor & Researcher Department of Computer Science, University of Crete & ICS-FORTH E-mail:, Office: K.307,

More information


70 ADAPTIVE SOA INFRASTRUCTURE BASED ON VARIABILITY MANAGEMENT Peter Graubmann, Mikhail Roshchin Abstract: In order to exploit the adaptability of a SOA infrastructure, it becomes necessary to provide

More information

Comparative Analysis of SOA and Cloud Computing Architectures Using Fact Based Modeling

Comparative Analysis of SOA and Cloud Computing Architectures Using Fact Based Modeling Baba Piprani 1, Don Sheppard 2, and Abbie Barbir 3 1 MetaGlobal Systems, Canada 2 ConCon Management Services, Canada

More information

Service Oriented Architecture

Service Oriented Architecture Version 9 2 SOA-2 Overview Ok, now we understand the Web Service technology, but how about Service Oriented Architectures? A guiding analogy Terminology excursion Service,

More information

A Collaborative System Software Solution for Modeling Business Flows Based on Automated Semantic Web Service Composition

32 A Collaborative System Software Solution for Modeling Business Flows Based on Automated Semantic Web Service Composition Ion SMEUREANU, Andreea DIOŞTEANU Economic Informatics Department, Academy of

More information

Weak Conformance between Process Models and Synchronized Object Life Cycles

Weak Conformance between Process Models and Synchronized Object Life Cycles Andreas Meyer, Mathias Weske Technische Berichte Nr. 91 des Hasso-Plattner-Instituts für Softwaresystemtechnik an der Universität

More information

Model-Driven Service Level Management

Model-Driven Service Level Management Anacleto Correia 1,2, Fernando Brito e Abreu 1 1 Faculdade de Ciências e Tecnologia/Universidade Nova de Lisboa, Caparica 2 Escola Superior de Tecnologia/Instituto

More information

A Review On SLA And Various Approaches For Efficient Cloud Service Provider Selection Shreyas G. Patel Student of M.E, CSE Department, PIET Limda

A Review On SLA And Various Approaches For Efficient Cloud Service Provider Selection Shreyas G. Patel Student of M.E, CSE Department, PIET Limda Prof. Gordhan B. Jethava Head & Assistant Professor, Information

More information

Software Architecture Document

Software Architecture Document Natural Language Processing Cell Version 1.0 Natural Language Processing Cell Software Architecture Document Version 1.0 1 1. Table of Contents 1. Table of Contents.. 2

More information

Repairing Event Logs Using Stochastic Process Models

Repairing Event Logs Using Stochastic Process Models Andreas Rogge-Solti, Ronny S. Mans, Wil M. P. van der Aalst, Mathias Weske Technische Berichte Nr. 78 des Hasso-Plattner-Instituts für Softwaresystemtechnik

More information

Tomáš Müller IT Architekt 21/04/2010 ČVUT FEL: SOA & Enterprise Service Bus. 2010 IBM Corporation

Tomáš Müller IT Architekt 21/04/2010 ČVUT FEL: SOA & Enterprise Service Bus Agenda BPM Follow-up SOA and ESB Introduction Key SOA Terms SOA Traps ESB Core functions Products and Standards Mediation Modules

More information

Secure Semantic Web Service Using SAML

Graphic Card Emc Utl 90s

Secure Semantic Web Service Using SAML JOO-YOUNG LEE and KI-YOUNG MOON Information Security Department Electronics and Telecommunications Research Institute 161 Gajeong-dong, Yuseong-gu, Daejeon KOREA

More information

On the general structure of ontologies of instructional models

On the general structure of ontologies of instructional models Miguel-Angel Sicilia Information Engineering Research Unit Computer Science Dept., University of Alcalá Ctra. Barcelona km. 33.6 28871 Alcalá

More information

Dynamism and Data Management in Distributed, Collaborative Working Environments

Dynamism and Data Management in Distributed, Collaborative Working Environments Alexander Kipp 1, Lutz Schubert 1, Matthias Assel 1 and Terrence Fernando 2, 1 High Performance Computing Center Stuttgart,

More information



More information

Verifying Semantic of System Composition for an Aspect-Oriented Approach

Graphic Card Emc Utl 900

2012 International Conference on System Engineering and Modeling (ICSEM 2012) IPCSIT vol. Www super contra game free download com. 34 (2012) (2012) IACSIT Press, Singapore Verifying Semantic of System Composition for an Aspect-Oriented Approach

Graphic Card Emc Utl 900

More information
Coments are closed
Scroll to top