| Title: | COMBINING INFORMATION EXTRACTION AND DATA INTEGRATION IN THE ESTEST SYSTEM |
Author(s): | Dean Williams and Alexandra Poulovassilis |
| Abstract: | We describe an approach which builds on techniques from Data Integration and Information Extraction
in order to make better use of the unstructured data found in application domains
such as the Semantic Web which require the integration of information from
structured data sources, ontologies and text.
We describe the design and implementation of the ESTEST system which integrates available
structured and semi-structured data sources into a virtual global schema which is used to
partially configure an information extraction process.
The information extracted from the text is merged with this virtual global database and is
available for query processing over the entire integrated resource.
As a result of this semantic integration, new queries can now be answered
which would not be possible from the structured and semi-structured data alone.
We give some experimental results from the ESTEST system in use. |
| Title: | A FRAMEWORK FOR THE DEVELOPMENT AND DEPLOYMENT OF EVOLVING APPLICATIONS - Elaborating on the Model Driven Architecture Towards a Change-resistant Development Framework |
Author(s): | Georgios Voulalas and Georgios Evangelidis |
| Abstract: | Software development is an R&D intensive activity, dominated by human creativity and diseconomies of scale. Current efforts focus on design patterns, reusable components and forward-engineering mechanisms as the right next stage in cutting the Gordian knot of software. Model-driven development improves productivity by introducing formal models that can be understood by computers. Through these models the problems of portability, interoperability, maintenance, and documentation are also successfully addressed. However, the problem of evolving requirements, which is more prevalent within the context of business applications, additionally calls for efficient mechanisms that ensure consistency between models and code, and enable seamless and rapid accommodation of changes, without interrupting severely the operation of the deployed application. This paper introduces a framework that supports rapid development and deployment of evolving web-based applications, based on an integrated database schema. The proposed framework can be seen as an extension of the Model Driven Architecture targeting a specific family of applications. |
| Title: | SMART BUSINESS OBJECT - A New Approach to Model Business Objects for Web Applications |
Author(s): | Xufeng (Danny) Liang and Athula Ginige |
| Abstract: | At present, there is a growing need to accelerate the development of web applications and to support
continuous evolution of web applications due to evolving business needs. The object persistence capability
and web interface generation capability in contemporary MVC (Model View Controller) web application
development frameworks and model-to-code generation capability in Model-Driven Development tools has
simplified the modelling of business objects for developing web applications. However, there is still a
mismatch between the current technologies and the essential support for high-level, semantic-rich modelling
of web-ready business objects for rapid development of modern web applications. Therefore, we propose a
novel concept called Smart Business Object (SBO) to solve the above-mentioned problem. In essence,
SBOs are web-ready business objects. SBOs have high-level, web-oriented attributes such as email, URL,
video, image, document, etc. This allows SBO to be modelled at a higher-level of abstraction than
traditional modelling approaches. A lightweight, near-English modelling language called SBOML (Smart
Business Object Modelling Language) is proposed to model SBOs. We have created a toolkit to streamline
the creation (modelling) and consumption (execution) of SBOs. With these tools, we are able to build fully
functional web applications in a very short time without any coding. |
| Title: | MEASURING EFFECTIVENESS OF COMPUTING FACILITIES IN ACADEMIC INSTITUTES A NEW SOLUTION FOR A DIFFICULT PROBLEM |
Author(s): | Smriti Sharma and Veena Bansal |
| Abstract: | There has been a constant effort to evaluate the success of Information Technology in organizations. This kind of investment is extremely hard to evaluate because of difficulty in identifying tangible benefits, as well as high uncertainty about achieving the expected value. Though a lot of research has taken place in this direction, but not much is written about evaluating IT in non-profit organizations like educational institutions. Measures for evaluating success of IT in such kind of institutes are markedly different from that of business organizations. The purpose of this paper is to build further upon the existing body of research by proposing a new model for measuring effectiveness of computing facilities in academic institutes. As a baseline, Delone & McLean’s model for measuring the success of Information System (DeLone & McLean 1992,DeLone & McLean 2003) is used, as it is the most pioneering model in this regard. |
| Title: | DISCOVERY AND AUTO-COMPOSITION OF SEMANTIC WEB SERVICES |
Author(s): | Philippe Larvet and Bruno Bonnin |
| Abstract: | In order to facilitate the on-demand delivery of new services for mobile terminals as well as for fixed phones, we propose a user-centric solution based on Semantic Service-Oriented Architecture (SSOA) for instant building and delivery of new services composed with existing Web services discovered and assembled on-the-fly. This solution, based on semantic descriptions of Web services, is made of three main mechanisms: a semantic service discoverer, transparent for the user, allows to find the pertinent Web services matching with the user's original request, expressed vocally or by a SMS or a simple text ; a semantic service composer, using the semantic descriptions of the Web services, allows to combine and orchestrate the discovered services in order to build a new service fully matching the user's request, and a service deliverer makes the new service immediately accessible by the user. |
| Title: | PCA-BASED DATA MINING PROBABILISTIC AND FUZZY APPROACHES WITH APPLICATIONS IN PATTERN RECOGNITION |
Author(s): | Luminita State, Catalina Cocianu, Panayiotis Vlamos and Viorica Stefanescu |
| Abstract: | The aim of the paper is to develop a new learning by examples PCA-based algorithm for extracting skeleton information from data to assure both good recognition performances, and generalization capabilities. Here the generalization capabilities are viewed twofold, on one hand to identify the right class for new samples coming from one of the classes taken into account and, on the other hand, to identify the samples coming from a new class. The classes are represented in the measurement/feature space by continuous repartitions, that is the model is given by the family of density functions , where H stands for the finite set of hypothesis (classes). The basis of the learning process is represented by samples of possible different sizes coming from the considered classes. The skeleton of each class is given by the principal components obtained for the corresponding sample. The recognition algorithm results by a defuzzyfication technique that identifies the class whose skeleton is the “nearest” to the tested example, where the closeness degree is expressed in terms of the amount of disturbance determined by the decision of allotting it to the corresponding class. |
| Title: | PROGRAM VERIFICATION TECHNIQUES FOR XML SCHEMA-BASED TECHNOLOGIES |
Author(s): | Suad Alagic, Mark Royer and David Briggs |
| Abstract: | Representation and verification techniques for XML Schema types,
structures, and applications, in a program verification system PVS are
presented. Type derivations by restriction and extension as defined in
XML Schema are represented in the PVS type system using predicate
subtyping. Availability of parametric polymorphism in PVS makes it
possible to represent XML sequences and sets via PVS theories.
Powerful PVS logic capabilities are used to express complex
constraints of XML Schema and its applications. Transaction
verification methodology developed in the paper is based on
declarative, logic-based specification of the frame constraints and
the actual transaction updates. A sample XML application given in the
paper includes constraints typical for XML schemas such as keys and
referential integrity, and in addition ordering and range constraints.
The developed proof strategy is demonstrated by a sample transaction
verification with respect to this schema. The overall approach has a
model theory based on the view of XML types and structures as
theories. The core of this model theory is also presented in the
paper. |
| Title: | ADMIRE FRAMEWORK: DISTRIBUTED DATA MINING ON DATA GRID PLATFORMS |
Author(s): | Nhien An Le Khac, Tahar Kechadi and Joe Carthy |
| Abstract: | In this paper, we present the ADMIRE architecture; a new framework for developing novel and innovative data mining techniques to deal with very large and distributed heterogeneous datasets in both commercial and academic applications. The main ADMIRE components are detailed as well as its interfaces allowing the user to efficiently develop and implement their data mining applications techniques on a Grid platform such as Globus ToolKit, DGET, etc. |
| Title: | USAGE TRACKING LANGUAGE: A META LANGUAGE FOR MODELLING TRACKS IN TEL SYSTEMS |
Author(s): | Christophe Choquet and Sébastien Iksal |
| Abstract: | In the context of distance learning and teaching, the re-engineering process needs a feedback on the learners' usage of the learning system. The feedback is given by numerous vectors, such as interviews, questionnaires, videos or log files. We consider that it is important to interpret tracks in order to compare the designer’s intentions with the learners’ activities during a session. In this paper, we present the usage tracking language – UTL. This language is designed to be generic and we present an instantiation of a part of it with IMS-Learning Design, the representation model we chose for our three years of experiments. |
| Title: | ON CONTEXT AWARE PREDICATE SEQUENCE QUERIES |
Author(s): | Hagen Höpfner |
| Abstract: | Due to the limited input capabilities of small mobile information
system clients like mobile phones, it is not a must to support a
descriptive query language like SQL. Furthermore, information systems
with mobile clients have to address characteristics resulting from
clients mobility as well as from wireless communications. These
additional functions can be supported by a reasonable, well-defined
notation of queries. Moreover, such systems should be context
aware. In this paper we present a query notation named ""context aware
predicate sequence queries"" which respects these issues.
|
| Title: | CLICKSTREAM DATA MINING ASSISTANCE - A Case-Based Reasoning Task Model |
Author(s): | Cristina Wanzeller and Orlando Belo |
| Abstract: | This paper presents a case-based reasoning system to assist users in knowledge discovery from clickstream data. The system is especially oriented to store and make use of the knowledge acquired from the experience in solving specific clickstream data mining problems inside a corporate environment. We describe the main design, implementation and characteristics of this system. The system was implemented as a prototype Web-based application, centralizing the past mining processes in a corporative memory. Its main goal is the decentralized recommendation of the most suited mining strategies to address the problem at hand, accepting as inputs the characteristics of the available clickstream data and the analysis requirements. The system also takes advantage and integrates corporative related information resources, supporting a semi-automated data gathering approach along the organization. |
| Title: | A DATA MINING APPROACH TO LEARNING PROBABILISTIC USER BEHAVIOR MODELS FROM DATABASE ACCESS LOG |
Author(s): | Mikhail Petrovskiy |
| Abstract: | The problem of user behavior modeling arises in many fields of computer science and software development. User models play an important role in recommendation and collaborative filtering systems, in intrusion detection systems and in some tasks of software engineering. In this paper we investigate a data mining approach for learning probabilistic user behavior models from the database usage logs. We propose simple but effective procedure for translating database traces into representation suitable for application of user behavior modeling techniques based on sequential, associative or classification data mining models. However, most existing methods have a serious drawback - they rely on the order of actions and ignore time intervals between actions. To avoid this problem we propose our novel method based on combination of decision trees classification algorithm and empirical time-dependent feature map, motivated by potential functions theory. As a result, the designed method generates understandable probabilistic user behavior models taking into account time dependencies. The performance of the proposed method was experimentally estimated on real-world data. The comparison to the results of state-of-the-art data mining methods has confirmed outstanding performance of our method in predictive user behavior modeling and has demonstrated competitive results in anomaly detection. |
| Title: | VIRTUAL MUSEUM – AN IMPLEMENTATION OF A MULTIMEDIA OBJECT-ORIENTED DATABASE |
Author(s): | Rodrigo Filev Maia and Jorge Rady Almeida Junior |
| Abstract: | This paper describes the main characteristics involved in the process of using multimedia content in the Internet sites and it presents a proposal for an implementation of an object-oriented database, in order to assist the multimedia data exigency in a dynamic website. It is described an implementation of the proposed architecture, consisting of a virtual museum made for the Contemporary Art Museum of the USP, called Virtual MAC, which was elected as the 3rd best virtual museum of the world by INFOLAC Web 2005 (UNESCO) . The main objective of Virtual MAC is to create a virtual collection of works at art and make it available on Internet. Our analysis shows that it is more appropriate to use the Object Oriented paradigm instead of Relational Modelling due to the nature of the multimedia data and the structure of the dynamic web site used for the Virtual MAC. |
| Title: | AN ANALYSIS OF THE EFFECTS OF SPATIAL LOCALITY ON THE CACHE PERFORMANCE OF BINARY SEARCH TREES |
Author(s): | Thomas B. Puzak and Chun-Hsi Huang |
| Abstract: | The topological structure of binary search trees does not translate well into the linear nature of a computer's memory system, resulting in high cache miss rates on data accesses. This paper analyzes the cache performance of search operations on several varieties of binary trees. Using uniform and nonuniform key distributions, the number of cache misses encountered per search is measured for Vanilla, AVL, and two types of Cache Aware Trees. Additionally, concrete measurements of the degree of spatial locality observed in the Trees is provided. This allows the trees to be evaluated for situational merit, and for definitive explanations of their performance to be given. Results show that the balancing operations of AVL trees effectively negates any spatial locality gained through naive allocation schemes. Furthermore, for uniform input this paper shows that large cache lines are only beneficial to trees that consider the cache's line size in their allocation strategy. Results in the paper demonstrate that adaptive cache aware allocation schemes that approximate the key distribution of a tree have universally better performance than static systems that favor a particular key distribution. |
| Title: | A UNIFIED APPROACH FOR SOFTWARE PROCESS REPRESENTATION AND ANALYSIS |
Author(s): | Vassilis C. Gerogiannis, George Kakarontzas and Ioannis Stamelos |
| Abstract: | This paper presents a unified approach for software process management which combines object-oriented (ΟΟ) structures with formal models based on (high-level timed) Petri nets. This pairing may be proved beneficial not only for the integrated representation of software development processes, human resources and work products, but also in analysing properties and detecting errors of a software process specification, before the process is put to actual use. The use of OO models provides the advantages of graphical abstraction, high-level of understanding and manageable representation of a software process classes and instances. The resulted OO models are mechanically transformed into a high-level timed Petri net representation to derive a model for formally proving process properties as well as applying managerial analysis. We demonstrate the applicability of our approach by addressing a software process modelling example problem used in the literature to exercise various software process modelling notations. |
| Title: | USING LINGUISTIC TECHNIQUES FOR SCHEMA MATCHING |
Author(s): | Ozgul Unal and Hamideh Afsarmanesh |
| Abstract: | Organizations from a variety of domains have now clearly realized the need for collaboration to achieve
higher goals and/or to be more productive. Among others, the collaboration requirement has become
prominent in the Biodiversity domain. Nevertheless, like in any other collaborative network, different
Biodiversity nodes represent a variety of heterogeneous structuring/organization of the information, which
is very challenging. Automatic resolution of semantic and schematic heterogeneity still remains a bottleneck
for providing integrated access to and data sharing among heterogeneous, autonomous, and distributed
biodiversity databases in a network of biodiversity organizations. In order to deal with this problem,
matching components among database schemas need to be identified and heterogeneity needs to be
resolved, by creating the corresponding mappings in a process called schema matching. One important step
in this process is the identification of the syntactic and semantic similarity among elements from different
schemas, usually referred to as Linguistic Matching. The Linguistic Matching component of a schema
matching and integration system, called SASMINT, is the focus of this paper. Unlike other systems, which
typically utilize only a limited number of similarity metrics, SASMINT makes an effective use of NLP
techniques for the Linguistic Matching and proposes a weighted usage of several syntactic and semantic
similarity metrics. In order to demonstrate the accuracy of weighted sum of metrics, a number of tests have
been carried out, results of which are presented in this paper. Since it is not easy for the user to determine
the weights, SASMINT provides a component called Sampler as another novelty, to support automatic
generation of weights. |
| Title: | FORMAL FRAMEWORK FOR SEMANTIC INTEROPERABILITY |
Author(s): | Nadia Yaacoubi Ayadi, Mohamed Ben Ahmed and Yann Pollet |
| Abstract: | We address in this paper the general issue of ""a
posteriori"" semantic interoperability between systems relying on
semantically heterogeneous schemas, having been designed for the
purpose of independent specific goals and activities.\\ As
conceptual schemas, we opt here for UML conceptual hierarchies of
classes, integrating a very general notion of specialisation,
including exclusive/inclusive and/or complete/incomplete
constraints, etc., and from which the classical inheritance link is
a particular case.
\\ In our approach, we formalize all the knowledge embedded in
hierarchies in terms of a consistent logical description. Using
these set of predicates, we interpret an UML hierarchy as a property
lattice by adding new abstractions of classes (that can be empty or
non empty). However, a \emph{property lattice} is not unique.
According to a particular attribute scaling, we may obtain different
lattices that are all equivalent to the initial schema. So, we
introduce the notion of conceptual structure that is the equivalence
set of all equivalent lattices. In this context, given a set of
assertions stating the existence of various semantic links between
properties (attributes, concepts) of two schemas $S_{1}$ and
$S_{2}$, our problem is to automatically build an ""interoperation""
structure equivalent to relevant parts of $S_{1}$ and $S_{2}$
schemas. So, we propose an algorithm to incrementally reorganize
property lattices,
based on schema logical formulation. For the purpose of the reorganization
algorithm, we propose a set of elementary operators. |
| Title: | ON THE EVALUATION OF TREE PATTERN QUERIES |
Author(s): | Yangjun Chen |
| Abstract: | The evaluation of Xpath expressions can be handled as a tree embedding problem. In this paper, we propose two strategies on this issue. One is ordered-tree embedding based and the other is unordered-tree embedding based. For the ordered-tree embedding, our algorithm needs only O(|T|Þ|P|) time and O(|T|Þ|P|) space, where |T| and |P| stands for the numbers of the nodes in the target tree T and the pattern tree P, respectively. We show that the unordered-tree embedding is NP-complete by a reduction from the constraint satisfaction problem (CSP). Based on such a reduction, an algorithm is devised for the unordered problem. In the case that the branching of pattern trees is limited, the algorithm works in polynomial time.
|
| Title: | CRYSTALLIZATION OF AGILITY - Back to Basics |
Author(s): | Asif Qumer and Brian Henderson-Sellers |
| Abstract: | There are a number of agile and traditional methodologies for software development. Agilists provide agile principles and agile values to characterize the agile methods but there is no clear and inclusive definition of agile methods; subsequently it is not feasible to draw a clear distinction between traditional and agile software development methods in practice. The purpose of this paper is to explain the concept of agility in detail; and then to suggest a definition of agile methods that would facilitate us to rank or differentiate agile methods from other available methods. |
| Title: | DATA MINING METHODS FOR GIS ANALYSIS OF SEISMIC VULNERABILITY |
Author(s): | Florin Leon and Gabriela M. Atanasiu |
| Abstract: | This paper aims at designing some data mining methods of evaluating the seismic vulnerability of regions in the built infrastructure. A supervised clustering methodology is employed, based on k-nearest neighbor graphs. Unlike other classification algorithms, the method has the advantage of taking into account any distribution of training instances and also data topology. For the particular problem of seismic vulnerability analysis using a Geographic Information System, the gradual formation of clusters (for different values of k) allows a decision-making stakeholder to visualize more clearly the details of the cluster areas. The performance of the k-nearest neighbor graph method is tested on three classification problems, and finally it is applied to a sample from a digital map of Iasi, a large city located in the North-Eastern part of Romania. |
| Title: | SOFTWARE SYNTHESIS OF THE WEB-BASED QUESTIONNARIE SYSTEM |
Author(s): | Masahiro Yamamoto |
| Abstract: | The questionnaires on the web are increasing in the fields of business and personals recently. However, staff peoples of business areas and conventional personals can not implement them by themselves. Usually they ask professionals of information technology fields for building of such kinds of questionnaires. In this case it takes many times and costs much. If staff peoples of business fields and personals can easily make it by themselves, it is very useful. Software synthesis system of a web–based questionnaire system for them is developed. |
| Title: | WEB INFORMATION SYSTEM: A FOUR LEVEL ARCHITECTURE |
Author(s): | Roberto Paiano, Anna Lisa Guido and Leonardo Mangia |
| Abstract: | Business processes are playing a very important role in companies and the explicit introduction of them in Information System architecture is a must. According to the interest shown towards Web Application it is important to introduce a new web-oriented class of software, which is able to gives to the manager the possibility to operating directly with the process (we will talk about process oriented WIS - Web Information System). It is necessary to replace the three-level logic of the traditional application development (Data, Business Logic, Presentation), that hides processes in the Business Logic, with a four-level logic that allows to separate the process level from the application level: definition and management of the processes will be not tied solely to business logic. Our research (work in progress) focus is on an innovative framework (software architecture and methodology) for Information System development that links together the know-how acquired in Web Application design and on process definition concepts. |
| Title: | INFORMATION SYSTEM DESIGN AND PROTOTYPING USING FORM TYPES |
Author(s): | Jelena Pavićević, Ivan Luković, Pavle Mogin and Miro Govedarica |
| Abstract: | The paper presents the form type concept that generalizes screen forms that users utilize to communicate with an information system. The concept is semantically rich enough to enable specifying such an initial set of constraints, which makes it possible to generate application prototypes together with related implementation database schema. IIS*Case is a CASE tool based on the form type concept that supports conceptual modelling of an information system and its database schema. The paper outlines a way how this tool can generate XML specifications of application prototypes of an information system. The aim is to improve IIS*Case through implementation of a module which can produce an executable prototype of an information system, automatically. |
| Title: | EFFECTİVENESS OF WEB BASED PBL USİNG COURSE MANAGEMENT TECHNOLOGİES: A CASE STUDY |
Author(s): | Havva H. Basak and Serdar Ayan |
| Abstract: | Maritime education and training has typically focused on delivering practical courses for a practical vocation. In the modern environment, maritime personnel now need to be more professional, more open to change and more business-like in their thinking. This has led to changes in the education system that supports the maritime industries. Teaching thinking skills has become a major agenda for education. Problem Based Learning is a part of this thinking. Problem-Based Learning (PBL) within a web-based environment in the delivery of an undergraduate courses has been investigated. The effects was evaluated by comparing the performances of the students using the web-based PBL and comparing the outcomes with those of the traditional PBL. The outcomes of the experiments was positive. By having real life problems as focal points and students as active problem-solvers, the learning paradigm would shift towards the attainment of higher thinking skills. |
| Title: | A BAYESIAN NETWORK TO STRUCTURE A DATA QUALITY MODEL FOR WEB PORTALS |
Author(s): | Angélica Caro, Coral Calero, Houari Sahraoui, Ghazwa Malak and Mario Piattini |
| Abstract: | The technological advances and the use of the internet have favoured the appearance of a great diversity of web applications, among them Web portals. Through them, organizations develop their businesses in a highly competitive environment. One decisive factor for this competitiveness is the assurance of its data quality. In previous works, a data quality model for Web portals has been developed. The model is represented as a matrix that links the user expectations of data web quality to the portal functionalities. Into this matrix a set of 34 attributes where classified. However, the quality attributes on this model have not an operational structure, necessary to be used actual assessment. In this paper we present how we have structured these attributes by means of a probabilistic approach, using Bayesian Networks. The final objective is to use the Bayesian network obtained for evaluating the quality of a data portal (or a subset of its characteristics). |
| Title: | A DETECTION METHOD OF STAGNATION SYMPTOMS BY USING PROJECT PROGRESS MODELS GENERATED FROM PROJECT REPORTS |
Author(s): | Satoshi Tsuji, Yoshitmo Ikkai and Masanori Akiyoshi |
| Abstract: | The purpose of this research is to extract ``stagnation symptoms'' from progress reports about research project. The stagnation symptom is defined as a portion where remarkable stagnation is seen in a project progress. Concretely, according to project managers, stagnation symptoms can be classified into the following three kinds: first one is a bottleneck of the project grasped from one document, the second is clarified by comparing with the most recent document, and the third is clarified from changes of working object in a series of documents. We propose the method to extract stagnation symptoms with structural analysis of project progress. A progress model that is a structural chart to express progress of a project is generated from documents with label tags, which indicate contexts or attributes beforehand. multilevel layer model by detailed degree and situation analysis by color and relation analysis of details and basis by propagation of color. Stagnation symptoms are automatically extracted by applying stagnation symptom extraction rules to the progress model. The proposed method has been applied to set of real progress reports. It could extract stagnation symptoms that were extracted in sense manually.
|