Daniel Stamate

Current Research

Soft Computing and Computational Intelligence: Imperfect information representation and integration, rule based reasoning under uncertainty, incompleteness and inconsistency with multivalued logics, and knowledge bases and databases with imperfect information and optimal querying

This theme addresses the problem of representing and reasoning with imperfect information using a logical approach based on multivalued logics, as well as the integration and querying of information coming from different sources, in a distributed environment.

Motivation comes from the area of knowledge acquisition, representation, and reasoning based on imperfect knowledge. Indeed, in the real world information may be incomplete or may have a bounded level of certainty and on the other hand contradictions may occur during the process of integrating information coming from various sources as it is the case of collecting knowledge from different experts. In multi-agent systems, different agents may give different answers to the same query. It is then important to be able to process the answers so as to extract the maximum of information on which the various agents agree, or to detect the items on which the agents give conflicting answers. Incompleteness, uncertainty and inconsistency of the information may be treated by using ready to employ hypotheses when information is completely missing, and multivalued logics with particular algebraic structures of semilattice, lattice and bilattice, when information is incomplete, uncertain or inconsistent.

In our framework the information concerns the truth values of information items, and is obtained through queries to the relevant sources. The answers of such queries are combined or integrated using a set of rules. In such a setting, imperfect information i.e. incomplete, uncertain information from a source, or contradictory information coming from different sources, can elegantly be expressed and dealt with using bilattices for instance, and an approach of reasoning based on rules whose semantics is natural in these bilattice based multivalued logics.

A connected research direction we tackle concerns the problem of optimal querying of knowledge bases and databases with imperfect information. Conventional techniques based on the concept of homomorphism have traditionally been used in database research to study the containment of queries evaluated against conventional data. We have extended and generalized these techniques such that the problem of query containment and equivalence (essential in optimal query evaluation) can be successfully studied in the context of sources containing imperfect information.

The applications of these approaches are in knowledge acquisition and representation, uncertain knowledge bases and databases, intelligent systems, and imperfect information integration and querying.

Data mining: dimensionality reduction, and clustering techniques

We have introduced an approach to reducing dimensionality of data, and data clustering inspired from computational models used to evaluate economical parameters. In particular the techniques of dimensionality reduction are based on calculating the highest risk or the lowest return associated with each attribute in the dataset.

Research under development concerns the formulation and implementation of new segmentation techniques which are efficient for processing very large datasets. These techniques are based on three phases that contribute to make the process of clustering very large datasets tractable, and contribute to a good quality clustering. In particular the initial phase consists in a pre-clustering process whose result formed of mini-cluster prototypes feeds the process associated to the second phase. The second phase is performed on data which are wholly resident in the main memory, is computationally intensive and generate macro-clusters formed of prototypes. The third phase integrates the whole data in the macro-clusters, leading to the final clusters.

These techniques, currently designed for numerical data, need to be extended to nominal and mixed data.

Relevant applications are those in data pre-processing and segmentation, knowledge discovery in databases.

Back to Daniel Stamate's homepage