Current Research
Soft Computing and Computational Intelligence: Imperfect information representation and integration, rule based reasoning under uncertainty, incompleteness and inconsistency with multivalued logics, and knowledge bases and databases with imperfect information and optimal querying
This theme addresses the problem of representing and reasoning with imperfect
information using a logical approach based on multivalued
logics, as well as the integration and querying of information coming from
different sources, in a distributed environment.
Motivation comes from the area of knowledge acquisition, representation, and
reasoning based on imperfect knowledge. Indeed, in the real world information
may be incomplete or may have a bounded level of certainty and on the other
hand contradictions may occur during the process of integrating information
coming from various sources as it is the case of collecting knowledge from
different experts. In multi-agent systems, different agents may give different
answers to the same query. It is then important to be able to process the
answers so as to extract the maximum of information on which the various agents
agree, or to detect the items on which the agents give conflicting answers.
Incompleteness, uncertainty and inconsistency of the information may be treated
by using ready to employ hypotheses when information is completely missing, and
multivalued logics with particular algebraic structures
of semilattice, lattice and bilattice,
when information is incomplete, uncertain or inconsistent.
In our framework the information concerns the truth values of information
items, and is obtained through queries to the relevant sources. The answers of
such queries are combined or integrated using a set of rules. In such a
setting, imperfect information i.e. incomplete, uncertain information from a
source, or contradictory information coming from different sources, can
elegantly be expressed and dealt with using bilattices
for instance, and an approach of reasoning based on rules whose semantics is
natural in these bilattice based multivalued
logics.
A connected research direction we tackle concerns the problem of optimal querying
of knowledge bases and databases with imperfect information. Conventional
techniques based on the concept of homomorphism have traditionally been used in
database research to study the containment of queries evaluated against
conventional data. We have extended and generalized these techniques such that
the problem of query containment and equivalence (essential in optimal query
evaluation) can be successfully studied in the context of sources containing
imperfect information.
The applications of these approaches are in knowledge acquisition and
representation, uncertain knowledge bases and databases, intelligent systems, and
imperfect information integration and querying.
Data mining: dimensionality reduction, and clustering techniques
We have introduced an approach to reducing dimensionality of data, and data clustering inspired from computational models used to evaluate economical parameters. In particular the techniques of dimensionality reduction are based on calculating the highest risk or the lowest return associated with each attribute in the dataset.
Research under development concerns the formulation and implementation of new segmentation techniques which are efficient for processing very large datasets. These techniques are based on three phases that contribute to make the process of clustering very large datasets tractable, and contribute to a good quality clustering. In particular the initial phase consists in a pre-clustering process whose result formed of mini-cluster prototypes feeds the process associated to the second phase. The second phase is performed on data which are wholly resident in the main memory, is computationally intensive and generate macro-clusters formed of prototypes. The third phase integrates the whole data in the macro-clusters, leading to the final clusters.
These techniques, currently designed for numerical data, need to be extended
to nominal and mixed data.
Relevant applications are those in data pre-processing and segmentation,
knowledge discovery in databases.
Back to Daniel Stamate's
homepage