Thesis Summary
My thesis investigates the mining of sequential data in order to
(1) provide interesting patterns by considering contextual
information, and
(2) exploit such patterns for other tasks such
as classification, prediction or anomaly detection.
The first part of this thesis aims at considering contextual information
associated with data during the frequent pattern mining process, in
order to provide the expert with patterns that are representative of
a context. Existing work prior to this thesis could not reveal that some patterns strongly depend on context. We thus
provide the notion of contextual frequent pattern, where a pattern is
associated with a context. In addition, we generalize the notion of
contextual pattern to various interestingness measures (other than frequency):
information gain, growth rate, etc. In both cases, we unveil and exploit
some essential theoretical properties of contextual patterns and provide
efficient algorithms.
The second part of this work concerns the use of contextual patterns to
address various data mining tasks. We mainly focus here on sequential
data in order to perform pattern-based classification, prediction and
anomaly detection. Being able to consider contextual information is here
of great help. For instance, contextual patterns can highlight the fact that a behavior that is considered as anomalous
in
summer can be considered as normal in
winter. It is
therefore absolutely necessary to understand what is changing according
to the context.
Our approaches have been experimented on various real-world datasets and
have been showed to be efficient in practical applications.
Current version of the PhD thesis here (in French)
My topics
- Frequent pattern mining
- Context-aware patterns
- Pattern-based applications (classification, prediction, anomaly detection)
- Sensor data mining
- Sequential data mining