SIPTA school 08 - UEE 08
Abstracts & Bibliography

Abstracts

A unified view of uncertainty theories,
by Didier Dubois [1,7,8,10,11,12,13]

The modeling of uncertainty is motivated by two concerns: taming the variability of external phenomena and facing incomplete information in decision processes. These two concerns are not unrelated, but the two concerns are distinct in the sense that variability is far from being the only cause of ignorance. However, the development of probability theory, as witnessed by the Bayesian school especially, tended to blur this distinction, suggesting that a unique probability distribution is enough to account for both randomness and incomplete information. More recently, new theories of uncertainty have emerged where partial ignorance is acknowledged and represented separately from randomness: the theories of evidence, possibility and imprecise probabilities, respectively. The aim of this talk is to provide a (partially) unified view of these approaches. The main point of the talk is that modern uncertainty theories put together probabilistic and set-valued representations, which allow for a clear separation between randomness and incompleteness.

The basic tool for representing information incompleteness is a subset of mutually exclusive values, one of which is the real one. This kind of uncertainty is naturally accounted for in logical representations. In the area of numerical modelling, the processing of incomplete information is basically carried out by interval analysis or constraint propagation methods. All above theories of uncertainty come down to introducing shades of plausibility within set-representations of incompleteness:

- in imprecise probability theory, the most general one, information takes the form a set of probability measures, all the larger as information is poor.
- in evidence theory, information is represented by random sets, which correspond to special subsets of probabilities.
- in possibility theory, information is summarized by fuzzy sets or fuzzy intervals, then equivalent to nested random sets.

The course points out some important issues to be addressed with uncertainty theories such as the difference between generic and singular information, practical representations of imprecise probabilities on the real line, conditioning and fusion, uncertainty propagation, and decision. The role of possibility theory, as the simplest representation of imprecise probability will be emphasized.

 

Coherent lower previsions,
by Enrique Miranda and Gert de Cooman [1,2,5]

This tutorial presents a summary of Peter Walley's theory of coherent lower previsions. This theory is a generalisation of de Finetti's approach to subjective probability that allows for indecision. It uses the behavioural interpretation, where a subject establishes his acceptable buying and selling prices for a number of bounded random variables. The theory verifies then the consistency of these assessments using a number of conditions, the strongest of which is called coherence.

We see that coherent lower previsions can be given two other mathematically equivalent representations: closed and convex sets of linear previsions and coherent sets of almost-desirable gambles. The first one connects the theory with Robust Bayesian analysis, by providing a sensitivity analysis representation of lower and upper previsions, while the second one is related to decision making.  Moreover, we can always determine the consequences of a number of coherent assessments for other (new) bounded random variables, using a procedure called natural extension. We show how to use this procedure under the three equivalent representations mentioned above. Besides, we see that most of the uncertainty models that appear in the literature can be embedded into the theory of coherent lower previsions, and similarly most of the extension procedures to larger domains in measure theory are a particular case of natural extension.

Finally, we study how to extend the notion of coherent lower previsions under the light of new information. We define then the so-called conditional lower previsions. We see how to extend the notion of coherence to this case, and the main difficulties that arise.

 

Algorithms and Approximation Methods for Imprecise Probability
by Fabio Gagliardi Cozman and Cassio P. de Campos

This short course has a simple goal: to present algorithms and methods that let one manipulate imprecision in probability values as efficiently as possible. Such imprecision is variously encoded through probability intervals, sets of probability distributions, neighborhoods of distributions; these representations have surfaced in many areas and have led to many algorithms, sometimes in different communities that have little contact. The course will try to present problems and solutions from these various communities in a somewhat unified form, using sets of probability measures as the main tool for representation, and resorting to established tools from optimization theory as much as possible. Existing software and packages will be discussed, and open problems will be examined.

The plan of the short course is as follows:
1) Probabilistic logic: the propositional case (linear programming and column generation methods, and challenges).
2) Probabilistic logic: the first-order case (kinds of logics, existing algorithms and their applications, recent efforts in "probabilistic relational models" for logic programming and description logics, solved and open problems).
3) Models from robust statistics (epsilon-contaminated, total variation) and Choquet capacities: closed-form solutions, approximation methods, applications.
4) Algorithms for graph-theoretical models, in particular credal networks (propagation methods and the 2U algorithm; approximations based on propagation; exact methods based on optimization and multilinear programming).
5) Quick overview of existing software for manipulation of imprecision in probability values, and a discussion of the (many!) open problems in this area

 

Credal Networks: Theory and Applications
by Cassio P. de Campos and Fabio Gagliardi Cozman [6,19,20,21,22]

Graph-theoretical models such as Bayesian networks have been spread out in a wide variety of areas from Statistics, Engineering and Computer Science (especially in Artificial Intelligence) to Human and Biological Sciences. A Bayesian network encodes a single joint probability distribution for a set of random variables through a compact graph representation. Credal networks extend the usual Bayesian networks by encoding a set of joint probability distributions. This short course discusses theoretical and applied aspects of credal networks under the strong independence concept. We start with graph-theoretical models that are the basis for the credal network theory, discussing their properties, applications and extensions. Then the credal network model is introduced, including some ideas to handle qualitative and logical assessments. Learning, classification and belief updating problems are presented, followed by some algorithmic ideas and complexity issues. Finally, some applications of credal networks are discussed.

 

Independence Concepts in Imprecise Probability,
by Fabio Gagliardi Cozman [1,3]

The concepts of stochastic independence and stochastic conditional independence are central to probability theory: e.g they are used to build large distributions out of marginal and conditional distributions; they are used in many convergence results; they are used to isolate parts of a model with dynamic components. Not surprisingly, research on imprecision in probability values has also paid attention to the concept of "independence". However, no single concept of independence (and of conditional independence) has emerged from the literature. Quite the contrary: there are several proposed concepts; even though all of them are related in one way or another to stochastic independence, they have interesting and diverse properties on their own. Given the central place of "independence" in probabilistic thinking, one should have a good command of these existing proposals. The goal of this short course is to discuss the existing concepts of independence in the context of imprecision in probability values, to discuss their properties, to compare their strengths and applications, to indicate their weaknesses, and to suggest a number of open problems that are ready to be tackled.

The plan of the course is as follows:
1) Review of stochastic independence, including their applications in modeling and asymptotics, and their graphoid properties.
2) Walley's epistemic independence: definition, properties, open problems.
3) Kuznetsov's independence: definition, properties, open problems.
4) Strong independence: varieties of strong independence (and their many definitions); attempts at axiomatizing strong independence, including Seidenfeld's recent derivation based on coherent choice functions; properties and open problems.
5) Application in the theory of graph-theoretical models, with comparisons between varieties of strong independence, Walley's, and Kuzetsov's proposals.
6) Overview of other related concepts of irrelevance and non-interaction in the literature, and discussion of the challenges involved in defining independence when conditioning on events of zero probabilities is allowed.

 

Predictive inference: from Bayesian inference to Imprecise Probability,

by Jean-Marc Bernard [1,4,14,15,16,17,18]

There are two essential problems in statistical inference: from an observed sample of data, generated by a known or assumed model, one may want to learn about unknown model's parameters (parametric inference), or to make predictions about future data (predictive inference). The two problems are closely related, and the talk will focus on the predictive problem, exemplified by the case of categorical data, assumed to be either multinomial (infinite population) or multi-hypergeometric (finite population). We shall particularly pay attention to "immediate prediction", i.e. predicting a single future observation, also known as the "rule of succession" problem.

We shall review the answers of the two major theories of inference for this problem: frequentist inference and Bayesian inference. Both theories use probabilities, but essentially differ in the status of probabilities that are used, either frequentist or epistemic ones. In our case, a common Bayesian answer leads to a Dirichlet-multinomial probability distribution from which all inferences are drawn.

Another important difference lies in the principles the inferences obey to (e.g. coherence, symmetry, embedding principle). A way to reconcile several principles is provided by a generalization of Bayesian inference which, in order to model uncertainty, allows for sets of distributions instead of a single one. This leads to an imprecise probability model, an important one being the imprecise Dirichlet-multinomial model (IDMM) which attempts to formalize prior ignorance. The final part of the talk will particularly focus on the IDMM and its properties.

 

Imprecise immediate predictions,
by Gert de Cooman [1,4,9,17,29,30,31,32,33]

This course deals with the issue of immediate prediction. Generally speaking, we are dealing with immediate prediction whenever we can represent, in some way, the available information in the form of an event tree [29], where the nodes represent situations that may occur, and where in each node we have local beliefs about which of the child nodes is going to be visited.

Immediate prediction occurs in a great variety of situations which are very relevant for probabilistic and statistical modelling and reasoning, e.g., in predictive inference and stochastic processes. I discuss the very general case that the local predictive belief models, attached to the nodes, are imprecise probability models [1]. This leads in particular to the notion of an imprecise probability tree [30].

I show in particular how, using the rationality principles provided the behavioural theory of imprecise probabilities, we can combine the local into global models: these then represent beliefs about which sequence of situations will occur. This combination can in general be done quite efficiently by a method of backward propagation. The discussion here will provide a particular interpretation to a number of notions, ideas and results in game-theoretic probability [9].

I then go on to apply these ideas in three specific areas: (i) predictive inference in statistical reasoning, leading to a characterisation of the Imprecise Dirichlet Model [4,17,33], (ii) quite general laws of large numbers for imprecise probability models [30,31]; and (iii) Markov chains in stochastic process theory [32]. I try to make clear in all these applications that the quite general and very efficient approach described above leads to surprisingly simple and elegant solutions to interesting problems.

Robust Bayesian Analysis
by Fabrizio Ruggeri [23--28]

The short course will illustrate ideas, methods and examples in Bayesian robustness, discussing also connections/differences with the imprecise probability approach. Formalisation of expert's opinions into a prior distribution on the parameter of interest and its update, once data are observed, into a posterior distribution via Bayes Theorem are, at the same time, the strength and the weakness of the Bayesian approach. The latter aspect is due to the arbitrariness in the choice of the prior distribution (and a similar argument could be used about the other two elements of the Bayesian approach, i.e. likelihood and loss function). Bayesian robustness is about the study of the effects of the changes in prior/likelihood/loss on the quantities of interest, e.g. posterior mean or set probabilities.

In the lecture, we will review how the uncertainty on the prior can be modelled (mostly by a class of priors), the different sensitivity analyses proposed in literature as well as the widely used measures of robustness. Although most of the lecture will be on robustness with respect to the prior, loss robustness will be considered as well. Robust procedures and few applications will be illustrated as well.

 

What is risk?  What is probability?  Game-theoretic answers
by Glenn Shafer [9]

For 170 years, people have been arguing about whether probability is objective or subjective. From the game-theoretic point of view that emerges from my 2001 book with Vladimir Vovk [9], the question is instead whether a particular decision problem is embedded in a repetitive structure. The method of defensive forecasting, developed by Vovk in more recent work, gives probabilities for decision making in problems that are sufficiently repetitive, without any a priori assumption about trials being identical. But when the focus is on the particular case rather than on long-run average performance, and when there is even contention about what long-run sequence the particular case should be compared with, we need instead methods of weighing evidence. These methods produce only upper and lower probabilities.

 

   

Suggested bibliography

[1] P. Walley. Statistical Reasoning with Imprecise Probabilities. Chapman and Hall, 1991.

[2] E. Miranda, A survey of the theory of coherent lower previsions. Universidad Rey Juan Carlos, May 2007.

[3] I. Couso, S. Moral, and P. Walley, Examples of Independence for Imprecise Probabilities. In: ISIPTA’99 - Proceedings of the First International Symposium on Imprecise Probabilities and Their Applications, Ghent, Belgium, 1999.

[4] P. Walley and J.-M. Bernard, Imprecise probabilistic prediction for categorical data, Rapport technique CAF-9901 du Laboratoire Cognition et Activités Finalisées, Université Paris 8, Saint-Denis (France), Janvier 1999.

[5] G. de Cooman and Enrique Miranda, Symmetry of models versus models of symmetry, in Probability and Inference: Essays in Honor of Henry E. Kyburg, Jr., eds. William Harper and Gregory Wheeler, pp. 67-149 , King's College Publications, London, 2007.

[6] F. G. Cozman, Credal networks, Artificial Intelligence Journal, vol. 120, pp. 199-233, 2000.

[7] D. Dubois and H. Prade, Théorie des Possibilités. Applications à la Représentation des Connaissances en Informatique, Masson, Paris, 1985.

[8] D. Dubois and H. Prade, Possibility Theory: An Approach to Computerized Processing of Uncertainty (updated translation of [7]), Plenum Press, New York, 1988.

[9] G. Shafer, Vladimir Vovk. Probability and Finance: It's only a game! Wiley, New York, 2001.

[10] G. L.S. Shackle, Decision, Order and Time in Human Affairs, 2nd edition, Cambridge University Press, UK, 1961.

[11] G. Shafer, A mathematical Theory of Evidence, Princeton University Press, 1976.

[12] L. A. Zadeh, Fuzzy sets as a basis for a theory of possibility, Fuzzy Sets and Systems, 1: 3-28, 1978.

[13] G. De Cooman, Possibility theory. Part I: Measure- and integral-theoretic groundwork; Part II: Conditional possibility; Part III: Possibilistic independence. Int. J. of General Systems, 25: 291-371, 1997.

[14] J.-M. Bernard, Bayesian Interpretation of Frequentist Procedures for a Bernoulli Process, The American Statistician, 50(1), 7-13, 1996.

[15] J.-M. Bernard, Bayesian Inference for Categorized Data, In Rouanet H. et al., New Ways in Statistical Methodology: From Significance Tests to Bayesian Inference", Bern: Peter Lang, pp. 159-226, 1998.

[16] J.-M. Bernard, An Introduction to the Imprecise Dirichlet Model for Multinomial Data, International Journal of Approximate Reasoning, 39, 123-150, 2005.

[17] P. Walley, Inferences from multinomial data: learning about a bag of marbles, JRSS B, 58(1), 3-57, 1996.

[18] P. Walley, Reconciling frequentist properties with the likelihood principle, Journal of Statistical Planning and Inference, 105, 35-65, 2002.

[19] M. Zaffalon, The naive credal classifier. Journal of Statistical Planning and Inference 105(1), p. 5–21, 2002.

[20] C. P. De Campos, F. G. Cozman, The Inferential Complexity of Bayesian and Credal Networks. In: International Joint Conference on Artificial Intelligence, p. 1313-1318, 2005.

[21] C. P. De Campos, F. G. Cozman, Belief Updating and Learning in Semi-Qualitative Probabilistic Networks. In: Conference on Uncertainty in Artificial Intelligence, AUAI Press, p. 153-160, 2005.

[22] A. Cano, M. Gomez-Olmedo, S. Moral, Credal Nets with Probabilities Estimated with an Extreme Imprecise Dirichlet Model. International Symposium on Imprecise Probability: Theories and Applications, p. 57-66, 2007.

[23] D. Rios Insua, and F. Ruggeri Eds, Robust Bayesian Analysis, Springer-Verlag, New York, 2000.

[24] J.O. Berger, D. Rios Insua, and F. Ruggeri, Bayesian robustness. In Robust Bayesian Analysis (D. Rios Insua and F. Ruggeri, eds.), Springer-Verlag, New York, 2000.

[25] J.O. Berger, The robust Bayesian viewpoint (with discussion). In Robustness of Bayesian Analysis (J. Kadane ed.), North Holland, Amsterdam, 1984.

[26] J.O. Berger, Statistical Decision Theory and Bayesian Analysis, Springer-Verlag, New York, 1985.

[27] J.O. Berger, Robust Bayesian analysis: sensitivity to the prior; Journal of Statistical Planning and Inference, vol. 25, pp. 303-328, 1990.

[28] J.O. Berger, An overview of robust Bayesian analysis (with discussion), TEST, vol. 3, pp. 5-59, 1994.

[29] G. Shafer. The Art of Causal Conjecture. The MIT Press, 1996.

[30] G. de Cooman and F. Hermans, Imprecise probability trees: Bridging two theories of imprecise probability, Artificial Intelligence, 2008. In print. DOI:10.1016/j.artint.2008.03.001 (http://dx.doi.org/10.1016/j.artint.2008.03.001)

[31] G. de Cooman and E. Miranda, Weak and strong laws of large numbers for coherent lower previsions, Journal of Statistical Planning and Inference, 138(8), 2409-2432, 2008. DOI:10.1016/j.jspi.2007.10.020 (http://dx.doi.org/10.1016/j.jspi.2007.10.020)

[32] G. de Cooman, F. Hermans and E. Quaeghebeur, Imprecise Markov chains and their limit behaviour. Submitted for publication (arXiv:0801.0980).

[33] G. de Cooman, E. Miranda and E. Quaeghebeur, Representation insensitivity in immediate prediction under exchangeability, International Journal of Approximate Reasoning, 2008. In press. DOI:10.1016/j.ijar.2008.03.010 (http://dx.doi.org/10.1016/j.ijar.2008.03.010)

   
Last Update: December 10th, 2008