Nonparametric Bayes and empirical Bayes for species sampling problems: classical questions, new directions and related issues ’ — ‘NBEB-SSP’
Progetto Consider a population of individuals belonging to different species with unknown proportions. Given an
initial (observable) random sample from the population, how do we estimate the number of species in the
population, or the probability of discovering a new species in one additional sample, or the number of
hitherto unseen species that would be observed in additional unobservable samples? These are archetypal
examples of a broad class of statistical problems referred to as species sampling problems (SSP), namely:
statistical problems in which the objects of inference are functionals involving the unknown species
proportions and/or the species frequency counts induced by observable and unobservable samples from the
population. SSPs first appeared in ecology, and their importance has grown considerably in the recent years
driven by challenging applications in a wide range of leading scientific disciplines, e.g., biosciences and
physical sciences, engineering sciences, machine learning, theoretical computer science and information
theory, etc.
The objective of this project is the introduction and a thorough investigation of new nonparametric Bayes
and empirical Bayes methods for SSPs. The proposed advances will include: i) addressing challenging
methodological open problems in classical SSPs under the nonparametric empirical Bayes framework, which
is arguably the most developed (currently most implemented by practitioners) framework do deal with
classical SSPs; fully exploiting and developing the potential of tools from mathematical analysis,
combinatorial probability and Bayesian nonparametric statistics to set forth a coherent modern approach to
classical SSPs, and then investigating the interplay between this approach and its empirical counterpart;
extending the scope of the above studies to more challenging SSPs, and classes of generalized SSPs, that
have emerged recently in the fields of biosciences and physical sciences, machine learning and information
theory.