USC Probability and Statistics Seminars (Fall 2018)
Fridays 3:30pm in KAP 414; tea is usually provided at 3:00pm
Organizers: Steve Heilman and Stanislav Minsker
For seminars/colloquia on other topics, see the Department of Mathematics webpage.
Older seminars: Spring 2017, Fall 2016, Spring 2016, Fall 2015, Spring 2015, Fall 2014, 2013-2014
-
Fall 2018 seminars
August 31: Johannes Muhle-Karbe (Carnegie Mellon University)
Equilibrium Asset Pricing with Transaction Costs
We study a risk-sharing equilibrium where heterogenous agents trade subject to a quadratic transaction costs. The corresponding equilibrium asset prices and trading strategies are characterized by a system of fully-coupled forward-backward stochastic differential equations. We show that a unique solution exists provided that the agents’ preferences are sufficiently similar. (Joint work in progress with Martin Herdegen and Dylan Possamai)
September 7: Roman Vershynin (UC Irvine)
From number theory to machine learning: hunting for smooth Boolean functions
The most fundamental kind of functions studied in computer science are Boolean functions. Most Boolean functions oscillate a lot, which is analogous to the fact that “most” continuous functions on R are nowhere differentiable. A “smooth” Boolean function can be generated by taking the sign of some polynomial of low degree in n variables. Such functions are called polynomial threshold functions, and they are widely used in machine learning as classification devices. Surprisingly, we do not know how many low-degree polynomial threshold functions there exist! An approximate answer to this question has been known only for polynomials of degree 1, i.e. for linear functions. In a recent joint work with Pierre Baldi, we found a way to approximately count polynomial threshold functions of any fixed degree. This solves a problem of M. Saks that goes back to 1993 and much earlier. Our argument draws insights from analytical number theory, additive combinatorics, enumerative combinatorics, probability and discrete geometry. I will describe some of these connections, focusing particularly on a beautiful interplay of zeta and Mobius functions in number theory, hyperplane arrangements in enumerative combinatorics and random tensors in probability theory.
September 14: Xin Tong (USC Marshall School of Business)
Neyman-Pearson classification
In many binary classification applications, such as disease diagnosis and spam detection, practitioners commonly face the need to limit type I error (that is, the conditional probability of misclassifying a class 0 observation as class 1) so that it remains below a desired threshold. To address this need, the Neyman-Pearson (NP) classification paradigm is a natural choice; it minimizes type II error (that is, the conditional probability of misclassifying a class 1 observation as class 0) while enforcing an upper bound, alpha, on the type I error. Although the NP paradigm has a century-long history in hypothesis testing, it has not been well recognized and implemented in classification schemes. Common practices that directly limit the empirical type I error to no more than alpha do not satisfy the type I error control objective because the resulting classifiers are still likely to have type I errors much larger than alpha. This talk introduces the speaker’s work on NP classification algorithms and their applications and raises current challenges under the NP paradigm.
September 21: Wei-Min Shen (USC Information Sciences Institute)
Self-Reconfigurable “SuperBot” Robots and Surprise-Based Learning (SBL)
Self-reconfigurable robots/UAVs would autonomously adapt their logical or physical configurations and locomotion/manipulation capabilities (such as shape, size, and function) to the environments and tasks in hand. With modularity, versatility, self-healing-ability, and low cost for mass production, such systems could provide powerful solutions for complex tasks in unstructured, uncertain, and dynamic environments. This talk first describes the major challenges for such research, including dynamic network topology, limited individual resource, lack of reliable global communication and synchronization, constraints for centralized control. We then present a particular robotic system called “SuperBot” for self-assembly, self-healing, and self-reconfiguration, as well as coordinated precision locomotion and manipulation. Individual SuperBot units are modeled “robotic stem cells” and use a biologically inspired mechanism call “digital hormones” to physically connect and disconnect with each other and collaborate via content-based messages without any globally unique identifiers. Featured demonstration movies will show their modular designs, distributed control, topology-triggered discovery and behaviors, and their potential applications. A non-parametric machine learning technology called “Surprise-Based Learning (SBL)” has been developed over 30 years to enable individual or global systems to autonomously learn from their own experiences in both deterministic and uncertain environments. Based on logic and probability, SBL could detect and rectify model deficiency or anomalies, build and refine predictive models (such as POMDP), and recover from local failures. We will discuss some details of SBL from theoretical and practical aspects, including developmental psychology, cognitive architecture, system reliability, detectability and learnability of surprises, model convergence, correctness, and scalability, with collected simulation and physical experimental results.
September 28:
Zachary Feinstein (Washington University Saint Louis)
Moving Scalarizations for Time Consistency in Dynamic Multivariate Problems in Finance
In this talk we will consider the time consistency problem for dynamic multivariate risk measures. Such risk measures arise in markets with transaction costs and, separately, when studying systemic risk. We will introduce the family of all market-consistent scalar multivariate risk measures. Duality of such risk measures will be presented. These results will be used to provide a recursive formulation of multivariate scalar risk measures. We relate this recursive formulation to a notion of a “moving scalarization”, i.e. where the consistent risk-weights applied to the assets (in markets with transaction costs) or banks (in systemic risk) must change dynamically in time.
October 5: Jianfeng Zhang (USC)
Viscosity Solutions to Parabolic Master Equations
The master equation is a type of PDE whose state variable involves the distribution of certain underlying state process, and thus is by nature an infinite dimensional problem. It is a powerful tool for studying limit behavior of large interacting systems, including mean field games and systemic risk. It also appears naturally in stochastic control problems with partial information and in time inconsistent problems. In this talk we propose a novel notion of viscosity solution for parabolic master equations, arising mainly from control problems, and investigate its wellposedness. Our main innovation is to restrict the involved measures to certain set of semimartingale measures which satisfies the desired compactness.
October 12: no seminar
October 19: Matthew Junge (Department of Mathematics, Duke University)
Diffusion–Limited Annihilating Systems
We study a two-type annihilating system in which particles are placed with equal density on the integer lattice. Particles perform simple random walk and annihilate when they contact a particle of different type. The occupation probability of the origin was known to exhibit anomalous behavior in low-dimension when particles have equal speeds. Describing the setting with asymmetric speeds has been open for over 20 years. We prove a lower bound that matches physicists’ conjectures and discuss partial progress towards an upper bound. Joint with Michael Damron, Hanbaek Lyu, Tobias Johnson, and David Sivakoff.
October 26: Steven Heilman (USC)
Symmetric Sets with Minimal Gaussian Surface Area
It is well known that a Euclidean set with fixed Gaussian volume and minimal Gaussian surface area is a half-space. On the other hand, finding the symmetric Euclidean set of fixed Gaussian volume and minimal Gaussian surface area has remained elusive since at least a work of Barthe from 2001. We show that if this set or its complement is convex and if a certain integral of its curvature is not close to 1, then the set must be a round cylinder. That is, with a small range of exceptions, we resolve the convex case of Barthe’s conjecture. Time permitting, we will discuss applications to theoretical computer science and to finding Euclidean partitions of fixed Gaussian volume and minimal Gaussian surface area.
November 2: Aaditya Ramdas (Department of Statistics and Data Science, Carnegie Mellon University)
Exponential line-crossing inequalities
This paper develops a class of exponential bounds for the probability that a martingale sequence crosses a time-dependent linear threshold. Our key insight is that it is both natural and fruitful to formulate exponential concentration inequalities in this way. We illustrate this point by presenting a single assumption and a single theorem that together strengthen many tail bounds for martingales, including classical inequalities (1960-80) by Bernstein, Bennett, Hoeffding, and Freedman; contemporary inequalities (1980-2000) by Shorack and Wellner, Pinelis, Blackwell, van de Geer, and de la Pena; and several modern inequalities (post-2000) by Khan, Tropp, Bercu and Touati, Delyon, and others. In each of these cases, we give the strongest and most general statements to date, quantifying the time-uniform concentration of scalar, matrix, and Banach-space-valued martingales, under a variety of nonparametric assumptions in discrete and continuous time. In doing so, we bridge the gap between existing line-crossing inequalities, the sequential probability ratio test, the Cramer-Chernoff method, self-normalized processes, and other parts of the literature.
(joint work with Steve Howard, Jas Sekhon and Jon McAuliffe, preprint https://arxiv.org/abs/1808.03204)
November 9: Joe Neeman (Department of Mathematics, UT-Austin)
The Gaussian double–bubbleThe Gaussian isoperimetric inequality states that if we want to partition R^n into two sets with prescribed Gaussian measure while minimizing the Gaussian surface area of the interface between the sets, then the optimal partition is obtained by cutting R^n with a hyperplane. We prove an extension to more than two parts. For example, the optimal way to partition R^3 into three parts involves cutting along three rays that meet at 120-degree angles at a common point. This is the Gaussian analogue of the famous Double Bubble Theorem in Euclidean space, which describes the shape that results from blowing two soap bubbles and letting them stick together.
November 16: no seminar (USC mathematical finance conference)
Monday November 19 [Note the nonstandard day]: Cesar Cuenca (Department of Mathematics, MIT)
Interlacing Particle Systems and Special FunctionsI will talk about the beta orbital corners process, a natural interpolation (on the dimension of the base field) of orbital measures from random matrix theory. The new result is the convergence in probability of finitely many levels of the orbital corners process to a Gaussian process. Our approach is based on the study of asymptotics of a class of special functions (called the multivariate Bessel functions) via explicit formulas.
November 23: no seminar (Thanksgiving)
Wednesday November 28 [Math colloquium]: Tim Austin (Department of Mathematics, UCLA)
The Weak Pinsker Property
This talk is about the structure theory of measure-preserving systems: transformations of a finite measure space that preserve the measure. Many important examples arise from stationary processes in probability, and simplest among these are the i.i.d. processes. In ergodic theory, i.i.d. processes are called Bernoulli shifts. Some of the main results of ergodic theory concern an invariant of systems called their entropy, which turns out to be intimately related to the existence of maps from a general system to Bernoulli shifts.
I will give an overview of this area and its history, ending with a recent advance in this direction. A measure-preserving system has the weak Pinsker property if it can be split, in a natural sense, into a direct product of a Bernoulli shift and a system of arbitrarily low entropy. The recent result is that all ergodic measure-preserving systems have this property.
This talk will assume some familiarity with measure theory and the basic language of random variables and their distributions. Past exposure to entropy or ergodic theory will not be essential.USC Probability and Statistics Seminars (Spring 2018)
Fridays 3:30pm in KAP 414; tea is usually provided at 3:00pm
Organizer: Stanislav Minsker
For seminars/colloquia on other topics, see the Department of Mathematics webpage.
Older seminars: Fall 2016, Spring 2016, Fall 2015, Spring 2015, Fall 2014, 2013-2014
-
Spring 2018 seminars
January 12: Pierre-François Rodriguez (UCLA, Department of Mathematics)
On random walks and local limits in dimension 2
We will discuss recent developments relating Poissonian “loop soups” à la Lawler-Werner, Le Jan and Sznitman, to study random walks on large two-dimensional tori. We will show how the underlying correspondence can be used to effectively describe certain critical phenomena in such systems, which occur when the walk is run up to suitable timescales while forced to avoid a given point.
January 19: Andrew Stuart (Caltech)
Large Graph Limits of Learning Algorithms
Many problems in machine learning require the classification of high dimensional data. One methodology to approach such problems is to construct a graph whose vertices are identified with data points, with edges weighted according to some measure of affinity between the data points. Algorithms such as spectral clustering, probit classification and the Bayesian level set method can all be applied in this setting. The goal of the talk is to describe these algorithms for classification, and analyze them in the limit of large data sets. Doing so leads to interesting problems in the calculus of variations, in stochastic partial differential equations and in Monte Carlo Markov Chain, all of which will be highlighted in the talk. These limiting problems give insight into the structure of the classification problem, and algorithms for it.
The talk is based on collaboration with Matt Dunlop (Caltech), Dejan Slepcev (CMU) and Matt Thorpe (Cambridge).
February 2: Jacob Bien (USC, Marshall School of Business)
Rare Feature Selection in High Dimensions
Many prediction problems include a large number of features that count the frequency of various events. In several common application areas, nearly all of these events are rare, leading to design matrices that are highly sparse. The challenge posed by such “rare features” has received very little attention despite its prevalence. We show, both theoretically and empirically, that not explicitly accounting for the rareness of features can greatly reduce the effectiveness of an analysis. We next propose a framework for aggregating rare features into denser features in a flexible manner that creates better predictors of the response. Our strategy leverages side information in the form of a tree that encodes feature similarity.We apply our method to data from Trip Advisor, in which we predict the numerical rating of a hotel based on the text of the associated review. Our method achieves high accuracy by making effective use of rare words; by contrast, the lasso is unable to identify highly predictive words if they are too rare.
February 16: Timothy Cannings (USC, Marshall School of Business)
Local nearest neighbour classification with applications to semi-supervised learning.
We derive a new asymptotic expansion for the global excess risk of a local $k$-nearest neighbour classifier, where the choice of $k$ may depend upon the test point. We prove that, provided the $d$-dimensional marginal distribution of the features has a finite $\rho$th moment for some $\rho > 4$ (as well as other regularity conditions), a local choice of $k$ can yield a rate of convergence of the excess risk of $O(n^{-4/(d+4)})$, where $n$ is the sample size, whereas for the standard $k$-nearest neighbour classifier, our theory would require $d \geq 5$ and $\rho > 4d/(d-4)$ finite moments to achieve this rate. Our results motivate a new $k$-nearest neighbour classifier for semi-supervised learning problems, where the unlabelled data are used to obtain an estimate of the marginal feature density, and fewer neighbours are used for classification when this density estimate is small. The potential improvements over the standard $k$-nearest neighbour classifier are illustrated both through our theory and via a simulation study.
February 23: Wen Sun (USC, Marshall School of Business)
A General Framework for Information Pooling in Large-Scale Multiple Testing
This talk discusses a general framework for exploiting the sparsity information in two-sample multiple testing problems. We propose to first construct a covariate sequence, in addition to the usual primary test statistics, to capture the sparsity structure, and then incorporate the auxiliary covariates in inference via a three-step algorithm consisting of grouping, adjusting and pooling (GAP). The GAP procedure provides a simple and effective framework for information pooling. An important advantage of GAP is its capability of handling various dependence structures such as those arise from multiple testing for high-dimensional linear regression, differential correlation analysis, and differential network analysis. We establish general conditions under which GAP is asymptotically valid for false discovery rate control, and show that these conditions are fulfilled in a range of applications. Numerical results demonstrate that existing methods can be much improved by the proposed framework. An application to a breast cancer study for identifying gene-gene interactions will be discussed if time permits.
March 2: Akihiko Nishimura (UCLA)
Discontinuous Hamiltonian Monte Carlo for models with discrete parameters and discontinuous likelihoods
Hamiltonian Monte Carlo (HMC) is a powerful sampling algorithm employed by several probabilistic programming languages. Its fully automatic implementations have made HMC a standard tool for applied Bayesian modeling. While its performance is often superior to alternatives under a wide range of models, one prominent weakness of HMC is its inability to handle discrete parameters. In this talk, I present discontinuous HMC, an extension that can efficiently explore discrete spaces involving ordinal parameters as well as target distributions with discontinuous densities. The proposed algorithm is based on two key ideas: embedding of discrete parameters into a continuous space and simulation of Hamiltonian dynamics on a piecewise smooth density function. The latter idea has been explored under special cases in the literature, but the extensions introduced here are critical in turning the idea into a general and practical sampling algorithm. When properly-tuned, discontinuous HMC is guaranteed to outperform a Metropolis-within-Gibbs algorithm as the two algorithms coincide under a specific (and sub-optimal) implementation of discontinuous HMC. We apply our algorithm to challenging posterior inference problems to demonstrate its wide applicability and competitive performance.To make the talk more accessible, I will start my talk with a review of the essential ideas and notions behind HMC. A brief review of Bayesian inference and Markov chain Monte Carlo will also be provided.
March 23: Joseph Salmon (TELECOM ParisTech )
Generalized Concomitant Multi-Task Lasso for sparse multimodal regression
For standard Lasso theory to hold though, the regularization parameter should be proportional to the noise level, which is generally unknown in practice. A remedy is to consider estimators, such as the Concomitant Lasso, which jointly optimize over the regression coefficients and the noise level. However, when data from different sources are pooled to increase sample size, or when dealing with multimodal data, noise levels differ and new dedicated estimators are needed. We provide new statistical and computational solutions to perform heteroscedastic regression, with an emphasis on functional brain imaging with magneto- and electroencephalographic (M/EEG) signals. This joint work with M. Massias, O. Fercoq and A. Gramfort is to appear to AISTAT 2018. Pdf <https://arxiv.org/abs/1705.09778>. Python code <https://github.com/mathurinm/SHCL>.
March 30: Yaming Yu (UC Irvine, Department of Statistics)
Successive sampling, monotonicity, and Bayesian bandits problems
Stochastic orders appear naturally in many problems in statistics and applied probability and can be used to derive useful inequalities. We discuss monotonicity in classical limit theorems, structural results in Bayesian bandit problems, and comparisons between successive sampling and conditional Poisson sampling in sample surveys.
April 6: Leila Setayeshgar (Providence College)
Large Deviations for a Class of Stochastic Semilinear Partial Differential Equations
Standard approaches to large deviations analysis for stochastic partial differential equa- tions (SPDEs) are often based on approximations. These approximations are mostly technical and often onerous to carry out. In 2008, Budhiraja, Dupuis and Maroulas, employed the weak convergence approach and showed that these approximations can be avoided for many infinite dimensional models. Large deviations analysis for such systems instead relied on demonstrating existence, uniqueness and tightness properties of certain perturbations of the original process. In this talk, we use the weak convergence approach, and establish the large deviation principle for the law of the solutions to a class of semilinear SPDEs. Our family of semilinear SPDEs contains, as special cases, both the stochastic Burgers’ equation, and the stochastic reaction-diffusion equation.
April 13: Henry Schellhorn (Claremont Graduate University, Department of Mathematics)
Density formula for functionals of compound Poisson processes using Malliavin calculus
We extend the work of Nourdin and Viens (Electronic Journal of Probability 2009) to obtain a new exact formula for the density of the law of a random variable Z, which is measurable and di§erentiable with respect to a compound Poisson process. The main restriction is that the Levy measure must charge the whole real line.
This talk is work in progress, and not all results have been exploited. For this reason time will probably allow for me to describe another project.
April 20: Guido Montufar (UCLA, Department of Mathematics)
Mixtures and Products in Two Graphical Models
This talk is about two graphical models with hidden variables. One is a mixture model and the other is a product of mixtures called restricted Boltzmann machine. We derive relations between theoretical properties of restricted Boltzmann machines and several natural notions from discrete mathematics and convex geometry. We take a closer look at the first non-trivial case, with three observed binary variables. Although the mixture and the product of mixtures look different from their parametrizations, we show that in this case they represent the same set of distributions on the interior of the probability simplex, and are equal up to closure. We give a semi-algebraic description of this model in terms of six binomial inequalities and obtain closed form expressions for the maximum likelihood estimates.
This talk is based on joint work with Jason Morton and Anna Seigal.
April 27: Leonard Wong (USC, Department of Mathematics)
Information geometry and optimal transport
Information geometry studies spaces of probability distributions using differential geometry. On the other hand, optimal transport is about finding efficient methods of transporting one distribution to another. Both fields apply geometric ideas in probability and have found numerous applications in statistics and machine learning. In this talk we explain some new connections between the two fields. We study a family of logarithmic divergences (distance-like quantities) which generalizes the Bregman divergence (of which the relative entropy is a prime example). These divergences have a dual structure in terms of an optimal transport map, satisfy a generalized Pythagorean theorem, and correspond to natural extensions of exponential family. Geometrically, they characterize (locally) statistical manifolds with constant curvatures.
USC Probability and Statistics Seminars (Fall 2017)
Fridays 3:30pm in KAP 414; tea is usually provided at 3:00pm
Organizer: Stanislav Minsker
For seminars/colloquia on other topics, see the Department of Mathematics webpage.
Older seminars: Spring 2017, Fall 2016, Spring 2016, Fall 2015, Spring 2015, Fall 2014, 2013-2014
-
Fall 2017 seminars
August 25: Christian Keller (University of Michigan, Department of Mathematics)
Path-dependent Hamilton-Jacobi equations in infinite dimensions
Abstract: We propose notions of minimax and viscosity solutions for a class of fully nonlinear path-dependent PDEs with nonlinear, monotone, and coercive operators on Hilbert space. Our main result is well-posedness (existence, uniqueness, and stability) for minimax solutions. A particular novelty of our approach is a suitable combination of minimax and viscosity solution techniques. Thereby, we establish a comparison principle for path-dependent PDEs under conditions that are weaker even in the finite-dimensional case. In contrast to most of the related works on PDEs in infinite dimensions, perturbed optimization is entirely avoided. The path-dependent setting itself enables us to circumvent the lack of compactness in infinite-dimensional Hilbert spaces. As an application, our theory makes it possible to employ the dynamic programming approach to study optimal control problems for a fairly general class of (delay) evolution equations in the variational framework. Furthermore, differential games associated to such evolution equations can be investigated following the Krasovskii-Subbotin approach similarly as in finite dimensions. Our results are also potentially useful for applications to large deviations. This talk is based on joint work with Erhan Bayraktar.
September 1: Wuchen Li (UCLA, Department of Mathematics)
Optimal transport on finite graphs
Optimal transport theory provides powerful tools in both mathematics and applications. In this talk, we consider similar matters on finite graphs. Various recent developments related to Fokker-Planck equations, Shannon-Boltzmann entropy, Fisher information, as well as Wasserstein metric on graphs will be presented. Some applications are presented, including evolutionary games and “geometry” of finite graphs.
September 8: Wen-Xin Zhou (UC San Diego, Department of Mathematics)
Robust estimation and inference via multiplier bootstrap
Massive data are often contaminated by outliers and heavy-tailed errors. In the presence of heavy-tailed data, finite sample properties of the least squares-based methods, typified by the sample mean, are suboptimal both theoretically and empirically. To address this challenge, we propose the adaptive Huber regression for robust estimation and inference. The key observation is that the robustification parameter should adapt to sample size, dimension and moments for optimal tradeoff between bias and robustness. For heavy-tailed data with bounded $(1+\delta)$-th moment for some $\delta>0$, we establish a sharp phase transition for robust estimation of regression parameters in both finite dimensional and high dimensional settings: when $\delta \geq 1$, the estimator achieves sub-Gaussian rate of convergence without sub-Gaussian assumptions, while only a slower rate is available in the regime $0<\delta <1$ and the transition is smooth and optimal.In addition, non-asymptotic Bahadur representation and Wilks’ expansion for finite sample inference are derived when higher moments exist. Based on these results, we investigate the theoretical underpinnings of both classical and contemporary statistical inference problems, where heavy-tailed data are present. In particular, we focus on uncertainty quantification methodologies, including the construction of confidence sets and large-scale simultaneous hypothesis testing. We demonstrate that the adaptive Huber regression, combined with the multiplier bootstrap procedure, provides a useful robust alternative to the method of least squares. The idea of adaptivity is to let data, which are probably collected with low quality and exhibit heavy tails, to influence the choice of method by which they are analyzed. Together, the theoretical and empirical results reveal the effectiveness of the proposed method, and highlight the importance of having statistical methods that are robust to violations of the assumptions underlying their use.
September 15: Jason Schweinsberg (UC San Diego, Department of Mathematics)
Rigorous results for a population model with selection
We consider a model of a population of fixed size N in which each individual acquires beneficial mutations at a constant rate. Each individual dies at rate one, and when a death occurs, an individual is chosen at random with probability proportional to the individual’s fitness to give birth. We obtain rigorous results for the rate at which mutations accumulate in the population, the distribution of the fitness levels of individuals in the population at a given time, and the genealogy of the population. Our results confirm nonrigorous predictions of Desai and Fisher (2007), Desai, Walczak, and Fisher (2013), and Neher and Hallatschek (2013).
September 22: Ilya Mironov (Google)
Differential Privacy: From Principled Foundations to Your Browser
We survey progress in understanding of privacy in statistical databases over the last 10+ years, starting with early negative results followed by emergence of the notion of differential privacy and its variants. In the second half of the talk we cover uses of differential privacy in the Chrome browser, and its recent applications in machine learning tasks such as text and image recognition.Bio: Ilya Mironov is a Staff Research Scientist in Google Brain. After completing his PhD at Stanford in 2003, he joined Microsoft Research Silicon Valley, where he worked on cryptography, cryptanalysis, and privacy until 2014.
September 29: Jianfeng Zhang (USC, Department of Mathematics)
A Martingale Approach for Fractional Brownian Motions and Related Path Dependent PDEsIn this talk we consider dynamic backward problems in a framework where the forward state process satisfies a Volterra type SDE, with fractional Brownian motion as a typical example. Such a problem is important in financial markets with rough volatility, as confirmed by some empirical studies. These processes are neither Markov processes nor semimartingales. Our main result is a functional Ito formula, extending the seminal work of Dupire to our more general framework. In particular, unlike in Dupire’s work where one needs only to consider the stopped paths, here we need to concatenate the observed path up to the current time with certain smooth curve derived from the distribution of the future paths. This new feature is due to some intrinsic time inconsistency involved in this framework. We then derive the path dependent PDEs for the backward problems. The talk is based on a joint work with Frederi Viens.
October 6: Kenneth Alexander (USC, Department of Mathematics)
Intersections of independent renewal processes
We consider the intersection (that is, the common renewals) of two independent renewal processes, $\rho=\tau\cap\sigma$. This seemingly natural and classical problem, which has received relatively little attention in the literature, arises in the context of a pinning model in statistical mechanics. Assuming that $P(\tau_1 = n ) = \varphi(n)\, n^{-(1+\alpha)}$ and $P(\sigma_1 = n ) = \tilde\varphi(n)\, n^{-(1+ \tilde\alpha)}$ for some $\alpha,\tilde\alpha \geq 0$ and some slowly varying $\varphi,\tilde\varphi$, we find the asymptotic behavior of $P(\rho_1=n)$. The result may be viewed as a kind of inverse of standard renewal theorems, as we determine probabilities $P(\rho_1=n)$ while knowing asymptotically the renewal mass function $P(n\in\rho)=P(n\in\tau)P(n\in\sigma)$. The work is joint with Quentin Berger.
October 13: Larry Goldstein (USC, Department of Mathematics)
Non asymptotic distributional bounds for the Dickman approximation of the running time of the Quickselect algorithm
Given a non-negative random variable $W$ and $\theta>0$, let the generalized Dickman transformation map the distribution of $W$ to that of$W^*=_d U^{1/\theta}(W+1)$, where $U \sim {\cal U}[0,1]$, a uniformly distributed variable on the unit interval, independent of $W$, and $=_d$ denotes equality in distribution. It is well known that $W^*$ and $W$ are equal in distribution if and only if $W$ has the generalized Dickman distribution ${\cal D}_\theta$. By use of Stein’s method, one can show that the Wasserstein distance $d_1$ between $W$, a non-negative random variable with finite mean, and $D_\theta$ having distribution ${\cal D}_\theta$ obeys the inequality
$d_1(W,D_\theta) \le (1+\theta)d_1(W,W^*).$
The specialization of this bound to the case $\theta=1$ and coupling constructions yield
$d_1(W_n,D) \le \frac{8\log (en/2)+2}{n} \quad \mbox{for all $n \ge 1$, where} \quad W_n=\frac{1}{n}C_n-1$,
with $C_n$ the number of comparisons made by the Quickselect algorithm to find the smallest element of a list of $n$ distinct numbers. The running time for Quickselect to locate the $m^{th}$ smallest element of the list obeys a similar bound, and together recover the results of Hwang and Tsai (2002) that show distributional convergence of $W_n$ to the standard Dickman distribution in the asymptotic regime $m=o(n)$.
October 20: Meisam Razaviyayn (USC, Department of Industrial & Systems Engineering)
Two non-convex optimization problems: learning deep models and tuning hyper-parameters
In this talk, we study two important classes of optimization problems arising in machine learning applications:
1) Learning deep models and 2) Tuning hyper-parameters. In the first part of the talk, weconsider the problem of training deep models. These models maps the input (feature) data to the output (response) vector through the composition of multiple simple mappings. Training feedforward multi-layer neural networks or low rank matrix decomposition are examples of such problems. We introduce a mathematical framework for analyzing the local optima of training these deep models. In particular, by leveraging the local openness property of the resulting mapping, we provide sufficient conditions under which the local optima of the objective function are globally optimal. Our result leads to the characterization of local optima of linear feedforward network and provides sufficient conditions for the existence of no spurious local optima under hierarchical non-linear neural networks.In the second part of the talk, we consider the problem of optimizing the cross validation (CV) for a given learning problem. We first develop a computationally efficient approximate for CV and provide theoretical guarantees for its performance. Then we use our approximate to provide an optimization algorithm for finding the optimal hyper-parameters in the empirical risk minimization framework. In our numerical experiments, we illustrate the accuracy and efficiency of our approximate as well as our proposed framework for the optimal regularizer.This is a joint work with Maher Nouiehed (USC), Ahmad Beirami (MIT), Shahin Shahrampour (Harvard), Vahid Tarokh (Harvard).
October 27: Sanjoy Dasgupta (UC San Diego, Computer Science and Engineering)
Learning from partial correction
We introduce a new model of interactive learning in which an expert examines the predictions of a learner and partially fixes them if they are wrong. Although this kind of feedback is not i.i.d., we show statistical generalization bounds on the quality of the learned model. This is joint work with Mike Luby.
November 3: David X. Li (Shanghai Advanced Institute of Finance (SAIF) and Shanghai Jiaotong University (SJTU))
Theoretical Problems in Credit Portfolio Modeling
This talk first provides an overview about the copula function approach to credit portfolio modeling. Some of the key theoretical deficiency in its current framework is then highlighted. Finally, some initial result based on the equilibrium approach to the credit portfolio modeling is presented. This new approach is based on the extension of the copula function for random variables to the copula function for stochastic processes. The basic definition, properties of copulas for stochastic processes are discussed. This new approach allows us to theoretically link our credit portfolio modeling with our classical equity portfolio modeling under the CAPM setting.
November 10: Vladimir V. Ulyanov (Higher School of Economics, Moscow)
On confidence sets for projectors of a covariance matrix
We offer a bootstrap procedure to get the sharp confidence sets for the projectors of a covariance matrix from the given sample. This procedure could be applied for small or moderate sample size and large dimension of observations. The main result states the validity of the proposed procedure for finite samples with an explicit error bound of bootstrap approximation. This bound involves some new sharp results on Gaussian comparison and Gaussian anti-concentration in high dimensions.These are the joint results with V.Spokoiny (WIAS, Berlin) and A.Naumov (Skoltech, Moscow).
November 10: Alexey Naumov (Skoltech, Moscow)
Big ball probability with applications in statistical inference
We derive the bounds on the Kolmogorov distance between probabilities of two Gaussian elements to hit a ball in a Hilbert space. The key property of these bounds is that they are dimensional-free and depend on the nuclear (Schatten-one) norm of the difference between the covariance operators of the elements. We are also interested in the anticoncentration bound for a squared norm of a non-centered Gaussian element in a Hilbert space. All bounds are sharp and cannot be improved in general. We provide a list of motivation examples and applications in statistical inference for the derived results as well. The talk is based on joint results with F. Götze (Bielefeld University), V. Spokoiny (WIAS) and V. Ulyanov(Moscow State University).
December 1: Ujan Gangopadhyay (Department of Mathematics, USC)
We consider an urn model where the replacement matrices are random. The replacement matrices need neither be independent, nor identically distributed. However, we assume that the replacement matrices are independent of the color drawn in the same round conditioned on the entire past. We also assume the matrices to have only first moment finite, unlike the usual second moment assumption in the literature. We further require the conditional expectation of the replacement matrix given the past to be close to an (not necessarily nonrandom) irreducible matrix in some appropriate sense. We do not require any of the matrices to be balanced.
When the replacement matrices have $p>1$ finite moments, we prove almost sure convergence of the proportion vector, while the convergence is in probability when the replacement matrices have only first finite moment. We also consider the growth rates of composition vectors and count vectors. We use stochastic approximation to analyze the model. We develop a new version of stochastic approximation with random step size and driving function. The related differential equation is of Lotka-Volterra type and can be analyzed directly.
This is a joint work with Krishanu Maulik.
USC Probability and Statistics Seminars (Spring 2017)
Fridays 3:30pm in KAP 414; tea is usually provided at 3:00pm
Organizer: Stanislav Minsker
For seminars/colloquia on other topics, see the Department of Mathematics webpage.
Older seminars: Fall 2016, Spring 2016, Fall 2015, Spring 2015, Fall 2014, 2013-2014
-
Spring 2017 seminars
January 13: Søren Asmussen (Aarhus University)
Lévy processes, phase-type distributions, and martingales
Lévy processes are defined as processes with stationary independent increments. In finance, they generalize the Black-Scholes model by accommodating jumps and non-Gaussian marginals, and some popular examples there are variance Gamma, CGMY and NIG processes. We survey here how to explicitly compute a number of relevant quantities by restricting to the dense class of compound Poisson processes with phase-type jumps in both directions and an added Brownian component (phase-type distributions are certain generalizations of the exponential distribution). The solutions are in terms of roots to polynomials, and the basic equations are derived by purely probabilistic arguments using martingale optional stopping, and the method also applies to regime switching. The approach is illustrated via a worked-out numerical example dealing with equity default swaps.
January 20: Adel Javanmard (USC Marshall School of Business)
Online Rules for Control of False Discovery Rate
Multiple hypothesis testing is a core problem in statistical inference and arises in almost every scientific field. For a given set of null hypotheses, Benjamini and Hochberg introduced the notion of false discovery rate (FDR), which is the expected proportion of false positives among rejected null hypotheses, and further proposed a testing procedure that controls FDR below a pre-assigned significance level. Nowadays FDR is the criterion of choice for large- scale multiple hypothesis testing. In this talk, we consider the problem of controlling FDR in an “online manner”. Concretely, we consider an ordered, possibly infinite, sequence of null hypotheses where at each step the statistician must decide whether to reject current null hypothesis having access only to the previous decisions. We introduce a class of generalized alpha-investing procedures and prove that any rule in this class controls FDR in online manner.
January 27: Venkat Chandrasekaran (Caltech)
Learning Regularizers from Data
Regularization techniques are widely employed in the solution of inverse problems in data analysis and scientific computing due to their effectiveness in addressing difficulties due to ill-posedness. In their most common manifestation, these methods take the form of penalty functions added to the objective in optimization-based approaches for solving inverse problems. The purpose of the penalty function is to induce a desired structure in the solution, and these functions are specified based on prior domain-specific expertise. For example, regularization is useful for promoting smoothness, sparsity, low energy, and large entropy in solutions to inverse problems in image analysis, statistical model selection, and the geosciences.
We consider the problem of learning suitable regularization functions from data in settings in which precise domain knowledge is not directly available; the objective is to identify a regularizer to promote the type of structure contained in the data. The regularizers obtained using our framework are specified as convex functions that can be computed efficiently via semidefinite programming, and they can be employed in tractable convex optimization approaches for solving inverse problems. Our approach for learning such semidefinite regularizers is based on computing certain structured factorizations of data matrices. We propose a method for this task that combines recent techniques for rank minimization problems along with the Operator Sinkhorn iteration. We discuss some of the theoretical properties of our algorithm as well as its utility in practice.
(Joint work with Yong Sheng Soh)
February 3: Edmond Jonckheere (USC, Department of Electrical Engineering)
Jonckheere-Terpstra test for nonclassical error versus log-sensitivity relationship of quantum spin network controllers
Selective information transfer in spin ring networks by landscape shaping control has the property that the error 1-prob, where prob is the transfer success probability, and the sensitivity of the probability to spin coupling errors are “positively correlated”, meaning that both are statistically increasing across a family of controllers of increasing error. Here, we examine the rank correlation between the error and another measure of performance – the logarithmic sensitivity – used in robust control to formulate the fundamental limitations. Contrary to error versus sensitivity, the error versus logarithmic sensitivity correlation is less obvious, because its weaker trend is made difficult to detect by the noisy behavior of the logarithmic sensitivity across controllers of increasing error numerically optimized in a challenging landscape. This results in the Kendall tau test for rank correlation between the error and the log sensitivity to be pessimistic with poor confidence. Here it is shown that the Jonckheere-Terpstra test, because it tests the Alternative Hypothesis of an ordering of the medians of some groups of log sensitivity data, alleviates this problem and hence singles out cases of anti-classical behavior of “positive correlation” between the error and the logarithmic sensitivity.
February 10: Denis Chetverikov (UCLA, Department of Economics)
On cross-validated Lasso
We derive a rate of convergence of the Lasso estimator when the penalty parameter $\lambda$ for the estimator is chosen using $K$-fold cross-validation; in particular, we show that in the model with the Gaussian noise and under fairly general assumptions on the candidate set of values of $\lambda$, the prediction norm of the estimation error of the cross-validated Lasso estimator is with high probability bounded from above up to a constant by $(s\log p /n)^{1/2}\cdot(\log^{7/8}(p n))$, where $n$ is the sample size of available data, $p$ is the number of covariates, and $s$ is the number of non-zero coefficients in the model. Thus, the cross-validated Lasso estimator achieves the fastest possible rate of convergence up to a small logarithmic factor $\log^{7/8}(p n)$. In addition, we derive a sparsity bound for the cross-validated Lasso estimator; in particular, we show that under the same conditions as above, the number of non-zero coefficients of the estimator is with high probability bounded from above up to a constant by $s\log^5(p n)$. Finally, we show that our proof technique generates non-trivial bounds on the prediction norm of the estimation error of the cross-validated Lasso estimator even if the assumption of the Gaussian noise fails; in particular, the prediction norm of the estimation error is with high-probability bounded from above up to a constant by $(s\log^2(p n)/n)^{1/4}$ under mild regularity conditions.
February 24: Julian Gold (UCLA, Department of Mathematics)
Isoperimetric shapes in supercritical bond percolation
We study the isoperimetric subgraphs of the infinite cluster $\textbf{C}_\infty$ of supercritical bond percolation on $\mathbb{Z}^d$, $d \geq 3$. We prove a shape theorem for these random graphs, showing that upon rescaling they tend almost surely to a deterministic shape. This limit shape is itself an isoperimetric set for a norm we construct. In addition, we obtain sharp asymptotics for a modification of the Cheeger constant of $\textbf{C}_\infty \cap [-n,n]^d$, settling a conjecture of Benjamini for this modified Cheeger constant. Analogous results are shown for the giant component in dimension two, with the caveat that we use the original definition of the Cheeger constant here, and a more complicated continuum isoperimetric problem emerges as a result.
March 3: Arash Amini (UCLA, Department of Statistics)
Semidefinite relaxations of the block model
Community detection has emerged in recent years as one of the fundamental problems of network analysis. Informally, one seeks to partition the network into cohesive groups of nodes, or communities, that reveal its large-scale connective structure. There is now a well-established mathematical model, namely, the Stochastic Block Model (SBM), which allows for a precise notion of community. The model has a number of attractive features, among them, a deep connection to the general class of exchangeable random graphs and the ability to approximate complex nonparametric models while being analytically tractable to study theoretical limitations of community detection. The flexibility and expressive power of SBM, however, comes at the expense of computational intractability. The optimal way of fitting SBM, via maximum likelihood estimation (MLE), requires combinatorial search in the worst case. In this talk, we consider some of the proposed semidefinite programming (SDP) relaxations of the problem and clarify their relations by viewing them within a unified framework, namely, as relaxations of the MLE over various sub-classes of SBM. We also derive a tighter relaxation for the balanced case where the communities are of equal size. We show that this new relaxation is consistent over a broader class of SBMs which we call the weakly assortative class. Previous consistency results for SDP relaxations require a stronger notion of assortativity, and we show that this stronger notion is indeed necessary in those cases.
March 10: Tomoyuki Ichiba (UC Santa Barbaba, Department of Statistics and Applied Probability)
Walsh semimartingales and diffusion on metric graphs
In this talk we shall discuss diffusion on metric graphs. We start with a change-of-variable formula of Freidlin-Sheu type for Walsh semimartingale on a star graph. In diffusion case we characterize such processes via martingale problem. As a consequence of folding/unfolding semimartingale, we obtain a system of degenerate stochastic differential equations and we examine its solution and convergence properties. The stationary distribution, strong Markov property and related statistical problems are also discussed. Then we extend our considerations to diffusion on metric graphs. Part of this talk is based on joint work with I. Karatzas, V. Prokaj, A. Sarantsev and M. Yan.
April 7: Xunyu Zhou (Columbia University, Department of Industrial Engineering and Operations Research)
Discounting, Diversity, and Investment
This paper presents the class of weighted discount functions, which contains the discount functions commonly used in economics and finance. Weighted discount functions also describe the discounting behavior of groups, and they admit a natural notion of group diversity in time preference. We show that more diverse groups discount less heavily, and make more patient decisions. Greater group diversity leads to more risk-taking and delayed investment in a real options setting. We further provide a general result on the timing behavior of groups, and link it to that of individuals who are time-inconsistent. This is a joint work with Sebastian Ebert and Wei Wei.
April 14: Badal Joshi (California State University San Marcos)
Graphical equilibria and balanced stationary distributions in reaction networks
Reaction networks are commonly used to model a wide variety of biochemical systems. Deterministic models have been traditionally used in chemistry which involves a large number of molecules. In biochemistry, the numbers of molecules are small and thus stochastic modeling becomes essential. We explain the connection between the mass-action stochastic and deterministic models. We then focus on the underlying graphical structure of the reaction network and the graphical structure of the allowed transitions in the state space of the discrete/stochastic model. We explore the relationships between the symmetry properties of the reaction network, symmetry properties of the underlying network of the allowed transitions in the stochastic model, the deterministic equilibria and the stationary distributions in the stochastic model.
April 21: Peter Baxendale (USC, Department of Mathematics)
Some very large deviations
Motivated by a problem in financial mathematics, we study the small time behavior of an additive functional of a fast diffusion process. The resulting behavior depends on the relative sizes of the small time and the speed of the diffusion. A simple time change converts the problem into one of moderate, or large, or very large deviations.
After reviewing results for the sums of i.i.d. random variables (Cramer’s theorem and beyond) we will identify the rate function for our very large deviation problem in terms of viscosity solutions of an associated deterministic control problem.
April 28: James-Michael Leahy (USC, Department of Mathematics)
On solutions of degenerate linear stochastic integro-differential equations in Soblev spaces
We consider the question of existence, uniqueness and regularity of solutions of degenerate systems of linear stochastic integro-differential equations with variable coefficients in the scale of integer Sobolev spaces $W^{m,p}$, $m, p \geq 1$. The main motivation for considering this class of equations is the non-linear filtering problem for jump-diffusion processes with correlated noise. The regularity of solutions of these equations is intimately related to establishing convergent rates of numerical approximations and to deriving the existence of a conditional density. The main technique we employ is the method of vanishing viscosity, which relies on a priori estimates of solutions in the $W^{m,p}$-norms.
USC Probability and Statistics Seminars (Spring 2017)
Fridays 3:30pm in KAP 414; tea is usually provided at 3:00pm
Organizer: Stanislav Minsker
For seminars/colloquia on other topics, see the Department of Mathematics webpage.
Older seminars: Fall 2016, Spring 2016, Fall 2015, Spring 2015, Fall 2014, 2013-2014
-
Spring 2017 seminars
January 13: Søren Asmussen (Aarhus University)
Lévy processes, phase-type distributions, and martingales
Lévy processes are defined as processes with stationary independent increments. In finance, they generalize the Black-Scholes model by accommodating jumps and non-Gaussian marginals, and some popular examples there are variance Gamma, CGMY and NIG processes. We survey here how to explicitly compute a number of relevant quantities by restricting to the dense class of compound Poisson processes with phase-type jumps in both directions and an added Brownian component (phase-type distributions are certain generalizations of the exponential distribution). The solutions are in terms of roots to polynomials, and the basic equations are derived by purely probabilistic arguments using martingale optional stopping, and the method also applies to regime switching. The approach is illustrated via a worked-out numerical example dealing with equity default swaps.
January 20: Adel Javanmard (USC Marshall School of Business)
Online Rules for Control of False Discovery Rate
Multiple hypothesis testing is a core problem in statistical inference and arises in almost every scientific field. For a given set of null hypotheses, Benjamini and Hochberg introduced the notion of false discovery rate (FDR), which is the expected proportion of false positives among rejected null hypotheses, and further proposed a testing procedure that controls FDR below a pre-assigned significance level. Nowadays FDR is the criterion of choice for large- scale multiple hypothesis testing. In this talk, we consider the problem of controlling FDR in an “online manner”. Concretely, we consider an ordered, possibly infinite, sequence of null hypotheses where at each step the statistician must decide whether to reject current null hypothesis having access only to the previous decisions. We introduce a class of generalized alpha-investing procedures and prove that any rule in this class controls FDR in online manner.
January 27: Venkat Chandrasekaran (Caltech)
Learning Regularizers from Data
Regularization techniques are widely employed in the solution of inverse problems in data analysis and scientific computing due to their effectiveness in addressing difficulties due to ill-posedness. In their most common manifestation, these methods take the form of penalty functions added to the objective in optimization-based approaches for solving inverse problems. The purpose of the penalty function is to induce a desired structure in the solution, and these functions are specified based on prior domain-specific expertise. For example, regularization is useful for promoting smoothness, sparsity, low energy, and large entropy in solutions to inverse problems in image analysis, statistical model selection, and the geosciences.
We consider the problem of learning suitable regularization functions from data in settings in which precise domain knowledge is not directly available; the objective is to identify a regularizer to promote the type of structure contained in the data. The regularizers obtained using our framework are specified as convex functions that can be computed efficiently via semidefinite programming, and they can be employed in tractable convex optimization approaches for solving inverse problems. Our approach for learning such semidefinite regularizers is based on computing certain structured factorizations of data matrices. We propose a method for this task that combines recent techniques for rank minimization problems along with the Operator Sinkhorn iteration. We discuss some of the theoretical properties of our algorithm as well as its utility in practice.
(Joint work with Yong Sheng Soh)
February 3: Edmond Jonckheere (USC, Department of Electrical Engineering)
Jonckheere-Terpstra test for nonclassical error versus log-sensitivity relationship of quantum spin network controllers
Selective information transfer in spin ring networks by landscape shaping control has the property that the error 1-prob, where prob is the transfer success probability, and the sensitivity of the probability to spin coupling errors are “positively correlated”, meaning that both are statistically increasing across a family of controllers of increasing error. Here, we examine the rank correlation between the error and another measure of performance – the logarithmic sensitivity – used in robust control to formulate the fundamental limitations. Contrary to error versus sensitivity, the error versus logarithmic sensitivity correlation is less obvious, because its weaker trend is made difficult to detect by the noisy behavior of the logarithmic sensitivity across controllers of increasing error numerically optimized in a challenging landscape. This results in the Kendall tau test for rank correlation between the error and the log sensitivity to be pessimistic with poor confidence. Here it is shown that the Jonckheere-Terpstra test, because it tests the Alternative Hypothesis of an ordering of the medians of some groups of log sensitivity data, alleviates this problem and hence singles out cases of anti-classical behavior of “positive correlation” between the error and the logarithmic sensitivity.
February 10: Denis Chetverikov (UCLA, Department of Economics)
On cross-validated Lasso
We derive a rate of convergence of the Lasso estimator when the penalty parameter $\lambda$ for the estimator is chosen using $K$-fold cross-validation; in particular, we show that in the model with the Gaussian noise and under fairly general assumptions on the candidate set of values of $\lambda$, the prediction norm of the estimation error of the cross-validated Lasso estimator is with high probability bounded from above up to a constant by $(s\log p /n)^{1/2}\cdot(\log^{7/8}(p n))$, where $n$ is the sample size of available data, $p$ is the number of covariates, and $s$ is the number of non-zero coefficients in the model. Thus, the cross-validated Lasso estimator achieves the fastest possible rate of convergence up to a small logarithmic factor $\log^{7/8}(p n)$. In addition, we derive a sparsity bound for the cross-validated Lasso estimator; in particular, we show that under the same conditions as above, the number of non-zero coefficients of the estimator is with high probability bounded from above up to a constant by $s\log^5(p n)$. Finally, we show that our proof technique generates non-trivial bounds on the prediction norm of the estimation error of the cross-validated Lasso estimator even if the assumption of the Gaussian noise fails; in particular, the prediction norm of the estimation error is with high-probability bounded from above up to a constant by $(s\log^2(p n)/n)^{1/4}$ under mild regularity conditions.
February 24: Julian Gold (UCLA, Department of Mathematics)
Isoperimetric shapes in supercritical bond percolation
We study the isoperimetric subgraphs of the infinite cluster $\textbf{C}_\infty$ of supercritical bond percolation on $\mathbb{Z}^d$, $d \geq 3$. We prove a shape theorem for these random graphs, showing that upon rescaling they tend almost surely to a deterministic shape. This limit shape is itself an isoperimetric set for a norm we construct. In addition, we obtain sharp asymptotics for a modification of the Cheeger constant of $\textbf{C}_\infty \cap [-n,n]^d$, settling a conjecture of Benjamini for this modified Cheeger constant. Analogous results are shown for the giant component in dimension two, with the caveat that we use the original definition of the Cheeger constant here, and a more complicated continuum isoperimetric problem emerges as a result.
March 3: Arash Amini (UCLA, Department of Statistics)
Semidefinite relaxations of the block model
Community detection has emerged in recent years as one of the fundamental problems of network analysis. Informally, one seeks to partition the network into cohesive groups of nodes, or communities, that reveal its large-scale connective structure. There is now a well-established mathematical model, namely, the Stochastic Block Model (SBM), which allows for a precise notion of community. The model has a number of attractive features, among them, a deep connection to the general class of exchangeable random graphs and the ability to approximate complex nonparametric models while being analytically tractable to study theoretical limitations of community detection. The flexibility and expressive power of SBM, however, comes at the expense of computational intractability. The optimal way of fitting SBM, via maximum likelihood estimation (MLE), requires combinatorial search in the worst case. In this talk, we consider some of the proposed semidefinite programming (SDP) relaxations of the problem and clarify their relations by viewing them within a unified framework, namely, as relaxations of the MLE over various sub-classes of SBM. We also derive a tighter relaxation for the balanced case where the communities are of equal size. We show that this new relaxation is consistent over a broader class of SBMs which we call the weakly assortative class. Previous consistency results for SDP relaxations require a stronger notion of assortativity, and we show that this stronger notion is indeed necessary in those cases.
March 10: Tomoyuki Ichiba (UC Santa Barbaba, Department of Statistics and Applied Probability)
Walsh semimartingales and diffusion on metric graphs
In this talk we shall discuss diffusion on metric graphs. We start with a change-of-variable formula of Freidlin-Sheu type for Walsh semimartingale on a star graph. In diffusion case we characterize such processes via martingale problem. As a consequence of folding/unfolding semimartingale, we obtain a system of degenerate stochastic differential equations and we examine its solution and convergence properties. The stationary distribution, strong Markov property and related statistical problems are also discussed. Then we extend our considerations to diffusion on metric graphs. Part of this talk is based on joint work with I. Karatzas, V. Prokaj, A. Sarantsev and M. Yan.
April 7: Xunyu Zhou (Columbia University, Department of Industrial Engineering and Operations Research)
Discounting, Diversity, and Investment
This paper presents the class of weighted discount functions, which contains the discount functions commonly used in economics and finance. Weighted discount functions also describe the discounting behavior of groups, and they admit a natural notion of group diversity in time preference. We show that more diverse groups discount less heavily, and make more patient decisions. Greater group diversity leads to more risk-taking and delayed investment in a real options setting. We further provide a general result on the timing behavior of groups, and link it to that of individuals who are time-inconsistent. This is a joint work with Sebastian Ebert and Wei Wei.
April 14: Badal Joshi (California State University San Marcos)
Graphical equilibria and balanced stationary distributions in reaction networks
Reaction networks are commonly used to model a wide variety of biochemical systems. Deterministic models have been traditionally used in chemistry which involves a large number of molecules. In biochemistry, the numbers of molecules are small and thus stochastic modeling becomes essential. We explain the connection between the mass-action stochastic and deterministic models. We then focus on the underlying graphical structure of the reaction network and the graphical structure of the allowed transitions in the state space of the discrete/stochastic model. We explore the relationships between the symmetry properties of the reaction network, symmetry properties of the underlying network of the allowed transitions in the stochastic model, the deterministic equilibria and the stationary distributions in the stochastic model.
April 21: Peter Baxendale (USC, Department of Mathematics)
Some very large deviations
Motivated by a problem in financial mathematics, we study the small time behavior of an additive functional of a fast diffusion process. The resulting behavior depends on the relative sizes of the small time and the speed of the diffusion. A simple time change converts the problem into one of moderate, or large, or very large deviations.
After reviewing results for the sums of i.i.d. random variables (Cramer’s theorem and beyond) we will identify the rate function for our very large deviation problem in terms of viscosity solutions of an associated deterministic control problem.
April 28: James-Michael Leahy (USC, Department of Mathematics)
On solutions of degenerate linear stochastic integro-differential equations in Soblev spaces
We consider the question of existence, uniqueness and regularity of solutions of degenerate systems of linear stochastic integro-differential equations with variable coefficients in the scale of integer Sobolev spaces $W^{m,p}$, $m, p \geq 1$. The main motivation for considering this class of equations is the non-linear filtering problem for jump-diffusion processes with correlated noise. The regularity of solutions of these equations is intimately related to establishing convergent rates of numerical approximations and to deriving the existence of a conditional density. The main technique we employ is the method of vanishing viscosity, which relies on a priori estimates of solutions in the $W^{m,p}$-norms.
USC Probability and Statistics Seminars (Fall 2016)
Fridays 3:30pm in KAP 414; tea is usually provided at 3:00pm
Organizer: Stanislav Minsker
For seminars/colloquia on other topics, see the Department of Mathematics webpage.
Older seminars: Spring 2016, Fall 2015, Spring 2015, Fall 2014, 2013-2014
-
Fall 2016 seminars
August 26: Christos Thrampoulidis (California Institute of Technology)
Precise performance analysis of non-smooth convex optimization via Gaussian comparison inequalities.
The typical scenario that arises in modern large-scale inference problems is one where the ambient dimension of the unknown signal is very large (e.g. high-resolution images, Massive-MIMO, etc.), yet is such that its desired properties lie in some low-dimensional structure (sparsity, binary, low-rankness, etc.). In the past couple of decades, non-smooth convex optimization has emerged as a powerful tool to reveal those structures. While the algorithms (LASSO, LAD, etc.) are fairly well-established, rigorous and unifying frameworks for their exact analysis are only recently emerging.
In this talk, we will present a novel and general framework to evaluate the asymptotic performance of such recovery methods under Gaussian measurement ensembles and under different measurement models (e.g. noisy linear, single-index model). For illustration, we will show simple and closed-form expressions for the mean-squared error of the generalized LASSO and we will evaluate the bit-error rate of the box relaxation for recovering BPSK signals in communications.
The analysis is based on Gaussian process methods; in particular, on a classical comparison inequality proved by Gordon in 1988. We will prove a stronger and tight version of Gordon’s original result in the presence of additional convexity assumptions. We call this the Convex Gaussian Min-max Theorem (CGMT).
September 2: Georg Menz (UCLA)
A two scale proof of the Eyring-Kramers formula (joint work with Andre Schlichting)
We consider a drift-diffusion process on a smooth potential landscape with small noise. We give a new proof of the Eyring-Kramers formula which asymptotically characterizes the spectral gap of the generator of the diffusion. The proof is based on a refinement of the two-scale approach introduced by Grunewald, Otto, Villani, and Westdickenberg and of the mean-difference estimate introduced by Chafai and Malrieu. The new proof exploits the idea that the process has two natural time-scales: a fast time-scale resulting from the fast convergence to a metastable state, and a slow time-scale resulting from exponentially long waiting times of jumps between metastable states. A nice feature of the argument is that it can be used to deduce an asymptotic formula for the log-Sobolev constant, which was previously unknown.
September 9: Konstantin Zuev (California Institute of Technology)
Hyperbolic Geometry of Complex Network Data
Recent years have witnessed an explosion of new kind of data: network data. Large datasets of social, biological, technological, and information networks are analyzed by thousands of scientists around the world, making probabilistic modeling and statistical analysis of network data a mainstream research area in statistics, computer science, social sciences, system biology, and physics. One of the fundamental questions in the study of network data is to uncover hidden evolution mechanisms that shape the structure and dynamics of large real networks. It has been empirically observed that many real networks, in spite of being very different in other respects, have heavy-tail degree distribution, high clustering, and significant community structure. Since that discovery, several mechanisms were proposed to explain some of these universal properties, but none of them captured all the three properties at once. In this talk, I will fill the gap and show how the universal properties of complex networks naturally emerge from the new mechanism, called geometric preferential attachment (GPA). I will explain how latent network geometry coupled with preferential attachment of nodes to this geometry induces power-law degree distribution, strong clustering, and community structure. Using the Internet data as an example, I will demonstrate that GPA generates networks that are similar to real networks.
September 16: Jason Lee (USC Marshall School of Business)
Matrix Completion, Saddlepoints, and Gradient Descent
Matrix completion is a fundamental machine learning problem with wide applications in collaborative filtering and recommender systems. Typically, matrix completion are solved by non-convex optimization procedures, which are empirically extremely successful. We prove that the symmetric matrix completion problem has no spurious local minima, meaning all local minima are also global. Thus the matrix completion objective has only saddlepoints an global minima.
Next, we show that saddlepoints are easy to avoid for even Gradient Descent — arguably the simplest optimization procedure. We prove that with probability 1, randomly initialized Gradient Descent converges to a local minimizer. The same result holds for a large class of optimization algorithms including proximal point, mirror descent, and coordinate descent.
September 23: Leonard Wong (USC, Department of Mathematics)
Exponentially concave functions: optimal transport, information geometry and finance
A function is said to be exponentially concave if its exponential is concave. We show that exponentially concave functions on the unit simplex have remarkable connections with sequential investment and a Monge-Kantorovich optimal transport problem. Motivated by the question of optimal trading frequency, we show that an exponentially concave function induces a new geometric structure on the simplex. This structure consists of a Riemannian metric and a pair of dually coupled affine connections defining two kinds of geodesics. Using this geometry we prove a generalized Pythagorean theorem and construct a displacement interpolation for the associated transport problem. Our results extend the classical dually flat geometry of Bregman divergence originated from statistical inference. This is joint work with Soumik Pal.
September 30: Martin Tassy (UCLA)
Variational principles for discrete systems
Previous works have shown that arctic circle phenomenons and limiting behaviors of some integrable discrete systems can be explained by a variational principle. In this talk we will present the first results of the same type for a non-integrable discrete system: graph homomorphisms form Z^d to a regular tree. We will also explain how the technique used could be applied to other non-integrable models.
October 7: Deanna Needell (Claremont McKenna College)
Batched Stochastic Gradient Descent with Weighted Sampling and Linear Feasibility
We analyze a batched variant of Stochastic Gradient Descent (SGD) with weighted sampling distribution for smooth and non-smooth objective functions. We show that when the batches can be distributed computationally that a significant speedup in the convergence rate is possible. We propose several computationally efficient schemes to approximate the optimal weights, and compute the proposed sampling distribution for the least-squares and hinge loss problems. We show both analytically and experimentally that substantial gains can be obtained using this hybrid approach. In addition, we present a method of Motzkin for the linear feasibility problem and a variation that is computationally efficient. Together, these bridge recent ideas in stochastic optimization.
October 14: Inga Maslova (USC Marshall School of Business)
Hybrid machine learning and wavelet models in remote sensing
Machine learning and signal processing techniques are gaining popularity in the areas of application related to remote sensing. This talk will introduce the results of using multivariate relevance vector machine models combined with wavelet analysis techniques to estimate, model, and forecast evapotranspiration (ET) from the local weather station variables and Landsat satellite images.
With the development of surface energy balance analyses, remote sensing has become a spatially explicit and quantitative methodology for understanding evapotranspiration (ET), a critical requirement for water resources planning and management. Limited temporal resolution of satellite images and cloudy skies present major limitations that impede continuous estimates of ET.
First, a methodology combining wavelet multiresolution analysis with a machine learning algorithm, the multivariate relevance vector machine, in order to predict 16 days of future daily reference evapotranspiration will be introduced.This methodology laid the ground for forecasting the spatial distribution of ET using Landsat satellite imagery. Next, the results of using satellite images and relevance vector machine (RVM) to model ET will be presented.
October 21: Wei Wu (New York University)
Minimal spanning tree on quasi-planar graphs
Abstract: The minimal spanning tree model has been widely applied to combinatorial optimizations and to the study of disordered physical systems. For the infinite lattice $\mathbb{Z}^d$, rigorous results for the geometry of the minimal spanning forest were recently proved for $d= 2$ and still remain open for $d \geq 3$. We made partial progress by proving that the minimal spanning forest measure is supported on a single tree for quasi-planar graphs, such as the two dimensional slabs. Our proof uses the connections between the minimal spanning forests and critical bond percolations, and certain generalizations of gluing lemmas for bond percolation.
Based on joint work with Charles Newman and Vincent Tassion.
October 28: Joel Tropp (California Institute of Technology)
Universality laws for randomized dimension reduction
Dimension reduction is the process of embedding high-dimensional data into a lower dimensional space to facilitate its analysis. In the Euclidean setting, one fundamental technique for dimension reduction is to apply a random linear map to the data. The question is how large the embedding dimension must be to ensure that randomized dimension reduction succeeds with high probability.
This talk describes a phase transition in the behavior of the dimension reduction map as the embedding dimension increases. The location of this phase transition is universal for a large class of datasets and random dimension reduction maps. Furthermore, the stability properties of randomized dimension reduction are also universal. These results have many applications in numerical analysis, signal processing, and statistics.
Joint work with Samet Oymak.
November 4: Ery Arias-Castro (UC San Diego)
Distribution-Free Detection of Structured Anomalies: Permutation and Rank-Based ScansThe scan statistic is by far the most popular method for anomaly detection, being popular in syndromic surveillance, signal and image processing, and target detection based on sensor networks, among other applications. The use of the scan statistics in such settings yields a hypothesis testing procedure, where the null hypothesis corresponds to the absence of anomalous behavior. If the null distribution is known, then calibration of a scan-based test is relatively easy, as it can be done by Monte Carlo simulation. When the null distribution is unknown, it is less straightforward.
We investigate two procedures. The first one is a calibration by permutation and the other is a rank-based scan test, which is distribution-free and less sensitive to outliers. Furthermore, the rank scan test requires only a one-time calibration for a given data size making it computationally much more appealing. In both cases, we quantify the performance loss with respect to an oracle scan test that knows the null distribution. We show that using one of these calibration procedures results in only a very small loss of power in the context of a natural exponential family. This includes the classical normal location model, popular in signal processing, and the Poisson model, popular in syndromic surveillance. We perform numerical experiments on simulated data further supporting our theory and also on a real dataset from genomics.
Joint work with Rui M. Castro, Ervin Tánczos, and Meng Wang
November 11: Chenchen Mou (UCLA)
Uniqueness and existence of viscosity solutions for a class of integro-differential equations.
We present comparison theorems and existence of viscosity solutions for a class of nonlocal equations. This class of equations includes Bellman-Isaacs equations containing operators of Levy type with measures depending on the state variable and control parameters.
November 18: Zhongyang Li (University of Connecticut)
Phase transitions in the 1-2 model
A configuration in the 1-2 model is a subgraph of the hexagonal lattice, in which each vertex is incident to 1 or 2 edges. By assigning weights to configurations at each vertex, we can define a family of probability measures on the space of these configurations, such that the probability of a configuration is proportional to the product of weights of configurations at vertices.We study the phase transition of the model by investigating the probability measures with varying weights. We explicitly identify the critical weights, in the sense that the edge-edge correlation decays to 0 exponentially in the subcritical case, and converges to a non-zero constant in the supercritical case, under the limit measure obtained from torus approximation. These results are obtained by a novel measure-preserving correspondence between configurations in the 1-2 model and perfect matchings on a decorated graph, which appears to be a more efficient way to solve the model, compared to the holographic algorithm used by computer scientists to study the model.When the weights are uniform, we prove a weak mixing property for the finite-volume measures – this implies the uniqueness of the infinite-volume measure and the fast mixing of a Markov chain Monte Carlo sampling. The major difficulty here is the absence of stochastic monotonicity.
December 9: Soumik Pal (University of Washington)
Markov chains on partitions and their diffusion analogs
A popular family of models of random partitions is called the Chinese Restaurant Process. We imagine n customers being seated randomly and sequentially at tables of a restaurant according to a fixed stochastic rule. Grouping customers by the tables gives us a partition of n. Consider a Markov chain on such partitions where we remove a randomly chosen customer and reseat her. How can one describe the limit of such a Markov chain as n tends to infinity? We will construct such limits as diffusions on partitions of the unit interval. Examples of such random partitions of the unit interval are given by the complement of the zeros of the Brownian motion or the Brownian bridge. The processes of ranked interval lengths of our partitions are members of a family of diffusions introduced by Ethier and Kurtz (1981) and Petrov (2009) that are stationary with respect to the Poisson-Dirichlet distributions. Our construction is a piece of a more complex diffusion on the space of real trees, stationary with respect to the law of the Brownian Continuum Random Tree, whose existence has been conjectured by David Aldous. Joint work with Noah Forman, Doug Rizzolo, and Matthias Winkel.
USC Probability and Statistics Seminars (Spring 2016)
Fridays 3:30pm in KAP 414; tea is usually provided at 3:00pm
Organizer: Tobias Johnson
For seminars/colloquia on other topics, see the Department of Mathematics webpage.
Older seminars: Fall 2015, Spring 2015, Fall 2014, 2013-2014
-
Spring 2016 seminars
January 22: Panki Kim (Seoul National University)
Estimates of Dirichlet heat kernel for symmetric Markov processes
In this talk, we consider a large class of symmetric pure jump Markov processes dominated by isotropic unimodal Lévy processes with weak scaling conditions. We first establish sharp two-sided heat kernel estimates for this processes in C1,ρ open sets. As a corollary of our main result, we obtain a sharp two-sided Green function and a scale invariant boundary Harnack inequality with explicit decay rates in C1,ρ open sets. This is a joint work with Tomasz Grzywny and Kyung-Youn Kim.
January 29: Torstein Nilssen (University of Oslo, USC)
Rough path transport equation with discontinuous drift
In this talk I will present a linear transport equation perturbed by a fractional Brownian motion. One has to use the theory of rough paths to give meaning to the stochastic integral term. I will present a local time technique which shows the regularizing effect of the noise in the equation.
February 5: Jie Zhong (Mathematical Sciences Research Institute)
Parametrix method for skew diffusions
In this talk, I will introduce the parametrix method in order to obtain the existence and the regularity properties of the density of a skew diffusion and provide a Gaussian upper bound. This expansion leads to a probabilistic representation and an exact simulation method.
February 19: Zhenjie Ren (Ecole Polytechnique)
Comparison result for viscosity solutions to the fully nonlinear path-dependent PDE’s
Motivated by non-Markovian stochastic control problem, we work on the recently introduced notion of viscosity solutions to path-dependent PDE’s. This time, we prove a comparison result for viscosity solutions of (possibly degenerate) fully nonlinear parabolic path-dependent PDEs. Our argument follows the regularization method as introduced by Jensen, Lions & Souganidis in the corresponding finite-dimensional PDE setting. The present argument significantly simplifies the comparison proof in the work by Ekren, Touzi & Zhang, but requires an L_p−type continuity (with respect to the path) for the viscosity semi-solutions and for the nonlinearity defining the equation.
February 26: Guang Cheng (Purdue University)
Bayesian Aggregation for Extraordinarily Large Dataset
In this talk, a set of scalable Bayesian inference procedures is developed for a general class of nonparametric regression models. Specifically, we first perform independent nonparametric Bayesian inference on each subset split from a massive dataset, and then aggregate those local results into global counterparts. This aggregation step is explicit without involving any additional computation cost. By a careful partition, we show that our aggregated inference results obtain the oracle rule in the sense that they are equivalent to those obtained directly from the entire data (which are computationally prohibitive). For example, an aggregated credible ball achieves desirable credibility level and also frequentist coverage while possessing the same radius as the oracle ball. This oracle matching phenomenon occurs due to a delicate geometric structure of the infinite-dimensional parameter space in consideration.
March 4: Mahdi Soltanolkotabi (University of Southern California)
From Generic chaining to structured signal recovery via (non)convex optimization
Many problems in science and engineering ask for solutions to underdetermined systems of linear equations. The last decade has witnessed a flurry of activity in understanding when and how it is possible to solve such problems using convex optimization. Structured signal recovery via convex methods has arguably revolutionized signal acquisition, enabling signals to be measured with remarkable fidelity using a small number of measurements. Despite many success stories, in this talk I will argue that over insistence on convex methods has stymied progress in some application domains. I will discuss my ongoing research efforts to “unshackle” structured signal recovery from the confines of convexity opening the door for new applications.
In particular, I will present a unified theoretical framework for sharply characterizing the convergence rates of various (non-)convex iterative schemes for solving such problems. At the heart of this new analysis is a powerful result due to Gordon that allows for the precise characterization of “restricted” singular values of random Gaussian matrices. It has been conjectured and demonstrated empirically that these restricted singular values are universal. That is, surprisingly the same exact formula holds for a wide variety of non-Gaussian and non-i.i.d. random matrices. I will end by discussing exciting recent results towards settling this conjecture via generic chaining methods and the beautiful math behind proving them.
March 11: Nate Strawn (Georgetown University)
Posterior Concentration in High-Dimensional Regression
Many interesting data sets admit compressible representations in terms of a fixed linear dictionary. This a priori structure facilitates many applications and algorithmic techniques, and this structure can often be refined to provide a structured Bayesian prior that reflects the geometry and uncertainty in a dataset. Once such a prior has been constructed and a noise model is established, a Bayesian framework may be used to perform statistical estimation and uncertainty quantification. In particular, it is important to know when the posterior concentrates around true parameters with high probability.
In this talk, we examine the phenomenon of posterior concentration in high-dimensional regression, and exhibit a universal concentration bound for under-sampled linear regression with a Gaussian noise model. In particular, we show that a large class of priors admit an quantitative finite-sample posterior concentration bound. This result establishes that reasonable concentration bounds may be obtained in the high-dimensional setting. Additionally, we show that this bound may be strengthened in certain cases.
This is joint work with Artin Armagan, Lawrence Carin, David Dunson, and Rayan Saab.
March 25: Erik Slivken (UC Davis)
Pattern-avoiding permutations and Brownian excursion
Permutations of length n that avoid a given pattern of length 3 are in bijection with Dyck paths of length 2n. Using particular bijections, we can say much about the shape and statistics of these permutations. Under certain scalings these permutations look like scaled Dyck paths, which in turn converge to Brownian excursion. The bijections also help us compute the distribution of the number of fixed points for certain pattern-avoiding classes. For permutations that avoid patterns of length k>3 we no longer have a bijection with well-studied objects such as Dyck paths. However, for monotone pattern-avoiding classes, there is a strong connection to random walks which allows us to prove various results about the shape and statistics of typical permutations from these classes. This is joint work with Benjamin Fineman, Christopher Hoffman, and Douglas Rizzolo.
April 1: Jianfeng Zhang (USC, Department of Mathematics)
Regularity of Stopping Times, with Applications in Path Dependent PDEs
It is commonly known in stochastic analysis literature that stopping times do not have a good regularity. This is particularly true when the underlying state process is degenerate, and this bad regularity causes some major difficulty when we study degenerate PPDEs. Strongly motivated by a recent work of Bayraktar and Yao, we introduce a type of stopping times which have desired regularity. Applications in PPDEs will also be discussed. The talk will be based on a joint work with Ekren.
April 8: Xin Tong (USC Marshall School of Business)
Neyman-Pearson (NP) Classification and NP-ROC
In many binary classification applications such as disease diagnosis, type I errors are often more important than type II errors, and practitioners have the great need to control type I errors under a desired threshold $\alpha$. However, common practices that tune empirical type I errors to $\alpha$ often lead to classifiers with type I errors much larger than $\alpha$. In statistical learning theory, the Neyman‐Pearson (NP) binary classification paradigm installs a type I error constraint under some user specified level $\alpha$ before it minimizes type II errors. Despite recent theoretical advances, the NP paradigm has not been implemented for many classification scenarios in practice. In this work, we propose an umbrella algorithm that adapts popular classification methods, including logistic regression, support vector machines, and random forests, to the NP paradigm. Powered by these NP classification methods, we propose the NP receiver operating characteristic (NP‐ROC) curves, a variant of the receiver operating characteristic (ROC) curves, which have been widely used for evaluating the overall performance of binary classification methods at all possible type I error thresholds. Despite conceptual simplicity and wide applicability, ROC curves provide no reliable information on how to choose classifiers whose type I errors are under a desired threshold with high probability. In contrast, NP‐ROC curves serve as effective tools to evaluate, compare and select binary classifiers with prioritized type I errors. We demonstrate the use and advantages of NP‐ ROC curves via simulation and real data case studies.
April 15: Sergey Lototsky (USC, Department of Mathematics)
Small Ball Probabilities for Gaussian Diffusions in Finite and Infinite Dimensions
A positive random variable is usually not very likely to become very small. The subject of small ball probabilities is asymptotic analysis of these unlikely events, and the general question remains wide open even when the random variable is a functional of a Gaussian process. The objective of the talk is to derive logarithmic asymptotic of small ball probabilities for L_2-type norms of solutions of linear stochastic differential equations. This logarithmic asymptotic is rather robust and often does not depend on the drift in the equation.
April 22: Ilias Diakonikolas (USC, Department Of Computer Science)
Density estimation via piecewise polynomial approximation in sample near-linear time
In this talk, I will focus on the problem of density estimation, i.e., how to estimate (learn) a probability distribution based on random samples. I will describe a sample-optimal and computationally efficient algorithm to learn univariate distributions that are well-approximated by piecewise polynomial density functions. As a consequence of this algorithm, we obtain the first (near-)sample optimal and *near-linear time* density estimators for a wide range of well-studied structured distribution families.
If time permits, I will mention applications of the underlying algorithmic ideas to other inference tasks (e.g., regression).
Based on joint works with S. Chan, R. Servedio, X.Sun; and J. Acharya, J. Li, and L. Schmidt.
April 29: Peter Radchenko (USC Marshall School of Business)
Consistent clustering using an L-1 fusion penalty
We study the large sample behavior of a convex clustering framework, which minimizes the sample within cluster sum of squares under an L-1 fusion constraint on the cluster centroids. This recently proposed approach has been gaining in popularity, however, its asymptotic properties have remained mostly unknown. Our analysis is based on a novel representation of the sample clustering procedure as a sequence of cluster splits determined by a sequence of maximization problems. We use this representation to provide a simple and intuitive formulation for the population clustering procedure, and demonstrate that the sample procedure consistently estimates its population analog. We also derive the corresponding rates of convergence. The proof conducts a careful simultaneous analysis of a collection of M-estimation problems, whose cardinality grows together with the sample size. Based on the new perspectives gained from the asymptotic investigation, we propose a key post-processing modification of the original clustering approach.
USC Probability and Statistics Seminars (Fall 2015)
Fridays 3:30pm in KAP 414; tea is usually provided at 3:00pm
Organizer: Tobias Johnson
For seminars/colloquia on other topics, see the Department of Mathematics webpage.
Older seminars: Spring 2015, Fall 2014, 2013-2014
-
Fall 2015 seminars
September 4: Jiming Jiang (UC Davis)
A Unified Monte-Carlo Jackknife for Small Area Estimation After Model Selection
We consider estimation of mean squared prediction error (MSPE) in small area estimation (SAE) when a procedure of model selection is involved prior to the estimation. We discuss the difficulty of achieving both second-order unbiasedness and positivity at the same time, the so-called double-goal, in MSPE estimation, and propose a simple alternative by estimating the logarithm of the MSPE. A unified Monte-Carlo jackknife method, called McJack, is proposed for estimating the log-MSPE. We prove the second-order unbiasedness of McJack, and demonstrate the performance of McJack in assessing uncertainty in SAE after model selection through empirical investigations that include simulation studies and real-data analyses.
This work is joint with P. Lahiri of the University of Maryland, and Thuan Nguyen of the Oregon Health & Science University.
September 18: Frederi Viens (Purdue University), joint with Math Finance Colloquium
Parameter estimation for Gaussian sequences: sharp asymptotic normality and non-normality
By using recent progress on how to measure total variation distance to normality on Wiener chaos, we develop a framework for estimating scale parameters for stationary and non-stationary Gaussian sequences via power-type variations, concentrating on the sharpness of total-variation convergence speeds for their asymptotic normality or non-normality. Applications are given to Ornstein-Uhlenbeck processes driven by fractional Gaussian noise, observed in discrete time, under long-horizon asymptotics, and to partially observed systems of such processes. The resulting estimators can be interpreted as least-squares estimators, and as generalized method of moments estimators. This represents joint work completed and in progress with Kh. es-Sebaiy and B. el Onsy (Marrakech).
October 2: Sixian Jin (Claremont Graduate University)
Series Representations of Fractional Conditional Expectations under Malliavin Calculus
Fractional Brownian motion nowadays is considered as one of the most natural extensions of the classical Brownian motion. Based on its Malliavin calculus, we introduce the fractional conditional expectations under which fractional Brownian motion has the similarly “martingale” property as Brownian motion under general conditional expectations. Then according to fractional Clark-Hausmann-Ocone formula, we represent fractional conditional expectations of a functional of fractional Brownian motion as a convergent series in L^2 space. When the target random variable is some function of a discrete trajectory of fractional Brownian motion, we obtain a backward Taylor series representation; when the target functional is generated by a continuous fractional filtration, the series representation is obtained by applying a “frozen path” operator and an exponential operator to the functional.
October 16: Nicolai Krylov (University of Minnesota), joint with Math Finance Colloquium
On the independence of the value function for stochastic differential games of the probability space
We show that the value function in a stochastic differential game does not change if we keep the same space (Ω,F) but introduce probability measures by means of Girsanov’s transformation depending on the policies of the players. We also show that the value function does not change if we allow the driving Wiener processes to depend on the policies of the players. Finally, we show that the value function does not change if we perform a random time change with the rate depending on the policies of the players
October 23: Nicholas Cook (UCLA)
Random regular digraphs: singularity and spectrum
We consider two random matrix ensembles associated to large random regular digraphs: (1) the 0/1 adjacency matrix, and (2) the adjacency matrix with iid bounded edge weights. Motivated by universality conjectures, we show that the spectral distribution for the latter ensemble is asymptotically described by the circular law, assuming the graph has degree linear in the number of vertices. Towards establishing the same result for the unweighted adjacency matrix, we prove that it is invertible with high probability, even for sparse digraphs with degree growing only poly-logarithmically.
October 30: Harry Crane (Rutgers)
Pattern avoidance for random permutations
A permutation of [n] is a linear ordering of the elements {1,…,n}. A length n permutation v avoids a length k permutation w (called a pattern) if there is no length k subsequence of v that is order-isomorphic to w. Here we recast the classic enumerative question—How many permutations of [n] avoid a given pattern?—in probabilistic terms—what is the probability that a randomly generated permutation avoids a given pattern? We study this question for the class of random Mallows permutations, of which uniform random permutations are a special case. The probabilistic interpretation avails us of techniques from Poisson approximation, which we use to bound the total variation distance between the Poisson distribution and the distribution of the number of occurrences of fixed patterns in random Mallows permutations. For the uniform distribution, we obtain bounds on the number of pattern avoiding permutations of all finite sizes. For large or specially structured patterns, our bounds lead to a Poisson limit theorem for the number of occurrences and yield asymptotically exact estimates for the number of permutations that avoid such patterns. This is joint work with Stephen DeSalvo.
November 6: Tobias Johnson (USC)
Size biased couplings and the spectral gap for random regular graphs
Consider a random d-regular graph on n vertices. Its second eigenvalue is closely related to its expansion properties, and bounding it has been a major topic of research over the last thirty years. It’s conjectured by Van Vu that as n and optionally d tend to infinity, the second eigenvalue is bounded by C*sqrt(d) with probability tending to 1, regardless of the growth of d. This is currently known to hold only if d = o(n^(1/2)). We show that it holds so long as d remains smaller than n^(2/3). We use the Kahn-Szemerédi approach, which is based on showing concentration for Rayleigh quotients of the graph’s adjacency matrix. To prove this concentration, we build on work by Ghosh–Goldstein (2011) and by Arratia–Baxendale (2015) for showing concentration via size biased couplings. This is joint work with Nicholas Cook and Larry Goldstein.
November 13: Thaleia Zariphopoulou (UT Austin), joint with Math Finance Colloquium
CANCELLED
November 20: Alejandro Morales (UCLA)
Bijections and probabilistic puzzles related to factorizations of the long cycle
We study the factorizations of the permutation (1, 2, …, n) into k factors of given cycle types. Using the group algebra of the symmetric group, Jackson obtained for each k an elegant formula for counting these factorizations according to the number of cycles of each factor. In the cases k = 2, 3 Schaeffer and Vassilieva gave a combinatorial proof of Jackson’s formula. These counting results are indicative of a rich combinatorial theory which has remained elusive to this point, and it is the goal of this project to establish a series of bijections which unveil combinatorial properties and a probabilistic puzzle related to these factorizations into k factors for all k. The first bijection is an instance of a correspondence of Bernardi between such factorizations and tree-rooted maps; certain graphs embedded on surfaces with a distinguished spanning tree. This is joint work with Olivier Bernardi.
December 4: Stanislav Minsker (USC)
Efficient representation of data on smooth manifolds: non-asymptotic bounds and robustness
It has been empirically observed that many high-dimensional datasets are well approximated by low-dimensional structures. Over the past decade, this fact has motivated the investigation of modeling techniques which exploit these low-dimensional intrinsic structures, along with applications to high-dimensional statistics, machine learning, and signal processing. One particularly interesting example is the situation when the “low-dimensional structure” is a smooth submanifold of a high-dimensional Euclidean space. We study sparse multiscale representations and associated compression schemes for such data, and provide non-asymptotic probabilistic performance guarantees that depend only on the “intrinsic dimension” of the underlying submanifold. The talk is based on a joint work with M. Maggioni and N. Strawn.
January 15: Steven Heilman (UCLA)
Noncommutative invariance principles and Grothendieck’s Inequality
The seminal invariance principle of Mossel-O’Donnell-Oleszkiewicz implies the following. Suppose we have a multilinear polynomial Q, all of whose partial derivatives are small. Then the distribution of Q on i.i.d. uniform {-1,1} inputs is close to the distribution of Q on i.i.d. standard Gaussian inputs. The case that Q is a linear function recovers the Berry-Esseen Central Limit Theorem. In this way, the invariance principle is a nonlinear version of the central limit theorem. We prove the following version of the invariance principle. Suppose we have a multilinear polynomial Q with matrix coefficients, all of whose partial derivatives are small. Then the distribution of Q on i.i.d. uniform {-1,1} inputs is close to the distribution of Q on (carefully chosen) random Gaussian matrices. The exact statement must be phrased carefully in order to avoid being false. Time permitting, we discuss an application of this result to computational hardness for the noncommutative Grothendieck inequality. (joint with Thomas Vidick)
USC Probability and Statistics Seminars (Spring 2015)
Fridays 3:30pm in KAP 414; tea is usually provided at 3:00pm
Organizer: James Zhao
For seminars/colloquia on other topics, see the Department of Mathematics webpage.
-
Past Seminars
30 January (joint with Mathematical Finance Colloquium) – Zhenjie Ren, Ecole Polytechnique
Viscosity solution of semi linear path dependent PDE
The notion of viscosity solutions introduced by Ekren, Touzi and Zhang considers as test functions all those smooth processes which are tangent in mean. When restricted to the Markovian case, this definition induces a larger set of test functions, and reduces to the notion of stochastic viscosity solutions analyzed by Bayraktar and Sirbu. We take advantage of this enlargement of the test functions, and provides an easier proof of comparison. As a key ingredient for our methodology, we introduce a notion of punctual differentiation, similar to the corresponding concept in the standard viscosity solutions, and we prove that semimartingales are almost everywhere punctually differentiable. This smoothness result can be viewed as the counterpart of the Aleksandroff smoothness result for convex functions. We also developed recently an argument of Perron’s method for the existence of the solution.
13 February – Peter Ralph, USC (Computational Biology)
Descriptive statistics of a large random graph, the population pedigree
Often, the summary statistics of population genetics are framed in the setting of Kingman’s coalescent or related models. These statistics can be alternatively thought of as descriptive statistics of the realized population pedigree-with-recombination, in a way that has become much more useful in the era of whole-genome sequencing. For instance, pairwise number of nucleotide differences is proportional to “effective population size”, which is sometimes more usefully thought of as an estimate of the average length of the path through the pedigree to the most recent common ancestor at a randomly chosen locus (with an explicit standard error). Another example is the pairwise distribution of long tracts of shared genomic segments, which provides an estimate of a functional of the entire distribution of such paths.
20 February – Timothy Daley, USC (Computational Biology)
Non-parametric capture-recapture models for DNA sequencing experiments
Current technologies DNA sequencing experiments involve sampling genomic fragments from a large pool, called a library. The library is constructed from a small initial amount of DNA using amplification procedures, hence each original fragment exists in the thousands or millions of copies. This amplification, although necessary to produce enough material for the experiment, can introduce large biases and implies that the properties of the library cannot be known beforehand. Our goal is to infer properties of the experiment based on a small initial sample of the library. The capture-recapture framework naturally fits this scenario, however next-generation sequencing experiments produce data several orders of magnitude larger than traditional capture-recapture experiments. This gives rise to challenges in extrapolating but also opportunities for for methods that utilize the size of the data for highly accurate inferences. We will discuss the application of non-parametric empirical Bayes models to predict critical aspects of sequencing experiments to allow for optimal allocation of sequencing resources in large-scale sequencing experiments.
27 February (joint with Mathematical Finance Colloquium) – Joscha Diehl, Technical University of Berlin
Singular stochastic PDEs on Lie groups
We study singular stochastic PDEs – equations that have so irregular forcing terms that they are formally ill-posed – on compact Lie groups. One example is the (continuous) parabolic anderson model. Our main tool is an adaptation of the theory of paracontrolled distributions (Gubinelli/Imkeller/Perkowski) using representation theory for (compact) groups.
This is work in progress with Antoine Dahlqvist (TU Berlin).
6 March – David Kempe, USC (Computer Science)
Incentivizing Exploration
We study a Bayesian multi-armed bandit (MAB) setting in which a principal seeks to maximize the sum of expected time-discounted rewards obtained by pulling arms, when the arms are actually pulled by selfish and myopic individuals. Since such individuals pull the arm with highest expected posterior reward (i.e., they always exploit and never explore), the principal must incentivize them to explore by offering suitable payments. Among others, this setting models crowdsourced information discovery and funding agencies incentivizing scientists to perform high-risk, high-reward research.
We explore the tradeoff between the principal’s total expected time-discounted incentive payments, and the total time-discounted rewards realized. Specifically, with a time-discount factor gamma in (0,1), let OPT denote the total expected time-discounted reward achievable by a principal who pulls arms directly without having to incentivize selfish agents, in a MAB problem. We call a (payment, reward) combination (b,ρ) in [0,1]2 achievable if for every MAB instance, using expected time-discounted payments of at most b*OPT, the principal can guarantee an expected time-discounted reward of at least ρ*OPT. Our main result is a complete characterization of achievable (payment, reward) pairs: (b,ρ) is achievable if and only if sqrt(b) + sqrt(1-ρ) ≥ sqrt(γ).
In proving this characterization, we analyze so-called time-expanded policies, which in each step let the agents choose myopically with some probability p, and incentivize them to choose “optimally” with probability 1-p. The analysis of time-expanded policies leads to a question that may be of independent interest: If the same MAB instance (without selfish agents) is considered under two different time-discount rates gamma, eta, how small can the ratio of OPT(η) to OPT(γ) be? We give a complete answer to this question, showing that OPT(η) ≥ ((1-γ)2/(1-η)2) OPT (γ), and that this bound is tight.
27 March – Matthew Junge, University of Washington
Splitting hairs (with choice)
In the past decade computer science literature has studied the effect of introducing random choices to classical processes. For example, sequentially place n balls into n bins. For each ball, two bins are sampled uniformly and the ball is placed in the emptier of the two. This process does a much better job of evenly distributing the balls than the “choiceless” version where one places each ball uniformly.
Consider the continuous version: Form a random sequence in the unit interval by having the nth term be whichever of two uniformly placed points falls in the larger gap between the previous n-1 points. I’ll confirm the intuition that this sequence is a.s. equidistributed, solving an open problem from Itai Benjamini, Pascal Maillard and Elliot Paquette. The history goes back a century to Weyl and more recently to Kakutani. Several open problems will be discussed.
3 April – Tatiana Tatarinova, USC (Keck School of Medicine) & Children’s Hospital Los Angeles
Differential Evolution Approach to Detect Recent Admixture
The genetic structure of human populations is extraordinarily complex and of fundamental importance to studies of anthropology, evolution, and medicine. As increasingly many individuals are of mixed origin, there is an unmet need for tools that can infer multiple origins. Misclassification of such individuals can lead to incorrect and costly misinterpretation of genomic data, primarily in disease studies and drug trials. We present an advanced tool to infer ancestry that can identify the biogeographic origins of highly mixed individuals. reAdmix can incorporate individual’s knowledge of ancestors (e.g. having some ancestors from Turkey or a Scottish grandmother).
10 April – Igor Cialenco, Illinois Institute of Technology
On Hypothesis testing for Stochastic PDEs
We study the simple hypothesis testing problem for the drift coefficient for stochastic fractional heat equation driven by additive noise. We introduce the notion of asymptotically the most powerful test, and find explicit forms of such tests in two asymptotic regimes: large time asymptotics, and increasing number of Fourier modes. The proposed statistics are based on Maximum Likelihood Ratio. Additionally, we obtain a series of important technical results of independent interest: we find the cumulant generating function of the log-likelihood ratio; obtain sharp large deviation type results for large times and for large number of Fourier modes.
Also, we will discuss how to estimate and control the Type I and Type II errors. We propose a new class of rejection regions and provide computable thresholds for T, and N, that guarantee that the statistical errors are smaller than a given upper bound. Finally, we illustrate the theoretical results by numerical simulations.USC Probability and Statistics Seminars (Fall 2014)
Fridays 3:30pm in KAP 414; tea is usually provided at 3:00pm
Organizer: James Zhao <james.zhao@usc.edu>
For seminars/colloquia on other topics, see the Department of Mathematics webpage.
Older seminars: 2013-2014
-
Fall 2014 Seminars
September 12 – Aaron Abrams, Washington and Lee University
Finding patterns in randomness: A new perspective on the longest increasing subsequence problem
Given a random permutation of 1, . . . , N, how long is the longest subsequence that’s monotonically increasing? Decades of work went into solving and understanding this problem, and by now the answer is well-known: the expectation is 2√N and the length is distributed according to the famous Tracy-Widom distribution.
Now, given a random permutation of 1, . . . , N, how long is the longest subsequence that alternates between increasing and decreasing? This problem turns out to be much easier. A few years ago, Stanley showed that this length is normally distributed around a mean of 2N/3.
These two cases have quite different behaviors. We show that they are archetypes of a dichotomy: for any order pattern (of which increasing and alternating are examples), the answer will resemble either the increasing case or the alternating case. The difference is detected by a simple combinatorial feature called drift that arose in some of our earlier work.
(Joint work with E. Babson, H. Landau, Z. Landau, and J. Pommersheim)
September 19 – Ken Alexander, USC
Polymer pinning with sparse disorder
The standard setup in disordered pinning models is that a polymer configuration is modeled by the trajectory of a Markov chain which is rewarded or penalized by an amount ωn when it returns to a special state 0 at time n. More precisely, for a polymer of length N the Boltzmann weight is eβH, where for a trajectory τ, H(τ) is the sum of the values ωn at the times n≤N of the returns to 0 of τ. Typically the ωn are taken to be a quenched realization of a iid sequence, but here we consider the case of sparse disorder: ωn is 1 at the returns times of a quenched realization of a renewal sequence {σj}, and 0 otherwise; in the interesting cases the gaps between renewals have infinite mean, and we assume the gaps have a regularly varying power-law tail. For β above a critical point, the polymer is pinned in the sense that τ asymptotically hits a positive fraction of the N renewals in σ. To see the effect of the disorder one can compare this critical point to the one in the corresponding annealed system. We establish equality or inequality of these critical points depending on the tail exponents of the two renewal sequences (that is, σ and the return times of τ.) This is joint work with Quentin Berger.
September 26 – Skip Garibaldi, UCLA IPAM
Probability and the lottery
Various schemes surrounding the lottery, both legal and criminal, both money-losing and money-making, raise interesting probability problems. We explain how to analyze these schemes, how the analysis can distinguish between criminals and problem gamblers, and how the analysis led to lottery operators changing policies and law enforcement cracking down on the criminal schemes.
This talk is about joint work with Aaron Abrams, Richard Arratia, Philip Stark, and others. It will be accessible to a general mathematical audience; the main pre-requisite is something like Math 307 or Math 407.
October 3 – Larry Goldstein, USC
Normal approximation for Gaussian projections and conic intrinsic volumes: Steining the Steiner formula
Intrinsic volumes of convex sets are natural geometric quantities that also play important roles in applications. In particular, the discrete probability distribution L(VC) given by the sequence v0, . . . ,vd of conic intrinsic volumes of a closed convex cone C in Rd summarizes key information about the success of convex programs used to solve for sparse vectors, and other unknowns, in high dimensional regularized inverse problems. The concentration of VC implies the existence of phase transitions for the probability of recovery of the unknown in the number of observations. Additional information about the probability of recovery success is provided by a normal approximation for VC. Such central limit theorems can be shown by first considering the squared length GC of the projection of a Gaussian vector on the cone C. Applying a second order Poincaré inequality, proved using Stein’s method and Malliavin calculus, then produces a non-asymptotic total variation bound to the normal for L(GC). A conic version of the classical Steiner formula in convex geometry shows a normal limit for GC implies one for VC.
This work is joint with Ivan Nourdin and Giovanni Peccati.
October 10 – Jianfeng Zhang, USC
Some Thinking about Pathwise Stochastic Calculus
In standard stochastic analysis literature, processes are typically understood in a.s. sense, with notable examples stochastic integration and conditional expectations. Motivated from our study of path dependent PDEs, which is a convenient tool to study non-Markovian (or say path dependent) problems, we are interested in pathwise solutions of certain equations. In particular, we would like to have the regularity of the processes in terms of the underlying paths. For the backward situation, with conditional expectation as a typical example, it seems the Functional Ito formula is the right tool. For the forward situation, with stochastic integration as a simple example, we feel the rough path theory is the convenient tool. With the help of the rough path theory, we will show that a diffusion (and more generally the solution of a stochastic PDE) can be continuous in ω under certain topology.
October 17 – Han Liang Gan, Washington University in St Louis
Conditional random variable approximation with Stein’s method
The distribution of the number of rare events is a problem that has been studied extensively over a long period of time. At its core is the law of small numbers, which states that if you have many independent events that happen with very small probability, then the distribution of the number of rare events is Poisson in limit. If you relax the independence assumption, then the limit becomes far more complicated. In this talk, we will focus on approximating the distribution of the number of rare events given at least 1 rare event has occurred, and to this end, formulate Stein’s method for conditional random variable approximation. We will also discuss conditional point process approximation, and the difficulties that generalising to a spatial point process generates.
This is joint work with Aihua Xia.
October 31 – Toby Johnson, USC
The frog model on trees
Imagine that each vertex of a graph contains a sleeping frog. At time 0, the frog at some designated vertex wakes up and begins a simple random walk. When it lands on a vertex, the sleeping frog there wakes up and begins its own simple random walk, which in turn wakes up any sleeping frogs it lands on, and so on. This process is called the frog model.
We consider a question posed by Serguei Popov in 2003: On an infinite d-ary tree, is the frog model recurrent or transient? That is, is each vertex visited infinitely or finitely often by frogs? We (mostly) resolve this question, showing that there is a phase transition between recurrence and transience as d grows. This work is joint with Christopher Hoffman and Matthew Junge.
Monday November 3 (joint with CAMS Colloquium) – Michael Wolf, University of Zurich
Spectrum Estimation: A Unified Framework for Covariance Matrix Estimation and PCA in Large Dimensions
Covariance matrix estimation and principal component analysis (PCA) are two cornerstones of multivariate analysis. Classic textbook solutions perform poorly when the dimension of the data is of a magnitude similar to the sample size, or even larger. In such settings, there is a common remedy for both statistical problems: nonlinear shrinkage of the eigenvalues of the sample covariance matrix. The optimal nonlinear shrinkage formula depends on unknown population quantities and is thus not available. It is, however, possible to consistently estimate an oracle nonlinear shrinkage, which is motivated on asymptotic grounds. A key tool to this end is consistent estimation of the set of eigenvalues of the population covariance matrix (also known as the spectrum), an interesting and challenging problem in its own right. Extensive Monte Carlo simulations demonstrate that our methods have desirable finite-sample properties and outperform previous proposals.
November 7 – Yosef Rinott, Hebrew University of Jerusalem
A partial survey and some new thoughts on model selection
I will discuss some model selection methods, old and relatively new, such as AIC, BIC, Lasso and more. I will explain their goals and ‘philosophy’, and some of their properties. Time permitting I will discuss my recent work in this area.
November 14 (joint with Mathematical Finance Colloquium) – Carl Mueller, University of Rochester
Do stochastic PDE hit points or have double points in the critical dimensions?
This talk will describe work in progress with R. Dalang, Y. Xiao, and S. Tindel. The stochastic heat equation is often used as a basic model for a moving polymer:
(1) ∂tu=Δu+̇Ẇ(t,x).
Here, u = u(t,x) ∈ Rd is the position of the polymer, x ∈ R is the length along the polymer, and t is time. Ẇ(t,x) is two-parameter vector-valued white noise. Note that u ∈ Rd; that is, u is vector-valued. This is consistent with the interpretation of u as the position of a polymer.
We say that the solution hits a point z if there is positive probability that u(t,x) = z for some (random) parameters (t,x). We say that the solution has a double point if there exist distinct pairs of parameters (t,x) and (s,y) with t,s>0, x,y ∈ R such that u(t,x) = u(s,y). Some time ago the speaker and R. Tribe proved that d = 6 is the critical dimension for the solution u to hit points. That is, for a generic point z, the solution u hits points iff d < 6. In the same way, the critical dimension for double points is d = 12. The proofs use arguments which are specific to (1).
The goal of our work, which is still in progress, is to adapt an argument of Talagrand to study this question for equations similar to (1) but which:
(1) have colored noise in place of white noise;
(2) have nonlinearities.
As usual, the critical case is by far the hardest. In fact, there are a number of results about the situation away from criticality, but they are not sharp enough to give the results we seek.
November 14 at 4.30pm (Special Colloquium) – Teng Zhang, Princeton University
Tyler’s M-estimator: subspace recovery and high-dimensional regime
Given a data set, Tyler’s M-estimator is a widely used covariance matrix estimator with robustness to outliers or heavy-tailed distribution. We will discuss two recent results of this estimator. First, we show that when a certain percentage of the data points are sampled from a low-dimensional subspace, Tyler’s M-estimator can be used to recover the subspace exactly. Second, in the high-dimensional regime that the number of samples n and the dimension p both go to infinity, p/n converges to a constant y with 0<y<1, and when the data samples are identically and independently generated from the Gaussian distribution N(0,I), we showed that the difference between the sample covariance matrix and a scaled version of Tyler’s M-estimator tends to zero in spectral norm, and the empirical spectral densities of both estimators converge to the Marcenko-Pastur distribution. We also prove that when the data samples are generated from an elliptical distribution, the limiting distribution of Tyler’s M-estimator converges to a Marcenko-Pastur-Type distribution. This is joint work with Xiuyuan Cheng and Amit Singer.
November 21 at 3pm (Special Colloquium) – Ngoc Mai Tran, University of Texas at Austin
Random permutations and random partitions
I will talk about various problems related to random permutations and random partitions. In particular, I discuss size-biased permutations, which have applications to statistical sampling. Then I will talk about random partitions obtained from projections of polytopes. These are related to random polytopes and zeros of random tropical polynomials.
November 21 at 4.30pm (Special Colloquium) – Joseph Neeman, University of Texas at Austin
Gaussian noise stability
Given two correlated Gaussian vectors, X and Y, the noise stability of a set A is the probability that both X and Y fall in A. In 1985, C. Borell proved that half-spaces maximize the noise stability among all sets of a given Gaussian measure. We will give a new, and simpler, proof of this fact, along with some extensions and applications. Specifically, we will discuss hitting times for the Ornstein-Uhlenbeck process, and a noisy Gaussian analogue of the “double bubble” problem.
December 5 – Stephen DeSalvo, UCLA
Limit shapes of restricted integer partitions under non-multiplicative conditions
Limit shapes are an increasingly popular way to understand the large-scale characteristics of a random ensemble. The limit shape of unrestricted integer partitions has been studied by many authors primarily under the uniform measure and Plancherel measure. In addition, asymptotic properties of integer partitions subject to restrictions has also been studied, but mostly with respect to independent conditions of the form “parts of size i can occur at most ai times.” While there has been some progress on asymptotic properties of integer partitions under other types of restrictions, the techniques are mostly ad hoc. In this talk, we will present an approach to finding limit shapes of restricted integer partitions which extends the types of restrictions currently available, using a class of asymptotically stable bijections. This is joint work with Igor Pak.
Monday December 8 at 2pm (Special Colloquium) – Johannes Lederer, Cornell University
Tuning Parameters in High-Dimensional Statistics
High-dimensional statistics has become pivotal in many areas of science and engineering. In biology, for example, the advent of high-throughput technologies has generated a tremendous demand for statistical methods that allow for large and complex data sets.
Considerable effort has been made over the last couple of years to develop such methods. However, a major limitation of standard methods, such as Lasso, Ridge Regression, and Graphical Lasso, is their dependence on tuning parameters that are difficult to calibrate in practice. In this talk, I present two novel approaches to overcome this limitation. The first approach is a novel calibration scheme for high-dimensional estimation and prediction with standard estimators.
Our novel scheme is inspired by techniques for bandwidth selection in non-parametric statistics and is the first procedure that simultaneously provides fast computations, optimal finite sample guarantees, and highly competitive practical performances. The second approach establishes model selection via the minimization of an objective function that avoids tuning parameters altogether. Apart from its value for regression, this approach is particularly useful as a basis for graph estimation. Despite the mathematical nature of my research, I especially illustrate in my talk that our procedures can be readily applied to reveal hidden structures in high-dimensional data.
Wednesday December 10 at 2pm (Special Colloquium) – Xiaodong Li, University of Pennsylvania
Low-rank Recovery by Convex and Nonconvex Optimization with Applications in Network Data Analysis and Imaging
Low-rank structures are common in modern data analysis and signal processing, and they usually play essential roles in different estimation problems. Convex and nonconvex optimization methods are shown to be effective and efficient in the recovery of hidden low-rank structures from noisy and undersampled datasets. Two examples in statistics and signal processing are employed to illustrate the features of these two methods, respectively.
The first example is community detection, which is concerning how to cluster N nodes in a given network dataset into r distinct groups according to edge densities. By making use of the hidden low-rank structures induced by the communities, a convex optimization method is proven to be able to accurately detect the communities against a portion of the adversarial outlier nodes. Our theoretical results admit to the state-of-the-art results in the literature of computationally convenient community detetion under the standard stochastic block model.
The second example is regarding how to recover a signal from intensity measurements. This model has important applications in imaging, and it is equivalent to the recovery of a rank-one matrix from linear measurements. Initialized by a spectral method, we show that a gradient descent method induced by a nonconvex optimization can recover the signal quickly and accurately.
By studying these two examples carefully, we can get some intuition about the features and advantages of convex and nonconvex methods in low-rank recovery. Our analytical framework may provide insights for analyzing the statistical efficiency of computational methods in other large data problems.USC Probability and Statistics Seminars – 2013/2014
Fridays, 3:30PM in KAP 414
Organizer: Quentin Berger ; qberger@usc.edu
For seminars/colloquia on other topics, see the Department of Mathematics webpage
For seminars for the current academic year, click here
-
September 13: Toby Johnson, University of Wahington
Stein’s method and random regular graphs -
September 20: Tuan Nguyen, USC
Random covering in high dimension by a union of scaled convex sets -
September 27: Elton Hsu, Northwestern / Joint with Math Finance Colloquium
Near-Expiry Asymptotics of the Implied Volatility in Local and Stochastic Volatility Models -
October 4: Steve Kou, Columbia and NUS / Joint with Math Finance Colloquium
First Passage times of two-dimensional brownian motion -
October 11: Manuel Lladser, University of Colorado
Prediction of the discovery probability of an urn sample -
October 18: Leonid Petrov, Northeastern University
Markov dynamics on interlacing arrays -
October 25: Joel Tropp, California Institute of Technology
User-friendly tail bounds for sums of random matrices -
November 1st: Richard Arratia, USC
Bounded size bias coupling: a Gamma function bound, and universal Dickman-function behavior (joint work with Peter Baxendale) -
November 8: James Zhao, USC
Sampling Graphs with Given Degrees -
November 15: Tom Alberts, California Institute of Technology
The Gaussian Free Field, Conformal Field Theory, and Schramm-Loewner Evolution -
Saturday, December 7: Southern California Probability Symposium
Held at USC, organized by UC Santa Barbara http://www.pstat.ucsb.edu/scps13.html -
December 13: Anirban Basak, Stanford University
Ferromagnetic Ising measures on large locally tree-like graphs. -
December 18: Zhejie Ren, École Polytechnique (WEDNESDAY, 12/18/13, KAP 245, 2:15 – 3:15 PM, joint with Math Finance)
Viscosity solution to elliptic path dependent PDEs
-
Colloquium, December 18: Nizar Touzi, École Polytechnique (WEDNESDAY, 12/18/13, KAP 414, 3:30-4:30 PM)
Martingale optimal transport and martingale inequalities
-
January 24: Special Colloquium, Xiangxiong Zhang, MIT (3.30pm, KAP 414)
Positivity-preserving high order accurate schemes for hyperbolic conservation laws -
January 31: Kiros Berhane, USC (Division of Biostatistics)
Bayesian mixed hidden Markov models for categorical outcomes with differential misclassification (joint work with Yue Zhang) -
February 14th: Hubert Lacoin, Université Paris Dauphine
A Mathematical Perspective on Metastable Wetting (joint work with Augusto Teixeira) -
February 21st: Jinchi Lv, USC (Marshall School of Business)
Impacts of High Dimensionality in Finite Samples -
March 7th: Stephen De Salvo, UCLA
Completely effective error bounds for Stirling numbers and a Poisson limit theorem for set partitions. -
CANCELLED / March 14th: Carl Mueller, University of Rochester
Do Stochastic PDS Hit Points in the Critical Dimension? -
March 28th: Gökhan Yildirim, USC
Directed polymers in a random environment with a defect line -
April 11th: Eviatar Procaccia, UCLA
Stationary aggregation processes -
April 18th: David Renfrew, UCLA
Product of random matrices -
April 25th: Peter Baxendale, USC
Small time asymptotics for an additive functional of a fast diffusion process: moderate, large and superlarge deviations. -
May 2nd: Samy Tindel, University of Lorraine / Joint Math Finance seminar
Viscosity solutions to fully nonlinear stochastic PDEs and rough paths -
June 13th: Michael Cranston, UC Irvine
-
-
-
-
-
-
-
-
-
-