Image of Yifan Chen
Applied Mathematics
Office: MS 7304
Email: yifanchen@math.ucla.edu
Programming:
Julia, Python, MATLAB

I am an Assistant Professor in the Department of Mathematics at UCLA. I was previously a Courant Instructor at the Courant Institute, New York University from 2023-2025. I received my Ph.D. from Caltech working with Profs. Thomas Y. Hou, Houman Owhadi and Andrew M. Stuart. I obtained my B.S. in Pure and Applied Mathematics at Tsinghua University.

My research lies at the intersection of applied and computational mathematics, applied probability, and statistics. I am interested in probabilistic inference and PDEs in scientific computing and data science. My recent work focuses on efficient sampling algorithms and generative modeling.

Check out my Research Page and Google Scholar for more details. Feel free to reach out by email if you share similar interests!

Topics sketch


I work on understanding how high-dimensional sampling of probability distributions can be made feasible and developing efficient algorithms for applications in scientific computing and inverse problems.

  • Bias in high dimensions: Motivated by molecular dynamics applications, we show that unadjusted Langevin algorithms can exhibit a phenomenon "delocalization of bias", which leads to nearly dimension-independent bias scaling when the quantity of interest depends on few variables, under certain assumptions. In contrast, unbiased samplers must scale with dimension via power laws.
  • Improved dimensional scaling for unbiased ensemble samplers: My work introduces new derivative-free and derivative-based ensemble samplers with affine invariance that improve the popular emcee package, especially in high dimensions. The affine-invariant ensemble HMC achieves state-of-the-art power law scaling regarding dimension as single-chain counterparts while achieving affine invariance based on complementary ensembles and being embarrassingly parallel.
  • Gradient flows and variational inference: Variational, approximate approaches are often promising in high-dimensional applications. Our work builds a design framework for gradient flows in sampling, which clarifies the choices of energy functionals, metrics, and numerics. We focus on Fisher-Rao gradient flows as the promising diffeomorphism-invariant approach and establish functional inequalities in Fisher-Rao geometry. Parametric approximation of gradient flows connects to natural gradients; see Wasserstein natural gradients and also a connection to sequential-in-time neural Galerkin schemes. We develop a Kalman-filter-type Gaussian mixture approximation of Fisher-Rao natural gradients for efficient, multimodal, derivative-free sampling in large-scale Bayesian inverse problems; see recent further improvement through specialized high-dimensional quadrature.

Generative modeling has become a powerful tool for handling high-dimensional data. I work on understanding the mathematical foundations of generative modeling and its usage for probabilistic inference with applications in scientific domains.

  • Numerical designs for generative dynamics: Coming soon!
  • Probabilistic forecasting with generative modeling: We develop probabilistic forecasting methods using generative diffusion from a point source, based on stochastic interpolants with applications in fluid and video forecasting. We prove that Föllmer's process—an entropy-minimizing dynamic connecting Dirac measures to target distributions—minimizes a statistical estimation error (namely KL divergence on path space) between true and generated samples.
  • Posterior sampling with diffusion priors: I collaborate with scientists on developing rigorous probabilistic imaging methods (see PnP-MC, PnP-DM, SGDD) by integrating generative diffusion priors with efficient posterior sampling algorithms. See also crystal structure generation.

Recent machine learning approaches for PDEs and inverse problems often treat PDEs as data that describe the relationships between pointwise function evaluations and their derivatives at specific collocation points. By fitting all the data using neural network function approximators, these techniques lead to automatic machine learning-based solvers.

Compared to neural networks, the statistical framework based on Gaussian processes (GPs) and its deterministic counterpart kernel methods offer more interpretable and theoretically grounded function approximators, with deep connections to meshless collocation methods and built-in uncertainty quantification. I contribute to a line of work on using GPs to automate the solution of nonlinear PDEs and inverse problems.

  • Solving nonlinear PDEs and inverse problems with GPs: We present a rigorous, efficient, and reliable framework with algorithms, error estimates, and uncertainty quantification.
  • Hierarchical parameter learning: Our work proves the large data limit for hierarchical parameter learning of covariance kernels of GPs using either probabilistic empirical Bayes or approximation theoretic criteria, which also leads to new mathematical consistency results in spatial statistics.
  • Fast algorithms with sparse Cholesky: Our work develops new adaptations of sparse Cholesky algorithms based on screening effects in spatial statistics that compress the resulting kernel matrices—derived from pointwise evaluations of covariance kernels and their partial derivatives—with state-of-the-art near-linear complexity and new, simpler analysis.

For fast kernel methods in high dimensions for scientific machine learning, see also randomly pivoted Cholesky.


PDEs in physical sciences present significant computational challenges, especially those involving heterogeneous media and high-frequency oscillations. These scenarios require substantial resources to resolve accurate physics.

Multiscale methodology has long been advocated to address this problem by identifying low-complexity, homogenized models at coarse scales.

  • Exponentially convergent multiscale finite element methods: We develop ExpMsFEM that extends previous work and achieves exponential convergence of accuracy (see this paper for 2D Helmholtz equations and this paper for rough elliptic equations). This method accurately estimates highly oscillatory solutions with a few specialized basis functions that capture coarse-scale behavior while efficiently incorporating fine-scale features through localized simulations.
  • Multiscale upscaling with subsampled data: We propose a novel multiscale method using subsampled coarse variables for upscaling, and the corresponding tight subsampled Poincaré inequalities for analysis of its accuracy.
  • Multiscale ideas for numerical linear algebra: Our new adaptation of the sparse Cholesky algorithm for kernel matrices in PDE contexts with derivative-free and derivative-based measurements exploits the ordering of measurements from coarse to fine scales, where derivatives are treated as finer than derivative-free measurements to achieve sparsity.