Wiley: WIREs Computational Statistics: Table of Contents

Covariance Estimation for Wide Data

Eran Raviv — Fri, 29 May 2026 05:30:21 -0700

Covariance matrix estimation is fundamental to multivariate analysis, with applications spanning finance, genomics, climate science, and signal processing. This review synthesizes recent advances in high-dimensional covariance estimation-thresholding, linear and nonlinear shrinkage, graphical models, and random matrix theory-under a unifying framework that clarifies their interconnections.

ABSTRACT

Reliable estimation of the covariance matrix is fundamental to multivariate analysis, second in importance only to the mean. Its accuracy directly impacts applications in economics, finance, chemistry, health science, bioinformatics, climate research, signal processing, and social network analysis. Modern data environments often involve wide data, where the number of variables is comparable to or exceeds the number of observations, making classical estimators unstable or singular. This review synthesizes recent advances in high-dimensional covariance estimation, focusing on thresholding procedures, linear and nonlinear shrinkage methods, graphical model-based approaches, and estimation techniques using random matrix theory. A unifying taxonomy is proposed that organizes these diverse techniques under a single conceptual framework, highlighting their interconnections and guiding the selection of appropriate estimators in wide-data settings.

A Comprehensive Review of Functional Graphical Models

Yao Zhao, Kuang‐Yao Lee — Thu, 14 May 2026 01:17:48 -0700

We review both undirected and directed functional graphical models, which extending network learning from vectors to random functions. We unify theory and practice, highlighting estimation pipelines, computational challenges from infinite dimensionality and non-Gaussianity, and applications on brain connectivity.

ABSTRACT

Functional graphical models (FGMs) extend classical graphical models from multivariate vectors to multivariate random functions, enabling inference on conditional dependence and, in directed settings, directional or causal relationships across functional domains. This review provides a unified overview of recent developments in both undirected and directed FGMs. We summarize key theoretical foundations, estimation methods, and computational challenges arising from infinite dimensionality, operator non-invertibility, dimension reduction, and non-Gaussian behavior. Applications in brain connectivity, protein signaling, transportation systems, and longitudinal microbiome studies illustrate the broad potential of FGMs. We also highlight open research directions, including dynamic networks, adaptive truncation, and methodological extensions beyond Gaussian assumptions.

Text Mining in Bibliometrics and Science Mapping: A Methodological Review

Michelangelo Misuraca — Tue, 28 Apr 2026 20:52:57 -0700

Text mining has become a foundational component of contemporary bibliometrics and science mapping, enabling systematic analysis of the semantic structure, thematic evolution, and cognitive organization of scientific fields. Integrating textual evidence with relational indicators enriches knowledge maps and supports more comprehensive, content-sensitive representations of research dynamics.

ABSTRACT

Text mining has become central to bibliometrics, providing quantitative insight into the semantic structure of scientific communication. This review surveys current methodological approaches to text-based science mapping, including geometric embeddings, probabilistic models, network techniques, and neural embedding methods. The discussion examines how these approaches operate across different representations of text and evaluates their interpretability, stability, and statistical assumptions. Key issues include data quality, model validation, reproducibility, and the growing influence of large language models. Persistent challenges—language bias, topic instability, limited full-text access, and model opacity—raise open questions about dynamic, multimodal, and ethically grounded science mapping.

On the Foundational Arguments of Sufficient Dimension Reduction

R. Dennis Cook — Fri, 24 Apr 2026 23:21:57 -0700

Contemporary Sufficient Dimension Reduction, a versatile method for extracting material information from data, can serve as a preprocessor for classical modeling and inference, or as a standalone theory that leads directly to statistical inference.

ABSTRACT

Sufficient dimension reduction (SDR) refers to supervised methods of dimension reduction that apply in the context of regression, interpreted broadly. SDR started in the early 1990's with methodology to reduce linearly the predictor dimension without loss of information about the conditional distribution of the response given the predictors. The field grew quickly and today it is vast. The early ideas and methods have been formalized, extended, specialized and adapted to many problems in statistics. A comprehensive synopsis of everything covered by SDR would be truly substantial. Instead, the focus of this overview is on the ideas and philosophy of SDR. While the field is vast, there are a few foundational ideas that define the area generally and have been adapted to different problems. In this overview, we elucidate these ideas by explaining the historical ambience, exploring the genesis of the ideas and discussing how they are adapted for various problems. There is also literature on ‘dimensionality reduction’ that is beyond the scope of this overview. Examples include uniform manifold approximation and projection, neural PCA, kernel PCA, and locally linear embedding. Dimension reduction of data that are not meaningfully stochastic is also outside the scope of this article. The worlds of dimensionality reduction and sufficient dimension reduction were largely developed independently, even when in retrospect the developments involve overlap.

Contrastive Dimension Reduction: A Systematic Review

Sam Hawke, Eric Zhang, Jiawen Chen, Didong Li — Sun, 19 Apr 2026 18:41:48 -0700

Contrastive dimension reduction framework for case–control studies, extracting treatment-specific signals from high-dimensional data to enable downstream tasks such as clustering and prediction.

ABSTRACT

Contrastive dimension reduction (CDR) methods aim to extract signal unique to or enriched in a treatment (foreground) group relative to a control (background) group. This setting arises in many scientific domains, such as genomics, imaging, and time series analysis, where traditional dimension reduction techniques such as principal component analysis (PCA) may fail to isolate the signal of interest. In this review, we provide a systematic overview of existing CDR methods. We propose a pipeline for analyzing case–control studies together with a taxonomy of CDR methods based on their assumptions, objectives, and mathematical formulations, unifying disparate approaches under a shared conceptual framework. We highlight key applications and challenges in existing CDR methods and identify open questions and future directions. By providing a clear framework for CDR and its applications, we aim to facilitate broader adoption and motivate further developments in this emerging field.

This article is categorized under: Statistical Learning and Exploratory Methods of the Data Sciences > Manifold Learning Statistical and Graphical Methods of Data Analysis > Dimension Reduction Statistical and Graphical Methods of Data Analysis > Analysis of High Dimensional Data

Issue Information

Wed, 25 Mar 2026 18:40:19 -0700

WIREs Computational Statistics, Volume 18, Issue 2, June 2026.

Subgroup Identification via Multiple Change Point Detection: Methods and Applications

Yaguang Li, Jingli Wang, Da Zhao, Jialiang Li — Wed, 25 Mar 2026 00:00:00 -0700

Subgroup identification methods facilitate the discovery of clinically meaningful subpopulations with differing disease progression, improving personalized risk assessment and treatment strategies.

ABSTRACT

Subgroup identification is a significant research area in statistics and machine learning, aiming to partition a heterogeneous population into more homogeneous subgroups to enable precise inference and personalized decision-making. Among the various tools available, change point analysis has emerged as a powerful approach for detecting structural changes in data sequences, and it plays an increasingly important role in subgroup identification. In this paper, we provide a systematic review of recent advances in subgroup identification methods based on efficient multiple change point detection methods: (a) We first review the two-step multiple change point detection method (TSMCD) and its application from linear regression to survival analysis. (b) We then discuss the construction of the threshold variable, including the recent developments in the change plane regression model and the change surface regression model. This review aims to provide researchers with a comprehensive perspective to promote the further application and development of change point analysis in subgroup identification and precision medicine.