Statistics articles within Nature Communications

Featured

  • Article
    | Open Access

    Successful memorization could be decoded from brain activity. Here the authors decode human memory success from EEG recordings, suggesting memory is linked to context.

    • Yuxuan Li
    • , Jesse K. Pazdera
    •  & Michael J. Kahana
  • Article
    | Open Access

    SARS-CoV-2 variants with mutations in spike have emerged during the pandemic. Magaret et al. show that in Latin America, efficacy of the Ad26.COV2.S vaccine against moderate to severe–critical COVID-19 varied by sequence features, antibody escape scores, and neutralization impacting features of the SARS-CoV-2 variant.

    • Craig A. Magaret
    • , Li Li
    •  & Peter B. Gilbert
  • Comment
    | Open Access

    Selecting omic biomarkers using both their effect size and their differential status significance (i.e., selecting the “volcano-plot outer spray”) has long been equally biologically relevant and statistically troublesome. However, recent proposals are paving the way to resolving this dilemma.

    • Thomas Burger
  • Article
    | Open Access

    In this study, the authors develop a mathematical modelling framework to estimate the impacts of non-pharmaceutical interventions and vaccination on COVID-19 incidence. The model accounts for changes in SARS-CoV-2 variant and population immunity, and here they use it to investigate epidemic dynamics in French Polynesia.

    • Lloyd A. C. Chapman
    • , Maite Aubry
    •  & Adam J. Kucharski
  • Article
    | Open Access

    Robust genome-wide association study (GWAS) methods that can utilise time-to-event information such as age-of-onset will help increase power in analyses for common health outcomes. Here, the authors propose a computationally efficient time-to-event model for GWAS.

    • Emil M. Pedersen
    • , Esben Agerbo
    •  & Bjarni J. Vilhjálmsson
  • Article
    | Open Access

    The serial interval (time between symptom onset in an infector and infectee) is usually estimated from contact tracing data, but this is not always available. Here, the authors develop a method for estimation of serial intervals using whole genome sequencing data and apply it data from clusters of SARS-CoV-2 in Victoria, Australia.

    • Jessica E. Stockdale
    • , Kurnia Susvitasari
    •  & Caroline Colijn
  • Article
    | Open Access

    Conservation laws are crucial for analyzing and modeling nonlinear dynamical systems; however, identification of conserved quantities is often quite challenging. The authors propose here a geometric approach to discovering conservation laws directly from trajectory data that does not require an explicit dynamical model of the system or detailed time information.

    • Peter Y. Lu
    • , Rumen Dangovski
    •  & Marin Soljačić
  • Article
    | Open Access

    While experts analyze cytomorphology to diagnose myelodysplastic syndromes, definitive diagnosis requires complementary information such as karyotype and molecular genetics testing. Here, the authors present a computational method that automatically detects, characterizes and helps identify blood cell characteristics associated with this group of diseases.

    • José Guilherme de Almeida
    • , Emma Gudgin
    •  & Moritz Gerstung
  • Article
    | Open Access

    The global risk of record-breaking heatwaves is assessed, with the most at-risk regions identified. It is shown that record-smashing events that currently appear implausible could happen anywhere as a result of climate change.

    • Vikki Thompson
    • , Dann Mitchell
    •  & Julia M. Slingo
  • Article
    | Open Access

    Neutron scattering experiments are important for studying materials properties. Here, the authors present a probabilistic active learning approach for neutron spectroscopy with three-axes spectrometers and demonstrate optimization of beam time use by favoring informative regions of signal.

    • Mario Teixeira Parente
    • , Georg Brandl
    •  & Astrid Schneidewind
  • Article
    | Open Access

    Dimension reduction is an indispensable part of modern data science, and many algorithms have been developed. Here, the authors develop a theoretically justified, simple to use and reliable spectral method to assess and combine multiple dimension reduction visualizations of a given dataset from diverse algorithms.

    • Rong Ma
    • , Eric D. Sun
    •  & James Zou
  • Article
    | Open Access

    Authors have previously reported on the efficacy and safety of the recombinant spike protein nanoparticle vaccine, NVX-CoV2373, in healthy adults. In this work, they assess anti-spike binding IgG, anti-RBD binding IgG and neutralising antibody titer as correlates of risk and protection against COVID-19.

    • Youyi Fong
    • , Yunda Huang
    •  & Peter B. Gilbert
  • Article
    | Open Access

    COVID-19-releated public health measures may have indirectly impacted mortality rates by causing or averting deaths. Here, the authors use data from Switzerland until April 2022 and estimate that, after accounting for deaths directly related to COVID-19, mortality was lower than expected, indicating some evidence of an overall positive impact of control measures.

    • Julien Riou
    • , Anthony Hauser
    •  & Garyfallos Konstantinoudis
  • Article
    | Open Access

    Additively manufactured materials contain different types of volumetric defects. Here, the authors utilize the most distinguishing morphological features among different defect types to propose a defect classification methodology.

    • Arun Poudel
    • , Mohammad Salman Yasin
    •  & Nima Shamsaei
  • Article
    | Open Access

    SARS-CoV-2 variants of concern have been associated with reduced vaccine effectiveness, even after a booster dose. In this study, authors aim to estimate vaccine effectiveness against hospitalisation with the Omicron and Delta variants, using different definitions of hospitalisation in secondary care data.

    • Julia Stowe
    • , Nick Andrews
    •  & Jamie Lopez Bernal
  • Article
    | Open Access

    Can AI learn from atmospheric data and improve weather forecasting? The neural network MetNet-2 achieves this by forecasting the fast changing variable of precipitation up to 12 h ahead more accurately and efficiently than traditional models based on hand-coded physics.

    • Lasse Espeholt
    • , Shreya Agrawal
    •  & Nal Kalchbrenner
  • Article
    | Open Access

    Dynamic remodeling of the actin cytoskeleton underlies cell movement, but is challenging to characterize at the molecular level. Here, the authors present a method to extract actin filament velocities in living cells, and compare their results to current models of cytoskeletal dynamics.

    • Cayla M. Miller
    • , Elgin Korkmazhan
    •  & Alexander R. Dunn
  • Article
    | Open Access

    Brain-inspired neural generative models can be designed to learn complex probability distributions from data. Here the authors propose a neural generative computational framework, inspired by the theory of predictive processing in the brain, that facilitates parallel computing for complex tasks.

    • Alexander Ororbia
    •  & Daniel Kifer
  • Article
    | Open Access

    The targeted discovery of molecules with specific structural and chemical properties is an open challenge in computational chemistry. Here, the authors propose a conditional generative neural network for the inverse design of 3d molecular structures.

    • Niklas W. A. Gebauer
    • , Michael Gastegger
    •  & Kristof T. Schütt
  • Article
    | Open Access

    The movements of individuals within and among cities influence critical aspects of our society, such as well-being, the spreading of epidemics, and the quality of the environment. Here, the authors use deep neural networks to discover non-linear relationships between geographical variables and mobility flows.

    • Filippo Simini
    • , Gianni Barlacchi
    •  & Luca Pappalardo
  • Article
    | Open Access

    Household air pollution derived from cooking fuels is a major source of health and environmental problems. Here, the authors provide detailed global, regional and country estimates of cooking fuel usage from 1990 to 2030 and project that 31% of people will still be mainly using polluting fuels in 2030.

    • Oliver Stoner
    • , Jessica Lewis
    •  & Heather Adair-Rohani
  • Article
    | Open Access

    Differential expression analysis of single-cell transcriptomics allows scientists to dissect cell-type-specific responses to biological perturbations. Here, the authors show that many commonly used methods are biased and can produce false discoveries.

    • Jordan W. Squair
    • , Matthieu Gautier
    •  & Grégoire Courtine
  • Article
    | Open Access

    Characterizing an unknown, complex system, like an accelerator, in multi-dimensional space is a challenging task. Here the authors report a Bayesian active learning method - Constrained Proximal Bayesian Exploration - for the characterization of a complex, constrained measurement as a function of multiple free parameters.

    • Ryan Roussel
    • , Juan Pablo Gonzalez-Aguilera
    •  & Auralee Edelen
  • Article
    | Open Access

    Forecasting models have been used extensively to inform decision making during the COVID-19 pandemic. In this preregistered and prospective study, the authors evaluated 14 short-term models for Germany and Poland, finding considerable heterogeneity in predictions and highlighting the benefits of combined forecasts.

    • J. Bracher
    • , D. Wolffram
    •  & Frost Tianjian Xu
  • Article
    | Open Access

    Accurate seasonal forecasts of sea ice are highly valuable, particularly in the context of sea ice loss due to global warming. A new machine learning tool for sea ice forecasting offers a substantial increase in accuracy over current physics-based dynamical model predictions.

    • Tom R. Andersson
    • , J. Scott Hosking
    •  & Emily Shuckburgh
  • Article
    | Open Access

    In many machine learning applications, one uses pre-trained neural networks, having limited access to training and test data. Martin et al. show how to predict trends in the quality of such neural networks without access to this information, relevant for reproducibility, diagnostics, and validation.

    • Charles H. Martin
    • , Tongsu (Serena) Peng
    •  & Michael W. Mahoney
  • Article
    | Open Access

    Networks describe the intricate patterns of interaction occurring within ecological systems, but they are unfortunately difficult to construct from data. Here, the authors show how Bayesian statistical techniques can separate structure from noise in networks gathered in observational studies of plant-pollinator systems.

    • Jean-Gabriel Young
    • , Fernanda S. Valdovinos
    •  & M. E. J. Newman
  • Article
    | Open Access

    Generating new sensible molecular structures is a key problem in computer aided drug discovery. Here the authors propose a graph-based molecular generative model that outperforms previously proposed graph-based generative models of molecules and performs comparably to several SMILES-based models.

    • Omar Mahmood
    • , Elman Mansimov
    •  & Kyunghyun Cho
  • Article
    | Open Access

    Influenza forecasting in the United States is challenging and consequential, with the ability to improve the public health response. Here the authors show the performance of the multiscale flu forecasting model, Dante, that won the CDC’s 2018/19 national, regional and state flu forecasting challenges.

    • Dave Osthus
    •  & Kelly R. Moran
  • Article
    | Open Access

    Gene regulatory networks are a useful means of inferring functional interactions from large-scale genomic data. Here, the authors develop a Bayesian framework integrating GWAS summary statistics with gene regulatory networks to identify genetic enrichments and associations simultaneously.

    • Xiang Zhu
    • , Zhana Duren
    •  & Wing Hung Wong
  • Article
    | Open Access

    In genome-wide association meta-analysis, it is often difficult to find an independent dataset of sufficient size to replicate associations. Here, the authors have developed MAMBA to calculate the probability of replicability based on consistency between datasets within the meta-analysis.

    • Daniel McGuire
    • , Yu Jiang
    •  & Dajiang J. Liu
  • Article
    | Open Access

    The Tafel slope in electrochemical catalysis is usually determined from experimental data and remains error-prone. Here, the authors develop a Bayesian approach for Tafel slope quantification, and apply it to study the prevalence of certain "cardinal" Tafel slopes in the electrochemical CO2 reduction literature.

    • Aditya M. Limaye
    • , Joy S. Zeng
    •  & Karthish Manthiram
  • Article
    | Open Access

    Accurate prediction of solubility represents a challenge for traditional computational approaches due to the complex nature of phenomena involved. Here the authors report a successful approach to solubility prediction in organic solvents and water using combination of machine learning and computational chemistry.

    • Samuel Boobier
    • , David R. J. Hose
    •  & Bao N. Nguyen
  • Article
    | Open Access

    Distributed health data networks (DHDNs) leverage data from multiple healthcare systems, but often face major analytical challenges in the presence of missing data. This paper develops distributed multiple imputation methods that do not require sharing subject-level data across health systems.

    • Changgee Chang
    • , Yi Deng
    •  & Qi Long
  • Article
    | Open Access

    Theories of human categorization have traditionally been evaluated in the context of simple, low-dimensional stimuli. In this work, the authors use a large dataset of human behavior over 10,000 natural images to re-evaluate these theories, revealing interesting differences from previous results.

    • Ruairidh M. Battleday
    • , Joshua C. Peterson
    •  & Thomas L. Griffiths
  • Article
    | Open Access

    Time-dependent errors are one of the main obstacles to fully-fledged quantum information processing. Here, the authors develop a general methodology to monitor time-dependent errors, which could be used to make other characterisation protocols time-resolved, and demonstrate it on a trapped-ion qubit.

    • Timothy Proctor
    • , Melissa Revelle
    •  & Kevin Young
  • Article
    | Open Access

    The intermittency of solar resources is one of the primary challenges for the large-scale integration of the renewable energy. Here Yin et al. used satellite data and climate model outputs to evaluate the geographic patterns of future solar power reliability, highlighting the tradeoff between the maximum potential power and the power reliability.

    • Jun Yin
    • , Annalisa Molini
    •  & Amilcare Porporato
  • Article
    | Open Access

    Principal component analysis is often used in studies of ancient DNA, but does not account for the age of the samples. Here, the authors present a factor analysis (FA) which corrects for this by including the effect of allele frequency drift over time.

    • Olivier François
    •  & Flora Jay
  • Article
    | Open Access

    The pyruvate dehydrogenase complex (PDC) is a multienzyme complex connecting glycolysis to mitochondrial oxidation of pyruvate. Cryo-EM analysis of PDC from Neurospora crassa reveals localization of fungi-specific protein X (PX) and confirms that it functions like the mammalian E3BP, recruiting the E3 component of PDC.

    • B. O. Forsberg
    • , S. Aibara
    •  & E. Lindahl
  • Perspective
    | Open Access

    Photon-induced charge separation phenomena are at the heart of light-harvesting applications but challenging to be described by quantum mechanical models. Here the authors illustrate the potential of machine-learning approaches towards understanding the fundamental processes governing electronic excitations.

    • Florian Häse
    • , Loïc M. Roch
    •  & Alán Aspuru-Guzik
  • Article
    | Open Access

    Although power laws are observed during nanoindentation and the power-law exponents are estimated to be approximately 1.5-1.6 for face-centered cubic metals, the origin of the exponent remains unclear. In this paper, we show the power-law statistics in pop-in magnitudes and unveil the nature of the exponent.

    • Yuji Sato
    • , Shuhei Shinzato
    •  & Shigenobu Ogata
  • Article
    | Open Access

    In medical diagnosis a doctor aims to explain a patient’s symptoms by determining the diseases causing them, while existing diagnostic algorithms are purely associative. Here, the authors reformulate diagnosis as a counterfactual inference task and derive new counterfactual diagnostic algorithms.

    • Jonathan G. Richens
    • , Ciarán M. Lee
    •  & Saurabh Johri