One Million Days of Mortality Study

What is this?


The “One Million Days of Mortality” study is an Open Science project organised by to investigate the association between physical activity and mortality utilising techniques from compositional data analysis in conjunction with the more familiar Cox’s proportional hazards model. Multiple research teams will analyse their own datasets using harmonised methods (implemented in easy-to-use software provided by that we hope will form the basis for the largest study of its kind. Our ambition is to collect over one million person-days of mortality risk exposure. 

Each team is free to publish their own results, but at the end of this study we will publish (with our collaborators) a harmonised meta-analysis based on the results across all of the teams, that will give us a fuller understanding of how mortality rates are associated with the allocation of time between different physical behaviours across the day.

The study is running between 1 September 2018 and 1 September 2019, and already includes well-characterised datasets such as “NHANES 2005-06” and “UK Biobank” but the project remains open to new teams interested in applying our software to new datasets.  

How does it work?


If you're interested in getting involved just contact us at and you'll receive a welcome pack with more details.

Essentially we will provide your team with an online tool which implements our method for investigating the association between mortality rates and physical activity. You are free to use this tool as you see fit and can publish any findings based solely on your own data set freely (we ask only that you give a brief acknowledgement of the assistance of In exchange you'll provide us with a set of results based on your data which will feed into our larger harmonised analysis.

All contributors will be included on the list of co-authors for the publication expected to arise from this work.

What do I need?


You just need a suitable dataset that hasn't already been included in the study. You can use objective data or self-reported data (we prefer objective but it's not always available). We are interested in two breakdowns of the day.

  1. Waking day  (SB, LIPA, MVPA)
  2. 24 hour day (Sleep, SB, LIPA, MVPA)


  • SB is sedentary behaviour
  • LIPA is light intensity physical activity
  • MVPA is moderate/ vigorous physical activity

so your data will need to provide at least one of these. We'll need to ask a few more questions to establish data quality for carrying out the harmonized analysis, but we're keen to make our tools available to as many teams as possible.

We'll provide you with the tools you need to analyze your own dataset. It is easiest to use our online tool, however we can also provide you with the Shiny Application code to run on a local R server if you are uncomfortable with uploading your data, or your data is sufficiently large for run-times to become an issue. In this case, you will need to download RStudio from .

Current Datasets


  • NHANES 2005-06
  • UK Biobank

(last updated 31/10/2018)