# Measuring the impact of apnea and obesity on circadian activity patterns using functional linear modeling of actigraphy data

- Jia Wang
^{1_91}, - Hong Xian
^{1_91, 2_91}, - Amy Licis
^{3_91}, - Elena Deych
^{1_91}, - Jimin Ding
^{4_91}, - Jennifer McLeland
^{3_91}, - Cristina Toedebusch
^{3_91}, - Tao Li
^{1_91}, - Stephen Duntley
^{3_91}and - William Shannon
^{1_91}Email author

**9**:11

**DOI: **10.1186/1740-3391-9-11

© Wang et al; licensee BioMed Central Ltd. 2011

**Received: **5 August 2011

**Accepted: **13 October 2011

**Published: **13 October 2011

## Abstract

### Background

Actigraphy provides a way to objectively measure activity in human subjects. This paper describes a novel family of statistical methods that can be used to analyze this data in a more comprehensive way.

### Methods

A statistical method for testing differences in activity patterns measured by actigraphy across subgroups using functional data analysis is described. For illustration this method is used to statistically assess the impact of apnea-hypopnea index (apnea) and body mass index (BMI) on circadian activity patterns measured using actigraphy in 395 participants from 18 to 80 years old, referred to the Washington University Sleep Medicine Center for general sleep medicine care. Mathematical descriptions of the methods and results from their application to real data are presented.

### Results

Activity patterns were recorded by an Actical device (Philips Respironics Inc.) every minute for at least seven days. Functional linear modeling was used to detect the association between circadian activity patterns and apnea and BMI. Results indicate that participants in high apnea group have statistically lower activity during the day, and that BMI in our study population does not significantly impact circadian patterns.

### Conclusions

Compared with analysis using summary measures (e.g., average activity over 24 hours, total sleep time), Functional Data Analysis (FDA) is a novel statistical framework that more efficiently analyzes information from actigraphy data. FDA has the potential to reposition the focus of actigraphy data from general sleep assessment to rigorous analyses of circadian activity rhythms.

### Keywords

Apnea BMI circadian activity patterns Functional Data Analysis## 1. Introduction

Activity measured by wrist actigraphy has been shown to be a valid marker of entrained Polysomnography (PSG) sleep phase and is strongly correlated with entrained endogenous circadian phase [1]. Actigraphy data is recorded densely, such as every minute or every 15 seconds, for each patient over multiple days. This data is generally analyzed by reducing the time series activity values to summary statistics such as sleep/wake ratios,[2, 3] total sleep time,[2, 4] sleep efficiency,[5, 6] wake after sleep onset, [2, 3, 6] ratio of nighttime activity to daytime activity or total activity,[7, 8] standard deviation of sleep onset time,[9] and intra-daily variability [10]. More complex modeling of actigraphy includes spectral analysis,[7] cosinor analysis [7] and waveform eduction calculated as an "average waveform" for some period [11].

In this paper we propose a novel statistical framework, Functional Linear Modeling (FLM), a subset of Functional Data Analysis (FDA), for analyzing actigraphy data to extract and analyze circadian activity information through direct analysis of raw activity values [12]. FLM extends standard linear regression to the analysis of functions, which in this case represent circadian activity patterns. FLM is performed by 1) converting a subject's raw actigraphy data to a functional form (i.e., continuous curve over time), and 2) analyzing sets of functions to see if they differ statistically across groups. Our FLM-based analysis shows where and with what level the difference between groups occurs along the time, which provides valuable reference for clinical analysis and treatments, and distinguishes our methods from existing circadian analysis works (see [13] for a review). Moreover, we adopted a non-parametric permutation F test to detect the difference between groups, which makes the results robust to the uncertainty in raw data distribution. Using FLM, we show that the apnea-hypopnea index (apnea) has a statistically significant impact on circadian activity patterns, while body mass index (BMI) in this dataset has little impact.

## 2. Methods

### 2.1 Participants and Measures

Participants were recruited prospectively from the clinic at Washington University in St. Louis Sleep Medicine Center. The sleep center is a multidisciplinary clinic at a tertiary medical facility. Clinic patients with a suspected diagnosis of obstructive sleep apnea (OSA), insomnia, or restless legs syndrome (RLS) were invited to participate. Pregnant women, individuals under age of 18, and patients who report working an evening or overnight shift were excluded from participation due to known biologically different circadian clocks. Clinical covariates such as BMI, co-morbidities, concomitant medications, and presenting sleep complaints were collected. Participants underwent an overnight PSG when clinically indicated. These data were collected in accordance with the standards of the American Academy of Sleep Medicine (AASM) and were reviewed by a board certified sleep physician. PSG data were scored according to the AASM Manual for the Scoring of Sleep and Associated Events. This ongoing study has been approved by the Washington University School of Medicine Institutional Review Board.

Activity was measured using Actical devices (Philips Respironics Inc.) which were positioned on the non-dominant wrist of subjects at the initial sleep center visit and set to measure activity every minute for 7 days. Three hundred and ninety five patients have been recruited, of which 305 have apnea and/or BMI measured. This subgroup comes from a larger NIH funded study currently recruiting a cross section of 750 patients referred to the Washington University Sleep Medicine Center for the purpose of developing and validating functional data analysis methods for actigraphy data (HL092347).

### 2.2. Functional Data Analysis (FDA)

FDA is an emerging field in statistics that extends classical statistical methods for analyzing sets of numbers (scalars for univariate analyses, and vectors for multivariate analyses) to analyzing sets of functions [13][15]. FDA is a subset of the larger field called 'object data analysis' or 'object oriented data analysis' that uses statistical methods to analyze data that are in non-numeric form such as images, graphs (e.g., trees), or functions [14, 15]. The goal of object oriented data analysis is to analyze objects in their natural form (e.g., functions, graphs) to extract more information than generally can be extracted when the objects are converted into simpler summary measures (e.g., average activity level, total sleep time) where standard statistical methods can be applied.

#### 2.2.1 Functional smoothing

*y*

_{ kj }be the discrete activity count for patient k at time point

*t*

_{ kj }, then the model

represents activity, where *k* = 1, 2,...*,N*,*N* is total number of patients, *j* = 1, 2,...,*T*
_{
k
}, *T*
_{
k
} is the total number of time points for patient *k*. In our dataset, observation times are minutes from midnight to midnight in 24 hours, so all subjects have the same number of measurements *T*
_{
k
}.

*Activity*

_{ k }(

*t*

_{ j })

where
are scalar coefficients for patient k and
are basis functions. Possible basis functions include polynomials (*f*(*t*) = *a*
_{1}
*t* + *a*
_{2}
*t*
^{2} + ... + *a*
_{
n
}
*t*
^{
n
}), Fourier basis
, splines, and wavelets.

We will use this functional representation for all analyses in this paper.

where *y*
_{
k
} = (*y*
_{1k
}, *y*
_{2k
},...,*y*
_{1440k
})^{'}, *a*
_{
k
} = (*a*
_{1k
}, *a*
_{2k
},...,*a*
_{9k
})'.

where Φ is a 1440 × 9 matrix with columns for basis functions and rows for basis value at each minute.

*SMSSE*(

*y*

_{ k }|

*a*

_{ k }) with respect to

*a*, gives 2Φ

^{'}Φ

*a*

_{ k }- 2Φ

^{'}

*y*

_{ k }, and setting this equal to 0 and solving for a provides the estimate that minimizes the least square solution,

The raw data does not need to be normalized since all analyzes are done on the functional form of the data.

To avoid introducing variation between weekday and weekend activity patterns, only data from midnight Monday to midnight Friday was used in this paper, although this simplification is not required for analysis. The five weekdays of actigraphy data were averaged into a single 24 hour profile and a smooth Fourier expansion function was fitted using a 24 hour periodicity and 9 basis functions. This produced a single 24 hour circadian activity pattern for each subject that can be used to estimate patient's activity level at any time point throughout the day. We are developing and preparing to publish functional linear mixed models which will analyze every day's activity data to incorporate day effects, weekday/weekend effects, and pre/post treatment effects which will provide more insight into circadian rhythm patterns and within-subject variability.

#### 2.2.2 Functional Linear Models

Reducing actigraphy data to a summary statistic can mask differences across groups. For example, if one group of patients has high activity in the morning and low activity in the afternoon, and another group has a reversed pattern with the same magnitude of activity, low activity in the morning and high activity in the afternoon, their average activity may be similar, and a significant difference in circadian activity patterns would be missed. FLM avoids masking by extending the linear regression model to the analysis of smooth functions (i.e. circadian activity patterns), and differences such as described in this example become apparent.

*β*

_{0},

*β*

_{1}), and error term are functions. To illustrate the use of FLMs for analyzing actigraphy data, four subjects from our database with the highest apnea scores and four subjects with the lowest apnea scores were selected. apnea is a measure of apnea-hypopnea index used routinely in sleep medicine, and measures the severity of sleep apnea with high values indicating more severe disease. In Figure 2, the circadian activity patterns fitted by Fourier expansion for each of the 8 subjects are shown in separate plots with time recorded on the X axis, and activity level on the Y axis. The top 4 plots show the high apnea subjects (severe sleep apnea) and the bottom 4 plots show the low apnea subjects (mild or no sleep apnea). Visually there is a large difference between the circadian patterns in the high and low apnea subjects.

Using this subset of subjects, functional smoothing and linear modeling is illustrated in this section. In the following section the methods are applied to the full dataset.

where k = 1,2,...,8 are the subjects in Figure 2, apnea is the group membership indicator with apnea = 1 for low apnea subjects, apnea = -1 for high apnea subjects, and *ε*
_{
k
} is the error term. The resulting model fit to this data is Activity_{k} = 247.9 + 169.9 × apnea, P < 0.001, and R^{2} = 0.97. The estimated mean activity in the 4 low apnea subjects is 247.9 + 169.9 = 417.8, and in the 4 high apnea subjects is 247.9 - 169.9 = 78. This statistical analysis confirms the clinical belief that apnea impacts activity, and confirms what is seen in Figure 2. However, it does not tell us when during the day activity levels are different.

where the (t) notation indicates functions over the circadian period for activity (fitted by the Fourier expansion to the actigraphy data for each subject k) Activity_{k}(*t*), the mean circadian activity pattern over all subjects *β*
_{0}(*t*), the functional coefficient indicating how the mean circadian activity patterns changes for low apnea subjects (apnea = 1, *β*
_{0}(*t*) + *β*
_{1}(*t*)), or for high apnea subjects (apnea = -1, *β*
_{0}(*t*) - *β*
_{1}(*t*)), and *ε*
_{
k
}(*t*) is the functional error term. In other words, the low apnea group is predicted to have a mean circadian activity pattern found by adding the two functions *β*
_{0}(*t*) + *β*
_{1}(*t*), and the high apnea group is predicted to have a mean circadian activity pattern found by subtracting the two functions *β*
_{0}(*t*) - *β*
_{1}(*t*). In Figure 3A
*β*
_{0}(*t*) is the thick black line representing the overall mean, *β*
_{0}(*t*) + *β*
_{1}(*t*) is the thick red line for the mean of the low apnea group, and *β*
_{0}(*t*) - *β*
_{1}(*t*) is the thick blue line for the mean of the high apnea group.

*Z*with rows indicating subjects and columns indicating the mean function (column 1) and effects on the activity due to apnea level

*g*(column 2). In standard matrix notation each row is a vector of 1's and -1's indicating if the subject belongs to high apnea (1, -1) and low apnea (1, 1). The two functional linear coefficients are represented in matrix notation as a 'functional vector'

*ε*(

*t*) = (

*ε*

_{1}(

*t*),

*ε*

_{2}(

*t*),...,

*ε*

_{ N }(

*t*))

^{'}. Equation 8 in matrix notation becomes,

where *Z*
_{
k
} is the *k*
^{
th
} row of the design matrix *Z*.

*β*(

*t*) in function linear regression, we also want to measure the accuracy of our estimation result. We calculate the point-wise 95% confidence limits for these effects using residuals from the model. This formulation is the same as the standard linear model except that instead of numeric coefficients we are now estimating functional coefficients defined over the 24 hour circadian period. A statistical test of the null hypothesis that the circadian activity patterns are the same in both groups is given by the function [12]:

where *Z* is the design matrix and
is a vector of the estimated regression coefficient functions.

Because of the nature of functional statistics, it is difficult to attempt to derive a theoretical null distribution for any given test statistic. Instead, we applied a non-parametric permutation test methodology. If there is no relationship between activity pattern and apnea levels, it should make no difference if we randomly rearrange the apnea group assignment. The advantage of this is that we no longer need to rely on distributional assumptions while the disadvantage is that we cannot test for the significance of an individual covariate among many. The *p* value of the test can then be calculated by counting the proportion of permutation *F* values that are larger than the *F* statistics for the observed pairing. Here we used two different ways to counting the proportion: global test and point-wise test. Global test provides a single number which is the proportion of maximized *F* values from each permutation. Point-wise test provides a curve which is the proportion of all permutation *F* values at each time point.

Plot (b) in Figure 3 provides a display for the statistical significance test for the differences in circadian activity patterns continuously over time. The blue dashed and dotted lines correspond to a global and point-wise test of significance at significant level α = 0.05, respectively, and the red solid curve represents the observed statistic F(t) at each time point. When F(t) is above the blue dashed or dotted line, it is concluded the two apnea groups have significantly different mean circadian activity patterns at those time points. The global critical value (blue dashed line) is preferred since this represents a more conservative test. For these data, the two apnea groups are statistically different in activity from approximately 7 AM - 9 PM.

The statistical and computational details for fitting FLM models are well described elsewhere and are outside the scope of this paper. The reader interested in these details are referred to Ramsay and Silverman [12].

This illustration was meant as an introduction to the methodology only, and not an indicator of a clinical conclusion. In the following section, these methods are applied to the entire 395 subject dataset, and show how apnea and BMI clinically impacts circadian activity patterns.

## 3. Results

### 3.1 Demographic Information

Demographic information and sample characteristics

Variable | N (%) Mean ± std (N Total 395) | |
---|---|---|

| 196(49.87%) | |

| African-American | 134 (35.08%) |

Caucasian | 237 (62.04%) | |

| Snoring | 279 (70.63%) |

Gasping | 93 (23.54%) | |

Morning headache | 67 (16.96%) | |

RLS symptoms | 26 (6.58%) | |

PLMS | 3 (0.76%) | |

Witnessed apneas | 146 (36.96%) | |

Insomnia | 42 (10.63%) | |

Excessive day sleepiness | 91 (23.04%) | |

Nonrestorative sleep | 9 (2.28%) | |

| Class 4 | 145 (41.55%) |

Class 3 | 136 (38.97%) | |

Class 2 | 53 (15.19%) | |

Class 1 | 15 (4.30%) | |

| OSA | 292 (73.92%) |

RLS | 5 (1.27%) | |

Insomnia | 8 (2.03%) | |

Hypersomnia | 20 (5.06%) | |

| 241 (60.86%) | |

| 34.66 ± 8.88 (Median = 34) | |

| 47.9 ± 14.8 | |

| 22.11 ± 28.11 (Median = 12.95) |

### 3.2 Smoothed Functional Actigraphy Data

### 3.3. Functional Liner Model (FLM) Results

We apply FLM to measure the impact of apnea and BMI on subject circadian activity patterns and test the null hypothesis that circadian activity patterns are the same regardless of apnea and BMI values. The alternative hypothesis is that apnea, BMI, and/or their interaction effect activity behavior in a statistically significant way. In addition to the tests of hypotheses, FLM provides a graphical view of the subgroup circadian activity patterns that can aid interpretation of behavioral differences.

For the 295 subjects, 235 subjects had data on apnea and actigraphy, 277 subjects had BMI and actigraphy, and 232 subjects had apnea, BMI, and actigraphy. The following analyses are based on these subsets.

Three Functional Linear Models

Model 1 | apnea Main Effect Only | Activity |
---|---|---|

Model 2 | BMI Main Effect Only | Activity |

Model 3 | apnea+BMI+interaction | Activity |

#### 3.3.1 Apnea Main Effect Models

The impact of apnea as a main effect on circadian activity patterns was tested with Model 1, Table 2. The null hypothesis is that the circadian actigraphy patterns are the same in the two apnea groups. Of the 235 subjects in this analysis, 118 have apnea less than the median apnea = 10.8, and 117 patients have apnea larger than or equal to 10.8.

#### 3.3.2. BMI main effect

We emphasize that the population of participants in this study had a higher overall BMI compared to the general population which may explain why the expected difference in circadian activity patterns across these groups was not observed.

#### 3.3.3 Apnea and BMI effect, with interaction

Sample size for apnea, BMI mode

apnea Low (< 10.75) | apnea High (> = 10.75) | Total | |
---|---|---|---|

| 61 | 94 | 155 |

| 55 | 22 | 77 |

| 116 | 116 | 232 |

It is an established statistical practice in a linear regression model to test the main effects of two covariates and the effect of the interaction of the two covariates. We extended this method to the functional linear model. The comparisons of all 4 groups in this section are actually the evaluation of the combination of the main and interaction effects which should be consistent with a 2-way ANOVA.

#### 3.3.4 BMI as a Continuous Variable

## 4. Discussion

Traditionally, actigraphy data is transformed into summary numbers, such as total sleep time, sleep efficiency, wake after sleep onset, and other measurements. These transformations allow data analysts to test hypothesis using simple classical statistical methods. However, large amount of information can be lost and problems of masking circadian patterns may arise.

The merit of functional linear modeling relies in determining when along the 24-hour scale groups differ. Results from parameter tests in a cosinor approach would provide information as to differences in harmonic content between groups. Another advantage of the functional linear modeling approach is exemplified in Figure 8, where BMI is used as a variable instead of comparing groups with higher versus lower BMI values.

In this paper we have presented a novel approach for analyzing the full actigraphy data which we believe avoids significant information loss and masking effect. Representing actigraphy data as smooth continuous functions, and applying Functional Linear Modeling methods allowed us to directly compare and test differences of circadian activity patterns across apnea and BMI subgroups. Other Functional Data Analysis methods using principal components analysis ([15]; Zeitzer, et al. 'Phenotyping apathy in individuals with Alzheimer's using functional principal component analysis', Revised and Resubmitted) for identifying sources of variability within circadian activity patterns across subgroups, and mixed effect models (Ding, et al., 'Functional Linear Mixed Effects Model for Actigraphy Data', In Preparation) for incorporating additional sources of within subject variability are currently being developed in our lab and applied to this type of data. Functional linear mixed models are also being developed in our lab which will allow within-subject variability such as day-to-day or pre-treatment to post-treatment differences in activity to be analyzed.

## Declarations

### Acknowledgements

We are particularly grateful to the editor and reviewers who have greatly increased our knowledge of existing work in circadian rhythm data analysis. This work was supported by R01 HL092347 "New Data Analysis Methods for Actigraphy in Sleep Medicine" (Shannon, PI), the Washington University Dept. of Medicine's Biostatistics Center (Shannon, Director), and the Dept. of Neurology Sleep Center (Duntley, Director)

## Authors’ Affiliations

## References

- Ancoli-Israel S, Cole R, Alessi C, Chambers M, Moorcroft W, Pollak CP:
**The role of actigraphy in the study of sleep and circadian rhythms.***Sleep*2003,**26:**342–392.PubMedGoogle Scholar - Jean-Louis G, von Gizycki H, Zizi F, Fookson J, Spielman A, Nunes J, Fullilove R, Taub H:
**Determination of sleep and wakefulness with the actigraph data analysis software (ADAS).***Sleep*1996,**19:**739–743.PubMedGoogle Scholar - Blood ML, Sack RL, Percy DC, Pen JC:
**A comparison of sleep detection by wrist actigraphy, behavioral response, and polysomnography.***Sleep*1997,**20:**388–395.PubMedGoogle Scholar - Kushida CA, Chang A, Gadkary C, Guilleminault C, Carrillo O, Dement WC:
**Comparison of actigraphic, polysomnographic, and subjective assessment of sleep parameters in sleep-disordered patients.***Sleep Med*2001,**2:**389–396.PubMedView ArticleGoogle Scholar - Reid K, Dawson D:
**Correlation between wrist activity monitor and electrophysiological measures of sleep in a simulated shiftwork environment for younger and older subjects.***Sleep*1999,**22:**378–385.PubMedGoogle Scholar - Shinkoda H, Matsumoto K, Hamasaki J, Seo YJ, Park YM, Park KP:
**Evaluation of human activities and sleep-wake identification using wrist actigraphy.***Psychiatry Clin Neurosci*1998,**52:**157–159.PubMedView ArticleGoogle Scholar - Satlin A, Teicher MH, Lieberman HR, Baldessarini RJ, Volicer L, Rheaume Y:
**Circadian locomotor activity rhythms in Alzheimer's disease.***Neuropsychopharmacology*1991,**5:**115–126.PubMedGoogle Scholar - Mishima K, Hishikawa Y, Okawa M:
**Randomized, dim light controlled, crossover test of morning bright light therapy for rest-activity rhythm disorders in patients with vascular dementia and dementia of Alzheimer's type.***Chronobiol Int*1998,**15:**647–654.PubMedView ArticleGoogle Scholar - Gruber R, Sadeh A, Raviv A:
**Instability of sleep patterns in children with attention-deficit/hyperactivity disorder.***J Am Acad Child Adolesc Psychiatry*2000,**39:**495–501.PubMedView ArticleGoogle Scholar - Van Someren EJ, Kessler A, Mirmiran M, Swaab DF:
**Indirect bright light improves circadian rest-activity rhythm disturbances in demented patients.***Biol Psychiatry*1997,**41:**955–963.PubMedView ArticleGoogle Scholar - Pollak CP, Tryon WW, Nagaraja H, Dzwonczyk R:
**How accurately does wrist actigraphy identify the states of sleep and wakefulness?***Sleep*2001,**24:**957–965.PubMedGoogle Scholar - Ramsey J, Silverman BW:
*Functional Data Analysis*. second edition. New York; 2005.Google Scholar - Refinetti R, Cornélissen G, Halberg F:
**Procedures for numerical analysis of circadian rhythms.***Biological Rhythm Research*2007,**38**(4)**:**275–325.View ArticleGoogle Scholar - Shannon WD, Banks D:
**Combining classification trees using MLE.***Stat Med*1999,**18:**727–740.PubMedView ArticleGoogle Scholar - Ding J, Symanzik J, Sharif A, Wang J, Duntley S, Shannon WD:
**Powerful Actigraphy Data Through Functional Representation.***Chance*2011, in press.Google Scholar

## Copyright

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.