Maximum entropy spectral analysis for circadian rhythms: theory, history and practice
- Harold B Dowse^{1}Email author
DOI: 10.1186/1740-3391-11-6
© Dowse; licensee BioMed Central Ltd. 2013
Received: 16 April 2013
Accepted: 25 June 2013
Published: 11 July 2013
Abstract
There is an array of numerical techniques available to estimate the period of circadian and other biological rhythms. Criteria for choosing a method include accuracy of period measurement, resolution of signal embedded in noise or of multiple periodicities, and sensitivity to the presence of weak rhythms and robustness in the presence of stochastic noise. Maximum Entropy Spectral Analysis (MESA) has proven itself excellent in all regards. The MESA algorithm fits an autoregressive model to the data and extracts the spectrum from its coefficients. Entropy in this context refers to “ignorance” of the data and since this is formally maximized, no unwarranted assumptions are made. Computationally, the coefficients are calculated efficiently by solution of the Yule-Walker equations in an iterative algorithm. MESA is compared here to other common techniques. It is normal to remove high frequency noise from time series using digital filters before analysis. The Butterworth filter is demonstrated here and a danger inherent in multiple filtering passes is discussed.
Background
Physiological processes in almost all plants and animals have adapted to the cycles in the environment, be they daily (circadian), tidal, lunar, synodic lunar monthly or annual [1]. Oscillatory behavior with periods of less than 24-h, termed ultradian, are also commonly found, occasionally embedded in circadian or other rhythms [2]. This adaptation to cycles in the environment has occurred through the evolution of a biological timekeeper, a true temperature-compensated oscillator providing temporal information at all levels of physiology and behavior [1]. Thorough investigation of these oscillators requires that the periodic evolution of the processes in time be characterized precisely as to the length of the periods seen, as these are the manifestation of the clock process [3]. In addition, the relative robustness and regularity of the rhythms is of considerable interest. Numerical samplings of any process that evolves in time, taken at appropriate intervals, form time series, the stuff and substance of biological rhythm research.
Analysis of time series may be simply done. In early work by Bünning on bean plant leaf movements, periodicity was estimated by measuring peak-to-peak intervals on chymographs that registered leaf position [4]. Analysis technique has progressed considerably since that time and now offers an array of possibilities for estimates of period length [5]. This paper deals with a very useful method called Maximum Entropy Spectral Analysis, or MESA, developed by John Parker Burg in the 1960s in answer to shortcomings of the principal analysis technique up to that time, Fourier analysis [6–8]. We will first discuss Fourier analysis, noting the problems that MESA was developed to fix and how they can be circumvented with MESA. We will pay attention to the theoretical underpinnings so that this popular method will not be a “black box” and will show the basics of how the spectrum is computed. Given that the biologist necessarily works with time series that are either inherently irregular or contain major trends, tools that can ameliorate these problems when used in conjunction with MESA will be introduced and examples of their benefits discussed.
Biological rhythm data
Circadian rhythms are studied in systems ranging from intracellular fluorescence to complex behaviors such as running wheel activity; data acquisition and format vary accordingly. For example, when studying the activity of an enzyme, the variable may be continuous and an appropriate sampling interval must be chosen. This must be rapid enough to avoid “aliasing” in the periodicity region of interest. This occurs when the sampling interval is longer than the period being recorded and is famously seen in old western movies when the spokes of wagon wheels seem to be going backwards; sampling frequency must be no less than twice the frequency of the cyclic process of interest. This constraint is the Nyquist or foldover frequency [9]. Faster sampling is normal, however, to ensure no detail is lost and that accurate period estimates result. There are two important things to consider. The first, is resolution, which is the ability to separate two frequencies as being distinct, for example a 24-h circadian peak with a 24.8-h lunar daily peak. This is equivalent to optical resolution, in which two objects in an image can be separable [10]. Resolution is theoretically limited by the number of cycles in the data set, or in optics, by the diameter of the lens. A completely separate problem is the ability to discern a periodic signal in noise. This is sensitivity, and both are important.
Biological rhythm data are commonly not continuous and consist of unary events unlike the record left on a kymograph by a bean plant. Here, other constraints begin to play a role. Running wheel activity in mammals and the breaking of an infrared light beam by Drosophila are useful examples. Here, individual events occur, and are summed across arbitrary intervals or “bins”. Bin size affects the output of time series analysis and this effect can be profound when bin size is too small (Review: [11]). Bin sizes of 10 min up to an hour are common in rhythms work.
The autocovariance and autocorrelation functions
Given a particular signal, even if it appears clearly rhythmic in a simple time plot, it is important that an objective statistical test be employed to determine if significant periodicity is present. Such a robust test is autocorrelation analysis [5]. In this analysis, the time series is initially lined up with itself in register and correlation analysis is applied yielding the coefficient, r. In this case, no matter what the signal looks like, correspondence is one to one and r = 1. The two series are then set out of register or “lagged” by one interval. The result is a decrement in r. The drop depends on the series; if it is a noiseless sinusoid, the change will initially be small, but if it is white noise, the drop will be very large, since the value of any given point has no relation whatsoever to any other point either near or far in time. Lagging proceeds one interval at a time up to about N/3. The process is usually limited to this point since the power of the test is reduced with the decrement of each pair of lags off the ends of the series. r values are plotted as a function of the lag yielding the autocorrelogram function. In a rhythmic series, r will continue to decline, becoming negative and reaching a minimum when the peaks and valleys in the two series are out of phase by one half cycle. A second positive peak will occur when the peaks and valleys are back in phase, but one cycle out of register. The envelope of decay of the autocorrelation peaks is a function of the regularity in the series and this can provide a useful way of characterizing the regularity in the signal, as will be discussed below (Reviews: [5, 12]).
When computing the correlation coefficient, the output is normalized by dividing by the variance of the complete data set, but this need not be so and the output is then “covariance”, or the autocovariance function [5]. Autocorrelation is commonly employed, as it allows comparisons among wide-ranging experiments.
The autocorrelation function also yields a valuable way to quantify the regularity of the signal both in terms of variation in period and the presence of noise. The height of the third peak in this function, counting the peak at lag zero as one, is taken as the Rhythmicity Index, or RI. This value relies on the decay of the envelope of the function and is normally distributed so it may be used in statistical analyses (review: [12]). The RI of the test signal is 0.697.
Fourier analysis
Compromises inherent in Fourier analysis
Since the autcovariance and autocorrelation functions lose power with each pair of points lost, usually no more than one third of the data are used to compute correlation coefficients, adversely affecting the potential resolution in the spectrum. To alleviate this, the rest of the function is padded out with zeroes. This is an outright falsification of data points not in evidence, since there is no reason to suspect these data points would all be zeroes. An added problem occurs at the point where the zeroes abruptly start, since this abrupt discontinuity will cause artifactual peaks in the spectrum called “side lobes” owing to the Gibbs phenomenon [16]. To correct for this, the real data are blended into the zeros to soften the transition. Here, yet more actual data must be altered to allow for side lobe suppression. One further development that exacerbates these compromises is the Fast Fourier Transform, or FFT. In this algorithm, computational efficiency, and concomitantly speed of calculation, are increased by constraining the input series to consist of 2^{N} data points. Here again, the chances of a data set containing an integer power of 2 points are slim, and again, zeroes are added to pad the series out (Review [16]).
Further data corruption can occur when tighter spacing of spectral estimates is required. If the series is long, consisting of multiple cycles, this is usually not a problem. However, when short data sets are at hand, as is commonly the case with circadian rhythm work, there will be few cycles available. Fourier analysis is based on harmonics and these are constrained. In practice, this means that the spacing between estimates can be very wide. For a one-week long experiment with data sampled at half hour intervals (as in the test data) and analyzed using a simple DFT, spectral estimates are produced only for 22.4 and 24 h in the critical interval between 22 and 25 h. This leaves enormous gaps with little chance that the single estimate at 22.4 is even remotely close to the true period which is 23.0 h. Once again, it is straightforward to tighten up the interval between estimates, but once again, zeroes are added with further problems in using false data points for which there is no justification [16].
Maximum entropy spectral analysis
In the late 1960’s, John Parker Burg developed a new method for producing a spectrum that tackles these problems [17, 18]. It initially found acceptance in astrophysics and quickly spread to other fields. It began to be used for circadian rhythm work in the 1980s and is an excellent choice for a wide range of biological time series. This technique is called “Maximum Entropy Spectral Analysis” (MESA) [16–18]. MESA delivers the highest possible resolution, while eliminating side lobe problems. It is also extremely sensitive, as defined above. It is particularly useful in the short, noisy time series typical in biological systems [12, 19, 20].
Where: S(ω) is spectral power as a function of angular velocity (see above), P is the power passed by the PEF, p is the order of the PEF and a _{ k } is the set of PEF coefficients.
The algorithm commonly used by us and others calculates the filter in an iterative fashion and is based on the work of Anderson [22]. Each iteration extends the AR model by one. The number of coefficients in the prediction error filter employed to construct the spectrum is hence not fixed and requires some care in its choice. If a number that is too low is chosen, resolution and important detail can be lost. On the other side of the coin, if the number of coefficients is run up too high, there may be spurious peaks [21]. An objective method has been developed using the methods of Akaike [21], based on information theory. The filter length chosen is consistent with the most amount useful information that is being extracted as each iteration extends the length of the filter. This is used in the MESA software employed in our work, but we also commonly set a minimum filter length of about N/4 for biological rhythm analyses to ensure adequate representation of any long period cycles in the presence of noise, which can be considerable. N/3 is a good safe maximum [5, 21].
Data conditioning
Biological signals can be notoriously non-stationary and noisy. This variation can take the form of linear or nonlinear trends in amplitude, variations in period and the waveform. As with any signal analysis system, MESA output can be improved by conditioning the signal. It should be noted, however, that MESA is robust in the face of such problems from the start. Incorporated into our MESA program is a detrending step which fits a line by regression and subtracts it. This eliminates linear trend and removes the mean. Mean removal is highly recommended, as this DC component can obscure the rhythmic one if it is excessive [12]. Removal of nonlinear trend can be accomplished by high-pass filtering by numerous methods and will not be discussed here as it is beyond the scope of this work. Low pass filtering to remove high frequency noise, however, is of considerable interest and is commonly done in preparation for spectral analysis (Review: [12, 24]).
Where X_{t} is the original data series and Y_{t} is the output series. A and B are the filter coefficients: A = 9.656 and B = −3.4142. C is the “gain” or amplitude change of the filter and equals 10.2426. See [9, 12, 24] for a more detailed description of this filter. Owing to the recursion there is a 4-h phase delay in this example and this needs either to be made clear when the data are plotted or actually reversed. Reversal can easily be accomplished by running the filter in reverse. Since it is highly inadvisable to run a filter more than once to achieve additional smoothing before spectral analysis, as this will result in artifact [9], this reversal must only be done for display of simple plots for visualizing data. A single pass with the phase change is not an issue for MESA, since the attendant phase shift is of no consequence in this context. A second reversing pass with this filter actually resulted in a widening of the MESA peak (data not shown). After filtering, the RI (see above) is improved from 0.697 to 0.715.
MESA at work
MESA has seen notable success since first being implemented for use in biological rhythms in the 1980s. MESA is useful for an extremely wide range of living oscillatory processes. It was instrumental in discovering the presence of ultradian rhythms in Drosophila locomotor activity rhythms early on, most remarkably in flies bearing the per ^{ 0 } and per ^{ - } mutations, which have no overt circadian periodicity [25, 26]. These ultradian rhythms have been central in a competing hypothesis describing the mechanism of the circadian clock [26, 27]. It has been used extensively in circadian rhythm work since that time. Given its superior resolving power, it settled an old dispute about the presence of lunar rhythmicity in physiological activity in marine organisms [9]. When applied in conjunction with powerful trend removal techniques, it was instrumental in teasing out the role of the gene cryptochrome (cry) in the fly clock system. Luciferase activity was monitored in antennae bearing either tim-luc or per-luc constructs and a central role for CRY protein in the peripheral antennal clock was established [28, 29]. Ultradian and circadian rhythms were examined in premature infants (24–29 weeks) prior to their developing robust circadian periodicity enabling inferences on prenatal periodicity in normal pregnancies [30]. A cryptic human core body temperature of about one hour was elucidated [31]. Electroencephalography has yielded considerable information on ultradian periodicity in rats with MESA analysis combined with aggressive filtering enabling an incorporation of these high-frequency rhythms into models of sleep-wake dynamics [32]. A genetic component in strain differences among normal mice was discerned when locomotor activity was investigated with MESA, revealing robust ultradian components [33]. The presence of an endogenous vertical migration rhythm in Antarctic krill was verified [34]. Work on the cardiac pacemaker of the fly heart has benefitted from the use of MESA for measuring heart rate [35, 36]. When combined with a novel preliminary Fourier treatment to alter the sampling structure, the presence of rhythmicity in the spacing of pulses in the Drosophila mating song was confirmed and it was shown to be under the control of the period gene [37].
In summary, Maximum Entropy Spectral Analysis has proven itself to be a highly useful and versatile tool for the investigation of periodic biological phenomena.
Technical note
A full explanation of the mathematics underlying MESA and the ways in which algorithms have been implemented is beyond the scope of this paper. For those wishing to explore these topics in detail, the author recommends the following: For a good general introduction to the basic logic of MESA see Able’s review [16]; Burg’s original papers are the next step in seeing how the technique developed [17, 18]; the very thorough paper by Ulrych and Bishop [21] should be sufficient to answer almost any mathematical question on the procedure and the algorithm used in our version of the technique, which we implemented in FORTRAN, is found in Andersen’s contribution [22]. Some of these papers, notably those of Burg himself, are difficult to locate. The compendium edited by D.G. Childers entitled “Modern Spectrum Analysis” (1978, Wiley) has all these recommended papers and many more that are on point.
All software used in this lab, including the FORTRAN source code, is available free of charge from the author: dowse@maine.edu. A step by step annotated guide in its use has been published by this author [38].
Declarations
Authors’ Affiliations
References
- Pittendrigh CS: General Perspective. In Handbook of Behavioral Neurobiology 4 Biological Rhythms Edited by: Aschoff J. 1981, 57–55.View ArticleGoogle Scholar
- Lloyd D, Rossi E: Ultradian rhythms in life processes: An inquiry into fundamental principles of chronobiology and psychobiology. Springer-Verlag; 1992.View ArticleGoogle Scholar
- Winfree AT: The Geometry of biological Time. Springer-Verlag; 1980.Google Scholar
- Bünning E: The Physiological Clock. Springer-Verlag; 1964.View ArticleGoogle Scholar
- Chatfield C: The Analysis of Time Series: An Introduction. Chapman and Hall; 1989.Google Scholar
- Lanczos C: Applied Analysis. Prentice Hall; 1956.Google Scholar
- Lanczos C: Discourse on Fourier Series. Oliver and Boyd; 1966.Google Scholar
- Beauchamp K, Yuen CK: Digital Methods for Signal Analysis. George, Allen, Unwin; 1970.Google Scholar
- Hamming RW: Digital Filters. Prentice-Hall; 1983.Google Scholar
- Chui CK: An Introduction to Wavelets. Academic Press; 1992.Google Scholar
- Dowse HB, Ringo JM: Summing locomotor activity into “bins”: How to avoid artifact in spectral analysis. Biol Rhythm Research 1994, 25:2–14.View ArticleGoogle Scholar
- Dowse HB: Analyses for physiological and behavioral rhythmic. In Methods in Enzymology, Computer Methods, Part A, Volume 454. Edited by: Johnson ML, Brand L. 2009: Elsevier; 2009:141–174.Google Scholar
- Schuster A: On the investigation of hidden periodicities with application to a supposed 26-day period of meterological phenomena. Terrestrial Magnetism and Atmospheric Electricity. 1898, 3:13–41.View ArticleGoogle Scholar
- Whitaker E, Robinson G: The Calculus of Observations. Blackie and Son; 1924.Google Scholar
- Kendall MG: Contributions to the study of oscillatory time series. Cambridge University Press; 1946.Google Scholar
- Ables JG: Maximum entropy spectral analysis. Astron Astrophys Suppl Series 1974, 15:383–393.Google Scholar
- Burg JP: Maximum Entropy Spectral Analysis. In Modern Spectrum Analysis. Edited by: Childers DG. Wiley; 1978:34–41.Google Scholar
- Burg JP: A new analysis technique for time series data. In Modern Spectrum Analysis. Edited by: Childers DG. Wiley; 1978:42–48.Google Scholar
- Dowse HB, Ringo JM: The search for hidden periodicities in biological time series revisited. J Theoret Biol 1989, 139:487–515.View ArticleGoogle Scholar
- Dowse HB, Ringo JM: Comparisons between “periodograms” and spectral analysis: Apples are apples after all. J. Theoret. Biol. 1991,1991(148):139–144.View ArticleGoogle Scholar
- Ulrych T, Bishop T: Maximum entropy spectral analysis and autoregressive decomposition. Rev. Geophysics and Space Physics 1975, 13:183–300.View ArticleGoogle Scholar
- Andersen N: On the calculation of filter coefficients for maximum entropy spectral analysis. Geophys 1974, 39:69–72.View ArticleGoogle Scholar
- Kay SM, Marple SG Jr: Spectrum analysis, a modern perspective. IEEE Proc 1981, 69:1380–1419.View ArticleGoogle Scholar
- Levine J, Funes P, Dowse H, Hall J: Signal analysis of behavioral and molecular cycles. Biomed Central Neurosci 2002, 3:1–25.Google Scholar
- Dowse HB, Hall JC, Ringo JM: Circadian and ultradian rhythms in period mutants of Drosophila melanogaster . Behav Genet 1987, 17:19–35.PubMedView ArticleGoogle Scholar
- Dowse HB: Mid-range ultradian rhythms in Drosophila and the circadian clock problem. In Ultradian Rhythms from Molecules to Mind: A New Vision of Life. Edited by: Lloyd DL, Rossi E. Springer Verlag; 2008:175–199.View ArticleGoogle Scholar
- Dowse H, Ringo J: Further evidence that the circadian clock in Drosophila is a population of coupled ultradian oscillators. J Biol Rhythms 1987, 2:65–76.PubMedView ArticleGoogle Scholar
- Krishnan B, Levine J, Sisson K, Dowse H, Funes P, Hall J, Hardin P, Dryer S: A new role for cryptochrome in a Drosophila circadian oscillator. Nature 2001, 411:313–317.PubMedView ArticleGoogle Scholar
- Plautz J, Straume M, Stanewsky R, Jamison C, Brandes C, Dowse H, Hall J, Kay S: Quantitative analysis of Drosophila period gene transcription in living animals. J Biol Rhythms 1997, 12:204–217.PubMedView ArticleGoogle Scholar
- Tenriero S, Dowse H, D’Souza S, Minors DS, Chiswick M, Simms D, Waterhouse JM: Rhythms of temperature and heart rate in premature babies in intesive care. Early Hum Dev 1991, 27:33–52.View ArticleGoogle Scholar
- Lindsley G, Dowse H, Burgoon P, Kilka M, Stephenson L: A persistent circhoral ultradian rhythm is identified in human core temperature. Chronobiol Int 1999, 16:69–78.PubMedView ArticleGoogle Scholar
- Stephenson R, Joonbum L, Famina S, Caron A, Dowse H: Slep-wake behavior in the rat: ultradian rhythms in a light–dark cycle and continuous bright light. J Biol Rhythms 2012, 27:490–501.PubMedView ArticleGoogle Scholar
- Dowse H, Umemori J, Koide T: Ultradian components in the locomotor activity rhythms of the genetically normal mouse, Mus musculus . J Exp Biol 2010, 213:1788–1795.PubMedView ArticleGoogle Scholar
- Gaten E, Tarling G, Dowse H, Kyriacou C, Rosato E: Is vertical migration in Antarctic krill ( Euphausia superba ) influenced by an underlying circadian rhythm? J Genetics 2008, 87:473–483.View ArticleGoogle Scholar
- Dowse HB, Ringo JM, Power J, Johnson E, Kinney K, White L: A congenital heart defect in Drosophila caused by an action potential mutation. J Neurogenet 1995, 10:153–168.PubMedView ArticleGoogle Scholar
- Bodmer R, Wessels RJ, Johnson E, Dowse H: Heart development and function. In Comprehensive Molecular Insect Science V2. Edited by: Gilbert LI, Latrou K, Gill S. Elsevier; 2005:199–250.View ArticleGoogle Scholar
- Alt S, Ringo J, Talyn B, Bray W, Dowse H: The period gene controls courtship song cycles in Drosophila melanogaster . Anim Behav 1998, 56:87–97.PubMedView ArticleGoogle Scholar
- Dowse H: Statistical analysis of biological rhythm data. In Methods in Molecular Biology: Circadian Rhythms. Edited by: Rosato E. Humana; 2007:362–29–45.Google Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.