# Methods for serial analysis of long time series in the study of biological rhythms

- Antoni Díez-Noguera
^{2}Email author

**11**:7

https://doi.org/10.1186/1740-3391-11-7

© Díez-Noguera; licensee BioMed Central Ltd. 2013

**Received: **10 May 2013

**Accepted: **13 July 2013

**Published: **18 July 2013

## Abstract

When one is faced with the analysis of long time series, one often finds that the characteristics of circadian rhythms vary with time throughout the series. To cope with this situation, the whole series can be fragmented into successive sections which are analyzed one after the other, which constitutes a serial analysis. This article discusses serial analysis techniques, beginning with the characteristics that the sections must have and how they can affect the results. After consideration of the effects of some simple filters, different types of serial analysis are discussed systematically according to the variable analyzed or the estimated parameters: scalar magnitudes, angular magnitudes (time or phase), magnitudes related to frequencies (or periods), periodograms, and derived and / or special magnitudes and variables. The use of wavelet analysis and convolutions in long time series is also discussed. In all cases the fundamentals of each method are exposed, jointly with practical considerations and graphic examples. The final section provides information about software available to perform this type of analysis.

## Keywords

## Background

Virtually all living beings exhibit circadian rhythms that are manifested as oscillations with periods close to 24 hours in almost all their physiological variables. The study of these rhythms is thus based on the characterization and quantification of these oscillations, and, for this, one can use various statistical and mathematical techniques [1], which often come from wave analysis theory. These techniques include quantification of various descriptive parameters in the time domain: average values, variability (variance) of the oscillation, amplitude, the duration of the activity phase (alpha), the mean value in the alpha phase, and a whole host of indices derived from these and other similar magnitudes. Another set of techniques to study the characteristics of the rhythm in the domain of frequency are: spectral analysis, periodic regression (fitting to sinusoids), periodogram, calculation of phases, etc. In all these cases the presence of a periodic process is assumed, which is repeated with a certain frequency, either known or unknown.

In many such studies, a descriptive analysis is enough to characterize the circadian rhythm of one or more individuals under certain circumstances, and it is common to compare these parameters among different individuals that may be under different environmental conditions or under different experimental conditions. But a different situation occurs when performing long experimental studies in which a variable is recorded (usually spontaneous motor activity) for long periods of time [2]. In these cases the main objective of the study is usually to observe and quantify the evolution and the changes in the rhythm throughout the study. To do this, one applies the aforementioned techniques, repetitively at different times of the period analyzed. So we can “cut” the recorded data series in several sections to be analyzed individually and then compare the results.

However, when very long and practically continuous registers are available, it is of greater interest the study of the changes that occur over the time, in a continuous or quasi-continuous (daily) way. This is particularly frequent when monitoring changes of phase, in response to external stimuli, or to environmental changes. But virtually all variables from the descriptive analysis of a circadian rhythm can be continuously monitored over time [3]. This type of analysis provides invaluable information to the knowledge of the circadian system and their responses. This is also of great interest in the study of the processes of maturation or degeneration that occur throughout the life of the individuals of a given species.

In this article we will review the techniques that allow the analysis of circadian rhythms through a relatively long period of time, considering the most frequently used and also their software availability and its informative value. In all cases, the techniques will be introduced briefly along with some comments about its features and applicability. To properly illustrate their characteristics, both synthetic and real experimental datasets will be used in examples, to highlight some features of their behavior.

### The serial analysis concept

The study of changes in the characteristics of a rhythm over time is a kind of “multiresolution analysis” since the factor “time” is considered at two clearly differentiated levels: one level of resolution is related to the circadian rhythm itself (approximately 24 hours) and the other is the level of resolution of the timescale in which it is intended to detect changes of circadian rhythm, which can range from a few weeks to several months or years. In some special cases these changes are also periodicities like, weekly, monthly or seasonal. This leads to a set of techniques in which the primary data series is constituted by the results from the daily analysis of the original data, and will not be considered in this article.

The first step necessary to perform the analysis over time is to segment the data series into different sections with the same size, in each of which an analysis in the “circadian range” will be done. In this way we will obtain a series of results that will constitute a new series in a “global range of time”. This type of analysis is often applied to data sets that are analyzed in the circadian range by fitting a sine function in what is called “serial analysis” or “serial section analysis” [4–6]. Although strictly speaking the term “serial analysis” is widely used for the investigation of specific sequences in ordered collections of data (e.g. analysis of base sequences in genetics) we consider that we can keep using this term, without ambiguity, in the area we are now considering. Another possible term that we can use for this analysis could be that of “sequential analysis” since the analysis is performed sequentially in one section after another, but this term also has a specific meaning in statistics different from what we are considering here. Here we will generalize the use of the term “serial analysis” to cases where the analysis of successive sections will not be based on fitting the data to a sinusoid in the circadian range. This will be the case of the periodograms, phase analysis, etc.

In order to establish a consistent formal notation, we will consider a generic variable as X (temperature, motor activity, plasma concentration, blood pressure, etc.), whose different values over time are expressed as a function of time x(t). The series (being analyzed) is constituted by the successive observations made at different times t_{1}, t_{2} … t_{i} (i = 1 … N) giving the values x (t_{1}), x (t_{2}), … x (t_{i}). For simplicity, in the case of a discrete series of values, we can do x_{i} ≡ x (t_{i}), so that the series will consist of {x_{1}, x_{2} … x_{i}}. In all cases (and if not otherwise stated) it is assumed that the series is uniformly sampled with a constant sampling interval Δt = t_{i}-t_{i-1}. In this series, we can define different sections as subsets of m values of X, which are generically designated with the letter Y. Thus the series Y_{k} (i.e. the section k) will be formed by l values of X, starting at the sample k, so that: Y_{k} = {y_{1}, y_{2} … y_{l}}_{k} = {x_{k}, x_{k+1}, … x_{k+l-1}}. In the next part will show a more precise formulation, depending on the characteristics of the section.

Regarding the nature of the variable X, practically all methods of analysis are defined for continuous quantitative variables, like temperature or blood pressure, but in many cases the variable has a discrete quantitative (countable) character, as the number of movements. In these cases one can consider that the value of the variable (e.g. the number of counts in a fixed time) is proportional to the probability that an individual makes a move or, in any case, is proportional to the abstract variable “the tendency to move”. According to this, one can treat these variables as a continuous one. To be strictly correct these values should be transformed by the application of a linear filter [6], but this should be necessary only for extremely discrete values (only 2 or 3 levels). For values greater than 5 they can be used directly without any transform.

### Characteristics and definition of sections

_{1}, Y

_{2}… Y

_{j}(j = 1, 2 … n; n = N · l/s + 1) is Y

_{j}= {y

_{1}, y

_{2}, … y

_{l}}

_{j}= {x

_{(j-1)s+1}, x

_{(j-1)s+2}… x

_{(j-1)s+l}}, which is shown more clearly in Figure 1. Thus y

_{1}, y

_{2}, … y

_{l}, represent the elements of any section, while appending a subscript j, then y

_{j,1}, y

_{j,2}, … y

_{j,l}represent the elements of a particular section Y

_{j}. Hereafter and unless otherwise indicated, all values of the intervals (s, l, and T) are expressed in number of samples, so that their value in real time units will be calculated by multiplying by the value of Δt sampling interval. From each section a characteristic value of the circadian rhythm will be obtained. The array of these values {z

_{1}, z

_{2}, … z

_{n}} will be the new Z series whose evolution is analyzed over time and that is the objective of the serial analysis.

_{Y}. The data set X is a sinusoidal function with period T = 12 and a constant of 1. The step value is set at s = 1, so that the sections overlap in order to obtain the “instant” value of the mean for each time point. The lengths l that are tested correspond to l = 3 T/4, T and 5 T/4. It can be clearly seen that for l = T, the Z value reflects the actual value of the average over the entire range studied, whereas for l = 3 T/4 and 5 T/4, an oscillation occurs in the value of m with the same period as the original series X.

For the selection of the jump s, one must take into account whether the value calculated at circadian level, could be considered as a continuous process that exists all the time, or exists only for intervals of length equal to T. In the first case it can be considered that there exists an “instant” value (e.g. the heart rate), which allows the calculation of Z for each point of the original series, i.e. Z can take values at any time between 0 and T. In the second case (e.g. the calculation of an amplitude), it would be advisable to use a value of s = T, to avoid the errors discussed in the previous paragraph.

When one uses values of s lower than l, an overlap occurs between the successive sections, which should be taken into account when one checks the statistical significance of the values of the resulting variable Z. This applies only where the variable Z, calculated from sections, is defined under some hypothesis, such as being different from zero, or a fixed value, which allows the estimation of the statistical significance. This could be the case that Z corresponds, for example, to an average or a variance. If there are no overlapping sections, the significance level p, for each value of Z, can be calculated independently, because the data sets used to calculate each value of Z are different from each other. Conversely, in the event of having overlapping sections (s < l), the successive values of Z, share several values of X. Thus a part of a data set used to calculate a Z value can be used to calculate other Z values. In these cases, to maintain a fixed level of significance, a correction should be applied to calculate the level of significance for each single test of Z, depending on the number of statistical tests (Z values) that share a value of X for its calculation. If we call this number m (m = integer(s/l)-1), the corrected significance level will be calculated by Sidak’s formula [7] p_{m} = 1-(1-p)^{1/m} which, for values of m > 3, is approached by the Bonferroni correction [8] p_{m} = p/m, where p is the probability level for the overall set of tests.

The last aspect to be considered is the expected rate of change in the process being studied. In this case a balance is required between the minimum length of the section needed for calculating the parameter Z (e.g. the phase), the magnitude of the jump s, and the expected rate of change. The rate of change is defined by the time-constant of the process, which in physics is defined as the time required for the variable to reach a fraction of 0.632 of its final value (0.632 = 1-1/e). In order to follow the changing process properly, the value of the step s must not exceed the expected time-constant of the process, and it is recommended to keep it below half that value. With regard to the choice of the length of the section l, this should be kept as short as possible in order to detect changes in the moment at which they occur. Anyway, this value will be conditioned by the length required for the calculation of the parameters corresponding to Z. Thus, for example, in the case of a serial periodogram, it is necessary to use a length l ≥ 10 T, to perform the calculation confidently.

It has to be kept in mind that, when very long sections are used, a smoothing will occur in the series of values of Z. This smoothing is equivalent to the average of a number of Z values equal to the number of complete cycles T included in each section (see Figure 2B). Thus a value of Z obtained from sections of length l = mT, is equivalent to the smoothing of m elements calculated in the Z series (calculated with a length l = T). This effect may be desirable in cases where the variable Z is very unstable, but it should be avoided when dealing with rapidly changing processes. In many cases where the studied variable may change from cycle to cycle, one single cycle is enough for the determination of Z values, and very often the three intervals take the same value, i.e. T = l = s.

A final consideration relates to the mode of presenting the results graphically that, in general, differs from the regular cartesian graph with the time axis in the abscissas. Since this type of analysis is performed on large data sets, which are usually represented as actograms or “double plotted graphs”, most often the results series Z is represented, in parallel to the above graphs. That is, the variable Z is represented in abscissas, and vertically, starting from the top, successive values of Z (derived from sections) are shown in descending order. When the variable Z corresponds to a time (or phase) within the cycle T, the horizontal axis goes from 0 to T, and often it is superimposed to the “Double plotted graph” of X to facilitate the location of phases within the series analyzed.

### Analysis of sections and types of variables

As already mentioned the “results series” Z is formed by the sequence of values resulting from the analysis of successive sections and obtained from the segmentation of the original series X. Thus, Z is a variable, or a feature of each section, as could be the average amplitude of a sinusoidal oscillation, the period, the variance, etc. In order to describe the different possibilities, five types of results can be considered: scalar magnitudes, angular magnitudes (time or phase), magnitudes related to frequencies (or periods), periodograms, and derived and/or special magnitudes and variables. Each one of these types has some common formal and methodological properties that will be discussed below.

There is another type of analysis called “wavelet analysis” which also allows seeing the evolution of the characteristics of the circadian rhythm through time. Although it uses a very specific methodology that is different from above, a comment on this technique will also be included jointly with the convolutions.

#### Scalar magnitudes

In case of using sections with a length different of T, it is very important that the length l be multiple of T (as already discussed in the previous section) to avoid the presence of false fluctuations in the analyzed parameter. Another consideration to be taken into account when trying to plot individual values of the series (e.g.: minimum or maximum values) will be the application of filters, since in the case of using values calculated from the entire section, random variations or noise present in the data is compensated by the calculation itself, whereas when the result is a single value from the series it can be seriously disturbed by the noise. Applying a filter to the whole data set before the calculation is a practice advisable in most cases, as it reduces the noise in the series, decreasing the residual variance and increasing the accuracy in the estimation of parameters.

The use of numeric filters constitutes a broad field in the theory of signal analysis and there is a wide variety of types and features, and highly specific filters can be designed for the removal (or amplification) of definite frequency ranges. Except in special cases, the use of very specialized filters can modify the presence of frequency components in an unwanted manner, distorting seriously the characteristics of the series being analyzed. Instead, what is really advisable is the use of simple filters for noise reduction, which is usually constituted by high frequency components.

_{i}= (x

_{i-n}+ … + x

_{i}+ … + x

_{i+n}) / (2n +1). The moving average produces a smoothing effect on the data that removes high frequency components, depending on the amplitude of the interval M = 2n + 1, as shown in Figure 4A. The choice of n will depend on the highest frequency to be detected in the subsequent analysis. To determine the effect of smoothing we can calculate the transfer function H(f), according to the formula: H(f) = abs [sin(π · f · M) / (M · sin(π · f))], where H(f) represents the relationship between the signal before and after filtering for each frequency f, where a f = 1 is the frequency corresponding to two sampling intervals: f

_{N}= 1/(2 Δt), that is known as the Nyquist frequency. As an example, for a sampling interval of Δt = 15 min., a frequency f = 0.4 corresponds to a period T = 1/f = 1/(0.4 · f

_{N}) = 2 Δt /0.4 = 2 · 15/0.4 = 75 min. As seen in Figure 4A, the frequency reduction is not uniform and has a strong ripple, but remains at very acceptable levels. It is possible to design a filter with a much flatter response, but it is actually much more complex. Regarding the sample interval, we must remember that the shorter period that one can analyze in a series is determined by the Nyquist frequency, giving a value of T =1/f

_{N}= 2 Δt.

In real series there is a reduction of low frequencies somewhat higher that in theory, as shown in the example of Figure 4B, calculated from a series of “real pink noise” (1/f). The same figure shows the effect of applying another interesting filter: a running median. This type of filter is highly recommended because it eliminates very effectively spurious or aberrant data from the series. In the case of the moving average, these values can produce a displacement that may distort the series significantly. In addition, the moving median produces a similar reduction at high frequencies as the moving average does but with a noisy transfer function. In spite of this, the filtering effect is practically unaffected. For these reasons it would be always advisable to smooth out the data with a running median before proceeding to the serial analysis of data.

#### Angular magnitudes (time or phase)

Tracking a characteristic moment of the circadian cycle along successive days is usually one of the most characteristic aspects of numerous circadian rhythm studies. The choice of this point is controversial, and there is no consensus in defining a unique moment that can be used in all cases. In many cases where the motor activity of an animal is recorded, the beginning of the active phase is usually a very stable indicator of the phase of the animal [9, 10]. In other cases, for certain diurnal animals, the offset of activity is better [11], while in other cases more complex patterns hinder the selection of a characteristic point. In many cases the right selection of one point or the other determines the quality of results, as in the case of the calculation of phase changes to obtain a phase response curve [12].

In many cases when the pattern of the studied variable is far from sinusoid, the time of acrophase will not necessarily coincide with the maximum value of the actual series, and for this reason many authors criticize its use, but we will see later that the phase φ (or acrophase) is probably the best parameter of centralization, even in these non-sinusoidal cases.

Before continuing with the discussion of phase indicators, it should be noted that, as in the case of scalar magnitudes, it is advisable to use non-overlapping sections with s = l = T (daily sections), since we are analyzing a value (a phase) whose definition involves a complete cycle (only one T) and also assumes that the value can change from day to day.

Despite the advantages already mentioned, this method has practical drawbacks to maintain the wave profile more or less centered in the range of the analyzed section.

Regardless of the parameter of centrality, in numerous studies the focus is on determining the start of the active phase or the end of it, especially when patterns have non-sinusoidal waveforms (generally square) or when working under the hypothesis of various oscillators (e.g. morning and evening components [20]). It is also common to study the two points simultaneously, and such is the case of studying the duration of the alpha phase [9]. Again, if the contrast between the phases of activity and rest is marked, the estimation of these parameters is relatively simple. Often this estimate is done “*de visu*” on the graph itself (“double plot”), where the values are estimated graphically, but it is possible to use analytical methods that are more accurate and totally objective, such as those shown below. We must bear in mind that if the studied variable X has a certain inertia (e.g., temperature) the use of graphical methods can be very complicated, because of the difficulty of establishing the threshold between high and low values.

The procedure that often works best is to perform (for each section) a regression analysis between the data and the Heaviside function (the unit step function) using a variable displacement ψ between the series and the Heaviside function. The offset value ψ varies from 0 to n, and the value of ψ at which the higher correlation index is obtained is chosen as the indicator of the start of activity phase. Mathematically, Heaviside function is represented by the letter H, but to avoid confusion with the transfer function used above, we’ll use the symbol θ, which is also used in some cases of continuous functions. This function is defined as θ(t) = {0: t < c, 1: t ≥ c}, i.e. is zero for any time lower than an arbitrary value c, and 1 for times greater than or equal to c, thus representing a unit step. The procedure, in fact, looks for the time point when the data set has a behavior similar to θ(t). Formally at that point there is a “positive flank”, which is indicated by the symbol ψ+. To calculate the end of the phase of activity, one uses the same procedure but using the complementary function defined as θ'(t) = 1-θ(t) = {1: t < c, 0: t ≥ c }, defining a negative step, so that the point found corresponds to the negative flank ψ-.

Another possibility to estimate the position of flanks is to use a square wave for the regression instead of the Heaviside functions (see Figure 7). This procedure improves slightly the previous one because the Heaviside function only takes into account an increase (or decrease) on the series values, which means that a part of the series is not properly adjusted to the actual values because in the real data set both an increase and a decrease take place. To solve this problem a first estimate of the positive and negative flanks is made using the above procedure. After that, the positive flank is kept (c = ψ+) and successive regression analysis are performed, as in the previous case, but with a square wave function q(t) = {1: c ≤ t < d, 0: otherwise} and changing d till finding the best fit that will define the negative flank d. Then the procedure is repeated but keeping d, and obtaining a new value for c. These new values of flanks, c and d, are compared with the values ψ + and ψ- obtained initially. If they are equal, they are considered valid, but if different, the procedure is repeated using the new values c and d as the initial estimates of flanks, and the procedure continues until the two pairs of values coincide. Although the procedure is a bit complex, it can be automated, and has the advantage that it gives very good estimates, especially in series with irregular circadian patterns.

Finally, we will discuss other methods for estimating parameters of centrality or flanks, which are much simpler, but their application is restricted to highly uniform and noise-free series. It consists simply in determining the time at which the series has its minimum value or its maximum value, or the point at which the greater increase or decrease takes place between two successive values (positive and negative flanks respectively). To apply these methods, a previous smoothing of considerable amplitude is necessary. Another possibility is locating the point at which the data series crosses a certain threshold (mean, median or percentile), but here it is necessary to apply a moving average with an amplitude close to one third the length of the cycle analyzed, to remove local variations. A serious inconvenient of these techniques is that the smoothing effect can be so great that they cause distortion in the wave and errors in the estimation of parameters. Therefore these techniques are not very suitable and should be used only in the case of very “soft” waveforms.

A consideration which is common to all angular magnitudes that have been considered in this section is the monitoring of the phase changes between sections, to avoid changes greater than half a cycle T/2 = π rad = 180°. In an example where the sections are analyzed with T = 24 hours, it could be possible, as a result of a calculation, to obtain a phase shift of 18 hours, which actually corresponds to a change of −6 hours, considering the shorter jump. In this case, one would have to subtract 24 hours (T = 2π rad = 360°) from the calculated value to obtain the correct value. It must be noted that the value of the phase shift has to be calculated from the expected value in the corresponding section and not the absolute value of the phase. Suppose we obtain values 4 h, 6 h and 8 h in subsequent sections. If in the next section we obtain 10 h, it is clear that it is equal to the expected value and the phase shift from the expected value will be zero, although there is an absolute increase of 2 h, with respect to the previous section. To monitor the phase and make corrections properly, one must take the k values of the preceding sections and extrapolate the next point on the basis of linear regression with the previous points. It is possible to obtain relatively simple formulae for this extrapolation for fixed values of k ≤ 6. In this way, false jumps will be eliminated in the calculated series of phases.

#### Magnitudes related to frequencies (or periods)

where c_{i} and θ_{i} are the amplitude and phase of each component i. Each one of these components is called a harmonic, and the graphical representation of the amplitudes of the successive harmonics is the spectrum of the series. This spectrum is characteristic for each rhythmic pattern shape and clearly indicates the different frequency components involved in defining a pattern. The decomposition of a waveform into its harmonic components is done by spectral analysis or Fourier analysis [24, 25].

This type of analysis can also be done serially on sections that may overlap or not. In case of overlapping sections, as already mentioned in previous cases, the procedure will produce a smoothing over the values of all the parameters resulting from the analysis. If one makes a serial Fourier analysis, it is important that the length of the analyzed sections is equal to or an integer multiple of the period used for the main analysis. This is because under this condition (orthogonality), the estimates of harmonics are independent of the other components, so that if one harmonic has certain amplitude, this does not affect the magnitude of another harmonic, while if the condition of orthogonality is not satisfied, the estimation of each harmonic affects the other components, which leads to an erroneous or biased spectrum. In this type of analysis, it is particularly useful to employ power spectrum (instead of the amplitude) in which the magnitude of each harmonic is expressed as the fraction that represents the square of its amplitude with respect the sum of the squares of all harmonics. This allows to quantify the importance of each component as a rate (or percentage), regardless of its real amplitude. In this case the spectrum is called power spectrum.

In all these analyses, one must establish a value for the period, that must be known or estimated “a priori”. This is the fundamental period to be used throughout the whole analysis and that, logically, will be the rhythm of the series. When this period is not known with accuracy, we shall see other techniques aimed precisely at determining the value of the period of the rhythm.

#### Periodograms

In the previous section, quantities related to specific frequencies (or periods) were calculated, meaning that the value of the frequency at which the data should be adjusted is known or assumed in advance. Now we will deal with the case in which the value of the frequency (or period) is unknown, and the aim of the analysis is precisely finding its value. For these calculations a spectrogram can be used, or one may conduct a serial analysis consisting of successive periodograms. Usually, quite long sections are required, which must encompass a minimum of 8–10 days (or cycles). The jump between successive sections usually is a complete cycle. Thus what is obtained is a daily succession of periodograms. Each periodogram is a row in a graphic matrix and each cell will correspond to one of the tested periods in the specific section, colored according a gray scale, or other chromatic scale. The most suitable periodograms for this kind of representations are the Lomb-Scargle [28, 29] and the Sokolove-Bushell [30]. The first one should be used when sections are shorter than 8 cycles.

Each periodogram has its own characteristics, whose discussion is beyond the scope of this article. Everything that can be explained in the “normal” periodograms (not serial) also applies in this serial application. Before explaining the differences and characteristics of these two periodograms we should insist on a common feature of all periodograms, for a proper interpretation. This is the fact that the values shown in a periodogram are obtained independently of each other, and not in a jointly way. This means that if in three adjacent points of a periodogram corresponding (for example) to periods of 1200, 1210 and 1220 minutes have values of 35, 38 and 42% of the total variance, that does not mean that each of these periodicities are present on the wave with such percentages of variance simultaneously. The true interpretation is that when a period of 1200 minutes is tested, this period explains 35% of the total variance, and if we perform a new (and different) analysis with a period of 1210 minutes, then it explains 38%. Therefore, each tested period is an independent test, performed over the same dataset, which obliges to apply the Bonferroni correction (explained above) on the probability level required in each test (or point) to maintain the desired global level of significance.

It should be noted that in the case of serial Fourier analysis (graphic matrices) several frequencies are also shown simultaneously for each section, but in that case the different harmonics are calculated simultaneously and orthogonally, so that the percentages of variance correspond to the distribution of the total variance of series, among the different harmonic components, while in the periodogram this is not so. Although the two periodograms generate a Chi2 type variable, one can transform these values in percentages of explained variance, which is much easier to interpret.

_{P}is

where P is the period expressed in samples,
are the column means after arranging the series (of N elements) in an array of P columns, and K is the number of rows of the resulting array. Q_{P} follows a Chi^{2} distribution with as many degrees of freedom as cycles in each section (see a description of the method of calculation in [30]). From the value of Q_{P}, the amount of variance explained by the rhythm can be calculated [31] just multiplying Q_{P} by 100/N.

_{i}must be normalized to N(m = 0,s = 1), and

All other formulas remain the same as in the previous periodogram. The maximum value of P(ω) is 1, when the periodicity accounts for all the variance of the series; because of this, 100· P(ω) can be interpreted as the percentage of explained variance.

#### Derived and/or special magnitudes and variables

The number of variables that can be studied in a serial analysis is virtually unlimited, since it is possible to define a multitude of indices and variables from those already indicated or others different, depending on the properties under investigation. As example, a few cases will be discussed.

One case is the duration of the activity phase or alpha phase. It can be expressed in absolute terms or as a percentage with respect the duration of the cycle. To calculate the length of the alpha phase, one can calculate the difference between the positive and the negative flank, as already calculated above.

Some authors determine the degree of entrainment, expressing the portion of activity that takes place during a certain part of the light cycle [35]. To calculate this value, one uses sections with a length and step equal to the period of the cycle of illumination, and then one calculates the amount of activity in the first half of the section, with respect to the activity recorded in the entire section.

This index has proved to be useful to follow the effect of treatments in demented patients in long periods of time, and because of this, it is well suited for its serial application.

Where M_{10} and L_{5} are the activity values for the most active 10 hours period and the least active 5 hours period in the average 24 hours pattern (sections should include 5 or more cycles). This index provides a good indication about the degree of coupling between the activity rest cycle and the external *zeitgeber* (light–dark cycle).

### Wavelet analysis

The wavelet analysis is a relatively new technique for analyzing a process in time and frequency simultaneously. Its main advantage is that the process does not require the date to be stationary or to have a constant spectral structure, so it is especially suitable for the analysis of rhythmic processes whose characteristics vary in time. Although there is not a serial analysis technique, the calculation methodology has many similarities to this type of analysis, so here we will make brief reference to these techniques and refer the reader interested in them to the extensive mathematical literature existing and more specifically to two recent articles [36, 37] on its application in chronobiology.

Its application in chronobiology is scarce and there is no clear consensus on how to implement this technique. In addition to the articles mentioned, early studies were conducted in the late 1990s in which wavelet analysis was used for the characterization of ultradian rhythms [38], for monitoring phase changes [39] or for signal recognition [40] and more recently in studies on variations of the period [41].

The analysis is performed by dividing the series in different sets of sections with lengths that are half of the previous one, and their associated frequencies will double. In each resulting set of sections a wavelet with the same length of the section is applied to the data by cross-multiplying their terms. Wavelets are shrunk or lengthened to fit the section length. They are not sinusoids and have very characteristic waveforms that form a set of orthogonal functions with a domain of existence limited by the length of each section (this interval defines the “support” of the wavelet). This type of analysis has a clear discrete character (DWT Discrete Wavelet Transform), so it could be considered similar to performing various serial analyses with non-overlapping sections with a different length and step (each length is associated to a frequency) in each analysis.

### Practical application and conclusions

The serial analysis of long duration data series includes a wide variety of methods, such as we have seen, and it is not easy to find applications capable of performing these calculations in a compact form. Consequently, one often needs to develop specific programs or routines to perform the calculations.

“ActogramJ” Java app by Schmid, Helfrich-Föster & Yoshii [http://actogramj.neurofly.de],

“BRASS” Excel app by Millar [http://millar.bio.ed.ac.uk],

“Chronos-Fit” by Lemmer [http://www.fileguru.com/Chronos-Fit/info],

“Circadian software” by Refinetti [http://www.circadian.org/main.html],

“ClockLab” from Actimetrics [http://www.actimetrics.com/ClockLab/],

“CronoLab” by Mojón, Fernández & Hermida [http://www.tsc.uvigo.es/BIO/Bioing/ChrLDoc1.html],

“El Temps” by Diez-Noguera [http://www.el-temps.com],

“Free chronobiology software” by Hut [http://hutlab.nl],

“The Chronobiology Kit” from Stanford Software Systems [http://query.com/chronokit/],

“Time Series Analysis-Cosinor” from Expert Soft Technologie [http://www.euroestech.net/index.php].

In the case one prefers to write his/her own programs, one can use any of the many existing programming languages -- such as Delphi, Basic, C++, C#, and Java --, several of which include programming environments that facilitate the work very much. A serious disadvantage is the requirement of specific programming skills, which most users do not possess. For this purpose there are mathematical applications that use a special language for commands, operations and functions, such as MATLAB, allowing the execution of small programs or scripts that are extremely useful for such calculations. This language includes a great number of predefined functions (statistics, graphs, spectral analysis, regression, etc.) that facilitate the realization of the calculations automatically. Because it is a very high level language, many processes are preprogrammed and ready to use, which saves programming time and effort. It is worth to point out that MATLAB is a language that can be handled relatively easily, at very acceptable levels without high expertise, and this explains its great popularity and acceptance in the field of computing and numerical analysis. The existing large user community has developed many applications that are freely available in the Internet, and in many cases it is not difficult to find a program that suits one’s specific needs.

Another similar application is Mathematica, with similar characteristics as MATLAB but with different programming strategies. Both programs have additional libraries that extend the capabilities of analysis to specific fields, such as statistics, waveform analysis, wavelet analysis, filters design, etc. all of them very useful for our purpose of analysis.

In short we can say that the serial analysis is a tool of the first order to analyze the rhythmic characteristics of long time series, in which rhythm properties evolve over time. In fact it is nothing else that the repeated application of conventional rhythm analysis techniques along different sections of a time series. As already mentioned in this article, the correct application of these methods requires some considerations and precautions, already mentioned above, to avoid the commission of errors, as when choosing the length of sections and the step size. Likewise the interpretation of the results must be conducted cautiously under the knowledge of the specific characteristics of each technique, as in the case of Fourier or spectral analysis.

In general terms, one can say that when a time series includes more than 20 successive cycles, it is necessary to use the serial analysis to see the evolution and changes that have taken place during that time period. If this analysis is not performed and the whole data series is treated as a single unit, the presence of errors is highly likely due to the non-stationarity of the process, since most of the conventional analysis methods are defined for stationary series whose properties remain constant throughout the investigated period, a situation that is very rare in living beings.

Although most of the techniques described here have a rather complex mathematical basis, the existence and availability of numerous computer applications, more or less specific, greatly facilitates the practical implementation of these methods. We believe that the systematic application of these analytical techniques, would improve the description of many experimental observations allowing a more accurate analysis.

## Declarations

### Acknowledgments

The author’s work was partially supported by the Ministerio de Educación y Ciencia (project: BFU2008-00199).

## Authors’ Affiliations

## References

- Refinetti R, Cornelissen G, Halberg F:
**Procedures for numerical analysis of circadian rhythms.***Biol Rhythm Res*2007,**38:**275–325.PubMedView ArticleGoogle Scholar - Richter CP:
**Inherent twenty-four hour and lunar clocks of a primate, the squirrel monkey.***Comm Behav Biol*1968,**1:**305–332.Google Scholar - Richter CP:
**Heavy water as a tool for study of the forcas that control length of period of 24-hour clock of the hamster.***Proc Natl Acad Sci USA*1977,**74:**1295–1299.PubMedView ArticleGoogle Scholar - Arbogast B, Lubanovic W, Halberg F, Cornélissen G, Bingham C:
**Chronobiologic serial sections of several orders.***Chronobiologia*1983,**10:**59–69.PubMedGoogle Scholar - Halberg F, Engeli M, Hamburger C, Hillman D:
**Spectral resolution of low-frequency, small-amplitude rhythms in excreted 17- ketosteroid: probable androgen induced circaseptan desychronization.***Acta Endocrionl (Copenh)*1965,**103**(supl)**:**5–54.Google Scholar - Helfrich-Förster C:
**Differential Control of morning and evening components in the activity rhythm of drosophila melanogaster—Sex-specific differences suggest a different quality of activity.***J Biol Rhythms*2000,**15:**135–154.PubMedView ArticleGoogle Scholar - Sidak Z:
**Rectangular confidence regions for the means of multivariate normal distributions.***J American Statistical Association*1967,**62:**626–633.Google Scholar - Bonferroni EC:
**Teoria statistica delle classi e calcolo delle probabilità.***Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commerciali di Firenze*1936,**8:**3–62.Google Scholar - Pittendrigh CS, Daan S:
**A functional analysis of circadian pacemakers in nocturnal rodents. I: the stability and lability of spontaneous frequency.***J Comp Physiol A*1976,**106:**223–252.View ArticleGoogle Scholar - DeCoursey PJ, Plus S, Sandlin C, Wethey D, Schull J:
**Relationsgip of circadian temperature and activity rhythms in two rodent species.***Physiol Behav*1998,**65:**457–463.PubMedView ArticleGoogle Scholar - Everts LG, Strijkstra AM, Hut RA, Hoffmann IE, Millesi E:
**Seasonal variation in daily activity patterns of free-ranging European ground squirrels (Spermophilus citellus).***Chronobiol Int*2004,**21:**57–71.PubMedView ArticleGoogle Scholar - Aschoff J:
**Response curves in circadian periodicity.**In*Circadian clocks*. Edited by: Aschoff J. Amsterdam: North-Holland; 1965:95–111.Google Scholar - Koehler WK, Fleissner G:
**Internal sesynchronization of bilaterally ordanized circadian oscillators in the visual system of insects.***Nature*1978,**274:**708–710.PubMedView ArticleGoogle Scholar - Vokac M, Vokac Z:
**Serial analysis of circadian rhythms section by best-fitting periods.***Am J Hypertens*2002,**15:**79A.Google Scholar - Mitsutake G, Otsuka K, Cornelissen G, Herold M, Gunther R, Dawes C, Burch JB, Watson D, Halberg F:
**Circadian and infradian rhythms in mood.***Biomed Pharmoacother*2001,**55:**94s-100s.View ArticleGoogle Scholar - Comas M, Beersma DGM, Hut RA, Daan S:
**Circadian phase resetting in response to light–dark and dark–light transitions.***J Biol Rhythms*2008,**23:**425–434.PubMedView ArticleGoogle Scholar - Kantermann T, Juda M, Merrow M, Roenneberg T:
**The human circadian Clock’s seasonal adjustment is disrupted by daylight saving time.***Curr Biol*2007,**17:**1–5.View ArticleGoogle Scholar - Hut RA, Scheper A, Daan S:
**Can the circadian system of a diurnal and a nocturnal rodent entrain to ultraviolet light?***J Comp Physiol*2000,**186:**707–715.View ArticleGoogle Scholar - Kenagy GJ:
**Center-of-gravity of circadian activity and its relation to free-running period in Two rodent species.***J Interdiscipl Cycle Res*1980,**11:**1–8.View ArticleGoogle Scholar - Potdar S, Sheeba V:
**Large Ventral lateral neurons determine the phase of evening activity peak across photoperiods in drosophila melanogaster.***J Biol Rhythms*2012,**27:**267–279.PubMedView ArticleGoogle Scholar - Bliss CI:
**Periodic regression in biology and climatology.***The Connecticut Agric Exp Station Bull*1958,**615:**3–47.Google Scholar - Batschelet E:
*Circular statistics in biology*. London: Academic Press; 1981.Google Scholar - Bingham C, Arbogast B, Lee JK, Halberg F:
**Inferential Statistical methods for estimating and comparing cosinor parameters.***Chronobiologia*1982,**9:**397–439.PubMedGoogle Scholar - Grafakos L:
*Classical Fourier analysis*. 2nd edition. New York: Springer; 2008.Google Scholar - Fourier JBJ:
*Théorie Analytique de la*. Chaleur Paris: F. Didiot; 1882.Google Scholar - Cambas T, Diez-Noguera A:
**Evolution of rat motor activity circadian rhythm under three different light patterns.***Physiol Behav*1991,**40:**63–68.View ArticleGoogle Scholar - Cambras T, Castejón L, Díez-Noguera A:
**Social interaction and sex differences influence rat temperature circadian rhythm under LD cycles and constant light.***Physiol Behav*2011,**103:**365–371.PubMedView ArticleGoogle Scholar - Lomb NR:
**Least-squares frequency analysis of unequally spaced data.***Astrophys Space Sci*1976,**39:**447–462.View ArticleGoogle Scholar - Scargle JD:
**Studies in astronomical time series analysis. II. Statistical aspects of spectral analysis of unevenly spaced data.***Astrophys J*1982,**302:**757–763.Google Scholar - Sokolove PG, Bushell WN:
**The chi square periodogram: its utility for analysis of circadian rhythms.***J Theor Biol*1978,**72:**131–160.PubMedView ArticleGoogle Scholar - Canal-Corretger MM, Vilaplana J, Cambras T, Diez-Noguera A:
**Functioning of the rat circadian system is modified by light applied in critical postnatal days.***Am J Physiol*2001,**280:**R1023-R1030.Google Scholar - Ruf T:
**The lomb-scargle periodogram in biological rhythm research: analysis of incomplete and unequally spaced time-series.***Biol Rhythm Res*1999,**30:**178–201.View ArticleGoogle Scholar - Van Dongen HPA, Olofsen E, VanHartevelt JH, Kruyt EW:
**A Procedure of multiple period searching in unequally spaced time-series with the lomb-scargle method.***Biol Rhythm Res*1999,**30:**149–177.PubMedView ArticleGoogle Scholar - Zeichmeister M, Kürster M:
**The generalised lomb-scargle periodogram.***A & A*2009,**496:**577–584.View ArticleGoogle Scholar - Vivanco P, Rol MA, Madrid JA:
**Two steady-entrainment phases and graded masking effects by light generate different circadian chronotypes in Octodon degus.***Chronobiol Intl*2009,**26:**219–241.View ArticleGoogle Scholar - Van Someren EJ, Swaab DF, Colenda CC, Cohen W, McCall WV, Rosenquist PB:
**Bright light therapy: improved sensitivity to its effects on rest-activity rhythms in Alzheimer patients by application of nonparametric methods.***Chronobiol Int*1999,**16:**505–518.PubMedView ArticleGoogle Scholar - Leise TL, Harrington ME:
**Wavelet-based time series analysis of circadian rhythms.***J Biol Rhythms*2011,**26:**454–463.PubMedView ArticleGoogle Scholar - Leise TL, Indic P, Paul MJ, Schwartz WJ:
**Wavelets meets actogram.***J Biol Rhythms*2013,**28:**62–68.PubMedView ArticleGoogle Scholar - Poon AMS, Wu BM, Poon PWF, Cheung EPW, Chan FHY, Lam FK:
**Effect of cage size on ultradian locomotor rhythm of laboratory mice.***Physiol Behav*1998,**62:**1253–1258.View ArticleGoogle Scholar - Chan FHY, Wu BM, Lam FK, Poon PWF, Poon AMS:
**Multiscale characterization of chronobiological signals based on the discrete wavelet transform.***IEEE Transact Biomed Engineering*2000,**47:**88–95.View ArticleGoogle Scholar - Wu BM, Chan FHY, Lam FK, Poon AMS, Poon PWF, Chow DCS:
*Characterization of the chronobiological signals based on the continuous wavelet transform*. Chicago, IL: BMES Conf Proc IEEE; 1997:1620–1623.Google Scholar - Meeker K, Harang R, Webb AB, Welsh DK, Doyle FJ, Bonnet G, Herzog ED, Petzold LR:
**Wavelet measurement suggests cause of period instability in mammalian circadian neurons.***J Biol Rhythms*2011,**26:**353–362.PubMedView ArticleGoogle Scholar

## Copyright

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.