Sign Up
..... Connect Australia with the world.
Categories

Posted: 2022-07-02 13:57:37

The purpose of this study was to determine whether there exists concordance among different methods of binary seasonality classification when applied to time series derived from diagnosis codes in observational data. We used databases of varying size, type, and provenance to eliminate the possibility of discordance caused by mere database choice. The results of this study, as shown in Fig. 1, indicate the methods are generally inconsistent with each other, with discordance observed in 60 to 80% of time series across 10 databases. As Tables 3, 4, and 5 reveal, the methods exhibit considerable within-database variation even when only considering the proportion of time series classified as seasonal. The existence of this variation on all databases and significance levels indicates that the source of the variation is not the data, but the methods themselves.

Sources of discord

Ultimately, the source of discord stems from the different ways in which the methods assess seasonality. While there do exist similarities, each method focuses on a different aspect of a time series to assess seasonality (Table 2). For instance, half the methods (ET, AA, AR, ED) fit a time series with a hypothetical model and test the model for seasonality, while the other half (FR, KW, WE, QS) test different aspects of a time series directly, without using a hypothesized model. To take the discussion further and generalize where we can, we make distinctions between types of concordance and types of peaks. Regarding concordance, we define “positive concordance” to be unanimous agreement among the methods that a time series is seasonal, while “negative concordance” to be unanimous agreement that a time series is non-seasonal. Therefore, for a given time series, the methods are discordant when there is neither positive concordance nor negative concordance. Regarding peaks, we say that peaks are “persistent” if they occur year after year, and they are “consistent” if they occur in the same month. We make this distinction because peaks relate to important aspects of time series analysis relevant to seasonality; specifically, variation and autocorrelation. Peaks can, of course, come in different sizes. Time series with large peaks suggest greater variation than those with small peaks. Persistent peaks (be they small or large) suggest the possibility of underlying cyclical behavior in the time series. Consistent peaks, to the extent that they are consistent, indicate autocorrelation in the time series. We’ll use Figs. 2 and 3 to navigate the remainder of the discussion.

From Fig3.ts1 (N = 2809) and Fig3.ts9 (N = 1498), we learn that the methods exhibit concordance only 4307/11,137 = 38.7% of the time. Figure 2 provides valuable insight into the extent of discord among the methods. Of the 40 unique combinations, we observe that some combinations occur more frequently than others and this is due to similarities in the testing procedure (Table 2). For instance, methods that group time series data by month and test for differences among the groups are assessing seasonality differently than methods that fit a hypothetical model and then determine seasonality by minimizing forecast error. Acknowledging the differences in how the methods assess seasonality is important not only for understanding the amount of observed discord, but in recognizing that these differences indicate a disagreement with regards to how seasonality is defined. Indeed, if the methods were highly concordant despite their contrasting approaches, we would have to concede that the contrasting approaches are ultimately just different ways of expressing the same aspect of a time series. This can be observed more clearly by exploring Fig. 3. In Fig3.ts1, …, Fig3.ts4 we observe time series that to the human eye seem seasonal and very similar. Identifying such time series as seasonal is a very old idea in time series analysis, with Beveridge [24] and Yule [25] employing harmonic functions to model time series with cyclical behavior. However, despite an obvious cyclical pattern and visual similarities, Fig3.ts2, Fig3.ts3, and Fig3.ts4, all exhibit discord. The reason being, except for the ED method, the methods are not testing for seasonality by fitting the data with harmonic functions. Thus, the different methods of seasonality assessment ultimately result in different definitions of seasonality.

As we’ve mentioned previously, the behavior of peaks plays an important role in concordance. We’ll use Fig. 3 further to explore the relationship between peaks, variation, and discord, and provide general principles as to when a method would be more likely to classify a time series as seasonal rather than non-seasonal.

Positive concordance

Since each method assesses seasonality differently, positive concordance is only achieved when multiple conditions are simultaneously present. Persistent and consistent peaks are most important for ED, AA, AR, and ET. Peaks will result in a seasonal classification by ED, so long as there exists a sufficient difference between the peaks and troughs in the data. However, even with persistent and consistent peaks, variation (particularly among the peaks) over time can lead to a non-seasonal classification by AA, AR, or ET (Fig3.ts2, Fig3.ts3, and Fig3.ts4). Indeed, we have confirmed experimentally that we can achieve positive concordance for the time series in Fig3.ts2, Fig3.ts3, and Fig3.ts4, by removing the data prior to 2016. Since time series with persistent and consistent peaks will have high correlation between seasonal lags, they will be classified seasonal by QS. For FR, KW, and WE, most important is variation. In the absence of the prominent peaks we see in Fig3.ts1, …, Fig3.ts4, sufficient variation in the time series data can lead FR, KW, and WE to a seasonal classification (Fig3.ts6). Therefore, with regards to positive concordance we see tension among the methods in that variation may cause some methods to classify seemingly seasonal time series as non-seasonal (Fig3.ts2, Fig3.ts3, and Fig3.ts4) and seemingly non-seasonal time series as seasonal (Fig3.ts5, …, Fig3.ts8).

Negative concordance

The relationship between negative concordance and variation is more straightforward. The time series in Fig3.ts5, …, Fig3.ts9 are similar in that one cannot determine the results of the methods by visual inspection alone (recall that any linear trend in each of the original series have been removed prior to method application). Given the similarity of the time series in Fig3.ts5, …, Fig3.ts9, it’s reasonable to wonder why they all do not exhibit negative concordance. Ultimately, time series that are constant or stationary around a constant mean with minimal variation will result in negative concordance among the methods. However, a time series with both large peaks and variation will exhibit negative concordance if there is no monthly or yearly autocorrelation (for instance, a time series generated from N(μ,σ2)). As was noted in the Results section, the 1498 time series for which the methods exhibit negative concordance report a mean variance of 0 to four decimal places.

Generalization and limitations

We’ve explained general scenarios in which we can expect negative and positive concordance, but further generalization is more difficult. As Fig. 3 reveals, there are thousands of different combinations of discord (M = 2168, …, 1267) for each time series, making it difficult to predict which particular combination of discord to expect based on visual inspection of the time series alone. However, an immediate consequence of this study is that researchers using different methods are implicitly defining seasonality differently. Given the discordance between the methods, researchers relying on different methods are likely to encounter different results, thus leading to conflicting understanding of the seasonality of a time series.

Finally, we note that the study and evaluation of methods was limited to 10 observational databases and eight methods of binary seasonality classification. Different results may have been observed by modifying one or more of the design choices. As was explained in the Discussion section, aspects of a time series that influence seasonality classification include variance, autocorrelation, peak persistence, and peak consistence. Time series constructed to influence one or more of those aspects could influence concordance. We chose 10 observational databases. Perhaps adding dozens or hundreds of other databases would reveal different levels of concordance among the methods. Similarly, we chose 8 methods of binary seasonality classification. A different group of methods may have resulted in different levels of concordance.

View More
  • 0 Comment(s)
Captcha Challenge
Reload Image
Type in the verification code above