Chapter 2 Time Series Analysis of Passenger Data: Pre- and Post- September 11, 2001
Chapter 2 Time Series Analysis of Passenger Data: Pre- and Post- September 11, 2001
Introduction
In the previous chapter, we noted that some of the differences between the pre- and post-September 2001 data could be attributed to the nature of time series data. Differences due to seasonality confound the measurement of the September 2001 impact. In order to understand the time series characteristics of the passenger data, we turn our attention to different sets of data that measure monthly passenger movement. The following sections will analyze three sets of monthly data: air revenue passenger miles, rail passenger miles, and vehicle miles traveled. The data do not separate local and long-distance travel.
To measure the effect of September 11, 2001, we will forecast these three series based on the data from January 1990 through August 2001. The forecasts are then compared to what actually occurred in the data. Air showed the greatest differences: the values of the aviation time series began to enter the prediction intervals in December 2003, indicating that the aviation miles only began to approach in 2004 the previously expected values. Rail miles did not appear to experience an immediate impact from September 11, 2001. Vehicle miles experienced a one-month drop for September 2001, and an additional one-month drop was also experienced for September 2002.
The following sections provide the details behind these findings.
About the Data
The first three figures (figures 9, 10, and 11) provide graphs of the passenger time series data: air revenue passenger miles (RPM), rail passenger miles (PM), and vehicle miles traveled (VMT). The air RPM are found in the T-1 dataset compiled for the T-100 database, taken from the U.S. Department of Transportation, Research and Innovative Technology Administration, Bureau of Transportation Statistics (BTS), Office of Airline Information (OAI). The data for rail PM are compiled from the U.S. Department of Transportation, Federal Railroad Administration (FRA), Office of Safety. VMT data are taken from the Traffic Volume Trends reports from the Federal Highway Administration, Office of Highway Policy Information.3 The datasets to be studied are all monthly, and each time series initiates at January 1990. The data values for the three series are through June 2004. Each of the three graphs contains a vertical line indicating September 2001. (Appendix C provides a table with all the data.)
Seasonal Analysis
A quick perusal of the three passenger datasets reveals that the data are strongly seasonal. But it is difficult to compare the degree of seasonality across the three time series. The next three graphs attempt to simplify the comparison. Figures 12, 13, and 14 provide histograms of the seasonality of each series, which required an additional assumption that the seasonality of each series does not change much over time to justify averaging the monthly components. The seasonality is measured as the percent deviation from the underlying trend and consists of the average for the five years of monthly data prior to September 2001, that is, September 1996 through August 2001. The method for decomposing the seasonality and the trend will be dealt with in the next section.
Note that the vertical axis for each graph runs from +25% to -25% deviation. In this way, the degree of deviation is comparable across the figures. While the patterns of seasonality are comparable (less travel in winter, and more travel in summer), we note that rail PM tends to vary the most and VMT the least. Part of this may be accounted for by the fact that rail deals with fewer miles, while VMT incorporates more; extreme values have less of an impact when the number of observations is large.
The next section describes how the seasonal and trend components were created.
Stamp Modeling
One approach to studying seasonality is to decompose a time series into three components: the trend, the seasonal factors, and the irregular components. STAMP, which stands for Structural Time Series Analyser, Modeller and Predictor,4 allows us to take a set of time series data and break that dataset down (or “decompose” it) into components that cannot be observed directly, but have intuitive appeal. Most readers have an understanding that a trend component will represent the long-term direction of the data; a seasonal component will refl ect changes due to the time within the year; and the irregular component will illustrate what is “left-over” or not explained by the trend or seasonal behavior. (For an explanation of the theory, the reader can obtain the details in appendix D.) These three components can be stochastic, or changing over time (S); fixed, or unchanging over time (F); or nonexistent (N). Through statistical testing, we ascertained the best fitting model for each time series for the time period of January 1990 through August 2001 (see table 1). By fitting the data prior to September 2001, we can then forecast that model out through the current time and compare the results of these pre-September 2001 forecasts to the actual data.
While all three series have models with levels and seasonality changing stochastically over time, the air RPM and VMT exhibit fixed underlying trends. The rail PM has no long-term trend for the period under study.
Using the above models, we forecast from September 2001 through December 2004 to help us understand the changes that occurred over that forecast period. We next provide the graphs of the three sets of data, with forecasts from September 2001 through December 2004 (figures 15, 16, and 17). The prediction intervals for the forecasts are also provided.
The following three graphs (figures 18, 19, and 20) provide a comparison of the forecasts based on the pre-September 2001 data with the actual data from September 2001 forward. The large Is on the graphs indicate the values within the prediction interval.
As was expected, air RPM experienced the greatest impact. The forecasted RPMs and the actual RPMs have yet to match one another; only in 2004 did the actual air RPMs cross over into and remain in the 95 percent prediction interval, which indicates that aviation only began to return to what would have been the expected set of RPMs in 2004. Rail PM actuals tended to be close to the forecasted values. VMT seems to show little difference between the actuals and the forecasts, with the exception of the months of September 2001 and September 2002, thereby indicating little longterm impact on overall VMT levels.
In order to more accurately define the impact of September 2001 on the actual data, we attempted to model the impact of September 11, 2001, using the full set of data (January 1990 to June 2004), with the threestep intervention procedure developed by Ord and Young (2004). Three components of the intervention are tested: an additive outlier (AO) on September 2001, a temporary decay (TC) starting October 2001, and a level shift (LS) starting November 2001.
For air RPM, the AO was significant, as was expected; the TC of 8 percent 5 also proved to be significant. The LS term was not significant – indicating that there may not be a permanent shift downward in the trend of the time series.
For rail RPM, none of the three-step intervention terms were significant, indicating that September 2001 did not have a significant impact on the series. However, as can be seen in the graph, there was an unexpected shortterm drop in the rail PM the following year (September through November), which may indicate an avoidance of travel on the oneyear anniversary.
For VMT, the AO at September 2001 was significant, but then the data returned to the expected pattern. So it may be that people avoided car travel in September 2001, but then returned to their usual driving behavior the following month. The summary of these intervention results are provided in table 2.
The next section conducts an analysis comparing the NHTS data with the time series analysis; the time series summaries used in the analysis are available in appendix E.
Comparison of Pre- and Post- 9/11 Monthly Trips
In order to simplify the comparison between the time series analysis and the NHTS data, only the forecast results from the time series analysis for the months of the NHTS data collection are considered. The three tables in appendix E show, according to the time series forecasting, both the actual values and the forecasts, along with the corresponding calculated forecast errors.
The travel estimates by mode presented in appendix E can be compared to the NHTS trip estimates to indicate what portion, if any, of the changes in the number of trips between the NHTS pre-and post-9/11 datasets can be attributed to the events of 9/11 and the following months covered by the NHTS survey.
Both the pre-and post-9/11 NHTS datasets were reweighted to give annual trip estimates for the U.S. population. However, the survey periods covered by each dataset do not give equivalent time periods before and after September 11, 2001. The NHTS survey was conducted from March 2001 to May 2002. The pre-and post-9/11 datasets divide the persons surveyed during that period into two groups, March 2001 to September 11, 2001 in the pre-9/11 dataset, and September 12, 2001 to May 2002 in the post-9/11 dataset. The pre-9/11 dataset covers three full months and parts of four other months. The post-9/11 dataset covers 5 full months and parts of 4 other months. The repeated months between the two datasets include March 2001 and March 2002, April 2001 and April 2002, May 2001 and May 2002, and the pre-and post-9/11 portions of September 2001. As a consequence of the unequal time periods covered in each dataset, the total number of estimated trips from each dataset cannot be compared to each other. However, it is possible to compare the trip estimates by calculating a monthly average number of trips from each dataset to adjust for the unequal time periods. Table 3 gives the comparison of NHTS monthly trip estimates to the travel estimates by mode in tables E1 through E3.
Unfortunately the comparisons are very rough. For highways, vehicle miles traveled includes all personal vehicle, public transit, and freight travel over all public highways in the United States. This is compared to NHTS trip estimates that include only long distance personal vehicle trips. For passenger rail, Amtrak passenger data is compared to the NHTS long-distance trips by rail, which likely includes some commuter rail trips. There are no long-distance bus data available for comparison to the NHTS bus trip estimates.
Air trips provide the most interesting comparison. Average monthly air RPM experienced a drop of 26.4 percent between the pre-9/11 time period and the post-9/11 time period. When the post-9/11 time period was forecast using previous historical trends, the drop was only 7.8 percent. The difference between those two percentage decreases, about 18.6 percent, can be viewed as the additional decrease in air travel beyond normal historical seasonal trends. The estimated decline from NHTS data of about 22 percent is roughly comparable to the 26 percent decline in air RPM. While the 22 percent cannot be divided up into seasonal and nonseasonal components, using the proportional change in air RPM as a proxy, this gives roughly about a 6 to 7 percent decline for normal seasonal trends and the rest, about 15 to 16 percent, can be attributed to other factors, including 9/11.
3 Quality information on the air RPM and rail PM can be found at: http://www.bts.gov/programs/economics_and_finance/transportation_services_index/html/source_and_documentation_and_ data_quality.html. Information on the quality of the VMT data can be found at: http://www.fhwa.dot.gov/ohim/tvtw/tvtpage.htm.
4 See Koopman et al. (2000) for more detail on STAMP.
5 Four decay rates were tested: 6, 7, 8, and 9 percent. For air RPM, the best fitting decay rate was 8 percent, which results in a half-life of 3 months beyond September, or December 2001.