USA Banner

Official US Government Icon

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure Site Icon

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Site Notification

Site Notification

 Find the latest Coronavirus-related transportation statistics on the BTS COVID-19 landing page.

U.S. Department of Transportation U.S. Department of Transportation Icon United States Department of Transportation United States Department of Transportation

Methodological Considerations, Data Reliability, and Data Comparability

Wednesday, December 21, 2011

Methodological Considerations, Data Reliability, and Data Comparability

Survey Methodology

The 2001 NHTS was conducted by Westata social science research firm in Rockville, Marylandfor the U.S. Department of Transportation. The NHTS is designed to provide detailed cross-sectional information on daily and long-distance passenger travel in the United States.

Sample Design

A nationally representative sample of 60,000 individuals in 26,000 households participated in the 2001 NHTS. The sample included all members of the household, including children under the age of five. Sampling for the NHTS involved a random digit dial (RDD) telephone sampling design. 2001 NHTS data were collected from March 2001 to May 2002. Data were collected about all household members either directly from the respondent or through a proxy.12 A household where 50 percent of the adults completed the survey was considered a responding household and included in the data file. Individuals from sampled households were asked to complete a travel diary documenting their daily trips in order to aid in the recall process, and use the diary when responding to the interviewer. The overall response rate was 41 percent. (The household screener interview rate was 58 percent, and the useable household rate was 71 percent.)

Data Reliability

Estimates produced using data from the NHTS are subject to two types of error, sampling and nonsampling errors. Nonsampling errors are errors made in the collection and processing of data. Sampling errors occur because the data are collected from a sample rather than a census of the population.

Nonsampling Errors

Nonsampling error is the term used to describe variations in the estimates that may be caused by population coverage limitations, as well as data collection, processing, and reporting procedures. The sources of nonsampling errors are typically problems like unit and item nonresponse, the differences in respondents' interpretations of the meaning of the questions, response differences related to the particular time the survey was conducted, and mistakes in data processing. In general, it is difficult to identify and estimate either the amount of nonsampling error or the bias caused by this error. In the 2001 NHTS, design efforts were made to prevent such errors from occurring and to compensate for them where possible. For instance, a travel diary was used in order to aid with the recall of daily trips. In addition, details on the travel day were collected within six days of it occurring while events of that day were still relatively fresh in the minds of the respondents. Other more standard procedures, such as online, computer-assisted telephone interview (CATI) editing were used as well.

Standard Errors and Weights

In order to produce national estimates from the 2001 NHTS data, the sample data were weighted. Weighting the data adjusts for selection probabilities at the household level and adjusts for household and individual nonresponse. The 2001 NHTS data files contain two kinds of weights;

  1. from "usable" households in which person interviews were completed with at least 50 percent of adults in the household (26,038 households in the sample), and
  2. "100 percent" households in which person interviews were completed with all adults in the household (22,178 households in the sample).

All estimates in this report are weighted.13 In addition to properly weighting the responses, special procedures for estimating the statistical significance of the estimates were employed because the data were collected using a complex sample design. Complex sample designs, like that used in the NHTS and other large-scale federal surveys, result in data that do not comply with the assumptions normally required to assess the statistical significance of the results. Frequently, the standard errors of the estimates are larger than would be expected if the sample was a simple random sample and the observations were independent and identically distributed random variables. Replication methods of variance estimation were used to reflect the actual sample design used in the 2001 NHTS. A form of the jackknife replication method (JK2) using 99 replicates was used to compute approximately unbiased estimates of the standard errors of the estimates in the report, using the statistical software WesVarPC. The jackknife methods were used to estimate the precision of the estimates of the reported national percentages and means.

Statistical Procedures

Comparisons made in the text were tested for statistical significance to ensure that the differences are larger than might be expected due to sampling variation. All differences described in the text are statistically significant at a 0.05 level. When comparing estimates between categorical groups (e.g., sex, income), the difference in the estimates (mean or percent) was computed along with a confidence interval. If the confidence interval contained the value of zero, then the estimates had no detectable statistically significant difference.

The confidence interval of the difference of two proportions was computed as:

(Est 1 Est 2 ) +/- (1.96 * SQRT[(se 1 ) 2 + (se 2 ) 2 ])

The confidence interval of the difference of two means was computed as:

(Est 1 -Est 2 ) +/- (1.96* SQRT[((se 1 ) 2 ) + ((se 2 ) 2 )])

where Est 1 and Est 2 are the independent estimates being compared, and se 1 and se 2 are their corresponding standard errors.

Data Comparability: Changes from Prior Surveys

The 2001 survey represents a combined survey of the National Personal Transportation Survey (NPTS) and the ATS (American Travel Survey). The ATS, conducted in 1995 by the Census Bureau for BTS, was a survey of trips of 100 miles or more taken over the course of a calendar year. There were methodological difficulties in trying to use the 1995 NPTS and the 1995 ATS together to form a picture of total household travel by the American public. The combined survey approach for the 2001 NHTS was designed to resolve this by providing one data source for the full continuum of person travel. In addition to combining the two surveys, the threshold for longer trips was lowered to 50 miles or more to obtain a better sample of the often-overlooked trips in the 50- to 100-mile range.

For the first time in the NPTS series, travel data were collected for household members under the age of five years. All previous surveys collected travel only from household members age five and older, on the assumption that children under the age of five only made trips with other household members. However, this overlooked trips made by this young group with day care providers as part of a preschool activity, or with other nonhousehold members.

There are many differences in questionnaire format that affect estimates. For example, in the 2001 NHTS, a specific probe was included about walking trips to more accurately capture such trips. As a result, it is possible that the increase in the proportion of all trips that are walk trips compared to the 1995 NPTS may be due to this additional probe, rather than a true increase in the actual numbers of walk trips.

In addition to changes in the survey design and administration, other factors affected travel behavior and possibly data collection during the 2001 NHTS. The September 11, 2001 attacks on the World Trade Center Towers and the Pentagon, the security measures that followed, and the ensuing sense of insecurity in the nation severely disrupted travel in the United States for months, changing the amount and modes of travel during that period. In addition, during the last few months of 2001, the public's suspicion regarding unanticipated mail packages was heightened after letters containing anthrax were sent to individuals through the mail. Although the impact of this on travel is yet to be determined, it may have affected 2001 NHTS response rates because the survey had a mail component.

Further information on the survey, sample design, comparability with past surveys, data editing and process, data file structures, weight, etc. is available in the 2001 NHTS User's Guide that will be released with the data.

End Notes

12. In the NHTS data collection, an adult household member always served as the proxy for a child under age 14. Proxies were also requested for persons age 14 and 15 years. However, if an adult household member requested that the interviewer speak directly with these teenagers, the interview was conducted with the subject. Proxies were not initially requested for household members 16 years and older, but were allowed under limited conditions. Proxy interviews were conducted for 23.4 percent of the respondents 16 and older.

13. Weights used in this report: The 50 percent daily travel and person weights (WTTRDFIN and WTPERFIN) and the 50 percent travel period weights (WTTPFIN) were used. WTPERFIN sums to the population of all noninstitutionalized individuals in the United States while WTTRDFIN and WTTPFIN sum to the annualized estimate of the total number of daily and long-distance trips taken by individuals in the nation.