Appendix C - Sample Design, Survey Methodology, and Estimation
Appendix C - Sample Design, Survey Methodology, and Estimation
SAMPLE DESIGN
The sample for the Commodity Flow Survey( CFS) is a stratified three- stage probability design where the first-stage sample units are establishments, the second- stage units are 2- week periods of 1993, and the third- stage units are shipments. In a probability sample, (1) there are distinct samples that can be selected, (2) each sample has a known probability of selection, and (3) one of the distinct samples is chosen.
In the first stage, approximately 200,000 domestic establishments were selected from a universe of 800,000 establishments engaged in mining, manufacturing, wholesale, and selected retail and service activities, as well as auxiliaries (e. g., warehouses) of multiestablishment companies. Establishments classified in farming, forestry, fishing, oil and gas extraction, government, construction, or transportation, and most establishments in retail and services are not covered by the CFS.
Establishments were selected from the 1992 Standard Statistical Establishment List (SSEL) of business establishments with paid employees. The SSEL, maintained by the Bureau of the Census, is a central multipurpose computerized name and address file of all known multiestablishment firms, and single- establishment employer firms. Establishments having 1991 payroll and classified in the kinds of business of interest to the survey were eligible for selection. The establishments in the survey universe were stratified by Standard Industrial Classification1 (SIC), National Transportation Analysis Region (NTAR), and Type of Operation Code (TOC). (The Department of Transportation (DOT) developed the NTARs to create geographic regions that could be used in conjunction with other DOT data to measure and analyze nationwide patterns of transportation demands and activities.) Within each stratum (1) the establishments were divided into certainty and noncertainty establishments based on employment size, (2) certainties (typically large firms) were automatically selected, and (3) a sample of noncertainty establishments was selected with probability proportional to estimated size, where the measure of size was based on annual payroll. The manner in which the sample was selected ensured that, if an establishment was twice as large as another establishment, it would typically have twice the chance of being selected. The final sample contained 106,362 certainty establishments and 90,814 noncertainty establishments.
In the second stage, establishments selected for the CFS were asked to report for a predetermined 2- week period in each of the four quarters of calendar year 1993. Entire 2- week periods were used to reduce the effect of any daily or weekly bias. Each week of the quarter began a different 2-week reporting period, resulting in 13 possible reporting periods originating in the first quarter. Each sampled establishment was randomly assigned one of these thirteen 2- week reporting periods in the
In the third stage of sampling, for each of the 2- week periods determined in the second stage, a reporting establishment selected a systematic sample of its shipments from its files. The questionnaire provided sampling instructions that typically resulted in a sample of between 20 and 50 shipments being selected each quarter.
SURVEY METHODOLOGY
The 1993 Commodity Flow Survey (CFS) is an establishment-based shipper survey that used mailout/mailback data collection. Respondents were asked to select a sample of their outbound shipments and to report, for each sampled shipment, the major commodity, weight, value, transportation mode( s), origin, destination, and indicators of whether the shipment was an export, hazardous material, or containerized. For exports we also collected the mode of export and city and country of destination. For multi-commodity shipments, the respondents were instructed to report the commodity that made up the greatest percent-age of the shipments weight.
Two report forms were used for the survey the CFS-1000 (the primary questionnaire) and the CFS-2000, which was sent in the fourth quarter to a subsample of establishments. The CFS-2000 contained additional questions about the establishments transportation equipment and access to shipping facilities. See appendix E for sample questionnaires.
ESTIMATION
Estimates in this survey are derived from weighted shipment data and are then adjusted using several factors to account for nonresponse, undercoverage, and response errors. Selected establishments reported for a sample of their shipments. We weighted these shipments to represent the establishments shipments for the year. Each establishments data were then weighted by the inverse of the establishments probability of being selected into the sample, which allows data from selected establishments to represent nonselected establishments. We also used results from the economic census of Mineral Industries, Manufactures, Wholesale, Retail, and Service to construct adjustment factors at the establishment level and at the SIC level. We adjusted individual establishments to the Census to correct for sampling error and nonsampling error in the selection of shipments within the establishment. We per-formed the SIClevel adjustment to correct for sampling error in the selection of establishments and to account for undercoverage and establishment nonresponse.
1 Standard Industrial Classification Manual: 1987. For sale by Superintendent of Documents, U. S. Government Printing Office, Washington, D. C 20402. Stock No. 041- 001- 00314- 2.first quarter. To avoid potential quarterly cycles, reporting periods in subsequent quarters were assigned so that an establishment did not report at the same time each quarter. In all, responses were obtained for 8 out of 52 weeks during 1993.