CPANDA Data Set Versions
Why researchers have multiple versions of a data set
The data sets in CPANDA are provided by the organizations or researchers that originally collected the data. In the process of collecting and analyzing data, it is common for multiple versions of a data set to be created. For example, data may have been omitted or recoded for specific analytic purposes and then saved as a variation on the original data set. Some versions of the data set may be "cleaner" than earlier versions which occasionally include ineligible survey respondents. Researchers may sometimes hire analysts to conduct specific analyses of the data, resulting in still more versions of the data set.
Data set versions archived at CPANDA
In some cases, many years may have passed before the data have been archived with CPANDA. For this reason, the version of a particular data set deposited with CPANDA may or may not be completely identical to the version that was used to generate findings that may have been reported in published documents.
CPANDA makes every effort to obtain the most complete version of each data set it archives and to document as completely as possible any significant differences between the archived version of the data and what is known about the version that was used to generate published findings. Users of CPANDA data sets should review this documentation carefully before conducting new analyses of the data.
Occasionally, a more complete version of a particular data set will become available and it will be added to the Archive, replacing a less complete version. When this happens, CPANDA will notify users via announcements on the web site.