Though it is widely recognized that people with intellectual and developmental disabilities (IDD) face significant health disparities, the comprehensive data sets needed for population-level health surveillance of people with IDD are lacking. This paucity of data makes it difficult to track and accurately describe health differences, improvements, and changes in access. Many states maintain administrative health databases that, to date, have not been widely used for research purposes. In order to evaluate the feasibility of using administrative databases for research purposes, the authors attempted to validate Massachusetts' administrative health database by comparing it to a large safety net hospital system's patient data regarding cancer screening, and to the state's service enrollment tables. The authors found variable representativeness overall; the sub-population of adults who live in 24-hr supported residences were better represented than adults who live independently or with family members. They also found a fairly low false negative rate for cancer screening data as compared with the “gold standard” of hospital records. Despite some limitations, these results suggest that state-level administrative databases may represent an exciting new avenue for health research. These results should lend context to efforts to study cancer and health screening variables using administrative databases. The present study methods may also have utility to researchers in other states for critically evaluating other state IDD service databases. This type of evaluation can assist researchers in contextualizing their data, and in tailoring their research questions to the abilities and limitations of this kind of database.
It is widely accepted that adults with intellectual and developmental disabilities (IDD) face significant health disparities (Iacono & Sutherland, 2006; Krahn, Hammond, & Turner, 2006). Despite research and advocacy interest in disparity reduction, the United States lacks comprehensive data sets and methods for tracking population health of people with IDD (Janicki et al., 2002; Owens, Kerker, Zigler, & Horwitz, 2006). There is a clear need for additional data sources regarding the health status of this population (Krahn, Fox, Campbell, Ramon, & Jesien, 2010).
National surveys, such as the National Health Interview Survey (NHIS) and the National Health and Nutrition Examination Survey (NHANES), dramatically underrepresent the population of people with IDD. For example, in 2004, people with “mental retardation” made up about 0.1% of the NHIS sample, while they are estimated make up at least 1%–2% of the general population (Luckason et al., 2002; Wilkinson, Lauer, Fruend & Rosen, 2011). Underrepresentation of people with IDD in national health data may be due to several causes, including difficulties with gathering a sample that adequately represents the heterogeneous population of people with IDD, difficulties with identifying the subgroup of people with IDD, unintended participation barriers associated with response modes, difficulties with identification of IDD in medical records, and problems with linking one data source to another (Lennox et al., 2005; Leonard & Wen, 2002). Problems with sampling methods likely contribute to the disparity in research representation (in some states, group homes are considered exempt from door-to-door sampling) or to the way in which “disability” is classified in NHIS (Centers for Disease Control and Prevention/National Center for Health Statistics, 2009).
In other countries, centralized health surveillance databases provide important data documenting positive and negative changes in health differences between people with IDD and the general population (Lin, 2009; Mulcahy, Mulvany, & Timmons, 1996; Petterson et al., 2005; Smiley, Cooper, Miller, Robertson, & Simpson, 2002). For example, Dutch health surveillance databases are compiled from primary care records; nearly everyone residing in the Netherlands has a community-based general practitioner who oversees and tracks their care, in contrast to the United States (Wullink, van Schrojenstein Lantman-de Valk, Dinant, & Metsemakers, 2007). This kind of centralized database allows stakeholders to better plan for service needs, track disparities over time, and understand causation, successes, and needed areas of improvement, and enables more targeted, effective advocacy efforts (Leonard, Petterson, Bower, & Sanders, 2003). However, in the United States, population health data is often collected through the use of surveys, rather than national registries. It is unlikely that the United States will move to a centralized registry system in the foreseeable future, due to important differences in our health system, in comparison to health care organization elsewhere (Krahn et al., 2010). Centralized administration of social and medical services may be conducive to the creation and maintenance of health surveillance databases, such as in Finland, which uses a streamlined system to track receipt of all services (Westernin, Kaski, Virtra, Almqvist, & Iivanainen, 2007). However, in the United States, this lack of comprehensive surveillance data has long been recognized by the research community as a barrier to the study and reduction of health disparities. Therefore, we were interested in investigating other potential sources of research data.
Though we lack centralized registries, some hospitals and hospital systems maintain clinical data warehouses (CDWs). CDWs store patient records, allowing researchers and clinicians to extract variables of interest from patients' clinical charts. However, while CDWs typically contain large numbers of records, the hospital system may only serve small numbers of patients with IDD. In addition, extracting specific variables from the entire medical record can be time consuming, especially in studies with large sample sizes. Records and charts in a CDW track the individual's care and health care utilization at that particular hospital only, and may not provide accurate data for patients who change providers or are seen in multiple health care systems. Though patients with IDD may receive services at the hospital system, and therefore appear in the CDW database, these records are not distinguished from individuals without disabilities. Finally, CDWs contain records for all of a hospital system's patients, making it impractical to use CDW data for large scale health surveillance of the population of people with IDD, as researchers would need to sort through high numbers of records to identify those few belonging to patients with IDD, a time-consuming process. CDW data is often considered a relative “gold standard” against which to compare other data sources, though, because CDWs contain the patient's entire chart and document all care received at that institution.
Currently, many state departments that provide services to people with IDD maintain administrative health databases for tracking client health and facilitating communication between providers. While these databases are often created as part of efforts to optimize care delivery for service users, and were not intended for research purposes, they may represent a new source of valuable data that can be used to assess and track population health. These administrative health databases may not represent the entire population of people with IDD, but they have the potential to be very representative of specific subpopulations, such as people receiving services through Medicaid Home and Community-Based Service waivers, particularly those living in group homes and facilities. There are potential challenges to conducting secondary analyses with these administrative databases in order to use them to learn about health and health disparities: misinterpreting data fields designed for administrative purposes, incomplete records, fields that may not directly inform research questions, and changes over time to data from systemic changes (Glasson & Hussain, 2008). However, with careful use, administrative data sets may be a cost-effective way to conduct limited surveillance on a large population of people, or over a broad time span. Through careful linkage, the usefulness of administrative databases for surveillance activities may be improved (Glasson & Hussain, 2008). There is a need for research examining the accuracy and degree of representation of these administrative databases in order to guide their use as research tools.
In this study, we evaluated Massachusetts' administrative health database for representativeness and false negative rates. This database is maintained by the Massachusetts Department of Developmental Services (DDS) for administrative purposes, and was not created for research use. The state mandates record maintenance by state-funded residential service providers and DDS service coordinators for certain service recipients. For other service recipients, the state recommends maintenance by DDS service coordinators with known information. We evaluated its representativeness by comparing clients with records in the health database to a known list of clients receiving services in the state, and compared representativeness from different residential settings. We also evaluated the false negative rates of several cancer screening tests in the health database by comparing cancer screening records in this database to records maintained by the CDW at a large, urban safety net hospital network that includes many satellite clinics and community health centers. We recognize that because administrative health databases were not created for research purposes and track individual rather than population health, they might not be the perfect tool for conducting population-level research. However, we wanted to identify both gaps and areas of strength in order to facilitate future research using administrative data sets.
We chose to look at cancer screening for two reasons: Limited U.S. data and data from other countries suggest that people with IDD are known to receive fewer cancer screenings than the general population (Wilkinson, Cerreto & Culpepper, 2007; Sullivan et al., 2003), though people with IDD contract cancer at least as often as people without disabilities (Sullivan, Hussain, Threlfall, & Bittles, 2004). People with syndrome-specific and/or profound IDD may have higher incidences of some cancers (such as cancers of the gastrointestinal tract and uterine cancer) than the general population (Patja, Eero, & Iivanainen, 2001). It should be noted that people with IDD have lower incidences of engaging in behaviors associated with elevated cancer risk, such as smoking (Patja et al., 2001), making it especially important to enable research that contributes to our understanding of cancer screening and prevention in people with IDD. We therefore felt that cancer screening would be a topic of interest to IDD researchers who might be interested in using administrative data sets. In addition, presence or absence of cancer screening was straightforward to look up in the hospital CDW records, and did not necessitate excessive viewing of patient medical records or potential identifiers. Though other outcomes, such as routine primary care visits and immunizations, would also be of interest to evaluate in the future, we choose to focus our study on cancer screening, as it was straightforward to compare in the two data sets.
In this study, we chose to focus our research questions solely on the representativeness and utility of the administrative database, in hopes of enabling future research focused on cancer screening rates. For this reason, we chose not to calculate and compare cancer screening rates for widely recommended screenings among those with IDD to the general population. We selected Massachusetts because the commonwealth has near universal health insurance coverage due to a combination of mandatory coverage and generous state assistance. This eliminated insurance coverage and access to screening as a potential confounding factor. In addition, the state is comprised of the large Boston metropolitan area and other suburban and rural areas, and we felt that representing a mix of urban/suburban and rural areas would be useful to other researchers in similar states. We decided to analyze breast, colorectal, and prostate cancer screenings because reliable testing for these cancers exist, are generally accessible, and have relatively high utilization rates in the general public. Because our main purpose was comparing data sources, rather than compliance with cancer screening recommendations, we cast a wider net by examining cancer screening utilization in patients ages 40 and up.
Our specific objectives were to examine health information in the Massachusetts administrative health database to assess: (a) the degree to which it represents the population of adults with IDD eligible for the department's services compared to a comprehensive enrollment table also maintained by the department; and (b) the false negative and false positive rates of specific cancer and health screening variables, using matched medical records from a large hospital system in Boston as a gold standard. We defined false negatives as tests that were performed but not entered in the administrative database, and false positives as tests that were not performed but which appeared in the administrative database. However, due to privacy concerns that prevented us from searching other health facilities' records for tests that appeared in the administrative database but not the hospital database, we chose to focus our study on false negatives.
Our first objective was to evaluate the representativeness of the administrative database as compared to a known master list of individuals receiving state services; we hypothesized that representation would be fairly good. Our second objective was to evaluate the false negative and false positive rates of the administrative database as compared to a large hospital system data warehouse. We hypothesized that the administrative database is more likely to have a low false negative (tests were performed but not entered into the database) than a low false positive (tests that were not performed were entered into the database) because the entry is based on clinical records, and the staff time for administrative duties for service providers is limited. We also hypothesized that false negative rates would be lower for tests such as mammography that are “notable events” requiring advance preparation or a trip to a different facility. We expected to find variation by residential setting.
Representativeness of the administrative health database is defined as how well it reflects the overall population of service-eligible adults with IDD in the state.
In order to assess representativeness, the clients in the administrative health database were compared to a comprehensive master list of clients receiving state services, created from state service enrollment tables.
Administrative health database
The Massachusetts Department of Developmental Services (DDS) administrative health database is maintained in order to track client health and facilitate communication between providers in an effort to optimize health care delivery for persons who receive services from DDS, the state department charged with overseeing services for people with IDD. The record includes information about health insurance coverage, emergency contacts, consent level and guardian status, current medications, allergies, current medical problems and diagnoses, height and weight, functional status, use of adaptive devices, special needs at medical appointments, contact information for health care providers, immunizations and infectious disease testing, past medical history, and family history. It is important to distinguish the administrative health database from electronic medical records (EMR) maintained by medical professionals in order to track medical care. The department requires that all service providers (such as group home management agencies) update this database at least annually for their active residential service recipients, and recommends that the database be updated any time there are significant changes in the person's information. Clients are more likely to appear in the database if they actively receive certain types of services from the department, including residential services, placement services, and individualized home supports where the provider is responsible for health care coordination for the person. We expected that reporting would be higher for those who receive state-funded services (i.e. residential services and care coordination), as the state can mandate reporting by state-funded service providers. In contrast, reporting was expected to be lower for people who receive fewer services, and for people who live with their families, as the state cannot mandate inclusion of clients for whom a state-funded support provider is not responsible.
We also obtained information about each person's service enrollment, such as in state-funded residential or day programs, directly from enrollment tables maintained by the department and linked by unique identifier to people in the health database. The enrollment tables are updated more frequently, as enrollment determines payment to the service provider, and therefore serves as a master list for all state enrollees. This enabled us to ensure that service data was more accurate than an annual update would allow.
The Department of Developmental Services permitted extraction of several variables of interest from the database, creating a subset of data. This subset was compared to the comprehensive master list described below.
Master list of the population of known clients with IDD in Massachusetts
It should be noted that only adults with IDDs who are active service recipients are included in this list. Therefore, the master list may not capture some adults with mild intellectual disabilities who elected not to request state services due to the provision of support from other sources, who may have never been formally diagnosed, or who did not meet the state's eligibility criteria. There may also be adults with IDDs who moved to the state recently, or never required services from the department, who would not appear even in the master list of clients. However, it is felt that the master list includes all adults with IDDs who are enrolled in any of the services offered by the department, and is the closest thing available to a gold standard for comparison.
Rationale. Residential setting has been shown to be associated with likelihood of cancer screening (Wilkinson, Lauer, Fruend, & Rosen, 2011; Davies & Duff, 2001), and the receipt of residential services is associated with an assigned responsibility for health and safety to the service provider. Since residential setting appears to be an important variable when studying cancer screening, we wanted to highlight the administrative health database's potential generalizability to certain settings when interpreting future data on cancer screening.
Measures. We were interested in comparing number of people represented in the health database to number of people represented in the state service enrollment tables, which functions as a master list.
Procedure. We expressed the percentage of known clients who appear in the administrative health database by residential setting, so that the types of residential settings with higher and lower representation would be readily available. To compare the administrative health database to the master list of clients, a state consultant who already had access to both lists identified the numbers of people in each residential category (i.e. independent, supported living facility such as group homes, etc.) in the health database and on the master list.
We wanted to assess the rate at which preventive screening tests that were performed were entered into the administrative health database, as a measure of the accuracy of representation of diagnostic tests. Hospital records were used as the gold standard indicating test performance. Information about false negative rate of the administrative health database can inform the design of future studies using these (and similar) data sets. We did not focus on evaluating or reporting false positive rates of tests, as we only had data from one hospital and privacy concerns would have made it logistically impossible to check every potential health care setting where a test might have been performed.
Data sources. In addition to the administrative health database described above, we used data from the clinical data warehouse at a large urban safety net hospital.
Clinical Data Warehouse (CDW)
The Boston Medical Center hospital system uses a single relational database that encompasses data from all the various software systems used at the hospital and outlaying health centers into a single source. This database, called the CDW, integrates data from many of the hospital's major electronic data systems including the software systems used for inpatients, outpatients, the emergency room, the operating room, and billing. The data includes laboratory and radiology reports, appointment and visit information, medications, and visit and chronic diagnoses (ICD-9 codes), as well as physician notes. Access to the database is by request through the enterprise analytics team at the hospital. The CDW is quite comprehensive, and was considered the gold standard for this study.
Measures. Variables examined include whether age- and gender-appropriate cancer screenings (mammography, colonoscopy, prostate cancer screening) had taken place within the recommended time frame, which we defined as within the past two years for mammograms for women ages 42 and above (at the time this study was conducted, mammography was recommended beginning at age 40), within the past two years for prostate cancer screening for men ages 42 and above, and within the past 10 years for colorectal cancer screening (colonoscopy) people over age 50.
Procedure. We requested data from the clinical data warehouse for all people in the appropriate age range for the screening tests who had an ICD-9 code (diagnosis code used for medical billing purposes) of intellectual disability documented in their record. Though there are many ICD-9 codes associated with various kinds of IDDs, we chose to pull only those charts that contained diagnoses of 317.0–319.0, the codes for mild, moderate, severe, profound, and unspecified intellectual disabilities. These codes were selected as they were straightforward in nature, and all individuals with these diagnoses would be eligible for state services. Each test (mammography, prostate cancer screening, and colonoscopy) was run as a separate query so that we could identify people of the appropriate age and gender for the test. CDW staff identified Current Procedural Terminology (CPT) codes (numerical codes assigned to medical procedures used for medical billing purposes) for the test of interest during the appropriate time range (defined above). These datasets from the CDW were then shared with the state consultant mentioned previously who already had full access to the DDS administrative health database. The CDW datasets contained patient age range, a binary response indicating whether the test of interest had been performed and patient social security numbers, which the consultant used to match CDW records to records in the administrative health database. A de-identified database of findings for patients who were represented in both databases was then prepared that linked the hospital database and the administrative health database. (Table 1 demonstrates representation by residential setting; group sizes for each test are provided in Tables 2 and 3).
The study included only patients who were represented in both databases, as we felt that trying to match all people in the administrative database to the CDW would compromise people's privacy and anonymity. The cancer-screening test of interest was compared across databases (completed vs. not completed in each database). In the CDW database, we defined “listed as complete” as presence of a CPT code for the test; in the administrative health database, we defined “listed as complete” as a completion date for the test in the state health record within the appropriate time frame for the test (past two years for mammography and prostate cancer screening, past 10 years for colonoscopy). This process was used for each variable. Complete details about the time period examined for each test and the recommended screening interval (at the time of study completion) are provided in Table 2.
For the geographic sub-analysis, the consultant pulled the field containing their service area, which was noted as either inside or outside of the greater Boston metro area. (The Massachusetts Department of Developmental Services uses a system of regional service areas, with the Boston metro area appearing as a pre-defined administrative area which was validated with clinicians as the catchment area for the hospital.
Because identifiable linkages were performed by a consultant who already had access to identifiable portions of the administrative health database, we ensured that clients' identifiable data were not viewed by any additional personnel. The identifying linkages were destroyed once the records had been found and linked.
This project was deemed “exempt” by the Boston University Medical Campus Institutional Review Board, and was approved by the state's research review committee before we were given access to the data.
Table 3 shows the representativeness by type of residential setting for clients in the administrative health database.
As expected, clients living in state-funded settings with 24-hr staff support had very high rates (92%) of representation in the database, likely because completion of the health data is required by providers in these types of settings. However, this group represents only about a third of all the adults with IDD eligible for services in Massachusetts. Thirty-six percent of adults with IDD in Massachusetts live independently or with family. Though these people comprise the largest group of service-eligible adults in the state, they had the third lowest representation in the database, with only 28% of members of this group having completed health records in the database. Clients receiving individualized supports, who comprise only 3.5% of the department's population, had good representation (73%) though not as high as those living in 24-hr supported residences. Another small percentage of the department's enrolled population (3.3%) lives in home share settings, also known as shared living. This group had fair representation, with 51% of these clients represented. Predictably, the almost 25% of service-eligible people in the state who live in settings that are not funded by the department, such as nursing homes, had the lowest representation of any residential group, with only 9% represented. (This category of “other” did not include people who live alone or people who live with their families.)
Table 2 shows the sample size of the extracted data by test, as well as the time period examined and the recommended screening interval at the time of study completion. Table 3 shows the false negative rates of each screening test examined for this objective (colonoscopy, prostate cancer testing, and mammography). For tests with higher false negative rates (tests were listed as “done” in the hospital database but not entered in the DDS database), additional sub-analyses based on the client's geographic area are presented in Table 4 with the division of geographic area into “metropolitan Boston area” and outside Boston.
As shown in the tables, the lowest false negative rate was noted for colonoscopy, which had a false negative rate of 0.9%. Mammography had a false negative rate of 7.5%. As expected, the highest false negative rate was found for prostate specific antigen (PSA) testing, the screening test most commonly used for prostate cancer. As Table 4 demonstrates, false negative rates were similar within and outside of the greater Boston metro area.
In this comparison between a state administrative health database and a comprehensive data set from a large hospital system's CDW, the state database showed variable overall representativeness of the underlying population; people who live independently or with their families had lower representation than many other residential groups. This is an especially important gap given the relatively large percentage of all people with IDD who live with their families. The nearly complete representation for people who live in 24-hr supported residences such as group homes, compared with the lower representation for adults in other residential settings, is likely explained by the state's ability to mandate completion of records by service providers who receive state funding, but not from non-state funded service providers and self-advocates and families. In particular, alternative systems for managing care exist in settings such as nursing homes, and it is therefore not required or expected that these staff would use the state systems to record health information.
Though this administrative database was not designed for research purposes, our evaluation shows it to be a fairly robust source of certain health data. Due to the high representativeness of people with IDD who live in supported residences, administrative databases such as this one could be valuable data sources for research questions related to health care in supported residences. They would be helpful in evaluating the impact of health promotion activities designed to take place in the supported home setting and in evaluating care utilization and access of this population. While this evaluation confirms that the data does not include all people with IDD in the state, the findings may nonetheless have utility in identifying differences in healthcare access or utilization patterns in order to allow targeting of public health efforts. As the data may be most representative of groups with the highest support needs, it is likely that efforts to improve access to or utilization of generic community health care services identified by research using this database may also have positive impacts on those who require less support.
In addition, the external validation of the administrative health database with hospital CDW records showed an overall low rate of false negatives for common cancer screening tests. (False positives were not reported as logistical constraints and privacy concerns prevented us from obtaining data from multiple health care settings, which limited our ability to evaluate this accurately.) As expected, the false negative rate was highest for prostate cancer screening using the PSA test, which consists of a simple blood draw, and lowest for colorectal cancer screening using colonoscopy, a test which requires extensive preparation. False negative rates for mammography were also lower than false negative rates for prostate cancer screening. We theorize that lower false negative rates for mammography and colorectal cancer screening may be associated with the “break in routine” represented by these tests in that they often require visiting an additional office or clinic, such as a mammogram facility. Colonoscopy requires extensive preparation, such as fasting and a liquid diet. These tests may be more notable events than a simple in-office blood draw such as the PSA. Therefore, they may be more likely to be noted and documented by the patient's direct support worker. In addition, we lack research regarding the degree of information about lab tests that physicians offer their patients with IDD. Prior research has documented communication gaps between adults with IDD/support workers and health care providers (Wilkinson, Dreyfus, Bowen, & Bokhour, 2012). Further research to evaluate whether these differences are symptomatic of larger communication gaps in informing patients and their caregivers about more routine testing would be useful.
Limitations of this study included difficulties with ICD-9 codes (diagnostic codes used primarily for medical billing), in that our empirical observation suggests providers may or may not enter a disability-related ICD-9 when treating a person with IDD for a non-disability related medical condition, such as a sore throat or a sinus infection. It is possible that our method of identifying people with IDD by ICD-9 code may have failed to identify some patients who received care unrelated to their disability and/or whose providers failed to include an ICD-9 code related to IDD. However, our goal was to identify a sample of people with IDD for the purpose of matching, and we felt that attempts to identify and match every person with IDD would present significant risks to privacy. We therefore decided to assume the risk of missing records inherent in ICD-9 code use in order to avoid the risk of compromising privacy. We also recognized that we used a relatively small group of ICD-9 codes, and that the codes used to identify records will impact results. In addition, state level characteristics that may not generalize to states other than Massachusetts. For example, while Massachusetts' near total health insurance coverage removed access to insurance as a confounding factor, researchers examining other databases may need to find a way to account for differences in insurance status and cost. Other state-level characteristics that might not be broadly applicable include the existence of a large safety-net hospital system experienced in caring for patients with IDD, and the availability of a clinical data warehouse. Our partnership with a state consultant who was already familiar with the administrative database and master list created from enrollment tables greatly facilitated this study and ensured adequate protection of privacy; this partnership may not be available in other locations. Use of the state's enrollment tables as a master list presents a limitation in that, by definition, it only captures people who are enrolled in state services. However, it is thought to be a comprehensive listing of people who are known to the state as having IDD, and represented the closest thing to a “gold standard” available to us for this purpose.
In addition, we did not investigate tests that appear in the administrative health database but do not appear in the CDW records as having been completed. Based on the state's size and number of health care facilities, it is reasonable to assume that people who live at geographic distance from the hospital system studied might receive primary care, including cancer screenings, closer to home. Concern for subjects' privacy prevented us from attempting to locate medical records at facilities other than the included hospital system, and we were unable to verify receipt of every test. Finally, cancer screening was selected as a focal topic for this study based on its saliency as an area of needed improvement for adults with IDD and the relative ease of accessing cancer-screening data. Researchers interested in using administrative health databases to study other topics may face other challenges or issues that our study did address.
As administrative databases begin to be used more frequently for research purposes, it becomes increasingly important to understand the limitations of these data sets in order to avoid bias in studies conducted using these data. Researchers should note the potential limitations of each administrative database under consideration for use, such as differential representativeness of certain subgroups of the covered population, and any administrative artifacts of how information is inputted or coded that may affect validity or reliability of information (Iezzoni, 2002). These factors must be explored and accounted for in the research design and interpretation in order to maintain the utility of the data source for research. In this study we found that records in the administrative database were organized and mostly ready for analysis as numerous fields utilize categorical response options. In addition, certain fields, such as race, were not required for entry and therefore had a high degree of missing data. Some fields, such as medication name, use open text fields for entry and would require additional cleaning prior to analysis. Access to a state consultant knowledgeable with how the information in the database is obtained and entered, as well as the policy context for the databases' use assisted in accurate interpretation. For researchers accessing other data sources, access to staff familiar with database when possible may be highly beneficial.
Our method is easy and cost effective to replicate with other data sources, and provides insight into a largely untapped source for health surveillance information. Our results also suggest that researchers using administrative databases should evaluate their databases periodically, through repetition of this kind of study, in order to appropriately address potential biases through research design. The results of this study should lend context to efforts to study cancer and health screening variables using this database. We also hope that researchers in other states who may have access to similar state databases might be able to adopt these methods in order to critically evaluate their databases and assess their potential for use in research.
Nechama W. Greenwood (e-mail: firstname.lastname@example.org) Department of Family Medicine, Boston University School of Medicine, Dowling 5 771 Albany St., Boston, MA 02118, USA; Joanne Wilkinson, Boston University, Family Medicine; Emily Lauer, Consultant to the Center for Developmental Disabilities Evaluation and Research, University of Massachusetts Medical School; Karen M. Freund, Tufts University School of Medicine Institute for Clinical Research and Health Policy Studies; and Amy K. Rosen, Boston University School of Public Health Health Policy and Management.