CMS (Medicare) Data


The CHS dataset includes data from the Centers for Medicare & Medicaid Services, formerly known as Medicare or HCFA.

Summary of available CMS data:

Claims data including cost data from inpatient/outpatient and carrier forms are available on CHS participants up to 2009. Updates are ongoing. Researchers who log in to the internal CHS site (password protected) may also:

  • explore the CMS database for the CHS cohort
  • download the Data Dictionary for CMS claim data and other CMS training materials provided by Duke University
  • view the Frequency tables of participant counts by calendar year for FFS enrollment, CVD endpoints, other diagnosis and procedures codes in the CMS claim data


Use of CMS Data in Publications:

  1. Any publication using CMS data cannot include cell sizes with fewer than 10 observations.  If investigators are uncertain about whether their manuscripts meet this requirement, they may submit their documents to CMS for review.  (See Section 9 of the CMS DUA for the text on this.)  According to this requirement, then, investigators may not submit their manuscripts for publication unless they are sure, or have obtained CMS’s assurance, that their manuscript meets the minimum cell size requirement.
  2. Manuscripts resulting from analyses using these linked datasets will be provided by each study to CMMI (Centers for Medicare and Medicaid Innovations, a new center at CMS). The CHS P&P Coordinator will forward such manuscripts to CMS upon CHS approval.

Cell Size Requirement Details:

For example, an atrial fibrillation variable that is partially derived from CMS claims and partially derived from ECG readings and CHS-collected data would still count for the purposes of the small cell rule.

From the CMS DUA the small cell rules are:

The User agrees not to disclose direct findings, listings, or information derived from the file(s) specified in section 5, with or without direct identifiers, if such findings, listings, or information can, by themselves or in combination with other data, be used to deduce an individual’s identity. Examples of such data elements include, but are not limited to geographic location, age if > 89, sex, diagnosis and procedure, admission/discharge date(s), or date of death.

The User agrees that any use of CMS data in the creation of any document (manuscript, table, chart, study, report, etc.) concerning the purpose specified in section 4 (regardless of whether the report or other writing expressly refers to such purpose, to CMS, or to the files specified in section 5 or any data derived from such files) must adhere to CMS’ current cell size suppression policy. This policy stipulates that no cell (e.g. admittances, discharges, patients, services) 10 or less may be displayed. Also, no use of percentages or other mathematical formulas may be used if they result in the display of a cell 10 or less. By signing this Agreement you hereby agree to abide by these rules and, therefore, will not be required to submit any written documents for CMS review. If you are unsure if you meet the above criteria, you may submit your written products for CMS review. CMS agrees to make a determination about approval and to notify the user within 4 to 6 weeks after receipt of findings. CMS may withhold approval for publication only if it determines that the format in which data are presented may result in identification of individual beneficiaries.

References to section 4 and 5 simply indicate that you are using the CMS data for an approved CHS project. 

The main criterion is that tables or aggregate data showing cell sizes of 10 or less needs to be suppressed in all publications and presentations.  This does not apply to internal use, although all printed material involving small cells should be shredded. 

An example of how a different NHLBI cohort study handled the small cells sizes is shown below:


Example is from: Validation of the Atherosclerotic Cardiovascular Disease Pooled Cohort Risk Equations. JAMA. 2014;311(14):1406-1415. doi:10.1001/jama.2014.2630 and is Copyright © 2014 American Medical [table used under Fair Use for educational purposes]

The outcome is a composite variable using CMS-derived CVD events to enhance events detected by REGARDS cohort surveillance.  Note that the events and person years are both suppressed for the small cell and at least one other cell to make sure that the small cell cannot be determined by calculation (The number of events was 234, and they showed 212, so at least one suppressed cell was > 10 but the authors had to suppress 2 cells so that a calculation could not be done to determine the number in the suppressed cell).  However, adjusted and predicted rates are shown.  Crude rates are absent and would need to be suppressed in any table intended for publication or presentation. 

The Coordinating Center will handle the requirement in the memorandum of understanding between CMS and NHLBI that CMS be provided a courtesy copy of papers, for informational purposes only.  Other CMS rules (e.g. that the data should not be disseminated, data should be stored securely) are already part of the standard CHS Data and Materials Distribution Agreement (DMDA) that all users of CHS data have signed.  

Alternatively, a request for exemption to this rule needs to be submitted to CMS prior to journal submission.  Note that this process may require between 4 and 6 weeks.   Instructions can be found here The current CHS contact at CMS is Alex Laberge.