Form 5500 Data Sets
The Form 5500 Annual Report is the primary source of information about the operations, funding and investments of approximately 800,000 retirement and welfare benefit plans. The datasets below are the raw, unedited data from all of the Form 5500 and Form 5500-SF filings for each year, including the data reported in the various schedules.
With the introduction of EFAST2, filers have sent the Department many incomplete electronic filings that were later corrected. To reduce the number of duplicate filing records for a single plan, beginning with the 2009 Form 5500, two sets of raw data are provided.
The dataset listed as “Latest” includes only the latest, most correct filing for a plan. Specifically, this dataset has the latest received unique plan filing with a filing status of “FILING_RECEIVED”. For more information on the different filing statuses, see EFAST 2 FAQ 38.
The dataset listed as “All” includes all filings received by the Department without regard to filing status or the number of attempts to file. This dataset may contain multiple filings for a single plan.
The 2008 and prior year Form 5500 datasets are complete and will not change. The 2009 and future Form 5500 datasets will be updated monthly. Any filers submitting delinquent Form 5500s for plan years prior to 2009 will use the current form available on EFAST2. The data for that filing will be included in the current form dataset.
The datasets are provided as zipped text files. The text files are comma delimited with “(double quote) as a text qualifier. The first row contains the field names. The “Latest” and “All” datasets include three system-generated fields in addition to filer-provided data: filing status, date received, and an indication of whether there is a valid signature. In the “All” datasets, these three fields can help to distinguish between the duplicates.
As of the January 2011 data update, there is a change to the data structure to assure that when filers submit multiple service provider codes on certain lines in the Schedule C, all of the codes are included. Please see the technical note for a more detailed explanation of the structure change.
The data dictionary, provided for each year's form, shows variable names and where they are found on the Form 5500 and various schedules.
For older datasets, you will need to submit a Freedom of Information Act (FOIA) request. To submit this request, please follow the instructions on our FOIA page.