|
|
|
Sample Design
|
|
The statistical goal of the project was to measure year-to-year change in
the extent to which health plans subject to Part 7 of Title I of ERISA are in
compliance with various provisions of that Part. For purposes of this project,
the universe of private sector health plans was divided into three segments - multiemployer
plans, single-employer plans sponsored by large firms, and
single-employer plans sponsored by small firms.
|
 |
Quick
Links |
 |
|
|
|
|
Firms with 100 or more
employees were considered to be large. A separate compliance measurement
effort was conducted for each of the three segments of the health plan
universe. The same statistical goal applied to measurements for each of the three
segments of the universe - to measure year-to-year changes in violation rates
to within 10 percentage points with probabilities of type I and type II error
of 5 percent or less. The caps on the two types of error guard against erroneous
conclusions that EBSA could draw after completing its project and a follow-up project
in some future year. Type I error would arise if the true universe
violation rate had not changed at all and EBSA falsely concluded that the
violation rate had changed. Type II error would arise if the true universe
violation rate changed by 10 percentage points, and EBSA falsely concluded
that it had not significantly changed.
|
|
The sample size calculations were implemented using two sample size
calculation tools:
-
The sample size calculation routine based on a “two-sample t-test”
that is built into the SAS Analyst application, and
-
SAS code for calculating the power of a two by two chi-square test
downloaded from the SAS Web site.(1)
|
|
In applying the sample size calculation based on the two-sample t-test, the
first sample is the base year (2001) sample, the second is the sample from
whatever future year the project is repeated. The second tool does not compute
sample size directly. It computes power(2) for a specified range of sample
sizes. Each run of the program reports the statistical power that results from
10 to 20 sample sizes evenly spaced across a specified interval. By running
this program three or four times and adjusting the specified sample size
interval as necessary, it is a simple matter to zoom in on the minimum sample
size that produces the power of 95 percent or more. Compared to the second tool, the
first has the advantage of computing sample size in a single run rather than
through a sequence of runs. It has the disadvantage of requiring that the
standard error of an estimated percentage p be approximated as square root of
p(1-p). As discussed below, the approximation turns out to be quite good, so
both tools were used.
|
|
Both of the sample size calculation tools assume that the universe size is
infinite. The sample size computed using these tools was therefore adjusted
downward to account for actual sizes of the three universes using the standard
formula for finite population correction.(3)
|
|
Multiemployer Sample - The sample size can be calculated to achieve the target variance provided
the estimated violation rate does not exceed a specified level. For surveys
where no ceiling on the percentages to be estimated can be provided, a
variance-maximizing estimate of 50 percent can be used. The problem with this
approach is that the sample size will be larger than necessary if the
estimated percentages turn out to be much lower than 50 percent.(4)
For the multiemployer sample, the violation rate ceiling used was 25 percent, based on an
earlier EBSA project that estimated violation rates for Part 7 of ERISA.
|
|
Using these assumptions, a sample size of 488 was computed based on the
chi-square sample size routine. The two-sample t-test requires the standard
deviation, which was approximated as the square root of .25(1-.25) which is
.433. The routine based on the two-sample
t-test also requires specification of null and alternate hypotheses. The null
hypothesis is that the base year mean is 25 percent. The alternate hypothesis is that
the initial rate of 25 percent changes by 10 percentage points. The sample size
produced using the two-sample t-test procedure is 489, which is nearly
identical to the chi-square sample size, whether the alternate hypothesis is
specified as 15 percent or 35 percent.
|
|
The sampling frame for multiemployer plans was the 1997 5500 file
maintained by EBSA’s Office of Information Management for purposes of the
Freedom of Information Act. It includes all types of employee benefit plans.
Health plans were identified based on a question on the Form 5500 that
indicates all of the types of welfare benefits that the plan provides. A code
of ‘A’ flags health benefits. Plans entering an ‘A’ were classified as
health plans regardless of what other codes were entered. Other codes that can
be entered in this field identify dental and vision plans. Plans indicating
that they provide dental or vision benefits were not included unless they also
indicated provision of health benefits.
|
|
Multiemployer plans were identified based on an entry of ‘C’
(multiemployer plan) or ‘D’ (multiple-employer-collectively bargained
plan) in the type of plan entity field. Plans entering plan entity code ‘F’
(group insurance arrangement) were also classified as multiemployer plans if
they also indicated that they were collectively bargained. It was clear from
the sponsor names that many of the plans identifying themselves as multiemployer
plans did so incorrectly. The list was therefore manually
reviewed to eliminate all obvious single-employer plans.
|
|
The edited samples frame that resulted from this process numbered 2,169
multiemployer plans. Correcting the infinite population sample size of 489
(the more conservative of the two estimates) for the size universe results in
a multiemployer sample size of 399.
|
|
Concepts for Large and Small Firm Sample Design -
Sampling single-employer health plans is not as easy as sampling multiemployer
plans because there is no satisfactory sampling frame for these
plans. The series 5500 reports do not constitute a satisfactory sampling frame
because most health plans are exempt from filing under ERISA. To our
knowledge, no firm or government agency maintains a comprehensive national
list of health plans. It may be possible to construct a list of insured plans
by obtaining lists of insurers from the states and lists of health plans from
insurers. A sample frame could then be constructed by combining this list with
a list of self-insured plans based on 5500 filings. That process was
considered time-consuming, expensive, and uncertain. It was therefore decided
to sample single-employer health plans via the firms that sponsor them.
|
|
In the parlance of sampling theory, firms serve as the “primary sampling
units” because it is firms that are directly selected for the sample.
Because the analysis is conducted at the plan level, plans are the elementary
units. This type of divergence between the primary sampling units and the
elementary units implies that the sample is a cluster sample rather than a
simple random sample. If each firm had no more than one plan, then plan
characteristics could be regarded as firm characteristics, and the sample as a
simple random one. Because some firms sponsor more than one health plan, the
large and small firm samples are properly regarded as cluster samples. Because
a large majority of firms sponsor only one health plan, this cluster sample is
close to being a simple random.
|
|
Three alternative rules could have been used to associate health plans with
sample firms:
-
Any plan that covers workers at sample firms, even if sponsored by a
parent company;
-
Any plan sponsored by a sample firm or any of its branches or
subsidiaries; or
-
Plans sponsored by sample firms (or their branches), but not by their
subsidiaries.
|
|
All of these alternatives were considered statistically viable. The first
introduces a statistical weighting issue, but was selected because it was
considered to be the most consistent with procedures normally followed in EBSA
investigations. Under this approach, plans covering workers of a parent firm
and at least one subsidiary would require investigation if the parent or any
of the participating subsidiaries fell into the sample. The probability of
selection for each plan therefore depends on the probabilities of selection
for the subsidiaries that participate in the plan. The probability of
selection for each subsidiary depends only on whether it is large or small
(with 100 employees being the dividing line). To accurately compute
statistical weights, national office coordinators were asked to determine, and
investigators to verify, counts of the number of large and small subsidiaries
participating in each plan.
|
|
To compute sizes of the large and small firm samples using cluster sampling
theory would require three kinds of information about the health plan universe
in addition to that required for a simple random sample:
-
The distribution of the number of plans per firm;
-
An estimate of within-firm homogeneity (delta) in violation rates;
-
A distribution of single-employer health plans by the number of
subsidiaries participating in the plan.
|
|
All three of these data requirements pose serious problems.
|
|
Data to meet the first requirement initially appeared to be available. The
Bureau of Labor Statistics once published an article in the Monthly Labor
Review that reported a distribution of firms by number of health plans based
on its Employee Benefit Surveys. Foster-Higgins-Mercer and KPMG
each report distributions of health plans per firm based on annual surveys
conducted by each of those firms. For two reasons, each of these sources
significantly overestimates the number of ERISA plans that EBSA would have to
investigate.
|
|
First, these surveys included plans for workers at all locations of
multi-location firms. Given the chosen strategy for associating plans with
sample firm locations, the fact that each of a firm’s subsidiaries sponsors
their own plans has no bearing on cluster size. Whether the parent, or one of
the subsidiaries it covers falls in the sample, investigators would find only
one plan covering workers of that firm. Thus data from any of these surveys
would overestimate cluster size for this project.
|
|
The second reason that these surveys would overestimate cluster size arises
from the ambiguity concerning the word “plan.” In response to surveys such
as those above, many companies that offer health insurance from multiple
carriers would count each carrier’s offering as a separate plan. The entire
set of health insurance offerings may be regarded as one plan under ERISA,
however. Based on the ERISA definition, EBSA would recognize one plan and
would open only one case that examines health insurance offered to the plan by
any of the carriers.
|
|
Employer identification numbers (EINs) on the series 5500 data could also
be used to count the number of health plans per firm. In addition to being
subject to the multiple-location problem mentioned above, large firms may
sponsor small plans, most of which would be exempt from filing. Thus 5500 data
are also unable to provide usable estimates of health plans offered at
individual firm locations.
|
|
The second data requirement (within-firm homogeneity in violation rates) is
highly problematic. Not only does it require knowledge of the quantities EBSA
is attempting to measure (violation rates) before they are measured, but it
requires knowledge of the extent to which those quantities vary from plan to
plan within firms having more than one plan. It seems reasonable to speculate
that there would be a substantial tendency for plans within the same firm to
be uniform in their compliance status. It does not seem reasonable to quantify
that speculation in the absence of any supporting data.(5)
|
|
The third data requirement is to estimate the distribution of plans by
their probability of selection. Within the large firm sample, the probability
of selection for firms is designed to be uniform. Probabilities of selection
for plans will not be uniform, however. As explained above, plans covering
workers at multiple subsidiaries will be investigated if the parent or any of
the subsidiaries it covers is selected for the sample. We are aware of no data
that permit an estimate of the distribution of plans by the number of
subsidiaries they cover, so this data requirement also remains unfulfilled.
|
|
The lack of data with which to credibly estimate any of these data
requirements lead to acceptance of a simple random design as the only feasible
approach. To the extent that firms offer only one health benefits package at
each location or consider the variety of benefits packages offered to be a
single ERISA plan, the approximation is accurate. In the small firm sample,
the approximation is undoubtedly very accurate. In the large firm universe,
the available data can provide only an (possibly substantial) overestimate of
the extent to which firms have multiple ERISA health plans covering workers at
individual locations.
|
|
Textbook formulas for computing the size of cluster samples cover only the
simplest cluster sample designs where either cluster size is constant or the
sampling fraction within each cluster is uniform. Cluster size for this
project (number of ERISA health plans per location of a firm) is clearly not
constant. Uniform sampling fractions are problematic when many clusters are of
size one, because any sampling fraction less than one will cause entire
clusters to drop out of the sample. Fortunately there are software packages
that can be used to estimate variance for more complex cluster sample designs.
|
|
Despite large gaps in the data required to implement a sample design for
this project, cluster design tools offer the only approach to answering one
fundamental design question - the number of plans to investigate for firms
with multiple plans. Simple random samples do not involve sub-sampling, so the
associated theory offers no guidance on this subject. This question is not
trivial because most cluster samples present a tradeoff between some number of
clusters sub-sampled at one rate and a higher number of clusters sub-sampled
at a lower rate, where both designs achieve the target variance, and thus
precision. The choice between the alternative designs is normally made on the
basis of cost.
|
|
To answer the question of the optimal number of plans to investigate per
firm, the Office of Policy and Research used a software package capable of
estimating variance from complex surveys, version 8.0 of the SAS/STAT software, which
includes a variance estimation procedure called PROC SURVEYMEANS. The analysis
using this procedure required estimates of the three factors mentioned above
as necessary for estimating the size of cluster samples. Guesses regarding
these factors were used, and the sensitivity of the conclusion to these
guesses was examined. The SAS program simulates the consequences of
alternative ceilings or caps on the number of plans investigated per firm. A
cap of three, for example, would mean that all plans of firms with three or
fewer plans would be investigated. At firms with more than three plans, three
plans would be randomly selected for investigation.
|
|
The simulations showed that estimated variance varied considerably between
simulations with the same assumptions due solely to chance, and that the
distribution of plans per sample firm was an important determinant of the
variance. Thus to assure that the target variance would be met with a high
degree of assurance, the program computes the 95th percentile of the variance.
For each set of assumptions, the sample size was selected to achieve the
target variance in 95 percent of the simulations. The figure shows how the number of
large firms to be sampled and the number of plans to be investigated varies
with the cap. Because the numerical assumptions underlying these estimates are
mere guesses, the sample size estimates are not usable. The usable conclusion
is that investigating all plans of sample firms minimizes not only the number
of firms to be visited, but also the number of plans to be investigated.
Fortunately, this conclusion proved insensitive to reasonable changes in the
three determining factors.(6) For this reason, it was decided to investigate all
health plans covering workers at the selected location of each sample firm.
|
|
Two of the ERISA Part 7 statutes(7) are applicable only to plans having at
least two participants who are current employees. To reduce the chances that
plans located would be exempt from these statutes it was decided to limit the
universe to firms having at least three employees.
|
|
A comprehensive database of U.S. companies maintained by Dun and Bradstreet
(D & B) was selected as the sampling frame. This database includes records
for branch locations. According to the D & B definition, branches are
locations of a company with no separate legal responsibility for their debts.
For this reason, branch locations were believed to lack the authority to
sponsor their own health plans. Although it is possible that a small number of
firms sponsor separate health plans for one or more of their branches,
including branches in the samples would have complicated the investigation of
health plans for branches in the far more common situation where branch
workers are covered under a headquarters plan. Experienced EBSA investigators
judged the existence of separate plans for branches to be too rare to justify
the added investigatory complexity. Branches were therefore excluded from the
sample.
|
|
The universe for the study was restricted in two other ways intended to
simplify investigations and reduce their cost without significantly
compromising the findings. First, sponsor firms were geographically limited to
those sponsored in either the District of Columbia or one of the 50 states.
Second, firms were limited to those having at least three employees. Although
some firms with fewer than three employees sponsor ERISA health plans, most
firms that small do not sponsor health plans, and many of those that do are
not ERISA plans. The effort to screen large numbers of such tiny firms for
ERISA health plans was judged too great to justify the small expansion in the
scope of the study.
|
|
At the request of EBSA, D & B drew two separate simple random samples
from their database - 1,604 private-sector firms having 3-99 employees, and
622 private-sector firms with 100 or more employees. These numbers of firms
were calculated so that the number of in-scope firms with health plans would
at least equal the target sample sizes.
|
|
The D & B database has no flag to distinguish private sector from
public sector organizations. It does have an eight-digit Standard Industrial
Code (SIC) code. A list of 17 D & B SIC codes (or ranges of codes) was
used to exclude from the D & B sampling frame organizations such as public
secondary schools that were clearly public sector organizations and
organizations whose plans were judged likely to qualify for the ERISA church
plan exemption. (See Attachment 1.)
|
|
Calculation of Large and Small Firm Sample Sizes
- In the EBSA project that was the source of the estimated 25 percent violation rate
ceiling, plans were selected for investigation through EBSA’s normal
targeting methods rather than through random sampling. Violation rates in
randomly targeted cases will undoubtedly be lower than in targeted cases, but
the magnitude of the difference is unclear. The sample size calculation for
the large and small firms was based on a 22 percent violation rate ceiling. This
ceiling resulted from the judgment that three percentage points is the
smallest conceivable amount by which single-employer violation rates in
targeted cases could exceed those for random cases.
|
|
Just as in the multiemployer sample, the infinite population sample size
was computed using both of the available tools. The sample size computed using
the
t-test procedure was 448. The chi-square sample size procedure estimated a
sample size of 446. The larger, and thus more conservative, sample size of 448
was corrected for the actual finite populations. After adjustment for a
population size(8) of 134,016, the large firm sample size became 444. The small
firm population size of 4,957,773 was sufficiently large to leave the infinite
population sample size of 448 unaffected by the finite population correction
after rounding. These estimates of the size of the large and small firm
universes were provided by D & B at the time of sample selection.
|
|
Strategy for Contacting Firms and Multiemployer Plans
- Achieving the target number of investigations of small firm plans, large
firm plans, and multiemployer plans required contacting more than the target
number of sample units due to firms/plans being out-of-scope, unreachable, or
the subject of a non-project investigation in the past 12 months.(9)
(The most
common reason that firms, especially small firms, were out-of-scope was that
they did not sponsor health plans.)(10)
The number of firms and plans to contact
was therefore unknown at the start of the project. An approximation of the
number of firms and plans to contact could have been calculated given
estimates of the rates at which contacts would yield in-scope health plans,
but a more accurate method was chosen.
|
|
A longer-than-needed list of sample units was prepared for each of the
three samples and sorted into random order. The first round involved
contacting firms and plans up to the target number of investigations from the
top of the randomly ordered list. Based on experience from this round, the
size of the second round of contacts was estimated. The target number of
investigations for each sample was thus approached incrementally.
|
|
Calculating Sample Weights - The sample weights are the ratio of the universe size to the sample size.
For purposes of the weighting calculation, the sample size is the number of
attempted contacts, as opposed to the number of plans investigated. Weights
computed in this manner support estimates of the results that would have been
found had the project screening and investigation methodology been applied to
the universe of private sector health plans. Attempted contacts to sample
units that did not lead to investigations because the sample unit was
out-of-scope, unreachable, or ineligible for investigation due to a recent
prior investigation (See Table 1) thus represent corresponding segments of the
health plan universe that would not have led to investigation had the project targeted the entire universe. Among unreachable sample units
(multiemployer
plans or firms), there were an unknown number of in-scope plans. No attempt
has been made to impute the number of such plans or their violation rates. The
inability to represent this portion of the universe results in some degree of
underestimation of health plans in Table 2. Violation rates could be biased
for the same reason in either direction, depending on whether violation rates
among plans of unreachable firms were higher or lower than rates among
reachable plans.
|
|
Due to the incremental contact strategy, the number of attempted contacts
was not known until near the end of the project. The final counts (including
plans investigated under recent, non-project Part 7 investigations) are shown
in the sample size column below and in
Table 1.
|
|
The probabilities of selection are therefore:
|
|
Sample |
Universe Size |
Sample Size |
Prob. of Selection |
Reciprocal of Prob.
of Select. |
|
Large Firm |
134,016 |
623 |
0.00465 |
215.11 |
|
Small Firm |
4,957,773 |
1,604 |
0.000324 |
3090.88 |
|
Multis |
2,169 |
510 |
0.2351 |
4.25 |
|
Total |
5,093,958 |
2,737 |
|
|
|
|
For multiemployer plans and for plans of large and small firms that cover
no subsidiaries, the statistical weights are simply the reciprocals of the
probabilities of selection, as shown in the last column. The probability of
selection, Pi, for a plan i that covers Li large
subsidiaries(11) and Si small
subsidiaries is:
Pi = 1 - (1 - PS)Si (1 - PL)1+Li
|
|
PS and PL are the probabilities of selection for small and large firms (or
subsidiaries). This formula, which is derived in Attachment 2, was applied
solely in the large firm sample because no plans covering subsidiaries were
identified through the small firm sample.
|
|
The weight for plan i is the reciprocal of Pi. Some of the weights that
result from applying this formula using the large and small firm probabilities
of selection shown above are:
|
| Probabilities
of Selection and Weights for Plans Covering Selected Numbers of
Large and Small Subsidiaries |
| Subsidiaries
Covered by Plan |
|
|
| Large
Firms |
Small
Firms |
Plan
Prob. of Selection |
Weight |
|
0 |
0 |
0.004641 |
215.47 |
|
0 |
1 |
0.004963 |
201.51 |
|
0 |
2 |
0.005284 |
189.25 |
|
1 |
0 |
0.009260 |
107.99 |
|
1 |
1 |
0.009580 |
104.38 |
|
1 |
2 |
0.009900 |
101.01 |
|
2 |
0 |
0.013858 |
72.16 |
|
2 |
1 |
0.014177 |
70.54 |
|
2 |
2 |
0.014495 |
68.99 |
|
2 |
3 |
0.014814 |
67.50 |
|
|
|
|
Reliability of Estimates
|
|
EBSA attempted to minimize all types of error in this project.
Nevertheless, violation rates estimated from this survey may differ from the
true universe violation rates for a number of reasons:
|
|
Sampling Error - This error refers to the risk that the true violation rate among sample
plans and firms differed from the true violation rate among all plans and
firms simply because the random sample did not perfectly represent the
corresponding universe. This is the error that sampling theory attempts to
control and statistical theory attempts to measure with tools such as
confidence intervals.
|
|
Tables 3 and 4 provide lower and upper 95 percent confidence limits for
violation rates for each sample and statute. The first row of Table 3, for
example, shows a lower confidence limit of 41 percent and an upper confidence limit
of 50 percent for the 45 percent point estimate of the overall Part 7 violation rate for all
plans. The confidence limits indicate that there is a 95 percent chance that the
interval from 41 percent to 50 percent brackets the true overall Part 7 violation rate.
|
|
Response Bias - If the sample units from whom data cannot be collected are meaningfully
different from sample units from whom data can be collected, the resulting
response bias is a source of measurement error. Although response bias
generally cannot be directly measured, a response rate is often computed to
assess the potential for response bias. In this project, the response rate
concept can be applied to phase I, to phase II, and to the project as a whole.
For the large and small firm samples, the first phase involved calls by national office
coordinators to sample firms provided by D & B.
Coordinators were unable to contact 342 firms, 87 percent of which were in the small
plan sample (Table 1). Thus for this phase of the effort, the response rate
was 87.3 percent. Table A shows the derivation of this percentage and the
considerable variation in these response rates across samples.
|
|
The second phase of the project was the investigation of plans determined
in the first phase to be in-scope. EBSA has authority to investigate all
in-scope health plans and consistently invoked this authority to achieve a 100
percent rate of response for the second phase.
|
|
Computing the response rates for phase 1 and 2 combined is more difficult
because there is no way of knowing the percentage of unreachable firms that
sponsored in-scope health plans, so the denominator of the overall response
rate is not known. Because 70 percent of small firms that could be contacted were
out-of-scope, it seems likely that among unreachable firms, the percentage
out-of-scope would be at least that high. That assumption underlies the
estimates that appear in the bottom row of Table A.
|
|
Because the actual percentage of unreachable firms that were out-of-scope
could be as low as 0 percent or as high as 100 percent, combined phase 1-phase 2 response
rates are also computed using these assumptions. The result is a range of
possible overall response rates from a low of 78 percent (if all unreachable firms
are in-scope) to a high of 98 percent (if all unreachable firms are out-of-scope).
The response rate derived from the assumption that unreachable firms are
in-scope to the same extent as reachable firms is 86 percent, and it seems reasonable
to hope that this estimate is low.
|
|
Error in Identification of Firms with In-Scope Plans - National office coordinators contacted sample firms to determine whether
they sponsored health plans. Sample firms determined to have in-scope health
plans were referred to the field for investigation. In some cases, the
investigators found that the initial determination by the national office was
wrong and that, in fact, the firm did not sponsor an in-scope plan. There was
no comparable check for firms determined by the national office not to have
health plans. Thus it is likely that national office coordinators failed to
identify all firms that had health plans.
|
|
Coordinators began their contacts with firms by identifying themselves as
employees of the Employee Benefits Security Administration because less
direct approaches were regarded as unethical. One reason that in-scope health
plans may have been missed is that firms falsely claimed not to have a health
plan because they knew they were speaking to a representative of the agency
that investigates health plans. It is likely that violations rates among plans
that were not identified were different from the violation rates measured,
especially if deliberate evasion occurred.
|
|
Sampling Frame Non-Coverage - EBSA relied on the Form 5500 filings as the sampling frame for
multiemployer plans, and on D & B for firm data. It is possible that multiemployer
plans or firms with plans were missing from these frames. The
potential for error from this source is probably small, however. Plans as
large as most multiemployer plans are very unlikely to avoid filing partly
because EBSA has a Division of Reporting Compliance that identifies
non-filers. Maintenance of the D & B database is a high priority for that
company as it is the foundation for a number of that company’s products. It
is frequently used as a sampling frame for surveys of firms.
|
|
Investigator Error - As described in the body of the report, EBSA devoted considerable resources
to training investigators for Part 7 investigations. Nevertheless, human error
in identification or reporting of violations may have occurred.
|
|
|
|
|
|
Table
A. Response Rates in the Three Project Samples
|
|
Response Rate for |
Sample
|
Total
|
|
|
Large
|
Multi
|
Small
|
Numerator
|
Denominator
|
Percent
|
|
Phase I -
Determining if sample unit has in-scope plan
|
93.4%
|
99.2%
|
81.4%
|
1,267+1,090
|
1,267+1,090+342
|
87.3%
|
|
Phase II - Investigating in-scope
plans
|
100
|
100
|
100
|
|
|
100
|
|
Phases I and II - combined Percentage of in-scope units with usable data
|
|
|
|
|
|
|
|
Assuming unreachables are always
out-of-scope
|
98.3
|
96.3
|
99.5
|
1,267+12
|
1,267+38
|
98.0
|
|
Assuming
unreachables are always in-scope
|
90.7
|
95.4
|
56.8
|
1,267+12
|
1,267+38+342
|
77.7
|
|
Percentage of reachables found to
be in-scope
|
81.4
|
84.7
|
30.2
|
1,267
|
1,267+1,090
|
53.8
|
|
Assuming
unreachables are in-scope to the same extent as reachables
|
92.0
|
95.5
|
81.1
|
1,267+12
|
1,267+38+342x.538
|
85.9
|
|
Source: Table 1
|
|
|
|
|
|
|
|
|
|
|
|
|
Standard
Industrial Classification Codes of Organizations Excluded from the Universes for the
Samples of Large and Small Firms
|
|
SIC
|
Meaning
|
|
43xx xxxx
|
U.S. Postal Service
|
|
8049 9905
|
Christian Science
practitioner
|
|
8211 01xx
|
Catholic elementary and secondary schools
|
|
8211 03xx
|
Public elementary and
secondary schools
|
|
8211 99xx
|
Elementary and secondary schools, nec(12)
|
|
8221 0202
|
Theological seminaries
|
|
8222 xxxx
|
Junior colleges
|
|
8299 9904
|
Bible school
|
|
8299 9913
|
Religious school
|
|
8231 03xx
|
General public libraries
|
|
8412 0101
|
Art gallery, noncommercial
|
|
8422 0103
|
Zoological garden,
noncommercial
|
|
8661 xxxx
|
Churches, temples, and shrines and non-church
religious organizations (convent, monastery, religious instruction)
|
|
8699 0201
|
Christian Science reading
room
|
|
8699 0204
|
Reading room, religious materials
|
|
8999 0601
|
Christian Science lecturers
|
|
9xxx xxxx
|
Governmental and non-classifiable organizations
|
|
|
|
|
|
|
|
|
|
A plan i is in the sample if the sponsoring firm or any
of its subsidiaries that have employees covered under plan i is in the
sample. Assume that firms with subsidiaries have 100 or more employees and
therefore fall into the large category.
|
|
Let Li be the number of large subsidiaries having employees covered under
plan i.
|
|
Let Si be the number of small subsidiaries having employees covered under
plan i.
|
|
Let PL be the probability of selection for large firms.
|
|
Let PS be the probability of selection for small firms.
|
|
Let Pi be the probability of selection for plan
i.
|
|
1+Li is the number of large subsidiaries or parents covered under plan
i.
|
|
1-PL is the probability that one large subsidiary or parent is
not in the
sample.
|
|
(1-PL)1+Li is the probability that none of
1+Li large firms or subsidiaries
fall in the sample.
|
|
1-PS is the probability that one small subsidiary is
not in the sample.
|
|
(1-PS)Si is the probability that none of
Si large subsidiaries fall in the
sample.
|
|
1-Pi = P (Plan i is not in the sample)
|
= P (none of the Si small covered subsidiaries are in the
sample
and
none of the Li large covered subsidiaries are in the sample)
= P (none of the Si small covered subsidiaries
are in the sample) x
P (none of the Li large covered subsidiaries are in
the sample)
|
|
Substituting the two expressions derived above, we have:
|
|
1 - Pi = (1 - PS)Si (1 -
PL)1+Li
|
|
solving for Pi yields:
|
|
Pi = 1 - (1 - PS)Si (1 -
PL)1+Li
|
|
|
|
Footnotes
|
-
The statistical basis for this code is: Agresti, A. (1990), Categorical
Data Analysis, New York: John Wiley & Sons, Inc.
-
Power is defined as one minus the probability of type II error. For this
initiative, it is the probability of correctly concluding that the true
universe violation rate has changed given that it actually did change by 10
percentage points. The goal of capping the probability of type II error at 5 percent
may also be stated as achieving power of at least 95 percent.
-
If a simple random sample size of n’ achieves the target variance for
an infinite population, then the sample size n that achieves the same variance
in a population of size N is:
-
For example, the assumption that estimated percentages will not exceed
25 percent reduces sample size by 25 percent compared to the sample size calculated using no
advance knowledge and assuming that variance-maximizing estimates of 50 percent
are
possible.
-
An examination of records from a 1999 pilot project coordinated by the
EBSA Office of Health Plan Standards and Compliance Assistance found no more
than three firms where more than one plan was investigated. This number was
judged too small to provide usable empirical information, especially since the
cases were not randomly targeted.
-
It was also fortunate that there was no need to choose a
sub-sampling
fraction on the basis of cost, because the relationship between sample design
and travel cost would have been difficult to estimate.
-
Namely, the Health Insurance Portability and Accountability Act
(HIPAA)
and the Women’s Health and Cancer Rights Act (WHCRA).
-
See footnote #3 for formula.
-
The random identification of the multiemployer
and single-employer plans
meant that we did not have reasonable cause under Section 504(b), so we did
not open a new case on a sample entity’s health plan for which we had an
open case or closed case in the preceding 12 months. Section 504(b) of ERISA
states that: “The Secretary may not under the authority of this section
require any plan to submit to the Secretary any books or records of the plan
more than once in any 12 month period, unless the Secretary has reasonable
cause to believe there may exist a violation of this title or any regulation
or order thereunder.”
-
Table 1 provides the complete list of the reasons why firms/plans were
found to be out-of-scope along with frequency counts for each reason and
sample.
-
Parent companies are assumed to be large, so the number of large firms
or subsidiaries is one more than the reported number of large subsidiaries.
-
Nec means, "not elsewhere
classified." Such schools would include non-Catholic
religious schools, and possibly public-private hybrid schools such as
charter schools.
|