

Quality assessment is likely to focus on the following elements of the
economic evaluation, each of which can have an important impact on the
validity of the overall results of that study.

  • Methods of deriving
    the effectiveness data

  • Measurement of
    resource data

  • Valuation of resource

  • Measurement and
    valuation of health benefits (utilities)

  • Method of synthesising
    the costs and effects

  • Analysis of uncertainty

  • Generalisability
    of the results

This is not an exhaustive list, but an understanding of these issues,
which are discussed in more detail below, will provide insight into the
quality assessment of economic evaluations. Quality assessment of decision
models is not covered in detail here due to the technical nature of the
material. It is recommended that more detailed information on good practice
in decision modelling be consulted.

There is a hierarchy of sources of evidence ranging from a formal systematic
review to expert opinion and authors’ assumptions. Where possible economic evaluations
should use effectiveness data obtained from a systematic review. However,
non-systematic synthesis of effectiveness data may be justifiable when
it is the only available source of evidence.

The type of effectiveness data included in an economic evaluation can
vary from a single efficacy parameter obtained from a meta-analysis of
RCTs to epidemiological data mapping the natural history of disease. Quality
assessment of the clinical effectiveness data incorporated in an economic
evaluation will depend on the type of clinical data used; whether the
data were obtained from a single study or from the literature or from
expert opinion; and whether modelling techniques were used.

When the effectiveness data has been derived from a single study, quality
assessment should be undertaken as described in Chapter 1. However,
additional elements will also need to be assessed. For example, whether
the study time horizon is adequate to capture all the relevant health
outcomes required and, if statistical modelling techniques have been used
to extrapolate the data, whether the extrapolation methods and assumptions
used were appropriate.

When the effectiveness data has been synthesised from a variety of sources
assessment should focus on the quality of the literature review and the
methods used to synthesise the data including:

  • Whether a search
    strategy was used

  • Which databases
    were searched

  • Whether there
    were clear inclusion and exclusion criteria

  • Whether sufficient
    information was given about the quality of the included studies

Quality assessment of cost analysis should consider which costs were
evaluated in the study, the measurement of the associated resource quantities,
and the valuation (cost) of those resources. Some of the issues that need
to be assessed are common to all economic evaluations, while others are
specific to the type of approach used.

For any economic evaluation all costs relevant to the study question
and the perspective adopted or viewpoint from which the analysis has been
undertaken should have been included. For example patient travel costs
are a cost from the patient’s perspective and a cost from society’s perspective,
but not a cost from the hospital’s perspective.

Measurement of resources data

Resource use is measured in physical units such as equipment, staff,
dressings and drugs. Issues to consider are as follows:

  • The sources used
    to collect resource utilisation data should be reported clearly (e.g.
    clinical trials, administrative databases, clinical databases, medical
    records and published literature)

  • Resource quantities
    should be reported independently from the costs, so that assessment of
    the measurement method is facilitated

  • Any assumptions
    in the measurement of resources should be explicitly reported and justified

  • If an expert was
    consulted to estimate some of the resources, the methods used should be

For trial-based economic evaluations, the most
valid resource estimates are considered to be those collected prospectively
alongside effectiveness data, utilising the robust infrastructure established
for the trial.

If resources utilized were identified through a
review of the literature, details of the process employed to identify
and select the patterns of resource utilisation and the quantities used
should have been given.

Valuation of resource data

For the valuation of resources, the relevant issues to consider are
as follows:

  • All the sources
    used to obtain unit costs should be reported and be relevant for the specific
    study setting

  • All costs should
    be adjusted to a specific price year so that the effects of inflation
    are removed from the cost estimation

  • If
    the time horizon for estimating costs was longer than one year, discounting
    should have been performed in order to reflect time preferences

  • If prices
    were used instead of costs and cost-to-charge ratios calculated these
    should reflect the true opportunity costs of the strategies compared

Utilities may be measured using either a generic valuation tool, such
as the SF-6D or the EQ-5D, or a disease specific tool which may have been
obtained using either standard gamble or time trade off techniques. Tools
differ considerably (a full discussion is given in the books by Drummond and Brazier) and choice of tool can impact on
the results obtained and on their usefulness in priority setting. As a
minimum assessment should consider who provided the scores (patients,
clinicians, general public, etc.), which tool was used (EQ-5D, SF-6D,
etc.) and when the scores were elicited (at baseline, during treatment,
after treatment, etc.). A useful overview and comparison of the impact
of different measures in rheumatoid arthritis is available.

The true economic value of an intervention compared to another depends
on the additional costs and benefits. Incremental cost-effectiveness ratios
are the ratios that capture this relative value. Unless a treatment is
clearly dominant (both cheaper and more effective), incremental cost-effectiveness
ratios (ICERs) should have been calculated as this is the only appropriate
way of capturing the true economic value. A paper should report sufficient
data to ascertain dominance from the figures given, rather than relying
on a statement from the authors which can be made in error and be potentially
misleading. Cost-effectiveness results should have been reported in both
a disaggregated and an aggregated way. That is, undiscounted and discounted
health benefit and cost results should have been reported both separately
and as part of the ICERs. It is also appropriate to report the net benefit
statistic, which is sometimes used to overcome the statistical issues
raised when dealing with a ratio, like the ICER.

A well-conducted economic evaluation should investigate as thoroughly
as possible, the following sources of uncertainty:

  • Parameter uncertainty,
    which occurs because parameters are estimated from samples and their true
    value is unknown

  • Methodological
    uncertainty, which arises from the analytical methods used in the evaluation,
    particularly where there is disagreement around the methods used (e.g.
    the inclusion of indirect costs, discounting of health benefits, discount

  • Modelling uncertainty
    which can arise due to the simplifying assumptions that are often required
    to facilitate modelling

Methods of evaluating uncertainty include statistical comparisons, bootstrapping,
sensitivity analyses (one-way or multi-way sensitivity analyses, threshold
analyses and analyses of extremes or worst/best case analysis) and probabilistic
sensitivity analyses. The method(s) employed will vary depending on what
is being assessed and the types of data that were used as input parameters
in the economic evaluation.

Statistical tests comparing effects, costs or cost-effectiveness are
appropriate for studies that have derived their effectiveness and costs
from patient level data. The quality assessment of the statistical comparisons
performed should focus on the appropriateness of the type of tests used
and the results reported (e.g. 95% confidence intervals; p-values).

Bootstrapping is a statistical method that can be applied to capture
uncertainty where patient level data are used. Due to the fact that the ICER is
a ratio, normal parametric statistical methods based on the standard error
cannot be used. Non-parametric bootstrapping is an alternative method
which allows a comparison of the arithmetic means without making any assumptions
about the sampling distribution. However, it should be noted that economic
evaluations can use a net benefit statistic rather than an ICER to overcome
the statistical problems associated with a ratio.

Sensitivity analyses of parameter uncertainty are usual in economic
evaluations that obtain their data from systematic or other reviews. The
aim of the sensitivity analyses is to evaluate the sensitivity of the
results to changes in the parameter estimates. N-way sensitivity analyses
and threshold analysis can only vary a few parameters at the same time
in practice. In contrast, probabilistic sensitivity analysis (PSA) (see
below) can vary all parameters at the same time, subject to data availability.

The following issues should be assessed:

  • Whether the parameters
    chosen were justified

  • Whether variations
    were performed across meaningful ranges of values

  • Whether the robustness
    of the results was assessed according to a previously agreed level of
    ‘acceptable variation’

Uncertainty around analytical methods is also assessed through the use
of sensitivity analysis. For example, the impact of different discount
rates and the use of discounting (or not) on health benefits should have
been assessed in studies with a long time horizon.

5.3: Checklist for assessing economic evaluations


Was the research question stated?

Was the economic importance of the research question stated?

Was/were the viewpoint(s) of the analysis clearly stated and justified?

Was a rationale reported for the choice of the alternative programmes
or interventions compared?

Were the alternatives being compared clearly described?

Was the form of economic evaluation stated?

Was the choice of form of economic evaluation justified in relation to
the questions addressed?


Was/were the source(s) of effectiveness estimates used stated?

Were details of the design and results of the effectiveness study given
(if based on a single study)?

Were details of the methods of synthesis or meta-analysis of estimates
given (if based on an overview of a number of effectiveness studies)?

Were the primary outcome measure(s) for the economic evaluation clearly

Were the methods used to value health states and other benefits stated?

Were the details of the subjects from whom valuations were obtained given?

Were productivity changes (if included) reported separately?

Was the relevance of productivity changes to the study question discussed?

Were quantities of resources reported separately from their unit cost?

Were the methods for the estimation of quantities and unit costs described?

Were currency and price data recorded?

Were details of price adjustments for inflation or currency conversion

Were details of any model used given?

Was there a justification for the choice of model used and the key parameters
on which it was based?

and interpretation of results

Was time horizon of cost and benefits stated?

Was the discount rate stated?

Was the choice of rate justified?

Was an explanation given if cost or benefits were not discounted?

Were the details of statistical test(s) and confidence intervals given
for stochastic data?

Was the approach to sensitivity analysis described?

Was the choice of variables for sensitivity analysis justified?

Were the ranges over which the parameters were varied stated?

Were relevant alternatives compared? (i.e. Were appropriate comparisons
made when conducting the incremental analysis?)

Was an incremental analysis reported?

Were major outcomes presented in a disaggregated as well as aggregated

Was the answer to the study question given?

Did conclusions follow from the data reported?

Were conclusions accompanied by the appropriate caveats?

Were generalisability issues addressed?

on Drummond’s checklist

This method can only be used to deal with parameter
uncertainty in modelling-based economic evaluations. PSA, also referred
to as second-order uncertainty, considers the uncertainty surrounding
the value of a parameter. This is achieved by assigning a probability
distribution rather than a point estimate to each parameter. The quality
assessment in this case should focus on whether:

  • Appropriate
    distributions were assigned to the model parameters ,

  • Relevant
    assumptions were tested. For example, assumptions about model structure
    or interpretation of the available evidence

Generalisability refers to the extent to which the results obtained
can be applied to different settings. The relevance of the intervention,
the patient population and the resources which have been included in the
economic evaluation will determine whether the results can be generalised.
Uncertainty regarding the generalisability of the results to the relevant
study setting would usually be assessed through sensitivity analyses.
A useful discussion on this issue is available.

Several reliable, comprehensive, and easy to use checklists are available
to guide the quality assessment of economic evaluations. The most widely
used is the BMJ checklist. Both a 10-item version and an expanded
35-item version are available. In addition, a 36th
item relating to generalisability may be added if it is relevant to the
review (see Box 5.3). Although, this checklist does not provide
detailed coverage of some issues relevant to modelling studies, it can
be augmented using specific items such as model type, structural assumptions,
time horizon, cycle length and health states. Alternatively, a checklist
developed to assess the quality of the models used in economic evaluations
can be used as a complement to the BMJ checklist.

In some cases the validity of an economic evaluation may be difficult
to assess due to limitations in reporting, an issue common to many studies
and covered in Chapter 1.

Several quality scoring systems have been devised for use in assessing
the methodological quality of economic evaluations. These are generally
based on completing checklists, assigning values to the different items
considered, and summing these values to obtain a final score, which is
intended to reflect the quality level of the appraised study.

Six published quality scoring systems for economic evaluations have
been identified, but none of these are considered to be sufficiently valid
and reliable for use as a method of quality assessment. Given the limitations presented by
quality scoring systems, their use is not recommended. Rather, it is preferable
to present a checklist or a descriptive critical assessment based on appropriate
guidelines or checklists, which should describe the methods and results,
strengths and weaknesses and the implications of the strengths and weaknesses
on the reliability of the conclusions.