|
Should We Be
Measuring Effect Size in Applied Behavior Analysis?
by
Sigurdur Oli Sigurdsson & John Austin,
Ph.D.
Western Michigan University
One
of the most persistent debates in applied behavior
analysis is the extent to which descriptive and
inferential statistics can be useful analytic
tools. Effect–size (ES) measurements, however,
have received little attention from behavior analysts.
Effect–size (Cohen's d) for between–groups
re-search designs is the difference between two
groups (experimental and control) in the metric
of standard units (Rosenthal, Rosnow & Rubin,
2000). (Although there are other measures of ES
than Cohen's d, ES and Cohen's d will
be treated as synonymous in this article).
Effect–size estimations (and statistical power analyses) are enjoying
increased attention in statistical circles. In fact, there is a discernible
shift away from traditional hypothesis testing among main-stream psychologists
as such analyses cannot ever be demonstrations of ES or provide information
on probabilities of effect replications. Effect–size demonstrations for
applied data can therefore provide a common ground for behavior analysis practitioners
and mainstream practitioners, and can lead to a more widespread appeal of our
work.
Whereas statistical hypothesis testing is often misunderstood and incorrectly
applied, the d statistic is a straightforward and simple measure of difference
in standard units. The formula is easily applied as the calculations are
quite straightforward to conduct. In its simplest form, the effect–size
statistic (d) is calculated as follows:
d = (Experimental mean - Control mean
/ Standard deviation of control )
One might ask: How can it be beneficial for applied
behavior analysts to report ES? The goals of this
article are: (a) To describe 3 reasons why calculating
ES would be beneficial to applied behavior analysis,
and (b) to describe 6 technical issues in need
of further discussion before behavior analysts
move ahead with ES calculations.
Three reasons why calculating and reporting
ES could be beneficial to the behavior analytic
community:
1. Calculating ES for behavior analytic interventions provides us with a quantifiable
measure of the effects of interventions on dependent variables. Such a measure
can be used by practitioners to predict the effectiveness of an intervention.
For example, if a review of the literature on feedback indicated that supervisory
feedback given to sales persons consistently yielded an ES over 0.8, a smaller
effect reported in a new study could suggest a breakdown in the implementation
of feedback procedures (i.e., a low degree of independent variable integrity).
Reports of ES can also be used to compare the effectiveness of various interventions,
for example, daily versus weekly feedback, or feedback versus feedback and
reinforcement.
During a recent review of feedback in organizational behavior management interventions,
Alvero, Bucklin, & Austin (2001), had difficulty in characterizing the
efficacy of the many varieties of feedback encountered in the empirical literature
(A.M. Alvero, personal communication, October 31, 2001). Alvero, et al.,
used the same criteria as were used by the authors of the original review
of the organizational behavior management feedback literature (Balcazar,
Hopkins, & Suarez, 1985). Balcazar, et al. developed a system in which
each study reviewed was categorized as having consistent, mixed and/or no
effects across the applications of feedback in the study. Variations of feedback
applications (e.g., feedback with goals; feedback with reinforcement; feedback
alone) were then identified and the consistency of effects were summarized
by feedback variety. Clearly, this is a cursory approach to identify the
most effective varieties of feedback. However, because the authors did not
have the data they needed to conduct more quantitative analyses (e.g., reports
of ES), this was a reasonable approach. Reports of ES for these studies would
have made the authors’ task more feasible, and perhaps would even have
made the results more interesting and useful.
2. Calculating ES for our data would aid in evaluations of the monetary value
of applied behavior analysis interventions. How much a certain amount of
behavior change (d) will benefit an organization can be calculated and used
to predict future gains if the change is maintained. For example, if an ES
of 1.0 is recorded during an intervention over a certain period, it can be
demonstrated that the intervention is cost–effective, in terms of money
saved (e.g., because of a reduction in injuries) or earned (e.g., because
of increased productivity), as compared to baseline. If the behavior change
is maintained, the benefits of the intervention can easily be calculated
for a certain period of time. Although calculating a percent change between
experimental phases may yield similar information, the ES measure also takes
variation in behavior into account. Variation is an important aspect of behavior
that is not captured by simply calculating a percent change.
3. Calculating ES for our data would make applied behavioral data more accessible
and convincing to practitioners from other fields (e.g., traditional psychology
and business. No matter how impressive the data, the statement, "behavior
analytic interventions have consistently been reported to demonstrate improvements
in variable X" would not necessarily impress a mainstream psychologist.
Instead, consider the statement, "behavior analytic interventions have
consistently demonstrated ES ranging from 0.5 to 2.5 (mean = 1.6, standard
deviation = 0.5) for improvements in variable X". The second statement
includes a quantification of results, and offers an opportunity of comparison
with other types of interventions.
However, before ES calculations are conducted indiscriminately for behavioral
data, some issues regarding the elements of the calculations themselves have
to be addressed:
Continue
reading...
|