Organizational Behavior Management Network

What is OBM?
Why Go Into OBM?
Why Use OBM?

Resources
Articles & More
JOBM
Newsletter
Discussion
Links

Membership
Why Join?
Sign-up
Directory

Opportunities
Grants & Awards
Graduate Training
Jobs

More
Upcoming Events
Store / Donations
Contact

About Us
Strategic Plan
Officers
Bylaws

 

 

 



Should We Be Measuring Effect Size in Applied Behavior Analysis?

by Sigurdur Oli Sigurdsson & John Austin, Ph.D.
Western Michigan University

One of the most persistent debates in applied behavior analysis is the extent to which descriptive and inferential statistics can be useful analytic tools. Effect–size (ES) measurements, however, have received little attention from behavior analysts. Effect–size (Cohen's d) for between–groups re-search designs is the difference between two groups (experimental and control) in the metric of standard units (Rosenthal, Rosnow & Rubin, 2000). (Although there are other measures of ES than Cohen's d, ES and Cohen's d will be treated as synonymous in this article).


Effect–size estimations (and statistical power analyses) are enjoying increased attention in statistical circles. In fact, there is a discernible shift away from traditional hypothesis testing among main-stream psychologists as such analyses cannot ever be demonstrations of ES or provide information on probabilities of effect replications. Effect–size demonstrations for applied data can therefore provide a common ground for behavior analysis practitioners and mainstream practitioners, and can lead to a more widespread appeal of our work.


Whereas statistical hypothesis testing is often misunderstood and incorrectly applied, the d statistic is a straightforward and simple measure of difference in standard units. The formula is easily applied as the calculations are quite straightforward to conduct. In its simplest form, the effect–size statistic (d) is calculated as follows:


d = (Experimental mean - Control mean / Standard deviation of control )

One might ask: How can it be beneficial for applied behavior analysts to report ES? The goals of this article are: (a) To describe 3 reasons why calculating ES would be beneficial to applied behavior analysis, and (b) to describe 6 technical issues in need of further discussion before behavior analysts move ahead with ES calculations.


Three reasons why calculating and reporting ES could be beneficial to the behavior analytic community:
1. Calculating ES for behavior analytic interventions provides us with a quantifiable measure of the effects of interventions on dependent variables. Such a measure can be used by practitioners to predict the effectiveness of an intervention. For example, if a review of the literature on feedback indicated that supervisory feedback given to sales persons consistently yielded an ES over 0.8, a smaller effect reported in a new study could suggest a breakdown in the implementation of feedback procedures (i.e., a low degree of independent variable integrity). Reports of ES can also be used to compare the effectiveness of various interventions, for example, daily versus weekly feedback, or feedback versus feedback and reinforcement.


During a recent review of feedback in organizational behavior management interventions, Alvero, Bucklin, & Austin (2001), had difficulty in characterizing the efficacy of the many varieties of feedback encountered in the empirical literature (A.M. Alvero, personal communication, October 31, 2001). Alvero, et al., used the same criteria as were used by the authors of the original review of the organizational behavior management feedback literature (Balcazar, Hopkins, & Suarez, 1985). Balcazar, et al. developed a system in which each study reviewed was categorized as having consistent, mixed and/or no effects across the applications of feedback in the study. Variations of feedback applications (e.g., feedback with goals; feedback with reinforcement; feedback alone) were then identified and the consistency of effects were summarized by feedback variety. Clearly, this is a cursory approach to identify the most effective varieties of feedback. However, because the authors did not have the data they needed to conduct more quantitative analyses (e.g., reports of ES), this was a reasonable approach. Reports of ES for these studies would have made the authors’ task more feasible, and perhaps would even have made the results more interesting and useful.


2. Calculating ES for our data would aid in evaluations of the monetary value of applied behavior analysis interventions. How much a certain amount of behavior change (d) will benefit an organization can be calculated and used to predict future gains if the change is maintained. For example, if an ES of 1.0 is recorded during an intervention over a certain period, it can be demonstrated that the intervention is cost–effective, in terms of money saved (e.g., because of a reduction in injuries) or earned (e.g., because of increased productivity), as compared to baseline. If the behavior change is maintained, the benefits of the intervention can easily be calculated for a certain period of time. Although calculating a percent change between experimental phases may yield similar information, the ES measure also takes variation in behavior into account. Variation is an important aspect of behavior that is not captured by simply calculating a percent change.


3. Calculating ES for our data would make applied behavioral data more accessible and convincing to practitioners from other fields (e.g., traditional psychology and business. No matter how impressive the data, the statement, "behavior analytic interventions have consistently been reported to demonstrate improvements in variable X" would not necessarily impress a mainstream psychologist. Instead, consider the statement, "behavior analytic interventions have consistently demonstrated ES ranging from 0.5 to 2.5 (mean = 1.6, standard deviation = 0.5) for improvements in variable X". The second statement includes a quantification of results, and offers an opportunity of comparison with other types of interventions.
However, before ES calculations are conducted indiscriminately for behavioral data, some issues regarding the elements of the calculations themselves have to be addressed:

Continue reading...