
Program Dynamic
A useful statistical tool is
the smoothed scatterplot. Smoothed scatterplots can reveal trends in
the data that are not obvious from a traditional scatterplot.
There are many types of smoothed scatterplots.
One type plots the mean Y score (i.e., the mean outcome value) at each value
of the predictor variable, X, using a scatterplot format. The data are
"smoothed" in the sense that there is only a single Y at each value of X,
namely the mean Y score.
ZumaStat plots such a smoothed scatterplot in
Excel based on variables in your active SPSS data set.
In the example below, we first show the
ZumaStat interface, then we show how the scatterplot looks in a standard SPSS
scatterplot. We then show the smoothed scatterplot that ZumaStat generates
in Excel.
In this program, the user
is shown on the left side of the dialog box the variables that are in the
SPSS active data file (see the example below). Note the 'Sort' and 'Label' buttons below the listbox. The 'Labels' button is a toggle switch that allows the user
to see the labels associated with the variables or to not see the labels.
In this example, the labels are not shown. The 'Sort' button is a
toggle switch that allows the user to sort the variable list by file order
or
alphabetically by variable name.
The actions produced by these buttons are instantaneous.
After choosing the predictor and criterion
variables from the active SPSS data file, the user presses 'Create.'
ZumaStat will automatically
start Excel and create a plot of the smoothed data.
How it
Appears on Your Screen

The Output
The two variables are each scored on a 1 to 5 rating scale.
They are correlated 0.66. Here is how the traditional scatterplot in SPSS appears:

Not very informative, is it? The problem is that at
least one score occurs at each of the possible combinations of X and Y. Now here is the smoothed scatterplot:

The trend in the aggregate data is much clearer. At the
top of the Excel worksheet, ZumaStat provides the mean Y values that are
being plotted, the number of cases on which each mean is based and the
standard deviation associated with each mean. The sample sizes tell us
how much to trust a given mean value in terms of its stability. The
standard deviation tells us how much variability there is in scores about
the mean. As these standard deviations increase, the correlation
decreases. These values also give insights into non-homogeneous
residuals, as all the standard deviations should be about the same. In
this example, they tend to decrease as X increases, suggesting a possible
model violation.
The plot can be selected and copied and pasted into Word,
Powerpoint, or other software programs.