Smoothed Scatterplot
Examples of ZumaStat Programs

Expands the Capabilities of SPSS and Excel 

Uses Summary Statistics as Input


ZumaStat
SPSS Interface
Means and ANOVA
Regression
Frequencies
Miscellaneous Utilties
Robust Statistics
Sample Programs
List of Programs
Support
Contact Us
Purchase
Disclaimers

 

 

Program Dynamic

A useful statistical tool is the smoothed scatterplot.  Smoothed scatterplots can reveal trends in the data that are not obvious from a traditional scatterplot.

There are many types of smoothed scatterplots.  One type plots the mean Y score (i.e., the mean outcome value) at each value of the predictor variable, X, using a scatterplot format.  The data are "smoothed" in the sense that there is only a single Y at each value of X, namely the mean Y score. 

ZumaStat plots such a smoothed scatterplot in Excel based on variables in your active SPSS data set.

In the example below, we first show the ZumaStat interface, then we show how the scatterplot looks in a standard SPSS scatterplot.  We then show the smoothed scatterplot that ZumaStat generates in Excel.

In this program, the user is shown on the left side of the dialog box the variables that are in the SPSS active data file (see the example below).  Note the 'Sort' and 'Label' buttons below the listbox.  The 'Labels' button is a toggle switch that allows the user to see the labels associated with the variables or to not see the labels.  In this example, the labels are not shown.  The 'Sort' button is a toggle switch that allows the user to sort the variable list by file order or alphabetically by variable name.  The actions produced by these buttons are instantaneous.

After choosing the predictor and criterion variables from the active SPSS data file, the user presses 'Create.'  ZumaStat will automatically start Excel and create a plot of the smoothed data. 

How it Appears on Your Screen

 

 

The Output

The two variables are each scored on a 1 to 5 rating scale.  They are correlated 0.66.  Here is how the traditional scatterplot in SPSS appears:

 

Not very informative, is it?  The problem is that at least one score occurs at each of the possible combinations of X and Y.  Now here is the smoothed scatterplot:

 

 

The trend in the aggregate data is much clearer.  At the top of the Excel worksheet, ZumaStat provides the mean Y values that are being plotted, the number of cases on which each mean is based and the standard deviation associated with each mean.  The sample sizes tell us how much to trust a given mean value in terms of its stability.  The standard deviation tells us how much variability there is in scores about the mean.  As these standard deviations increase, the correlation decreases. These values also give insights into non-homogeneous residuals, as all the standard deviations should be about the same.  In this example, they tend to decrease as X increases, suggesting a possible model violation.

The plot can be selected and copied and pasted into Word, Powerpoint, or other software programs.