PSPP: MEANS

15.9 MEANS

MEANS [TABLES =] 
      {var_list} 
        [ BY {var_list} [BY {var_list} [BY {var_list} … ]]]

      [ /{var_list} 
         [ BY {var_list} [BY {var_list} [BY {var_list} … ]]] ]

      [/CELLS = [MEAN] [COUNT] [STDDEV] [SEMEAN] [SUM] [MIN] [MAX] [RANGE]
        [VARIANCE] [KURT] [SEKURT] 
        [SKEW] [SESKEW] [FIRST] [LAST] 
        [HARMONIC] [GEOMETRIC] 
        [DEFAULT]
        [ALL]
        [NONE] ]

      [/MISSING = [TABLE] [INCLUDE] [DEPENDENT]]

You can use the MEANS command to calculate the arithmetic mean and similar statistics, either for the dataset as a whole or for categories of data.

The simplest form of the command is

MEANS v.

which calculates the mean, count and standard deviation for v. If you specify a grouping variable, for example

MEANS v BY g.

then the means, counts and standard deviations for v after having been grouped by g will be calculated. Instead of the mean, count and standard deviation, you could specify the statistics in which you are interested:

MEANS x y BY g
      /CELLS = HARMONIC SUM MIN.

This example calculates the harmonic mean, the sum and the minimum values of x and y grouped by g.

The CELLS subcommand specifies which statistics to calculate. The available statistics are:

MEAN The arithmetic mean.
COUNT The count of the values.
STDDEV The standard deviation.
SEMEAN The standard error of the mean.
SUM The sum of the values.
MIN The minimum value.
MAX The maximum value.
RANGE The difference between the maximum and minimum values.
VARIANCE The variance.
FIRST The first value in the category.
LAST The last value in the category.
SKEW The skewness.
SESKEW The standard error of the skewness.
KURT The kurtosis
SEKURT The standard error of the kurtosis.
HARMONIC The harmonic mean.
GEOMETRIC The geometric mean.

In addition, three special keywords are recognized:

DEFAULT This is the same as MEAN COUNT STDDEV.
ALL All of the above statistics will be calculated.
NONE No statistics will be calculated (only a summary will be shown).

More than one table can be specified in a single command. Each table is separated by a ‘/’. For example

MEANS TABLES =
      c d e BY x
      /a b BY x y
      /f BY y BY z.

has three tables (the ‘TABLE =’ is optional). The first table has three dependent variables c, d and e and a single categorical variable x. The second table has two dependent variables a and b, and two categorical variables x and y. The third table has a single dependent variables f and a categorical variable formed by the combination of y and z.

By default values are omitted from the analysis only if missing values (either system missing or user missing) for any of the variables directly involved in their calculation are encountered. This behaviour can be modified with the /MISSING subcommand. Three options are possible: TABLE, INCLUDE and DEPENDENT.

/MISSING = TABLE causes cases to be dropped if any variable is missing in the table specification currently being processed, regardless of whether it is needed to calculate the statistic.

/MISSING = INCLUDE says that user missing values, either in the dependent variables or in the categorical variables should be taken at their face value, and not excluded.

/MISSING = DEPENDENT says that user missing values, in the dependent variables should be taken at their face value, however cases which have user missing values for the categorical variables should be omitted from the calculation.