Previous: Testing data consistency, Up: Data Screening and Transformation [Contents][Index]
Many statistical tests rely upon certain properties of the data.
One common property, upon which many linear tests depend, is that of
normality — the data must have been drawn from a normal distribution.
It is necessary then to ensure normality before deciding upon the
test procedure to use. One way to do this uses the EXAMINE
command.
In Example 5.5, a researcher was examining the failure rates of equipment produced by an engineering company. The file repairs.sav contains the mean time between failures (mtbf) of some items of equipment subject to the study. Before performing linear analysis on the data, the researcher wanted to ascertain that the data is normally distributed.
A normal distribution has a skewness and kurtosis of zero.
Looking at the skewness of mtbf in Example 5.5 it is clear
that the mtbf figures have a lot of positive skew and are therefore
not drawn from a normally distributed variable.
Positive skew can often be compensated for by applying a logarithmic
transformation.
This is done with the COMPUTE
command in the line
compute mtbf_ln = ln (mtbf).
Rather than redefining the existing variable, this use of COMPUTE
defines a new variable mtbf_ln which is
the natural logarithm of mtbf.
The final command in this example calls EXAMINE
on this new variable,
and it can be seen from the results that both the skewness and
kurtosis for mtbf_ln are very close to zero.
This provides some confidence that the mtbf_ln variable is
normally distributed and thus safe for linear analysis.
In the event that no suitable transformation can be found,
then it would be worth considering
an appropriate non-parametric test instead of a linear one.
See NPAR TESTS, for information about non-parametric tests.
PSPP> get file='/usr/local/share/pspp/examples/repairs.sav'. PSPP> examine mtbf /statistics=descriptives. PSPP> compute mtbf_ln = ln (mtbf). PSPP> examine mtbf_ln /statistics=descriptives. Output: 1.2 EXAMINE. Descriptives #====================================================#=========#==========# # #Statistic|Std. Error# #====================================================#=========#==========# #mtbf Mean # 8.32 | 1.62 # # 95% Confidence Interval for Mean Lower Bound# 4.85 | # # Upper Bound# 11.79 | # # 5% Trimmed Mean # 7.69 | # # Median # 8.12 | # # Variance # 39.21 | # # Std. Deviation # 6.26 | # # Minimum # 1.63 | # # Maximum # 26.47 | # # Range # 24.84 | # # Interquartile Range # 5.83 | # # Skewness # 1.85 | .58 # # Kurtosis # 4.49 | 1.12 # #====================================================#=========#==========# 2.2 EXAMINE. Descriptives #====================================================#=========#==========# # #Statistic|Std. Error# #====================================================#=========#==========# #mtbf_ln Mean # 1.88 | .19 # # 95% Confidence Interval for Mean Lower Bound# 1.47 | # # Upper Bound# 2.29 | # # 5% Trimmed Mean # 1.88 | # # Median # 2.09 | # # Variance # .54 | # # Std. Deviation # .74 | # # Minimum # .49 | # # Maximum # 3.28 | # # Range # 2.79 | # # Interquartile Range # .92 | # # Skewness # -.16 | .58 # # Kurtosis # -.09 | 1.12 # #====================================================#=========#==========# |
Previous: Testing data consistency, Up: Data Screening and Transformation [Contents][Index]