# Economic data analysis

CRIME CLEARANCE RATE (a ) General Linear Model

The General Linear Model is of the form

Where : Yt dependent variable ?o y-intercept ?1 and ?2 are the partial regression coefficients for the independent variables X1t and X2t and ?t is the error term for observation (time ) t . This equation can be extended to several independent variables . Non-linear equations such as the production function as applied to crime clear-up rate above , can be reduced to a linear form by logarithmic (common or natural logarithms ) transformation . For this exercise , the general linear model after [banner_entry_middle]

data transformation by natural logarithm is : lnCR 3 .32 -0 .26t 0 .37 lnF 0 .45 lnX1 0 .84 lnX2 . The variable t has a negative coefficient indicating a decreasing trend of clearance rate through time

Statistical softwares like SAS , SPSS , and Minitab has procedures (e .g PROC GLM ) for computing the Generalized Linear Model for Analysis of Variance (ANOVA ) for experimental designs and regression analysis . This procedure allows the user to specify the dependent and independent variables

Test for assumptions of no serial correlation – by Durbin-Watson test applicable only to first Auto regressive systems (Boyd

One of the assumptions of the general linear model is that the ‘s are independently distributed with mean zero and variance ?2 . For a time series

Violation of this assumption leads to autocorrelation , and that the error term at time i has an effect on the error term of another time j (Boyd

The test

Null hypothesis Ho 0

Alternative hypothesis : Ha ? 0

is (Powell

There is a table for the critical values for DW tests . There three decision regions : accept null hypothesis reject the null hypothesis indecision when the test can neither be rejected or not rejected (Boyd

would reject Ho if DW 2 (1- z ?N , where z ) is the upper standard normal critical value for a size ? test (Powell

The equations derived above have DW ‘s closer to 2 (no serial correlation : For the car-theft data (1974-2001 ) DW 2 .12 1974-1987 , DW 2 .01 and for 1988-2001 , DW 1 .92

Test for normality

The Shapiro-Wilk (1965 ) test is used to test if data is normally distributed . The Wilks Shapiro test statistic (Heckert , 2003 ) is defined as is the sample mean of the data , and w (w1 , w2 , wn ) or M denotes the expected values of standard normal statistics for a sample of size n and V is the corresponding covariance matrix

W may be thought of as the squared correlation coefficient between the ed sample values (X ) and the wi . The wi are approximately proportional to the normal scores Mi . W is a measure of the straightness of the normal probability plot , and small values indicate departures from normality

Many statistical softwares include procedures for testing normality

Dataplot statistical software has PPCC PLOT command based on a similar concept described above . Its syntax is

WILKS SHAPIRO NORMALITY TEST

where is the response variable being tested and where the is optional

Dataplot uses Algorithm AS R94 (SWILK sub routine… [banner_entry_footer]

**Author:** Essay Raptor