PROS Andria PD, Suhartono Two level parameter fulltext

(1)

SWUP

Two-level parameter estimates GSTARX-GLS model

Andria Prima Ditago and Suhartono

Statistics Department, Sepuluh November of Institute Technology, Surabaya 60119, Indonesia

Abstract

GSTAR is a special form of the VAR model and is one of the commonly used models for modeling and forecasting time series data and location. At GSTAR modeling, estimation method used is OLS, the method is considered to have a weakness, which will result in an inefficient estimator. Thus, one appropriate method is GLS. In this study, conducted modeling GSTARX two levels by adding a predictor of calendar variation model. Parameter estimation of the first level models made of predictors with a linear regression model, while the second level models using error models which is done on first level with GSTAR model. Calendar variation model discussed is the impact of Ramadhan effect. Results of the simulation study showed that GSTAR-GLS models produces a more efficient estimator than GSTAR-OLS, seen from the obtained standard error smaller.

Keywords calendar variations, efficient, GLS, GSTARX, Ramadhan, two-level

1.

Introduction

One approach that can be used to handle the data space-time is a Generalized Space Time Autoregressive models (GSTAR). The model provides a more flexible and is an extension of the model STAR (Borovkova et al., 2008). Different from STAR models, GSTAR does not require that the values of the same parameters for all locations. Therefore GSTAR more realistic, because in reality is more found models with different parameters for different locations. Theoretical studies relating to the nature of the parameters GSTAR asymptotically and weighting between locations given (Lopuhaa & Borovkova, 2005).

Implementation of GSTAR has been done on the production of oil and Gross Domestic Product (GDP) in 16 Western European countries (Nurani, 2002). In addition, research on comparison of between VARMA with GSTAR models (Suhartono, 2005) showed that the fore-casting is more accurate GSTAR model. However, in the model building process in terms of theoretical and applied by the statistical program package was found that the model is more flexible and perfect VARMA.

In addition, studies related to GSTAR especially for parameter estimation is still limited to using Ordinary Least Square (OLS) (Borovkova et al., 2008) and Maximum Likelihood method (Terzi, 1995). Parameter estimation using OLS in the multivariate model with residual correlated assessed as having a weakness, which will result in inefficient estimators. However, over the years, developed Generalized Least Square (GLS) estimation method. GLS methods commonly applied to the model Seemingly Unrelated Regression (SUR), because one of the approaches that can be used to estimate the parameters in the model SUR is the GLS method (Baltagi, 1980). SUR is a system of equations that consist of multiple regression equations, where each equation has a different response and possible predictors have different also. Advantages of system of equations SUR is able to accommodate the correlation between the error equation with the other equations. SUR models were first


(2)

applied in the case of gross investment demand in the two companies (Zellner, 1962). The results obtained are the estimated parameters by GLS for the overall model is more efficient than the OLS parameter estimates for each model. In addition, SUR models are also applied to the spatio-temporal domains (Wang & Kockelman, 2007). SUR models were applied to estimate the parameters GSTAR provide assurance that the error of the model is a multivariate white noise (Wutsqa & Suhartono, 2010).

Along with its development, GSTAR model can be expanded to GSTARX. In this case X is a notation for a predictor or input. Predictors can be a metric and or non-metric scale form. For this form metric, predictor will be conducted by transfer function model, whereas for the form of non-metric conducted by dummy variables. In the case of non-metric, the variable may be the effect of the intervention, outliers and calendar variations.

This research will be used predictor of the calendar variation model to capture of Ramadhan effects. Implementation of the GSTARX model in this study will be discussed through a simulation study with the aim to get the right model building procedure according with the conditions of real data.

2.

Materials and methods

Estimation process in GSTAR can be done with two methods of estimation, i.e., OLS and GLS. For example, applied in GSTAR(11) for N locations, it can be written

,

i i i

i

X

β

e

Z

=

+

where

β

i

=

(

φ

10

,

φ

11

,...,

φ

N0

,

φ

N1

)'

. In the matrix form, can be written

,

) (

) 2 (

) 1 (

) (

) 2 (

) 1 (

) 1 ( ) 1 ( 0

0

) 1 ( )

1 ( 0

0

) 0 ( )

0 ( 0

0

0 0

) 1 ( ) 1 (

0 0

) 1 ( )

1 (

0 0

) 0 ( )

0 (

) (

) 2 (

) 1 (

) (

) 2 (

) 1 (

1 1 1

1 0 11 10

1 1

1 1

1 1

1 1 1

           

 

           

 

+

     

 

     

 

           

 

           

 

− −

− −

=

          

 

          

 

T e e e

T e e e

T V T z

V z

V z

T V T z

V z

V z

T z z z

T z

z z

N N N

N N

N N

N N

N N

N N N

M M M

M

L

M M

O M M

L L

M M

O M M

L

M M

M M M

L L

M M M

φ φ φ φ

where

=

i j

j ij

i t wZ t

V() (). Parameter estimation of β conducted using OLS method by means of minimizing e=ZXβ, so the estimator for

β

i

=

(

φ

10

,

φ

11

,...,

φ

N0

,

φ

N1

)'

is obtained

. ' ) ' (

ˆ 1

Z X X X

β

=

Whereas estimates for GLS is obtained by minimizing generalized sum of square ε'−1ε, where ε=(YXβ), so that the equation

) (

)' (

' 1ε Y Xβ 1 Y Xβ

ε − = − − −

then do decrease to the parameters, so that would be obtained estimator ,

) (

ˆ 1 1 1Y

X' X X'

β − − −

= (1)


(3)

SWUP

   

 

   

 

=

NN N

N

N N

σ σ

σ

σ σ

σ

σ σ

σ

L M O M M

L L

1 1

2 22

21

1 12

11

Σ dan

   

 

   

 

=

1 0

0

0 1

0

0 0

1

L M O M M

L L

I ,

in which Σ is a error variance-covariance matrix size (N x N) and I is a identity matrix of size (T x T).

In this study, parameter estimation of the GSTARX model will be done in two-level. Models for first level, i.e., the regression models with calendar variation:

,

)

,

(

, , 1 ,

0 *

,t gt gt it

i

f

D

D

u

Y

=

β

+

+

(2)

where =

+

g g,t g g

g,t g t

g t

g D α D γ D

D

f( ,, , 1) 1 is the total calendar variation effects, Dg,t and

1

g,t

D respectively represents dummy variable for the during month Eid and one month before Eid, g indicate the number of days prior to the date of Eid, ui,t is the error component, and i is a notation for number of location. Models for second level, i.e., GSTAR models:

. ) (

) (

) (

) 1 (

) 1 (

) 1 (

0 0 0

0 0

0 0

0 0

) 1 (

) 1 (

) 1 (

0 0

0 0

0 0

) (

) (

) (

3 2 1

3 2 1

32 31

23 21

13 12

31 21 11

3 2 1

30 20 10

3 2 1

         

+

         

− − −

     

         

   

+

         

− − −

     

   

=

         

t e

t e

t e

t u

t u

t u

w w

w w

w w

t u

t u

t u

t u

t u

t u

φ φ φ φ

φ φ

(3)

From Eq. (3), the models for second level can be written in the form

, ) (

) (

) (

) 1 (

) 1 (

) 1 (

) (

) (

) (

3 2 1

3 2 1

* 33 * 32 * 31

* 23 * 22 * 21

* 13 * 12 * 11

3 2 1

         

+

         

− − −

     

   

=

         

t e

t e

t e

t u

t u

t u

t u

t u

t u

φ φ φ

φ φ φ

φ φ φ

where φii*=φi0, for i = 1, 2, 3 and * 1 i ij ij wφ

φ = , for i, j = 1, 2, 3 where i j.

Stages of simulation study carried out in two-level GSTARX models are as follows. Step 1: Determine the effects of calendar variation during the specified period.

Table 1. Eid celebration for the period 1990 to 2010.

Year Date Year Date Year Date

1990 1991 1992 1993 1994 1995 1996 1997

27-28 April 16-17 April 4-5 April 25-26 March 14-15 March 3-4 March 21-22 February 9-10 February

1998 1999 2000 2001 2002 2003 2004

30-31 January 19-20 January 8-9 January 28-29 December 17-18 December 6-7 December 25-26 November 14-15 November

2005 2006 2007 2008 2009 2010

3-4 November 23-24 October 12-13 October 1-2 October 21-22 September 10-11 September

Step 2: Determine coefficient parameters of the vector AR(1) models:

. 25 , 0 20 , 0 20 , 0

15 , 0 20 , 0 15 , 0

10 , 0 10 , 0 30 , 0 1

     

   

= Φ

Step 3: Generate residual at three locations multivariate normal distribution with a mean of zero and variance covariance matrix .


(4)

- for the first case, the residual is not correlated in three locations:

, 00 , 1 00 , 0 00 , 0

00 , 0 00 , 1 00 , 0

00 , 0 00 , 0 00 , 1

     

   

=

- for the second case, the residuals are correlated in three locations:

. 00 , 1 30 , 0 40 , 0

30 , 0 00 , 1 20 , 0

40 , 0 20 , 0 00 , 1

     

   

=

Step 5: Determining the dummy variable for the period of calendar variations (see Table 1).

Step 6: Perform parameter estimation model for first level using the OLS method, such as Eq. (1).

Step 7: Determining the spatial weights (W) are used.

Step 8: Perform parameter estimation model for second level using the OLS and GLS method, such as Eq. (2).

Step 9: Calculate the efficiency (%) of GLS method, with form . 100 x )

ˆ ( SE

) ˆ ( SE ) ˆ ( SE

OLS GLS OLS

β β β −

Step 10: This phase is done by adding up the value of out-sample forecasting of the first and second level model.

3.

Results and discussion

The first step is to identify the effects of calendar variation from plot time series for a specified period according to Table 1. Plot time series of data simulation the effects of calendar variation shown in Figure 1.

Year Month

2010 2008 2006 2004 2002 2000 1998 1996 1994 1992 1990

Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan 200

150

100

50

0

L

o

k

a

s

i

1

A pr '90 A pr '91

A pr '92 M ar '93

M ar '94 M ar '95

F eb '96 F eb '97

Jan '98 Jan '99

Jan '00 D ec '00

D ec '01 D ec '02

N ov '03 N ov '04

N ov '05 O ct '06

O ct '07 O ct '08

Sep '09 Sep '10

9 8 9

8

10 9

10 9 10

9

11 10

11 10 11

10 12 11

12 11 12

11 1 12 1

12 1

12 2 1 2

1

3 2

3 2 3

2 4 3

4 3 4

3

Year Month

2010 2008 2006 2004 2002 2000 1998 1996 1994 1992 1990

Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan 140 120 100 80 60 40 20 0

L

o

k

a

s

i

2

A pr '90 A pr '91

A pr '92 M ar '93

M ar '94 M ar '95

F eb '96 F eb '97

Jan '98 Jan '99

Jan '00 D ec '00

D ec '01 D ec '02

N ov '03 N ov '04

Nov '05 O ct '06

O ct '07 O ct '08

Sep '09 Sep '10

9 8 9

8

10 9

10 9 10

9 11 10

11 10 11

10 12 11 12

11 12

11 1 12 1

12 1

12 2 1 2

1 3 2

3 2 3

2 4 3 4 3 4

3

Year Month

2010 2008 2006 2004 2002 2000 1998 1996 1994 1992 1990

Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan 80 70 60 50 40 30 20 10 0

L

o

k

a

s

i

3

A pr '90 A pr '91

A pr '92 M ar '93

M ar '94 M ar '95

F eb '96 F eb '97

Jan '98 Jan '99

Jan '00 D ec '00

D ec '01 D ec '02

N ov '03 N ov '04

N ov '05 O ct '06

O ct '07 O ct '08

S ep '09 Sep '10

9 8 9

8

10 9

10 9 10

9

11 10

11 10 11

10 12 11

12 11 12

11 1 12 1 12 1

12 2 1 2

1

3 2

3 2 3

2 4 3

4 3 4

3


(5)

SWUP Figure 1 shows of the time series plot of the vector AR(1) model with the data containing the effects of calendar variations. A vertical (dotted line) was included in this plot to emphasize the months of Eid that occurred during this period. Stage one in modeling GSTARX is to estimate of parameters the first level. Such as the following results.

- Model for location 1

. 440 , 33 548 , 45 823 , 49 283 , 60 224 , 71 903 , 79 571 , 90 684 , 98 843 , 105 702 , 116 863 , 126 634 , 135 766 , 138 139 , 145 698 , 155 080 , 165 700 , 170 075 , 180 453 , 193 178 , 181 097 , 175 960 , 163 563 , 152 775 , 141 300 , 128 849 , 116 003 , 110 889 , 99 302 , 86 113 , 76 747 , 69 142 , 64 248 , 50 016 , 37 371 , 33 393 , 20 , 1 1 , 29 1 , 27 1 , 26 1 , 24 1 , 22 1 , 20 1 , 18 1 , 16 1 , 15 1 , 13 1 , 11 1 , 9 1 , 8 1 , 7 1 , 5 1 , 3 1 , 2 1 , 0 , 29 , 27 , 26 , 24 , 22 , 20 , 18 , 16 , 15 , 13 , 11 , 9 , 8 , 7 , 5 , 3 , 2 , 0 * , 1 t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t u D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D Y + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + = − − − − − − − − − − − − − − − − − − (4)

- Model for location 2

. 458 , 13 428 , 18 733 , 21 763 , 27 403 , 33 442 , 41 232 , 47 652 , 52 402 , 54 992 , 61 158 , 67 089 , 73 810 , 76 371 , 80 693 , 84 449 , 91 296 , 94 745 , 98 196 , 132 653 , 122 330 , 117 898 , 109 161 , 104 874 , 95 987 , 87 555 , 78 243 , 75 474 , 67 875 , 59 549 , 52 717 , 46 394 , 44 187 , 36 291 , 27 357 , 22 722 , 15 , 2 1 , 29 1 , 27 1 , 26 1 , 24 1 , 22 1 , 20 1 , 18 1 , 16 1 , 15 1 , 13 1 , 11 1 , 9 1 , 8 1 , 7 1 , 5 1 , 3 1 , 2 1 , 0 , 29 , 27 , 26 , 24 , 22 , 20 , 18 , 16 , 15 , 13 , 11 , 9 , 8 , 7 , 5 , 3 , 2 , 0 * , 2 t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t u D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D Y + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + = − − − − − − − − − − − − − − − − − − (5)

- Model for location 3

. 942 , 9 558 , 16 888 , 15 820 , 22 282 , 27 128 , 31 304 , 35 231 , 37 547 , 39 706 , 42 713 , 48 560 , 52 652 , 53 552 , 56 777 , 59 286 , 64 182 , 65 318 , 70 915 , 67 409 , 64 822 , 62 961 , 58 500 , 55 610 , 51 522 , 47 505 , 42 077 , 41 226 , 35 430 , 31 756 , 27 345 , 26 099 , 24 967 , 19 600 , 17 601 , 14 054 , 10 , 3 1 , 29 1 , 27 1 , 26 1 , 24 1 , 22 1 , 20 1 , 18 1 , 16 1 , 15 1 , 13 1 , 11 1 , 9 1 , 8 1 , 7 1 , 5 1 , 3 1 , 2 1 , 0 , 29 , 27 , 26 , 24 , 22 , 20 , 18 , 16 , 15 , 13 , 11 , 9 , 8 , 7 , 5 , 3 , 2 , 0 * , 3 t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t u D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D Y + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + = − − − − − − − − − − − − − − − − −

(6)

From the estimation parameters such as the first level of Eq. (4) to Eq. (6) obtained residual models (i.e., ui,t). Furthermore, the residual models used for estimation at second

level, i.e., GSTAR models with parameter between locations (spatial).

Characteristics of GSTAR models is weighted with the location. Spatial weighting method used in this study is limited only by inference Partial Correlation Normalized Cross (NIPKS). This method is based on the high or low value of the partial cross correlation between locations. Statistical inference process is done by using a 95% confidence interval.


(6)

Table 2. Estimates normalization of cross correlation inference partial data simulation case one.

Parameter Estimates 95% confidence interval Conclusion

Lower Upper

P12(1) 0.186 0.073 0.300 Valid and concurrent

P13(1) 0.185 0.072 0.298 Valid and concurrent

P21(1) 0.212 0.099 0.326 Valid and concurrent

P23(1) 0.197 0.084 0.310 Valid and concurrent

P31(1) 0.140 0.027 0.253 Valid and concurrent

P32(1) 0.267 0.154 0.380 Valid and concurrent

Based on the calculation of the amount of the cross-correlation between the location at the time to lag 1, the process of inference statistics in Table 2 shows that the confidence interval gives the same amount (the relationship). Thus, the decision obtained are valid and comparable, it showed no difference in weighting between locations. Thus, the appropriate weighting method in this case is uniform

. 0 5 . 0 5 . 0

5 . 0 0 5 . 0

5 . 0 5 . 0 0

     

   

=

W (7)

For the second case, the weighting method is the same as Eq. (7). By using the weight of locations the results of parameter estimation GSTAR(11) model shown in Table 3.

Table 3. Comparison of parameter estimates from OLS and GLS method for the first and second case.

Case Parameter OLS GLS Efisiensi (%)

GLS

Estimasi SE Estimasi SE

1

1 10

φ 0.283 0.060 0.299 0.060 0.000

1 11

φ

0.247 0.082 0.230 0.082 0.000

1 20

φ

0.175 0.061 0.169 0.061 0.000

1 21

φ

0.282 0.074 0.288 0.074 0.000

1 30

φ

0.329 0.061 0.335 0.061 0.000

1 31

φ 0.246 0.082 0.240 0.082 0.000

2

1 10

φ 0.271 0.067 0.254 0.062 6.625

1 11

φ

0.114 0.082 0.132 0.078 4.531

1 20

φ

0.161 0.064 0.176 0.062 3.416

1 21

φ

0.293 0.077 0.279 0.076 2.169

1 30

φ

0.143 0.070 0.133 0.064 8.185

1 31

φ 0.270 0.084 0.280 0.079 5.873

Table 3 shows the results of estimation GSTAR models there is a difference in the standard error of the estimation OLS with GLS method. The standard error of the GLS method is smaller than the OLS, the difference is largely occurs on all parameters. For the second case can be stated that the parameter estimation using GLS better than OLS. This can be seen in


(7)

SWUP almost all GLS efficiency coefficient is worth above five percent. In addition, comparison of the efficiency of the standard error of each parameter GSTAR model can also be shown through the curve probability distribution function (p.d.f) in Figure 2.

0.5 0.4 0.3 0.2 0.1

7

6

5

4

3

2

1

0

psi10

D

e

n

s

it

y

0.3

O LS GLS

0.4 0.3 0.2 0.1 0.0 -0.1 -0.2 5

4

3

2

1

0

psi11

D

e

n

s

it

y

0.1

O LS GLS

0.4 0.3 0.2

0.1 0.0 7 6 5 4 3 2 1 0

psi20

D

e

n

s

it

y

0.2

OLS GLS

0.5 0.4 0.3 0.2 0.1 0.0 6 5 4 3 2 1 0

psi21

D

e

n

s

it

y

0.15

OLS GLS

0.4 0.3 0.2 0.1 0.0 7

6 5 4 3 2 1 0

psi30

D

e

n

s

it

y

0.25

O LS GLS

0.6 0.5 0.4 0.3 0.2 0.1 0.0 5

4

3

2

1

0

psi31

D

e

n

s

it

y

0.2

O LS GLS

Figure 2. Parameter distribution plot for φi10 (left) dan φi11 (right) with OLS and GLS method (a) location 1, (b) location 2, dan (c) location 3.

Efficiency of each parameter estimation by using GLS method look more efficient than OLS method, it is marked on the shape of the curve p.d.f blue color a more narrow. In addition, the vertical a dotted line shows the actual coefficient values of each parameter. Visually, the coefficient of parameter estimation approach with a true value. Furthermore, for the first case GSTAR(11)-OLS model can be written

         

+

     

   

− − −      

   

=

         

) (

) (

) (

) 1 (

) 1 (

) 1 (

329 , 0 123 , 0 123 , 0

141 , 0 175 , 0 141 , 0

124 , 0 124 , 0 283 , 0

) (

) (

) (

3 2 1

3 2 1

3 2 1

t e

t e

t e

t u

t u

t u

t u

t u

t u

and GSTAR(11)-GLS can be written

. ) (

) (

) (

) 1 (

) 1 (

) 1 (

335 , 0 120 , 0 120 , 0

144 , 0 169 , 0 144 , 0

115 , 0 115 , 0 299 , 0

) (

) (

) (

3 2 1

3 2 1

3 2 1

         

+

     

   

− − −      

   

=

         

t e

t e

t e

t u

t u

t u

t u

t u

t u


(8)

As for the second case GSTAR(11)-OLS model can be written

         

+

     

   

− − −      

   

=

         

) (

) (

) (

) 1 (

) 1 (

) 1 (

143 , 0 135 , 0 135 , 0

147 , 0 161 , 0 147 , 0

057 , 0 057 , 0 271 , 0

) (

) (

) (

3 2 1

3 2 1

3 2 1

t e

t e

t e

t u

t u

t u

t u

t u

t u

and GSTAR(11)-GLS can be written

. ) (

) (

) (

) 1 (

) 1 (

) 1 (

133 , 0 140 , 0 140 , 0

140 , 0 176 , 0 140 , 0

066 , 0 066 , 0 254 , 0

) (

) (

) (

3 2 1

3 2 1

3 2 1

         

+

     

   

− − −      

   

=

         

t e

t e

t e

t u

t u

t u

t u

t u

t u

4.

Conclusion and remarks

The results of research, particularly in the simulation study showed that GSTAR involving the effects of calendar variation estimation, if done directly (simultaneously) there are obstacles, that resulted in the form of non-linear models. So in this study, proposed a procedure for building a model, i.e., two-level GSTARX, X based on regression models with calendar variation. The first level model is used to estimate of parameters the effects of calendar variations, whereas at the second level models for the spatial of parameters (based on error models of the first level). Two-level model building procedure refers to Wei (2006). The simulation results show that if the residuals are correlated between all locations, or only a few locations only, then the model GSTAR-GLS will result in a more efficient parameter estimation than GSTAR-OLS models. This is shown on the standard error values generated by the model GSTAR-GLS is smaller than GSTAR-OLS. Meanwhile, if the residual is not correlated between all the equations, the standard error values generated by the model GSTAR-OLS and GSTAR-GLS are the same.

For further research is necessary to estimate a parameter with mulltivariate regression model for the first level model. It is also necessary to study a simulation involving a combination data type (metric and non-metric).

References

Baltagi, B. (1980). On seemingly unrelated regressions with error components. Econometrica, 48, 1547–1551.

Borovkova, S.A., Lopuhaa, H.P., & Ruchjana, B.N. (2008). Consistency and asymptotic normality of least square estimators in Generalized STAR models. Journal Compilation Statistica Neerlandica, 62(4), 482–508.

Lopuhaa, H.P., & Borovkova, S. (2005). Asymptotic properties of least squares estimators in generalized STAR models. Technical Report, Delft University of Technology.

Nurani, B. (2002). Pemodelan kurva produksi minyak bumi menggunakan model generalisasi S-TAR.

Jurnal Forum Statistika dan Komputasi, IPB, Bogor.

Pfeifer, P.E., & Deutsch, S.J. (1980). A three stage iterative procedure for space-time modeling.

Technometrics, 22(1), 35–47.

Suhartono (2005). Perbandingan antara model GSTAR dan VARIMA untuk peramalan data deret waktu dan lokasi. Prosiding Seminar Nasional Statistika, ITS, Surabaya.

Suhartono, & Subanar (2006). The optimal determination of space weight in GSTAR model by using crosscorrelation inference. Journal of Quantitative Method, Journal Devoted to the Mathematical and Statistical Aplication in Various Field, 2(2), 45–53.

Terzi, S. (1995). Maximum likelihood estimation of a generalized STAR (p; 1p) model. Journal of the Italian Statistical Society, 4(3), 377–393.


(9)

SWUP

Wang, X., & Kockelman, K. (2007). Specification and estimation of a spatially and temporally autocorrelated seemingly unrelated regression model: Application to crash rates in China.

Journal of Transportation, 34, 281–300.

Wei, W.W.S. (2006). Time series analysis univariate and multivariate methods. Canada: Addison-Wesley Publishing Company, Inc.

Wutsqa, D.U., & Suhartono (2010). Seasonal multivariate time series forecasting on tourism data by using VAR-GSTAR model. Jurnal ILMU DASAR, 11(1), 101–109.

Zellner, A. (1962). An efficient method of estimating seemingly unrelated regression equations and tests for aggregation bias. Journal of the American Statistical Association, 57, 348–368.


(1)

- for the first case, the residual is not correlated in three locations:

, 00 , 1 00 , 0 00 , 0

00 , 0 00 , 1 00 , 0

00 , 0 00 , 0 00 , 1

     

   

=

- for the second case, the residuals are correlated in three locations: .

00 , 1 30 , 0 40 , 0

30 , 0 00 , 1 20 , 0

40 , 0 20 , 0 00 , 1

     

   

=

Step 5: Determining the dummy variable for the period of calendar variations (see Table 1).

Step 6: Perform parameter estimation model for first level using the OLS method, such as Eq. (1).

Step 7: Determining the spatial weights (W) are used.

Step 8: Perform parameter estimation model for second level using the OLS and GLS method, such as Eq. (2).

Step 9: Calculate the efficiency (%) of GLS method, with form . 100 x )

ˆ ( SE

) ˆ ( SE ) ˆ ( SE

OLS GLS OLS

β β β −

Step 10: This phase is done by adding up the value of out-sample forecasting of the first and second level model.

3.

Results and discussion

The first step is to identify the effects of calendar variation from plot time series for a specified period according to Table 1. Plot time series of data simulation the effects of calendar variation shown in Figure 1.

Year Month

2010 2008 2006 2004 2002 2000 1998 1996 1994 1992 1990

Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan 200

150

100

50

0

L

o

k

a

s

i

1

A pr '90 A pr '91

A pr '92 M ar '93

M ar '94 M ar '95

F eb '96 F eb '97

Jan '98 Jan '99

Jan '00 D ec '00

D ec '01 D ec '02

N ov '03 N ov '04

N ov '05 O ct '06

O ct '07 O ct '08

Sep '09 Sep '10

9 8 9

8

10 9

10 9 10

9

11 10

11 10 11

10 12 11

12 11 12

11 1 12 1

12 1

12 2 1 2

1

3 2

3 2 3

2 4 3

4 3 4

3

Year Month

2010 2008 2006 2004 2002 2000 1998 1996 1994 1992 1990

Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan 140 120 100 80 60 40 20 0

L

o

k

a

s

i

2

A pr '90 A pr '91

A pr '92 M ar '93

M ar '94 M ar '95

F eb '96 F eb '97

Jan '98 Jan '99

Jan '00 D ec '00

D ec '01 D ec '02

N ov '03 N ov '04

Nov '05 O ct '06

O ct '07 O ct '08

Sep '09 Sep '10

9 8 9

8

10 9

10 9 10

9 11 10

11 10 11

10 12 11 12

11 12

11 1 12 1

12 1

12 2 1 2

1 3 2

3 2 3

2 4 3 4 3 4

3

Year Month

2010 2008 2006 2004 2002 2000 1998 1996 1994 1992 1990

Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan 80 70 60 50 40 30 20 10 0

L

o

k

a

s

i

3

A pr '90 A pr '91

A pr '92 M ar '93

M ar '94 M ar '95

F eb '96 F eb '97

Jan '98 Jan '99

Jan '00 D ec '00

D ec '01 D ec '02

N ov '03 N ov '04

N ov '05 O ct '06

O ct '07 O ct '08

S ep '09 Sep '10

9 8 9

8

10 9

10 9 10

9 11 10

11 10 11

10 12 11

12 11 12

11 1 12 1 12 1

12 2 1 2

1

3 2

3 2 3

2 4 3

4 3 4

3


(2)

SWUP Figure 1 shows of the time series plot of the vector AR(1) model with the data containing the effects of calendar variations. A vertical (dotted line) was included in this plot to emphasize the months of Eid that occurred during this period. Stage one in modeling GSTARX is to estimate of parameters the first level. Such as the following results.

- Model for location 1

. 440 , 33 548 , 45 823 , 49 283 , 60 224 , 71 903 , 79 571 , 90 684 , 98 843 , 105 702 , 116 863 , 126 634 , 135 766 , 138 139 , 145 698 , 155 080 , 165 700 , 170 075 , 180 453 , 193 178 , 181 097 , 175 960 , 163 563 , 152 775 , 141 300 , 128 849 , 116 003 , 110 889 , 99 302 , 86 113 , 76 747 , 69 142 , 64 248 , 50 016 , 37 371 , 33 393 , 20 , 1 1 , 29 1 , 27 1 , 26 1 , 24 1 , 22 1 , 20 1 , 18 1 , 16 1 , 15 1 , 13 1 , 11 1 , 9 1 , 8 1 , 7 1 , 5 1 , 3 1 , 2 1 , 0 , 29 , 27 , 26 , 24 , 22 , 20 , 18 , 16 , 15 , 13 , 11 , 9 , 8 , 7 , 5 , 3 , 2 , 0 * , 1 t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t u D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D Y + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + = − − − − − − − − − − − − − − − − − − (4)

- Model for location 2

. 458 , 13 428 , 18 733 , 21 763 , 27 403 , 33 442 , 41 232 , 47 652 , 52 402 , 54 992 , 61 158 , 67 089 , 73 810 , 76 371 , 80 693 , 84 449 , 91 296 , 94 745 , 98 196 , 132 653 , 122 330 , 117 898 , 109 161 , 104 874 , 95 987 , 87 555 , 78 243 , 75 474 , 67 875 , 59 549 , 52 717 , 46 394 , 44 187 , 36 291 , 27 357 , 22 722 , 15 , 2 1 , 29 1 , 27 1 , 26 1 , 24 1 , 22 1 , 20 1 , 18 1 , 16 1 , 15 1 , 13 1 , 11 1 , 9 1 , 8 1 , 7 1 , 5 1 , 3 1 , 2 1 , 0 , 29 , 27 , 26 , 24 , 22 , 20 , 18 , 16 , 15 , 13 , 11 , 9 , 8 , 7 , 5 , 3 , 2 , 0 * , 2 t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t u D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D Y + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + = − − − − − − − − − − − − − − − − − − (5)

- Model for location 3

. 942 , 9 558 , 16 888 , 15 820 , 22 282 , 27 128 , 31 304 , 35 231 , 37 547 , 39 706 , 42 713 , 48 560 , 52 652 , 53 552 , 56 777 , 59 286 , 64 182 , 65 318 , 70 915 , 67 409 , 64 822 , 62 961 , 58 500 , 55 610 , 51 522 , 47 505 , 42 077 , 41 226 , 35 430 , 31 756 , 27 345 , 26 099 , 24 967 , 19 600 , 17 601 , 14 054 , 10 , 3 1 , 29 1 , 27 1 , 26 1 , 24 1 , 22 1 , 20 1 , 18 1 , 16 1 , 15 1 , 13 1 , 11 1 , 9 1 , 8 1 , 7 1 , 5 1 , 3 1 , 2 1 , 0 , 29 , 27 , 26 , 24 , 22 , 20 , 18 , 16 , 15 , 13 , 11 , 9 , 8 , 7 , 5 , 3 , 2 , 0 * , 3 t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t u D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D Y + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + = − − − − − − − − − − − − − − − − −

(6)

From the estimation parameters such as the first level of Eq. (4) to Eq. (6) obtained residual models (i.e., ui,t). Furthermore, the residual models used for estimation at second level, i.e., GSTAR models with parameter between locations (spatial).

Characteristics of GSTAR models is weighted with the location. Spatial weighting method used in this study is limited only by inference Partial Correlation Normalized Cross (NIPKS). This method is based on the high or low value of the partial cross correlation between locations. Statistical inference process is done by using a 95% confidence interval.


(3)

Table 2. Estimates normalization of cross correlation inference partial data simulation case one.

Parameter Estimates 95% confidence interval Conclusion

Lower Upper

P12(1) 0.186 0.073 0.300 Valid and concurrent

P13(1) 0.185 0.072 0.298 Valid and concurrent

P21(1) 0.212 0.099 0.326 Valid and concurrent

P23(1) 0.197 0.084 0.310 Valid and concurrent

P31(1) 0.140 0.027 0.253 Valid and concurrent

P32(1) 0.267 0.154 0.380 Valid and concurrent

Based on the calculation of the amount of the cross-correlation between the location at the time to lag 1, the process of inference statistics in Table 2 shows that the confidence interval gives the same amount (the relationship). Thus, the decision obtained are valid and comparable, it showed no difference in weighting between locations. Thus, the appropriate weighting method in this case is uniform

. 0 5 . 0 5 . 0

5 . 0 0 5 . 0

5 . 0 5 . 0 0

     

   

=

W (7)

For the second case, the weighting method is the same as Eq. (7). By using the weight of locations the results of parameter estimation GSTAR(11) model shown in Table 3.

Table 3. Comparison of parameter estimates from OLS and GLS method for the first and

second case.

Case Parameter OLS GLS Efisiensi (%)

GLS

Estimasi SE Estimasi SE

1

1 10

φ 0.283 0.060 0.299 0.060 0.000

1 11

φ

0.247 0.082 0.230 0.082 0.000

1 20

φ

0.175 0.061 0.169 0.061 0.000

1 21

φ

0.282 0.074 0.288 0.074 0.000

1 30

φ

0.329 0.061 0.335 0.061 0.000

1 31

φ 0.246 0.082 0.240 0.082 0.000

2

1 10

φ 0.271 0.067 0.254 0.062 6.625

1 11

φ

0.114 0.082 0.132 0.078 4.531

1 20

φ

0.161 0.064 0.176 0.062 3.416

1 21

φ

0.293 0.077 0.279 0.076 2.169

1 30

φ

0.143 0.070 0.133 0.064 8.185

1 31

φ 0.270 0.084 0.280 0.079 5.873

Table 3 shows the results of estimation GSTAR models there is a difference in the standard error of the estimation OLS with GLS method. The standard error of the GLS method is smaller than the OLS, the difference is largely occurs on all parameters. For the second case can be stated that the parameter estimation using GLS better than OLS. This can be seen in


(4)

SWUP almost all GLS efficiency coefficient is worth above five percent. In addition, comparison of the efficiency of the standard error of each parameter GSTAR model can also be shown through the curve probability distribution function (p.d.f) in Figure 2.

0.5 0.4 0.3 0.2 0.1

7

6 5 4 3

2 1 0

psi10

D

e

n

s

it

y

0.3

O LS GLS

0.4 0.3 0.2 0.1 0.0 -0.1 -0.2 5

4

3

2

1

0

psi11

D

e

n

s

it

y

0.1

O LS GLS

0.4 0.3 0.2

0.1 0.0 7 6 5 4 3 2 1 0

psi20

D

e

n

s

it

y

0.2

OLS GLS

0.5 0.4 0.3 0.2 0.1 0.0 6 5 4 3 2 1 0

psi21

D

e

n

s

it

y

0.15

OLS GLS

0.4 0.3 0.2 0.1 0.0 7

6 5 4 3 2 1 0

psi30

D

e

n

s

it

y

0.25

O LS GLS

0.6 0.5 0.4 0.3 0.2 0.1 0.0 5

4

3

2

1

0

psi31

D

e

n

s

it

y

0.2

O LS GLS

Figure 2. Parameter distribution plot for φi10 (left) dan φi11 (right) with OLS and GLS method (a) location 1, (b) location 2, dan (c) location 3.

Efficiency of each parameter estimation by using GLS method look more efficient than OLS method, it is marked on the shape of the curve p.d.f blue color a more narrow. In addition, the vertical a dotted line shows the actual coefficient values of each parameter. Visually, the coefficient of parameter estimation approach with a true value. Furthermore, for the first case GSTAR(11)-OLS model can be written

         

+

     

   

− − −

     

   

=

         

) (

) (

) (

) 1 (

) 1 (

) 1 (

329 , 0 123 , 0 123 , 0

141 , 0 175 , 0 141 , 0

124 , 0 124 , 0 283 , 0

) (

) (

) (

3 2 1

3 2 1

3 2 1

t e

t e

t e

t u

t u

t u

t u

t u

t u

and GSTAR(11)-GLS can be written

. ) (

) (

) (

) 1 (

) 1 (

) 1 (

335 , 0 120 , 0 120 , 0

144 , 0 169 , 0 144 , 0

115 , 0 115 , 0 299 , 0

) (

) (

) (

3 2 1

3 2 1

3 2 1

         

+

     

   

− − −

     

   

=

         

t e

t e

t e

t u

t u

t u

t u

t u

t u


(5)

As for the second case GSTAR(11)-OLS model can be written

         

+

     

   

− − −

     

   

=

         

) (

) (

) (

) 1 (

) 1 (

) 1 (

143 , 0 135 , 0 135 , 0

147 , 0 161 , 0 147 , 0

057 , 0 057 , 0 271 , 0

) (

) (

) (

3 2 1

3 2 1

3 2 1

t e

t e

t e

t u

t u

t u

t u

t u

t u

and GSTAR(11)-GLS can be written

. ) (

) (

) (

) 1 (

) 1 (

) 1 (

133 , 0 140 , 0 140 , 0

140 , 0 176 , 0 140 , 0

066 , 0 066 , 0 254 , 0

) (

) (

) (

3 2 1

3 2 1

3 2 1

         

+

     

   

− − −

     

   

=

         

t e

t e

t e

t u

t u

t u

t u

t u

t u

4.

Conclusion and remarks

The results of research, particularly in the simulation study showed that GSTAR involving the effects of calendar variation estimation, if done directly (simultaneously) there are obstacles, that resulted in the form of non-linear models. So in this study, proposed a procedure for building a model, i.e., two-level GSTARX, X based on regression models with calendar variation. The first level model is used to estimate of parameters the effects of calendar variations, whereas at the second level models for the spatial of parameters (based on error models of the first level). Two-level model building procedure refers to Wei (2006). The simulation results show that if the residuals are correlated between all locations, or only a few locations only, then the model GSTAR-GLS will result in a more efficient parameter estimation than GSTAR-OLS models. This is shown on the standard error values generated by the model GSTAR-GLS is smaller than GSTAR-OLS. Meanwhile, if the residual is not correlated between all the equations, the standard error values generated by the model GSTAR-OLS and GSTAR-GLS are the same.

For further research is necessary to estimate a parameter with mulltivariate regression model for the first level model. It is also necessary to study a simulation involving a combination data type (metric and non-metric).

References

Baltagi, B. (1980). On seemingly unrelated regressions with error components. Econometrica, 48, 1547–1551.

Borovkova, S.A., Lopuhaa, H.P., & Ruchjana, B.N. (2008). Consistency and asymptotic normality of least square estimators in Generalized STAR models. Journal Compilation Statistica Neerlandica, 62(4), 482–508.

Lopuhaa, H.P., & Borovkova, S. (2005). Asymptotic properties of least squares estimators in generalized STAR models. Technical Report, Delft University of Technology.

Nurani, B. (2002). Pemodelan kurva produksi minyak bumi menggunakan model generalisasi S-TAR.

Jurnal Forum Statistika dan Komputasi, IPB, Bogor.

Pfeifer, P.E., & Deutsch, S.J. (1980). A three stage iterative procedure for space-time modeling.

Technometrics, 22(1), 35–47.

Suhartono (2005). Perbandingan antara model GSTAR dan VARIMA untuk peramalan data deret waktu dan lokasi. Prosiding Seminar Nasional Statistika, ITS, Surabaya.

Suhartono, & Subanar (2006). The optimal determination of space weight in GSTAR model by using crosscorrelation inference. Journal of Quantitative Method, Journal Devoted to the Mathematical

and Statistical Aplication in Various Field, 2(2), 45–53.

Terzi, S. (1995). Maximum likelihood estimation of a generalized STAR (p; 1p) model. Journal of the


(6)

SWUP Wang, X., & Kockelman, K. (2007). Specification and estimation of a spatially and temporally

autocorrelated seemingly unrelated regression model: Application to crash rates in China. Journal of Transportation, 34, 281–300.

Wei, W.W.S. (2006). Time series analysis univariate and multivariate methods. Canada: Addison-Wesley Publishing Company, Inc.

Wutsqa, D.U., & Suhartono (2010). Seasonal multivariate time series forecasting on tourism data by using VAR-GSTAR model. Jurnal ILMU DASAR, 11(1), 101–109.

Zellner, A. (1962). An efficient method of estimating seemingly unrelated regression equations and tests for aggregation bias. Journal of the American Statistical Association, 57, 348–368.