Tải bản đầy đủ (.pdf) (24 trang)

Bài 1: Các dạng hàm hồi quy

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1008.36 KB, 24 trang )

(1)

FUNCTIONAL FORMS



Truong Dang Thuy

truong@dangthuy.net




(2)

Linear model



Consider a linear regression function



: change in Y when X increases by 1 unit.


Sometimes the relationship is not linear.


Common functional form:



Log-linear


Log-lin


Lin-log



Reciprocal


Polynomial



0

1



Y

X




(3)

Functional forms



Linear model

Log-linear



Lin-log



Log-lin




0 1


Y

X



0 1


ln

Y

ln

X



0 1

ln



Y

X



0 1



(4)

Functional forms



Reciprocal (negative beta)

Reciprocal (positive beta)



0 1 1


1



0



Y



X








1




0 1 1


1



0



Y



X






(5)

Example dataset



Viet Nam Provincial data on (file ‘

gdpprov.xlsx

’)



gdp

:

provincial GDP (mil. VND)



labfo

:

number of laborers of provinces (1000



persons)




(6)

Record of


commands




Record of results



Variables


(data)



Commands



Taskbar




(7)

Import data



Copy from Excel




(8)

Data description





(9)

Linear function




(10)

LOG-LINEAR MODEL



The Cobb-Douglas Production Function:



can be transformed into a linear model by taking natural


logs of both sides:



The slope coefficients can be interpreted as elasticities.



If (B

2

+ B

3

) = 1, we have constant returns to scale.


If (B

2

+ B

3

) > 1, we have increasing returns to scale.



If (B

2

+ B

3

) < 1, we have decreasing returns to scale.



3


2



1



B


B



i

i

i



Q

B L K



1

2

3




(11)

Log-linear model




_cons 3.06333 .4515804 6.78 0.000 2.174233 3.952426
linvest .644785 .0405325 15.91 0.000 .5649824 .7245876
llabor .508612 .0643267 7.91 0.000 .381962 .635262

lgdp Coef. Std. Err. t P>|t| [95% Conf. Interval]

Total 224.910559 270 .833002069 Root MSE = .42886
Adj R-squared = 0.7792
Residual 49.2915017 268 .183923514 R-squared = 0.7808
Model 175.619057 2 87.8095284 Prob > F = 0.0000
F( 2, 268) = 477.42


Source SS df MS Number of obs = 271
. reg lgdp llabor linvest


(17 missing values generated)
. gen linvest = ln(rinvest)
. gen llabor = ln(labfo)


(10 missing values generated)
. gen lgdp = ln(rgdp)



(12)

LOG-LIN OR GROWTH MODELS



The rate of growth of real GDP:



can be transformed into a linear model by taking natural logs


of both sides:



Letting B

1

= ln RGDP

0

and B

2

= ln (l+r), this can be



rewritten as:



ln RGDP

t

= B

1

+B

2

t



B

2

is considered a semi-elasticity or an instantaneous growth rate.


The compound growth rate (r) is equal to (e

B2

– 1).



0

(1

)



t


t




RGDP

RGDP

r



0




(13)

LOG-LIN MODEL



t 290 3 1.416658 1 5



Variable Obs Mean Std. Dev. Min Max


. sum t




(14)

LOG-LIN MODEL




(15)

LIN-LOG MODELS



Lin-log models follow this general form:



Note that B

2

is the absolute change in Y responding to a



percentage (or relative) change in X



If X increases by 100%, predicted Y increases by B

2

units



1

2

ln



i

i

i




(16)

Exercise – lin-log model




Data: from VHLSS 2010



income

: individual annual income (1000 VND)


healthcost

: individual annual cost for health care



(1000 VND)



Use the data in ‘healthcost.dta’ to run the



regression



where

hcshare

is the share of health cost in income.





0

1

ln




(17)

Health cost with Lin-log model




_cons .421608 .0322026 13.09 0.000 .35847 .484746
lincome -.0341629 .0029364 -11.63 0.000 -.0399202 -.0284056

hcshare Coef. Std. Err. t P>|t| [95% Conf. Interval]

Total 75.7996618 3474 .021819131 Root MSE = .14494
Adj R-squared = 0.0372
Residual 72.9563097 3473 .021006712 R-squared = 0.0375
Model 2.84335206 1 2.84335206 Prob > F = 0.0000
F( 1, 3473) = 135.35


Source SS df MS Number of obs = 3475
. reg hcshare lincome


. gen lincome = ln(income)



(18)

RECIPROCAL MODELS



Lin-log models follow this general form:



Note that:



As X increases indefinitely, the term approaches zero and Y approaches


the limiting or asymptotic value B

1

.



The slope is:



Therefore, if B

2

is positive, the slope is negative throughout, and if B

2

is negative,


the slope is positive throughout.



1

2



1


(

)



i

i



i



Y

B

B

u




X



2

1


(

)


i

B


X


2

2


1


(

)


dY


B




(19)

Exercise – Reciprocal model



Use the data in ‘

healthcost.dta

’ to run the



regression



0

1



1



hcshare



income







(20)

Exercise – Reciprocal model




_cons .023971 .0032251 7.43 0.000 .0176478 .0302943


invincome 942.4843 81.65964 11.54 0.000 782.3786 1102.59



hcshare Coef. Std. Err. t P>|t| [95% Conf. Interval]



Total 75.7996618 3474 .021819131 Root MSE = .14498


Adj R-squared = 0.0367


Residual 72.9997153 3473 .02101921 R-squared = 0.0369


Model 2.79994649 1 2.79994649 Prob > F = 0.0000


F( 1, 3473) = 133.21


Source SS df MS Number of obs = 3475


. reg hcshare invincome




(21)

POLYNOMIAL REGRESSION MODELS



The following regression predicting GDP is an example of a



quadratic function, or more generally, a second-degree


polynomial in the variable time:



The slope is nonlinear and equal to:



Exercise: run the above model with ‘gdpprov.dta’



2



1

2

3




t

t



RGDP

A

A time

A time

u



2

2

3



dRGDP



A

A time




(22)

SUMMARY OF FUNCTIONAL FORMS



MODEL

FORM

SLOPE

ELASTICITY



(

dY



dX

)

.



dY X


dX Y



Linear

Y =B

1

+ B

2

X

B

2 2

(

)



Y


X


B



Log-linear

lnY =B

1

+ ln X

2

(

)




Y


B



X

B

2


Log-lin

lnY =B

1

+ B

2

X

B Y

2

( )

B

2

(

X

)



Lin-log

Y

B

1

B

2

ln

X

2


1


(

)


B


X

)


1


(


2

Y


B



Reciprocal

1 2


1


(

)



Y

B

B


X



B

2

(

1

2

)



X




2

(

1

)



XY


B





2

ln




(23)

COMPARING ON BASIS OF R

2



We cannot directly compare two models that have



different dependent variables.



We can transform the models as follows and compare RSS:



Step 1: Compute the geometric mean (GM) of the dependent



variable, call it Y

*

.



Step 2: Divide Y

i

by Y

*

to obtain:



Step 3: Estimate the equation with lnY

i

as the dependent variable



using in lieu of Y

i

as the dependent variable (i.e., use ln as the


dependent variable).



Step 4: Estimate the equation with Y

i

as the dependent variable




using as the dependent variable instead of Y

i

.



i


i



Y


Y



Y

~



*



i



Y

~

Y

~

i



i




(24)

MEASURES OF GOODNESS OF FIT



R

2

: Measures the proportion of the variation in the regressand



explained by the regressors.



Adjusted R

2

: Denoted as , it takes degrees of freedom into account:



Akaike’s Information Criterion (AIC): Adds harsher penalty for adding



more variables to the model, defined as:




The model with the lowest AIC is usually chosen.



Schwarz’s Information Criterion (SIC): Alternative to the AIC criterion,



expressed as:



The penalty factor here is harsher than that of AIC.


2



R





_


2 2

1



1 (1

)

n



R

R


n k



  



2



ln

AIC

k

ln(

RSS

)



n

n






ln

SIC

k

ln

n

ln(

RSS

)



n

n






×