Skip to document

MLE - Is about MLE

Is about MLE
Course

Categorical Data Analysis (SSTE081)

11 Documents
Students shared 11 documents in this course
Academic year: 2015/2016
Uploaded by:
Anonymous Student
This document has been uploaded by a student, just like you, who decided to remain anonymous.
Technische Universiteit Delft

Comments

Please sign in or register to post comments.

Preview text

Stat 653 HW2 Exercise 1. Divya Nair Find MLE (Maximum Likelihood Estimator) for the following parameters: 1. Probability of success p in Bernoulli(p) model. Let X be a Bernoulli random variable with parameter p. Let X1 , · · · , Xn be the independent random samples of X . Recall that the probability density function for the Bernoulli distribution with parameter p is f (x) = px (1 − p)1−x where x = 0, 1. Then the likelihood function of the sample is Solution. l(x1 , · · · , xn , p) = f (x1 , · · · , xn , p) n Y = f (xi , p) {since Xi are independent} = i=1 n Y pxi (1 − p)1−xi i=1 n P xi n− = pi=1 (1 − p) n P xi . i=1 Taking the natural logarithm on both sides gives, L(x1 , · · · , xn , p) = ln l(x1 , · · · , xn , p) n P = ln p = xi i=1 n X n− (1 − p) ! xi p+ n P xi ! i=1 n− n X ! xi ln(1 − p). i=1 i=1 Since L(x1 , · · · , xn , p) is a continuous function of p, it has a maximum value. This value can be found by taking the derivative of L(x1 , · · · , xn , p) with respect to p, and setting it equal to 0. So, n P ∂L(x1 , · · · , xn , p) = ∂p n P gives that p = xi i=1 n n P . Hence, pˆ = xi i=1 n xi i=1 p   −  n− n P i=1 1−p xi   =0  ¯n. =X 2. Probability of success p in Binomial(n, p) model. Let X be a Binomial random variable with parameter p. Let X1 , · · · , Xm be the independent random samples of X . Recall that the probability density function for the Binomial distribution with parameter p is f (x) = nx px (1 − p)n−x where x = 0, · · · , n. Then the likelihood function of the sample is Proof. l(x1 , · · · , xm , p) = f (x1 , · · · , xm , p) m Y = f (xi , p) {since Xi are independent} = i=1 m  Y i=1  n xi p (1 − p)n−xi xi 1 Stat 653 HW2 Divya Nair Taking the natural logarithm on both sides gives, L(x1 , · · · , xm , p) = ln l(x1 , · · · , xm , p) ! m   Y n xi n−xi = ln p (1 − p) xi i=1  m    X n = ln + xi ln p + (n − xi ) ln(1 − p) xi i=1 ! !   m m m X X X n = ln + xi ln p + mn − xi ln(1 − p) xi i=1 i=1 i=1 Since L(x1 , · · · , xn , p) is a continuous function of p, it has a maximum value. This value can be found by taking the derivative of L(x1 , · · · , xn , p) with respect to p, and setting it equal to 0. So, m P ∂L(x1 , · · · , xn , p) =0+ ∂p m P gives that p = xi i=1 mn m P . Hence, pˆ = xi i=1 mn xi i=1  mn − xi i=1  −  p m P 1−p   =0  . 3. Probability of success p in Geometric(p) model. Let X be a Geometric random variable with parameter p. Let X1 , · · · , Xn be the independent random samples of X . Recall that the probability density function for the Geometric distribution with parameter p is f (x) = p(1 − p)x where x = 0, 1, 2, · · · . Then the likelihood function of the sample is Proof. l(x1 , · · · , xn , p) = f (x1 , · · · , xn , p) n Y f (xi , p) {since Xi are independent} = = i=1 n Y p(1 − p)xi i=1 n P n = p (1 − p) xi i=1 . Taking the natural logarithm on both sides gives, L(x1 , · · · , xn , p) = ln l(x1 , · · · , xn , p) n P n = ln p (1 − p)i=1 = n ln p + n X xi ! ! xi ln(1 − p). i=1 Since L(x1 , · · · , xn , p) is a continuous function of p, it has a maximum value. This value can be found by taking the derivative of L(x1 , · · · , xn , p) with respect to p, and setting it equal to 0. So, P n xi  ∂L(x1 , · · · , xn , p) n  i=1  =0 = − ∂p p 1−p 2 Stat 653 HW2 Divya Nair Recall that E(X) = p where X is the Bernoulli random variable. Then the Fisher information is  ∂ 2 L(x, p) ∂2p    1−X X = −E − 2 − p (1 − p)2 E(X) E(1 − X) = + p2 (1 − p)2   1−p p = 2+ p (1 − p)2 1 1−p = + p (1 − p)2 (1 − p)2 + p − p2 = p(1 − p)2 1−p = p(1 − p)2 1 = . p(1 − p)  I(ˆ p) = −E 2. Probability of success p in Binomial(n, p) model. Solution.  −E The Fisher information for the parameter pˆ is given by I(ˆ p) = E " ∂L(x, p) ∂p 2 # or I(ˆ p) = ∂ 2 L(x, p) . From Exercise 1 part 2, we have ∂2p   ∂ ln nx + x ln p + (n − x) ln(1 − p) ∂L(x, p) = ∂p ∂p   x n−x = − . p 1−p  Recall that E(X) = np where X is the Binomial random variable. Then the Fisher information is  ∂ 2 L(x, p) I(ˆ p) = −E ∂2p    X n−X = −E − 2 − p (1 − p)2   n − np np = 2 − p (1 − p)2   1 1 =n − p 1−p n = . p(1 − p)  3. Probability of success p in Geometric(p) model. 4 Stat 653 HW2 Solution.  −E Divya Nair The Fisher information for the parameter pˆ is given by I(ˆ p) = E " ∂L(x, p) ∂p 2 # or I(ˆ p) = ∂ 2 L(x, p) . From Exercise 1 part 3, we have ∂2p  ∂ (ln p + (x − 1) ln(1 − p)) ∂L(x, p) = ∂p ∂p   1 x = − . p 1−p Recall that E(X) = 1 where X is the Geometric random variable. Then the Fisher information is p  2  ∂ L(x, p) I(ˆ p) = −E ∂2p    X 1 = −E − 2 − p (1 − p)2 1 E(X) = 2+ p (1 − p)2 1 1 = 2+ p p(1 − p)2 (1 − p)2 + p = 2 p (1 − p)2 1 − p + p2 = 2 p (1 − p)2 1 − p(1 − p) = 2 p (1 − p)2 1 1 = 2 − p (1 − p)2 p(1 − p) 1−p = 2 p (1 − p)2 1 . = 2 p (1 − p) 4. Intensity λ in Poisson(λ) model. " 2 # ∂L(x, λ) ˆ is given by I(λ) ˆ =E ˆ = Solution. The Fisher information for the parameter λ or I(λ) ∂λ   2  1 ∂ −λ + x ln λ + ln x! ∂ L(x, λ) ∂L(x, λ) −E . From Exercise 1 part 4, we have = = λx − 1, and ∂2λ ∂p ∂λ 2 so ∂ L(x,λ) = − λx2 . Recall that E(X) = λ where X is a Poisson random variable with parameter λ. ∂2λ 5 Stat 653 HW2 Divya Nair m P Solution. As shown in Excercise 1 part 2, pˆ = P m xi i=1 mn . The variance of this parameter is  x  i=1 i   var(ˆ p) = var   mn  m P = {since Xi are independent} m 2 n2 m P = var(xi ) i=1 np(1 − p) i=1 m2 n2 mnp(1 − p) = m2 n2 p(1 − p) = . mn Also, from Exercise 2 part 2, we have I(ˆ p) = 1 m·I(p) ˆ n . Hence, the Cramer-Rao inequality V ar(ˆ p) ≥ p(1 − p) holds, and pˆ is the most ecient estimator. 3. Intensity λ in Poisson(λ) model n P Solution. As shown in Excercise 1 part 4, λˆ = n P ˆ = var(λ) = n var(xi ) . The variance of this parameter is {since Xi are independent} i=1 n2 n P xi i=1 λ i=1 n2 nλ n2 λ = . n = 1 ˆ = . Hence, the Cramer-Rao inequality V ar(ˆ Also, from Exercise 2 part 4, we have I(λ) p) ≥ λ holds, and pˆ is the most ecient estimator. 1 n·I(p) ˆ Consider the 2 × 2 contingency table given below that describes the Belief in After Life (Y ) by Gender (X ). Compute the following: Exercise 4. 1. point estimations for π1 , π2 , ∆. ˆ. 2. distribution for ∆ ˆ. 3. 95% condence interval for ∆ 4. hypothesis test testing whether the gender has an eect on belief in after life. Belief in After Life Y N Total Gender M 398 (n11 ) 104 (n12 ) 502 F 509 (n21 ) 116 (n22 ) 625 Total 907 220 n = 1127 7 Stat 653 HW2 1. Divya Nair Point estimations for π1 , π2 and ∆ are as follows: a. πˆ1 = p1 = n11 398 n11 = = = 0 n11 + n12 n1 502 b. πˆ2 = p2 = n21 n21 509 = = = 0 n121 + n22 n2 625 ˆ =π c. ∆ ˆ1 − π ˆ2 = −0 2. Each subject in the given data can be described in terms of Bernoulli trials. Let ( 1 males believe in after life Mi = 0 otherwise and ( Fi = Then n11 = n 1 P i=1 1 females believe in after life 0 otherwise Mi ∼ Binomial(n1 , π1 ) and n21 = n 2 P i=1 Fi ∼ Binomial(n2 , π2 ). Since n1 , n2 are large, by Central Limit Theorem, the binomial random variable can be approximated by a n n 1 2 P P normal random variable. Thus, n11 = Mi ∼ N (n1 π1 , n1 π1 (1 − π1 )) and n21 = Fi ∼ N (n2 π2 , n2 π2 (1 − π2 )). Thus, i=1 i=1 ˆ =π ∆ ˆ1 − π ˆ2 = p1 − p2 n11 n21 = − n1 n2   π1 (1 − π1 ) π2 (1 − π2 ) ∼ N π1 − π2 , + . n1 n2 3. ˆ is given by ∆ ˆ ± Z α · σ ˆ . The standard 95% condence interval / interval estimation for ∆ ∆ 2 ˆ deviation for ∆ is given by s π1 (1 − π1 ) n2 (1 − π2 ) σ∆ + ˆ = n1 n2 s p1 (1 − p1 ) p2 (1 − p2 ) ≈ + n1 n2 r 0(1 − 0) 0(1 − 0) + = 502 625 = 0. Thus, the 95% condence interval is ˆ ± Z α · σ ˆ = −0 ± 1 × 0 ∆ ∆ 2 = (−0, 0). We are 95% condent that the true value of ∆ lies within this interval. 4. Our null hypothesis is that gender has no eect on belief in after life, and the alternate hypothesis is gender has an eect on belief in after life. H0 :∆ = 0 Ha :∆ 6= 0 8

Was this document helpful?

MLE - Is about MLE

Course: Categorical Data Analysis (SSTE081)

11 Documents
Students shared 11 documents in this course
Was this document helpful?
❙t❛t ✻✺✸ ❍❲✷ ❉✐✈② ❛✐r
❊①❡r❝✐s❡ ✶✳
❋✐♥❞ ▼▲❊ ✭▼❛①✐♠✉♠ ▲✐❦❡❧✐❤♦ ❊st✐♠❛t♦r✮ ❢♦r t❤❡ ♦❧❧♦✇✐♥❣ ♣❛r❛♠❡t❡rs✿
✶✳ Pr♦❜❛❜✐❧✐t ♦❢ s✉❝❝❡ss
p
✐♥ ❇❡r♥♦✉❧❧✐
(p)
♠♦❞❡❧✳
❙♦❧✉t✐♦♥✳
▲❡t
X
❇❡r♥♦✉❧❧✐ r❛♥❞♦♠ ❛r✐❛❜❧❡ ✇✐t❤ ♣❛r❛♠❡t❡r
p
▲❡t
X1,· · · , Xn
t❤❡ ✐♥❞❡❡♥❞❡♥t
r❛♥❞♦♠ s❛♠♣❧❡s ♦❢
X
❘❡❝❛❧❧ t❤❛t t❤❡ ♣r♦❜❛❜✐❧✐t ♥st ❢✉♥❝t✐♦♥ ❢♦r t❤❡ ❇❡r♥♦✉❧❧✐ ❞✐str✐❜✉t✐♦♥ ✇✐t❤
♣❛r❛♠❡t❡r
p
✐s
f(x) = px(1 p)1x
✇❤❡r❡
x= 0,1
❚❤❡♥ t❤❡ ❧✐❦❡❧✐❤♦ ❢✉♥❝t✐♦ ♦❢ t❤❡ s❛♠♣❧❡ ✐s
l(x1,· · · , xn, p) = f(x1,· · · , xn, p)
=
n
Y
i=1
f(xi, p){
s✐♥❝❡
Xi
❛r❡ ✐♥❞❡♣❡♥❞❡♥t
}
=
n
Y
i=1
pxi(1 p)1xi
=p
n
P
i=1
xi(1 p)n
n
P
i=1
xi
.
❛❦✐♥❣ t❤❡ ♥❛t✉r❛❧ ❧♦❣❛r✐t❤♠ ♦♥ ♦t❤ s✐❞❡s ❣✐✈❡s✱
L(x1,· · · , xn, p) = ln l(x1,· · · , xn, p)
= ln p
n
P
i=1
xi(1 p)n
n
P
i=1
xi!
= n
X
i=1
xi!p+ n
n
X
i=1
xi!ln(1 p).
❙✐♥❝❡
L(x1,· · · , xn, p)
✐s ❝♦♥t✐♥✉♦✉s ❢✉♥❝t✐♦♥ ♦❢
p
✐t ❤❛s ♠❛①✐♠✉♠ ❛❧✉❡✳ ❚❤✐s ❛❧✉❡ ❛♥ ❢♦✉♥❞
t❛❦✐♥❣ t❤❡ r✐✈❛t✐✈ ♦❢
L(x1,· · · , xn, p)
✇✐t❤ r❡s♣❡❝t t♦
p
❛♥❞ stt✐♥❣ ✐t ❡q✉❛❧ t♦
0
❙♦✱
L(x1,· · · , xn, p)
p =
n
P
i=1
xi
p
n
n
P
i=1
xi
1p
= 0
❣✐✈❡s t❤❛t
p=
n
P
i=1
xi
n
❍❡♥❝❡✱
ˆp=
n
P
i=1
xi
n=¯
Xn
✷✳ Pr♦❜❛❜✐❧✐t ♦❢ s✉❝❝❡ss
p
✐♥ ❇✐♥♦♠✐❛❧
(n, p)
♠♦❞❡❧✳
Pr♦❢✳
▲❡t
X
❇✐♥♦♠✐❛❧ r❛♥❞♦♠ ❛r✐❛❜❧❡ ✇✐t❤ ♣❛r❛♠❡t❡r
p
▲❡t
X1,· · · , Xm
t❤❡ ✐♥❞❡♣❡♥❞❡♥t r❛♥❞♦♠
s❛♠♣❧❡s ♦❢
X
❘❡❝❛❧❧ t❤❛t t❤❡ ♣r♦❜❛❜✐❧✐t ❞❡♥s✐t ❢✉♥❝t✐♦♥ ❢♦r t❤❡ ❇✐♥♦♠✐❛❧ ❞✐str✐❜✉t✐♦♥ ✇✐t❤ ♣❛r❛♠❡t❡r
p
✐s
f(x) = n
xpx(1 p)nx
✇❤❡r❡
x= 0,· · · , n
❚❤❡♥ t❤❡ ❧✐❦❡❧✐❤♦ ❢✉♥❝t✐♦♥ ♦❢ t❤❡ s❛♠♣❧❡ ✐s
l(x1,· · · , xm, p) = f(x1,· · · , xm, p)
=
m
Y
i=1
f(xi, p){
s✐♥❝❡
Xi
❛r❡ ✐♥❞❡♣❡♥❞❡♥t
}
=
m
Y
i=1 n
xipxi(1 p)nxi