Q6
(a) (X, Y) has bivariate normal distribution BN(μ₁, μ₂, σ₁², σ₂², ρ). (i) Show that X and Y are independent if and only if ρ = 0. (6 marks) (ii) If (X, Y) follows BN(3, 1, 16, 25, 3/5), obtain P(3 < Y < 8 | X = 7), given Φ(2) = 0.9772 and Φ(-0.25) = 0.4017, and Φ(x) represents the area under the standard normal curve from -∞ to x. (6 marks) (iii) If (X, Y) follows BN(0, 0, 1, 1, 0), what will be the distribution of Z = Y/X? (4 marks) (iv) State the multivariate extension of (i) when X̃ follows Nₚ(μ̃, Σ). (4 marks) (b) Define principal components and canonical correlation. How can one attain data reduction using principal components? If (X₁, X₂) has covariance matrix Σ = [[1, ρ], [ρ, 1]], then find the principal components. (15 marks) (c) For the simple linear regression model y = β₀ + β₁x + ε, where β₀ and β₁ are parameters and ε has zero mean and an unknown variance σ², find the estimates of β₀ and β₁ by the principle of least squares as well as the method of maximum likelihood. Examine whether they are identical. (15 marks)
हिंदी में प्रश्न पढ़ें
(a) (X, Y) का द्विवर प्रसामान्य बंटन BN(μ₁, μ₂, σ₁², σ₂², ρ) है। (i) दर्शाइए कि X और Y स्वतंत्र हैं, यदि और केवल यदि ρ = 0 है। (6 अंक) (ii) यदि (X, Y) का बंटन BN(3, 1, 16, 25, 3/5) है, तो P(3 < Y < 8 | X = 7) निकालिए, दिया है Φ(2) = 0.9772 और Φ(-0.25) = 0.4017 तथा Φ(x), -∞ से x तक का मानक प्रसामान्य वक्र के अंतर्गत क्षेत्रफल दर्शाता है। (6 अंक) (iii) यदि (X, Y) का बंटन BN(0, 0, 1, 1, 0) है, तो Z = Y/X का बंटन क्या होगा? (4 अंक) (iv) जब X̃, Nₚ(μ̃, Σ) का अनुसरण करता है, तो (i) का बहुचर विस्तरण लिखिए। (4 अंक) (b) मुख्य घटकों और विहित सहसंबंध को परिभाषित कीजिए। मुख्य घटकों का उपयोग करके कोई दत्त समान्यन कैसे प्राप्त कर सकता है? यदि (X₁, X₂) का सहप्रसरण आव्यूह Σ = [[1, ρ], [ρ, 1]] है, तो मुख्य घटकों को ज्ञात कीजिए। (15 अंक) (c) एक साधारण रैखिक समाश्रयन निदर्श y = β₀ + β₁x + ε के लिए, जहाँ β₀ और β₁ प्राचल हैं तथा ε का माध्य 0 और प्रसरण σ² अज्ञात है, न्यूनतम वर्ग सिद्धांत और अधिकतम संभाविता विधि से β₀ और β₁ के आकलक निकालिए। जाँच कीजिए कि क्या वे एकसमान हैं। (15 अंक)
Directive word: Prove
This question asks you to prove. The directive word signals the depth of analysis expected, the structure of your answer, and the weight of evidence you must bring.
See our UPSC directive words guide for a full breakdown of how to respond to each command word.
How this answer will be evaluated
Approach
Prove the independence condition in (a)(i) using factorization of joint density; for (a)(ii)-(iv), calculate conditional distributions and identify the Cauchy distribution; for (b), define concepts then derive eigenvalues/eigenvectors for PC extraction; for (c), derive both estimators and compare. Allocate ~40% time to part (a) [20 marks], ~30% each to (b) and (c) [15 marks each], with explicit theorem statements and step-by-step derivations throughout.
Key points expected
- (a)(i) Prove ρ=0 ⇔ independence by showing joint density factorizes into marginal densities, using the bivariate normal PDF structure
- (a)(ii) Compute conditional distribution Y|X=7 ~ N(μ₂ + ρ(σ₂/σ₁)(x-μ₁), σ₂²(1-ρ²)), then standardize and use Φ values
- (a)(iii) Identify Z=Y/X as ratio of independent N(0,1) variables, hence standard Cauchy distribution
- (a)(iv) State multivariate extension: X̃ ~ Nₚ(μ̃, Σ) has independent components iff Σ is diagonal
- (b) Define PCs as uncorrelated linear combinations maximizing variance; define canonical correlation as correlation between linear combinations of two variable sets; data reduction by retaining top k PCs; derive eigenvalues (1±ρ) and eigenvectors for given Σ
- (c) Derive LSE by minimizing Σ(yᵢ-β₀-β₁xᵢ)²; derive MLE using normal error assumption; show identical estimators but different variance estimators
- Compare LSE (distribution-free) vs MLE (requires normality) and note σ²_MLE = SSE/n vs σ²_LSE = SSE/(n-2)
Evaluation rubric
| Dimension | Weight | Max marks | Excellent | Average | Poor |
|---|---|---|---|---|---|
| Setup correctness | 20% | 10 | Correctly writes full bivariate normal PDF for (a)(i); properly identifies conditional normal parameters in (a)(ii); recognizes independence of X,Y in (a)(iii); states multivariate normal definition with precision in (a)(iv); sets up eigenvalue problem correctly in (b); specifies all regression assumptions in (c) | Writes most formulas correctly but misses constants or conditions; partial setup for eigenvalue problem; vague on regression assumptions | Incorrect PDF or missing key components; wrong conditional distribution formula; fails to identify distribution of ratio; omits multivariate extension; incorrect eigenvalue setup |
| Method choice | 20% | 10 | Uses factorization criterion for independence proof; applies standard conditional normal theory with proper standardization; recognizes Cauchy via ratio of normals; uses spectral decomposition for PCs; applies calculus minimization for LSE and likelihood maximization for MLE with clear distinction | Correct general methods but inefficient or partially justified; some mixing of approaches; incomplete optimization steps | Wrong method for independence (e.g., correlation only); incorrect standardization; fails to identify distribution type; wrong PC extraction method; confused LSE/MLE derivation |
| Computation accuracy | 20% | 10 | Precise calculation: conditional mean = 1 + (3/5)(5/4)(4) = 4, conditional SD = 4, Z-scores (3-4)/4=-0.25 and (8-4)/4=1, probability = 0.9772-0.4017 = 0.5755; eigenvalues 1±ρ with orthogonal eigenvectors [1,1]/√2 and [1,-1]/√2; correct normal equations and MLE solutions β̂₁=Sxy/Sxx, β̂₀=ȳ-β̂₁x̄ | Minor arithmetic errors in probability or eigenvectors; correct formulas but calculation mistakes; partial derivation of estimators | Major computational errors in conditional parameters; wrong probability value; incorrect eigenvalues/eigenvectors; wrong estimator formulas |
| Interpretation | 20% | 10 | Explains why zero correlation implies independence only for normal distributions; interprets conditional probability in context; explains why Cauchy has no moments; clarifies that PCs are uncorrelated and capture maximum variance sequentially; explains geometric meaning of LSE (projection) vs MLE (likelihood maximization) and conditions for equivalence | Some interpretation present but shallow or partially correct; misses key insights about normality requirement or PC variance maximization | No interpretation of results; fails to explain why results matter; purely mechanical computation without insight |
| Final answer & units | 20% | 10 | Clear boxed answers: (a)(ii) P = 0.5755 or 57.55%; (a)(iii) Z ~ Cauchy(0,1); (a)(iv) Σ diagonal ⇔ independence; (b) PC1 = (X₁+X₂)/√2, PC2 = (X₁-X₂)/√2 with variances 1+ρ and 1-ρ; (c) explicit estimator formulas with note on identical β̂ but different σ² estimators; proper notation throughout | Answers present but poorly formatted or missing some parts; inconsistent notation; unclear final statements | Missing final answers; wrong conclusions; no clear presentation of results; confused notation |
Practice this exact question
Write your answer, then get a detailed evaluation from our AI trained on UPSC's answer-writing standards. Free first evaluation — no signup needed to start.
Evaluate my answer →More from Statistics 2024 Paper I
- Q1 (a) Two events A and B are such that P(A) = 1/3, P(B) = 1/4 and P(A|B) + P(B|A) = 2/3. Evaluate the following: (i) P(A^c ∪ B^c) (5 marks) (…
- Q2 (a) Let the joint probability density function of two random variables X and Y be f(x, y) = x/3, for 0 < 2x < 3y < 6; 0, otherwise. Compute…
- Q3 (a) Let moment generating function of random variable X exist in the neighbourhood of zero and if $$E(X^n) = \frac{1}{5} + (-1)^n \frac{2}{…
- Q4 (a) Find the most powerful test of size α(= 0·05) for testing H₀: μ = 0 vs. H₁: μ = 1, given a random sample of size 25 from N(μ, 16) popul…
- Q5 (a) How will you justify the usage of the principle of least squares in estimating the parameters of a linear regression model? With usual…
- Q6 (a) (X, Y) has bivariate normal distribution BN(μ₁, μ₂, σ₁², σ₂², ρ). (i) Show that X and Y are independent if and only if ρ = 0. (6 marks)…
- Q7 (a) A very big population is divided into two strata. The allocation of units of stratified random sample of size n for the two strata unde…
- Q8 (a) A 2²-factorial design was used to develop the yield of a crop. Two factors A and B were used at two levels: low (–1) and high (+1). The…