Statistics 2024 Paper I 50 marks Prove

Q6

(a) (X, Y) has bivariate normal distribution BN(μ₁, μ₂, σ₁², σ₂², ρ). (i) Show that X and Y are independent if and only if ρ = 0. (6 marks) (ii) If (X, Y) follows BN(3, 1, 16, 25, 3/5), obtain P(3 < Y < 8 | X = 7), given Φ(2) = 0.9772 and Φ(-0.25) = 0.4017, and Φ(x) represents the area under the standard normal curve from -∞ to x. (6 marks) (iii) If (X, Y) follows BN(0, 0, 1, 1, 0), what will be the distribution of Z = Y/X? (4 marks) (iv) State the multivariate extension of (i) when X̃ follows Nₚ(μ̃, Σ). (4 marks) (b) Define principal components and canonical correlation. How can one attain data reduction using principal components? If (X₁, X₂) has covariance matrix Σ = [[1, ρ], [ρ, 1]], then find the principal components. (15 marks) (c) For the simple linear regression model y = β₀ + β₁x + ε, where β₀ and β₁ are parameters and ε has zero mean and an unknown variance σ², find the estimates of β₀ and β₁ by the principle of least squares as well as the method of maximum likelihood. Examine whether they are identical. (15 marks)

हिंदी में प्रश्न पढ़ें

(a) (X, Y) का द्विवर प्रसामान्य बंटन BN(μ₁, μ₂, σ₁², σ₂², ρ) है। (i) दर्शाइए कि X और Y स्वतंत्र हैं, यदि और केवल यदि ρ = 0 है। (6 अंक) (ii) यदि (X, Y) का बंटन BN(3, 1, 16, 25, 3/5) है, तो P(3 < Y < 8 | X = 7) निकालिए, दिया है Φ(2) = 0.9772 और Φ(-0.25) = 0.4017 तथा Φ(x), -∞ से x तक का मानक प्रसामान्य वक्र के अंतर्गत क्षेत्रफल दर्शाता है। (6 अंक) (iii) यदि (X, Y) का बंटन BN(0, 0, 1, 1, 0) है, तो Z = Y/X का बंटन क्या होगा? (4 अंक) (iv) जब X̃, Nₚ(μ̃, Σ) का अनुसरण करता है, तो (i) का बहुचर विस्तरण लिखिए। (4 अंक) (b) मुख्य घटकों और विहित सहसंबंध को परिभाषित कीजिए। मुख्य घटकों का उपयोग करके कोई दत्त समान्यन कैसे प्राप्त कर सकता है? यदि (X₁, X₂) का सहप्रसरण आव्यूह Σ = [[1, ρ], [ρ, 1]] है, तो मुख्य घटकों को ज्ञात कीजिए। (15 अंक) (c) एक साधारण रैखिक समाश्रयन निदर्श y = β₀ + β₁x + ε के लिए, जहाँ β₀ और β₁ प्राचल हैं तथा ε का माध्य 0 और प्रसरण σ² अज्ञात है, न्यूनतम वर्ग सिद्धांत और अधिकतम संभाविता विधि से β₀ और β₁ के आकलक निकालिए। जाँच कीजिए कि क्या वे एकसमान हैं। (15 अंक)

Directive word: Prove

This question asks you to prove. The directive word signals the depth of analysis expected, the structure of your answer, and the weight of evidence you must bring.

See our UPSC directive words guide for a full breakdown of how to respond to each command word.

How this answer will be evaluated

Approach

Prove the independence condition in (a)(i) using factorization of joint density; for (a)(ii)-(iv), calculate conditional distributions and identify the Cauchy distribution; for (b), define concepts then derive eigenvalues/eigenvectors for PC extraction; for (c), derive both estimators and compare. Allocate ~40% time to part (a) [20 marks], ~30% each to (b) and (c) [15 marks each], with explicit theorem statements and step-by-step derivations throughout.

Key points expected

  • (a)(i) Prove ρ=0 ⇔ independence by showing joint density factorizes into marginal densities, using the bivariate normal PDF structure
  • (a)(ii) Compute conditional distribution Y|X=7 ~ N(μ₂ + ρ(σ₂/σ₁)(x-μ₁), σ₂²(1-ρ²)), then standardize and use Φ values
  • (a)(iii) Identify Z=Y/X as ratio of independent N(0,1) variables, hence standard Cauchy distribution
  • (a)(iv) State multivariate extension: X̃ ~ Nₚ(μ̃, Σ) has independent components iff Σ is diagonal
  • (b) Define PCs as uncorrelated linear combinations maximizing variance; define canonical correlation as correlation between linear combinations of two variable sets; data reduction by retaining top k PCs; derive eigenvalues (1±ρ) and eigenvectors for given Σ
  • (c) Derive LSE by minimizing Σ(yᵢ-β₀-β₁xᵢ)²; derive MLE using normal error assumption; show identical estimators but different variance estimators
  • Compare LSE (distribution-free) vs MLE (requires normality) and note σ²_MLE = SSE/n vs σ²_LSE = SSE/(n-2)

Evaluation rubric

DimensionWeightMax marksExcellentAveragePoor
Setup correctness20%10Correctly writes full bivariate normal PDF for (a)(i); properly identifies conditional normal parameters in (a)(ii); recognizes independence of X,Y in (a)(iii); states multivariate normal definition with precision in (a)(iv); sets up eigenvalue problem correctly in (b); specifies all regression assumptions in (c)Writes most formulas correctly but misses constants or conditions; partial setup for eigenvalue problem; vague on regression assumptionsIncorrect PDF or missing key components; wrong conditional distribution formula; fails to identify distribution of ratio; omits multivariate extension; incorrect eigenvalue setup
Method choice20%10Uses factorization criterion for independence proof; applies standard conditional normal theory with proper standardization; recognizes Cauchy via ratio of normals; uses spectral decomposition for PCs; applies calculus minimization for LSE and likelihood maximization for MLE with clear distinctionCorrect general methods but inefficient or partially justified; some mixing of approaches; incomplete optimization stepsWrong method for independence (e.g., correlation only); incorrect standardization; fails to identify distribution type; wrong PC extraction method; confused LSE/MLE derivation
Computation accuracy20%10Precise calculation: conditional mean = 1 + (3/5)(5/4)(4) = 4, conditional SD = 4, Z-scores (3-4)/4=-0.25 and (8-4)/4=1, probability = 0.9772-0.4017 = 0.5755; eigenvalues 1±ρ with orthogonal eigenvectors [1,1]/√2 and [1,-1]/√2; correct normal equations and MLE solutions β̂₁=Sxy/Sxx, β̂₀=ȳ-β̂₁x̄Minor arithmetic errors in probability or eigenvectors; correct formulas but calculation mistakes; partial derivation of estimatorsMajor computational errors in conditional parameters; wrong probability value; incorrect eigenvalues/eigenvectors; wrong estimator formulas
Interpretation20%10Explains why zero correlation implies independence only for normal distributions; interprets conditional probability in context; explains why Cauchy has no moments; clarifies that PCs are uncorrelated and capture maximum variance sequentially; explains geometric meaning of LSE (projection) vs MLE (likelihood maximization) and conditions for equivalenceSome interpretation present but shallow or partially correct; misses key insights about normality requirement or PC variance maximizationNo interpretation of results; fails to explain why results matter; purely mechanical computation without insight
Final answer & units20%10Clear boxed answers: (a)(ii) P = 0.5755 or 57.55%; (a)(iii) Z ~ Cauchy(0,1); (a)(iv) Σ diagonal ⇔ independence; (b) PC1 = (X₁+X₂)/√2, PC2 = (X₁-X₂)/√2 with variances 1+ρ and 1-ρ; (c) explicit estimator formulas with note on identical β̂ but different σ² estimators; proper notation throughoutAnswers present but poorly formatted or missing some parts; inconsistent notation; unclear final statementsMissing final answers; wrong conclusions; no clear presentation of results; confused notation

Practice this exact question

Write your answer, then get a detailed evaluation from our AI trained on UPSC's answer-writing standards. Free first evaluation — no signup needed to start.

Evaluate my answer →

More from Statistics 2024 Paper I