Statistics 2025 Paper I 50 marks Derive

Q6

(a)(i) If (X, Y) follows bivariate normal BN(μ₁, μ₂, σ₁², σ₂², ρ), then obtain (A) E(e^X) (B) E(e^(X+Y)) (C) Var(e^X) and (D) Correlation between e^X and e^Y. 3+3+3+3=12 marks (a)(ii) If (X, Y) have the joint probability density function g(x,y) = y e^(-y(x+1)), for x ≥ 0, y ≥ 0; 0 elsewhere, then find the regression curve of X on Y and comment on the nature of the curve. 8 marks (b) Let X = (X₁, X₂, X₃)' ~ N₃(μ, Σ), in which μ = (2 1 3)' and Σ = (9 2 -2 / 2 2 -3 / -2 -3 9). Obtain (i) E{X₁ | X₂ = x₂, X₃ = x₃} and (ii) Var{X₁ | X₂ = x₂, X₃ = x₃}. 15 marks (c) Consider the model: Y = X θ + ε, where ε is an n×1 vector of unobservable random variables such that E(ε) = 0 and D(ε) = σ²Ω, σ>0 unknown, Ω is a positive definite matrix of known constants and rank(X) = k<n. Then (i) Derive least square estimator of θ and (ii) Derive an unbiased estimator of σ². 9+6=15 marks

हिंदी में प्रश्न पढ़ें

(a)(i) यदि (X, Y) द्विचर प्रसामान्य BN(μ₁, μ₂, σ₁², σ₂², ρ) का अनुसरण करता है, तो (A) E(e^X) (B) E(e^(X+Y)) (C) Var(e^X) तथा (D) e^X और e^Y के बीच सहसंबंध ज्ञात कीजिए। 3+3+3+3=12 अंक (a)(ii) यदि (X, Y) का संयुक्त प्रायिकता घनत्व फलन निम्नवत है: g(x,y) = y e^(-y(x+1)), x ≥ 0, y ≥ 0; 0 अन्यथा, तो X का Y पर समाश्रयन वक्र ज्ञात कीजिए तथा वक्र की प्रकृति पर टिप्पणी कीजिए। 8 अंक (b) मान लीजिए कि X = (X₁, X₂, X₃)' ~ N₃(μ, Σ), जिसमें μ = (2 1 3)' तथा Σ = (9 2 -2 / 2 2 -3 / -2 -3 9) है। ज्ञात कीजिए (i) E{X₁ | X₂ = x₂, X₃ = x₃} और (ii) Var{X₁ | X₂ = x₂, X₃ = x₃}। 15 अंक (c) निदर्श पर विचार कीजिए: Y = X θ + ε, जहाँ ε अलक्ष्य यादृच्छिक चरों का एक n×1 सदिश इस प्रकार है कि E(ε) = 0 और D(ε) = σ²Ω, σ > 0 अज्ञात है, Ω ज्ञात स्थिरांकों का एक धनात्मक निश्चित आव्यूह है तथा कोटि (X) = k < n है। तब: (i) θ का न्यूनतम वर्ग आकलक व्युत्पन्न कीजिए और (ii) σ² का एक अनभिनत आकलक व्युत्पन्न कीजिए। 9+6=15 अंक

Directive word: Derive

This question asks you to derive. The directive word signals the depth of analysis expected, the structure of your answer, and the weight of evidence you must bring.

See our UPSC directive words guide for a full breakdown of how to respond to each command word.

How this answer will be evaluated

Approach

Derive all required quantities systematically across three parts: spend ~35% time on (a) covering MGF technique for lognormal moments and regression curve derivation; ~30% on (b) for conditional multivariate normal using Schur complement; and ~35% on (c) for GLS estimator derivation via Aitken transformation and unbiased variance estimation. Begin each part with appropriate distribution assumptions, show complete derivation steps, and conclude with explicit final expressions.

Key points expected

  • Part (a)(i): Use MGF of bivariate normal to derive E(e^X)=exp(μ₁+σ₁²/2), E(e^(X+Y))=exp(μ₁+μ₂+(σ₁²+σ₂²+2ρσ₁σ₂)/2), Var(e^X), and Corr(e^X,e^Y) using lognormal properties
  • Part (a)(ii): Obtain marginal of Y, conditional density of X|Y, derive E(X|Y=y)=1/y showing hyperbolic regression curve with negative association
  • Part (b): Apply conditional multivariate normal formula with Σ₂₂ partition, compute Σ₁₂Σ₂₂⁻¹ for conditional mean and Σ₁₁-Σ₁₂Σ₂₂⁻¹Σ₂₁ for conditional variance
  • Part (c)(i): Derive GLS estimator θ̂=(X'Ω⁻¹X)⁻¹X'Ω⁻¹Y via Aitken transformation or direct minimization of generalized sum of squares
  • Part (c)(ii): Derive unbiased estimator σ̂²=(Y-Xθ̂)'Ω⁻¹(Y-Xθ̂)/(n-k) using trace properties and idempotent matrix arguments
  • Correct handling of positive definiteness conditions for Ω and invertibility requirements throughout
  • Proper verification that E(θ̂)=θ (unbiasedness) and E(σ̂²)=σ² in part (c)

Evaluation rubric

DimensionWeightMax marksExcellentAveragePoor
Setup correctness18%9Correctly identifies bivariate normal MGF, proper joint density support in (a)(ii), accurate Σ partitioning in (b) with correct submatrix identification, and proper specification of GLS assumptions including rank conditions in (c)Basic distribution assumptions stated but with minor errors in support specification or Σ indexing; GLS setup mostly correct but missing explicit rank verificationConfuses conditional with marginal distributions, wrong density support, or fundamentally misidentifies the transformation needed for GLS
Method choice22%11Selects MGF approach for (a)(i) moments, integration by parts or recognition of Gamma/Exponential forms for (a)(ii), Schur complement formula for (b), and Aitken transformation or Lagrangian for (c); justifies why each method is optimalUses correct general methods but with suboptimal choices (e.g., direct integration instead of MGF properties); applies formulas without showing why they applyAttempts inappropriate methods like naive OLS for GLS or ignores conditional structure in multivariate normal; uses moment generating without recognizing lognormal connection
Computation accuracy24%12Flawless execution: correct MGF exponents, accurate 2×2 and 3×3 matrix inversions in (b), precise Σ₂₂⁻¹ computation, correct (X'Ω⁻¹X)⁻¹ derivation, and exact unbiasedness verification with proper degrees of freedom (n-k)Minor arithmetic slips in exponent algebra or matrix elements; correct structure but computational errors in final numerical coefficients; off-by-one errors in degrees of freedomMajor computational errors: wrong determinant, incorrect matrix inversion, confused σ₁² with σ₁, or fundamental errors in quadratic form expectations
Interpretation18%9Interprets regression curve E(X|Y=y)=1/y as rectangular hyperbola showing inverse relationship; explains geometric decay of correlation in lognormal transformation; discusses efficiency gain of GLS over OLS via Gauss-Markov extensionStates curve is decreasing but misses hyperbolic classification; notes GLS is 'better' without explaining BLUE property in transformed modelNo interpretation of regression curve shape; treats derived formulas as endpoints without connecting to statistical meaning or practical implications
Final answer & units18%9All 12 quantities explicitly boxed: four in (a)(i), regression curve with domain in (a)(ii), conditional mean and variance expressions in (b), and final matrix-form estimators in (c); dimensions clearly stated for matrix resultsFinal answers present but some buried in text; missing explicit statement of conditional variance formula or unclear on estimator dimensionsMissing final answers for sub-parts, incorrect dimensional consistency (e.g., scalar where matrix required), or answers without proper mathematical closure

Practice this exact question

Write your answer, then get a detailed evaluation from our AI trained on UPSC's answer-writing standards. Free first evaluation — no signup needed to start.

Evaluate my answer →

More from Statistics 2025 Paper I