Statistics 2023 Paper I 50 marks Solve

Q6

(a) Let **X** = (X₁ X₂ X₃)' be distributed as N₃ (μ, Σ), where μ = (2 −1 3)' and Σ = $\begin{pmatrix} 4 & 1 & 0 \\ 1 & 2 & 1 \\ 0 & 1 & 3 \end{pmatrix}$. Find (i) the conditional distribution of (X₁ X₂)' given X₃ = 2. (ii) partial correlation coefficient ρ₁₂.₃ and multiple correlation coefficient R₁.₂₃ (8+7 marks) (b) (i) Describe the complete analysis of two-way classified data with multiple (but equal) observations per cell, clearly stating the assumptions used. Also state two examples where such type of analysis is used. (ii) Let three mutually independent variables Y₁, Y₂ and Y₃ having common variance σ² and E(Y₁) = β₁ + β₂, E(Y₂) = β₁ + β₃, E(Y₃) = β₁ + β₂ be given. Show that the linear parametric function p₁β₁ + p₂β₂ + p₃β₃ is estimable if and only if p₁ = p₂ + p₃, clearly stating the assumptions used, if any. (5 marks) (c) (i) State briefly three reasons why an analyst may wish to perform a principal component analysis. (6 marks) (ii) Define canonical correlations and give two examples of their application. Describe the procedure of working out canonical correlations and canonical variates. (9 marks)

हिंदी में प्रश्न पढ़ें

(a) माना **X** = (X₁ X₂ X₃)' का बंटन N₃ (μ, Σ) है, जहाँ μ = (2 −1 3)' एवं Σ = $\begin{pmatrix} 4 & 1 & 0 \\ 1 & 2 & 1 \\ 0 & 1 & 3 \end{pmatrix}$। ज्ञात कीजिए (i) (X₁ X₂)' का प्रतिबंधित बंटन जबकि X₃ = 2 दिया है । (ii) आंशिक सहसंबंध गुणांक ρ₁₂.₃ एवं बहु सहसंबंध गुणांक R₁.₂₃ (8+7 अंक) (b) (i) प्रति कोष्ठ संख्या में बराबर बहु आंकड़े (आब्जर्वेशन्स) रखने वाले द्वि-विध (टू-वे) वर्गीकृत आंकड़ों के सम्पूर्ण विश्लेषण का विवरण, उपयोग में ली गई मान्यताओं का स्पष्ट उल्लेख करते हुए दीजिए । ऐसे दो उदाहरण भी दीजिए जहाँ इस प्रकार के विश्लेषण का उपयोग होता है । (ii) माना कि तीन परस्पर स्वतंत्र चर Y₁, Y₂ और Y₃ जिनका प्रसरण σ² समान है तथा E(Y₁) = β₁ + β₂, E(Y₂) = β₁ + β₃, E(Y₃) = β₁ + β₂ दिए गए हैं । दिखाइए कि रैखीय प्राचलिक फलन p₁β₁ + p₂β₂ + p₃β₃ प्राकलिक है, यदि एवं केवल यदि p₁ = p₂ + p₃ है । साथ ही यदि कोई मान्यताएं प्रयुक्त होती हैं, तो उनका भी स्पष्ट उल्लेख कीजिए । (5 अंक) (c) (i) संक्षिप्त में तीन कारण लिखिए जिनके कारण विश्लेषक प्रमुख घटक विश्लेषण का प्रयोग करने की इच्छा कर सकता है। (6 अंक) (ii) विहित सहसंबंधों को परिभाषित कीजिए, तथा इनके अनुप्रयोग के दो उदाहरण दीजिए। विहित सहसंबंधों एवं विहित चरों को ज्ञात करने की विधि का वर्णन कीजिए। (9 अंक)

Directive word: Solve

This question asks you to solve. The directive word signals the depth of analysis expected, the structure of your answer, and the weight of evidence you must bring.

See our UPSC directive words guide for a full breakdown of how to respond to each command word.

How this answer will be evaluated

Approach

Solve this multi-part numerical and theoretical question by allocating approximately 35% time to part (a) due to its 15 marks and computational complexity, 25% to part (b) covering ANOVA and estimability, and 40% to part (c) on PCA and canonical correlations. Begin with clear problem identification for each sub-part, show all computational steps with matrix operations for (a), present structured ANOVA decomposition for (b)(i) and rigorous linear algebra proof for (b)(ii), and provide conceptual clarity with real-world Indian examples for (c). Conclude each part with precise final answers and interpretations.

Key points expected

  • Part (a)(i): Correctly partition Σ into Σ₁₁, Σ₁₂, Σ₂₁, Σ₂₂ and apply conditional distribution formula N₂(μ₁ + Σ₁₂Σ₂₂⁻¹(x₃-μ₃), Σ₁₁ - Σ₁₂Σ₂₂⁻¹Σ₂₁) with x₃=2
  • Part (a)(ii): Compute partial correlation ρ₁₂.₃ = (σ₁₂ - σ₁₃σ₂₃/σ₃₃)/√[(σ₁₁-σ₁₃²/σ₃₃)(σ₂₂-σ₂₃²/σ₃₃)] and multiple correlation R₁.₂₃ = √[σ₁'Σ₂₂⁻¹σ₁/σ₁₁] where σ₁' = (σ₁₂, σ₁₃)
  • Part (b)(i): Describe two-way ANOVA with replication: model yᵢⱼₖ = μ + αᵢ + βⱼ + (αβ)ᵢⱼ + εᵢⱼₖ, assumptions (normality, homoscedasticity, independence), ANOVA table with SS_T, SS_A, SS_B, SS_AB, SS_E, and examples like agricultural field trials (ICRISAT crop studies) or industrial quality control
  • Part (b)(ii): Set up design matrix X, show rank deficiency, derive condition for estimability via Cβ where C = (p₁,p₂,p₃), prove p₁ = p₂ + p₃ using linear independence of rows and estimability condition C = LX for some L
  • Part (c)(i): Three reasons for PCA: dimensionality reduction (e.g., reducing NSSO household survey variables), multicollinearity remediation in regression, and data visualization/pattern detection in large datasets
  • Part (c)(ii): Define canonical correlations as correlations between linear combinations u=a'X and v=b'Y maximizing correlation; examples: relationship between economic indicators and social development indices, or agricultural inputs vs outputs; describe eigenvalue solution of Σ₁₁⁻¹Σ₁₂Σ₂₂⁻¹Σ₂₁ and extraction of canonical variates

Evaluation rubric

DimensionWeightMax marksExcellentAveragePoor
Setup correctness20%10Correctly partitions covariance matrix for (a), identifies all parameters for conditional distribution; properly states ANOVA model with all terms for (b)(i); correctly sets up design matrix and identifies rank deficiency for (b)(ii); accurately defines PCA objectives and canonical correlation framework for (c)Minor errors in matrix partitioning or model specification; incomplete ANOVA model or partially correct design matrix; vague definitions of PCA reasons or canonical correlationMajor errors in setup: wrong matrix dimensions, incorrect ANOVA model specification, failure to identify estimability conditions, or fundamentally misunderstood PCA/canonical correlation concepts
Method choice20%10Selects optimal formulas: Schur complement for conditional covariance, precise partial/multiple correlation formulas; standard two-way ANOVA with interaction; Gauss-Markov approach for estimability; eigen decomposition for canonical correlationsCorrect but inefficient methods, or minor deviations from optimal approach; acceptable alternative formulas with more computationIncorrect method selection: using simple correlation instead of partial, ignoring interaction in ANOVA, attempting direct inversion without partitioning, or wrong optimization approach for canonical analysis
Computation accuracy20%10Flawless arithmetic: correct matrix inversions (Σ₂₂⁻¹ = 1/3), accurate conditional mean (2, -1/3)' and covariance [[4,1],[1,5/3]], precise ρ₁₂.₃ = 3/√35 and R₁.₂₃ = √(13/24); correct F-ratios for ANOVA; valid proof steps for estimability; correct eigenvalue extraction for canonical correlationsMinor computational slips: sign errors, arithmetic mistakes in fractions, or incorrect degrees of freedom; partially correct proof with gapsMajor computational errors: wrong matrix inversion, incorrect conditional distribution, miscalculated correlations, wrong ANOVA decomposition, or invalid proof logic
Interpretation20%10Interprets conditional distribution parameters meaningfully; explains ρ₁₂.₃ as correlation net of X₃ effect and R₁.₂₃ as predictive power; clarifies interaction interpretation in ANOVA; explains why p₁=p₂+p₃ ensures estimability; relates PCA to variance maximization and canonical correlations to inter-group associations with Indian examplesBasic interpretation without depth; mechanical statement of results without conceptual linkage; minimal context for applicationsNo interpretation provided, or completely misinterprets results (e.g., confusing conditional with marginal distribution, misreading correlation directions)
Final answer & units20%10All final answers boxed/highlighted: conditional distribution N₂((2, -1/3)', [[4,1],[1,5/3]]), ρ₁₂.₃ = 3/√35 ≈ 0.507, R₁.₂₃ = √(13/24) ≈ 0.736; complete ANOVA table structure; clear estimability condition; three concise PCA reasons; canonical correlation definition with procedure steps and examplesMost answers present but poorly organized; missing some final values; incomplete ANOVA table; partial listing of PCA reasonsMissing critical final answers, disorganized presentation, or answers without proper labeling (e.g., no distinction between partial and multiple correlation)

Practice this exact question

Write your answer, then get a detailed evaluation from our AI trained on UPSC's answer-writing standards. Free first evaluation — no signup needed to start.

Evaluate my answer →

More from Statistics 2023 Paper I