(a) Let **X** = (X₁ X₂ X₃)' be distributed as N₃ (μ, Σ), where μ = (2 −1 3)' and Σ = $\begin{pmatrix} 4 & 1 & 0 \ 1 & 2 & 1 \ 0 & 1 & 3 \end{pmatrix}$. Find (i) the conditional distribution of (X₁ X₂)' given X₃ = 2. (ii) partial correlation coeffic

Question

(a) Let **X** = (X₁ X₂ X₃)' be distributed as N₃ (μ, Σ), where μ = (2 −1 3)' and Σ = $\begin{pmatrix} 4 & 1 & 0 \ 1 & 2 & 1 \ 0 & 1 & 3 \end{pmatrix}$. Find (i) the conditional distribution of (X₁ X₂)' given X₃ = 2. (ii) partial correlation coefficient ρ₁₂.₃ and multiple correlation coefficient R₁.₂₃ (8+7 marks)

(b) (i) Describe the complete analysis of two-way classified data with multiple (but equal) observations per cell, clearly stating the assumptions used. Also state two examples where such type of analysis is used. (ii) Let three mutually independent variables Y₁, Y₂ and Y₃ having common variance σ² and E(Y₁) = β₁ + β₂, E(Y₂) = β₁ + β₃, E(Y₃) = β₁ + β₂ be given. Show that the linear parametric function p₁β₁ + p₂β₂ + p₃β₃ is estimable if and only if p₁ = p₂ + p₃, clearly stating the assumptions used, if any. (5 marks)

(c) (i) State briefly three reasons why an analyst may wish to perform a principal component analysis. (6 marks) (ii) Define canonical correlations and give two examples of their application. Describe the procedure of working out canonical correlations and canonical variates. (9 marks)

UPSC Answer Check · Accepted Answer

Solve this multi-part numerical and theoretical question by allocating approximately 35% time to part (a) due to its 15 marks and computational complexity, 25% to part (b) covering ANOVA and estimability, and 40% to part (c) on PCA and canonical correlations. Begin with clear problem identification for each sub-part, show all computational steps with matrix operations for (a), present structured ANOVA decomposition for (b)(i) and rigorous linear algebra proof for (b)(ii), and provide conceptual clarity with real-world Indian examples for (c). Conclude each part with precise final answers and interpretations.
- Part (a)(i): Correctly partition Σ into Σ₁₁, Σ₁₂, Σ₂₁, Σ₂₂ and apply conditional distribution formula N₂(μ₁ + Σ₁₂Σ₂₂⁻¹(x₃-μ₃), Σ₁₁ - Σ₁₂Σ₂₂⁻¹Σ₂₁) with x₃=2
- Part (a)(ii): Compute partial correlation ρ₁₂.₃ = (σ₁₂ - σ₁₃σ₂₃/σ₃₃)/√[(σ₁₁-σ₁₃²/σ₃₃)(σ₂₂-σ₂₃²/σ₃₃)] and multiple correlation R₁.₂₃ = √[σ₁'Σ₂₂⁻¹σ₁/σ₁₁] where σ₁' = (σ₁₂, σ₁₃)
- Part (b)(i): Describe two-way ANOVA with replication: model yᵢⱼₖ = μ + αᵢ + βⱼ + (αβ)ᵢⱼ + εᵢⱼₖ, assumptions (normality, homoscedasticity, independence), ANOVA table with SS_T, SS_A, SS_B, SS_AB, SS_E, and examples like agricultural field trials (ICRISAT crop studies) or industrial quality control
- Part (b)(ii): Set up design matrix X, show rank deficiency, derive condition for estimability via Cβ where C = (p₁,p₂,p₃), prove p₁ = p₂ + p₃ using linear independence of rows and estimability condition C = LX for some L
- Part (c)(i): Three reasons for PCA: dimensionality reduction (e.g., reducing NSSO household survey variables), multicollinearity remediation in regression, and data visualization/pattern detection in large datasets
- Part (c)(ii): Define canonical correlations as correlations between linear combinations u=a'X and v=b'Y maximizing correlation; examples: relationship between economic indicators and social development indices, or agricultural inputs vs outputs; describe eigenvalue solution of Σ₁₁⁻¹Σ₁₂Σ₂₂⁻¹Σ₂₁ and extraction of canonical variates

Dimension	Weight	Max marks	Excellent	Average	Poor
Setup correctness	20%	10	Correctly partitions covariance matrix for (a), identifies all parameters for conditional distribution; properly states ANOVA model with all terms for (b)(i); correctly sets up design matrix and identifies rank deficiency for (b)(ii); accurately defines PCA objectives and canonical correlation framework for (c)	Minor errors in matrix partitioning or model specification; incomplete ANOVA model or partially correct design matrix; vague definitions of PCA reasons or canonical correlation	Major errors in setup: wrong matrix dimensions, incorrect ANOVA model specification, failure to identify estimability conditions, or fundamentally misunderstood PCA/canonical correlation concepts
Method choice	20%	10	Selects optimal formulas: Schur complement for conditional covariance, precise partial/multiple correlation formulas; standard two-way ANOVA with interaction; Gauss-Markov approach for estimability; eigen decomposition for canonical correlations	Correct but inefficient methods, or minor deviations from optimal approach; acceptable alternative formulas with more computation	Incorrect method selection: using simple correlation instead of partial, ignoring interaction in ANOVA, attempting direct inversion without partitioning, or wrong optimization approach for canonical analysis
Computation accuracy	20%	10	Flawless arithmetic: correct matrix inversions (Σ₂₂⁻¹ = 1/3), accurate conditional mean (2, -1/3)' and covariance [[4,1],[1,5/3]], precise ρ₁₂.₃ = 3/√35 and R₁.₂₃ = √(13/24); correct F-ratios for ANOVA; valid proof steps for estimability; correct eigenvalue extraction for canonical correlations	Minor computational slips: sign errors, arithmetic mistakes in fractions, or incorrect degrees of freedom; partially correct proof with gaps	Major computational errors: wrong matrix inversion, incorrect conditional distribution, miscalculated correlations, wrong ANOVA decomposition, or invalid proof logic
Interpretation	20%	10	Interprets conditional distribution parameters meaningfully; explains ρ₁₂.₃ as correlation net of X₃ effect and R₁.₂₃ as predictive power; clarifies interaction interpretation in ANOVA; explains why p₁=p₂+p₃ ensures estimability; relates PCA to variance maximization and canonical correlations to inter-group associations with Indian examples	Basic interpretation without depth; mechanical statement of results without conceptual linkage; minimal context for applications	No interpretation provided, or completely misinterprets results (e.g., confusing conditional with marginal distribution, misreading correlation directions)
Final answer & units	20%	10	All final answers boxed/highlighted: conditional distribution N₂((2, -1/3)', [[4,1],[1,5/3]]), ρ₁₂.₃ = 3/√35 ≈ 0.507, R₁.₂₃ = √(13/24) ≈ 0.736; complete ANOVA table structure; clear estimability condition; three concise PCA reasons; canonical correlation definition with procedure steps and examples	Most answers present but poorly organized; missing some final values; incomplete ANOVA table; partial listing of PCA reasons	Missing critical final answers, disorganized presentation, or answers without proper labeling (e.g., no distinction between partial and multiple correlation)

Q6

Directive word: Solve

How this answer will be evaluated

Approach

Key points expected

Evaluation rubric

Practice this exact question

More from Statistics 2023 Paper I