Q6
(a) Let **X** = (X₁ X₂ X₃)' be distributed as N₃ (μ, Σ), where μ = (2 −1 3)' and Σ = $\begin{pmatrix} 4 & 1 & 0 \\ 1 & 2 & 1 \\ 0 & 1 & 3 \end{pmatrix}$. Find (i) the conditional distribution of (X₁ X₂)' given X₃ = 2. (ii) partial correlation coefficient ρ₁₂.₃ and multiple correlation coefficient R₁.₂₃ (8+7 marks) (b) (i) Describe the complete analysis of two-way classified data with multiple (but equal) observations per cell, clearly stating the assumptions used. Also state two examples where such type of analysis is used. (ii) Let three mutually independent variables Y₁, Y₂ and Y₃ having common variance σ² and E(Y₁) = β₁ + β₂, E(Y₂) = β₁ + β₃, E(Y₃) = β₁ + β₂ be given. Show that the linear parametric function p₁β₁ + p₂β₂ + p₃β₃ is estimable if and only if p₁ = p₂ + p₃, clearly stating the assumptions used, if any. (5 marks) (c) (i) State briefly three reasons why an analyst may wish to perform a principal component analysis. (6 marks) (ii) Define canonical correlations and give two examples of their application. Describe the procedure of working out canonical correlations and canonical variates. (9 marks)
हिंदी में प्रश्न पढ़ें
(a) माना **X** = (X₁ X₂ X₃)' का बंटन N₃ (μ, Σ) है, जहाँ μ = (2 −1 3)' एवं Σ = $\begin{pmatrix} 4 & 1 & 0 \\ 1 & 2 & 1 \\ 0 & 1 & 3 \end{pmatrix}$। ज्ञात कीजिए (i) (X₁ X₂)' का प्रतिबंधित बंटन जबकि X₃ = 2 दिया है । (ii) आंशिक सहसंबंध गुणांक ρ₁₂.₃ एवं बहु सहसंबंध गुणांक R₁.₂₃ (8+7 अंक) (b) (i) प्रति कोष्ठ संख्या में बराबर बहु आंकड़े (आब्जर्वेशन्स) रखने वाले द्वि-विध (टू-वे) वर्गीकृत आंकड़ों के सम्पूर्ण विश्लेषण का विवरण, उपयोग में ली गई मान्यताओं का स्पष्ट उल्लेख करते हुए दीजिए । ऐसे दो उदाहरण भी दीजिए जहाँ इस प्रकार के विश्लेषण का उपयोग होता है । (ii) माना कि तीन परस्पर स्वतंत्र चर Y₁, Y₂ और Y₃ जिनका प्रसरण σ² समान है तथा E(Y₁) = β₁ + β₂, E(Y₂) = β₁ + β₃, E(Y₃) = β₁ + β₂ दिए गए हैं । दिखाइए कि रैखीय प्राचलिक फलन p₁β₁ + p₂β₂ + p₃β₃ प्राकलिक है, यदि एवं केवल यदि p₁ = p₂ + p₃ है । साथ ही यदि कोई मान्यताएं प्रयुक्त होती हैं, तो उनका भी स्पष्ट उल्लेख कीजिए । (5 अंक) (c) (i) संक्षिप्त में तीन कारण लिखिए जिनके कारण विश्लेषक प्रमुख घटक विश्लेषण का प्रयोग करने की इच्छा कर सकता है। (6 अंक) (ii) विहित सहसंबंधों को परिभाषित कीजिए, तथा इनके अनुप्रयोग के दो उदाहरण दीजिए। विहित सहसंबंधों एवं विहित चरों को ज्ञात करने की विधि का वर्णन कीजिए। (9 अंक)
Directive word: Solve
This question asks you to solve. The directive word signals the depth of analysis expected, the structure of your answer, and the weight of evidence you must bring.
See our UPSC directive words guide for a full breakdown of how to respond to each command word.
How this answer will be evaluated
Approach
Solve this multi-part numerical and theoretical question by allocating approximately 35% time to part (a) due to its 15 marks and computational complexity, 25% to part (b) covering ANOVA and estimability, and 40% to part (c) on PCA and canonical correlations. Begin with clear problem identification for each sub-part, show all computational steps with matrix operations for (a), present structured ANOVA decomposition for (b)(i) and rigorous linear algebra proof for (b)(ii), and provide conceptual clarity with real-world Indian examples for (c). Conclude each part with precise final answers and interpretations.
Key points expected
- Part (a)(i): Correctly partition Σ into Σ₁₁, Σ₁₂, Σ₂₁, Σ₂₂ and apply conditional distribution formula N₂(μ₁ + Σ₁₂Σ₂₂⁻¹(x₃-μ₃), Σ₁₁ - Σ₁₂Σ₂₂⁻¹Σ₂₁) with x₃=2
- Part (a)(ii): Compute partial correlation ρ₁₂.₃ = (σ₁₂ - σ₁₃σ₂₃/σ₃₃)/√[(σ₁₁-σ₁₃²/σ₃₃)(σ₂₂-σ₂₃²/σ₃₃)] and multiple correlation R₁.₂₃ = √[σ₁'Σ₂₂⁻¹σ₁/σ₁₁] where σ₁' = (σ₁₂, σ₁₃)
- Part (b)(i): Describe two-way ANOVA with replication: model yᵢⱼₖ = μ + αᵢ + βⱼ + (αβ)ᵢⱼ + εᵢⱼₖ, assumptions (normality, homoscedasticity, independence), ANOVA table with SS_T, SS_A, SS_B, SS_AB, SS_E, and examples like agricultural field trials (ICRISAT crop studies) or industrial quality control
- Part (b)(ii): Set up design matrix X, show rank deficiency, derive condition for estimability via Cβ where C = (p₁,p₂,p₃), prove p₁ = p₂ + p₃ using linear independence of rows and estimability condition C = LX for some L
- Part (c)(i): Three reasons for PCA: dimensionality reduction (e.g., reducing NSSO household survey variables), multicollinearity remediation in regression, and data visualization/pattern detection in large datasets
- Part (c)(ii): Define canonical correlations as correlations between linear combinations u=a'X and v=b'Y maximizing correlation; examples: relationship between economic indicators and social development indices, or agricultural inputs vs outputs; describe eigenvalue solution of Σ₁₁⁻¹Σ₁₂Σ₂₂⁻¹Σ₂₁ and extraction of canonical variates
Evaluation rubric
| Dimension | Weight | Max marks | Excellent | Average | Poor |
|---|---|---|---|---|---|
| Setup correctness | 20% | 10 | Correctly partitions covariance matrix for (a), identifies all parameters for conditional distribution; properly states ANOVA model with all terms for (b)(i); correctly sets up design matrix and identifies rank deficiency for (b)(ii); accurately defines PCA objectives and canonical correlation framework for (c) | Minor errors in matrix partitioning or model specification; incomplete ANOVA model or partially correct design matrix; vague definitions of PCA reasons or canonical correlation | Major errors in setup: wrong matrix dimensions, incorrect ANOVA model specification, failure to identify estimability conditions, or fundamentally misunderstood PCA/canonical correlation concepts |
| Method choice | 20% | 10 | Selects optimal formulas: Schur complement for conditional covariance, precise partial/multiple correlation formulas; standard two-way ANOVA with interaction; Gauss-Markov approach for estimability; eigen decomposition for canonical correlations | Correct but inefficient methods, or minor deviations from optimal approach; acceptable alternative formulas with more computation | Incorrect method selection: using simple correlation instead of partial, ignoring interaction in ANOVA, attempting direct inversion without partitioning, or wrong optimization approach for canonical analysis |
| Computation accuracy | 20% | 10 | Flawless arithmetic: correct matrix inversions (Σ₂₂⁻¹ = 1/3), accurate conditional mean (2, -1/3)' and covariance [[4,1],[1,5/3]], precise ρ₁₂.₃ = 3/√35 and R₁.₂₃ = √(13/24); correct F-ratios for ANOVA; valid proof steps for estimability; correct eigenvalue extraction for canonical correlations | Minor computational slips: sign errors, arithmetic mistakes in fractions, or incorrect degrees of freedom; partially correct proof with gaps | Major computational errors: wrong matrix inversion, incorrect conditional distribution, miscalculated correlations, wrong ANOVA decomposition, or invalid proof logic |
| Interpretation | 20% | 10 | Interprets conditional distribution parameters meaningfully; explains ρ₁₂.₃ as correlation net of X₃ effect and R₁.₂₃ as predictive power; clarifies interaction interpretation in ANOVA; explains why p₁=p₂+p₃ ensures estimability; relates PCA to variance maximization and canonical correlations to inter-group associations with Indian examples | Basic interpretation without depth; mechanical statement of results without conceptual linkage; minimal context for applications | No interpretation provided, or completely misinterprets results (e.g., confusing conditional with marginal distribution, misreading correlation directions) |
| Final answer & units | 20% | 10 | All final answers boxed/highlighted: conditional distribution N₂((2, -1/3)', [[4,1],[1,5/3]]), ρ₁₂.₃ = 3/√35 ≈ 0.507, R₁.₂₃ = √(13/24) ≈ 0.736; complete ANOVA table structure; clear estimability condition; three concise PCA reasons; canonical correlation definition with procedure steps and examples | Most answers present but poorly organized; missing some final values; incomplete ANOVA table; partial listing of PCA reasons | Missing critical final answers, disorganized presentation, or answers without proper labeling (e.g., no distinction between partial and multiple correlation) |
Practice this exact question
Write your answer, then get a detailed evaluation from our AI trained on UPSC's answer-writing standards. Free first evaluation — no signup needed to start.
Evaluate my answer →More from Statistics 2023 Paper I
- Q1 (a) Out of 1000 persons born, only 900 reach the age of 15 years, and out of every 1000 who reach the age of 15 years, 950 reach the age of…
- Q2 (a) Let X, Y, Z be three mutually independent standard exponential variates and W₁ = X + Y + Z, W₂ = (X + Y)/(X + Y + Z), W₃ = X/(X + Y). T…
- Q3 (a) (i) If X is a random variable with finite variance, show that lim n² P{|X| > n} = 0. n → ∞ (10 marks) (ii) In a certain recruitment tes…
- Q4 (a) What is the role of properties of completeness and sufficiency in Statistical Inference ? Explain. In U (0, θ), find out Uniformly Mini…
- Q5 (a) (i) If **X** = (X₁ X₂ X₃)' is distributed as N₃ (μ, Σ), find the distribution of [(X₁ – X₂) (X₂ – X₃)]'. (5 marks) (ii) Suppose that **…
- Q6 (a) Let **X** = (X₁ X₂ X₃)' be distributed as N₃ (μ, Σ), where μ = (2 −1 3)' and Σ = $\begin{pmatrix} 4 & 1 & 0 \\ 1 & 2 & 1 \\ 0 & 1 & 3 \…
- Q7 (a) Discuss the difference between sampling for variables and sampling for attributes with examples. For a qualitative characteristic, find…
- Q8 (a) Differentiate between randomised block design and balanced incomplete block design. In usual notations, for a balanced incomplete block…