Statistics 2025 Paper I 50 marks Solve

Q8

(a)(i) What are principal components ? Show that the principal components are uncorrelated. (10 marks) (a)(ii) Obtain the principal components and the amount of variation explained by each principal component associated with the following dispersion matrix : Σ = $\begin{pmatrix} 4 & 2 & 1 \\ 2 & 3 & 1 \\ 1 & 1 & 2 \end{pmatrix}$ Comment on the results. (10 marks) (b) For the given data, the yield of the treatment B in the second block is missing and is denoted as 'y'. Estimate the missing value, and analyse the data by assuming the level of significance = 0·05. [Given that F(3, 4) = 6·59; and F(2, 3) = 9·55] (20 marks) (c) Distinguish between Sampling and Non-sampling Errors. What are their sources ? How these errors can be controlled ? (10 marks)

हिंदी में प्रश्न पढ़ें

(a)(i) मुख्य घटक क्या हैं ? दर्शाइए कि मुख्य घटक असहसंबंधित हैं। (10 अंक) (a)(ii) निम्नलिखित प्रकीर्णन आव्यूह से संबंधित मुख्य घटकों को प्राप्त कीजिए तथा प्रत्येक मुख्य घटक द्वारा स्पष्ट की गई परिवर्तन की मात्रा प्राप्त कीजिए : Σ = $\begin{pmatrix} 4 & 2 & 1 \\ 2 & 3 & 1 \\ 1 & 1 & 2 \end{pmatrix}$ परिणामों पर टिप्पणी कीजिए। (10 अंक) (b) दिए गए आंकड़ों के लिए, दूसरे खंड में उपचार B की उपज लुप्त है और इसे 'y' से दर्शाया गया है। लुप्त मान का आकलन कीजिए, और आंकड़ों का सार्थकता स्तर 0·05 पर विश्लेषण कीजिए। [दिया गया है F(3, 4) = 6·59; और F(2, 3) = 9·55] (20 अंक) (c) प्रतिचयन और अप्रतिचयन त्रुटियों के बीच अंतर कीजिए। उनके स्रोत क्या हैं ? इन त्रुटियों को कैसे नियंत्रित किया जा सकता है ? (10 अंक)

Directive word: Solve

This question asks you to solve. The directive word signals the depth of analysis expected, the structure of your answer, and the weight of evidence you must bring.

See our UPSC directive words guide for a full breakdown of how to respond to each command word.

How this answer will be evaluated

Approach

This is a multi-part numerical and theoretical question requiring proof, computation, and analysis. Allocate approximately 40% time to part (a) covering PCA theory and computation, 40% to part (b) for missing value estimation and ANOVA analysis, and 20% to part (c) for conceptual comparison of errors. Begin with definitions and proofs in (a), proceed to systematic eigenvalue computation, then handle missing value estimation using Yates' method followed by complete ANOVA, and conclude with structured comparison for (c).

Key points expected

  • Part (a)(i): Define principal components as linear combinations maximizing variance; prove uncorrelatedness using orthogonal transformation property (Z = Γ'X where Γ is eigenvector matrix)
  • Part (a)(ii): Compute eigenvalues of Σ (characteristic equation: -λ³ + 9λ² - 21λ + 13 = 0), obtain eigenvectors, calculate proportion of variance explained by each PC, comment on dimensionality reduction
  • Part (b): Estimate missing value y using Yates' formula for RBD: y = (rB + tT - G)/((r-1)(t-1)), reconstruct ANOVA table with adjusted degrees of freedom, compare calculated F with given critical values
  • Part (c): Distinguish sampling error (random, measurable, decreases with n) vs non-sampling error (systematic, non-measurable); list sources (coverage, non-response, measurement, processing, frame errors); control methods (probability sampling, pre-testing, training, validation, imputation techniques)
  • Correct application of spectral decomposition theorem and verification that trace equals sum of eigenvalues

Evaluation rubric

DimensionWeightMax marksExcellentAveragePoor
Setup correctness20%10Correctly states definitions of principal components using variance maximization criterion; properly sets up characteristic equation |Σ - λI| = 0 for part (a); correctly identifies RBD structure and applies Yates' missing value formula for part (b); accurately distinguishes error types with proper classification framework for part (c)States basic definitions with minor errors in mathematical notation; sets up eigenvalue problem but with computational errors in determinant expansion; applies missing value formula with wrong substitution; lists error types without clear distinctionConfuses principal components with factor analysis or other techniques; fails to set up characteristic equation; uses incorrect formula for missing value or ignores ANOVA adjustment; conflates sampling and non-sampling errors or omits key distinctions
Method choice20%10Selects orthogonal transformation proof for uncorrelatedness; chooses efficient eigenvalue computation (e.g., recognizing λ=1 as root); applies correct ANOVA procedure with proper error term selection; uses appropriate error control strategies matched to specific sourcesUses correct but inefficient methods (e.g., direct expansion without factorization); applies ANOVA with minor errors in error term selection; provides generic control methods without source-specific matchingSelects incorrect proof method (e.g., covariance matrix approach without orthogonality); uses trial-and-error for eigenvalues without systematic approach; applies completely wrong ANOVA design; provides irrelevant or incorrect control methods
Computation accuracy20%10Accurate eigenvalues (λ₁≈6.45, λ₂≈2.00, λ₃≈0.55) with correct eigenvector normalization; precise missing value calculation with correct substitution of block/treatment totals; accurate F-ratio computation and comparison with critical valuesCorrect eigenvalues with minor eigenvector errors; approximate missing value with arithmetic errors; correct ANOVA structure but arithmetic errors in SS calculationsMajor errors in eigenvalue computation (wrong roots); completely wrong missing value; fundamental errors in ANOVA calculations or failure to adjust degrees of freedom
Interpretation20%10Interprets first PC explaining ~72% variance indicating strong first component dominance; explains dimensionality reduction feasibility; interprets ANOVA results with clear conclusion on treatment significance; provides contextual examples (e.g., Census 2011 non-sampling error control)States variance proportions without interpretation of practical significance; states significance without explaining meaning for experimental design; lists error sources without Indian statistical system contextNo interpretation of computational results; fails to conclude on hypothesis tests; omits practical implications entirely
Final answer & units20%10Clear presentation of three principal components with variance explained; explicit missing value estimate and complete ANOVA table with conclusion; structured tabular comparison of error types with specific control measures; proper mathematical notation throughoutPresent answers but with disorganized structure; incomplete ANOVA table; missing some components of error comparisonMissing final answers for key parts; no ANOVA conclusion; incomplete or absent error comparison; poor mathematical notation

Practice this exact question

Write your answer, then get a detailed evaluation from our AI trained on UPSC's answer-writing standards. Free first evaluation — no signup needed to start.

Evaluate my answer →

More from Statistics 2025 Paper I