(a)(i) What are principal components ? Show that the principal components are uncorrelated. (10 marks)

(a)(ii) Obtain the principal components and the amount of variation explained by each principal component associated with the following dispersion

Question

(a)(i) What are principal components ? Show that the principal components are uncorrelated. (10 marks)

(a)(ii) Obtain the principal components and the amount of variation explained by each principal component associated with the following dispersion matrix :
Σ = $\begin{pmatrix} 4 & 2 & 1 \ 2 & 3 & 1 \ 1 & 1 & 2 \end{pmatrix}$
Comment on the results. (10 marks)

(b) For the given data, the yield of the treatment B in the second block is missing and is denoted as 'y'. Estimate the missing value, and analyse the data by assuming the level of significance = 0·05.
[Given that F(3, 4) = 6·59; and F(2, 3) = 9·55] (20 marks)

(c) Distinguish between Sampling and Non-sampling Errors. What are their sources ? How these errors can be controlled ? (10 marks)

UPSC Answer Check · Accepted Answer

This is a multi-part numerical and theoretical question requiring proof, computation, and analysis. Allocate approximately 40% time to part (a) covering PCA theory and computation, 40% to part (b) for missing value estimation and ANOVA analysis, and 20% to part (c) for conceptual comparison of errors. Begin with definitions and proofs in (a), proceed to systematic eigenvalue computation, then handle missing value estimation using Yates' method followed by complete ANOVA, and conclude with structured comparison for (c).
- Part (a)(i): Define principal components as linear combinations maximizing variance; prove uncorrelatedness using orthogonal transformation property (Z = Γ'X where Γ is eigenvector matrix)
- Part (a)(ii): Compute eigenvalues of Σ (characteristic equation: -λ³ + 9λ² - 21λ + 13 = 0), obtain eigenvectors, calculate proportion of variance explained by each PC, comment on dimensionality reduction
- Part (b): Estimate missing value y using Yates' formula for RBD: y = (rB + tT - G)/((r-1)(t-1)), reconstruct ANOVA table with adjusted degrees of freedom, compare calculated F with given critical values
- Part (c): Distinguish sampling error (random, measurable, decreases with n) vs non-sampling error (systematic, non-measurable); list sources (coverage, non-response, measurement, processing, frame errors); control methods (probability sampling, pre-testing, training, validation, imputation techniques)
- Correct application of spectral decomposition theorem and verification that trace equals sum of eigenvalues

Dimension	Weight	Max marks	Excellent	Average	Poor
Setup correctness	20%	10	Correctly states definitions of principal components using variance maximization criterion; properly sets up characteristic equation \|Σ - λI\| = 0 for part (a); correctly identifies RBD structure and applies Yates' missing value formula for part (b); accurately distinguishes error types with proper classification framework for part (c)	States basic definitions with minor errors in mathematical notation; sets up eigenvalue problem but with computational errors in determinant expansion; applies missing value formula with wrong substitution; lists error types without clear distinction	Confuses principal components with factor analysis or other techniques; fails to set up characteristic equation; uses incorrect formula for missing value or ignores ANOVA adjustment; conflates sampling and non-sampling errors or omits key distinctions
Method choice	20%	10	Selects orthogonal transformation proof for uncorrelatedness; chooses efficient eigenvalue computation (e.g., recognizing λ=1 as root); applies correct ANOVA procedure with proper error term selection; uses appropriate error control strategies matched to specific sources	Uses correct but inefficient methods (e.g., direct expansion without factorization); applies ANOVA with minor errors in error term selection; provides generic control methods without source-specific matching	Selects incorrect proof method (e.g., covariance matrix approach without orthogonality); uses trial-and-error for eigenvalues without systematic approach; applies completely wrong ANOVA design; provides irrelevant or incorrect control methods
Computation accuracy	20%	10	Accurate eigenvalues (λ₁≈6.45, λ₂≈2.00, λ₃≈0.55) with correct eigenvector normalization; precise missing value calculation with correct substitution of block/treatment totals; accurate F-ratio computation and comparison with critical values	Correct eigenvalues with minor eigenvector errors; approximate missing value with arithmetic errors; correct ANOVA structure but arithmetic errors in SS calculations	Major errors in eigenvalue computation (wrong roots); completely wrong missing value; fundamental errors in ANOVA calculations or failure to adjust degrees of freedom
Interpretation	20%	10	Interprets first PC explaining ~72% variance indicating strong first component dominance; explains dimensionality reduction feasibility; interprets ANOVA results with clear conclusion on treatment significance; provides contextual examples (e.g., Census 2011 non-sampling error control)	States variance proportions without interpretation of practical significance; states significance without explaining meaning for experimental design; lists error sources without Indian statistical system context	No interpretation of computational results; fails to conclude on hypothesis tests; omits practical implications entirely
Final answer & units	20%	10	Clear presentation of three principal components with variance explained; explicit missing value estimate and complete ANOVA table with conclusion; structured tabular comparison of error types with specific control measures; proper mathematical notation throughout	Present answers but with disorganized structure; incomplete ANOVA table; missing some components of error comparison	Missing final answers for key parts; no ANOVA conclusion; incomplete or absent error comparison; poor mathematical notation

Q8

Directive word: Solve

How this answer will be evaluated

Approach

Key points expected

Evaluation rubric

Practice this exact question

More from Statistics 2025 Paper I