(a) (i) What are orthogonal polynomials ? How do you fit an orthogonal polynomial of degree 'p' ?

(ii) For the model Y_(n×1) = X_(n×k) β_(k×1) + u_(n×1), E(uu') = σ² I_n, where X_(n×k) is a matrix of rank k (k

Question

(a) (i) What are orthogonal polynomials ? How do you fit an orthogonal polynomial of degree 'p' ?

(ii) For the model Y_(n×1) = X_(n×k) β_(k×1) + u_(n×1), E(uu') = σ² I_n, where X_(n×k) is a matrix of rank k (k < n), find out the value of E[Y'(I_n - X(X'X)⁻¹X')Y]. 10+10=20

(b) Consider an artificial population of three farms. Their selection probabilities and the wheat production (in '000 tons) are as follows : Farm unit (i) : 1 2 3; Selection probability (pᵢ) : 0·3 0·2 0·5; Wheat production (yᵢ) : 11 6 25. Draw all possible samples of size 2 with replacement (order is to be considered). Show that Horvitz-Thompson estimator of total wheat production is unbiased. 15

(c) What is a missing plot technique ? Derive the missing value formula for a Latin Square Design. How would you proceed to analyse such a design ? 15

UPSC Answer Check · Accepted Answer

Begin with (a)(i) defining orthogonal polynomials with the orthogonality condition Σφᵢ(x)φⱼ(x)=0 for i≠j, then describe the recurrence relation method for fitting degree p. For (a)(ii), recognize the residual sum of squares form and apply E[u'Mu]=σ²tr(M) to obtain (n-k)σ². In (b), enumerate all 9 ordered samples with replacement, compute πᵢ=1-(1-pᵢ)² for inclusion probabilities, verify E[Ŷ_HT]=Y. For (c), derive the Latin Square missing value formula ŷ=(R+C+T-2G)/((t-1)(t-2)) and outline the adjusted ANOVA procedure. Allocate ~40% time to (a), ~30% each to (b) and (c).
- (a)(i) Definition: orthogonal polynomials satisfy Σφᵢ(x)φⱼ(x)=0 for i≠j over the point set; fitting uses recurrence φᵣ₊₁(x)=(x-aᵣ)φᵣ(x)-bᵣφᵣ₋₁(x) with specific coefficient formulas
- (a)(ii) Recognition that Iₙ-X(X'X)⁻¹X' is the residual maker matrix M; E[Y'MY]=E[u'Mu]=σ²tr(M)=σ²(n-k) using tr(Iₙ)=n and tr(X(X'X)⁻¹X')=k
- (b) Enumeration of 9 ordered samples: (1,1),(1,2),(1,3),(2,1),(2,2),(2,3),(3,1),(3,2),(3,3) with their probabilities; calculation of first-order inclusion probabilities πᵢ=1-(1-pᵢ)²; verification that Σ(yᵢ/πᵢ)·πᵢ/Σ1 = Y
- (c) Missing plot technique: Yates' method for estimating missing observations by minimizing error sum of squares; derivation using ∂SSE/∂y=0 for Latin Square layout
- (c) Latin Square missing value formula: ŷ = (tRᵢ + tCⱼ + tTₖ - 2G) / [(t-1)(t-2)] where R,C,T are respective totals and G is grand total; analysis proceeds with reduced degrees of freedom and bias correction in treatment SS

Dimension	Weight	Max marks	Excellent	Average	Poor
Setup correctness	20%	10	Correctly defines orthogonal polynomials with orthogonality condition in (a)(i); properly identifies M=I-X(X'X)⁻¹X' as idempotent projection matrix in (a)(ii); accurately sets up 9 ordered samples with correct probability structure in (b); correctly states Latin Square model with row, column, treatment effects in (c)	Defines orthogonal polynomials vaguely without orthogonality condition; recognizes residual form but misidentifies matrix properties; lists samples but misses some combinations or probability calculations; mentions Latin Square but confuses with RCBD	Confuses orthogonal polynomials with simple polynomial regression; fails to identify the matrix form entirely; fundamental errors in sample enumeration; no understanding of missing plot purpose
Method choice	20%	10	Uses recurrence relation method for orthogonal polynomial fitting; applies trace operator property tr(AB)=tr(BA) for (a)(ii); employs Horvitz-Thompson estimator with correct inclusion probabilities in (b); uses calculus-based minimization of SSE for missing value derivation in (c)	Describes fitting without clear recurrence; attempts direct expansion for expectation; uses wrong estimator or confuses with SRS; states formula without derivation	No systematic fitting method; brute force matrix multiplication without simplification; completely wrong estimator choice; no derivation attempt for missing value
Computation accuracy	20%	10	Accurate trace calculation yielding (n-k)σ²; correct π₁=0.51, π₂=0.36, π₃=0.75 and verification E[Ŷ_HT]=42; precise algebraic manipulation leading to ŷ=(R+C+T-2G)/((t-1)(t-2)) with correct denominator (t-1)(t-2)	Minor arithmetic errors in trace calculation; small errors in inclusion probabilities or verification; algebraic slips in missing value derivation but correct structure	Major computational errors in expectation; wrong probabilities summing incorrectly; fundamentally wrong missing value formula
Interpretation	20%	10	Explains why orthogonal polynomials reduce multicollinearity and computational instability; interprets (n-k)σ² as expected residual SS with n-k degrees of freedom; explains why H-T estimator is design-unbiased for varying probabilities; clearly distinguishes between error df reduction and treatment SS adjustment in missing plot analysis	States computational advantages without explaining multicollinearity; notes answer without degrees of freedom interpretation; states unbiasedness without design-based reasoning; mentions df reduction without explaining why	No interpretation of computational benefits; no statistical meaning attached to results; no understanding of design-based inference; confused about analysis adjustments
Final answer & units	20%	10	Clear final answers: (a)(ii) E[·]=(n-k)σ²; (b) verified Ŷ_HT unbiased with total production 42 ('000 tons); (c) complete missing value formula with analysis steps including adjusted treatment SS and reduced error df by 1	Correct answers but missing units or incomplete final expressions; partial analysis description; missing some components of the final answer	Missing or wrong final answers; no units where relevant; incomplete or incorrect analysis procedure

Q8

Directive word: Derive

How this answer will be evaluated

Approach

Key points expected

Evaluation rubric

Practice this exact question

More from Statistics 2022 Paper I