(a) Define general linear model with usual assumptions. If y₁ = β₁ + u₁, y₂ = –β₁ + β₂ + u₂, y₃ = –β₂ + u₃, where u₁, u₂, u₃ are mutually independent random variables with mean zero and variance σ², then find the least square estimators of β₁ and β₂.

Question

(a) Define general linear model with usual assumptions. If y₁ = β₁ + u₁, y₂ = –β₁ + β₂ + u₂, y₃ = –β₂ + u₃, where u₁, u₂, u₃ are mutually independent random variables with mean zero and variance σ², then find the least square estimators of β₁ and β₂. (10 marks)

(b) Given X ~ N₃(μ, Σ), where μ = (2, 4, 3)' and Σ = ⎛8  2  3⎞
    ⎜2  4  1⎟
    ⎝3  1  3⎠
(i) find the regression function of X₁ on X₂ and X₃, and (ii) compute the conditional variance of X₁ given X₂ and X₃. (10 marks)

(c) What is a uniformity trial ? Explain how it can be used to determine optimum shape and size. (10 marks)

(d) In a 2⁶ – factorial experiment, the key block is given as : (1), ab, cd, ef, ace, abef, abcd, bce, cdef, acf, ade, abcdef, bde, bcf, adf, bdf. Identify the confounded effects. (10 marks)

(e) If the coefficients of variation of x and y are equal and the correlation coefficient between x and y is ρ = 2/3, compute the efficiency of ratio estimator relative to the mean of a simple random sample. (10 marks)

UPSC Answer Check · Accepted Answer

This is a computational-cum-descriptive question requiring precise derivations and calculations across five sub-parts. Allocate approximately 20% time to part (a) for matrix formulation of GLM and LSE derivation, 20% to part (b) for multivariate normal conditional distributions, 15% to part (c) for explaining uniformity trials with agricultural field trial context, 25% to part (d) for systematic identification of confounded effects in 2⁶ factorial, and 20% to part (e) for ratio estimator efficiency computation. Begin each part with clear statement of method, show all computational steps, and conclude with boxed final answers.
- Part (a): Correct matrix formulation of GLM y = Xβ + u with assumptions E(u)=0, Var(u)=σ²I; proper construction of design matrix X and derivation of LSE β̂ = (X'X)⁻¹X'y yielding β̂₁ = (y₁ - y₂)/2 and β̂₂ = (y₁ + y₂ + 2y₃)/2
- Part (b)(i): Correct partitioning of Σ into Σ₁₁, Σ₁₂, Σ₂₁, Σ₂₂ and computation of regression coefficients β = Σ₁₂Σ₂₂⁻¹ for E(X₁|X₂,X₃) = μ₁ + Σ₁₂Σ₂₂⁻¹(x₂-μ₂, x₃-μ₃)'
- Part (b)(ii): Computation of conditional variance Var(X₁|X₂,X₃) = Σ₁₁ - Σ₁₂Σ₂₂⁻¹Σ₂₁ using Schur complement
- Part (c): Definition of uniformity trial as trial with uniform treatment to assess field variability; explanation of how coefficient of variation and soil heterogeneity index guide selection of plot shape (long narrow for fertility gradient) and size (balancing variance reduction vs cost)
- Part (d): Systematic identification of confounded effects by finding generalized interaction of defining contrasts; recognition that key block corresponds to I = ABCDEF or equivalent 6-factor interaction confounding
- Part (e): Application of ratio estimator efficiency formula RE = (1-ρ²)/(Cₓ²/Cᵧ² + 1 - 2ρCₓ/Cᵧ) with Cₓ = Cᵧ yielding simplified computation; final numerical answer for efficiency

Dimension	Weight	Max marks	Excellent	Average	Poor
Setup correctness	20%	10	Correctly writes matrix form y = Xβ + u for (a) with explicit X matrix; properly partitions Σ for (b); defines uniformity trial with agricultural context for (c); identifies 2⁶ factorial structure for (d); states ratio estimator assumptions for (e)	Partially correct setup with minor matrix dimension errors or missing some assumptions; incomplete Σ partitioning; vague uniformity trial definition; recognizes factorial but misses confounding structure; states efficiency formula with errors	Incorrect model specification, wrong matrix dimensions, or missing crucial assumptions; fails to partition multivariate normal; no agricultural context; misunderstands factorial design; wrong efficiency formula
Method choice	20%	10	Uses normal equations/LSE principle for (a); applies conditional distribution theory for multivariate normal in (b); employs CV and heterogeneity analysis for (c); uses generalized interaction method for (d); applies correct relative efficiency derivation for (e)	Correct broad method but inefficient approach or missing optimization steps; some confusion between regression and conditional variance methods; basic uniformity explanation without optimization linkage; trial-and-error for confounding; correct formula but messy derivation	Wrong method entirely (e.g., MLE instead of LSE, simple linear regression instead of conditional MVN); no method for uniformity optimization; random guessing for confounded effects; incorrect efficiency comparison base
Computation accuracy	20%	10	Accurate matrix inversion (X'X)⁻¹ for (a); correct Σ₂₂⁻¹ and Schur complement for (b); correct numerical values for regression coefficients and conditional variance; accurate identification of all confounded effects in (d); precise efficiency value for (e)	Minor arithmetic errors in matrix operations; correct method but wrong final numbers in one part; partial identification of confounded effects; correct formula substitution but calculation error in final efficiency	Major computational errors in matrix algebra; wrong inverse or determinant; completely wrong conditional variance; fails to identify confounding structure; nonsensical efficiency value
Interpretation	20%	10	Interprets LSE as BLUE under Gauss-Markov for (a); explains regression surface geometry for (b); links uniformity trials to Indian agricultural experiments (e.g., IARI field trials) for (c); explains practical consequences of confounding for (d); interprets efficiency value for survey design decisions in (e)	Basic interpretation without theoretical depth; mentions BLUE but doesn't explain; superficial uniformity application; notes confounding without explaining aliasing consequences; states efficiency >1 or <1 without practical meaning	No interpretation of results; fails to connect to statistical theory or practical applications; meaningless or wrong interpretation of efficiency; ignores consequences of confounding
Final answer & units	20%	10	Clear boxed final answers: explicit β̂₁, β̂₂ formulas for (a); numerical regression function and conditional variance for (b); specific shape/size recommendations for (c); complete list of confounded effects for (d); exact efficiency ratio for (e) with statement of superiority/inferiority	Final answers present but poorly formatted; one missing answer; incomplete list of confounded effects; efficiency without comparison statement; missing units where relevant	Missing or wrong final answers; incomplete solutions; no conclusion on which estimator is better in (e); fails to identify any confounded effects

Q5

Directive word: Solve

How this answer will be evaluated

Approach

Key points expected

Evaluation rubric

Practice this exact question

More from Statistics 2022 Paper I