(a) For a two variable linear regression model Yᵢ = a + bXᵢ + eᵢ, where E(eᵢ) = 0, Var(eᵢ) = σ²ₑ, Cov(eᵢ, eⱼ) = 0 for i ≠ j, (i,j) ∈ {1, 2, ..., n}, if â and b̂ are least square estimators of a and b respectively, derive expressions for Var(â), Var(b

Question

(a) For a two variable linear regression model Yᵢ = a + bXᵢ + eᵢ, where E(eᵢ) = 0, Var(eᵢ) = σ²ₑ, Cov(eᵢ, eⱼ) = 0 for i ≠ j, (i,j) ∈ {1, 2, ..., n}, if â and b̂ are least square estimators of a and b respectively, derive expressions for Var(â), Var(b̂) and Cov(â, b̂). 10 marks

(b) Let X = (X₁ X₂ X₃)' ~ N₃(μ, Σ), where μ = (1 2 1)' and Σ = (9 2 2 / 2 3 0 / 2 0 2). Find the joint distribution of Y₁ = X₁ + X₂ + X₃ and Y₂ = X₂ - X₃. 10 marks

(c) If X₁, X₂, ..., Xₙ is a random sample from a standard normal population, then using quadratic forms show that the sample mean X̄ = (1/n)∑ⱼ₌₁ⁿ Xⱼ and sample variance S² = [1/(n-1)]∑ⱼ₌₁ⁿ(Xⱼ - X̄)² are stochastically independent. 10 marks

(d) Assume that in a population of very large number of items, proportion of defective items is 0·30. What should be the size of the sample, if a simple random sample is to be drawn from this population to estimate the percent defective within 2 percent of the true value with 95·5 percent probability? [Given P(0 ≤ Z ≤ 1·96) = 0·475; and P(0 ≤ Z ≤ 2·005) = 0·4775]. 10 marks

(e) How do the size and shape of plots and blocks effect the results of field experiments? 10 marks

UPSC Answer Check · Accepted Answer

This question demands rigorous derivation and calculation across five statistical sub-parts. Allocate approximately 20% time each: for (a) derive Var(â), Var(b̂), Cov(â,b̂) using matrix or summation approach; for (b) apply linear transformation of multivariate normal; for (c) use quadratic forms and Cochran's theorem; for (d) solve the sample size formula for proportions; for (e) discuss experimental design principles with Indian agricultural examples like IARI field trials. Present each part distinctly with clear labeling.
- Part (a): Derivation of Var(â) = σ²ₑ[∑Xᵢ²/(n∑(Xᵢ-X̄)²)], Var(b̂) = σ²ₑ/∑(Xᵢ-X̄)², and Cov(â,b̂) = -σ²ₑX̄/∑(Xᵢ-X̄)² using normal equations or matrix approach
- Part (b): Application of linear transformation Y = AX where A = [[1,1,1],[0,1,-1]] to obtain Y ~ N₂(Aμ, AΣA') with computed mean (4,1)' and covariance matrix [[17,1],[1,5]]
- Part (c): Decomposition of total sum of squares using idempotent matrices, showing Q₁ = nX̄² and Q₂ = (n-1)S² are independent via rank additivity and Cochran's theorem
- Part (d): Sample size calculation n = Z²ₐ/₂ p(1-p)/d² with p=0.30, d=0.02, Z=2.005 yielding n ≈ 2102 or appropriate rounding
- Part (e): Discussion of plot size effects on soil heterogeneity control, shape effects on border bias, and block arrangement for local control with reference to Indian agricultural experiments like varietal trials at IARI

Dimension	Weight	Max marks	Excellent	Average	Poor
Setup correctness	20%	10	Correctly states all model assumptions for (a), identifies transformation matrix for (b), specifies standard normal setup for (c), identifies parameters p=0.30, d=0.02, Z=2.005 for (d), and defines plot/block terminology for (e)	States most assumptions correctly but misses independence of errors in (a) or uses wrong Z-value in (d); vague on experimental design terms in (e)	Wrong model specification, confuses parameters, or fundamentally misunderstands the statistical setup in multiple parts
Method choice	20%	10	Uses optimal methods: normal equations/matrix algebra for (a), linear transformation property of MVN for (b), Cochran's theorem/quadratic forms for (c), standard sample size formula for (d), and principles of local control/randomization for (e)	Correct general approach but inefficient or partially correct methods; e.g., direct differentiation instead of matrix approach in (a), or missing Cochran's theorem in (c)	Wrong methodological approach such as treating (b) as univariate, ignoring quadratic forms in (c), or using descriptive statistics instead of design principles in (e)
Computation accuracy	20%	10	All algebraic manipulations correct: proper handling of ∑(Xᵢ-X̄)² terms in (a), accurate matrix multiplication AΣA' in (b), correct idempotent matrix ranks in (c), precise calculation n=2101.875→2102 in (d)	Minor computational errors like arithmetic mistakes in matrix multiplication or rounding errors in sample size, but methodologically sound	Major computational errors: wrong variance expressions, incorrect covariance matrix elements, or sample size off by factor of 10 or more
Interpretation	20%	10	Interprets negative covariance in (a) as precision trade-off, explains why Y₁,Y₂ are correlated in (b), clarifies why independence requires normality in (c), discusses practical implications of large sample in (d), and relates plot size to fertility gradients in Indian soils for (e)	Basic interpretation without deeper insight; mentions correlation exists but doesn't explain geometric intuition or practical consequences	No interpretation provided or completely wrong interpretation of statistical results across parts
Final answer & units	20%	10	All final answers clearly stated: explicit variance-covariance formulas in (a), complete N₂ distribution specification in (b), clear independence demonstration in (c), sample size n=2102 with justification in (d), and actionable recommendations for field experiment design in (e)	Final answers present but incomplete or poorly formatted; missing distribution parameters or vague recommendations	Missing final answers, wrong units (e.g., percent vs proportion confusion in d), or incomplete distribution specifications

Q5

Directive word: Derive

How this answer will be evaluated

Approach

Key points expected

Evaluation rubric

Practice this exact question

More from Statistics 2025 Paper I