(a) (i) In stratified sampling under optimum allocation, how will you proceed to select units from different strata, if one or more nᵢ's happens to be greater than Nᵢ (i ≥ 2) ?

(ii) A sample survey was conducted in a certain district of Himachal Pra

Question

(a) (i) In stratified sampling under optimum allocation, how will you proceed to select units from different strata, if one or more nᵢ's happens to be greater than Nᵢ (i ≥ 2) ?

(ii) A sample survey was conducted in a certain district of Himachal Pradesh. Four strata A, B, C and D of villages were formed according to the acreage of fruit trees as obtained from revenue records. A random sample of villages was selected from each stratum and the number of apple orchards in each selected village was noted. The data are shown below :

| Stratum | Total number of villages (Nᵢ) | Number of villages in sample (nᵢ) | Number of orchards in the selected villages |
|---------|------------------------------|-----------------------------------|---------------------------------------------|
| A (0 – 3 acres) | 275 | 15 | 2, 5, 1, 9, 6, 7, 0, 4, 7, 0, 5, 0, 0, 3, 0 |
| B (3 – 6 acres) | 146 | 10 | 21, 11, 7, 5, 6, 19, 5, 24, 30, 24 |
| C (6 – 15 acres) | 93 | 12 | 3, 10, 4, 11, 38, 11, 4, 46, 4, 18, 1, 39 |
| D (15 acres and above) | 62 | 11 | 30, 42, 20, 38, 29, 22, 31, 28, 66, 14, 15 |

Estimate the number of orchards in the district.

(b) (i) For a second order polynomial model with one predictor variable, derive the least squares normal equations clearly stating the conditions assumed. How will you interpret the parameters in this model ?

(ii) Describe why it is recommended to work with predictor variables centred around the mean. Comment on fitted values of the response variable in this case. Prove your claim.

(c) What are split-plot designs ? When do you recommend the use of such designs ? If e₁ and e₂ are the main plot and sub-plot errors respectively, both estimated in units of a single sub-plot, explain why e₁ is expected to be larger than e₂.

UPSC Answer Check · Accepted Answer

This multi-part question demands solving numerical problems alongside theoretical derivations and explanations. Allocate approximately 35% effort to part (a) combining optimum allocation adjustment and stratified estimation with Himachal Pradesh data; 35% to part (b) covering polynomial regression derivation, centering benefits, and proof; and 30% to part (c) explaining split-plot designs with error comparison. Structure as: brief theoretical setup → step-by-step calculations/derivations → interpretation of results in context.
- For (a)(i): Explain the iterative adjustment procedure when nᵢ > Nᵢ in optimum allocation—set nᵢ = Nᵢ for such strata, recompute allocation for remaining strata using revised formula, and repeat until all nᵢ ≤ Nᵢ
- For (a)(ii): Calculate stratum means, apply Neyman or proportional allocation weights, compute stratified estimate Ŷ = ΣNᵢȳᵢ with standard error, and present final estimate of total orchards in the district
- For (b)(i): Derive normal equations for Y = β₀ + β₁X + β₂X² by minimizing Σ(Yᵢ - β₀ - β₁Xᵢ - β₂Xᵢ²)²; interpret β₀ as response at X=0, β₁ as linear rate of change, β₂ as curvature/acceleration
- For (b)(ii): Explain that centering (X - X̄) eliminates correlation between linear and quadratic terms, stabilizes variance-covariance matrix; prove fitted values remain identical using algebraic expansion showing predicted Y unchanged
- For (c): Define split-plot designs as experiments with two sizes of experimental units where whole plots receive one factor and sub-plots receive another; recommend when one factor is harder/costlier to change; explain e₁ > e₂ due to additional whole-plot error component from main plot-to-main plot variation

Dimension	Weight	Max marks	Excellent	Average	Poor
Setup correctness	20%	10	Correctly identifies all components: for (a) recognizes optimum allocation violation condition and applies iterative adjustment; for (b) properly specifies polynomial model assumptions (independence, homoscedasticity, normality of errors); for (c) accurately distinguishes whole-plot and sub-plot error structures	Basic identification of components with minor errors in stating conditions or missing one assumption; partial recognition of the nᵢ > Nᵢ problem without clear resolution steps	Misidentifies key elements—confuses optimum with proportional allocation, omits essential assumptions for least squares, or fails to distinguish error types in split-plot design
Method choice	20%	10	Selects appropriate methodology throughout: iterative Neyman allocation adjustment for (a)(i), stratified mean estimation with correct weighting for (a)(ii), matrix/algebraic derivation approach for (b), and clear experimental design principles for (c)	Generally correct methods with suboptimal choices—e.g., using proportional instead of optimum allocation, or descriptive rather than rigorous proof approach for centering	Incorrect methods selected—simple random sampling formulas for stratified data, ordinary regression without polynomial terms, or confused error structure explanation
Computation accuracy	20%	10	Precise calculations: correct stratum means (A: 3.27, B: 15.2, C: 15.75, D: 30.45), accurate weighted total estimate (~8,500-9,000 orchards), correct normal equation derivation with proper partial derivatives, and valid algebraic proof for centering invariance	Minor computational slips—arithmetic errors in stratum totals, slightly incorrect weights, or algebraic omissions in derivation that don't invalidate core logic	Major computational errors—wrong stratum means, incorrect finite population correction application, fundamentally flawed normal equations, or invalid proof structure
Interpretation	20%	10	Rich contextual interpretation: explains why iterative allocation preserves optimality, interprets Himachal Pradesh estimate in agricultural planning context, clarifies practical meaning of polynomial parameters (turning points, rates), and relates e₁ > e₂ to precision trade-offs in agricultural experiments	Adequate interpretation with limited context—mechanical parameter definitions without practical insight, or generic statements about split-plots without experimental relevance	Missing or incorrect interpretation—fails to explain what parameters mean for policy/management, or provides no practical significance for error differences
Final answer & units	20%	10	Complete, labeled answers: explicit final orchard estimate with standard error and confidence interpretation for (a); clear statement that fitted values are invariant under centering for (b); precise recommendation conditions and error inequality justification for (c)	Present but incomplete answers—numerical estimate without uncertainty measure, stated invariance without proof completion, or partial error comparison	Missing final answers, wrong units (orchards vs. orchards per village), or completely unjustified claims about error relationships

Q8

Directive word: Solve

How this answer will be evaluated

Approach

Key points expected

Evaluation rubric

Practice this exact question

More from Statistics 2021 Paper I