Q8
(a) (i) What are orthogonal polynomials ? How do you fit an orthogonal polynomial of degree 'p' ? (ii) For the model Y_(n×1) = X_(n×k) β_(k×1) + u_(n×1), E(uu') = σ² I_n, where X_(n×k) is a matrix of rank k (k < n), find out the value of E[Y'(I_n - X(X'X)⁻¹X')Y]. 10+10=20 (b) Consider an artificial population of three farms. Their selection probabilities and the wheat production (in '000 tons) are as follows : Farm unit (i) : 1 2 3; Selection probability (pᵢ) : 0·3 0·2 0·5; Wheat production (yᵢ) : 11 6 25. Draw all possible samples of size 2 with replacement (order is to be considered). Show that Horvitz-Thompson estimator of total wheat production is unbiased. 15 (c) What is a missing plot technique ? Derive the missing value formula for a Latin Square Design. How would you proceed to analyse such a design ? 15
हिंदी में प्रश्न पढ़ें
(a) (i) लांबिक बहुपद क्या हैं ? 'p' घातीय लांबिक बहुपद का आसंजन आप कैसे करेंगे ? (ii) निर्देश Y_(n×1) = X_(n×k) β_(k×1) + u_(n×1), E(uu') = σ² I_n, जहाँ X_(n×k) (k < n) का एक आयुः है, के लिए E[Y'(I_n - X(X'X)⁻¹X')Y] का मान ज्ञात कीजिए । 10+10=20 (b) तीन फार्मों की एक कृत्रिम समष्टि पर विचार कीजिए । उनकी चयन प्रायिकताएँ और गेहूँ उत्पादन ('000 टन में) निम्न प्रकार हैं : फार्म इकाई (i) : 1 2 3; चयन प्रायिकता (pᵢ) : 0·3 0·2 0·5; गेहूँ उत्पादन (yᵢ) : 11 6 25। आकार 2 के सभी संभावित प्रतिदर्शों को प्रतिस्थापन सहित निकालिए (क्रम पर विचार किया जाना है) । दर्शाइए कि कुल गेहूँ उत्पादन का हॉर्विट्ज़-थॉम्पसन आकलक अनभिनत है । 15 (c) लुप्त खंड तकनीक क्या है ? किसी लैटिन वर्ग अभिकल्पना के लिए लुप्त मान सूत्र व्युत्पन्न कीजिए । ऐसी अभिकल्पना का विश्लेषण करने के लिए आप कैसे अग्रसर होंगे ? 15
Directive word: Derive
This question asks you to derive. The directive word signals the depth of analysis expected, the structure of your answer, and the weight of evidence you must bring.
See our UPSC directive words guide for a full breakdown of how to respond to each command word.
How this answer will be evaluated
Approach
Begin with (a)(i) defining orthogonal polynomials with the orthogonality condition Σφᵢ(x)φⱼ(x)=0 for i≠j, then describe the recurrence relation method for fitting degree p. For (a)(ii), recognize the residual sum of squares form and apply E[u'Mu]=σ²tr(M) to obtain (n-k)σ². In (b), enumerate all 9 ordered samples with replacement, compute πᵢ=1-(1-pᵢ)² for inclusion probabilities, verify E[Ŷ_HT]=Y. For (c), derive the Latin Square missing value formula ŷ=(R+C+T-2G)/((t-1)(t-2)) and outline the adjusted ANOVA procedure. Allocate ~40% time to (a), ~30% each to (b) and (c).
Key points expected
- (a)(i) Definition: orthogonal polynomials satisfy Σφᵢ(x)φⱼ(x)=0 for i≠j over the point set; fitting uses recurrence φᵣ₊₁(x)=(x-aᵣ)φᵣ(x)-bᵣφᵣ₋₁(x) with specific coefficient formulas
- (a)(ii) Recognition that Iₙ-X(X'X)⁻¹X' is the residual maker matrix M; E[Y'MY]=E[u'Mu]=σ²tr(M)=σ²(n-k) using tr(Iₙ)=n and tr(X(X'X)⁻¹X')=k
- (b) Enumeration of 9 ordered samples: (1,1),(1,2),(1,3),(2,1),(2,2),(2,3),(3,1),(3,2),(3,3) with their probabilities; calculation of first-order inclusion probabilities πᵢ=1-(1-pᵢ)²; verification that Σ(yᵢ/πᵢ)·πᵢ/Σ1 = Y
- (c) Missing plot technique: Yates' method for estimating missing observations by minimizing error sum of squares; derivation using ∂SSE/∂y=0 for Latin Square layout
- (c) Latin Square missing value formula: ŷ = (tRᵢ + tCⱼ + tTₖ - 2G) / [(t-1)(t-2)] where R,C,T are respective totals and G is grand total; analysis proceeds with reduced degrees of freedom and bias correction in treatment SS
Evaluation rubric
| Dimension | Weight | Max marks | Excellent | Average | Poor |
|---|---|---|---|---|---|
| Setup correctness | 20% | 10 | Correctly defines orthogonal polynomials with orthogonality condition in (a)(i); properly identifies M=I-X(X'X)⁻¹X' as idempotent projection matrix in (a)(ii); accurately sets up 9 ordered samples with correct probability structure in (b); correctly states Latin Square model with row, column, treatment effects in (c) | Defines orthogonal polynomials vaguely without orthogonality condition; recognizes residual form but misidentifies matrix properties; lists samples but misses some combinations or probability calculations; mentions Latin Square but confuses with RCBD | Confuses orthogonal polynomials with simple polynomial regression; fails to identify the matrix form entirely; fundamental errors in sample enumeration; no understanding of missing plot purpose |
| Method choice | 20% | 10 | Uses recurrence relation method for orthogonal polynomial fitting; applies trace operator property tr(AB)=tr(BA) for (a)(ii); employs Horvitz-Thompson estimator with correct inclusion probabilities in (b); uses calculus-based minimization of SSE for missing value derivation in (c) | Describes fitting without clear recurrence; attempts direct expansion for expectation; uses wrong estimator or confuses with SRS; states formula without derivation | No systematic fitting method; brute force matrix multiplication without simplification; completely wrong estimator choice; no derivation attempt for missing value |
| Computation accuracy | 20% | 10 | Accurate trace calculation yielding (n-k)σ²; correct π₁=0.51, π₂=0.36, π₃=0.75 and verification E[Ŷ_HT]=42; precise algebraic manipulation leading to ŷ=(R+C+T-2G)/((t-1)(t-2)) with correct denominator (t-1)(t-2) | Minor arithmetic errors in trace calculation; small errors in inclusion probabilities or verification; algebraic slips in missing value derivation but correct structure | Major computational errors in expectation; wrong probabilities summing incorrectly; fundamentally wrong missing value formula |
| Interpretation | 20% | 10 | Explains why orthogonal polynomials reduce multicollinearity and computational instability; interprets (n-k)σ² as expected residual SS with n-k degrees of freedom; explains why H-T estimator is design-unbiased for varying probabilities; clearly distinguishes between error df reduction and treatment SS adjustment in missing plot analysis | States computational advantages without explaining multicollinearity; notes answer without degrees of freedom interpretation; states unbiasedness without design-based reasoning; mentions df reduction without explaining why | No interpretation of computational benefits; no statistical meaning attached to results; no understanding of design-based inference; confused about analysis adjustments |
| Final answer & units | 20% | 10 | Clear final answers: (a)(ii) E[·]=(n-k)σ²; (b) verified Ŷ_HT unbiased with total production 42 ('000 tons); (c) complete missing value formula with analysis steps including adjusted treatment SS and reduced error df by 1 | Correct answers but missing units or incomplete final expressions; partial analysis description; missing some components of the final answer | Missing or wrong final answers; no units where relevant; incomplete or incorrect analysis procedure |
Practice this exact question
Write your answer, then get a detailed evaluation from our AI trained on UPSC's answer-writing standards. Free first evaluation — no signup needed to start.
Evaluate my answer →More from Statistics 2022 Paper I
- Q1 (a) Let X and Y be independent random variables with exponential distribution having respective means $\frac{1}{\lambda_1}$ and $\frac{1}{\…
- Q2 (a) Let a random variable X have exponential distribution with mean 1/θ, θ > 0. To test H₀ : θ = 3 against H₁ : θ = 2, construct sequential…
- Q3 (a) (i) How large a sample must be taken in order that the probability will be at least 0·90 that the sample mean will be within 0·4 – neig…
- Q4 (a) Consider Poisson distribution $$P_{\theta}(X = j) = \frac{e^{-\theta} \theta^{j}}{j!} = p_{j}, j = 0, 1, 2, ....$$ Let $f_{j}$ be the f…
- Q5 (a) Define general linear model with usual assumptions. If y₁ = β₁ + u₁, y₂ = –β₁ + β₂ + u₂, y₃ = –β₂ + u₃, where u₁, u₂, u₃ are mutually i…
- Q6 (a) In a set of two-way classified data according to k levels of factor A and r levels of factor B, there is one observation in each cell.…
- Q7 (a) Consider the following data given for a BIBD with v = b = 4, r = k = 3, λ = 2 and N = 12 : Analyse the design. [Given that : F₃,₅ (0·05…
- Q8 (a) (i) What are orthogonal polynomials ? How do you fit an orthogonal polynomial of degree 'p' ? (ii) For the model Y_(n×1) = X_(n×k) β_(k…