Statistics 2024 Paper I 50 marks Solve

Q8

(a) A 2²-factorial design was used to develop the yield of a crop. Two factors A and B were used at two levels: low (–1) and high (+1). The experiment was replicated two times with completely randomized way. The data obtained are as follows: | Factor A | Factor B | Estimated Average Effect | | – | – | | | + | – | 8 | | – | + | –5 | | + | + | 2 | The sum of squares of all the yields = 510.5 The grand total of all the yields = 50.00 (i) Analyze the data and identify the significant factors. (12 marks) (ii) Develop the regression model and predict the yield when A and B both are at low level (–1). (8 marks) [Given, F₍₁, ₄, ₀.₀₅₎ = 7.71] (b) To estimate the population mean Ȳ of a characteristic Y, a simple random sample of size 1000 was selected from a population of size 1000000 by without replacement. The population mean of an auxiliary character X is X̄ = 15. The other results are given below: s²ᵧ = 20, s²ₓ = 25, sₓᵧ = 15, x̄ = 14, ȳ = 10. (i) Estimate Ȳ using difference, ratio and regression estimators. (6 marks) (ii) Estimate the MSE of these estimators. Which estimator should we choose to estimate Ȳ? (9 marks) (c) Write down the model used in the analysis of a two-way classification with interactions, stating the assumptions. What are the hypotheses tested in this scenario? Obtain the expression for the sum of squares and complete the ANOVA. (15 marks)

हिंदी में प्रश्न पढ़ें

(a) एक फसल की उपज विकसित करने के लिए एक ²-बहु-उपदानी अभिकल्पना का उपयोग किया गया है। दो घटकों A और B का उपयोग दो स्तरों, निम्न (−1) और उच्च (+1), पर किया गया है। प्रयोग को पूर्णतः यादृच्छीकृत तरीके से दो बार पुनरावृत्ति किया गया है। प्राप्त आँकड़े इस प्रकार हैं: | घटक A | घटक B | आकलित औसत प्रभाव | | − | − | | | + | − | 8 | | − | + | −5 | | + | + | 2 | सभी उपजों के वर्गों का योग = 510.5 सभी उपजों का कुल योग = 50.00 (i) आँकड़ों का विश्लेषण कीजिए और महत्त्वपूर्ण घटकों की पहचान कीजिए। (12 अंक) (ii) समाश्रयन निदर्श विकसित कीजिए और जब A तथा B दोनों निम्न स्तर (−1) पर हों, तब उपज का पूर्वानुमान कीजिए। (8 अंक) [दिया गया है, F₍₁, ₄, ₀.₀₅₎ = 7.71] (b) एक अभिलक्षण Y के समष्टि माध्य Ȳ का आकलन करने के लिए, 1000 आमाप का एक सरल यादृच्छिक प्रतिदर्श 1000000 आमाप की समष्टि में से प्रतिस्थापन रहित चुना गया है। सहायक अभिलक्षण X का समष्टि माध्य X̄ = 15 है। अन्य परिणाम नीचे दिये गये हैं: s²ᵧ = 20, s²ₓ = 25, sₓᵧ = 15, x̄ = 14, ȳ = 10। (i) अंतर, अनुपात और समाश्रयण आकलकों का उपयोग करते हुए Ȳ का आकलन कीजिए। (6 अंक) (ii) इन आकलकों की MSE का आकलन कीजिए। Ȳ का आकलन करने के लिए हमें कौन-सा आकलक चुनना चाहिए? (9 अंक) (c) मान्यताओं का उल्लेख करते हुए अन्योन्यक्रियाओं सहित द्विविधा वर्गीकरण के विश्लेषण में उपयोग किये गये निदर्श को लिखिए। इसके संदर्भ में किन परिकल्पनाओं का परीक्षण किया जाता है? वर्गों के योग का व्यंजक प्राप्त कीजिए और ANOVA को पूर्ण कीजिए। (15 अंक)

Directive word: Solve

This question asks you to solve. The directive word signals the depth of analysis expected, the structure of your answer, and the weight of evidence you must bring.

See our UPSC directive words guide for a full breakdown of how to respond to each command word.

How this answer will be evaluated

Approach

Solve this multi-part numerical problem by allocating approximately 35% time to part (a) [20 marks], 25% to part (b) [15 marks], and 40% to part (c) [25 marks]. Begin with clear model specification and ANOVA table construction for the 2² factorial in (a), followed by systematic calculation of difference, ratio and regression estimators in (b), and complete theoretical derivation of two-way ANOVA with interaction in (c). Present all computational steps in tabular format with explicit F-test conclusions and MSE comparisons.

Key points expected

  • For (a)(i): Calculate main effects A and B, interaction effect AB, construct complete ANOVA table with 3 d.f. for treatments and 4 d.f. for error, compare F-calculated with F-critical=7.71 to identify significant factors
  • For (a)(ii): Develop regression equation Y = β₀ + β₁A + β₂B + β₁₂AB with coded variables, substitute A=-1, B=-1 to predict yield at low-low combination
  • For (b)(i): Compute difference estimator ȳ_D = ȳ + (X̄ - x̄), ratio estimator ȳ_R = ȳ(X̄/x̄), and regression estimator ȳ_lr = ȳ + b(X̄ - x̄) where b = s_xy/s_x²
  • For (b)(ii): Calculate MSE for each estimator using appropriate formulas (MSE(ȳ_D), approximate MSE for ratio, and MSE(ȳ_lr) = (1-f)(s_y²(1-ρ²))/n), select estimator with minimum MSE
  • For (c): State model y_ijk = μ + α_i + β_j + (αβ)_ij + ε_ijk with assumptions (normality, independence, homoscedasticity, Σα_i=Σβ_j=Σ(αβ)_ij=0), hypotheses H₀: all α_i=0, all β_j=0, all (αβ)_ij=0, derive SSA, SSB, SSAB, SSE with degrees of freedom and complete ANOVA table

Evaluation rubric

DimensionWeightMax marksExcellentAveragePoor
Setup correctness20%12Correctly identifies 2² factorial with r=2 replications giving 8 total observations; properly specifies sampling design in (b) as SRSWOR with n=1000, N=1000000, f=0.001; accurately writes two-way ANOVA model with interaction including all constraints and assumptionsIdentifies basic design structure but makes minor errors in replication count or degrees of freedom; model specification in (c) misses one assumption or constraintConfuses factorial design structure, incorrect sample size or sampling fraction, omits interaction term or constraints in model specification
Method choice20%12Selects Yates' algorithm or contrast method for factorial effects; applies correct finite population correction; uses optimal regression coefficient for regression estimator; chooses standard ANOVA decomposition for two-way classificationUses correct general methods but with minor formula variations; acceptable alternative approaches with some inefficiencyApplies incorrect method (e.g., treats as 2³ design, uses with-replacement variance formulas, omits interaction in two-way analysis)
Computation accuracy20%12Precise calculation of effects (A=5, B=-1.5, AB=-1.5 from given data), correct SS values leading to F_A≈33.3, F_B≈3.0, F_AB≈3.0; accurate estimator values ȳ_D=11, ȳ_R≈10.71, ȳ_lr=13; exact ANOVA expressions with correct df breakdownMinor arithmetic errors in one component (e.g., effect calculation or variance estimation) with mostly correct methodologyMajor computational errors in multiple parts, incorrect sum of squares decomposition, wrong F-values, or invalid estimator calculations
Interpretation20%12Correctly concludes only Factor A is significant (F_A>7.71, others<F-critical); interprets negative B effect as yield decrease; justifies regression estimator choice via minimum MSE; explains practical meaning of interaction test in agricultural contextCorrect significance conclusions with weak practical interpretation; acceptable estimator comparison without clear recommendationWrong significance conclusions, fails to interpret effect directions, no comparison or justification for estimator selection
Final answer & units20%12Predicted yield at (-1,-1) as 6.5 (or 3.5 if using grand mean approach with clear justification); all estimators with explicit values; complete ANOVA table with all SS, df, MS, F values; proper units (yield units, squared units for MSE)Final answers present with minor rounding issues or missing one component; incomplete ANOVA tableMissing final answers, no ANOVA table, incorrect prediction values, or complete absence of units throughout

Practice this exact question

Write your answer, then get a detailed evaluation from our AI trained on UPSC's answer-writing standards. Free first evaluation — no signup needed to start.

Evaluate my answer →

More from Statistics 2024 Paper I