Statistics 2023 Paper II 50 marks Explain

Q6

(a) Explain the principle of least squares. How it is used in fitting trend in time series analysis ? Explain the fitting of trend for the curve $y=ab^tc^{t^2}$. 15 marks (b) Define stationary time series. How would you test the stationarity of the given time series ? Write the importance of stationary time series. Check the following time series for stationarity. (i) $Y_t = Y_{t-1} + U_t$ (ii) $Y_t = \delta + Y_{t-1} + U_t$ (iii) $Y_t = \delta Y_{t-1} + U_t$ ; $-1 \leq \delta \leq 1$ 15 marks (c) State the different methods of detecting the presence of heteroscedasticity. Explain in brief the Goldfeld-Quandt Test for detecting the presence of heteroscedasticity. Also write the assumption required to apply this test. For a data on consumption expenditure in relation to income for a cross section of 30 families, after dropping the middle 4 observations, the OLS regression based on the first 13 and the last 13 observations and their associated residual sum of squares are as follows : Regression based on the first 13 observations : $\hat{Y}_i = 3.4094 + 0.6968 X_i$ $(r^2 = 0.8887, RSS_1 = 377.17, df = 11)$ Regression based on the last 13 observations : $\hat{Y}_i = -28.0272 + 0.7941 X_i$ $(r^2 = 0.7681, RSS_2 = 1536.8, df = 11)$ Check the presence of heteroscedasticity for the above given results and write your conclusion. $(F_{(11, 11, 5\%)} = 2.82, F_{(11, 11, 1\%)} = 4.46, F_{(13, 13, 5\%)} = 2.53, F_{(13, 13, 1\%)} = 3.82)$ 20 marks

हिंदी में प्रश्न पढ़ें

(a) न्यूनतम वर्ग के सिद्धांत को समझाइये । काल श्रेणी विश्लेषण में इसका उपयोग प्रवृत्ति समंजन में कैसे किया जाता है ? वक्र $y=ab^tc^{t^2}$ के लिए प्रवृत्ति के समंजन को समझाइए । 15 (b) अनुपन्न काल श्रेणी को परिभाषित कीजिए । एक दी हुई काल श्रेणी की स्थावरता की जाँच (परीक्षण) कैसे करेंगे ? अनुपन्न काल श्रेणी के महत्व को लिखिए । निम्नलिखित काल श्रेणियों की स्थावरता की जाँच कीजिए । (i) $Y_t = Y_{t-1} + U_t$ (ii) $Y_t = \delta + Y_{t-1} + U_t$ (iii) $Y_t = \delta Y_{t-1} + U_t$ ; $-1 \leq \delta \leq 1$ 15 (c) विषम विचलितता (हैट्रोसिडास्टिसिटी) की उपस्थिति का पता लगाने की विभिन्न विधियों को बताइए । विषम विचलितता की उपस्थिति पता लगाने के लिए गोल्डफेल्ड-क्वांड्ट (Goldfeld-Quandt) के परीक्षण को संक्षेप में समझाइए । इस परीक्षण को लागू करने के लिए आवश्यक अभिधारणा भी लिखें । उपभोग व्यय पर डेटा के लिए, जो 30 परिवारों के क्रॉस-सेक्शन की आय से संबंधित है, बीच में 4 अवलोकनों को हटाने के बाद, प्रथम 13 प्रेक्षणों और अंतिम 13 प्रेक्षणों के आधार पर साधारण न्यूनतम वर्ग (ओ.एल.एस.) समाश्रयण और उनके संबद्ध वर्गों का अवशिष्ट योग निम्नांकित है : पहले 13 प्रेक्षणों के आधार पर समाश्रयण : $\hat{Y}_i = 3.4094 + 0.6968 X_i$ $(r^2 = 0.8887, RSS_1 = 377.17, df = \text{स्वतंत्रकोटि} = 11)$ पिछले (या बाद के) 13 प्रेक्षणों के आधार पर समाश्रयण : $\hat{Y}_i = -28.0272 + 0.7941 X_i$ $(r^2 = 0.7681, RSS_2 = 1536.8, \text{स्वतंत्रकोटि (df)} = 11)$ उपरोक्त दिये गये परिणामों के लिए विषम विचलितता की उपस्थिति की जाँच करें और अपना निष्कर्ष लिखें । $(F_{(11, 11, 5\%)} = 2.82, F_{(11, 11, 1\%)} = 4.46, F_{(13, 13, 5\%)} = 2.53, F_{(13, 13, 1\%)} = 3.82)$ 20

Directive word: Explain

This question asks you to explain. The directive word signals the depth of analysis expected, the structure of your answer, and the weight of evidence you must bring.

See our UPSC directive words guide for a full breakdown of how to respond to each command word.

How this answer will be evaluated

Approach

Explain the theoretical foundations first, then demonstrate computational application. Allocate ~30% time to part (a) on least squares and trend fitting, ~30% to part (b) on stationarity concepts and testing the three given models, and ~40% to part (c) on heteroscedasticity detection with complete Goldfeld-Quandt test execution. Structure as: theoretical exposition → mathematical derivation → numerical computation → statistical inference.

Key points expected

  • Part (a): Principle of least squares (minimizing sum of squared residuals), its application in linear and non-linear trend fitting, and complete working for y=ab^tc^{t^2} using logarithmic transformation to linear form
  • Part (b): Formal definition of weak/strong stationarity (constant mean, variance, autocovariance), Dickey-Fuller or graphical methods for testing, importance for valid inference, and classification of (i) random walk (non-stationary), (ii) random walk with drift (non-stationary), (iii) AR(1) process (stationary when |δ|<1)
  • Part (c): Listing detection methods (graphical, Park test, Glejser test, White test, Goldfeld-Quandt test), complete Goldfeld-Quandt procedure with assumptions (normality, homoscedasticity under null, increasing/decreasing variance pattern)
  • Correct computation of F-statistic = RSS2/RSS1 = 1536.8/377.17 = 4.075 with proper degrees of freedom (11,11)
  • Proper hypothesis testing conclusion: F_calculated (4.075) > F_critical at 5% (2.82), reject null, heteroscedasticity present; also note significance at 1% level since 4.075 < 4.46 is false—actually 4.075 < 4.46, so not significant at 1%
  • Recognition that RSS2 > RSS1 indicates increasing variance with income, confirming heteroscedasticity in consumption expenditure data

Evaluation rubric

DimensionWeightMax marksExcellentAveragePoor
Setup correctness20%10Correctly states all assumptions for least squares, properly defines stationarity with mean/variance/autocovariance conditions, accurately lists Goldfeld-Quandt assumptions (normality, homoscedastic null, ordered heteroscedasticity), and correctly identifies the three time series models by their standard namesDefines stationarity partially (only mean/variance), mentions some assumptions but omits key ones like normality for G-Q test, and shows basic recognition of model types without precise classificationConfuses least squares with other estimation methods, defines stationarity incorrectly or incompletely, omits critical assumptions, or misidentifies the AR(1) vs random walk distinction
Method choice20%10Selects logarithmic transformation for part (a) curve fitting, chooses appropriate stationarity tests (ADF or visual inspection) for part (b), correctly applies Goldfeld-Quandt with proper ordering and middle observation deletion, and uses correct F-distribution with (11,11) dfAttempts transformation for (a) but with errors, uses basic differencing concept for (b), applies G-Q test with minor errors in procedure or df selectionFails to transform non-linear curve, uses inappropriate stationarity tests, applies G-Q test completely wrongly (wrong ordering, wrong test statistic, or wrong distribution)
Computation accuracy20%10Accurate derivation of normal equations for transformed curve, correct classification of all three time series with mathematical justification, precise calculation F = 1536.8/377.17 = 4.075 (or ~4.08), and correct comparison with critical valuesMinor arithmetic errors in calculations, correct classification of 2/3 series, approximate F-value with correct conclusion, or correct calculation with wrong dfMajor computational errors in normal equations, misclassifies 2+ series, calculates F = RSS1/RSS2 (reversed), or uses completely wrong critical values
Interpretation20%10Explains why logarithmic transformation enables OLS application, clearly distinguishes unit root non-stationarity from stationary AR(1) with proper variance analysis, interprets F-test result correctly with both 5% and 1% significance levels, and explains economic meaning (increasing variance in consumption with income)Basic explanation of transformation purpose, some confusion between random walk and AR(1) properties, correct conclusion at 5% but misses 1% comparison, limited economic interpretationNo explanation of why transformation works, fundamental confusion about stationarity implications, wrong conclusion from F-test, or no interpretation of economic significance
Final answer & units20%10Clear final answers: explicit trend equation form for (a), definitive stationarity classification for all three series with reasoning, explicit hypothesis test conclusion stating 'heteroscedasticity present at 5% significance level' with proper statistical notation, and structured presentation with labeled partsFinal answers present but scattered, some classifications stated without justification, conclusion present but lacks precision on significance levels, adequate organizationMissing final answers, incomplete classifications, no clear conclusion on heteroscedasticity, or disorganized response making evaluation difficult

Practice this exact question

Write your answer, then get a detailed evaluation from our AI trained on UPSC's answer-writing standards. Free first evaluation — no signup needed to start.

Evaluate my answer →

More from Statistics 2023 Paper II