Statistics

UPSC Statistics 2022 — Paper II

All 8 questions from UPSC Civil Services Mains Statistics 2022 Paper II (400 marks total). Every stem reproduced in full, with directive-word analysis, marks, word limits, and answer-approach pointers.

8Questions
400Total marks
2022Year
Paper IIPaper

Topics covered

Statistical Quality Control and Operations Research (1)Statistical Quality Control and Reliability Theory (1)Manpower planning, queuing theory and inventory management (1)Linear programming, crew scheduling and sequential sampling (1)Time series, econometrics, life tables, psychometrics (1)Time series analysis, econometrics, industrial statistics (1)Demography and vital statistics (1)Index numbers and queuing theory (1)

A

Q1
50M Compulsory distinguish Statistical Quality Control and Operations Research

(a) Distinguish between process control and product control. Explain the various sources of variation encountered in a process control study. Suggest how they can be eliminated from the process. 10 marks (b) The management of ABC company is considering the question of marketing a new product. The fixed cost required in the project is ₹ 4,000. Three factors are uncertain, viz., selling price, variable cost and annual sales volume. The product has life of only one year. The management has the data on three factors as under: | Selling Price (₹) | Probability | Variable Cost (₹) | Probability | Sales Volume (units) | Probability | |---|---|---|---|---|---| | 3 | 0·2 | 1 | 0·3 | 2000 | 0·3 | | 4 | 0·5 | 2 | 0·6 | 3000 | 0·3 | | 5 | 0·3 | 3 | 0·1 | 5000 | 0·4 | Consider the sequence of thirty random numbers 81, 32, 60, 04, 46, 31, 67, 25, 24, 10, 40, 02, 39, 68, 08, 59, 66, 90, 12, 64, 79, 31, 86, 68, 82, 89, 25, 11, 98, 16 and using the sequence (first 3 random numbers for the first trial, etc.), simulate the average profit for the above project on the basis of 10 trials. 10 marks (c) If N(t) is a Poisson process and s < t, find P(N(s) = k | N(t) = n) and comment. 10 marks (d) What are the assumptions made in the theory of games? Describe the maximin principle and minimax principle. Explain the algebraic method for games without saddle point. 10 marks (e) What are the importances of censoring in life-testing experiments? Discuss the estimation of parameters involved in exponential distribution with mean θ, using type-2 censored sample. 10 marks

हिंदी में पढ़ें

(a) प्रक्रम नियंत्रण और उत्पाद नियंत्रण में विभेदन कीजिए। एक प्रक्रम नियंत्रण अध्ययन में आने वाले परिवर्तन के विभिन्न स्रोतों की व्याख्या कीजिए। सुझाइए कि इनको प्रक्रम से कैसे विलुप्त किया जा सकता है। 10 अंक (b) ABC कंपनी का प्रबंधन एक नये उत्पाद के विपणन के प्रश्न पर विचार कर रहा है। इस परियोजना में निश्चित लागत ₹ 4,000 की आवश्यकता है। तीन कारक अनिश्चित हैं, जैसे बिक्री मूल्य, परिवर्तनीय लागत और वार्षिक बिक्री मात्रा। इस उत्पाद का जीवन केवल एक वर्ष का है। प्रबंधन के पास तीन कारकों का डेटा निम्नलिखित है : | बिक्री मूल्य (₹) | प्रायिकता | परिवर्तनीय लागत (₹) | प्रायिकता | बिक्री मात्रा (इकाइयाँ) | प्रायिकता | |---|---|---|---|---|---| | 3 | 0·2 | 1 | 0·3 | 2000 | 0·3 | | 4 | 0·5 | 2 | 0·6 | 3000 | 0·3 | | 5 | 0·3 | 3 | 0·1 | 5000 | 0·4 | तीस यादृच्छिक संख्याओं के अनुक्रम 81, 32, 60, 04, 46, 31, 67, 25, 24, 10, 40, 02, 39, 68, 08, 59, 66, 90, 12, 64, 79, 31, 86, 68, 82, 89, 25, 11, 98, 16 पर विचार कीजिए और इस अनुक्रम का प्रयोग करते हुए (पहले अभिप्रयोग के लिए पहली 3 यादृच्छिक संख्याएँ आदि) 10 अभिप्रयोगों के आधार पर उपरोक्त परियोजना के लिए औसत लाभ का अनुकरण (सिमुलेट) कीजिए। 10 अंक (c) यदि N(t) एक प्वासों प्रक्रम है और s < t है, तो P(N(s) = k | N(t) = n) प्राप्त कीजिए और टिप्पणी दीजिए। 10 अंक (d) खेलों के सिद्धांत में कौन-सी कल्पनाएँ की जाती हैं? मैक्सिमिन नियम और मिनिमैक्स नियम का वर्णन कीजिए। पल्याण बिंदु रहित खेलों के लिए बीजीय विधि को समझाइए। 10 अंक (e) जीवन-परीक्षण प्रयोगों में खंड-वर्जन के क्या महत्व हैं? प्रकार-2 के खंड-वर्जित प्रतिदर्श का प्रयोग करके, चरघातांकी बंटन, जिसका माध्य θ है, के प्राचलों के आकलन का वर्णन कीजिए। 10 अंक

Answer approach & key points

Distinguish requires clear differentiation followed by explanation. Allocate approximately 20% time to part (a) on process vs product control with variation sources, 25% to part (b) simulation with proper random number mapping, 15% to part (c) conditional probability derivation for Poisson, 20% to part (d) game theory assumptions and algebraic method, and 20% to part (e) censoring importance and MLE for exponential Type-2. Begin with definitions, proceed to analytical derivations or computational steps, and conclude with interpretations.

  • Part (a): Clear distinction between process control (monitoring during production) and product control (acceptance sampling); identification of chance causes (random, inherent) and assignable causes (special, identifiable) of variation; remedial measures for each source
  • Part (b): Correct probability interval mapping for selling price (00-19, 20-69, 70-99), variable cost (00-29, 30-89, 90-99), and sales volume (00-29, 30-59, 60-99); proper profit calculation as (Price - Variable Cost) × Volume - Fixed Cost; accurate simulation table with 10 trials and average profit computation
  • Part (c): Derivation of P(N(s)=k|N(t)=n) using independent increments property resulting in Binomial(n, s/t) distribution; recognition that conditional distribution depends only on ratio s/t not on rate parameter λ
  • Part (d): Assumptions: two players, finite strategies, zero-sum, simultaneous moves, complete information; maximin (player A's security level) and minimax (player B's security level) principles; algebraic method using mixed strategies with probability variables p and q, solving simultaneous equations for value of game
  • Part (e): Importance of censoring: time/cost efficiency, ethical considerations, handling heavy-tailed distributions; Type-2 censoring with r failures out of n items; MLE derivation for θ with estimator T/r where T is total time on test, showing unbiasedness and variance properties
Q2
50M solve Statistical Quality Control and Reliability Theory

(a) Samples of size n = 5 units are taken from a process every hour. The x̄ and R̄ values for a particular quality characteristic are determined. After 25 samples have been collected, we obtain x̄̄ = 20 and R̄ = 4·56. (i) What are the three-sigma control limits for x̄ and R? (ii) Estimate the process standard deviation if both the charts exhibit control. (iii) Assume that the process output is normally distributed. If the specifications are 19 ± 5, what are your conclusions regarding the process capability? (iv) If the process mean shifts to 24, what is the probability of not detecting this shift on the first subsequent sample? (d₂ = 2·326, D₁ = 0, D₂ = 4·918, D₃ = 0, D₄ = 2·114, A = 1·342, A₂ = 0·577, A₃ = 1·427, C₄ = 0·940, B₃ = 0, B₄ = 2·089) 15 marks (b) Define a Weibull distribution with scale parameter α and shape parameter β. Obtain the hazard function and reliability function of the model. Show also that the distribution satisfies increasing, constant and decreasing failure rate based on suitable choice of the shape parameter. 15 marks (c) A company uses the following acceptance-sampling procedure—A sample equal to 10% of the lot is taken. If 2% or less of the items in the sample are defective, the lot is accepted, otherwise it is rejected. If the submitted lot varies in size from 5000 units to 10000 units, what can you say about the protection by this plan? If 0·05 is the LTPD, does this scheme offer reasonable protection to the consumer? 20 marks

हिंदी में पढ़ें

(a) एक प्रक्रम से प्रत्येक घंटे आमाप n = 5 यूनिटों के प्रतिदर्श लिये जाते हैं। किसी विशेष गुणता-अभिलक्षण के लिए x̄ और R̄ के मानों को निकाला जाता है। 25 प्रतिदर्शों को एकत्रित करने के बाद, हम प्राप्त करते हैं x̄̄ = 20 और R̄ = 4·56। (i) x̄ और R के लिए तीन-सिग्मा नियंत्रण सीमाएँ क्या हैं? (ii) यदि दोनों संचित्र (चार्ट) नियंत्रण प्रदर्शित करते हैं, तो प्रक्रम मानक विचलन का आकलन कीजिए। (iii) मान लीजिए कि प्रक्रम उत्पादन प्रसामान्यतः बंटित है। यदि विनिर्देश 19 ± 5 हैं, तो प्रक्रम सामर्थ्य के बारे में आपके निष्कर्ष क्या हैं? (iv) यदि प्रक्रम माध्य 24 पर स्थानांतरित हो जाता है, तो प्रथम परवर्ती प्रतिदर्श पर इस स्थानांतरण को न पहचान पाने की प्रायिकता क्या है? (d₂ = 2·326, D₁ = 0, D₂ = 4·918, D₃ = 0, D₄ = 2·114, A = 1·342, A₂ = 0·577, A₃ = 1·427, C₄ = 0·940, B₃ = 0, B₄ = 2·089) 15 अंक (b) एक वेबुल बंटन को परिभाषित कीजिए जिसका मापक्रम प्राचल α और आकृति प्राचल β है। मॉडल का संकटप्रसूता फलन और विश्वसनीयता फलन प्राप्त कीजिए। यह भी दर्शाइए कि आकृति प्राचल के उपयुक्त विकल्प के आधार पर, बंटन वर्धमान, स्थिर और ह्रासमान विफलता दर को संतुष्ट करता है। 15 अंक (c) एक कंपनी निम्नलिखित स्वीकरण-प्रतिचयन कार्यविधि का प्रयोग करती है—एक प्रतिदर्श लिया गया है प्रचय के 10% के बराबर। यदि प्रतिदर्श में 2% या उससे कम मद दोषपूर्ण है, तो प्रचय स्वीकार किया जाता है, अन्यथा यह अस्वीकार कर दिया जाता है। यदि जमा किया गया प्रचय 5000 इकाइयों से 10000 इकाइयों के आमाप में बदलता है, तो आप इस आयोजना द्वारा सुरक्षा के बारे में क्या कह सकते हैं? यदि एल० टी० पी० डी० 0·05 है, तो क्या यह योजना उपभोक्ता को उचित सुरक्षा प्रदान करती है? 20 अंक

Answer approach & key points

This is a numerical problem requiring systematic calculation across three parts. Allocate approximately 35% time to part (a) with its four sub-parts on control charts (15 marks), 30% to part (b) on Weibull distribution derivations (15 marks), and 35% to part (c) on acceptance sampling analysis (20 marks). Begin each part with clear identification of given parameters, show all formulas before substitution, and conclude with explicit interpretation of results.

  • Part (a)(i): Correct application of x̄ chart limits using A₂R̄ and R chart limits using D₃R̄, D₄R̄ with n=5
  • Part (a)(ii): Estimation of process standard deviation using σ̂ = R̄/d₂ = 4.56/2.326
  • Part (a)(iii): Calculation of Cp and Cpk indices comparing process capability with specification limits 19±5
  • Part (a)(iv): Calculation of β-risk (Type II error) using normal distribution for shifted mean μ=24
  • Part (b): Derivation of Weibull hazard function h(t) = (β/α)(t/α)^(β-1) and reliability R(t) = exp[-(t/α)^β], with IFR/CFR/DFR classification based on β
  • Part (c): Analysis of variable sample size plan, calculation of acceptance probability using binomial/Poisson approximation, and evaluation against LTPD=0.05 for consumer protection
Q3
50M solve Manpower planning, queuing theory and inventory management

(a) It is planned to raise a research team to a strength of 50 chemists, which is to be maintained. The wastage of recruits depends on their length of service which is as follows: Year : 1 2 3 4 5 6 7 8 9 10 Total percentage who have left by the end of year : 5 36 55 63 68 73 79 87 97 100 What is the required number of recruitments per year necessary to maintain the required strength? There are 8 senior posts for which the length of service is the main criterion. What is the average length of service after which the next entrant expects promotion to one of these posts? (20 marks) (b) Explain the structure of a queuing system. Explain M/M/1 queuing system and obtain steady-state solution. Also calculate busy period distribution. (15 marks) (c) A company that operates for 50 weeks in a year is concerned about its stocks of copper cable. This costs ₹ 240 a metre and there is a demand for 8000 metres a week. Each replenishment costs ₹ 1,050 for administration and ₹ 1,650 for delivery, while holding costs are estimated at 25 percent of value held a year. Assuming that no shortages are allowed, what is the optimal inventory policy for the company? How would this analysis differ if the company wants to maximize its profits rather than minimize cost? What is the gross profit if the company sells the cable for ₹ 360 a metre? (15 marks)

हिंदी में पढ़ें

(a) एक अनुसंधान दल को 50 केमिस्टों की तादाद तक बढ़ाने की योजना है, जिसे बनाये रखना है। रंगड़ों की बर्बादी उनकी सेवा की लंबाई पर निर्भर करती है जो इस प्रकार है : वर्ष : 1 2 3 4 5 6 7 8 9 10 कुल प्रतिशत जो वर्ष के अंत तक छोड़ गये : 5 36 55 63 68 73 79 87 97 100 भर्ती की आवश्यक संख्या क्या है, जबकि आवश्यक तादाद बनाये रखने के लिए प्रतिवर्ष भर्ती जरूरी है? 8 वरिष्ठ पद हैं जिनके लिए सेवा की लंबाई मुख्य मानदंड है। सेवा की औसत लंबाई क्या है जिसके बाद अगला प्रवेशकर्ता इन पदों में से एक पर पदोन्नति की उम्मीद करता है? (20 अंक) (b) एक पंक्ति प्रणाली की संरचना को समझाइए। M/M/1 पंक्ति प्रणाली की व्याख्या कीजिए और इसके स्थायी-अवस्था हल को निकालिए। व्यस्त अवधि बंटन की गणना भी कीजिए। (15 अंक) (c) एक कंपनी जो एक वर्ष में 50 सप्ताह तक काम करती है, वह अपने कॉपर केबल के स्टॉक के बारे में चिंतित है। इसकी लागत ₹ 240 प्रति मीटर है और सप्ताह में 8000 मीटर की माँग है। प्रत्येक पुनःपूर्ति की लागत प्रशासन के लिए ₹ 1,050 और डिलीवरी के लिए ₹ 1,650 है, जबकि होल्डिंग लागत एक वर्ष में धारित मूल्य का 25 प्रतिशत अनुमानित है। यह मानते हुए कि कोई कमी की अनुमति नहीं है, कंपनी के लिए इष्टतम सूची नीति क्या है? यह विश्लेषण कैसे भिन्न होगा यदि कंपनी लागत को कम करने के बजाय अपने लाभ को अधिकतम करना चाहती है? यदि कंपनी ₹ 360 प्रति मीटर के लिए केबल बेचती है, तो सकल लाभ क्या है? (15 अंक)

Answer approach & key points

This is a multi-part numerical problem requiring you to solve three distinct operations research scenarios. Allocate approximately 40% of time to part (a) given its 20 marks, and 30% each to parts (b) and (c). Begin with clear problem identification for each sub-part, show all working steps with proper formulae, and conclude with precise numerical answers with units. For part (b), balance theoretical explanation with mathematical derivation.

  • Part (a): Calculate annual wastage rates from cumulative percentages, determine survival probabilities, compute required annual recruitment using renewal equation, and find average service length for promotion using weighted probability distribution
  • Part (a): Correctly interpret 'total percentage who have left' as cumulative distribution and derive conditional probabilities of leaving in each specific year
  • Part (b): Define queuing system components (arrival process, service mechanism, queue discipline) and derive steady-state probabilities for M/M/1 using balance equations with ρ = λ/μ < 1
  • Part (b): Obtain explicit formulas for P₀ = 1-ρ, Pₙ = ρⁿ(1-ρ), and derive busy period distribution using Takács formula or generating function approach
  • Part (c): Apply EOQ model with D = 400,000 metres/year, C₀ = ₹2,700, Cₕ = ₹60/metre/year, calculate optimal Q*, cycle time, and total minimum cost
  • Part (c): Distinguish cost minimization from profit maximization by incorporating revenue function, show that optimal quantity remains unchanged under constant price, and compute gross profit as (360-240) × 400,000
Q4
50M solve Linear programming, crew scheduling and sequential sampling

(a) Use penalty method to solve the following linear programming problem : Maximize Z = x₁ + 2x₂ + 3x₃ - x₄ subject to the constraints x₁ + 2x₂ + 3x₃ = 15 2x₁ + x₂ + 5x₃ = 20 x₁ + 2x₂ + x₃ + x₄ = 10 x₁, x₂, x₃, x₄ ≥ 0 (20 marks) (b) An airline that operates seven days a week has the time-table shown below. Crew must have a minimum layover of 5 hours between flights. Obtain the pairing of flights that minimizes layover time away from home. For any given pairing, crew will be based at the city that results in the smaller layover : Delhi-Jaipur Flight No. | Departure | Arrival 1 | 7:00 AM | 8:00 AM 2 | 8:00 AM | 9:00 AM 3 | 1:30 PM | 2:30 PM 4 | 6:30 PM | 7:30 PM Jaipur-Delhi Flight No. | Departure | Arrival 101 | 8:00 AM | 9:15 AM 102 | 8:30 AM | 9:45 AM 103 | 12 Noon | 1:15 PM 104 | 5:30 PM | 6:45 PM For each pair, also mention the city where the crew should be based. (15 marks) (c) What are sequential sampling plans? Suggest a sequential sampling plan for which p₁ = 0·01, α = 0·05, p₂ = 0·06 and β = 0·10. (15 marks)

हिंदी में पढ़ें

(a) निम्नलिखित रैखिक प्रोग्रामन समस्या का हल दण्ड विधि का प्रयोग करके निकालिए : अधिकतमीकरण Z = x₁ + 2x₂ + 3x₃ - x₄ निम्न प्रतिबंधों के अंतर्गत x₁ + 2x₂ + 3x₃ = 15 2x₁ + x₂ + 5x₃ = 20 x₁ + 2x₂ + x₃ + x₄ = 10 x₁, x₂, x₃, x₄ ≥ 0 (20 अंक) (b) एक एयरलाइन जो सप्ताह में सातों दिन परिचालन करती है, उसकी समय-सारणी नीचे दर्शाई गई है। चालक-दल को उड़ानों के बीच कम-से-कम 5 घंटे का विश्रामकाल लेना चाहिए। उन उड़ानों की जोड़ी प्राप्त कीजिए जिनमें घर से दूर विश्रामकाल का समय न्यूनतम हो। किसी भी दी गई जोड़ी के लिए चालक-दल उस शहर पर आधारित होगा जहाँ विश्रामकाल कम होगा : दिल्ली-जयपुर उड़ान सं० | प्रस्थान | आगमन 1 | 7:00 AM | 8:00 AM 2 | 8:00 AM | 9:00 AM 3 | 1:30 PM | 2:30 PM 4 | 6:30 PM | 7:30 PM जयपुर-दिल्ली उड़ान सं० | प्रस्थान | आगमन 101 | 8:00 AM | 9:15 AM 102 | 8:30 AM | 9:45 AM 103 | 12 मध्याह्न | 1:15 PM 104 | 5:30 PM | 6:45 PM प्रत्येक जोड़ी के लिए उस शहर का भी उल्लेख कीजिए जहाँ चालक-दल को आधारित होना चाहिए। (15 अंक) (c) अनुक्रमिक प्रतिचयन आयोजनाएं क्या हैं? एक अनुक्रमिक प्रतिचयन आयोजना सुझाइए, जिसके लिए p₁ = 0·01, α = 0·05, p₂ = 0·06 और β = 0·10 हो। (15 अंक)

Answer approach & key points

Begin with the directive 'solve' for part (a), applying the Big-M penalty method to convert equality constraints and maximize the objective. Allocate approximately 40% of time to part (a) given its 20 marks, 30% to part (b) for crew scheduling optimization, and 30% to part (c) for sequential sampling theory and design. Structure as: (a) complete LP solution with simplex iterations, (b) layover time matrix and optimal pairing, (c) definition followed by ASN and OC curve construction.

  • Part (a): Convert to standard form using Big-M penalty for equality constraints; introduce artificial variables A₁, A₂ with -M coefficient in objective
  • Part (a): Execute simplex iterations showing entering and leaving variables until optimality reached with Z_max = 15, x₁=2.5, x₂=5, x₃=2.5, x₄=0
  • Part (b): Construct 4×4 layover time matrix for Delhi-based and Jaipur-based crews; calculate layovers respecting 5-hour minimum
  • Part (b): Identify optimal pairings minimizing total layover: Flight 1-101 (Delhi base), 2-102 (Delhi), 3-103 (Jaipur), 4-104 (Jaipur)
  • Part (c): Define sequential sampling as item-by-item inspection with decision boundaries; state Wald's SPRT principles
  • Part (c): Calculate decision parameters h₁, h₂, s and construct acceptance/rejection lines; provide ASN ~ 40-50 and OC curve characteristics

B

Q5
50M Compulsory calculate Time series, econometrics, life tables, psychometrics

(a) Apply the method of link relatives to the following data and calculate the seasonal indices : Price of Rice (in ₹ per 10 kg) | Quarter | 2001 | 2002 | 2003 | 2004 | |---------|------|------|------|------| | 1 | 75 | 86 | 90 | 100 | | 2 | 60 | 65 | 72 | 78 | | 3 | 54 | 63 | 66 | 72 | | 4 | 59 | 80 | 82 | 93 | 10 marks (b) Derive the means and variances of the sampling distributions of the OLS estimates of α and β in the two-variable linear model Y = α + βX + u. 10 marks (c) Consider, in the usual notations, the equation y = Y₁β + X₁γ + u, where y is an (n × 1) vector, Y₁ is an (n × (g-1)) matrix, X₁ is an (n × k) matrix. Derive the equations for the two-stage least square method of estimation. 10 marks (d) If the survivorship function l(x) in life table is linear between x and x+1, and complete expectations of life at ages 40 and 41 for a particular group of persons are 21·39 years and 20·91 years respectively and l(40) = 41176, find the number of persons that attain the age 41. 10 marks (e) Compute the T-scores corresponding to test score x for the following frequency distribution : | x | 1 | 2 | 3 | 4 | 5 | |-----|---|---|---|---|---| | f | 2 | 3 | 8 | 6 | 1 | (Cumulative Normal Distribution Table is given in Page No. 9) 10 marks

हिंदी में पढ़ें

(a) निम्नलिखित आँकड़ों पर शृंखलित आपेक्षिक विधि का प्रयोग कीजिए और ऋतुनिष्ठ सूचकांकों की गणना कीजिए : चावल का मूल्य (₹ में प्रति 10 किलोग्राम) | क्वार्टर | 2001 | 2002 | 2003 | 2004 | |---------|------|------|------|------| | 1 | 75 | 86 | 90 | 100 | | 2 | 60 | 65 | 72 | 78 | | 3 | 54 | 63 | 66 | 72 | | 4 | 59 | 80 | 82 | 93 | 10 (b) द्विचर रैखिक निदर्श Y = α + βX + u में α और β के ओ. एल. एस. आकलनों के प्रतिचयन बंटनों के मध्यों और प्रसरणों को व्युत्पन्न कीजिए। 10 (c) प्रचलित संकेतों में, समीकरण y = Y₁β + X₁γ + u पर विचार कीजिए, जहाँ y एक (n × 1) सदिश है, Y₁ एक (n × (g-1)) आव्यूह है, X₁ एक (n × k) आव्यूह है। आकलन की द्विचरण न्यूनतम वर्ग विधि के लिए समीकरणों को व्युत्पन्न कीजिए। 10 (d) यदि वय सारणी में उत्तरजीविता फलन l(x), x और x+1 के बीच रैखिक है तथा किसी विशेष व्यक्तियों के समूह के लिए आयु 40 और 41 पर जीवन की पूर्ण प्रत्याशा क्रमशः 21·39 वर्ष और 20·91 वर्ष हैं, और l(40) = 41176, तब आयु 41 तक पहुँचने वाले व्यक्तियों की संख्या ज्ञात कीजिए। 10 (e) निम्नलिखित बारंबारता बंटन के लिए परीक्षण स्कोर x के संगत (T-समांकी) का परिकलन कीजिए : | x | 1 | 2 | 3 | 4 | 5 | |-----|---|---|---|---|---| | f | 2 | 3 | 8 | 6 | 1 | (संचयी प्रसामान्य बंटन सारणी पृष्ठ सं० 9 में दी गई है) 10

Answer approach & key points

This multi-part question requires precise calculation and derivation across five distinct statistical domains. Allocate approximately 20% time to each sub-part: (a) construct link relatives table and seasonal indices using chain relatives method; (b) derive OLS estimators' properties using Gauss-Markov assumptions; (c) set up 2SLS normal equations showing projection onto instruments; (d) apply linear survivorship assumption to solve for l(41); (e) compute percentile ranks then transform to T-scores using given normal table. Present each part clearly with step-by-step working.

  • For (a): Calculate chain relatives by expressing each quarter's value as percentage of preceding quarter, then obtain corrected relatives and seasonal indices normalized to 400
  • For (b): Derive E(α̂) = α and E(β̂) = β showing unbiasedness, then derive Var(α̂) = σ²ΣX²/(nΣx²) and Var(β̂) = σ²/Σx² using matrix or scalar algebra
  • For (c): State first stage projection Ŷ₁ = X(X'X)⁻¹X'Y₁, then second stage OLS of y on Ŷ₁ and X₁ to obtain 2SLS estimator β̂₂ₛₗₛ = (Z'PₓZ)⁻¹Z'Pₓy where Z = [Y₁|X₁]
  • For (d): Use linearity of l(x) to establish e°₄₀ = ½ + l(41)/l(40) × e°₄₁, then solve l(41) = l(40)(e°₄₀ - ½)/(e°₄₁ + ½) and compute numerical value
  • For (e): Compute cumulative frequencies, percentile ranks P = (100/N)(C - ½), find corresponding z-scores from normal table, then T = 50 + 10z for each score value
Q6
50M explain Time series analysis, econometrics, industrial statistics

(a) Explain Akaike information criterion for order selection in an ARMA (p, q) process. 15 marks (b) Define autocorrelation coefficient. What are its consequences for ordinary least squares? Discuss the maximum likelihood estimation of the model, in the usual notations, Y = Xβ + u with AR (autoregressive)(1) disturbance. 20 marks (c) Explain the method of collection of industrial data. Describe the (i) official publications for data collection and (ii) statistics collected by the various official agencies pertaining to industrial production. 15 marks

हिंदी में पढ़ें

(a) एक ए० आर० एम० ए० (p, q) प्रक्रम में क्रम चयन के लिए अकैके सूचना मानदंड की व्याख्या कीजिए। 15 (b) स्वसहसंबंध गुणांक को परिभाषित कीजिए। साधारण न्यूनतम वर्गों के लिए इसके परिणाम क्या हैं? प्रचलित संकेतों में, ए० आर० (स्वसमाश्रयी)(1) विघोष के साथ, निदर्श Y = Xβ + u के अधिकतम संभाविता आकलन का वर्णन कीजिए। 20 (c) औद्योगिक आँकड़ों के संग्रह की विधि की व्याख्या कीजिए। (i) आँकड़ों के संग्रह के लिए राजकीय प्रकाशनों का और (ii) औद्योगिक उत्पादन से संबंधित विभिन्न राजकीय एजेंसियों द्वारा एकत्र किये गये आँकड़ों का वर्णन कीजिए। 15

Answer approach & key points

Explain the theoretical foundations across all three parts with appropriate mathematical derivations where required. Allocate approximately 30% effort to part (a) on AIC and ARMA order selection, 40% to part (b) on autocorrelation and MLE estimation given its higher marks, and 30% to part (c) on industrial data collection methods. Structure with clear sectional headings, begin each part with precise definitions, develop through step-by-step reasoning, and conclude with practical implications or limitations.

  • Part (a): Definition of AIC as -2log(L) + 2k where L is likelihood and k is number of parameters; trade-off between goodness-of-fit and model complexity; comparison with BIC/AICc; application to ARMA(p,q) via minimization over candidate orders
  • Part (b): Autocorrelation coefficient ρ_k = Cov(u_t, u_{t-k})/Var(u_t); consequences for OLS: biased standard errors, inefficient estimates, invalid t/F tests; MLE derivation for AR(1) errors with transformation matrix Ω, concentrated likelihood, and iterative estimation
  • Part (c): Methods: census vs sample surveys, ASI (Annual Survey of Industries) schedule, establishment surveys; Official publications: ASI Summary Results, Index of Industrial Production (IIP), Economic Census; Agencies: CSO (now NSO), DIPP, Labour Bureau, RBI industrial data
  • Mathematical rigor: Proper likelihood functions, matrix notation for GLS transformation, stationarity conditions for AR(1) parameter |ρ| < 1
  • Applied context: Indian industrial statistics system, ASI coverage of registered manufacturing, limitations of informal sector data
Q7
50M explain Demography and vital statistics

(a) What are the various indices of mortality measure? Explain the purpose and procedure for standardizing them. (20 marks) (b) With usual notations, obtain logistic curve as given by P(t) = L / (1 + e^(r(β-t))) ; t > 0, β > 0, r > 0 for population growth model. Also discuss its any three properties. (15 marks) (c) In what way do total fertility rate (TFR), gross reproduction rate (GRR) and net reproduction rate (NRR) differ from one another as a measure of reproduction? (15 marks)

हिंदी में पढ़ें

(a) मृत्यता माप के विभिन्न सूचकांक क्या हैं? उनके मानकीकरण के लिए उद्देश्य एवं कार्यविधि की व्याख्या कीजिए। (20 अंक) (b) साधारण संकेतनों के साथ, जनसंख्या वृद्धि निदर्श के लिए बृद्धियत वक्र प्राप्त कीजिए, जो कि दिया जाता है P(t) = L / (1 + e^(r(β-t))) ; t > 0, β > 0, r > 0 के द्वारा। इसके किन्हीं तीन गुणों का भी वर्णन कीजिए। (15 अंक) (c) किस तरह से संपूर्ण जननक्षमता दर (टी० एफ० आर०), सकल जनन दर (जी० आर० आर०) और नेट जनन दर (एन० आर० आर०), प्रजनन के एक माप के रूप में, एक-दूसरे से भिन्न होते हैं? (15 अंक)

Answer approach & key points

The directive 'explain' demands conceptual clarity with logical exposition. For part (a) carrying 20 marks, allocate ~40% effort covering CDR, ASDR, IMR, MMR with direct and indirect standardization procedures; for (b) with 15 marks, spend ~30% on deriving the logistic curve from differential equation dP/dt = rP(1-P/L) and discussing properties like inflection point, asymptotic behavior, and symmetry; for (c) with 15 marks, devote remaining ~30% to contrasting TFR, GRR, NRR through formulas, assumptions, and replacement-level interpretations. Structure: brief intro → systematic part-wise treatment → integrated conclusion on demographic measurement evolution.

  • Part (a): Lists at least 5 mortality indices (CDR, ASDR, IMR, MMR, U5MR) with formulas; explains purpose of standardization (eliminating age-structure bias for inter-population/temporal comparison); describes direct standardization (applying standard population weights) and indirect standardization (applying standard rates to study population) with step-wise procedure
  • Part (a): Cites Indian context—SRS data, Sample Registration System mortality estimates, or NFHS standardized mortality ratios for interstate comparisons
  • Part (b): Derives logistic curve by solving dP/dt = rP(1-P/L) with initial condition P(0) = P₀; shows integration steps, substitution, and algebraic manipulation to reach given form with β = (1/r)ln[(L-P₀)/P₀]
  • Part (b): Discusses three properties—(i) sigmoid/S-shaped curve with inflection point at P=L/2, t=β; (ii) upper asymptote L (carrying capacity); (iii) growth rate parameter r determining steepness; may add symmetry or point of diminishing returns
  • Part (c): Distinguishes TFR (age-specific fertility rates summed, no mortality adjustment, both sexes), GRR (TFR × proportion female births, no mortality, female births only), NRR (GRR adjusted by survival probabilities lₓ to reproductive age, female generation replacement measure); notes NRR=1 indicates exact replacement, TFR≈2.1 replacement level for India
  • Part (c): Clarifies that TFR is period measure, GRR/NRR are generation measures; NRR most complete for population projection while TFR most commonly reported
Q8
50M explain Index numbers and queuing theory

(a) How does the concept of wholesale price index work? Describe the major components of wholesale price index. Explain the methodology of index numbers of area, production and yield in agriculture. (15 marks) (b) Explain G/M/1 model and show that the steady-state arrival point system has a geometric distribution. (20 marks) (c) If e(x) is the average number of complete years of life lived by each of l(x) persons in life table population after attaining age x, and q(x) is the probability of dying within one year following the attainment of age x, prove that q(x) = (1 - (e(x) - e(x+1))) / (1 + e(x+1)) (15 marks)

हिंदी में पढ़ें

(a) थोक मूल्य सूचकांक की संकल्पना कैसे काम करती है? थोक मूल्य सूचकांक के प्रमुख घटकों का वर्णन कीजिए। कृषि में क्षेत्र, उत्पादन और उपज के सूचकांकों की कार्य-प्रणाली की व्याख्या कीजिए। (15 अंक) (b) G/M/1 निदर्श की व्याख्या कीजिए और दर्शाइए कि स्थायी-अवस्था आगमन बिन्दु प्रणाली में गुणोत्तर बंटन होता है। (20 अंक) (c) यदि e(x), वय सारणी समष्टि में, आयु x तक पहुँचने के बाद, l(x) व्यक्तियों में से प्रत्येक व्यक्ति द्वारा जिये गये जीवन के पूर्ण वर्षों की संख्याओं का औसत है, और q(x), आयु x तक पहुँचने के बाद, एक वर्ष में मरने की प्रायिकता है, तो सिद्ध कीजिए कि q(x) = (1 - (e(x) - e(x+1))) / (1 + e(x+1)) (15 अंक)

Answer approach & key points

The directive 'explain' demands clear exposition with logical reasoning and supporting evidence. Structure your answer with: (a) ~30% time/space (15 marks) — define WPI, list its three major components (primary articles, fuel & power, manufactured products), then detail the Laspeyres/Fisher methodology for agricultural index numbers with base year selection; (b) ~40% time/space (20 marks) — define G/M/1 assumptions, derive the embedded Markov chain, use generating functions or recursive relations to prove the geometric steady-state distribution πₙ = (1-σ)σⁿ; (c) ~30% time/space (15 marks) — start with life table definitions, express e(x) and e(x+1) in terms of T(x) and l(x), manipulate the identities algebraically to reach the required expression. Conclude each part with a brief summary of significance.

  • For (a): WPI definition as base-weighted index measuring wholesale price movements; three commodity groups with their weights (primary articles ~23%, fuel & power ~13%, manufactured products ~64% in India's WPI)
  • For (a): Agricultural index methodology — fixed base vs chain base, Laspeyres formula P₀₁ = Σp₁q₀/Σp₀q₀, area and yield indices as relatives with geometric/ arithmetic mean aggregation
  • For (b): G/M/1 model specification — general inter-arrival distribution, exponential service (rate μ), single server; embedded Markov chain at arrival epochs with transition probabilities
  • For (b): Proof of geometric distribution — derive root σ of generating equation A*(μ(1-z)) = z where σ ∈ (0,1), show πⱼ = (1-σ)σʲ satisfies balance equations and normalization
  • For (c): Life table relationships — T(x) = ∫ₓ^ω l(t)dt, e(x) = T(x)/l(x), L(x) = l(x+1) + ½d(x); algebraic manipulation using l(x+1) = l(x)(1-q(x)) to derive the identity
  • For (c): Verification that the derived expression satisfies boundary conditions and consistency with complete expectation of life definitions

Practice Statistics 2022 Paper II answer writing

Pick any question above, write your answer, and get a detailed AI evaluation against UPSC's standard rubric.

Start free evaluation →