Statistics 2021 Paper I 50 marks Compulsory Solve

Q1

(a) A production unit manufacturing surgical masks is concerned about the quality of their masks. A random sample of n masks are inspected to estimate 'p', the probability of manufacturing a defective mask. How large a sample is required so that the estimate of p lies in the range p ± 0.1 with probability 0.95 ? (10 marks) (b) An insurance company studies a sample of 150 policy-holders. There are three categories of policies : auto, home and medical. The following results are obtained about the policies held by the policy-holders : (i) 30 have only home insurance (ii) 10 have only medical insurance (iii) 98 have auto insurance, but not all three types of insurance (iv) 27 have medical insurance, but not all three types of insurance (v) 13 have auto and medical insurance Given that a policy-holder has medical insurance, calculate the probability that he has home insurance. (10 marks) (c) Let X and Y be independent and identically distributed exponential random variables with mean λ > 0. Define $$Z = \begin{cases} 1, & \text{if} \quad X < Y \\ 0, & \text{if} \quad X \geq Y \end{cases}$$ Find E[X|Z = 1] + E[X|Z = 0]. (10 marks) (d) Let X₁, X₂, ..., Xₙ be a random sample from $$f(x, \theta) = \frac{\log(\theta)}{\theta - 1}\theta^x; \quad 0 < x < 1, \quad \theta > 1$$ Is there a function of θ, say g(θ), for which there exists an unbiased estimator whose variance attains the C-R lower bound ? If yes, find it. If not, show why not. (10 marks) (e) Let f(x, θ) be the Cauchy pdf $$f(x, \theta) = \frac{\theta}{\pi} \frac{1}{\theta^2 + x^2}; -\infty < x < \infty, \theta > 0$$ (i) Show that this family does not have Monotone Likelihood Ratio (MLR). (ii) If X is one observation from f(x, θ), show that |X| is sufficient for θ and hence the distribution of |X| does have an MLR. (5+5 marks)

हिंदी में प्रश्न पढ़ें

(a) सर्जिकल मास्क बनाने वाली एक उत्पादन इकाई की मास्क की गुणवत्ता जाँचने में रुचि है । दोषपूर्ण मास्क बनाने की प्रायिकता, 'p', के आकलन के लिए n मास्क के एक यादृच्छिक प्रतिदर्श का निरीक्षण किया गया । प्रतिदर्श कितना बड़ा होना चाहिए ताकि 0.95 प्रायिकता के साथ p के आकलक का परिसर p ± 0.1 हो ? (10 अंक) (b) एक बीमा कंपनी ने 150 पॉलिसी-धारकों के प्रतिदर्श का अध्ययन किया । पॉलिसी के तीन वर्ग हैं : वाहन, गृह और चिकित्सा । पॉलिसी-धारकों द्वारा गृहीत पॉलिसियों के संबंध में निम्न परिणाम प्राप्त हुए : (i) 30 के पास केवल गृह बीमा है (ii) 10 के पास केवल चिकित्सा बीमा है (iii) 98 के पास वाहन बीमा है, लेकिन सभी तीन प्रकार के बीमा नहीं हैं (iv) 27 के पास चिकित्सा बीमा है, लेकिन सभी तीन प्रकार के बीमा नहीं हैं (v) 13 के पास वाहन और चिकित्सा बीमा है यदि यह दिया हुआ है कि पॉलिसी-धारक के पास चिकित्सा बीमा है, तो उसके पास गृह बीमा होने की प्रायिकता परिकलित कीजिए । (10 अंक) (c) माना X और Y स्वतंत्र और सर्वसम बंटित चर्याताकि यादृच्छिक चर हैं जिनका माध्य λ > 0 है । परिभाषित है $$Z = \begin{cases} 1, & \text{यदि} \quad X < Y \\ 0, & \text{यदि} \quad X \geq Y \end{cases}$$ ज्ञात कीजिए : $$\text{E}[X|Z=1] + \text{E}[X|Z=0]$$ (10 अंक) (d) माना X₁, X₂, ..., Xₙ $$f(x, \theta) = \frac{\log(\theta)}{\theta - 1}\theta^x; \quad 0 < x < 1, \quad \theta > 1$$ से लिया गया कोई यादृच्छिक प्रतिदर्श है । क्या θ का एक फलन, माना g(θ), के लिए कोई अनभिनत आकलक है जिसका प्रसरण सी.-आर. (C-R) निम्न परिबंध प्राप्त करता हो ? यदि हाँ, तो ज्ञात कीजिए । यदि नहीं, तो दर्शाइए क्यों नहीं । (10 अंक) (e) माना $$f(x, \theta) = \frac{\theta}{\pi}\frac{1}{\theta^2 + x^2}; \quad -\infty < x < \infty, \quad \theta > 0$$ कौशी का प्रायिकता घनत्व फलन है । (i) दर्शाइए कि इस बंटन संवर्ग के लिए एकदिष्ट संभाविता अनुपात (एम.एल.आर.) नहीं है । (ii) यदि X, f(x, θ) से लिया गया एक प्रेक्षण है, तो दर्शाइए कि |X|, θ का पर्याप्त प्रतिदर्शज है और इसलिए |X| के बंटन के लिए एम.एल.आर. है । (5+5 अंक)

Directive word: Solve

This question asks you to solve. The directive word signals the depth of analysis expected, the structure of your answer, and the weight of evidence you must bring.

See our UPSC directive words guide for a full breakdown of how to respond to each command word.

How this answer will be evaluated

Approach

Solve each sub-part systematically with clear mathematical derivations. For (a), apply normal approximation to binomial for sample size determination; for (b), use set theory and conditional probability with Venn diagram analysis; for (c), exploit memoryless property of exponential distribution and symmetry arguments; for (d), verify regularity conditions and apply Cramér-Rao inequality; for (e), construct likelihood ratio and apply factorization theorem. Allocate approximately 15% time to (a), 15% to (b), 20% to (c), 25% to (d), and 25% to (e) given their analytical complexity.

Key points expected

  • (a) Sample size formula using n = z²₀.₀₂₅ × p(1-p)/d² with conservative p = 0.5 yielding n = 97 (or 96 with p unspecified)
  • (b) Complete Venn diagram construction: only auto = 85, auto∩home only = 0, all three = 0, medical∩home only = 4, yielding conditional probability 4/27
  • (c) E[X|Z=1] = E[X|X<Y] = λ/2 by memoryless property and E[X|Z=0] = λ + λ/2 = 3λ/2, sum = 2λ
  • (d) Verification of regularity conditions, Fisher information calculation I(θ) = [θ(log θ)² - (θ-1)²]/[θ(θ-1)²(log θ)²], and proof that only linear functions of the canonical parameter attain C-R bound
  • (e)(i) Counterexample showing L(θ₂)/L(θ₁) is not monotone by comparing likelihood ratios at x = 0 and x → ∞ for θ₂ > θ₁
  • (e)(ii) Factorization theorem application showing |X| sufficient, and proof that g(|X|;θ) = 2θ/[π(θ²+x²)] for x > 0 has MLR in |X|

Evaluation rubric

DimensionWeightMax marksExcellentAveragePoor
Setup correctness20%10Correctly identifies distributions and parameters for all parts: binomial with normal approximation for (a), mutually exclusive events for (b), i.i.d. exponential with memoryless property for (c), regular exponential family check for (d), and location-scale structure of Cauchy for (e)Correct setup for 3-4 parts but misses key assumptions like independence in (c) or regularity conditions in (d), or uses wrong distribution for (a)Fundamental errors in problem identification: treats (a) as exact binomial without approximation, misinterprets 'not all three' in (b), or fails to recognize exponential distribution properties in (c)
Method choice20%10Optimal methods throughout: conservative sample size formula for (a), systematic Venn diagram approach for (b), conditional expectation via memoryless property for (c), Cramér-Rao machinery with score function for (d), and direct likelihood ratio analysis for (e)Correct methods for most parts but suboptimal approaches like integration by parts for (c) instead of memoryless property, or missing factorization shortcut in (e)(ii)Incorrect methods such as Chebyshev's inequality for (a), naive conditional probability without set analysis for (b), or failure to use factorization theorem for sufficiency in (e)
Computation accuracy20%10Precise calculations: n = 96 or 97 for (a), exact fraction 4/27 for (b), clean derivation yielding 2λ for (c), correct Fisher information and variance bound verification for (d), explicit likelihood ratio expressions for (e)Minor arithmetic errors in 1-2 parts such as Venn diagram miscount in (b) or algebraic slip in Fisher information calculation for (d)Major computational errors: wrong z-value in (a), incorrect set cardinality calculations in (b), integration errors in (c), or failure to compute score function variance in (d)
Interpretation20%10Clear interpretation of results: practical meaning of sample size in quality control for (a), insurance risk assessment implication for (b), symmetry exploitation explanation for (c), characterization of UMVUE attainability for (d), and insight into why |X| creates MLR despite Cauchy's pathologies for (e)Correct answers with minimal interpretation, or partial interpretation missing connections between parts like why exponential symmetry matters in (c)Answers without interpretation or incorrect interpretation such as confusing sufficient statistics with complete statistics in (e), or misinterpreting conditional expectations
Final answer & units20%10All final answers clearly boxed with appropriate notation: n = 96 (or 97) for (a), 4/27 for (b), 2λ for (c), explicit g(θ) characterization for (d), and clear MLR demonstration with concluding statement for (e)Correct final answers but missing in 1-2 parts, or present without clear labeling which sub-part they belong toMissing final answers, wrong answers due to earlier errors propagated, or answers without proper mathematical notation (e.g., decimal approximations where exact fractions required)

Practice this exact question

Write your answer, then get a detailed evaluation from our AI trained on UPSC's answer-writing standards. Free first evaluation — no signup needed to start.

Evaluate my answer →

More from Statistics 2021 Paper I