Q1 50M Compulsory solve Probability theory and statistical inference
(a) A production unit manufacturing surgical masks is concerned about the quality of their masks. A random sample of n masks are inspected to estimate 'p', the probability of manufacturing a defective mask. How large a sample is required so that the estimate of p lies in the range p ± 0.1 with probability 0.95 ? (10 marks)
(b) An insurance company studies a sample of 150 policy-holders. There are three categories of policies : auto, home and medical. The following results are obtained about the policies held by the policy-holders :
(i) 30 have only home insurance
(ii) 10 have only medical insurance
(iii) 98 have auto insurance, but not all three types of insurance
(iv) 27 have medical insurance, but not all three types of insurance
(v) 13 have auto and medical insurance
Given that a policy-holder has medical insurance, calculate the probability that he has home insurance. (10 marks)
(c) Let X and Y be independent and identically distributed exponential random variables with mean λ > 0.
Define
$$Z = \begin{cases} 1, & \text{if} \quad X < Y \\ 0, & \text{if} \quad X \geq Y \end{cases}$$
Find E[X|Z = 1] + E[X|Z = 0]. (10 marks)
(d) Let X₁, X₂, ..., Xₙ be a random sample from
$$f(x, \theta) = \frac{\log(\theta)}{\theta - 1}\theta^x; \quad 0 < x < 1, \quad \theta > 1$$
Is there a function of θ, say g(θ), for which there exists an unbiased estimator whose variance attains the C-R lower bound ? If yes, find it. If not, show why not. (10 marks)
(e) Let f(x, θ) be the Cauchy pdf
$$f(x, \theta) = \frac{\theta}{\pi} \frac{1}{\theta^2 + x^2}; -\infty < x < \infty, \theta > 0$$
(i) Show that this family does not have Monotone Likelihood Ratio (MLR).
(ii) If X is one observation from f(x, θ), show that |X| is sufficient for θ and hence the distribution of |X| does have an MLR. (5+5 marks)
Answer approach & key points
Solve each sub-part systematically with clear mathematical derivations. For (a), apply normal approximation to binomial for sample size determination; for (b), use set theory and conditional probability with Venn diagram analysis; for (c), exploit memoryless property of exponential distribution and symmetry arguments; for (d), verify regularity conditions and apply Cramér-Rao inequality; for (e), construct likelihood ratio and apply factorization theorem. Allocate approximately 15% time to (a), 15% to (b), 20% to (c), 25% to (d), and 25% to (e) given their analytical complexity.
- (a) Sample size formula using n = z²₀.₀₂₅ × p(1-p)/d² with conservative p = 0.5 yielding n = 97 (or 96 with p unspecified)
- (b) Complete Venn diagram construction: only auto = 85, auto∩home only = 0, all three = 0, medical∩home only = 4, yielding conditional probability 4/27
- (c) E[X|Z=1] = E[X|X<Y] = λ/2 by memoryless property and E[X|Z=0] = λ + λ/2 = 3λ/2, sum = 2λ
- (d) Verification of regularity conditions, Fisher information calculation I(θ) = [θ(log θ)² - (θ-1)²]/[θ(θ-1)²(log θ)²], and proof that only linear functions of the canonical parameter attain C-R bound
- (e)(i) Counterexample showing L(θ₂)/L(θ₁) is not monotone by comparing likelihood ratios at x = 0 and x → ∞ for θ₂ > θ₁
- (e)(ii) Factorization theorem application showing |X| sufficient, and proof that g(|X|;θ) = 2θ/[π(θ²+x²)] for x > 0 has MLR in |X|
Q2 50M derive Limit theorems and characteristic functions
(a) Let Y₁, Y₂, Y₃, ... be independent and identical Poisson random variables with parameter 1. Use central limit theorem to establish
$$n! \simeq \sqrt{2\pi n}\left(\frac{n}{e}\right)^n$$
for large value of positive integer n. (20 marks)
(b) Let X₁, X₂, ..., Xₙ be a random sample such that log Xᵢ ~ N(θ, θ) distribution with θ > 0 unknown. Show that one of the solutions of the likelihood equation is the unique MLE of θ. Obtain asymptotic distribution of MLE of θ. (15 marks)
(c) (i) State the sufficient conditions for a function φ(t) to be a characteristic function.
(ii) Investigate if the following functions are characteristic functions :
1. e⁻ᵗ⁴
2. [1 + |t|]⁻¹
Justify your answer. (5+10 marks)
Answer approach & key points
Derive the Stirling approximation in part (a) by applying CLT to Poisson sums and carefully manipulating the resulting normal approximation. For part (b), derive the likelihood equation, verify the MLE solution, and obtain its asymptotic normality via Fisher information. In part (c), state Bochner's theorem precisely, then investigate the two functions using properties of positive definiteness and Polya's criteria. Allocate approximately 40% time to (a), 30% to (b), and 30% to (c), ensuring rigorous justification at each step.
- Part (a): Define Sₙ = Y₁ + ... + Yₙ ~ Poisson(n), apply CLT to (Sₙ - n)/√n → N(0,1), and use P(Sₙ = n) with Stirling's manipulation
- Part (a): Equate Poisson pmf at n to normal density approximation and solve for n! to obtain √2πn(n/e)ⁿ
- Part (b): Construct log-likelihood l(θ) = -n/2 log(2πθ) - 1/(2θ)Σ(log Xᵢ - θ)², derive score function and likelihood equation
- Part (b): Verify second-order condition (negative Fisher information) to confirm unique MLE, then apply standard asymptotic theory: √n(θ̂ - θ) → N(0, I(θ)⁻¹)
- Part (c)(i): State Bochner's theorem: φ(0)=1, continuous at 0, positive definite (non-negative definite matrices from φ(tᵢ-tⱼ))
- Part (c)(ii): Show e⁻ᵗ⁴ fails (fourth derivative at 0 gives E[X⁴]=0, contradiction) or check positive definiteness failure
- Part (c)(ii): Verify [1+|t|]⁻¹ satisfies Polya's criteria (convex on t>0, φ(0)=1, even, continuous, φ(∞)=0) hence is characteristic function
Q3 50M construct Sequential probability ratio test and convergence
(a) Let X and Y be two independent random variables following exponential distribution with mean $\frac{1}{\lambda}$ and $\frac{1}{\mu}$ respectively, $\lambda > 0$, $\mu > 0$. Suppose that $(X_1, X_2, ..., X_n)$ and $(Y_1, Y_2, ..., Y_n)$ are sequences of observations on X and Y respectively. A random variable $U_i$ is defined as $$U_i = \begin{cases} 1, & \text{if} \quad X_i \geq Y_i, \quad i = 1, 2, ..., n \\ 0, & \text{otherwise} \end{cases}$$ Construct Wald's SPRT procedure based on $U_i$'s for testing H : $\lambda = \mu$ versus K : $\lambda = 2\mu$ with strength $(\alpha, \beta)$. (20 marks)
(b) Let $Y_i$, $i \geq 1$ be independent and identical $U(-1, 1)$ random variables. Determine if the following sequences converge in probability : (i) $\left\{\frac{Y_i}{i}\right\}$ (ii) $\left\{(Y_i)^i\right\}$ (5+10 marks)
(c) Let X₁, X₂, ..., Xₙ be a random sample from uniform distribution U(− θ, θ), θ > 0. Find the complete sufficient statistic for θ. Hence, obtain the best unbiased estimator of θ. (15 marks)
Answer approach & key points
Construct the Wald's SPRT procedure for part (a) by deriving the likelihood ratio for Bernoulli outcomes, then determine convergence properties for sequences in part (b) using appropriate limit theorems, and finally derive the complete sufficient statistic and MVUE for part (c). Allocate approximately 40% time to part (a) given its 20 marks, 30% to part (c) for its 15 marks, and 30% to part (b) for its 10 marks. Structure with clear headings for each sub-part, showing derivations step-by-step and concluding with explicit final answers.
- For (a): Derive P(X_i ≥ Y_i) = μ/(λ+μ) under H and K, showing U_i ~ Bernoulli with p = 1/2 under H and p = 1/3 under K
- For (a): Construct Wald's SPRT with likelihood ratio Λ_n = (2/3)^T_n × (4/3)^(n-T_n) where T_n = ΣU_i, and specify continuation region with bounds A ≈ (1-β)/α and B ≈ β/(1-α)
- For (b)(i): Show Y_i/i → 0 in probability using Chebyshev's inequality or direct calculation of P(|Y_i/i| > ε)
- For (b)(ii): Analyze (Y_i)^i convergence by considering cases Y_i ∈ (-1,1), showing convergence to 0 in probability
- For (c): Identify T = max(|X_(1)|, |X_(n)|) or equivalently max(-X_(1), X_(n)) as complete sufficient statistic using factorization theorem and completeness of uniform family
- For (c): Derive E[T] = nθ/(n+1) and construct unbiased estimator θ̂ = (n+1)T/n, verifying it is the UMVUE via Lehmann-Scheffé theorem
Q4 50M prove UMVUE, joint distributions and non-parametric tests
(a) Let X₁, X₂, ..., Xₙ be a random sample from Poisson distribution with mean λ > 0. Define a statistic W = (1 − 1/n)^T, T = Σᵢ₌₁ⁿ Xᵢ (i) Show that T is complete sufficient statistic. (ii) Show that T is unbiased for e^(−λ). (iii) Show that even though T is UMVUE, it does not attain the CRLB for g(λ) = e^(−λ). (20 marks)
(b) Let f(x, y) = \frac{e^{\frac{-yx^2}{2}} y^{3/2} e^{-y}}{\sqrt{2\pi}}, -\infty < x < \infty, y > 0. (i) Obtain the marginal distribution of Y and conditional distribution of X given Y. (ii) Find E(Y), V(Y), E(X|Y), V(X|Y). (iii) Use (ii) to find E(X), V(X). (5+5+5 marks)
(c) A company's trainees are randomly assigned to groups which are through a certain industrial inspection procedure by three different methods. At the end of the instructing period they are tested for inspection performance quality. The following are their scores : Method A : 80 83 79 85 90 68 Method B : 82 84 60 72 86 67 91 Method C : 93 65 77 78 88 Using the appropriate non-parametric test, determine at 0·05 level of significance whether the three methods are equally effective. (15 marks)
Answer approach & key points
Prove all required results systematically, spending approximately 40% of time on part (a) given its 20 marks, 30% on part (b) for 15 marks, and 30% on part (c) for 15 marks. Structure as: (a) establish completeness via exponential family, sufficiency via factorization, unbiasedness via expectation calculation, and CRLB non-attainment via variance comparison; (b) integrate to obtain marginal Gamma distribution, derive conditional Normal, then apply law of total expectation/variance; (c) state Kruskal-Wallis test assumptions, compute ranks, calculate H-statistic, and compare with χ² critical value.
- Part (a)(i): Apply factorization theorem to show T is sufficient; use completeness property of Poisson exponential family with natural parameter space containing an open set
- Part (a)(ii): Calculate E[W] = E[(1-1/n)^T] using Poisson MGF to verify unbiasedness for e^(-λ)
- Part (a)(iii): Compute Var(W), derive CRLB for g(λ)=e^(-λ), and explicitly show strict inequality Var(W) > CRLB
- Part (b): Identify Y ~ Gamma(5/2, 1) marginal; X|Y ~ N(0, 1/Y); apply E(X)=E[E(X|Y)] and V(X)=E[V(X|Y)]+V[E(X|Y)]
- Part (c): Apply Kruskal-Wallis H-test: pool and rank all 18 observations, compute rank sums per method, calculate H = [12/N(N+1)]Σ(Ri²/ni) - 3(N+1), compare with χ²₂,₀.₀₅ = 5.991
Q5 50M Compulsory derive Linear regression, experimental designs, sampling theory
(a) For a simple linear regression model Y = β₀ + β₁Xᵢ + εᵢ, i = 1, ..., n
(i) Derive the least square estimators of β₀ and β₁, clearly stating the conditions assumed.
(ii) For eᵢ = Yᵢ - Ŷᵢ where Ŷᵢ is the fitted value, show that
1. Σᵢ₌₁ⁿ eᵢ = 0
2. Σᵢ₌₁ⁿ Yᵢ = Σᵢ₌₁ⁿ Ŷᵢ
3. Σᵢ₌₁ⁿ Xᵢeᵢ = 0
4. Σᵢ₌₁ⁿ Ŷᵢeᵢ = 0
5. The regression line passes through (X̄, Ȳ). 5+5
(b) In usual notations, if v, b, r, k and λ are the parameters of a Balanced Incomplete Block Design, then show that :
(i) b ≥ r + 1 ≥ λ + 2
(ii) v ≤ b ≤ (r² - 1)/λ
10
(c) For the multiple linear regression model with two predictor variables X₁ and X₂, show that the estimate of regression coefficient of X₁ is unchanged when X₂ is added to the regression model, whenever X₁ and X₂ are uncorrelated.
10
(d) A sample of size n is drawn from a population having N units by simple random sampling without replacement. A sub-sample of n₁ units is drawn from the n units by simple random sampling without replacement. Let ȳ₁ denote the mean based on n₁ units and ȳ₂, the mean based on n₂ = n - n₁ units. Consider the estimator of the population mean Ȳₙ given by :
Ŷₙ = wȳ₁ + (1-w)ȳ₂ ; 0 < w < 1
Show that E(Ŷₙ) = Ȳₙ, and obtain its variance.
10
(e) How is the efficiency of a design measured ? Derive the expression to measure the efficiency of a Randomised Block Design over a Completely Randomised Design. 10
Answer approach & key points
Derive requires rigorous step-by-step mathematical proofs with clear logical progression. Allocate time proportionally: ~20% for (a)(i)-(ii) on SLR properties, ~20% for (b) on BIBD inequalities, ~20% for (c) on multiple regression orthogonality, ~20% for (d) on two-phase sampling variance, and ~20% for (e) on design efficiency. Begin each sub-part by stating assumptions, proceed with systematic derivation, and conclude with the required result clearly boxed.
- (a)(i) Correct setup of normal equations minimizing Σ(Yᵢ - β₀ - β₁Xᵢ)²; explicit statement of Gauss-Markov conditions (E(εᵢ)=0, Var(εᵢ)=σ², Cov(εᵢ,εⱼ)=0)
- (a)(ii) All five residual properties proved using normal equations: Σeᵢ=0 from first normal equation; ΣXᵢeᵢ=0 from second; ΣŶᵢeᵢ=0 via substitution; (X̄,Ȳ) on regression line verified
- (b) BIBD parameter relationships: bk=vr, r(k-1)=λ(v-1) used to prove b≥v (Fisher's inequality) and hence b≥r+1≥λ+2; second inequality using r(k-1)=λ(v-1) and k≤v
- (c) Multiple regression: β̂₁ = (S₁₁S₂₂ - S₁₂S₂₂)/(S₁₁S₂₂ - S₁₂²) or equivalent; when S₁₂=0, β̂₁ reduces to S₁₀/S₁₁ = simple regression coefficient
- (d) Two-phase sampling: E(ȳ₁)=Ȳ, E(ȳ₂)=Ȳ shown; E(Ŷₙ)=wȲ+(1-w)Ȳ=Ȳ; variance derived using Var(ȳ₁)=σ²/n₁, Var(ȳ₂)=σ²/n₂ and independence
- (e) Efficiency defined as ratio of variances (or precision); E = Var(CRD)/Var(RBD) = [(σ²+σ²ᵦ)/σ²] × adjustment; derivation using E(MSE) for both designs
- Proper mathematical notation throughout: summation limits, subscripts, expectation and variance operators clearly distinguished
Q6 50M derive Multivariate analysis, correlation, cluster sampling, multivariate normal distribution
(a) For a multiple linear regression model with three covariates X₁, X₂ and X₃, let rᵢⱼ denote the correlation coefficient between Xᵢ and Xⱼ. For a data, it was found r₁₂ = 0·77, r₂₃ = 0·52, r₁₃ = 0·72.
(i) Check the consistency of the above data.
(ii) If r₁₃ is unknown, obtain the limits within which r₁₃ lies given the above values for r₁₂ and r₂₃. 20
(b) In cluster sampling with equal size clusters, obtain the unbiased estimate of population mean. Also obtain its sampling variance as
V(ȳ̄) = (1-f)(NM-1)S²{1+(M-1)ρcl}/[M²(N-1)n],
where notations have their usual meanings. 15
(c) Let Z₃ₓ₁ = (X₁ₓ₁, Y₂ₓ₁)ᵀ ~ N₃((0, 0, 1)ᵀ, [[1, 2, 1], [2, 5, 2], [1, 2, 2]]).
Show that conditional on X₁ₓ₁, the two components of Y₂ₓ₁ are independent but marginally they are not.
15
Answer approach & key points
Derive the required mathematical results systematically across all three parts. For part (a)(i)-(ii), apply correlation matrix properties and determinant conditions first, then use partial correlation bounds. For part (b), build the cluster sampling theory from first principles with ANOVA decomposition. For part (c), partition the multivariate normal distribution and derive conditional distributions. Allocate approximately 40% time to part (a) given its 20 marks, 30% each to parts (b) and (c). Structure as: direct derivations without lengthy introductions, clear theorem statements, step-by-step proofs, and boxed final expressions.
- For (a)(i): Verify positive semi-definiteness of correlation matrix by checking det(R) ≥ 0 or all principal minors non-negative; compute 1 - r₁₂² - r₂₃² - r₁₃² + 2r₁₂r₂₃r₁₃ ≥ 0
- For (a)(ii): Derive bounds using r₁₃ = r₁₂r₂₃ ± √[(1-r₁₂²)(1-r₂₃²)]; obtain numerical interval [0.077, 0.963] or equivalent
- For (b): Define cluster sampling estimator ȳ̄ = (1/nM)ΣᵢΣⱼ yᵢⱼ; prove unbiasedness E(ȳ̄) = Ȳ; derive variance via between-cluster and within-cluster SS decomposition
- For (b): Express variance in ICC form using ρcl = (S_b² - S_w²)/(S_b² + (M-1)S_w²) or equivalent definition; manipulate to reach target formula
- For (c): Partition covariance matrix Σ = [[Σ_XX, Σ_XY], [Σ_YX, Σ_YY]]; derive conditional distribution Y|X ~ N(μ_Y + Σ_YXΣ_XX⁻¹(X-μ_X), Σ_YY - Σ_YXΣ_XX⁻¹Σ_XY)
- For (c): Show conditional covariance matrix is diagonal (implying independence given X₁) while marginal covariance Σ_YY is not diagonal
Q7 50M derive Factorial experiments, principal components, regression estimator
(a) (i) What is confounding in factorial experiments ?
(ii) A $2^6$ factorial experiment is conducted in blocks of size $2^3$. Write the confounded effects such that no main effect or two factor interaction are confounded. Give the list of independent and generalised interactions confounded along with the elements of key block only.
(iii) Give the break-up of degrees of freedom for a $2^n$ factorial experiment in $2^k$ blocks.
(b) What are principal components ? Describe how to compute the principal components of the vectors X₁ = $\begin{bmatrix} 1 \\ 0 \\ -1 \end{bmatrix}$ and X₂ = $\begin{bmatrix} -1 \\ 1 \\ 0 \end{bmatrix}$. Give X₁ and X₂ in terms of the principal components.
(c) Define Regression estimator. Show bias = – Cov ($\bar{x}$, b). Under what conditions is bias negligible ? Find the mean square error of the estimator to first degree of approximation. Give comparison of Regression estimator with Ratio estimator.
Answer approach & key points
Derive the required expressions systematically across all five sub-parts. For (a)(i)-(iii), allocate ~35% time covering confounding definition, the specific 2^6 in 2^3 blocks construction with ABC, DEF, ABCDEF as confounded effects, and the general df breakdown. For (b), spend ~25% time on PCA computation: construct data matrix, find covariance, eigenvalues (3, 1, 0), eigenvectors, and express X₁, X₂ in PC terms. For (c), allocate ~40% time deriving regression estimator bias, MSE approximation, and comparison with ratio estimator via Cochran's approach. Begin with definitions, proceed through step-by-step derivations, and conclude with clear interpretations.
- (a)(i) Define confounding as mixing of treatment effects with block effects; distinguish complete vs partial confounding
- (a)(ii) Identify confounded effects: ABC, DEF, and their generalized interaction ABCDEF; verify no main effect or 2-factor interaction is confounded; construct key block with I, AD, BE, CF, ABDE, ACDF, BCEF, ABCDEF
- (a)(iii) State df breakdown: blocks (2^k - 1), treatments (2^n - 1), error (2^n - 2^k - n + nk - k), total (2^n - 1)
- (b) Define PCs as uncorrelated linear combinations maximizing variance; compute covariance matrix [2 -1; -1 2], eigenvalues λ₁=3, λ₂=1, eigenvectors [1/√2, -1/√2]ᵀ and [1/√2, 1/√2]ᵀ; express X₁ = (1/√2)PC₁ + (1/√2)PC₂, X₂ = (-1/√2)PC₁ + (1/√2)PC₂
- (c) Define regression estimator Ŷ_reg = Ȳ + b(X̄ - x̄); derive bias = -Cov(x̄, b) using E(b) = β + O(1/n); state negligible bias when n is large or ρ ≈ 0; derive MSE ≈ S²_y(1-ρ²)(1/n + 1/N); compare: regression has smaller MSE when |ρ| > 1/2 C_x/C_y
Q8 50M solve Stratified sampling, polynomial regression, split-plot designs
(a) (i) In stratified sampling under optimum allocation, how will you proceed to select units from different strata, if one or more nᵢ's happens to be greater than Nᵢ (i ≥ 2) ?
(ii) A sample survey was conducted in a certain district of Himachal Pradesh. Four strata A, B, C and D of villages were formed according to the acreage of fruit trees as obtained from revenue records. A random sample of villages was selected from each stratum and the number of apple orchards in each selected village was noted. The data are shown below :
| Stratum | Total number of villages (Nᵢ) | Number of villages in sample (nᵢ) | Number of orchards in the selected villages |
|---------|------------------------------|-----------------------------------|---------------------------------------------|
| A (0 – 3 acres) | 275 | 15 | 2, 5, 1, 9, 6, 7, 0, 4, 7, 0, 5, 0, 0, 3, 0 |
| B (3 – 6 acres) | 146 | 10 | 21, 11, 7, 5, 6, 19, 5, 24, 30, 24 |
| C (6 – 15 acres) | 93 | 12 | 3, 10, 4, 11, 38, 11, 4, 46, 4, 18, 1, 39 |
| D (15 acres and above) | 62 | 11 | 30, 42, 20, 38, 29, 22, 31, 28, 66, 14, 15 |
Estimate the number of orchards in the district.
(b) (i) For a second order polynomial model with one predictor variable, derive the least squares normal equations clearly stating the conditions assumed. How will you interpret the parameters in this model ?
(ii) Describe why it is recommended to work with predictor variables centred around the mean. Comment on fitted values of the response variable in this case. Prove your claim.
(c) What are split-plot designs ? When do you recommend the use of such designs ? If e₁ and e₂ are the main plot and sub-plot errors respectively, both estimated in units of a single sub-plot, explain why e₁ is expected to be larger than e₂.
Answer approach & key points
This multi-part question demands solving numerical problems alongside theoretical derivations and explanations. Allocate approximately 35% effort to part (a) combining optimum allocation adjustment and stratified estimation with Himachal Pradesh data; 35% to part (b) covering polynomial regression derivation, centering benefits, and proof; and 30% to part (c) explaining split-plot designs with error comparison. Structure as: brief theoretical setup → step-by-step calculations/derivations → interpretation of results in context.
- For (a)(i): Explain the iterative adjustment procedure when nᵢ > Nᵢ in optimum allocation—set nᵢ = Nᵢ for such strata, recompute allocation for remaining strata using revised formula, and repeat until all nᵢ ≤ Nᵢ
- For (a)(ii): Calculate stratum means, apply Neyman or proportional allocation weights, compute stratified estimate Ŷ = ΣNᵢȳᵢ with standard error, and present final estimate of total orchards in the district
- For (b)(i): Derive normal equations for Y = β₀ + β₁X + β₂X² by minimizing Σ(Yᵢ - β₀ - β₁Xᵢ - β₂Xᵢ²)²; interpret β₀ as response at X=0, β₁ as linear rate of change, β₂ as curvature/acceleration
- For (b)(ii): Explain that centering (X - X̄) eliminates correlation between linear and quadratic terms, stabilizes variance-covariance matrix; prove fitted values remain identical using algebraic expansion showing predicted Y unchanged
- For (c): Define split-plot designs as experiments with two sizes of experimental units where whole plots receive one factor and sub-plots receive another; recommend when one factor is harder/costlier to change; explain e₁ > e₂ due to additional whole-plot error component from main plot-to-main plot variation