Q1 50M Compulsory solve Probability theory and statistical inference
(a) Two events A and B are such that P(A) = 1/3, P(B) = 1/4 and P(A|B) + P(B|A) = 2/3. Evaluate the following: (i) P(A^c ∪ B^c) (5 marks) (ii) P(A|B^c) + P(B|A^c) (5 marks)
(b) Suppose the joint probability function of two random variables X and Y is f(x, y) = (xy^(x-1))/3; x = 1, 2, 3 and 0 < y < 1. Compute the following: (i) P(X ≥ 2 and Y ≥ 1/2) (5 marks) (ii) P(X ≥ 2) (5 marks)
(c) Let X₁, X₂, ... is a sequence of independent and identically distributed random variables with mean (μ) and variance (σ²) < ∞, and assume Sₙ = X₁ + X₂ + ... + Xₙ. Show that WLLN does not hold for sequence ⟨Sₙ⟩ of random variables. (10 marks)
(d) Write the criterion of a good estimator. Let X₁, X₂ be iid P(λ) random variables, then show that T = X₁ + X₂ is sufficient while T = X₁ + 2X₂ is not sufficient for estimating λ. (10 marks)
(e) For testing H₀: μ = 100 vs. H₁: μ ≠ 100, a random sample of size 50 is drawn from a normal population with unknown mean μ and variance 200. If α = 0.05, then obtain the critical region. (10 marks)
Answer approach & key points
Solve this multi-part numerical problem by allocating time proportionally to marks: approximately 20% for (a), 20% for (b), 20% for (c), 20% for (d), and 20% for (e). Begin each sub-part by stating the relevant formula or theorem, show complete step-by-step working, and conclude with clearly boxed final answers. For (c) and (d), include brief theoretical justification before numerical demonstration.
- For (a): Use P(A|B) + P(B|A) = 2/3 to find P(A∩B) = 1/12, then apply De Morgan's laws for P(A^c ∪ B^c) = 1 - P(A∩B) = 11/12
- For (a)(ii): Calculate conditional probabilities using P(A|B^c) = [P(A) - P(A∩B)]/P(B^c) and similar for P(B|A^c), yielding 1/9 + 1/8 = 17/72
- For (b): Integrate joint PDF over appropriate regions; for (i) sum x=2,3 and integrate y from 1/2 to 1; for (ii) marginalize over y
- For (c): Show WLLN fails by proving Var(S_n)/n² → σ² ≠ 0, so S_n/n does not converge in probability to μ (unlike sample mean)
- For (d): State sufficiency criterion (Factorization theorem), show T=X₁+X₂ ∼ P(2λ) allows factorization, while T=X₁+2X₂ does not
- For (e): Construct z-test with σ²=200, n=50; critical region |z| > 1.96 becomes |x̄ - 100| > 3.92 or x̄ < 96.08 or x̄ > 103.92
Q2 50M calculate Joint distributions and limiting distributions
(a) Let the joint probability density function of two random variables X and Y be f(x, y) = x/3, for 0 < 2x < 3y < 6; 0, otherwise. Compute the following: (i) E(Y|X = x) (10 marks) (ii) E(var(Y|X = x)) (10 marks)
(b) Find the distribution function of random variable X, for which the characteristic function is φ(t) = e^(-t²), -∞ < t < ∞. Also compute P(X > 2√2) in terms of Φ(z), where Φ(z) = ∫_{-∞}^{z} (1/√(2π)) e^(-θ²/2) dθ. (15 marks)
(c) Let X₁, X₂, ..., X₂ₙ be iid N(0, 1) variates. Find the limiting distribution of [(X₁/X₂) + (X₃/X₄) + ... + (X₂ₙ₋₁/X₂ₙ)] / [X₁² + X₂² + ... + Xₙ²]. (15 marks)
Answer approach & key points
Calculate the required quantities systematically across all parts: spend approximately 40% time on part (a) covering conditional expectation and conditional variance with proper region identification; 30% on part (b) for characteristic function inversion and normal probability computation; and 30% on part (c) for establishing the limiting distribution using Slutsky's theorem and properties of ratio distributions. Begin with clear region sketches for (a), apply Fourier inversion for (b), and justify convergence arguments for (c).
- For (a)(i): Correctly identify the region 0 < 2x < 3y < 6, derive marginal f_X(x) = x(2-x)/3 for 0 < x < 3, and obtain conditional pdf f_{Y|X}(y|x) leading to E(Y|X=x) = (2+x)/3
- For (a)(ii): Compute Var(Y|X=x) = (2-x)²/36 and then calculate E[Var(Y|X)] by integrating against f_X(x) to obtain 1/30
- For (b): Recognize φ(t) = e^{-t²} corresponds to N(0, 2) via inversion formula or matching with standard normal CF, then express P(X > 2√2) = 1 - Φ(2) using variance σ² = 2
- For (c): Identify that numerator S_n = Σ(X_{2k-1}/X_{2k}) has E(S_n) undefined but apply Cauchy distribution properties, while denominator T_n = ΣX_i² ~ χ²_n, then use Slutsky/continuous mapping to show limiting distribution is standard Cauchy
- Proper handling of ratio of independent quantities in (c): Show numerator terms are iid Cauchy(0,1) and denominator is χ²_n, establishing that the ratio converges to Cauchy(0,1) or requires normalization clarification
Q3 50M solve Statistical inference and estimation theory
(a) Let moment generating function of random variable X exist in the neighbourhood of zero and if
$$E(X^n) = \frac{1}{5} + (-1)^n \frac{2}{5} + \frac{2^{n+1}}{5}; \quad n = 1, 2, 3, \cdots$$
then find the values of the following:
(i) $P(|X - 0.75| \leq 1.5)$ (10 marks)
(ii) $P(|X - \mu| < \sigma)$; $\mu = E(X)$ and $\sigma^2 = \text{var}(X)$ (10 marks)
[Use $\sqrt{1.84} = 1.36$]
(b) (i) Write the importance of Cramer-Rao inequality and Rao-Blackwell theorem. (5 marks)
(ii) Let $X \sim B(1, \theta)$, then find the uniformly minimum variance unbiased estimator (UMVUE) of $\theta(1-\theta)$. (10 marks)
(c) Obtain the maximum likelihood estimates of $\alpha$ and $\beta$ for a random sample from the exponential population
$$f(x; \alpha, \beta) = Ce^{-\beta(x-\alpha)}, \alpha \leq x < \infty, \beta > 0$$ (15 marks)
Answer approach & key points
Solve this multi-part numerical problem by first identifying the probability distribution from the given moment pattern in part (a), then applying appropriate estimation theory for parts (b) and (c). Allocate approximately 35% time to part (a) (20 marks), 25% to part (b) (15 marks), and 40% to part (c) (15 marks) based on computational complexity. Structure as: distribution identification → probability calculations → theoretical exposition → UMVUE derivation → MLE derivation with likelihood analysis.
- For (a): Identify X as a discrete mixture distribution with P(X=-1)=2/5, P(X=0)=1/5, P(X=2)=2/5 by comparing E(X^n) with MGF expansion or direct pattern recognition from the given moment formula
- For (a)(i): Calculate P(|X-0.75|≤1.5) = P(-0.75≤X≤2.25) by enumerating which mixture components satisfy the inequality, yielding P(X=-1)+P(X=0)+P(X=2)=1 or appropriate subset
- For (a)(ii): Compute μ=E(X)=0.6 and σ²=Var(X)=1.84, then find P(|X-0.6|<1.36) using the identified distribution support points
- For (b)(i): Explain Cramer-Rao inequality provides variance lower bound for unbiased estimators enabling efficiency comparison; Rao-Blackwell theorem enables improvement of unbiased estimators via conditioning on sufficient statistics
- For (b)(ii): Derive UMVUE of θ(1-θ) using Lehmann-Scheffé theorem: identify T=ΣX_i as complete sufficient statistic, find unbiased estimator based on sample variance or direct calculation, condition to obtain final form
- For (c): Obtain MLEs by writing likelihood L(α,β)=C^n exp[-βΣ(x_i-α)] with constraint α≤x_(1), show likelihood increases with α so α̂=X_(1), then maximize with respect to β to get β̂=n/[Σ(X_i-X_(1))]
Q4 50M solve Hypothesis testing and non-parametric methods
(a) Find the most powerful test of size α(= 0·05) for testing H₀: μ = 0 vs. H₁: μ = 1, given a random sample of size 25 from N(μ, 16) population. (20 marks)
(b) A lot consists of some defective items. A random sample of 25 items has 6 defective items with probability p₁ = θ and 19 non-defective items with probability p₂ = 1 – θ. Then estimate θ using the following:
(i) MLE method
(ii) Minimum χ²-method
(iii) Modified minimum χ²-method (15 marks)
(c) Differentiate between Mann-Whitney U-test and Wilcoxon sign test. The following data pertain to APGAR scores of 15 pregnant women in two care programmes A and B:
Programme A : 8 7 6 2 5 8 7 3
Programme B : 9 9 7 8 10 9 6
Is there a significant difference in APGAR scores of pregnant women under the two care programmes?
[Given, U₍₀.₀₅₎ = 10] (15 marks)
Answer approach & key points
Solve this multi-part numerical problem by allocating approximately 40% time to part (a) given its 20 marks, 30% to part (b) covering three estimation methods, and 30% to part (c) involving both differentiation and non-parametric testing. Structure as: (a) derive Neyman-Pearson lemma application with critical region, (b) present three estimation approaches with clear derivations, (c) tabulate differences then perform Mann-Whitney U-test with ranking and decision.
- Part (a): Apply Neyman-Pearson lemma to derive most powerful test; identify critical region as sample mean > k; compute critical value k = 0.4 + 1.645×(4/5) = 1.716 using Z-test; state rejection rule and power function
- Part (b)(i): Derive MLE as θ̂ = 6/25 = 0.24 using binomial likelihood L(θ) = θ⁶(1-θ)¹⁹ and log-likelihood differentiation
- Part (b)(ii): Set up minimum χ² by minimizing Σ(Oi-Ei)²/Ei; show equivalence to MLE in this case or derive modified form with expected frequencies 25θ and 25(1-θ)
- Part (b)(iii): Apply modified minimum χ² using weights in denominator; demonstrate Neyman modification or minimum logit χ² approach if applicable
- Part (c): Differentiate Mann-Whitney (two independent samples, ordinal/rank data) vs Wilcoxon signed-rank (paired/matched samples); correctly identify independent samples scenario here
- Part (c) computation: Rank pooled data (1-15), compute sum of ranks for smaller sample (Programme B, n=7), calculate U = 56 - 28 = 28 or U' = 49 - 28 = 21; compare with critical value U₀.₀₅ = 10; conclude no significant difference since U > 10
Q5 50M Compulsory derive Linear regression, multivariate normal distribution, sampling design
(a) How will you justify the usage of the principle of least squares in estimating the parameters of a linear regression model? With usual notations, for the regression model y = Xβ + ε, show that the least square estimator of β is β̂ = (X'X)⁻¹X'y (10 marks)
(b) (i) If X̃ is distributed as N₃(μ̃, Σ), find the distribution of [X₁ - X₂; X₂ - X₃]. (5 marks)
(ii) If X₁, X₂ and X₃ are three variables, obtain the expression for the partial correlation coefficient between X₁ and X₂ eliminating the effect of X₃, ρ₁₂·₃, in terms of simple correlation coefficients. (5 marks)
(c) X₁ and X₂ are independent data sets of order (n₁ × p) and (n₂ × p) respectively from Nₚ(μ̃, Σ). Show that (n₁n₂D²)/n is distributed as T²(p, n-2), where n = n₁ + n₂, and T² and D² represent the Hotelling's T² and Mahalanobis D² respectively. (10 marks)
(d) For the population U = {a, b, c, d, e}, consider the following sampling design: P({a, b, d}) = 1/6, P({a, b, e}) = 1/6, P({a, d, e}) = 1/6, P({b, c, d}) = 1/6, P({b, c, e}) = 1/6, P({c, d, e}) = 1/6. Calculate the first-order and second-order inclusion probabilities. Hence show that it is a matter of a stratified design. Identify the strata with their units. (10 marks)
(e) Let the incidence matrix of a design be N = [[1, 1, 1, 0], [1, 1, 0, 1], [1, 0, 1, 1], [0, 1, 1, 1]]. Show that— (i) the design is connected balanced; (ii) its efficiency factor is E = 8/9. (6+4=10 marks)
Answer approach & key points
Derive requires rigorous mathematical proof and step-by-step derivation across all sub-parts. Allocate approximately 20% time to part (a) on least squares justification and derivation, 20% to part (b) on multivariate normal transformations and partial correlation, 20% to part (c) on Hotelling's T² distribution, 20% to part (d) on inclusion probabilities and stratified design identification, and 20% to part (e) on connectedness, balance and efficiency factor. Begin with clear statement of assumptions, proceed through systematic derivations with matrix algebra where needed, and conclude with explicit verification of claimed properties.
- Part (a): Justify least squares via Gauss-Markov theorem (BLUE property under Gauss-Markov assumptions) or via maximum likelihood under normality; derive β̂ = (X'X)⁻¹X'y by minimizing S(β) = ε'ε = (y-Xβ)'(y-Xβ) using matrix differentiation
- Part (b)(i): Apply linear transformation property of multivariate normal; define transformation matrix A = [[1, -1, 0], [0, 1, -1]]; derive distribution as N₂(Aμ̃, AΣA') with explicit mean and covariance structure
- Part (b)(ii): Derive ρ₁₂·₃ = (ρ₁₂ - ρ₁₃ρ₂₃)/√[(1-ρ₁₃²)(1-ρ₂₃²)] using residual correlation formula or partial covariance matrix inversion
- Part (c): Define D² = (x̄₁ - x̄₂)'S⁻¹(x̄₁ - x̄₂); use independence of sample means and pooled covariance; apply Wishart and Hotelling's T² construction to show (n₁n₂/n)D² ~ T²(p, n-2)
- Part (d): Calculate πᵢ = Σ_{s∋i} P(s) for first-order inclusion probabilities; calculate πᵢⱼ = Σ_{s∋i,j} P(s) for second-order; verify πᵢⱼ = πᵢπⱼ/πₕ for stratified structure; identify strata as {a,b}, {c}, {d,e} or equivalent based on inclusion pattern analysis
- Part (e)(i): Verify connectedness via incidence matrix rank or graph connectivity; verify balance via constant λ = Σⱼ nᵢⱼnᵢ'ⱼ for all i ≠ i' pairs
- Part (e)(ii): Calculate efficiency factor E = (v-1)/[r(k-1)] × (harmonic mean of eigenvalues) or via C-matrix eigenvalues; show E = 8/9 explicitly
Q6 50M prove Bivariate normal distribution, principal components, linear regression estimation
(a) (X, Y) has bivariate normal distribution BN(μ₁, μ₂, σ₁², σ₂², ρ).
(i) Show that X and Y are independent if and only if ρ = 0. (6 marks)
(ii) If (X, Y) follows BN(3, 1, 16, 25, 3/5), obtain P(3 < Y < 8 | X = 7), given Φ(2) = 0.9772 and Φ(-0.25) = 0.4017, and Φ(x) represents the area under the standard normal curve from -∞ to x. (6 marks)
(iii) If (X, Y) follows BN(0, 0, 1, 1, 0), what will be the distribution of Z = Y/X? (4 marks)
(iv) State the multivariate extension of (i) when X̃ follows Nₚ(μ̃, Σ). (4 marks)
(b) Define principal components and canonical correlation. How can one attain data reduction using principal components? If (X₁, X₂) has covariance matrix Σ = [[1, ρ], [ρ, 1]], then find the principal components. (15 marks)
(c) For the simple linear regression model y = β₀ + β₁x + ε, where β₀ and β₁ are parameters and ε has zero mean and an unknown variance σ², find the estimates of β₀ and β₁ by the principle of least squares as well as the method of maximum likelihood. Examine whether they are identical. (15 marks)
Answer approach & key points
Prove the independence condition in (a)(i) using factorization of joint density; for (a)(ii)-(iv), calculate conditional distributions and identify the Cauchy distribution; for (b), define concepts then derive eigenvalues/eigenvectors for PC extraction; for (c), derive both estimators and compare. Allocate ~40% time to part (a) [20 marks], ~30% each to (b) and (c) [15 marks each], with explicit theorem statements and step-by-step derivations throughout.
- (a)(i) Prove ρ=0 ⇔ independence by showing joint density factorizes into marginal densities, using the bivariate normal PDF structure
- (a)(ii) Compute conditional distribution Y|X=7 ~ N(μ₂ + ρ(σ₂/σ₁)(x-μ₁), σ₂²(1-ρ²)), then standardize and use Φ values
- (a)(iii) Identify Z=Y/X as ratio of independent N(0,1) variables, hence standard Cauchy distribution
- (a)(iv) State multivariate extension: X̃ ~ Nₚ(μ̃, Σ) has independent components iff Σ is diagonal
- (b) Define PCs as uncorrelated linear combinations maximizing variance; define canonical correlation as correlation between linear combinations of two variable sets; data reduction by retaining top k PCs; derive eigenvalues (1±ρ) and eigenvectors for given Σ
- (c) Derive LSE by minimizing Σ(yᵢ-β₀-β₁xᵢ)²; derive MLE using normal error assumption; show identical estimators but different variance estimators
- Compare LSE (distribution-free) vs MLE (requires normality) and note σ²_MLE = SSE/n vs σ²_LSE = SSE/(n-2)
Q7 50M prove Stratified sampling and BIBD
(a) A very big population is divided into two strata. The allocation of units of stratified random sample of size n for the two strata under Neyman allocation are n'_1 and n'_2, and under other type of allocation are n_1 and n_2. Define r = n'_1/n'_2 and μ = n_1/(rn_2). Then prove that the efficiency of stratified random sampling with respect to stratified random sampling under Neyman allocation is given by e = μ(r+1)²/((μr + 1)(μ + r)). (20 marks)
(b) A bank has 40000 clients in its computer files, divided into 4000 branches, each managing exactly 10 clients. To estimate the proportion of clients for whom the bank has granted loan, a simple random sample of 40 branches is selected. From the selected sample, for each branch i, a list of clients (A_i) having a loan is prepared; i = 1, 2, ..., 40. The data observed from the selected sample are Σ(i=1 to 40) A_i = 200 and Σ(i=1 to 40) A_i² = 1156.
(i) What type of sampling is this? (3 marks)
(ii) State the expression of the parameter to estimate and obtain its unbiased estimate. (6 marks)
(iii) Estimate the variance of the unbiased estimator obtained in part (ii). (6 marks)
(c) (i) Verify whether the following BIBD are possible: (1) v = b = 22, r = k = 7, λ = 2; (2) v = 10, b = 18, r = 9, k = 5, λ = 4. Given that the design is resolvable.
(ii) Given below is the incidence matrix (N) of a block design. Find the degrees of freedom associated with the adjusted treatment sum of squares and the degrees of freedom for the error sum of squares.
Answer approach & key points
This question demands rigorous mathematical derivation and proof for part (a), followed by applied numerical analysis for parts (b) and (c). Spend approximately 35% of time on part (a) given its 20 marks and proof complexity; allocate 25% to part (b) covering cluster sampling identification, unbiased estimation and variance calculation; and 40% to part (c) on BIBD verification and degrees of freedom computation. Structure as: (a) state assumptions and derive efficiency ratio step-by-step; (b) identify two-stage/cluster sampling, construct appropriate estimators using given sums; (c) verify necessary conditions for BIBD existence and compute rank of C-matrix for degrees of freedom.
- Part (a): Define stratum variances S₁², S₂² and sample sizes under Neyman allocation n'₁ = nS₁/(S₁+S₂), n'₂ = nS₂/(S₁+S₂), then express r = S₁/S₂ and derive Var(ȳ_st) under both allocations to obtain the efficiency formula
- Part (b)(i): Identify this as two-stage sampling (or cluster sampling) where branches are primary units and clients are secondary units, with 4000 first-stage units and 10 second-stage units per cluster
- Part (b)(ii): Parameter is population proportion P = ΣA_i/(MN) where M=10, N=4000; unbiased estimator is p̂ = (ΣA_i)/(mM) = 200/(40×10) = 0.5 where m=40
- Part (b)(iii): Variance estimator requires between-cluster mean square s_b² = [ΣA_i² - (ΣA_i)²/m]/(m-1) = [1156 - 1000]/39 = 4, then v(p̂) = (N-n)s_b²/(NnM²) with finite population correction
- Part (c)(i): Verify BIBD conditions: vr = bk, λ(v-1) = r(k-1), and for resolvable designs b ≥ v + r - 1; Design (1) fails as 22×7 ≠ 22×7 check shows λ(v-1)=42 ≠ r(k-1)=42 actually holds but resolvability requires b≥v+r-1=28 which fails; Design (2) verify 10×9=18×5=90, λ(v-1)=36=r(k-1)=36, and resolvability check
- Part (c)(ii): For given incidence matrix N, compute C = rI_v - Nk⁻¹N' or treatment information matrix, find rank(C) = v-1 for connected design giving adjusted treatment SS df = v-1, error df = n-v-b+1 or appropriate based on design parameters
Q8 50M solve Factorial design and ANOVA
(a) A 2²-factorial design was used to develop the yield of a crop. Two factors A and B were used at two levels: low (–1) and high (+1). The experiment was replicated two times with completely randomized way. The data obtained are as follows:
| Factor A | Factor B | Estimated Average Effect |
| – | – | |
| + | – | 8 |
| – | + | –5 |
| + | + | 2 |
The sum of squares of all the yields = 510.5
The grand total of all the yields = 50.00
(i) Analyze the data and identify the significant factors. (12 marks)
(ii) Develop the regression model and predict the yield when A and B both are at low level (–1). (8 marks)
[Given, F₍₁, ₄, ₀.₀₅₎ = 7.71]
(b) To estimate the population mean Ȳ of a characteristic Y, a simple random sample of size 1000 was selected from a population of size 1000000 by without replacement. The population mean of an auxiliary character X is X̄ = 15. The other results are given below: s²ᵧ = 20, s²ₓ = 25, sₓᵧ = 15, x̄ = 14, ȳ = 10.
(i) Estimate Ȳ using difference, ratio and regression estimators. (6 marks)
(ii) Estimate the MSE of these estimators. Which estimator should we choose to estimate Ȳ? (9 marks)
(c) Write down the model used in the analysis of a two-way classification with interactions, stating the assumptions. What are the hypotheses tested in this scenario? Obtain the expression for the sum of squares and complete the ANOVA. (15 marks)
Answer approach & key points
Solve this multi-part numerical problem by allocating approximately 35% time to part (a) [20 marks], 25% to part (b) [15 marks], and 40% to part (c) [25 marks]. Begin with clear model specification and ANOVA table construction for the 2² factorial in (a), followed by systematic calculation of difference, ratio and regression estimators in (b), and complete theoretical derivation of two-way ANOVA with interaction in (c). Present all computational steps in tabular format with explicit F-test conclusions and MSE comparisons.
- For (a)(i): Calculate main effects A and B, interaction effect AB, construct complete ANOVA table with 3 d.f. for treatments and 4 d.f. for error, compare F-calculated with F-critical=7.71 to identify significant factors
- For (a)(ii): Develop regression equation Y = β₀ + β₁A + β₂B + β₁₂AB with coded variables, substitute A=-1, B=-1 to predict yield at low-low combination
- For (b)(i): Compute difference estimator ȳ_D = ȳ + (X̄ - x̄), ratio estimator ȳ_R = ȳ(X̄/x̄), and regression estimator ȳ_lr = ȳ + b(X̄ - x̄) where b = s_xy/s_x²
- For (b)(ii): Calculate MSE for each estimator using appropriate formulas (MSE(ȳ_D), approximate MSE for ratio, and MSE(ȳ_lr) = (1-f)(s_y²(1-ρ²))/n), select estimator with minimum MSE
- For (c): State model y_ijk = μ + α_i + β_j + (αβ)_ij + ε_ijk with assumptions (normality, independence, homoscedasticity, Σα_i=Σβ_j=Σ(αβ)_ij=0), hypotheses H₀: all α_i=0, all β_j=0, all (αβ)_ij=0, derive SSA, SSB, SSAB, SSE with degrees of freedom and complete ANOVA table