Monday, July 29, 2013

Statistical Inference

Statistical Inference
Introduction

Statistical inference is the second broad category of Statistics , where statistics is divided into two categories , Descriptive statistics and statistical inference or inferential statistics .
Statistical inference is considered to be the action of drawing conclusions or hypotheses from collected observational data.
Statistical inference also describes set of procedures or steps that can be used to give postulates for future outcome from systems affected by random variations. There are initial requirements needed in systems to be considered for statistical inference studies , the system should produce reasonable answers when that system is applied to well-defined situations and these answers could be generalized to more other situations.
We use statistical inference to allow us understand behavior in samples to learn more about the behavior of the population which is often too large to collect or inaccessible to the researcher.
Statistical inference needs to use and/or derive a statistical model  for the data in hand on which we use it to derive future predictions , the success of any statistical inference tasks truly depends on the chosen statistical model , actually it is at the heart of any statistical inference task.
Statistical inference is divided up in two parts .
1)      What is needed for statistical inference test ?
Statistical inference needs a statistical model  of the random process that is supposed to generate the data, which is known when randomization has been used.
2)      What is the output for a statistical inference test ?
The output from a statistical inference test is a conclusion or a statistical proposition , These propositions or conclusions can be enumerated as follows :
·         An Estimate  : A particular value that best approximates some parameter of interest.
·         Confidence interval : an interval that is constructed using a dataset drawn from a given population of interest, so that , under repeated sampling of such datasets, such intervals would contain the true parameter value with the probability at the stated confidence level.
·         Acceptance or Rejection of a given hypothesis.
Whereas ,
Descriptive Statistics is a preliminary step of any data analysis , it only presents facts about the observed data , like the correctness of random sampling , how the data is distributed , how the data is deviated from its central average (mean) , in general sense , it gives overall summary of the data .
Any statistical analysis begins with describing the observed data (Descriptive Statistics), It gives a general intuition about the source of data and how the data is distributed in terms of researchers interests. Typically, Descriptive statistics is a preliminary step in any statistical analysis.

Also , in this preliminary step , the data analyst could curate the data , preprocess them and prepare them for a specific statistical inference analysis test.


Hypothesis Testing
Definition
Hypothesis Testing or significance testing is a method by which we test a claim or a hypothesis about a parameter in a population using data measured in a sample. In this method, we test some hypothesis by determining the likelihood that a sample statistic could have been selected, if the hypothesis regarding the population parameter were true.
The method of hypothesis testing can be summarized in four steps as follows
1)      First , We identify a hypothesis or claim that we feel should be tested.
For example , we might want to test the claim that the mean number of hours that children in the united states watch TV is 3 hours.
2)      We select a criterion upon which we decide that the claim being tested is true or not.
For example , The claim is that children watch 3 hours of TV per week.
3)      Selection of a random sample from the population and measure the sample mean.
4)      Compare what we observe in the sample to what we expect to observe if the claim we are testing is true.We expect the sample mean to be around 3 hours. If the discrepancy between the sample mean and population mean is small , then we will likely decide that the claim we are testing is indeed true. If the discrepancy is too large, then we will likely decide to reject the claim as being not true.

Decision Theory , I tried to learn about decision theory but I could not understand it , it is not covered in the course and I had hard time to figure what exactly it is , it is something related to game theory, which I don’t have any clue to it .

Parameter Estimation
In mathematical modeling, Such hypotheses about the structure and inner working of the behavioral process of interest are stated in terms of parameteric families of probability distributions called models. The goal of modeling is to deduce the form of the underlying process by testing the viability of the model.
When a model is specified with its parameters for the research of interest , now we can begin to evaluate its goodness of fit, that is, how well it fits the observed data. Goodness of fit is assessed by finding parameter values of a model that best fits the data , This procedure is called parameter estimation.
Types of parameter estimation
1.      Least-Squares estimation (LSE) : in this type, we are estimating the difference between the true value of the parameter compared to its predicted value, it is highly associated with regression analysis when we try to minimize residuals of least square, which is defined by 
RSS = E [ Yi – pY]^2
It is useful for obtaining a descriptive measure for the purpose of summarizing observed data but it has no basis for testing hypotheses or constructing confidence intervals.

2.       Maximum Likelihood Estimation (MLE)
 It is a widely used method in parameter estimation in statistics; it has many optimal properties in estimation as follows.
A.      Sufficiency (Complete information about the parameter of interest contained in its MLE Estimator).
B.      Consistency: true parameter value that generated data recovered asymptotically for example as data of sufficiently large samples.
C.      Efficiency: Lowest-possible variance parameterization invariance (Same MLE solution obtained independent of parameterization used).
D.      MLE is a prerequisite for chi-square test, the G-square test, Bayesian methods, inference with missing data and modeling of random effects.


Example
Binomial Distribution
Let us consider one observation and one parameter in binomial distribution, suppose that y represents number of successes in a sequence of 10 bernoulli trials (Tossing a coin 10 times), the probability of a success on any one trial, represented by the parameter w, is 0.2 . the pdf in this case should be 


Likelihood function
Given dataset and the model of the interest, the likelihood function will find a PDF function among all probability densities that the model prescribes, that is most likely to have produced the data.
Let us assume that the likelihood function is 
L(W|Y) is the likelihood function which represents the likelihood of the parameter w given the observed data y and it is truly a function of w.
We could use log likelihood function instead of the likelihood function itself because both are monotonically related to each others.


No comments:

Post a Comment