Statistical Inference
Introduction
Statistical inference is the second broad category of
Statistics , where statistics is divided into two categories , Descriptive
statistics and statistical inference or inferential statistics .
Statistical inference is considered to be the action of
drawing conclusions or hypotheses from collected observational data.
Statistical inference also describes set of procedures or
steps that can be used to give postulates for future outcome from systems
affected by random variations. There are initial requirements needed in systems
to be considered for statistical inference studies , the system should produce
reasonable answers when that system is applied to well-defined situations and
these answers could be generalized to more other situations.
We use statistical inference to allow us understand behavior
in samples to learn more about the behavior of the population which is often
too large to collect or inaccessible to the researcher.
Statistical inference needs to use and/or derive a
statistical model for the data in hand
on which we use it to derive future predictions , the success of any
statistical inference tasks truly depends on the chosen statistical model ,
actually it is at the heart of any statistical inference task.
Statistical inference is divided up in two parts .
1)
What is needed for
statistical inference test ?
Statistical inference needs a statistical model of the random process that is supposed to
generate the data, which is known when randomization has been used.
2)
What is the output for a
statistical inference test ?
The output from a statistical inference test is a conclusion
or a statistical proposition , These propositions or conclusions can be
enumerated as follows :
·
An Estimate : A particular value that best approximates
some parameter of interest.
·
Confidence interval : an
interval that is constructed using a dataset drawn from a given population of
interest, so that , under repeated sampling of such datasets, such intervals
would contain the true parameter value with the probability at the stated
confidence level.
·
Acceptance or Rejection of
a given hypothesis.
Whereas ,
Descriptive Statistics is a preliminary step of any data
analysis , it only presents facts about the observed data , like the
correctness of random sampling , how the data is distributed , how the data is
deviated from its central average (mean) , in general sense , it gives overall
summary of the data .
Any statistical analysis begins with describing the observed
data (Descriptive Statistics), It gives a general intuition about the source of
data and how the data is distributed in terms of researchers interests.
Typically, Descriptive statistics is a preliminary step in any statistical
analysis.
Also , in this preliminary step , the data analyst could
curate the data , preprocess them and prepare them for a specific statistical
inference analysis test.
Hypothesis Testing
Definition
Hypothesis Testing or significance testing is a method by
which we test a claim or a hypothesis about a parameter in a population using
data measured in a sample. In this method, we test some hypothesis by
determining the likelihood that a sample statistic could have been selected, if
the hypothesis regarding the population parameter were true.
The method of hypothesis testing can be summarized in four
steps as follows
1)
First , We identify a
hypothesis or claim that we feel should be tested.
For example , we might want to test the claim that the mean
number of hours that children in the united states watch TV is 3 hours.
2)
We select a criterion upon
which we decide that the claim being tested is true or not.
For example , The claim is that children watch 3 hours of TV
per week.
3)
Selection of a random
sample from the population and measure the sample mean.
4)
Compare what we observe in
the sample to what we expect to observe if the claim we are testing is true.We
expect the sample mean to be around 3 hours. If the discrepancy between the
sample mean and population mean is small , then we will likely decide that the
claim we are testing is indeed true. If the discrepancy is too large, then we
will likely decide to reject the claim as being not true.
Decision Theory , I tried to learn about decision theory but
I could not understand it , it is not covered in the course and I had hard time
to figure what exactly it is , it is something related to game theory, which I
don’t have any clue to it .
Parameter Estimation
In mathematical modeling, Such hypotheses about the
structure and inner working of the behavioral process of interest are stated in
terms of parameteric families of probability distributions called models. The
goal of modeling is to deduce the form of the underlying process by testing the
viability of the model.
When a model is specified with its parameters for the
research of interest , now we can begin to evaluate its goodness of fit, that
is, how well it fits the observed data. Goodness of fit is assessed by finding
parameter values of a model that best fits the data , This procedure is called
parameter estimation.
Types of parameter estimation
1. Least-Squares estimation (LSE) : in this type, we are estimating
the difference between the true value of the parameter compared to its
predicted value, it is highly associated with regression analysis when we try
to minimize residuals of least square, which is defined by
RSS = E [ Yi – pY]^2
It is useful for obtaining a descriptive
measure for the purpose of summarizing observed data but it has no basis for
testing hypotheses or constructing confidence intervals.
2.
Maximum Likelihood
Estimation (MLE)
It is a widely used
method in parameter estimation in statistics; it has many optimal properties in
estimation as follows.
A.
Sufficiency (Complete
information about the parameter of interest contained in its MLE Estimator).
B.
Consistency: true parameter
value that generated data recovered asymptotically for example as data of
sufficiently large samples.
C.
Efficiency: Lowest-possible
variance parameterization invariance (Same MLE solution obtained independent of
parameterization used).
D.
MLE is a prerequisite for
chi-square test, the G-square test, Bayesian methods, inference with missing
data and modeling of random effects.
Example
Binomial Distribution
Let us consider one observation and one parameter in
binomial distribution, suppose that y represents number of successes in a
sequence of 10 bernoulli trials (Tossing a coin 10 times), the probability of a
success on any one trial, represented by the parameter w, is 0.2 . the pdf in
this case should be
Likelihood function
Given dataset and the model of the interest, the likelihood
function will find a PDF function among all probability densities that the
model prescribes, that is most likely to have produced the data.
Let us assume that the likelihood function is
L(W|Y) is the likelihood function which represents the
likelihood of the parameter w given the observed data y and it is truly a
function of w.
We could use log likelihood function instead of the
likelihood function itself because both are monotonically related to each
others.
No comments:
Post a Comment