**Ask any Statistics/Probability/Math Question**
### Collection of Dissertation Data
Data collection plays a vital role in completing a good dissertation. Unnecessary and unimportant data should not be collected. In data collection the following things are involved:
- structured or semi-structured interviews.
- structured observation
- questions from the target population
- questionnaires
### Choosing Appropriate Statistical Tests
Many of the students are confused in choosing the appropriate statistical test. So problem arises when one is conducting his thesis or dissertation. Most of the students are not aware of the importance of choosing appropriate statistical test and also do not know about the procedure of choosing the right one. So it is of prime importance to choose the right and appropriate statistical test.
### The Various Statistical tests and their characteristics
##### Multiple Regressions
Multiple Regression helps to test and model multiple independent variables. With the help of this regression analysis we can understand the effect of the dependent variable when one of the independent variables is changed keeping the others fixed. In such analysis the average value for the dependent variable can be estimated given the condition that the independent variables are fixed, i.e the conditional expectation of Y given X. Estimation in regression analysis is always made for the dependent variable given the independent variables. Using probability distribution one can characterize variation of dependent variable. Types of regression are simple, linear, multiple etc.
##### Canonical Correlation
It deals with multiple regression and also correlation analysis. While in multiple regression one is interested to know about the relation between the dependent variable and the linear combination of the independent variables, in case of canonical correlation analysis one requires to know about the relation between linear combination of the set of dependent variables and that of the set of independent variables.
##### Logistic Regression
Logistic Regression is involved with a categorical variable with binary outcomes. It is also known as logistic model or legit model. It is used to fit data to a logistic function ,and the curve is known as logistic curve. In binomial regression we use this as a general linear model. Logistic Regression can take into consideration many independent variables that can be both numerical and categorical.
##### Factor Analysis
Factor analysis is a branch of multivariate analysis that is concerned with the internal relationships of a set of variates. The aim of Factor Analysis is to account for the covariances of the observed variates in terms of a much smaller number of hypothetical variates or factors. In correlation terms the first question that arises is whether any correlation exists, i.e, whether correlation matrix differs form he identity matrix. If there is correlation the next question is whether there is a random variable such that the partial correlation coefficients between the independent variables after eliminating that random variable is zero. If not then two random variables are postulated and the partial correlation coefficient after eliminating that two random variables are examined. The process continues until all the partial correlation between the independent variables are zero. The factor model consists of factor loading matrix of constants, common factor and specific factor.
##### Analysis of Variance: ANOVA and MANOVA
ANOVA test is associated with the cases having more than two groups. MANOVA is used to analyze data where there are many dependent variables. Through MANOVA we can test the dependency of one or more than one predictor variables on more than one dependent variables. With the help of p value in MANOVA we can decide to reject or accept the null hypothesis.
##### Discriminant Function Analysis
The objective of discriminant analysis is to classify the objects or individuals into non overlapping or mutually exclusive classes on the basis of some apriori information. This seems to be originated from the study of intergroup distance by R.A Fisher and later by P.C Mahalanobis. The discriminant used by them for discrimination is a linear composite of explanantory variables. The basic principle to determine the group with which an object is identified is that the misclassification error of that object is minimum.
##### Test for Mean:(T-Test):
To compare mean between two groups we take t test into consideration. Not only is t test required for the analysis of comparison means of two groups but also it is the appropriate one for post-test two group.So, in dissertation t-test is used if we come across such situations.
##### Goodness of fit and deviations (Chi-square):
To compare between observed and expected data the statistical test used is Chi-square. Goodness of fit refers to how well the data fits. We need to see whether the deviation between observed and expected data occur due to factors or chance. With the help of Chi-Square test, where the null hypothesis states that the there exists no significant difference between observed and expected values, we are able to see how much is the deviation and how good is the fit.
##### Structural Equation Modeling and Path Analysis
Path Analysis involves causal modeling. With the help of the model the correlations can be explored. Such method is also termed as Structural Equation Modeling (SEM). There are two types of variables in the path analysis model. One of them is observable ,endogenous variables which acts as indicators and the other one is non-observable exogenous variables which cannot be observed and hypothetical.
Aims of Path analysis are as follows:
- To make out patterns of correlation.
- To explain possible variation.
This is different from multiple regression , ANOVA etc. ,different in the sense that it is used to decide whether to reject, accept or modify the whole model. |