A modelling approach to the analysis of complex survey data
- Authors: Dlangamandla, Olwethu
- Date: 2021-10-29
- Subjects: Sampling (Statistics) , Linear models (Statistics) , Multilevel models (Statistics) , Logistic regression analysis , Complex survey data
- Language: English
- Type: Master's theses , text
- Identifier: http://hdl.handle.net/10962/192955 , vital:45284
- Description: Surveys are an essential tool for collecting data and most surveys use complex sampling designs to collect the data. Complex sampling designs are used mainly to enhance representativeness in the sample by accounting for the underlying structure of the population. This often results in data that are non-independent and clustered. Ignoring complex design features such as clustering, stratification, multistage and unequal probability sampling may result in inaccurate and incorrect inference. An overview of, and difference between, design-based and model-based approaches to inference for complex survey data has been discussed. This study adopts a model-based approach. The objective of this study is to discuss and describe the modelling approach in analysing complex survey data. This is specifically done by introducing the principle inference methods under which data from complex surveys may be analysed. In particular, discussions on the theory and methods of model fitting for the analysis of complex survey data are presented. We begin by discussing unique features of complex survey data and explore appropriate methods of analysis that account for the complexity inherent in the survey data. We also explore the widely applied logistic regression modelling of binary data in a complex sample survey context. In particular, four forms of logistic regression models are fitted. These models are generalized linear models, multilevel models, mixed effects models and generalized linear mixed models. Simulated complex survey data are used to illustrate the methods and models. Various R packages are used for the analysis. The results presented and discussed in this thesis indicate that a logistic mixed model with first and second level predictors has a better fit compared to a logistic mixed model with first level predictors. In addition, a logistic multilevel model with first and second level predictors and nested random effects provides a better fit to the data compared to other logistic multilevel fitted models. Similar results were obtained from fitting a generalized logistic mixed model with first and second level predictor variables and a generalized linear mixed model with first and second level predictors and nested random effects. , Thesis (MSC) -- Faculty of Science, Statistics, 2021
- Full Text:
- Date Issued: 2021-10-29
- Authors: Dlangamandla, Olwethu
- Date: 2021-10-29
- Subjects: Sampling (Statistics) , Linear models (Statistics) , Multilevel models (Statistics) , Logistic regression analysis , Complex survey data
- Language: English
- Type: Master's theses , text
- Identifier: http://hdl.handle.net/10962/192955 , vital:45284
- Description: Surveys are an essential tool for collecting data and most surveys use complex sampling designs to collect the data. Complex sampling designs are used mainly to enhance representativeness in the sample by accounting for the underlying structure of the population. This often results in data that are non-independent and clustered. Ignoring complex design features such as clustering, stratification, multistage and unequal probability sampling may result in inaccurate and incorrect inference. An overview of, and difference between, design-based and model-based approaches to inference for complex survey data has been discussed. This study adopts a model-based approach. The objective of this study is to discuss and describe the modelling approach in analysing complex survey data. This is specifically done by introducing the principle inference methods under which data from complex surveys may be analysed. In particular, discussions on the theory and methods of model fitting for the analysis of complex survey data are presented. We begin by discussing unique features of complex survey data and explore appropriate methods of analysis that account for the complexity inherent in the survey data. We also explore the widely applied logistic regression modelling of binary data in a complex sample survey context. In particular, four forms of logistic regression models are fitted. These models are generalized linear models, multilevel models, mixed effects models and generalized linear mixed models. Simulated complex survey data are used to illustrate the methods and models. Various R packages are used for the analysis. The results presented and discussed in this thesis indicate that a logistic mixed model with first and second level predictors has a better fit compared to a logistic mixed model with first level predictors. In addition, a logistic multilevel model with first and second level predictors and nested random effects provides a better fit to the data compared to other logistic multilevel fitted models. Similar results were obtained from fitting a generalized logistic mixed model with first and second level predictor variables and a generalized linear mixed model with first and second level predictors and nested random effects. , Thesis (MSC) -- Faculty of Science, Statistics, 2021
- Full Text:
- Date Issued: 2021-10-29
An exploratory study of students’ expectations and perceptions of service quality in a South African higher education institution
- Authors: Williams, Alyssa Shawntay
- Date: 2018
- Subjects: SERVQUAL (Service quality framework) , Relationship marketing , Consumer satisfaction , Sampling (Statistics) , College students Attitudes , Universities and colleges South Africa
- Language: English
- Type: text , Thesis , Masters , MCom
- Identifier: http://hdl.handle.net/10962/63844 , vital:28496
- Description: Within the past few years, higher education institutions have come under an exorbitant amount of pressure to restructure, increase funding and grow student numbers, whilst still preserving the service quality they offer. The purpose of this study is to measure students’ expectations and perceptions in a higher education institution and establish how significant of a gap exists between what is expected and what is perceived. The instrument utilised within the present study is SERVQUAL. A convenience sampling approach was adopted, furthermore, both descriptive and inferential statistics were used to analyse the data pertaining to the objectives concerning students’ gap between expectations and perceptions and hypotheses regarding the gap between students’ differences in each faculty, respectively. The study found that there were gaps in all dimensions with the order being, from highest to lowest: Reliability – Responsiveness – Assurance – Empathy – Tangibility. In addition, the significant difference in means according to faculty was established and the only dimension with a significant difference was Empathy. These results were used to offer recommendations to management, faculties and departments of the higher education institution under study about where they are deficient, consequently, improving their services to enhance their service quality and increase their competitive advantage but without financial strain. Overall, the conclusions the present study reached was that students and higher education institutions need to have a mutual interest in their relations. This means that as much as higher education institutions need to provide high service quality to students, students need to be willing to provide feedback and interact.
- Full Text:
- Date Issued: 2018
- Authors: Williams, Alyssa Shawntay
- Date: 2018
- Subjects: SERVQUAL (Service quality framework) , Relationship marketing , Consumer satisfaction , Sampling (Statistics) , College students Attitudes , Universities and colleges South Africa
- Language: English
- Type: text , Thesis , Masters , MCom
- Identifier: http://hdl.handle.net/10962/63844 , vital:28496
- Description: Within the past few years, higher education institutions have come under an exorbitant amount of pressure to restructure, increase funding and grow student numbers, whilst still preserving the service quality they offer. The purpose of this study is to measure students’ expectations and perceptions in a higher education institution and establish how significant of a gap exists between what is expected and what is perceived. The instrument utilised within the present study is SERVQUAL. A convenience sampling approach was adopted, furthermore, both descriptive and inferential statistics were used to analyse the data pertaining to the objectives concerning students’ gap between expectations and perceptions and hypotheses regarding the gap between students’ differences in each faculty, respectively. The study found that there were gaps in all dimensions with the order being, from highest to lowest: Reliability – Responsiveness – Assurance – Empathy – Tangibility. In addition, the significant difference in means according to faculty was established and the only dimension with a significant difference was Empathy. These results were used to offer recommendations to management, faculties and departments of the higher education institution under study about where they are deficient, consequently, improving their services to enhance their service quality and increase their competitive advantage but without financial strain. Overall, the conclusions the present study reached was that students and higher education institutions need to have a mutual interest in their relations. This means that as much as higher education institutions need to provide high service quality to students, students need to be willing to provide feedback and interact.
- Full Text:
- Date Issued: 2018
A cox proportional hazard model for mid-point imputed interval censored data
- Authors: Gwaze, Arnold Rumosa
- Date: 2011
- Subjects: Statistics -- Econometric models , Survival analysis (Biometry) , Mathematical statistics -- Data processing , Nonparametric statistics , Sampling (Statistics) , Multiple imputation (Statistics)
- Language: English
- Type: Thesis , Masters , MSc (Biostatistics and Epidemiology)
- Identifier: vital:11780 , http://hdl.handle.net/10353/385 , http://hdl.handle.net/10353/d1001135 , Statistics -- Econometric models , Survival analysis (Biometry) , Mathematical statistics -- Data processing , Nonparametric statistics , Sampling (Statistics) , Multiple imputation (Statistics)
- Description: There has been an increasing interest in survival analysis with interval-censored data, where the event of interest (such as infection with a disease) is not observed exactly but only known to happen between two examination times. However, because so much research has been focused on right-censored data, so many statistical tests and techniques are available for right-censoring methods, hence interval-censoring methods are not as abundant as those for right-censored data. In this study, right-censoring methods are used to fit a proportional hazards model to some interval-censored data. Transformation of the interval-censored observations was done using a method called mid-point imputation, a method which assumes that an event occurs at some midpoint of its recorded interval. Results obtained gave conservative regression estimates but a comparison with the conventional methods showed that the estimates were not significantly different. However, the censoring mechanism and interval lengths should be given serious consideration before deciding on using mid-point imputation on interval-censored data.
- Full Text:
- Date Issued: 2011
- Authors: Gwaze, Arnold Rumosa
- Date: 2011
- Subjects: Statistics -- Econometric models , Survival analysis (Biometry) , Mathematical statistics -- Data processing , Nonparametric statistics , Sampling (Statistics) , Multiple imputation (Statistics)
- Language: English
- Type: Thesis , Masters , MSc (Biostatistics and Epidemiology)
- Identifier: vital:11780 , http://hdl.handle.net/10353/385 , http://hdl.handle.net/10353/d1001135 , Statistics -- Econometric models , Survival analysis (Biometry) , Mathematical statistics -- Data processing , Nonparametric statistics , Sampling (Statistics) , Multiple imputation (Statistics)
- Description: There has been an increasing interest in survival analysis with interval-censored data, where the event of interest (such as infection with a disease) is not observed exactly but only known to happen between two examination times. However, because so much research has been focused on right-censored data, so many statistical tests and techniques are available for right-censoring methods, hence interval-censoring methods are not as abundant as those for right-censored data. In this study, right-censoring methods are used to fit a proportional hazards model to some interval-censored data. Transformation of the interval-censored observations was done using a method called mid-point imputation, a method which assumes that an event occurs at some midpoint of its recorded interval. Results obtained gave conservative regression estimates but a comparison with the conventional methods showed that the estimates were not significantly different. However, the censoring mechanism and interval lengths should be given serious consideration before deciding on using mid-point imputation on interval-censored data.
- Full Text:
- Date Issued: 2011
Randomization in a two armed clinical trial: an overview of different randomization techniques
- Authors: Batidzirai, Jesca Mercy
- Date: 2011
- Subjects: Clinical trials -- Statistical methods , Biometry , Sampling (Statistics)
- Language: English
- Type: Thesis , Masters , MSc (Biostatistics and Epidemiology)
- Identifier: vital:11781 , http://hdl.handle.net/10353/395 , Clinical trials -- Statistical methods , Biometry , Sampling (Statistics)
- Description: Randomization is the key element of any sensible clinical trial. It is the only way we can be sure that the patients have been allocated into the treatment groups without bias and that the treatment groups are almost similar before the start of the trial. The randomization schemes used to allocate patients into the treatment groups play a role in achieving this goal. This study uses SAS simulations to do categorical data analysis and comparison of differences between two main randomization schemes namely unrestricted and restricted randomization in dental studies where there are small samples, i.e. simple randomization and the minimization method respectively. Results show that minimization produces almost equally sized treatment groups, but simple randomization is weak in balancing prognostic factors. Nevertheless, simple randomization can also produce balanced groups even in small samples, by chance. Statistical power is also improved when minimization is used than in simple randomization, but bigger samples might be needed to boost the power.
- Full Text:
- Date Issued: 2011
- Authors: Batidzirai, Jesca Mercy
- Date: 2011
- Subjects: Clinical trials -- Statistical methods , Biometry , Sampling (Statistics)
- Language: English
- Type: Thesis , Masters , MSc (Biostatistics and Epidemiology)
- Identifier: vital:11781 , http://hdl.handle.net/10353/395 , Clinical trials -- Statistical methods , Biometry , Sampling (Statistics)
- Description: Randomization is the key element of any sensible clinical trial. It is the only way we can be sure that the patients have been allocated into the treatment groups without bias and that the treatment groups are almost similar before the start of the trial. The randomization schemes used to allocate patients into the treatment groups play a role in achieving this goal. This study uses SAS simulations to do categorical data analysis and comparison of differences between two main randomization schemes namely unrestricted and restricted randomization in dental studies where there are small samples, i.e. simple randomization and the minimization method respectively. Results show that minimization produces almost equally sized treatment groups, but simple randomization is weak in balancing prognostic factors. Nevertheless, simple randomization can also produce balanced groups even in small samples, by chance. Statistical power is also improved when minimization is used than in simple randomization, but bigger samples might be needed to boost the power.
- Full Text:
- Date Issued: 2011
- «
- ‹
- 1
- ›
- »