Statistical analysis for medical thesis is the step that confuses most MD, MS, DNB, and MSc Nursing students. You have collected your data — but now which test do you run? Choosing the wrong statistical test is one of the most common reasons examiners question your thesis during the viva. This practical guide explains exactly which statistical test to use, how to choose based on your data type, and how to plan your analysis before data collection begins.
1Why Statistical Analysis Matters in Your Medical Thesis
Statistical analysis is not just a formality — it is the scientific backbone of your entire research. Without the correct analysis, even the best-designed study produces results that examiners will question. Moreover, choosing the wrong test can lead to incorrect conclusions, which is a serious academic problem.
Most importantly, your choice of statistical tests must be declared in your synopsis before data collection begins. Planning your analysis early — at the synopsis stage — is not optional. It is one of the first things the IEC and your thesis guide will check.
Always declare your statistical plan in your Methods section before IEC submission. List every test by name and justify why it suits your data type and study design.
2Understand Your Data Type First
Before selecting any test, you must identify what type of data you have. This single decision determines everything else about your analysis.
🔢 Nominal Data
Categories with no order. Examples: blood group, gender, religion, diagnosis type.
→ Use: Chi-square, Fisher's exact test
📊 Ordinal Data
Categories with order but unequal gaps. Examples: pain score (mild/moderate/severe), NYHA class, severity grade.
→ Use: Mann-Whitney U, Kruskal-Wallis
📏 Continuous Data
Measured numbers with equal intervals. Examples: blood pressure, haemoglobin, serum creatinine, age.
→ Use: t-test, ANOVA, Pearson's correlation
⏱️ Time-to-Event Data
Time until an event occurs. Examples: time to recovery, survival after diagnosis, hospital stay duration.
→ Use: Kaplan-Meier, Log-rank test
Before applying any test for continuous data, check whether the data is normally distributed. Use Shapiro-Wilk test in SPSS for samples under 50, or Kolmogorov-Smirnov for larger samples. Normally distributed → parametric tests. Not normal → non-parametric alternatives.
Confused about which statistical test to use?
Get FREE expert guidance on your thesis statistics. Our SPSS experts reply on WhatsApp within 2 hours!
3Which Statistical Test to Use: Complete Decision Table
Use this table as your quick reference guide when planning statistical analysis for your medical thesis:
| Research Question | Data Type | Parametric Test | Non-Parametric Alternative |
|---|---|---|---|
| Compare means of 2 independent groups | Continuous | Independent t-test | Mann-Whitney U |
| Compare means before and after (same group) | Continuous | Paired t-test | Wilcoxon signed-rank |
| Compare means of 3+ groups | Continuous | One-way ANOVA | Kruskal-Wallis |
| Compare proportions between 2 groups | Categorical | Chi-square test | Fisher's exact test |
| Find relationship between 2 continuous variables | Continuous | Pearson's correlation | Spearman's correlation |
| Predict outcome from variables | Continuous/Binary | Linear/Logistic Regression | — |
| Assess diagnostic accuracy | Binary outcome | ROC curve, Sensitivity/Specificity | — |
| Assess agreement between observers | Continuous/Categorical | Bland-Altman / Kappa | — |
When any expected cell frequency in a Chi-square table is less than 5 in more than 20% of cells, switch to Fisher's exact test. This is the most frequently caught mistake in medical thesis statistical analysis.
4Most Common Tests Explained Simply
1. Chi-Square Test — For Categorical Data
The Chi-square test checks whether there is a significant association between two categorical variables. For instance, use it to compare the proportion of complications between diabetic and non-diabetic groups. This is probably the most frequently used inferential test in medical thesis research. Remember: it requires an expected cell frequency of at least 5 in 80% of cells.
2. Independent t-test — Comparing Two Groups
Use the independent t-test when comparing the mean of a continuous variable between two separate groups. For example, comparing mean serum creatinine between hypertensive and normotensive patients. This test assumes normally distributed data — therefore, always run Shapiro-Wilk first in SPSS.
3. Paired t-test — Before and After Comparison
The paired t-test is ideal for pre-post study designs — the most common design in MSc Nursing and MD intervention studies. If you are measuring blood pressure before and after a drug intervention in the same patients, the paired t-test is your go-to test. If difference scores are not normally distributed, use Wilcoxon signed-rank instead.
4. One-way ANOVA — Three or More Groups
ANOVA compares the means of three or more independent groups simultaneously. For example, comparing haemoglobin levels across three severity groups of chronic kidney disease. When ANOVA gives a significant result, you need a post-hoc test — Tukey's HSD or Bonferroni — to identify which specific groups differ.
5. Pearson's Correlation — Finding Relationships
Pearson's r measures the strength and direction of the relationship between two continuous, normally distributed variables. The r value ranges from -1 to +1 — values above 0.7 indicate a strong relationship, while values below 0.3 indicate a weak one.
6. ROC Curve Analysis — Diagnostic Studies
ROC analysis is essential for studies assessing diagnostic accuracy of a biomarker or clinical test. It gives you sensitivity, specificity, PPV, NPV, and Area Under the Curve (AUC). An AUC above 0.8 indicates good diagnostic accuracy, above 0.9 indicates excellent accuracy.
Always declare your statistical plan — every test by name — in your Methods section before IEC submission. Your examiner will ask why you chose each test.
5How to Run Tests in SPSS — Step by Step
SPSS version 26 or 27 is the standard software for statistical analysis in medical colleges worldwide. Here is a quick reference for running the most common tests:
Analyze → Descriptive Statistics → Crosstabs → Select row & column → Statistics → Chi-square → OK
Analyze → Compare Means → Independent Samples T-test → Test variable → Grouping variable → Define groups → OK
Analyze → Compare Means → Paired Samples T-test → Move both variables (pre & post) → OK
Analyze → ROC Curve → Test variable (biomarker) → State variable (disease: 0/1) → Display ROC curve → OK → Note AUC & CI
Analyze → Correlate → Bivariate → Move both variables → Pearson → Two-tailed → OK → Check r & p value
Always set your significance level to p < 0.05 before running any test. For multiple comparisons, consider applying Bonferroni correction to avoid Type I error.
Complete statistical analysis in 5-7 days!
Data entry → SPSS analysis → Results tables → Graphs → Results chapter writing. Trusted by 580+ scholars worldwide.
- ✓ Free synopsis review (worth ₹2000)
- ✓ Response in 2 hours
- ✓ No advance payment
6Common Statistical Mistakes — Avoid These!
- Using t-test without checking normality: Always run Shapiro-Wilk first. If not normal, use Mann-Whitney or Wilcoxon instead.
- Chi-square with small cell frequencies: When any expected cell count is below 5, use Fisher's exact test. SPSS flags this automatically.
- Not reporting effect size: A p-value alone tells you whether a difference exists, not how large it is. Always report mean difference, confidence intervals, or Cohen's d.
- Multiple testing without correction: Running 20 tests at p<0.05 means one false positive is expected by chance. Apply Bonferroni correction for multiple comparisons.
- Confusing correlation with causation: A significant Pearson's r only means two variables move together. It does NOT mean one causes the other.
7Quick Reference: Parametric vs Non-Parametric
| Parametric Test | Non-Parametric Alternative | When to Switch |
|---|---|---|
| Independent t-test | Mann-Whitney U | Data not normally distributed |
| Paired t-test | Wilcoxon signed-rank | Difference scores not normal |
| One-way ANOVA | Kruskal-Wallis | Groups not normally distributed |
| Pearson's correlation | Spearman's correlation | Ordinal data or non-normal |
| Chi-square test | Fisher's exact test | Expected cell count < 5 |
❓ Frequently Asked Questions
Quick answers to common questions about PROSPERO registration
SPSS version 26 or 27 is the most widely accepted software in medical colleges worldwide. R and Stata are also excellent free alternatives. OpenEpi is a free web-based tool that works well for basic tests and sample size calculations.
Parametric tests assume your data follows a normal distribution and are more powerful when this holds. Non-parametric tests make no assumptions about distribution and are safer when normality cannot be confirmed. Always check normality using Shapiro-Wilk before deciding.
Use Fisher's exact test when any expected cell frequency in your contingency table is less than 5, or when your total sample size is less than 20. SPSS automatically flags this and suggests Fisher's exact test in such situations.
Yes — absolutely. Your statistical analysis plan must be declared in the Methods section of your synopsis before IEC submission. List every test you plan to use by name and explain why it is appropriate for your data type and study design.
A p-value less than 0.05 means there is less than a 5% probability that your result occurred by chance alone — the standard threshold for statistical significance in medical research. However, statistical significance does not always mean clinical significance.
Yes! PubMedico's statisticians handle complete SPSS analysis for MD, MS, DNB, DM, MCh, and MSc Nursing thesis — from data entry to final results tables, graphs, and written interpretation. Results chapter ready in 5-7 working days. WhatsApp: +91 96642 99381.