
You’re making great progress on your research idea! You have a research question and hypothesis, and you’ve decided on your research team, study design/methodology, and research ethics implications. Now you’re ready to plan your data analyses, which is the best part! Okay, planning the stats isn’t the best part, but once you actually *do* the statistics, it will be the best part because that’s where you get the *answer* you’ve been working towards, so it is an exciting part of this process, even if it might appear intimidating (particularly if you’re navigating it by yourself). If you’re primary analysis will be to analyze text-based responses (e.g., looking for themes), that’s something that Generative AI can help you with! Be sure to log in to Copilot using your DC email before you upload any of this text, though, because you don’t want it to go outside of DC.
As a reminder, you don’t need any special statistical software to do a lot of the basic statistics you’re likely to need- you can do many things in Excel if you just enable the Analysis Toolpak. However, if you don’t love Excel and/or think your analyses will go beyond what Excel is likely to be able to do, then I highly recommend that you download the free program JASP as it is quite user-friendly. Because it is popular, there are also many free resources (either from JASP or statistics professors) for you to tap into if you get stuck. Specifically, you might be interested in this how-to JASP resource with a focus on diversity research for the data/examples.
It’s important to select the right statistical test for your data. One way to get this answer is to use a decision tree like this one (there is also a video version at that same url). You’ll also want to know which measure of effect size is appropriate for your analysis because reporting an effect size tells readers the “importance” of any significant differences you find.
Believe it or not, there are some ethical considerations related to statistical analyses. For example, you should be intentional about how many statistical tests/comparisons you run (p-hacking is an ethical problem). Additionally, once you collect your data, you’ll want to clean it by removing outliers and ensuring it’s normally distributed (or transforming it or using a non-parametric test). Keep track of all of these decisions. Most of the time, the specific decision you make is less important that your transparency about it when you eventually publish your research. I’m happy to help with this! 😊
You can also see my previous post about statistics.