Using tools to take your data from unruly to structures will normally take 70-80% of your time. So don’t be disheartened when you feel like you’ve been staring at the same spreadsheet error for far too long. Think about it like a delicious meal…
Analysis is the same, when you’re looking at the data, you’re growing the right ingredients, putting them in the right place and preparing them so you have prepared everything to do the analysis effectively and enjoyably. Now that I’ve taken that analogy too far, let’s look at the tools you’ll need. I’ve placed the tools along the DMLE spectrum to help add some context on what you should be thinking about learning and when you should think about learning it.
There is an abundance of tools and I’ve only captured some of the bigger names. The idea is that if you learn one of the tools within the sub-category, learning the other tools is easier.
I’m going to try and point you to the best resources I’ve found to help you with these skills. With the growth of MOOCs and Online Learning there are a few high quality and mostly free resources to learn these tools. I’ve called out some of my favourites for each below:
When it comes to spreadsheets, I am a big fan of Google Sheets. It’s free, easy to use and in my opinion has a better user experience than Microsoft Excel. However, Excel is the more powerful option so if you’re working with a really large dataset (over roughly 200,000 rows) then I would suggest excel (if you aren’t interested in python).
I don’t have a preference on data visualisation tools. I enjoy using most of them although Tableau and PowerBI’s free offerings are particularly enticing. If your company uses a specific tool then definitely start there, otherwise cutting your teeth on either Tableau or PowerBI will give you the basic understanding to then pick up most other data viz tools. Again I would recommend picking a random dataset and playing around as you learn!
If you’ve arrived at this stage of the blog, you should first be deciding between python and R. Back in the day when I had to make the same decision, I chose python. Why? In short, python has a less steep learning curve at the beginning and it was easier for me to get my hands dirty quickly. I also believe python is more prevalent across most organisations and tooling systems, which also made it an easier choice for me.