No Bullshit Guide to Statistics progress update

Over the years several readers have suggested (sometimes demanded!) that I write a book on statistics. Indeed, since the company’s mission is to make the most useful parts of math accessible to the people, it makes sense to pursue statistics as the next title. Statistics is some of the most useful math out there! The 21st century is going to be all about data, so it makes sense to learn about the concepts and tools you need to analyze data, discover patterns, and make decisions.

I’ve now been working on the No Bullshit Guide to Statistics for three years so I figured it’s about time for an update to let y’all know how it’s going. My goals with this blog post are to share with you the detailed book outline and chapter previews, and also ask for your help to validate certain assumptions about the readers’ background (math and programming skills) and their motivation to learn statistics. Please jump to the short survey before continuing with the rest of the blog post. It won’t take longer than 2 mins.

Fixing the introductory statistics curriculum

Let’s talk about the problems with the teaching of statistics. Understanding statistics is essential for many fields of academic research, and also useful in industry. Why is it that first-year statistics courses sucks so bad? It seems that conceptual understanding of statistics ideas only marginally improve after taking a STATS 101 course. Is this because statistics is a really difficult subject to teach, or are we teaching it wrong?

I’ve been looking into this question for the last three years and I finally have a plan for how we can improve things. I’ll start wiht a summary of the statistics curriculumâ€”the set of topics students are supposed to learn in STATS 101. I’ll list all the topics of the “classical” curriculum based on analytical approximations like the t-test. This is the approach currently taught in most high schools and universities around the world.

The “classical” curriculum has a number of problems with it. The main problem is that it’s based on difficult to understand concepts, and these concepts are often presented as procedures to follow without understanding the details. The classical curriculum is also very narrow, since it covers a slim subset of all the possible types of statistical analysis that can be described as math formulas that can be used blindly by plugging in the numbers. In the end of the introductory stats course, students know a few “recipes” for statistical analysis they can apply if they ever run into one of the few scenarios where the recipe can be used (comparison of two proportions, comparison of two means, etc.). That’s nice, but in practice this leaves learners totally unprepared to solve all stats problems that don’t fit the memorized templates, which is most of the problems they will need to solve in their day-to-day life. The current statistics curriculum is simply outdated (developed in times when the only computation available was simple algebraic formulas for computing test statistics and lookup tables for finding p-values). The focus on formulas and use of analytical approximations in the classical curriculum limits learners development of adjacent skills like programming and data management. Clearly there is room for improvement here, we can’t let the next generation of scientists, engineers, and business folks grow up without basic data literacy.

Something must be done.