Python coding skills for statistic

Learning statistics is greatly facilitated by using a computational platform for doing statistics calculations and visualizations. You can do basic stats calculations using pen-and-paper for small datasets, but you’ll need a computer to help you with larger datasets. Common computational platforms for doing statistics include JASP, jamovi, SPSS, R, and Python, among many others. You can even do statistics calculations using spreadsheet software like Excel, LibreOffice calc, or Google Sheets. I believe using Python is the best computational platform for learning statistics. Specifically, an interactive notebook environment like JupyterLab provides the best-in-class tools for data visualizations and probability calculations.

But what about learners who are not familiar with Python? Should we abandon non-tech learners and say they can’t learn statistics because they don’t know how to use Python? Naaaah, we ain’t having none of that! Instead, my plan is to bring non-technical learners up to speed on Python by teaching them the Python basics that they need to use for statistics. Anyone can learn Python, it’s really not a big deal. I hope to convince you of this fact in this blog post, which is intended as a Python crash-course for the absolute beginner.

Continue reading “Python coding skills for statistic”

What stats do people want to learn?

If I’ve learned anything about the startup world, it is that you have to listen to your customers, which in my case are the readers of the No Bullshit Guide textbooks. With this principle in mind, I sent out a survey to readers interested in the upcoming statistics requesting feedback on the general direction for the book, based on the stats curriculum and book proposal blog posts, the concept map, and the detailed book outline.

I’ll summarize the results of the survey below (140+ respondents) and comment on some of the readers’ suggestions and advice. The survey is still open in case you want to add your feedback, or feel free to send me an email directly. My email is ivan at this domain.

Continue reading “What stats do people want to learn?”

No Bullshit Guide to Statistics progress update

Over the years several readers have suggested (sometimes demanded!) that I write a book on statistics. Indeed, since the company’s mission is to make the most useful parts of math accessible to the people, it makes sense to pursue statistics as the next title. Statistics is some of the most useful math out there! The 21st century is going to be all about data, so it makes sense to learn about the concepts and tools you need to analyze data, discover patterns, and make decisions.

I’ve now been working on the No Bullshit Guide to Statistics for three years so I figured it’s about time for an update to let y’all know how it’s going. My goals with this blog post are to share with you the detailed book outline and chapter previews, and also ask for your help to validate certain assumptions about the readers’ background (math and programming skills) and their motivation to learn statistics. Please jump to the short survey before continuing with the rest of the blog post. It won’t take longer than 2 mins.

 

Continue reading “No Bullshit Guide to Statistics progress update”

Fixing the introductory statistics curriculum

Let’s talk about the problems with the teaching of statistics. Understanding statistics is essential for many fields of academic research, and also useful in industry. Why is it that first-year statistics courses sucks so bad? It seems that conceptual understanding of statistics ideas only marginally improve after taking a STATS 101 course. Is this because statistics is a really difficult subject to teach, or are we teaching it wrong?

I’ve been looking into this question for the last three years and I finally have a plan for how we can improve things. I’ll start wiht a summary of the statistics curriculum—the set of topics students are supposed to learn in STATS 101. I’ll list all the topics of the “classical” curriculum based on analytical approximations like the t-test. This is the approach currently taught in most high schools and universities around the world.

The “classical” curriculum has a number of problems with it. The main problem is that it’s based on difficult to understand concepts, and these concepts are often presented as procedures to follow without understanding the details. The classical curriculum is also very narrow, since it covers a slim subset of all the possible types of statistical analysis that can be described as math formulas that can be used blindly by plugging in the numbers. In the end of the introductory stats course, students know a few “recipes” for statistical analysis they can apply if they ever run into one of the few scenarios where the recipe can be used (comparison of two proportions, comparison of two means, etc.). That’s nice, but in practice this leaves learners totally unprepared to solve all stats problems that don’t fit the memorized templates, which is most of the problems they will need to solve in their day-to-day life. The current statistics curriculum is simply outdated (developed in times when the only computation available was simple algebraic formulas for computing test statistics and lookup tables for finding p-values). The focus on formulas and use of analytical approximations in the classical curriculum limits learners development of adjacent skills like programming and data management. Clearly there is room for improvement here, we can’t let the next generation of scientists, engineers, and business folks grow up without basic data literacy.

Something must be done.

Continue reading “Fixing the introductory statistics curriculum”