The state of open educational resources in 2017

I spend the last couple of weeks exploring the open educational resources (OER) landscape and wanted to write down my thoughts and observations about the field. The promise of an OER “revolution” that will put quality learning material into the hands of every student has been around for several decades, but we are yet to see OER displace the established publishers. Why is it that “open content” hasn’t taken off more, and what can we do to make things happen in the coming decade?

It’s easy to become over-optimistic about the possibilities afforded by new technologies in the educational space. It’s important to keep in mind the tremendous friction associated with adopting new tech and new ways of teaching and breaking into the established educational practices.  Let’s “aim low” and set realistic targets that can easily be achieved, even in the face of resistance from current stakeholders in the system. What is the common core of useful tools and practices that everyone can agree on that we can achieve in the next three to five years?

 

Why?

Improving access to education is a win-win for everyone. In fact, education is so important that it doesn’t make sense to have commercial interests mixed in with the production of educational content. Whenever possible, students, teachers, and schools need to be in active control of the content, rather than being passive consumers.

The ability to continuously update and improve the content will lead to higher-quality content in the long term. That is, of course, assuming the barriers to collaboration are sufficiently low to allow for productive collaboration to occur.

 

What?

There are two distinct categories of educational materials we can identify.

  • Content: text, exercises, images, videos, audio files, simulations, interactive demonstrations, links
  • Structure: curriculum, course plan, lecture plan, prerequisite structure, and topic guides

Existing OER efforts [1,2,3] are predominantly focused on the production of content items, but there are also recent efforts that provide entire curriculum including course structure, lesson plans, exercises, etc. The common core state standards for math provide an overall structure so that content from different sources can be used interchangeably. In general, it’s not clear that content from different OER sources can be easily remixed and combined.

 

Who?

Let’s review the main players in the “education space” and point out the roles they play.

  • School boards administrators oversee the operations of the “education machine” and are most interested in quantifiable results and metrics. They are interested in new technology and resources that will improve educational outcomes, but would be very slow to act, because of the red tape involved in large administrations.
  • School administrators operate under a lot of financial pressure and school board administrators breathing down their neck. They must also balance parents’ demands, keep their teachers happy, and make sure “operations” continue smoothly.
  • Teachers are the foot soldiers. They are faced with the daily tasks of keeping students learning, policing the classroom, and surviving on a limited teacher salary. They are usually not very technologically savvy, an won’t instantly adopt new tools, unless they can clearly see the benefits.
  • Students are the patients in the system. The whole point of the system is to support the students learning process, but they are given very little choice, and have very little control of their learning.
  • Parents often want to get involved in their kids education. Some parents have the ability to help their kids, others less. Some parents are willing to get involved in school projects and contribute their time.
  • Content producers like book authors, editors, and publishers are responsible for getting “educational products” shipped. From textbooks, to exams, to exam prep courses. The old players in the game (textbook publishers, educational content providers, exam sellers) are large organizations that make their profits through large deals with governments and school boards. The actual content authors are not necessarily concerned with the quality of the content they produce, and are often far removed from classrooms.
  • Technologists working in EdTech startups hope that bringing technology into classrooms will improve learning outcomes. The profit motive leads to a lot of technological innovation, but selling products and content to schools is notoriously hard so very few EdTech startups have made it big.
  • Governments usually want what is best for the population, but they are so far removed from the action that they are unlikely to know the best direction to move in. They are also likely to be lobbied by special interest groups.
  • Private donors often fund educational initiatives and non-profits. These people have their heart in the right place and want to encourage new efforts to improve the status quo. The ad-hoc nature of their efforts can often lead to duplication of effort.

Observe that the “main actors” in the educational system—students and teachers—are only a small part of this entire pipeline, and have very little say about how things are organized. The need to standardize the educational system has taken away the power from the teacher, and the need for one-size-fits all education removes student’s joy of learning. This sucks.

 

How can we fix this?

Who will build the educational system of the future? What will it look like? How can we put teachers in control of the educational content? How can we make students take ownership of their learning? These are important questions whose answers are far from simple, or obvious. The best I can do at this point is point out some high-level themes and directions that I believe could lead to useful results.

  • Better textbooks. I’m totally biased about this topic, but I think learning any subject benefits tremendously from following a structured sequence of lessons—something like a book. I’m talking about printed books, but it’s possible in the future the book will evolve to new formats. Regardless of the medium, the key benefit of the book-like “main text” is to lead the student through a path and, like a story, with a beginning, a main part, and an end.
  • Better support materials. Video lessons, exercises, images, demonstrations, and other “bonus material” are excellent for reinforcing the material students will be learning. It’s totally up to the student to pursue the subject at any depth they choose.
  • Project based learning. Instead of being passive “receivers” of information, students can work on personal projects related to to the subjects they’re supposed to be learning. Such projects will help them become autonomous learners.
  • Better authoring and collaboration tools. If we want the vision of quality OER produced by the community to become a reality, we need to build the best tools possible for authors to collaborate. Teachers must be able to submit typos and fixes. The best place to store educational content is github. Github and the pull request model is a proven technology. Perhaps a domain-specific layer on top of github could make things simpler for authors?
  • Incentives for authors. This is the hard one. Why would a teacher or author dedicate hours of their busy life to create or improve OER content? The situation is different from software, where people make contributions to fix bugs and solve their own problems. It’s comparatively more work for someone who wants to use the book to customize or edit a book. And if they edit it, what are the incentives to contribute back? And for the main author who receives contributions, why would they incorporate them?
  • Free software that runs on school premises through which students can access educational content. If it the “OER platform” is sufficiently easy to use, then school administrators will have to put in only one new system, which will act as conduit for all OER, and other resources. If a single “school server” system authentication, authorization, and student data, then this will greatly simplify things. The fact that it’s free should make it very competitive with commercial offerings.
  • Converter between different OER formats. It’s clear that every OER repository out there has useful content, but how can we use a mix of content from different sources? We need something like pandoc for OER.

I hope that the open source model will prevail and OER will be adopted as the predominant source of educational material in the future. I think putting the content authoring and editing tools in the hands of teachers, parents, and students is the important focus point that will enable all the rest. Paraphrasing the lyrics of Rage Against The Machine, it’s not about books, but about the machines for producing them.

 

Obstacles to OER adoption

The reason why OER are not used most often are as follows:

  • Content quality. Many people thing OER resources are of inferior quality compared to commercial materials. Certainly OER has “high variance” with some content being good and some being bad. In general, whenever OER content tries to reproduce the pedagogical approach of mainstream textbooks (teaching procedures instead of explaining and focus on repetition), the OER will be as low quality as the mainstream material.
  • There is a perception of insufficient resources. It’s possible to find 70% of what you would want to for your class, but be missing the 30%, which makes it impossible to use the material.
  • The resources are difficult to find. There are OER search engines and many repositories, but it remains that every teacher who wishes to adopt OER for their course must invest lots of time to find resources. This could be improved, with ready-made course packs and prepared/curated lists, but it will require lots of work, and constant updates.
  • Switching costs. The OER model is new so teachers and administrators are not used to it.
    • Lack of “vendor” to buy from, and ask for support.
    • How does CC licensing work?
    • Will there be updates and corrections?
  • There is no common format to allow mixing OER from different sources.

 

Next steps

I’ve come to the realization that getting quality OER into the hands of kids around the world is a much bigger task that I previously thought. It’s not like people who worked in the field before were stupid and didn’t use the right approach. It’s just really hard! The obstacles are not only technological, but also social, and psychological. Is multi-author collaboration on textbooks even possible? Every author has their own voice and approach, so it’s not guaranteed their “voices” can merge without outstanding amount of coordination.

When faced with big challenges, the best strategy is to “bite off” small pieces. For the coming weeks, I’ll focus on the “typo fixes” workflow, and write a prototype for anonymous readers to contribute fixes to an existing text. Let’s see if it is possible to hide the complexity of git, while at the same time keep the power of github pull requests. Easy does it. One step at a time.

 

References

[1] http://www.ck12.org/

[2] https://openstax.org/

[3] https://www.khanacademy.org/

 

Git for authors

Using version control is very useful for storing text documents like papers and books. It’s amazing how easy it is to track changes to documents, and communicate these changes with other authors. In my career as a researcher, I’ve had the chance to initiate many colleagues to the use of mercurial and git for storing paper manuscripts. Also, when working on my math books, I’ve had the fortune to work with an editor who understands version control and performed her edits directly to the books’ source repo. This blog post is a brainstorming session on the what a git user interface specific to author’s needs could look like.

The other day I was onboarding a new author and had a chance to explain to him the basics of git, and I realized how complicated the action verbs are. To save some work, you need to put files in the staging area using git add <filename>, commit the change to the local repo, then push the changes to the remote repo. These commands, and the corresponding commands for pulling changes from the remote repo to your local one, and updating your working directory from the local repo, are very logical after you get used to them, and represent necessary complexity. The diagram below illustrates well the different git verbs newcomers to git need to get used to.

git verbs explained

(Credit: Kieran Healy‘s excellent guide to git)

 

So what would git for authors look like?

It’s my non-expert opinion that this is too much complexity for the average non-technical person. Imagine a teacher who wants to use an OER textbook with her students, and in the process of producing the document for her class she finds some typos, which she wants to contribute back to the OER textbook project. Let’s do a thought experiment and imagine a humane interface that would make sense for this task. To make the thought experiment more concrete, we’ll personify the teacher as Jane,  a university professor who is in charge of a first-year physics class.

We’ll assume github is used as the storage backend, but most of author’s OER browsing,  and collaboration happens on a different site (say ezOER.com) whose users are authors, teachers, students, and parents. Suppose the OER book that Jane wants to use is College Phyisics by OpenStax, and this book is available in “source” format from the github repo openstax/physics, which we’ll refer to as upstream below. Given this preexisting setup, here are the steps the teacher would use:

  1. Login to ezOER.com
  2. Copy openstax/physics  to janesmith/physicsbook  (note we don’t say “fork” because it has different connotation as to the permanence and authority of the repo)
  3. Clone janesmith/physicsbook to her ~/Documents/School/Textbooks/OpenStaxPhysics
  4. Follow instructions for “building” the book locally. (e.g. running pdflatex three times)
  5. Performs customization like:
    1. Change cover page
    2. Remove chapters she doesn’t plan on covering in her class
    3. Add a custom preface with references specific to her class
    4. Choose values for “configuration variables” like font size, paper size, etc.
  6. Generate custom book for her class (PDF for print, PDF for screen, .epub, and .mobi)

At this point, she can distribute the eBooks to her students using her school’s LMS’ “file uploads” feature and setup the print PDF for print-on-demand using lulu.com, so students will be able to order the book in print. Her students will benefit from a world-class textbook for $20-30 when printed as a two-tome softcover, black-and-white print book. No payment or further engagement with ezOER.com would be required.

If she doesn’t like her school’s LMS system she could “host” her custom book on ezOER.com. These are the steps she would take to publish her changes to her public-copy repository ezOER.com/janesmith/physicsbook:

  1. git save: combines the effect of git add and git commit using a two-prompt wizard
  2. git publish

She could now give the links to the “build” directory of ezOER.com/janesmith/physicsbook.

Now suppose that halfway through the course, she finds some typos in Chapter 2 of the book, which she wants to correct, and furthermore she wants to share her corrections with the “upstream” copy of the textbook. (Bear with me with this scenario, we’ll have to think more about good incentives to share your corrections with others, but for the purpose of this thought experiment let’s assume Jane is feeling altruistic today). These are the commands she’ll have to use to “suggest edits” to the upstream authors who manage openstax/physics:

  1. Make the corrections in her working directory
  2. git save
  3. git publish (to her copy)
  4. git suggestedits which pops up a wizard asking her to give a short label for her edit suggestions, and pick the commits that should be part of the “suggested edit” (a pull request behind the scenes). The suggestedits command will perform the following steps behind the scenes.
    git checkout -b typoFixesChapter2
    git rebase -i   (choosing only corrections commits, and not the customization commits)
    – open github pull request

To keep things simple, Jane will never be shown the typoFixesChapter2 branch, and for all intents and purposes the rest of the workflow will be done entirely through the ezOER.com web interface. For example, if the upstream maintainers wants her to change something in her “suggested edits” (pull request), she’ll have to make these changes through the web interface, rather than edit the branch typoFixesChapter2 and push again. For all intents and purposes, Jane is always working on the master branch of her copy of the book.

I think introducing the new verbs save, publish, and suggestedits would be easier to use and correspond more closely to authors’ needs.

More power tools for authors

Assuming the source format is text based, git’s basic diff functionality will prove to be useful for “watching” changes made to large collections of text.  If the source is LaTeX documents, ezOER could run latex-diff to generate diff documents showing “rendered” differences between revisions, also know as red-blue diffs.

The build process could be automated using a generic continuous integration server. A script could run after each commit to regenerate the book in various PDF and eBook formats, and also generate diffs. We could even have some “language checks” scripts, that act like linters for text.


 

I’ve thought about this previously, but now the “authoring workflow” is becoming clearer. I need something like this for managing Minireference Co.’s (closed-source) content, but I plan to build all the tooling as open source. Would love to hear you feedback about this idea in the comments below.

Linear algebra concept maps

I spent the last week drawing. More specifically, drawing in concept space. Drawing concept maps for the linear algebra book.

Without going into too much details, the context is that the old concept map was too overloaded with information, so I decided to redo it. I had to split the concept map on three pages, because there’s a lot of stuff to cover. Check it out.

Math basics and how they relate to geometric and computational aspects of linear algebra

The skills from high school math you need to “import” to your study of linear algebra are geometry, functions, and the tricks for solving systems of equations (e.g. the values $x$ and $y$ that simultaneously satisfy the equations $x+y=3$ and $3x+y=5$ are $x=1$ and $y=2$.)

The first thing you’ll learn in linear algebra is the Gauss–Jordan elimination procedure, which is a systematic approach for solving systems of $n$ equations with $n$ unknowns. You’ll also learn how to compute matrix products, matrix determinants, and matrix inverses. This is all part of Chapter 3 in the book.

In Chapter 4, we’ll learn about vector spaces and subspaces. Specifically, we’ll discuss points in $\mathbb{R}^3$, lines in $\mathbb{R}^3$, planes in $\mathbb{R}^3$, and $\mathbb{R}^3$ itself. The basic computational skills you picked up in Chapter 3 can be used to solve interesting geometric problems in vectors spaces with any number of dimensions $\mathbb{R}^n$.

Linear transformations and theoretical topics

The concept of a linear transformation $T:\mathbb{R}^n \to \mathbb{R}^m$ is the extension of the idea of a function of a real variable $f:\mathbb{R} \to \mathbb{R}$. Linear transformations are linear functions that take $n$-vectors as inputs and produce $m$-vectors as outputs.

Understanding linear transformations is synonymous with understanding linear algebra. There are many properties of a linear transformation that we might want to study. The practical side of linear transformations is their nature as a vector-upgrade to your existing skill set of modelling the world with functions. You’ll also learn how to study, categorize, and understand linear transformations using new theoretical tools like eigenvalues and eigenvectors.

Matrices and applications

Another fundamental idea in linear algebra is the equivalence between linear transformations $T:\mathbb{R}^n \to \mathbb{R}^m$ and matrices $M \in \mathbb{R}^{m\times n}$. Specifically, the abstract idea of a linear transformation $T:\mathbb{R}^n \to \mathbb{R}^m$, when we fix a particular choice of basis $B_i$ for the input space and $B_o$ for the output space of $T$, can be represented as a matrix of coefficients $_{B_o}[M_T]_{B_i} \in \mathbb{R}^{m\times n}$. The precise mathematical term for this equivalence is isomorphism. The isomorphism between linear transformations and their matrix representations means we can characterize the properties of a linear transformation by analyzing its matrix representation.

Chapter 7 in the book contains a collection of short “applications essays” that describe how linear algebra is applied to various domains of science and business. Chapter 8 is a mini-intro to probability theory and Chapter 9 is an intro course on quantum mechanics. All the applications are completely optional, but I guarantee you’ll enjoy reading them. The power of linear algebra made manifest.

 


 

If you’re a seasoned blog reader, and you just finished reading this post, I know what you’re feeling… a moment of anxiety goes over you—is a popup asking you to sign up going to show up from somewhere, is there going to be a call to action of some sort?

Nope.

Problem sets ready

Sometime in mid-December I set out to create problem sets for the book. My friend Nizar Kezzo offered to help me write the exercises for Chapter 2 and Chapter 4 and I made a plan to modernize the calculus questions a bit and quickly write a few more questions and be done in a couple of weeks.

That was four months ago! Clearly, I was optimistic (read unrealistic) about my productivity. Nizar did his part right on schedule, but it took me forever to write nice questions for the other chapters and to proofread everything. After all, if the book is no bullshit, the problem sets must also be no bullshit. I’m quite happy with the results!

noBS problem sets: letter format or 2up format.

Please, if you find any typos or mistakes in the problem sets, drop me a line so I can fix them before v4.1 goes to print.

Tools

In addition to work on the problem sets, I also made some updates to the main text. I also developed some scripts to use in combination with latexdiff to filter only pages with changes. This automation saved me a lot of time as I didn’t have to page through 400pp of text, but only see the subset of the pages that had changes in them.

If you would like to see the changes made to the book from v4.0 to v4.1 beta, check out noBSdiff_v4.0_v4.1beta.pdf.

Future

Today I handed over the problems to my editor and once she has taken a look at them, I’ll merge the problems into the book and release v4.1. The coming months will be focussed on the business side. I know I keep saying that, but now I think the book is solid and complete so I will be much more confident when dealing with distributors and bookstores. Let’s scale this!

Ghetto CRM

Say you want to extract the names and emails from all the messages under given tag in your gmail. In my case, it’s the 60 readers who took part in the “free PDF if you buy the print version” offer. I’d like to send them an update.

I started clicking around in gmail and compiling the list, but Gmail’s UI is NOT designed for this, you can’t select-text the email field because a popup shows up, and yada yada…. If you’re reading this, you probably got to this post because you have the same problem so I don’t need to explain.

Yes this is horribly repetitive, and yes it can be automated using python:

import imaplib
import email
from email.utils import parseaddr
import getpass


user = raw_input("Enter your GMail username:")
pwd = getpass.getpass("Enter your password: ")

m = imaplib.IMAP4_SSL('imap.gmail.com', 993)    
m.login(user,pwd)    

# see IMAP client
# m
# see tags (i.e. mailboxes) using
# m.list()


# select the desired tag
m.select('miniref/lulureaders', readonly=True)
typ, data = m.search(None, 'ALL')


# build a list of people from (both FROM and TO headers)
people = []
for i in range(1, len(data[0].split(' '))+1 ):
    typ, msg_data = m.fetch(str(i), '(RFC822)')
    for response_part in msg_data:
        if isinstance(response_part, tuple):
            msg = email.message_from_string(response_part[1])
            name1, addr1 = parseaddr( msg['to'] )
            name2, addr2 = parseaddr( msg['from'] )
            d1 = { "name":name1, "email":addr1 }
            d2 = { "name":name2, "email":addr2 }
            people.extend([d1,d2])
            # uncomment below to see wat-a-gwaan-on 
            #for header in [ 'subject', 'to', 'from' ]:
            #    print '%-8s: %s' % (header.upper(), msg[header])
            #print "-"*70

# lots of people, duplicate entries
len(people)

# filter uniq
# awesome trick by gnibbler 
# via http://stackoverflow.com/questions/11092511/python-list-of-unique-dictionaries
people =  {d['email']:d for d in people}.values()     # uniq by email

# just uniques
len(people)

# print as comma separated values for import into mailing list
for reader in people:
    print reader['email'] + ", " + reader['name']
    
# ciao!
m.close()