{"id":32,"date":"2012-03-10T04:24:01","date_gmt":"2012-03-10T04:24:01","guid":{"rendered":"http:\/\/minireference.com\/blog\/?p=32"},"modified":"2012-03-10T04:24:01","modified_gmt":"2012-03-10T04:24:01","slug":"measuring-readability","status":"publish","type":"post","link":"https:\/\/minireference.com\/blog\/measuring-readability\/","title":{"rendered":"Measuring readability"},"content":{"rendered":"<p>The <a href=\"http:\/\/en.wikipedia.org\/wiki\/Flesch%E2%80%93Kincaid_readability_test\">Flesch-Kincaid readability test<\/a><br \/>\nis a very simple metric that calculates how long the sentences<br \/>\nand how big the words used in a text are.<\/p>\n<p>Complicated, long words used in scientific jargon will give<br \/>\nlow readability scores.<br \/>\nCarelessly written text with run on sentences and lots of <em>which<\/em><br \/>\nand <em>that<\/em> will score low on the readability scale.<br \/>\nShort sentences with simple words are considered more readable.<\/p>\n<p>Toby Donaldson at SFU has <a href=\"http:\/\/csil-web.cs.surrey.sfu.ca\/cmpt120fall2010\/wiki\/TextReadability\/\"> an implementation of Flesch-Kincaid<\/a> in python. I decided to check how the three chapters of the book score.<\/p>\n<p><!--more--><\/p>\n<p>In a command prompt:<br \/>\n<code><br \/>\ncat math\/*.txt &gt; \/tmp\/all_math.txt<br \/>\ncat calculus\/*.txt &gt; \/tmp\/all_calc.txt<br \/>\ncat physics\/*.txt &gt; \/tmp\/all_phys.txt<br \/>\n<\/code><\/p>\n<p>Then downloaded and installed<br \/>\n<code><br \/>\nimport flesch<\/code><\/p>\n<p>math = open(&#8220;\/tmp\/all_math.txt&#8221;).read()<br \/>\nflesch.summarize( math )<br \/>\nTotal # syllables: 26730<br \/>\nTotal # words: 17719<br \/>\nTotal # sentences: 1248<br \/>\nFlesch reading ease score (FRES): 64.8417724083<br \/>\nFlesch-Kincaid grade level: 11.7480791982<\/p>\n<p>flesch.summarize( calc )<br \/>\nTotal # syllables: 43247<br \/>\nTotal # words: 28103<br \/>\nTotal # sentences: 1634<br \/>\nFlesch reading ease score (FRES): 59.2303055328<br \/>\nFlesch-Kincaid grade level: 13.2762936474<\/p>\n<p>flesch.summarize( phys )<br \/>\nTotal # syllables: 42982<br \/>\nTotal # words: 27984<br \/>\nTotal # sentences: 1740<br \/>\nFlesch reading ease score (FRES): 60.6107049743<br \/>\nFlesch-Kincaid grade level: 12.8064754047<\/p>\n<p>flesch.summarize( la )<br \/>\nTotal # syllables: 24280<br \/>\nTotal # words: 15074<br \/>\nTotal # sentences: 972<br \/>\nFlesch reading ease score (FRES): 54.8681963758<br \/>\nFlesch-Kincaid grade level: 13.464711137<\/p>\n<p>flesch.summarize( EnM )<br \/>\nTotal # syllables: 30724<br \/>\nTotal # words: 19138<br \/>\nTotal # sentences: 1188<br \/>\nFlesch reading ease score (FRES): 54.7087328366<br \/>\nFlesch-Kincaid grade level: 13.6363072411<\/p>\n<p>&nbsp;<\/p>\n<p>Its actually pretty accurate. The math chapter is<br \/>\nfor people in high school. The Phys is 1st year<br \/>\nuniversity level and the calculus seems to be<br \/>\na little more advanced. Linear algebra with<br \/>\nwords like &#8220;eigenvector&#8221; obviously wins as the<br \/>\nhardest thing.<\/p>\n<p>Very cool.<\/p>\n<p>EDIT: Just for comparison, I will now test my MSc thesis.<br \/>\n<code><br \/>\nflesch.summarize( th )<\/code><\/p>\n<p>Total # syllables: 61114<br \/>\nTotal # words: 34436<br \/>\nTotal # sentences: 1985<br \/>\nFlesch reading ease score (FRES): 39.1269891464<br \/>\nFlesch-Kincaid grade level: 16.1173708441<\/p>\n<p>So yeah. I have some range.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The Flesch-Kincaid readability test is a very simple metric that calculates how long the sentences and how big the words used in a text are. Complicated, long words used in scientific jargon will give low readability scores. Carelessly written text with run on sentences and lots of which and that will score low on the [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[10],"tags":[],"class_list":["post-32","post","type-post","status-publish","format-standard","hentry","category-metrics"],"_links":{"self":[{"href":"https:\/\/minireference.com\/blog\/wp-json\/wp\/v2\/posts\/32","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/minireference.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/minireference.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/minireference.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/minireference.com\/blog\/wp-json\/wp\/v2\/comments?post=32"}],"version-history":[{"count":0,"href":"https:\/\/minireference.com\/blog\/wp-json\/wp\/v2\/posts\/32\/revisions"}],"wp:attachment":[{"href":"https:\/\/minireference.com\/blog\/wp-json\/wp\/v2\/media?parent=32"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/minireference.com\/blog\/wp-json\/wp\/v2\/categories?post=32"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/minireference.com\/blog\/wp-json\/wp\/v2\/tags?post=32"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}