#WordOcc A word frequency tool that outputs sorted results in csv format, supporting stop words. # Requirements - Python 2.6 - TextBlob https://textblob.readthedocs.org/en/dev/ # Usage ## Basic python wordocc.py a_interesting_text.txt Outputs such content in _wordocc.csv_ in the current directory : top,43 image,31 sample,29 ... ## Options wordocc.py -h Usage: wordocc.py [options] FILE Options: -h, --help show this help message and exit -s STOP_WORDS, --stop-words=STOP_WORDS path to stop word file -o OUTPUT, --output=OUTPUT csv output filename (default: wordocc.csv) -e ENCODING, --encoding=ENCODING file encoding (default: utf-8) ## Stop words Stop words are words that are not interesting for the statistic study, like articles, conjunctions, etc ... You can provide a file containing those words (one per line). Following files can help : - English : http://snowball.tartarus.org/algorithms/english/stop.txt - French :http://snowball.tartarus.org/algorithms/french/stop.txt Use -s option to specify the file path : python wordocc.py -s /home/jdoe/en/stop.txt a_interesting_text.txt