You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Mutah 79513f9e57 Enhanced doc 7 years ago Enhanced doc 7 years ago Remove default for stop word file 7 years ago


A word frequency tool that outputs sorted results in csv format, supporting stop words.




python a_interesting_text.txt

Outputs such content in wordocc.csv in the current directory :


Options -h
Usage: [options] FILE

  -h, --help            show this help message and exit
  -s STOP_WORDS, --stop-words=STOP_WORDS
                        path to stop word file
  -o OUTPUT, --output=OUTPUT
	                    csv output filename (default: wordocc.csv)
  -e ENCODING, --encoding=ENCODING
                        file encoding (default: utf-8)

Stop words

Stop words are words that are not interesting for the statistic study, like articles, conjunctions, etc ...

You can provide a file containing those words (one per line). Following files can help :

Use -s option to specify the file path :

python -s /home/jdoe/en/stop.txt a_interesting_text.txt