wordocc/README.md

#WordOcc

A word frequency tool that outputs sorted results in csv format, supporting stop words.

# Requirements

 - Python 2.6
 - TextBlob https://textblob.readthedocs.org/en/dev/

# Usage

## Basic

	python wordocc.py a_interesting_text.txt

Outputs such content in wordocc.csv :

	top,43
	image,31
	sample,29
	...

## Options

	wordocc.py -h
	Usage: wordocc.py [options] FILE

	Options:
	  -h, --help            show this help message and exit
	  -s STOP_WORDS, --stop-words=STOP_WORDS
	                        path to stop word file
	  -o OUTPUT, --output=OUTPUT
	                        csv output filename
	  -e ENCODING, --encoding=ENCODING
	                        file encoding (default: utf-8)


## Stop words

### Introduction

Stop words are words that are not interesting for the statistic study, like articles, conjunctions, etc ...

You have to provide a file containing those words (one per line). Following files can help :

 - English : http://snowball.tartarus.org/algorithms/english/stop.txt
 - French :http://snowball.tartarus.org/algorithms/french/stop.txt

### Usage

	python wordocc.py -e /home/jdoe/en/stop.txt a_interesting_text.txt
Initial add 2015-12-08 01:30:02 +11:00			`#WordOcc`

			`A word frequency tool that outputs sorted results in csv format, supporting stop words.`

			`# Requirements`

			`- Python 2.6`
			`- TextBlob https://textblob.readthedocs.org/en/dev/`

			`# Usage`

			`## Basic`

			`python wordocc.py a_interesting_text.txt`

			`Outputs such content in wordocc.csv :`

			`top,43`
			`image,31`
			`sample,29`
			`...`

			`## Options`

			`wordocc.py -h`
			`Usage: wordocc.py [options] FILE`

			`Options:`
			`-h, --help show this help message and exit`
			`-s STOP_WORDS, --stop-words=STOP_WORDS`
			`path to stop word file`
			`-o OUTPUT, --output=OUTPUT`
			`csv output filename`
			`-e ENCODING, --encoding=ENCODING`
			`file encoding (default: utf-8)`


			`## Stop words`

			`### Introduction`

			`Stop words are words that are not interesting for the statistic study, like articles, conjunctions, etc ...`

			`You have to provide a file containing those words (one per line). Following files can help :`

			`- English : http://snowball.tartarus.org/algorithms/english/stop.txt`
			`- French :http://snowball.tartarus.org/algorithms/french/stop.txt`

			`### Usage`

			`python wordocc.py -e /home/jdoe/en/stop.txt a_interesting_text.txt`