I’ve written a small set of tools to (semi-)automatically collect and analyze data from online discussions on Facebook groups and pages. It is available on github. It can be used to collect posts and comments (including their hierarchical structure and some metadata) from public groups and pages automatically. For closed groups, manually saving the HTML output and parsing it with a provided Python script is necessary.
After collecting the data, statistical analyses can be performed on it. For now, identifying and counting nouns as described in a previous blog post is implemented.