JFreq

A tool for counting words, quickly
Download

JFreq Ranking & Summary

Advertisement

  • Rating:
  • License:
  • Freeware
  • Publisher Name:
  • Viral Bioinformatics
  • Operating Systems:
  • Windows All
  • File Size:
  • 3 KB

JFreq Tags


JFreq Description

JFreq is a tool that takes plain text documents and turns them into a word frequency matrix. JFreq tries to be quick, and not to take to much memory. It could be better at both, but it's quite usable. The plain text files can be added directly, or by the folder-load. If folders are offered, JFreq only looks one level down into them for documents and assumes that everything it finds is a plain text file. It is helpful to make sure this is true. During the counting process JFreq can, optionally: · lowercase everything · remove currency symbols · remove digits · remove stop words with a list you provide · apply a stemmer for one of 12 European languages · perform a content analysis with a dictionary you provide JFreq output is a folder containing your new word (or category) frequency matrix in a choice of formats, optionally gzipped to save space on your disk. The formats are: · LDA-C: Blei's sparse matrix format used for fitting topic models, but quite generally useful for word frequency data. · MTX: The Matrix Market sparse matrix format used in numerical analysis, in the 'coordinate integer' format. · CSV: Everybody's first choice of output format. Not well-suited for large scale word-frequency data but reasonable for small document collections and for content analyses


JFreq Related Software