A command-line tool & Ruby class for computing Pearson's r and other stats
ruby-regress
is a tool for computing correlations and regression equations
from two-variable input. It is designed to function as a drop-in replacement
for Gary Perlman's regress
, at least for those who use only the basic
functionality that regress
provides.
The problem with Gary Perlman's excellent |STAT programs is twofold:
If you need bulletproof robustness you're probably better off dealing with Perlman's terms of access and using |STAT; if you want ease of installation, try ruby-regress
.
Installing ruby-regress using rubygems is absurdly easy:
gem install ruby-regress
which installs the regress
executable.
Download the most recent source from Github:
git clone git://github.com/doches/ruby-regress.git
then build and install the gem:
cd ruby-regress
rake build
sudo rake install
ruby-regress installs a single command line tool called regress
, which
reads from STDIN
and prints a report containing the correlation coefficient,
plus some descriptive statistics, to STDOUT
. For example, if we have a file in
the current directory called data.txt
containing two datasets:
1 12.0
2 11.0
3 13.0
4 14.0
we can get the correlation coefficient between these two variables by:
regress < data.txt
which will dump a load of statistical information about the datasets to the terminal.
ruby-regress only understands one- or two-column input. Give it two columns and you'll get a regression plus some descriptive stats (mean, range, &c); give it a single column of input and you'll only get the descriptive stats.