
Web Page for Software:
Bayesian Regression: Nonparametric and Parametric Models
Version 11June2016
(beta version; older nonbeta versions available below)
by George Karabatsos (home page)
Free standalone, menudriven software for Bayesian regression analysis.
Supported by NSF Research Grant SES1156372.
Publications about the Bayesian Regression software (BibTex citations):
Karabatsos, G. (2015). A menudriven software package for Bayesian regression analysis.
ISBA Bulletin, 22(4), 1316.
Karabatsos, G. (2016, in press). A MenuDriven Software Package of
Bayesian
Nonparametric (and Parametric) Mixed Models for Regression Analysis and
Density Estimation.
Behavior Research Methods.
AERA Prof. Development Course: "Bayesian Nonparametric Regression for Education Research"
COURSE MATERIAL: Course Slides. Course exercises are provided in the 2016 paper (link) below.
New features of software version 11June2016:
 Software provides regression with ridge priors, and with LASSO priors, for Bayesian linear models, and for Bayesian infinitemixture (probits) regression models;
 Software allows for construction of covariates with truncated linear splines, truncated cubic splines, natural cubic splines, Bsplines, and Hardy's multiquadratic splines (multivariate and univariate);
 Software provides Dirichletprocess based ridge regression with heteroscedasticconsistent covariance estimation of the regression coefficients, along with Vibrations of Effects (VoE) analyses (see paper).
New features since software version 23May2016:
 Compatible with either Windows 7, 8, 8.1, or 10.
 Power ridge regression and generalized ridge regression models, fit by marginal maximum likelihood.

Kmeans clustering analysis of the data, and of the posterior distribution of model parameters.
 Improved principal components analysis (as recommended by MATLAB). 
Software
Description

 The Bayesian Regression software package currently includes 100 Bayesian models for
data analysis,
including 5 models for Bayesian density estimation.
 The data analysis can provide:
 Prediction analysis.
The predictions of the dependent variable, given chosen covariate (predictor) values,
can be made in terms of the:
 mean;
 variance, for a variance regression analysis;
 quantiles, for a quantile regression analysis
(e.g., a median regression analysis pertains to a quantile probability of 1/2);
 probability density function or cumulative distribution function,
for density regression analysis;
 survival function, hazard function, or cumulative hazard function, for survival analysis;
 Binary or ordinal regression analysis, including Item Response Theory (IRT) analysis;
 Variable (predictor) selection analysis;
 Regression with splines (thinplate splines, and/or locallyconstant splines);
 Causal analysis;
 Metaanalysis;
 Cluster analysis;
 The analysis of censored data (e.g., survival data); and
 The analysis of spatial data
(e.g., via spatial weight covariates or thin plate splines, applied to latitude and longitude data).
 Models include:
 Bayesian nonparametric, infinitemixture regression models, defined by:
 a probit regression model for the mixture weights (the infiniteprobits model);
 a general
stickbreaking prior, to define 2level hierarchical models.
Currently, priors include those defined by a Dirichlet process; the PitmanYor (PY) process,
the normalized stable process (a special PY process); the beta process (2parameter);
the geometric weights prior
(a restricted stickbreaking prior);
and the normalized inverseGaussian process.
 2level and 3level normal randomeffects models (sometimes called HLMs);
 Normal linear models;
 Probit models and logit models, for binary (0,1) and for ordinal dependent variables.
They also include scalemixture probit models that model the link function as an unknown parameter;
 Models that provide automatic covariate (predictor) selection,
using stochasticsearch variable selection (SSVS),
or using fast ridge regression via marginal maximum likelihood estimation;
 Since these regression models are Bayesian (with a proper prior distribution on the regression coefficients),
they can automatically handle covariates (predictors) that have multicollinenarity.
 New models will be added to the software over time (suggestions are welcomed).
 More details are provided in the software Help menu.

Software
Features

 The software can be run almost exclusively by the computer mouse.
No code writing is needed to run a Bayesian analysis.
 Using appropriate menu and/or pushbutton options of the software, you can easily and quickly:
 Import a data set that is in a commadelimited file (.csv) file format
(variable names in the first row, numerical (nontext data) in all other data rows);
 Provide basic summaries of the data set, through various descriptive statistics and graphs;
 Set up the data for a regression analysis, by constructing new variables or modifying existing variables,
involving either:
 Simple transformations of variables, including zscore or BoxCox transformations,
binary coding (0,1; or 1,1), or
sum of variables;
 The construction of new variables that represent effect sizes;
 The construction of new variables
that represent interactions between covariates,
polynomials,
or incorporate spatial information (e.g., latitude, longitude),
in order to set up a spatial data analysis;
 Dimension (variable) reduction methods, including principal components,
multidimensional scaling, Kmeans clustering, scaling via truescore test theory,
and propensity scoring to
set up a causal analysis of observational data;
 The set up of a timeseries, autoregression analysis,
by the construction of lag terms of chosen order;
 The handling of missing values, including:
 Nearestneighbor hotdeck imputation of missing values;
 The processing of plausible values;
 Simple changes to the data set (e.g., rename, delete, or move variables in the data).
 Select the Bayesian regression model for data analysis,
along with the dependent variable, covariates, and prior distribution.
When necessary, you can also select:
 The grouping/nesting variable (for a 2level or 3level model);
 Observation weights,
for unequallyweighed observations,
as in metaanalysis;
 The interval bounds for censored dependent variable observations, as in survival analysis;
 Select the Markov Chain Monte Carlo (MCMC) sampling parameters
(number of MCMC samples, burn in period, thinning intervals),
for estimating the posterior distribution of the model;
 Output results of the data analysis under the chosen Bayesian model,
through text and graphical output files, that report:
 The estimates of the posterior distribution of the model parameters;
 The model's predictive fit to the given data set;
 The predictions of the model as a function of
one or more covariates (predictors) you choose;
 the MCMC convergence of all of these estimates, through trace plots,
CUSUM statistics,
and subsampling methods that calculate
the
95% MC confidence intervals of the estimates.

Software
Requirements 
 The Bayesian Regression software is a standalone software package.
 The software can run on a 64bit Windows (PC) computer (also 32bit for older software versions).

SOFTWARE
INSTALLATION
(2 steps)

 Download the Bayesian software installation file:
New beta version for 64bit Computers (23 MB). Includes 100 models.
Version for 64bit Computers (23 MB). Includes 85 models.
 Click the downloaded file, BayesInstaller_web64bit.exe, to install the software.
This installation automatically includes webbased installation of MATLAB Compiler Runtime.
While installing, please be sure that you select "Add a shortcut to the desktop."
(Also, have internet an connection and disable any firewall/proxy settings).
Alternative installation instructions. Requires: MATLAB Compiler 64bit, BayesReg64bit.exe.
Previous Software Releases:
Software versions for either Windows XP, Vista, 7, 8, or 8.1:
30June2015: 64bit Computers (5 MB). 83 models. (MATLAB Compiler 64bit, BayesReg64bit.exe)
25March2015: For 64bit and 32bit Computers (6.9 MB). 59 models.
Or: MATLAB Compiler 32bit, BayesReg32bit.exe (Version 25March2015. Includes 59 models). 
Running
the Software

 The Bayesian Regression software is opened by clicking the icon (file) BayesRegression.exe.
The Help menu gives stepbystep instructions on how to analyze data,
using a model of your choice.
 The Bayesian regression software provides several example data files that can be used
to illustrate the software through data analysis.
 To access the example data files, first click the File menu of the software,
and run the menu option "Create Bayes Data Examples file folder" (you only need to run this once).
Then click the File menu again, to import and open an example data file from this folder.
(One example data file can be downloaded from here).

Output Files

 The Bayesian Regression software outputs the
results of a data analysis into text (.txt) files
with timestamped names. They include commadelimited (.csv) and spacedelimited text files
(such as posterior samples *.MC1, residual fit statistics *.RES, and .MODEL files).
 The text output files can be viewed in free NotePad++ or TextPad.
Such software is recommended
because it opens multiple output files in separate tabs.
 The delimited text output files can be analyzed and graphed in:
spreadsheet software (free spreadsheet software available from OpenOffice);
or in the free R software, after importing the commadelimited output file using the R command:
ImportedData = read.csv(file.choose());
 The Bayesian Regression software can also output data analysis
results into graphs, as figure (*.fig) files.
A figure file can be saved as another file format (e.g., *.eps, *.bmp, *.emf, *.jpg, and *.pdf).
