Web Page for Software:
Bayesian Regression: Nonparametric and Parametric Models
(beta version; older non-beta versions available below)
by George Karabatsos (home page)
Free stand-alone, menu-driven software for Bayesian regression analysis.
Supported by NSF Research Grant SES-1156372.
Publications about the Bayesian Regression software (BibTex citations):
Karabatsos, G. (2015). A menu-driven software package for Bayesian regression analysis.
ISBA Bulletin, 22(4), 13-16.
Karabatsos, G. (2016, in press). A Menu-Driven Software Package of
Nonparametric (and Parametric) Mixed Models for Regression Analysis and
Behavior Research Methods.
AERA Prof. Development Course: "Bayesian Nonparametric Regression for Education Research"
COURSE MATERIAL: Course Slides. Course exercises are provided in the 2016 paper (link) below.
New features of software version 11June2016:
-- Software provides regression with ridge priors, and with LASSO priors, for Bayesian linear models, and for Bayesian infinite-mixture (probits) regression models;
-- Software allows for construction of covariates with truncated linear splines, truncated cubic splines, natural cubic splines, B-splines, and Hardy's multiquadratic splines (multivariate and univariate);
-- Software provides Dirichlet-process based ridge regression with heteroscedastic-consistent covariance estimation of the regression coefficients, along with Vibrations of Effects (VoE) analyses (see paper).
New features since software version 23May2016:
-- Compatible with either Windows 7, 8, 8.1, or 10.
-- Power ridge regression and generalized ridge regression models, fit by marginal maximum likelihood.
K-means clustering analysis of the data, and of the posterior distribution of model parameters.
-- Improved principal components analysis (as recommended by MATLAB).
- The Bayesian Regression software package currently includes 100 Bayesian models for
including 5 models for Bayesian density estimation.
- The data analysis can provide:
- Prediction analysis.
The predictions of the dependent variable, given chosen covariate (predictor) values,
can be made in terms of the:
- variance, for a variance regression analysis;
- quantiles, for a quantile regression analysis
(e.g., a median regression analysis pertains to a quantile probability of 1/2);
- probability density function or cumulative distribution function,
for density regression analysis;
- survival function, hazard function, or cumulative hazard function, for survival analysis;
- Binary or ordinal regression analysis, including Item Response Theory (IRT) analysis;
- Variable (predictor) selection analysis;
- Regression with splines (thin-plate splines, and/or locally-constant splines);
- Causal analysis;
- Cluster analysis;
- The analysis of censored data (e.g., survival data); and
- The analysis of spatial data
(e.g., via spatial weight covariates or thin plate splines, applied to latitude and longitude data).
- Models include:
- Bayesian nonparametric, infinite-mixture regression models, defined by:
- a probit regression model for the mixture weights (the infinite-probits model);
- a general
stick-breaking prior, to define 2-level hierarchical models.
Currently, priors include those defined by a Dirichlet process; the Pitman-Yor (PY) process,
the normalized stable process (a special PY process); the beta process (2-parameter);
the geometric weights prior
(a restricted stick-breaking prior);
and the normalized inverse-Gaussian process.
- 2-level and 3-level normal random-effects models (sometimes called HLMs);
- Normal linear models;
- Probit models and logit models, for binary (0,1) and for ordinal dependent variables.
They also include scale-mixture probit models that model the link function as an unknown parameter;
- Models that provide automatic covariate (predictor) selection,
using stochastic-search variable selection (SSVS),
or using fast ridge regression via marginal maximum likelihood estimation;
- Since these regression models are Bayesian (with a proper prior distribution on the regression coefficients),
they can automatically handle covariates (predictors) that have multi-collinenarity.
- New models will be added to the software over time (suggestions are welcomed).
- More details are provided in the software Help menu.
- The software can be run almost exclusively by the computer mouse.
No code writing is needed to run a Bayesian analysis.
- Using appropriate menu and/or push-button options of the software, you can easily and quickly:
- Import a data set that is in a comma-delimited file (.csv) file format
(variable names in the first row, numerical (non-text data) in all other data rows);
- Provide basic summaries of the data set, through various descriptive statistics and graphs;
- Set up the data for a regression analysis, by constructing new variables or modifying existing variables,
- Simple transformations of variables, including z-score or Box-Cox transformations,
binary coding (0,1; or -1,1), or
sum of variables;
- The construction of new variables that represent effect sizes;
- The construction of new variables
that represent interactions between covariates,
or incorporate spatial information (e.g., latitude, longitude),
in order to set up a spatial data analysis;
- Dimension (variable) reduction methods, including principal components,
multidimensional scaling, K-means clustering, scaling via true-score test theory,
and propensity scoring to
set up a causal analysis of observational data;
- The set up of a time-series, autoregression analysis,
by the construction of lag terms of chosen order;
- The handling of missing values, including:
- Nearest-neighbor hot-deck imputation of missing values;
- The processing of plausible values;
- Simple changes to the data set (e.g., rename, delete, or move variables in the data).
- Select the Bayesian regression model for data analysis,
along with the dependent variable, covariates, and prior distribution.
When necessary, you can also select:
- The grouping/nesting variable (for a 2-level or 3-level model);
- Observation weights,
for unequally-weighed observations,
as in meta-analysis;
- The interval bounds for censored dependent variable observations, as in survival analysis;
- Select the Markov Chain Monte Carlo (MCMC) sampling parameters
(number of MCMC samples, burn in period, thinning intervals),
for estimating the posterior distribution of the model;
- Output results of the data analysis under the chosen Bayesian model,
through text and graphical output files, that report:
- The estimates of the posterior distribution of the model parameters;
- The model's predictive fit to the given data set;
- The predictions of the model as a function of
one or more covariates (predictors) you choose;
- the MCMC convergence of all of these estimates, through trace plots,
and subsampling methods that calculate
95% MC confidence intervals of the estimates.
- The Bayesian Regression software is a stand-alone software package.
- The software can run on a 64-bit Windows (PC) computer (also 32-bit for older software versions).
- Download the Bayesian software installation file:
New beta version for 64-bit Computers (23 MB). Includes 100 models.
Version for 64-bit Computers (23 MB). Includes 85 models.
- Click the downloaded file, BayesInstaller_web64bit.exe, to install the software.
This installation automatically includes web-based installation of MATLAB Compiler Runtime.
While installing, please be sure that you select "Add a shortcut to the desktop."
(Also, have internet an connection and disable any firewall/proxy settings).
Alternative installation instructions. Requires: MATLAB Compiler 64-bit, BayesReg64bit.exe.
Previous Software Releases:
Software versions for either Windows XP, Vista, 7, 8, or 8.1:
30June2015: 64-bit Computers (5 MB). 83 models. (MATLAB Compiler 64-bit, BayesReg64bit.exe)
25March2015: For 64-bit and 32-bit Computers (6.9 MB). 59 models.
Or: MATLAB Compiler 32-bit, BayesReg32bit.exe (Version 25March2015. Includes 59 models).
- The Bayesian Regression software is opened by clicking the icon (file) BayesRegression.exe.
The Help menu gives step-by-step instructions on how to analyze data,
using a model of your choice.
- The Bayesian regression software provides several example data files that can be used
to illustrate the software through data analysis.
- To access the example data files, first click the File menu of the software,
and run the menu option "Create Bayes Data Examples file folder" (you only need to run this once).
Then click the File menu again, to import and open an example data file from this folder.
(One example data file can be downloaded from here).
- The Bayesian Regression software outputs the
results of a data analysis into text (.txt) files
with time-stamped names. They include comma-delimited (.csv) and space-delimited text files
(such as posterior samples *.MC1, residual fit statistics *.RES, and .MODEL files).
- The text output files can be viewed in free NotePad++ or TextPad.
Such software is recommended
because it opens multiple output files in separate tabs.
- The delimited text output files can be analyzed and graphed in:
spreadsheet software (free spreadsheet software available from OpenOffice);
or in the free R software, after importing the comma-delimited output file using the R command:
ImportedData = read.csv(file.choose());
- The Bayesian Regression software can also output data analysis
results into graphs, as figure (*.fig) files.
A figure file can be saved as another file format (e.g., *.eps, *.bmp, *.emf, *.jpg, and *.pdf).