|
Web Page for Software:
Bayesian Regression: Nonparametric and Parametric Models
Version 12 January 2018
by George Karabatsos (home page)
Free stand-alone, menu-driven software for Bayesian regression analysis, for Windows 10 or 7.
Supported by NSF Research Grant SES-1156372.
Publications about the Bayesian Regression software (BibTex citations):
Karabatsos, G. (2015). A menu-driven software package for Bayesian regression analysis.
ISBA Bulletin, 22(4), 13-16.
Karabatsos, G. (2016, in press). A Menu-Driven Software Package of
Bayesian
Nonparametric (and Parametric) Mixed Models for Regression Analysis and
Density Estimation.
Behavior Research Methods, 49, 335-362.
AERA Prof. Development Course: "Bayesian Nonparametric Regression for Education Research"
COURSE MATERIAL: Course Slides. |
Software
Description
|
- The Bayesian Regression software package currently includes 100 Bayesian models for
data analysis,
including 5 models for Bayesian density estimation.
- The data analysis can provide:
- Prediction analysis.
The predictions of the dependent variable, given chosen covariate (predictor) values,
can be made in terms of the:
- mean;
- variance, for a variance regression analysis;
- quantiles, for a quantile regression analysis
(e.g., a median regression analysis pertains to a quantile probability of 1/2);
- probability density function or cumulative distribution function,
for density regression analysis;
- survival function, hazard function, or cumulative hazard function, for survival analysis;
- Binary or ordinal regression analysis, including Item Response Theory (IRT) analysis;
- Variable (predictor) selection analysis;
- Regression with splines
(truncated linear or cubic splines, natural cubic splines, locally-constant splines, B-splines,
and univariate or multivariate multiquadratic splines or thin-plate splines);
- Causal analysis;
- Meta-analysis;
- Cluster analysis;
- censored data analysis (e.g., survival data);
- spatial data analysis
(e.g., via spatial weight covariates or thin plate splines, applied to latitude and longitude data).
- Vibrations of Effects (VoE) analyses (paper).
- Models include:
- Bayesian nonparametric, infinite-mixture regression models, defined by:
- a probit regression model for the mixture weights (the infinite-probits model);
- a general
stick-breaking prior, to define 2-level hierarchical models.
Currently, priors include those defined by a Dirichlet process; the Pitman-Yor (PY) process,
the normalized stable process (a special PY process); the beta process (2-parameter);
the geometric weights prior
(a restricted stick-breaking prior);
and the normalized inverse-Gaussian process.
- 2-level and 3-level normal random-effects models (sometimes called HLMs);
- Normal linear models;
- Probit models and logit models, for binary (0,1) and for ordinal dependent variables.
They also include scale-mixture probit models that model the link function as an unknown parameter;
- Models that provide automatic covariate (predictor) selection, using prior distributions for
stochastic-search variable selection (SSVS), the LASSO, or ridge regression;
- Since these regression models are Bayesian (with a proper prior distribution on the regression coefficients),
they can automatically handle covariates (predictors) that have multicollinenarity.
- New models will be added to the software over time (suggestions are welcomed).
- More details are provided in the software Help menu.
|
Software
Features
|
- The software can be run almost exclusively by the computer mouse.
No code writing is needed to run a Bayesian analysis.
- Using appropriate menu and/or push-button options of the software, you can easily and quickly:
- Import a data set that is in a comma-delimited file (.csv) file format
(variable names in the first row, numerical (non-text data) in all other data rows);
- Provide basic summaries of the data set, through various descriptive statistics and graphs;
- Set up the data for a regression analysis, by constructing new variables or modifying existing variables,
involving either:
- Simple transformations of variables, including z-score or Box-Cox transformations,
binary coding (0,1; or -1,1), or
sum of variables;
- The construction of new variables that represent effect sizes;
- The construction of new variables
that represent interactions between covariates,
polynomials,
or incorporate spatial information (e.g., latitude, longitude),
in order to set up a spatial data analysis;
- Dimension (variable) reduction methods, including principal components,
multidimensional scaling, K-means clustering, scaling via true-score test theory,
and propensity scoring to
set up a causal analysis of observational data;
- The set up of a time-series, autoregression analysis,
by the construction of lag terms of chosen order;
- The handling of missing values, including:
- Nearest-neighbor hot-deck imputation of missing values;
- The processing of plausible values;
- Simple changes to the data set (e.g., rename, delete, or move variables in the data).
- Select the Bayesian regression model for data analysis,
along with the dependent variable, covariates, and prior distribution.
When necessary, you can also select:
- The grouping/nesting variable (for a 2-level or 3-level model);
- Observation weights,
for unequally-weighed observations,
as in meta-analysis;
- The interval bounds for censored dependent variable observations, as in survival analysis;
- Select the Markov Chain Monte Carlo (MCMC) sampling parameters
(number of MCMC samples, burn in period, thinning intervals),
for estimating the posterior distribution of the model;
- Output results of the data analysis under the chosen Bayesian model,
through text and graphical output files, that report:
- The estimates of the posterior distribution of the model parameters;
- The model's predictive fit to the given data set;
- The predictions of the model as a function of
one or more covariates (predictors) you choose;
- the MCMC convergence of all of these estimates, through trace plots,
CUSUM statistics,
and subsampling methods that calculate
the
95% MC confidence intervals of the estimates.
|
Software
Requirements |
- The Bayesian Regression software is a stand-alone software package.
- The software can run on a 64-bit Windows (PC) computer (also 32-bit for older software versions).
|
SOFTWARE
INSTALLATION
(2 steps)
|
- Download the Bayesian software (64-bit) installation file. Software includes 100 models.
- Click the downloaded file, BayesInstaller_web64bit.exe, to install the software.
This installation automatically includes a web-based installation of MATLAB Compiler Runtime.
While installing, please be sure that you select "Add a shortcut to the desktop."
(Also, have internet an connection and disable any firewall/proxy settings).
Alternative installation instructions. Requires: MATLAB Runtime 64-bit, BayesReg64bit.exe.
Previous Software Releases (64-bit and 32-bit versions). |
Running
the Software
|
- The Bayesian Regression software is opened by clicking the icon (file) BayesRegression.exe.
The Help menu gives step-by-step instructions on how to analyze data,
using a model of your choice.
- The Bayesian regression software provides several example data files that can be used
to illustrate the software through data analysis.
- To access the example data files, first click the File menu of the software,
and run the menu option "Create Bayes Data Examples file folder" (you only need to run this once).
Then click the File menu again, to import and open an example data file from this folder.
(One example data file can be downloaded from here).
|
Output Files
|
- The Bayesian Regression software outputs the
results of a data analysis into text (.txt) files
with time-stamped names. They include comma-delimited (.csv) and space-delimited text files
(such as posterior samples *.MC1, residual fit statistics *.RES, and .MODEL files).
- The text output files can be viewed in free NotePad++ or TextPad.
Such software is recommended
because it opens multiple output files in separate tabs.
- The delimited text output files can be analyzed and graphed in:
spreadsheet software (free spreadsheet software available from OpenOffice);
or in the free R software, after importing the comma-delimited output file using the R command:
ImportedData = read.csv(file.choose());
- The Bayesian Regression software can also output data analysis
results into graphs, as figure (*.fig) files.
A figure file can be saved as another file format (e.g., *.eps, *.bmp, *.emf, *.jpg, and *.pdf).
|