Regression¶
- LOTUS_regression.regression.mzm_regression(X, Y, sigma=None, tolerance=0.01, max_iter=50, do_autocorrelation=True, do_heteroscedasticity=False, extra_heteroscedasticity=None, heteroscedasticity_merged_flag=None, seasonal_harmonics=(3, 4, 6, 12), constrain_ilt_gap=False, ilt_predictor_index_dict=None)[source]¶
Performs the regression for a single bin.
- Parameters
X (np.ndarray) – (nsamples, npredictors) Array of predictor basis functions
Y (np.array) – (nsamples) Observations
sigma (np.array) – (nsamples) Square root of the diagonal elements of the covariance matrix for Y
tolerance (float, optional) – Iterations stop when the relative difference in the AR1 coefficient is less than this threshold. Default 1e-2
max_iter (int, optional) – Maximum number of iterations to perform. Default 50
do_autocorrelation (bool, optional) – If true, do the AR1 autocorrelation correction on the covariance matrix. Default True.
do_heteroscedasticity (bool, optional) – If True, do the heteroscedasticity correction on the covariance matrix. Default False.
extra_heteroscedasticity (np.ndarray, optional) – (nsamples, nextrapredictors) Extra predictor functions to use in the heteroscedasticity correction.
heteroscedasticity_merged_flag (np.ndarray, optional) – (nsamples) A flag indicating time periods that should be treated independently in the heteroscedasticity correction. E.g. this could be something like [0, 0, 0, 0, 2, 2, 2, 1, 1, 1] which would create 3 independant time periods where the heteroscedasticity correction is applied.
seasonal_harmonics (Iterable Float, optional. Default (3, 4, 6, 12)) – The monthly harmonics to use in the heteroscedasticity correction.
constrain_ilt_gap (bool, optional. Default False.) – If true then a constraint is added so that the ILT terms in the gap period enforce continuity. This must be set in conjunction with ilt_predictor_index_dict
ilt_predictor_index_dict (dict, optional. Default None.) – If using constrain_ilt_gap, this must be a dictionary {predictor_name: index_in X} which contains the indicies of the predictors ‘gap_cons’, ‘pre_const’, ‘post_const’, ‘gap_linear’, ‘linear_pre’, ‘linear_post’
- Returns
results – a dictionary of outputs with keys:
gls_results
The raw regression output. This is an instance of RegressionResults which is documented at http://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.RegressionResults.html
residual
Residuals of the fit in the original coordinate system.
transformed_residuals
Residuals of the fit in the GLS transformed coordinates.
autocorrelation
The AR1 correlation constant.
numiter
Number of iterative steps performed.
covariance
Calculated covariance of Y that is input to the GLS model.
- Return type
dict
- LOTUS_regression.regression.regress_all_bins(predictors, mzm_data, time_field='time', debug=False, sigma=None, post_fit_trend_start=None, include_monthly_fits=False, return_raw_results=False, constrain_ilt_gap=False, **kwargs)[source]¶
Performs the regression for a dataset in all bins.
- Parameters
predictors (pd.Dataframe) – Dataframe of predictors to use in the regression. Index should be a time field.
mzm_data (xr.DataArray) – DataArray containing the monthly zonal mean data in a variety of bins. The data should be three dimensional with one dimension representing time. The other two dimensions are typically latitude and a vertical coordinate.
time_field (string) – Name of the time field in the mzm_data structure
sigma (xr.DataArray, optional. Default None) – If not None then the regression is weighted as if sigma is the standard deviation of mzm_data. Should be in the same format as mzm_data.
post_fit_trend_start (datetimelike, optional. Default None) – If set to a datetime like object (example: ‘2000-01-01’) then a linear trend is post fit to the residuals with the specified start date. If this is set you should not include a linear term in the predictors or the results will not be valid
constrain_ilt_gap (bool, optional. Default False) – If True then a constraint is added to the regression so that the fit terms in the gap period maintain continunity when doing ILT trends. The predictors must have keys ‘gap_cons’, ‘post_const’, ‘pre_const’, ‘linear_pre’, ‘linear_post’, and ‘gap_linear’ for this option to work. This is the standard case when using the ‘predictors_baseline_ilt_linear_gap’ file.
kwargs – Other arguments passed to mzm_regression
- Returns
- Return type
xr.Dataset