LOTUS Regression
0.8.0
  • Technical Note on the Baseline Regression Implementation
  • Trends with the GOZCARDS Dataset
  • Seasonal trends with the GOZCARDS Dataset
  • Trends with the SAGE II / OSIRIS / OMPS-LP Dataset
  • Predictors
  • Extra Predictors
  • Running the Regression for a Single Bin
  • Changing the Linear Components
  • API Reference
LOTUS Regression
  • »
  • Trends with the GOZCARDS Dataset
  • View page source

Trends with the GOZCARDS DatasetΒΆ

Here we calculate trends using the GOZCARDS dataset by regressing to the VMR monthly zonal means using seasonal terms in our predictors.

[2]:
import xarray as xr
import numpy as np
from LOTUS_regression.regression import regress_all_bins
from LOTUS_regression.predictors.seasonal import add_seasonal_components
from LOTUS_regression.predictors import load_data

The GOZCARDS data is in multiple NetCDF4 files. Load them all in and concatenate on the time dimension. Also only take values in the latitude range -60 to 60.

[3]:
GOZCARDS_FILES = r'/media/users/data/LOTUS/GOZCARDS/*.nc4'

data = xr.decode_cf(xr.open_mfdataset(GOZCARDS_FILES, concat_dim='time', group='Merged', engine='netcdf4'))

data = data.loc[dict(lat=slice(-60, 60))]

print(data)
<xarray.Dataset>
Dimensions:               (data_source: 6, lat: 12, lev: 25, overlap: 6, time: 384)
Coordinates:
  * lat                   (lat) float32 -55.0 -45.0 -35.0 ... 35.0 45.0 55.0
  * lev                   (lev) float32 1e+03 681.3 464.2 ... 0.2154 0.1468 0.1
  * time                  (time) datetime64[ns] 1979-01-15 ... 2012-12-15
  * data_source           (data_source) int32 1 2 3 4 5 6
  * overlap               (overlap) int32 1 2 3 4 5 6
Data variables: (12/13)
    data_source_name      (time, data_source) |S20 dask.array<chunksize=(12, 6), meta=np.ndarray>
    overlap_start_time    (time, overlap) datetime64[ns] dask.array<chunksize=(12, 6), meta=np.ndarray>
    overlap_end_time      (time, overlap) datetime64[ns] dask.array<chunksize=(12, 6), meta=np.ndarray>
    overlap_used_source   (time, overlap, data_source) int8 dask.array<chunksize=(12, 6, 6), meta=np.ndarray>
    overlap_source_total  (time, overlap, data_source, lev, lat) float64 dask.array<chunksize=(12, 6, 6, 25, 12), meta=np.ndarray>
    average               (time, lev, lat) float32 dask.array<chunksize=(12, 25, 12), meta=np.ndarray>
    ...                    ...
    std_dev               (time, lev, lat) float32 dask.array<chunksize=(12, 25, 12), meta=np.ndarray>
    std_error             (time, lev, lat) float32 dask.array<chunksize=(12, 25, 12), meta=np.ndarray>
    minimum               (time, lev, lat) float32 dask.array<chunksize=(12, 25, 12), meta=np.ndarray>
    maximum               (time, lev, lat) float32 dask.array<chunksize=(12, 25, 12), meta=np.ndarray>
    offset                (time, data_source, lev, lat) float32 dask.array<chunksize=(12, 6, 25, 12), meta=np.ndarray>
    offset_std_error      (time, data_source, lev, lat) float32 dask.array<chunksize=(12, 6, 25, 12), meta=np.ndarray>
Attributes:
    OriginalFileVersion:           1.01
    OriginalFileProcessingCenter:  Jet Propulsion Laboratory / California Ins...
    InputOriginalFile:             All available original source data for the...
    DataQuality:                   All data screening according to the specif...
    DataSource:                    SAGE-I v5.9_rev, SAGE-II v6.2, HALOE v19, ...
    DataSubset:                    Full Dataset

Load some standard predictors and add a constant

[4]:
predictors = load_data('pred_baseline_pwlt.csv')

print(predictors.columns)
Index(['enso', 'solar', 'qboA', 'qboB', 'aod', 'linear_pre', 'linear_post',
       'constant'],
      dtype='object')

Currently our predictors have no seasonal dependence. Add in some seasonal terms with different numbers of Fourier components.

[5]:
predictors = add_seasonal_components(predictors, {'constant': 4, 'linear_pre': 2, 'linear_post': 2, 'qboA': 2, 'qboB': 2})

print(predictors[:5])
                enso     solar      qboA      qboB       aod  linear_pre  \
time
1979-01-01  0.499274  1.568564 -0.991782 -0.925437 -0.395539   -1.800000
1979-02-01  0.281624  1.622217 -1.024426 -1.028774 -0.396092   -1.791667
1979-03-01 -0.070762  1.312212 -1.147750 -0.659631 -0.397189   -1.783333
1979-04-01  0.219438  1.127112 -1.138165 -0.399319 -0.399997   -1.775000
1979-05-01  0.291988  1.002103 -1.205567  0.058949 -0.403881   -1.766667

            linear_post  constant  qboA_sin0  qboA_cos0  ...  \
time                                                     ...
1979-01-01          0.0       1.0  -0.000000  -0.991782  ...
1979-02-01          0.0       1.0  -0.520774  -0.882181  ...
1979-03-01          0.0       1.0  -0.974957  -0.605631  ...
1979-04-01          0.0       1.0  -1.137875  -0.025696  ...
1979-05-01          0.0       1.0  -1.061723   0.571085  ...

            linear_post_sin1  linear_post_cos1  constant_sin0  constant_cos0  \
time
1979-01-01               0.0               0.0       0.000000       1.000000
1979-02-01               0.0               0.0       0.508356       0.861147
1979-03-01               0.0              -0.0       0.849450       0.527668
1979-04-01               0.0              -0.0       0.999745       0.022576
1979-05-01              -0.0              -0.0       0.880683      -0.473706

            constant_sin1  constant_cos1  constant_sin2  constant_cos2  \
time
1979-01-01       0.000000       1.000000       0.000000       1.000000
1979-02-01       0.875539       0.483147       0.999579      -0.029025
1979-03-01       0.896456      -0.443132       0.096613      -0.995322
1979-04-01       0.045141      -0.998981      -0.997707      -0.067683
1979-05-01      -0.834370      -0.551205      -0.090190       0.995925

            constant_sin3  constant_cos3
time
1979-01-01       0.000000       1.000000
1979-02-01       0.846029      -0.533137
1979-03-01      -0.794497      -0.607268
1979-04-01      -0.090190       0.995925
1979-05-01       0.919817      -0.392347

[5 rows x 32 columns]

Perform the regression and convert the coefficients to percent anomaly.

[6]:
results = regress_all_bins(predictors, data['average'], tolerance=0.1)

# Convert to ~ percent
results /= data['average'].mean(dim='time')
results *= 100

print(results)
<xarray.Dataset>
Dimensions:               (lat: 12, lev: 25)
Coordinates:
  * lev                   (lev) float32 1e+03 681.3 464.2 ... 0.2154 0.1468 0.1
  * lat                   (lat) float32 -55.0 -45.0 -35.0 ... 35.0 45.0 55.0
Data variables: (12/64)
    enso                  (lev, lat) float64 dask.array<chunksize=(25, 12), meta=np.ndarray>
    enso_std              (lev, lat) float64 dask.array<chunksize=(25, 12), meta=np.ndarray>
    solar                 (lev, lat) float64 dask.array<chunksize=(25, 12), meta=np.ndarray>
    solar_std             (lev, lat) float64 dask.array<chunksize=(25, 12), meta=np.ndarray>
    qboA                  (lev, lat) float64 dask.array<chunksize=(25, 12), meta=np.ndarray>
    qboA_std              (lev, lat) float64 dask.array<chunksize=(25, 12), meta=np.ndarray>
    ...                    ...
    constant_cos2         (lev, lat) float64 dask.array<chunksize=(25, 12), meta=np.ndarray>
    constant_cos2_std     (lev, lat) float64 dask.array<chunksize=(25, 12), meta=np.ndarray>
    constant_sin3         (lev, lat) float64 dask.array<chunksize=(25, 12), meta=np.ndarray>
    constant_sin3_std     (lev, lat) float64 dask.array<chunksize=(25, 12), meta=np.ndarray>
    constant_cos3         (lev, lat) float64 dask.array<chunksize=(25, 12), meta=np.ndarray>
    constant_cos3_std     (lev, lat) float64 dask.array<chunksize=(25, 12), meta=np.ndarray>
[7]:
import LOTUS_regression.plotting.trends as trends
trends.pre_post_with_confidence(results, x='lat', y='lev', ylim=(100, 0.5), log_y=True, figsize=(16, 6),
                                x_label='Latitude [$^\circ$]', y_label='Pressure [hPa]', pre_title='Pre 1997',
                                post_title='Post 1997')
../_images/examples_GOZCARDS_Trends_11_0.png
Next Previous

© Copyright 2017, USask ARG and the LOTUS Group.

Built with Sphinx using a theme provided by Read the Docs.