pandas.DataFrame.ewm¶
- DataFrame.ewm(com=None, span=None, halflife=None, alpha=None, min_periods=0, adjust=True, ignore_na=False, axis=0, times=None, method='single')[source]¶
Provide exponentially weighted (EW) calculations.
Exactly one parameter:
com
,span
,halflife
, oralpha
must be provided.- Parameters
- comfloat, optional
Specify decay in terms of center of mass
\(\alpha = 1 / (1 + com)\), for \(com \geq 0\).
- spanfloat, optional
Specify decay in terms of span
\(\alpha = 2 / (span + 1)\), for \(span \geq 1\).
- halflifefloat, str, timedelta, optional
Specify decay in terms of half-life
\(\alpha = 1 - \exp\left(-\ln(2) / halflife\right)\), for \(halflife > 0\).
If
times
is specified, the time unit (str or timedelta) over which an observation decays to half its value. Only applicable tomean()
, and halflife value will not apply to the other functions.New in version 1.1.0.
- alphafloat, optional
Specify smoothing factor \(\alpha\) directly
\(0 < \alpha \leq 1\).
- min_periodsint, default 0
Minimum number of observations in window required to have a value; otherwise, result is
np.nan
.- adjustbool, default True
Divide by decaying adjustment factor in beginning periods to account for imbalance in relative weightings (viewing EWMA as a moving average).
When
adjust=True
(default), the EW function is calculated using weights \(w_i = (1 - \alpha)^i\). For example, the EW moving average of the series [\(x_0, x_1, ..., x_t\)] would be:
\[y_t = \frac{x_t + (1 - \alpha)x_{t-1} + (1 - \alpha)^2 x_{t-2} + ... + (1 - \alpha)^t x_0}{1 + (1 - \alpha) + (1 - \alpha)^2 + ... + (1 - \alpha)^t}\]When
adjust=False
, the exponentially weighted function is calculated recursively:
\[\begin{split}\begin{split} y_0 &= x_0\\ y_t &= (1 - \alpha) y_{t-1} + \alpha x_t, \end{split}\end{split}\]- ignore_nabool, default False
Ignore missing values when calculating weights.
When
ignore_na=False
(default), weights are based on absolute positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are \((1-\alpha)^2\) and \(1\) ifadjust=True
, and \((1-\alpha)^2\) and \(\alpha\) ifadjust=False
.When
ignore_na=True
, weights are based on relative positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are \(1-\alpha\) and \(1\) ifadjust=True
, and \(1-\alpha\) and \(\alpha\) ifadjust=False
.
- axis{0, 1}, default 0
If
0
or'index'
, calculate across the rows.If
1
or'columns'
, calculate across the columns.- timesstr, np.ndarray, Series, default None
New in version 1.1.0.
Only applicable to
mean()
.Times corresponding to the observations. Must be monotonically increasing and
datetime64[ns]
dtype.If 1-D array like, a sequence with the same shape as the observations.
Deprecated since version 1.4.0: If str, the name of the column in the DataFrame representing the times.
- methodstr {‘single’, ‘table’}, default ‘single’
New in version 1.4.0.
Execute the rolling operation per single column or row (
'single'
) or over the entire object ('table'
).This argument is only implemented when specifying
engine='numba'
in the method call.Only applicable to
mean()
- Returns
ExponentialMovingWindow
subclass
Notes
See Windowing Operations for further usage details and examples.
Examples
>>> df = pd.DataFrame({'B': [0, 1, 2, np.nan, 4]}) >>> df B 0 0.0 1 1.0 2 2.0 3 NaN 4 4.0
>>> df.ewm(com=0.5).mean() B 0 0.000000 1 0.750000 2 1.615385 3 1.615385 4 3.670213 >>> df.ewm(alpha=2 / 3).mean() B 0 0.000000 1 0.750000 2 1.615385 3 1.615385 4 3.670213
adjust
>>> df.ewm(com=0.5, adjust=True).mean() B 0 0.000000 1 0.750000 2 1.615385 3 1.615385 4 3.670213 >>> df.ewm(com=0.5, adjust=False).mean() B 0 0.000000 1 0.666667 2 1.555556 3 1.555556 4 3.650794
ignore_na
>>> df.ewm(com=0.5, ignore_na=True).mean() B 0 0.000000 1 0.750000 2 1.615385 3 1.615385 4 3.225000 >>> df.ewm(com=0.5, ignore_na=False).mean() B 0 0.000000 1 0.750000 2 1.615385 3 1.615385 4 3.670213
times
Exponentially weighted mean with weights calculated with a timedelta
halflife
relative totimes
.>>> times = ['2020-01-01', '2020-01-03', '2020-01-10', '2020-01-15', '2020-01-17'] >>> df.ewm(halflife='4 days', times=pd.DatetimeIndex(times)).mean() B 0 0.000000 1 0.585786 2 1.523889 3 1.523889 4 3.233686