DataFrameGroupBy.
transform
Call function producing a like-indexed DataFrame on each group and return a DataFrame having the same indexes as the original object filled with the transformed values
Function to apply to each group.
Can also accept a Numba JIT function with engine='numba' specified.
engine='numba'
If the 'numba' engine is chosen, the function must be a user defined function with values and index as the first and second arguments respectively in the function signature. Each group’s index will be passed to the user defined function and optionally available for use.
'numba'
values
index
Changed in version 1.1.0.
Positional arguments to pass to func
'cython' : Runs the function through C-extensions from cython.
'cython'
'numba' : Runs the function through JIT compiled code from numba.
None : Defaults to 'cython' or globally setting compute.use_numba
None
compute.use_numba
New in version 1.1.0.
For 'cython' engine, there are no accepted engine_kwargs
engine_kwargs
For 'numba' engine, the engine can accept nopython, nogil and parallel dictionary keys. The values must either be True or False. The default engine_kwargs for the 'numba' engine is {'nopython': True, 'nogil': False, 'parallel': False} and will be applied to the function
nopython
nogil
parallel
True
False
{'nopython': True, 'nogil': False, 'parallel': False}
Keyword arguments to be passed into func.
See also
DataFrame.groupby.apply
DataFrame.groupby.aggregate
DataFrame.transform
Notes
Each group is endowed the attribute ‘name’ in case you need to know which group you are working on.
The current implementation imposes three requirements on f:
f must return a value that either has the same shape as the input subframe or can be broadcast to the shape of the input subframe. For example, if f returns a scalar it will be broadcast to have the same shape as the input subframe.
if this is a DataFrame, f must support application column-by-column in the subframe. If f also supports application to the entire subframe, then a fast path is used starting from the second chunk.
f must not mutate groups. Mutation is not supported and may produce unexpected results.
When using engine='numba', there will be no “fall back” behavior internally. The group data and group index will be passed as numpy arrays to the JITed user defined function, and no alternative execution attempts will be tried.
Examples
>>> df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar', ... 'foo', 'bar'], ... 'B' : ['one', 'one', 'two', 'three', ... 'two', 'two'], ... 'C' : [1, 5, 5, 2, 5, 5], ... 'D' : [2.0, 5., 8., 1., 2., 9.]}) >>> grouped = df.groupby('A') >>> grouped.transform(lambda x: (x - x.mean()) / x.std()) C D 0 -1.154701 -0.577350 1 0.577350 0.000000 2 0.577350 1.154701 3 -1.154701 -1.000000 4 0.577350 -0.577350 5 0.577350 1.000000
Broadcast result of the transformation
>>> grouped.transform(lambda x: x.max() - x.min()) C D 0 4 6.0 1 3 8.0 2 4 6.0 3 3 8.0 4 4 6.0 5 3 8.0