metaclean3.features

Module Contents

Functions

scale_values(data)

Scales each column of the given pandas.DataFrame to range [0, 1).

get_bin_density(data, bins[, dens_agg_type, n_cores, ...])

Calculates a density value for each bin.

get_bin_moments(data, bins[, mmt])

Calculates the `mmt`th moment of each bin.

Attributes

G_RAP_GPU

metaclean3.features.G_RAP_GPU
metaclean3.features.scale_values(data: pandas.DataFrame)

Scales each column of the given pandas.DataFrame to range [0, 1).

Args:

data (pandas.DataFrame): A numeric (int/float) matrix.

Returns:

pandas.DataFrame: The scaled version of the given matrix.

metaclean3.features.get_bin_density(data: numpy.ndarray | pandas.DataFrame, bins: list | numpy.ndarray | pandas.Series, dens_agg_type: str = 'max', n_cores: int = -1, p: int = 2, eps: float = 0.1, dens_k_dtm: int = 15)

Calculates a density value for each bin.

Args:

data (numpy.ndarray | pandas.DataFrame): Numeric matrix (2D array). bins (list | numpy.ndarray | pandas.Series):

list containing bin value for each data row.

dens_agg_type (str, optional): Aggregation method per bin.

See pandas.DataFrame.agg(). Defaults to ‘max’.

n_cores (int, optional): The number of cores to use while

calculating the density feature. Set to -1 to use all cores. Defaults to -1.

p (int, optional): See scipy.spatial.cKDTree.query(). Defaults to 2. eps (float, optional): See scipy.spatial.cKDTree.query().

Defaults to 0.1.

dens_k_dtm (int, optional): See scipy.spatial.cKDTree.query().

Defaults to 15.

Returns:

numpy.ndarray: A 1D array containing a density value for each bin.

metaclean3.features.get_bin_moments(data: numpy.ndarray | pandas.DataFrame, bins: list | numpy.ndarray | pandas.Series, mmt: int = 2)

Calculates the `mmt`th moment of each bin.

Args:
data (numpy.ndarray | pandas.DataFrame): Numeric matrix (2D array).

If data has more than one column, we take the sum of each row.

bins (list | np.ndarray | pd.Series):

list containing bin value for each data row.

mmt (int, optional): Moment. Defaults to 2.

Returns:

numpy.ndarray: A 1D array containing the `mmt`th moment for each bin.