sklearn miscellenous

发布时间 2023-04-08 14:18:18作者: 鱼市口

StandardScaler in preprocessing

 

Standardize features by removing the mean and scaling to unit variance.

 

scaler = StandardScaler() can have .tranform

with_stdbool, default=True   with_meanbool, default=True     copybool, default=True

>>> scaler = StandardScaler()
>>> print(scaler.fit(data))

Attributes ----------------------

scale_ndarray of shape (n_features,) or None

Per feature relative scaling of the data to achieve zero mean and unit(1) variance. Generally this is calculated using np.sqrt(var_). If a variance is zero, we can’t achieve unit variance, and the data is left as-is, giving a scaling factor of 1. scale_ is equal to None when with_std=False.

New in version 0.17: scale_

mean_ndarray of shape (n_features,) or None

The mean value for each feature in the training set. Equal to None when with_mean=False.

var_ndarray of shape (n_features,) or None

The variance for each feature in the training set. Used to compute scale_. Equal to None when with_std=False.

n_features_in_int

Number of features seen during fit.

New in version 0.24.

feature_names_in_ndarray of shape (n_features_in_,)

Names of features seen during fit. Defined only when X has feature names that are all strings.

New in version 1.0.

n_samples_seen_int or ndarray of shape (n_features,)

The number of samples processed by the estimator for each feature. If there are no missing samples, the n_samples_seen will be an integer, otherwise it will be an array of dtype int. If sample_weights are used it will be a float (if no missing data) or an array of dtype float that sums the weights seen so far. Will be reset on new calls to fit, but increments across partial_fit calls.

 

sklearn.feature_selection.f_regression

Univariate linear regression tests returning F-statistic and p-values.

Quick linear model for testing the effect of a single regressor, sequentially for many regressors.