ValueError: Input contains NaN, infinity or a value too large for dtype('float64') in Python
Dung Do Tien
Sep 04 2021
217
Hi Guys. In Python, I want to train data with the panda's data frame.
When standardizing data using scikit-learn's StandardScaler
, the following error may occur.
from sklearn.preprocessing import StandardScaler
#Training data (pandas.DataFrame type)
X = training_data()
# Standardization
sc = StandardScaler()
sc.fit(X)
But I get an exception throw ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
How can I solve it?
Thanks for any suggestions.
Have 1 answer(s) found.
-
N-15
Nguyen Truong Giang Sep 04 2021
To avoid this, it is necessary to remove NaN and infinity from the input data.
For example, you can remove a column from X that contains at least one NaN with the code below.# Remove columns containing NaN from X X.drop(X.columns[np.isnan(X).any()], axis=1)
Description of each function
np.isnan(X)
: Get True for NaN elements, False matrix for other elementsnp.isnan(X).any()
: Get a list of True for columns containing NaN and False for other columnsX.columns[np.isnan(X).any()]
: Get column names containing NaNX.drop('col', axis = 1)
: Remove a column with column name col from X
* Type maximum 2000 characters.
* All comments have to wait approved before display.
* Please polite comment and respect questions and answers of others.