org.sonar.l10n.py.rules.python.S6741.html Maven / Gradle / Ivy
This rule raises an issue when the pandas.DataFrame.values
is used instead of the pandas.DataFrame.to_numpy()
method.
Why is this an issue?
The values
attribute and the to_numpy()
method in pandas both provide a way to return a NumPy representation of the
DataFrame
. However, there are some reasons why the to_numpy()
method is recommended over the values
attribute:
- Future Compatibility: The
values
attribute is considered a legacy feature, while the to_numpy()
is
the recommended method to extract data and is considered more future-proof.
- Data type consistency: If the
DataFrame
has columns with different data types, NumPy will choose a common data
type that can hold all the data. This may lead to loss of information, unexpected type conversions, or increased memory usage. The
to_numpy()
allows you to select the common type manually, passing the dtype
argument.
- View vs Copy: The
values
attribute can return a view or a copy of the data depending on whether the data needs to
be transposed. This can lead to confusion when modifying the extracted data. On the other hand, to_numpy()
has copy
argument allowing to force it always to return a new NumPy array, ensuring that any changes you make won’t affect the original
DataFrame
.
- Missing values control: The
to_numpy()
allows to specify the default value used for missing values in the
DataFrame
, while the values
will always use numpy.nan
for missing values.
How to fix it
Use the to_numpy()
method instead of the values
attribute to get a NumPy representation of the
DataFrame
.
Code examples
Noncompliant code example
import pandas as pd
df = pd.DataFrame({
'X': ['A', 'B', 'A', 'C'],
'Y': [10, 7, 12, 5]
})
arr = df.values # Noncompliant: using the 'values' attribute is not recommended
Compliant solution
import pandas as pd
df = pd.DataFrame({
'X': ['A', 'B', 'A', 'C'],
'Y': [10, 7, 12, 5]
})
arr = df.to_numpy() # Compliant
Resources
Documentation
- Pandas Documentation - pandas.DataFrame.to_numpy()
- Pandas Documentation - pandas.DataFrame.values
© 2015 - 2024 Weber Informatics LLC | Privacy Policy