All Downloads are FREE. Search and download functionalities are using the official Maven repository.

org.sonar.l10n.py.rules.python.S6740.html Maven / Gradle / Ivy

There is a newer version: 4.23.0.17664
Show newest version

This rule raises an error when the dtype parameter is not provided when using pandas.read_csv or pandas.read_table.

Why is this an issue?

The pandas library provides an easy way to load data from documents hosted locally or remotely, for example with the pandas.read_csv or pandas.read_table functions:

import pandas as pd

df = pd.read_csv("my_file.csv")

Pandas will infer the type of each columns of the CSV file and specify the datatype accordingly, making this code perfectly valid. However this snippet of code does not convey the proper intent of the user, and can raise questions such as:

  • What information can I access in df?
  • What are the names of the columns available in df?

These questions arise as there are no descriptions of what kind of data is loaded into the data frame, making the code less understandable and harder to maintain.

A straightforward way to fix these issues is by providing the schema of the data through the usage of the dtype parameter.

How to fix it

To fix this issue provide the dtype parameter to the read_csv or read_table function.

Code examples

Noncompliant code example

import pandas as pd

def foo():
  return pd.read_csv("my_file.csv") # Noncompliant: it is unclear which type of data the data frame holds.

Compliant solution

import pandas as pd

def foo():
  return pd.read_csv(
          "my_file.csv",
          dtype={'name': 'str', 'age': 'int'}) # Compliant

Resources

Documentation





© 2015 - 2024 Weber Informatics LLC | Privacy Policy