how do you import pandas?
import pandas as pd
how do you read in a csv?
df = pd.read_csv(“data.csv”)
How do I get the first 4 rows of a df?
df.head(4)
What is the default number of rows for head()?
5
How do I get the last 5 rows for df?
df.tail()
How do i get a random sample of the df?
df.sample(5)
How do I get the number of rows and columns of a df?
df.shape
how do i get the dtypes + non null counts?
df.info()
how do i get column names of a df?
df.columns
how do i get the data types per column?
df.dtypes
how do i get the count missing per column?
df.isna().sum()
how do we get % missing per column?
df.isna().mean() * 100
how do we get total missing columns out of all the df?
df.isna().sum().sum()
how do we drop rows with any missing values?
df.dropna()
how do we do a simple fill in missing values?
df.fillna(0)
how do we get descriptive statistics like count, mean, std, quantiles, etc. for every numerical column?
df.describe()
how do we get descriptive statistics like count, mean, std, quantiles, etc. for every column?
df.describe(include=”all”)
how do we get descriptive statistics like count, mean, std, quantiles, etc. for a specific column?
df[‘col_name’].describe()
how do we get the mean of a column?
df[‘col’].mean()
how do we get the median of a column?
df[‘col’].median()
how do we get the standard deviation of a column?
df[‘col’].std()
how do we get the minimum value of a column?
df[‘col’].min()
how do we get the max value of a column?
df[‘col’].max()
how do we get the index for the maximum value in each column?
df.idxmax()