How to read an data.csv onto pandas dataframe df having ID as index column?
df = pd.read_csv('data.csv', index_col = 0)
how to extract count, mean, std, min 25%, … statistic information from pandas dataframe df ?
df.describe()
How to extract the column ‘z’ from df ?
df.loc[:, 'z']
use pandas method plot with no further parameter to plot a serie s?
s.plot()
What are the outputs of:
1. df.isna()
2. df.isna().sum()
3. df.isna().sum().sum()
what is the overall result after 3
how to get all raws from user Id 3292879998 as series, if the index col is user Id and dtype of this column ist Int64
df.loc[3292879998]
how to convert the index column of a pandas df from Float64 to Int64
df.index = df.index.astype(‘Int64’)
how to user lists of keys and values to generate a dict?
new_dict = dict(zip(key, values))
how to drop the first row of a pandas df if the index is NaN?
# Drop the first row df = df.drop(index=pd.NA)
how to count the number of atendents if antendent id is index column in a df?
len(df.index)
how to get all rows from df where gender not male or female
df.loc[~df['gender'].isin(['Male', 'Female'])]
how to substitute all cells where True with ‘no’ ?
df.replace(True, ‘no’, inplace=True)
how to select only col1 and col2 for df filtered with boolean condition bc?
df.loc[bc, ['col1', 'col2'] ]
how to count the rows of a df (or df selection)
df.shape[0]
count the number of rows for each different value of ‘location’ in df
df[‘location’].value_counts()
what does pd.cut do?
pandas.cut(x, bins) splits the Array or Serie x in bins categories. bins can also be a list specifying the bin edges
what does pd.qcut do?
pandas.qcut(x, q) It divides a dataset into quantiles, which are intervals with approximately the same number of observations.