NumPy
Pyhon library for working with arrays of data (only one data type) (faster than lists
Creating Numpy Array
np.array([1,2,3])
Shapes:
- 1D array
- 2 D array
- 3 D array
Pandas
DataFrame
multiple different data types in different columns possible
Pandas: Loading data
pd.read_csv(“file.csv”)
Pandas: Connecting to a database (to read data from database directly into a Dataframe)
Pandas: check 5 first or last rows
df.head(); df.tail()
Pandas: check basic info
df.info()
Pandas:check shape of Dataframe
df.shape
Pandas: check name of columns
df.columns
Pandas: check number of missing values - count how many 0 values per column
df.isnull().sum()
Pandas: count of all different values in column incl. missing values
df[feature].value_counts()
Pandas vs sqlite3 in Python when querying a database
Differences NumPy & Pandas