What are the types of data sources mentioned?
These categories represent various origins and types of data used in analysis.
What is data collection?
The process of gathering raw data from various sources for analysis and decision-making.
Data collection is essential for obtaining accurate information.
Why is data collection important?
It provides accurate information needed for analysis, planning, and decision-making.
Accurate data is crucial for effective decision-making.
What are common methods of data collection?
These methods help gather data from various sources.
What are primary data?
Data collected directly from the source for a specific purpose.
Primary data is often more reliable for specific research needs.
What are secondary data?
Data already collected by others and reused for analysis.
Secondary data can save time and resources in research.
What tools are used for data collection?
These tools facilitate the gathering of data efficiently.
What problems can occur during data collection?
These issues can compromise the quality of the collected data.
What is data accuracy?
The degree to which data correctly represents real-world values.
High data accuracy is essential for reliable analysis.
What is data validity?
It ensures data measures what it is intended to measure.
Valid data is crucial for drawing correct conclusions.
What is data consistency?
Data is uniform and does not contradict across systems.
Consistent data is vital for maintaining integrity in datasets.
What is data cleansing (data cleaning)?
The process of identifying and correcting or removing errors and inconsistencies in data.
Data cleansing is essential for improving data quality.
Why is data cleansing important?
It improves data quality, accuracy, and reliability for analysis.
Clean data leads to better decision-making.
What types of errors require data cleansing?
Addressing these errors is crucial for data integrity.
What is missing data?
Data fields with no recorded values.
Missing data can skew analysis results.
How can missing data be handled?
Proper handling of missing data is necessary for accurate analysis.
What is duplicate data?
Repeated records representing the same data item.
Duplicate data can lead to inflated results and inaccuracies.
How do you remove duplicate data?
By using unique identifiers and data-matching techniques.
Effective removal of duplicates is essential for data integrity.
What is data standardization?
Converting data into a consistent format (e.g., date or text format).
Standardized data is easier to analyze and compare.
What is data validation?
Checking data against rules to ensure correctness.
Validation helps maintain data quality.
What is data normalization in cleansing?
Organizing data to reduce redundancy and improve consistency.
Normalization is key for efficient data management.
How are data collection and data cleansing related?
Collected data must be cleansed to ensure it is accurate and usable.
The cleansing process is vital after data collection.
What happens if data is not cleansed?
It can lead to incorrect analysis and poor decisions.
Unclean data can severely impact outcomes.
When should data cleansing be done?
After data collection and regularly during data use.
Ongoing cleansing ensures data remains reliable.