What is a SQL database?
What is a NoSQL database?
When should you use a SQL vs a NoSQL database?
Additional interview questions around these topics:
How do you handle data consistency in a NoSQL database? What are the trade-offs between consistency, availability, and partition tolerance in the CAP theorem? How do you design a schema for a relational database? What are the advantages and disadvantages of using a document-based NoSQL database? How do you handle data normalization and denormalization in a SQL database? What are some strategies for optimizing query performance in a SQL database? How do you handle data migration when transitioning from a SQL to a NoSQL database or vice versa?
How do you design a schema for a relational database?
What are the advantages and disadvantages of using a document-based NoSQL database?
Advantages:
- Flexible Schema: Accommodating diverse data structures and changes over time.
- Scalability: Designed for horizontal scalability. Suitable for handling large volumes of data.
- Ease of Use: Often use JSON-like formats
- Aggregation: Many offer powerful aggregation frameworks for complex data processing and transformation.
Disadvantages:
- Lack of Joins: Typically do not support joins. Harder to query data across multiple collections.
- Data Redundancy: Flexible schema can lead to data redundancy and inconsistency, with the same data may be stored in multiple documents.
- Complex Transactions: Handling ACID transactions can be more complex compared to relational databases.
How do you handle data normalization and denormalization in a SQL database (and what is it)?
Normalization:
Denormalization:
What are some strategies for optimizing query performance in a SQL database?
What is indexing, when should you use it and what are the pros and cons?
What is Indexing?
When Should You Use It?
Pros of Indexing:
Cons of Indexing:
What is database caching? What are the pros and cons?
Overview of Database Caching:
Pros of Database Caching:
Cons of Database Caching:
What are Constraints in SQL?
Constraints are used to specify the rules concerning data in the table. It can be applied for single or multiple fields in an SQL table during the creation of the table or after creating using the ALTER TABLE command. The constraints are:
NOT NULL - Restricts NULL value from being inserted into a column. CHECK - Verifies that all values in a field satisfy a condition. DEFAULT - Automatically assigns a default value if no value has been specified for the field. UNIQUE - Ensures unique values to be inserted into the field. INDEX - Indexes a field providing faster retrieval of records. PRIMARY KEY - Uniquely identifies each record in a table. FOREIGN KEY - Ensures referential integrity for a record in another table.
What is a Join? List its different types.
The SQL Join clause is used to combine records (rows) from two or more tables in a SQL database based on a related column between the two.
There are four different types of JOINs in SQL:
(INNER) JOIN: Retrieves records that have matching values in both tables involved in the join. This is the widely used join for queries.
LEFT (OUTER) JOIN: Retrieves all the records/rows from the left and the matched records/rows from the right table.
RIGHT (OUTER) JOIN: Retrieves all the records/rows from the right and the matched records/rows from the left table.
FULL (OUTER) JOIN: Retrieves all the records where there is a match in either the left or right table.
What is an Index? Explain its different types.
A database index is a data structure that provides a quick lookup of data in a column or columns of a table. It enhances the speed of operations accessing data from a database table at the cost of additional writes and memory to maintain the index data structure.
CREATE INDEX index_name /* Create Index /
ON table_name (column_1, column_2);
DROP INDEX index_name; / Drop Index */
There are different types of indexes that can be created for different purposes:
Unique and Non-Unique Index:
Unique indexes are indexes that help maintain data integrity by ensuring that no two rows of data in a table have identical key values. Once a unique index has been defined for a table, uniqueness is enforced whenever keys are added or changed within the index.
CREATE UNIQUE INDEX myIndex
ON students (enroll_no);
Non-unique indexes, on the other hand, are not used to enforce constraints on the tables with which they are associated. Instead, non-unique indexes are used solely to improve query performance by maintaining a sorted order of data values that are used frequently.
Clustered and Non-Clustered Index:
Clustered indexes are indexes whose order of the rows in the database corresponds to the order of the rows in the index. This is why only one clustered index can exist in a given table, whereas, multiple non-clustered indexes can exist in the table.
The only difference between clustered and non-clustered indexes is that the database manager attempts to keep the data in the database in the same order as the corresponding keys appear in the clustered index.
Clustering indexes can improve the performance of most query operations because they provide a linear-access path to data stored in the database.
What is Cursor? How to use a Cursor?
A database cursor is a control structure that allows for the traversal of records in a database. Cursors, in addition, facilitates processing after traversal, such as retrieval, addition, and deletion of database records. They can be viewed as a pointer to one row in a set of rows.
Working with SQL Cursor:
DECLARE a cursor after any variable declaration. The cursor declaration must always be associated with a SELECT Statement. Open cursor to initialize the result set. The OPEN statement must be called before fetching rows from the result set. FETCH statement to retrieve and move to the next row in the result set. Call the CLOSE statement to deactivate the cursor. Finally use the DEALLOCATE statement to delete the cursor definition and release the associated resources.
List the different types of relationships in SQL.
One-to-One - This can be defined as the relationship between two tables where each record in one table is associated with the maximum of one record in the other table.
One-to-Many & Many-to-One - This is the most commonly used relationship where a record in a table is associated with multiple records in the other table.
Many-to-Many - This is used in cases when multiple instances on both sides are needed for defining a relationship.
Self-Referencing Relationships - This is used when a table needs to define a relationship with itself.
Can you explain the architecture of PostgreSQL?
The architecture of PostgreSQL follows the client-server model.
The server side comprises of background process manager, query processer, utilities and shared memory space which work together to build PostgreSQL’s instance that has access to the data. The client application does the task of connecting to this instance and requests data processing to the services. The client can either be GUI (Graphical User Interface) or a web application. The most commonly used client for PostgreSQL is pgAdmin.
What are steps to take toward zero-downtime schema changes / migrations
ALTER TABLE foos ADD COLUMN bar varchar DEFAULT 'baz' NOT NULL)CONCURRENTLY flag)Also:
- Set lock_timeout in your migration scripts to a pause your app can tolerate. It’s better to abort a deploy than take your application down
- Keep Postgres up to date. The locking code is improved with every release.
What are Data Definition Language (DDL), Data Manipulation Language (DML), and Data Control Language (DCL) commands?
Data Definition Language (DDL), is a subset of SQL commands that define the structure or schema of the database. e.g CREATE, ALTER, TRUNCATE, RENAME and DROP
Data Manipulation Language (DML), is an element in SQL language that deals with managing and manipulating data in the database. e.g. SELECT, INSERT, UPDATE, DELETE, MERGE
Data Control Language (DCL), are those commands in SQL that deal with controls, rights, and permission in the database system. e.g. GRANT, REVOKE
What are database transactions?
A database transaction is a sequence of one or more SQL operations executed as a single unit of work. These operations can include insertions, updates, deletions, or queries.
Characterized by the ACID properties:
Specifics: