PANW Flashcards

(241 cards)

1
Q

What is a rate limiter?

A

A system that restricts the number of requests a user or service can make within a time window.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Difference between a sliding window and a fixed window?

A

fixed: simpler, allows bursts at boundaries
sliding: more accurate, prevents bursts but slightly more expensive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the data structure for rate limiter?

A

HashMap + Queue (timestamps per user)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How to scale rate limiter?

A

User Redis (shared state), load balancing, consistent hashing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is an event deduplicator?

A

A system that ensures each event is processed only once.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Core data structure for deduplicator?

A

HashSet or HashMap for O(1) lookup

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the biggest issue for an event deduplicator?

A

unbounded memory growth

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the solution to memory issue for deduplicator?

A

TTL (time based expiration) + cleanup

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do we optimize a deduplicator for large scale?

A

Bloom filter (less memory, allows false positives)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is TTL?

A

a numerical value in network packets (IP) or DNS records that dictates how long data should exist or be cached before expiring. It prevents data from circulating indefinitely, manages congestion, and ensures timely updates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a bloom filter?

A

a space-efficient, probabilistic data structure used to test set membership, determining if an element is definitely not in a set or possibly in the set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How to design MaxStack in O(1)?

A

use two stacks
- main stack
- max stack (tracks current max)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the difference between horizontal scaling and vertical scaling?

A

horizontal: add more machines
vertical: upgrade one machine

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

When would we prefer horizontal scaling?

A

large-scale systems -> better fault tolerance + scalability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a load balancer?

A

distributes incoming traffic across multiple servers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why is a load balancer needed?

A

to prevent overload and improve availability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is caching?

A

storing frequently accessed data for faster retrieval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is an example tool for caching?

A

Redis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Why would we use Redis for caching?

A

Redis offers low-latency reads and writes, making it particularly suitable for use cases that require a cache

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the tradeoff of caching?

A

speed vs data freshness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Difference between SQL vs NoSQL?

A

SQL: structures, relational
NoSQL: flexible, scalable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

When would we use NoSQL?

A

high throughput, flexible schema (like event logs)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is an API?

A

interface for communication between systems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the different between HTTP and HTTPS?

A

HTTPS is encrypted (secure), HTTP is not

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is the difference between TCP vs UDP?
TCP: reliable, slower UDP: fast, no guarantee
26
What is TCP?
The protocol suite used to establish a connection between a client and a server
27
What is UDP?
a fast, connectionless, and lightweight transport layer protocol. It enables low-latency communication by sending data packets ("datagrams") without establishing a formal connection or guaranteeing delivery, ordering, or error correction.
28
What is DNS?
Translates a domain name into an IP address so a browser can locate the correct server
29
What is p95/p99 latency?
p95 = 95% of requests are faster than this time p99 = tail latency (important for reliability)
30
Why is latency important?
measures worst-case user experience
31
What is throughput?
number of requests handled per second
32
How do you validate AI-generated code?
- test cases - edge cases - code review - performance checks
33
What are the risks of AI code?
- security vulnerabilities - incorrect logic - hidden bugs
34
What are metrics for AI systems?
- latency - accuracy - error rate - cost
35
What is prompt injection?
malicious input that manipulates AI behavior
36
What is a security risk in backend systems?
- injection attacks - data leaks - improper validation
37
What matters more: correctness or speed?
depends, but PANW leans toward secure + reliable first
38
How do you debug code in interviews?
1. understand expected behavior 2. trace through example 3. identify edge cases 4. fix + explain
39
Why Palo Alto?
- interest in cybersecurity - real-world impact - fast-growing company - alignment with my backend/infra interests
40
What is the time complexity of accessing an element in an array?
O(1)
41
What is the time complexity of accessing an element in an array?
O(1)
42
What is the time complexity of inserting into a dynamic array (amortized)?
O(1)
43
What data structure uses FIFO?
Queue
44
What data structure uses LIFO?
Stack
45
What is a hash table?
A data structure that maps keys to values using a hash function.
46
Average vs worst-case lookup in a hash map?
Average: O(1), Worst: O(n)
47
What causes hash collisions?
Two keys hashing to the same index
48
How do you resolve collisions?
Chaining or open addressing
49
What is a linked list?
A sequence of nodes where each node points to the next
50
Time complexity to search a linked list?
O(n)
51
What is a binary tree?
A tree where each node has at most 2 children
52
What is a binary search tree (BST)?
Left < root < right
53
Average vs worst-case search in BST?
Avg: O(log n), Worst: O(n)
54
What is a balanced tree?
A tree where height is minimized (e.g., AVL, Red-Black)
55
What is a heap?
A complete binary tree satisfying heap property
56
Min heap vs max heap?
Min: root smallest, Max: root largest
57
Heap insertion/removal complexity?
O(log n)
58
What is binary search time complexity?
O(log n)
59
When can you use binary search?
On sorted data
60
What is DFS?
Depth-first traversal (stack/recursion)
61
What is BFS?
Breadth-first traversal (queue)
62
BFS vs DFS key difference?
BFS explores level-by-level, DFS goes deep first
63
What is dynamic programming?
Solving problems by storing overlapping subproblems
64
Memoization vs tabulation?
Top-down vs bottom-up
65
What is greedy algorithm?
Makes locally optimal choice each step
66
What is backtracking?
Exploring all possibilities and undoing choices
67
What is space complexity?
Memory usage of an algorithm
68
What is scalability?
Ability to handle increasing load
69
Vertical vs horizontal scaling?
Bigger machine vs more machines
70
What is load balancing?
Distributing traffic across servers
71
What is caching?
Storing frequently accessed data for faster retrieval
72
What is a database index?
Structure to speed up queries
73
SQL vs NoSQL?
Structured relational vs flexible schema
74
What is sharding?
splitting data across multiple databases
75
What is replication?
Copying data across servers
76
CAP theorem?
Consistency, Availability, Partition tolerance (pick 2)
77
What is eventual consistency?
Data becomes consistent over time
78
What is an IP address?
Unique identifier for a device on a network
79
What is DNS?
Translates domain names to IP addresses
80
What is a firewall?
System that filters network traffic
81
What is a port?
Logical communication endpoint
82
What is latency?
Delay in communication
83
What is bandwidth?
Amount of data transferable
84
What is a process?
Running program instance
85
What is a thread?
Lightweight unit of execution
86
Process vs thread?
Threads share memory, processes don’t
87
What is context switching?
Switching CPU between processes/threads
88
What is deadlock?
Processes stuck waiting on each other
89
What are the 4 conditions for deadlock?
Mutual exclusion, hold & wait, no preemption, circular wait
90
What is encapsulation?
Hiding internal state
91
What is inheritance?
Deriving from another class
92
What is polymorphism?
Same interface, different behavior
93
What is abstraction?
Hiding implementation details
94
What is a primary key?
Unique identifier
95
What is a foreign key?
Reference to another table
96
What is normalization?
Reducing redundancy
97
ACID properties?
Atomicity, Consistency, Isolation, Durability
98
What is rate limiting?
Restricting number of requests per time window
99
Example rate limiter approach?
Token bucket or sliding window
100
What is idempotency?
Same request → same result
101
What is REST API?
Stateless web service architecture
102
What is a microservice?
Small, independent service
103
Monolith vs microservices?
Single system vs distributed services
104
What are the key steps when approaching a system design question?
clarify requirements -> define constraints -> high-level design -> deep dive -> identify bottlenecks -> discuss tradeoffs
105
Functional vs non-functional requirements?
functional = features, non-functional = scalability, latency, reliability
106
What is latency vs throughput?
latency = time per request, throughput = requests per second
107
When is horizontal scaling preferred?
large-scale distributed systems (better fault tolerance)
108
L4 vs L7 load balancing?
L4 = transport layer (IP/port) L7 = application layer (HTTP routing)
109
What algorithms do load balancers use?
round robin, least connections, IP hashing
110
What is caching and where can it exist?
storing data for faster access (client, CDN, server, DB)
111
What are some cache invalidation strategies?
TTL, write-through, write-back, cache-aside
112
What is a cache miss?
data not found in cache -> fetch from DB
113
What is a CDN?
globally distributed servers serving cached content closer to users
114
When should you use a CDN?
static assets, media, global apps
115
What is replication?
copying data across nodes for redundancy
116
CP vs AP systems?
CP = consistent but may be unavailable; AP = available but eventually consistent
117
What is eventual consistency?
data syncs over time across nodes
118
What is a message queue?
asynchronous communication between services
119
Why use message queues?
decoupling, reliability, buffering spikes
120
What is rate limiting and why important?
limits requests to prevent abuse/overload
121
Common rate limiting algorithms?
token bucket, leaky bucket, sliding window
122
What is a reverse proxy?
server that forwards client requests to backend servers
123
What is the OSI model?
7-layer networking model (physical -> application)
124
What are the 7 OSI layers?
physical, data link, network, transport, session, presentation, application
125
What does TCP provide?
reliable, ordered, error-checked delivery
126
What does UDP provide?
fast, connectionless, no guarantees
127
TCP 3-way handshake?
SYN -> SYN-ACK -> ACK
128
Why is TCP reliable?
acknowledgements, transmissions, sequencing
129
What is HTTP?
The protocol for actual communication where a client sends requests and receives responses from a server
130
What is HTTPS?
HTTP over TLS encryption
131
What is TLS?
provides security by encrypting data so it cannot be read or changed by unauthorized parties during transit
132
What is DNS resolution process?
browser -> resolver -> root -> TLD -> authoritative server -> IP
133
What is NAT?
maps private IPs to public IPs
134
What is a firewall?
filters traffic based on rules
135
What is a socket?
endpoint for sending/receiving data
136
What is packet loss?
data packets failing to reach destination
137
What is a process vs thread?
process = isolated memory thread = shared memory
138
What is a context switch?
CPU switching between tasks
139
What is scheduling?
OS deciding which process runs next
140
What are scheduling algorithms?
round robin, FCFS, priority scheduling
141
What is virtual memory?
abstraction of memory using disk
142
What is paging?
dividing memory into fixed-size pages
143
What is page fault?
accessing data not in memory -> fetch from disk
144
What is deadlock?
processes waiting indefinitely
145
What are deadlock prevention strategies?
avoid circular wait, allow preemption, ordering resources
146
What is a race condition?
multiple threads accessing shared data unsafely
147
What is a mutex?
lock ensuring only one thread accesses resource
148
What is a semaphore?
counter-based synchronization tool
149
What are the 4 pillars of OOP?
encapsulation, inheritance, polymorphism, abstraction
150
What is an interface?
contract defining methods without implementation
151
What is composition vs inheritance?
composition = "has-a" inheritance = "is-a"
152
Why prefer composition over inheritance?
More flexible, less tightly coupled
153
What is SOLID?
5 design principles for maintainable code
154
Single responsibility principle?
one class = one responsibility
155
open/closed principle?
open for extension, closed for modification
156
dependency injection?
passing dependencies instead of creating them inside class
157
How do OS, networking, and system design connect?
OS manages resources -> networking connects systems -> system design scales them
158
Why is TCP important for distributed systems?
ensures reliable communication between services
159
How does caching reduce latency?
avoids repeated expensive operations
160
Why are race conditions dangerous in distributed systems?
lead to inconsistent state
161
How does load balancing improve reliability?
prevents single point of failure
162
What is *args in Python?
It allows a function to accept a variable number of positional arguments, stored as a tuple.
163
What is **kwargs in Python?
It allows a function to accept a variable number of keyword arguments, stored as a dictionary.
164
Order of parameters in a function definition?
regular params → *args → **kwargs
165
When should you use *args?
When you don’t know how many positional arguments will be passed
166
When should you use **kwargs?
When you want flexible named parameters or optional configs
167
Why are args/kwargs useful in real systems?
They make functions flexible, reusable, and extensible (common in APIs, frameworks)
168
What is CI (Continuous Integration)?
Automatically building and testing code whenever changes are merged
169
What is CD (Continuous Delivery vs Deployment)?
Delivery = ready to release manually, Deployment = automatically released to production
170
Why is CI/CD important?
Faster releases, fewer bugs, consistent deployments
171
What triggers a CI pipeline?
Code push, pull request, or merge to main branch
172
Typical CI/CD pipeline stages?
Build → Test → Package → Deploy
173
What happens in the build stage?
Compile code, install dependencies, create artifacts
174
What happens in the test stage?
Run unit, integration, and sometimes end-to-end tests
175
What is an artifact?
A built output (e.g., binary, Docker image)
176
Why use Docker in CI/CD?
Consistent environments across dev, test, and prod
177
What is a container?
Lightweight, isolated runtime environment
178
What is Kubernetes used for in CD?
Orchestrating and managing container deployments
179
What are secrets in CI/CD?
API keys, tokens, credentials stored securely
180
How should secrets be handled?
Use secret managers, never hardcode them
181
What is dependency scanning?
Checking libraries for vulnerabilities
182
What is static code analysis?
Analyzing code for bugs/security issues without running it
183
What is blue-green deployment?
Two environments; switch traffic to new version instantly
184
What is canary deployment?
Release to a small % of users first
185
What is rolling deployment?
Gradually replace old instances with new ones
186
What is rollback?
Reverting to a previous stable version
187
Why is monitoring important in CD?
Detect failures after deployment
188
What is a failed pipeline?
When a stage (build/test/deploy) fails → stops progression
189
What would you do if a deployment fails?
Rollback, check logs, fix issue, re-run pipeline
190
How do you make a pipeline faster?
Parallelize tests, cache dependencies, incremental builds
191
How do you ensure reliability?
Automated tests, monitoring, rollback strategies
192
How do you secure a pipeline?
Secret management, access control, dependency scanning
193
What is REST?
A common way to structure communication using HTTP
194
What is GraphQL?
A query language that allows clients to request exactly the data they need in a single request, which helps reduce overfetching and underfetching
195
What is gRPC?
A modern communication method that uses a binary format to send data faster and more efficiently than traditional methods
196
What are stateless services?
Services that do not retain memory of past requests, making them easier to scale
197
What are stateful services?
Services that remember information about each client, which can simplify certain features but adds overall system complexity
198
What is synchronous communication?
A model where the client waits for a response from the server before continuing its operations
199
What is asynchronous communication?
A model where the client moves on immediately after sending a request and handles the response later when it arrives
200
What are load balancers?
Tools that distribute incoming requests across multiple servers to ensure no single server becomes overwhelmed
201
What are proxies?
Intermediaries between clients and servers that act as security filters or content accelerators
202
What are forward proxies?
Work on behalf of clients making requests
203
What are reverse proxies?
Sit in front of servers to manage requests
204
What are API gateways?
Systems that handle requests for microservices by checking authorization, limiting request volume, and routing requests to the appropriate service
205
What are web servers?
Software that receives HTTP requests to serve web pages or API responses
206
What are SQL Databases?
Databases using structured schemas, best suited for systems requiring clear relationships and strong consistency
207
What are NoSQL Databases?
Flexible databases that store unstructured or semi-structured data, ideal for handling many data types or large amounts of data across multiple servers
208
What is indexing?
The creation of fast lookup paths within a database to speed up queries
209
What is normalization?
Organizing data to remove duplication and improve data integrity
210
What is denormalization?
Intentionally adding duplication back into a database to improve the speed of read operations
211
What is replication?
Copying data to multiple servers so the system remains functional if one server fails
212
What is sharding?
Splitting data into smaller pieces across different machines to increase storage and processing capacity
213
What is partitioning?
Dividing data based on specific criteria, such as time or region, to spread the load
214
What is the CAP thoerem?
A principle stating that distributed systems can only fully guarantee two out of three properties at once: consistency, availability, and partition tolerance
215
What is PACELC?
An extension of the CAP theorem that considers trade-offs between latency and consistency even when no network failure is occurring
216
What are transactions?
A way to group database operations so they all either succeed together or fail together
217
What are isolation levels?
Controls that determine how transactions interact to prevent issues like "dirty reads" or lost updates
218
What is data modeling?
The process of designing how data relates and is stored so the system can grow without requiring major changes
219
What is caching?
Storing frequently accessed data in a way that allows for faster retrieval, reducing database load and user latency
220
What are some caching strategies?
LRU, LFU, TTL, Write-through caching, write-behind caching
221
What is LRU?
least recently used - Removes the items that haven't been accessed for the longest time when space runs out
222
What is LFU?
least frequently used - Removes the items that are accessed the least often
223
What is TTL?
time to live - Sets an expiration time on cached data
224
What is write-through caching?
Updates both the cache and the database immediately
225
What is write-behind caching?
Delays database updates to improve system speed
226
What is cache invalidation?
The process of updating or clearing outdated cache entries to ensure users receive fresh dat
227
What are messaging and queueing systems?
Systems that store messages temporarily so services can communicate and process work asynchronously
228
What are some messaging and queueing systems?
kafka, rabbitMQ
229
What is event-driven architecture?
A design where events trigger actions in different parts of the system
230
What is publish-subscribe model?
A model where one service publishes messages and other services subscribe to receive them, helping systems stay responsive and scalable
231
What is service discovery?
a mechanism that allows microservices to find each other automatically
232
What are circuit breakers?
A safety mechanism that detects when a service fails and temporarily stops sending requests to it to prevent worsening the failure
233
What is idempotency?
A property where repeating the same request multiple times does not cause unintended side effects, such as charging a customer twice
234
What is horizontal scaling?
Adding more machines to a system to handle increased load
235
What is vertical scaing?
Upgrading the resources (like CPU or RAM) of existing machines
236
What are read replicas?
Database copies used specifically to handle read operations, offloading work from the primary database
237
What is authentication?
The process of verifying a user's identity
238
What is authorization?
The process of checking what actions a verified user is allowed to perform
239
What are failover strategies?
Automatically redirecting traffic when a server or geographic region fails
240
What is leader election?
Algorithms (like Raft) that ensure distributed systems agree on which server should make primary decisions
241
What are health checks, monitoring, and alerting?
Tools and processes that provide visibility into system health so teams can respond to problems quickly