Graphx Flashcards

(32 cards)

1
Q

Web graphs

A

*Vertices represent pages
*edges represent hyder links

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Knowledge graphs

A

*Vertices represent entities, eg; locations,people,events
*edges represent relationships, eg; capital of, occured in

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Graph model

A

Graph[VD,ED]
VD: verteice
ED : edge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

examples of
Knowledge Graph

A

graph[String,String]
DV: String= entity names
ED: String = relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

examples of graph[String,String]
Road Network

A

DV: (longitude, latitude)
ED: (streedName, speedLimit)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

slides show graph creation code

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Graph partitioning; Edge cut

A

*Each vertex is assigned to one partition
*Some edges are replicated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Graph partitioning; Vertex cut

A

*Each edge is assigned one partition
*Some vertices are replicated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

GraphX adopts which graph partitioning for what function

A

adopts Vertex cut to minimize communication cost

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Simple graph processing

A
  • Vertices and edges can be treated as RDDs
  • Regular RDD operations can operate on them
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Vertex Operations

A
  • Filter
  • mapValues
    *diff(other: VertexRDD[VD])
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Vertex Operation; Filter

A

Keeps only a subset of vertices that satisfy the predicated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Vertex Operation; mapValues

A

Modify the value on each
vertex

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Vertex Operation; diff(other: VertexRDD[VD]):

A

Removes all
vertices in the other set of vertices

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Edge operations

A

*mapValues
*reverse

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Edge operations; mapValues

A

Modify the value on each
edge

17
Q

Edge operations;Reverse

A

Reverse the direction of each
edge (V1 -> V2) → (V2 -> V2)

18
Q

Iterative Graph Processing operations

A
  • Each vertex send a
    message along outgoing
    edges
  • Each vertex updates its
    vertex data by
    aggregating incoming
    messages
  • Repeat until a stopping
    condition
19
Q

Pregel API

A

def pregelA(
vprog: (VertexId, VD, A) => VD,
sendMsg: EdgeTriplet[VD, ED] =>
Iterator[(VertexId, A)],
mergeMsg: (A, A) => A)
: Graph[VD, ED]

20
Q

Graph Analytics: Connected
Components
Goal

A

Assign each vertex the smallest ID in
its connected component

21
Q

Graph Analytics: Connected
Components
message

A

Each vertex sends its current
label (initially its own ID) to neighbors.

22
Q

Graph Analytics: Connected
Components
Update Rule

A

On each superstep, a vertex
adopts the smallest label received from its
neighbors if it’s smaller than its current
label

23
Q

Graph Analytics: Connected
Components
Terminiation

A

When no vertex changes its
label (i.e., messages stop propagating),
the algorithm converges

24
Q

Graph Analytics: Single-source Shortest
Path (SSSP)
Goal

A

Find the minimum distance from a
source vertex to all others.

25
Graph Analytics: Single-source Shortest Path (SSSP) message
Each vertex sends updated distance estimates to neighbors (e.g., its own distance + edge weight).
26
Graph Analytics: Single-source Shortest Path (SSSP) Update rule
A vertex keeps the smallest distance received and forwards new distance estimates if updated
27
Graph Analytics: Single-source Shortest Path (SSSP) Termination
When no shorter distances are propagated, the algorithm halts.
28
Graph Analytics: Page Rank Goal
Compute an "importance" score for each vertex based on its incoming links
29
Graph Analytics: Page Rank Message
Each vertex sends a fraction of its current rank (e.g., rank / out-degree) to neighbors.
30
Graph Analytics: Page Rank Update rule
Each vertex sums incoming contributions, applies damping (e.g., new_rank = 0.15 + 0.85 * sum), and updates its rank
31
Graph Analytics: Page Rank Termination
After a fixed number of iterations or when rank changes fall below a threshold
32