New big spatial data Flashcards

(15 cards)

1
Q

When should you use point geometries to represent spatial data?

A

Use points to represent single locations such as tweets, incidents, crimes, or any feature with only a coordinate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

When should you use line geometries?

A

Use lines (LineStrings) for linear features such as roads, rivers, or paths.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When should you use polygon geometries?

A

Use polygons for areas such as states, counties, buildings, or boundaries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What geometry types does Beast support?

A

Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon, and GeometryCollection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How is big spatial data related to the four V’s of big data?

A

Volume: Spatial datasets include millions of features (e.g., states, roads, satellites).

Velocity: Data from IoT, autonomous vehicles, or sensors arrives continuously.

Variety: Many input formats (CSV, GPX, KML, GeoJSON, shapefiles, GeoTIFF).

Veracity: Spatial data quality varies across sources (government data, satellites, crowdsourcing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How does spatial partitioning help with query processing?

A

It divides data into balanced partitions, enabling faster range filters, joins, and load balancing. Beast uses R*-Grove for efficient high-utilization partitions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why is efficient spatial partitioning important in distributed systems?

A

Good partitioning ensures even workload distribution, reduces skew, and speeds up spatial operations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What challenges occur when parsing irregular spatial file formats?

A

Spatial formats like shapefiles, GeoJSON, and compressed block ZIP require:
Decompression
Detecting record boundaries
Handling binary vs text formats
Avoiding partial-record splitting during parallel load
Beast solves this with split-aware parallel parsing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does Beast handle parsing irregular or compressed formats in parallel?

A

For every split (except the first), it skips to the next compressed block boundary, starts decompressing, then skips to the next record boundary—ensuring each partition finishes whole records

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What makes visualizing big spatial data challenging?

A

Large datasets are too big to draw at once, and single-level images lose quality when zoomed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does Beast overcome visualization challenges?

A

Provides plotImage for single-level images.

Provides plotPyramid for multilevel tile-based maps (similar to web map zooming).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the difference between vector and raster data?

A

Vector data: Geometries such as points, lines, polygons; used for buildings, states, roads.

Raster data: Gridded pixel data (e.g., satellite images, temperature, vegetation), stored as tiles. Each tile is a 2D array of pixels with values such as Int or Float.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What operations can be applied to raster data?

A

Local pixel-wise operations, focal (neighborhood) operations, filtering pixels, flattening, rescaling, and raster-vector joins (e.g., RaptorJoin).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does the raster data model look like in Beast?

A

A 2D array of pixels
Tile ID
Geolocation metadata
RasterRDD[T] holds all tiles.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why is combining raster and vector data useful?

A

It enables analytics such as overlaying population (vector) on temperature (raster) or combining satellite data with administrative boundaries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly