graphics processing units (GPUs)
message passing
send message routine
A routine used by a processor in machines with private memories to pass a message to another processor
receive message routine
A routine used by a processor in machines with private memories to accept a message from another processor
clusters
collections of computers connected via I/O over standard network switches to form a message-passing multiprocessor
software as a service (SaaS)
rather than selling software that is installed and run on customers’ own computers, software is run at a remote site and made available over the Internet typically via a Web interface to customers
network bandwidth
bisection bandwidth
fully connected network
multistage network
a network that supplies a small switch at each node
crossbar network
a network that allows any node to communicate with any other node in one pass through the network
memory-mapped I/O
An I/O scheme in which portions of the address space are assigned to I/O devices, and reads and writes to those addresses are interpreted as commands to the I/O device
device driver
a program that controls an I/O device that is attached to the compute
polling
GPU memory structure
-rely on smaller streaming caches and multithreading of SIMD instructions to hide long latency of DRAM
organization of multiprocessor with multiple private address spaces
weakness of private memory
when should you use polling?