Pthreads Sync for IPC
• Init'ed w/Pthread_process_shared: - pthread_mutexattr_t - pthread_condattr_t • Sync info must be avail to all processes - create mutex in shared memory region (See video #274 for code sample)
Copy (Messages) vs Shared Memory
Copy (Messages) CPU cycles to copy data to/from port • Smaller messages Map (Shmem) CPU cycles to map mem to addr space CPU to copy data to channel Set up once, use many x || lg block 1 x-> good payoff
Freeing Physical Memory
• When should pages be swapped?
Memory usage above threshold (high watermark)
CPU usage below threshold (low watermark)
• Which pages should be swapped?
History based prediction - Least Recently Used (LRU)
Pages without dirty bit set
Avoid non-swappable pages
Catagorize pages into diff types
Second chance variation of LRU - make 2 passes
before deciding which page to swap
Memory Allocation
Memory Allocator
• Determines VA->PA mapping, checks validity/perm
Kernel Level Allocators
• Allocate pages for kernel state, processes states
• Keep track of free memory
User-level allocators
• Dynamic process state (heap) using malloc/free
• Once malloc’d, kernel not involved. Used by dlmalloc, jemalloc, hoard, tcmalloc
Homogeneous Architectures
• Each node can do any processing step
+ Keeps front end simple - doesn’t mean each node has all data but can get to all data
- How to benefit from caching?
• Scale by adding more nodes
Priority Scheduling
Copy-on-Write
If new process is created, why copy all the memory?
• Instead, map new VA to original page
• Write protect original page
• If only read, this will save memory & time
• On write, page fault and copy/update page tables
• Only copy pages on write => Copy-on-Write
Split Device Driver Model
• Approach: device access ctrl split btwn
front end driver in guest VM (device API)
back-end driver in service VM (or host)
• Modified guest drivers
• Limited to paravirtualized guests
+ Eliminates emul OHs
+ Allow better mgmt of shared devices
Internet Service
• Any service avail through web interface (mail, weather, banking, etc) 3 components: • Presentation - static content • Biz logic - dynamic content • Database tier - data store
Problems with Trap & Emulate
• x86 pre 2005:
4 rings, no root/non-root modes yet
HV = ring 0, guest OS = ring 1
• 17 instructions do not trap, fail silently
• i.e. interupts, POPF/PUSHF
• HV doesn’t know there was an error, assumes successful
Strict consistency
Strict consistency
• Updates are visible everywhere immediately
• In practice, even on SMP there are no guarantees
• Even harder on DSM
• Doesn’t really exist
More Sync Constructs (All need hardware support)
What are the overheads associated with scheduling? Do you understand the tradeoffs associated with the frequency of preemption and scheduling/what types of workloads benefit from frequent vs. infrequent intervention of the scheduler (short vs. long timeslices)?
Device Drivers
Summary of Page based DSM
Performance Considerations
DSM perf metric == access latency How do we make most from local memory? Migration • Makes sense for SRSW • Requires data movement Replication (Caching) • More general, requires consistency mgmt • Best for low latency
CPU - Device Interconnect
inode example
(Quiz on video #354 & 355)
Memory Virtualization Full
• Full virtualization
• Guest OS expects contig phys mem start @ 0
• Virtual (VM), Physical(Guest thinks phys), Machine (Actual phys mem)
Option 1:
• Guest page table: VA => PA, HypVisor: PA => MA
• Still uses TLB, MMU for PA=>MA, software VA=>PA
Option 2:
• Guest page tables VA=>PA
• Hypervisor shadow PT: VA=>MA
• Hypervisor maintains consistence
Hosted Virtualization - Type 2
Virtual File System
Demand Paging
• Virtual page memory not always in physical memory
• Phys page frame saved & restored to/from 2ndary storage
Demand Paging
• Pages swapped in/out of mem & swap partition
• Page can be ‘pinned’ - disables swapping
POSIX Shared Memory
Shared Memory Design Considerations