A normal sequential program consists of a single thread of execution
Threads provide a way for programmers to express concurrency
(i.e. multiple programs or sequences of instructions running, or
appearing to run at the same time) in a program
There are multiple threads of execution
Threads may perform the same task.
Threads may perform different tasks.
Properties of Thread:
Resource Utilization: blocked/waiting threads give up resource
Priority: higher priority, more CPU time; lower priority, less CPU time
Modularization: organization of execution tasks
Important Notes for Thread:
A thread can create new threads using thread_fork
All threads share the program global variables and heap, but they have
private stack
Operating system control which thread to run, orignal thread and the new
thread runs concurrenctly.
Once the threads start running, you cannot predict the behavior since you have
no control.
Concurrent Program Execution
One program has at least one thread and all threads share that program address space.
Operating system job to prevent stacks overlapping each other --> but also make sure
to get the maximum stack
multithread program gives:
the illusion that be able to runs multi-things at the same time
better organization of the program
better performance of the program
Implementation Concurrent Thread
Hardware support: P(processors)C(cores)M(multithreading per core)
PCM threads can execute simultaneously.
Timesharing: Multiple threads take turns on the same hardware(share CPU); rapidly switching
between threads so all make progress; they are all making progress, use CPU one by one
Hardware support + Timesharing: PCM + timesharing
Timesharing and Content Switches
When timesharing, the switch from one thread to another is called a content switch
During the timesharing:
decide which thread will run next (go to operating system schedulor)
save the states(register value) of the current thread
load the states(register value) of the nwe thread
make sure that the thread that running CPU again at the exact position
and exact states when it pick up
Thread context must be saved/restored carefully
switchframe_switch (callee): save all the callee-save registers
thread_switch (caller): save all the caller-save registers
What causes context switch?
thread_yield: running thread voluntarily allows other threads to run; give up CPU voluntarily
Four ways of context-switch:
Go to Blocked state --> wait something
Preemption: involuntarily give up the CPU since the scheduling quantum count down expires --> interrupts --> context switch
Thread_yield: voluntarily called thread yield
Thread terminated;
Thread States
Thread states (3 states)
running: thread is executing on the CPU right now; if CPU only support one thread
at a time, then only one thread has that state.
ready: the thread is not waiting for anything except CPU; waiting to run instructions
blocked: any threads that is waiting for something that can continue. waiting to receive the "order"
to execute
transection:
ready to running: called by contextswitch
running to ready: called by yield
running to blocked: need something so that need to go to blocked to wait for that
blocked to ready: when something becomes available, then some threads will take the blocked thread
and push to ready state
Thread Stack after Coluntary Context Switch
program calls thread_yield first to yield CPU
thread_yield then calls thread_switch, to perform a context switch
thread_switch chooses a new thread, calls switchframe_switch to perfrom low-level content switch
Timesharing and Preemption:
timesharing -- scheduling quantum (the upper bound on how long a thread can run before
must yield the CPU), it is involunteering stop.
preemption forces a running thread to stop
Since CPU can only run one thread and that one is not aware that it needs to stop; the only way to stop a thread is through hardware -- using Interrupt!
Interrupt
an interrupt is an event that occurs during the execution of a program
interrupts are caused by system devices
CPU know who causes the interrupt and OS handle the interrupts
CPU has a trap table where stores some addresses of some codes.
CPU --> receive interrupts --> stop executing user's program --> execute the code
stored in address of the trap table (Interrupt handler (operating system code))
Interrupt handler
create a trap frame; save every register in the system
called the actual handler --> perform device-specific processing
restores the saved thread context from the trap frame and resuems execution of the thread
Trap frame is always followed by the interrupt
Interrupt handler stack frame followed by the trap frame
Difference between trap frame and switch functions
trap frame saves all the registers comes before (the previous running program's registers)
thread_switch save all the registers between trap frame and itself (i.e. callee-save registers)
switchframe save all the registers (i.e. caller-save registers)
trap frame called by CPU not program(cannot be predict when to be called, so we need to save every register),
while switch functions called by program!
Preemptive Scheduling
Preemptive Scheduling uses the schduling quantum
Quantum cannot be too long --> not effective timesharing; cannot be too short --> spend more time on timesharing, not do so much work!
Every time the new thread get a fresh quantum, so quantum does not carry over
Every computer has clock --> run the count down --> when the count down is expire --> clock fire a interrupt --> causes CPU to execute the
interrupt handler so OS take control of the interrupts (i.e. execute content switch (yield -> thread_switch -> switchframe)
Two passes to implement
Every time there is a content switch, tell the clock to start a new count down --> really slow
Clock count down in a short period and repeat, when there is a time interrupt, go check (when the thread start running (record it))
whether it exceed its quantum, if yes, kick out, otherwise keep running.
Synchronization
All threads in a concurrent program share access to the program's global variables and the heap (read & write).
Any block of codes in a concurrent program where we have multiple threads reading and writing the shared variable(either global or heap)
is called critical section
Critical Section
The order of the instructions to be exectued in that section matters
no control during preemption, no control over which thread run next --> OS have ability controlling
Context switch happens expected to assembling, instead of C code.
There are multiple programs running background, therefore, interrupts happens every time.
Race Condition
when the program result depends on the order of execution.
can cause: incorrect output, return error; memory error
Race conditions occurs when multiple threads are reading and writing the same memory at the same time
Source of race conditions:
implementation
compiler
CPU
compiler and CPU introduce the race conditions due to optimization -- rearranging the code
By applying memory model, CPU and complier knows which optimization is safe/unsafe
In C99, volatile disable the optimization of CPU
register allocation optimization: it keeps the up-to-date value in a register for as long as possible can
func a, func b, might use different register for register allocation --> do not know which one has the up-to-date value
volatile --> turn register optimization off; it forces to load from memory for every use
Identify race conditions
inspect each variable; is it possible for multiple threads to read/write it at the same time?
constants and memoery that all threads only read
Enforcing Mutual Exclusion With Lock
Acquire/release must ensure that only one thread at a time can hold the lock, even if both attempt to Acquire at the same time.
If a thread cannot Acquire the lock immediately, it must wait until the lock is available.
Must ACQUIRE the lock before exeucting the shared variable.
C code cannot stop the interrupts & context switch, we need hardware to help
provide a way to implement atomic test-and-set (i.e. atomic means cannot be interrupted!) for synchoronization primitives like locks
Mips & Arm use pair of value to implement Test-and-Set, they use the pair of values to check only.
load-link: ll get the lock immediately;
store conditionally: sc get the value of the current lock;
return success if both value are the same; fail otherwise
if success: return tmp (which is the old value (immediate value of lock))
if tmp is false, then we get the lock
if tmp is true, then we keep spinning
if fail: return true, then we keep spinning
Spinlock and lock
They both have a owener!
spinlock is owned by CPU
lock is owned by thread
Spinlock disable the interrupts
minimize the spinning
preemption is disable on the CPU --> because they are doing busy wait --> take up CPU
DO NOT use spin lock to protect large critical sections
Some CPU can queue some interrupts during the spinlock is on
Thread Blocking
When a thread blocks, it stops running:
The scheduler chooses a new thread to run
A context switch from the blocking thread to a new thread occurs
The blocking thread is queued in a Wait Queue
Eventually, a blocked thread is signaled and awakened by another thread.
Semaphores
Keep tracking the number of resources that are available. Try to synchoronizate in terms of number of resources
Ownership! lock care about, but semaphores does not
Can release it without own it
Having a counter to keep track, the counter is non-negative.
If the counter is zero, we go to sleep and wait for the counter to be anything else
If the counter is greater than zero, we take the resourse and decrement the counter
P(take the resourse, space++, item--) vs V(return the resourse, space--, item++)
No way to check the value of the counter in C, the only thing to do is P & V
Type of Semaphores:
Binary Semaphore:
Similar to the lock, create with one initial resouce, when P it, take one resouce, V it, return one resource.
Do P first, then V. --> offer mutual exclusion
Initial resource and Maxmimun resource are 1
Counting semaphore:
A semaphore with an arbitrary number of resouces, no maximum resources
Do not care about the order of P and V
Barrier semaphore:
Use semaphore to force one thread wait another thread to finish.
Start the semaphore with zero resouces;
Threads need to wait for the other thread needs to P sempahore once, for the threads are waiting on has to V the semaphore to access
Proceducer/Consumer Synchronization
struct semaphore* Item, Space
Item = sem_create("Buffer Item", 0);
Space = sem_create("Buffer Spaces", N)
Producer's Pseudo-code:
P(Spaces)
add item to the buffer
V(Items)
Consumer's Pseudo-code:
P(Items)
remove item from the buffer
V(spaces)
Condition Variables
The safe to simultaneously fall asleep and give up the lock, so other threads can change the condition, then wake up by someone and own the lock
wait: give up the lock --> go to sleep --> acquire the lock
Difference:
lock: mutual exclusion
sempahore: resources
CV: conditions
Deadlocks
It forever stopped and never execute or make any progress
Program stops running
OS cannot detect the deadlock; since it cannot differentiate the deadlock or waiting for something
Solutions:
No Hold and Wait: prevent a thread from requesting resource while owning a resource (no wait, includes sleep and busy wait)
Resource Ordering: Order the resource types and require that each thread acquire resources in increasing resource type order. (acquire in increasing order)