Gökçe Aydos
understand semaphore’s working principle
Roboter mit Rädern
Rotary encoder is a sensor. We find these kinds of robots in transportation (autonomous vehicles, conveyor robot, warehouse (Lagerhalle) robot, wheelchair).
Example robots: Wheeled robots
int encoder_value;
...
void encoder_read() { ...
= ...;
encoder_value }
void odometry_process() { ...
+= encoder_value;
odometry }
void journal() { ...
(..., uptime, odometry);
fprintf}
(ENCODER) encoder_read();
INTERRUPT_FROM(TIMER_10HZ) { odometry_process; journal; } INTERRUPT_FROM
Odometry: Wegstreckenzähler
Imagine you want to program a robot to get encoder values to calculate how much distance the robot has traveled. This value is useful for localization of the robot. In parallel you have a journaling system which logs what the robot has done before.
Your program is interrupt driven. Every time when the encoder makes a measurement, there is an interrupt the rest of the functions are started 10 times per second.
Do you see any problems?
encoder_val
Do you see any problems?
. . .
Demonstration of shared-variable-threads.
top -H -p $(pidof main)
shows the priorities of the
threads and their processing time usage.
Analogy: You have a 2-wheel robot. Each of the motors have an encoder that signals non-deterministically rotation data. Another task processes this data and creates a sum.
Rijksmuseum, CC0, via Wikimedia Commons
How would you solve the problem/s we had in the beginning with a semaphore?
encoder_value
Use vimdiff
:
Emily(emily_is_here);
post(otto_is_here); wait
():
Otto(otto_is_here);
post(emily_is_here); wait
Can’t we use pthread.join()
for the same purpose? No,
because pthread.join()
waits until the thread exits.
Use vimdiff
Would the following solution work in the last problem?
Imagine we modified encoder_value
to a FIFO with a
capacity of 3. How can we leverage a semaphore that there are no more
than 3 encoder_value
s in the FIFO?
max_count = 3
Instead of using a semaphore, we could use a loop like:
int encoder_read_done;
void* odometry_process() {
...
while (!encoder_read_done);
...
}
What are the pros/cons?
This is a busy-waiting loop. It wastes CPU time. Semaphores block the threads, so the CPU can be used for other tasks.
However blocking and returning back to the thread can take more time than busy-waiting. If we know that we won’t need much time until an event, then busy-waiting could save us time.
Every OS that implements POSIX includes semaphores.
QNX is popular is mission-critical systems like nuclear plants, medical equipment, defense systems.
~200M vehicles use QNX — Source: Strategy Analytics via Blackberry for ADAS and digital cockpits.
Fan, Real-time Embedded Systems, 2015
Very suitable for introduction, includes many code examples. Many of the resources in this work is based on this book.
The standard document. Most of the man
pages are based
on this doc. Includes the rationale behind some concepts. Especially
relevant: Realtime
services index
More resources I skimmed, but did not use thoroughly:
Arpaci-Dusseau et.al., Operating Systems: Three Easy Pieces, 2018
Enjoyed reading. Contains a chapters about concurrency.
Tian et.al., (ed.), Handbook of real-time computing, 2022
Based on the latest research, written by many experts. Targeted at researchers.
Kopetz et.al, Real-Time Systems - Design Principles for Distributed Embedded Applications, 2022
Based on the lecture notes in Vienna University
Hüning, Echtzeitbetriebssystem, Embedded Systems für IoT, 2018
Principles of real-time OS, a case study based on Renesas Synergy RTOS
Seck, Aufbau Echtzeitbetriebssystem OS9000, 2014
German case study of OS-9 RTOS, part of lecture series about real-time systems.
Introduction to real-time systems
ROS is popular framework for robotics. Covers ROS programming related aspects. Real-time tutorial: ROS2 demo: Understanding real-time programming
Example of needing multiple resources: task requires some memory and 16K memory is available in small chunks 1K each.
if a consumer task T2 must wait for the producer T1:
T1(5,2)
means that T1 is released every 5 time cycles
and 2 time cycles after its release T1 vacates or procures a
semaphore.J14
is the fourth release of the task T1Industrial application: T2 may require the latest data from T1. It should not be the duplicate of an old data.
. . .
Would a single semaphore with max_count = 2
not
suffice?
Two tasks are released at the same time and want to meet again when they are finished. A semaphore is vacated when the corresponding task starts and is procured when the task ends.
A single semaphore would not suffice, because the tasks could be released many times.
() {
T1...
(sem🔵);
procure(sr🔵);
access(sem🟢);
procure(sr🟢);
access(sem🟢);
vacate(sem🔵);
vacate...
}
() {
T2...
// same as T1()
...
}
If tasks require multiple kinds of resources there is potential for a deadlock (T1 acquires R1, T2 preempts+acquires R2 + wants R1 + is blocked, T1 wants R2 + is blocked). A solution is to define an order for the usage of resources.
Each task that wants to acquire SR2, must first get SR1.
Do you see a problem below?
main
executes func_A()
, func_A
executes func_B()
. Every function invocation requires
exclusive access to SR
. However the thread had procured
S
and it cannot procure it the second time and will block
even it has already exclusive access. We could increase its capacity,
but this would mean that more than one task can access SR
and we require exclusive access. Recursive calls are not compatible with
semaphores. (The same problem applies if func_A
would call
itself.
We already procured S
in main
. Why should
we procure it again? The problem is that the semaphore does not know
which task procured it.
Mutex can help, because it can be un-locked only by its owner.
Semaphore has a second problem which is below.
Semaphore has two problems: 1) It does not track which task procured it 1) It belongs to the common but not a single task. Every task has access to a semaphore.
Mutex is like an unnamed binary semaphore and has the following superpowers: - the first thread that locks the mutex gains its ownership - can be unlocked only by its owner (Note that the standard mutex does not output an error if it is unlocked by another thread. See the table in the standard) - can be locked recursively - can use different protocols like priority inheritance protocol or the priority ceiling protocol, where a thread that locks the mutex inherits the annotated priority
Mutex is a low-level primitive which is used to construct condition variables
Semaphore manages multiple memory chunks in heap, mutex enables exclusive access to the control block.
Now you are armed with mutex. How would you solve the problem we had in the beginning using a mutex?
. . .
We could protect the access to the shared buffer with a mutex: Before the producer or consumer want to access the value, they have to lock it.
DEMONSTRATE
But we still have problems: 1. Some values written by the producer are not read by the consumer, because the producer sometimes requires less time and locks the mutex again before the consumer can read it. 2. Sometimes the opposite happens. The producer requires more time and the consumer reads the same value multiple times.
The whole construct is also called a monitor.
Bedingungsvariable
Note that the notation includes two task waiting lists.
Guarding mutex ensures exclusive access to the resource. If a mutex is locked, then the task in queued in the mutex waiting list. The diamond indicates the condition that the thread must fulfill to proceed - if the task does not fulfill it, then it is queued in the waiting list of the diamond and unlocks the mutex - if the condition is fulfilled, the task signals all the tasks in the waiting list and they are unblocked. After every task has been successfully unblocked, the mutex is unlocked.
Barriers are used for synchronization of tasks. A condition variable can be used for this purpose.
If a task is finished processing, it locks the mutex, increments
barrier_count
and checks
barrier_count >= threshold
- If true, then it signals
all the tasks waiting in the queue of the condition variable, so that
they are unblocked and can continue processing. After they are
released the task unlocks the mutex - else the task is
queued in the waiting list of the condition variable + it unlocks
(atomic)
Note that POSIX also supports barriers as a standalone primitive.
Refer to the figure 18.25 for a detailed UML diagram
The producer locks the mutex and checks data_is_ready
.
If - true, unlocks the mutex and wait in loop - false, it produces data,
sets data_is_ready
, signals the tasks waiting in the queue
of the condition variable (consumer) and unlocks the mutex
The consumer locks the mutex and checks data_is_ready
.
If - true, processes data, resets data_is_ready
, signals
similar to above, unlocks the mutex - false, unlocks the mutex and waits
in a loop checking for data_is_ready
.
Note that the condition can be different for the producer and the consumer.