Dovetail introduces the new interrupt type flag IRQF_OOB
, denoting
an out-of-band handler to the generic interrupt API routines:
setup_irq()
for early registration of special interruptsrequest_irq()
for device interrupts__request_percpu_irq()
for per-CPU interruptsAn IRQ action handler bearing this flag runs on the out-of-band stage, regardless of the current interrupt state of the in-band stage. If no out-of-band stage is present, the flag will be ignored, with the interrupt handler running on the in-band stage as usual.
Conversely, out-of-band handlers are dismissed using the usual calls, such as:
free_irq()
for device interruptsfree_percpu_irq()
for per-CPU interruptsOut-of-band IRQ handling has the following constraints:
IRQF_OOB
flag too, or the request will fail.If meeting real-time requirements is your goal, sharing an IRQ line among multiple devices operating from different execution stages (in-band vs out-of-band) can only be a bad idea design-wise. You should resort to this in desperate hardware situations only.
IRQF_NO_THREAD
is implicit, IRQF_ONESHOT
is ignored).Installing an out-of-band handler for a device interrupt
#include <linux/interrupt.h>
static irqreturn_t oob_interrupt_handler(int irq, void *dev_id)
{
...
return IRQ_HANDLED;
}
init __init driver_init_routine(void)
{
int ret;
...
ret = request_irq(DEVICE_IRQ, oob_interrupt_handler,
IRQF_OOB, "Out-of-band device IRQ",
device_data);
if (ret)
goto fail;
return 0;
fail:
/* Unwind upon error. */
...
}
Your companion core will most likely want to be notified each time a new interrupt context is entered, typically in order to block any further task rescheduling on its end. Conversely, this core will also want to be notified when such context is exited, so that it can start its rescheduling procedure, applying any change to the scheduler state which occurred during the execution of the interrupt handler(s), such as waking up a thread which was waiting for the incoming event.
To provide such support, Dovetail calls irq_enter_pipeline()
on
entry to the pipeline when it receives an IRQ from the hardware, then
irq_exit_pipeline()
right before it leaves the interrupt frame. It
defines empty placeholders for these hooks as follows, which are
picked in absence of a companion core in the kernel tree:
linux/include/dovetail/irq.h
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _DOVETAIL_IRQ_H
#define _DOVETAIL_IRQ_H
/* Placeholders for pre- and post-IRQ handling. */
static inline void irq_enter_pipeline(void) { }
static inline void irq_exit_pipeline(void) { }
#endif /* !_DOVETAIL_IRQ_H */
As an illustration, the EVL core overrides these placeholders by interposing the following file which comes earlier in the inclusion order of C headers, providing its own set of hooks as follows:
linux-evl/include/asm-generic/evl/irq.h
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _ASM_GENERIC_EVL_IRQ_H
#define _ASM_GENERIC_EVL_IRQ_H
#include <evl/irq.h>
static inline void irq_enter_pipeline(void)
{
#ifdef CONFIG_EVL
evl_enter_irq();
#endif
}
static inline void irq_exit_pipeline(void)
{
#ifdef CONFIG_EVL
evl_exit_irq();
#endif
}
#endif /* !_ASM_GENERIC_EVL_IRQ_H */
This routine turns on/off out-of-band delivery for the given IRQ, for
which an action must set (i.e. requested). This call comes in handy
when the IRQ was already requested without mentioning the IRQF_OOB
flag. In such a case, there is
still the option to switch the interrupt delivery stage manually by a
call to irq_switch_oob.
The IRQ number to switch the delivery mode for.
A boolean indicating whether out-of-band delivery should be enabled.
Since the regular local_irq_*()
kernel API only controls interrupt
disabling only for the in-band stage when interrupt
pipelining is enabled, we need a replacement for the original
implementation which actually flips the interrupt enable/disable flag
in the CPU. When CONFIG_IRQ_PIPELINE
is disabled, this set is mapped
1:1 onto the original local_irq_*()
API.
Original/Virtual | Non-virtualized call |
---|---|
local_save_flags(flags) | flags = hard_local_save_flags() |
local_irq_disable() | hard_local_irq_disable() |
local_irq_enable() | hard_local_irq_enable() |
local_irq_save(flags) | flags = hard_local_irq_save() |
local_irq_restore(flags) | hard_local_irq_restore(flags) |
irqs_disabled() | hard_irqs_disabled() |
irqs_disabled_flags(flags) | hard_irqs_disabled_flags(flags) |
Just like the in-band stage is affected by the state of the virtual interrupt disable flag, the interrupt state of the out-of-band stage is controlled by a dedicated stall bit flag in the out-of-band stage status. In combination with the interrupt disable bit in the CPU, this software bit controls interrupt delivery to the out-of-band stage.
When this stall bit is set, interrupts which might be pending in the event log of the out-of-band stage for a given CPU are not played. Conversely, the out-of-band handlers attached to pending IRQs are fired when the stall bit is clear(ed). The following table represents the equivalent calls affecting the stall bit for each stage:
In-band stage | Out-of-band stage |
---|---|
local_save_flags(flags) | flags = oob_irq_save() |
local_irq_disable() | oob_irq_disable() |
local_irq_enable() | oob_irq_enable() |
local_irq_save(flags) | flags = oob_irq_save() |
local_irq_restore(flags) | oob_irq_restore(flags) |
irqs_disabled() | oob_irqs_disabled() |
irqs_disabled_flags(flags) | -none- |
The pipeline exposes two generic IPI vectors which autonomous cores may use in SMP configuration for signaling the following events across CPUs:
RESCHEDULE_OOB_IPI
, the cross-CPU task reschedule request. This is
available to the core’s scheduler for kicking the task rescheduling
procedure on remote CPUs, when the state of their respective
runqueue has changed. For instance, a task sleeping on CPU #1 may be
unblocked by a system call issued from CPU #0: in this case, the
scheduler code running on CPU #0 is supposed to tell CPU #1 that it
should reschedule. Typically, the EVL core does so from its
test_resched()
routine.
TIMER_OOB_IPI
, the cross-CPU timer reschedule request. Because
software timers are in essence per-CPU beasts, this IPI is available
to the core’s timer management code for kicking the hardware timer
programming procedure on remote CPUs, when the state of some
software timer has changed. Typically, stopping a timer from a
remote CPU, or migrating a timer from a CPU to another should
trigger such signal. The EVL core does so from its
evl_program_remote_tick()
routine, which is called whenever the
timer with the earliest timeout date enqueued on a remote CPU, may
have changed.
In addition, the pipeline core defines CALL_FUNCTION_OOB_IPI
for its
own use, in order to implement the smp_call_function_oob()
routine. The latter is semantically equivalent to the regular
smp_call_function_single()
routine, except that its runs the
callback on the out-of-band stage.
As their respective name suggests, those three IPIs can be sent from
out-of-band context (as well as in-band), by calling the
irq_send_oob_ipi()
service.
The IPI number to send. There are only three legit values for this argument: either RESCHEDULE_OOB_IPI, TIMER_OOB_IPI or CALL_FUNCTION_OOB_IPI. This is a low-level service with not much parameter checking, so any other value is likely to cause havoc.
A CPU bitmask specifying the target CPU(s) which should receive the IPI. The current CPU is silently excluded from this mask, so the calling CPU cannot send an IPI to itself using this call.
In order to receive these IPIs, an out-of-band handler must have been set for them, mentioning the [IRQF_OOB flag]({{ < relref “#request-oob-irq” >}}).
irq_send_oob_ipi()
serializes callers internally so that it
may be used from either stages: in-band or out-of-band.
In some very specific cases, we may need to inject an IRQ into the
pipeline by software as if such hardware event had happened on the
current CPU. irq_inject_pipeline()
does exactly this.
The IRQ number to inject. A valid interrupt descriptor must exist for this interrupt.
irq_inject_pipeline()
fully emulates the receipt of a hardware
event, which means that the common interrupt pipelining logic applies
to the new event:
first, any out-of-band handler is considered for delivery,
then such event may be passed down the pipeline to the common in-band handler(s) in absence of out-of-band handler(s).
The pipeline priority rules apply accordingly:
if the caller is in-band, and an out-of-band handler is registered
for the IRQ event, and the out-of-band stage is unstalled,
the execution stage is immediately switched to out-of-band for
running the later, then restored to in-band before
irq_inject_pipeline()
returns.
if the caller is out-of-band and there is no out-of-band handler, the IRQ event is deferred until the in-band stage resumes execution on the current CPU, at which point it is delivered to any in-band handler(s).
in any case, should the current stage receive the IRQ event, the
virtual interrupt state of that stage
is always considered before deciding whether this event should be
delivered immediately to its handler by irq_inject_pipeline()
(unstalled case), or deferred until the stage is unstalled
(stalled case).
This call returns zero on successful injection, or -EINVAL if the IRQ has no valid descriptor.
If you look for a way to schedule the execution of a routine in the in-band interrupt context from the out-of-band stage, you may want to consider the extended irq_work API which provides a high level interface to this feature.
Sometimes, running the full interrupt delivery logic irq_inject_pipeline() implements for feeding an interrupt into the pipeline may be overkill when we may make assumptions about the current execution context, and which stage should handle the event. The following fast helpers can be used instead in this case:
The IRQ number to inject into the in-band stage. A valid interrupt descriptor must exist for this interrupt.
This routine may be used to mark an interrupt as pending directly into the current CPU’s log for the in-band stage. This is useful in either of these cases:
you know that the out-of-band stage is current, therefore this event has to be deferred until the in-band stage resumes on the current CPU later on. This means that you can simply post it to the in-band stage directly.
you know that the in-band stage is current but stalled, therefore this event can’t be immediately delivered, so marking it as pending into the in-band stage is enough.
Interrupts must be hard disabled in the CPU before calling this routine.
The IRQ number to inject into the out-of-band stage. A valid interrupt descriptor must exist for this interrupt.
This routine may be used to mark an interrupt as pending directly into the current CPU’s log for the out-of-band stage. This is useful in only one situation: you know that the out-of-band stage is current but stalled, therefore this event can’t be immediately delivered, so marking it as pending into the out-of-band stage is enough.
Interrupts must be hard disabled in the CPU before calling this routine. If the out-of-band stage is stalled as expected on entry to this helper, then interrupts must be hard disabled in the CPU as well anyway.
Due to the NMI-like nature of interrupts running out-of-band code from the standpoint of the main kernel, such code might preempt in-band activities in the middle of a critical section. For this reason, it would be unsafe to call any in-band routine from an out-of-band context.
However, we may schedule execution of in-band work handlers from
out-of-band code, using the regular irq_work_queue()
and
irq_work_queue_on()
services which have been extended by the IRQ
pipeline core. A work request is scheduled from the out-of-band stage
for running on the in-band stage on the issuing/requested CPU as soon
as the out-of-band activity quiesces on this processor. As its name
implies, the work handler runs in (in-band) interrupt context.
The interrupt pipeline always uses a synthetic IRQ as the notification signal for the IRQ work
machinery, instead of an architecture-specific interrupt vector. This
special IRQ is labeled in-band work when reported by
/proc/interrupts
. irq_work_queue()
may invoke the work handler
immediately only if called from the in-band stage with hard irqs on.
In all other cases, the handler execution is deferred until the
in-band log is synchronized.
The pipeline introduces an additional type of interrupts, which are purely software-originated, with no hardware involved. These IRQs can be triggered by any kernel code. A synthetic IRQ (aka SIRQ) is inherently a per-CPU event. Because the common pipeline flow applies to synthetic interrupts, it is possible to attach such interrupt to out-of-band and/or in-band handlers, just like device interrupts.
A synthetic interrupt abide by the normal rules with respect to interrupt masking: such IRQ may be deferred until the stage it should be handled from is unstalled.
Synthetic interrupts and softirqs differ in essence: the latter only exist in the in-band context, and therefore cannot trigger out-of-band activities. Synthetic interrupts used to be called virtual IRQs (or virq for short) by the legacy I-pipe implementation, Dovetail’s ancestor; such rename clears the confusion with the way abstract interrupt numbers defined within interrupt domains may be called elsewhere in the kernel code base (i.e. virtual interrupts too).
Synthetic interrupt vectors are allocated from the
synthetic_irq_domain, using the irq_create_direct_mapping()
routine.
A synthetic interrupt handler can be installed for running on the in-band stage upon a scheduling request (i.e. being posted) from an out-of-band context as follows:
#include <linux/irq_pipeline.h>
static irqreturn_t sirq_handler(int sirq, void *dev_id)
{
do_in_band_work();
return IRQ_HANDLED;
}
static struct irqaction sirq_action = {
.handler = sirq_handler,
.name = "In-band synthetic interrupt",
.flags = IRQF_NO_THREAD,
};
unsigned int alloc_sirq(void)
{
unsigned int sirq;
sirq = irq_create_direct_mapping(synthetic_irq_domain);
if (!sirq)
return 0;
setup_percpu_irq(sirq, &sirq_action);
return sirq;
}
A synthetic interrupt handler can be installed for running from the out-of-band stage upon a trigger from an in-band context as follows:
static irqreturn_t sirq_oob_handler(int sirq, void *dev_id)
{
do_out_of_band_work();
return IRQ_HANDLED;
}
unsigned int alloc_sirq(void)
{
unsigned int sirq;
sirq = irq_create_direct_mapping(synthetic_irq_domain);
if (!sirq)
return 0;
ret = __request_percpu_irq(sirq, sirq_oob_handler,
IRQF_OOB,
"Out-of-band synthetic interrupt",
dev_id);
if (ret) {
irq_dispose_mapping(sirq);
return 0;
}
return sirq;
}
The execution of sirq_handler()
in the in-band context can be
scheduled (or posted) from the out-of-band context in two different
ways:
irq_inject_pipeline(sirq);
unsigned long flags = hard_local_irqsave();
irq_post_inband(sirq);
hard_local_irqrestore(flags);
Assuming that no interrupt may be pending in the event log for the
out-of-band stage at the time this code runs, the second method relies on the
invariant that in a pipeline interrupt model, IRQs pending for the
in-band stage will have to wait for the out-of-band stage to quiesce before they
can be handled. Therefore, it is pointless to check for synchronizing the
interrupts pending for the in-band stage from the out-of-band stage, which the
irq_inject_pipeline()
service would do systematically.
irq_post_inband()
simply marks the event as pending in the event
log of the in-band stage for the current CPU, then returns. This event
would be played as a result of synchronizing the log automatically when
the current CPU switches back to the in-band stage.
It is also valid to post a synthetic interrupt to be handled on the
in-band stage from an in-band context, using
irq_inject_pipeline()
. In such a case, the normal rules of interrupt
delivery apply, depending on the state of the virtual interrupt
disable flag for the in-band
stage: the IRQ is immediately delivered, with the call to
irq_inject_pipeline()
returning only after the handler has run.
Conversely, the execution of sirq_handler()
on the out-of-band stage can be
triggered from the in-band context as follows:
irq_inject_pipeline(sirq);
Since the out-of-band stage has precedence over the in-band stage for execution
of any pending event, this IRQ is immediately delivered, with the call
to irq_inject_pipeline()
returning only after the handler has run.
It is also valid to post a synthetic interrupt to be handled on the
out-of-band stage from an out-of-band context, using
irq_inject_pipeline()
. In such a case, the normal rules of interrupt
delivery apply, depending on the state of the virtual interrupt
disable flag for the out-of-band stage.
Calling irq_post_oob(sirq)
from the in-band stage to trigger an
out-of-band event is most often not the right way to do this, because
this service would not synchronize the interrupt log before
returning. In other words, the sirq
event would still be pending for
the out-of-band stage despite the fact that it should have preempted the in-band
stage before returning to the caller.