.LP .nf JSR-282 SI 22: 6.0 Allow the pinning of a schedulable object to a processor. -------------------------------------------- Last Updated: 19 December 2006 ----------------------- Summary ------------- Provide simple support to allow the pinning of schedulable objects to processors in a multiprocessor (SMP) systems. A schedulable object pinned to one or more processors means that only those processors can execute that schedulable object. Specification References ----------------------- 5.2, 5.1, 11.3, 12.3 Problem being address ------------------------------ Currently, the RTSJ is silent on multiprocessor issues. It attempts not to preclude multiprocessor implementations but provides no direct support. The java.lang.Runtime class allows the number of processors available to the JVM to be determined by the "int availableProcessors()" method, but does not allow Java threads to be pinned to processors. Furthermore, on some SMPs the number of processors may vary during the execution of the programme. JSR 302 anticipates that future SCS will contain SMP. (Currently, there does not seem to be hard evidence that SMP have been used in SCS. However, there is evidence that future systems will use SMP. For example, LynxSecure Separation Kernel has recently been announced http://www.lynuxworks.com/rtos/secure-rtos-kernel.php. ) Further Motivation ------------------- Whilst many applications do not need more control over the mapping of threads to processors in an SMP environment, there are occasions when such control is important. They include: 1 To allow more flexible approaches to scheduling. -- Although the state of the art in schedulability analysis for multiprocessor systems continues to advance, the current state is such that partition systems offer more guaranteed schedulability than global systems. Quoting from Ted Baker (private communication) a known expert on fixed priority scheduling on multiprocessor ssytems ``"The choice between global and partitioned approaches to multiprocessor scheduling is a conundrum. Setting aside pragmatic questions about queue contention overhead and differences in cache behavior, the theoretical results are equivocal. In favor of global scheduling, it has long been known from queueing theory that single-queue (global) FIFO multiprocessor scheduling is superior to queue-per-processor (partitioned) FIFO scheduling, with respect to average response time. Apparently in favor of partitioned scheduling, the application of well known single processor scheduling algorithms appears is superior to the global application of those same algorithms for some task sets with hard-deadlines. For example, it is known that all periodic implicit-deadline task sets with utilization below $m(2^{1/2} - 1)$ can be scheduled on m processors using a first-fit-decreasing-utilization (FFDU) partitioning algorithm and and local rate monotonic scheduling, but Dhall's example shows that there are hard-deadline periodic task sets with total utilization arbitrarily close to 1.0 that cannot meet all deadlines if scheduled on m processors using global rate monotonic scheduling. Dhall's example also applies to global EDF scheduling, yet FFDU partitioned EDF scheduling is guaranteed up to utilization $(m+1)/2$. However, the supposed advantage of partitioned scheduling above disappears if one considers hybrid global priority schemes. The Dhall example can easily be handled by the $EDF-US(1/2)$ or $EDF(k_min$) schemes, in which top priority is given to a few "heavy" tasks, as can any implicit deadline sporadic task system with utilization up to $(m+1)/2$. This is exactly the same bound as for FFDU partitioned scheduling! The experiments we performed on large numbers of pseudo-randomly generated task sets were intended to provide some additional evidence on which to base a choice between these two approaches. From those experiments, statistically, the chance of being able to satisfy all the deadlines of a randomly chosen periodic or sporadic task set appears to be highest with partitioned scheduling. In particular, the partitioned EDF scheduling appeared to be the overall best performer in this statistical sense. At the same time, there are certainly specific task sets where global scheduling is more effective. While the schedulability tests used in the experiments probably could be improved, it remains unclear whether they can be improved enough to erase the statistical margin of partitioned scheduling with the available schedulability tests." 2. To support temporal isolation. -- Where an application consists of tasks of mixed criticality level, some form of protection between the different levels is required. The strict typing model of Ada provides a strong degree of protection in the spatial domain. The CPU budgeting facility provides a limit form of temporal protection but at the expense of flexibility. More flexible temporal protection is obtainable by allowing tasks in each criticality level to be executed on partitions of the processor set. 3. To obtain performance benefits. -- For example, dedicating one CPU to a particular process will ensure maximum execution speed for that process. Restricting a process to run on a single CPU also prevents the performance cost caused by the cache invalidation that occurs when a process ceases to execute on one CPU and then recommences execution on a different CPU. 4. To be able to respond to dynamic changes to the processor set. -- In a parallel computing environment the set of processors allocated to an application may vary depending on the global state of the system. An application may be able to optimize its algorithms if it is informed when these changes in the processor set occur. Proposed Solution Summary -------------------------------------- There is no POSIX standard in this area, although an initial proposal was developed (see email exchange at the end of this SI). Consequently, it is difficult to ensure that the API proposed here is implementable on POSIX-compliant RTOS. However, the POSIX proposal and the work done on SMP Linux (http://www.die.net/doc/linux/man/man2/sched_setaffinity.2.html) suggests the following: Add to RealtimeSystems class public static java.util.Bitset availableProcessors(); public static boolean setAffinitySupported(); public static boolean affinityChangeNotificationSupported(); public static AsyncEvent ProcessorRemoved, ProcessorAdded; public final static java.util.BitSet setDefaultAffinity( java.util.BitSet Processors) throws ProcessorAffinityException; public final static java.util.BitSet setDefaultNoHeapAffinity( java.util.BitSet Processors) throws ProcessorAffinityException; public final static java.util.BitSet getDefaultAffinity() throws ProcessorAffinityException; public final static java.util.BitSet getDefaultNoHeapAffinity() throws ProcessorAffinityException; Add new Exception public class ProcessorAffinityException extends Exception; In RealTimeThread class: public java.util.Bitset setAffinity(java.util.BitSet Processors) throws ProcessorAffinityException; public java.util.BitSet getAffinity(); In BoundASEH class: public java.util.Bitset setAffinity(java.util.BitSet Processors) throws ProcessorAffinityException; public java.util.BitSet getAffinity(); Semantics of Proposed Solution ---------------------------------------- The proposal is for a minimum interface that allows the pinning of a schedulable object to one or more processors. The challenge is to define the API so that it allows a range of OS facilities to be supported. The minimum functionality is for the OS to allow the VM to determine how many processors are available for the execution of the Java application. The number of processors that the RT JVM is aware of is represented by a BitSet that is returned by availableProcessors() in the RealtimeSystems class public java.util.Bitset availableProcessors(); // returns a bitset // .length = the number of processors the VM can determine // that will be available to it // .cardinality = the number of processors allocated to JVM For example, in a 64 processor system, the VM may be aware of all 64 or only a subset of those. This is the length of the bitset. Of these processors, the VM will know which processors have been allocated to it (either logical processors or physical processors depending on the OS). Each of the available processors is set to one in the bit set. Hence, the cardinality of the bit set represents the number of processors that the VM thinks are currently available to it. The returned bit set is a new object that is allocated in the current memory area. The API allows for systems that supports the dynamic addition and removal of processors from the set allocated to the VM. If an OS does not support this facility then the set will not dynamically change. An OS is also allowed to maintain a set of logical processors allocated to the VM and to transparently change its logical to physical mapping. Again, from the VM perspective the set has not changed. However, it should be noted that this may have an impact on the application if a) it is handling interrupts directly on the processor or b) if the change undermines any feasibility analysis assumptions. For many RTSJ applications this may not be a problem. In all of the above circumstances affinityChangeNotificationSupported() returns false. If the OS does support dynamic changes to the processor set, the assumption is that it will inform the VM of the changes (e.g. via a signal). The VM will pass this information to the application via the firing of the appropiate asynchronous event (ProcessorRemoved or ProcessorAdded) declared in the RealtimeSystems class. In this circumstances affinityChangeNotificationSupported() returns true. An application can specify an ASEH to run in response to the firing of the above events. The assumption is that the application will maintain its own list of which SOs are mapped to which processors (logical or physical). It will then undertake whatever reconfiguration it deems appropriate. Failure Model: If a processor fails and the platform cannot transparently recover, the VM abnormally ends (with assumed fails stop semantics). Any recovery must be performed outside of the VM. This is because a processor failure can leave the application and VM in an inconsistent state (e.g. with a corrupt heap) from which it is unlikely to be able to recover. The API supports the setting of the affinity real-time threads and Bound ASEH by the programmer. If the OS doesn't support this facility then all of the associated operations, given below, throw UnsupportedOperationException, and any call to setAffinitySupported returns false. The default affinity can be set at run-time. Two defaults are provided: one for heap-using SOs and one for no-heap SOs. public final static java.util.BitSet setDefaultAffinity( java.util.BitSet Processors) throws ProcessorAffinityException; public final static java.util.BitSet setDefaultNoHeapAffinity( java.util.BitSet Processors) throws ProcessorAffinityException; The is no association maintained between the parameter passed and the default. i.e. copy semantics - changes the parameter object at a later stage will NOT result in a change of the default. public final static java.util.BitSet getDefaultAffinity() throws ProcessorAffinityException; public final static java.util.BitSet getDefaultNoHeapAffinity() throws ProcessorAffinityException; The returned object is allocated in the current memory area. The default "default affinity" is scheduler dependent, and must be documented. The affinity of a specific SO can be set via the following In RealTimeThread class: public java.util.Bitset setAffinity(java.util.BitSet Processors) throws ProcessorAffinityException; Changes to the parameter object does not change the affinity. Changes only occur when the setAffinity method is called. The actual affinity will be changed between the time the thread finishes its current release and the time it starts it next release. It must be complete by the time the next release starts. Throws IllegalArgument if size of the given given bitset does not match the current size of the bitset returned from availableProcessors or if the given bitset is null. Throws ProcessorAffinityException if a processor is unavailable. public java.util.BitSet getAffinity(); This returns that last bitset that was set by a call to setAffinity (or the default if there was no call). In BoundASEH class: public java.util.Bitset setAffinity(java.util.BitSet Processors) throws ProcessorAffinityException; Throws IllegalArgument - as above Throws ProcessorAffinityException - as above public java.util.BitSet getAffinity(); Discussion Points --------- 1. With the current proposal, only bound asynchronous event handlers can be pinned, not unbound one. 2. With the current proposal, Java threads cannot be pinned. 3. I have kept it as simple as possible. One could imagine other methods that, for example, return an array of SO that are current pinned to a processor. At the moment, the assumption is that the programmer maintains any information it needs. 4. The current assumption is that there are no API changes needed to support the two priority inheritance protocols on an SMP (the priority assignment algorithm will change) and that WaitFree queues will also work in an SMP environment.