by Maarten Struys and Michel Verhagen (Jul. 3, 2003)
Foreword: With the arrival of Visual Studio .NET 2003, with integrated support for Smart Device Programmability, it is possible to develop applications for a broad range of devices by using managed code. Software developers can now use new exciting languages like Visual Basic .NET and Visual C# for device development. Although this sounds promising, one question is still to be answered: Is it possible to make use of the real-time capabilities of Windows CE .NET while using managed code to write applications for an embedded device? This technical article, originally published by the Microsoft Developer Network (MSDN), will answer that question and suggest a possible scenario in which real-time behavior can be combined with Microsoft .NET functionality.
Real-Time Behavior of the .NET Compact Framework
by Maarten Struys and Michel Verhagen PTS Software
Downloads for this tutorial:
A Managed World and an Unmanaged World
Some of the advantages of a managed environment like the Microsoft Common Language Runtime, such as writing safer and platform independent software, could be a disadvantage in a real-time environment. Typically, you cannot afford to wait for a just-in-time (JIT) compiler to compile a method prior to using it, and you cannot wait for a garbage collector to clear previously allocated memory by removing unused resources. Both of these features can interfere with deterministic system behavior. It is possible to force the garbage collector to do its duty, by calling GC.Collect(). However, you want the garbage collector to perform its task by itself, because it is highly optimized. To allow hard real-time behavior, it would be great if there was a way in which you could distinguish between hard real-time functionality, written in native or unmanaged Microsoft Win32 code and other functionality, written in managed code. Making use of Platform Invoke or P/Invoke, you can just do that.
Platform Invoke at Work
According to MSDN Help, P/Invoke is the functionality provided by the common language runtime to enable managed code to call unmanaged native dynamic-link library (DLL) entry points. In other words, P/Invoke gives you an escape route from managed Microsoft .NET code to unmanaged Win32 code. To be able to use this mechanism within Microsoft Windows CE .NET, native Win32 functions that you want to call must be defined extern public within a DLL. Because the managed .NET environment does not know anything about C++ name mangling, the functions to be called from within a managed application should also have C naming conventions. To be able to use functionality from within a DLL, you need to build a wrapper class around the function entry points from within your managed application. Listing 1 shows an example of a small, unmanaged DLL. Listing 2 shows how to call it from managed code. Because this mechanism works for all exported DLL functions and because almost all Win32 API's are exported in coredll.dll, this mechanism also provides a way to call into almost any Win32 API. We used P/Invoke in our test to have a managed application calling into an unmanaged real-time thread.
// This is the function GetTimingInfo that exists in the // unmanaged Win32 DLL. The function is fed with information, // originating in an Interrupt Service Thread in the same // DLL. On request of the managed application, timing // information is copied using a double buffering mechanism. RTCF_API DWORD GetTimingInfo(LPDWORD lpdwAvgPerfTicks, LPDWORD lpdwMax, LPDWORD lpdwMin, LPDWORD lpdwDeltaMax, LPDWORD lpdwDeltaMin) { g_bRequestData = TRUE; if (WaitForSingleObject(g_hNewDataEvent, 1000)==WAIT_OBJECT_0) { *lpdwAvgPerfTicks = g_dwBufferedAvgPerfTicks; *lpdwMax = g_dwBufferedMax; *lpdwMin = g_dwBufferedMin; *lpdwDeltaMax = g_dwBufferedDeltaMax; *lpdwDeltaMin = g_dwBufferedDeltaMin; return 1; } else return 0; }
// GetTimingInfo prototype #ifdef RTCF_EXPORTS #define RTCF_API __declspec(dllexport) #else #define RTCF_API __declspec(dllimport) #endif
extern "C" { RTCF_API BOOL Init(); RTCF_API BOOL DeInit(); RTCF_API DWORD GetTimingInfo(LPDWORD lpdwAvgPerfTicks, LPDWORD lpdwMax, LPDWORD lpdwMin, LPDWORD lpdwDeltaMax, LPDWORD lpdwDeltaMin); } Listing 1. Win32 DLL to be called from within managed code
// Wrapper class to be able to P/Invoke into a DLL. // Exported functions in the DLL are imported by this // wrapper. Note the use of compiler attributes to identify // the physical DLL that hosts the exported functions. using System; using System.Runtime.InteropServices;
namespace CFinRT { public class WCEThreadIntf { [DllImport("RTCF.dll")] public static extern bool Init(); [DllImport("RTCF.dll")] public static extern bool DeInit(); [DllImport("RTCF.Dll")] public static extern uint GetTimingInfo( ref uint perfAvg, ref uint perfMax, ref uint perfMin, ref uint perfTickMax, ref uint perfTickMin); } }
// Call an unmanaged function from within managed code public void CollectValue() { if (WCEThreadIntf.GetTimingInfo(ref aveSleepTime, ref maxSleepTime, ref minSleepTime, ref curMaxSleepTime, ref curMinSleepTime) != 0) { curMaxSleepTime = (uint)(float)((curMaxSleepTime * scaleValue) / 1.19318); curMinSleepTime = (uint)(float)((curMinSleepTime * scaleValue) / 1.19318); aveSleepTime = (uint)(float)((aveSleepTime * scaleValue) / 1.19318); maxSleepTime = (uint)(float)((maxSleepTime * scaleValue) / 1.19318); minSleepTime = (uint)(float)((minSleepTime * scaleValue) / 1.19318); }
StoreValue(); counter = (counter + 1) % samplesInMinute; } Listing 2. Calling into unmanaged code
A Real-Time Scenario
A system needs hard real-time functionality to retrieve information from an external source. The information is stored in the system and will be presented to the user in some graphical way. Figure 1 shows a possible scenario for this problem.
 Figure 1. Real-time scenario using both managed and unmanaged code
A real-time thread living inside a native Win32 DLL receives an interrupt from an external source. The thread processes the interrupt and stores relevant information to be presented to the user. On the right side, a separate UI thread, written in managed code, reads information that was previously stored by the real-time thread. Given the fact that context switches between processes are expensive, you want the entire system to live within the same process. If you separate real-time functionality from user interface functionality by putting real-time functionality in a DLL and providing an interface between that DLL and the other parts of the system, you have achieved your goal of having one single process dealing with all parts of the system. Communication between the UI thread and the real-time (RT) thread is possible by means of using P/Invoke to get into the native Win32 code.
The Actual Test
You want to make the test representative, yet as simple as possible so it can also be repeated easily on other systems. For that purpose, the source code to run the experiment yourself is available for download. This test requires a way to feed interrupts into the system and a possibility to output probes to be able to measure the performance of the system. You feed the system by using a block wave, generated by a signal generator. Of course, the Windows CE .NET operating system should be capable of hosting the .NET Compact Framework. Paul Yao has written an article indicating which Windows CE .NET modules and components should be present to run managed applications. See Microsoft .NET Compact Framework for Windows CE .NET. The aim of the test is not just to be representative and reproducible; just find a suitable interrupt source for input. Listing 3 shows how to hook a physical interrupt to an Interrupt Service Thread.
RTCF_API BOOL Init() { BOOL bRet = FALSE; DWORD dwIRQ = IRQ; // in our case IRQ = 5
// Get a SysIntr for the specified IRQ if (KernelIoControl(IOCTL_HAL_TRANSLATE_IRQ, &dwIRQ, sizeof(DWORD), &g_dwSysIntr, sizeof(DWORD), NULL)) { // create an event that will activate our IST g_hEvent = CreateEvent(NULL, FALSE, FALSE, NULL);
if (g_hEvent) { // Connect the interrupt to our event and // create our Interrupt Service Thread. // The actual IST is shown in listing 4 InterruptDisable(g_dwSysIntr);
if (InterruptInitialize(g_dwSysIntr, g_hEvent, NULL, 0)) { g_bFinish = FALSE; g_hThread = CreateThread(NULL, 0, IST, NULL, 0, NULL); if (g_hThread) { bRet = TRUE; } else { InterruptDisable(g_dwSysIntr); CloseHandle(g_hEvent); g_hEvent = NULL; } } } } return bRet; } Listing 3. Connecting a physical interrupt to an interrupt service thread
To test the real-time behavior of an application making use of managed code and the .NET Compact Framework, we have created a Windows CE .NET platform, based on Standard SDK. We also included the RTM version of the .NET Compact Framework in the platform. The operating system runs on a Geode GX1 at 300 megahertz (MHz). We feed the system with a block wave, immediately connected to the IRQ5 line on the PC104 bus (pin 23). The frequency of the block wave is 10 kilohertz (kHz). On uprising flanks, an interrupt is generated. The interrupt is processed by an interrupt service thread (IST). In the IST we send out probe pulses to the parallel port to be able to view an output signal. We also store the time at which the IST was activated making use of the high resolution QueryPerformanceCounter API. To be able to measure timing information over a long period of time, we also store maximum and minimum time in addition to average time. The time from interrupt occurrence to probe output is an indication of IRQ -- IST latency. The timing information acquired by the high resolution timer indicates when the IST is activated. Ideally this value should be 100 microseconds for an interrupt rate of 10 kHz. All timing information is passed to the graphical user interface on regular intervals.
Because the .NET Compact Framework itself can not be used in hard real-time situations like the situations described earlier, we decided to use it for presentation purposes only and to use a DLL, written in embedded Microsoft Visual C++ 4.0 for all real-time functionality. For communication between the DLL and the .NET Compact Framework graphical user interface (GUI), a double buffering mechanism is used in combination with P/Invoke. The GUI requests new timing information on regular intervals, making use of a System.Threading.Timer object. The DLL decides when it has time available to pass information to the GUI. Until data is ready, the GUI is blocked. The refresh rate of the information presented in the GUI is user selectable. For our test we used a refresh rate of 50 msec.
The following pseudo code explains the operation of the IST and the mechanism by which the GUI retrieves information, stored in the native Win32 DLL.
Interrupt Service Thread: Wait On IRQ 5 send probe pulse to the parallel port Measure time with QueryPerformanceCounter Store measured time (min, max, current, average) locally if (userInterfaceRequestsData) { copy measured time information reset statistic measure values set dataReady event userInterfaceRequestsData = false }
Managed code periodical update of display data:
disable timer // See pitfalls call with P/Invoke into the DLL // The following code is implemented in the DLL userInterfaceRequestsData = true wait for dataReady event return measured values draw measured values on the display, each time using new graphics objects update marker // A running vertical bar on the display enable timer During the test we hooked up an oscilloscope and made printouts of both the scope and the Windows CE .NET graphical display 10 minutes into the experiment. Figure 2 shows the interrupt latency measured with an oscilloscope. In the best case, the latency is 14.0 microseconds, in the worst case the latency is 54.4 microseconds, meaning a jitter of 40.4 microseconds. Figure 3 shows the periodic when the IST is activated. This figure is a screen shot of the actual user interface. Ideally the IST should run every 100 microseconds, which is also the average time during our measurement (the blue line in the middle). We also measured overall minimum (green) and maximum (red) times, in addition to minimum and maximum times over the sample period of 50 milliseconds (the white block). The deviation we found during the test period is limited to +/- 40 microseconds.
 Figure 2. Managed application: IRQ. IST latency
 Figure 3. Managed application: IST activation times after running 10 minutes
The Results
We measured over a longer period of time to make sure that both the garbage collector and the JIT compiler were frequently active. Thanks to the folks at Microsoft, we were able to monitor the behavior of the .NET Compact Framework because they provided us with a performance counters registry key. Using this key, a number of performance counters within the .NET Compact Framework are activated. We mainly used this performance information to verify that the JIT compiler and the garbage collector actually ran. It also gave a nice indication about the number of objects used during the cause of the test.
// Our periodic timer method in which we want to collect new // data and refresh the screen private void OnTimer(object source) { // Temporarily stop the timer, to prevent against // a whole bunch of OnTimer calls to be invoked if (theTimer != null) { theTimer.Change(Timeout.Infinite, dp.Interval); } Pen blackPen = new Pen(Color.Black); Pen yellowPen = new Pen(Color.Yellow); Graphics gfx = CreateGraphics();
td.SetTimePointer(dp.CurrentSample, gfx, blackPen);
for (int i = 0; i < dp.SamplesPerMeasure; i++) { td.ShowValue(dp.CurrentSample, dp[i], gfx, i); }
dp.CollectValue(); td.SetTimePointer(dp.CurrentSample, gfx, yellowPen);
gfx.Dispose(); yellowPen.Dispose(); blackPen.Dispose();
// Restart the timer again for the next update if (theTimer != null) { theTimer.Change(dp.Interval, dp.Interval); } } Listing 4. Handling timer messages in a managed world
As you can see in listing 4, we instantiate a number of objects each time we periodically update the screen. These objects, two pens and a graphics object, are created during each screen update. The functions td.ShowValue and td.SetTimerPointer also create brushes. Because td.SetTimerPointer is called twice per screen update, a total of six objects are created during each update of the screen. Because we update the screen every 50 msec, a total number of 120 objects are created each second. Over 10 minutes of execution, 72,000 objects are created. All of these objects are potentially subject to garbage collection. In table 1, the number of allocated objects roughly corresponds to these theoretical values.
Table 1. .NET Compact Framework performance results after running the test for five minutes

We have included performance counter results for both a 10 minute and a 100 minute run. This data was recorded during the actual test. As you can see, after running for 10 minutes, garbage collection occurred without noticeable fallbacks in performance. Table 2 shows the performance counters for a run of approximately 100 minutes. Full garbage collection occurred in this run. During this run, only 461,499 objects were created instead of the 720,000 expected objects. This is approximately 35 percent fewer objects than expected. The difference is likely to be caused by the performance counters that, according to Microsoft, result in a performance penalty of about 30 percent within the managed application. However, real-time behavior of the system was not influenced, as shown in figure 4.
Table 2. .NET Compact Framework performance results after running the test for 100 minutes

 Figure 4. Managed application: IST activation times after running 100 minutes
The remote process viewer provides more proof of the fact that the garbage collector and the JIT compiler did not influence real-time behavior. Figure 5 shows a screen dump of the remote process viewer for the managed application. All threads in the application (except the real-time thread with priority 0) run at normal priorities (251). During our measurements we did not find that the JIT compiler and garbage collector needed kernel blocking to perform their tasks.
 Figure 5. Remote process viewer showing the managed application
Pitfalls
During the test, increasing the frequency of the block wave led to unexpected results in the managed application. Especially in the situation in which the screen needed to be redrawn frequently because areas of the screen were invalid, the application randomly hung up the system. Investigation of this problem showed unexpected behavior for experienced Win32 programmers. In a Win32 application, using a timer results in a WM_TIMER message each time a timer expires. However, in the message queue WM_TIMER messages are low priority messages, only posted when there are no other higher priority messages to be processed. This behavior can potentially lead to missing timer ticks, but because CreateTimer does not give you an accurate timer to begin with. This is not a issue, especially if the timer is used to update a graphical user interface (GUI). However, in the managed application, we use a System.Threading.Timer object to create a timer. This timer calls a delegate every time the timer expires. The delegate is called from within a separate thread that exists in a thread pool. If the system is too busy with other activities, for example redrawing an entire screen, more timer delegates, each in separate threads, are activated before previously activated delegates are finished. This might lead to consuming all available threads from the thread pool, causing the system to hang. The solution to prevent this behavior is found in listing 4. Each time a timer delegate is activated, we stop the timer object by invoking the Change method of the Timer object, to indicate that we do not want the next timer message until we have processed the current one. This might result in inaccurate timer intervals. In our case the timer is just used to refresh the screen so inaccurate timing is not an issue.
Proof of Results
To be able to compare the results of our experiment with typical results in the same setting, we also wrote a Win32 application that invoked the same DLL with real-time functionality. The Win32 application is functionally identical to the managed application. It provides the system with a graphical user interface in which timing information is displayed in a window. This application paints timing results upon reception of WM_TIMER messages, solely making use of Win32 APIs. We did not find any significant difference in performance, as figures 6 and 7 show. In figure 6 the interrupt latency is again measured with an oscilloscope. For the 2 application, the latency is 14.4 microseconds. In the worst case, the latency is 55.2 microseconds, meaning a jitter of 40.8 microseconds. These results are identical to the test run with a .NET Compact Framework managed application.
 Figure 6. Win32 application: IST activation times after running 10 minutes
In figure 7, the periodic time is displayed when the IST is activated, again for the Win32 application. Again, the results are identical to the results of a .NET Compact Framework managed application. Both sources for the managed application and the Win32 application can be downloaded.
 Figure 7. Win32 application: IST activation times after running 10 minutes
Conclusion
It is important that you understand that we are not suggesting the .NET Compact Framework for any real-time work by itself. We suggest that it can be used as a presentation layer. In such a system, the .NET Compact Framework can "peacefully coexist" with real-time functionality, not affecting the real time behavior of Windows CE .NET. In this article we have not benchmarked the graphics capabilities of the .NET Compact Framework. In our situation we did not find any significant difference between an application written entirely in Win32 and an application partly written in a managed environment with C#. Given the higher programmer productivity and the richness of the .NET Compact Framework, there are many advantages to writing presentation layers in managed code and writing hard real-time functionality in unmanaged code. The clear distinction between these different types of functionality is something you will get for free, by using this approach.
Acknowledgements
We have been thinking quite a while about testing the usability of the .NET Compact Framework in real-time scenarios. This test was only possible by cooperating with people and companies that could provide us with the proper hardware and measuring equipment. Therefore we like to thank Willem Haring of Getronics for his support, ideas and hospitality during this project. We also like to thank the folks at Delem for their hospitality and for providing us with the necessary equipment to execute our tests.
About the authors: Michel Verhagen works at PTS Software in the Netherlands. Michel is a Windows CE .NET consultant, has 4 years experience with Windows CE. His main expertise lies in the area of Platform Builder.
Maarten Struys also works at PTS Software. There he is responsible for the real-time and embedded competence centre. Maarten is an experienced Windows (CE) developer, having worked with Windows CE since its introduction. Since 2000, Maarten is working with managed code in .NET environments. He is also a freelance journalist for the two leading magazines about embedded systems development in the Netherlands. He recently opened a website with information about .NET in the embedded world.
(Click here for further information)
|