Sunday, August 28, 2011

The problem with sequence coverage. (part 2)

Previously I mentioned why just relying on sequence coverage is not a good idea as it is possible to have 100% sequence coverage but not 100% code coverage. However I only described a scenario that used a branch that had 2 paths i.e. the most common form of the conditional branches, but there is one other member of the conditional branch family that exists in IL and that is the switch instruction; this instruction can have many paths. This time I am using the code from the Newtonsoft.Json library because a) it has tests and b) it is very well covered at 83% sequence coverage, but only 72% (by my calculations) branch coverage. The subject of this investigation is BsonReader::ReadType(BsonType) this method has a very large switch statement, that actually is defined as a switch statement in IL, with a default and several fall-throughs; a fall-through is where two or more case statements call the same code. The method itself has 98% sequence coverage and 82% branch coverage; the only code that is uncovered is the handler for the default: path.
which is not unexpected as it is a handler for an Enum which should not be set to any value that is not part of the allowed values. Looking at the branch coverage report we have the following results (the switch instruction we are interested in is at IL offset 8.
Now the first path (0) is unvisited, but we knew that, so the next unvisited branch is #14 and the next is #17; luckily for us the enum in question that is used by the switch instruction is well defined.
And as such we can thus deduce that the method is never called during testing with the values Symbol and TimeStamp but the code that they would call is covered; in fact we can see from the code that both these enum values are part of the switch/case statement and are part of fall-throughs. So again we see how branch coverage helps identify 'potential' issues and test candidates.

Friday, August 26, 2011

The problem with sequence coverage.

Sequence coverage is probably the simplest coverage metric, the information is packaged in PDB files and can be read using tools like Mono.Cecil, but just because a method has 100% sequence coverage does not mean you have 100% code coverage.

I'll use an example from OpenCover's own dogfood tests to demonstrate what I mean. Here is a method which shows that it has 100% coverage (sequence point that is).


However I see an issue and that is that on line 101 there is a condition, i.e. a branch, and yet if the visit count is 1 then there is no possibility that both paths for that branch could have been tested. We can therefore infer that even if we had 10000 visits there is no guarantee that every path would be covered even in such a simple method.

Looking at the OpenCover results from which the coverage report was generated. We get


We can see that each sequence point has been visited once, however the branch coverage shows that only one of the paths for the condition we have identified had been visited (in this case it is the true path); which is good as that is what we deduced.

So if you are using code coverage tools do NOT just rely on sequence point coverage alone to determine how well covered your code is. Luckily OpenCover now, as of 25th Aug 2011, supports branch coverage and ReportGenerator 1.2 displays most of the information to help you identify possible coverage mismatches.

Wednesday, August 10, 2011

OpenCover Performance Impact (part 2)

I think I now have a handle on why I was getting the results I earlier reported i.e. OpenCover and PartCover were not some magical performance boosters that added Go Faster stripes to your code.

After a heads up by leppie and his investigations of using OpenCover on his IronScheme project I realised that I needed to spend some time on optimizing how I get data from the profiler and aggregate it into the report. In case you are wondering IronScheme test that took just shy of 1 minute to run on my machine took over 60mins when running under the profiler, Ouch!

The problem

First of all I should explain what sort of data OpenCover gathers (and why) and then I can describe what I did to improve performance. OpenCover records each visit to a sequence point and stores these visits into shared memory; I did it this way as I am hoping to be able to use the order of visits for some form of path coverage analysis at a later date. After 8000 visits it informs the host process that there is a block ready for processing. The host takes this block, makes a copy, releases the shared memory back to the profiler and then processes the data. After processing the data the hosts then waits for the next message. It was this latter stage that was the bottleneck as the host was spending too much time aggregating the data that the profiler was already ready with the next 8000 points.

An (interim) solution

I say interim solution as I am not finished with performance improvements yet but decided that what I had implemented so far was okay for release.

First I looked at how the results were being aggregated and noticed that a lot of the time was being spent looking up the sequence points so that the visit count could be updated, I switched this to a list and mapped the visit count data to the model at the end of the profiling run. This helped but only by bringing the profiling run down to ~40mins.

I realised that I just had to get the data out of the way quickly and process it at a later date, so I added a processing thread and a ConcurrentQueue. This was an interesting turn of events as the target process now finished in 4 mins but the host took nearly 40 mins to process the data and the memory usage went up to 2.5GB and a backlog of 40K messages. Hmmm....

After some toying, whilst looking for inspiration, I noticed that the marshaling of the structure (2 integers) was where most of the time was spent. I switched this to using BitConvertor, which also meant that I could avoid the memory pinning required by the marshaling. Now the target process still ran in just under 4 mins but the backlog very rarely reached 20 messages and memory usage stayed at a comfortable level (<100MB) and the results were ready virtually as soon as the target process had closed. I also upped the number of visit points per packet to 16000, but this didn't show any noticeable improvement in performance.

I decided this was enough for now and released a version of the profiler.

But what about the earlier results?

Those earlier results though were still are cause for thought. Why should the OpenCover dogfood tests be faster but the ironscheme test be so much slower. Well the IronScheme tests were doing a lot of loops and were running parts of the code many 1000's of times whereas the dogfood tests were unit tests and the code was only being run several times before moving onto the next test fixture and next section of code. I am now thinking that the issue is due to the optimization that is normally performed by the JIT compiler, but is turned off by the profiler i.e. when running the tests (without profiler) the JIT compiler spends time optimizing the code but the time spent is not recovered as the code is not run enough times to get a net gain, compared to when the JIT compiler just compiles the non-optimised modified code that the profiler produces.

So in conclusion you may see some speed improvements if running tests where your code is only visited a few times but if you are doing intensive execution of code then don't be surprised if the performance is degraded.