That's not supposed to happen. Various websites claim that if this happens it's due to lanky synchronicity between the RAM and CPU. In a normal case scenario, you shouldn't be seeing any of these issues. I've gained a few FPS by going from 8/4 to 8/8/4 with the 2 8 gig modules in the corresponding slots.
I'll have a detailed look through what does what, but like I said, FLEX is supposed to distribute the memory into sections and force asynchronous dual channel, which is why CPU-Z reports it as such. Not the whole memory is running in dual channel then but if any issues arise it would be because of lack of compatibility or crap interaction.
EDIT: Okay, I ran AIDA64 memory benchmarks. I use standard Hynix memory.
8/8/4
Read: 23463MB/s
Write: 24958MB/s
Copy: 23009MB/s
Latency: 71.1ns
8/8
Read: 24051MB/s
Write: 25092MB/s
Copy: 23200MB/s
Latency: 72.9ns
Okay...that doesn't tell me much. If anything, I gained a couple hundred MB from the memory controller being offloaded and the dual channel covering the full 16GB (the +600MB/s to read speeds is nice but impossible to notice). There's nothing here to show that 2 is much better than 3 for me, at least not in the way you describe. Latency even went up...
Then again, the i7-4710HQ + GTX980M combo beats the **** out of any game you throw at it (except maybe Star Citizen on Very High) so it's kinda hard to determine any performance differences.
To put things into perspective, here's the scores for a single 8GB module:
Read: 12405
Write: 12551
Copy: 12304
Latency: 67.7
So, the difference is minimal and I'll underline again, if there's issues, they're gonna be because of lack of synchronization. Sadly, I can't confirm your theory either way since I don't have your issue.
Still, it's nice to see that dual channel actually DOES make things literally twice as fast.