he fact remains that if you under expose by 5 stops you will lose 5 stops of detail at the bottom end. Is this correct?
No, not necessarily...
Perhaps there is nothing in the scene that is very dark and there is a lot of bright areas that exceed the sensor's/pixel's capacity, maybe a boating scene in bright light with water reflections and white boats/sails/etc. Underexposing that scene will
extend the DR recorded.
Perhaps a scene does have dark areas as well as overly bright areas where the bright areas exceed capacity, but the dark areas are not dark enough so that the recorded signal is lost in the system noise. Again, underexposing that scene will
extend the DR recorded.
Or perhaps the scene has both light and dark areas that exceed capacity/capability. Underexposing that scene will
shift the DR.
You will
reduce the recorded DR *if* no pixels saturate to FWC... you loose it from the top, not the bottom (i.e. you make brighter areas darker). Yes, this does mean dark areas of the scene may drop below minimum, but zero is still zero. The loss of DR is because nothing recorded at 255.
Do not confuse the 5-12 stops of DR capable of the output (print/display) with the 12-14 stops of DR the sensor may be capable of. And realize that even if your monitor is capable of 12 stops, your processing/raw conversion may be reducing it farther (i.e. high contrast images). Just because you are not seeing it, doesn't mean it wasn't recorded.
Most of this probably sounds like an endorsement for underexposing/using invariance. But it really isn't.
The problem with recording a lack of light is not only a possible reduction of DR. It is also a lack of information and accuracy (color/etc). Lets say it takes 10 generated electrons to be distinguishable. In a lack of light scenario, when some pixels reach 10 the bottom of the recorded DR is established. But due to photon shot noise others will still be below 10.
And if we follow that out, every stop of gradation requires 2x as much as the previous (10,20,40,80,160). It's still a difference of 1 stop, but the sampling frequency/accuracy within the higher stops is greater.
Most of this doesn't matter until you start editing/pushing color/exposures around. If you take a really dark image (i.e. black on black) and properly expose that so that it records in the dark end you record very little data with very little data separating the tones (plus shot noise). If you attempt to edit such an image to any significant degree you will quickly run into banding and noise issues that cannot be resolved due to this lack of data/accuracy. And have you noticed the color noise that exists in dark areas? That's because a pixel receives enough light to register while surrounding ones don't so the pixel outputs as it's native color or an errant color.
This might sound sound like an endorsement for ETTR... and technically it is. But in practicality many/most situations don't allow for such refinement, and the risk of causing/extending high end clipping is greater... and if a scene isn't heavily biased to the dark end, there isn't a lot of benefit any way. My preference is to preserve highlights and ignore ETTR *unless* the scene is very dark.
What's funny is that this is what automated exposure will do for you if you let it; if it is a very bright scene it will underexpose (save highlights), and if a scene is particularly dark it will overexpose (ETTR). But a "correct" manual exposure will happily clip either end...
Can you recover black i.e. 0,0,0 ?
0,0,0 in the raw file, no.
0,0,0 anywhere else, probably (w/ issues noted above).