Canon EOS 5D Mark II - The Unified Grand Theory of Black Pixels

©2008-12-13 Henrik Herranen, updated 2009-01-08

This page is http://www.iki.fi/leopold/Photo/Canon5D2BlackPixels/


2009-01-08: Issue Seems to Be Corrected

A bit surprisingly Canon's new firmware v1.0.7 for the Canon EOS 5D Mark II seems to remove the black dot problem right at the RAW level according to Andrew Yip and some other bloggers / testers. I am most pleased to hear the good news, particularly as it doesn't involve a hardware fix and doesn't seem to have any adverse effect on performance. Kudos to Canon for correcting the problem so fast!

And, to put our money where our mouths are: my wife ordered our 5D Mark II (+ 32 GB memory cards + Eg-S focusing screen + extra battery) today!


0. Original Article

Contents:
  1. General
  2. The Models
  3. Testing the Models, Part 1
  4. Testing the Models, Part 2
  5. Corrective Measures
  6. Conclusions

1. General

Many photographers have noticed a weird behaviour pattern with Canon's brand new consumer flagship camera, the EOS 5D Mark II. For some reason it tends to create black dots a few pixels to the right side of highly over-exposed highlights like christmas tree lights and night lighting (when using normal landscape camera orientation, otherwise the black pixels are on the side which would be normal camera right). This phenomenon seems to manifest itself only mildly at ISO 100, but very clearly at higher ISOs. The phenomenon seems to be 100 % repeatable - particularly with people posting "proof" that their camera isn't afflicted by the problem. It isn't a JPEG engine problem because it can be seen in both JPEG and RAW output.

This article demonstrates a simple DSLR image signal path model that, while being pure black box guesswork, will demonstrate a plausible way the phenomenon could happen. The point of the model is that it is plausible both mathematically and physically - I am looking at the problem as an engineering problem, not as something where things happen by magic.

Although I may be completely off-track with my guesses, I find it highly likely that something like the model of mine is actually happening inside the new Canon EOS 5D Mark II (called 5D2 from now on).

Oh, and because someone is bound to ask: my motivation for writing this page is my desire to be able to buy an as good a 5D2 as possible. While it seems to take perfectly good pictures in the daytime, I'm not at all thrilled of the thought that I would not be happy with my night time or astro photographs because the camera has added funky details to them. I am now running my computer on a 4 megapixel monitor, which already shows the problem quite clearly, and I am sure my monitor resolution won't go down during the following years. Thus I hope the 5D2 to be a logical continuation to the good old original 5D, which has given my family endless hours of joy.


2. The Models

To be able to explain what is happening, we have to create a model of the image signal path. In the model we have to make realistic assumptions of what is going on inside the 5D2. We also have to leave away all those components that are not necessary for understanding the system so as not to make the model incomprehensible.

To make things easier to understand, I am going to assign specific voltages and limits to analog signals. They are almost certainly not nearly correct. However, the exact voltages are not important. It's their interactions that matter, and those are easier to understand if we assign some physical properties to them.

2.1 The General Model

The relevant signal path is as follows:

  1. First we have our offending christmas decoration with lots of pointy lights which will later trigger all of our problems.

    Range: The Sun is the limit.

  2. The shutter limits the amount of light that enters the sensor.

    Range: To make everything easier for us we'll assume that our shutter speed is 1/100s when ISO is set to 100, 1/200s when ISO is set to 200 etc. This way we will always get equal exposure.

  3. The sensor collects light at each of its over 21 million pixel locations. The analog pixel values are transmitted in a rapid fashion one horizontal line at a time, from left to right onto the next processing pipeline stages. Because it seems that the Canon EOS 1Ds Mark III and the 5D2 share the same sensor, and because the black pixel problem doesn't appear on the 1Ds3, we will assume the sensor is an ideal component. Note that this means that we are not going to model sensor noise which, while important, isn't relevant for this dicussion.

    Range: Let's assume that black would be 0 volts and absolute high white level would be 1 volt. And let's further assume that the highest level the sensor can output is slightly higher than that to make sure we actually will always get a full amplitude value at the later A/D conversion stage. Make it 1.1 V. Again, it is highly unlikely that these figures are even close to the truth, but it won't really matter. It just is more convenient to have something even remotely tangible to count with.

  4. DSLR's have several parallel read channels, which means that two or four pixels are transmitted at a time. It would make sense to send green pixels of a line on one channel and the red or blue pixels on another channel. This would keep colur interference to a minimum and thus increase colour purity. Thus we will make the very important assumption that one signal path from now on will only carry adjacent pixels of the same colour!

    Range: As before: 1 V is maximum white, 1.1 V is the absolute maximum value.

  5. Somewhere between the sensor and the ISO amplifier, or perhaps even inside the ISO amplifier, signal degradation occurs. We will later model the degradation as a linear filter, and this will be a very important part in explaining what goes wrong with the 5D2. A separate Signal path degradation model will be presented in the next section.

    Range: As before.

  6. The ISO amplifier amplifies the signal according to the user selected sensitivity.

    Range: As agreed upon earlier, the input to the ISO amplifier is limited to 0-1.1 V. Let's assume that amplification at ISO 100 is 1, which means that the output of the ISO amplifier is the same as the input (this is almost certainly not the case, but as before, it doesn't really matter). Then, amplification at ISO 400 would be 4. However, when using ISO 400 1.1 V would not be increased to 4.4 V because the analog amplifier must have a limited output. We'll assume that the output is limited to the same range as the input: to 1.1 V. Thus, at ISO 400, 0.1 V is amplified to 0.4 V, but 0.5 V is not amplified to 2.0 V, but instead limited to 1.1 V.

  7. The A/D converter converts the data from the analog to the digital domain. It outputs 14-bit values which will later be losslessly compressed to form the 5D2's RAW files and they will also be fodder for the lower quality sRAW1, sRAW2 and JPEG formats. While it is entirely possible for further signal degradation to happen between the ISO amplifier and the A/D converter, it is not important for the black pixel phenomenon, and thus it will not be modelled.

    Range: 0 to 1 V are converted to values between 0 and the maximum value. Although the maximum 14-bit value is 16383, we'll normalize this highest value to 1.0 so that we'll have similar scales on our whole signal path. This, as well as the conveniently chosen voltages, will make it easier to draw graphs later.

2.2 Signal Path Degradation

As told earlier, the pixels that are output from the sensor are transmitted to the ISO amplifier. To recreate the dreaded black pixels we have to make this pathway unideal in a certain way.

The frequencies in the signal path are very high for analog signals: even with 4 read channels and 3.9 frames per second there are at least 20 million pixels read each second. This means that even the slightest electrical mismatch - or even the traces on a circuit or a circuit board could have an effect on the signal.

Signal path degradation is modeled to have the following impulse response:

The image is read as follows: each number on the horizontal axis is the time it takes to transmit one sample. The vertical scale shows how the analog signal changes over time because of one full scale input sample.

The impulse that is intended to be read, is output at time point -1, at which time the impulse response is 0. The read point is exactly one sample later, at time point 0, which is where the top of the impulse lies. Then, the impulse goes down so that it is zero by time 1 when the next sample is supposed to be read. This is all good and well.

Notice, however, that there is an low-amplitude continuation in the impulse response so that there is an ever-so-slight negative value at time point 2. This means that every sample that has been read will have a very slight effect on the sample being read two sample cycles later. The effect is very small: the value at 2 is only -0.01, or -1% of the value. While it may seem very innocent, it is exactly this one per cent error that is going to cause all of our problems later.

If you are not accustomed to high-frequency designs, you may ask why anyone would be so stupid to make such a system where an old signal "rings" so long that it (slightly) confuses the reading of the next signal. Well, rest assured. No-one puts these kinds of inaccuracies there on purpose. It just happens to be that any real system with limited bandwidth (and there's no such thing as unlimited bandwidth) has this kind of ringing. The only question is how much ringing occurs and whether it can be handled in such a way that it won't adversely affect the performance of the system.


3. Testing the Models, Part 1

So we have a model. Now we could test it with an actual signal.

3.1 Proving the Negative: Highlight Within Sensor Range

We'll start by proving that the sensor works just fine most of the time and with normal material. At the same time we will get accustomed to the graph format that shows what happens in the signal path.

Let us digitize the following horizontal pixel line, consisting of 12 pixels:

First, the individual pixels are read by the sensor. The brightest values are at 95 % of pure white, so presenting them will be well inside the capacity of the sensor. Now we will see what happens with the signal during the stages presented earlier in our system model. We'll examine cases at ISO 100 as well as ISO 400 to see how the system manages.

3.1.1 Highlight Within Sensor Range, ISO 100

All right, we'll start by using basic iso and an 1/100 s exposure time. What we get is the following graph:

Because the signal isn't overdriven at any point, the light hitting the sensor has a 1:1 correspondence with the sensor read-out signal (drawn 0.1 samples apart so that both would show). Signal path degradation changes the signal a little bit before it enters the ISO amplifier - remember that the value from two samples to the left has a slight effect on the current sample - but the error is insignificant. As we defined the amplification factor for ISO 100 to be 1, the ISO amplifier input and output are the same (also drawn 0.1 samples apart so that both would show). Finally, we see the signal that has gone through the A/D conversion (and normalized to a scale between 0 and 1) reminds very much the original light hitting the sensor.

All in all, for the input of
(0.050 0.050 0.600 0.950 0.950 0.950 0.950 0.800 0.050 0.050 0.050 0.050)
we got the output of
(0.050 0.050 0.606 0.959 0.954 0.950 0.950 0.799 0.041 0.043 0.050 0.050)

This output would be a very good representation of the input. It would look like this:

You may see a difference between the original and the result signal, superimposed vertically below, but I dare for anyone to be able to recognize this slight difference in a real image:

3.1.2 Highlight Within Sensor Range, ISO 400

All right. At ISO 100 everything was okey-dokey. How would our situation change if we upped our sensitivity by two stops to ISO 400, and proportionally reduced our exposure time by two stops to 1/400 s?

Because exposure time has been driven down to 1/4 of what it was before, the sensor will only get 1/4 of the signal. So instead of a top value of 0.95, the sensor now reads a top value of approximately 0.24. The signal stays weak right until it is amplified by the ISO amplifier, after which we have exactly the same image data as before, and the final values are just as they were the last time:
(0.050 0.050 0.606 0.959 0.954 0.950 0.950 0.799 0.041 0.043 0.050 0.050)

We have now proved that our model doesn't present us black pixels if there are no overexposed highlights in the picture, and that this behaviour is not dependent on the camera sensitivity setting. In other words, so far we are consistent with real 5D2 behaviour.

3.2 Going for the Kill: Highlight Outside of Sensor Range

We have now seen that our model doesn't create black pixel anomalies if all brightness values are inside the range of the whole signal path. However, what would happen if we would have "whiter than white", or, in other words, a light so bright that the system would not have a chance to even closely represent it?

We'll simulate this by taking the previous 12-pixel input, and replacing the almost-white 0.95 pixel values with values that are 2.28 stops brighter. That would be 4.6. This is very realistic and actually a bit conservative when there are small lights or specular highlights in an otherwise darkish picture. This applies to christmas trees, cityscapes, water shot against the sun and many other cases.

So, this time we'll use the following pixels as our source material:

The red pixels are the ones that cannot be represented represent using 8-bit colour. On the non-linear display scale of 0..255 that every computer except some Macs use, those pixels would have an RGB value [506, 506, 506].

3.2.1 Highlight Outside of Sensor Range, ISO 100

All right, let's draw a graph of what would happen if we took the shot of the really bright light at ISO 100.

Because the light is much brighter, light entering sensor is literally off the scale. While our scale goes upto 1.2 the light level is at 4.6. However, as our maximum sensor well depth and voltage is only 1.1 V, the values get saturated right at the starting point. As a result, our original data which is
(0.050 0.050 0.600 4.595 4.595 4.595 4.595 0.800 0.050 0.050 0.050 0.050)
becomes
(0.050 0.050 0.606 1.100 1.100 1.100 1.100 0.798 0.040 0.043 0.050 0.050)
after the ISO amplifier, and is then saturated by the A/D converter to be
(0.050 0.050 0.606 1.000 1.000 1.000 1.000 0.798 0.040 0.043 0.050 0.050)

As we can see, though the badly over-white values of 4.6 were saturated to 1.0, otherwise image fidelity is good. Only pixels #9 and #10 seem to be a little off, but even their errors don't have a significant impact on the result:

Again, comparing to the original won't show much of a difference:

So, we didn't get the dreaded black pixels. Yet. Just as they don't like to appear on 5D2 shots done with low ISO. But let's up sensitivity to values where black pixels in real life.

3.2.2 Highlight Outside of Sensor Range, ISO 400

Now we increase sensitivity by two stops and to compensate lower our exposure time accordingly:

Because we are only using one fourth of the exposure time we did the last time, the amount of light entering the sensor in the over-the-top white pixels have decreased from 4.6 to 1.15 and now they fit our scale. However, now we see that the error that made pixel #9 be a little bit off the last time, has now a much greater effect. Actually, pixel #9 is very close to zero!

How does our numerical analysis look this time? Original sensor values
(0.013 0.013 0.150 1.149 1.149 1.149 1.149 0.200 0.013 0.013 0.013 0.013)
become
(0.050 0.050 0.606 1.100 1.100 1.100 1.100 0.764 0.007 0.043 0.050 0.050)
after the ISO amplifier, and are then satureted by the A/D converter to
(0.050 0.050 0.606 1.000 1.000 1.000 1.000 0.764 0.007 0.043 0.050 0.050)

The numbers confirm that sample #9 is not at all what it should be: it should be a relatively nice darkish grey shade of 0.050, but what we get is 0.007. That is way off! What would it look like?

Well well well, what have we here? Does this look like a black pixel to you? It does to me! Let's compare to the original data:

Yes, there it is. A black pixel that appears at high ISO right next to a highlight that is many stops brighter than what the system can represent. We have a perfect early production 5D Mark II!


4. Testing the Models, Part 2

As soon as I get my simulation program ready the model will be tested with real, two-dimensional low-resolution pictures. Maybe in a day or two, or a week, or never. Who knows?

5. Corrective Measures

Now that we have established a model that seems to be working for the set requirements (very bright details, high ISO), we can start thinking about how to minimize or remove the effect altogether. I can think of at least these alternatives.
  1. Slower read-out. By slowing pixel read-out, the settling time will be over before the next pixel is read. Too bad this may not be possible if the read-out speed is not adjustable, and in any case it would make continuous shooting slower.
  2. Better signal path. If the signal path between the sensor and ISO amplifier is not inside an ASIC chip, cleaning the signal path could help in increasing the bandwidth and removing the problem. This could be as easy as changing the value or adding one resistor, capacitor or inductor. But it can only be implemented if the signal path is not inside a chip.
  3. Limiting the signal at the sensor read-out stage. If the signal is limited right at the sensor read-out stage according to maximum ISO, the later signal path degradation doesn't create such an undershoot. This can only be implemented if the signal path is not inside an ASIC, and is also otherwise problematic: the limiting should be according to ISO in such a way that ISO 100 would limited post-sensor read-out signal to 1.1 V (using my arbitrary scale), ISO 200 would limit to 0.55 V, etc.
  4. If the problematic signal path is inside an ASIC and its speed cannot be adjusted, the only alternative is doing a redesign and creating a new full mask version of the offending chip. This is both expensive and time consuming. The processing itself takes anything from 5 to 12 weeks and that doesn't include time needed for the redesign. This is very much a non-desired option.
  5. Software recognition of potential error pixels and interpolating around them. This is a workaround, not a fix. And I very much suspect that it isn't even possible to tweak the in-camera JPEG engine to mask off pixels in such a way without significantly slowing down the camera, if at all. For RAW images, of course, everything is possible, but the workaround is still not very attractive.

6. Conclusions

In this document I have shown a simple DSLR signal path model that closely mimics the black pixel phenomenon of the early production Canon EOS 5D Mark II DSLR cameras. While there are no guarantees that this model is right, it is technically plausible and just might be essentially correct. I have also presented corrective measures that may or may not be applicable to the problem.

I hope Canon can solve this problem as soon as possible, because I wish for my family to be able to buy a perfect 5D2 as soon as possible.


PS.
If anyone is willing to let me use two similar shots demonstrating the near absence / appearing of the black dot problem at ISO 100 / 400, I'd be happy to post small crops here. The shots should be of the same subject at the same brightness, the only differences being ISO and shutter speed. Thank you.

©2008-2009 Henrik Herranen