Difference between revisions of "HFI data compression"

From Planck PLA 2015 Wiki
Jump to: navigation, search
 
(11 intermediate revisions by 2 users not shown)
Line 4: Line 4:
  
 
The output of the readout electronics unit (REU) consists of one
 
The output of the readout electronics unit (REU) consists of one
value for each of the 72 science channels (bolometers and thermometers) for each modulation half-period. This number, <math>S_{REU}</math>, is the sum
+
value for each of the 72 science channels (bolometers and thermometers) for each modulation half-period. This number, <i>S</i><sub>REU</sub>, is the sum of the 40 16-bit ADC signal values measured within the given
of the 40 16-bit ADC signal values measured within the given
 
 
half-period. The data processor unit (DPU) performs a lossy
 
half-period. The data processor unit (DPU) performs a lossy
quantization of <math>S_{REU}</math>.
+
quantization of <i>S</i><sub>REU</sub>.
  
We define a compression slice of 254 <math>S_{REU}</math> values, corresponding  
+
We define a compression slice of 254 <i>S</i><sub>REU</sub> values, corresponding  
 
to about 1.4 s of observation for each detector and to a
 
to about 1.4 s of observation for each detector and to a
strip on the sky about 8 degrees long. The mean <math>\langle S_{REU} \rangle</math> of the data within
+
strip on the sky about 8&deg; long. The mean <<i>S</i><sub>REU</sub>> of the data within
 
each compression slice is computed, and data are demodulated
 
each compression slice is computed, and data are demodulated
 
using this mean:
 
using this mean:
  
<math>S_{demod,i} = (S_{REU,i} - \langle S_{REU} \rangle) \ast(-1)^{i}</math>
+
<math>S_{{\rm demod},i} = (S_{{\rm REU},i} - \langle S_{\rm REU} \rangle) \ast(-1)^{i}</math>,
  
where <math>1 < i < 254</math> is the running index within the compression slice.
+
where 1<<i>i</i><254 is the running index within the compression slice.
  
The mean <math>\langle S_{demod} \rangle</math> of the demodulated data <math>S_{demod,i}</math>  
+
The mean <<i>S</i><sub>demod</sub>> of the demodulated data <i>S</i><sub>demod,<i>i</i></sub>
is computed and subtracted, and the resulting slice data is quantized
+
is computed and subtracted, and the resulting slice data are quantized
according to a step size Q that is fixed per detector:
+
according to a step size <i>Q</i> that is fixed per detector:
  
<math> S_{DPU,i} = \mbox{round} \left[( S_{demod,i} - \langle S_{demod} \rangle) /Q \right ] </math>
+
<math> S_{{\rm DPU},i} = \mbox{round} \left[( S_{\rm demod,i} - \langle S_{\rm demod} \rangle) /Q \right ] </math>.
  
 
This is the lossy part of the algorithm: the required compression
 
This is the lossy part of the algorithm: the required compression
factor, obtained through the tuning of the quantization step Q,
+
factor, obtained through the tuning of the quantization step <i>Q</i>,
adds a noise of variance <math> \simeq 2\% </math> to the data. This will be discussed below.
+
adds a noise of variance approximately 2% to the data. This will be discussed below.
  
The two means  
+
The two means
<math>\langle S_{REU} \rangle</math>  
+
<<i>S</i><sub>REU</sub>>
and  
+
and
<math>\langle S_{demod} \rangle</math>  
+
<<i>S</i><sub>demod</sub>>
 
are computed as
 
are computed as
 
32-bit words and sent through the telemetry, together with the
 
32-bit words and sent through the telemetry, together with the
<math>S_{DPU,i}</math> values.  
+
<math>S_{{\rm DPU},i}</math> values.  
Variable-length encoding of the <math>S_{DPU,i}</math> values is
+
Variable-length encoding of the <i>S</i><sub>DPU,<i>i</i></sub> values is
performed on board, and the inverse decoding is applied on
+
performed on board, and the inverse decoding is applied on the ground.
ground.
 
  
 
===Performance of the data compression during the mission===
 
===Performance of the data compression during the mission===
  
 
Optimal use of the bandpass available for the downlink was obtained initially by using a value
 
Optimal use of the bandpass available for the downlink was obtained initially by using a value
of Q = <math>\sigma</math>/2.5 for all bolometer signals.  
+
of <i>Q</i> = &sigma;/2.5 for all bolometer signals.  
 
After the 12th of December 2009, and only for the 857 GHz detectors, the  
 
After the 12th of December 2009, and only for the 857 GHz detectors, the  
value was reset to Q = <math>\sigma</math>/2.0 to avoid data loss  
+
value was reset to <i>Q</i> = &sigma;/2.0 to avoid data loss  
 
due to exceeding the limit of the downlink rate.
 
due to exceeding the limit of the downlink rate.
 
With these settings the load during the mission never exceeded the
 
With these settings the load during the mission never exceeded the
allowed band-pass width as is seen on the next figure.
+
allowed bandpass width, as is seen on the next figure.
  
 
[[Image:HFI_TM_bandpass.png|thumb|500px|center|Evolution of the total load during the mission for the 72  
 
[[Image:HFI_TM_bandpass.png|thumb|500px|center|Evolution of the total load during the mission for the 72  
Line 54: Line 52:
 
     the time spent by the high-frequency channels in the Galactic
 
     the time spent by the high-frequency channels in the Galactic
 
     region, which has very large data gradients, and depends on the satellite scanning strategy.
 
     region, which has very large data gradients, and depends on the satellite scanning strategy.
     The bandpass-width limit was 80kb/s and was never
+
     The bandpass-width limit was 80 kb s<sup>-1</sup> and was never
 
     reached during the mission.]]
 
     reached during the mission.]]
  
Line 61: Line 59:
  
  
The only parameter that enters the PLANCK-HFI compression algorithm is
+
The only parameter that enters the Planck-HFI compression algorithm is
 
the size of the quantization step, in units of <math>\sigma</math>, the white
 
the size of the quantization step, in units of <math>\sigma</math>, the white
 
noise standard deviation for each channel.
 
noise standard deviation for each channel.
It has been adjusted during the mission by studying the mean frequency of
+
This quantity was adjusted during the mission by studying the mean frequency of
the central quantization bin [-Q/2,Q/2], <math>p_0</math> within each compression
+
the central quantization bin [-<i>Q</i>/2,<i>Q</i>/2], <i>p</i><sub>0</sub>, within each compression
 
slice (254 samples).
 
slice (254 samples).
For a pure Gaussian noise, this frequency is related to the  
+
For pure Gaussian noise, this frequency is related to the  
step size (in units of <math>\sigma</math>) by  
+
step size (in units of &sigma;) by  
 
<math>
 
<math>
   \hat Q =2\sqrt{2} \text{Erf}^{-1}(p_0)\simeq 2.5 p_0
+
   \hat Q =2\sqrt{2} \text{Erf}^{-1}(p_0)\approx 2.5 p_0,
 
</math>
 
</math>
where the approximation is valid up to <math>p_0 <0.4</math>.
+
where the approximation is valid for <i>p</i><sub>0</sub><0.4.
In PLANCK however the channel signal is not a pure Gaussian, since
+
In Planck, however, the channel signal is not a pure Gaussian, since
 
glitches and the periodic crossing of the Galactic plane add some
 
glitches and the periodic crossing of the Galactic plane add some
 
strong outliers to the distribution.
 
strong outliers to the distribution.
By using the frequency of these outliers, <math>p_\text{out}</math>, above <math>5
+
By using the frequency of these outliers above 5&sigma;, <i>p</i><sub>out</sub>}, simulations show that the following formula gives a valid
\sigma</math>, simulations show that the following formula gives a valid
 
 
estimate:
 
estimate:
 
<math>
 
<math>
   \hat Q_\text{cor}=2.5 \frac{p_0}{1-p_\text{out}}
+
   \hat Q_\text{cor}=2.5 \frac{p_0}{1-p_\text{out}}.
 
</math>
 
</math>
  
Line 98: Line 95:
  
 
===Impact of the data compression on science===
 
===Impact of the data compression on science===
The effect of a pure quantization process of step <math>Q</math> (in units of <math>\sigma</math>) on the statistical moments of
+
The effect of a pure quantization process of step <i>Q</i> (in units of &sigma;) on the statistical moments of
a signal is well known (<cite>#widrow</cite>)
+
a signal is well known ({{BibCite|widrow}})
When the step is typically below the noise level (which is largely the PLANCK
+
When the step is typically below the noise level (which is largely the case for Planck)
case) one can apply the Quantization Theorem which states that the
+
one can apply the Quantization Theorem which states that the
process is equivalent to the addition of a uniform random noise in the
+
process is equivalent to the addition of uniform random noise in the
<math>[-Q/2,Q/2]</math> range.
+
[-<i>Q</i>/2,<i>Q</i>/2] range.
 
The net effect of quantization is therefore to add quadratically to the
 
The net effect of quantization is therefore to add quadratically to the
signal a <math>Q^2/12</math> variance. For <math>Q\simeq 0.5</math> this corresponds to a
+
signal a <i>Q</i><sup>2</sup>/12 variance. For <i>Q</i>&asymp;0.5 this corresponds to a
<math>2\%</math> noise level increase.
+
2% noise level increase.
 
The spectral effect of the non-linear quantization process is theoretically much more
 
The spectral effect of the non-linear quantization process is theoretically much more
 
complicated and depends on the signal and noise details. As a rule of
 
complicated and depends on the signal and noise details. As a rule of
 
thumb, a pure quantization adds some auto-correlation function that is
 
thumb, a pure quantization adds some auto-correlation function that is
suppressed by a <math>\exp[-4\pi^2(\frac{\sigma}{Q})^2]</math> factor <cite>#banta</cite>.
+
suppressed by a factor exp[-4&pi;<sup>2</sup>(&sigma;/<i>Q</i>)<sup>2</sup>] {{BibCite|banta}}.
Note however that PLANCK does not perform a pure quantization
+
Note however that Planck does not perform a pure quantization
process. A baseline which
+
process. A baseline is subtracted that
depends on the data (mean of each compression slice value),  
+
depends on the data (specifically the mean of each compression slice value),  
 
is subtracted. Furthermore, for the science data, circles
 
is subtracted. Furthermore, for the science data, circles
 
on the sky are coadded. Coaddition is again performed when
 
on the sky are coadded. Coaddition is again performed when
 
projecting the rings onto the sky (map-making).
 
projecting the rings onto the sky (map-making).
To study the full effect of the PLANCK-HFI data compression
+
To study the full effect of the Planck-HFI data compression
algorithm on our main science products, we have simulated a
+
algorithm on our main science products, we simulated a
realistic data timeline corresponding to the observation of a pure CMB
+
realistic data timeline, corresponding to the observation of a pure CMB
 
sky. The compressed/decompressed signal was then back-projected onto
 
sky. The compressed/decompressed signal was then back-projected onto
the sky using the PLANCK scanning strategy.  
+
the sky using the Planck scanning strategy.  
The two maps were analyzed using the \texttt{anafast} Healpix
+
The two maps were analysed using the \texttt{anafast} HEALPix
procedure and both reconstructed <math>C_\ell</math> were compared. The result is
+
procedure and both reconstructed <i>C<sub>l</sub></i> were compared. The result is
shown for a quantization step <math>Q=0.5</math>.
+
shown for a quantization step <i>Q</i>=0.5.
  
[[Image:cl_DPU_217unlensed.png|thumb|500px|center|Effect of the PLANCK compression algorithm on the reconstructed
+
[[Image:cl_DPU_217unlensed.png|thumb|500px|center|Effect of the Planck compression algorithm on the reconstructed
power spectrum (<math>C_\ell</math>) after data projection and map-making,
+
power spectrum (<i>C<sub>l</sub></i>) after data projection and mapmaking,
 
according to the simulation. The
 
according to the simulation. The
 
upper plot shows the input reconstructed CMB power spectrum (black), the CMB+noise
 
upper plot shows the input reconstructed CMB power spectrum (black), the CMB+noise
 
spectrum for this channel (blue, barely visible) and the
 
spectrum for this channel (blue, barely visible) and the
reconstructed <math>C_\ell</math>) when including the data compression in the chain
+
reconstructed <i>C<sub>l</sub></i>) when including the data compression in the chain
 
(red).  
 
(red).  
 
The lower plot shows these last two
 
The lower plot shows these last two
 
ratios and confirms the white nature of the final added noise at a
 
ratios and confirms the white nature of the final added noise at a
 
level that can still be computed by the Quantization Theorem
 
level that can still be computed by the Quantization Theorem
(<math>2\%</math> for <math>Q=0.5</math> used here).]]
+
(2% for the value <i>Q</i>=0.5 used here).]]
  
  
 
It is remarkable that the full procedure of  
 
It is remarkable that the full procedure of  
baseline-subtraction+quantization+ring-making+map-making still leads to the <math>2\%</math> increase of the
+
baseline-subtraction + quantization+ring-making + mapmaking still leads to the 2% increase of the
variance that is predicted by the simple timeline quantization (for <math>Q/\sigma=2</math>).  
+
variance that is predicted by the simple timeline quantization (for <i>Q</i>/&sigma;=2).  
Furthermore we check that the noise added by the compression algorithm is white.
+
Furthermore we checked that the noise added by the compression algorithm is white.
  
It is not expected that the compression brings any non-gaussianity,
+
It is not expected that the compression brings any non-Gaussianity,
 
since the pure quantization process does not add any skewness and less
 
since the pure quantization process does not add any skewness and less
 
than 0.001 kurtosis, and coaddition of circles and then rings erases
 
than 0.001 kurtosis, and coaddition of circles and then rings erases
any non-gaussian contribution according to the Central Limit Theorem.
+
any non-Gaussian contribution according to the central limit theorem.
  
<biblio force=false>
+
== References ==
#[[References]]
+
<References/>
</biblio>
+
 
[[Category:HFI design, qualification and performance|0013]]
+
 
 +
[[Category:HFI design, qualification and performance|013]]

Latest revision as of 09:45, 10 December 2014

Data compression[edit]

Data compression scheme[edit]

The output of the readout electronics unit (REU) consists of one value for each of the 72 science channels (bolometers and thermometers) for each modulation half-period. This number, SREU, is the sum of the 40 16-bit ADC signal values measured within the given half-period. The data processor unit (DPU) performs a lossy quantization of SREU.

We define a compression slice of 254 SREU values, corresponding to about 1.4 s of observation for each detector and to a strip on the sky about 8° long. The mean <SREU> of the data within each compression slice is computed, and data are demodulated using this mean:

[math]S_{{\rm demod},i} = (S_{{\rm REU},i} - \langle S_{\rm REU} \rangle) \ast(-1)^{i}[/math],

where 1<i<254 is the running index within the compression slice.

The mean <Sdemod> of the demodulated data Sdemod,i is computed and subtracted, and the resulting slice data are quantized according to a step size Q that is fixed per detector:

[math] S_{{\rm DPU},i} = \mbox{round} \left[( S_{\rm demod,i} - \langle S_{\rm demod} \rangle) /Q \right ] [/math].

This is the lossy part of the algorithm: the required compression factor, obtained through the tuning of the quantization step Q, adds a noise of variance approximately 2% to the data. This will be discussed below.

The two means <SREU> and <Sdemod> are computed as 32-bit words and sent through the telemetry, together with the [math]S_{{\rm DPU},i}[/math] values. Variable-length encoding of the SDPU,i values is performed on board, and the inverse decoding is applied on the ground.

Performance of the data compression during the mission[edit]

Optimal use of the bandpass available for the downlink was obtained initially by using a value of Q = σ/2.5 for all bolometer signals. After the 12th of December 2009, and only for the 857 GHz detectors, the value was reset to Q = σ/2.0 to avoid data loss due to exceeding the limit of the downlink rate. With these settings the load during the mission never exceeded the allowed bandpass width, as is seen on the next figure.

Evolution of the total load during the mission for the 72 HFI channels. The variations are mainly due to the time spent by the high-frequency channels in the Galactic region, which has very large data gradients, and depends on the satellite scanning strategy. The bandpass-width limit was 80 kb s-1 and was never reached during the mission.


Setting the quantization step in flight[edit]

The only parameter that enters the Planck-HFI compression algorithm is the size of the quantization step, in units of [math]\sigma[/math], the white noise standard deviation for each channel. This quantity was adjusted during the mission by studying the mean frequency of the central quantization bin [-Q/2,Q/2], p0, within each compression slice (254 samples). For pure Gaussian noise, this frequency is related to the step size (in units of σ) by [math] \hat Q =2\sqrt{2} \text{Erf}^{-1}(p_0)\approx 2.5 p_0, [/math] where the approximation is valid for p0<0.4. In Planck, however, the channel signal is not a pure Gaussian, since glitches and the periodic crossing of the Galactic plane add some strong outliers to the distribution. By using the frequency of these outliers above 5σ, pout}, simulations show that the following formula gives a valid estimate: [math] \hat Q_\text{cor}=2.5 \frac{p_0}{1-p_\text{out}}. [/math]

The following figure shows an example of the [math]\hat Q[/math] and [math]\hat Q_\text{cor}[/math] timelines that were used to monitor and adjust the quantization setting.

Example of the "bin0" frequency timeline (1 point per compression slice) [math]p_0[/math] (labeled "ZERO") for one of the channels. The -1 ("NEG1") and +1 ("POS1") bin frequencies are also shown. The lower plot shows the frequency timeline of outliers, where the glitches and periodic crossing of the Galaxy are visible. The estimated raw step size [math]\hat Q=0.47[/math] and corrected value [math]\hat Q_\text{cor}=0.49[/math] are automatically computed and the step size can be adjusted according to these control plots.

Impact of the data compression on science[edit]

The effect of a pure quantization process of step Q (in units of σ) on the statistical moments of a signal is well known ([1]) When the step is typically below the noise level (which is largely the case for Planck) one can apply the Quantization Theorem which states that the process is equivalent to the addition of uniform random noise in the [-Q/2,Q/2] range. The net effect of quantization is therefore to add quadratically to the signal a Q2/12 variance. For Q≈0.5 this corresponds to a 2% noise level increase. The spectral effect of the non-linear quantization process is theoretically much more complicated and depends on the signal and noise details. As a rule of thumb, a pure quantization adds some auto-correlation function that is suppressed by a factor exp[-4π2(σ/Q)2] [2]. Note however that Planck does not perform a pure quantization process. A baseline is subtracted that depends on the data (specifically the mean of each compression slice value), is subtracted. Furthermore, for the science data, circles on the sky are coadded. Coaddition is again performed when projecting the rings onto the sky (map-making). To study the full effect of the Planck-HFI data compression algorithm on our main science products, we simulated a realistic data timeline, corresponding to the observation of a pure CMB sky. The compressed/decompressed signal was then back-projected onto the sky using the Planck scanning strategy. The two maps were analysed using the \texttt{anafast} HEALPix procedure and both reconstructed Cl were compared. The result is shown for a quantization step Q=0.5.

Effect of the Planck compression algorithm on the reconstructed power spectrum (Cl) after data projection and mapmaking, according to the simulation. The upper plot shows the input reconstructed CMB power spectrum (black), the CMB+noise spectrum for this channel (blue, barely visible) and the reconstructed Cl) when including the data compression in the chain (red). The lower plot shows these last two ratios and confirms the white nature of the final added noise at a level that can still be computed by the Quantization Theorem (2% for the value Q=0.5 used here).


It is remarkable that the full procedure of baseline-subtraction + quantization+ring-making + mapmaking still leads to the 2% increase of the variance that is predicted by the simple timeline quantization (for Q/σ=2). Furthermore we checked that the noise added by the compression algorithm is white.

It is not expected that the compression brings any non-Gaussianity, since the pure quantization process does not add any skewness and less than 0.001 kurtosis, and coaddition of circles and then rings erases any non-Gaussian contribution according to the central limit theorem.

References[edit]

  1. A Study of Rough Amplitude Quantization by Means of Nyquist Sampling Theory, B Widrow, IRE Transactions on Circuit Theory, CT-3(4), 266-276, (1956).
  2. On the autocorrelation function of quantized signal plus noise, E. Banta, Information Theory, IEEE Transactions on, 11, 114 - 117, (1965).

Readout Electronic Unit

analog to digital converter

Data Processing Unit

(Planck) High Frequency Instrument

Cosmic Microwave background

(Hierarchical Equal Area isoLatitude Pixelation of a sphere, <ref name="Template:Gorski2005">HEALPix: A Framework for High-Resolution Discretization and Fast Analysis of Data Distributed on the Sphere, K. M. Górski, E. Hivon, A. J. Banday, B. D. Wandelt, F. K. Hansen, M. Reinecke, M. Bartelmann, ApJ, 622, 759-771, (2005).