Check for the CASA Incremental Storage Manager Flagging Bug

From CASA Guides
Revision as of 22:21, 4 December 2014 by Jott (talk | contribs) (Created page with " An issue in CASA 4.2.1 and earlier has been discovered with the CASA Table System Incremental Store Manager (ISM) such that when flagdata is run on data with large numbers o...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search


An issue in CASA 4.2.1 and earlier has been discovered with the CASA Table System Incremental Store Manager (ISM) such that when flagdata is run on data with large numbers of antennas a discrepancy can be created between the FLAG_ROW column and FLAG column. The result is much more data being flagged than intended. Unfortunately the nature of the discrepancy means that the affected data will appear unflagged in plotms for example, but be interpreted as flagged in gaincal etc (once applycal is run using an affected gaintable the data will also appear flagged in plotms).

This situation is most likely to happen under the following combination:

(1) The number of antennas is large (2) The number of integrations per scan is large (3) You are attempting to flag a whole antenna(s), i.e. inserting manual flags into what is typically step 10: 'Initial flagging'

A rule of thumb is that the bug will be triggered when Ntimestep*NBaselines > 10000

where Ntimestep = ScanDuration / Exposure (i.e. number of rows per scan/antenna/spw).

For example in one test dataset with 40 antennas, when one bad antenna was flagged for all sources, gaincal interpreted the flag state such that all data for the flux-calibrator was deemed flagged. Also note that condition (2) will typically make ALMA TDM more susceptible than FDM due to the shorter integration time.

This problem has existed for the history of CASA, it is very low down in the fundamental code base. It has only now been found due to (1) now being quite a large number for ALMA and the specific way that ALMA data is stored with regard to auto-correlations and cross-correlations for the same antenna sometimes being separated by many rows in the ms. It may also appear in Jansky VLA data.

The bug was fixed for the current CASA 4.2.2 patch, which is available from the CASA download page onhttps://casa.nrao.edu/casa_obtaining.shtml If you have further questions, please consult the NRAO/CASA helpdesk http://help.nrao.edu or the ALMA helpdeskhttp://help.almascience.org

You may also use our CASA forum, part of the NRAO Science forums https://science.nrao.edu/forums/ to discuss with other users.

Regards,

the CASA team



We have now developed a tool to find out if your data is affected by this Incremental Storgae Manager (ISM) bug. As noted above, the bug is only present in CASA 4.2.1 and earlier. The testing tool, however, is included in CASA 4.3 and later.


This tool is available as a method of the table tool, with name "testincrstman", taking as an argument a string which is the column to be tested:

mytb = tbtool()
mytb.open('test.ms')
ok = mytb.testincrstman('FLAG_ROW')
mytb.close()

If the check passes successfully it returns boolean True, otherwise it returns boolean False and posts a log message specifying the sector (bucket, row) affected.

The heuristics inside check if the rowId number associated with the ISM buckets is not monotonically ascending, and when this condition is detected it is guaranteed that the ISM is corrupted.

Sometimes this check returns positive which means that the ISM is not corrupted (yet) but the column values may not be correctly aligned (as expected by the user). However it is not possible to detect this situation.

In conclusion:

- When this method returns False it means 100% that the ISM is corrupted, and the MS has to be re-processed.

- When it returns True it means that ISM is not corrupted, but: -> The values held may not be correct -> ISM can be potentially corrupted according to the condition nBaselines*nTimeSteps > 10000