Ocean Color Forum - Not logged in
Forum Ocean Color Home Help Search Login
Previous Next Up Topic Products and Algorithms / Satellite Data Products & Algorithms / HDF VData Export (locked) (14989 hits)
By DonCatanzaro Date 2006-03-18 05:36
Hi All,

Once again I am coming to the forum.  I can't tell you how much I appreciate all your help, and I have made progress but am stumped again.  It seems to be in fits and starts.  For those who have not followed this in excrutiating detail, my goal is to get Chlor a, nlw, and K490 data out of the binned cumulative product for analysis in my GIS, all on a windows platform.  Now I am a Biologist/GIS guy and while I've been doing this for a while, this project has been really wracking my brain because all these formats and conversions are foreign to me, I am more comfortable with digital air photos and Landsat TM  !

While HDFView 2.3 seemed to work fine to display the maps of data for the SMI product (which I found out lost the shallow zones during conversion to 8 bit data), HDFView does not display the data for the binned product and I figured that has something to do with the subordinate files.  I should caveat this, at least I can't make it work.  I know HDFView is reading something properly because I can pull up the properties of each of the subordinate data (ie chloro ) and I can see the metadata such as path (/level-3 binned data/) tag, ref (1962,12) name (chlor_a sum) Type (32-bit floating point) Array Size (1)

I've been following the steps that Norman posted for me on one of my previous queries http://oceancolor.gsfc.nasa.gov/forum/oceancolor/topic_show.pl?tid=1018

Here's the steps so far:
1)  Download four files A20020012006031.L3b_CU.main, A20020012006031.L3b_CU.x04, A20020012006031.L3b_CU.x07 and A20020012006031.L3b_CU.x08.

2)  bunzip'd them

3)  Installed HDP (as far as I can tell, it works great on windows platform).

4)  Ran hdp dumpsds -h -o sds.txt A20020012006031.L3b_CU.main so I could look at the header everything looks good, but what would I know!

5) Ran hdp dumpvd -h -o vd.txt A20020012006031.L3b_CU.main so I could look at the header

6)  Ran hdp dumpvd -n chlor_a -d -o chlor_a.bin -b A20020012006031.L3b_CU.main (I also made an ASCII file so I could look at it).
Interestingly, the ASCII file is two columns wide, not a 8640 x 4320 matrix of values, any reason for that ?

7)  Installed netpbm (version 10.27) to get the pgm conversion tools

8)  Ran rawtopgm 8640 4320 chlor_a.bin > chloro.pgm

9)  Tried to open the file in ERMapper as a binary in BIL (Band Interleaved by Line ), BIP (Band Interleaved by Pixel), and BSQ (Band Sequential format) with no luck.  Did not expect the BIL or BIP to work because I did not think pgm is structured that way but I was not sure so I gave it a shot.  I used the values of 8640 cells 4320 lines and 1 band and a 32-bit unsigned integer (ERMapper can't handle 32-bit floating). 

So, I then tried to open the PGM file in a generic viewer (Advanced Batch Converter 3.9) and the image was basically gray noise.

I know the forum is not an expert on PGM file format, but does anyone have any ideas of where I went wrong ?  I thought I did everything correct but it sure does not look like it. Ii was expecting a picture that resembles the earth.

By the way, if I try to convert the PGM file to a TIFF with pnmtotiff, it does not work, I get an error that says "_TIFFVSetField:  Standard Output:  Bad value 0 for "RowsPerStrip".  Which made me wonder if I got my columns/rows transposed and when I ran rawtopgm with 4320 x 8640 the resultant file is taller than it is wide, so that can't be right.

Also, by the way, I'd like to import the file into ERMapper before making it to a TIFF because a TIFF will loose the 32-bit information in the HDF chloro file. 

So I am stuck again with a bunch of questions on how to do what I want to do.  What's worse is when I look at the headers and the PDF describing the binned product it says that in the subordinate files the data are

_sum (4-byte real): weighed sum of binned pixel values for corresponding geophysical parameter.
_sum_sq (4-byte real): weighted sum of squares of binned pixel values for corresponding geophysical parameter.

which means I then have to pull out the weights to compute the average (which I have no idea yet how to get the weights. 

Thanks again for all your help ! 

-Don

PS If I have to do this again, I'll have to find a LINUX box that is for sure.
By Anonymous Date 2006-03-18 21:02
Don,

The bin files are not maps; they are actually just lists of filled bins, with associated bin numbers provided in the .main file.  Specialized software is required to convert from bin number to lon/lat, to enable mapping of the data.  No generic utilities will be able to do this for you.  SeaDAS is required (unless you want to write code yourself). Here's a post on how to run SeaDAS under Windows:

  http://oceancolor.gsfc.nasa.gov/forum/oceancolor/topic_show.pl?pid=3314;hlm=adv;hl=knoppix#pid3314

However, I question your original reason for going from mapped to binned data.  I doubt we "lost the shallow zones" when converting to 8-bit.  It is more likely that we never had them, because we could not get a retrieval in the coastal zones that you wish to study.  Have you examined the Level-2 images from our browser, to verify that we do actually have retrievals in your area of interest? It's also possible that we have retrievals, but the data was flagged for some quality issues at Level-2 (e.g., stray light) and subsequently masked at Level-3. In either case, the associated bins will not exist in the binned product, which is why they don't exist in the SMI product.  What is really lost between the bin file and the associated SMI is digital resolution (float to 8-bit), spatial consistency (equal area to equiangular), and statistical knowledge (number of observations, standard deviation info).  The dynamic range of the data is effectively maintained between the binned and mapped products, though the chlorophylls above 64 mg/m^3 will be pegged at 64. No additional masking occurs between the bin and SMI products.

Keep in mind that our processing is tuned to produce the best quality global products.  This sometimes requires processing decisions that improve the global averages at the expense of the coastal regions.  For example, we utilize bio-optical models within the atmospheric correction and geometric normalization steps which are not valid in turbid (case-2) waters.  We are also very conservative with cloud masking, to the point that we may mask high aerosol concentrations as clouds (concentration levels which may be common to coastal regions).  When people wish to utilize SeaWiFS or MODIS for coastal ocean studies, they generally order the Level-1A data for their area of interest and process through Level-2 and beyond with SeaDAS, while adjusting the processing parameters as they deem most appropriate for their region and requirements. This is the primary reason we distribute all of our processing codes to the research community through SeaDAS.

I realize this is a lot more work, but I suspect what you are trying to do with the bin files may prove to be a waste of your time.

-- Bryan
  
By DonCatanzaro Date 2006-03-18 23:14 Edited 2006-03-18 23:21
Hi Bryan,

Thanks for you post and reply but now I am really puzzled.  The only reason I went to the binned product is Sean's post to my previous question (http://oceancolor.gsfc.nasa.gov/forum/oceancolor/topic_show.pl?tid=1137).  The main problem that I encountered is that portions of South Florida, the Great Lakes and Caspian Sea are water but don't have data in the SMI product (especially nlw551).  Sean recommended that I get the binned data because of the scaling of the 8-bit data may have effected my areas of interest.

As far as your other questions, I have not looked at the L-2 images from the browser yet, great idea ! The place within my area of interest that seems the worst is Lake Erie ...

Also you stated "It's also possible that we have retrievals, but the data was flagged for some quality issues at Level-2 (e.g., stray light) and subsequently masked at Level-3."   I would imagine that as you temporally bin the data into a 4 year cumulative product, you' have to have a couple of valid data points for those shallow zones.  Thus some data would appear for the entire Lake Erie.  Is that a correct assumption ?

Finally you state "Specialized software is required to convert from bin number to lon/lat, to enable mapping of the data."  By specialized software do you mean that currently SeaDAS is the only software that can read HDFs with subordinate data or do you mean any HDF software (which someone like me might consider to be specialized) ?  If the former, you might want to put that in a FAQ and also a couple of places on the website like http://oceancolor.gsfc.nasa.gov/DOCS/ocformats.html and inside the PDFs on that page which explain the various products.  The PDFs have generic language such as for the Level-3 binned product PDF (Ocean_Level-3_Binned_Data_Products.pdf) where it states

"These specifications are given in terms of the logical implementation of the products in HDF and are not a physical description of file contents. Therefore, HDF software must be used to create or read these products."

I read that statement and it means to me that any HDF software should be able to read the data, not that SeaDAS is the only software currently capable of doing this, but I am new to this data.

Maybe I should back up a bit and explain my project before we go to much farther.  We are trying to run GARP models (Genetic Algorithm for Rule-Set Production - see www.lifemapper.org) on invasive species orginating from the Ponto-Caspian and Baltic Seas and coming to the USA.  In order to accomplish this, we need data in the source and target areas.  I am trying to use Chloro a as a surrogate for productivity and nlw&K490 as surrogates of water clarity or turbidity.  For my purposes, an annual or cummulative average is fine, thus the focus on a cumulative and/or binned product. 

If I understand correctly, the Level-3 browse products are made from the SMI product.   So I am assuming the L-3 browse products will have the same problems as the SMI for those shallow areas.  This then leaves me with L-3 binned products or L-2 products.  Given that the binned products are temporally binned across the year, it seemed like a good idea to get those products.

Whew, with all of that, I am still pretty confused on which data product will be appropriate for me to use.

-Don
By Anonymous Date 2006-03-19 00:24
Don,

Sean is correct that you can get dynamic range from the bin files that you may lose in the SMI files, because there is an upper limit on the scaling range (e.g., 64 for chlorophyll).  I can see that this would be a potential problem if you are interested in nLw_551 in highly reflective waters, because the upper-limit on the scaling range may not be sufficient.  But it's not that the SMI doesn't have data for those pixels, it's just that the value is pegged at the upper limit of the scaling range.  All mapped products from the same period are identically masked. 

Yes, as we accumulate data over months and years, the possibility of obtaining one good observation increases.

Re specialized software:  I mean generic HDF viewers may be able to give you some information regarding the bin file meta-data, they may even be able to read the content, but they won't have any idea how to map the data into a global image.  You can read the bin files with standard IDL functions, for example, but you'd have to write or obtain code to convert the bin numbers to earth locations.  If you use IDL, there is software on our website which can do it:

   http://oceancolor.gsfc.nasa.gov/cgi/idllibrary.cgi?dir=hdf

I'm not really familiar with the various utilities that claim to read HDF4.  I think most of them have limited capabilities, mostly to read raster images from HDF files. If they can read a general SDS or VDATA, you'd still have to be able to tell them what to do with it.  You need something like IDL or MATLAB for that (or C or Fortran linked with the HDF library).

Based on what you explained, I would suggest that you find a way to install SeaDAS, either on a Linux box or using Knoppix on a Windows box.  Then obtain the bin files at  the desired accumulation period (e.g., yearly).  With SeaDAS, you can run the software that we run to convert the bin file to SMI, but you will be able to specify the scaling range so that nLw_551 is not saturated in Lake Erie.  You will even be able to specify 16-bit format to get better digital resolution.  And, you will have the ability to map any of the products in the bin file, so you can also look at nLw_667, for example.

-- Bryan

 
By @norman Date 2006-03-20 14:35 Edited 2012-01-17 17:09
Just to add another option to the mix (or another
monkey wrench depending upon your point of view),
what follows is a message I sent out a few years ago
describing a program I have written to read SeaWiFS
bin files.  To make this work for our MODIS L3BIN files,
one just needs to change the following line in the source
code from:

#define NUMROWS               2160

to:

#define NUMROWS               4320

and compile and link the program.  Feel free to use or
ignore.  (SeaDAS is still the recommended software if
you can get it installed.)

Regards,

Norman


Subject: program to read SeaWiFS level-3 bin data
Date: Thu, 13 Feb 2003 19:44:05 -0500

To whoever may be interested,

I have written a program that reads SeaWiFS
level-3 bin files (HDF format) and writes
out the data as an ASCII table.  I have done
this because I have received requests for
such a beast from various different quarters,
and because such a program (one that did not
require IDL or a SeaDAS GUI or something similar)
did not appear to exist.

I also discovered (whether because of the lack
of a program or because of a lack of interest)
that few people appear to actually use the bin
files -- preferring the already gridded Standard
Mapped Image (SMI) files instead.  (The data in
the SMI files have, of course, been reduced from
32-bit floating-point values to 8-bit integers.
[Note that this has changed since I wrote the
above; many parameters are now stored as
32-bit floating point or 16-bit integer values
in our L3 SMI files.])

The program, called swreadl3b, is invoked in one
of three ways.

swreadl3b main_file_path parameter bin_number

or

swreadl3b main_file_path parameter north south west east

or

swreadl3b main_file_path parameter lat lon radius

In each case, main_file_path points to the uncompressed
bin file whose name, by convention, ends with ".main", and
parameter is the name of one of the products stored in
the subordinate files (e.g. "chlor_a","K_490","nLw_443",
etc.).  (The subordinate file containing the selected
parameter is expected to be in the same directory as the
main file -- also uncompressed.)

The first invocation returns data for a selected bin
number if that bin is stored in the file.

The second invocation extracts data for all bins
inside the box specified by north, south, west, and
east (specified in decimal degrees).

The third invocation extracts data for all bins
that are within radius kilometers of the central
latitude and longitude.

Program output looks as follows.  (The records are
181 characters wide not counting the line-feed, so
you may want to maximize your window to display them
without wrapping.)

                                                                                        chlor_a         chlor_a
    bin centerlat  centerlon     north     south       west       east    n   N         sum_obs sum_squared_obs          weight  time_trend_bits                     l2_flag_bits sel
------- --------- ---------- --------- --------- ---------- ---------- ---- --- --------------- --------------- --------------- ---------------- -------------------------------- ---
4449078  29.87500  -60.11212  29.91667  29.83333  -60.16017  -60.06407   20   7  4.46598411e-01  2.07058135e-02  1.15604782e+01 0111100000000000 00100000000000000000000000000001  11
4449079  29.87500  -60.01602  29.91667  29.83333  -60.06407  -59.96797   15   9  4.50908273e-01  2.18728725e-02  1.13889046e+01 0111100000000000 00100000000000000000000000000001  11
4449080  29.87500  -59.91991  29.91667  29.83333  -59.96797  -59.87186   23  12  6.72879457e-01  3.31562646e-02  1.61209564e+01 0111100000000000 00100000000001000000000000000001  11
   .
   .
   .

Mean values can be obtained by dividing the
sum_obs field by the weight field.  Other
statistics (standard deviation, etc.) are
also possible.  (Refer to <a class='url' href='http://oceancolor.gsfc.nasa.gov/cgi/tech_memo.pl?32'>SeaWiFS bin file
documentation</a> published in other places.)

I have run the program a few times with different
arguments and then manipulated the output with
Perl, GMT, etc. to generate the attached images.

The bin boundaries displayed in the first attachment
were generated with this command.

swreadl3b S20022742002304.L3b_MO.main chlor_a 32 24 -86 -78

(Bins in the specified region that are not outlined
are not present in the input file.)

The second attachment (North Pacific) started with
this command.

swreadl3b S20022742002304.L3b_MO.main chlor_a 45 30 130 -160

The third and fourth attachments (South Atlantic) are
just two different projections of the same data which
were extracted with this command.

swreadl3b S20022742002304.L3b_MO.main chlor_a -46 -48 1500

I cannot vouch for the usefulness of the last
three fields that are output, viz. time_trend_bits
(time_rec), l2_flag_bits (flags_set), and sel
(sel_cat).  In fact, I suspect that the
time trend bitmask stored in the bin files
is incorrect.  (In the above sample output one
would expect the set bits to be distributed
all across the 16-bit field for a monthly bin
file, yet only the least significant bits ever
seem to be set for any of the bins I have seen.)

The error may be in the timebin.c code.
The following fragment in particular looks
suspicious.

  if (strncmp(oprod_type, yearc, strlen(oprod_type))==0) {
  /* set bit for this month using month of start time of input file */
        tmp = file_smonths[ii] - 1;
  }else if (strncmp(oprod_type, monthc, strlen(oprod_type))==0){
  /* set bit for this day using day of start time of input file */
        tmp = (file_sdays[ii] + 1)/2 - 1;
  }else if (strncmp(oprod_type, day8c, strlen(day8c))==0){
  /* set bit for day of week, with first bit = start of output period */
        tmp = file_smdays[ii] - period_smdays;
  } else if (strncmp(oprod_type, dayc, strlen(dayc))==0) {
        tmp = start_orbit - sorbit;
  }        
  t_bit[ii] = tmp << 1;

I suspect that the last line might work
better if the bit-shift operands were
switched.

  t_bit[ii] = 1 << tmp;

I am attaching the source code (and
Makefile) for my swreadl3b program.
You can get the HDF libraries and include files from
http://www.hdfgroup.org/release4/obtain.html .

If you have any comments or suggestions,
send them along.

As Jim Acker likes to say,
"Thanks for reading all the way down."

Norman


Previous Next Up Topic Products and Algorithms / Satellite Data Products & Algorithms / HDF VData Export (locked) (14989 hits)



Responsible NASA Official: Gene C. Feldman
Curator: OceanColor Webmaster
Authorized by: Gene C. Feldman
Updated: 27 November 2007
Privacy Policy and Important Notices NASA logo