Not logged inOcean Color Forum

The forum is locked.

The Ocean Color Forum has transitioned over to the Earthdata Forum (https://forum.earthdata.nasa.gov/). The information existing below will be retained for historical reference. Please sign into the Earthdata Forum for active user support.

Up Topic Products and Algorithms / Satellite Data Access / Crafting URLs to download data? (locked)
- By bruce Date 2012-04-24 15:19
I have a list of dates with various lat/lon ranges (they don't correspond to any of the "predefined" regions).   Is it possible to craft a URL that will get all of the granules that fall within the lat/lon range on the given day (or at least get me a list of appropriate URLs) for a given satellite (viirs in this case)?

Thanks!
Bruce
- By norman Date 2012-04-24 17:44 Edited 2020-04-08 15:39
############################################################################
# Note that the browse.pl code has changed since the following text was    #
# written.  Please see this post for an updated approach.                  #
############################################################################


Hi Bruce,

It's ugly, but you could do something like this.

set url=http://oceancolor.gsfc.nasa.gov/cgi/browse.pl

wget -qO - \
$url'?sub=level1or2list&sen=v0&per=DAY&day=15390&n=-35&s=-36&w=37&e=38' \
| perl -n -0777 \
       -e 'if(/filenamelist&id=(\d+\.\d+)/){' \
       -e 'print `wget "'$url'?sub=filenamelist&id=$1&prm=CHL" -qO -`;' \
       -e '}' \
       -e 'elsif(/(V\d+\.L2_NPP_OC)/){' \
       -e 'print "$1\n";' \
       -e '}' \
| wget -B http://oceandata.sci.gsfc.nasa.gov/cgi/getfile/ \
--content-disposition -i -

The above will download the VIIRS level-2 files collected on
20 Feb 2012 (day=15390 refers to number of elapsed days
since 1 Jan 1970) in the region bounded by 35 and 36 South
and 37 and 38 East.

For 3 Mar 2012 between 49 and 51 South and 70 and 73 East,
you could incant the following.

wget -qO - \
$url'?sub=level1or2list&sen=v0&per=DAY&day=15402&n=-49&s=-51&w=70&e=73' \
| perl -n -0777 \
       -e 'if(/filenamelist&id=(\d+\.\d+)/){' \
       -e 'print `wget "'$url'?sub=filenamelist&id=$1&prm=CHL" -qO -`;' \
       -e '}' \
       -e 'elsif(/(V\d+\.L2_NPP_OC)/){' \
       -e 'print "$1\n";' \
       -e '}' \
| wget -B http://oceandata.sci.gsfc.nasa.gov/cgi/getfile/ \
--content-disposition -i -

(The above pipelines assume a *nix Cshell.)

Regards,
Norman
- By bruce Date 2012-04-24 18:08
Thanks!
- By daurin Date 2012-05-10 20:55
Norman,

I am trying to do something similar to Bruce, and was hoping you could clarify some of the parameters in your script.  In particular, what are valid inputs for sub, sen, day, and prm?  These don't seem to correspond to the "post-data" parameters as described in http://oceancolor.gsfc.nasa.gov/forum/oceancolor/topic_show.pl?tid=3081 (e.g. sensor=aqua).  Will all missions count days from 1/1/1970 in this fashion?  Since this script pulls an L2 with all default geophysical values, what does prm=CHL do?

Finally, at the risk of sounding hopelessly naive, is there any method by which this approach could be combined with geographic subsetting of the requested file prior to download?  I have ~13,000 stations (and counting) worldwide to match up with three missions (Aqua, Terra, and SeaWiFS), and am concerned about transfer times and storage space, even if I go after the L2 instead of L1A (ie. in order to implement custom atmospheric correction).

Thanks,
Dirk
- By norman Date 2012-05-10 21:15
Hi Dirk,

The "post-data" parameters you refer to are for
Sean Bailey's file search script.  The parameters
in my post control the browse script.  As far as I
know, I am the only one using a day count from
1970 (based on the unix epoch), but yes, I use the
same counter for all missions.  The "prm" parameter
controls whether the browser displays true-color or
chlorophyll or SST quicklook images, but it also has
some effect on returned file names which is why I
passed it in my example for Bruce.

The code was not written with these sorts of operations
in mind; I just knew that certain combinations of commands
would yield the requested result, so I posted an example.
I'm afraid your request is not in the cards right now.

When I need to process a long time series of data, I sometimes
do something like this.

(1)  Use the browser to get a list of files that meet my criteria.
(2)  Set up a loop that calls Sean's getfile to download each
file, process it, and then delete all the parent files before
moving on to the next file.

I hope this helps.

Norman
- By daurin Date 2012-05-14 18:58
Norman,
Thanks for the explanation.  I have put together an automated query based on your code and some crude ksh that I was hoping you could help me to debug. 

The goal is to read in a file with columns "station", "days-since-1/1/1970", "lat", "lon", and download AQUA, TERRA, and SeaWiFS from those times/positions. I have worked out the main parameter/value pairs for the perl script as far as I can from viewing the source code at http://oceancolor.gsfc.nasa.gov/cgi/browse.pl after running various queries.

My script nearly works, but for some reason downloads the Aqua file twice, the Terra file twice, and grabs both the GAC and MLAC for SeaWiFS.

I'm attaching the script, log, and input files.

Thanks,
Dirk 
Attachment: wget_matchup.ksh (2k)
Attachment: wget_matchup.log (9k)
Attachment: ag_testfile.csv (76B)
- By norman Date 2012-05-14 19:34 Edited 2012-05-14 19:37
Hi Dirk,

If you're only interested in the ocean color files,
you could add dnm=D to eliminate the nighttime
scenes.

http://oceancolor.gsfc.nasa.gov/cgi/browse.pl?sub=level1or2list&sen=am&per=DAY&day=12112&n=24.177&w=-159.819&dnm=D

I'm not sure, without a more careful study of your code,
why you are getting duplicate downloads unless the scenes
in question overlap two of your search areas.

You can restrict your SeaWiFS searches to MLAC by adding
typ=MLAC .

http://oceancolor.gsfc.nasa.gov/cgi/browse.pl?sub=level1or2list&sen=sw&typ=MLAC&per=DAY&day=12112&n=24.177&w=-159.819&dnm=D

By the way, your second and third example dates predate all three sensors.

Norman
- By daurin Date 2012-05-14 19:58
Norman,

The second and third dates are meant to be place holders in the example; was just seeing how the script behaved with no valid matches.  
Thanks for the dnm=D and typ=MLAC tips, that did the trick!

Dirk
- By daurin Date 2012-06-12 22:25
Norman, I would like to adapt your code above to query just the L1A Aqua file. 

I tried something like this:

query="?sub=level1or2list&sen=am&type=LAC&per=DAY&day=13879&n=47&s=24&w=113&e=146&dnm=D"
ocfile=$wget -qO - \
   $url$query \
   | perl -n -0777 \
       -e 'if(/filenamelist&id=(\d+\.\d+)/){' \
       -e 'print `wget "'$url'?sub=filenamelist&id=$1" -qO -`;' \
       -e '}' \
       -e 'elsif(/(A\d+\.L1A_LAC)/){' \
       -e 'print "$1\n";' \
       -e '}' \
| wget -B http://oceandata.sci.gsfc.nasa.gov/cgi/getfile/ \
--content-disposition -i -

...but it still just returns L2.
Thanks,
Dirk
- By norman Date 2012-06-13 12:50 Edited 2013-04-09 13:53
Hi Dirk,

Clearly, I did not design the code with this sort of usage
in mind.  Nevertheless, you should be able to get what
you want by adding "&prm=TC" to your query string.

(You can leave out the "type=LAC" portion of the URL
for MODIS data.)

Norman
- By daurin Date 2012-06-13 13:29
Norman,  Sorry if I'm twisting your code to suit my evil plans, but essentially my need is the same as Bruce's at the top of this thread: to craft a url that will get all the granules falling within a lat/lon range on a given day.  In fact, it has been enormously helpful.

I've tried your suggestion, but I'm still getting just the L2 files.  Here is my actual test code block:

#!/bin/ksh
# Download MODIS imagery from lat/lon limits at time (days since 1/1/70)

date
starttime=`date +%s`

INPUT=~/Dropbox/Projects/GeoCAPE/spatial_tsm/plumes_times_locations_test.txt
url=http://oceancolor.gsfc.nasa.gov/cgi/browse.pl
url2=http://oceandata.sci.gsfc.nasa.gov/cgi/getfile/
L1AAdir=/raid/Imagery/GEOCAPE/modisa/L1A/
L1ATdir=/raid/Imagery/GEOCAPE/modist/L1A/
L1BAdir=/raid/Imagery/GEOCAPE/modisa/L1B/
L1BTdir=/raid/Imagery/GEOCAPE/modist/L1B/
L2Adir=/raid/Imagery/GEOCAPE/modisa/L2/
L2Tdir=/raid/Imagery/GEOCAPE/modist/L2/

tick=0
nrecords=`wc -l $INPUT |awk '{print $1}'`
print "Total records: $nrecords"

typeset -i dat
# dat is the sorted day number since 1/1/70
[ ! -f $INPUT ] && { echo "$INPUT file not found"; exit 99; }
while read dat north south east west
do
  ((tick+=1))

  print "Finding imagery for:"
  date -d "UTC 1970-01-01 $dat days" -u --utc

  echo "#################<<<<<<<<<<<<<<<<< Sequential Day: $tick >>>>>>>>>>>>>>>>>>######################"

  # AQUA
  echo "************AQUA on $dat between $south and $north N, and between $west and $east E"
  query="?sub=level1or2list&sen=am&per=DAY&day=$dat&n=$north&s=$south&w=$west&e=$east&dnm=D&prm=TC"
  ocfile=$(wget -qO - \
    $url$query \
    | perl -n -0777 \
           -e 'if(/filenamelist&id=(\d+\.\d+)/){' \
           -e 'print `wget "'$url'?sub=filenamelist&id=$1" -qO -`;' \
           -e '}' \
           -e 'elsif(/(A\d+\.L1A_LAC)/){' \
           -e 'print "$1\n";' \
           -e '}' -)
  set -A ocaqua $ocfile
  loca=${#ocaqua[*]}
  print "Matches: ${ocaqua[*]}"
  for (( x=0; x<= ${#ocaqua[*]}-1; x +=1)); do
    print "File $x is ${ocaqua[$x]}"
    # Check for the full granule
    ls $L1AAdir${ocaqua[$x]} > /tmp/dump 2>&1
    q1=$?
    ls $L1AAdir${ocaqua[$x]}.bz2 > /tmp/dump 2>&1
    q2=$?  

    if [[ $q1 != 0 && $q2 != 0 ]];then
      print "Downloading..."
      # echo ${ocaqua[$x]} |
      # wget -B http://oceandata.sci.gsfc.nasa.gov/cgi/getfile/ \
      #  --content-disposition -i -
      # mv ${ocaqua[$x]}* $L1AAdir
    else
      print "File found on local drive"
    fi
  done
done < $INPUT

...and here is the output:

dirk@oceanbio3:~/Dropbox/rs_scripts$ ./wget_test.ksh
Wed Jun 13 09:13:24 EDT 2012
Total records: 1
Finding imagery for:
Tue Jan  1 00:00:00 UTC 2008
#################<<<<<<<<<<<<<<<<< Sequential Day: 1 >>>>>>>>>>>>>>>>>>######################
************AQUA on 13879 between 24.7500 and 47.2500 N, and between 113.4000 and 146.6000 E
Matches: A2008001025000.L2_LAC_OC A2008001025500.L2_LAC_OC A2008001030000.L2_LAC_OC A2008001043000.L2_LAC_OC A2008001043500.L2_LAC_OC A2008001044000.L2_LAC_OC A2008001061000.L2_LAC_OC A2008001061500.L2_LAC_OC
File 0 is A2008001025000.L2_LAC_OC
File found on local drive
File 1 is A2008001025500.L2_LAC_OC
File found on local drive
File 2 is A2008001030000.L2_LAC_OC
File found on local drive
File 3 is A2008001043000.L2_LAC_OC
File found on local drive
File 4 is A2008001043500.L2_LAC_OC
File found on local drive
File 5 is A2008001044000.L2_LAC_OC
File found on local drive
File 6 is A2008001061000.L2_LAC_OC
File found on local drive
File 7 is A2008001061500.L2_LAC_OC
File found on local drive
dirk@oceanbio3:~/Dropbox/rs_scripts$
- By norman Date 2012-06-13 16:22
Hi Dirk,

I'm not as familiar with the Bourne Shell, but I find
that the following works for me (tcsh).

set url=http://oceancolor.gsfc.nasa.gov/cgi/browse.pl

wget -qO - \
$url'?sub=level1or2list&sen=am&per=DAY&day=13879&n=47.25&s=24.75&w=113.4&e=146.6' \
| perl -n -0777 \
       -e 'if(/filenamelist&id=(\d+\.\d+)/){' \
       -e 'print `wget "'$url'?sub=filenamelist&id=$1&prm=TC" -qO -`;' \
       -e '}' \
       -e 'elsif(/(A\d+\.L1A_LAC)/){' \
       -e 'print "$1\n";' \
       -e '}' \
| wget -B http://oceandata.sci.gsfc.nasa.gov/cgi/getfile/ \
--content-disposition -i -

The above downloads 14 files for me.

ls -l
total 1882664
-rw-r--r-- 1 norman norman 207816338 2010-03-20 06:13 A2008001025000.L1A_LAC.bz2
-rw-r--r-- 1 norman norman 231256342 2010-03-20 05:47 A2008001025500.L1A_LAC.bz2
-rw-r--r-- 1 norman norman 209823502 2010-03-20 05:51 A2008001030000.L1A_LAC.bz2
-rw-r--r-- 1 norman norman 242869172 2010-03-20 05:40 A2008001043000.L1A_LAC.bz2
-rw-r--r-- 1 norman norman 232182468 2010-03-20 05:47 A2008001043500.L1A_LAC.bz2
-rw-r--r-- 1 norman norman 199515573 2010-03-20 05:42 A2008001044000.L1A_LAC.bz2
-rw-r--r-- 1 norman norman 227956311 2010-03-20 05:53 A2008001061000.L1A_LAC.bz2
-rw-r--r-- 1 norman norman 218411296 2010-03-20 05:35 A2008001061500.L1A_LAC.bz2
-rw-r--r-- 1 norman norman  24448746 2008-01-02 11:57 A2008001163000.L1A_LAC.bz2
-rw-r--r-- 1 norman norman  28992762 2008-01-02 12:05 A2008001163500.L1A_LAC.bz2
-rw-r--r-- 1 norman norman  28354125 2008-01-02 12:01 A2008001164000.L1A_LAC.bz2
-rw-r--r-- 1 norman norman  23563523 2008-01-02 12:27 A2008001181000.L1A_LAC.bz2
-rw-r--r-- 1 norman norman  23894426 2008-01-02 12:28 A2008001181500.L1A_LAC.bz2
-rw-r--r-- 1 norman norman  26765168 2008-01-02 12:27 A2008001182000.L1A_LAC.bz2

Norman
- By norman Date 2012-06-13 16:26
Oh... I see the issue.  Move your prm=TC to the
second wget invocation.
- By daurin Date 2012-06-13 16:39
That's perfect.  It's working fine now.  Thanks once again!
Up Topic Products and Algorithms / Satellite Data Access / Crafting URLs to download data? (locked)

Powered by mwForum 2.29.7 © 1999-2015 Markus Wichitill