Ocean Color Forum - Not logged in
Forum Ocean Color Home Help Search Login
Previous Next Up Topic Frequently Asked Questions / Data Access FAQ / Bulk data downloads via HTTP (793 hits)
By @sean Date 2009-09-15 14:50
Can I download data in bulk via HTTP?

Yes.  It is possible to mimic FTP bulk data downloads using the HTTP-based data distribution server

CAVEATS
1) The following examples are provided for informational purposes only.
2) No product endorsement is implied. 
3) There is no guarantee that these options will work for all situations.
4) The examples below are not an exhaustive description of the possibilities.

Using command-line utilities:

w.get:
(*note: the period between 'w' and 'g' in w.get needs to be removed when executing the commands)

1) "mget *SST4* from /MODISA/L2/2006/005

w.get -q -O - http://oceandata.sci.gsfc.nasa.gov/MODISA/L2/2006/005/ |grep SST4|w.get --force-html -i -

2) Use the file search utility to find and download OCTS daily L3 binned  data from November 1, 1996 through  December 31, 1999

w.get -q --post-data="sensor=octs&sdate=1996-11-01&edate=1997-01-01&dtype=L3b&addurl=1&results_as_file=1&search=*DAY*" -O - http://oceandata.sci.gsfc.nasa.gov/cgi/file_search.cgi |w.get -i -

3) grab SeaWiFS data (which needs username and password)

w.get --user=username --password=passwd http://oceandata.sci.gsfc.nasa.gov/restrict/getfile/S2006010174900.L2_GAC_OC.bz2

cURL:

Unlike w.get, cURL has no method for downloading a list of URLs (although it can download multiple URLs on the command line). 
However, a shell or script (perl, python, etc) loop can easily be written (examples below use a BASH for loop):

1) grab MODIS L2 files for 2006 day 005 (Jan 5th, 2006)

for file in $(curl http://oceandata.sci.gsfc.nasa.gov/MODISA/L2/2006/005/ | grep href | cut -d '"' -f 4);
do
  curl -L -O $file;
done;


2) Use the file search utility to find and download OCTS daily L3 binned  data from November 1, 1996 through  December 31, 1999

for file in $(curl -d "sensor=octs&sdate=1996-11-01&edate=1997-01-01&dtype=L3b&addurl=1&results_as_file=1&search=*DAY*" http://oceandata.sci.gsfc.nasa.gov/cgi/file_search.cgi |grep http);
do 
  curl -L -O $file;
done;


3) grab SeaWiFS data (which needs username and password)

curl -u username:passwd -L -O  http://oceandata.sci.gsfc.nasa.gov/restrict/getfile/S2006010174900.L2_GAC_OC.bz2

Web Browser options:

Firefox add-on 'DownloadThemAll'

If you prefer a GUI based option, there is an add-on for the Firefox web browser called 'DownloadThemAll'.  It is easy to configure to download only
what you want from the page (even has a default for archived products -gz, tar, bz2, etc.).  It allows putting a limit concurrent downloads, which is
important for downloading from our servers as we limit connections to one concurrent per file and 3 files per IP - so don't try the "accelerate"
features as you're IP may get blocked.

Another alternative - that works for more than just Firefox (but isn't free) is "Internet Download Manager"

Like 'DownloadThemAll' it has features to grab all the links on a page, as well as limit the number of concurrent downloads.  It also advertises download
acceleration - Do NOT use this feature with our servers, as you're IP may get blocked.
Previous Next Up Topic Frequently Asked Questions / Data Access FAQ / Bulk data downloads via HTTP (793 hits)



Responsible NASA Official: Gene C. Feldman
Curator: OceanColor Webmaster
Authorized by: Gene C. Feldman
Updated: 27 November 2007
Privacy Policy and Important Notices NASA logo