Not logged inOcean Color Forum
Up Topic Frequently Asked Questions / Data Access FAQ / Change to Subscription Data Access via FTP (locked)
- By SeanBailey Date 2010-03-22 13:25
This message is relevant for users that have active non-extracted data
subscriptions.  None of the changes outlined will affect the extracted and
mapped subscriptions.

The OBPG is in the process of moving away from FTP for data distribution in
favor of HTTP.  This is driven by a number of issues that some of you might
have encountered with the OBPG FTP server such as slow response and timeouts.
Much of that has to do with the infrastructure behind the FTP server, but some
of it is caused by FTP's antiquatedness, especially when dealing with
firewalls.  To that end, we will be introducing a change in how the OBPG data
subscriptions are staged.

The current scheme stages files, which are actually symbolic links, in your
subscription directories.  Users use some sort of FTP client to poll the site
for new files and download new files.  For this scheme to work, the actual
data files must be present on the FTP server in some way.  It usually works
okay, but occasionally we encounter a problem that causes all of the physical
space to be used, which prevents new data from being staged until space is
available again.

The new scheme will instead create zero-length marker files that have the same
name as the file that would have been staged in the old scheme.  There are
several advantages of this scheme:

   a. Since the marker files have no size, we should never have a space issue
on the subscriptions partition again.

   b. We no longer have to make copies of a significant amount of data, since
the files will not have to reside on the FTP server.

   c. The subscription system will be able to stage the data faster since it
takes virtually no time to create a zero-length file.

The name of the marker files can be plugged into an HTTP-getfile service call
(discussed below) that will download the file.  This will require a small
change to your download procedures.  We envision two main scenarios:

Scenario 1:  You poll the subscription directory for new files, and download
any files that have appeared since the last poll.

In this case, the process that would have downloaded the new files via FTP
would instead use a command-line HTTP client such as w.get or curl to initiate
an HTTP transfer.  The polling mechanism need not change.

Scenario 2:  You periodically sync your local directory with the subscription
directory, and the FTP client determines which files you need and downloads
them automatically.

In this case, you will need a process that examines the downloaded files for
those that are zero length.  For each zero-length file, you will need to
invoke the HTTP transfer.

The HTTP-getfile service call looks like this:

Using w.get:

    w.get http://oceandata.sci.gsfc.nasa.gov/cgi/getfile/$file

Using curl:

    curl --output $file

http://oceandata.sci.gsfc.nasa.gov/cgi/getfile/$file

You replace $file with the name of the file you want to download, i.e. the
name of the zero-length file.  Note there are two instances of $file in the
curl command.

w.get and curl each have a number of options for dealing with timeouts and
retrying transfers that you can use.

This new scheme was implemented on Mar 22, 2010.  We have a few test
subscriptions that have transitioned to the zero-length marker files.  If you
would like to experiment with them, they can be accessed in these directories
on oceans.gsfc.nasa.gov:

     subscriptions/812
     subscriptions/813
     subscriptions/826
     subscriptions/827
     subscriptions/833

We would like this transition to be as smooth as possible for everyone.  If
you have any questions or concerns about this, please let us know.  You can
post questions in the Data Access board.

Regards,

NASA Ocean Biology Processing Group
Up Topic Frequently Asked Questions / Data Access FAQ / Change to Subscription Data Access via FTP (locked)

Powered by mwForum 2.29.7 © 1999-2015 Markus Wichitill