AOC - CV Archive Workaround

AOC - CV Archive Workaround

Downloading archive data from CV to AOC or from AOC to CV is slow. This is a workaround.

There are two network links between CV and AOC: 10Gb/s Internet2 tunnel which is used for specific things and a 45Mb/s MPLS which is the default.  Because the Internet2 link is unencrypted we cannot use it for all our traffic.  Also, because the server that hosts the ALMA archive hosts other things as well, we cannot put it on the Internet2 link.  This means that all archive downloads from CV to AOC go over the 45Mb/s MPLS link which is often insufficient for downloading such data.

To work around this problem you can use dynamic port forwarding which will use the Internet2 tunnel.  For this, you will need two terminal windows.

Getting ALMA data to NM

These instructions assume you are not logged into the ALMA Archive web portal.  If you are, the curl options may not work.  It is recommended that you log out of the ALMA Archive web portal in order to use these instructions.

These instructions do NOT work for proprietary data at this time.
1. Select your data from the ALMA Archive
2. Select the File List download method to get a list of URLs.
3. Open two terminal windows on nmpost-master or an nmpost cluster node.
4. In the first terminal window, create a port forward based on your UID.  Using your UID will keep you from interfering with other users.
5. ssh -D localhost:id -u cvpost-master.cv.nrao.edu
6. Then, in the second terminal window, use the curl command to download each URL in the File List.  For example...
7. cd /lustre/whoamicurl -O --socks5-hostname localhost:id -u https://almascience.nrao.edu/dataPortal/requests/anonymous/250749371/ALMA/2011.0.00101.S_2012-04-17_001_of_001.tar/2011.0.00101.S_2012-04-17_001_of_001.tar
8. You can continue to download data in this manner until you exit the first terminal window which will close your port forward.

Getting Other Data from CV to NM

The two clusters, nmpost and cvpost, are connected via the Internet2 link.  So, the following is the quickest way to copy data from CV to NM

1. Login to nmpost-master or an nmpost cluster node.
2. Use rsync to copy the data like so
3. rsync -va cvpost-master.cv.nrao.edu:/lustre/naasc/krowe/bigfile.tar /lustre/krowe

Getting Other Data from NM to CV

The two clusters, nmpost and cvpost, are connected via the Internet2 link. So, the following is the quickest way to copy data from NM to CV

1. Login to cvpost-master or a cvpost cluster node.
2. Use rsync to copy the data like so
3. rsync -va nmpost-master.aoc.nrao.edu:/lustre/krowe/bigfile.tar /lustre/naasc/krowe