Data transfer is a routine activity for most faculty, whether it’s sharing research data with colleagues, downloading research databases, or backing up vital data. When the volume of data you’re transferring is in the tens or hundreds of megabytes, any tool can get the job done. When you have gigabytes, or tens of gigabytes of data to move, more strategy is called for.
The tool and strategy you should use depends on the kind of data you have, the size of the data, whether you need to do the transfer once or repeatedly, and the computer and tools you’re most comfortable with. Some ideas are outlined below, but NACS’s Research Computing Support maintains a detailed discussion with links to sites from which you can get data transfer tools.
Two basic strategies exist which can reduce the actual volume of data you need to transfer: compression and synchronization. Unless your data is already in a compressed form (say, MP3 files), compression can save a great deal of time and network capacity. Many transfer tools can even do on-the-fly compression. If your files contain sensitive information, you may wish to consider encrypting the data you’re transferring, although this imposes a small time penalty.
The second strategy, particularly when you’re regularly moving the same data, is to use a synchronization tool that recognizes that only part of your data is new and needs to be transferred. This can be particularly convenient if you have an entire directory tree you wish to send over the network.
A final technique which might apply in some cases is to make the best possible use of the network, either by setting up multiple parallel data-transfer streams, or even creating a special-purpose GridFTP node. RCS staff can help you analyze your data transfer needs, choose a method, and set up your system.
RCS staff will also coordinate with NACS Network Engineers to ensure they are aware of research data transfer needs in various campus locations. This will help inform future network upgrade plans. In addition, in a few cases, it may be possible to upgrade network connections to higher speed to support critical research requirements.