Uploading/downloading/editing files#
Uploading/downloading files#
To transfer files from and to the HPC, see the section about transferring files of the HPC manual
dos2unix
#
After uploading files from Windows, you may experience some problems due to the difference in line endings between Windows (carriage return + line feed) and Linux (line feed only), see also https://kuantingchen04.github.io/line-endings/.
For example, you may see an error when submitting a job script that was edited on Windows:
sbatch: error: Batch script contains DOS line breaks (\r\n)
sbatch: error: instead of expected UNIX line breaks (\n).
To fix this problem, you should run the dos2unix
command on the file:
$ dos2unix filename
Symlinks for data/scratch#
As we end up in the home directory when connecting, it would be convenient if we could access our data and VO storage. To facilitate this we will create symlinks to them in our home directory. This will create 4 symbolic links (they're like "shortcuts" on your desktop) pointing to the respective storages:
$ cd $HOME
$ ln -s $VSC_SCRATCH scratch
$ ln -s $VSC_DATA data
$ ls -l scratch data
lrwxrwxrwx 1 vsc40000 vsc40000 31 Mar 27 2009 data ->
/user/data/gent/vsc400/vsc40000
lrwxrwxrwx 1 vsc40000 vsc40000 34 Jun 5 2012 scratch ->
/user/scratch/gent/vsc400/vsc40000
Editing with nano
#
Nano is the simplest editor available on Linux. To open Nano, just type
nano
. To edit a file, you use nano the_file_to_edit.txt
. You will be
presented with the contents of the file and a menu at the bottom with
commands like ^O Write Out
The ^
is the Control key. So ^O
means
Ctrl-O
. The main commands are:
-
Open ("Read"):
^R
-
Save ("Write Out"):
^O
-
Exit:
^X
More advanced editors (beyond the scope of this page) are vim
and
emacs
. A simple tutorial on how to get started with vim
can be found
at http://www.openvim.com/.
Copying faster with rsync
#
rsync
is a fast and versatile copying tool. It can be much faster than
scp
when copying large datasets. It's famous for its "delta-transfer
algorithm", which reduces the amount of data sent over the network by
only sending the differences between files.
You will need to run rsync
from a computer where it is installed.
Installing rsync
is the easiest on Linux: it comes pre-installed with
a lot of distributions.
For example, to copy a folder with lots of CSV files:
$ rsync -rzv testfolder vsc40000@login.hpc.ugent.be:data/
will copy the folder testfolder
and its contents to $VSC_DATA
on the
, assuming the data
symlink is present in your home directory, see
symlinks section.
The -r
flag means "recursively", the -z
flag means that compression
is enabled (this is especially handy when dealing with CSV files because
they compress well) and the -v
enables more verbosity (more details
about what's going on).
To copy large files using rsync
, you can use the -P
flag: it enables
both showing of progress and resuming partially downloaded files.
To copy files to your local computer, you can also use rsync
:
$ rsync -rzv vsc40000@login.hpc.ugent.be:data/bioset local_folder
bioset
and its contents on $VSC_DATA
to a local folder named local_folder
.
See man rsync
or https://linux.die.net/man/1/rsync for more
information about rsync.
Exercises#
-
Download the file
/etc/hostname
to your local computer. -
Upload a file to a subdirectory of your personal
$VSC_DATA
space. -
Create a file named
hello.txt
and edit it usingnano
.
Now you have a basic understanding, see next chapter for some more in depth concepts.