published on in GNU/Linux
tags: rsync hugo

The battle of transfer protocols or "How do I update this site"

All right, so as you may know, this website is made with Hugo, the awesome static web generator created in go. I have also used the amazing base16 theme by Hylke Visser. I have tweaked it a lil’ bit so it fits my needs. Actually, after going through the documentation, I want to try and learn go. Next time, I guess.

Today I’m going to talk about how I update this static website.

One thing I really like about Hugo is that it comes with a tiny webserver so you can actually build your website in your local machine and see how your last update looks like. You can use, for example:

$ hugo server --buildDrafts

And your machine starts serving the website on port 1313. This is something I find really cool, because it means that you have a development and a testing environment all together. I like the simplicity of developing/testing locally and then uploading the updates to the production server. Check hugo’s quickstart guide to see how easily you can generate a static site with this tool.

Each time you’re done updating your site, you can just hit:

$ hugo

And voliĆ ! Your static website has been generated under the /public directory. Now you can upload it anywhere. But as always, there are different ways I could think about, namely ftp, scp, sftp and rsync. Lets take a look at them, shall we?

FTP

Ok, to be fair, I was not even considering this one. Mainly because all the transmissions in the File Transfer Protocol are in plaintext. It just amazes me how many hosting companies offer only this system to manage remote files. Just take a quick look at the Wikipedia article and you’ll get my point. It’s fine to use it when you’re connecting both machines through a VPN but I’m not creating a VPN just so I can use FTP instead of the multiple alternatives.

But what about FTPS?

FTPS is an extension of FTP. It adds support for TLS/SSL. This gives me a security layer that would meet my goals. There’s only one problem and it is that I have to install a FTP server that supports FTPS (like the FileZilla server), and then configure it to only use secure connections.

But I already have a SSH server running on the remote machine, and there are many ways to transfer a file over SSH… so what’s the point of adding another server? After all, SSH is a cryptographic protocol, not an extension of a non-cryptographic one, so there’s no way you can mess up in this aspect. Therefore, let’s see what options I can use over SSH.

SFTP, SCP, RSYNC

Here’s where I started doubting. In the beggining, every one of these options looked good to me. They all work over SSH, so there’s no need to install new software on the remote machine. I’ve used all of them for different purposes, such as backing up my data, accessing my smartphone files remotely or copying certain files quickly from/to a raspberry pi. At that point I could say they were all good enough for me, but as the goal of this blog is to learn new things (and do them right), I’ve done a little research to find the differences between these three programs in order to decide which one is the best to use by default.

SCP

SCP stands for “Secure Copy”. It works like the UNIX command cp but over SSH. You can either use it to copy files in a local machine or to/from a remote host. It’s a very simple command-line utility, and you can specify a few options. After a quick search I found some issues with this tool:

  • If the transfer is interrupted, it can’t be resumed
  • It makes a plain copy from A to B, regardless of the files in B.

Hence, this tool makes sense for a first upload, but not for the following updates, as I don’t want to be uploading the whole website each time.

SFTP

SFTP stands for “SSH File Transfer Protocol”. This protocol allows for many more operations than SCP. The program sftp is interactive, it allows you to navigate through the directories of the remote machine, and to put or get what you want. Even if this system would easily allow me to see what’s on the remote machine and quickly upload only what I want… it just looks too complex. I want something that allows me to update my files on the remote machine, without uploading them all again, and without having to check manually what do I have to upload. Something I can script once, then execute and zoop, it’s done.

RSYNC

Rsync is a copying tool. As SCP, it can copy both locally and to/from another host. It works over remote shell protocols or by using its own daemon. The thing that I find interesting about this program is the fact that it’s very versatile and allows for a lot of fine tunning. But there’s more. It uses a delta-transfer algorithm, which only sends the differences between the source files and the destination files. Also, it allows me to write my own filter rules, so I can script ahead what directories I want to update, delete or preserve. It looks like the perfect tool for what I want to do.

Final setup

In order to make my life easier, I use SSH public-key authentication. I also have written the following (trivial) script which I call zoop.sh

#!/bin/bash
# This script should be placed in the hugo working directory
hugo && rsync -rtuv \
	--delete-after \
	--exclude='not-hugo/' \ 
	public/ foo@remotemachine:/path/to/website/directory/

The script generates the static site with hugo and then calls rsync with the following options:

  1. -r: Copy directories recursively.
  2. -t: Preserve timestamps.
  3. -u: Skips files which are newer on the destination than on the source.
  4. -v: Verbose.
  5. –delete-after: Delete files that no longer exist on the source after updating.
  6. –exclude=‘not-hugo/’: Prevents rsync from deleting certain files, in this example, the not-hugo directory.

And that’s it! Once I’ve made and tested my changes to the site, I just call zoop.sh, hit my passphrase and the site is updated secrurely and efficiently :)