Project:Infrastructure/Experiments/Rsync-on-Cloud

From Gentoo Wiki
Jump to:navigation Jump to:search

In the future it may be necessary to run rsync on cloud infrastructure; this experiment provided testing to determine the costs and feasibility associated with moving rsync onto cloud infrastructure.

Gentoo has a fairly large rsync network; mostly run by the community. Its likely too large (and was always too large) and designed for a time period circa 2002 where bandwidth was very costly and having a low-latency mirror near users was highly beneficial.

The experiment

Attempt to run an rsync node as cheaply as possible. To do this, we chose GCP (because Antarus works there) and choose the smallest node we could get away with. Circa 2019 on GCP this was a n1-standard-1 (1vcpu 3GB of ram.) We ran this experiment for 3 months (all bills were paid by Alec, FWIW.) In the end the costs were ~100$ / month.

Code repository for the experiment can be found here.

The setup

We built a container, the container would be started, load the rsync data from the master-mirror, then offer rsync service. A background task in the container would sync a 'shadow' copy of rsync every interval (afaik it was 30m) and then atomically swap the rsync-tree offered by the rsync service; enabling the container to serve a consistent updated rsync tree to users. This necessitated the n1-standard-1 sizing, as we needed 3GB of RAM to fit two copies of the rsync tree in memory. The rsync tree fits entirely in memory and a single vcpu can sustain many concurrent customers (likely 50-100 easily on a single core.)

This (fairly ephemeral) container ran in GCP behind a layer 4 load balancer with an external static IPv4 address and it was published in public DNS in the Americas, to attract traffic from Gentoo users in the US; the actual container ran in the GCP us-central1 region.

The conclusion

The experiment ran for 3 months and cost 320$, so approximately 100$ / month. 2/3 of costs were allocated to the ipv4 address and egress / ingress cost. In GCP just having a layer 4 load balancer has a minimum cost per month (of 18$) and there is an ingress cost after some amount of traffic (but we never exceeded this; rsync is dominated by egress in any case.) The egress cost was around 120$ and we egressed 1TB of traffic during the 3 month period, for an average cost of around 12 cents per egress GB.

While we could slim the service down more (we don't actually *need* the L4 load balancer, so we could cut 20% of cost there) it seemed unlikely we could scale this to the entire rsync service cheaply. One napkin-math suggested the normal rsync service itself did between 3-5TB per month globally (recall that this experiment was only targeted at Americas) and at 12 cents per egress GB, it would cost around 600$ / month solely in egress network cost; more expensive than the value such a service might add over our existing deployment.