[1016] in Coldmud discussion meeting
[COLD] Reducing backups to 1 second, ALWAYS ... (or there abouts)
daemon@ATHENA.MIT.EDU (Tue Jun 11 20:15:13 1996
)
Date: Tue, 11 Jun 1996 17:53:45 -0600 (MDT)
From: Brandon Gillespie <brandon@tombstone.sunrem.com>
To: coldstuff@cold.org
Ok, in the never ending quest for reducing the time of a backup I believe
Miro (Jenner) has found the best solution. To keep myself from
forgetting it I'm posting here (and for feedback).
Basically, right now the driver has done _virtually_ all it can do to
reduce the time involved in a backup. What occurs now is:
* driver goes atomic
* db sync -- objects in memory are written to disk, usually takes
microseconds.
* binary db copy -- copy entire db to a new location, this is the
slowdown--right now its doing a raw read/write--which is about as
fast as we will ever get. The slowdown on backups as the current
time is one thing: disk/filesystem speed.
* driver returns to normal operation
The reason you are atomic is you must keep the db syncronized on disk
during the entire copying process, if you dont it will change halfway
through the backup and you will have a corrupt backup.
Jenner suggests...
* driver goes atomic
* db sync to disk
* db becomes read-only, changed objects stay in ram-cache
* driver returns to normal operation
* fork binary db copy as another process, when it completes a signal
is sent to the driver.
* driver receives signal, sets the db as read/write again.
Because of this the time involved in the backup will ONLY depend upon how
large your ram-cache is, and how active your server is (a more active
server will have more out-of-sync objects in the ram-cache). Overall I
doubt it would take more than a second to sync the db except in the rare
situations.
There is one major problem: the ram-cache can reach capacity. You can
get by this by simply starting the driver with a larger ram cache--but
then you have the problem in that your server may eventually allocate
all of that cache. You will not be like a MOO, as after the backup was
finished the cache would be pushed to disk and the empty holders (still
allocated) would be pushed to swap.
Another suggestion would be to have a regular cache size and a backup
cache size. The backup cache size would be several times larger (perhaps
even growing as needed). When the backup was completed it would reduce
the cache back to its normal size, writing objects to disk as needed. If
you are using phk_malloc() (or your OS malloc is very good), the object
cache holders would be free'd, thereby freeing up pages which are
RETURNED to the operating system, and your footprint suddenly shrinks.
-Brandon Gillespie-