Package: dump
Version: 0.4b44-1
Severity: important
Hello,
I am running a restore, and here is output from top:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11070 root 20 0 7549m 4.2g 284 D 6.3 53.7 831:32.08 restore
The only reason the RSS is so much lower than the VSS is that part of
it has swapped out.
Here's what happens.
On restore, it reads the first 2.7GB or so of the backup, by that
point allocating around 800MB of RAM. It then spends a long time
creating all the directories in the backup set, and as it does so the
RAM usage gradually increases to many GBs. Once it starts creating
files, the RAM is up north of 5GB. As it extracts the dump, the RAM
continues to climb. I had to move the restore process to a different
machine than the server from which it was made, because that system
had only (!) 4GB RAM. Extracting over NFS is working, so far, but
this system has 8GB RAM and the restore is only about 2/3 done at this
point.
The dump in question was made from a filesystem containing 1.8TB of
BackupPC data across about 24 million inodes. BackupPC works with a
hardlink farm, and every backup has a directory skeleton created
(though only the "full" backups have files hardlinked into the storage
pool).
I didn't specifically watch while dump was running, but I would have
noticed if it tried to allocated 8GB RAM.
In addition, restore filled up /tmp because it tried to put 2.8GB
there, causing issues with other programs running on the system.
(Worked around with -T)
I think some people may face a situation where a backup is
unrestoreable because the restore process demands a far beefier system
than the backup process does!
-- System Information:
Debian Release: 7.1
APT prefers stable
APT policy: (500, 'stable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386
Kernel: Linux 3.2.0-4-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Versions of packages dump depends on:
ii e2fslibs 1.42.5-1.1
ii libblkid1 2.20.1-5.3
ii libc6 2.13-38
ii libcomerr2 1.42.5-1.1
ii libncurses5 5.9-10
ii libreadline6 6.2+dfsg-0.1
ii libselinux1 2.1.9-5
ii libuuid1 2.20.1-5.3
ii tar 1.26+dfsg-0.1
dump recommends no packages.
dump suggests no packages.
-- no debconf information
Acknowledgement sent
to Elliott Mitchell <[email protected]>:
Extra info received and forwarded to list. Copy sent to Alexander Zangerl <[email protected]>.
(Fri, 30 Jun 2017 03:15:02 GMT) (full text, mbox, link).
Perhaps there should be a caution about filesystems with large numbers
of i-nodes. Notice the numbers provided are just under 330 bytes for
every i-node.
The current `restore` program no longer acts as the traditional 4.4BSD
`restore` did. Instead of restoring a near-exact image of the filesystem
to a clean filesystem, it has to remap each file to a new i-node. I
think this behavior is a *vast* improvement, but it means large numbers
of i-nodes result in large memory consumption during restore.
In order to perform this task, `restore` has to generate a huge table to
map old i-node numbers to new filenames. 330 bytes per i-node isn't too
bad as far as this goes. Perhaps some optimization can be done, but with
this many i-nodes you're simply bumping into a problem of how small can a
hash-table or tree be yet still perform the needed function.
(geeze, memory and processor power are so cheap nowadays...)
--
(\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/)
\BS ( | [email protected] PGP 87145445 | ) /
\_CS\ | _____ -O #include <stddisclaimer.h> O- _____ | / _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445
Acknowledgement sent
to Elliott Mitchell <[email protected]>:
Extra info received and forwarded to list. Copy sent to Alexander Zangerl <[email protected]>.
(Tue, 18 Jul 2017 04:00:03 GMT) (full text, mbox, link).
I should mention a reasonable alternative method.
I /think/ it should be reasonable to replace restoresymtable with a
small directory tree. The first level of directories corresponding to
the first digit of a restored file, then inside each directory include a
hard link to the new file.
Say if i-nodes 0001, 0002, 1000, 1001, and 2134 the files might be:
0/001
0/002
1/000
1/001
2/134
When the file replacing i-node 0001 was created, the link to whatever
file was newly created is added to the FS. Depending upon the number of
i-nodes a few levels of directories might be needed. The point is to
replace the gigantic symbol table hash which needs to fit in memory, with
a directory tree which the new filesystem can hopefully handle reasonably
efficiently. Directories would need symbolic links instead of hard
links.
--
(\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/)
\BS ( | [email protected] PGP 87145445 | ) /
\_CS\ | _____ -O #include <stddisclaimer.h> O- _____ | / _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445
Acknowledgement sent
to Alexander Zangerl <[email protected]>:
Extra info received and forwarded to list. Copy sent to Alexander Zangerl <[email protected]>.
(Thu, 24 Aug 2017 14:09:03 GMT) (full text, mbox, link).
On Mon, 17 Jul 2017 20:55:04 -0700, Elliott Mitchell writes:
>I should mention a reasonable alternative method.
i'm not sure i'd want to go to a dir tree for this; if
one just wants to handle level 0 backups then my understanding
is that the restoresymtable could be skipped completely.
there's no code for such a skip at this time, but i suspect that
that would be a cheap remedy - at least for full backups.
>I /think/ it should be reasonable to replace restoresymtable with a
>small directory tree.
right now i don't see why the symtable cannot be dumped incrementally
instead of being held in memory.
regards
az
--
Alexander Zangerl + GPG Key 2FCCF66BB963BD5F + http://snafu.priv.at/
"It's a pity that punched card equipment is now almost all gone. There's
nothing better for grabbing a tie and breaking the wearer's neck."
-- Mike Andrews
Acknowledgement sent
to Elliott Mitchell <[email protected]>:
Extra info received and forwarded to list. Copy sent to Alexander Zangerl <[email protected]>.
(Thu, 24 Aug 2017 19:30:03 GMT) (full text, mbox, link).
Subject: Re: Bug#726731: #726731: dump: Huge RAM usage on restore
Date: Thu, 24 Aug 2017 12:10:55 -0700
On Thu, Aug 24, 2017 at 11:52:26PM +1000, Alexander Zangerl wrote:
> On Mon, 17 Jul 2017 20:55:04 -0700, Elliott Mitchell writes:
> >I should mention a reasonable alternative method.
>
> i'm not sure i'd want to go to a dir tree for this; if
> one just wants to handle level 0 backups then my understanding
> is that the restoresymtable could be skipped completely.
>
> there's no code for such a skip at this time, but i suspect that
> that would be a cheap remedy - at least for full backups.
That would be mode -x or -X which extracts files, rather than than
attempting to do a full restore. That certainly has benefits if you're
merely trying to retrieve copies of file from a dump. Since the original
bug specifically mentioned "restore", implying mode -r that doesn't sound
like an acceptable resolution (though possibly acceptable to the original
reporter).
> >I /think/ it should be reasonable to replace restoresymtable with a
> >small directory tree.
>
> right now i don't see why the symtable cannot be dumped incrementally
> instead of being held in memory.
If the i-nodes in a dump can be *guaranteed* to be in-order then such
should be possible. I'm guessing right now either hash table(s) or a
tree is being used, since an old i-node number needs to be mapped to a
new i-node number/file. If they're guaranteed in-order then that can be
optimized to a list-type structure with a provision for skipping ahead
quickly.
I'm pretty sure e2dumpfs upholds this guarantee, but do *all* dump
implementations uphold this guarantee? (ideally `restore` would be able
to handle foreign dumps)
Speaking of which, I'm inclined to suggest `dump` should have a -t option
similar to the -t option of `mount`. Alas -t for `restore` has already
been allocated for a behavior like `tar`'s -t behavior.
--
(\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/)
\BS ( | [email protected] PGP 87145445 | ) /
\_CS\ | _____ -O #include <stddisclaimer.h> O- _____ | / _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445
Debbugs is free software and licensed under the terms of the GNU General
Public License version 2. The current version can be obtained
from https://bugs.debian.org/debbugs-source/.