Discussion:
Bug#679811: another s390x dpkg-shlibdeps crash
(too old to reply)
Niko Tyni
2012-12-12 22:00:02 UTC
Permalink
Another instance of this bug on s390x is at
https://buildd.debian.org/status/fetch.php?pkg=gimp&arch=s390x&ver=2.8.2-2&stamp=1353706105
*** glibc detected *** /usr/bin/perl: malloc(): memory corruption: 0x00000200004480b8 ***
[...]
dpkg-shlibdeps: error: dpkg-query --control-path libc6:s390x shlibs died from signal 6
Looking at this more systematically on buildd.debian.org, I went through
all the 291 s390x build logs of package versions that have had at least
two build attempts. I found a dozen matches for 'dpkg-shlibdeps: error':

y/yapet/0.8~pre2-1/s390x_1322439632_log.bz2:dpkg-shlibdeps: error: dpkg-query --control-path libgcc1 symbols died from signal 11
v/vips/7.28.2-1+b1/s390x_1333573321_log.bz2:dpkg-shlibdeps: error: dpkg-query --control-path libjpeg8:s390x symbols died from signal 11
t/timemachine/0.3.3-1/s390x_1322675528_log.bz2:dpkg-shlibdeps: error: dpkg-query --control-path libglib2.0-0 symbols died from signal 11
o/openscenegraph/3.0.1-3/s390x_1331737417_log.bz2:dpkg-shlibdeps: error: dpkg-query --control-path libc6 shlibs died from signal 6
g/gnat-gps/5.0-10/s390x_1337451822_log.bz2:dpkg-shlibdeps: error: objdump died from signal 11
g/gdal/1.9.0-3/s390x_1338917204_log.bz2:dpkg-shlibdeps: error: dpkg-query --control-path libdapclient3:s390x shlibs died from signal 11
f/firebird2.5/2.5.2~svn+54698.ds4-1/s390x_1341081509_log.bz2:dpkg-shlibdeps: error: dpkg-query --control-path libicu48:s390x symbols died from signal 11
e/evolution/3.2.2-1+b1/s390x_1336510533_log.bz2:dpkg-shlibdeps: error: dpkg-query --control-path libebook-1.2-12 symbols died from signal 6

Not all of these contain a Perl backtrace. I'm not sure how clearly this
is Perl crashing vs. something else. Two separate issues seem improbable,
though.

Applying the same method to all the other architectures too, ranging
back 180 days from now, gives the additional

e/evolution/3.4.3-1/s390_1340147597_log.bz2:dpkg-shlibdeps: error: dpkg-query --control-path libnss3:s390 symbols died from signal 6
l/llvm-3.1/3.1-2/s390_1341191288_log.bz2:dpkg-shlibdeps: error: dpkg-query --control-path libstdc++6:s390 symbols died from signal 6

so this failure mode seems to be specific to s390* unless I goofed
up some way.

This isn't exhaustive as some of the failures (like shogun, mentioned
in this bug) were not given back before a new source upload superseded
them. I can't see any way to determine all failed builds on
buildd.debian.org without reading all the logs in; the best thing I
found is /srv/buildd.debian.org/db/log which I'm using as an index.

FWIW I can't see any clear correlation between the buildd host and the
failure rate so a hardware problem seems to be out of the picture.

Cc'ing the s390 list in case somebody has clever ideas. This is
unreproducible so far; if somebody can reproduce it reliably, please
take a snapshot of the whole system or something before it goes away.
--
Niko Tyni ***@debian.org
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Philipp Kern
2012-12-13 11:30:01 UTC
Permalink
Post by Niko Tyni
so this failure mode seems to be specific to s390* unless I goofed
up some way.
Yes, it is.
Post by Niko Tyni
This isn't exhaustive as some of the failures (like shogun, mentioned
in this bug) were not given back before a new source upload superseded
them. I can't see any way to determine all failed builds on
buildd.debian.org without reading all the logs in; the best thing I
found is /srv/buildd.debian.org/db/log which I'm using as an index.
I did start a grep on all s390x build logs yesterday. And it did complete
over night.
Post by Niko Tyni
FWIW I can't see any clear correlation between the buildd host and the
failure rate so a hardware problem seems to be out of the picture.
Cc'ing the s390 list in case somebody has clever ideas. This is
unreproducible so far; if somebody can reproduce it reliably, please
take a snapshot of the whole system or something before it goes away.
abort()s don't leave coredumps as far as I know and those problems are
unreproducible as a given back build usually works. Normally these
are one-offs. The build log grep seems to confirm that.

So we are talking about these:

./g/gimp/2.8.2-2/s390x_1353706105_log.bz2:*** glibc detected *** /usr/bin/perl: malloc(): memory corruption: 0x00000200004480b8 ***
./l/linux/3.2.30-1/s390x_1348704958_log.bz2:*** glibc detected *** /bin/sh: malloc(): memory corruption: 0x00000000800288e0 ***
./s/sofa-framework/1.0~beta4-6.1/s390x_1338312919_log.bz2:*** glibc detected *** /usr/bin/perl: malloc(): memory corruption (fast): 0x0000000083aebdd0 ***
./p/pdl/1:2.4.11-3/s390x_1338241813_log.bz2:*** glibc detected *** /usr/bin/perl: malloc(): memory corruption: 0x0000000080c28b60 ***
./p/pdl/1:2.4.11-3/s390x_1338241813_log.bz2:*** glibc detected *** /usr/bin/perl: malloc(): memory corruption: 0x0000000080c28dc0 ***
./p/pdl/1:2.4.11-4/s390x_1338393181_log.bz2:*** glibc detected *** /usr/bin/perl: malloc(): memory corruption: 0x0000000080c28b60 ***
./p/pdl/1:2.4.11-4/s390x_1338393181_log.bz2:*** glibc detected *** /usr/bin/perl: malloc(): memory corruption: 0x0000000080c28dc0 ***
./p/pdl/1:2.4.11-1/s390x_1338082289_log.bz2:*** glibc detected *** /usr/bin/perl: malloc(): memory corruption: 0x0000000080c28b40 ***
./p/pdl/1:2.4.11-1/s390x_1338082289_log.bz2:*** glibc detected *** /usr/bin/perl: malloc(): memory corruption: 0x0000000080c28d80 ***
./p/pdl/1:2.4.11-2/s390x_1338139001_log.bz2:*** glibc detected *** /usr/bin/perl: malloc(): memory corruption: 0x0000000080c28b40 ***
./p/pdl/1:2.4.11-2/s390x_1338139001_log.bz2:*** glibc detected *** /usr/bin/perl: malloc(): memory corruption: 0x0000000080c28d80 ***
./p/pdl/1:2.4.10+dfsg-1/s390x_1333684026_log.bz2:*** glibc detected *** /usr/bin/perl: malloc(): memory corruption: 0x0000000080c154f0 ***
./p/pdl/1:2.4.10+dfsg-1/s390x_1333684026_log.bz2:*** glibc detected *** /usr/bin/perl: malloc(): memory corruption: 0x0000000080c15670 ***
./h/haskell-attempt/0.3.1.1-1+b2/s390x_1328292955_log.bz2:*** glibc detected *** /usr/bin/perl: malloc(): memory corruption: 0x00000200001f7838 ***

Indeed I saw it mostly with perl but it's not confined to it. It doesn't
make me happy but it happens quite rarely. Maybe one would have the best
chance with pdl to debug that. 0x80c28d80 is on the heap in any case
(i.e. not an anonymous mmap), my assumption is still that something in the
allocator or the kernel's fishy.

In any case feel free to give back builds failing with this reason. That's
what I did so far.

Kind regards
Philipp Kern

Loading...