Discussion:
Proper way to specify Java build-dep
(too old to reply)
Matthias Klose
2008-05-07 08:00:20 UTC
Permalink
Greetings,
My package babel recently closed bug 477845 by changing a build
dependency from java-gcj-compat to default-jdk-builddep. Unfortunately,
this made it FTBFS on hppa, s390, arm and alpha, where before it built
everywhere except arm.
What is the right thing to do in this situation? Do I exclude those
architectures from the package build list? Is there a "java-works"
architecture variable which will automatically add arches as their java
implementations work?
[Please CC me in replies.]
These arches currently dont work, rigth. I"m testing a solution for
this. In general just B-D on default-jdk-builddep is okay.
it is only correct if your package doesn't build other architecture
dependent packages, which is not the case for babel.

the build failure on s390 is unexpected; is it possible to extract a
test case?
Just curious, what will happen regarding testing transitions in the
meantime? And long-term, can I assume that the package will build and
transition as soon as the infrastructure is in place, without another
upload?
the infrastructure is in place; the packages can move to testing once
the hppa, arm and alph binaries are not in unstable anymore.

Matthias
--
To UNSUBSCRIBE, email to debian-java-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Bastian Blank
2008-05-07 09:40:09 UTC
Permalink
Package: libc6
Version: 2.7-10
Severity: important
Post by Matthias Klose
the build failure on s390 is unexpected; is it possible to extract a
test case?
| java: pthread_mutex_lock.c:71: __pthread_mutex_lock: Assertion `mutex->__data.__owner == 0' failed.

So another package failed about that (after mono and libto$bla). It
looks like a race condition somewhere in the libpthread.

Bastian
--
The more complex the mind, the greater the need for the simplicity of play.
-- Kirk, "Shore Leave", stardate 3025.8
--
To UNSUBSCRIBE, email to debian-java-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Aurelien Jarno
2008-05-10 22:30:11 UTC
Permalink
Post by Bastian Blank
Package: libc6
Version: 2.7-10
Severity: important
Post by Matthias Klose
the build failure on s390 is unexpected; is it possible to extract a
test case?
| java: pthread_mutex_lock.c:71: __pthread_mutex_lock: Assertion `mutex->__data.__owner == 0' failed.
So another package failed about that (after mono and libto$bla). It
looks like a race condition somewhere in the libpthread.
Looking quickly at the code the problem is that LLL_MUTEX_LOCK (mutex)
fails to acquire the mutex. It can be a bug in atomic.h or a bug in the
futexes implementation of the kernel.

It would be nice to have an strace of the problem to see the futex
syscall before this assertion.

Also a small testcase of the problem would be really helpful to debug
it.
--
.''`. Aurelien Jarno | GPG: 1024D/F1BCDB73
: :' : Debian developer | Electrical Engineer
`. `' ***@debian.org | ***@aurel32.net
`- people.debian.org/~aurel32 | www.aurel32.net
--
To UNSUBSCRIBE, email to debian-java-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Julien Danjou
2008-10-25 17:30:31 UTC
Permalink
Post by Aurelien Jarno
Looking quickly at the code the problem is that LLL_MUTEX_LOCK (mutex)
fails to acquire the mutex. It can be a bug in atomic.h or a bug in the
futexes implementation of the kernel.
It would be nice to have an strace of the problem to see the futex
syscall before this assertion.
Here's what I can get from #468793.
In this test, if the number of thread is <= 2, it's ok.
With something like ./tchmttest typical casket 3 1000 1000 it fails 50 %
of the time.

I've tried to strace the test but unfortunately when stracing,
everything is fine.

Is there anything from an outsider that could help?

Cheers,
--
Julien Danjou
.''`. Debian Developer
: :' : http://julien.danjou.info
`. `' http://people.debian.org/~acid
`- 9A0D 5FD9 EB42 22F6 8974 C95C A462 B51E C2FE E5CD
Carlos O'Donell
2008-10-27 13:20:05 UTC
Permalink
Post by Julien Danjou
Is there anything from an outsider that could help?
I've seen this on-and-off again on the hppa-linux port. The issue has,
in my experience, been a compiler problem. My standard operating
procedure is to methodically add volatile to the atomic.h operations
until it goes away, and then work out the compiler mis-optimization.

The bug is almost always a situation where the lll_unlock is scheduled
before owner = 0, and the assert catches the race condition where you
unlock but have not yet cleared the owner.

$0.02.

Cheers,
Carlos.
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Andrew Haley
2008-10-27 14:20:12 UTC
Permalink
Post by Carlos O'Donell
Post by Julien Danjou
Is there anything from an outsider that could help?
I've seen this on-and-off again on the hppa-linux port. The issue has,
in my experience, been a compiler problem. My standard operating
procedure is to methodically add volatile to the atomic.h operations
until it goes away, and then work out the compiler mis-optimization.
The bug is almost always a situation where the lll_unlock is scheduled
before owner = 0, and the assert catches the race condition where you
unlock but have not yet cleared the owner.
Are you sure this is a compiler problem? Unless you use explicit atomic
memory accesses or volatile the compiler is supposed to re-order memory
access. Perhaps I'm misunderstanding you.

Andrew.
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Carlos O'Donell
2008-10-27 14:30:11 UTC
Permalink
Post by Andrew Haley
Post by Carlos O'Donell
I've seen this on-and-off again on the hppa-linux port. The issue has,
in my experience, been a compiler problem. My standard operating
procedure is to methodically add volatile to the atomic.h operations
until it goes away, and then work out the compiler mis-optimization.
The bug is almost always a situation where the lll_unlock is scheduled
before owner = 0, and the assert catches the race condition where you
unlock but have not yet cleared the owner.
Are you sure this is a compiler problem? Unless you use explicit atomic
memory accesses or volatile the compiler is supposed to re-order memory
access. Perhaps I'm misunderstanding you.
Sorry, parsing the above statement requires knowing something about
how lll_unlock is implemented in glibc.

The lll_unlock function is supposed to be a memory barrier.

The function is usually an explicit atomic operation, or a volatile
asm implementing the futex syscall i.e. INTERNAL_SYSCALL macro.

Cheers,
Carlos.
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Andrew Haley
2008-10-27 15:40:15 UTC
Permalink
Post by Carlos O'Donell
Post by Andrew Haley
Post by Carlos O'Donell
I've seen this on-and-off again on the hppa-linux port. The issue has,
in my experience, been a compiler problem. My standard operating
procedure is to methodically add volatile to the atomic.h operations
until it goes away, and then work out the compiler mis-optimization.
The bug is almost always a situation where the lll_unlock is scheduled
before owner = 0, and the assert catches the race condition where you
unlock but have not yet cleared the owner.
Are you sure this is a compiler problem? Unless you use explicit atomic
memory accesses or volatile the compiler is supposed to re-order memory
access. Perhaps I'm misunderstanding you.
Sorry, parsing the above statement requires knowing something about
how lll_unlock is implemented in glibc.
The lll_unlock function is supposed to be a memory barrier.
The function is usually an explicit atomic operation, or a volatile
asm implementing the futex syscall i.e. INTERNAL_SYSCALL macro.
I understand all that, but the question still stands: is the compiler
really moving a memory write past a memory barrier? ISTR we did have
a discussion on gcc-list about that, but it was a while ago and should
now be fixed.

Andrew.
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Carlos O'Donell
2008-10-27 16:30:28 UTC
Permalink
Post by Andrew Haley
I understand all that, but the question still stands: is the compiler
really moving a memory write past a memory barrier? ISTR we did have
a discussion on gcc-list about that, but it was a while ago and should
now be fixed.
This issue no longer affects the PA port, but I can't speak for s390.

The PA port is the only port for which I do regular gcc / glibc testing.

Cheers,
Carlos.
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Loading...