Discussion:
Bug#499833: [SOLVED] - chccwdev cannot set device offline in Lenny
(too old to reply)
Stephen Powell
2010-02-03 16:10:02 UTC
Permalink
I finally found the cause of this pesky bug!
At some point, an "aptitude full-upgrade" seemed to fix this problem.
But not always.

The problem is that I could never come up with a consistent failure
scenario. Well, today, I finally did. It turns out
that, for me, the failure only occurs when using four device numbers:
0400, 0401, 0402, and 0403. When using any other device numbers,
everything works fine. Searching my machine, I found four mystery
files:

/etc/sysconfig/hardware/config-ccw-0.0.0400
/etc/sysconfig/hardware/config-ccw-0.0.0401
/etc/sysconfig/hardware/config-ccw-0.0.0402
/etc/sysconfig/hardware/config-ccw-0.0.0403

These were all empty files, zero bytes each, the kind one would get with
"touch" executed against a non-existent file name. There was another
file in the same directory,

/etc/sysconfig/hardware/config-ccw-0.0.0300

but this is for the (virtual) OSA card. It is not a DASD device. And its file
size is non-zero.

I'm not sure how these files got there. They may be leftovers from
a process that did not complete for some reason. Anyway, I erased the
files and now everything works as expected! The device does not
come online automatically anymore, and when I vary the device off,
it stays off! Hooray! Sorry for all your trouble.
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Frans Pop
2010-02-03 17:10:02 UTC
Permalink
Post by Stephen Powell
I finally found the cause of this pesky bug!
At some point, an "aptitude full-upgrade" seemed to fix this problem.
That was probably at times that sysconfig-hardware was broken... A next
fixed version of that package would have reintroduced your "problem".
Post by Stephen Powell
The problem is that I could never come up with a consistent failure
scenario. Well, today, I finally did. It turns out
0400, 0401, 0402, and 0403. When using any other device numbers,
everything works fine. Searching my machine, I found four mystery
They are not mystery files at all. They are part of sysconfig-hardware and
their exact purpose is to bring up devices during system boot!

They were almost certainly created when you installed the system because
you selected at that time to activate the devices.
Post by Stephen Powell
These were all empty files, zero bytes each, the kind one would get with
"touch" executed against a non-existent file name.
The files are essentially a trigger to bring the device up, so they don't
need any content. But they can contain configuration settings (depending
on the type of device).
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Stephen Powell
2010-02-03 18:20:02 UTC
Permalink
Post by Frans Pop
They are not mystery files at all. They are part of sysconfig-hardware and
their exact purpose is to bring up devices during system boot!
I'm sure that they are no mystery to you, but they *were* a mystery
to me until you explained their purpose.
I knew next to nothing about sysconfig-hardware, except that
it configured the OSA during boot. Now I know a little more.
Post by Frans Pop
They were almost certainly created when you installed the system because
you selected at that time to activate the devices.
That makes sense. That is how I did my first migration of data from
cdl minidisks to CMS minidisks: with the installer. However, after
migrating I deleted 0200-0203 and renamed 0400-0403 to 0200-0203.
This change was made in the CP directory entry for the virtual machine
in z/VM. This change was made without the knowledge of sysconfig-hardware.
So whenever a dynamically linked device showed up using one of the
device numbers 0400-0403, sysconfig-hardware was convinced it needed
to be brought online immediately! What I don't understand is why,
when it is *manually* varied offline *after* being brought online
automatically, sysconfig-hardware thinks that it has to be brought right
back online again! I'm not sure if this behavior is considered "working
as designed", but it's definitely not working as *desired*, at least not from my
point of view. Nevertheless, it's easy enough to circumvent. Just
delete the file.

My production minidisks, 0200-0203, are brought online automatically,
but in a different way. They are part of the "dasd" option passed to
the "dasd_mod" module via an "options" record in a file called
/etc/modprobe.d/dasd. This file is included in the initial RAM
file system image, of course. sysconfig-hardware is not needed to bring
these devices online. But if I *manually* take one of these devices
offline, it *stays* offline! It is only brought online *automatically*
during the boot process. (I can't get "/" offline while running,
of course, but I can get "/boot", "/home", and my swap partition offline
while running, if necessary.)

All of this is how my Lenny system works. I haven't done a Squeeze
install yet. Not on the s390 platform, that is.

I will leave it up to your discretion whether to re-open this bug
and assign it to sysconfig-hardware (using a more appropriate title,
such as "device manually varied offline comes right back on again",
or some other similar title, or to leave it closed. If this problem
has been fixed in Squeeze, then I'm content. And if it hasn't been,
I now know how to work around it.
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Frans Pop
2010-02-03 19:30:01 UTC
Permalink
Post by Stephen Powell
So whenever a dynamically linked device showed up using one of the
device numbers 0400-0403, sysconfig-hardware was convinced it needed
to be brought online immediately! What I don't understand is why,
when it is *manually* varied offline *after* being brought online
automatically, sysconfig-hardware thinks that it has to be brought right
back online again!
Because sysconfig-hardware gets triggered by udev when the kernel tells
udev there's a new device...

When you dynamically add hardware that way I guess it's a new device for
the kernel, just like inserting/removing/reinserting a USB stick in a PC.
Post by Stephen Powell
But if I *manually* take one of these devices offline, it *stays*
offline!
Because they don't disappear for the kernel.
Post by Stephen Powell
I will leave it up to your discretion whether to re-open this bug [...]
There is no bug. Everything works as designed.
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Stephen Powell
2010-02-03 20:20:03 UTC
Permalink
Post by Frans Pop
Because sysconfig-hardware gets triggered by udev when the kernel tells
udev there's a new device...
When you dynamically add hardware that way I guess it's a new device for
the kernel, just like inserting/removing/reinserting a USB stick in a PC.
Hmm. I'm not so sure about that. If I DETACHED the device and then
LINKED it again, I would certainly agree with you. It's a "new"
device. But taking the device offline is not the equivalent of
a DETACH and LINK sequence. In fact, taking the device offline
is a *prerequisite* for a DETACH, if you want to do it cleanly.
The CP DETACH command and the CP LINK command each cause "machine-check"
conditions for "channel report pending", and are handled by the
kernel's machine-check interrupt handler. Varying the device offline
does no such thing. It's just an internal state change in the DASD driver.
I still think this is a bug. It may not be a bug in sysconfig-hardware.
It may be a bug in udev, I don't know. Nevertheless, I still
think there's a bug *somewhere*.

However, I do not wish to be contentious about this.
I know what caused it, and I know how to prevent it in the future.
And that's good enough for me. You have other matters
to attend to, I'm sure.

Cheers,
SMP
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Frans Pop
2010-02-03 21:10:02 UTC
Permalink
It may not be a bug in sysconfig-hardware. It may be a bug in udev, I
don't know.  Nevertheless, I still think there's a bug *somewhere*.
I'm fairly certain that it's not a bug in sysconfig-hardware, nor in
s390-tools, and probably also not in udev. *If* there is a bug (as opposed
to due to incorrect configuration) of the system, the most likely
candidate is the kernel which is where the triggers that set things in
motion come from.

If you want to persue it further, you'll have to narrow it down further and
show that something is actually not behaving correctly. A good starting
point would be to check what udev actions get generated when you do things
(using udevadm monitor).

Until you can identify that something is really wrong, there's no point (as
explained in earlier replies) in reopening the BR.
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Loading...