| Summary: | modprobe-small: incorrect alias-based probe with dep_bb_fd < 0 | ||
|---|---|---|---|
| Product: | Busybox | Reporter: | Jiri J. <jirij.jabb> |
| Component: | Other | Assignee: | unassigned |
| Status: | RESOLVED FIXED | ||
| Severity: | normal | CC: | busybox-cvs |
| Priority: | P3 | ||
| Version: | unspecified | ||
| Target Milestone: | --- | ||
| Hardware: | PC | ||
| OS: | Linux | ||
| Host: | Target: | ||
| Build: | |||
| Attachments: |
modprobe debug log, ro filesystem, probes ata_generic.ko
modprobe debug log, rw filesystem, probes ata_piix.ko modules.dep.bb from a-z ordered /lib/modules modules.dep.bb from z-a ordered /lib/modules Added some behaviors for modprobe |
||
I have to admit, I was wrong about the first/last match, both pick up the first match due to
if (result) {
i++;
continue;
}
which throws me into even greater mystery, since both cases should get the same final module. So the bug is most likely somewhere else.
It might be unrelated, however it might also help:
Both the directory search (dirscan) and module body searches (filename order) are done in reverse when the filesystem is mounted read-write. It appears to be normal (a->z) when the filesystem is read-only. This also applies to depmod.
Ie. with read-only filesystem, the search looks like
--------------------------------------------------
modprobe: 'kernel/arch/x86/kernel/scx200.ko' module name doesn't match
modprobe: 'kernel/crypto/aead.ko' module name doesn't match
modprobe: 'kernel/crypto/chainiv.ko' module name doesn't match
modprobe: 'kernel/crypto/crc32c.ko' module name doesn't match
modprobe: 'kernel/crypto/crypto.ko' module name doesn't match
...
modprobe: 'kernel/net/ipv6/sit.ko' module name doesn't match
modprobe: 'kernel/net/llc/llc.ko' module name doesn't match
modprobe: 'kernel/net/sctp/sctp.ko' module name doesn't match
modprobe: 'kernel/net/sunrpc/sunrpc.ko' module name doesn't match
modprobe: 'kernel/net/sunrpc/xprtrdma/svcrdma.ko' module name doesn't match
modprobe: 'kernel/net/sunrpc/xprtrdma/xprtrdma.ko' module name doesn't match
modprobe: dirscan complete
modprobe: find_alias('pci:v00008086d00007010sv00000000sd00000000bc01sc01i80')
modprobe: opened modules.dep.bb.new:-1
modprobe: parse_module('kernel/arch/x86/kernel/scx200.ko')
modprobe: alias:'pci:v0000100Bd00000515sv*sd*bc*sc*i*'
...
modprobe: alias:'symbol:scx200_gpio_base'
modprobe: parse_module('kernel/crypto/aead.ko')
modprobe: alias:'symbol:crypto_alloc_aead'
...
modprobe: dep:'crypto_algapi crypto'
modprobe: parse_module('kernel/crypto/chainiv.ko')
modprobe: dep:'crypto_algapi rng crypto_blkcipher'
modprobe: parse_module('kernel/crypto/crc32c.ko')
modprobe: dep:'crypto_hash'
modprobe: parse_module('kernel/crypto/crypto.ko')
modprobe: alias:'symbol:crypto_has_alg'
modprobe: alias:'symbol:crypto_destroy_tfm'
--------------------------------------------------
while on the read-write filesystem (where modules.dep.bb can be created), it's
--------------------------------------------------
modprobe: 'kernel/net/sunrpc/xprtrdma/xprtrdma.ko' module name doesn't match
modprobe: 'kernel/net/sunrpc/xprtrdma/svcrdma.ko' module name doesn't match
modprobe: 'kernel/net/sunrpc/sunrpc.ko' module name doesn't match
modprobe: 'kernel/net/sctp/sctp.ko' module name doesn't match
modprobe: 'kernel/net/llc/llc.ko' module name doesn't match
modprobe: 'kernel/net/ipv6/sit.ko' module name doesn't match
modprobe: 'kernel/net/ipv6/ipv6.ko' module name doesn't match
modprobe: 'kernel/net/ipv4/tunnel4.ko' module name doesn't match
...
modprobe: 'kernel/crypto/chainiv.ko' module name doesn't match
modprobe: 'kernel/crypto/aead.ko' module name doesn't match
modprobe: 'kernel/arch/x86/kernel/scx200.ko' module name doesn't match
modprobe: dirscan complete
modprobe: find_alias('pci:v00008086d00007010sv00000000sd00000000bc01sc01i80')
modprobe: opened modules.dep.bb.new:3
modprobe: parse_module('kernel/net/sunrpc/xprtrdma/xprtrdma.ko')
modprobe: dep:'sunrpc ib_core rdma_cm'
modprobe: grow stringbuf to 149
modprobe: parse_module('kernel/net/sunrpc/xprtrdma/svcrdma.ko')
modprobe: dep:'ib_core sunrpc rdma_cm'
modprobe: parse_module('kernel/net/sunrpc/sunrpc.ko')
modprobe: alias:'symbol:rpc_call_null'
--------------------------------------------------
Failed to reproduce this. For me, both RO and RW filesystem returns modules in the same order, and modprobe pci:v00008086d00007010sv00000000sd00000000bc01sc01i80 loads ata_generic.ko in both cases. This is on reiser3. What filesystem do you use? You can check the order of readdir() with find applet. It lists the files using the same internal routine, recursive_action(), which is used by modprobe. Just run "find /lib/modules/n.n.n". What do you see when it is mounted RO and when it is mounted RW? First of all, yes, it's true that "find" has the same syndrome, but at least I know why.
Let's assume I already have all modules in "modules" and "find modules/" returns them in alphabetical (a->z) order.
I do the following:
mkdir mods0 mods1 mods2 mods3 mods4 mods5 mods6 mods7
cp -r modules/* mods0/.
cp -r mods0/* mods1/.
cp -r mods1/* mods2/.
cp -r mods2/* mods3/.
cp -r mods3/* mods4/.
cp -r mods4/* mods5/.
cp -r mods5/* mods6/.
cp -r mods6/* mods7/.
Modules in mods0 will be in reversed (z->a) order, while the a->z order will be restored in mods1, broken in mods2, restored in mods3, etc.
I guess the alphabet doesn't even matter, the thing that matters is storage order.
Since I can't reproduce it on my desktop OS, it's probably filesystem-specific. The filesystem used here is tmpfs, however I've disabled shmem in the kernel (EMBDEDDED->SHMEM), so it's in fact ramfs, as it says:
The shmem is an internal filesystem used to manage shared memory.
It is backed by swap and manages resource limits. It is also exported
to userspace as tmpfs if TMPFS is enabled. Disabling this
option replaces shmem and tmpfs with the much simpler ramfs code,
which may be appropriate on small systems without swap.
Which might make sense, "cp" copies "a...." files as first and "z...." as last, while the storage is done the other way, so "z...." is on the top.
I guess it shouldn't matter on an embedded system that much, modprobe-simple.c seems to disagree with me.
Anyway,
> loads ata_generic.ko in both cases.
.. which is wrong.
I don't have any hardware ata_generic.ko might drive, moreover:
# grep 00007010 /mnt/lib/modules/2.6.29.6/modules.alias
alias pci:v00008086d00007010sv*sd*bc*sc*i* piix
alias pci:v00008086d00007010sv*sd*bc*sc*i* ata_piix
The funny thing is, that it works with modules.dep.bb generated from reverse-ordered modules directory.
I'll post both my modprobe.dep.bb files (from both trees - generated by depmod -n) as soon as possible.
Created attachment 669 [details]
modprobe debug log, ro filesystem, probes ata_generic.ko
Created attachment 671 [details]
modprobe debug log, rw filesystem, probes ata_piix.ko
In the end, it really doesn't matter whether the filesystem is read-only or not, that's why I attached the old logs (promised in the first post).
(sorry for the gzip format, size limit is in here)
Created attachment 673 [details]
modules.dep.bb from a-z ordered /lib/modules
Created attachment 675 [details]
modules.dep.bb from z-a ordered /lib/modules
Both depmod outputs (modules.dep.bb) are pretty much the same, the alias belongs to ata_piix.ko in both files, however modprobe-small probably parses them in a wrong way.
>> loads ata_generic.ko in both cases. >.. which is wrong. >I don't have any hardware ata_generic.ko might drive, moreover: > ># grep 00007010 /mnt/lib/modules/2.6.29.6/modules.alias >alias pci:v00008086d00007010sv*sd*bc*sc*i* piix >alias pci:v00008086d00007010sv*sd*bc*sc*i* ata_piix pci:v00008086d00007010sv00000000sd00000000bc01sc01i80 matches in ata_piix: alias=pci:v00008086d00007010sv*sd*bc*sc*i* in ata_generic: alias=pci:v*d*sv*sd*bc01sc01i* In order to make modprobe-small prefer ata_piix, we need to introduce the concept of "better" and "worse" matches. Any ideas what this weight might be? >in ata_generic:
>alias=pci:v*d*sv*sd*bc01sc01i*
Yeah, I was aware of such case, however I haven't checked it, sorry.
The algorithm for finding "the better case" should IMHO match the more exact case. In this example, ata_generic is clearly a "fallback" - anything else should be preferred, if the alias matches.
Some ideas I have right now are based on
1) string length - fallback matches have shorter alias names,
due to the asterisks
2) asterisk count - more exact matches are more likely to have
less asterisks
Using those (+ something else), we could create an algorithm which decides whether the new match is more exact than the old one. It's not so simple, since pci aliases aren't the only ones supposed to use asterisks (like pnp:dYMH0021* versus pnp:dYMH0021 which would score 1:1 from the above ideas).
I don't know how this is done in the normal modutils, whether there's some sort of a "database" with calculated score for a given alias (which we can't afford) or something like that.
> Some ideas I have right now are based on
3) asterisk processing - how many characters an asterisk
replaces? The lesser value should
indicates a more exact match.
I took a look at fnmatch() flags, none of them looks interesting though.
Since we don't have any custom fnmatch implementation, we would need to write the algorithm on our own.
I've tried to write down an algorithm that does the "3rd idea", I ended up in doing substring matches and counting characters back to the last match, when I realized all this can be done much more easily.
Consider the following example:
#include <stdio.h>
#include <string.h>
#define ASTERISK_CHR '*'
int alias_match(const char *in_alias,
const char *cmp_alias)
{
char *pos;
int count;
/* count asterisks */
count = 0;
pos = strchr(cmp_alias, ASTERISK_CHR);
while (pos != NULL)
{
pos = strchr(pos+1, ASTERISK_CHR);
count++;
}
/* calculate wildcard replacement */
count = strlen(in_alias) - (strlen(cmp_alias) - count);
return count;
}
int main(int argc, char **argv)
{
int num1;
int num2;
/* 1: input, 2: old alias, 3: new alias to compare */
if (argc < 4)
return 1;
/* if the number of asterisk-replaced characters
* in the new alias is lesser, use it, otherwise
* use the old alias (this includes num1==num2) */
num1 = alias_match(argv[1], argv[2]);
num2 = alias_match(argv[1], argv[3]);
printf("use %s\n", num1 > num2 ? argv[3] : argv[2]);
return 0;
}
$ ./a.out \
> v00008086d00007010sv00000000sd00000000bc01sc01i80 \
> v*d*sv*sd*bc01sc01i* \
> v00008086d00007010sv*sd*bc*sc*i*
use v00008086d00007010sv*sd*bc*sc*i*
I believe it should be accurate enough for the rare cases where two or more aliases matches the provided pattern.
If you diagree, you can simply modify the function to return original asterisk count as well and do some more checks.
module-init-tools should be suffering from the same problem. In their case, the order will be controlled by modules.alias. Whether ata_piix or ata_generic will be first in modules.alias is determined by readdir() order in which depmod reads modules. IOW: it also will be basically random. Can you test this theory? (In reply to comment #13) > module-init-tools should be suffering from the same problem. > > In their case, the order will be controlled by modules.alias. > > Whether ata_piix or ata_generic will be first in modules.alias > is determined by readdir() order in which depmod reads modules. > > IOW: it also will be basically random. > > Can you test this theory? > Indeed I can. And I did. It's true that readdir() affects them as well, all three files (.alias, .dep, .symbols) are in reverse when depmod is run on "copied" modules directory. Now the interesting stuff - modprobe seems to load everything that matches the alias. Both ata_generic and ata_piix gets loaded, regardless of line order in any of those files. I can confirm it on my libata-using server machine - ata_generic is loaded even when it's unused (I can safely rmmod it). Perhaps an config option allowing to use the above algorithm (for more memory-restrained systems) would be good in modprobe-small case. > Now the interesting stuff - modprobe seems to load everything that matches the
> alias. Both ata_generic and ata_piix gets loaded, regardless of line order in
> any of those files.
Is the *order* in which they are insmod'ed matches modules.alias order?
I guess if ata_generic is insmod'ed first, it will "take" the device, and later ata_piix will just hang around, with no devices to handle.
This may be a satisfactory outcome, though... would it be satisfactory for you?
(In reply to comment #15) > Is the *order* in which they are insmod'ed matches modules.alias order? I believe so. Their order seems to be shifted in lsmod output. > I guess if ata_generic is insmod'ed first, it will "take" the device, and later > ata_piix will just hang around, with no devices to handle. I wouldn't say that. I can freely modprobe ata_generic (by hand) before modprobing ata_piix and it won't hurt anything. I guess ata_generic won't even recognize it. However, you're right that ata_generic might take over it in other cases, that's why some people had to blacklist ata_generic on their desktop machines to get the maximum speed benefit. > This may be a satisfactory outcome, though... would it be satisfactory for you? As for me, I'm going to use the normal module-init-tools. I've several reasons for that, like nice lsmod output, blacklist support, squashfs-compressed modules.* files are almost the same size as compressed modules.dep.bb (~900 bytes), rmmod isn't recursively removing modules and so on. It isn't going to be really embedded OS, users will need to interact with it using the ash shell, so I'm with the more user-friendly way, even if it adds ~1K to the binary. My opinion is that the "simplified modutils" should be used mainly on the embedded devices, where the manufacturer can simply put modprobes in an startup script (since he knows the exact hardware) without the need of using some kind of uevent probing script. for now, I added a comment in modprobe-small.c explaining the problem. So how now with priorities in modprobe? Are there some plans about extending modprobe functionality (for ex., classify aliases into some categories - most important are with VID/DID, then - more generic)? Or even option to load all modules that corresponds to one alias? Created attachment 2209 [details]
Added some behaviors for modprobe
I wrote patch to add some other than 'load first matched' behaviors for modprobe. One of new criteria - longest modalias, another - modalias with longest prefix without wildcards (that will be perfect for PCI/USB devices - they have VID/DID at first places of alias).
Possibly it isn't optimal - but it's working.
Well, the solution needed here is to closely follow module-init-tools.
There is a comment in modprobe_small.c about the exact nature
of the problem:
/*
* Given modules definition and module name (or alias, or symbol)
* load/remove the module respecting dependencies.
* NB: also called by depmod with bogus name "/",
* just in order to force modprobe.dep.bb creation.
*/
static void process_module(char *name, const char *cmdline_options)
{
module_info *info;
...
...
if (!module_count) {
/* Scan module directory. This is done only once.
* It will attempt module load, and will exit(EXIT_SUCCESS)
* on success. */
...
dbg1_error_msg("dirscan complete");
/* Module was not found, or load failed, or is_rmmod */
if (module_found_idx >= 0) { /* module was found */
info = &modinfo[module_found_idx];
} else { /* search for alias, not a plain module name */
info = find_alias(name);
}
} else {
info = find_alias(name);
}
// Problem here: there can be more than one module
// for the given alias. For example,
// "pci:v00008086d00007010sv00000000sd00000000bc01sc01i80" matches
// ata_piix because it has an alias "pci:v00008086d00007010sv*sd*bc*sc*i*"
// and ata_generic, it has an alias "alias=pci:v*d*sv*sd*bc01sc01i*"
// Standard modprobe would load them both.
// In this code, find_alias() returns only the first matching module.
That's where the problem is: find_alias() returns one module_info;
but in order to mimic module-init-tools, it needs to return
*a list* of matching module_info's.
rmmod should not require module directory when ive my system partition unmounted (containing modules directory) i cant use rmmod. But, if i create an empty folder of module directory, im able to use rmmod (module loaded in memory) ... So i think a little fix should be made to ignore return if the directory doesnt exists.. Fixed in git: commit 07e5555a8f7469f6f45cacd7fc188816ae644f74 Author: Denys Vlasenko <vda.linux@googlemail.com> Date: Mon Apr 21 16:59:36 2014 +0200 modprobe-small: (un)load all modules which match the alias, not only first one Closes 627 and 7034. Commonly seen case is (un)loading of an alias which matches ata_generic and a more specific ata module. For example: modprobe [-r] pci:v00008086d00007010sv00000000sd00000000bc01sc01i80 (ata_generic and pata_acpi) modprobe [-r] pci:v00001106d00000571sv00001509sd00009022bc01sc01i8a (ata_generic and pata_via) |
Hello, I've built a small linux-based distribution where I mount /lib/modules as a squashfs read-only filesystem and I've finally tracked down the issue of loading ata_generic instead of ata_piix both in QEMU and on a real machine. It's because the modprobe can't open /lib/modules/* for write, resulting in dep_bb_fd == -1 (and thus < 0), which apparently isn't a fatal case for find_alias() as it's author was aware of such case: if (result && dep_bb_fd < 0) return result; This indeed returns the result as soon as possible, because further searching and parsing wouldn't make sense when we can't write the modules.dep.bb file. In the other case, however, the loop finishes, find_alias() writes a new modules.dep.bb file and returns LAST result found. Two or more modules with the same alias shouldn't even exist (correct me if I'm wrong), so this shouldn't be an issue. Unfortunatelly, it is. I did a grep on several parts of the alias and only ata_piix.ko matches. I'm able to attach - on a request - a three gzipped logs of "modprobe pci:v00008086d00007010sv00000000sd00000000bc01sc01i80" with empty /proc/modules and nonexisting modules.dep.bb on a: - read-only filesystem (which loads incorrectly ata_generic) - read-write filesystem (which loads correctly ata_piix) - read-only filesystem with deleted ata_generic (loads ata_piix) A snippet of the first case: -------------------------------------------------- modprobe: alias:'pci:v00008086d00002653sv*sd*bc*sc*i*' modprobe: alias:'pci:v00008086d00002652sv*sd*bc*sc*i*' modprobe: grow stringbuf to 4553 modprobe: dep:'libata' modprobe: parse_module('kernel/drivers/ata/ata_generic.ko') modprobe: alias:'pci:v*d*sv*sd*bc01sc01i*' modprobe: alias:'pci:v00001179d00000105sv*sd*bc*sc*i*' modprobe: alias:'pci:v00001179d00000103sv*sd*bc*sc*i*' modprobe: alias:'pci:v00001179d00000102sv*sd*bc*sc*i*' modprobe: alias:'pci:v000016CAd00000001sv*sd*bc*sc*i*' modprobe: alias:'pci:v00001045d0000C558sv*sd*bc*sc*i*' modprobe: alias:'pci:v00001106d00000561sv*sd*bc*sc*i*' modprobe: alias:'pci:v00003388d00008013sv*sd*bc*sc*i*' modprobe: alias:'pci:v00001060d0000673Asv*sd*bc*sc*i*' modprobe: alias:'pci:v00001060d0000886Asv*sd*bc*sc*i*' modprobe: alias:'pci:v00001060d00000101sv*sd*bc*sc*i*' modprobe: alias:'pci:v00009412d00006565sv*sd*bc*sc*i*' modprobe: alias:'pci:v00001042d00003020sv*sd*bc*sc*i*' modprobe: dep:'libata' modprobe: found alias 'pci:v00008086d00007010sv00000000sd00000000bc01sc01i80' in module 'kernel/drivers/ata/ata_generic.ko' modprobe: recurse on dep 'libata' modprobe: process_module('libata','(null)') modprobe: already_loaded:0 is_rmmod:0 modprobe: process_module('libata'): options:'(null)' modprobe: find_alias('libata') modprobe: found 'libata' in module 'kernel/drivers/ata/libata.ko' modprobe: parse_module('kernel/drivers/ata/libata.ko') modprobe: alias:'symbol:ata_cable_sata' modprobe: alias:'symbol:ata_cable_ignore' modprobe: alias:'symbol:ata_cable_unknown' -------------------------------------------------- and the second case: -------------------------------------------------- modprobe: alias:'pci:v00008086d00007111sv*sd*bc*sc*i*' modprobe: alias:'pci:v00008086d00007111sv000015ADsd00001976bc*sc*i*' modprobe: alias:'pci:v00008086d00007010sv*sd*bc*sc*i*' modprobe: dep:'libata' modprobe: found alias 'pci:v00008086d00007010sv00000000sd00000000bc01sc01i80' in module 'kernel/drivers/ata/ata_piix.ko' modprobe: parse_module('kernel/drivers/ata/ata_generic.ko') modprobe: alias:'pci:v*d*sv*sd*bc01sc01i*' modprobe: alias:'pci:v00001179d00000105sv*sd*bc*sc*i*' modprobe: alias:'pci:v00001179d00000103sv*sd*bc*sc*i*' modprobe: alias:'pci:v00001179d00000102sv*sd*bc*sc*i*' modprobe: alias:'pci:v000016CAd00000001sv*sd*bc*sc*i*' modprobe: alias:'pci:v00001045d0000C558sv*sd*bc*sc*i*' modprobe: alias:'pci:v00001106d00000561sv*sd*bc*sc*i*' modprobe: alias:'pci:v00003388d00008013sv*sd*bc*sc*i*' modprobe: alias:'pci:v00001060d0000673Asv*sd*bc*sc*i*' modprobe: alias:'pci:v00001060d0000886Asv*sd*bc*sc*i*' modprobe: alias:'pci:v00001060d00000101sv*sd*bc*sc*i*' modprobe: alias:'pci:v00009412d00006565sv*sd*bc*sc*i*' modprobe: alias:'pci:v00001042d00003020sv*sd*bc*sc*i*' modprobe: dep:'libata' modprobe: parse_module('kernel/drivers/ata/ahci.ko') modprobe: alias:'pci:v*d*sv*sd*bc01sc06i01*' modprobe: alias:'pci:v0000105Ad00003F20sv*sd*bc*sc*i*' modprobe: alias:'pci:v000011ABd00006121sv*sd*bc*sc*i*' -------------------------------------------------- I have to admit, I haven't found the real (code) reason for this bug, it might be a counter mismatch (off-set by one) or something similar. target: i486-linux-uclibc-gcc git: 2009-09-25 (4ea0ca8) Thanks.