I tried to compile bird with uClibc - on prod machine it crashes with segafault in malloc. Here is backtrace from coredump: #0 0xa777b97d in malloc () from /.../libuClibc-0.9.33.2.so #1 0x0807a856 in bird_xmalloc () #2 0x0807a2ad in mb_alloc () #3 0x0805bcb3 in bgp_get_bucket () #4 0x0805c84d in bgp_rt_notify () #5 0x0804a9ef in do_rt_notify () #6 0x0804aff3 in rt_notify_basic () #7 0x0804b116 in rte_announce () #8 0x0804b55c in rte_recalculate () #9 0x0804baf4 in rte_update2 () #10 0x0805ed62 in bgp_rx () #11 0x08074f7e in sk_read () #12 0x080756ad in io_loop () #13 0x0804a273 in main () I'll try to reproduce crash with unstripped version in next few days.
Here is 2 backtraces: #0 0xa77932ec in malloc (bytes=184) at libc/stdlib/malloc-standard/malloc.c:944 #1 0x080777f8 in bird_xmalloc (size=184) at xmalloc.c:29 #2 0x0807724f in mb_alloc (p=0x94597f8, size=168) at resource.c:339 #3 0x08058f75 in bgp_new_bucket (hash=20169, new=0xafdec0a0, p=0x9446680) at ../../../proto/bgp/attrs.c:734 #4 bgp_get_bucket (p=p@entry=0x9446680, n=n@entry=0x951a6a0, attrs=attrs@entry=0x94b4870, originate=0) at ../../../proto/bgp/attrs.c:861 #5 0x08059b0f in bgp_rt_notify (P=0x9446680, tbl=0x9443a58, n=0x951a6a0, new=0x94524c4, old=0x0, attrs=0x94b4870) at ../../../proto/bgp/attrs.c:939 #6 0x0804a55f in do_rt_notify (ah=ah@entry=0x9570738, net=net@entry=0x951a6a0, new=new@entry=0x94524c4, old=0x0, tmpa=0x0, refeed=0) at ../../nest/rt-table.c:346 #7 0x0804ab63 in rt_notify_basic (ah=ah@entry=0x9570738, net=net@entry=0x951a6a0, new=new@entry=0x94524c4, old=0x0, tmpa=0x0, refeed=0) at ../../nest/rt-table.c:393 #8 0x0804ac86 in rte_announce (tab=tab@entry=0x9443a58, type=type@entry=1, net=net@entry=0x951a6a0, new=0x94524c4, old=0x0, before_old=0x0, tmpa=0x0) at ../../nest/rt-table.c:580 #9 0x0804b0cc in rte_recalculate (ah=ah@entry=0x9548118, net=net@entry=0x951a6a0, new=0x94524c4, tmpa=0x0, src=0x944e7f4) at ../../nest/rt-table.c:886 #10 0x0804b664 in rte_update2 (ah=0x9548118, net=0x951a6a0, new=0x94524c4, src=0x944e7f4) at ../../nest/rt-table.c:1053 #11 0x0805c024 in bgp_rte_withdraw (src=<optimized out>, last_id=<optimized out>, path_id=<optimized out>, pxlen=<optimized out>, prefix=<optimized out>, p=<optimized out>) at ../../../proto/bgp/packets.c:1024 #12 bgp_do_rx_update (attr_len=<optimized out>, attrs=<optimized out>, nlri_len=24, nlri=0x94d82a5 "\024\260l`\030\301]\021\023\260l`\030\301]\020\030\301]\022\026\301]\020", '\377' <repeats 16 times>, withdrawn_len=<optimized out>, withdrawn=<optimized out>, conn=0x9447d7c) at ../../../proto/bgp/packets.c:1130 #13 bgp_rx_update (len=<optimized out>, pkt=<optimized out>, conn=0x9447d7c) at ../../../proto/bgp/packets.c:1303 #14 bgp_rx_packet (len=<optimized out>, pkt=<optimized out>, conn=0x9447d7c) at ../../../proto/bgp/packets.c:1524 #15 bgp_rx (sk=0x94c12d8, size=1448) at ../../../proto/bgp/packets.c:1569 #16 0x08071f65 in sk_read (s=0x94c12d8) at io.c:1734 #17 0x08072694 in io_loop () at io.c:1975 #18 0x08049dee in main (argc=2, argv=0xafdec794) at main.c:825 (gdb) bt #0 0xa77aa817 in malloc (bytes=184) at libc/stdlib/malloc-standard/malloc.c:1153 #1 0x080777f8 in bird_xmalloc () #2 0x0807724f in mb_alloc () #3 0x08058f75 in bgp_get_bucket () #4 0x08059b0f in bgp_rt_notify () #5 0x0804a55f in do_rt_notify () #6 0x0804ab63 in rt_notify_basic () #7 0x0804ac86 in rte_announce () #8 0x0804b0cc in rte_recalculate () #9 0x0804b664 in rte_update2 () #10 0x0805c024 in bgp_rx () #11 0x08071f65 in sk_read () #12 0x08072694 in io_loop () #13 0x08049dee in main () Coredump from first trac is in attach. Bird compiled with glibc doesn't crash.
Coredump: ftp://seti.kr.ua/core.tbz
Here is valgrind log: # valgrind --tool=memcheck --track-origins=yes ./bird -d ==12995== Memcheck, a memory error detector ==12995== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. ==12995== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info ==12995== Command: ./bird -d ==12995== ==12995== Syscall param socketcall.sendto(msg) points to uninitialised byte(s) ==12995== at 0x402AA6B: __socketcall (in /lib/libuClibc-0.9.33.2.so) ==12995== by 0x40864F4: sendto (socketcalls.c:520) ==12995== by 0x8075539: nl_send (netlink.c:92) ==12995== by 0x8075584: nl_request_dump (netlink.c:110) ==12995== by 0x8076498: kif_do_scan (netlink.c:566) ==12995== by 0x8072A3C: kif_start (krt.c:191) ==12995== by 0x804F5DC: proto_rethink_goal (proto.c:641) ==12995== by 0x804F8E6: protos_commit (proto.c:580) ==12995== by 0x807003F: config_do_commit (conf.c:255) ==12995== by 0x80701D9: config_commit (conf.c:348) ==12995== by 0x8049DD7: main (main.c:814) ==12995== Address 0xaebb8a4d is on thread 1's stack ==12995== in frame #3, created by nl_request_dump (netlink.c:99) ==12995== Uninitialised value was created by a stack allocation ==12995== at 0x8075560: nl_request_dump (netlink.c:99) ==12995== ==12995== Syscall param socketcall.setsockopt(optval) points to uninitialised byte(s) ==12995== at 0x402AA6B: __socketcall (in /lib/libuClibc-0.9.33.2.so) ==12995== by 0x4086558: setsockopt (socketcalls.c:549) ==12995== by 0x8070CCD: sk_setup (io.c:1156) ==12995== by 0x8071B8F: sk_open (io.c:1358) ==12995== by 0x806153F: ospf_sk_open (iface.c:111) ==12995== by 0x806153F: ospf_iface_add (iface.c:470) ==12995== by 0x8051A55: olock_run_event (locks.c:175) ==12995== by 0x80708D1: ev_run_list (event.c:135) ==12995== by 0x80722E6: io_loop (io.c:1848) ==12995== by 0x8049DED: main (main.c:825) ==12995== Address 0xaebb88f6 is on thread 1's stack ==12995== in frame #2, created by sk_setup (io.c:1128) ==12995== Uninitialised value was created by a stack allocation ==12995== at 0x8070C58: sk_setup (io.c:1132) ==12995== ==12995== Syscall param sendmsg(msg.msg_name) points to uninitialised byte(s) ==12995== at 0x402AA6B: __socketcall (in /lib/libuClibc-0.9.33.2.so) ==12995== by 0x4086470: sendmsg (socketcalls.c:472) ==12995== by 0x80713BF: sk_sendmsg (io.c:1523) ==12995== by 0x80713BF: sk_maybe_write (io.c:1602) ==12995== by 0x805FCF4: ospf_send_to (packet.c:565) ==12995== by 0x805FFD9: ospf_hello_send (hello.c:364) ==12995== by 0x80610E0: ospf_iface_sm (iface.c:385) ==12995== by 0x80616A1: ospf_iface_add (iface.c:489) ==12995== by 0x8051A55: olock_run_event (locks.c:175) ==12995== by 0x80708D1: ev_run_list (event.c:135) ==12995== by 0x80722E6: io_loop (io.c:1848) ==12995== by 0x8049DED: main (main.c:825) ==12995== Address 0xaebb8810 is on thread 1's stack ==12995== in frame #2, created by sk_maybe_write (io.c:1568) ==12995== Uninitialised value was created by a stack allocation ==12995== at 0x8071280: sk_maybe_write (io.c:1568) ==12995== ==12995== Invalid write of size 1 ==12995== at 0x8059002: bgp_new_bucket (attrs.c:754) ==12995== by 0x8059002: bgp_get_bucket (attrs.c:861) ==12995== by 0x8059B0E: bgp_rt_notify (attrs.c:939) ==12995== by 0x804A55E: do_rt_notify (rt-table.c:346) ==12995== by 0x804AB62: rt_notify_basic (rt-table.c:393) ==12995== by 0x804C53C: do_feed_baby (rt-table.c:1776) ==12995== by 0x804C53C: rt_feed_baby (rt-table.c:1827) ==12995== by 0x804E87B: proto_feed_more (proto.c:940) ==12995== by 0x80708D1: ev_run_list (event.c:135) ==12995== by 0x80722E6: io_loop (io.c:1848) ==12995== by 0x8049DED: main (main.c:825) ==12995== Address 0x419c530 is 0 bytes after a block of size 168 alloc'd ==12995== at 0x400F14D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==12995== by 0x80777F7: bird_xmalloc (xmalloc.c:29) ==12995== by 0x807724E: mb_alloc (resource.c:339) ==12995== by 0x8058F74: bgp_new_bucket (attrs.c:734) ==12995== by 0x8058F74: bgp_get_bucket (attrs.c:861) ==12995== by 0x8059B0E: bgp_rt_notify (attrs.c:939) ==12995== by 0x804A55E: do_rt_notify (rt-table.c:346) ==12995== by 0x804AB62: rt_notify_basic (rt-table.c:393) ==12995== by 0x804C53C: do_feed_baby (rt-table.c:1776) ==12995== by 0x804C53C: rt_feed_baby (rt-table.c:1827) ==12995== by 0x804E87B: proto_feed_more (proto.c:940) ==12995== by 0x80708D1: ev_run_list (event.c:135) ==12995== by 0x80722E6: io_loop (io.c:1848) ==12995== by 0x8049DED: main (main.c:825) ==12995== ==12995== Invalid read of size 1 ==12995== at 0x40135F7: bcmp (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==12995== by 0x804D57A: adata_same (route.h:441) ==12995== by 0x804D57A: ea_same (rt-attr.c:503) ==12995== by 0x8058E6E: bgp_get_bucket (attrs.c:837) ==12995== by 0x8059B0E: bgp_rt_notify (attrs.c:939) ==12995== by 0x804A55E: do_rt_notify (rt-table.c:346) ==12995== by 0x804AB62: rt_notify_basic (rt-table.c:393) ==12995== by 0x804C53C: do_feed_baby (rt-table.c:1776) ==12995== by 0x804C53C: rt_feed_baby (rt-table.c:1827) ==12995== by 0x804E87B: proto_feed_more (proto.c:940) ==12995== by 0x80708D1: ev_run_list (event.c:135) ==12995== by 0x80722E6: io_loop (io.c:1848) ==12995== by 0x8049DED: main (main.c:825) ==12995== Address 0x419d638 is 0 bytes after a block of size 168 alloc'd ==12995== at 0x400F14D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==12995== by 0x80777F7: bird_xmalloc (xmalloc.c:29) ==12995== by 0x807724E: mb_alloc (resource.c:339) ==12995== by 0x8058F74: bgp_new_bucket (attrs.c:734) ==12995== by 0x8058F74: bgp_get_bucket (attrs.c:861) ==12995== by 0x8059B0E: bgp_rt_notify (attrs.c:939) ==12995== by 0x804A55E: do_rt_notify (rt-table.c:346) ==12995== by 0x804AB62: rt_notify_basic (rt-table.c:393) ==12995== by 0x804C53C: do_feed_baby (rt-table.c:1776) ==12995== by 0x804C53C: rt_feed_baby (rt-table.c:1827) ==12995== by 0x804E87B: proto_feed_more (proto.c:940) ==12995== by 0x80708D1: ev_run_list (event.c:135) ==12995== by 0x80722E6: io_loop (io.c:1848) ==12995== by 0x8049DED: main (main.c:825) ==12995== ==12995== Invalid read of size 4 ==12995== at 0x805993C: bgp_encode_attrs (attrs.c:607) ==12995== by 0x805B28D: bgp_create_update (packets.c:356) ==12995== by 0x805B28D: bgp_fire_tx (packets.c:671) ==12995== by 0x805B4BB: bgp_kick_tx (packets.c:714) ==12995== by 0x80708D1: ev_run_list (event.c:135) ==12995== by 0x80722E6: io_loop (io.c:1848) ==12995== by 0x8049DED: main (main.c:825) ==12995== Address 0x419c530 is 0 bytes after a block of size 168 alloc'd ==12995== at 0x400F14D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==12995== by 0x80777F7: bird_xmalloc (xmalloc.c:29) ==12995== by 0x807724E: mb_alloc (resource.c:339) ==12995== by 0x8058F74: bgp_new_bucket (attrs.c:734) ==12995== by 0x8058F74: bgp_get_bucket (attrs.c:861) ==12995== by 0x8059B0E: bgp_rt_notify (attrs.c:939) ==12995== by 0x804A55E: do_rt_notify (rt-table.c:346) ==12995== by 0x804AB62: rt_notify_basic (rt-table.c:393) ==12995== by 0x804C53C: do_feed_baby (rt-table.c:1776) ==12995== by 0x804C53C: rt_feed_baby (rt-table.c:1827) ==12995== by 0x804E87B: proto_feed_more (proto.c:940) ==12995== by 0x80708D1: ev_run_list (event.c:135) ==12995== by 0x80722E6: io_loop (io.c:1848) ==12995== by 0x8049DED: main (main.c:825) ==12995== ^C./bird: can't resolve symbol '__libc_freeres' in lib '/usr/lib/valgrind/vgpreload_core-x86-linux.so'. ==12995== ==12995== HEAP SUMMARY: ==12995== in use at exit: 5,819,481 bytes in 30,235 blocks ==12995== total heap usage: 126,286 allocs, 96,051 frees, 22,617,323 bytes allocated ==12995== ==12995== LEAK SUMMARY: ==12995== definitely lost: 0 bytes in 0 blocks ==12995== indirectly lost: 0 bytes in 0 blocks ==12995== possibly lost: 0 bytes in 0 blocks ==12995== still reachable: 5,819,481 bytes in 30,235 blocks ==12995== suppressed: 0 bytes in 0 blocks ==12995== Rerun with --leak-check=full to see details of leaked memory ==12995== ==12995== For counts of detected and suppressed errors, rerun with: -v ==12995== ERROR SUMMARY: 1007024 errors from 6 contexts (suppressed: 0 from 0) It seems like this is a bird bugs... Strange that with glibc it works OK. Maybe glibc allocates some more memory, so read at end of malloc'd chunk doesn't cause segfault?
As you say these look like bird bugs. I am closing this bug for now. Thanks,