After updating from v2019.02.09 to v2021.02, I am not able to connect to my board via SSH. The OpenSSH server on my device closes the connection abruptly and exits with 255 before the keys can be exchanged. I have tried to use a password, but this gave the same result. I am using the default sshd_config as this was working with v2019.02.09. I attached the server and client logs as well as the defconfig for more information.
Created attachment 8831 [details] OpenSSH client and server logs and defconfig
We've noticed a similar issue and haven't quite narrowed it down. So far what we observed is that everything works on a 2020.02 LTS through the latest master with a GCC8.x Buildroot internal toolchain. However, when we switched to GCC9.x (bootlin stable toolchain) while using 2020.02 LTS we noticed this same behavior with packet type 50 (https://tools.ietf.org/html/rfc4252#section-6 ).
Additional/similar bug report: https://bugs.busybox.net/show_bug.cgi?id=13626 Older OpenSSH login problem report: http://lists.busybox.net/pipermail/buildroot/2020-August/289111.html http://lists.busybox.net/pipermail/buildroot/2020-September/291853.html Dropbear login problem with password and BR2_TARGET_GENERIC_PASSWD_SHA512: http://lists.busybox.net/pipermail/buildroot/2020-August/288682.html By the way, you disabled BR2_TARGET_ENABLE_ROOT_LOGIN, how did you setup the root and/or test account/password? And another reported OpenSSH login problem: http://lists.busybox.net/pipermail/buildroot/2020-August/289111.html http://lists.busybox.net/pipermail/buildroot/2020-September/291379.html
I overwrite the default root password with the correct SHA512 and add the authorized public key using the post-build.sh script of my board.
I can reproduce (maybe the same) problem on Rpi4 with this defconfig: BR2_arm=y BR2_cortex_a72=y BR2_ARM_FPU_NEON_VFPV4=y BR2_TOOLCHAIN_EXTERNAL=y BR2_TARGET_GENERIC_PASSWD_SHA512=y BR2_ROOTFS_DEVICE_CREATION_DYNAMIC_EUDEV=y BR2_ROOTFS_MERGED_USR=y BR2_SYSTEM_BIN_SH_BASH=y BR2_SYSTEM_DHCP="eth0" BR2_SYSTEM_DEFAULT_PATH="/bin:/sbin:/usr/bin:/usr/sbin" BR2_TARGET_TZ_INFO=y BR2_ROOTFS_POST_BUILD_SCRIPT="board/raspberrypi4/post-build.sh" BR2_ROOTFS_POST_IMAGE_SCRIPT="board/raspberrypi4/post-image.sh" BR2_LINUX_KERNEL=y BR2_LINUX_KERNEL_CUSTOM_TARBALL=y BR2_LINUX_KERNEL_CUSTOM_TARBALL_LOCATION="$(call github,raspberrypi,linux,967d45b29ca2902f031b867809d72e3b3d623e7a)/linux-967d45b29ca2902f031b867809d72e3b3d623e7a.tar.gz" BR2_LINUX_KERNEL_DEFCONFIG="bcm2711" BR2_LINUX_KERNEL_DTS_SUPPORT=y BR2_LINUX_KERNEL_INTREE_DTS_NAME="bcm2711-rpi-4-b" BR2_LINUX_KERNEL_NEEDS_HOST_OPENSSL=y BR2_PACKAGE_BUSYBOX_SHOW_OTHERS=y BR2_PACKAGE_STRACE=y BR2_PACKAGE_RPI_FIRMWARE=y BR2_PACKAGE_RPI_FIRMWARE_VARIANT_PI4=y BR2_PACKAGE_RPI_FIRMWARE_CONFIG_FILE="board/raspberrypi4/config_4.txt" BR2_PACKAGE_DBUS=y BR2_PACKAGE_LIBCAP=y BR2_PACKAGE_OPENSSH=y BR2_PACKAGE_KMOD_TOOLS=y BR2_PACKAGE_UTIL_LINUX_AGETTY=y BR2_PACKAGE_UTIL_LINUX_FSCK=y BR2_PACKAGE_UTIL_LINUX_MOUNT=y BR2_TARGET_ROOTFS_EXT2=y BR2_TARGET_ROOTFS_EXT2_4=y BR2_TARGET_ROOTFS_EXT2_SIZE="120M" # BR2_TARGET_ROOTFS_TAR is not set BR2_PACKAGE_HOST_DOSFSTOOLS=y BR2_PACKAGE_HOST_GENIMAGE=y BR2_PACKAGE_HOST_MTOOLS=y On the serial console I get the following log in case of ssh login abort/failure: [ 110.415395] audit: type=1326 audit(110.409:3): auid=4294967295 uid=1001 gid=1001 ses=4294967295 pid=248 comm="sshd" exe="/usr/sbin/sshd" sig=31 arch=40000028 syscall=403 compat=0 ip=0xb6b9b766 code=0x0 Strace output looks like the following: 243 write(6, "\0\0\0e\0\0\0\23ecdsa-sha2-nistp256\0\0\0J\0"..., 105 <unfinished ...> 248 read(5, <unfinished ...> 243 <... write resumed>) = 105 248 <... read resumed>"\7\0\0\0e\0\0\0\23ecdsa-sha2-nistp256\0\0\0J"..., 106) = 106 243 poll([{fd=6, events=POLLIN}, {fd=7, events=POLLIN}], 2, -1 <unfinished ...> 248 clock_gettime64(CLOCK_BOOTTIME, <unfinished ...>) = ? 248 +++ killed by SIGSYS +++ 243 <... poll resumed>) = 2 ([{fd=6, revents=POLLIN|POLLHUP}, {fd=7, revents=POLLHUP}]) 243 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=248, si_uid=1001, si_status=SIGSYS, si_utime=4, si_stime=1} --- The call to clock_gettime64() is aborted with SIGSYS...., but there is already an (doubled) entry for it in openssh-8.4p1/sandbox-seccomp-filter.c (maybe __NR_clock_gettime64 is not defined), see e.g. [1]... [1] http://lists.busybox.net/pipermail/buildroot/2020-August/289369.html
The following patch/hack fixed the problem for my testcase: --- openssh-8.4p1/sandbox-seccomp-filter.c_orig 2021-03-23 23:15:02.131964000 +0100 +++ openssh-8.4p1/sandbox-seccomp-filter.c 2021-03-23 23:24:24.388408285 +0100 @@ -189,6 +189,11 @@ #ifdef __NR_clock_gettime SC_ALLOW(__NR_clock_gettime), #endif + +#ifndef __NR_clock_gettime64 +#define __NR_clock_gettime64 403 +#endif + #ifdef __NR_clock_gettime64 SC_ALLOW(__NR_clock_gettime64), #endif @@ -252,6 +257,11 @@ #ifdef __NR_clock_nanosleep SC_ALLOW(__NR_clock_nanosleep), #endif + +#ifndef __NR_clock_nanosleep_time64 +#define __NR_clock_nanosleep_time64 407 +#endif + #ifdef __NR_clock_nanosleep_time64 SC_ALLOW(__NR_clock_nanosleep_time64), #endif
When I use the Arm ARM 2020.11 toolchain (GCC 10.2, GDB 10.1, glibc 2.31, Binutils 2.35.1) to build 2021.02, the OpenSSH is being killed when trying to login. I made a strace using that setup and got this: [pid 14957] write(2, "debug3: mm_request_send entering"..., 52 <unfinished ...> [pid 15027] write(7, "\0\0\0/\0\0\0\6\0\0\0'input_userauth_reque"..., 51 <unfinished ...> [pid 14957] <... write resumed>) = 52 [pid 15027] <... write resumed>) = 51 [pid 14957] poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}], 2, -1 <unfinished ...> [pid 15027] write(7, "\0\0\08\0\0\0\7\0\0\0000user_specific_delay:"..., 60 <unfinished ...> [pid 14957] <... poll resumed>) = 2 ([{fd=5, revents=POLLIN}, {fd=6, revents=POLLIN}]) [pid 15027] <... write resumed>) = 60 [pid 14957] read(6, <unfinished ...> [pid 15027] clock_gettime(CLOCK_BOOTTIME, <unfinished ...> [pid 14957] <... read resumed>"\0\0\0/", 4) = 4 [pid 15027] <... clock_gettime resumed>{tv_sec=3303, tv_nsec=218403125}) = 0 [pid 14957] read(6, <unfinished ...> [pid 15027] write(7, "\0\0\0[\0\0\0\7\0\0\0Sensure_minimum_time_"..., 95 <unfinished ...> [pid 14957] <... read resumed>"\0\0\0\6\0\0\0'input_userauth_request: "..., 47) = 47 [pid 15027] <... write resumed>) = 95 [pid 14957] write(2, "debug2: input_userauth_request: "..., 59 <unfinished ...> [pid 15027] clock_nanosleep_time64(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=24392751410935043}, <unfinished ...> [pid 14957] <... write resumed>) = 59 [pid 15027] <... clock_nanosleep_time64 resumed> <unfinished ...>) = ? [pid 14957] poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}], 2, -1 <unfinished ...> [pid 15027] +++ killed by SIGSYS +++ <... poll resumed>) = 2 ([{fd=5, revents=POLLIN}, {fd=6, revents=POLLIN}]) --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=15027, si_uid=1001, si_status=SIGSYS, si_utime=8, si_stime=4} --- It looks similar to what you could reproduce, the only difference is that the call to clock_nanosleep_time64() is aborted with SIGSYS. When I use the Arm ARM 2019.2 toolchain (GCC 9.2.1, GDB 8.3.0, glibc 2.30, Binutils 2.33.1) to build 2021.02, OpenSSH is not being killed and I am able to login in. Do you have any idea what could cause this? Furthermore, I am using the Linux kernel v4.14.78 in my project.
(In reply to Geert Lens from comment #7) No other advise at the moment as avoid the buggy (?) pre-build toolchains, change to a buildroot build ones (and/or avoid/fix uclibc?), or patch openssh according/specific to your toolchain/openssh failure (see suggested patch above), or disable seccomp-filter for openssh (did not investigate yet if or how it is possible)...
Hello, I believe it's an issue between the kernel-headers version and glibc >= 2.31. I'm able to reproduce with the Arm ARM 2020.11 and Bootlin toolchain stable 2020.08-1 but not with Bootlin toolchain bleeding-edge 2020.08-1. The Arm ARM 2020.11 provide 4.20.3 kernel headers while the Bootlin toolchain bleeding-edge 2020.08-1 provide 5.4.61. Both use glibc 2.31. Since glibc 2.31, there is a tables with system call numbers that come from kernel 5.4 [1]. But syscall like __NR_clock_gettime64 has been added since kernel 5.1 [2] It seems we need a toolchain with kernel headers >= 5.1 to workaround the issue. [1] https://sourceware.org/git/?p=glibc.git;a=commit;h=4cf0d223052dabb9caed29e1e91e1d61933e14fb [2] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=48166e6ea47d23984f0b481ca199250e1ce0730a Best regards, Romain
See https://lore.kernel.org/lkml/20190719170343.GA13680@linux.intel.com/ "Using __NR_clock_gettime64 instead of __NR_clock_gettime breaks userspace applications that use seccomp filtering to block syscalls, as applications are completely unaware of the newly added of __NR_clock_gettime64, e.g. sshd gets zapped on syscall(403) when attempting to ssh into the system." The issue doesn't seems fixed with a running kernel 5.10.7 with a system built with a toolchain using 4.19.x kernel headers.
Hi, Thank you for all your help clarifying why this is happening. I have for the time being implemented the patch/hack that Peter provided and will when available switch to a toolchain that provides the needed headers. Best regards, Geert
*** Bug 13626 has been marked as a duplicate of this bug. ***