Package: coreutils
Version: 8.32-4+b1
Version: 9.1-1
Severity: normal
Dear Maintainer,
-- >8 --
$ echo ' Q' | pr -fia
2023-04-25 20:53 Page 1
a Q
$ echo ' Q' | pr -fią
pr: '-i' extra characters or invalid number in the argument: ‘\205’
Try 'pr --help' for more information.
$ echo 'a Q' | pr -fea
2023-04-25 20:56 Page 1
Q
$ echo 'ą Q' | pr -feą
pr: '-e' extra characters or invalid number in the argument: ‘\205’
Try 'pr --help' for more information.
-- >8 --
POSIX Issue 7 and 8 Draft 2.1 say:
104054 −e[char][gap]
104055 Expand each input <tab> to the next greater column position specified by the
104056 formula n*gap+1, where n is an integer > 0. If gap is zero or is omitted, it shall
104057 default to 8. All <tab> characters in the input shall be expanded into the
104058 appropriate number of <space> characters. If any non-digit character, char, is
104059 specified, it shall be used as the input <tab>. If the first character of the −e option-
104060 argument is a digit, the entire option-argument shall be assumed to be gap.
104067 −i[char][gap] In output, replace <space> characters with <tab> characters wherever one or more
104068 adjacent <space> characters reach column positions gap+1, 2* gap+1, 3* gap+1, and
104069 so on. If gap is zero or is omitted, default tab settings at every eighth column
104070 position shall be assumed. If any non-digit character, char, is specified, it shall be
104071 used as the output <tab>. If the first character of the −i option-argument is a digit,
104072 the entire option-argument shall be assumed to be gap.
104119 LC_CTYPE
104120 Determine the locale for the interpretation of sequences of bytes of text data as
104121 characters (for example, single-byte as opposed to multi-byte characters in
104122 arguments and input files) and which characters are defined as printable (character
104123 class print). Non-printable characters are still written to standard output, but are
104124 not counted for the purpose for column-width and line-length calculations.
Which very obviously and explicitly says that -eą must work
(if the locale has an ą, which mine obviously does).
Best,
наб
-- System Information:
Debian Release: 12.0
APT prefers unstable
APT policy: (500, 'unstable')
Architecture: x32 (x86_64)
Foreign Architectures: amd64, i386
Kernel: Linux 6.1.0-2-amd64 (SMP w/2 CPU threads; PREEMPT)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled
Versions of packages coreutils depends on:
ii libacl1 2.3.1-3
ii libattr1 1:2.5.1-4
ii libc6 2.36-9
ii libgmp10 2:6.2.1+dfsg1-1.1
ii libselinux1 3.4-1+b5
coreutils recommends no packages.
coreutils suggests no packages.
-- no debconf information
Control: retitle -1 coreutils: pr: -ien don't accept characters for tab override, only bytes
Same applies to -n, of course.
-- >8 --
$ printf '%s\n' a b c | pr -fna
2023-04-25 22:26 Page 1
1aa
2ab
3ac
$ printf '%s\n' a b c | pr -fną
pr: '-n' extra characters or invalid number in the argument: ‘\205’
Try 'pr --help' for more information.
-- >8 --
Funnily enough, -s is fine since, as an extension, it uses the entire
option-argument instead of the first character.
Taking everything before the first digit in -ien as the [char]
would match the -s extension and is much simpler than mbrtowc()ing
optarg. It's also probably correct for most encodings? So that's a plus.
Best,
наб
Changed Bug title to 'coreutils: pr: -ien don't accept characters for tab override, only bytes' from 'coreutils: pr: -ie don't accept characters for tab override, only bytes'.
Request was from наб <[email protected]>
to [email protected].
(Tue, 25 Apr 2023 20:36:04 GMT) (full text, mbox, link).
Debbugs is free software and licensed under the terms of the GNU General
Public License version 2. The current version can be obtained
from https://bugs.debian.org/debbugs-source/.