Acknowledgement sent
to Stephane Chazelas <[email protected]>:
New Bug report received and forwarded. Copy sent to Michael Stone <[email protected]>.
(Sun, 24 Jan 2010 11:33:05 GMT) (full text, mbox, link).
Subject: /usr/bin/uniq: uniq tells 2 lines with different invalid utf-8 characters
are duplicate
Date: Sun, 24 Jan 2010 11:31:40 +0000
Package: coreutils
Version: 8.4-1
Severity: normal
File: /usr/bin/uniq
~$ locale charmap
UTF-8
~$ locale collate-codeset
UTF-8
~$ sort .zsh-history|uniq -D|sed -n l
cd Pyr\202n\202es$
cd Pyr\351n\351es$
Both lines are identical except for the invalid UTF-8
characters, uniq reports them as identical.
"sort -u" and "comm" also treat them as identical:
~$ echo '\0300\n\0301' | sort -u | sed -n l
\300$
~$ sed -n l a
cd Pyr\202n\202es$
~$ sed -n l b
cd Pyr\351n\351es$
~$ comm -12 a b | sed -n l
cd Pyr\351n\351es$
If that's an expected behavior, I think it should be better
documented as I think "Comparisons honor the rules specified
by the `LC_COLLATE' locale category." is not enough to cover
that rather unintuitive behavior.
-- System Information:
Debian Release: squeeze/sid
APT prefers unstable
APT policy: (500, 'unstable'), (50, 'experimental')
Architecture: i386 (i686)
Kernel: Linux 2.6.32-trunk-686 (SMP w/1 CPU core)
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_US.ISO-8859-15 (charmap=UTF-8) (ignored: LC_ALL set to en_US.UTF-8)
Shell: /bin/sh linked to /bin/bash
Versions of packages coreutils depends on:
ii libacl1 2.2.49-1 Access control list shared library
ii libc6 2.10.2-5 Embedded GNU C Library: Shared lib
ii libselinux1 2.0.89-4 SELinux runtime shared libraries
coreutils recommends no packages.
coreutils suggests no packages.
-- debconf-show failed
Acknowledgement sent
to Stephane Chazelas <[email protected]>:
Extra info received and forwarded to list. Copy sent to Michael Stone <[email protected]>.
(Sun, 24 Jan 2010 11:42:05 GMT) (full text, mbox, link).
Debbugs is free software and licensed under the terms of the GNU General
Public License version 2. The current version can be obtained
from https://bugs.debian.org/debbugs-source/.