Debian Bug report logs - #431231
tr: no UTF-8 support

version graph

Package: coreutils; Maintainer for coreutils is Michael Stone <[email protected]>; Source for coreutils is src:coreutils (PTS, buildd, popcon).

Reported by: Juhapekka Tolvanen <[email protected]>

Date: Sat, 30 Jun 2007 20:30:01 UTC

Severity: normal

Tags: confirmed, upstream

Merged with 139861, 388689, 613155, 649729, 721324

Found in versions coreutils/8.13-3, coreutils/5.97-5.3, coreutils/8.21-1, coreutils/5.97-5, coreutils/8.5-1, coreutils/5.96-3, coreutils/9.1-1, coreutils/6.10~20071127-1

Reply or subscribe to this bug.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to [email protected], Michael Stone <[email protected]>:
Bug#431231; Package coreutils. (full text, mbox, link).


Acknowledgement sent to Juhapekka Tolvanen <[email protected]>:
New Bug report received and forwarded. Copy sent to Michael Stone <[email protected]>. (full text, mbox, link).


Message #5 received at [email protected] (full text, mbox, reply):

From: Juhapekka Tolvanen <[email protected]>
To: Debian Bug Tracking System <[email protected]>
Subject: tr fails with UTF-8
Date: Sat, 30 Jun 2007 23:29:25 +0300
Package: coreutils
Version: 5.97-5.3
Severity: important


juhtolv@juhtolv:/home/juhtolv % echo 'huuhaa öljy äiti über' | tr '[:lower:]' '[:upper:]'
HUUHAA öLJY äITI üBER
juhtolv@juhtolv:/home/juhtolv % echo 'huuhaa öljy äiti über' | tr '[:lower:]' '[:upper:]' | tr 'åäöü' 'ÅÄÖÜ'
HUUHAA ÖLJY ÄITI ÜBER
juhtolv@juhtolv:/home/juhtolv %

See locale below.


-- System Information:
Debian Release: 4.0
  APT prefers stable
  APT policy: (990, 'stable'), (500, 'testing-proposed-updates'), (500, 'proposed-updates'), (101, 'testing'), (99, 'unstable')
Architecture: i386 (i686)
Shell:  /bin/sh linked to /bin/bash
Kernel: Linux 2.6.18-5-686
Locale: LANG=fi_FI.utf8, LC_CTYPE=fi_FI.utf8 (charmap=UTF-8)

Versions of packages coreutils depends on:
ii  libacl1                       2.2.41-1   Access control list shared library
ii  libc6                         2.5-9+b1   GNU C Library: Shared libraries
ii  libselinux1                   2.0.15-2   SELinux shared libraries

coreutils recommends no packages.

-- no debconf information


-- 
Juhapekka "naula" Tolvanen * http colon slash slash iki dot fi slash juhtolv
"Denn du bist was du isst und ihr wisst was es ist. Es ist mein Teil – nein.
Mein Teil – nein. Da das ist mein Teil – nein. Mein Teil – nein."  Rammstein



Information forwarded to [email protected], Michael Stone <[email protected]>:
Bug#431231; Package coreutils. (full text, mbox, link).


Acknowledgement sent to [email protected] (Bob Proulx):
Extra info received and forwarded to list. Copy sent to Michael Stone <[email protected]>. (full text, mbox, link).


Message #10 received at [email protected] (full text, mbox, reply):

From: [email protected] (Bob Proulx)
To: Juhapekka Tolvanen <[email protected]>, [email protected]
Subject: Re: Bug#431231: tr fails with UTF-8
Date: Sat, 30 Jun 2007 15:33:12 -0600
merge 431231 139861
thanks

Juhapekka Tolvanen wrote:
> ...report of a locale problem in tr deleted...

See also these related issues.

  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=139861
  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=388689

It is a known deficiency in coreutils in general that the utilities
are not multibyte aware.  The following can be found in the upstream
source package TODO file.

  Adapt tools like wc, tr, fmt, etc. (most of the textutils) to be
    multibyte aware.  The problem is that I want to avoid duplicating
    significant blocks of logic, yet I also want to incur only minimal
    (preferably `no') cost when operating in single-byte mode.

Some vendors have hacked in patches to make the utilities multibyte
aware but none of those patches have been considered clean enough to
incorporate into the upstream source yet.  Debian's maintainer has
stated that he does not want to diverge from upstream this radically.
The patches are very messy and incomplete.  The best course of action
would be to get this resolved upstream with the functionally properly
integrated.

Bob



Information forwarded to [email protected], Michael Stone <[email protected]>:
Bug#431231; Package coreutils. (full text, mbox, link).


Acknowledgement sent to Juhapekka Tolvanen <[email protected]>:
Extra info received and forwarded to list. Copy sent to Michael Stone <[email protected]>. (full text, mbox, link).


Message #15 received at [email protected] (full text, mbox, reply):

From: Juhapekka Tolvanen <[email protected]>
To: Bob Proulx <[email protected]>, [email protected]
Cc: Hilko Bengen <[email protected]>, Colin Watson <[email protected]>
Subject: Re: Bug#431231: tr fails with UTF-8
Date: Tue, 25 Dec 2007 06:48:59 +0200
On Sun, 01 Jul 2007, +01:24:14 EEST (UTC +0300),
Bob Proulx <[email protected]> pressed some keys:

> merge 431231 139861
> thanks
> 
> Juhapekka Tolvanen wrote:
> > ...report of a locale problem in tr deleted...
> 
> See also these related issues.
> 
>   http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=139861
>   http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=388689
> 
> It is a known deficiency in coreutils in general that the utilities
> are not multibyte aware.  The following can be found in the upstream
> source package TODO file.

Look at this:

% echo 'huuhaa öljy äiti über' | tr '[:lower:]' '[:upper:]'
HUUHAA öLJY äITI üBER
% echo 'huuhaa öljy äiti über' | /opt/heirloom/5bin/tr '[:lower:]' '[:upper:]'
HUUHAA ÖLJY ÄITI ÜBER

That Heirloom Toolchest is available here:

http://heirloom.sourceforge.net/

IMNSHO Debian project should package that toolchest, too. Maybe GNU
tools should be replaced with them, but that would be very radical.

Also other software from Heirloom -project should be Debian-packaged,
because they have superior UTF-8 -support when compared to
GNU-counterparts. For example their roff -implementation can handle
UTF-8 and OpenType-fonts.


-- 
Juhapekka "naula" Tolvanen * http colon slash slash iki dot fi slash juhtolv
"eiga wo miyou kimi no yakusoku dohri te wo tsunai de. yoru ni wa owakare
desu ringo to ichigo ga kusaru mae ni. yume wa hirogaru kimi no yakusoku
dohri kisu wo shi nagara."                                       Dir en grey




Information forwarded to [email protected], Michael Stone <[email protected]>:
Bug#431231; Package coreutils. (full text, mbox, link).


Acknowledgement sent to Colin Watson <[email protected]>:
Extra info received and forwarded to list. Copy sent to Michael Stone <[email protected]>. (full text, mbox, link).


Message #20 received at [email protected] (full text, mbox, reply):

From: Colin Watson <[email protected]>
To: Juhapekka Tolvanen <[email protected]>
Cc: Bob Proulx <[email protected]>, [email protected], Hilko Bengen <[email protected]>
Subject: Re: Bug#431231: tr fails with UTF-8
Date: Mon, 31 Dec 2007 17:14:08 +0000
On Tue, Dec 25, 2007 at 06:48:59AM +0200, Juhapekka Tolvanen wrote:
> Also other software from Heirloom -project should be Debian-packaged,
> because they have superior UTF-8 -support when compared to
> GNU-counterparts. For example their roff -implementation can handle
> UTF-8 and OpenType-fonts.

Somebody else is welcome to try to do so, but I don't have the time, I'm
afraid.

Cheers,

-- 
Colin Watson                                       [[email protected]]




Information forwarded to [email protected], Michael Stone <[email protected]>:
Bug#431231; Package coreutils. (full text, mbox, link).


Acknowledgement sent to Lucas Nussbaum <[email protected]>:
Extra info received and forwarded to list. Copy sent to Michael Stone <[email protected]>. (full text, mbox, link).


Message #25 received at [email protected] (full text, mbox, reply):

From: Lucas Nussbaum <[email protected]>
To: [email protected], [email protected], [email protected]
Cc: [email protected]
Subject: Re: Bug#431231: tr fails with UTF-8
Date: Tue, 22 Jan 2008 20:56:51 +0100
forcemerge 139861 388689 431231
tags 139861 + upstream confirmed wontfix
found 139861 6.10~20071127-1
thanks

Hi,

I'm merging these bugs (all about tr not supporting UTF-8), that still
affects the current coreutils in experimental. "wontfix" indicates that
this is not going to be fixed by a debian-specific patch, but that the
problem should be fixed upstream first.
-- 
| Lucas Nussbaum
| [email protected]   http://www.lucas-nussbaum.net/ |
| jabber: [email protected]             GPG: 1024D/023B3F4F |




Forcibly Merged 139861 388689 431231. Request was from Lucas Nussbaum <[email protected]> to [email protected]. (Tue, 22 Jan 2008 19:57:06 GMT) (full text, mbox, link).


Tags added: upstream, confirmed, wontfix Request was from Lucas Nussbaum <[email protected]> to [email protected]. (Tue, 22 Jan 2008 19:57:07 GMT) (full text, mbox, link).


Bug marked as found in version 6.10~20071127-1. Request was from Lucas Nussbaum <[email protected]> to [email protected]. (Tue, 22 Jan 2008 19:57:09 GMT) (full text, mbox, link).


Removed tag(s) wontfix. Request was from Jonathan Nieder <[email protected]> to [email protected]. (Fri, 11 Feb 2011 10:06:07 GMT) (full text, mbox, link).


Merged 139861 388689 431231 613155. Request was from Benoît Knecht <[email protected]> to [email protected]. (Sun, 10 Jul 2011 09:57:25 GMT) (full text, mbox, link).


Changed Bug title to 'tr: no UTF-8 support' from 'tr fails with UTF-8' Request was from Benoît Knecht <[email protected]> to [email protected]. (Sun, 10 Jul 2011 09:57:27 GMT) (full text, mbox, link).


Forcibly Merged 139861 388689 431231 613155 649729. Request was from Bob Proulx <[email protected]> to [email protected]. (Fri, 03 Feb 2012 07:03:57 GMT) (full text, mbox, link).


Marked as found in versions coreutils/8.21-1. Request was from Bob Proulx <[email protected]> to [email protected]. (Sun, 30 Nov 2014 21:54:15 GMT) (full text, mbox, link).


Merged 139861 388689 431231 613155 649729 721324 Request was from Bob Proulx <[email protected]> to [email protected]. (Sun, 30 Nov 2014 21:54:20 GMT) (full text, mbox, link).


Information forwarded to [email protected], Michael Stone <[email protected]>:
Bug#431231; Package coreutils. (Fri, 16 Feb 2018 10:48:03 GMT) (full text, mbox, link).


Acknowledgement sent to [email protected]:
Extra info received and forwarded to list. Copy sent to Michael Stone <[email protected]>. (Fri, 16 Feb 2018 10:48:03 GMT) (full text, mbox, link).


Marked as found in versions coreutils/9.1-1. Request was from Thorsten Glaser <[email protected]> to [email protected]. (Mon, 13 Mar 2023 14:42:07 GMT) (full text, mbox, link).


Send a report that this bug log contains spam.


Debian bug tracking system administrator <[email protected]>. Last modified: Tue May 13 08:44:04 2025; Machine Name: buxtehude

Debian Bug tracking system

Debbugs is free software and licensed under the terms of the GNU General Public License version 2. The current version can be obtained from https://bugs.debian.org/debbugs-source/.

Copyright © 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson, 2005-2017 Don Armstrong, and many other contributors.