Skip Menu |

Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the SVN-Notify CPAN distribution.

Report information
The Basics
Id: 28456
Status: resolved
Worked: 10 min
Priority: 0/
Queue: SVN-Notify

People
Owner: Nobody in particular
Requestors: jo [...] bitvalve.org
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: (no value)
Fixed in: 2.70



Subject: UTF-8 special characters in subject line not encoded correctly
Date: Wed, 25 Jul 2007 00:02:09 +0200
To: bug-SVN-Notify [...] rt.cpan.org
From: Jörg <jo [...] bitvalve.org>
Hello, I use SVN::Notify version 2.66 on a Debian etch GNU/Linux 2.6.16.18-amd64 System. svn, version 1.4.2 (r22196) perl, v5.8.8 We write German log messages and German Umlauts do not work properly: I use the following command in the post-commit script: LANG=en_US.utf8 /usr/bin/svnnotify -p "${REPOS}" -r "${REV}" -s "${SENDMAIL}" -l "${SVNLOOK}" -H "HTML::ColorDiff" -t "...@example.org" [Setting LANG=de_DE.utf8 does not work for us, since the output of svnlook is then localized and SVN::Notify cannot handle that correctly, it seems.] The log message inline in the email is encoded correctly, it is "Hello UTF-8 Test: Umlaute ä ö ü" but the subject field of the email is encoded like this: "Subject: [37] Hello UTF-8 Test: =?UTF-8?Q?=20Umlaute=20=EF=BF=BD=20=EF=BF=BD=20=EF=BF=BD?=" But the byte string EF BF BD is Unicode character U+FFFD, which is the REPLACEMENT CHARACTER for incoming characters whose value is unknown. Is this a bug, or am I too stupid to configure SVN::Notify correctly? Any help is appreciated Thanks in advance Jörg
From: jo [...] bitvalve.org
On Di. 24. Jul. 2007, 18:02:26, jo@bitvalve.org wrote: Show quoted text
> I use SVN::Notify version 2.66
I am sorry. We actually use version 2.64 Jörg
From: GROUSSE [...] cpan.org
Le Mar. Jul. 24 18:02:26 2007, jo@bitvalve.org a écrit : Show quoted text
> Is this a bug, or am I too stupid to configure SVN::Notify correctly?
I have the same issue. I tried to play with charset and language options without result.
On Tue Jul 24 18:02:26 2007, jo@bitvalve.org wrote: Show quoted text
> The log message inline in the email is encoded correctly, it is > "Hello UTF-8 Test: Umlaute ä ö ü" > > but the subject field of the email is encoded like this: > "Subject: [37] Hello UTF-8 Test: > =?UTF-8?Q?=20Umlaute=20=EF=BF=BD=20=EF=BF=BD=20=EF=BF=BD?=" > > But the byte string EF BF BD is Unicode character U+FFFD, which is the > REPLACEMENT CHARACTER for incoming characters whose value is unknown. > > > Is this a bug, or am I too stupid to configure SVN::Notify correctly?
I believe it's a bug. Does this patch fix the issue for you? Index: Notify.pm =================================================================== --- Notify.pm (revision 3385) +++ Notify.pm (working copy) @@ -1131,7 +1131,7 @@ # Q-Encoding (RFC 2047) if (PERL58) { require Encode; - Encode::from_to($self->{subject}, $self->{charset}, 'MIME-Q'); + Encode::from_to($self->{subject}, 'utf8', 'MIME-Q'); } return $self;
On Wed Feb 06 14:03:50 2008, DWHEELER wrote: Show quoted text
> I believe it's a bug. Does this patch fix the issue for you?
Better yet, could you check the latest out of Subversion and give it a try? I've just rewritten the handling of character encodings. https://svn.kineticode.com/SVN-Notify/trunk/ You'll want to use something like: /usr/bin/svnnotify -p "${REPOS}" -r "${REV}" -s "${SENDMAIL}" -l "${SVNLOOK}" -H "HTML::ColorDiff" -t "...@example.org" --charset utf-8 --language de_DE See also the new --svn-charset and --diff-charset options. Thanks, David
Subject: [rt.cpan.org #28456] UTF-8 special characters in subject line not encoded correctly
Date: Sat, 16 Feb 2008 13:56:18 +0100
To: bug-SVN-Notify [...] rt.cpan.org
From: Jörg <jo [...] bitvalve.org>
David Wheeler via RT wrote: Show quoted text
> <URL: http://rt.cpan.org/Ticket/Display.html?id=28456 > > > On Wed Feb 06 14:03:50 2008, DWHEELER wrote: >
>> I believe it's a bug. Does this patch fix the issue for you?
No! This patch does not change anything. Still the same problem. Show quoted text
> > Better yet, could you check the latest out of Subversion and give it a > try? I've just rewritten the handling of character encodings. > > https://svn.kineticode.com/SVN-Notify/trunk/ > > You'll want to use something like: > > /usr/bin/svnnotify -p "${REPOS}" -r "${REV}" -s > "${SENDMAIL}" -l "${SVNLOOK}" -H "HTML::ColorDiff" -t "...@example.org" > --charset utf-8 --language de_DE > > See also the new --svn-charset and --diff-charset options.
Yes, I think this works. Though I could not test it on our productive system. But I tried revision 3424 on my local machine using your testsendmail program and this now gives the correct UTF-8 Unicode sequences when writing umlauts, even in the Subject line. Is the new handling for character encodings already in the current upstream version (2.67)? If not, could you please add it? Thank you Jörg
Subject: Re: [rt.cpan.org #28456] UTF-8 special characters in subject line not encoded correctly
Date: Sun, 17 Feb 2008 09:59:47 -0800
To: bug-SVN-Notify [...] rt.cpan.org
From: "David E. Wheeler" <dwheeler [...] cpan.org>
On Feb 16, 2008, at 04:57, Jörg via RT wrote: Show quoted text
>> Better yet, could you check the latest out of Subversion and give >> it a >> try? I've just rewritten the handling of character encodings. >> >> https://svn.kineticode.com/SVN-Notify/trunk/ >> >> You'll want to use something like: >> >> /usr/bin/svnnotify -p "${REPOS}" -r "${REV}" -s >> "${SENDMAIL}" -l "${SVNLOOK}" -H "HTML::ColorDiff" -t >> "...@example.org" >> --charset utf-8 --language de_DE >> >> See also the new --svn-charset and --diff-charset options.
> > Yes, I think this works. > Though I could not test it on our productive system. > But I tried revision 3424 on my local machine using your testsendmail > program and this now gives the correct UTF-8 Unicode sequences when > writing umlauts, even in the Subject line.
Brilliant, thanks for the confirmation. Show quoted text
> Is the new handling for character encodings already in the current > upstream version (2.67)? If not, could you please add it?
No, not yet, but it will be later this week. Lots of changes coming. Best, David
From: martin [...] unicorn.tv
Hello. I ran into the same problem with utf8 text appearing incorrectly. I used version 2.65 as packaged with Ubuntu 7.10, where i would get utf8 text displayed correctly in the diffs, but in the mail subject and svn commit logs the text would look like this: "h?\195?\164r testar ja ?\195?\133?\195?\132?\195?\150 igen!!!" correct would be: "här testar ja åäö igen!!!" So today I installed the svn trunk version on our dev machine at work and used the following svn post-commit hook: svnnotify -r "$REV" -p "$REPOS" \ -C --subject-prefix "[cs] " -d \ -H "HTML::ColorDiff" \ --to "svnlog@..." \ --from "dev-server <svn@...>" \ --charset utf-8 --language sv_SE Now the text in the subject is displayed correctly! But instead, text in commit log & diff is broken. It is now broken in another way: Subject: åäö tredje gången gillt?? Log message: åäö tredje gÃ¥ngen gillt?? And same type of coding error in the diff. This new problem looks like when you try to display latin1 text as utf8. I am very willing to test out any changes to this as this is a issue that is hitting me daily since our commit logs are mostly in Swedish language
Subject: Re: [rt.cpan.org #28456] UTF-8 special characters in subject line not encoded correctly
Date: Mon, 18 Feb 2008 10:44:45 -0800
To: bug-SVN-Notify [...] rt.cpan.org
From: "David E. Wheeler" <david [...] justatheory.com>
On Feb 18, 2008, at 08:06, Martin Lindhe via RT wrote: Show quoted text
> So today I installed the svn trunk version on our dev machine at work > and used the following svn post-commit hook: > > svnnotify -r "$REV" -p "$REPOS" \ > -C --subject-prefix "[cs] " -d \ > -H "HTML::ColorDiff" \ > --to "svnlog@..." \ > --from "dev-server <svn@...>" \ > --charset utf-8 --language sv_SE > > Now the text in the subject is displayed correctly!
W00t! Show quoted text
> But instead, text in commit log & diff is broken. It is now broken in > another way: > > Subject: åäö tredje gången gillt?? > Log message: åäö tredje gÃ¥ngen gillt?? > > And same type of coding error in the diff. This new problem looks like > when you try to display latin1 text as utf8.
Hrm. In what encoding are your log messages and files kept in svn? UTF-8? And what happens if you omit the --language bit? Could you send me the output of svnlook to use in my tests? I would need something like this: LANG=sv_SE.utf-8 svnlook svnlook info /path/to/svnroot -r 3426 > info.txt And something like this: LANG=sv_SE.utf-8 svnlook svnlook diff /path/to/svnroot -r 3426 > diff.txt Then send me those text files. A small commit that demonstrates the above would be best. Please do make sure that the output looks right in your text editor, though, and tweak the LANG environment variable if you need to get get it right. Show quoted text
> I am very willing to test out any changes to this as this is a issue > that is hitting me daily since our commit logs are mostly in Swedish > language
Thanks, I appreciate it. Best, David
Subject: Re: [rt.cpan.org #28456] UTF-8 special characters in subject line not encoded correctly
Date: Mon, 18 Feb 2008 16:53:34 -0800
To: bug-SVN-Notify [...] rt.cpan.org
From: "David E. Wheeler" <dwheeler [...] cpan.org>
On Feb 18, 2008, at 10:45, david@justatheory.com via RT wrote: Show quoted text
>> But instead, text in commit log & diff is broken. It is now broken in >> another way: >> >> Subject: åäö tredje gången gillt?? >> Log message: åäö tredje gÃ¥ngen gillt?? >> >> And same type of coding error in the diff. This new problem looks >> like >> when you try to display latin1 text as utf8.
I think I've fixed this. Can you update your checkout of SVN::Notify and try again? It should be fixed as of r3434. https://svn.kineticode.com/SVN-Notify/trunk/ Thanks, David
From: martin [...] unicorn.tv
On Mon Feb 18 19:54:10 2008, DWHEELER wrote: Show quoted text
> I think I've fixed this. Can you update your checkout of SVN::Notify > and try again? It should be fixed as of r3434. > > https://svn.kineticode.com/SVN-Notify/trunk/ > > Thanks, > > David >
I just tried with r3438. Text is displayed correctly again in log message & in diff!! But hmm you broke somehting else. In subject there are now "X" instead of swedish letters. Like this: Subject: XXXXXX gogogogo Should be: åäö gogogogo apart from that, which does look intentional (?), it is starting to work! And regarding earlier question about text format yes the source code is in utf8, and i believe svn uses utf8 to store the log messages in aswell, not sure about that...
Subject: Re: [rt.cpan.org #28456] UTF-8 special characters in subject line not encoded correctly
Date: Tue, 19 Feb 2008 11:26:59 -0800
To: bug-SVN-Notify [...] rt.cpan.org
From: "David E. Wheeler" <dwheeler [...] cpan.org>
On Feb 19, 2008, at 00:35, Martin Lindhe via RT wrote: Show quoted text
> I just tried with r3438. Text is displayed correctly again in log > message & in diff!!
Awesome. Show quoted text
> But hmm you broke somehting else. In subject there are now "X" instead > of swedish letters. Like this: > > Subject: XXXXXX gogogogo > Should be: åäö gogogogo > > apart from that, which does look intentional (?), it is starting to > work! > > And regarding earlier question about text format yes the source code > is > in utf8, and i believe svn uses utf8 to store the log messages in > aswell, not sure about that...
Ah, I see. I've just fixed that, too, I think. Please update your checkout and give it another try. Thanks! David
From: martin [...] unicorn.tv
On Tue Feb 19 14:27:29 2008, DWHEELER wrote: Show quoted text
> Ah, I see. I've just fixed that, too, I think. Please update your > checkout and give it another try. > > Thanks! > > David >
I just tried r3447, and the problem is actually back to the beginning :( Subject: ?\195?\165?\195?\164?\195? \182 test!!! Log message: ?\195?\165?\195?\164?\195?\182 test!!! utf8 is displayed correctly in diff Also, i had to fix a bug in Notify.pm to get it working at all. Trying to figure how to submit that now
Subject: Re: [rt.cpan.org #28456] UTF-8 special characters in subject line not encoded correctly
Date: Wed, 20 Feb 2008 12:17:40 -0800
To: bug-SVN-Notify [...] rt.cpan.org
From: "David E. Wheeler" <dwheeler [...] cpan.org>
On Feb 20, 2008, at 01:05, Martin Lindhe via RT wrote: Show quoted text
> I just tried r3447, and the problem is actually back to the > beginning :( > > Subject: ?\195?\165?\195?\164?\195? \182 test!!! > Log message: ?\195?\165?\195?\164?\195?\182 test!!! > > utf8 is displayed correctly in diff
Yes, this seems to be a peculiarity of svnlook. See here: http://subversion.tigris.org/servlets/ReadMsg?list=users&msgNo=74948 If you set --language and --charset to come out to a valid locale on your system, it should fix the issue. Show quoted text
> Also, i had to fix a bug in Notify.pm to get it working at all. Trying > to figure how to submit that now
Much obliged. Best, David
From: martin [...] unicorn.tv
On Wed Feb 20 16:23:25 2008, DWHEELER wrote: Show quoted text
> On Feb 20, 2008, at 01:05, Martin Lindhe via RT wrote: >
> > I just tried r3447, and the problem is actually back to the > > beginning :( > > > > Subject: ?\195?\165?\195?\164?\195? \182 test!!! > > Log message: ?\195?\165?\195?\164?\195?\182 test!!! > > > > utf8 is displayed correctly in diff
> > Yes, this seems to be a peculiarity of svnlook. See here: > > http://subversion.tigris.org/servlets/ReadMsg?list=users&msgNo=74948 > > If you set --language and --charset to come out to a valid locale on > your system, it should fix the issue.
Well i read your message about svnlook and maybe that is indeed the issue. However i can happily report that with r3453 it appears to fully work with utf8 in both subject, log message and the diff!!! Thanks alot for the help getting this fixed. PS. here is my post-commit hook: #!/bin/sh REPOS="$1" REV="$2" svnnotify -r "$REV" -p "$REPOS" \ -C --subject-prefix "[cs] " -d \ -H "HTML::ColorDiff" \ --to "svnlog@..." \ --from "dev-server <svn@dev-server>" \ --charset utf-8 --language sv_SE
Subject: Re: [rt.cpan.org #28456] UTF-8 special characters in subject line not encoded correctly
Date: Thu, 21 Feb 2008 14:33:40 -0800
To: bug-SVN-Notify [...] rt.cpan.org
From: "David E. Wheeler" <dwheeler [...] cpan.org>
On Feb 21, 2008, at 00:25, Martin Lindhe via RT wrote: Show quoted text
> Well i read your message about svnlook and maybe that is indeed the > issue. However i can happily report that with r3453 it appears to > fully > work with utf8 in both subject, log message and the diff!!!
Great, I guess your OS has sv_SE.UTF-8 as a valid locale. Wish mine supported UTF-8. :-( Show quoted text
> Thanks alot for the help getting this fixed. > > PS. here is my post-commit hook:
Yes, helpful, thank you. I hope to get a new version out soon. David