Skip Menu |

This queue is for tickets about the HTML-Truncate CPAN distribution.

Report information
The Basics
Id: 34732
Status: resolved
Worked: 6.5 hours (390 min)
Priority: 0/
Queue: HTML-Truncate

People
Owner: ashley [...] cpan.org
Requestors: lorenzo.iannuzzi [...] staff.dada.net
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: truncate on space
Date: Mon, 07 Apr 2008 10:42:37 +0200
To: bug-html-truncate [...] rt.cpan.org
From: Lorenzo Iannuzzi <lorenzo.iannuzzi [...] staff.dada.net>
Hi Ashley, I found your module useful. Tho only thing it lacked to me was the option to truncate on word boundaries, without leaving "breaked" words. I added an option to do this, see the patch. I also noted that, despites what's in the documentation, chars limit doesn't take into account ellipsis. Thanks for your work and sorry me if I made some mistakes. Lorenzo Iannuzzi Skype: innakis
Index: C:/Documents and Settings/l.iannuzzi/Documenti/workspace/importer/lib/HTML/Truncate.pm =================================================================== --- C:/Documents and Settings/l.iannuzzi/Documenti/workspace/importer/lib/HTML/Truncate.pm (revision 4366) +++ C:/Documents and Settings/l.iannuzzi/Documenti/workspace/importer/lib/HTML/Truncate.pm (working copy) @@ -103,6 +103,7 @@ _repair => undef, _skip_tags => \%skip, _stand_alone_tags => \%stand_alone, + _space_truncation => undef, }, $class; while ( my ( $k, $v ) = splice(@_, 0, 2) ) @@ -137,7 +138,7 @@ =head2 $ht->chars Set/get. The number of characters remaining after truncation, -including the C<ellipsis>. The C<style> attribute determines whether +excluding the C<ellipsis>. The C<style> attribute determines whether the chars will only count text or HTML and text. Only "text" is supported currently. @@ -350,7 +351,12 @@ if ( $length > $chars ) { - $self->{_renewed} .= substr($txt, 0, ( $chars ) ); + $txt = substr($txt, 0, ( $chars ) + 1 ); + if ( $self->{_space_truncation} ) + { + $txt =~ s/(.*)\s*\b.*$/$1/; + } + $self->{_renewed} .= $txt; $self->{_renewed} =~ s/\s+\Z//; $self->{_renewed} .= $self->ellipsis(); last TOKENS; @@ -430,6 +436,26 @@ } } +=head2 $ht->space_truncation + +Set/get, true/false. If true, will attempt to truncate on word boundaries, +i.e. on last occurring space before truncation limit. + +=cut + +sub space_truncation { + my $self = shift; + if ( @_ ) + { + $self->{_space_truncation} = shift; + return 1; # say we did it, even if untrue value + } + else + { + return $self->{_space_truncation}; + } +} + # sub _load_chars_from_percent {
Subject: Re: [rt.cpan.org #34732] truncate on space
Date: Mon, 7 Apr 2008 08:05:10 -0700
To: bug-HTML-Truncate [...] rt.cpan.org
From: Ashley <apv [...] sedition.com>
Thanks a lot. It may take me until the weekend to look at this. I'll get back to you then. -Ashley On Apr 7, 2008, at 1:43 AM, Lorenzo Iannuzzi via RT wrote: Show quoted text
> > Mon Apr 07 04:43:06 2008: Request 34732 was acted upon. > Transaction: Ticket created by lorenzo.iannuzzi@staff.dada.net > Queue: HTML-Truncate > Subject: truncate on space > Broken in: (no value) > Severity: (no value) > Owner: Nobody > Requestors: lorenzo.iannuzzi@staff.dada.net > Status: new > Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=34732 > > > > Hi Ashley, > I found your module useful. Tho only thing it lacked to me was the > option to truncate on word boundaries, without leaving "breaked" > words. > I added an option to do this, see the patch. I also noted that, > despites > what's in the documentation, chars limit doesn't take into account > ellipsis. > Thanks for your work and sorry me if I made some mistakes. > > Lorenzo Iannuzzi > Skype: innakis > > Index: C:/Documents and Settings/l.iannuzzi/Documenti/workspace/ > importer/lib/HTML/Truncate.pm > =================================================================== > --- C:/Documents and Settings/l.iannuzzi/Documenti/workspace/ > importer/lib/HTML/Truncate.pm (revision 4366) > +++ C:/Documents and Settings/l.iannuzzi/Documenti/workspace/ > importer/lib/HTML/Truncate.pm (working copy) > @@ -103,6 +103,7 @@ > _repair => undef, > _skip_tags => \%skip, > _stand_alone_tags => \%stand_alone, > + _space_truncation => undef, > }, $class; > > while ( my ( $k, $v ) = splice(@_, 0, 2) ) > @@ -137,7 +138,7 @@ > =head2 $ht->chars > > Set/get. The number of characters remaining after truncation, > -including the C<ellipsis>. The C<style> attribute determines whether > +excluding the C<ellipsis>. The C<style> attribute determines whether > the chars will only count text or HTML and text. Only "text" is > supported currently. > > @@ -350,7 +351,12 @@ > > if ( $length > $chars ) > { > - $self->{_renewed} .= substr($txt, 0, ( $chars ) ); > + $txt = substr($txt, 0, ( $chars ) + 1 ); > + if ( $self->{_space_truncation} ) > + { > + $txt =~ s/(.*)\s*\b.*$/$1/; > + } > + $self->{_renewed} .= $txt; > $self->{_renewed} =~ s/\s+\Z//; > $self->{_renewed} .= $self->ellipsis(); > last TOKENS; > @@ -430,6 +436,26 @@ > } > } > > +=head2 $ht->space_truncation > + > +Set/get, true/false. If true, will attempt to truncate on word > boundaries, > +i.e. on last occurring space before truncation limit. > + > +=cut > + > +sub space_truncation { > + my $self = shift; > + if ( @_ ) > + { > + $self->{_space_truncation} = shift; > + return 1; # say we did it, even if untrue value > + } > + else > + { > + return $self->{_space_truncation}; > + } > +} > + > # > > sub _load_chars_from_percent {
Subject: Re: [rt.cpan.org #34732] truncate on space
Date: Thu, 15 May 2008 10:47:07 +0200
To: bug-HTML-Truncate [...] rt.cpan.org
From: Lorenzo Iannuzzi <lorenzo.iannuzzi [...] staff.dada.net>
apv@sedition.com via RT ha scritto: Show quoted text
> <URL: http://rt.cpan.org/Ticket/Display.html?id=34732 > > > Thanks a lot. It may take me until the weekend to look at this. I'll > get back to you then.
Show quoted text
> On Apr 7, 2008, at 1:43 AM, Lorenzo Iannuzzi via RT wrote:
>> Mon Apr 07 04:43:06 2008: Request 34732 was acted upon. >> Transaction: Ticket created by lorenzo.iannuzzi@staff.dada.net >> Queue: HTML-Truncate >> Subject: truncate on space
Show quoted text
>> Requestors: lorenzo.iannuzzi@staff.dada.net >> Status: new >> Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=34732 > >> >> >> Hi Ashley, >> I found your module useful. Tho only thing it lacked to me was the >> option to truncate on word boundaries, without leaving "breaked" >> words. >> I added an option to do this, see the patch. I also noted that, >> despites >> what's in the documentation, chars limit doesn't take into account >> ellipsis.
Show quoted text
>> + $txt =~ s/(.*)\s*\b.*$/$1/;
I changed the above regex to the following (that works...): s/(.*)\b\W+\w+$/$1/ -- Lorenzo Iannuzzi
I worked on updating the module a bit today; the tests need to be a bit better and I'm updating some other internal stuff and Makefile. I'll do more tonight or the weekend and get version 0.12 as soon as I can. Thanks for reminding me about this!
Version 0.12 is on the way up to the CPAN. http://search.cpan.org/dist/HTML-Truncate/lib/HTML/Truncate.pm I *think* this covers your request correctly. I found things in a bit of disrepair so I rewrote a lot of the guts and added some new tests. The module should in general be much better and more accurate now. I also took a cue from you to add a cleanly() method which is something I do in my own TT2 stuff; like s/[\s[:punct:]]+\z// or user provided regex. It seems to play nicely with on_space(). Thanks!