Skip Menu |

This queue is for tickets about the Text-Hunspell CPAN distribution.

Report information
The Basics
Id: 84054
Status: resolved
Priority: 0/
Queue: Text-Hunspell

People
Owner: Nobody in particular
Requestors: paul [...] frixxon.co.uk
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 2.06
Fixed in: (no value)



Subject: [RFE] [PATCH] Multiple speller objects are not supported
Text::Hunspell currently doesn't support multiple speller objects, and it fails in a rather surprising way. If you try to create a second object (to support loading another language's dictionary, for example), you can no longer look up words on the first handle. A bit of digging shows that Hunspell.xs maintains a single static handle to the library. This seems to have been a workaround for THIS references not working. The underlying fault was a comment line in the typemap that was causing the compiler to miss out the line that initialised THIS, so it segfaulted all the time. This patch looks rather big, so I'll explain: 1. The typemap has been corrected by removing the comments that caused compilation problems. 2. Hunspell.xs has been reworked to remove the static handle and restore THIS references. 3. Hunspell.xs now loses the delete() function, which didn't make sense, but it gains a DESTROY object so that the library object can be garbage collected (these two lines taken straight from perlxs documentation). 4. Test case added for multiple spellers. If you add this test case to 2.06, you'll see it failing. 5. Removed delete() from other test cases and Hunspell.pm documentation. Hopefully this all looks cleaner. In my testing, I've successfully used this to spell-check a document in four languages (four Text::Hunspell objects) and a personal dictionary, with no problems. (Just checking: do you prefer receiving patches through this form still, or should I put in pull requests on github?)
Subject: non-static-handle-patch.txt
diff -urN Text-Hunspell-2.06/Hunspell.pm Text-Hunspell-2.06.new/Hunspell.pm --- Text-Hunspell-2.06/Hunspell.pm 2013-03-09 10:34:53.000000000 +0000 +++ Text-Hunspell-2.06.new/Hunspell.pm 2013-03-19 12:30:38.011736000 +0000 @@ -98,9 +98,6 @@ print "Morphological modification generator...\n"; print Data::Dumper::Dumper(\@suggestions); - # Deletes the underlying Hunspell C/C++ object - $speller->delete($speller); - =head1 DESCRIPTION @@ -187,12 +184,6 @@ TODO WHY IS THIS DIFFERENT FROM generate2() ??? EXPLAIN. -=head2 C<< $speller->delete($speller) >> - -Deletes the speller class. - -TODO WHY IS THIS NEEDED?? Called on C<$speller> and needs C<$speller> ??? - =head1 BUGS Probably. Yes, definitely. diff -urN Text-Hunspell-2.06/Hunspell.xs Text-Hunspell-2.06.new/Hunspell.xs --- Text-Hunspell-2.06/Hunspell.xs 2013-03-09 10:26:06.000000000 +0000 +++ Text-Hunspell-2.06.new/Hunspell.xs 2013-03-19 12:14:09.174192000 +0000 @@ -19,16 +19,6 @@ /* $Id: Hunspell.xs,v 1.5 2002/08/29 20:28:00 moseley Exp $ */ -static Hunspell *handle; - -/* Needed for static initialization inside XS functions. - A hack, yes, since 'THIS' is always empty it seems. -*/ -static Hunspell * get_hunspell_handle () { - assert(handle != NULL); - return handle; -} - static void * get_mortalspace ( size_t nbytes ) { SV * mortal; mortal = sv_2mortal( NEWSV(0, nbytes ) ); @@ -47,34 +37,19 @@ char *aff; char *dic; CODE: - /* Store the new object as static shared handle. - Every new will overwrite it. Ugly, but reasonable. */ - handle = new Hunspell(aff, dic); - RETVAL = handle; + RETVAL = new Hunspell(aff, dic); OUTPUT: RETVAL - -int -Hunspell::delete(h) - Hunspell *h; - CODE: - delete h; - /* If we deleted our shared static object (most likely) - then remove the reference to it, so it doesn't get used */ - if (h == handle) - handle = NULL; - - /* And return something to the caller too. */ - RETVAL = 1; +void +Hunspell::DESTROY() int Hunspell::add_dic(dic) char *dic; CODE: - handle = get_hunspell_handle(); - RETVAL = handle->add_dic(dic); + RETVAL = THIS->add_dic(dic); OUTPUT: RETVAL @@ -83,8 +58,7 @@ Hunspell::check(buf) char *buf; CODE: - handle = get_hunspell_handle(); - RETVAL = handle->spell(buf); + RETVAL = THIS->spell(buf); OUTPUT: RETVAL @@ -96,8 +70,7 @@ char **wlsti; int i, val; PPCODE: - handle = get_hunspell_handle(); - val = handle->suggest(&wlsti, buf); + val = THIS->suggest(&wlsti, buf); for (int i = 0; i < val; i++) { PUSHs(sv_2mortal(newSVpv( wlsti[i] ,0 ))); free(wlsti[i]); @@ -110,8 +83,7 @@ char **wlsti; int i, val; PPCODE: - handle = get_hunspell_handle(); - val = handle->analyze(&wlsti, buf); + val = THIS->analyze(&wlsti, buf); for (i = 0; i < val; i++) { PUSHs(sv_2mortal(newSVpv(wlsti[i], 0))); free(wlsti[i]); @@ -125,8 +97,7 @@ char **wlsti; int i, val; PPCODE: - handle = get_hunspell_handle(); - val = handle->stem(&wlsti, buf); + val = THIS->stem(&wlsti, buf); for (int i = 0; i < val; i++) { PUSHs(sv_2mortal(newSVpv( wlsti[i] ,0 ))); free(wlsti[i]); @@ -141,8 +112,7 @@ char **wlsti; int i, val; PPCODE: - handle = get_hunspell_handle(); - val = handle->generate(&wlsti, buf, sample); + val = THIS->generate(&wlsti, buf, sample); for (int i = 0; i < val; i++) { PUSHs(sv_2mortal(newSVpv( wlsti[i] ,0 ))); free(wlsti[i]); @@ -171,8 +141,7 @@ array[i] = SvPV( *elem, PL_na ); } - handle = get_hunspell_handle(); - val = handle->generate(&wlsti, buf, array, len); + val = THIS->generate(&wlsti, buf, array, len); for (int i = 0; i < val; i++) { PUSHs(sv_2mortal(newSVpv( wlsti[i] ,0 ))); diff -urN Text-Hunspell-2.06/t/05-core.t Text-Hunspell-2.06.new/t/05-core.t --- Text-Hunspell-2.06/t/05-core.t 2013-03-09 10:17:47.000000000 +0000 +++ Text-Hunspell-2.06.new/t/05-core.t 2013-03-19 12:27:06.477776000 +0000 @@ -2,7 +2,7 @@ use warnings; use Data::Dumper; -use Test::More tests => 6; +use Test::More tests => 5; use Text::Hunspell; my $speller = Text::Hunspell->new(qw(./t/test.aff ./t/test.dic)); @@ -30,7 +30,3 @@ \@suggestions => [ qw(lói ló lót) ], q(List of suggestions should be correct) ); - -# Curtains down -ok($speller->delete($speller), q(delete method presumably worked)); - diff -urN Text-Hunspell-2.06/t/07-add-dictionary.t Text-Hunspell-2.06.new/t/07-add-dictionary.t --- Text-Hunspell-2.06/t/07-add-dictionary.t 2013-03-09 10:28:20.000000000 +0000 +++ Text-Hunspell-2.06.new/t/07-add-dictionary.t 2013-03-19 12:27:07.666028000 +0000 @@ -2,7 +2,7 @@ use warnings; use Data::Dumper; -use Test::More tests => 5; +use Test::More tests => 4; use Text::Hunspell; my $speller = Text::Hunspell->new(qw(./t/test.aff ./t/test.dic)); @@ -24,6 +24,3 @@ $speller->check($word), qq(Word '$word' is in the supplemental dictionary) ); - -# Curtains down -ok($speller->delete($speller), q(delete method presumably worked)); diff -urN Text-Hunspell-2.06/t/08-two-dics.t Text-Hunspell-2.06.new/t/08-two-dics.t --- Text-Hunspell-2.06/t/08-two-dics.t 1970-01-01 01:00:00.000000000 +0100 +++ Text-Hunspell-2.06.new/t/08-two-dics.t 2013-03-19 12:27:05.961087000 +0000 @@ -0,0 +1,40 @@ +use strict; +use warnings; + +use Test::More tests => 7; +use Text::Hunspell; + +my $speller = Text::Hunspell->new(qw(./t/test.aff ./t/test.dic)); +die unless $speller; +ok($speller, qq(Created a Text::Hunspell object [$speller])); + +my $word = q(lótól); +ok( + $speller->check($word), + qq(Word '$word' found in first dictionary) +); + +my $speller2 = Text::Hunspell->new(qw(./t/supp.aff ./t/supp.dic)); +die unless $speller2; +ok($speller2, qq(Created another Text::Hunspell object [$speller2])); + +ok( + $speller->check($word), + qq(Word '$word' found in first dictionary) +); + +ok( + !$speller2->check($word), + qq(Word '$word' not found in second dictionary) +); + +$word = q(munkey); +ok( + !$speller->check($word), + qq(Word '$word' not found in first dictionary) +); + +ok( + $speller2->check($word), + qq(Word '$word' found in second dictionary) +); diff -urN Text-Hunspell-2.06/t/supp.aff Text-Hunspell-2.06.new/t/supp.aff --- Text-Hunspell-2.06/t/supp.aff 1970-01-01 01:00:00.000000000 +0100 +++ Text-Hunspell-2.06.new/t/supp.aff 2013-03-19 12:27:50.427076000 +0000 @@ -0,0 +1 @@ +SET UTF-8 diff -urN Text-Hunspell-2.06/typemap Text-Hunspell-2.06.new/typemap --- Text-Hunspell-2.06/typemap 2013-03-09 10:17:47.000000000 +0000 +++ Text-Hunspell-2.06.new/typemap 2013-03-19 12:12:01.895918000 +0000 @@ -2,7 +2,6 @@ TYPEMAP Hunspell * O_OBJECT -Hunspell O_OBJECT # From: "perlobject.map" Dean Roehrich, version 19960302 # O_OBJECT -> link an opaque C or C++ object to a blessed Perl object. @@ -15,10 +14,8 @@ INPUT O_OBJECT -# warn(\" sv_isobject($arg) $arg SvTYPE(SvRV($arg)) SVt_PVMG $type\"); if( sv_isobject($arg) && (SvTYPE(SvRV($arg)) == SVt_PVMG) ) -# $var = ($type)SvIV((SV*)SvRV( $arg )); - $var = INT2PTR ($type, SvIV (SvRV ($arg))); + $var = ($type)SvIV((SV*)SvRV( $arg )); else{ warn( \"${Package}::$func_name() -- $var is not a blessed SV reference\" ); XSRETURN_UNDEF;
From: paul [...] frixxon.co.uk
On Tue Mar 19 08:54:58 2013, hisdeedsaredust wrote: Show quoted text
> This patch looks rather big, so I'll explain: > 3. Hunspell.xs now loses the delete() function, which didn't make sense, > but it gains a DESTROY object so that the library object can be garbage > collected (these two lines taken straight from perlxs documentation).
OK, this bit was stupid, as it might break existing scripts. This replacement patch retains delete(), but it does nothing other than warn that it is deprecated.
Subject: non-static-handle-patch2.txt
diff -urN Text-Hunspell-2.06/Hunspell.pm Text-Hunspell-2.06.new/Hunspell.pm --- Text-Hunspell-2.06/Hunspell.pm 2013-03-09 10:34:53.000000000 +0000 +++ Text-Hunspell-2.06.new/Hunspell.pm 2013-03-19 12:30:38.011736000 +0000 @@ -98,9 +98,6 @@ print "Morphological modification generator...\n"; print Data::Dumper::Dumper(\@suggestions); - # Deletes the underlying Hunspell C/C++ object - $speller->delete($speller); - =head1 DESCRIPTION @@ -187,12 +184,6 @@ TODO WHY IS THIS DIFFERENT FROM generate2() ??? EXPLAIN. -=head2 C<< $speller->delete($speller) >> - -Deletes the speller class. - -TODO WHY IS THIS NEEDED?? Called on C<$speller> and needs C<$speller> ??? - =head1 BUGS Probably. Yes, definitely. diff -urN Text-Hunspell-2.06/Hunspell.xs Text-Hunspell-2.06.new/Hunspell.xs --- Text-Hunspell-2.06/Hunspell.xs 2013-03-09 10:26:06.000000000 +0000 +++ Text-Hunspell-2.06.new/Hunspell.xs 2013-03-19 16:22:30.818653000 +0000 @@ -19,16 +19,6 @@ /* $Id: Hunspell.xs,v 1.5 2002/08/29 20:28:00 moseley Exp $ */ -static Hunspell *handle; - -/* Needed for static initialization inside XS functions. - A hack, yes, since 'THIS' is always empty it seems. -*/ -static Hunspell * get_hunspell_handle () { - assert(handle != NULL); - return handle; -} - static void * get_mortalspace ( size_t nbytes ) { SV * mortal; mortal = sv_2mortal( NEWSV(0, nbytes ) ); @@ -47,34 +37,28 @@ char *aff; char *dic; CODE: - /* Store the new object as static shared handle. - Every new will overwrite it. Ugly, but reasonable. */ - handle = new Hunspell(aff, dic); - RETVAL = handle; + RETVAL = new Hunspell(aff, dic); OUTPUT: RETVAL - int Hunspell::delete(h) Hunspell *h; CODE: - delete h; - /* If we deleted our shared static object (most likely) - then remove the reference to it, so it doesn't get used */ - if (h == handle) - handle = NULL; - - /* And return something to the caller too. */ + warn("Text::Hunspell::delete() is deprecated and no replacement is needed"); RETVAL = 1; + OUTPUT: + RETVAL + +void +Hunspell::DESTROY() int Hunspell::add_dic(dic) char *dic; CODE: - handle = get_hunspell_handle(); - RETVAL = handle->add_dic(dic); + RETVAL = THIS->add_dic(dic); OUTPUT: RETVAL @@ -83,8 +67,7 @@ Hunspell::check(buf) char *buf; CODE: - handle = get_hunspell_handle(); - RETVAL = handle->spell(buf); + RETVAL = THIS->spell(buf); OUTPUT: RETVAL @@ -96,8 +79,7 @@ char **wlsti; int i, val; PPCODE: - handle = get_hunspell_handle(); - val = handle->suggest(&wlsti, buf); + val = THIS->suggest(&wlsti, buf); for (int i = 0; i < val; i++) { PUSHs(sv_2mortal(newSVpv( wlsti[i] ,0 ))); free(wlsti[i]); @@ -110,8 +92,7 @@ char **wlsti; int i, val; PPCODE: - handle = get_hunspell_handle(); - val = handle->analyze(&wlsti, buf); + val = THIS->analyze(&wlsti, buf); for (i = 0; i < val; i++) { PUSHs(sv_2mortal(newSVpv(wlsti[i], 0))); free(wlsti[i]); @@ -125,8 +106,7 @@ char **wlsti; int i, val; PPCODE: - handle = get_hunspell_handle(); - val = handle->stem(&wlsti, buf); + val = THIS->stem(&wlsti, buf); for (int i = 0; i < val; i++) { PUSHs(sv_2mortal(newSVpv( wlsti[i] ,0 ))); free(wlsti[i]); @@ -141,8 +121,7 @@ char **wlsti; int i, val; PPCODE: - handle = get_hunspell_handle(); - val = handle->generate(&wlsti, buf, sample); + val = THIS->generate(&wlsti, buf, sample); for (int i = 0; i < val; i++) { PUSHs(sv_2mortal(newSVpv( wlsti[i] ,0 ))); free(wlsti[i]); @@ -171,8 +150,7 @@ array[i] = SvPV( *elem, PL_na ); } - handle = get_hunspell_handle(); - val = handle->generate(&wlsti, buf, array, len); + val = THIS->generate(&wlsti, buf, array, len); for (int i = 0; i < val; i++) { PUSHs(sv_2mortal(newSVpv( wlsti[i] ,0 ))); diff -urN Text-Hunspell-2.06/t/05-core.t Text-Hunspell-2.06.new/t/05-core.t --- Text-Hunspell-2.06/t/05-core.t 2013-03-09 10:17:47.000000000 +0000 +++ Text-Hunspell-2.06.new/t/05-core.t 2013-03-19 12:27:06.477776000 +0000 @@ -2,7 +2,7 @@ use warnings; use Data::Dumper; -use Test::More tests => 6; +use Test::More tests => 5; use Text::Hunspell; my $speller = Text::Hunspell->new(qw(./t/test.aff ./t/test.dic)); @@ -30,7 +30,3 @@ \@suggestions => [ qw(lói ló lót) ], q(List of suggestions should be correct) ); - -# Curtains down -ok($speller->delete($speller), q(delete method presumably worked)); - diff -urN Text-Hunspell-2.06/t/07-add-dictionary.t Text-Hunspell-2.06.new/t/07-add-dictionary.t --- Text-Hunspell-2.06/t/07-add-dictionary.t 2013-03-09 10:28:20.000000000 +0000 +++ Text-Hunspell-2.06.new/t/07-add-dictionary.t 2013-03-19 12:27:07.666028000 +0000 @@ -2,7 +2,7 @@ use warnings; use Data::Dumper; -use Test::More tests => 5; +use Test::More tests => 4; use Text::Hunspell; my $speller = Text::Hunspell->new(qw(./t/test.aff ./t/test.dic)); @@ -24,6 +24,3 @@ $speller->check($word), qq(Word '$word' is in the supplemental dictionary) ); - -# Curtains down -ok($speller->delete($speller), q(delete method presumably worked)); diff -urN Text-Hunspell-2.06/t/08-two-dics.t Text-Hunspell-2.06.new/t/08-two-dics.t --- Text-Hunspell-2.06/t/08-two-dics.t 1970-01-01 01:00:00.000000000 +0100 +++ Text-Hunspell-2.06.new/t/08-two-dics.t 2013-03-19 12:27:05.961087000 +0000 @@ -0,0 +1,40 @@ +use strict; +use warnings; + +use Test::More tests => 7; +use Text::Hunspell; + +my $speller = Text::Hunspell->new(qw(./t/test.aff ./t/test.dic)); +die unless $speller; +ok($speller, qq(Created a Text::Hunspell object [$speller])); + +my $word = q(lótól); +ok( + $speller->check($word), + qq(Word '$word' found in first dictionary) +); + +my $speller2 = Text::Hunspell->new(qw(./t/supp.aff ./t/supp.dic)); +die unless $speller2; +ok($speller2, qq(Created another Text::Hunspell object [$speller2])); + +ok( + $speller->check($word), + qq(Word '$word' found in first dictionary) +); + +ok( + !$speller2->check($word), + qq(Word '$word' not found in second dictionary) +); + +$word = q(munkey); +ok( + !$speller->check($word), + qq(Word '$word' not found in first dictionary) +); + +ok( + $speller2->check($word), + qq(Word '$word' found in second dictionary) +); diff -urN Text-Hunspell-2.06/t/supp.aff Text-Hunspell-2.06.new/t/supp.aff --- Text-Hunspell-2.06/t/supp.aff 1970-01-01 01:00:00.000000000 +0100 +++ Text-Hunspell-2.06.new/t/supp.aff 2013-03-19 12:27:50.427076000 +0000 @@ -0,0 +1 @@ +SET UTF-8 diff -urN Text-Hunspell-2.06/typemap Text-Hunspell-2.06.new/typemap --- Text-Hunspell-2.06/typemap 2013-03-09 10:17:47.000000000 +0000 +++ Text-Hunspell-2.06.new/typemap 2013-03-19 12:12:01.895918000 +0000 @@ -2,7 +2,6 @@ TYPEMAP Hunspell * O_OBJECT -Hunspell O_OBJECT # From: "perlobject.map" Dean Roehrich, version 19960302 # O_OBJECT -> link an opaque C or C++ object to a blessed Perl object. @@ -15,10 +14,8 @@ INPUT O_OBJECT -# warn(\" sv_isobject($arg) $arg SvTYPE(SvRV($arg)) SVt_PVMG $type\"); if( sv_isobject($arg) && (SvTYPE(SvRV($arg)) == SVt_PVMG) ) -# $var = ($type)SvIV((SV*)SvRV( $arg )); - $var = INT2PTR ($type, SvIV (SvRV ($arg))); + $var = ($type)SvIV((SV*)SvRV( $arg )); else{ warn( \"${Package}::$func_name() -- $var is not a blessed SV reference\" ); XSRETURN_UNDEF;
Hi Paul, On Tue Mar 19 12:48:03 2013, hisdeedsaredust wrote: Show quoted text
> On Tue Mar 19 08:54:58 2013, hisdeedsaredust wrote:
> > This patch looks rather big, so I'll explain: > > 3. Hunspell.xs now loses the delete() function, which didn't make sense, > > but it gains a DESTROY object so that the library object can be garbage > > collected (these two lines taken straight from perlxs documentation).
> > OK, this bit was stupid, as it might break existing scripts. This > replacement patch retains delete(), but it does nothing other than warn > that it is deprecated.
thanks for your effort, but as you may know “Patches, patches… we don't need no stinkin’ patches!” (via http://en.wikipedia.org/wiki/Troop_Beverly_Hills originally based on http://en.wikipedia.org/wiki/Stinking_badges ). Can you please give me a pull request for that? Not that I'm sure where the repository is because COSIMO does not appear to use one? COSIMO, wherefore are thy repository? And if it does not exist, set up one on http://github.com/ or wherever. Regards, -- Shlomi Fish (who is in a silly, but hopefully amusing, mood).
From: paul [...] frixxon.co.uk
On Tue Mar 19 19:32:42 2013, SHLOMIF wrote: Show quoted text
> Can you please > give me a pull request for that? Not that I'm sure where the > repository is because COSIMO does not appear to use one? > > COSIMO, wherefore are thy repository? And if it does not exist, set up > one on http://github.com/ or wherever.
https://github.com/cosimo/perl5-text-hunspell/pull/1
On Wed Mar 20 03:04:59 2013, hisdeedsaredust wrote: Show quoted text
> On Tue Mar 19 19:32:42 2013, SHLOMIF wrote:
> > Can you please > > give me a pull request for that? Not that I'm sure where the > > repository is because COSIMO does not appear to use one? > > > > COSIMO, wherefore are thy repository? And if it does not exist, set up > > one on http://github.com/ or wherever.
> > https://github.com/cosimo/perl5-text-hunspell/pull/1
Thanks Paul! I'll go through this during the weekend. Thanks much again. I'll add the github repository in the META info, so everyone will know where to find the sources.
Subject: Re: [rt.cpan.org #84054] [RFE] [PATCH] Multiple speller objects are not supported
Date: Tue, 26 Mar 2013 20:43:37 +0100
To: bug-Text-Hunspell [...] rt.cpan.org
From: Cosimo Streppone <cosimo [...] streppone.it>
On 03/20/2013 08:04 AM, Paul Flo Williams via RT wrote: Show quoted text
Hi Paul, this looks great. Thanks for this patch. It cleans up one of the parts I had tagged with that "TODO" comment "why is this needed at all" :-) Thanks much again. I'll apply it now and release a new version on CPAN soon after. -- Cosimo
2.07 has been just pushed out to CPAN. Considering this closed.