Skip Menu |

Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the Geo-ShapeFile CPAN distribution.

Report information
The Basics
Id: 83258
Status: resolved
Priority: 0/
Queue: Geo-ShapeFile

People
Owner: SLAFFAN [...] cpan.org
Requestors: nwetters [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Wishlist
Broken in: 2.52
Fixed in: 2.56



Subject: speedup for contains_point
contains_point calls parts() and points() many times, when only one call is necessary. Here's a fix. sub contains_point { my ( $self, $point ) = @_; return 0 unless $self->bounds_contains_point( $point ); my $a = 0; my ( $x0, $y0 ) = ( $point->X, $point->Y ); my @parts = $self->parts; my @points = $self->points; foreach my $index (1 .. $self->num_parts) { my ( $x1, $y1 ); $index -= 1; # shift to a 0 index my $beg = $parts[$index] || 0; my $end = $parts[$index+1] || 0; $end -= 1; if($end < 0) { $end = $#points; } foreach my $p2 (@points[$beg .. $end]){ my $x2 = $p2->X - $x0; my $y2 = $p2->Y - $y0; if ( defined( $y1 ) && ( ( $y2 >= 0 ) != ( $y1 >= 0 ) ) ) { my $isl = $x1*$y2 - $y1*$x2; if ( $y2 > $y1 ) { --$a if $isl > 0; } else { ++$a if $isl < 0; } } ( $x1, $y1 ) = ( $x2, $y2 ); } } return $a; }
Hello Nigel, I now have mainternership of this module so am working through the bug queue. Are you able to provide some benchmarking results to quantify the speedup? I ran some profiling a while ago and the main slowdown was accessing the X and Y values (if I remember correctly). In any case, development is now at https://github.com/shawnlaffan/Geo-ShapeFile so you can send a pull request for any changes you might wish to see. Regards, Shawn. On Mon Feb 11 04:50:20 2013, NWETTERS wrote: Show quoted text
> contains_point calls parts() and points() many times, when only one > call is necessary. Here's a > fix. > > sub contains_point { > my ( $self, $point ) = @_; > > return 0 unless $self->bounds_contains_point( $point ); > > my $a = 0; > my ( $x0, $y0 ) = ( $point->X, $point->Y ); > > my @parts = $self->parts; > my @points = $self->points; > foreach my $index (1 .. $self->num_parts) { > my ( $x1, $y1 ); > > $index -= 1; # shift to a 0 index > > my $beg = $parts[$index] || 0; > my $end = $parts[$index+1] || 0; > $end -= 1; > if($end < 0) { $end = $#points; } > > foreach my $p2 (@points[$beg .. $end]){ > my $x2 = $p2->X - $x0; > my $y2 = $p2->Y - $y0; > > if ( defined( $y1 ) && ( ( $y2 >= 0 ) != ( $y1 >= 0 ) ) ) { > my $isl = $x1*$y2 - $y1*$x2; > if ( $y2 > $y1 ) { > --$a if $isl > 0; > } else { > ++$a if $isl < 0; > } > } > ( $x1, $y1 ) = ( $x2, $y2 ); > } > } > return $a; > }
I've looked into this and taken a slightly different approach. Rather than copy the code from get_part into contains_point, I have modified get_part to use array refs. This will give some speed improvements, and is more robust to (admittedly unlikely) later changes in the internal structure of the objects. See details at https://github.com/shawnlaffan/Geo-ShapeFile/commit/d2991ef2587edbe50c1d083de7feb99cf180b779 I will close this ticket when the next release is on CPAN. Regards, Shawn. On Sun Feb 09 03:17:02 2014, SLAFFAN wrote: Show quoted text
> Hello Nigel, > > I now have mainternership of this module so am working through the bug > queue. > > Are you able to provide some benchmarking results to quantify the > speedup? I ran some profiling a while ago and the main slowdown was > accessing the X and Y values (if I remember correctly). > > In any case, development is now at https://github.com/shawnlaffan/Geo- > ShapeFile so you can send a pull request for any changes you might > wish to see. > > Regards, > Shawn. > > > > On Mon Feb 11 04:50:20 2013, NWETTERS wrote:
> > contains_point calls parts() and points() many times, when only one > > call is necessary. Here's a > > fix. > > > > sub contains_point { > > my ( $self, $point ) = @_; > > > > return 0 unless $self->bounds_contains_point( $point ); > > > > my $a = 0; > > my ( $x0, $y0 ) = ( $point->X, $point->Y ); > > > > my @parts = $self->parts; > > my @points = $self->points; > > foreach my $index (1 .. $self->num_parts) { > > my ( $x1, $y1 ); > > > > $index -= 1; # shift to a 0 index > > > > my $beg = $parts[$index] || 0; > > my $end = $parts[$index+1] || 0; > > $end -= 1; > > if($end < 0) { $end = $#points; } > > > > foreach my $p2 (@points[$beg .. $end]){ > > my $x2 = $p2->X - $x0; > > my $y2 = $p2->Y - $y0; > > > > if ( defined( $y1 ) && ( ( $y2 >= 0 ) != ( $y1 >= 0 ) ) ) { > > my $isl = $x1*$y2 - $y1*$x2; > > if ( $y2 > $y1 ) { > > --$a if $isl > 0; > > } else { > > ++$a if $isl < 0; > > } > > } > > ( $x1, $y1 ) = ( $x2, $y2 ); > > } > > } > > return $a; > > }
2.56 has been released so I am now closing this ticket. This version also uses spatial indexing to speed up the contains_point search. Thanks for your report. Shawn. On Fri Feb 14 19:11:18 2014, SLAFFAN wrote: Show quoted text
> I've looked into this and taken a slightly different approach. > > Rather than copy the code from get_part into contains_point, I have > modified get_part to use array refs. This will give some speed > improvements, and is more robust to (admittedly unlikely) later > changes in the internal structure of the objects. > > See details at https://github.com/shawnlaffan/Geo- > ShapeFile/commit/d2991ef2587edbe50c1d083de7feb99cf180b779 > > I will close this ticket when the next release is on CPAN. > > > Regards, > Shawn. > > > > On Sun Feb 09 03:17:02 2014, SLAFFAN wrote:
> > Hello Nigel, > > > > I now have mainternership of this module so am working through the > > bug > > queue. > > > > Are you able to provide some benchmarking results to quantify the > > speedup? I ran some profiling a while ago and the main slowdown was > > accessing the X and Y values (if I remember correctly). > > > > In any case, development is now at > > https://github.com/shawnlaffan/Geo- > > ShapeFile so you can send a pull request for any changes you might > > wish to see. > > > > Regards, > > Shawn. > > > > > > > > On Mon Feb 11 04:50:20 2013, NWETTERS wrote:
> > > contains_point calls parts() and points() many times, when only one > > > call is necessary. Here's a > > > fix. > > > > > > sub contains_point { > > > my ( $self, $point ) = @_; > > > > > > return 0 unless $self->bounds_contains_point( $point ); > > > > > > my $a = 0; > > > my ( $x0, $y0 ) = ( $point->X, $point->Y ); > > > > > > my @parts = $self->parts; > > > my @points = $self->points; > > > foreach my $index (1 .. $self->num_parts) { > > > my ( $x1, $y1 ); > > > > > > $index -= 1; # shift to a 0 index > > > > > > my $beg = $parts[$index] || 0; > > > my $end = $parts[$index+1] || 0; > > > $end -= 1; > > > if($end < 0) { $end = $#points; } > > > > > > foreach my $p2 (@points[$beg .. $end]){ > > > my $x2 = $p2->X - $x0; > > > my $y2 = $p2->Y - $y0; > > > > > > if ( defined( $y1 ) && ( ( $y2 >= 0 ) != ( $y1 >= 0 ) ) ) { > > > my $isl = $x1*$y2 - $y1*$x2; > > > if ( $y2 > $y1 ) { > > > --$a if $isl > 0; > > > } else { > > > ++$a if $isl < 0; > > > } > > > } > > > ( $x1, $y1 ) = ( $x2, $y2 ); > > > } > > > } > > > return $a; > > > }