Bug #127067 for TextFileParser: Recommend rename distribution to Text::File::Parser (or similar)

Fri Sep 07 18:12:57 2018 jkeenan [...] pobox.com - Ticket created

Subject:	Recommend rename distribution to Text::File::Parser (or similar)
Date:	Fri, 7 Sep 2018 18:12:39 -0400
To:	bug-textfileparser [...] rt.cpan.org
From:	James E Keenan <jkeenan [...] pobox.com>

Balaji Ramasubramanian, My attention was recently called to https://metacpan.org/release/TextFileParser, which appears to be your first release to CPAN. Thank you very much for writing this distribution. Welcome to CPAN. I would, however, like to point out some things about your release that may discourage other users from using it. Fortunately, these things are correctable -- and I hope you will undertake such corrections. ##### First, by naming your module "TextFileParser" rather than, say, Text::File::Parser, you have committed the faux pas of needlessly generating a new top-level namespace on CPAN. CPAN distributions are generally named following a convention where the top-level namespace refers to a large class of computing problems; the lower-level parts of the namespace get more specific. If a CPAN distribution has as its focus a problem which falls into a large, well-known class of computing problems, then it ought not to create a new top-level namespace; rather, it should start at the top-level and proceed downward. Concretely, this means that a distribution whose purpose is the parsing of text files should be placed in the top-level namespace "Text". Suppose that the purpose of your distribution were to parse a particular type of plain-text file known as "Gamma" files. Then you would probably want to name your module something like: Text::Parse::Gamma You would *not* want to create a new distribution called: TextParseGamma Anyone coming across "TextParseGamma" would immediately spot that as being the work of someone unfamiliar with the conventions of CPAN and the worldwide Perl community. That would be an impediment to getting other people to use your code -- which, after all, is the purpose of CPAN. ##### Second, when documenting CPAN modules (much like when doing academic research or filing a patent application), it is important to discuss prior art in the subject area and why a new CPAN module is needed. In your case -- the parsing of CSV files -- there are two CPAN distributions which have been in development and widespread production use for more than twenty years: https://metacpan.org/pod/Text::CSV and its XS variant https://metacpan.org/pod/Text::CSV_XS Those distributions are well maintained and well respected, as evidence by the fact that other CPAN contributors have built other CPAN distributions dependent upon them. (For example, I have written https://metacpan.org/pod/Text::CSV::Hashify). Now, this being Perl, There Is More Than One Way to Do It(TM), so fresh approaches to computing problems are always welcome. But if you're going to push some code to CPAN that is focused on a problem which other people have already addressed, at the very least you must discuss that prior art, provide links to those distributions and suggest why someone would use your distribution instead of those others. ##### My hunch is that, like many first-time CPAN contributors, you have not thoroughly understood what CPAN users expect out of new distributions. Don't worry, you're not alone in this. I made the same or similar mistakes when I started contributing to CPAN. Nonetheless, I recommend the following: * Read these two documents which are part of the Perl 5 core distribution: ** perldoc perlnewmod ** perldoc perlmodlib ... available at your command-line or perldoc.perl.org. These documents are aimed at helping you avoid the most typical missteps by new CPAN authors. Review what you have done so far in the light of the guidance they provide, and revise accordingly. * Subscribe to the 'module-authors' mailing list at lists.perl.org. (Also available via news interface as 'perl.module-authors' from the nntp.perl.org server.) This is not a high-volume mailing list, but it is the place where you should go to get feedback on (a) the naming of CPAN distributions and (b) where your ideas stand in relation to prior art. Once you have revised your distribution in the light of having read 'perlnewmod' and 'perlmodlib', you would do well to post on the 'module-authors' list about your module before re-uploading to CPAN. * Should you choose to revise the name of your CPAN distribution so as to use a more conventional top-level namespace like 'Text' (rather than 'TextFileParser'), once you have used the PAUSE interface to upload your new distribution you would simply click on the 'Delete Files' link and schedule your releases under the 'TextFileParser' namespace for deletion -- or, more precisely, for removal to http://backpan.perl.org/authors/id/. Once again, thanks for contributing to CPAN and the worldwide Perl community. I hope the suggestions I've made will enable you to make more and improved contributions in the future. Thank you very much. James E Keenan CPAN ID: JKEENAN

Fri Sep 07 18:20:52 2018 jkeenan [...] cpan.org - Correspondence added

On Fri Sep 07 18:12:57 2018, jkeenan@pobox.com wrote: Show quoted text

> Balaji Ramasubramanian, > > My attention was recently called to > https://metacpan.org/release/TextFileParser, which appears to be your > first release to CPAN. Thank you very much for writing this > distribution. Welcome to CPAN. > > I would, however, like to point out some things about your release that > may discourage other users from using it. Fortunately, these things are > correctable -- and I hope you will undertake such corrections. > > ##### > > First, by naming your module "TextFileParser" rather than, say, > Text::File::Parser, you have committed the faux pas of needlessly > generating a new top-level namespace on CPAN. CPAN distributions are > generally named following a convention where the top-level namespace > refers to a large class of computing problems; the lower-level parts of > the namespace get more specific. If a CPAN distribution has as its > focus a problem which falls into a large, well-known class of computing > problems, then it ought not to create a new top-level namespace; rather, > it should start at the top-level and proceed downward. > > Concretely, this means that a distribution whose purpose is the parsing > of text files should be placed in the top-level namespace "Text". > Suppose that the purpose of your distribution were to parse a particular > type of plain-text file known as "Gamma" files. Then you would probably > want to name your module something like: > > Text::Parse::Gamma > > You would *not* want to create a new distribution called: > > TextParseGamma > > Anyone coming across "TextParseGamma" would immediately spot that as > being the work of someone unfamiliar with the conventions of CPAN and > the worldwide Perl community. That would be an impediment to getting > other people to use your code -- which, after all, is the purpose of CPAN. > > ##### > > Second, when documenting CPAN modules (much like when doing academic > research or filing a patent application), it is important to discuss > prior art in the subject area and why a new CPAN module is needed. In > your case -- the parsing of CSV files -- there are two CPAN > distributions which have been in development and widespread production > use for more than twenty years: > > https://metacpan.org/pod/Text::CSV > > and its XS variant > > https://metacpan.org/pod/Text::CSV_XS > > Those distributions are well maintained and well respected, as evidence > by the fact that other CPAN contributors have built other CPAN > distributions dependent upon them. (For example, I have written > https://metacpan.org/pod/Text::CSV::Hashify). > > Now, this being Perl, There Is More Than One Way to Do It(TM), so fresh > approaches to computing problems are always welcome. But if you're > going to push some code to CPAN that is focused on a problem which other > people have already addressed, at the very least you must discuss that > prior art, provide links to those distributions and suggest why someone > would use your distribution instead of those others. >

Subsequent to posting the above I downloaded your code and read the main module. I now realize that your module is not specific to the parsing of CSV files, so this point is largely, but not entirely, moot. I do think, however, that you should not use 'TextFileParser' as a top-level namespace. Show quoted text

> ##### > > My hunch is that, like many first-time CPAN contributors, you have not > thoroughly understood what CPAN users expect out of new distributions. > Don't worry, you're not alone in this. I made the same or similar > mistakes when I started contributing to CPAN. Nonetheless, I recommend > the following: > > * Read these two documents which are part of the Perl 5 core distribution: > > ** perldoc perlnewmod > ** perldoc perlmodlib > > ... available at your command-line or perldoc.perl.org. These documents > are aimed at helping you avoid the most typical missteps by new CPAN > authors. Review what you have done so far in the light of the guidance > they provide, and revise accordingly. > > * Subscribe to the 'module-authors' mailing list at lists.perl.org. > (Also available via news interface as 'perl.module-authors' from the > nntp.perl.org server.) This is not a high-volume mailing list, but it > is the place where you should go to get feedback on (a) the naming of > CPAN distributions and (b) where your ideas stand in relation to prior > art. Once you have revised your distribution in the light of having > read 'perlnewmod' and 'perlmodlib', you would do well to post on the > 'module-authors' list about your module before re-uploading to CPAN. > > * Should you choose to revise the name of your CPAN distribution so as > to use a more conventional top-level namespace like 'Text' (rather than > 'TextFileParser'), once you have used the PAUSE interface to upload your > new distribution you would simply click on the 'Delete Files' link and > schedule your releases under the 'TextFileParser' namespace for deletion > -- or, more precisely, for removal to http://backpan.perl.org/authors/id/. > > Once again, thanks for contributing to CPAN and the worldwide Perl > community. I hope the suggestions I've made will enable you to make > more and improved contributions in the future. > > Thank you very much. > James E Keenan > CPAN ID: JKEENAN

Fri Sep 07 18:20:53 2018 The RT System itself - Status changed from 'new' to 'open'

Sat Sep 15 03:36:12 2018 balaji.ramasubramanian [...] gmail.com - Correspondence added

Hi James, After your email, I read a little bit about the naming of CPAN modules. I'm sorry I hadn't read this earlier. Do you think it would make sense to rename this class to Parser::Text or do you think Text::Parser is better? Which would be easier to find and maintain? If I write child classes for specific formats later - say the SPICE format, or the SPEF format, or something else - then what do you suggest the name of that class to be like? If I select Text::Parser as base class I would have to write SPICE::Parser or SPEF::Parser which would probably not be ideal. But if I select Parser::Text as base class, I could write Parser::SPICE, Parser::SPEF etc., and all Parsers can then sit in one namespace. Which do you think is the ideal solution? Thanks, Balaji

Sat Sep 15 03:36:14 2018 balaji.ramasubramanian [...] gmail.com - Taken

Sat Sep 15 03:38:59 2018 balaji.ramasubramanian [...] gmail.com - Correspondence added

Also, if we do rename the distribution, how do we change other things. For example, what happens to the existing distro on CPAN? Thanks, Balaji

Sat Sep 15 16:02:00 2018 jkeenan [...] pobox.com - Correspondence added

Subject:	Re: [rt.cpan.org #127067] Recommend rename distribution to Text::File::Parser (or similar)
Date:	Sat, 15 Sep 2018 16:01:26 -0400
To:	bug-TextFileParser [...] rt.cpan.org
From:	James E Keenan <jkeenan [...] pobox.com>

On 09/15/2018 03:36 AM, Balaji Ramasubramanian via RT wrote: Show quoted text

> <URL: https://rt.cpan.org/Ticket/Display.html?id=127067 > > > Hi James, > > After your email, I read a little bit about the naming of CPAN modules. I'm sorry I hadn't read this earlier. > > Do you think it would make sense to rename this class to Parser::Text or do you think Text::Parser is better? Which would be easier to find and maintain? > > If I write child classes for specific formats later - say the SPICE format, or the SPEF format, or something else - then what do you suggest the name of that class to be like? > > If I select Text::Parser as base class I would have to write SPICE::Parser or SPEF::Parser which would probably not be ideal. > > But if I select Parser::Text as base class, I could write Parser::SPICE, Parser::SPEF etc., and all Parsers can then sit in one namespace. > > Which do you think is the ideal solution? > > Thanks, > Balaji >

1. Your module handles the processing of data in plain-text format. Therefore I think the module should be placed in the top-level namespace (TLS) "Text". That is a very well-established TLS. 2. At that point you need to step back and consider an important question about the long-term evolution of your module. Will your module always be limited to the parsing of plain-text *files* -- or might it someday handle the processing of plain-text *streams* as well? That is, will your read() method ever be able to handle STDIN as well as files? That has a bearing on how you should rename your module. If your module is going to be limited to processing files -- which is a perfectly reasonable choice -- then, I recommend your module be renamed: Text::File::Parser ... at which point, if you post subclasses to CPAN in the future, they can be named: Text::File::Parser::CSV Text::File::Parser::SPEF Text::File::Parser::SPICE ... and so forth. If, however, your module will eventually handle STDIN as well as files, then you should probably eliminate the "File::" level, resulting in: Text::Parser Text::Parser::CSV Text::Parser::SPEF Text::Parser::SPICE And if you *do* plan to handle processing of STDIN, *now* would be an excellent time to implement that functionality. (Note that I'm not saying you *should* do handle STDIN. I'm simply saying that if you want to do that, it's better to tackle all the problems associated with that while your module is young.) 3. On a different note: In your documentation in lib/TextFileParser.pm, you should settle on either '$parser' or '$pars' but you should not use both. I wrote a little program by copying-and-pasting from the POD and quickly got syntax errors when I used both variable names. Otherwise, your documentation is well written. 4. On 09/15/2018 03:39 AM, Balaji Ramasubramanian via RT wrote: Show quoted text

> <URL: https://rt.cpan.org/Ticket/Display.html?id=127067 > > > Also, if we do rename the distribution, how do we change other

things. For example, what happens to the existing distro on CPAN? Show quoted text

> > Thanks, > Balaji >

Once you have revised the distribution and are ready to upload it to CPAN, a. Make sure that you have incremented $VERSION as you normally would. (I.e., do not start the renamed module's $VERSION at 0.01.) b. Make sure that your documentation states that your module supersedes "TextFileParser". c. I presume you're familiar with the PAUSE interface at pause.perl.org, as you have already uploaded tarballs to CPAN. So once you've uploaded your module, look for the "Delete Files" link. Clicking on that will take you to a menu that will enable you to mark earlier uploads for deletion (actually, for transfer to backpan.perl.org). The deletion is scheduled for three days hence. 5. If you would like to get your revised code reviewed prior to upload to CPAN: a. Send me a tarball or a link to a tarball and I'll happily review it. b. Or, better still, consider putting your source code up on github.com -- a way of software development that is becoming very standard these days -- and posting link to your github repository in a posting to the module-authors@perl.org mailing list. Thank you very much. Jim Keenan

Sun Dec 23 23:22:06 2018 balaji.ramasubramanian [...] gmail.com - Correspondence added

Hi James, I just took the time to edit the module name and make the recommended changes. Please take a look at the GitHub page: https://github.com/balajirama/Text-Parser.git Also, there is a new CPAN module distro that I have already uploaded to PAUSE. I will also be taking down the now superseded TextFileParser module, which the offensive TLNS. Thank you very much for all your inputs on this module. It has made the module a better piece of software. I am very new to this process, and am glad I could contribute. In fact I have been long unsure why people always write code that always looks the same: open INFILE, "<file"; ## Write monolith code to read content of file close INFILE; The object-oriented way to allow for each text file format to just be an inherited version of the main class seems perfectly logical to me. That was my inspiration to write this class. But your suggestions will now hopefully make it even more useful to everyone. Thanks, Balaji

Sun Dec 23 23:22:06 2018 balaji.ramasubramanian [...] gmail.com - Status changed from 'open' to 'resolved'

Mon Dec 24 11:00:32 2018 jkeenan [...] cpan.org - Correspondence added

On Sun Dec 23 23:22:06 2018, BALAJIRAM wrote: Show quoted text

> Hi James, > > I just took the time to edit the module name and make the recommended > changes. Please take a look at the GitHub page: > https://github.com/balajirama/Text-Parser.git

That looks better. You might want to consider using Devel::Cover to identify places in your source code which the test suite is not yet exercising. Thank you very much. Jim Keenan

Mon Dec 31 10:59:59 2018 balaji.ramasubramanian [...] gmail.com - Correspondence added

Subject:	Re: [rt.cpan.org #127067] Recommend rename distribution to Text::File::Parser (or similar)
Date:	Mon, 31 Dec 2018 07:59:19 -0800
To:	bug-TextFileParser [...] rt.cpan.org
From:	Balaji <balaji.ramasubramanian [...] gmail.com>

Hi James, Okay, but right now, the Text::Parser module seems to be uploaded to CPAN as "UNAUTHORIZED". It shows up in big bold red and I can't find Text::Parser on CPAN through a search. Do you know why? What should I do? Thanks, Balaji On Mon, Dec 24, 2018 at 8:00 AM James E Keenan via RT < bug-TextFileParser@rt.cpan.org> wrote: Show quoted text

> Queue: TextFileParser > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=127067 > > > On Sun Dec 23 23:22:06 2018, BALAJIRAM wrote:

> > Hi James, > > > > I just took the time to edit the module name and make the recommended > > changes. Please take a look at the GitHub page: > > https://github.com/balajirama/Text-Parser.git

> > That looks better. You might want to consider using Devel::Cover to > identify places in your source code which the test suite is not yet > exercising. > > Thank you very much. > Jim Keenan >

Mon Dec 31 14:41:35 2018 jkeenan [...] cpan.org - Correspondence added

On Mon Dec 31 10:59:59 2018, BALAJIRAM wrote: Show quoted text

> Hi James, > > Okay, but right now, the Text::Parser module seems to be uploaded to CPAN > as "UNAUTHORIZED". It shows up in big bold red

Where are you seeing that? Please provide URL. Show quoted text

> and I can't find > Text::Parser on CPAN through a search.

Confirmed. Show quoted text

> Do you know why? What should I do?

Not yet. Show quoted text

> > Thanks, > Balaji > > > On Mon, Dec 24, 2018 at 8:00 AM James E Keenan via RT < > bug-TextFileParser@rt.cpan.org> wrote: >

> > Queue: TextFileParser > > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=127067 > > > > > On Sun Dec 23 23:22:06 2018, BALAJIRAM wrote:

> > > Hi James, > > > > > > I just took the time to edit the module name and make the recommended > > > changes. Please take a look at the GitHub page: > > > https://github.com/balajirama/Text-Parser.git

> > > > That looks better. You might want to consider using Devel::Cover to > > identify places in your source code which the test suite is not yet > > exercising. > > > > Thank you very much. > > Jim Keenan > >

Mon Dec 31 14:53:50 2018 jkeenan [...] cpan.org - Correspondence added

On Mon Dec 31 14:41:35 2018, JKEENAN wrote: Show quoted text

> On Mon Dec 31 10:59:59 2018, BALAJIRAM wrote:

> > Hi James, > > > > Okay, but right now, the Text::Parser module seems to be uploaded to CPAN > > as "UNAUTHORIZED". It shows up in big bold red

> > Where are you seeing that? Please provide URL. > >

> > and I can't find > > Text::Parser on CPAN through a search.

> > Confirmed. >

> > Do you know why? What should I do?

> > Not yet. > >

> > > > Thanks, > > Balaji > > > > > > On Mon, Dec 24, 2018 at 8:00 AM James E Keenan via RT < > > bug-TextFileParser@rt.cpan.org> wrote: > >

> > > Queue: TextFileParser > > > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=127067 > > > > > > > On Sun Dec 23 23:22:06 2018, BALAJIRAM wrote:

> > > > Hi James, > > > > > > > > I just took the time to edit the module name and make the recommended > > > > changes. Please take a look at the GitHub page: > > > > https://github.com/balajirama/Text-Parser.git

> > > > > > That looks better. You might want to consider using Devel::Cover to > > > identify places in your source code which the test suite is not yet > > > exercising. > > > > > > Thank you very much. > > > Jim Keenan > > >

> >

I recommend you get on IRC, go to irc.perl.org and /j #toolchain. Pose your question there, as there are people more knowledgeable than me there. (Note: At the moment I'm having trouble logging onto pause.perl.org myself -- so that may be part of the problem.)