Skip Menu |

This queue is for tickets about the Pod-Perldoc CPAN distribution.

Report information
The Basics
Id: 120229
Status: open
Priority: 0/
Queue: Pod-Perldoc

People
Owner: Nobody in particular
Requestors: zefram [...] fysh.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: perldoc rudely interferes with pager configuration
Date: Mon, 13 Feb 2017 14:07:54 +0000
To: bug-Pod-Perldoc [...] rt.cpan.org
From: Zefram <zefram [...] fysh.org>
Since Pod::Perldoc 3.25, perldoc edits the environment variable $ENV{LESS} when running the pager, thus rudely altering pager configuration that belongs to the user. The change that it makes, adding "-R", is somewhat understandable as a reaction to [rt.cpan.org #88204] with the ToTerm output method, though as I explained there it is still the wrong approach to that issue. But the configuration change gets made even if I explicitly select a different output method, such as if I use PERLDOC=-oman to get the (former default) ToMan output method, which does not exhibit ToTerm's problem. In this case there is no excuse at all for dishonouring the user's pager configuration. -zefram
On Mon Feb 13 09:25:32 2017, zefram@fysh.org wrote: Show quoted text
> Since Pod::Perldoc 3.25, perldoc edits the environment variable $ENV{LESS} > when running the pager, thus rudely altering pager configuration > that belongs to the user. The change that it makes, adding "-R", > is somewhat understandable as a reaction to [rt.cpan.org #88204] with > the ToTerm output method, though as I explained there it is still the > wrong approach to that issue. But the configuration change gets made > even if I explicitly select a different output method, such as if I use > PERLDOC=-oman to get the (former default) ToMan output method, which > does not exhibit ToTerm's problem. In this case there is no excuse at > all for dishonouring the user's pager configuration. > > -zefram
Yes, sorry. I have published 3.27_02 to CPAN which addresses this issue. Please try that release. If it fixes the problem, I will publish _02 as 3.28 official.
Subject: Re: [rt.cpan.org #120229] perldoc rudely interferes with pager configuration
Date: Thu, 2 Mar 2017 04:46:50 +0000
To: Mark Allen via RT <bug-Pod-Perldoc [...] rt.cpan.org>
From: Zefram <zefram [...] fysh.org>
Mark Allen via RT wrote: Show quoted text
>Yes, sorry. I have published 3.27_02 to CPAN which addresses this issue.
It no longer hits ToMan, but it hasn't been properly limited to ToTerm. -zefram
On Wed Mar 01 23:46:59 2017, zefram@fysh.org wrote: Show quoted text
> Mark Allen via RT wrote:
> >Yes, sorry. I have published 3.27_02 to CPAN which addresses this issue.
> > It no longer hits ToMan, but it hasn't been properly limited to ToTerm. > > -zefram
OK, would you help me understand how you would prefer I address this issue?
Subject: Re: [rt.cpan.org #120229] perldoc rudely interferes with pager configuration
Date: Thu, 2 Mar 2017 18:09:13 +0000
To: Mark Allen via RT <bug-Pod-Perldoc [...] rt.cpan.org>
From: Zefram <zefram [...] fysh.org>
Mark Allen via RT wrote: Show quoted text
>OK, would you help me understand how you would prefer I address this issue?
Well, I'd *prefer* that you leave the user's pager configuration untouched and pick a default formatter that's compatible with that. But that's a bigger change than I've sought with this ticket. If we take as read that this environment diddling is required with the ToTerm formatter, then it ought to be applied *only* with that formatter. The question then is merely how to detect the ToTerm formatter, to which there are two obvious answers. Firstly, there could be a method in the generic formatter API, akin to is_pageable, indicating whether to modify the environment. Or secondly, Pod::Perldoc could perform an exact string comparison on the whole of the class name (where 3.27_02 has a case-insensitive substring match). -zefram
On Thu Mar 02 13:09:25 2017, zefram@fysh.org wrote: Show quoted text
> The question then is merely how to detect the ToTerm formatter, to > which there are two obvious answers. Firstly, there could be a method > in the generic formatter API, akin to is_pageable, indicating whether > to modify the environment. Or secondly, Pod::Perldoc could perform an > exact string comparison on the whole of the class name (where 3.27_02 > has a case-insensitive substring match).
What do you think of this idea? https://github.com/mrallen1/Pod-Perldoc/pull/33
Subject: Re: [rt.cpan.org #120229] perldoc rudely interferes with pager configuration
Date: Fri, 3 Mar 2017 06:02:04 +0000
To: Mark Allen via RT <bug-Pod-Perldoc [...] rt.cpan.org>
From: Zefram <zefram [...] fysh.org>
Mark Allen via RT wrote: Show quoted text
>What do you think of this idea? > >https://github.com/mrallen1/Pod-Perldoc/pull/33
There are two parts to this: the pager_configuration API, and the reduced amount of environmental change. The pager_configuration interface is an obvious generalisation of having the formatter declare whether it needs the configuration change. (Incidentally, I was surprised that you didn't add an empty pager_configuration to Pod::Perldoc::BaseTo.) If the choice is made to embrace the idea of per-formatter customisation of the pager, then I would certainly go in this direction of generalisation. But this is not the place to stop: I would generalise further. Whatever provides the code to actually change the environment settings needs to know about specific pagers and how to configure them. The individual formatters are very poorly placed to be burdened with this kind of knowledge: it's way off their core concern. That's a maintainability problem. There's also a maintainability problem where pager customisations ought to be shared between formatters, as is arguably the case with ToANSI and ToTerm. So if I were going the generalisation route, I'd introduce another layer of abstraction: have the formatter declare *what kind of textual features appear in its output*. I'd leave it up to framework code (in Pod::Perldoc or factored out) to mediate between those requirements and the pagers available. This system would supersede the existing is_pageable flag, and would give the framework a lot of flexibility in how to respond to output features, all of which could be influenced by user configuration of which the formatters do not need knowledge. ToTerm would declare "output contains ANSI text attribute controls", and in response to that the framework might decide (a) that the pager configuration can already handle that, so just pass it through; (b) that the pager configuration needs a change to handle it, so tweak environment and pass the output through; (c) that the pager can't handle it but the terminal can, so output it without a pager; (d) that the ANSI controls should be translated into some other kind of controls, with filtered output going to a pager; (e) that the ANSI controls should be removed, with filtered output going to a pager; or (f) that a different formatter should be used instead. Options to perldoc(1) or in the environment could influence this, and it could perform test executions of pagers, and so on. It can get arbitrarily sophisticated in future versions, without the individual formatters having to change. However, if I were in your place I would probably go to the opposite extreme. I don't think there's enough value in doing the pager configuration properly to offset its complexity and the burden of incorporating an unbounded amount of knowledge about specific pagers. As discussed elsewhere, I'd be looking to remove the pager tweaking entirely (along with the need for it), if not for Perl 5.26 then for 5.27.1. With that expectation, I'd look to minimise the present hack, on the basis that it's a stopgap that will shortly go away. I'd keep the hack entirely in Pod::Perldoc, and base it on exact match of the formatter class name. As regards the proposed reduction in the amount of enviornmental change, I don't think this achieves anything useful. It retains all the downside of dishonouring the user's configuration, including that it will entirely break some reasonable configuration on a current OS (more(1) on Debian with $ENV{MORE} unset). As far as I can see the change just makes it less successful at setting the -R option. -zefram
I've looked at this issue now, and I have some comments and questions. On Fri Mar 03 01:03:07 2017, zefram@fysh.org wrote: Show quoted text
> Mark Allen via RT wrote:
> >What do you think of this idea? > > > >https://github.com/mrallen1/Pod-Perldoc/pull/33
> > There are two parts to this: the pager_configuration API, and the reduced > amount of environmental change. > > The pager_configuration interface is an obvious generalisation of > having the formatter declare whether it needs the configuration > change. (Incidentally, I was surprised that you didn't add an empty > pager_configuration to Pod::Perldoc::BaseTo.) If the choice is made > to embrace the idea of per-formatter customisation of the pager, then > I would certainly go in this direction of generalisation. But this is > not the place to stop: I would generalise further. > > [...]
I think these are excellent ideas. However, they would require a maintainer interested and willing to make these changes. At the moment, our problem with reverting all of these changes is that Unicode will still be broken on Mac. macOS Sierra still includes the old version of groff that doesn't support Unicode adequately. This means that moving from theoretical good to practical necessity, this change - as beneficial as it might be, and as well thought-out as it is - is not necessary at the moment to implement to untangle the current situation. Show quoted text
> > However, if I were in your place I would probably go to the opposite > extreme. I don't think there's enough value in doing the pager > configuration properly to offset its complexity and the burden of > incorporating an unbounded amount of knowledge about specific pagers. > As discussed elsewhere, I'd be looking to remove the pager tweaking > entirely (along with the need for it), if not for Perl 5.26 then for > 5.27.1. With that expectation, I'd look to minimise the present hack, > on the basis that it's a stopgap that will shortly go away. I'd keep > the hack entirely in Pod::Perldoc, and base it on exact match of the > formatter class name.
What makes you think the stopgap would go away? I see no reason for that. I think the approach of making it work on broken systems by adding the necessary flags (and possibly changing the pipeline) is correct. I would like to keep it with some alterations, proposed below. Show quoted text
> > As regards the proposed reduction in the amount of enviornmental change, > I don't think this achieves anything useful. It retains all the downside > of dishonouring the user's configuration, including that it will entirely > break some reasonable configuration on a current OS (more(1) on Debian > with $ENV{MORE} unset). As far as I can see the change just makes it > less successful at setting the -R option.
Having reviewed the code and the issues, I have the following observations: * The old pipeline fails with Unicode on macOS because of the groff version. * The new pipeline uses escape codes which `less` and `more` do not like, more or less. (Excuse the pun.) * There is a behavior check to not disturb Windows or DOS and to not override the environment if it already is defined. I fail to understand how it is still overwriting the environment, as you noted. My proposal is, therefore: * Move the old pipeline to the default again. This will reduce the need for -R except for: * Either: * Move macOS individually (perhaps with derivatives) to the new pipeline, or * Check groff version (this is already done in the code) and use the new pipeline with "-R" (if not otherwise defined) in case it is too old to display Unicode correctly.
Subject: Re: [rt.cpan.org #120229] perldoc rudely interferes with pager configuration
Date: Thu, 12 Oct 2017 12:03:56 +0100
To: Sawyer X via RT <bug-Pod-Perldoc [...] rt.cpan.org>
From: Zefram <zefram [...] fysh.org>
Sawyer X via RT wrote: Show quoted text
>At the moment, our problem with reverting all of these changes is that >Unicode will still be broken on Mac. macOS Sierra still includes the >old version of groff that doesn't support Unicode adequately.
I presume that by "reverting all of these changes" you mean returning to defaulting to the ToMan formatter. It would help if you were explicit about which formatter you mean; terms like "the old pipeline" and "the new pipeline" are unnecessarily unclear. This groff issue was presumably the original motivation for changing the default away from ToMan. No one is proposing returning to that default (at least in its unconditional form). The problems we see now arise from the question of which formatter should be the default given that ToMan is not acceptable. The main candidates for this are ToTerm and ToText. Show quoted text
>What makes you think the stopgap would go away? I see no reason for that.
The context for the stopgap was the work immediately prior to the release of 5.26. The question of what version of Pod::Perldoc would go into 5.26 was distinct from the question of how it should be fixed in the long term. An actual fix, of changing the default formatter, was vetoed for 5.26, so for 5.26 we were looking just at tweaks to the environment-mutation approach. It's a stopgap in the sense that that was expected to be used only for 5.26, with a proper fix coming along for 5.28. Show quoted text
>I fail to understand how it is still overwriting the environment, as you noted.
Altering the environment is the whole purpose of that bit of code. It writes into %ENV. Show quoted text
> * Check groff version (this is already done in the code)
I'm OK with ToMan being used as the default conditional on the groff version. That's a refinement beyond what we've been discussing so far. The more important question, which we need to resolve first, is what should be the default when ToMan can't be used. Show quoted text
>and use the new pipeline with "-R" (if not otherwise defined) in case >it is too old to display Unicode correctly.
Anything that sticks "-R" into the environment is asking for trouble. That's a portability problem, and as you noted we don't have any volunteers to take on the full porting job. This is why I say that ToTerm is not suitable to be the default formatter: the fallback has to be ToText. -zefram
On Thu Oct 12 07:04:20 2017, zefram@fysh.org wrote: Show quoted text
> Sawyer X via RT wrote:
> > At the moment, our problem with reverting all of these changes is > > that > > Unicode will still be broken on Mac. macOS Sierra still includes the > > old version of groff that doesn't support Unicode adequately.
> > I presume that by "reverting all of these changes" you mean returning > to > defaulting to the ToMan formatter.
Yes. That was my intention. Show quoted text
> It would help if you were explicit > about which formatter you mean; terms like "the old pipeline" and "the > new pipeline" are unnecessarily unclear.
My apoology. When I meant "all of these changes", I meant returning to the ToMan to (n|g)roff pipeline ("the old pipeline"). Show quoted text
> This groff issue was presumably the original motivation for changing > the > default away from ToMan. No one is proposing returning to that > default > (at least in its unconditional form). The problems we see now arise > from > the question of which formatter should be the default given that ToMan > is not acceptable. The main candidates for this are ToTerm and > ToText.
My understanding was that ToMan was not acceptable simply because of Unicode in macOS. Is that correct? Show quoted text
> > What makes you think the stopgap would go away? I see no reason for > > that.
> > The context for the stopgap was the work immediately prior to the > release > of 5.26. The question of what version of Pod::Perldoc would go into > 5.26 > was distinct from the question of how it should be fixed in the long > term. > An actual fix, of changing the default formatter, was vetoed for 5.26, > so for 5.26 we were looking just at tweaks to the environment-mutation > approach. It's a stopgap in the sense that that was expected to be > used > only for 5.26, with a proper fix coming along for 5.28.
I think I understand. Show quoted text
>
> > I fail to understand how it is still overwriting the environment, as > > you noted.
> > Altering the environment is the whole purpose of that bit of code. > It writes into %ENV.
I see what you mean. I thought "overwriting" was meant in the sense of "there's a value in $ENV{LESS} it is overwriting". Show quoted text
>
> > * Check groff version (this is already done in the code)
> > I'm OK with ToMan being used as the default conditional on the groff > version. That's a refinement beyond what we've been discussing so > far.
Indeed, but that is what I'm proposing. ToMan be made the default again, except for older versions of groff, where we know it would screw up Unicode. Show quoted text
> The more important question, which we need to resolve first, is what > should be the default when ToMan can't be used.
If I understand correctly, your objection to this being ToTerm is the change to the environment. Is that right? Show quoted text
>
> > and use the new pipeline with "-R" (if not otherwise defined) in case > > it is too old to display Unicode correctly.
> > Anything that sticks "-R" into the environment is asking for trouble. > That's a portability problem, and as you noted we don't have any > volunteers to take on the full porting job. This is why I say that > ToTerm is not suitable to be the default formatter: the fallback has > to > be ToText.
I agree sticking "-R" is, in a sense, "looking for trouble", but I think it might still be a better solution than ToText. I don't know what the differences in display are between ToTerm and ToText. From ToTerm's perspective, we know the problem is simply control characters on pagers. If the pager has no configuration, I don't find it exceedingly terrible to add an environment change to fix it, since we're already calling the pager ourselves. I should add that now Mark decided to step down from maintaining this so we have one maintainer short willing to any work.
Subject: Re: [rt.cpan.org #120229] perldoc rudely interferes with pager configuration
Date: Fri, 13 Oct 2017 12:37:58 +0100
To: Sawyer X via RT <bug-Pod-Perldoc [...] rt.cpan.org>
From: Zefram <zefram [...] fysh.org>
Sawyer X via RT wrote: Show quoted text
>If I understand correctly, your objection to this being ToTerm is the >change to the environment. Is that right?
That's part of it, but there are multiple failure modes. My original complaint, before any environment twiddling was implemented, was that it renders badly, all "ESC[1mSYNOPSISESC[0m". There are other problematic outcomes too. Rough categorisation of failure modes: 0. If perldoc ever passes "-R" to the pager (via either environment or command line), then sometimes this will be applied to a pager that doesn't support -R, so you'll get an error message instead of documentation. 1. If perldoc ever refrains from passing "-R" to the pager, then it'll quite often hand the escape seqeunces off to a pager that's not configured to handle them, so you'll get nicely-paged "ESC[1mSYNOPSISESC[0m" instead of the intended rendering. 2. If the terminal is one that doesn't use the ANSI escape sequence schema, then successfully configuring the pager to pass them through (which is what -R does) yields terminal-dependent garbage instead of the intended rendering. Note that taking items 0 and 1 together means that you can't win. There is no criterion for adding "-R" that doesn't trigger at least one of these failure modes, and most criteria trigger both. There are also some situations where it's impossible to get the intended output even given perfect knowledge of how to configure pagers: user's pager can't be configured to pass through escape sequences; terminal doesn't support those escape sequences. Show quoted text
>I don't know what the differences in display are between ToTerm and >ToText.
ToText produces plain text, no escape sequences. That's the only difference, AFAICS. Both emit non-ASCII characters as UTF-8. (UTF-8 itself poses a rendering problem, but I'll let that slide for now.) Show quoted text
>If the pager has no configuration, I don't find it exceedingly terrible >to add an environment change to fix it,
It's terribly unportable. We don't know what switch to pass for most pagers; for pagers where we know how to spell "-R" we don't know whether the pager actually supports that switch; and we don't know whether the terminal supports those escape sequences anyway. It's impossible for us to actually fix the issue this way. If it were actually possible to fix the rendering by sticking something in the environment, and we could in all cases know what to stick in the environment to do it, then I'd be pretty OK with that. But neither of those conditions holds. My objection to tweaking the environment isn't an objection in principle to ever touching the environment; it's an objection to altering configuration that we don't understand. -zefram
On Fri Oct 13 07:38:49 2017, zefram@fysh.org wrote: Show quoted text
> Sawyer X via RT wrote:
> >If I understand correctly, your objection to this being ToTerm is the > >change to the environment. Is that right?
> > That's part of it, but there are multiple failure modes. My original > complaint, before any environment twiddling was implemented, was that it > renders badly, all "ESC[1mSYNOPSISESC[0m". There are other problematic > outcomes too. Rough categorisation of failure modes: > > 0. If perldoc ever passes "-R" to the pager (via either environment > or command line), then sometimes this will be applied to a pager > that doesn't support -R, so you'll get an error message instead of > documentation. > > 1. If perldoc ever refrains from passing "-R" to the pager, then > it'll quite often hand the escape seqeunces off to a pager > that's not configured to handle them, so you'll get nicely-paged > "ESC[1mSYNOPSISESC[0m" instead of the intended rendering. > > 2. If the terminal is one that doesn't use the ANSI escape sequence > schema, then successfully configuring the pager to pass them through > (which is what -R does) yields terminal-dependent garbage instead of > the intended rendering. > > Note that taking items 0 and 1 together means that you can't win. > There is no criterion for adding "-R" that doesn't trigger at least one > of these failure modes, and most criteria trigger both. There are also > some situations where it's impossible to get the intended output even > given perfect knowledge of how to configure pagers: user's pager can't > be configured to pass through escape sequences; terminal doesn't support > those escape sequences.
I agree. This means we are playing some kind of odds here. Show quoted text
> >If the pager has no configuration, I don't find it exceedingly terrible > >to add an environment change to fix it,
> > It's terribly unportable. We don't know what switch to pass for most > pagers; for pagers where we know how to spell "-R" we don't know whether > the pager actually supports that switch; and we don't know whether the > terminal supports those escape sequences anyway. It's impossible for > us to actually fix the issue this way.
It is not a "fix" then, it's a hack. Howevr, let us not allow perfection to be the enemy of "good." Show quoted text
> If it were actually possible to fix the rendering by sticking something > in the environment, and we could in all cases know what to stick in the > environment to do it, then I'd be pretty OK with that. But neither > of those conditions holds. My objection to tweaking the environment > isn't an objection in principle to ever touching the environment; it's > an objection to altering configuration that we don't understand.
I think limiting the risk here is reasonable. We can return the pipeline to ToMan, knowing this works, except the following situation: * old ngroff * macOS I'll add that currently perldoc already runs external programs, which means we are already assuming on a well-intentioned environment. If a user has something odd in their environment, well, that's the rub. Currently, we have broken Unicode on macOS which is a far greater factor of users than anyone who is on macOS, has an old groff, *and* decided to use some other non-standard pager. I'm okay with that.
Subject: Re: [rt.cpan.org #120229] perldoc rudely interferes with pager configuration
Date: Sun, 22 Oct 2017 23:04:11 +0100
To: Sawyer X via RT <bug-Pod-Perldoc [...] rt.cpan.org>
From: Zefram <zefram [...] fysh.org>
Sawyer X via RT wrote: Show quoted text
>I agree. This means we are playing some kind of odds here.
... Show quoted text
>It is not a "fix" then, it's a hack. Howevr, let us not allow perfection >to be the enemy of "good."
In these comments, you seem to be implying that there is a moral equivalence between the ToTerm and ToText mechanisms. (Though you don't mention either of those formatters explicitly, so it's not clear.) In fact there is no such equivalence in their portability. There is a sharp difference between ToTerm raising these huge portability problems and ToText which reliably works. Show quoted text
>I think limiting the risk here is reasonable. We can return the pipeline >to ToMan, knowing this works, except the following situation: > >* old ngroff >* macOS
I'm fine with conditional use of ToMan, indeed I welcome a return to the ToMan default on the platform that I use personally, provided that the fallback where ToMan is not usable is a properly portable one. Once again, your discussion of ToMan here is a distraction. These tickets are concerned with the non-ToMan situation. Show quoted text
>I'll add that currently perldoc already runs external programs, which >means we are already assuming on a well-intentioned environment.
Of course, we are always relying on the user's environment to be coherent. None of this has been concerned with environments that are a priori faulty. All of the situations that I have raised that ToTerm has difficulty with are reasonable situations, with environmental configuration presumed to be correct. Show quoted text
>Currently, we have broken Unicode on macOS which is a far greater factor >of users than anyone who is on macOS, has an old groff, *and* decided >to use some other non-standard pager. I'm okay with that.
I don't understand this comment. What exactly is broken on macOS? By "currently", do you mean perldoc with its ToTerm default (as in the CPAN version), or perldoc with the ToText default (as in blead), or something else? I'm mystified by your reference to "some other non-standard pager". To be clear, the portability issues with ToTerm arise on macOS just as on any other platform. I recommend that the CPAN instance of Pod-Perldoc should incorporate the same change to Pod::Perldoc with which I customised it in blead; then it should also remove the (now never called) pager_configuration method from Pod::Perldoc::ToTerm; then from that base it should incorporate some conditional preference for ToMan. The first part of this is the most important, and the last part should be postponed if the correct form for the condition is not immediately apparent. -zefram
On Sun Oct 22 18:04:24 2017, zefram@fysh.org wrote: Show quoted text
> Sawyer X via RT wrote:
> >I agree. This means we are playing some kind of odds here.
> ...
> >It is not a "fix" then, it's a hack. Howevr, let us not allow perfection > >to be the enemy of "good."
> > In these comments, you seem to be implying that there is a moral > equivalence between the ToTerm and ToText mechanisms. (Though you > don't mention either of those formatters explicitly, so it's not clear.) > In fact there is no such equivalence in their portability. There is a > sharp difference between ToTerm raising these huge portability problems > and ToText which reliably works.
I was comparing ToMan and ToTerm. I wasn't referring to the ToText. The ToText removes the ability to have any highlighting (of any form), rendering the documents the least readable in their text form (excluding rendering issues). I'm looking for a way to avoid that. Show quoted text
>
> >I think limiting the risk here is reasonable. We can return the pipeline > >to ToMan, knowing this works, except the following situation: > > > >* old ngroff > >* macOS
> > I'm fine with conditional use of ToMan, indeed I welcome a return > to the ToMan default on the platform that I use personally, provided > that the fallback where ToMan is not usable is a properly portable one.
The only "perfect" solution in which we do not lose the value of typographical emphasis is the long-term one you suggested which we cannot implement atthe moment. I prefer not to handicap these emphases. Show quoted text
> Once again, your discussion of ToMan here is a distraction. These tickets > are concerned with the non-ToMan situation.
Cutting out the ToMan "distraction," I'll reiterate that we are able to limit the ToTerm output to a smaller set of users that I find acceptable. I have noted this several timees. (I have simply included the overall solution which mentions going back to ToMan as a full answer.) Show quoted text
>
> >I'll add that currently perldoc already runs external programs, which > >means we are already assuming on a well-intentioned environment.
> > Of course, we are always relying on the user's environment to be > coherent. None of this has been concerned with environments that are > a priori faulty. All of the situations that I have raised that ToTerm > has difficulty with are reasonable situations, with environmental > configuration presumed to be correct.
Can you indicate other reasonable situations in which it would break, other than the old groff on macOS using a non-standard pager? I can only imagine a situation of: On Mac, using old groff, using a non-less pager that uses its binary, but doesn't support "-R". Are there any other situations? If this is it, I think it's minimal. If it were this vs. no typographical emphasis for anyone, I would pick this. Show quoted text
>
> >Currently, we have broken Unicode on macOS which is a far greater factor > >of users than anyone who is on macOS, has an old groff, *and* decided > >to use some other non-standard pager. I'm okay with that.
> > I don't understand this comment. What exactly is broken on macOS? > By "currently", do you mean perldoc with its ToTerm default (as in > the CPAN version), or perldoc with the ToText default (as in blead), > or something else? I'm mystified by your reference to "some other > non-standard pager".
macOS carries a copy of groff that breaks ToMan. That's all. "Currently" referred to CPAN version. Show quoted text
> > To be clear, the portability issues with ToTerm arise on macOS just as > on any other platform.
At the moment, we are only aware of macOS carrying such an old version of groff.
Subject: Re: [rt.cpan.org #120229] perldoc rudely interferes with pager configuration
Date: Mon, 23 Oct 2017 00:28:09 +0100
To: Sawyer X via RT <bug-Pod-Perldoc [...] rt.cpan.org>
From: Zefram <zefram [...] fysh.org>
Sawyer X via RT wrote: Show quoted text
>The ToText removes the ability to have any highlighting (of any form), >rendering the documents the least readable in their text form (excluding >rendering issues). I'm looking for a way to avoid that.
OK. Defaulting to ToMan where we're confident that it'll work would seem to be the main way to avoid that downside as much as possible. We cannot avoid getting unhighlighted text in some situations, though. Don't let the (sensible) desire for nice formatting get in the way of achieving legibility for everyone. Show quoted text
>I'll reiterate that we are able to limit the ToTerm output to a smaller >set of users that I find acceptable. I have noted this several timees.
Apparently I didn't understand the previous times you've noted it. This is the first time I've been aware of you defending ToTerm on the basis that we can limit its usage to a relatively small set of users. So I think what you're proposing, though you still haven't actually said it explicitly, is roughly that the default should be ToMan on platforms other than macOS, and should be ToTerm on macOS. Maybe we'd also avoid ToMan on Windows, as we used to, though I don't know what you're then proposing the default should be on Windows. Maybe also some condition based on detecting the version of the roff toolchain, though in your previous message you didn't say whether that's ANDed or ORed with macOS, so I don't know where this fits into the proposal. I think this is a poor basis on which to defend ToTerm. It's got exactly the same problems for macOS users that we have noted on other platforms. Only using it on macOS would indeed reduce the number of affected users, but this amounts to giving those users exactly the mechanism that we've rejected for the majority on the basis that it's broken. The problem here is that the condition for choosing ToTerm, namely the macOS platform, has nothing to do with whether ToTerm would actually work. For ToTerm to ever be the default, the conditions for selecting it need to be conditions that give us confidence that we can actually pull it off. This would be, roughly, * we recognise the specific pager nominated by the environment; and * by checking the specific version of the pager that will be executed we are confident that it is a version that supports -R; and * the terminal type nominated by the environment is one that we specifically recognise as taking ANSI escape sequences. If those conditions are met then we could confidently arrange to pass "-R" as a command-line argument to the pager and select the ToTerm formatter. If these conditions are not met then we're back in the usual situation of ToTerm being hopelessly unportable, and must fall back on ToText. It should be easy enough to arrange for the ToTerm-selecting conditions to include most of the macOS users about which you're concerned. Considering ToMan as well, presumably we'd still prefer ToMan, and so only look at whether ToTerm is an acceptable default if we've already decided that ToMan is not acceptable. Show quoted text
>Can you indicate other reasonable situations in which it would break, >other than the old groff on macOS using a non-standard pager?
Er, what? Where I've described things breaking, that's been specifically about ToTerm breaking. The version of groff has no influence on ToTerm. I can't figure out what you're trying to ask here. Show quoted text
>macOS carries a copy of groff that breaks ToMan. That's all.
Right, so, to be clear, the current perldoc defaults (in either CPAN or blead versions) don't produce any broken Unicode on macOS. -zefram
On Sun Oct 22 19:28:20 2017, zefram@fysh.org wrote: Show quoted text
> Sawyer X via RT wrote: > [...]
> >I'll reiterate that we are able to limit the ToTerm output to a smaller > >set of users that I find acceptable. I have noted this several timees.
> > Apparently I didn't understand the previous times you've noted it. > This is the first time I've been aware of you defending ToTerm on the > basis that we can limit its usage to a relatively small set of users.
I apologize for my lack of clarity. This was my original point. Show quoted text
> So I think what you're proposing, though you still haven't actually said > it explicitly, is roughly that the default should be ToMan on platforms > other than macOS, and should be ToTerm on macOS.
Yes. That is correct. Show quoted text
> Maybe we'd also avoid > ToMan on Windows, as we used to, though I don't know what you're then > proposing the default should be on Windows.
Do you have an opinion on what the default should be on Windows? Show quoted text
> Maybe also some condition > based on detecting the version of the roff toolchain, though in your > previous message you didn't say whether that's ANDed or ORed with macOS, > so I don't know where this fits into the proposal.
I thought it should be ANDed, but now that you raise this I realize that was naive thinking. From what I understood, the need to use ToTerm was because reports came in from macOS that ToMan was rendered inappropriately. This menas that, theoretically, one could have an older version of groff on other operating systems. This would mean that in all such cases ToMan is sub-optimal. Here's my updated proposal: 1. Check groff version and less binary. 2. If groff is new enough, use ToMan. 3. If groff is too old and our pager is less, use ToTerm. 4. If groff is too old and our pager is not less, use ToText. Our use-case of macOS seems covered, and in any event of different pager, or even any other operating system that also sufferes from a groff that screws up Unicode, we will opt for the least featureful (but working) ToText. Show quoted text
> > I think this is a poor basis on which to defend ToTerm. It's got exactly > the same problems for macOS users that we have noted on other platforms.
I understand. I think this new proposal will solve the situation of other platforms who could suffer from ToTerm. Show quoted text
> The problem > here is that the condition for choosing ToTerm, namely the macOS platform, > has nothing to do with whether ToTerm would actually work.
You're right. That was short-sighted. Show quoted text
> For ToTerm to ever be the default, the conditions for selecting it need > to be conditions that give us confidence that we can actually pull it off. > This would be, roughly, > > * we recognise the specific pager nominated by the environment; and > > * by checking the specific version of the pager that will be executed > we are confident that it is a version that supports -R; and > > * the terminal type nominated by the environment is one that we > specifically recognise as taking ANSI escape sequences. > > If those conditions are met then we could confidently arrange to pass "-R" > as a command-line argument to the pager and select the ToTerm formatter. > If these conditions are not met then we're back in the usual situation > of ToTerm being hopelessly unportable, and must fall back on ToText. > It should be easy enough to arrange for the ToTerm-selecting conditions > to include most of the macOS users about which you're concerned. > Considering ToMan as well, presumably we'd still prefer ToMan, and so > only look at whether ToTerm is an acceptable default if we've already > decided that ToMan is not acceptable.
This seems to me like the best approach. Show quoted text
>
> >Can you indicate other reasonable situations in which it would break, > >other than the old groff on macOS using a non-standard pager?
> > Er, what? Where I've described things breaking, that's been specifically > about ToTerm breaking. The version of groff has no influence on ToTerm. > I can't figure out what you're trying to ask here.
I meant ToMan. It doesn't matter. :)
Subject: Re: [rt.cpan.org #120229] perldoc rudely interferes with pager configuration
Date: Mon, 23 Oct 2017 15:37:40 +0100
To: Sawyer X via RT <bug-Pod-Perldoc [...] rt.cpan.org>
From: Zefram <zefram [...] fysh.org>
Sawyer X via RT wrote: Show quoted text
>Do you have an opinion on what the default should be on Windows?
I don't have firm platform knowledge. I expect that Windows ordinarily lacks a roff toolchain, making ToMan inoperable. It seems from the previous versions of Pod::Perldoc that ToTerm is also ordinarily unworkable on Windows: we know that the usual pager doesn't support -R. I don't know what its usual terminals do with the escape sequences. So if we're making flat per-platform decisions, it looks like Windows needs to default to ToText (as it does in the current CPAN version of Pod::Perldoc). But... Show quoted text
>This menas that, theoretically, one could have an older version of >groff on other operating systems.
Yes. I think we're converging on the concept that we should detect what's actually workable at runtime, rather than make coarse per-platform assumptions. We need such detection logic for ToMan and for ToTerm. -zefram
On Mon Oct 23 10:37:58 2017, zefram@fysh.org wrote: Show quoted text
> Sawyer X via RT wrote:
> >Do you have an opinion on what the default should be on Windows?
> > I don't have firm platform knowledge. I expect that Windows ordinarily > lacks a roff toolchain, making ToMan inoperable. It seems from > the previous versions of Pod::Perldoc that ToTerm is also ordinarily > unworkable on Windows: we know that the usual pager doesn't support -R. > I don't know what its usual terminals do with the escape sequences. > So if we're making flat per-platform decisions, it looks like Windows > needs to default to ToText (as it does in the current CPAN version of > Pod::Perldoc). But...
I'll verify this and report back then. Show quoted text
> >This menas that, theoretically, one could have an older version of > >groff on other operating systems.
> > Yes. I think we're converging on the concept that we should detect > what's actually workable at runtime, rather than make coarse per-platform > assumptions. We need such detection logic for ToMan and for ToTerm.
\o/ Thank you for your patience.
On Wed Oct 25 17:47:16 2017, xsawyerx wrote: Show quoted text
> On Mon Oct 23 10:37:58 2017, zefram@fysh.org wrote:
> > Sawyer X via RT wrote:
> > >Do you have an opinion on what the default should be on Windows?
> > > > I don't have firm platform knowledge. I expect that Windows ordinarily > > lacks a roff toolchain, making ToMan inoperable. It seems from > > the previous versions of Pod::Perldoc that ToTerm is also ordinarily > > unworkable on Windows: we know that the usual pager doesn't support -R. > > I don't know what its usual terminals do with the escape sequences. > > So if we're making flat per-platform decisions, it looks like Windows > > needs to default to ToText (as it does in the current CPAN version of > > Pod::Perldoc). But...
> > I'll verify this and report back then. > > >
> > >This menas that, theoretically, one could have an older version of > > >groff on other operating systems.
> > > > Yes. I think we're converging on the concept that we should detect > > what's actually workable at runtime, rather than make coarse per-platform > > assumptions. We need such detection logic for ToMan and for ToTerm.
> > \o/ > > Thank you for your patience.
Here is my pull request: https://github.com/mrallen1/Pod-Perldoc/pull/36. This is against mrallen1's account, but I can move it to the DualLife account as well.
Subject: Re: [rt.cpan.org #120229] perldoc rudely interferes with pager configuration
Date: Thu, 30 Nov 2017 20:40:33 +0000
To: Sawyer X via RT <bug-Pod-Perldoc [...] rt.cpan.org>
From: Zefram <zefram [...] fysh.org>
Sawyer X via RT wrote: Show quoted text
Your detection of less(1) in the pager list is dubious. You use a substring search for the character sequence "less", but that may be misleading because it could appear as substring in a $ENV{PAGER} that is for a different pager. The use of `$less_bin --version` is dubious, because, per ->pagers_guessing, $less_bin may contain shell redirection characters, such that "--version" wouldn't necessarily function as a command-line argument to less(1). You may need to parse pager strings in more detail. The version check on less(1) involves a regexp copied from the groff version check, which will never match actual "less --version" output. Where the acceptability of the formatter depends on a sufficiently recent less(1), you need to ensure that that is the pager actually used, which you are not doing. All you are doing is detecting that a sufficiently recent less(1) is somewhere in the list of candidate pagers. You probably want to filter the pager list to contain only acceptable versions of less(1). However, where the first pager setting comes from $ENV{PAGER} or $ENV{PERLDOC_PAGER} it would be rude to dishonour that. An environmental pager setting, where supplied, should be the only candidate pager, such that you will pass up the opportunity to use ToTerm rather than use a different pager. When you are depending on -R, you also need to actually pass -R to the pager, which you're not reliably doing. You haven't changed the logic for this, which sets $ENV{LESS} if it wasn't set but doesn't do anything to pass -R if $ENV{LESS} was set. If you're parsing shell code in pager settings, it's probably best to add -R there and leave the environment untouched. To make ToTerm acceptable, you need to also check that the terminal is of a type that accepts the ANSI escape sequences. The existing logic will avoid trying ToTerm if $ENV{TERM} has certain values indicating the lack of cursor addressability, but that's not sufficient, in two respects. Firstly, an addressable cursor does not in the slightest imply the use of ANSI escape sequences. And secondly, the check needs to be the opposite way round, because the consequences of sending unrecognised escape sequences are much worse than the consequences of trying to page on a dumb terminal. So you need to check that $ENV{TERM} is set to a type that you specifically know accepts these sequences. The default behaviour, for a terminal type that is not specifically recognised, has to be to conservatively not send the dubious sequences. -zefram
On Thu Nov 30 15:40:55 2017, zefram@fysh.org wrote: Show quoted text
> Sawyer X via RT wrote: > > Your detection of less(1) in the pager list is dubious. You use a > substring search for the character sequence "less", but that may be > misleading because it could appear as substring in a $ENV{PAGER} that > is for a different pager.
I agree. I thought about this and eventually stole it from the detection for adding "-R" to the pager. I figured it would at least be as consistent as it is now, despite being inaccurate. Show quoted text
> > The use of `$less_bin --version` is dubious, because, per > ->pagers_guessing, $less_bin may contain shell redirection characters, > such that "--version" wouldn't necessarily function as a command-line > argument to less(1). You may need to parse pager strings in more detail.
Good point. I'll improve this by finding the execution file and then calling only that with "--version", which would remove any redirection characters which might have appeared. Show quoted text
> > The version check on less(1) involves a regexp copied from the groff > version check, which will never match actual "less --version" output.
Whoops! Fixed and PR was updated. Show quoted text
> > Where the acceptability of the formatter depends on a sufficiently > recent less(1), you need to ensure that that is the pager actually > used, which you are not doing. All you are doing is detecting that a > sufficiently recent less(1) is somewhere in the list of candidate pagers. > You probably want to filter the pager list to contain only acceptable > versions of less(1).
Makes sense. That part was evidently fragile since all I did was check whether one of the possible pagers available is less. I couldn't tell if it would be used. The code then tries each one indepdendently hoping it will work (literallly checking the result of "system()") so it's unassured which are available or will work. Reducing it to "which are available" and then to the first one being "less" makes more sense. Show quoted text
> However, where the first pager setting comes from > $ENV{PAGER} or $ENV{PERLDOC_PAGER} it would be rude to dishonour that. > An environmental pager setting, where supplied, should be the only > candidate pager, such that you will pass up the opportunity to use ToTerm > rather than use a different pager.
It is possible to simply let those values override my inspection of possible pagers. Instead it currenly only adds it as an optional pager candidate. I think the reason is that a user might have an environment configured for a certain pager by the operating system which doesn't actually exist or is installed. I am in the opinion that, if you explicitly want something, that's what you should get (even if you didn't personally explicitly stated, but some other configuration did) and that, if that fails, you should be notified. In other words, if Debian automatically set my EDITOR to vim but I don't have vim installed, I would like to know about instead of whatever used the EDITOR configuration would silently just try something else and open a different editor. If we both agree on this, I can change the guessing function to explicitly override whatever pager is avaiable by those configurations and then, as a secondary measaure, try to reduce the list to existing pagers, and if the first one is less, continue with the logic. Does that make sense? Show quoted text
> > When you are depending on -R, you also need to actually pass -R to the > pager, which you're not reliably doing. You haven't changed the logic > for this, which sets $ENV{LESS} if it wasn't set but doesn't do anything > to pass -R if $ENV{LESS} was set. If you're parsing shell code in pager > settings, it's probably best to add -R there and leave the environment > untouched.
I had left it as is since ToTerm sets it in its own formatter. I can move this out of the way, but this would make it a particular ToTerm change that appears outside it. Although, your suggestioon would be safer since it would be parsing the pager string and make sure it's in the right place. Not sure which is best. Show quoted text
> To make ToTerm acceptable, you need to also check that the terminal is of > a type that accepts the ANSI escape sequences. The existing logic will > avoid trying ToTerm if $ENV{TERM} has certain values indicating the lack > of cursor addressability, but that's not sufficient, in two respects. > Firstly, an addressable cursor does not in the slightest imply the > use of ANSI escape sequences. And secondly, the check needs to be the > opposite way round, because the consequences of sending unrecognised > escape sequences are much worse than the consequences of trying to page > on a dumb terminal. So you need to check that $ENV{TERM} is set to a > type that you specifically know accepts these sequences. The default > behaviour, for a terminal type that is not specifically recognised, > has to be to conservatively not send the dubious sequences.
So we would have to both make sure that we have less that can understand escape codes as escape codes and to know that the terminal supports them and can display them properly. For the first we have a solution, for the latter we will need to create a list of known terminals that support escape codes and check ENV{TERM} for them. If we do not know your terminal, we cannot be sure it supports escape codes, and thus move to ToText as a fallback. If so, this means the logic is: * If you have an old version of groff * And your less is old, or * Your terminal is unknown (meaning it's implicitly known to not support escape codes), then: -> You get ToText. * If you have an old version of groff * And your less is new enough * And we know your terminal (and implicitly know it to support escape codes), then: -> You get ToTerm. * If you have a new enough version of groff, then: -> You get ToMan Did I understand it correctly?
CC: xsawyerx [...] gmail.com
Subject: Re: [rt.cpan.org #120229] perldoc rudely interferes with pager configuration
Date: Fri, 1 Dec 2017 13:56:20 +0000
To: Sawyer X via RT <bug-Pod-Perldoc [...] rt.cpan.org>
From: Zefram <zefram [...] fysh.org>
Sawyer X via RT wrote: Show quoted text
>I agree. I thought about this and eventually stole it from the detection >for adding "-R" to the pager.
That logic (in its current form) has rather different requirements. It's not adding "-R" to the command line, but to $ENV{LESS}, so a false positive doesn't matter there. A pager other than less(1) will just ignore the environment variable. But a false positive is a big problem for the new logic. Show quoted text
>Does that make sense?
Yes. Show quoted text
>I had left it as is since ToTerm sets it in its own formatter.
ToTerm is doing a bad job. It's not strictly necessary to remove its inadequate "-R" logic to replace with the correct stuff, but it *is* necessary to implement correct "-R" addition. The existing crap logic would be superfluous but harmless if it remained. Show quoted text
>So we would have to both make sure that we have less that can understand >escape codes as escape codes and to know that the terminal supports them >and can display them properly.
Yes. This is a rather strange requirement, but comes from the strange behaviour of the -R option. less(1) isn't interpreting the escape codes itself and then rendering a fancy text image to the actual terminal, which is what it does with backspace overstriking and which it could readily do for escape sequences. Instead it only understands how the escape sequences work syntactically, and passes the escape sequences through to the terminal without any knowledge of how the terminal will interpret them. So the burden is rather unpleasantly placed on the user to ensure that the terminal is suitable before supplying "-R" to less(1). Show quoted text
>Did I understand it correctly?
Yes. -zefram