From: Alan G. I. <ala...@gm...> - 2021-05-22 12:32:51
|
If I run the example at https://rst2html5.readthedocs.io/en/latest/README.html#example I do not see the documented output. Instead, I see the h1 elements demoted to h2, which is unwanted Removing the second title produces the expected output. I'm using an up to date docutils (revision 8757). Thank you, Alan Isaac |
From: Guenter M. <mi...@us...> - 2021-05-22 21:56:12
|
On 2021-05-22, Alan G. Isaac wrote: > If I run the example at > https://rst2html5.readthedocs.io/en/latest/README.html#example > I do not see the documented output. Please be aware, that the "rst2html5" documented here is a 3rd-party project, not Docutils' html5 writer! (cf. https://rst2html5.readthedocs.io/en/latest/authors.html) > Instead, I see the h1 elements demoted to h2, which is unwanted > Removing the second title produces the expected output. > I'm using an up to date docutils (revision 8757). This is an intended and documented change: - Change the `initial_header_level`_ setting's default to "2", as browsers use the `same style for <h1> and <h2> when nested in a section`__. .. _initial_header_level: docs/user/config.html#initial-header-level __ https://stackoverflow.com/questions/39547412/same-font-size-for-h1-and-h2-in-article --- RELEASE-NOTES.txt Günter |
From: Alan G. I. <ala...@gm...> - 2021-05-23 18:40:31
|
But of course you already did provide this option. I still claim this is a bad default (who on earth is using docutils without CSS), but my problem is already addressed. Thanks! Alan On 5/23/2021 8:04 AM, Alan G. Isaac wrote: > Please, please revert this change. > A bad default display decision by browser makers > does not imply that docutils users should not > have access to h1 elements. > > If you cannot revert this, please provide an option > to control it. > > Note that the current behavior the situation is close > to impossible. If I have a document with a single > top level header, it becomes a h1 element. > If I later include in it a document with a top level header, > they both become h2 elements. I have many documents like this. > > Alan Isaac |
From: Guenter M. <mi...@us...> - 2021-05-24 08:28:48
|
On 2021-05-23, Alan G. Isaac wrote: > But of course you already did provide this option. Actually, this was not me: the initial-header-level__ option is there "for ages". __ https://docutils.sourceforge.io/docs/user/config.html#initial-header-level > I still claim this is a bad default (who on earth is using docutils > without CSS), but my problem is already addressed. I changed the default value for the "html5" writer to bring it in line with HTML5 behaviour after making it write <section> instead of <div class="section">. Of course you should use CSS, but I believe it is a bad idea to force everyone to define the visual features for every <Hn> level in order to override current browser default just to keep using <H1> for both, title and first-level section heading when the "web consensus" moved to using <H1> for the title and <H1+n> for section headings -- independent of my personal preference in this question. >> If I have a document with a single >> top level header, it becomes a h1 element. Yes, by default, a lone top-level section title is promoted to `document title`__ (<H1 class="title">). You can, however, use the doctitle_xform__ setting if it should be a section heading instead. Then you will get a <H1> with html4css1 and <H2> with html5. __ https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#document-title __ https://docutils.sourceforge.io/docs/user/config.html#doctitle-xform >> If I later include in it a document with a top level header, >> they both become h2 elements. I have many documents like this. This very much depends: * If both documents use the same adornment style for their respective top-level headings, none will be promoted to document title. * If the master document uses an adornment style that is never reused in the rest of the document and its children, this will become the document title (<H1 class="title">) unless doctitle_xform == False. It may be a good idea to check your documents for current and intended behaviour and eventuall correct either the doctitle_xform setting or the style of the initial heading in master documents. Günter |
From: Alan G. I. <ala...@gm...> - 2021-05-24 16:18:42
|
On 5/24/2021 4:28 AM, Guenter Milde via Docutils-users wrote: > I changed the default value for the "html5" writer to bring it in line > with HTML5 behaviour after making it write <section> instead of <div > class="section">. Of course you should use CSS, but I believe it is a bad > idea to force everyone to define the visual features for every <Hn> level > in order to override current browser default just to keep using <H1> for > both, title and first-level section heading when the "web consensus" > moved to using <H1> for the title and <H1+n> for section headings -- > independent of my personal preference in this question. I am not going to drag out this discussion, but I must make one last appeal to revert the default behavior. Changing the meaning of a section header because it occurs a second time seems to me to be fundamentally broken behavior. So I must ask: how much consultation did you do on this? E.g., Did David agree that this is good behavior? I already pointed out that it means that combining two documents (e.g., with an `include`) that have compatible header notation and individually format perfect produces a broken combined document. That's exactly how I discovered this change. This alone should cause immense hesitation. Also, this decision seems out of step with the current understanding of HTML document structure. For example, consider: https://www.impactplus.com/blog/multiple-h1-headlines-okay https://webdesign.tutsplus.com/articles/the-truth-about-multiple-h1-tags-in-the-html5-era--webdesign-16824 Thus I again urge reverting this change. However, this is my last comment on this matter. Thanks for all your work! Alan Isaac |
From: Guenter M. <mi...@us...> - 2021-05-25 13:25:55
|
Dear Alan, On 2021-05-24, Alan G. Isaac wrote: > I am not going to drag out this discussion, but I must make one > last appeal to revert the default behavior. I am actually glad you don't give up easily, as this makes me think again about the document structure and design and hopefully helps to improve implementation and documentation. There are, however, still open questions regarding the core of your problem with the current behaviour and I currently come to different conclusions, so please bear with me. > Changing the meaning of a section header because it occurs > a second time seems to me to be fundamentally broken behavior. > So I must ask: how much consultation did you do on this? > E.g., Did David agree that this is good behavior? Changing the meaning of a section header when it occures just one time is actually the standard behaviour of rST laid down by David in the reStructuredText Markup Specification. https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#document This means that with the default settings, the input :: Document Title -------------- This is the content. generates a document without sections (here rendered as Docutils XML) :: <document ids="document-title" names="document\ title" source="/tmp/titles.rst" title="Document Title"> <title>Document Title</title> <paragraph>This is the content.</paragraph> </document> OTOH, changing the the input to :: 1st Section Title ----------------- Content of section 1 2nd Section Title ----------------- Content of section 2 leads to a document with two sections but no document title :: <document source="/tmp/titles.rst"> <section ids="st-section-title" names="1st\ section\ title"> <title> 1st Section Title <paragraph> Content of section 1 <section ids="nd-section-title" names="2nd\ section\ title"> <title> 2nd Section Title <paragraph> Content of section 2 Of course, this also holds, if the 2nd section is from a different file and included after the 1st section. "html4css1" has a special feature: it uses a <h1> tag for both, document title and 1st-level section titles. https://docutils.sourceforge.io/FAQ.html#unexpected-results-from-tools-rst2html-py-h1-h1-instead-of-h1-h2-why > I already pointed out that it means that combining two > documents (e.g., with an `include`) that have compatible > header notation and individually format perfect produces > a broken combined document. Producing broken documents is bad, so I would like to improve the situation. Could you provide a minimal working (rsp. breaking) example and specify what you would expect and what is broken, please? > Also, this decision seems out of step with the current > understanding of HTML document structure. The changing of the default `initial_header_level`__ setting for HTML5 was triggered by a rendering problem observed in Firefox, Chromium and Opera after the mapping of Docutils "section" nodes from <div class="section"> to <section>. I consulted the `MDN Web Docs`__ as well as the `HTML Specifications`__ before settling on this change. __ https://docutils.sourceforge.io/docs/user/config.html#initial-header-level __ https://developer.mozilla.org/en-US/docs/Web/HTML/Element/Heading_Elements __ https://developer.mozilla.org/en-US/docs/Web/HTML/Element/Heading_Elements#specifications The most current `"living standard"`__ says: Sections may contain headings of any rank, but authors are strongly encouraged to either use only h1 elements, or to use elements of the appropriate rank for the section's nesting level. __ https://html.spec.whatwg.org/multipage/sections.html#headings-and-sections The "html5" writer does the latter, using: <body> h1 <section> h2 <section> h3 ... ... also, if the document heading is missing because the top-level section is nevertheless nested inside the <body> of the document. In my understanding, this provides a good match of Docutils document structure to HTML5. However, I agree that HTML document structure (especially the understanding of a section's "nesting level") is a complex issue, , and I may have missed something. > For example, consider: > https://www.impactplus.com/blog/multiple-h1-headlines-okay This post starts quoting Webmaster Trends Analyst John Mueller [...] it makes no difference to Google whether you use no H1 tags or 100. which endorses both, the "html4css1" way of re-using <h1> for document title and section title as well as "html5"s way of skipping <h1> if there is no document title. The section about screen readers states Whether you decide it's better to use no H1s or several, consider if it will have an impact on someone who uses a screen reader for accessibility. and the summary says Although multiple top-level headings won't hurt your Google rankings, it may be preferable to stick to maintaining only one H1 per page until more information is available on the topic. > https://webdesign.tutsplus.com/articles/the-truth-about-multiple-h1-tags-in-the-html5-era--webdesign-16824 This article is interesting to read (although a bit dated). It states that "it is perfectly fine to use as many <h1> tags as your document calls for; that is one per sectioning root or content section." but also admits "It is permissible by the HTML5 spec to use lower level headings than <h1> to label a section". The "New <h1> Usage Rules" seem to contradict with the "living standard"'s recommendation and handling in current browsers. They can be reconciled using the caveat given at the end: "But if you do decide to use a tag other than <h1> for a section label, just ensure you follow the same rules as listed above, replacing <h1> in each rule with your chosen tag." The example problem given there is a page containing several independent "articles". Docutils currently does not offer a way to distinguish independent "articles" from interrelated "sections" inside one rST document. > Thus I again urge reverting this change. Just reverting to ``initial_header_level == 1`` would require another solution to the rendering problems in current browsers. Alternatives include "not using <section> elements" and "styling every heading level in the mandatory CSS style-sheet" which I am open to discuss. Thank you for the feedback, Günter |
From: Alan G. I. <ala...@gm...> - 2021-05-25 18:25:39
|
> Alan wrote: >> I already pointed out that it means that combining two >> documents (e.g., with an `include`) that have compatible >> header notation and individually format perfect produces >> a broken combined document. On 5/25/2021 9:25 AM, Guenter Milde via Docutils-users wrote: > Producing broken documents is bad, so I would like to improve the situation. > Could you provide a minimal working (rsp. breaking) example and specify what > you would expect and what is broken, please? I may have miscommunicated here. By using the `--initial-header-level` option, I was able to fix things for my use. Nevertheless, here is an example of what broke. I have documents (call them chapters), where the top level section header is the chapter title. Each of these has a single top-level header. These format perfectly, individually. Sometimes I combine these into a single document. Under the new defaults, since there are multiple top-level sections, the combined document does not maintain the desired formatting. I understand the point you made about the double use of the h1 element in previous writers. However, I always understood (misunderstood?) this as an effort to provide access to a document-level title while still providing access to *all* of the header elements within a document. This seemed useful, since some documents have many levels. Nevertheless, I typically turned it off with the `--no-doc-title` option. Finally, to your core concern: sectioning. My preferred solution would allow multiple h1 elements *outside* of section elements, thus conforming to the idea that header depth should conform to section depth. Note that the standard has the concept of implicit sections, but `body` is also considered an explicit section. (That's my non-expert understanding.) The explicit body section can contain multiple h1 elements. I see that there is some debate about whether a document *should* contain multiple h1 elements, but there does not seem to be any debate that a document *should not* skip levels. Demotion of the top level header to h2 therefore seems to me to be clearly forbidden while multiple h1 elements does not. Again, I simply share my non-expert opinion and experience. As I said, I have a solution that is working for me. And again, thank you for all your work on this! Alan |
From: Guenter M. <mi...@us...> - 2021-06-01 07:54:58
|
Dear Alan, On 2021-05-25, Alan G. Isaac wrote: > On 5/25/2021 9:25 AM, Guenter Milde via Docutils-users wrote: > ... here is an example of what broke. > I have documents (call them chapters), where the > top level section header is the chapter title. > Each of these has a single top-level header. > These format perfectly, individually. > Sometimes I combine these into a single document. > Under the new defaults, since there are multiple > top-level sections, the combined document does not > maintain the desired formatting. I can reproduce. In the output of the combined file, section headings look different from the output of the chapter files. However, this is consistent for "html4", "html5", "latex", and "odf" and reflects the missing document title in the combined document. Setting the "initial-header-level" to 1 is a possible workaround in this use-case. However, it leads to bad formatting (with the browser default styling for headings) when the combined document has a title:: Collection of recent unpublished work ************************************* .. include:: text1.txt .. include:: text2.txt > I understand the point you made about the double use of > the h1 element in previous writers. However, I > always understood (misunderstood?) this as an > effort to provide access to a document-level title > while still providing access to *all* of the header > elements within a document. This seemed useful, since > some documents have many levels. "Saving" elements for use with lower-level headings may have been sensible with HTML4.1 but is no longer necessary with HTML5 and <section> elements: here we have the possibility of infinite nesting and can also use the ARIA "role" element to set a heading level. > Nevertheless, I typically turned it off with the `--no-doc-title` > option. This results in documents without title also for the stand-alone chapters. A missing document title may be OK for home-use but is generally not recommended for published work. How would you cite the combined document? This is, IMV, also the reason behind the obsolete search engine optimization (SEO) rule "HTML documents must have exactly one <h1> element": One <h1> can be extracted as document title, more than one <h1> is ambiguous (just as multiple top-level headings in rST source). HTML5's outline algorithm allows to determine a document title also when several <h1> elements are used, but this depends on document structure and is not guaranteed to work for any document (see below). > Finally, to your core concern: sectioning. > My preferred solution would allow multiple h1 elements > *outside* of section elements, thus conforming to the idea that > header depth should conform to section depth. > Note that the standard has the concept of implicit sections, > but `body` is also considered an explicit section. > (That's my non-expert understanding.) > The explicit body section can contain multiple h1 elements. I share this understanding -- this would not violate the standard. However, it would not align header rank to section depth: the <h1> elements start implicit sub-sections (level 2). There is no heading in level 1 (i.e. no document title). > I see that there is some debate about whether a document *should* > contain multiple h1 elements, but there does not seem to be any debate > that a document *should not* skip levels. Demotion of the top > level header to h2 therefore seems to me to be clearly forbidden > while multiple h1 elements does not. IMV, both are instances of a titleless document, valid but not ideal. Cf. the statement cited in https://www.impactplus.com/blog/multiple-h1-headlines-okay: it makes no difference to Google whether you use no H1 tags or 100. See also the usage notes in https://developer.mozilla.org/en-US/docs/Web/HTML/Element/Heading_Elements * Both cases are valid HTML documents (in HTML4 as well as in HTML5). If sections without heading work best for a special use case, fine. * It is recommended not to skip levels and to have a heading in each section level. - With explicit sections, you cannot skip section levels but you can have sections without heading. - With implicit sections, you cannot have a section without heading but you can skip levels. (In HTML5, the <body> element is considered an explicit section and thus can be without heading). To handle the "hole" at level 1, you can either skip <h1> elements and keep heading rank and outline level in sync, or you can use <h1> for outline level 2 (top sections) and <h:math:`(n-1)`> for outline level math:`n`. The "initial-header-level" setting allows to configure which "coping strategy" is used. The best fix is to provide a document title in the source. As Docutils is already generating explicit sections when converting the rST source to a document tree, I prefer to follow the tip in https://html.spec.whatwg.org/multipage/sections.html#headings-and-sections Authors are also encouraged to explicitly wrap sections in elements of sectioning content, instead of relying on the implicit sections generated by having multiple headings in one element of sectioning content. and keep the current default behaviour. Viele Grüße, Günter |