Skip to content
This repository was archived by the owner on Aug 3, 2024. It is now read-only.

Use .hie files for the Hyperlinker backend #977

Merged

Conversation

harpocrates
Copy link
Collaborator

@harpocrates harpocrates commented Dec 12, 2018

Now that .hie files can be generated by GHC, Haddock should use them in the hyperlinker backend instead of going via the TypecheckedModule (which will have to go away for Hi Haddock to work).

Fixes #496.
Fixes #715.

@harpocrates
Copy link
Collaborator Author

harpocrates commented Dec 12, 2018

Some bigger TODO items:

  • .hie files shouldn't be put beside .hs files

  • .hie files should be generated only when hyperlinked sources are requested

  • get to the bottom of Typeable warnings the get produced on base:

    GHC error in desugarer lookup in Ghci1:
      Can't find interface-file declaration for type constructor or class Typeable
        Probable cause: bug in .hi-boot file, or inconsistent .hi file
        Use -ddump-if-trace to get an idea of which file caused the error
    

    (Looks like this is a GHC-side issue caused by calling typecheckModule with -fwrite-ide-info. I should file a bug...)

  • make sure we don't have performance regressions (in particular: if the new type annotation stuff is causing a huge performance regression, maybe it should be behind a flag?). The time it takes Hadrian to run Haddock (Hadrian passes --hyperlinked-source by default) has jumped from 240s to 370s...

  • accept the output

  • handle linking to/from back-ticked identifiers (like `foo`) and parenthesized operators (like (+))

  • https://phabricator.haskell.org/D5463 (merged, see https://gitlab.haskell.org/ghc/ghc/commit/9fb2702dec3e9419e1a229f8cd678324e89fdddf)

  • cross-package links are currently broken (either make them work or disable them)

@wz1000
Copy link
Collaborator

wz1000 commented Dec 17, 2018

There are a few more issues about the interaction between the haddock tokenizer and .hie files.

  • The haddock tokenizer classifies `foo` and (+) as 3 separate tokens, while they are treated as a single token by ghc, so hyperlinking is broken for these.

  • The haddock tokenizer doesn't understand {-# LINE ... #-} pragmas, so hyperlinking is broken for files that include this pragma(see this). This is not a new issue created by this PR, but also affects hyperlinked source generated by haddock without using .hie files. Relevant issue: LINE pragmas cause missing documentation #441

@harpocrates harpocrates reopened this Dec 17, 2018
@harpocrates
Copy link
Collaborator Author

There are a few more issues about the interaction between the haddock tokenizer and .hie files.

  • The haddock tokenizer classifies `foo` and (+) as 3 separate tokens, while they are treated as a single token by ghc, so hyperlinking is broken for these.

  • The haddock tokenizer doesn't understand {-# LINE ... #-} pragmas, so hyperlinking is broken for files that include this pragma(see this). This is not a new issue created by this PR, but also affects hyperlinked source generated by haddock without using .hie files. Relevant issue: LINE pragmas cause missing documentation #441

Good point for infix identifiers! Haddock uses GHC's tokenizer - the problem is that `foo` (and (+)) is actually three tokens (see https://github.com/ghc/ghc/blob/1ebe8438cd29448291229d5ce23c9f73216e9650/compiler/parser/Parser.y#L3354).

However, I don't think we should let this be blocked by position-altering pragmas. I think that's a more fundamental issue about how GHC conflates positions in files with positions used for error reporting.

@harpocrates
Copy link
Collaborator Author

Here's more information on the performance (both time and allocations) for all boot libs: https://docs.google.com/spreadsheets/d/1xmUdUoVqbYtIRYdz5TgWT6rwHsVGjYEgEmMwWClU9-c/edit?usp=sharing.

@wz1000
Copy link
Collaborator

wz1000 commented Dec 28, 2018

@harpocrates Is this for building haddock docs with --hyperlinked-source?

With an SSD, I see a slowdown of ~15% compiling ghc stage2 with -fwrite-ide-info: https://gist.github.com/wz1000/8678464b4c88be8359d39462ae7f0a90

I measured this by building ghc normally, then blowing away compiler/stage2 and building with and without -fwrite-ide-info

Once Hi Haddock lands, haddocks performance should increase significantly as it wouldn't need to invoke ghc at all.

@harpocrates
Copy link
Collaborator Author

harpocrates commented Dec 28, 2018 via email

@wz1000
Copy link
Collaborator

wz1000 commented Dec 28, 2018

It would be useful to know whether the bulk of the slowdown in in the ghc stage that generates .hie files or in the haddock stage that writes hyperlinked sources. Since on my machine, I get <15% slowdown when building stage2 with -fwrite-ide-info, wherease you get a 50% overall slowdown.

@harpocrates
Copy link
Collaborator Author

Here is the script I ran twice - once for haskell:ghc-head and once from harpocrates:feature/hie-based-hyperlinker (and from the GHC tree):

set -e

for run in run1 run2 run3 run4
do
  mkdir "profs/$run"
  for file in $(find _build -name "*.haddock")
  do
    echo "Processing $file..."
    echo "| Removing $file"
    rm $file
    echo "| Rebuilding"
    ./hadrian/build.sh -c  --flavour=user $file
    echo "| Moving haddock.prof"
    mv haddock.prof "profs/$run/$(basename $file | cut -d. -f1)"
  done
done

The custom user flavour isn't very complex - just making sure Haddock is built and run with some profiling options:

userFlavour :: Flavour
userFlavour = defaultFlavour
  { name = "user"
  , args = mconcat [ builder (Haddock BuildPackage) ? pure [ "+RTS", "-p", "-RTS" ]
                   , builder (Ghc CompileHs) ? package haddock ? pure [ "-prof", "-fprof-auto", "-rtsopts" ]
                   , builder (Ghc LinkHs) ? package haddock ? pure [ "-prof", "-fprof-auto", "-rtsopts" ]
                   , args defaultFlavour
                   ]
  }

Here are the .prof files I got: https://owncloud-tng.galois.com/index.php/s/1nt96QTchrWeTNZ

@wz1000
Copy link
Collaborator

wz1000 commented Dec 28, 2018

I meant that I get a 15% slowdown building ghc with -fwrite-ide-info, without building any haddock docs at all. It would be good to know if the disparity is due to

  • Using an SSD, since .hie files naturally lead to a substantial and largely unavoidable increase in disk IO, which would impact setups with slower storage formats disproportionately.
  • The new hyperlinker code added to haddock by this PR being slow
  • Some other reason.

@harpocrates
Copy link
Collaborator Author

Given the first lines of pretty much all the new .prof files, it looks like generating .hie files is the slow part. The new hyperlinker code could probably be sped up a bit too though.

        Sat Dec 22 19:54 2018 Time and Allocation Profiling Report  (Final)

           haddock +RTS -p -t_build/stage1/libraries/Cabal/Cabal/build/haddock.t ...


        total time  =       25.36 secs   (25362 ticks @ 1000 us, 1 processor)
        total alloc = 59,578,959,808 bytes  (excludes profiling overheads)

COST CENTRE                    MODULE                                SRC                                                                                      %time %alloc

extract_renamed_stuff          HscMain                               compiler/main/HscMain.hs:(402,1)-(435,18)                                                 25.6   31.3
tc_rn_src_decls                TcRnDriver                            compiler/typecheck/TcRnDriver.hs:(500,4)-(572,7)                                           9.6    6.5
annotate.typ                   Haddock.Backends.Hyperlinker.Renderer utils/haddock/haddock-api/src/Haddock/Backends/Hyperlinker/Renderer.hs:153:5-56            4.3    4.0
spanToNewline                  Haddock.Backends.Hyperlinker.Parser   utils/haddock/haddock-api/src/Haddock/Backends/Hyperlinker/Parser.hs:(126,1)-(156,21)      3.9    3.5
ppHyperlinkedModuleSource.html Haddock.Backends.Hyperlinker          utils/haddock/haddock-api/src/Haddock/Backends/Hyperlinker.hs:80:5-50                      2.9    5.2
solve_loop                     TcInteract                            compiler/typecheck/TcInteract.hs:(231,9)-(235,44)                                          2.7    1.0
...

@wz1000
Copy link
Collaborator

wz1000 commented Dec 28, 2018

That says 25% of the time is spent in extract_renamed_stuff. However, now I doubt whether we need renamed source at all to generate .hie files. Initially there were some issues with some name uniques not matching due to differences introduced by the typechecker, but I believe I should have solved those with the name_remappping stuff.

The traversal of the renamed source enables hyperlinking of imports/exports/annotations/foreign declarations and other stuff that is not in the TypecheckedSource. Maybe there is a better way to extract those specifically, and not extract the rest of AST that is already there in the typechecked source.

I believe it would be worth testing to see if not including RenamedSource in the hie file generation stuff(and thus not calling extract_renamed_stuff) yields a significant performance improvement.

@harpocrates
Copy link
Collaborator Author

harpocrates commented Dec 28, 2018 via email

@wz1000
Copy link
Collaborator

wz1000 commented Dec 28, 2018

Isn’t extract_renamed_stuff the top-level function for where HIE files get created and written out (so that entry includes not just time spent traversing the renamed source, but also time spent compacting the AST and writing it to a file)?

Oh, yes, I misremembered. However, the format will not change if we don't traverse the RenamedSource. All that needs to be changed in enrichHie is

         flat_asts = concat
           [ tasts
-          , rasts
-          , imps
-          , exps
           ]

and we can also avoid setting keep_rn to True in HscMain when given -fwrite-ide-info

@harpocrates
Copy link
Collaborator Author

By "format" I don't mean the Haskell types in HieTypes, I mean the structures that people expect to be inside .hie files.

Would you like to make a GHC patch for this? Then we can post some performance numbers there, as well as check that nobody is relying on the information that is about to be removed from .hie files.

@mpickering
Copy link
Collaborator

There are not that many cost centres defined in the GHC code base so I wouldn't trust the information to precisely point to what is taking up the time.

@wz1000
Copy link
Collaborator

wz1000 commented Dec 29, 2018

@wz1000
Copy link
Collaborator

wz1000 commented Dec 29, 2018

Parts of the AST that are unaccounted for without using renamed source:

  • Type signatures
  • Typeclass/instance/data/type declarations
  • Imports
  • Exports
  • Deriving Declarations
  • Fixity declarations
  • Default Declarations
  • Foreign declarations
  • Warning declarations
  • Ann Declarations
  • Rule Declarations

@harpocrates
Copy link
Collaborator Author

@wz1000 Any updates on potential performance improvements from the GHC side?

I'm gonna see if I can optimize the rest of the stuff in Haddock.Backends.Hyperlinker.Parser. In particular, I suspect there will be significant performance gains if we transform the preprocessing to work on StringBuffer instead of String. I expect to get about a 5% improvement from that.

@wz1000
Copy link
Collaborator

wz1000 commented Jan 8, 2019

@harpocrates the approach I suggested earlier(not traversing the renamed AST) will not work. Type signatures can be buried arbitrarily deep in the AST, and to hyperlink those, we need the Renamed Source(since LHsType GhcRn is simply converted to Type by the typechecker, thus losing the structure of the original source).

The other items in my checklist above should be possible to extract without a full traversal of the renamed AST. To see the possible gains, you can benchmark the branch I linked above. If the gains are substantial, we might want to add a flag that enables this behaviour at the cost of information for type signatures.

wz1000 and others added 9 commits January 14, 2019 10:54
Move the logic for handling temporary output directories up and
make .hie files write to there. Although having shorter lived
temp directories is better, they were so short lived that the .hie
files were gone by the time the hyperlinker backend was ready to run!
This is working around a quirk of the GHC lexer. It seems to be working
well enough though. See the attached test case for more motivation.
Optimized some of the batch HIE transformations to share
intermediate results. This sort of thing probably belongs as
a GHC-side utility.

Also, avoid recomputing a module name based SrcMap.

This should be a performance-only change.
Also avoid an unnecessary call to splice URL in those cases.
This is yet another example of why we need to stop using CPP.
Clang's CPP inserts a newline after lines that _look_ like (but don't
end up actually being) preprocessor directives.

See the comment on `needPragHack` for more on this sad state of affairs.
@harpocrates
Copy link
Collaborator Author

Good news. By limiting (for now) which expressions have their types serialized into the HIE files, performance returns to levels that make this mergeable (2-3% percent faster but allocate 2-3% more) without sacrificing any features.

We'll now see if anyone else objects: https://gitlab.haskell.org/ghc/ghc/merge_requests/214.

@wz1000
Copy link
Collaborator

wz1000 commented Jan 25, 2019

@harpocrates 2-3% faster than the previous version with .hie files or 2-3% faster than the old way of generating hyperlinked source?

@harpocrates
Copy link
Collaborator Author

@wz1000 Than the old way, otherwise it wouldn't really matter, right?

@harpocrates
Copy link
Collaborator Author

Actually, with the latest tweaks to that merge request, both allocations and total time drop by a couple percentage points compared to the old way. I do love changes that are net wins in all dimensions: time, allocations, bug fixes, and features! 😀

I intend to accept the new output and merge this as soon as the GHC MR gets merged.

This is twofold:

  1. ask GHC to treat these as regular tokens
  2. we manually update the internal position as well as returning tokens

If we don't do 1, we end up with huge chunks of text that get considered
as whitespace (because a `LINE`/`COLUMN` pragma caused us to jump to that
next place).

If we do 1 but not 2, the positions we associate with tokens won't match
those from HIE files, and we consequently won't get any links/type
annotations.
Fixes some annoyances around LINE/COLUMN pragmas not being
recognized properly due to the lexer inserting extra zero length tokens.

Also, avoid rendering a zero length token.
Calls out to 'spliceURL' are often useless. Also,
given the size of these functions, we should tell
GHC to inline them.
@harpocrates harpocrates merged commit 1b26460 into haskell:ghc-head Jan 31, 2019
@harpocrates harpocrates deleted the feature/hie-based-hyperlinker branch January 31, 2019 09:37
@harpocrates
Copy link
Collaborator Author

Final numbers, in case anyone is interested:

  • 9-10% faster on the sum of boot libs running --html and --hyperlinked-sources
  • 6-7% less allocations on the sum of boot libs running --html and --hyperlinked-sources

Thanks to everyone for being patient!

@alexbiehl
Copy link
Member

alexbiehl commented Jan 31, 2019 via email

alanz pushed a commit that referenced this pull request Mar 25, 2020
This is a large architectural change to the Hyperlinker.

  * extract link (and now also type) information from `.hie` instead
    of doing ad-hoc SYB traversals of the `RenamedSource`. Also
    adds a superb type-on-hover feature (#715).

 * re-engineer the lexer to avoid needless string conversions. By going
    directly through GHC's `P` monad and taking bytestring slices, we
    avoid a ton of allocation and have better handling of position
    pragmas and CPP.

In terms of performance, the Haddock side of things has gotten _much_
more efficient. Unfortunately, much of this is cancelled out by the
increased GHC workload for generating `.hie` files. For the full set of
boot libs (including `ghc`-the-library)

  * the sum of total time went down by 9-10% overall
  * the sum of total allocations went down by 6-7%

Haddock is moving towards working entirely over `.hi` and `.hie` files.
This change means we no longer need the `RenamedSource` from
`TypecheckedModule` (something which is _not_ in `.hi` files).

Along the way a bunch of things were fixed:

 * Cross package (and other) links are now more reliable (#496)
 * The lexer tries to recover from errors on every line (instead of at CPP
    boundaries)
 * `LINE`/`COLUMN` pragmas are taken into account
 * filter out zero length tokens before rendering
 * avoid recomputing the `ModuleName`-based `SrcMap`
 * remove the last use of `Documentation.Haddock.Utf8` (see  #998)
 * restructure temporary folder logic for `.hi`/`.hie` model
alanz pushed a commit that referenced this pull request Apr 28, 2020
This is a large architectural change to the Hyperlinker.

  * extract link (and now also type) information from `.hie` instead
    of doing ad-hoc SYB traversals of the `RenamedSource`. Also
    adds a superb type-on-hover feature (#715).

 * re-engineer the lexer to avoid needless string conversions. By going
    directly through GHC's `P` monad and taking bytestring slices, we
    avoid a ton of allocation and have better handling of position
    pragmas and CPP.

In terms of performance, the Haddock side of things has gotten _much_
more efficient. Unfortunately, much of this is cancelled out by the
increased GHC workload for generating `.hie` files. For the full set of
boot libs (including `ghc`-the-library)

  * the sum of total time went down by 9-10% overall
  * the sum of total allocations went down by 6-7%

Haddock is moving towards working entirely over `.hi` and `.hie` files.
This change means we no longer need the `RenamedSource` from
`TypecheckedModule` (something which is _not_ in `.hi` files).

Along the way a bunch of things were fixed:

 * Cross package (and other) links are now more reliable (#496)
 * The lexer tries to recover from errors on every line (instead of at CPP
    boundaries)
 * `LINE`/`COLUMN` pragmas are taken into account
 * filter out zero length tokens before rendering
 * avoid recomputing the `ModuleName`-based `SrcMap`
 * remove the last use of `Documentation.Haddock.Utf8` (see  #998)
 * restructure temporary folder logic for `.hi`/`.hie` model
alanz pushed a commit that referenced this pull request Apr 29, 2020
This is a large architectural change to the Hyperlinker.

  * extract link (and now also type) information from `.hie` instead
    of doing ad-hoc SYB traversals of the `RenamedSource`. Also
    adds a superb type-on-hover feature (#715).

 * re-engineer the lexer to avoid needless string conversions. By going
    directly through GHC's `P` monad and taking bytestring slices, we
    avoid a ton of allocation and have better handling of position
    pragmas and CPP.

In terms of performance, the Haddock side of things has gotten _much_
more efficient. Unfortunately, much of this is cancelled out by the
increased GHC workload for generating `.hie` files. For the full set of
boot libs (including `ghc`-the-library)

  * the sum of total time went down by 9-10% overall
  * the sum of total allocations went down by 6-7%

Haddock is moving towards working entirely over `.hi` and `.hie` files.
This change means we no longer need the `RenamedSource` from
`TypecheckedModule` (something which is _not_ in `.hi` files).

Along the way a bunch of things were fixed:

 * Cross package (and other) links are now more reliable (#496)
 * The lexer tries to recover from errors on every line (instead of at CPP
    boundaries)
 * `LINE`/`COLUMN` pragmas are taken into account
 * filter out zero length tokens before rendering
 * avoid recomputing the `ModuleName`-based `SrcMap`
 * remove the last use of `Documentation.Haddock.Utf8` (see  #998)
 * restructure temporary folder logic for `.hi`/`.hie` model
alanz pushed a commit that referenced this pull request Apr 30, 2020
This is a large architectural change to the Hyperlinker.

  * extract link (and now also type) information from `.hie` instead
    of doing ad-hoc SYB traversals of the `RenamedSource`. Also
    adds a superb type-on-hover feature (#715).

 * re-engineer the lexer to avoid needless string conversions. By going
    directly through GHC's `P` monad and taking bytestring slices, we
    avoid a ton of allocation and have better handling of position
    pragmas and CPP.

In terms of performance, the Haddock side of things has gotten _much_
more efficient. Unfortunately, much of this is cancelled out by the
increased GHC workload for generating `.hie` files. For the full set of
boot libs (including `ghc`-the-library)

  * the sum of total time went down by 9-10% overall
  * the sum of total allocations went down by 6-7%

Haddock is moving towards working entirely over `.hi` and `.hie` files.
This change means we no longer need the `RenamedSource` from
`TypecheckedModule` (something which is _not_ in `.hi` files).

Along the way a bunch of things were fixed:

 * Cross package (and other) links are now more reliable (#496)
 * The lexer tries to recover from errors on every line (instead of at CPP
    boundaries)
 * `LINE`/`COLUMN` pragmas are taken into account
 * filter out zero length tokens before rendering
 * avoid recomputing the `ModuleName`-based `SrcMap`
 * remove the last use of `Documentation.Haddock.Utf8` (see  #998)
 * restructure temporary folder logic for `.hi`/`.hie` model
lf- added a commit to lf-/haddock that referenced this pull request May 6, 2022
Fixes haskell#1481.

There were two bugs in this:
* We were assuming that we were always getting a relative path to the
  module in question, while Nix gives us file:// URLs sometimes. This
  change checks for those and stops prepending `..` to them.
* We were not linking to the file under the module. This seems
  to have been a regression introduced by haskell#977. That is, the URLs were
  going to something like
  file:///nix/store/3bwbsy0llxxn1pixx3ll02alln56ivxy-ghc-9.0.2-doc/share/doc/ghc/html/libraries/base-4.15.1.0/src
  which does not have the appropriate HTML file or fragment for the item
  in question at the end.

There is a remaining instance of the latter bug, but not in the
hyperlinker: the source links to items reexported from other modules are
also not including the correct file name. e.g. the reexport of Entity in
esqueleto, from persistent.

NOTE: This needs to get tested with relative-path located modules. It seems
correct for Nix based on my testing.

Testing strategy:

```
nix-shell '<nixpkgs>' --pure -A haskell.packages.ghc922.aeson
mkdir /tmp/aesonbuild && cd /tmp/aesonbuild
export out=/tmp/aesonbuild/out
genericBuild

ln -sf $HOME/co/haddock/haddock-api/resources .
./Setup haddock --with-haddock=$HOME/path/to/haddock/exec --hyperlink-source
```
Kleidukos pushed a commit that referenced this pull request May 7, 2022
Fixes #1481.

There were two bugs in this:
* We were assuming that we were always getting a relative path to the
  module in question, while Nix gives us file:// URLs sometimes. This
  change checks for those and stops prepending `..` to them.
* We were not linking to the file under the module. This seems
  to have been a regression introduced by #977. That is, the URLs were
  going to something like
  file:///nix/store/3bwbsy0llxxn1pixx3ll02alln56ivxy-ghc-9.0.2-doc/share/doc/ghc/html/libraries/base-4.15.1.0/src
  which does not have the appropriate HTML file or fragment for the item
  in question at the end.

There is a remaining instance of the latter bug, but not in the
hyperlinker: the source links to items reexported from other modules are
also not including the correct file name. e.g. the reexport of Entity in
esqueleto, from persistent.

NOTE: This needs to get tested with relative-path located modules. It seems
correct for Nix based on my testing.

Testing strategy:

```
nix-shell '<nixpkgs>' --pure -A haskell.packages.ghc922.aeson
mkdir /tmp/aesonbuild && cd /tmp/aesonbuild
export out=/tmp/aesonbuild/out
genericBuild

ln -sf $HOME/co/haddock/haddock-api/resources .
./Setup haddock --with-haddock=$HOME/path/to/haddock/exec --hyperlink-source
```
duog pushed a commit to duog/haddock that referenced this pull request Jul 29, 2022
Fixes haskell#1481.

There were two bugs in this:
* We were assuming that we were always getting a relative path to the
  module in question, while Nix gives us file:// URLs sometimes. This
  change checks for those and stops prepending `..` to them.
* We were not linking to the file under the module. This seems
  to have been a regression introduced by haskell#977. That is, the URLs were
  going to something like
  file:///nix/store/3bwbsy0llxxn1pixx3ll02alln56ivxy-ghc-9.0.2-doc/share/doc/ghc/html/libraries/base-4.15.1.0/src
  which does not have the appropriate HTML file or fragment for the item
  in question at the end.

There is a remaining instance of the latter bug, but not in the
hyperlinker: the source links to items reexported from other modules are
also not including the correct file name. e.g. the reexport of Entity in
esqueleto, from persistent.

NOTE: This needs to get tested with relative-path located modules. It seems
correct for Nix based on my testing.

Testing strategy:

```
nix-shell '<nixpkgs>' --pure -A haskell.packages.ghc922.aeson
mkdir /tmp/aesonbuild && cd /tmp/aesonbuild
export out=/tmp/aesonbuild/out
genericBuild

ln -sf $HOME/co/haddock/haddock-api/resources .
./Setup haddock --with-haddock=$HOME/path/to/haddock/exec --hyperlink-source
```

(cherry picked from commit ab53ccf)
hubot pushed a commit to ghc/ghc that referenced this pull request May 17, 2024
Fixes haskell/haddock#1481.

There were two bugs in this:
* We were assuming that we were always getting a relative path to the
  module in question, while Nix gives us file:// URLs sometimes. This
  change checks for those and stops prepending `..` to them.
* We were not linking to the file under the module. This seems
  to have been a regression introduced by haskell/haddock#977. That is, the URLs were
  going to something like
  file:///nix/store/3bwbsy0llxxn1pixx3ll02alln56ivxy-ghc-9.0.2-doc/share/doc/ghc/html/libraries/base-4.15.1.0/src
  which does not have the appropriate HTML file or fragment for the item
  in question at the end.

There is a remaining instance of the latter bug, but not in the
hyperlinker: the source links to items reexported from other modules are
also not including the correct file name. e.g. the reexport of Entity in
esqueleto, from persistent.

NOTE: This needs to get tested with relative-path located modules. It seems
correct for Nix based on my testing.

Testing strategy:

```
nix-shell '<nixpkgs>' --pure -A haskell.packages.ghc922.aeson
mkdir /tmp/aesonbuild && cd /tmp/aesonbuild
export out=/tmp/aesonbuild/out
genericBuild

ln -sf $HOME/co/haddock/haddock-api/resources .
./Setup haddock --with-haddock=$HOME/path/to/haddock/exec --hyperlink-source
```
hubot pushed a commit to ghc/ghc that referenced this pull request May 17, 2024
Fixes haskell/haddock#1481.

There were two bugs in this:
* We were assuming that we were always getting a relative path to the
  module in question, while Nix gives us file:// URLs sometimes. This
  change checks for those and stops prepending `..` to them.
* We were not linking to the file under the module. This seems
  to have been a regression introduced by haskell/haddock#977. That is, the URLs were
  going to something like
  file:///nix/store/3bwbsy0llxxn1pixx3ll02alln56ivxy-ghc-9.0.2-doc/share/doc/ghc/html/libraries/base-4.15.1.0/src
  which does not have the appropriate HTML file or fragment for the item
  in question at the end.

There is a remaining instance of the latter bug, but not in the
hyperlinker: the source links to items reexported from other modules are
also not including the correct file name. e.g. the reexport of Entity in
esqueleto, from persistent.

NOTE: This needs to get tested with relative-path located modules. It seems
correct for Nix based on my testing.

Testing strategy:

```
nix-shell '<nixpkgs>' --pure -A haskell.packages.ghc922.aeson
mkdir /tmp/aesonbuild && cd /tmp/aesonbuild
export out=/tmp/aesonbuild/out
genericBuild

ln -sf $HOME/co/haddock/haddock-api/resources .
./Setup haddock --with-haddock=$HOME/path/to/haddock/exec --hyperlink-source
```

(cherry picked from commit ab53ccf089ea703b767581ac14be0f6c78a7678a)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants