Skip to content
This repository was archived by the owner on Aug 3, 2024. It is now read-only.

Hoogle backend improvements #432

Closed
wants to merge 7 commits into from

Conversation

panhania
Copy link
Contributor

@panhania panhania commented Aug 8, 2015

This implements improvements suggested in #415 and #417.

While default methods are now removed from Hoogle output, type families are not. So, this:

class Foo a where

    type Bar a b
    type Baz a

    type Baz a = [(a, a)]

    bar :: Bar a a
    bar = undefined


instance Foo [a] where

    type Bar [a] Int = [(a, Bool)]
    type Bar [a] Bool = [(Int, a)]

    type Baz [a] = (a, a, a)

will produce following output:

class Foo a where type family Bar a b type family Baz a Baz a = [(a, a)]
bar :: Foo a => Bar a a
instance Test.Foo [a]

@ndmitchell - if you want to get rid of type families as well, let me know.

@ndmitchell
Copy link

Leaving the type families is handy, since they are legitimately part of the class declaration. Could we get semi-colons/braces around the type family, so it becomes legal Haskell? e.g.

class Foo a where {type family Bar a b; type family Baz a; type Baz a = [(a, a)]}

If not, it's not too terrible, as long as there are only type declarations in there I can insert the newlines myself.

@panhania
Copy link
Contributor Author

I have added braces, semicolons and some whitespace to the output so that it is both easy to parse and readable for human:

class Foo a where {
    type family Bar a b;
    type family Baz a;
    type Baz a = [(a, a)];
}

@ndmitchell
Copy link

Currently all Haddock --hoogle declarations are single lines, so this would certainly be a significant departure - but I think as long as the { is at the end it should be pretty easy to recognise the requirement to read multiple lines, and it does seem more reasonable than trying to pack it all on one line.

@Fuuzetsu
Copy link
Member

@mrhania it's great that you fixed these but could we get some basic tests to make sure it doesn't silently break again? AFAIK @ndmitchell that pretty much all the Hoogle issues are regressions because there was nothing to track changes.

It can easily be the same as html-tests (i.e. diff-based) but simpler as we don't need to strip anything.

@panhania
Copy link
Contributor Author

Currently all Haddock --hoogle declarations are single lines, so this would certainly be a significant departure

Hmm, I haven't thought about that this way. It can be easily reverted, so if you think that change will make your work harder, tell me.

@mrhania it's great that you fixed these but could we get some basic tests to make sure it doesn't silently break again? AFAIK @ndmitchell that pretty much all the Hoogle issues are regressions because there was nothing to track changes.

It can easily be the same as html-tests (i.e. diff-based) but simpler as we don't need to strip anything.

I know, I was planning to do it. I just needed to think over how to refactor common test code because latex-test, html-test, hypsrc-test (and potenially hoogle-test) all share the same exact boilerplate and the only difference is output comparison method. Which isn't nice and I feel bad writing obviously duplicated code...

@ndmitchell
Copy link

@mrhania it's fine as it is - as long as what Haddock produces is legal Haskell and predictable then writing code to deal with it is pretty easy. The other thing that makes my life much easier is if Haddock doesn't change the Hoogle output regularly by accident, which is why I'm very keen on the tests :).

@Fuuzetsu
Copy link
Member

On master. @mrhania has a branch with test improvements which will should include Hoogle tests.

@Fuuzetsu Fuuzetsu closed this Aug 21, 2015
@bgamari
Copy link
Collaborator

bgamari commented Nov 23, 2016

@mrhania, it looks like the tests for this are currently broken; do you suppose you could have a look?

@panhania
Copy link
Contributor Author

panhania commented Dec 9, 2016

@bgamari I will try to look into that later today.

@panhania
Copy link
Contributor Author

Okay, so one of the reasons behind Hoogle tests failing is that there are default methods and associated types printed after the class statement which is something that I fixed in this PR (as discussed in #417). Looks like the old behaviour was reintroduced in 5628084 (around this place). Considering that this commit was made by @simonpj there must be a reason for it, so maybe current Haddock behaviour is desired?

@Fuuzetsu
Copy link
Member

This kind of breakage will always happen without an intermediate type with stable print out. Not sure how much sense it makes to try and keep up on top of Outputable instances in GHC.

Further, what actually caused it is half the problem: current behaviour changes output to Hoogle which means it can't be used, so it's not desired.

I also ran other tests recently and that was not the only issue by the way.

@panhania
Copy link
Contributor Author

panhania commented Jan 1, 2017

This kind of breakage will always happen without an intermediate type with stable print out. Not sure how much sense it makes to try and keep up on top of Outputable instances in GHC.

What does Hoogle use to parse these types? If it uses GHC API utilities then it shouldn't really be a problem as long as GHC is internally "sane" and both Haddock and Hoogle are compiled using the same GHC version.

Further, what actually caused it is half the problem: current behaviour changes output to Hoogle which means it can't be used, so it's not desired.

I think we really need an official format specification document for the Hoogle database that clearly states what is allowed and what is not. Otherwise we can never be sure that tests are correct and up to date. I don't think it is "healthy" that I defined some new output syntax in this PR and @ndmitchell adapted Hoogle to work with it - it should be the other way around!

@Fuuzetsu
Copy link
Member

Fuuzetsu commented Jan 2, 2017

Hoogle doesn't use ghc api and Neil always adapted to ever changing Haddock output. IIRC when I spoke with him he said any format would be ok so we can pick the output, we just have to establish something stable from our end.

@panhania
Copy link
Contributor Author

panhania commented Jan 3, 2017

Well, maybe Neil is fine with adapting Hoogle to Haddock output but I still think that this is quite "unhealthy" attitude. After all, it is his tool and he should know better what information he actually cares about. And adapting generated output is far easier than making changes in the parser.

I honestly fail to see any disadvantages of having a short, official specification document for the Hoogle database format. Without it if something stops working you have to figure out what is actually broken: is there a bug in the Hoogle parser, in the Haddock output or in the Haddock test suite? And if someone wants to create an alternative to Haddock (for whatever reason) he doesn't need to scavenge in the pull request discussions collecting bits of information about all the quirks in its output format.

@bgamari
Copy link
Collaborator

bgamari commented Jan 4, 2017

For what it's worth I think @mrhania has a point here. There is no harm in specifying the format. Of course, someone needs to step up to do this; @mrhania perhaps you'd be interested in picking this up?

@Fuuzetsu
Copy link
Member

Fuuzetsu commented Jan 9, 2017

I'm not arguing against a format. In fact I'm highly in favour. Main point is that to implement such a thing, we have to stop relying on Outputable, at least directly.

The format itself should be fairly trivial to come up with and specify (Hoogle does not need that much info), the implementation is likely the biggest chunk of work.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants