Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#1395] Marshalling Set and HashSet #1405

Merged
merged 11 commits into from
Oct 10, 2019
1 change: 1 addition & 0 deletions dhall/dhall.cabal
Original file line number Diff line number Diff line change
Expand Up @@ -407,6 +407,7 @@ Library
exceptions >= 0.8.3 && < 0.11,
filepath >= 1.4 && < 1.5 ,
haskeline >= 0.7.2.1 && < 0.8 ,
hashable >= 1.2 && < 1.3 ,
lens-family-core >= 1.0.0 && < 2.1 ,
megaparsec >= 6.5.0 && < 7.1 ,
memory >= 0.14 && < 0.16,
Expand Down
167 changes: 160 additions & 7 deletions dhall/src/Dhall.hs
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,10 @@ module Dhall
, sequence
, list
, vector
, setFromDistinctList
, setIgnoringDuplicates
, hashSetFromDistinctList
, hashSetIgnoringDuplicates
, Dhall.map
, pairFromMapEntry
, unit
Expand Down Expand Up @@ -119,6 +123,7 @@ import Data.Either.Validation (Validation(..), ealt, eitherToValidation, validat
import Data.Fix (Fix(..))
import Data.Functor.Contravariant (Contravariant(..), (>$<), Op(..))
import Data.Functor.Contravariant.Divisible (Divisible(..), divided)
import Data.Hashable (Hashable)
import Data.List.NonEmpty (NonEmpty (..))
import Data.Map (Map)
import Data.Monoid ((<>))
Expand All @@ -136,7 +141,7 @@ import Dhall.Import (Imported(..))
import Dhall.Parser (Src(..))
import Dhall.TypeCheck (DetailedTypeError(..), TypeError, X)
import GHC.Generics
import Lens.Family (LensLike', set, view)
import Lens.Family (LensLike', view)
import Numeric.Natural (Natural)
import Prelude hiding (maybe, sequence)
import System.FilePath (takeDirectory)
Expand All @@ -149,11 +154,13 @@ import qualified Data.Functor.Compose
import qualified Data.Functor.Product
import qualified Data.Map
import qualified Data.Maybe
import qualified Data.List
import qualified Data.List.NonEmpty
import qualified Data.Semigroup
import qualified Data.Scientific
import qualified Data.Sequence
import qualified Data.Set
import qualified Data.HashSet
import qualified Data.Text
import qualified Data.Text.IO
import qualified Data.Text.Lazy
Expand All @@ -167,10 +174,12 @@ import qualified Dhall.Parser
import qualified Dhall.Pretty.Internal
import qualified Dhall.TypeCheck
import qualified Dhall.Util
import qualified Lens.Family

-- $setup
-- >>> :set -XOverloadedStrings
-- >>> :set -XRecordWildCards
-- >>> import Dhall.Pretty.Internal (prettyExpr)

type Extractor s a = Validation (ExtractErrors s a)
type MonadicExtractor s a = Either (ExtractErrors s a)
Expand Down Expand Up @@ -405,8 +414,8 @@ inputWithSettings settings (Type {..}) txt = do
let EvaluateSettings {..} = _evaluateSettings

let transform =
set Dhall.Import.normalizer _normalizer
. set Dhall.Import.startingContext _startingContext
Lens.Family.set Dhall.Import.normalizer _normalizer
. Lens.Family.set Dhall.Import.startingContext _startingContext

let status = transform (Dhall.Import.emptyStatus _rootDirectory)

Expand Down Expand Up @@ -499,8 +508,8 @@ inputExprWithSettings settings txt = do
let EvaluateSettings {..} = _evaluateSettings

let transform =
set Dhall.Import.normalizer _normalizer
. set Dhall.Import.startingContext _startingContext
Lens.Family.set Dhall.Import.normalizer _normalizer
. Lens.Family.set Dhall.Import.startingContext _startingContext

let status = transform (Dhall.Import.emptyStatus _rootDirectory)

Expand Down Expand Up @@ -791,6 +800,126 @@ list = fmap Data.Foldable.toList . sequence
vector :: Type a -> Type (Vector a)
vector = fmap Data.Vector.fromList . list

{-| Decode a `Set` from a `List`

>>> input (setIgnoringDuplicates natural) "[1, 2, 3]"
fromList [1,2,3]

Duplicate elements are ignored.

>>> input (setIgnoringDuplicates natural) "[1, 1, 3]"
fromList [1,3]

-}
setIgnoringDuplicates :: (Ord a) => Type a -> Type (Data.Set.Set a)
setIgnoringDuplicates = fmap Data.Set.fromList . list

{-| Decode a `HashSet` from a `List`

>>> input (hashSetIgnoringDuplicates natural) "[1, 2, 3]"
fromList [1,2,3]

Duplicate elements are ignored.

>>> input (hashSetIgnoringDuplicates natural) "[1, 1, 3]"
fromList [1,3]

-}
hashSetIgnoringDuplicates :: (Hashable a, Ord a)
=> Type a
-> Type (Data.HashSet.HashSet a)
hashSetIgnoringDuplicates = fmap Data.HashSet.fromList . list

{-| Decode a `Set` from a `List` with distinct elements

>>> input (setFromDistinctList natural) "[1, 2, 3]"
fromList [1,2,3]

An error is thrown if the list contains duplicates.

> >>> input (setFromDistinctList natural) "[1, 1, 3]"
> *** Exception: Error: Failed extraction
>
> The expression type-checked successfully but the transformation to the target
> type failed with the following error:
>
> One duplicate element in the list: 1
>

> >>> input (setFromDistinctList natural) "[1, 1, 3, 3]"
> *** Exception: Error: Failed extraction
>
> The expression type-checked successfully but the transformation to the target
> type failed with the following error:
>
> 2 duplicates were found in the list, including 1
>

-}
setFromDistinctList :: (Ord a, Show a) => Type a -> Type (Data.Set.Set a)
setFromDistinctList = setHelper Data.Set.size Data.Set.fromList

{-| Decode a `HashSet` from a `List` with distinct elements

>>> input (hashSetFromDistinctList natural) "[1, 2, 3]"
fromList [1,2,3]

An error is thrown if the list contains duplicates.

> >>> input (hashSetFromDistinctList natural) "[1, 1, 3]"
> *** Exception: Error: Failed extraction
>
> The expression type-checked successfully but the transformation to the target
> type failed with the following error:
>
> One duplicate element in the list: 1
>

> >>> input (hashSetFromDistinctList natural) "[1, 1, 3, 3]"
> *** Exception: Error: Failed extraction
>
> The expression type-checked successfully but the transformation to the target
> type failed with the following error:
>
> 2 duplicates were found in the list, including 1
>

-}
hashSetFromDistinctList :: (Hashable a, Ord a, Show a)
=> Type a
-> Type (Data.HashSet.HashSet a)
hashSetFromDistinctList = setHelper Data.HashSet.size Data.HashSet.fromList


setHelper :: (Eq a, Foldable t, Show a)
=> (t a -> Int)
-> ([a] -> t a)
-> Type a
-> Type (t a)
setHelper size toSet (Type extractIn expectedIn) = Type extractOut expectedOut
where
extractOut (ListLit _ es) = case traverse extractIn es of
Success vSeq
| sameSize -> Success vSet
| length duplicates == 1 -> extractError err1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think something like this would make it cheaper to distinguish Success and Failure:

Suggested change
| length duplicates == 1 -> extractError err1
| otherwise -> extractError err
where err | length duplicates == 1

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would it be cheaper?
I don't mind, but err1 and errN take up a few lines, I think the structure code may be harder to read...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current situation it seems that we need to compute length duplicates to get the Failure constructor. I believe that if we move the length duplicates check into the computation of the error message, the caller can determine the constructor without computing length duplicates.

Maybe GHC is smart enough, but it's better not to rely on it too much.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But Failure is only accessed if it's not a Success. If it's not a success, it doesn't go in the conditions.

f x = case x of
    Just x | error "whoops" -> 1
    Nothing -> 0

ghci> f Nothing
0

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure what you mean. extractError results in Failure too.

I'm assuming usage like

case toMonadic (extract (setFromDistinctList elementType) expr) of
  Right s -> …
  Left _ -> … -- handle failure but ignore the error message

I'm arguing that even if expr contains duplicates, it shouldn't be necessary to force length duplicates, but the current code looks like it does.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I understand what you mean now, users should be able to pattern match on Failure without extra computation. That's a good point and sorry for being so dense ^^

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries. I wasn't very clear!

| otherwise -> extractError errN
where
vList = Data.Foldable.toList vSeq
vSet = toSet vList
sameSize = size vSet == Data.Sequence.length vSeq
duplicates = vList Data.List.\\ Data.Foldable.toList vSet
err1 = "One duplicate element in the list: "
<> (Data.Text.pack $ show $ head duplicates)
errN = Data.Text.pack $ unwords
[ show $ length duplicates
, "duplicates were found in the list, including"
, show $ head duplicates
]
Failure f -> Failure f
extractOut expr = typeError expectedOut expr

expectedOut = App List expectedIn

{-| Decode a `Map` from a @toMap@ expression or generally a @Prelude.Map.Type@

>>> input (Dhall.map strictText bool) "toMap { a = True, b = False }"
Expand Down Expand Up @@ -935,6 +1064,12 @@ instance Interpret a => Interpret [a] where
instance Interpret a => Interpret (Vector a) where
autoWith opts = vector (autoWith opts)

instance (Interpret a, Ord a, Show a) => Interpret (Data.Set.Set a) where
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you document how this instance handles duplicates? Maybe just reference setFromDistinctList.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can't be any duplicates here because the source is a Set which turns into a Dhall List.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take that as an indication that we should find better names for Interpret and Inject. ;)

(Take a look at the definition on the line below)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh my God, don't I feel silly!
Thank you for taking some of the blame away from me :)

autoWith opts = setFromDistinctList (autoWith opts)

instance (Interpret a, Hashable a, Ord a, Show a) => Interpret (Data.HashSet.HashSet a) where
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

autoWith opts = hashSetFromDistinctList (autoWith opts)

instance (Ord k, Interpret k, Interpret v) => Interpret (Map k v) where
autoWith opts = Dhall.map (autoWith opts) (autoWith opts)

Expand Down Expand Up @@ -1511,7 +1646,7 @@ class Inject a where

{-| Use the default options for injecting a value

> inject = inject defaultInterpretOptions
> inject = injectWith defaultInterpretOptions
-}
inject :: Inject a => InputType a
inject = injectWith defaultInterpretOptions
Expand Down Expand Up @@ -1656,8 +1791,26 @@ instance Inject a => Inject [a] where
instance Inject a => Inject (Vector a) where
injectWith = fmap (contramap Data.Vector.toList) injectWith

{-| Note that the ouput list will be sorted

>>> let x = Data.Set.fromList ["mom", "hi" :: Text]
>>> prettyExpr $ embed inject x
[ "hi", "mom" ]

-}

jiegillet marked this conversation as resolved.
Show resolved Hide resolved
instance Inject a => Inject (Data.Set.Set a) where
injectWith = fmap (contramap Data.Set.toList) injectWith
injectWith = fmap (contramap Data.Set.toAscList) injectWith

{-| Note that the ouput list may not be sorted

>>> let x = Data.HashSet.fromList ["hi", "mom" :: Text]
>>> prettyExpr $ embed inject x
[ "mom", "hi" ]

-}
instance Inject a => Inject (Data.HashSet.HashSet a) where
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether we should sort the elements… 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In math, sets have no notion of ordering so I would tend towards not sorting it...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having second thoughts, maybe we should sort it... Extra opinion @Gabriel439?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, this is something that we could change later on. For now, maybe just point out that there's no sorting, and add a doctest that demonstrates it.

Copy link
Collaborator Author

@jiegillet jiegillet Oct 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I've done one for HashSet (not pushed yet), but what about Set? Data.Set. toList and Data.Set.toAscList are both O(n), I might as well use toAscList right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Data.Set.toList is defined via toAscList. toAscList is more explicit though. 👍

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. Good.

injectWith = fmap (contramap Data.HashSet.toList) injectWith

instance (Inject a, Inject b) => Inject (a, b)

Expand Down