Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpretation of PROHIBITED for property requirement 🗳️ #206

Open
MatthiasWeise opened this issue Oct 17, 2023 · 18 comments
Open

Interpretation of PROHIBITED for property requirement 🗳️ #206

MatthiasWeise opened this issue Oct 17, 2023 · 18 comments
Labels
schema Issues that impact the schema file

Comments

@MatthiasWeise
Copy link
Contributor

Problem:

PROHIBITED seems to be interpreted as negation, but is missleading from an engineering point of view due to the mandatory datatype settings.

Current interpretation of PROHIBITED:
NOT EXISTS (Pset=="My_Pset" AND Property=="My_Property" AND IfcMeasure=="IFCLABEL")
My expectation as engineer would however be more like this: (if I would like to exclude a property I do not care about the datatype)
NOT EXISTS (Pset=="My_Pset" AND Property=="My_Property")

Proposed solution:

  • make datatype optional (or ignore in PROHIBITED requirements)
  • add logical statements to documentation for interpretation of PROHIBITED, REQUIRED and OPTIONAL
    (for OPTIONAL: IF Pset AND Property THEN Datatype AND Value)

P.S.:
REQUIRED (minOccurs=1) is clear, although I would expect maxOccurs=1 instead of unbounded because the required property should be present once, but not more than once, and asking for a number of properties with a pattern does not make sense to me.

@aothms
Copy link

aothms commented Oct 17, 2023

Another option:

  • add a datatype 'any'

This has the 'advantage' that you remain able to specify:

  • prohibited Pset=="My_Pset" AND Property=="FireRating" AND IfcMeasure=="IFCLABEL"
  • prohibited Pset=="My_Pset" AND Property=="FireRating" AND IfcMeasure=="IFCIDENTIFIER"

but also

  • prohibited Pset=="My_Pset" AND Property=="FireRating" AND IfcMeasure=="any"

I admit the utility of this is limited, just wanted to have the option on the table. Conceptually I think it's better than implicitly ignoring a field. I think an any datatype was discussed earlier.

@MatthiasWeise
Copy link
Contributor Author

What was the reason for making datatype mandatory?

I agree that ignoring a statement for PROHIBITED is not a good idea, but adding a reserved keyword ANY would lead to more complexity. Unless there is a good reason for making datatype mandatory, I would be in favour to make the datatype parameter optional.

In general I would recommend to provide a formal logic for interpretation of the property facet parameters and the min/maxOccurs. My understanding, which however is not really documented, is:

REQUIRED = MUST EXISTS (there should be minOccurs properties where all parameters must evaluate to true)
PROHIBITED = MUST NOT EXISTS (there should be no property where all parameters evaluate to true)
OPTIONAL = CAN EXISTS but with the following condition -> IF Pset AND Property THEN Datatype and Value
Accordingly, OPTIONAL is an exception from pure logic as used in REQUIRED/PROHIBITED because we do not want to allow that datatype and value does not fit to the property.

Regarding check of Datatype and Value:
Given Datatype must match (if Datatype is missing in case of being optional it is always true)
Value must match if given, if Value is missing then any value must exist (or do we allow to be unset?)

Another question: check of Pset and Property names and Value are case sensitive, right?
Check of datatype is most likely case insensitive due to our upper case agreement (similar to given entity names).

@aothms
Copy link

aothms commented Oct 18, 2023

What was the reason for making datatype mandatory?

If I recall the discussion correctly it was something along the lines of: we know this might be a struggle for some, but in the long term is going to help greatly increase information quality. What is the use of asking to deliver a property if you don't specify its range?

Another question: check of Pset and Property names and Value are case sensitive, right?

If I recall the discussion correctly they are case sensitive to facilitate downstream applications not having to do case normalization. Ultimately this is also a question for the IFC spec. We have functions like [0], but I find the EXPRESS lang really vague on string equality (and doesn't even mention things like unicode normalization - at least my version).

https://ifc43-docs.standards.buildingsmart.org/IFC/RELEASE/IFC4x3/HTML/lexical/IfcUniquePropertyName.htm

@MatthiasWeise
Copy link
Contributor Author

If I recall the discussion correctly it was something along the lines of: we know this might be a struggle for some, but in the long term is going to help greatly increase information quality. What is the use of asking to deliver a property if you don't specify its range?

Makes sense for user defined properties, but for all IFC Psets/Properties I do not see a need to double check the datatype. Anyhow, we need an agreement for PROHIBITED and both solutions (ANY and optional datatype) are semantically the same.

@aothms
Copy link

aothms commented Oct 18, 2023

Makes sense for user defined properties, but for all IFC Psets/Properties I do not see a need to double check the datatype

That's indeed the question. Previously the idea has been that an IDS implementation should be as light weight as possible. E.g no inheritance, because it needs schema knowledge. You could make the same argument about properties from the spec: make the IDS as self-contained as possible to make it easy to evaluate. Also there is the possibility that you include multiple schemas, but that the schemas have revised definitions for the same property.

@CBenghi
Copy link
Contributor

CBenghi commented Oct 23, 2023

I don't like the idea of adding a datatype any. I also feel it is semantically the same of leaving the field empty (which is more concise and intuitive).

I would argue that in the audit tool and in documentation, we enforce the need for datatype when the facet is not prohibited.

This would be a schema/documentation/tool change, but we are already changing the schema to be more specific about the capitalization, see #199.

@CBenghi CBenghi added the schema Issues that impact the schema file label Oct 23, 2023
@MatthiasWeise
Copy link
Contributor Author

That's indeed the question. Previously the idea has been that an IDS implementation should be as light weight as possible. E.g no inheritance, because it needs schema knowledge. You could make the same argument about properties from the spec: make the IDS as self-contained as possible to make it easy to evaluate. Also there is the possibility that you include multiple schemas, but that the schemas have revised definitions for the same property.

It seems that we do not trust IFC implementations if we argue that the datatype attribute must be mandatory to ensure data quality. If that is the argument, wouldn't we have to check the property type as well (IfcPropertySingleValue, IfcPropertyBoundedValue etc.)? I am not saying we should do, as it would conflict with making IDS as light weight as possible.

berlotti added a commit that referenced this issue Oct 24, 2023
some quick fixes for #207
still an error on min/maxoccur but will wait for resolution on #206
@CBenghi CBenghi changed the title Interpretation of PROHIBITED (maxOccurs=0) for property requirement Interpretation of PROHIBITED for property requirement Oct 31, 2023
@CBenghi
Copy link
Contributor

CBenghi commented Nov 14, 2023

Long conversation in the call today lead to the following proposal for implementation agreement:

We are adding back the OPTIONAL value for the requirement enumeration.
Datatype will be changed to be optional in the xsd schema.

The evaluation logic becomes more complex, our implementation will have to become a conditional process depending on the scenarios as follows:

REQUIRED

  1. IDS has: PSET/PNAME : A pset pname has to exist (null is accepted as a pass, any value of any datatype is accepted)
  2. IDS HAS PSET/PNAME/DATATYPE: A pset pname has to exist. The IFC value has to be of the required IDS datatype
  3. IDS HAS PSET/PNAME/DATATYPE/VALUE: Like the previous, PLUS: the IFC value needs to comply with the IDS restriction

PROHIBITED

  1. IDS has: PSET/PNAME: We cannot have the pset/pname combination in the IFC file, regardless of any value
  2. IDS HAS PSET/PNAME/DATATYPE: This does not sound like a valid use case -> Prohibited by the tool
  3. IDS HAS PSET/PNAME/DATATYPE/VALUE: This does not sound like a valid use case -> Prohibited by the tool (there are ways to limit the values accepted in the positive by the REQUIRED/OPTIONAL branches)

OPTIONAL

  1. IDS has: PSET/PNAME : This is useless -> Prohibited by the tool
  2. IDS HAS PSET/PNAME/DATATYPE: We might not have the property at all -> pass. If we have the property: the value has got to be non null and of correct data type
  3. IDS HAS PSET/PNAME/DATATYPE/VALUE: We might not have the property at all -> pass. If we have the property: the value has got to be non null and of correct data type. The IFC value must comply with the IDS VALUE restriction

🗳️ This will have to be evaluated this week by all parties and it is up for vote on the meeting of the 21st of Novermber.

🛠️ We still have to evaluate if the optional cardinality makes sense for other facets. Please consider and make your case.

@CBenghi CBenghi changed the title Interpretation of PROHIBITED for property requirement Interpretation of PROHIBITED for property requirement 🗳️ Nov 14, 2023
@andyward
Copy link
Contributor

Just ahead of your call a couple of questions:

1. Valid/invalid use cases

PROHIBITED

  1. IDS HAS PSET/PNAME/DATATYPE/VALUE: This does not sound like a valid use case -> Prohibited by the tool (there are ways to limit the values accepted in the positive by the REQUIRED/OPTIONAL branches)

Why do we say the use case is not valid? While no doubt it's generally easy to restate a requirement in the positive form, why would we limit the expressiveness especially where the user's requirement may be expressed negatively - i.e. any requirement that is SHALL NOT / MUST NOT. Enums are a good example of where it's trickier to negate. (i.e. "I don't want IFCELECTRICAPPLIANCEs that have a PSet_AssetMaintenance.AssetFrailty property of HIGH" - where otherwise I would need to account for all the other Enum options in the positive case.)

It sounds like we're proposing authors re-write the requirement as a Prohibited Applicability statement (with presumably no requirement)? Suddenly if I wanted to invert the logic it's not a matter of a switch?

2. UX in IDS authoring tools

Presumably we want IDS authors to fall into the 'pit of success', and not create specs that are invalid or supported inconsistently by validators.

Would we then need to think about the UX and usability of any authoring tool? If Required/Optional/Prohibited is visually represented in an interface as some kind of tri-state switch/toggle/selection, now it may need to account for which of the property facet values are populated?

This table shows valid/invalid permutations:

Fields Entered Required Optional Prohibited
PSET / NAME
PSET / NAME / DATATYPE
PSET / NAME / DATATYPE / VALUE

But we also need to account for the sequencing of data entry may create 'invalid' facets. Eg. Consider this set of steps authoring a Property requirement:

  • I enter a PSET/PNAME, Optional is now technically not valid (or at least useless) - should we disable that option to leave just Required and Prohibited?
  • If I change the facet to be Prohibited with PSET/PNAME entered - should we now prevent a DATATYPE or VALUE being entered as that would also created an invalid combination?

This feels unnecessarily complicated for a user (not to mention to implement)? And the alternative is an error/warning from the audit tool. For me I would just relax PROHIBITED 2 & 3, perhaps throw a warning at audit, and leave it to the validators to implement. Ultimately it's just Boolean logic that we're artificially constraining...

3. Are we being overly-prescriptive about what is a required facet field

... especially on Properties where there's 4+ fields

We seem to be coming to the conclusion that DATATYPE and VALUE are optional - in as much as they can be left empty - which makes sense for cases where it's not relevant, or the user doesn't currently know/care. Do we also need to consider PSET being optional? "All doors must have a FireRating (...and I don't care which Pset)". I raised this in #189, but I think it would have a bearing on prohibited facets too.

While BSDD and data dictionaries could help here I'm conscious by putting too many barriers on authoring requirements we're making it difficult to express requirements, or encouraging workarounds. As I understood the principles, we aimed to make it simple to author IDS and to leave the complexity with the implementors.

@andyward
Copy link
Contributor

My thoughts on the validity of optionality on the other facet types

✅ = valid
❌ = "useless" - possible audit warning

Entities

Fields Entered Required Optional Prohibited
NAME
NAME / [PREDEFINEDTYPE]

Optional Entity requirements have no meaning? An Applicable Element is a IFCDOOR or it is not.

Attribute

Fields Entered Required Optional Prohibited
NAME ✅ [1a] ✅ [1b]
NAME / [VALUE]

[1a] The Attribute should be populated
[1b] The Attribute should not be populated

Optional: The IFC Schema is fixed. Optionality makes no sense?

Properties

Fields Entered Required Optional Prohibited
PSET / NAME
PSET / NAME / [DATATYPE] ❌ [2]
PSET / NAME / [DATATYPE] / [VALUE] ❌ [2]

[2] based on current proposal. My feeling is these should be ✅. e.g. The DATATYPE must not be IFCTEXT, The VALUE must not be HIGH.

Classification

Fields Entered Required Optional Prohibited
[SYSTEM]
[VALUE]
[SYSTEM] / [VALUE]

Optional = If the applicable element has classifications at least one should match the value/system

Material

Fields Entered Required Optional Prohibited
[VALUE]

Optional = If the applicable element has materials at least one should match the value

PartOf

Fields Entered Required Optional Prohibited
ENTITY
ENTITY / [RELATION]

Optional = If the element has a matching relationship the target should match the Entity

@MatthiasWeise
Copy link
Contributor Author

Why do we say the use case is not valid?

The problem with PROHIBITED and PSET/PNAME/DATATYPE/VALUE is, that you most likely do not want to negate the datatype. The discussion was conducted under the aspect of meaningfulness and user expectations. We may need to discuss the option PSET/PNAME/VALUE, but that was the reason to exclude that option.

@MatthiasWeise
Copy link
Contributor Author

Suddenly if I wanted to invert the logic it's not a matter of a switch?

In the meeting we discussed the expectation of a user to invert requirements. Let's say you define a property with its datatype and a list of allowed values. What is the expectation of the user when changing from REQUIRED to PROHIBITED?
Expectation from an engineering point of view is most likely that the property should not be included no matter what the datatype of value is. So, logical negation of PSET/PNAME/DATATYPE/VALUE is maybe not want a normal user would expect.

@andyward
Copy link
Contributor

The problem with PROHIBITED and PSET/PNAME/DATATYPE/VALUE is, that you most likely do not want to negate the datatype.

I get what you're saying now. DataType is slightly different to the other fields - even in then way we model it in the schema. It's not a full constraint in the way that the other components are - it's just a literal value. I'm thinking it's part of the VALUE in concept (like a hypothetical Unit might be e.g. on a IfcPropertySingleValue). If there's no Value there's no Datatype. So our mental model of the clause should probably be PSET/PNAME/VALUE/DATATYPE (with VALUE and DATATYPE switched). With DataType being subordinate to the Value rather than a distinct concept, negation perhaps makes more sense. (Note the Value can still be optional)
E.g. Pseudo-code

meetsRequirement = (Matches(element, PSET) AND Matches(element, PNAME) AND ValueMatches(element, VALUE, DATATYPE))

// where ValueMatches() checks the Value constraint is met and the DataType matches

isProhibited =NOT(meetsRequirement)

My assumption here is that dataType's value is optional and can be blank.

I can see we may want some kind of warning to the user, as negations are always a source of confusion. But feels a mistake to be saying this is not supported.

@CBenghi
Copy link
Contributor

CBenghi commented Nov 21, 2023

With regards to the exclusion of prohibited configurations, we evaluated the following PROHIBITED scenario

  • DATATYPE "IFCTECXT"
  • VALUE = "IDSProhibitedVALUE"

We assumed that we needed DATATYPE for parsing the value (e.g. measures), and it must match even in prohibited facets.

The following cases emerged:

  • VALUE IS NULL -> if the datatype is part of the expected constraints, then this fails. because we can't assess the datatype POSSIBLE SOURCE OF CONFUSION FOR END USERS

  • PROP IS IFCTEXT (this is easy and understood)

    • VALUE is matched: IFCTEXT("IDSProhibitedVALUE") -> FAIL
    • VALUE is not matched: IFCTEXT("Foo") -> PASS
  • PROP IS NOT IFCTEXT, but compatible (E.g. it is IFCLABEL) -> WE FAIL AT THIS ENTIRE LEVEL, because we assume that the datatype is part of the expected data schema

    • VALUE is matched: IFCLABEL("IDSProhibitedVALUE") -> FAIL
    • VALUE is not matched: IFCLABEL("Foo") -> FAIL POSSIBLE SOURCE OF CONFUSION FOR END USERS
  • PROP IS NOT COMPATIBLE IFCTEXT (E.g. it is IFCINTEGER) -> WE FAIL AT THIS ENTIRE LEVEL, because we assume that the datatype is part of the expected data schema

    • VALUE IS IDSVALUE -> FAIL
    • VALUE IS NOT IDSVALUE -> FAIL

CONCLUSION: To avoid the possible sources of confusion listed above for version 1.0 we decided to keep the constraints previously proposed.

@CBenghi
Copy link
Contributor

CBenghi commented Nov 21, 2023

As a clarification to the valid configurations for property facet we also state that VALUE without datatype is not valid, and we are now evaluating the behaviour of the possible clash of xs:basetype of the value restriction and the datatype attribute of the property.

Fields Entered Required Optional Prohibited base type notes
PSET / NAME No base type because value is missing
PSET / NAME / DATATYPE No base type because value is missing
PSET / NAME / DATATYPE / VALUE we need to write a basetype for xsd compatibility,
PSET / NAME / VALUE No base type because this is never allowed

The clash could be avoided with our own implementation of xs:restriction that had an optional basetype, but we will avoid this if possible.
TODO: 🛠️ The preferred solution is to create a map of xml basetypes to match the IFC datatypes

@CBenghi
Copy link
Contributor

CBenghi commented Nov 21, 2023

⚠️ All considerations in the call when it comes to PSET/PNAME were discussed with clear matches to a single IFC entry, perhaps we need to think again if this need to be revised including patterns.

@andyward
Copy link
Contributor

Not sure I understand why we'd require a DATATYPE - even when VALUEs are required. In the majority of the use cases I see the user won't know or care if a (property's text) value is IFCTEXT or IFCLABEL. They just want to check the value is valid. I get there's a engineering view that data types are important, but it's asking a lot for a client to decide if they want their Thermal u-values in IfcThermalTransmittanceMeasure or IfcReal - when both are fundamentally REAL under the covers. And forcing them to make that call is often going to raise, what will be for them, some false negatives.

If the main justification for this decision is this statement:

We assumed that we needed DATATYPE for parsing the value (e.g. measures), and it must match even in prohibited facets.

... then I'd challenge that. Surely implementors don't need to know the type of an IfcValue before they can read it to validate?

I keep coming back to this point:

3. Are we being overly-prescriptive about what is a required facet field

Slightly concerned we're making it more difficult than need be to author IDS requirements - in the goal of delivering perfect IFC. There's a whole set of users where 'good enough' is a big improvement over what they currently get.

@CBenghi
Copy link
Contributor

CBenghi commented Feb 2, 2024

Made the datatype optional in the schema presented on 2024-02-02

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
schema Issues that impact the schema file
Projects
None yet
Development

No branches or pull requests

4 participants