-
Notifications
You must be signed in to change notification settings - Fork 122
Collection.from_dict fails with type error if spatial extent is explicitly null #1551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'd almost go as far to say that we should deserialize with a warning and a default extent ... Postel's law, eh? |
I don’t think we can set a default extent - I think it needs to be null -
though maybe to get around this immediately we make the extent a sentinel
value, like a point at null island?
I think the larger issue is that we need two different models - a lax one
for reading and a strict one for writing.
Currently, we have the pystac model and the pydantic model, and both are
somewhere in the middle. With the recent performance improvements in
pydantic, we may want to consider have both the strict and lax models in
that, and using them for the next major pystac release?
…On Sat, Apr 19, 2025 at 9:00 AM Pete Gadomski ***@***.***> wrote:
I think it would be reasonable (though not desired) to fail to deserialize
invalid items
I'd almost go as far to say that we should deserialize with a warning and
a default extent ... Postel's law, eh?
—
Reply to this email directly, view it on GitHub
<#1551 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAA6QRDHRFZGVDKYKGUH7ML22JCF5AVCNFSM6AAAAAB3ON4YDOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQMJWGY4TQMJXG4>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
*gadomski* left a comment (stac-utils/pystac#1551)
<#1551 (comment)>
I think it would be reasonable (though not desired) to fail to deserialize
invalid items
I'd almost go as far to say that we should deserialize with a warning and
a default extent ... Postel's law, eh?
—
Reply to this email directly, view it on GitHub
<#1551 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAA6QRDHRFZGVDKYKGUH7ML22JCF5AVCNFSM6AAAAAB3ON4YDOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQMJWGY4TQMJXG4>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Wouldn't global bounds (-180, -90, 180, 90) be a reasonable "default", with a warning?
I think that's one approach ... the other is to just have asymmetric serialization and deserilization, where deserialization is much more permissive. This relates to #1092 ... to me (other opinions are of course available) pystac gets more value out of not being tied to pydantic, but rather by hand-crafting that permissive deserialization to match real-world patterns/data. |
If it's going to have a value, then global extent sounds reasonable. But maybe changing this to allow null is better, if the ser/deser are asymmetric? I think the underlying problem is that we shouldn't be using pystac for construction of stac items from other metdata, and there needs to be another package for that. Fewer and only more "advanced" users are likely to use this, so it doesn't have to be as easy to work with (easy != simple). The types and validation should be such that you can only construct a valid STAC Item. Using pydantic and tightening up the validation is one way, Builder pattern is another, Pyrsistent with evolvers is another. |
🤔 why not? To me, pystac seems like a great place to store all of the weird heuristics that you might need to do to create valid STAC from messy real-world data.
I think that's the idea behind "permissive" deserialization ... even if you give it stuff that would be invalid STAC, pystac will make it valid for you. |
The main thing I had in mind with that comment is that you can construct invalid items with it. This would force the user to sort out the messiness before constructing the Item. Maybe a combination of using validate (with #1552 ) would improve this situation, and other semantic validation that can't be expressed well in schema (e.g., no antimeridian crossing, all bbox and geometry points are within 180/90 bounds). I don't like the idea that you could provide invalid values and pystac will make it valid, for the same reason that I don't like languages like JavaScript that have such weak typing. It allows the user to be sloppy and get some answer, which may not be the one they actually intended. Maybe this is another tool, but I think pystac should be explicit and do what you tell it to. |
I take your point ... this is some built-in messiness with pystac where we're trying to please three different user groups (data producers, "developers", and data consumers). If we were to split things up, I'd want to target those groups specifically, rather than starting w/ technologies. Right now, my thinking goes like:
|
This code (being used for the pystac-client Quickstart):
This fails because the spatial extent of the collection is explicitly null. This collection advertises it is stac_version 1.0.0. Null extent is invalid wrt the spec, but the tooling should be more permissive it what it will process.
The extent is:
Exception is:
I think it would be reasonable (though not desired) to fail to deserialize invalid items, but with an explicit error as to why it could not be deserialized. In this case, it's just an accidental TypeError from trying to subscript None.
The text was updated successfully, but these errors were encountered: