generated from ivoa-std/doc-template
-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathObsCore.tex
3107 lines (2563 loc) · 154 KB
/
ObsCore.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\documentclass[11pt,a4paper]{ivoa}
\input tthdefs
\input gitmeta
\title{Observation Data Model Core Components and its Implementation in the Table Access Protocol}
% see ivoatexDoc for what group names to use here; use \ivoagroup[IG] for
% interest groups.
\ivoagroup{DM}
\author{Mireille Louys}
\author{Doug Tody}
\author[http://www.ivoa.net/twiki/bin/view/IVOA/PatrickDowler]{Patrick Dowler}
\author{Daniel Durand}
\author[http://www.ivoa.net/twiki/bin/view/IVOA/LaurentMichel]
{Laurent Michel}
\author[http://www.ivoa.net/twiki/bin/view/IVOA/FrancoisBonnarel]
{Fran\c{c}ois Bonnarel}
\author{Alberto Micol}
\editor{????Alfred Usher Thor????}
% \previousversion[????URL????]{????Concise Document Label????}
\previousversion[http://www.ivoa.net/Documents/ObsCore/20161004/PR-ObsCore-v1.1-20161004.pdf]{R-ObsCore-v1.1-20161004.pdf}
\previousversion[https://www.ivoa.net/documents/ObsCore/20111028/REC-ObsCore-v1.0-20111028.pdf]{REC-ObsCore-v1.0-20111028.pdf}
\usepackage{longtable}
\begin{document}
\begin{abstract}
This document defines the core components of the Observation data model that are necessary to perform data discovery
when querying data centers for astronomical observations of interest. It exposes use-cases to be carried out, explains
the model and provides guidelines for its implementation as a data access service based on the Table Access Protocol
(TAP). It aims at providing a simple model easy to understand and to implement by data providers that wish to publish
their data into the Virtual Observatory. This interface integrates data modeling and data access aspects in a single
service and is named ObsTAP. It will be referenced as such in the IVOA registries. In this document, the Observation
Data Model Core Components (ObsCoreDM) defines the core components of queryable metadata required for global discovery
of observational data. It is meant to allow a single query to be posed to TAP services at multiple sites to perform
global data discovery without having to understand the details of the services present at each site. It defines a
minimal set of basic metadata and thus allows for a reasonable cost of implementation by data providers. The
combination of the ObsCoreDM with TAP is referred to as an ObsTAP service. As with most of the VO Data Models,
ObsCoreDM makes use of STC, Utypes, Units and UCDs. The ObsCoreDM can be serialized as a VOTable. ObsCoreDM can make
reference to more complete data models such as Characterisation DM, Spectrum DM or Simple Spectral Line Data Model
(SSLDM).
ObsCore shares a large set of common concepts with DataSet Metadata Data Model \cite{CITATIONCre16l1036} which binds
together most of the data model concepts from the above models in a comprehensive and more general frame work.
This current specification on the contrary provides guidelines for implementing these concepts using the TAP protocol
and answering ADQL queries. It is dedicated to global discovery.
\end{abstract}
\section*{Acknowledgments}
We acknowledge support from the Astronomy ESFRI and Research Infrastructure Cluster -- ASTERICS project, funded by the
European Commission under the Horizon 2020 Programme (GA 653477) and former Euro-VO ICE and CoSADiE European projects.
SSC XMM Catalog service supported the implementation of the SAADA version of ObsTAP at Strasbourg Observatory as well
as the TapHandle application. The US-VAO project contributed to developing this specification and prototyping the use
of ObsTAP in the VAO portal. The CANFAR project also contributed for the reference implementation of ObsTAP at CADC,
Victoria, which serves a large and diverse set of data collections.
\section*{Conformance-related definitions}
The words ``MUST'', ``SHALL'', ``SHOULD'', ``MAY'', ``RECOMMENDED'', and
``OPTIONAL'' (in upper or lower case) used in this document are to be
interpreted as described in IETF standard RFC2119 \citep{std:RFC2119}.
The \emph{Virtual Observatory (VO)} is a
general term for a collection of federated resources that can be used
to conduct astronomical research, education, and outreach.
The \href{https://www.ivoa.net}{International
Virtual Observatory Alliance (IVOA)} is a global
collaboration of separately funded projects to develop standards and
infrastructure that enable VO applications.
\section{Introduction}
The first version of this model, ObsCore 1.0, originates from an initiative of the IVOA Take Up Committee that, in the
course of 2009, collected a number of use cases for data discovery (see Appendix A). These use cases address the
problem of an astronomer posing a world-wide query for scientific data with certain characteristics and eventually
retrieving or otherwise accessing selected data products thus discovered. The ability to pose a single scientific
query to multiple archives simultaneously is a fundamental use case for the Virtual Observatory. Providing a simple
standard protocol such as the one described in this document increases the chances that a majority of the data
providers in astronomy will be able to implement the protocol, thus allowing data discovery for almost all archived
astronomical observations.
Version 1.0 and Version 1.1 of ObsCore are focused on public data. However optional fields like obs\_release\_date and
data\_rights are proposed to also support proprietary data.
The ObsCore data model is focused on describing the core metadata common to most data products distributed for
astronomical observations.~ It is the common basis that helps to search and discover datasets across various VO
compatible archives via a customized TAP protocol: ObsTAP. ObsCore also provides the core data model for discovery and
description of specific types of astronomical data (e.g., images and spectra) via the ``typed'' VO data access
protocols. These type-specific protocols may extend ObsCore to more fully describe specific types of data, but the
intent is that all VO data access protocols share the same core description of the data.
In order to take into account the pixelated data such as images, data cubes, and time series as well, this version makes
explicit the nature and length of the dataset axes as defined in the Characterisation data model
\cite{CITATIONIVO07l1036}. These allow covering the requirements for axes length (as a number of bins) expressed in
added uses-cases in Appendix A, sections A.3 for data cubes, A.4 for time series, A.5 for event lists. In addition it
corrects a few errors in the description of data model items found in version 1.0.
Consistency with the IVOA NDCube data model which represents N-Dimensional datasets has been improved. Therefore the
main data model component of ObsCore DM, which focuses on a data product, is renamed ``ObsDataset'' as in `NDCube' and
`IVOA DataSet Metadata' models, instead of `Observation' as named previously.
This data model does not expose the mapping of data axes to physical coordinate systems, as available for instance in
FITS WCS keywords. Such information belong to the scope of the `NDCube' and `STCv2' data models and will be used in
future versions of DAL protocols.
In the following are described the fundamental building blocks which are used to achieve the goal of global data
discoverability and accessibility.
\subsection{First building block: Data Models}
Modeling of observational metadata has been an important activity of the IVOA since its creation in 2002. This modeling
effort has already resulted in a number of integrated and approved IVOA standards such as the Resource Metadata, Space
Time Coordinates (STC), Spectrum and SSA, and the Characterisation data models that are currently used in IVOA services
and applications.
%\includegraphics[width=15.923cm,height=11.942cm]{ObsCore-img002.jpg}
Figure 1. Architecture of an ObsTAP service: it is based on the ObsCore data model,
which re-uses parts of Characterisation, Spectrum, STC data models and the UCD and Units specifications. As a service
ObsTAP relies on ADQL, TAP, UWS, TAPRegExt, VOSI and VOTable. Examples and use-cases are exposed following the
recommendation for DALI examples.
\subsection{Second building block: the Table Access Protocol (TAP)}
TAP defines a service protocol for accessing tabular data such as astronomical catalogs, or more generally, database
tables. TAP allows a client to (step 1) browse through the various tables and columns (names, units, etc.) in an
archive to collect the information necessary to pose a query, then (step 2) actually perform a table query. The Table
Access Protocol (TAP) specification was developed and reached recommendation status in March 2010
\cite{CITATIONTAPl1036}.
\subsection{The goal of this effort}
Building on the work done on data models and TAP, it becomes possible to define a standard service protocol to expose
standard metadata describing available datasets. In general, any data model can be mapped to a relational database and
exposed directly with the TAP protocol. The goal of ObsTAP is to provide such a capability based upon an essential
subset of the general observational data model.
Specifically, this effort aims at defining a database table to describe astronomical datasets (data products) stored in
archives that can be queried directly with the TAP protocol. This is ideal for global data discovery as any type of
data can be described in a straightforward and uniform fashion. The described datasets can be directly downloaded or
accessed via IVOA Data Access Layer (DAL) protocols.
The final capability required to support uniform global data discovery and access, with a client sending one and the
same query to multiple TAP services, is the stipulation that a uniform standard data model is exposed (through TAP)
using agreed naming conventions, formats, units, and reference systems. Defining this core data model and associated
query mechanism is what this document is for.
Thus the purpose of this document is twofold: (1) to define a simple data model to describe observational data, and (2)
to define a standard way to expose it through the TAP protocol to provide a uniform interface to discover observational
science data products of any type.
This document is organized as follows:
\begin{itemize}
\item Section \ref{bkm:Ref159237242} briefly presents the types of the use cases collected from the astronomical
community by the IVOA Uptake committee.
\item Section \ref{bkm:Ref159237280} defines the core components of the Observation data model. The elements of the data
model are summarized in Figure 2. Mandatory ObsTAP fields are summarized in Table 1.
\item Section \ref{bkm:Ref159237315} specifies the required data model fields as they are used in the TAP service: table
names, column names, column data type, UCD, Utype from the Observation Core components data model, and required units.
\item Section \ref{bkm:Ref298341494} describes how to register an ObsTAP service in a Virtual Observatory registry.
More detailed information is available in the appendices.
\item Examples are cited in Appendix A
\item Section 6 summarizes updates of this document.
\item Appendix A describes all the use cases as defined by the IVOA Take Up Committee.
\item Appendix B contains a full description of the Observation data model Core Components.
\item Appendix C shows the detailed content of the TAP\_SCHEMA tables and how to build up and fill them for the
implementation of an ObsTAP service.
\end{itemize}
\section[Use cases]{Use cases}
\label{bkm:Ref159237242}Our primary focus is on data discovery. To this end a number of use-cases have been defined,
aimed at finding observational data products in the VO domain by broadcasting the same query to multiple archives
(global data discoverability and accessibility). To achieve this we need to give data providers a set of metadata
attributes that they can easily map to their database system in order to support queries of the sort listed below.
The goal is to be simple enough to be practical to implement, without attempting to exhaustively describe every
particular dataset.
The main features of these use-cases are as follows:
\begin{itemize}
\item Support multi-wavelength as well as positional and temporal searches.
\item Support any type of science data product (image, cube, spectrum, time series, instrumental data, etc.).
\item Directly support the sorts of file content typically found in archives (FITS, VOTable, compressed files,
instrumental data, etc.).
\end{itemize}
Further server-side processing of data is possible but is the subject of other VO protocols. More refined or advanced
searches may include extra knowledge obtained by prior queries to determine the range of data products available.
The detailed list of use cases proposed for data discovery is given in Appendix A.
\subsection{Role within the VO Architecture}
\begin{figure}
\centering
% As of ivoatex 1.2, the architecture diagram is generated by ivoatex in
% SVG; copy ivoatex/archdiag-full.xml to role_diagram.xml and throw out
% all lines not relevant to your standard.
% Notes don't generally need this. If you don't copy role_diagram.xml,
% you must remove role_diagram.pdf from SOURCES in the Makefile.
%\includegraphics[width=0.9\textwidth]{role_diagram.pdf}
%\caption{Architecture diagram for this document}
%\label{fig:archdiag}
\end{figure}
Fig.~\ref{fig:archdiag} shows the role this document plays within the
IVOA architecture \citep{2021ivoa.spec.1101D}.
\section[Observation Core Components Data Model]{Observation Core Components Data Model}
\label{bkm:Ref159237280}This section highlights and describes the core components of the Observation data model,
synthetized today in the Dataset Metadata DM specification. The term ``core components'' is meant to refer to those
elements of the larger ``Observation Data Model'' that are required to support the use cases listed in Appendix A. In
reality this effort is the outcome of a trade-off between what astronomers want and what data providers are ready to
offer. The aim is to achieve buy-in of data providers with a simple and {\textquotedbl}good enough{\textquotedbl}
model to cover the majority of the use cases.
The project of elaborating a general data model for the metadata necessary to describe any astronomical observation was
launched at the first Data Model WG meeting held in Cambridge, UK at the IVOA meeting in May 2003. The first
Observation data model was sketched out relying on some key concepts: Dataset, Identification, Curation, physical
Characterisation and Provenance (either instrumental or software). A description of the early stages of this
development can be found in \cite{CITATIONCharDM2007l1036} (Observation IVOA note). Some of these concepts have already
been elaborated in existing data models, namely the Spectrum data model \cite{CITATIONJon07l1036} for general items
such as dataset identification and curation, and the Characterisation data model \cite{CITATIONIVO07l1036a} for the
description of the physical axes and properties of an observation, such as coverage, resolution, sampling, and
accuracy. The Core Components data model reuses the relevant elements from those models. Generalization of the
observational model to support data from theoretical models (e.g., synthetic spectra) is possible but is not addressed
here in order to keep the core model simple.
\subsection[UML description of the model]{UML description of the model}
This section provides a graphical overview of the Observation Core Components data model using the unified modeling
language (UML). The UML class diagram shown in Figure 2 depicts the overall Observation Data Model, detailing those
aspects that are relevant to the Core Components, while omitting those not relevant. The Characterisation classes
describing how the data span along the main physical measurement axes are simplified here showing only the attributes
necessary for data discovery. This is also the case for the DataID and Curation classes extracted from the
Spectrum/SSA data model where only a subset of the attributes are actually necessary for data discovery. For our
purposes here we show Characterisation classes only down to the level of the Support class (level 3).
%\includegraphics[width=17.489cm,height=19.59cm]{ObsCore-img003.png}
\label{bkm:Ref158037359}Figure F3. Depicted here are the classes used to organize
observational metadata. Classes may be linked either via association or aggregation. The minimal set of necessary
attributes for data discovery is shown in brown.
For the sake of clarity, the SpatialAxis, SpectralAxis and TimeAxis classes on the diagram are not expanded on the main
class diagram. Details for these axes are shown in Figure 3 for the spatial axis, Figure 4 for the spectral axis and
Figure 5 for the time axis.
%\includegraphics[width=14.605cm,height=10.239cm]{ObsCore-img004.png}
\label{bkm:Ref158037577}Figure F5. Details of the classes linked to the description of the
spatial axis for an Observation dataset. All axes in this model inherit the main structure from the
CharacterisationAxis class, but some peculiar attributes are necessary for Space coordinates.
%\includegraphics[width=17.427cm,height=10.495cm]{ObsCore-img005.png}
\label{bkm:Ref158037643}Figure F5. Spectral axis: details of the classes necessary to
describe the spectral properties of an Observation dataset. UCD and units are essential to disentangle various possible
spectral quantities.
%\includegraphics[width=13.524cm,height=9.075cm]{ObsCore-img006.png}
\label{bkm:Ref291003095}Figure F6. The classes from the Characterisation DM used to
describe time metadata.
Details on the ObsCoreDM axes definitions are available in the Characterisation data model standard document
\cite{CITATIONIVO07l1036a}. The hypertext documentation of the model is available on the IVOA web site under the
ObsCore wiki page http://www.ivoa.net/internal/IVOA/ObsDMCoreComponents/ .
\subsection[Main Concepts of the ObsCore Data Model]{Main Concepts of the ObsCore Data Model}
The ObsCore data model is the result of the analysis of the data discovery use cases introduced in Chapter 2. Two sets
of elements have been identified: those necessary to support the provided use cases, and others that are generally
useful to describe the data but are not immediately required to support the use cases. In this section only the first
set is described. That set coincides with the set of parameters that any ObsTAP service must support. Please refer to
appendix B for the detailed description of all model elements.
Table 1 lists the data model elements that any ObsTAP implementation must support (i.e. a column with such name must
exist, though, in some cases, it could be nillable). Provision of these mandatory fields ensures that any query based
on these parameters is guaranteed to be understood by all ObsTAP services.
NB: Data model fields are listed here with their TAP column name rather than the IVOA data model element identifiers
(Utype) to ease readability. See the associated Utypes in Appendix C.
%\begin{table}[h]
%\begin{center}
\begin{longtable}{|l|p{0.2\textwidth}|p{0.2\textwidth}|p{0.2\textwidth}|p{0.35\textwidth}|}
\hline
Column Name & Unit & Type & Description\\\hline
dataproduct\_type & unitless & String & Logical data product type (image etc.)\\\hline
calib\_level & unitless & enum integer & Calibration level \{0, 1, 2, 3, 4\} \\\hline
obs\_collection & unitless & String & Name of the data collection \\\hline
obs\_id & unitless & String & Observation ID \\\hline
obs\_publisher\_did & unitless & String & Dataset identifier given by the publisher\\\hline
access\_url & unitless & String & URL used to access (download) dataset\\\hline
access\_format & unitless & String & File content format (see in App. \ref{bkm:Ref297463580} )\\\hline
access\_estsize & kbyte & integer & Estimated size of dataset in kilo bytes\\\hline
target\_name & unitless & String & Astronomical object observed, if any\\\hline
s\_ra & deg & double & Central right ascension, ICRS\\\hline
s\_dec & deg & double & Central declination, ICRS\\\hline
s\_fov & deg & double & Diameter (bounds) of the covered region \\\hline
s\_region & unitless & String & Sky region covered by the data product (expressed in ICRS frame)\\\hline
s\_xel1 & unitless & integer & Number of elements along the first spatial axis\\\hline
s\_xel2 & unitless & integer & Number of elements along the second spatial axis\\\hline
s\_resolution & arcsec & double & Spatial resolution of data as FWHM\\\hline
t\_min & d & double & Start time in MJD\\\hline
t\_max & d & double & Stop time in MJD\\\hline
t\_exptime & s & double & Total exposure time\\\hline
t\_resolution & s & double & Temporal resolution\\\hline
t\_xel & unitless & integer & Number of elements along the time axis\\\hline
em\_min & m & double & Start in spectral coordinates\\\hline
em\_max & m & double & Stop in spectral coordinates\\\hline
em\_res\_power & unitless & double & Spectral resolving power\\\hline
em\_xel & unitless & integer & Number of elements along the spectral axis\\\hline
o\_ucd & unitless & String & UCD of observable (e.g. phot.flux.density, phot.count, etc.)\\\hline
pol\_states & unitless & String & List of polarization states or NULL if not applicable\\\hline
pol\_xel & unitless & integer & Number of polarization samples \\\hline
facility\_name & unitless & String & Name of the facility used for this observation \\\hline
instrument\_name & unitless & String & Name of the instrument used for this observation \\\hline
\end{longtable}
%\end{center}
\label{bkm:Ref460858868}Table T1. Mandatory fields of the Observation Core Components data
model with their name, recommended units, data type and designation.
%\end{table}
\subsection{Specific Data Model Elements}
In order to support the global data discoverability and accessibility requirements, some new concepts previously not
covered by any other data model have to be introduced. This section describes those, which are: the data product type,
a classification of the various levels of calibration and processing applied to the data, the file content and format
enriched and extended from the concept described in the SSA protocol\cite{CITATIONTod2012l1036}. In addition, a
clarification of how the terms Observation and Data Product are used in the ObsTAP context is provided as well as a
discussion for composed products.
\subsubsection{Data Product Type}
\label{bkm:Ref286875933}The model defines a data product type attribute to describe the high level scientific
classification of the data product being considered. This is coded as a string that conveys a general idea of the
content and organization of a dataset. We consider a coarse classification of the types of dataset interesting for
science usage, covering: image, cube, spectrum, SED, time series, visibility data, and event data.
\begin{itemize}
\item image An astronomical image, typically a 2D image with two spatial axes, e.g., a FITS image. The image content
may be complex, e.g., an objective-grism observation would be considered a type of image, even though an extracted
spectrum would be a Spectrum data product.
\item cube A multidimensional astronomical image with 3 or more image axes, e.g., a spectral image cube, a polarization
cube, a full Stokes radio data cube, a time image cube, etc. The most common format for astronomical ``cube'' data
products is a multidimensional FITS image, however other formats are allowed so long as they are adequately described.
\item spectrum Any dataset for which spectral coverage is the primary attribute, e.g., a 1D spectrum or a long slit
spectrum.
\item sed A spectral energy distribution, an advanced data product often produced by combining data from multiple
observations.
\item timeseries A one dimensional array presenting some quantity as a function of time. A light curve is a typical
example of a time series dataset.
\item visibility A visibility (radio) dataset of some sort. Typically this is instrumental data, i.e.,
{\textquotedbl}visibility data{\textquotedbl}. A visibility dataset is often a complex object containing multiple
files or other substructures. A visibility dataset may contain data with spatial, spectral, time, and polarization
information for each measured visibility, hence can be used to produce higher level data products such as image,
spectra, timeseries, and so forth.
\item event An event-counting (e.g. X-ray or other high energy) dataset of some sort. Typically this is instrumental
data, i.e., {\textquotedbl}event data{\textquotedbl}. An event dataset is often a complex object containing multiple
files or other substructures. An event dataset may contain data with spatial, spectral, and time information for each
measured event, although the spectral resolution (energy) is sometimes limited. Event data may be used to produce
higher level data products such as images or spectra.
\item measurements A list of derived measurements gathered in a particular original dataset of one of the previous sort
after some analysis processing, like a source list, or more generally a list of `results' attached to such datasets.
\end{itemize}
Classification of astronomical data by data product type is inherently ambiguous hence the classification scheme defined
here is intentionally kept as simple as possible. The data provider should pick the primary category most appropriate
for their data. Values must be specified in lower-case (in order to simplify queries). One of the defined
dataproduct\_type values must be used if appropriate for the data product in question, otherwise a NULL value is
permitted and a more precise definition of the data product type should be given in dataproduct\_subtype. Combination
of data product types is not allowed, i.e., either one of the above values or NULL must be specified.
Further information on the specific content of a data product can be provided by the dataproduct\_subtype data model
field defined in the data model appendix \ref{bkm:Ref291536287} , and by the related obs\_title
(\ref{bkm:Ref292046860}) and access\_format attributes (section \ref{bkm:Ref289893457}).
The intent of dataproduct\_type is to provide only a general indication of the category to which the data product
belongs to facilitate global data discovery.
\subsubsection{Calibration level}
\label{bkm:Ref158638048}\label{bkm:Ref287048333}The calibration level concept conveys to the user information on how
much data reduction/processing has been applied to the data. It is up to the data providers to consider how to map
their own internal classification to the suggested classification scale here.
Level 0: Raw instrumental data, in a proprietary or internal data-provider defined format, that needs instrument
specific tools to be handled.
Level 1: Instrumental data in a standard format (FITS, VOTable, SDFITS, ASDM, etc.) which could be manipulated with
standard astronomical packages.
Level 2: Calibrated, science ready data with the instrument signature removed.
Level 3: Enhanced data products like mosaics, resampled or drizzled images, or heavily processed survey fields. Level 3
data products may represent the combination of data from multiple primary observations.
Level 4: Analysis data products generated after some scientific data manipulation or interpretation.
The examples in the following subsection should help illustrate use of the calib\_level attribute. It is left to the
data provider to decide for ambiguous cases.
\paragraph[Examples of datasets and their calibration level]{Examples of datasets and their calibration level}
Here are examples of various datasets, classified according to scheme defined above.
%\begin{table}[h]
%\begin{center}
\begin{tabular}{|l|p{0.2\textwidth}|p{0.2\textwidth}|p{0.2\textwidth}|p{0.35\textwidth}|}
\hline
Data product type & Data collection & Calibration Level & Comments\\\hline
image & IRAS/NASA & 2 & Science ready data\\\hline
image & IRIS/IRSA & 3 & Recalibrated from infrared IRAS images with removal of the sensor memory effect.\\\hline
image & HDFS/ACS GOODS data & 3 & Image associations mosaicking/stacking\\\hline
spectrum & XMM-Newton EPIC spectra & 1 & Raw instrumental spectrum.\\\hline
cube & EVLA spectral data cube & 2 & Radio spectral data cube in FITS format\\\hline
sed & NED SED & 3 & NED spectral energy distribution\\\hline
event & ROSAT/HEASARC & 1 & Instrumental data\\\hline
visibility & ALMA, Merlin, etc. & 1 & Instrumental data\\\hline
measurements & ESO tile catalog & 4 & Photometric catalog of extracted sources for a tile image\\\hline
timeseries & CTA reconstructed light curve & 4 & Reconstructed light curve following photons vs particles separation under some assumption \\\hline
\end{tabular}
%\end{center}
\label{T2}Table T2. Examples of datasets with their associated calibration level values.
%\end{table}
\subsubsection{Observation and Observation Dataset}
\label{bkm:Ref450327253}ObsTAP describes observations in a broad sense; exactly what comprises an
{\textquotedbl}observation{\textquotedbl} is not well defined within astronomy and is left up to the data provider to
define for their data. ObsTAP also describes archive data products (e.g., actual archive files).
The IVOA Dataset Metadata model (see http://www.ivoa.net/documents/DatasetDM/) clarifies the logical links between an
Observation and an ObservationDataset i.e a data product here. It makes a distinction between an
{\textquotedbl}observation{\textquotedbl} as the description of an observing experiment and its resulting datasets.
Therefore the term ObsDataset is adopted in this version as a replacement of Observation in the previous ObsCore1.0
specification. It helps to handle various situations of combination, stacking, packaging of the results of performing
an observation at the instrument level.
In general an Observation Dataset, as a result, may be composed of multiple individual data products. In this case all
the data products stemming from one observation should share the same observation identifier (obs\_id). The form of
the obs\_id string is up to the data provider so long as it uniquely identifies, within the context of the archive, all
data products resulting from the observation. The individual data products associated with an observation may have
different data product types, calibration levels, and so forth. ObsTAP only directly supports the description of
science data products, i.e., data products which contain science data having some physical (spatial, spectral,
temporal) coverage.
Two different approaches can be followed for exposing the instrumental data from an observation. One can either expose
the individual science data products resulting from the observation, all sharing the same obs\_id, or one can
``package'' the data products and expose the package as a single complex instrumental data product. Combinations of the
two approaches are also possible, e.g., a package of all the data products, plus additional records exposing selected
high priority individual data products, all sharing the same obs\_id.
If the data products comprising an observation are exposed individually then attributes such as the calibration level
can vary for different data products, e.g., the raw instrumental data as observed might be level 1, a standard pipeline
data product might be level 2, and a custom user-processed data product subsequently published back to the archive
might be level 3. All such data products would share the same obs\_id.
If on the other hand all data from an observation is exposed as a single data product via ObsTAP this will likely be an
aggregate of some sort (tar file, directory, etc.) containing multiple files. This latter approach is limited to
instrumental data (level 0 or 1), even if objects within the aggregate observation file are higher level. From the
perspective of ObsTAP this would be instrumental data, and it is up to the user or client application consuming the
data to interpret the meaning of the data elements within the observation.
Which approach is best depends upon the anticipated scientific usage and is up to the data provider to determine. For
example if the observational data provided is most commonly used for multi-wavelength analysis, exposing individual
high level data products is likely to be the best approach. If the anticipated usage is dominated by complex analysis
of instrumental data, then exposing the entire observation as a standard package of instrumental data may be preferred.
\subsubsection{File Content and Format}
While dataproduct\_type specifies at a high level what a specific data product is, the access\_format attribute
specifies what is actually in the file. For example, an {\textquotedbl}image{\textquotedbl} could be a FITS image, an
image embedded in a FITS multi-extension format (MEF) file, a JPEG, etc. A {\textquotedbl}spectrum{\textquotedbl}
could be represented in the VO-compliant Spectrum format, or in some instrument-specific FITS binary table format. A
visibility dataset could be in FITS or ASDM format, or a variety of other radio data formats. A ROSAT or Chandra
observation might be presented as a `tar' file or directory containing instrument-specific observational files. There
are many such examples; we give only a few here to illustrate the concept.
Specifying the content and format of a data product is important as special software may be required to do anything
useful with the data. The user needs to know exactly what the data product is before deciding to download it for
analysis.
See section \ref{bkm:Ref289893457} for more details and implementation requirements.
\section{Implementation of ObsCore in a TAP Service}
\label{bkm:Ref159237315}The ObsCore model must be implemented within Table Access Protocol (TAP) services such that all
valid queries can be executed unchanged on any service that implements the model. Additional optional or
provider-defined columns are permitted (see section \ref{bkm:Ref421295535}) so long as all mandatory columns are
provided. The protocol does not specify any specific ordering of fields in the query response so long as the mandatory
parameters are present in the output stream.
Here we specify an explicit mapping of the model to relational database tables; in the context of TAP this means we are
specifying the logical tables as described in the TAP\_SCHEMA (the TAP-required database schema where the tables and
columns exposed by the service are described). This does not necessarily imply that the underlying database will have
the identical structure (what is exposed through TAP could be, for example, a database view of the underlying database
tables), but in most cases the relationship between TAP\_SCHEMA description and the underlying tables is
straightforward.
%\begin{table}[h]
%\begin{center}
\begin{tabular}{|p{0.2\textwidth}|p{0.2\textwidth}|p{0.6\textwidth}|}
\hline
schema\_name & table\_name & Description\\\hline
ivoa & ivoa.ObsCore & ObsCore 1.1\\\hline
\end{tabular}
%\end{center}
\label{T3}Table T3. TAP\_SCHEMA.tables for implementation of the ObsCore model.
%\end{table}
%\begin{table}[h]
%\begin{center}
\begin{longtable}{|p{0.20\textwidth}|p{0.28\textwidth}|p{0.22\textwidth}|p{0.1\textwidth}|p{0.20\textwidth}|}
\hline
table\_name & column\_name & data type & units & Constraint\\\hline
ivoa.ObsCore & dataproduct\_type & adql:VARCHAR & & \\\hline
ivoa.ObsCore & calib\_level & adql:INTEGER & & not null\\\hline
ivoa.ObsCore & obs\_collection & adql:VARCHAR & & not null\\\hline
ivoa.ObsCore & obs\_id & adql:VARCHAR & & not null\\\hline
ivoa.ObsCore & obs\_publisher\_did & adql:VARCHAR & & not null\\\hline
ivoa.ObsCore & access\_url & adql:CLOB & & \\\hline
ivoa.ObsCore & access\_format & adql:VARCHAR & & \\\hline
ivoa.ObsCore & access\_estsize & adql:BIGINT & kbyte & \\\hline
ivoa.ObsCore & target\_name & adql:VARCHAR & & \\\hline
ivoa.ObsCore & s\_ra & adql:DOUBLE & deg & \\\hline
ivoa.ObsCore & s\_dec & adql:DOUBLE & deg & \\\hline
ivoa.ObsCore & s\_fov & adql:DOUBLE & deg & \\\hline
ivoa.ObsCore & s\_region & adql:REGION & & \\\hline
ivoa.ObsCore & s\_resolution & adql:DOUBLE & arcsec & \\\hline
ivoa.ObsCore & s\_xel1 & adql:BIGINT & & \\\hline
ivoa.ObsCore & s\_xel2 & adql:BIGINT & & \\\hline
ivoa.ObsCore & t\_min & adql:DOUBLE & d & \\\hline
ivoa.ObsCore & t\_max & adql:DOUBLE & d & \\\hline
ivoa.ObsCore & t\_exptime & adql:DOUBLE & s & \\\hline
ivoa.ObsCore & t\_resolution & adql:DOUBLE & s & \\\hline
ivoa.ObsCore & t\_xel & adql:BIGINT & & \\\hline
ivoa.ObsCore & em\_min & adql:DOUBLE & m & \\\hline
ivoa.ObsCore & em\_max & adql:DOUBLE & m & \\\hline
ivoa.ObsCore & em\_res\_power & adql:DOUBLE & & \\\hline
ivoa.ObsCore & em\_xel & adql:BIGINT & & \\\hline
ivoa.ObsCore & o\_ucd & adql:VARCHAR & & \\\hline
ivoa.ObsCore & pol\_states & adql:VARCHAR & & \\\hline
ivoa.ObsCore & pol\_xel & adql:BIGINT & & \\\hline
ivoa.ObsCore & facility\_name & adql:VARCHAR & & \\\hline
ivoa.ObsCore & instrument\_name & adql:VARCHAR & & \\\hline
\end{longtable}
%\end{center}
\label{bkm:Ref286578377}Table T4. List of the minimal set of data model fields to
implement for an ObsTAP service. See tables of Appendix C for the full description of the TAP\_SCHEMA.columns table.
%\end{table}
Table 3 and Table 4 provide the primary information needed to describe the ObsCore model in terms of TAP\_SCHEMA tables
and columns.
The ``datatype'' column values should follow the TAP standard specification and are bound to the TAP specification
version used to implement the model. Here what is shown applies for TAP v1.0.
The content of the ``constraint'' column specified in Table 4 above is not part of the TAP\_SCHEMA.columns description,
but is required by the ObsCore model and specified here to make this clear to implementers. Additional standard
content for the individual columns is specified below.
\subsection{Data Product Type (dataproduct\_type)}
The dataproduct\_type column contains a simple string value describing the primary nature of the data product. It
should assume one of these string values: image, cube, spectrum, sed, timeseries, visibility, event or measurements.
These values are described in section \ref{bkm:Ref286875933}. A NULL value is permitted, but only in the event that
none of the proposed values can be used to describe the dataset. The optional field dataproduct\_subtype
(\ref{bkm:Ref291536287}) may be used to more precisely define the nature of the dataset. Values in the
dataproduct\_type column must be written in lower case. Specifying this field along with the desired spatial and
spectral coverage will be enough to discover data of interest in many common cases.
Usage: select * from ivoa.ObsCore where dataproduct\_type='image' returns only image data.
\subsection{ Caveat while using dataproduct\_type=``measurements''}
Note that ``measurements'' extends the set of accepted values for dataproduct\_type in ObsCore 1.0. This extension is
meant to expose derived data products together with the progenitor observation dataset.
A few mandatory keywords for the axes description may be non-applicable for such a data product. In this case the
coverage on spatial, energy, time, and polarization may inherit the values from the ObsCore description of its
progenitor.
Progenitors and their derived data products must have the same obs\_id.
\subsection{Calibration Level (calib\_level)}
The calib\_level column tells the user the amount of calibration processing that has been applied to create the data
product. calib\_level must assume one value among \{0, 1, 2 ,3, 4\}. Please refer to section \ref{bkm:Ref287048333} for
a full description of the various categories. Data providers decide which value best describes their data products.
Values in the calib\_level column must not be NULL.
Query usage: ``select * from ivoa.ObsCore where calib\_level {\textgreater}2'' returns enhanced data products.
\subsection{Collection Name (obs\_collection)}
The obs\_collection column identifies the data collection to which the data product belongs. A data collection is any
logical collection of datasets which are alike in some fashion. Typical data collections might be all the data from a
particular telescope, instrument, or survey. The value is either the registered shortname for the data collection, the
full registered IVOA identifier for the collection, or a data provider defined shortname for the collection. Often the
collection name will be set to the name of the instrument that generated the data. In that case we suggest specifying
the collection name as a string composed of the facility name, followed by a slash, followed by the instrument name.
Examples : HST/WFPC2, VLT/FORS2, CHANDRA/ACIS-S.
There are other cases where it makes no sense to use the instrument name, may be because the data product used data from
different instruments or facilities, or for other reasons. Examples: SDSS-DR7, etc.
In practice this is not a very precisely defined field. What is important is for the data provider to use a collection
name which is familiar to astronomers and discriminative to point easily on datasets of interest.
Values in the obs\_collection column must not be NULL.
\subsection{Observation Identifier (obs\_id)}
The obs\_id column defines a unique identifier for an observation as explained in section \ref{bkm:Ref450327253} . In
the case where multiple data products are available for an observation (e.g. with different calibration levels), the
obs\_id value will be the same for each data product comprising the observation. This is equivalent to the dataset name
for many archives where dataset name could have many files associated with them. The returned obs\_id for an archival
observation should remain identical through time for future reference.
In the case of some advanced data products (with calib\_level ${\geq}$ 3), the data product may be the result of
combining data from multiple primary (physical) observations. In this case the resulting data product is a new
processed ``observation'' to which a new unique observation identifier should be assigned. If the advanced processing
results in several associated data products they should share the same obs\_id. Describing the provenance of such an
advanced data product is possible, but is out of scope for ObsTAP.
Values in the obs\_id column must not be NULL.
\subsection{Publisher Dataset Identifier (obs\_publisher\_did)}
The obs\_publisher\_did column contains the IVOA dataset identifier \cite{CITATIONPla07l1036} for the published data
product. This value must be unique within the namespace controlled by the dataset publisher (data center). The value
will also be globally unique since each publisher has a unique IVOA registered publisher ID. The same dataset may
however have more than one publisher dataset identifier if it is published in more than one location; the creator DID,
if defined for the given dataset, would be the same regardless of where the data is published.
The returned obs\_publisher\_did for a static data product should remain identical through time for future reference.
Values in the obs\_publisher\_did column must not be NULL.
\subsection{Access URL (access\_url)}
The access\_url column contains a URL that can be used to download the data product (as a file of some sort).
We specify the data type as CLOB (character large object) in the TAP service so that users will know they can only use
the access\_url column in the SELECT clause of a query. That is, users cannot specify this column as part of a
condition in the WHERE clause and implementers are free to generate the URL on the fly during output (rather than being
forced to store it statically in the database).
More details are given on the use of CLOB data types for the TAP SCHEMA in the TAP Standard document
\cite{CITATIONTAPl1036}, section 2.5 Table upload.
Access URLs are not guaranteed to remain valid and unchanged indefinitely. To access a specific data product after a
period of time (e.g., days or weeks) a query should be performed (e.g., using obs\_publisher\_did) to obtain a fresh
access URL.
\subsection{Access Format (access\_format)}
\label{bkm:Ref289893457}The access\_format column specifies the format of the data product if downloaded as a file. This
data model field is important both for data discovery and for the client to evaluate whether it will be able to
actually use the data product if downloaded.
MIME types are often used to specify file formats in existing protocols such as HTTP\cite{CITATIONIntl1036}. However
when dealing with astronomical observations as in ObsTAP services, more information about the format of the data is
required than can be specified by conventional MIME types. For instance we might want to distinguish between various
formats like multi-extension FITS (e.g. for CCD mosaic instruments or MUSE IFU data), or ASDM (e.g. for ALMA or other
interferometry observations). Even for something as fundamental to astronomy as FITS binary table there is currently
no standardized MIME type other than the generic application/FITS.
While standard MIME types are limited when it comes to describing the many data formats actually in use within
astronomy, they are ideal for specifying common file types such as HTML and XML, the various graphics file types, text,
PDF, and so forth, all of which can be used to describe aspects of observational data. Furthermore the MIME type
scheme is extensible, allowing new formats which are not yet standardized to be specified. Hence what we propose here
is to adopt the MIME type mechanism to describe the file format of a science data product, defining new custom types as
needed. Note this is distinct from the science content which is specified by the data product type and subtype. The
same content can potentially be represented in multiple formats hence these are distinct.
The following table illustrates some common astronomical file formats. This list is by no means intended to be
comprehensive; rather it illustrates the approach while defining standard values for some common formats. Some
randomly selected formats are included to illustrate the approach. We can extend this list as experience is gained
using ObsTAP to describe actual data archives.
%\begin{table}[h]
%\begin{center}
\begin{tabular}{|p{0.3\textwidth}|p{0.15\textwidth}|p{0.5\textwidth}|}
\hline
MIME-type & Shortname & Definition\\\hline
image/fits & fits & Any multidimensional regularly sampled FITS image or cube\\\hline
image/jpeg & jpeg & A 2D JPEG graphic image (likewise for GIF, PNG, etc.)\\\hline
application/fits & fits & Any generic FITS file\\\hline
application/x-fits-bintable & bintable & A FITS binary table (single BINTABLE extension)\\\hline
application/x-fits-mef & mef & A FITS multi-extension file (multiple extensions)\\\hline
application/x-fits-uvfits & uvfits & A FITS file in UVFITS format (likewise SDFITS etc.)\\\hline
application/x-fits-euro3d & euro3d & A FITS file in Euro3D format (multiobject spectroscopy)\\\hline
application/x-votable+xml & VOTable & Any generic VOTable file\\\hline
application/x-asdm & asdm & ALMA science data model (final export format still TBD)\\\hline
application/pdf & pdf & Any PDF file\\\hline
text/html & html & Text in HTML format\\\hline
text/xml & xml & Any generic XML file\\\hline
text/plain & txt & Any generic text file\\\hline
text/csv & csv & Tabular data in comma separated values format\\\hline
text/tab-separated-values & tsv & Tabular data in tab separated values format\\\hline
application/x-tar & tar & Multiple files archive in TAR format\\\hline
application/zip & zip & Multiple files archive in ZIP format\\\hline
application/x-directory & dir & Multiple files archive returned as a text list \\\hline
image/x-fits-gzip & fits & A GZIP-compressed FITS image\\\hline
image/x-fits-hcompress & fits & A FITS image using HCOMPRESS compression\\\hline
application/x-tar-gzip & gtar & A GZIP-compressed TAR file (x-gtar also sometimes used)\\\hline
application/x-votable+xml; content=datalink & datalink & A datalink response containing links to data sets or services attached to the current dataset\\\hline
\end{tabular}
%\end{center}
\label{bkm:Ref286578377}Table T5: TODO: label from orig doc
%\end{table}
The value of access\_format should be a MIME type, either a standard MIME type, an extended MIME type from the above
table, or a new custom MIME type defined by the data provider. The short names suggested here are not used directly by
access\_format.
Custom file formats should be specified using a MIME type such as
{\textquotedbl}application/x-{\textless}whatever{\textgreater}{\textquotedbl}. This can be used for any file format
including custom binary file formats.
Observational datasets consisting of multiple instrument-specific files may be exposed in formats like
application/x-directory, application/x-tar or application/x-tar-gzip. Details of the package content and how to access
inner data products are described in the IVOA Data Link specification\cite{CITATIONPat15l1036}. See the example
presented in section \ref{bkm:Ref303703299} .
Compression is inherent in some file formats, e.g., ZIP or JPEG. In other formats it is optional and is indicated by
having multiple versions of the format, e.g. image/fits or image/x-fits-gzip for a GZIP-compressed FITS image (the
{\textquotedbl}x-{\textquotedbl} prefix is required for anything which is not a registered standard MIME type).
The access\_url field may also point to a datalink service. This is stipulated by the
`application/x-votable+xml;content=datalink' access format.
This service will return a response containing attached files related to the discovered dataset (previews, tar
ball{\dots}). It can also contain descriptions of services running operations on the dataset like cut-outs..
\subsection{Estimated Download Size (access\_estsize)}
The access\_estsize column contains the approximate size (in kilobytes) of the file available via the access\_url. This
is used only to gain some idea of the size of a data product before downloading it, hence only an approximate value is
required. Provision of dataset size estimates is important whenever it is possible that datasets can be very large.
\subsection{Target Name (target\_name)}
The target\_name column contains the name of the target of the observation, if any. This is typically the name of an
astronomical object, but could be the name of a survey field.
The target name is most useful for output, to identify the target of an observation to the user. In queries it is
generally better to refer to astronomical objects by position, using a name resolver to convert the target name into a
coordinate (when possible).
\subsection{Central Coordinates (s\_ra, s\_dec)}
The coordinate system in which coordinates are expressed is ICRS. The s\_ra column specifies the ICRS Right Ascension of
the center of the observation. The s\_dec column specifies the ICRS Declination of the center of the observation.
\subsection{Spatial Extent (s\_fov)}
The s\_fov column (1D size of the field of view) contains the approximate size of the region covered by the data
product. For a circular region, this is the diameter (not the radius). For most data products the value given should
be large enough to include the entire area of the observation; coverage within the bounded region need not be complete,
for example if the specified FOV encompasses a rotated rectangular region. For observations which do not have a
well-defined boundary, e.g. radio or high energy observations, a characteristic value should be given.
The s\_fov attribute provides a simple way to characterize and use (e.g. for discovery computations) the approximate
spatial coverage of a data product. The spatial coverage of a data product can be more precisely specified using the
s\_region attribute (\ref{bkm:Ref158024378}).
\subsection{Spatial Coverage (s\_region)}
\label{bkm:Ref158024378}The s\_region column can be used to precisely specify the covered spatial region of a data
product.
It is often an exact, or almost exact, representation of the illumination region of a given observation defined in a
standard way by the concept of Support in the Characterisation data model.
We specify the data type as adql:VARCHAR so that users can specify spatial queries using a single column and in a
limited number of ways. If implemented in TAP 1.0, and included in the select list of the query, the output is always
an STC-S string as described in \cite{CITATIONTAPl1036} [section 6]. In the WHERE clause, the s\_region column can be
used with the ADQL geometry functions (INTERSECTS, CONTAINS) to specify conditions; the service will generally have to
translate these into native SQL that enforces the same conditions or a suitable approximation. Implementers may
approximate the spatial query conditions by translating the INTERSECTS and CONTAINS function calls in the query.
Because ObsTAP relies on ADQL queries and builds up on TAP, the mapping between the ObsCore model data types, as shown
in Table 1. Mandatory fields of the Observation Core Components data model with their name, recommended units, data
type and designation.should be adjusted to the definitions stated in the TAP version used for the ObsTAP service.
Region computations are an advanced query capability which may not be supported by all services. Services should
however specify s\_region when possible to more precisely specify the spatial coverage of an observation.
\subsection{Spatial Resolution (s\_resolution)}
The s\_resolution column specifies a reference value chosen by the data provider for the estimated spatial resolution of
the data product in arcseconds. This refers to the smallest spatial feature in the observed signal that can be
resolved.
In cases where the spatial resolution varies across the field the best spatial resolution (smallest resolvable spatial
feature) should be specified. In cases where the spatial frequency sampling of an observation is complex (e.g.,
interferometry) a typical value for spatial resolution estimate should be given; additional characterisation may be
necessary to fully specify the spatial characteristics of the data.
\subsection{Time Bounds (t\_min, t\_max)}
\label{bkm:Ref285666427}The t\_min column contains the start time of the observation specified in MJD. The t\_max
column contains the stop time of the observation specified in MJD. In case of data products result of the combination
of multiple frames, t\_min must be the minimum of the start times, and t\_max as the maximum of the stop times.
\subsection{Exposure Time (t\_exptime)}
\label{bkm:Ref285666434}The t\_exptime column contains the exposure time. For simple exposures, this is just t\_max -
t\_min expressed in seconds. For data where the detector is not active at all times (e.g. data products made by
combining exposures taken at different times), the t\_exptime will be smaller than t\_max - t\_min. For data where the
t\_exptime is not constant over the entire data product, the median exposure time per pixel is a good way to
characterize the typical value. In some cases, t\_exptime is generally used as an indicator of the relative
sensitivity (depth) within a single data collection (e.g. obs\_collection); data providers should supply a suitable
relative value when it is not feasible to define or compute the true exposure time.
In case of targeted observations, on the contrary the exposure time is often adjusted to achieve similar signal to noise
ratio for different targets.
\subsection{Time Resolution (t\_resolution)}
The t\_resolution column is the minimal interpretable interval between two points along the time axis. This can be an
average or representative value. For products with no sampling along the time axis, the t\_resolution could be set to
the exposure time or could be null. If set to exposure time, one could compose a WHERE clause like: WHERE
t\_resolution {\textless} t\_exptime to find those products which are time resolved.
This implementation preference avoids dealing with undefined data model fields as originally considered in the
Characterisation data model for unresolved time axis so NULL value is preferred to not defined.
\subsection{Spectral Bounds (em\_min, em\_max)}
\label{bkm:Ref285651639}The em\_min column specifies the minimum spectral value observed, expressed as a vacuum
wavelength in meters.
The em\_max column contains the maximum spectral value observed, expressed as a vacuum wavelength in meters.
As mentioned in the data model in Appendix B, at least 3 physical quantities could in principle be used to represent the
spectral axis: energy, wavelength or frequency; which is most appropriate depends upon the observation domain. For
ObsTAP we are less concerned with how to present data to the user than with providing a simple and uniform way to
describe astronomical data, hence we restrict the spectral bounds units to wavelength in meters in vacuum. Conversion
to other quantities could be performed either on the client side for an application encapsulating queries, and/or on
the server side, for a data provider to expose its data from other regimes to ObsTAP queries.
\subsection{Spectral Resolving Power (em\_res\_power)}
The em\_res\_power column contains the typical or characteristic spectral resolving power defined as
${\lambda}/{\delta}{\lambda}$. The value is dimensionless.
\subsection{Polarization states (pol\_states)}
Polarisation states can also be described as a simple list of values if the dataset contains polarization data.
See section \ref{bkm:Ref482802717} for details.
\subsection{Observable Axis Description (o\_ucd)}
The o\_ucd column specifies a UCD \cite{CITATIONPre07l1036} describing the nature of the observable within the data
product. The observable is the measured quantity, for a sampled dataset, such as for example photon counts or flux
density stored in the pixel value within an image. Often for optical astronomical images the value would be phot.count;
for fully flux calibrated data a value such as phot.flux.density (usually specified in Jy) would be used. Any valid UCD
is permitted. If no appropriate UCD is defined the field should be left NULL (the IVOA provides a process by which new
UCDs can be defined).
In the case of event lists all components could be considered as observables prior to sampling, then o\_ucd must be left
NULL, unless the data provider wants to highlight a specific axis like phot.count.
\subsection{Axes lengths (s\_xel1, s\_xel2, em\_xel, t\_xel, pol\_xel)}
The lengths of each data axis (spatial, spectral, time, polarization) defined as a number of elements along each of
these axes are included in this specification for ObsCore v1.1. This data model element was already defined as an
attribute of the CharacterisationAxis Class in the Characterisation Data model. This provides quantitative information
on the geometry of the data portion along the axes defined in the Characterisation Data Model. Various use-cases in
Appendix A, section A.6 illustrate these discovery scenarios.
\begin{itemize}
\item s\_xel1, s\_xel2 specify the number of values spanned along the spatial dimensions
\item em\_xel, t\_xel specify the number of values spanned for the spectral and time axis respectively.
\item pol\_xel specifies the number of polarization states present in the dataset
\end{itemize}
This information helps to plan data selection, data slicing or sub setting following data discovery and will be used for
building up extracted subsets on the fly.
For pixelated data this concept clearly represents the number of samples along each axis.
In the case of non-pixelated data, like event lists, where several events can be gathered in one time bin or energy bin
for instance, these attributes should be set to -1. The number of elements in such lists is a different property and
should be represented in the NDCube DM which tackles sparse data sets.
\subsection{Additional Columns}
\label{bkm:Ref421295535}\label{bkm:Ref421297012}Service providers may include additional columns in the ivoa.ObsCore
table to expose additional metadata. These columns must be described in the TAP\_SCHEMA.columns table and in the output
from the VOSI-tables resource \cite{CITATIONVOSI2010l1036}. Users may access these columns by examining the column
metadata for individual services and then using them explicitly in queries or by selecting all columns in the query
(e.g. ``select * from ivoa.ObsCore ...'' in an ADQL query). In order to provide homogeneity in the keywords used as
optional fields, we recommend where possible to use the items defined in the full data model (Appendix B) and flagged
as optional. ObsTAP compliant services will support all columns defined as mandatory and possibly some of the optional
ones. Queries built up using additional columns defined specifically for a given archive might not be portable.
\section{Registering an ObsTAP Service}
\label{bkm:Ref298341494}The standard identifier for this version of the ObsCore model now follows the IVOA Identifiers
v2.0 specification \cite{CITATIONDem16l1036} and should be ivo://ivoa.net/std/ObsCore\#core-1.1
The ObsCore data model will be registered using this identifier in the StandardsRegExt (standards registry extension)
definition.
TAP services that implement the ObsCore model should be registered to indicate this fact so that users can easily find
all services that accept ObsCore queries. This can be done in any registry by using the keyword ``ObsCore'' to describe
the service. In addition, fine-grained registries may include the complete VODataService table set description.
The TAPRegExt (Table Access Protocol registry extension) \cite{CITATIONDem15l1036} provides a mechanism (the
`dataModel' element) to list one or more data models that are supported by a TAP service. The data model support uses
the ivo standard identifier (above). One or more `dataModel' elements may be included as child elements of the
{}`capability' element describing the TAP service which is the `capability' element with the following attributes:
standardID={\textquotedbl}ivo://ivoa.net/std/TAP{\textquotedbl}(or later version)
xmlns:tr={\textquotedbl}http://www.ivoa.net/xml/TAPRegExt/v1.0{\textquotedbl} (or later version)
xsi:type={\textquotedbl}tr:TableAccess{\textquotedbl}
For TAP services that support ObsCore-1.0 only, the {}'dataModel' element would be: {\textless}dataModel
ivo-id={\textquotedbl}ivo://ivoa.net/std/ObsCore/v1.0{\textquotedbl}{\textgreater}ObsCore-1.0{\textless}/dataModel{\textgreater}
as defined in the early version 1.0 of this specification. TAP services that support ObsCore-1.1 with the dimensions
elements s\_xel1, s\_xel2, em\_xel, etc. and utypes updates should include instead: {\textless}dataModel
ivo-id={\textquotedbl}ivo://ivoa.net/std/ObsCore\#core-1.1{\textquotedbl}{\textgreater}ObsCore-1.1{\textless}/dataModel{\textgreater}
In general, the data model support in TAPRegExt can be used when a TAP service contains tables and columns described
with Utypes from a standard data model; it is not generally necessary to have all the Utypes (e.g. the complete model).
However, since the ObsCore data model is a physical model designed specifically to be implemented in TAP services, the
standard identifier must only be used to specify data model support in the TAPRegExt if the ivoa.ObsCore table is
available and contains all the mandatory columns\footnote{ Additional columns with optional ObsCore Utypes, Utypes from
other data models, or no Utypes at all are allowed}.
\section{Changes from Previous Versions}
Errata on REC 2017 May 09:
Removed FWHM from Time resolution definition in Table 1. 2018/05/12
Updated UCD column for obs\_publisher\_did and publisher\_id to meta.ref.ivoid
Version REC 2017 May 09: (after final and careful read from A. Micol)
\begin{itemize}
\item corrected erroneous section number 7 into 6,
\item corrected datamodel identifier in section 5,
\item corrected ucd meta.ref.url in obs\_publisher\_did and publisher\_id as meta.ref.uri instead of url,
\item reword t\_resolution paragraph in section .
\end{itemize}
Version PR 1.1 2017: Mar 05 2017: table 4 in Appendix B: definition of pol\_states and definition of o\_calib\_status
Version PR 1.1 Oct04: typos and corrections of xml document example in App. C section 3
Version PR 1.1 March 2016 to PR1.1 September 2016:
\begin{itemize}
\item In TAP\_SCHEMA~ table 3 changed datatype for access\_estsize to adql:BIGINT instead of adql:INTEGER and for
pol\_xel to adql:INTEGER instead of ADQL:BIGINT
\item In Abstract: Clarified links with DataSet Metadata DM and emphasizes the TAP/ADQL implementation aspects of this
specification
\item Define `measurements' as a new dataproduct\_type for source lists, catalogs, and products derived from data
extraction/interpretation.
\item Extend allowed values for calib\_level up to 4, to represent data products obtained as results of deep analysis
processing.
\item Cleanup for expressing Use-cases in Appendix A
\item IVOA IDs in TAPRegExt updated to comply to IVOA identifiers version 2.0
\item Typo corrections from reviewers `comments listed on the RFC page
\item TAPRegExt section 5 rewrite
\item Use case appendix A: reformulate request for service capabilities
\end{itemize}
Version 1.0 to 1.1 March 2016:
\begin{itemize}
\item Homogenize case in Utypes strings
\item Improve ucd tags for s\_region now labeled with pos.outline;obs.field instead of phys.area;obsfield
\item Include axes dimensions (number of elements along one axis) expressed as s\_xel, em\_xel, t\_xel, etc , as an
extrapolation of the definition for pi\_xel, vo\_xel , etc.
\item Insert field s\_pixel\_scale that was missing in Appendix B Table 5 and App.C Table 7.
\item Homogenize root class name: Obs(ervation) changed to ObsDataset according to Dataset Metadata data model and Cube
DM
\item Enlarge the possible values for em\_ucd to allow the search for Doppler features ( velocity cubes)
\item Correct minor Utype typos and inconsistencies.
\end{itemize}
% NOTE: IVOA recommendations must be cited from docrepo rather than ivoabib
% (REC entries there are for legacy documents only)
\bibliography{ivoatex/ivoabib,ivoatex/docrepo}
\appendix
\section{Use Cases in detail}
The ability to discover data of a certain kind (images, spectra, cubes, event lists) according to scientific criteria
(e.g., a given sky position, spectral coverage including spectral line X, spatial resolution better than Y, resolving
power greater than Z) is central to archival astronomy. A special Take Up Committee of the IVOA was formed in 2009 to
stimulate IVOA work in the area of catalog-based science data access to allow astronomers to easily query and access
scientific data. This committee came up with a list of data discovery use cases expressed as a set of constraints on
selected scientific parameters to be used to query for datasets of interest. The full list of use cases is summarized
below.
Please note that for most science cases, a full TAP implementation is required for them to work as well as STC regions
support. \cite{CITATIONSTCl1036}
Some of the use-cases listed by the committee require advanced functionalities like ``search by type'', ``query from an
input list'', and have not been fully developed here.
Typical use-cases are described below.
A wider set of working examples (beta release), is available at http://saada.unistra.fr/voexamples/show/ObsCore a DALI
compliant example service developed by Laurent Michel, Mireille Louys and Daniel Durand (May 2016, in progress).
\subsection*{Simple Examples}
\subsubsection*{Simple Query by Position}
Show me a list of all data that satisfies:
\begin{enumerate}
\item Datatype=any
\item contains RA=16.0 and DEC=40.0
\end{enumerate}
These data would be searched on all VO services by sending the following query:
\begin{lstlisting}[language=SQL,flexiblecolumns=true]
SELECT * FROM ivoa.Obscore
WHERE CONTAINS(POINT('ICRS',16.0,40.0),s_region)=1
\end{lstlisting}