From f1ec40f1532f15e64723443fbf1515f54a32f321 Mon Sep 17 00:00:00 2001 From: Samantha Wittke <32324155+samumantha@users.noreply.github.com> Date: Mon, 22 Jan 2024 16:22:02 +0200 Subject: [PATCH 01/37] rm sentinel from mosaics text --- docs/support/tutorials/gis/eo_guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index 7810612897..2f7dd81103 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -108,7 +108,7 @@ Commercial datasets are usually available from data provider, while open dataset Some Finnish EO datasets are available locally at CSC. A STAC catalog for all spatial data available at CSC is currently in progress. You can find more information about it and its current content from the [Paituli STAC page](https://paituli.csc.fi/stac.html). -* **Sentinel and Landsat mosaics** of Finland in Puhti. Accessing data in Puhti requires CSC user account with a project where Puhti service is enabled. All Puhti users have **read** access to these datasets. You do not need to move the files: they can be used directly, unless you need to modify them, which requires you to make your own copy. +* **Landsat mosaics** of Finland in Puhti. Accessing data in Puhti requires CSC user account with a project where Puhti service is enabled. All Puhti users have **read** access to these datasets. You do not need to move the files: they can be used directly, unless you need to modify them, which requires you to make your own copy. * **Sentinel-2 L2A data** of Finland in Allas. These files are public, so anybody can download them, also from own computer or other services. * [More information and list of all spatial datasets in CSC computing environment](../../../data/datasets/spatial-data-in-csc-computing-env.md) From 5ab085d5c6e10a3d0811ecccfe18a9de2df3ab25 Mon Sep 17 00:00:00 2001 From: Samantha Wittke <32324155+samumantha@users.noreply.github.com> Date: Mon, 22 Jan 2024 17:21:35 +0200 Subject: [PATCH 02/37] general CDSE section added --- docs/support/tutorials/gis/eo_guide.md | 39 +++++++++++++++++++++++++- 1 file changed, 38 insertions(+), 1 deletion(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index 2f7dd81103..8c6c9a0a5b 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -116,7 +116,7 @@ Some Finnish EO datasets are available locally at CSC. A STAC catalog for all sp **[SYKE/FMI, Finnish image mosaics](https://www.syke.fi/fi-FI/Tutkimus__kehittaminen/Tutkimus_ja_kehittamishankkeet/Hankkeet/Paikkatietoalusta_PTA)** : Sentinel-1, Sentinel-2 and Landsat mosaics, for several time periods per year. Some of them are available in Puhti, but not all. [FMI provides also a STAC catalog for these mosaics](https://pta.data.lit.fmi.fi/stac/root.json) -[**Copernicus Data Space Ecosystem**](https://dataspace.copernicus.eu/) provides worldwide main products for Sentinel-1, -2 and -3. It requires free registration. Includes possibility for visualisation and data processing. This was introduced in late 2023 and replaced the European Space Agency's SciHub. This service provides much more than a data download service, see for example all [analysing services of the Copernicus Data Space Ecosystem](https://dataspace.copernicus.eu/analyse). +[**Copernicus Data Space Ecosystem**](https://dataspace.copernicus.eu/) provides worldwide main products for Sentinel-1, -2 and -3. It requires free registration. Includes possibility for visualisation and data processing. This was introduced in late 2023 and replaced the European Space Agency's SciHub. This service provides much more than a data download service, see below for more information. [**FinHub**](https://finhub.nsdc.fmi.fi/#/home) is the Finnish national mirror of SciHub; other national mirrors also exist. It covers Finland and the Baltics and offers Sentinel-2 L1C (but not L2A) and Sentinel 1 SLC, GRD and OCN products and requires own registration. FinHub provides a similar Graphical User Interface (GUI) and Application Programming Interface (API) to access the data as the old SciHub. You can also use for example the [sentinelsat](https://sentinelsat.readthedocs.io/en/stable/) tool for downloading data from FinHub. @@ -136,6 +136,43 @@ Some Finnish EO datasets are available locally at CSC. A STAC catalog for all sp To find other geospatial datasets, check out [CSC open spatial dataset list](https://research.csc.fi/open-gis-data). +### Copernicus Data Space Ecosystem + +The Copernicus Data Space Ecosystem does not only provide the possibility to browse, visualize and download Copernicus and other programs earth observation data, it also provides several options for further processing the data in the cloud. Almost all of the services require self registration, which everyone can do for free. Check out their website for all [available dataset descriptions](https://documentation.dataspace.copernicus.eu/Data.html) (note that duplicates may be available due to reprocessing by newest baselines). + +The [Copernicus Data Space Ecosystem Browser](https://dataspace.copernicus.eu/browser/) serves as a central hub for accessing and exploring Earth observation and environmental data provided by the Copernicus Sentinel constellations, contributing missions, auxiliary engineering data, on-demand data and more (Check out the [documentation on Data](https://documentation.dataspace.copernicus.eu/Data.html) for more details) . Users can visualize, compare and analyze and download all this data. + +The [Copernicus Data Workspace](https://dataspace.copernicus.eu/workspace/) is a tool for managing and reviewing Earth observation-related products. This platform enables you to aggregate and review products, which can then be further processed or downloaded for various purposes. When products are selected for processing, you are provided with a list of processors that are capable of processing relevant data types. The processors can be further parameterized to fine-tune the results. + +The [openEO Algorithm Plaza](https://marketplace-portal.dataspace.copernicus.eu) is a marketplace to discover and share various EO algorithms expressed as openEO process graphs. + +The [openEO Web Editor](https://openeo.dataspace.copernicus.eu/) is a web-based graphical user interface (GUI) that allows users (who are not familiar with a programming language) to interact with the openEO API and perform various tasks related to Earth observation data processing, such as querying available data, defining processing workflows, executing processes, and visualizing the results. It allows users to build complex processing chains by connecting different processing steps as building blocks and provides options to specify parameters and input data for each step. + +[Copernicus own Jupyter Lab instances](https://jupyterhub.dataspace.copernicus.eu/) provide example notebooks, and the possibility to add own packages via pip. 10Gb of persistent space per user (deleted after 15 days without login). The example notebooks are also available on [Copernicus Data Space Ecosystem github](https://github.com/eu-cdse/notebook-samples). JupyterHub provides several server options with 2 - 4 CPUs and 4 - 16 Gb of RAM. Note that in addition to personal limits, also the total number of active users seems to be limited. + +The Copernicus Data Space Ecosystem provides several different APIs to access the data. A [Copernicus access token]((https://documentation.dataspace.copernicus.eu/APIs/Token.html)) is needed to make use of these interfaces. + +Catalog APIs, all connected to the same database: + + - Odata + - OpenSearch + - STAC (Spatio Temporal Asset Catalog) + +Streamlined Data Access APIs (SDA) enables users to access and retrieve Earth observation (EO) data from the Copernicus Data Space Ecosystem catalogue. These APIs also provide you with a set of tools and services to support data processing and analysis: + + - SentinelHub + - OpenEO + +In addition, also direct EO data access with S3 is provided, as well as an on-demand production API and traceability service. + +The [Copernicus Request builder](https://shapps.dataspace.copernicus.eu/requests-builder/) lets you build requests for the different APIs via Graphical User Interface (GUI). It can also provide complete Python scripts using 'requests' or 'sentinelhub' Python packages. Requests can be sent immediately from the GUI or copied into own script/terminal. + +The [Sentinel Hub QGIS Plugin](https://documentation.dataspace.copernicus.eu/Applications/QGIS.html) allows you to view satellite image data from the Copernicus Data Space Ecosystem or from Sentinel Hub directly within a QGIS workspace. All datasets are available that are part of collections associated with your user. The current functionality of the QGIS Plugin is for visualization; it does not allow you to perform operations or access properties of the dataset. + +The [Copernicus dashboard](https://dashboard.dataspace.copernicus.eu/) shows the state of services and products. + +Different services have different limitations, which are described here in the [CDSE quota documentation](https://documentation.dataspace.copernicus.eu/Quotas.html). Compared to the previous system (SciHub), the number of concurrent downloads per user has increased from two to four. + ## How can I process EO data at CSC? You can find information about geocomputing using CSC resources and how to get started on [CSC geocomputing pages](https://research.csc.fi/geocomputing), including links to creating user accounts and all other practical information. From c31209197bdce0d17cec26679a6b8c4b6faa2310 Mon Sep 17 00:00:00 2001 From: Samantha Wittke <32324155+samumantha@users.noreply.github.com> Date: Mon, 22 Jan 2024 17:37:22 +0200 Subject: [PATCH 03/37] fix link --- docs/support/tutorials/gis/eo_guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index 8c6c9a0a5b..56ec4e0016 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -150,7 +150,7 @@ The [openEO Web Editor](https://openeo.dataspace.copernicus.eu/) is a web-based [Copernicus own Jupyter Lab instances](https://jupyterhub.dataspace.copernicus.eu/) provide example notebooks, and the possibility to add own packages via pip. 10Gb of persistent space per user (deleted after 15 days without login). The example notebooks are also available on [Copernicus Data Space Ecosystem github](https://github.com/eu-cdse/notebook-samples). JupyterHub provides several server options with 2 - 4 CPUs and 4 - 16 Gb of RAM. Note that in addition to personal limits, also the total number of active users seems to be limited. -The Copernicus Data Space Ecosystem provides several different APIs to access the data. A [Copernicus access token]((https://documentation.dataspace.copernicus.eu/APIs/Token.html)) is needed to make use of these interfaces. +The Copernicus Data Space Ecosystem provides several different APIs to access the data. A [Copernicus access token](https://documentation.dataspace.copernicus.eu/APIs/Token.html) is needed to make use of these interfaces. Catalog APIs, all connected to the same database: From e4a80f260d78bee88636cf6babbc574a1b9d6917 Mon Sep 17 00:00:00 2001 From: Samantha Wittke <32324155+samumantha@users.noreply.github.com> Date: Mon, 29 Jan 2024 10:34:01 +0200 Subject: [PATCH 04/37] add landsat mosaics back to data page --- docs/data/datasets/spatial-data-in-csc-computing-env.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/docs/data/datasets/spatial-data-in-csc-computing-env.md b/docs/data/datasets/spatial-data-in-csc-computing-env.md index a6f6d5ceb0..88b3d3c135 100644 --- a/docs/data/datasets/spatial-data-in-csc-computing-env.md +++ b/docs/data/datasets/spatial-data-in-csc-computing-env.md @@ -19,10 +19,13 @@ Puhti has following datasets: * [Gridcells](http://www.paikkatietohakemisto.fi/geonetwork/srv/fin/catalog.search#/metadata/3fa1beeb-ea6b-42b1-8e76-eb2bc8ac6d24) * [Forest mask](https://www.paikkatietohakemisto.fi/geonetwork/srv/fin/catalog.search#/metadata/df99fbd3-44b3-4ffc-b84a-9459f318d545) * [Forest resource plots](http://www.paikkatietohakemisto.fi/geonetwork/srv/fin/catalog.search#/metadata/332e5abf-63c2-4723-9c2d-4a926bbe587a) +* **Landsat mosaics produced by SYKE and FMI** in Paikkatietoalusta project + - [Historical Landsat satellite image mosaics](https://ckan.ymparisto.fi/dataset/historical-landsat-satellite-image-mosaics-href-historialliset-landsat-kuvamosaiikit-href): 1985, 1990, 1995 + - [Historical Landsat NDVI mosaics: 1984-2011](https://ckan.ymparisto.fi/dataset/historical-landsat-image-index-mosaics-hind-historialliset-landsat-kuvaindeksimosaiikit-hind) -!!! warning "Satellite mosaics produced by SYKE and FMI in Paikkatietoalusta project were removed from Puhti on 21.11.2023" +!!! warning "Sentinel satellite mosaics produced by SYKE and FMI in Paikkatietoalusta project were removed from Puhti on 21.11.2023" - The removed datasets were: [Sentinel1 SAR mosaics](https://ckan.ymparisto.fi/dataset/sentinel-1-sar-image-mosaic-s1sar-sentinel-1-sar-kuvamosaiikki-s1sar), [Sentinel2 index mosaics](https://ckan.ymparisto.fi/dataset/sentinel-2-image-index-mosaics-s2ind-sentinel-2-kuvamosaiikit-s2ind), [Historical Landsat satellite image mosaics](https://ckan.ymparisto.fi/dataset/historical-landsat-satellite-image-mosaics-href-historialliset-landsat-kuvamosaiikit-href) and [Historical Landsat NDVI mosaics: 1984-2011](https://ckan.ymparisto.fi/dataset/historical-landsat-image-index-mosaics-hind-historialliset-landsat-kuvaindeksimosaiikit-hind). They are available from FMI's own object storage which has more data than was stored to Puhti local disks. The easiest way to find PTA sentinel mosaics from FMI is with [Paituli STAC](https://paituli.csc.fi/stac.html). Paituli STAC page includes also usage examples for R and Python. + The removed datasets were: [Sentinel1 SAR mosaics](https://ckan.ymparisto.fi/dataset/sentinel-1-sar-image-mosaic-s1sar-sentinel-1-sar-kuvamosaiikki-s1sar), [Sentinel2 index mosaics](https://ckan.ymparisto.fi/dataset/sentinel-2-image-index-mosaics-s2ind-sentinel-2-kuvamosaiikit-s2ind). They are available from FMI's own object storage which has more data than was stored to Puhti local disks. The easiest way to find PTA sentinel mosaics from FMI is with [Paituli STAC](https://paituli.csc.fi/stac.html). Paituli STAC page includes also usage examples for R and Python. NLS 2m DEM, lidar, infrared ortophotos and all SYKE datasets are updated in Puhti automatically every Monday. From 3d89785c4a3b28336d475dc2784d6f55ab1cde13 Mon Sep 17 00:00:00 2001 From: Samantha Wittke <32324155+samumantha@users.noreply.github.com> Date: Mon, 29 Jan 2024 11:02:27 +0200 Subject: [PATCH 05/37] add draft download suggestion for CDSE to eoguide --- docs/support/tutorials/gis/eo_guide.md | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index 56ec4e0016..5e0310b028 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -163,7 +163,7 @@ Streamlined Data Access APIs (SDA) enables users to access and retrieve Earth ob - SentinelHub - OpenEO -In addition, also direct EO data access with S3 is provided, as well as an on-demand production API and traceability service. +In addition, also direct EO data access to the CDSE object storage with S3 is provided, as well as an on-demand production API and traceability service. The [Copernicus Request builder](https://shapps.dataspace.copernicus.eu/requests-builder/) lets you build requests for the different APIs via Graphical User Interface (GUI). It can also provide complete Python scripts using 'requests' or 'sentinelhub' Python packages. Requests can be sent immediately from the GUI or copied into own script/terminal. @@ -171,7 +171,12 @@ The [Sentinel Hub QGIS Plugin](https://documentation.dataspace.copernicus.eu/App The [Copernicus dashboard](https://dashboard.dataspace.copernicus.eu/) shows the state of services and products. -Different services have different limitations, which are described here in the [CDSE quota documentation](https://documentation.dataspace.copernicus.eu/Quotas.html). Compared to the previous system (SciHub), the number of concurrent downloads per user has increased from two to four. +Different services have different limitations, which are described here in the [CDSE quota documentation](https://documentation.dataspace.copernicus.eu/Quotas.html). Compared to the previous system (SciHub), the number of concurrent downloads per user has increased from two to four for most APIs. + +For downloading data from the CDSE to CSC computing environment, we recommend using the **S3** access, via command line tools like `s3cmd` and `rclone`. Also access via Python is supported as in the examples shown in example Notebooks, using [SentinelHub](https://github.com/eu-cdse/notebook-samples/blob/main/sentinelhub/data_download_process_request.ipynb) or [basic OData API and requests](https://github.com/eu-cdse/notebook-samples/blob/main/geo/odata_basics.ipynb). + +You can copy data from CDSE object storage also directly to Allas, following the instructions on [Allas docs page](https://docs.csc.fi/data/Allas/accessing_allas/#copying-files-directly-between-object-storages) with endpoint `eodata.dataspace.copernicus.eu` and access and secret keys by following the [instructions about generating CDSE access keys](https://documentation.dataspace.copernicus.eu/APIs/S3.html#generate-secrets). + ## How can I process EO data at CSC? From 055e438234c36a4c25d659cc148209dd7047ff21 Mon Sep 17 00:00:00 2001 From: Samantha Wittke <32324155+samumantha@users.noreply.github.com> Date: Mon, 29 Jan 2024 16:39:46 +0200 Subject: [PATCH 06/37] fix internal docs link --- docs/support/tutorials/gis/eo_guide.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index 5e0310b028..215edf99d1 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -173,9 +173,9 @@ The [Copernicus dashboard](https://dashboard.dataspace.copernicus.eu/) shows the Different services have different limitations, which are described here in the [CDSE quota documentation](https://documentation.dataspace.copernicus.eu/Quotas.html). Compared to the previous system (SciHub), the number of concurrent downloads per user has increased from two to four for most APIs. -For downloading data from the CDSE to CSC computing environment, we recommend using the **S3** access, via command line tools like `s3cmd` and `rclone`. Also access via Python is supported as in the examples shown in example Notebooks, using [SentinelHub](https://github.com/eu-cdse/notebook-samples/blob/main/sentinelhub/data_download_process_request.ipynb) or [basic OData API and requests](https://github.com/eu-cdse/notebook-samples/blob/main/geo/odata_basics.ipynb). +For downloading data from the CDSE to CSC computing environment, we recommend using the **S3** access, via command line tools like `s3cmd` and `rclone`. Also access via Python is supported as shown in the example Notebooks, using [SentinelHub](https://github.com/eu-cdse/notebook-samples/blob/main/sentinelhub/data_download_process_request.ipynb) or [basic OData API and requests](https://github.com/eu-cdse/notebook-samples/blob/main/geo/odata_basics.ipynb). -You can copy data from CDSE object storage also directly to Allas, following the instructions on [Allas docs page](https://docs.csc.fi/data/Allas/accessing_allas/#copying-files-directly-between-object-storages) with endpoint `eodata.dataspace.copernicus.eu` and access and secret keys by following the [instructions about generating CDSE access keys](https://documentation.dataspace.copernicus.eu/APIs/S3.html#generate-secrets). +You can copy data from CDSE object storage also directly to CSCs object storage Allas, following the instructions on [Allas docs page](../../../data/Allas/accessing_allas/#copying-files-directly-between-object-storages) with endpoint `eodata.dataspace.copernicus.eu` and access and secret keys by following the [instructions about generating CDSE access keys](https://documentation.dataspace.copernicus.eu/APIs/S3.html#generate-secrets). ## How can I process EO data at CSC? From 579ead106aade966d824bc40f0d52266fcbbc531 Mon Sep 17 00:00:00 2001 From: Samantha Wittke <32324155+samumantha@users.noreply.github.com> Date: Mon, 29 Jan 2024 17:55:21 +0200 Subject: [PATCH 07/37] small fixes to text --- docs/support/tutorials/gis/eo_guide.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index 215edf99d1..6f58af39c6 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -173,9 +173,9 @@ The [Copernicus dashboard](https://dashboard.dataspace.copernicus.eu/) shows the Different services have different limitations, which are described here in the [CDSE quota documentation](https://documentation.dataspace.copernicus.eu/Quotas.html). Compared to the previous system (SciHub), the number of concurrent downloads per user has increased from two to four for most APIs. -For downloading data from the CDSE to CSC computing environment, we recommend using the **S3** access, via command line tools like `s3cmd` and `rclone`. Also access via Python is supported as shown in the example Notebooks, using [SentinelHub](https://github.com/eu-cdse/notebook-samples/blob/main/sentinelhub/data_download_process_request.ipynb) or [basic OData API and requests](https://github.com/eu-cdse/notebook-samples/blob/main/geo/odata_basics.ipynb). +For downloading data from the CDSE to CSC computing environment, we recommend using the **S3** access to the CDSE object storage, via command line tools like `s3cmd` and `rclone` or Pythons `boto3`. Also other ways of accessing the data via Python are supported and work in CSC computing environment as shown in the example Notebooks, using [SentinelHub](https://github.com/eu-cdse/notebook-samples/blob/main/sentinelhub/data_download_process_request.ipynb) or [basic OData API and requests](https://github.com/eu-cdse/notebook-samples/blob/main/geo/odata_basics.ipynb). -You can copy data from CDSE object storage also directly to CSCs object storage Allas, following the instructions on [Allas docs page](../../../data/Allas/accessing_allas/#copying-files-directly-between-object-storages) with endpoint `eodata.dataspace.copernicus.eu` and access and secret keys by following the [instructions about generating CDSE access keys](https://documentation.dataspace.copernicus.eu/APIs/S3.html#generate-secrets). +You can copy data from CDSE object storage also directly to CSCs object storage Allas, following the instructions on [Allas docs page](../../../data/Allas/accessing_allas#copying-files-directly-between-object-storages) with endpoint `eodata.dataspace.copernicus.eu` and access and secret keys from following the [instructions about generating CDSE access keys](https://documentation.dataspace.copernicus.eu/APIs/S3.html#generate-secrets). ## How can I process EO data at CSC? From 85638ebfa57789fa908ebcadecde0e741afd38f1 Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Mon, 29 Jan 2024 17:59:15 +0200 Subject: [PATCH 08/37] Remove SWIFT, reorganize all connection set up, add CDSE. --- docs/support/tutorials/gis/gdal_cloud.md | 101 +++++++++++++++-------- 1 file changed, 66 insertions(+), 35 deletions(-) diff --git a/docs/support/tutorials/gis/gdal_cloud.md b/docs/support/tutorials/gis/gdal_cloud.md index fcfe964762..69604cfc9f 100644 --- a/docs/support/tutorials/gis/gdal_cloud.md +++ b/docs/support/tutorials/gis/gdal_cloud.md @@ -1,14 +1,23 @@ -# Using geospatial files directly from cloud, inc Allas +# Using geospatial files directly from public repositories and S3 storage services, inc Allas -[GDAL](../../../apps/gdal.md) is the main open-source library for reading and writing geospatial data and many more advanced tools rely on GDAL, including QGIS, ArcGIS, Python, R etc. GDAL and most tools depending on it can read directly from an public URL or cloud storage services, which eliminates the need to download the files manually before data analysis. It can also write files to cloud storage services. Several cloud storage services are supported, inc CSC [Allas](../../../data/Allas/index.md), [LUMI-O](https://docs.lumi-supercomputer.eu/storage/lumio/), [Amazon S3](https://aws.amazon.com/pm/serv-s3/), [Google Cloud Storage](https://cloud.google.com/storage), [Microsoft Azure Blob Storage](https://azure.microsoft.com/en-us/products/storage/blobs/) etc. Reading data directly from an external service is usually slower than reading from local disks, but in many cases, these seconds are negligible compared to the full duration of an analysis, but it is important to have good Internet connection. +[GDAL](https://gdal.org/) is the main open-source library for reading and writing geospatial data and many more advanced tools rely on GDAL, including QGIS, ArcGIS, Python, R etc. GDAL has several virtual [network based files systems](https://gdal.org/user/virtual_file_systems.html#network-based-file-systems), that are meant for different APIs or use cases. GDAL and most tools depending on it can **read** directly from an public URL or S3 storage services. This eliminates the need to download the files manually before data analysis. GDAL can also **write** files to S3 storage services, but only some tools depending on it are supporting it. -GDAL has several virtual [network based files systems](https://gdal.org/user/virtual_file_systems.html#network-based-file-systems), that are meant for different storage services or use cases. CSC Allas supports both SWIFT or S3 API. SWIFT is more secure, but the credentials need to be updated after 8 hours. S3 has permanent keys, and is therefore little bit easier to use. Both of these have a random reading and streaming API. +Reading data directly from an external service is usually slower than reading from local disks, but in many cases, these seconds are negligible compared to the full duration of an analysis, but it is important to have good Internet connection. -Below are described in more detail how to use GDAL with public files from URL (VSICURL), private files in S3 (VSIS3) or SWIFT (VSISWIFT) storage, but also other object storage services are supported. Special attention is on CSC Allas service and supercomputers. +S3 services are very common for storing bigger amounts of data, for example: -## VSICURL, reading public files from URL or cloud storage service +* CSC [Allas](../../../data/Allas/index.md), +* EuroHPC [LUMI-O](https://docs.lumi-supercomputer.eu/storage/lumio/), +* ESA [Copernicus Data Space Ecosystem S3](https://documentation.dataspace.copernicus.eu/APIs/S3.html), +* Amazon [S3](https://aws.amazon.com/pm/serv-s3/), +* Google [Cloud Storage](https://cloud.google.com/storage), +* Microsoft [Azure Blob Storage](https://azure.microsoft.com/en-us/products/storage/blobs/) etc. -Without any extra settings always should work [VSICURL](https://gdal.org/user/virtual_file_systems.html#vsicurl), that can be used for reading of files available through HTTP/FTP web protocols. Public objects in object storag usually also have an URL, so this works also for public object storage files. VSICURL supports also partial reading of files, so it works well with cloud-optimized file formats. VSICURL supports also basic authentication. +Below are described in more detail how to use GDAL with public files from URL (VSICURL) and private files in S3 (VSIS3) storage. Special attention is on CSC Allas service and supercomputers. For using GDAL in supercomputer, a module including [GDAL](../../../apps/gdal.md), must be activated. + +## GDAL VSICURL, reading public files from URL + +[VSICURL](https://gdal.org/user/virtual_file_systems.html#vsicurl) can be used for reading of files available via URL. Public objects in S3 storage usually also have an URL, so this works also for public S3 files. VSICURL supports also partial reading of files, so it works well with cloud-optimized file formats. VSICURL supports also basic authentication. ``` # A public file @@ -25,39 +34,75 @@ gdalinfo /vsicurl/https://bucket_name.a3s.fi/object_name # Amazon S3 (us-west-2) gdalinfo /vsicurl/https://s3.us-west-2.amazonaws.com/bucket_name/object_name -#Depending on installation settings VSICURL may sometimes work also without the `/vsicurl/` before the URL. +#Depending on GDAL installation settings VSICURL may sometimes work also without the `/vsicurl/` before the URL. gdalinfo URL - ``` ## VSIS3, reading and writing files from/to S3 services -[VSIS3](https://gdal.org/user/virtual_file_systems.html#vsis3-aws-s3-files) is suitable for working with S3 services, for example CSC Allas, LUMI-O and Amazon S3. +[VSIS3](https://gdal.org/user/virtual_file_systems.html#vsis3-aws-s3-files) is suitable for working with S3 services. + +### S3 connection details -### S3 settings +For accessing the data from S3 services, first the connection details must be set correctly. Usually the following connection details are needed: -For accessing the data from S3 services are needed a few settings that can be given as environment variables or saved to `credentials` file. See the GDAL [VSIS3](https://gdal.org/user/virtual_file_systems.html#vsis3-aws-s3-files) page and the cloud storage documentation for details. Below are more detailed instructions for using CSC Allas object storage with GDAL or GDAL-based tools. - -#### S3 settings for Allas with CSC supercomputers -Setting up Allas S3 connection is easiest with CSC supercomputers. CSC supercomputers have [`allas-conf`](../../../data/Allas/using_allas/s3_client.md#configuring-s3-connection-in-supercomputers) command for setting up Allas connection in the `allas` module.: +* end-point URL +* region +* access and secret key + +Each service's user guide should specify the end-point and region and give instructions how to find the keys. It is recommended to save the keys and region name to `credentials` file, located in `C:\Users\username\.aws\credentials` on Windows or `~/.aws/credentials` on Mac or Linux. For example the Allas the credentials file could look like this: + +``` +[allas_project1] +AWS_ACCESS_KEY_ID=xxx +AWS_SECRET_ACCESS_KEY=yyy +AWS_DEFAULT_REGION = regionOne +``` + +The end-point URL is not needed for Amazon S3, but is needed for other services. Unfortunatelly it can not be given via `credentials` file, but needs to be given to GDAL as environment variable. For example to set Allas end-point: Windows command shell: `set AWS_S3_ENDPOINT=a3s.fi` or Linux/Max: `export AWS_S3_ENDPOINT=a3s.fi` + +#### S3 connection set up for Allas + +If you are using also CSC supercomputers, then the easiest option to set up Allas connection details, is to use [`allas-conf` command](../../../data/Allas/using_allas/s3_client.md#configuring-s3-connection-in-supercomputers) in CSC supercomputers: ``` module load allas allas-conf --mode s3cmd ``` -* `module load allas` makes other Allas tools available and sets AWS_S3_ENDPOINT environment variable, which needs to be run each time S3 is used. -* `allas-conf --mode s3cmd` must be run only when first setting up the connection or if starting to work with different CSC project. +* `module load allas` sets AWS_S3_ENDPOINT environment variable, which needs to be run each time S3 is used. +* `allas-conf --mode s3cmd` writes the keys and region name to `credentials` file (also prints them to Terminal) and must be run only when using first time or when changing to different CSC project. + +After this, you are ready to use GDAL or GDAL-based tools on supercomputers. + +If you want to use Allas on some other machine, then copy `~/.aws/credentials` file from supercomputer to the other machine and set `AWS_S3_ENDPOINT` as described above. -#### S3 settings for Allas in general +If you are not using CSC supercomputers, you can install `allas-conf` to your Linux/Mac machine, follow the instructions in [CSC's Allas command line interface utilities repository](https://github.com/CSCfi/allas-cli-utils). -If you are using also CSC supercomputers, then the easiest is to set up S3 connection on a supercomputer and then: +#### S3 connection set up for Copernicus Data Space Ecosystem -1. Copy your `~/.aws/credentials` file from supercomputer to the other machine, `C:\Users\username\.aws\credentials` on Windows or `~/.aws/credentials` on Mac or Linux. -2. Set also AWS_S3_ENDPOINT environment variable to `a3s.fi`. Windows command shell: `set AWS_S3_ENDPOINT=a3s.fi` or Linux/Max: `export AWS_S3_ENDPOINT=a3s.fi` +ESA data, inc Sentinel data, is available via Copernicus Data Space Ecosystem S3. In general the same applies as decribed above, but `AWS_VIRTUAL_HOSTING` should be set to False: +``` +os.environ["AWS_S3_ENDPOINT"] = "eodata.dataspace.copernicus.eu" +os.environ["AWS_VIRTUAL_HOSTING"] = "FALSE" +``` -If you are not using also CSC supercomputers, you can install `allas-conf` to your Linux/Mac machine, follow the instructions in [CSC's Allas command line interface utilities repository](https://github.com/CSCfi/allas-cli-utils). +#### Several connection profiles +When working with several CSC projects or different S3 storages, it is possible to have several profiles in the `credentials` file: + +``` +[allas_project1] +AWS_ACCESS_KEY_ID=xxx +AWS_SECRET_ACCESS_KEY=yyy +AWS_DEFAULT_REGION = regionOne + +[esa_cdse] +AWS_ACCESS_KEY_ID=xxx +AWS_SECRET_ACCESS_KEY=yyy +``` + +Then before using GDAL, the currently used profile must be set as environment variable: `os.environ["AWS_PROFILE"] = "allas_project1"` ### Using S3 @@ -70,20 +115,6 @@ export CPL_VSIL_USE_TEMP_FILE_FOR_RANDOM_WRITE=YES gdal_translate /vsis3// /vsis3// -of COG ``` -## VSISWIFT, reading and writing files from/to SWIFT services - -[VSISWIFT](https://gdal.org/user/virtual_file_systems.html#vsiswift-openstack-swift-object-storage) is suitable for working with SWIFT services, for example CSC Allas. For setting up the connection use allas-conf. For example in Puhti or Mahti supercomputer: - -``` -module load allas -allas-conf -export SWIFT_AUTH_TOKEN=$OS_AUTH_TOKEN -export SWIFT_STORAGE_URL=$OS_STORAGE_URL -gdalinfo /vsiswift// -``` - -The export commands are needed because GDAL is looking for different environment variables than what allas-conf is writing. These commands need to be given each time you start working with Puhti, because the token is valid for 8 hours. Inside batchjobs use [allas-conf -k](../../../data/Allas/allas_batchjobs.md). - ## Other tools From 143697010e5169074de89e9ae5c044ce4a269913 Mon Sep 17 00:00:00 2001 From: Samantha Wittke <32324155+samumantha@users.noreply.github.com> Date: Mon, 29 Jan 2024 18:03:49 +0200 Subject: [PATCH 09/37] link fix? --- docs/support/tutorials/gis/eo_guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index 6f58af39c6..4102d78b7c 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -175,7 +175,7 @@ Different services have different limitations, which are described here in the [ For downloading data from the CDSE to CSC computing environment, we recommend using the **S3** access to the CDSE object storage, via command line tools like `s3cmd` and `rclone` or Pythons `boto3`. Also other ways of accessing the data via Python are supported and work in CSC computing environment as shown in the example Notebooks, using [SentinelHub](https://github.com/eu-cdse/notebook-samples/blob/main/sentinelhub/data_download_process_request.ipynb) or [basic OData API and requests](https://github.com/eu-cdse/notebook-samples/blob/main/geo/odata_basics.ipynb). -You can copy data from CDSE object storage also directly to CSCs object storage Allas, following the instructions on [Allas docs page](../../../data/Allas/accessing_allas#copying-files-directly-between-object-storages) with endpoint `eodata.dataspace.copernicus.eu` and access and secret keys from following the [instructions about generating CDSE access keys](https://documentation.dataspace.copernicus.eu/APIs/S3.html#generate-secrets). +You can copy data from CDSE object storage also directly to CSCs object storage Allas, following the instructions on [Allas docs page](../../../data/Allas/accessing_allas.md#copying-files-directly-between-object-storages) with endpoint `eodata.dataspace.copernicus.eu` and access and secret keys from following the [instructions about generating CDSE access keys](https://documentation.dataspace.copernicus.eu/APIs/S3.html#generate-secrets). ## How can I process EO data at CSC? From dfc6ddaec21ffac16370f9ae7f0b1fa3c2d5c41d Mon Sep 17 00:00:00 2001 From: Samantha Wittke <32324155+samumantha@users.noreply.github.com> Date: Mon, 29 Jan 2024 18:06:07 +0200 Subject: [PATCH 10/37] fix typos --- docs/support/tutorials/gis/gdal_cloud.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/support/tutorials/gis/gdal_cloud.md b/docs/support/tutorials/gis/gdal_cloud.md index 69604cfc9f..74b0567ce0 100644 --- a/docs/support/tutorials/gis/gdal_cloud.md +++ b/docs/support/tutorials/gis/gdal_cloud.md @@ -60,7 +60,7 @@ AWS_SECRET_ACCESS_KEY=yyy AWS_DEFAULT_REGION = regionOne ``` -The end-point URL is not needed for Amazon S3, but is needed for other services. Unfortunatelly it can not be given via `credentials` file, but needs to be given to GDAL as environment variable. For example to set Allas end-point: Windows command shell: `set AWS_S3_ENDPOINT=a3s.fi` or Linux/Max: `export AWS_S3_ENDPOINT=a3s.fi` +The end-point URL is not needed for Amazon S3, but is needed for other services. Unfortunately it can not be given via `credentials` file, but needs to be given to GDAL as environment variable. For example to set Allas end-point: Windows command shell: `set AWS_S3_ENDPOINT=a3s.fi` or Linux/Max: `export AWS_S3_ENDPOINT=a3s.fi` #### S3 connection set up for Allas @@ -82,7 +82,7 @@ If you are not using CSC supercomputers, you can install `allas-conf` to your Li #### S3 connection set up for Copernicus Data Space Ecosystem -ESA data, inc Sentinel data, is available via Copernicus Data Space Ecosystem S3. In general the same applies as decribed above, but `AWS_VIRTUAL_HOSTING` should be set to False: +ESA data, inc Sentinel data, is available via Copernicus Data Space Ecosystem S3. In general the same applies as described above, but `AWS_VIRTUAL_HOSTING` should be set to False: ``` os.environ["AWS_S3_ENDPOINT"] = "eodata.dataspace.copernicus.eu" os.environ["AWS_VIRTUAL_HOSTING"] = "FALSE" From 785670107725d4c90953c3e5e8ec7071eeaa8b50 Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Tue, 30 Jan 2024 10:38:51 +0200 Subject: [PATCH 11/37] minor text changes --- docs/support/tutorials/gis/gdal_cloud.md | 26 +++++++++++++----------- 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/docs/support/tutorials/gis/gdal_cloud.md b/docs/support/tutorials/gis/gdal_cloud.md index 74b0567ce0..90bf3b8752 100644 --- a/docs/support/tutorials/gis/gdal_cloud.md +++ b/docs/support/tutorials/gis/gdal_cloud.md @@ -15,7 +15,7 @@ S3 services are very common for storing bigger amounts of data, for example: Below are described in more detail how to use GDAL with public files from URL (VSICURL) and private files in S3 (VSIS3) storage. Special attention is on CSC Allas service and supercomputers. For using GDAL in supercomputer, a module including [GDAL](../../../apps/gdal.md), must be activated. -## GDAL VSICURL, reading public files from URL +## Reading public files from URL [VSICURL](https://gdal.org/user/virtual_file_systems.html#vsicurl) can be used for reading of files available via URL. Public objects in S3 storage usually also have an URL, so this works also for public S3 files. VSICURL supports also partial reading of files, so it works well with cloud-optimized file formats. VSICURL supports also basic authentication. @@ -38,9 +38,9 @@ gdalinfo /vsicurl/https://s3.us-west-2.amazonaws.com/bucket_name/object_name gdalinfo URL ``` -## VSIS3, reading and writing files from/to S3 services +## Reading and writing files from/to S3 services -[VSIS3](https://gdal.org/user/virtual_file_systems.html#vsis3-aws-s3-files) is suitable for working with S3 services. +GDAL's [VSIS3](https://gdal.org/user/virtual_file_systems.html#vsis3-aws-s3-files) is for working with S3 services. ### S3 connection details @@ -57,14 +57,14 @@ Each service's user guide should specify the end-point and region and give instr [allas_project1] AWS_ACCESS_KEY_ID=xxx AWS_SECRET_ACCESS_KEY=yyy -AWS_DEFAULT_REGION = regionOne +AWS_DEFAULT_REGION=regionOne ``` The end-point URL is not needed for Amazon S3, but is needed for other services. Unfortunately it can not be given via `credentials` file, but needs to be given to GDAL as environment variable. For example to set Allas end-point: Windows command shell: `set AWS_S3_ENDPOINT=a3s.fi` or Linux/Max: `export AWS_S3_ENDPOINT=a3s.fi` #### S3 connection set up for Allas -If you are using also CSC supercomputers, then the easiest option to set up Allas connection details, is to use [`allas-conf` command](../../../data/Allas/using_allas/s3_client.md#configuring-s3-connection-in-supercomputers) in CSC supercomputers: +If you are using also CSC supercomputers, then the easiest option to set up Allas connection details, is to use [allas-conf command](../../../data/Allas/using_allas/s3_client.md#configuring-s3-connection-in-supercomputers) in CSC supercomputers: ``` module load allas @@ -84,8 +84,8 @@ If you are not using CSC supercomputers, you can install `allas-conf` to your Li ESA data, inc Sentinel data, is available via Copernicus Data Space Ecosystem S3. In general the same applies as described above, but `AWS_VIRTUAL_HOSTING` should be set to False: ``` -os.environ["AWS_S3_ENDPOINT"] = "eodata.dataspace.copernicus.eu" -os.environ["AWS_VIRTUAL_HOSTING"] = "FALSE" +export AWS_S3_ENDPOINT=eodata.dataspace.copernicus.eu +export AWS_VIRTUAL_HOSTING=FALSE ``` #### Several connection profiles @@ -116,12 +116,14 @@ gdal_translate /vsis3// /vsis3/ Options -> Variables new variable with name AWS_S3_ENDPOINT and value `a3s.fi`. - * [Example Python code for working with Allas and rasterio and geopandas](https://github.com/csc-training/geocomputing/blob/master/python/allas). - * [Example R code for workign with Allas and terra and sf](https://github.com/csc-training/geocomputing/blob/master/R/allas/working_with_allas_from_R_S3.R). + * QGIS connects by default to Amazon S3, for connecting to some other service add to Settings -> Options -> Variables new variable with name AWS_S3_ENDPOINT, for Allas the value is `a3s.fi`. + * QGIS supports also point clouds from URL. + * [Example Python code for working with Allas and rasterio, geopandas and boto3](https://github.com/csc-training/geocomputing/blob/master/python/allas). + * [Example R code for workign with Allas and terra, sf and aws.s3](https://github.com/csc-training/geocomputing/blob/master/R/allas/working_with_allas_from_R_S3.R). From 5727378476fbc87deb5e84978a00ccbdc29b9820 Mon Sep 17 00:00:00 2001 From: Samantha Wittke <32324155+samumantha@users.noreply.github.com> Date: Tue, 30 Jan 2024 20:29:09 +0200 Subject: [PATCH 12/37] Small fixes to CDSE instructions --- docs/support/tutorials/gis/eo_guide.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index 4102d78b7c..71f4209c55 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -138,7 +138,7 @@ Some Finnish EO datasets are available locally at CSC. A STAC catalog for all sp ### Copernicus Data Space Ecosystem -The Copernicus Data Space Ecosystem does not only provide the possibility to browse, visualize and download Copernicus and other programs earth observation data, it also provides several options for further processing the data in the cloud. Almost all of the services require self registration, which everyone can do for free. Check out their website for all [available dataset descriptions](https://documentation.dataspace.copernicus.eu/Data.html) (note that duplicates may be available due to reprocessing by newest baselines). +The Copernicus Data Space Ecosystem does not only provide the possibility to browse, visualize and download Copernicus and other programs earth observation data, it also provides several options for further processing the data in the cloud. Almost all of the services require self registration, which everyone can do for free. Check out their website for all [available dataset descriptions](https://documentation.dataspace.copernicus.eu/Data.html) (note that duplicates may be available due to reprocessing with newest baselines). The [Copernicus Data Space Ecosystem Browser](https://dataspace.copernicus.eu/browser/) serves as a central hub for accessing and exploring Earth observation and environmental data provided by the Copernicus Sentinel constellations, contributing missions, auxiliary engineering data, on-demand data and more (Check out the [documentation on Data](https://documentation.dataspace.copernicus.eu/Data.html) for more details) . Users can visualize, compare and analyze and download all this data. @@ -173,9 +173,11 @@ The [Copernicus dashboard](https://dashboard.dataspace.copernicus.eu/) shows the Different services have different limitations, which are described here in the [CDSE quota documentation](https://documentation.dataspace.copernicus.eu/Quotas.html). Compared to the previous system (SciHub), the number of concurrent downloads per user has increased from two to four for most APIs. -For downloading data from the CDSE to CSC computing environment, we recommend using the **S3** access to the CDSE object storage, via command line tools like `s3cmd` and `rclone` or Pythons `boto3`. Also other ways of accessing the data via Python are supported and work in CSC computing environment as shown in the example Notebooks, using [SentinelHub](https://github.com/eu-cdse/notebook-samples/blob/main/sentinelhub/data_download_process_request.ipynb) or [basic OData API and requests](https://github.com/eu-cdse/notebook-samples/blob/main/geo/odata_basics.ipynb). +#### Data download from CDSE -You can copy data from CDSE object storage also directly to CSCs object storage Allas, following the instructions on [Allas docs page](../../../data/Allas/accessing_allas.md#copying-files-directly-between-object-storages) with endpoint `eodata.dataspace.copernicus.eu` and access and secret keys from following the [instructions about generating CDSE access keys](https://documentation.dataspace.copernicus.eu/APIs/S3.html#generate-secrets). +For downloading data from the CDSE to CSC computing environment, we recommend using the **S3** access to the CDSE object storage, via command line tools like `s3cmd` and `rclone` or Pythons `boto3`. You can also read data directly with GDAL, following the instructions provided in [CSC GDAL cloud tutorial](gdal_cloud.md). Also other ways of accessing the data via Python are supported and work in CSC computing environment as shown in the example Notebooks by CDSE, e.g. using [SentinelHub](https://github.com/eu-cdse/notebook-samples/blob/main/sentinelhub/data_download_process_request.ipynb) or [basic OData API and requests](https://github.com/eu-cdse/notebook-samples/blob/main/geo/odata_basics.ipynb). + +You can copy data from CDSE object storage also directly to CSCs object storage Allas, following the instructions on [Allas docs page](../../../data/Allas/accessing_allas.md#copying-files-directly-between-object-storages) with endpoint `eodata.dataspace.copernicus.eu` and secret and access keys from following the [instructions about generating CDSE access keys](https://documentation.dataspace.copernicus.eu/APIs/S3.html#generate-secrets). ## How can I process EO data at CSC? From d66a9d1956e26d887b9fb7d433bc8f57e83bc481 Mon Sep 17 00:00:00 2001 From: Samantha Wittke <32324155+samumantha@users.noreply.github.com> Date: Wed, 31 Jan 2024 10:07:03 +0200 Subject: [PATCH 13/37] add geoconda mention to CDSE download section --- docs/support/tutorials/gis/eo_guide.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index 71f4209c55..9f293c39e2 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -175,9 +175,9 @@ Different services have different limitations, which are described here in the [ #### Data download from CDSE -For downloading data from the CDSE to CSC computing environment, we recommend using the **S3** access to the CDSE object storage, via command line tools like `s3cmd` and `rclone` or Pythons `boto3`. You can also read data directly with GDAL, following the instructions provided in [CSC GDAL cloud tutorial](gdal_cloud.md). Also other ways of accessing the data via Python are supported and work in CSC computing environment as shown in the example Notebooks by CDSE, e.g. using [SentinelHub](https://github.com/eu-cdse/notebook-samples/blob/main/sentinelhub/data_download_process_request.ipynb) or [basic OData API and requests](https://github.com/eu-cdse/notebook-samples/blob/main/geo/odata_basics.ipynb). +For downloading data from the CDSE to CSC computing environment, we recommend using the **S3** access to the CDSE object storage, via command line tools like `s3cmd` and `rclone` or Pythons `boto3`. You can also read data directly with GDAL, following the instructions provided in [CSC GDAL cloud tutorial](gdal_cloud.md). Also other ways of accessing the data via Python are supported and work in CSC computing environment (Python packages are available in `geoconda` module) as shown in the example Notebooks by CDSE, e.g. using [SentinelHub](https://github.com/eu-cdse/notebook-samples/blob/main/sentinelhub/data_download_process_request.ipynb) or [basic OData API and requests](https://github.com/eu-cdse/notebook-samples/blob/main/geo/odata_basics.ipynb). -You can copy data from CDSE object storage also directly to CSCs object storage Allas, following the instructions on [Allas docs page](../../../data/Allas/accessing_allas.md#copying-files-directly-between-object-storages) with endpoint `eodata.dataspace.copernicus.eu` and secret and access keys from following the [instructions about generating CDSE access keys](https://documentation.dataspace.copernicus.eu/APIs/S3.html#generate-secrets). +You can copy data from CDSE object storage also directly to CSCs object storage Allas, following the instructions for setting up `rclone on [Allas docs page](../../../data/Allas/accessing_allas.md#copying-files-directly-between-object-storages) with endpoint `eodata.dataspace.copernicus.eu` and secret and access keys from following the [instructions about generating CDSE access keys](https://documentation.dataspace.copernicus.eu/APIs/S3.html#generate-secrets). ## How can I process EO data at CSC? From b7b84dc216a1868e2ab0f6bdf0a455ebe160c73e Mon Sep 17 00:00:00 2001 From: Samantha Wittke <32324155+samumantha@users.noreply.github.com> Date: Wed, 31 Jan 2024 10:08:39 +0200 Subject: [PATCH 14/37] grammar fixes from code review Co-authored-by: EetuHuuskoCSC <116141296+EetuHuuskoCSC@users.noreply.github.com> --- docs/support/tutorials/gis/eo_guide.md | 4 ++-- docs/support/tutorials/gis/gdal_cloud.md | 12 ++++++------ 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index 9f293c39e2..9ad6c488e2 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -140,7 +140,7 @@ Some Finnish EO datasets are available locally at CSC. A STAC catalog for all sp The Copernicus Data Space Ecosystem does not only provide the possibility to browse, visualize and download Copernicus and other programs earth observation data, it also provides several options for further processing the data in the cloud. Almost all of the services require self registration, which everyone can do for free. Check out their website for all [available dataset descriptions](https://documentation.dataspace.copernicus.eu/Data.html) (note that duplicates may be available due to reprocessing with newest baselines). -The [Copernicus Data Space Ecosystem Browser](https://dataspace.copernicus.eu/browser/) serves as a central hub for accessing and exploring Earth observation and environmental data provided by the Copernicus Sentinel constellations, contributing missions, auxiliary engineering data, on-demand data and more (Check out the [documentation on Data](https://documentation.dataspace.copernicus.eu/Data.html) for more details) . Users can visualize, compare and analyze and download all this data. +The [Copernicus Data Space Ecosystem Browser](https://dataspace.copernicus.eu/browser/) serves as a central hub for accessing and exploring Earth observation and environmental data provided by the Copernicus Sentinel constellations, contributing missions, auxiliary engineering data, on-demand data and more (Check out the [documentation on Data](https://documentation.dataspace.copernicus.eu/Data.html) for more details). Users can visualize, compare and analyze, and download all this data. The [Copernicus Data Workspace](https://dataspace.copernicus.eu/workspace/) is a tool for managing and reviewing Earth observation-related products. This platform enables you to aggregate and review products, which can then be further processed or downloaded for various purposes. When products are selected for processing, you are provided with a list of processors that are capable of processing relevant data types. The processors can be further parameterized to fine-tune the results. @@ -148,7 +148,7 @@ The [openEO Algorithm Plaza](https://marketplace-portal.dataspace.copernicus.eu) The [openEO Web Editor](https://openeo.dataspace.copernicus.eu/) is a web-based graphical user interface (GUI) that allows users (who are not familiar with a programming language) to interact with the openEO API and perform various tasks related to Earth observation data processing, such as querying available data, defining processing workflows, executing processes, and visualizing the results. It allows users to build complex processing chains by connecting different processing steps as building blocks and provides options to specify parameters and input data for each step. -[Copernicus own Jupyter Lab instances](https://jupyterhub.dataspace.copernicus.eu/) provide example notebooks, and the possibility to add own packages via pip. 10Gb of persistent space per user (deleted after 15 days without login). The example notebooks are also available on [Copernicus Data Space Ecosystem github](https://github.com/eu-cdse/notebook-samples). JupyterHub provides several server options with 2 - 4 CPUs and 4 - 16 Gb of RAM. Note that in addition to personal limits, also the total number of active users seems to be limited. +[Copernicus own Jupyter Lab instances](https://jupyterhub.dataspace.copernicus.eu/) provide example notebooks, and the possibility to add own packages via pip. 10Gb of persistent space per user (deleted after 15 days without login). The example notebooks are also available on [Copernicus Data Space Ecosystem GitHub](https://github.com/eu-cdse/notebook-samples). JupyterHub provides several server options with 2 - 4 CPUs and 4 - 16 Gb of RAM. Note that in addition to personal limits, also the total number of active users seems to be limited. The Copernicus Data Space Ecosystem provides several different APIs to access the data. A [Copernicus access token](https://documentation.dataspace.copernicus.eu/APIs/Token.html) is needed to make use of these interfaces. diff --git a/docs/support/tutorials/gis/gdal_cloud.md b/docs/support/tutorials/gis/gdal_cloud.md index 90bf3b8752..0b5da7f186 100644 --- a/docs/support/tutorials/gis/gdal_cloud.md +++ b/docs/support/tutorials/gis/gdal_cloud.md @@ -13,11 +13,11 @@ S3 services are very common for storing bigger amounts of data, for example: * Google [Cloud Storage](https://cloud.google.com/storage), * Microsoft [Azure Blob Storage](https://azure.microsoft.com/en-us/products/storage/blobs/) etc. -Below are described in more detail how to use GDAL with public files from URL (VSICURL) and private files in S3 (VSIS3) storage. Special attention is on CSC Allas service and supercomputers. For using GDAL in supercomputer, a module including [GDAL](../../../apps/gdal.md), must be activated. +More details on how to use GDAL with public files from URL (VSICURL) and private files in S3 (VSIS3) storage are given below. Special attention is on CSC Allas object storage service and supercomputers. For using GDAL on a supercomputer, a module including [GDAL](../../../apps/gdal.md), must be activated. ## Reading public files from URL -[VSICURL](https://gdal.org/user/virtual_file_systems.html#vsicurl) can be used for reading of files available via URL. Public objects in S3 storage usually also have an URL, so this works also for public S3 files. VSICURL supports also partial reading of files, so it works well with cloud-optimized file formats. VSICURL supports also basic authentication. +[VSICURL](https://gdal.org/user/virtual_file_systems.html#vsicurl) can be used for reading files available via URL. Public objects in S3 storage usually also have an URL, so this works also for public S3 files. VSICURL also supports partial reading of files, so it works well with cloud-optimized file formats. VSICURL also supports basic authentication. ``` # A public file @@ -64,7 +64,7 @@ The end-point URL is not needed for Amazon S3, but is needed for other services. #### S3 connection set up for Allas -If you are using also CSC supercomputers, then the easiest option to set up Allas connection details, is to use [allas-conf command](../../../data/Allas/using_allas/s3_client.md#configuring-s3-connection-in-supercomputers) in CSC supercomputers: +On CSC supercomputers the easiest option to set up Allas connection details is to use [allas-conf command](../../../data/Allas/using_allas/s3_client.md#configuring-s3-connection-in-supercomputers): ``` module load allas @@ -72,7 +72,7 @@ allas-conf --mode s3cmd ``` * `module load allas` sets AWS_S3_ENDPOINT environment variable, which needs to be run each time S3 is used. -* `allas-conf --mode s3cmd` writes the keys and region name to `credentials` file (also prints them to Terminal) and must be run only when using first time or when changing to different CSC project. +* `allas-conf --mode s3cmd` writes the keys and region name to `credentials` file (also prints them to Terminal) and must be run only when using first time or when changing the CSC project. After this, you are ready to use GDAL or GDAL-based tools on supercomputers. @@ -125,5 +125,5 @@ gdal_translate /vsis3// /vsis3/ Options -> Variables new variable with name AWS_S3_ENDPOINT, for Allas the value is `a3s.fi`. * QGIS supports also point clouds from URL. - * [Example Python code for working with Allas and rasterio, geopandas and boto3](https://github.com/csc-training/geocomputing/blob/master/python/allas). - * [Example R code for workign with Allas and terra, sf and aws.s3](https://github.com/csc-training/geocomputing/blob/master/R/allas/working_with_allas_from_R_S3.R). + * [Example Python code for working with Allas, rasterio, geopandas, and boto3](https://github.com/csc-training/geocomputing/blob/master/python/allas). + * [Example R code for workign with Allas, terra, sf, and aws.s3](https://github.com/csc-training/geocomputing/blob/master/R/allas/working_with_allas_from_R_S3.R). From 7ba59a5227dae4695bb448ad0dd34f34fc958b26 Mon Sep 17 00:00:00 2001 From: Samantha Wittke <32324155+samumantha@users.noreply.github.com> Date: Fri, 2 Feb 2024 17:07:21 +0200 Subject: [PATCH 15/37] eo guide link fixes --- docs/support/tutorials/gis/eo_guide.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index 9ad6c488e2..7b7d54ceda 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -138,9 +138,9 @@ Some Finnish EO datasets are available locally at CSC. A STAC catalog for all sp ### Copernicus Data Space Ecosystem -The Copernicus Data Space Ecosystem does not only provide the possibility to browse, visualize and download Copernicus and other programs earth observation data, it also provides several options for further processing the data in the cloud. Almost all of the services require self registration, which everyone can do for free. Check out their website for all [available dataset descriptions](https://documentation.dataspace.copernicus.eu/Data.html) (note that duplicates may be available due to reprocessing with newest baselines). +The Copernicus Data Space Ecosystem does not only provide the possibility to browse, visualize and download Copernicus and other programs earth observation data, it also provides several options for further processing the data in the cloud. Almost all of the services require self registration, which everyone can do for free. Check out the [CDSE website for all available dataset descriptions](https://documentation.dataspace.copernicus.eu/Data.html) (note that duplicates may be available due to reprocessing with newest baselines). -The [Copernicus Data Space Ecosystem Browser](https://dataspace.copernicus.eu/browser/) serves as a central hub for accessing and exploring Earth observation and environmental data provided by the Copernicus Sentinel constellations, contributing missions, auxiliary engineering data, on-demand data and more (Check out the [documentation on Data](https://documentation.dataspace.copernicus.eu/Data.html) for more details). Users can visualize, compare and analyze, and download all this data. +The [Copernicus Data Space Ecosystem Browser](https://dataspace.copernicus.eu/browser/) serves as a central hub for accessing and exploring Earth observation and environmental data provided by the Copernicus Sentinel constellations, contributing missions, auxiliary engineering data, on-demand data and more (Check out the [CDSE documentation on Data](https://documentation.dataspace.copernicus.eu/Data.html) for more details). Users can visualize, compare and analyze, and download all this data. The [Copernicus Data Workspace](https://dataspace.copernicus.eu/workspace/) is a tool for managing and reviewing Earth observation-related products. This platform enables you to aggregate and review products, which can then be further processed or downloaded for various purposes. When products are selected for processing, you are provided with a list of processors that are capable of processing relevant data types. The processors can be further parameterized to fine-tune the results. @@ -148,7 +148,7 @@ The [openEO Algorithm Plaza](https://marketplace-portal.dataspace.copernicus.eu) The [openEO Web Editor](https://openeo.dataspace.copernicus.eu/) is a web-based graphical user interface (GUI) that allows users (who are not familiar with a programming language) to interact with the openEO API and perform various tasks related to Earth observation data processing, such as querying available data, defining processing workflows, executing processes, and visualizing the results. It allows users to build complex processing chains by connecting different processing steps as building blocks and provides options to specify parameters and input data for each step. -[Copernicus own Jupyter Lab instances](https://jupyterhub.dataspace.copernicus.eu/) provide example notebooks, and the possibility to add own packages via pip. 10Gb of persistent space per user (deleted after 15 days without login). The example notebooks are also available on [Copernicus Data Space Ecosystem GitHub](https://github.com/eu-cdse/notebook-samples). JupyterHub provides several server options with 2 - 4 CPUs and 4 - 16 Gb of RAM. Note that in addition to personal limits, also the total number of active users seems to be limited. +[Copernicus own Jupyter Lab instances](https://jupyterhub.dataspace.copernicus.eu/) provide example notebooks, and the possibility to add own packages via pip. 10Gb of persistent space per user (deleted after 15 days without login). The example notebooks are also available in the [Copernicus Data Space Ecosystem GitHub repository](https://github.com/eu-cdse/notebook-samples). JupyterHub provides several server options with 2 - 4 CPUs and 4 - 16 Gb of RAM. Note that in addition to personal limits, also the total number of active users seems to be limited. The Copernicus Data Space Ecosystem provides several different APIs to access the data. A [Copernicus access token](https://documentation.dataspace.copernicus.eu/APIs/Token.html) is needed to make use of these interfaces. @@ -175,9 +175,9 @@ Different services have different limitations, which are described here in the [ #### Data download from CDSE -For downloading data from the CDSE to CSC computing environment, we recommend using the **S3** access to the CDSE object storage, via command line tools like `s3cmd` and `rclone` or Pythons `boto3`. You can also read data directly with GDAL, following the instructions provided in [CSC GDAL cloud tutorial](gdal_cloud.md). Also other ways of accessing the data via Python are supported and work in CSC computing environment (Python packages are available in `geoconda` module) as shown in the example Notebooks by CDSE, e.g. using [SentinelHub](https://github.com/eu-cdse/notebook-samples/blob/main/sentinelhub/data_download_process_request.ipynb) or [basic OData API and requests](https://github.com/eu-cdse/notebook-samples/blob/main/geo/odata_basics.ipynb). +For downloading data from the CDSE to CSC computing environment, we recommend using the **S3** access to the CDSE object storage, via command line tools like `s3cmd` and `rclone` or Pythons `boto3`. You can also read data directly with GDAL, following the instructions provided in [CSC GDAL cloud tutorial](gdal_cloud.md). Also other ways of accessing the data via Python are supported and work in CSC computing environment (Python packages are available in `geoconda` module) as shown in the example Notebooks by CDSE, e.g. using [SentinelHub](https://github.com/eu-cdse/notebook-samples/blob/main/sentinelhub/data_download_process_request.ipynb) or [OData API and requests](https://github.com/eu-cdse/notebook-samples/blob/main/geo/odata_basics.ipynb). -You can copy data from CDSE object storage also directly to CSCs object storage Allas, following the instructions for setting up `rclone on [Allas docs page](../../../data/Allas/accessing_allas.md#copying-files-directly-between-object-storages) with endpoint `eodata.dataspace.copernicus.eu` and secret and access keys from following the [instructions about generating CDSE access keys](https://documentation.dataspace.copernicus.eu/APIs/S3.html#generate-secrets). +You can copy data from CDSE object storage also directly to CSCs object storage Allas, following the instructions for setting up `rclone on [Allas documentation page](../../../data/Allas/accessing_allas.md#copying-files-directly-between-object-storages) with endpoint `eodata.dataspace.copernicus.eu` and secret and access keys from following the [instructions about generating CDSE access keys](https://documentation.dataspace.copernicus.eu/APIs/S3.html#generate-secrets). ## How can I process EO data at CSC? From 9dc3f5acc72598c8e6f3dae79b9bb4b8fb25874b Mon Sep 17 00:00:00 2001 From: Samantha Wittke <32324155+samumantha@users.noreply.github.com> Date: Fri, 2 Feb 2024 18:35:39 +0200 Subject: [PATCH 16/37] simplify CDSE download text --- docs/support/tutorials/gis/eo_guide.md | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index 7b7d54ceda..963905305c 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -173,12 +173,9 @@ The [Copernicus dashboard](https://dashboard.dataspace.copernicus.eu/) shows the Different services have different limitations, which are described here in the [CDSE quota documentation](https://documentation.dataspace.copernicus.eu/Quotas.html). Compared to the previous system (SciHub), the number of concurrent downloads per user has increased from two to four for most APIs. -#### Data download from CDSE - -For downloading data from the CDSE to CSC computing environment, we recommend using the **S3** access to the CDSE object storage, via command line tools like `s3cmd` and `rclone` or Pythons `boto3`. You can also read data directly with GDAL, following the instructions provided in [CSC GDAL cloud tutorial](gdal_cloud.md). Also other ways of accessing the data via Python are supported and work in CSC computing environment (Python packages are available in `geoconda` module) as shown in the example Notebooks by CDSE, e.g. using [SentinelHub](https://github.com/eu-cdse/notebook-samples/blob/main/sentinelhub/data_download_process_request.ipynb) or [OData API and requests](https://github.com/eu-cdse/notebook-samples/blob/main/geo/odata_basics.ipynb). - -You can copy data from CDSE object storage also directly to CSCs object storage Allas, following the instructions for setting up `rclone on [Allas documentation page](../../../data/Allas/accessing_allas.md#copying-files-directly-between-object-storages) with endpoint `eodata.dataspace.copernicus.eu` and secret and access keys from following the [instructions about generating CDSE access keys](https://documentation.dataspace.copernicus.eu/APIs/S3.html#generate-secrets). +#### Data download from Copernicus Data Space Ecosystem +For downloading data from the CDSE to CSC computing environment, we recommend using the **S3** access to the CDSE object storage, via command line tools like `s3cmd` for downloading data to a computing environment or `rclone` for downloading data directly to CSC object storage Allas (an example script for finding and downloading data with command line tools can be found in [CSC geocomputing repository](https://github.com/csc-training/geocomputing/tree/master/Copernicus_data_download). You can also read data directly with GDAL, following the instructions provided in [CSC GDAL cloud tutorial](gdal_cloud.md). The CDSE also offers many possibilities of accessing the data via **Python** with `sentinelhub`, `openeo` or more simple ones like using `requests`. All packages are available in the `geoconda` module on Puhti. Examples for using these packages are provided in [CDSE GitHub repository](https://github.com/eu-cdse/notebook-samples). ## How can I process EO data at CSC? From cd53d27b300c8c52d4b43f8eedc2b998779a1aa7 Mon Sep 17 00:00:00 2001 From: Samantha Wittke <32324155+samumantha@users.noreply.github.com> Date: Fri, 2 Feb 2024 18:46:25 +0200 Subject: [PATCH 17/37] add note on more than data download --- docs/support/tutorials/gis/eo_guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index 963905305c..0dccb13142 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -175,7 +175,7 @@ Different services have different limitations, which are described here in the [ #### Data download from Copernicus Data Space Ecosystem -For downloading data from the CDSE to CSC computing environment, we recommend using the **S3** access to the CDSE object storage, via command line tools like `s3cmd` for downloading data to a computing environment or `rclone` for downloading data directly to CSC object storage Allas (an example script for finding and downloading data with command line tools can be found in [CSC geocomputing repository](https://github.com/csc-training/geocomputing/tree/master/Copernicus_data_download). You can also read data directly with GDAL, following the instructions provided in [CSC GDAL cloud tutorial](gdal_cloud.md). The CDSE also offers many possibilities of accessing the data via **Python** with `sentinelhub`, `openeo` or more simple ones like using `requests`. All packages are available in the `geoconda` module on Puhti. Examples for using these packages are provided in [CDSE GitHub repository](https://github.com/eu-cdse/notebook-samples). +For downloading data from the CDSE to CSC computing environment, we recommend using the **S3** access to the CDSE object storage, via command line tools like `s3cmd` for downloading data to a computing environment or `rclone` for downloading data directly to CSC object storage Allas (an example script for finding and downloading data with command line tools can be found in [CSC geocomputing repository](https://github.com/csc-training/geocomputing/tree/master/Copernicus_data_download). You can also read data directly with GDAL, following the instructions provided in [CSC GDAL cloud tutorial](gdal_cloud.md). The CDSE also offers many possibilities of accessing the data via **Python** with `sentinelhub`, `openeo` or more simple ones like using `requests`. All packages are available in the `geoconda` module on Puhti. Examples for using these packages are provided in [CDSE GitHub repository](https://github.com/eu-cdse/notebook-samples). The Python packages `sentinelhub` and `openeo` also provide much more capabilities than data download. ## How can I process EO data at CSC? From 62b7ec5f04f73684ddc418f31e54fdf80cb61eda Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Mon, 5 Feb 2024 16:57:05 +0200 Subject: [PATCH 18/37] GDAL cloud: add CDSE links for S3 --- docs/support/tutorials/gis/gdal_cloud.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/support/tutorials/gis/gdal_cloud.md b/docs/support/tutorials/gis/gdal_cloud.md index 0b5da7f186..b6ec0602b5 100644 --- a/docs/support/tutorials/gis/gdal_cloud.md +++ b/docs/support/tutorials/gis/gdal_cloud.md @@ -80,9 +80,9 @@ If you want to use Allas on some other machine, then copy `~/.aws/credentials` f If you are not using CSC supercomputers, you can install `allas-conf` to your Linux/Mac machine, follow the instructions in [CSC's Allas command line interface utilities repository](https://github.com/CSCfi/allas-cli-utils). -#### S3 connection set up for Copernicus Data Space Ecosystem +#### S3 connection set up for Copernicus Data Space Ecosystem (CDSE) -ESA data, inc Sentinel data, is available via Copernicus Data Space Ecosystem S3. In general the same applies as described above, but `AWS_VIRTUAL_HOSTING` should be set to False: +ESA data, inc Sentinel data, is available via [CDSE S3](https://dataspace.copernicus.eu/). Get [CDSE S3 credentials](https://documentation.dataspace.copernicus.eu/APIs/S3.html) and save to the `credentials` file as described above. With CDSE `AWS_VIRTUAL_HOSTING` should be set to False: ``` export AWS_S3_ENDPOINT=eodata.dataspace.copernicus.eu export AWS_VIRTUAL_HOSTING=FALSE From e29a1822972be55882690728f156bdd427104d87 Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Mon, 5 Feb 2024 16:59:19 +0200 Subject: [PATCH 19/37] minor working --- docs/support/tutorials/gis/gdal_cloud.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/support/tutorials/gis/gdal_cloud.md b/docs/support/tutorials/gis/gdal_cloud.md index b6ec0602b5..86474a5e63 100644 --- a/docs/support/tutorials/gis/gdal_cloud.md +++ b/docs/support/tutorials/gis/gdal_cloud.md @@ -1,6 +1,6 @@ # Using geospatial files directly from public repositories and S3 storage services, inc Allas -[GDAL](https://gdal.org/) is the main open-source library for reading and writing geospatial data and many more advanced tools rely on GDAL, including QGIS, ArcGIS, Python, R etc. GDAL has several virtual [network based files systems](https://gdal.org/user/virtual_file_systems.html#network-based-file-systems), that are meant for different APIs or use cases. GDAL and most tools depending on it can **read** directly from an public URL or S3 storage services. This eliminates the need to download the files manually before data analysis. GDAL can also **write** files to S3 storage services, but only some tools depending on it are supporting it. +[GDAL](https://gdal.org/) is the main open-source library for reading and writing geospatial data and many more advanced tools rely on GDAL, including QGIS, ArcGIS, Python, R etc. GDAL has several virtual [network based files systems](https://gdal.org/user/virtual_file_systems.html#network-based-file-systems), that are meant for different APIs or use cases. GDAL and most tools depending on it can **read** directly from an public URL or S3 storage services. This eliminates the need to download the files manually before data analysis. GDAL can also **write** files to S3 storage services, but only some tools dependent on GDAL are supporting it. Reading data directly from an external service is usually slower than reading from local disks, but in many cases, these seconds are negligible compared to the full duration of an analysis, but it is important to have good Internet connection. From 4dd8609270d8256fd8e5cb456d5939e162accbf7 Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Mon, 5 Feb 2024 17:01:13 +0200 Subject: [PATCH 20/37] AWS_PROFILE to Linux style --- docs/support/tutorials/gis/gdal_cloud.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/support/tutorials/gis/gdal_cloud.md b/docs/support/tutorials/gis/gdal_cloud.md index 86474a5e63..fa9c8a06cd 100644 --- a/docs/support/tutorials/gis/gdal_cloud.md +++ b/docs/support/tutorials/gis/gdal_cloud.md @@ -102,7 +102,7 @@ AWS_ACCESS_KEY_ID=xxx AWS_SECRET_ACCESS_KEY=yyy ``` -Then before using GDAL, the currently used profile must be set as environment variable: `os.environ["AWS_PROFILE"] = "allas_project1"` +Then before using GDAL, the currently used profile must be set as environment variable: `export AWS_PROFILE=allas_project1` ### Using S3 From 7cea5fffdca71ae90c433c397ff6e157a20479f2 Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Mon, 12 Feb 2024 15:57:43 +0200 Subject: [PATCH 21/37] update gdal installation differences section --- docs/apps/gdal.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/apps/gdal.md b/docs/apps/gdal.md index 6af5560abf..b90fe8ac9a 100644 --- a/docs/apps/gdal.md +++ b/docs/apps/gdal.md @@ -17,7 +17,7 @@ GDAL is available with following versions: * Also in Puhti: [r-env](r-env-for-gis.md#gdal-and-saga-gis-support) and [OrfeoToolBox](otb.md) !!! note - The stand-alone version doesn't have python bindings installed so e.g __gdal_calc__ works only in the geoconda installations. Also, the supported file formats vary slightly between the GDAL installations. For instance, the PostGIS driver is not available in stand-alone gdal but is included in the geoconda versions. + The stand-alone GDAL and R module don't have Python bindings installed so e.g `gdal_calc` works only in the geoconda and qgis modules. Also, the supported file formats vary between the modules. For example, the PostGIS driver is not available in stand-alone GDAL and r-env, but is included in the geoconda and qgis modules. Standalone GDAL also does not support virtual drivers to work with data from HTTPS or S3-links. To standalone and r-env GDAL installations it is possible to add more drivers, please ask. geoconda and qgis GDAL installation are impossible to change regarding drivers support. ## Usage From 262bf02ca3be712780dc67343f227bc607bba4d5 Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Mon, 12 Feb 2024 16:20:12 +0200 Subject: [PATCH 22/37] update gdal versions --- docs/apps/gdal.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/apps/gdal.md b/docs/apps/gdal.md index b90fe8ac9a..bbf51e2930 100644 --- a/docs/apps/gdal.md +++ b/docs/apps/gdal.md @@ -11,9 +11,9 @@ tags: GDAL is available with following versions: -* 3.5.0 - [geoconda-3.10.6](geoconda.md) in Puhti +* 3.8.3 - in the newest [QGIS](qgis.md) in Puhti and LUMI +* 3.6.2 - in the newest [geoconda](geoconda.md) in Puhti * 3.4.3 stand-alone: `gdal` in Puhti -* 3.4.1 - [QGIS-3.31 module](qgis.md) in Puhti and LUMI * Also in Puhti: [r-env](r-env-for-gis.md#gdal-and-saga-gis-support) and [OrfeoToolBox](otb.md) !!! note From f3925d2de222e2ebbe354911f452793ba8779307 Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Mon, 12 Feb 2024 17:24:19 +0200 Subject: [PATCH 23/37] add Arttu, more STAC links --- .../spatial-data-in-csc-computing-env.md | 34 +++++++++++-------- 1 file changed, 20 insertions(+), 14 deletions(-) diff --git a/docs/data/datasets/spatial-data-in-csc-computing-env.md b/docs/data/datasets/spatial-data-in-csc-computing-env.md index 88b3d3c135..9e5c73febe 100644 --- a/docs/data/datasets/spatial-data-in-csc-computing-env.md +++ b/docs/data/datasets/spatial-data-in-csc-computing-env.md @@ -12,6 +12,7 @@ Puhti has following datasets: + 2m and 10m DEM and infrared orthophotos have virtual rasters, see Puhti virtual rasters below. + Stereoclassified lidar data has been slightly modified. The original NLS data had mistakes in headers, these have been fixed. Additionally lax-index files have been added. + Automatically classified lidar data, only data of year 2019 + - The easiest way to find Paituli raster data is with [Paituli STAC](https://paituli.csc.fi/stac.html), it has also links to Puhti local files. * **LUKE, multi-source national forest inventory**, 2013, 2015, 2017, 2019 and 2021. LUKE license changed in Aug 2019 to CC BY 4.0. * **SYKE, All open spatial datasets** available from [SYKE open data service](https://www.syke.fi/fi-FI/Avoin_tieto/Paikkatietoaineistot/Ladattavat_paikkatietoaineistot). Weekly updates. * **Finnish Forest Centre**, CC BY 4.0 license, **updated in 8/2023** @@ -21,21 +22,34 @@ Puhti has following datasets: * [Forest resource plots](http://www.paikkatietohakemisto.fi/geonetwork/srv/fin/catalog.search#/metadata/332e5abf-63c2-4723-9c2d-4a926bbe587a) * **Landsat mosaics produced by SYKE and FMI** in Paikkatietoalusta project - [Historical Landsat satellite image mosaics](https://ckan.ymparisto.fi/dataset/historical-landsat-satellite-image-mosaics-href-historialliset-landsat-kuvamosaiikit-href): 1985, 1990, 1995 - - [Historical Landsat NDVI mosaics: 1984-2011](https://ckan.ymparisto.fi/dataset/historical-landsat-image-index-mosaics-hind-historialliset-landsat-kuvaindeksimosaiikit-hind) + - [Historical Landsat NDVI mosaics](https://ckan.ymparisto.fi/dataset/historical-landsat-image-index-mosaics-hind-historialliset-landsat-kuvaindeksimosaiikit-hind): 1984-2011 + +NLS 2m DEM, lidar, infrared ortophotos and all SYKE datasets are updated in Puhti automatically every Monday. + +The open spatial data is stored in Puhti: **/appl/data/geo** + +Accessing data in Puhti requires CSC user account with a project with Puhti service enabled. All Puhti users have **read** access to these datasets. You do not need to move the files: they can be used directly, unless you need to modify them, which requires you to make your own copy. Open spatial data in Puhti is maintained by CSC personnel. If you notice any problems with data or wish some new dataset, contact CSC Servicedesk. !!! warning "Sentinel satellite mosaics produced by SYKE and FMI in Paikkatietoalusta project were removed from Puhti on 21.11.2023" The removed datasets were: [Sentinel1 SAR mosaics](https://ckan.ymparisto.fi/dataset/sentinel-1-sar-image-mosaic-s1sar-sentinel-1-sar-kuvamosaiikki-s1sar), [Sentinel2 index mosaics](https://ckan.ymparisto.fi/dataset/sentinel-2-image-index-mosaics-s2ind-sentinel-2-kuvamosaiikit-s2ind). They are available from FMI's own object storage which has more data than was stored to Puhti local disks. The easiest way to find PTA sentinel mosaics from FMI is with [Paituli STAC](https://paituli.csc.fi/stac.html). Paituli STAC page includes also usage examples for R and Python. - -NLS 2m DEM, lidar, infrared ortophotos and all SYKE datasets are updated in Puhti automatically every Monday. -The open spatial data is stored in Puhti: **/appl/data/geo** +## Spatial data in Allas + +CSC computing services users are welcome to share spatial data in [Allas](../Allas/index.md) with other users, if the data license terms allow this. This is a community service, meaning that any CSC user is welcome to contribute and add data to Allas. The data buckets in Allas are owned by data collaborators. If you would like some share some data you have in Allas, and would like the dataset be added to this page, contact CSC Servicedesk -Open spatial data in Puhti is maintained by CSC personnel. If you notice any problems with data or wish some new dataset, contact CSC Servicedesk. +Currently available: + +1. **[Sentinel2 2A level images](https://a3s.fi/sentinel-readme/README.txt)**. Maria Yli-Heikkilä (LUKE) and Arttu Kivimäki (NLS/FGI) have downloaded Finnihs data from vegetation period (ca 10.5.-1.9.) in 2016->. The easiest way to find Sentinel data stored in Allas is with [Paituli STAC](https://paituli.csc.fi/stac.html). + +For using data in Allas, see [CSC webinar about Allas and geospatial data](https://youtu.be/mnFXe2-dJ_g) and [Using geospatial files directly from cloud, inc Allas tutorial](../../support/tutorials/gis/gdal_cloud.md). ### Puhti virtual rasters -CSC has added [virtual rasters](../../support/tutorials/gis/virtual-rasters.md) to NLS 2m and 10m elevation models and infrared ortophotos in Puhti. There are two variants of virtual rasters for the elevation models: +Virtaul rasters is a very practical concept for working with bigger raster datasets, see CSC [Virtual rasters](../../support/tutorials/gis/virtual-rasters.md) page for longer explanation and how to create own virtual rasters, inc from STAC search results. + +#### NLS DEM and orthophotos ready-made virtual rasters +CSC has added to NLS 2m and 10m elevation models and infrared ortophotos in Puhti. There are two variants of virtual rasters for the elevation models: 1. The **direct** virtual rasters contain directly the source tif images without any hierarchical structure, overviews or pre-calculated statistics. The direct virtual raster is meant for using **only in scripts**. It should **not** be opened in QGIS, unless zoomed in and need to open only a few files etc: * 2m DEM: `/appl/data/geo/mml/dem2m/dem2m_direct.vrt` @@ -63,15 +77,7 @@ Optional arguments: * -o: create overviews * -p: output name prefix -## Spatial data in Allas - -CSC computing services users are welcome to share spatial data in [Allas](../Allas/index.md) with other users, if the data license terms allow this. This is a community service, meaning that any CSC user is welcome to contribute and add data to Allas. The data buckets in Allas are owned by data collaborators. If you would like some share some data you have in Allas, and would like the dataset be added to this page, contact CSC Servicedesk - -Currently available: - -1. **[Sentinel2 2A level images](https://a3s.fi/sentinel-readme/README.txt)**. Maria Yli-Heikkilä (LUKE) has downloaded data of Finland from vegetation period (ca 10.5.-1.9.) in 2016-2020. -For using data in Allas, see [CSC webinar about Allas and geospatial data](https://youtu.be/mnFXe2-dJ_g) and [Using geospatial files directly from cloud, inc Allas tutorial](../../support/tutorials/gis/gdal_cloud.md). The easiest way to find Sentinel data stored in Allas is with [Paituli STAC](https://paituli.csc.fi/stac.html). ## License and acknowledgement From 7dd3d37a9f8f81411d3b0e05cbc81e33e073b04a Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Mon, 12 Feb 2024 17:25:02 +0200 Subject: [PATCH 24/37] styling fix --- docs/data/datasets/spatial-data-in-csc-computing-env.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/data/datasets/spatial-data-in-csc-computing-env.md b/docs/data/datasets/spatial-data-in-csc-computing-env.md index 9e5c73febe..965df075a4 100644 --- a/docs/data/datasets/spatial-data-in-csc-computing-env.md +++ b/docs/data/datasets/spatial-data-in-csc-computing-env.md @@ -44,11 +44,11 @@ Currently available: For using data in Allas, see [CSC webinar about Allas and geospatial data](https://youtu.be/mnFXe2-dJ_g) and [Using geospatial files directly from cloud, inc Allas tutorial](../../support/tutorials/gis/gdal_cloud.md). -### Puhti virtual rasters +## Puhti virtual rasters Virtaul rasters is a very practical concept for working with bigger raster datasets, see CSC [Virtual rasters](../../support/tutorials/gis/virtual-rasters.md) page for longer explanation and how to create own virtual rasters, inc from STAC search results. -#### NLS DEM and orthophotos ready-made virtual rasters +### NLS DEM and orthophotos ready-made virtual rasters CSC has added to NLS 2m and 10m elevation models and infrared ortophotos in Puhti. There are two variants of virtual rasters for the elevation models: 1. The **direct** virtual rasters contain directly the source tif images without any hierarchical structure, overviews or pre-calculated statistics. The direct virtual raster is meant for using **only in scripts**. It should **not** be opened in QGIS, unless zoomed in and need to open only a few files etc: @@ -60,7 +60,7 @@ CSC has added to NLS 2m and 10m elevation models and infrared ortophotos in Puh * 2m DEM: `/appl/data/geo/mml/dem2m/dem2m_hierarchical.vrt` -#### Puhti: create virtual rasters of DEM for custom area +### Puhti: create virtual rasters of DEM for custom area In some cases it might be useful to create virtual rasters that cover only your study area or some part of it. CSC has made a Python script for creating virtual rasters for custom area from NLS 2m and 10m DEM datasets in Puhti. It's used in the following way: From c8e54fe205124092a562ae20697a8a848f3e6478 Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Mon, 12 Feb 2024 17:26:27 +0200 Subject: [PATCH 25/37] Add Allas files being public --- docs/data/datasets/spatial-data-in-csc-computing-env.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/data/datasets/spatial-data-in-csc-computing-env.md b/docs/data/datasets/spatial-data-in-csc-computing-env.md index 965df075a4..7e380eec69 100644 --- a/docs/data/datasets/spatial-data-in-csc-computing-env.md +++ b/docs/data/datasets/spatial-data-in-csc-computing-env.md @@ -40,7 +40,7 @@ CSC computing services users are welcome to share spatial data in [Allas](../All Currently available: -1. **[Sentinel2 2A level images](https://a3s.fi/sentinel-readme/README.txt)**. Maria Yli-Heikkilä (LUKE) and Arttu Kivimäki (NLS/FGI) have downloaded Finnihs data from vegetation period (ca 10.5.-1.9.) in 2016->. The easiest way to find Sentinel data stored in Allas is with [Paituli STAC](https://paituli.csc.fi/stac.html). +1. **[Sentinel2 2A level images](https://a3s.fi/sentinel-readme/README.txt)**. Maria Yli-Heikkilä (LUKE) and Arttu Kivimäki (NLS/FGI) have downloaded Finnihs data from vegetation period (ca 10.5.-1.9.) in 2016->. The easiest way to find Sentinel data stored in Allas is with [Paituli STAC](https://paituli.csc.fi/stac.html). These files are public, so anybody can download them, also from own computer or other services. For using data in Allas, see [CSC webinar about Allas and geospatial data](https://youtu.be/mnFXe2-dJ_g) and [Using geospatial files directly from cloud, inc Allas tutorial](../../support/tutorials/gis/gdal_cloud.md). From b98de5ae00f4c67ce6bd88a696a6d81fa5050f83 Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Thu, 22 Feb 2024 18:39:51 +0200 Subject: [PATCH 26/37] CDSE part bigger changes, less about openEO and SentinelHub --- docs/support/tutorials/gis/eo_guide.md | 61 +++++++++++++------------- 1 file changed, 31 insertions(+), 30 deletions(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index 0dccb13142..fa6d405bd9 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -18,9 +18,9 @@ This guide aims to help researchers to work with Earth Observation (EO) data usi For working with EO data in general, there are three main options: -1) **EO specific services**, which provide both data and advanced ready-to-use processing environments. Usually these give better user experience and efficiency, but the services might be limited in computing power, available tools and options for adding own data. Often these have fees for using. Examples are [Google Earth Engine](https://earthengine.google.com/), [SentinelHub](https://www.sentinel-hub.com/) and [Microsoft Planetary Computer](https://planetarycomputer.microsoft.com). +1) **EO specific services**, which provide both data and ready-to-use processing environments. Usually these give better user experience and efficiency, but the services might be limited in computing power, available tools and options for adding own data or tools. These might have usage fees. Examples are [Copernicus Data Space Ecosystem](https://dataspace.copernicus.eu/), [Google Earth Engine](https://earthengine.google.com/), [SentinelHub](https://www.sentinel-hub.com/) and [Microsoft Planetary Computer](https://planetarycomputer.microsoft.com). -2) **Cloud services** with access to EO data. Practically, the data is often stored in object-storage and can be accessed as independent service. They also provide general computing services, such as virtual machines, to which EO tools need to be installed by the end-user. These options usually have some fees, mainly for processing. The data download may be free of charge or have a small cost, depending on the amount of data needed. One example is [Amazon Web Services](https://registry.opendata.aws/); also the [Microsoft Planetary Computer](https://planetarycomputer.microsoft.com) and [Copernicus Data Space Ecosystem](https://dataspace.copernicus.eu/) somewhat fit this category. +2) **Cloud services** with access to EO data. Practically, the data is often stored in object-storage and can be accessed as independent service. They also provide general computing services, such as virtual machines, to which EO tools need to be installed by the end-user. These options usually have some fees, mainly for processing and storage. The data download may be free of charge or have a small cost, depending on the amount of data needed. One example is [Amazon Web Services](https://registry.opendata.aws/); also the [Microsoft Planetary Computer](https://planetarycomputer.microsoft.com) somewhat fit this category. 3) **Own computing environment** - PC, local cluster, virtual machines. Data needs to be downloaded and all tools must be installed to this system. On the other hand, it gives more freedom to select the tools and set-up. Usually this does not cause any extra costs, but the computing power is usually rather limited. @@ -30,7 +30,7 @@ At CSC, EO data can be processed and analyzed using a supercomputer, for example Puhti has also a lot of [pre-installed applications](#what-applications-are-available-on-puhti), so it is an environment ready to use. cPouta virtual machines are similar to commercial cloud services, where all set-up and installations are done by the end-user. In general, both services only support Linux software. -At CSC, [some Finnish EO datasets](#eo-data-at-csc) are available for direct use. In many cases, however, downloading EO data from other services (see [list of EO data download services](#eo-data-download-services)) is a required step of the process. Puhti and cPouta provide local storage of ~1-20 Tb. For more storage space, [Allas object storage](../../../data/Allas/index.md) can be used. +At CSC, [some Finnish EO datasets](#eo-data-at-csc) are available for direct use. In many cases, however, downloading EO data from other services (see [list of EO data download services](#eo-data-download-services)) is a required step of the process. Puhti and cPouta provide local storage of ~1-20 Tb. For more storage space, [Allas object storage](../../../data/Allas/index.md) can be used. Using CSC computing services requires basic Linux skills and ability to use some scripting language (for example Python, R, Julia) or command-line tools. In addition, supercomputers and virtual machines require you to understand some specific concepts, so it takes a few hours to get started. The [Puhti web interface](https://www.puhti.csc.fi/) makes the start considerably easier, providing a desktop environment in the web browser, which enables the use of tools with Graphical User Interfaces (GUI) and also tools like R Studio and JupyterLab for an easy start with R, Python and Julia. @@ -108,15 +108,15 @@ Commercial datasets are usually available from data provider, while open dataset Some Finnish EO datasets are available locally at CSC. A STAC catalog for all spatial data available at CSC is currently in progress. You can find more information about it and its current content from the [Paituli STAC page](https://paituli.csc.fi/stac.html). -* **Landsat mosaics** of Finland in Puhti. Accessing data in Puhti requires CSC user account with a project where Puhti service is enabled. All Puhti users have **read** access to these datasets. You do not need to move the files: they can be used directly, unless you need to modify them, which requires you to make your own copy. -* **Sentinel-2 L2A data** of Finland in Allas. These files are public, so anybody can download them, also from own computer or other services. +* **Landsat mosaics** in Puhti. +* **Sentinel-2 L2A data**, selection of cloud-free tiles in Allas. * [More information and list of all spatial datasets in CSC computing environment](../../../data/datasets/spatial-data-in-csc-computing-env.md) ### EO data download services **[SYKE/FMI, Finnish image mosaics](https://www.syke.fi/fi-FI/Tutkimus__kehittaminen/Tutkimus_ja_kehittamishankkeet/Hankkeet/Paikkatietoalusta_PTA)** : Sentinel-1, Sentinel-2 and Landsat mosaics, for several time periods per year. Some of them are available in Puhti, but not all. [FMI provides also a STAC catalog for these mosaics](https://pta.data.lit.fmi.fi/stac/root.json) -[**Copernicus Data Space Ecosystem**](https://dataspace.copernicus.eu/) provides worldwide main products for Sentinel-1, -2 and -3. It requires free registration. Includes possibility for visualisation and data processing. This was introduced in late 2023 and replaced the European Space Agency's SciHub. This service provides much more than a data download service, see below for more information. +[**Copernicus Data Space Ecosystem**](https://dataspace.copernicus.eu/) provides worldwide main Sentinel products. Includes possibility for visualisation and data processing. This was introduced in late 2023 and replaced ESA's SciHub. This service provides much more than a data download service, see below for more information. [**FinHub**](https://finhub.nsdc.fmi.fi/#/home) is the Finnish national mirror of SciHub; other national mirrors also exist. It covers Finland and the Baltics and offers Sentinel-2 L1C (but not L2A) and Sentinel 1 SLC, GRD and OCN products and requires own registration. FinHub provides a similar Graphical User Interface (GUI) and Application Programming Interface (API) to access the data as the old SciHub. You can also use for example the [sentinelsat](https://sentinelsat.readthedocs.io/en/stable/) tool for downloading data from FinHub. @@ -138,44 +138,43 @@ Some Finnish EO datasets are available locally at CSC. A STAC catalog for all sp ### Copernicus Data Space Ecosystem -The Copernicus Data Space Ecosystem does not only provide the possibility to browse, visualize and download Copernicus and other programs earth observation data, it also provides several options for further processing the data in the cloud. Almost all of the services require self registration, which everyone can do for free. Check out the [CDSE website for all available dataset descriptions](https://documentation.dataspace.copernicus.eu/Data.html) (note that duplicates may be available due to reprocessing with newest baselines). +[Copernicus Data Space Ecosystem](https://dataspace.copernicus.eu/) (CDSE) provides the possibility to browse, visualize, download and analyze EO data. Almost all of the services require self-registration. Different services have different limitations, see [CDSE Quotas and limitations](https://documentation.dataspace.copernicus.eu/Quotas.html). Compared to the ESA's previous service (SciHub), the number of concurrent downloads per user has increased from two to four for most APIs. -The [Copernicus Data Space Ecosystem Browser](https://dataspace.copernicus.eu/browser/) serves as a central hub for accessing and exploring Earth observation and environmental data provided by the Copernicus Sentinel constellations, contributing missions, auxiliary engineering data, on-demand data and more (Check out the [CDSE documentation on Data](https://documentation.dataspace.copernicus.eu/Data.html) for more details). Users can visualize, compare and analyze, and download all this data. +#### CDSE data -The [Copernicus Data Workspace](https://dataspace.copernicus.eu/workspace/) is a tool for managing and reviewing Earth observation-related products. This platform enables you to aggregate and review products, which can then be further processed or downloaded for various purposes. When products are selected for processing, you are provided with a list of processors that are capable of processing relevant data types. The processors can be further parameterized to fine-tune the results. +CDSE mainly includes different Sentinel datasets, but also some complimentary datasets, inc Landsat, see [full list of CDSE datasets](https://documentation.dataspace.copernicus.eu/Data.html). -The [openEO Algorithm Plaza](https://marketplace-portal.dataspace.copernicus.eu) is a marketplace to discover and share various EO algorithms expressed as openEO process graphs. +Note that duplicates may be available due to reprocessing with newest baselines. -The [openEO Web Editor](https://openeo.dataspace.copernicus.eu/) is a web-based graphical user interface (GUI) that allows users (who are not familiar with a programming language) to interact with the openEO API and perform various tasks related to Earth observation data processing, such as querying available data, defining processing workflows, executing processes, and visualizing the results. It allows users to build complex processing chains by connecting different processing steps as building blocks and provides options to specify parameters and input data for each step. +#### CDSE applications -[Copernicus own Jupyter Lab instances](https://jupyterhub.dataspace.copernicus.eu/) provide example notebooks, and the possibility to add own packages via pip. 10Gb of persistent space per user (deleted after 15 days without login). The example notebooks are also available in the [Copernicus Data Space Ecosystem GitHub repository](https://github.com/eu-cdse/notebook-samples). JupyterHub provides several server options with 2 - 4 CPUs and 4 - 16 Gb of RAM. Note that in addition to personal limits, also the total number of active users seems to be limited. +CDSE provides many applications for interacting with the data: +* [CDSE Browser](https://dataspace.copernicus.eu/browser/) - for accessing, exploring and downloading the data. +* [Copernicus Data Workspace](https://dataspace.copernicus.eu/workspace/) is a tool for managing and reviewing EO-related products, which can then be further processed or downloaded for various purposes. When products are selected for processing, you are provided with a list of processors that are capable of processing relevant data types. +* [CDSE Jupyter Notebooks](https://jupyterhub.dataspace.copernicus.eu/) provide to analyse the data using Jupyter Notebooks. Each user has 10Gb of persistent space (deleted after 15 days without login) and access to 2 - 4 CPUs with 4 - 16 Gb RAM. Note that in addition to personal limits, also the total number of active users seems to be limited. It is possible to add own packages via pip. [CDSE example notebooks](https://github.com/eu-cdse/notebook-samples) +* And many more, see [all CDSE applications](https://documentation.dataspace.copernicus.eu/Applications.html) -The Copernicus Data Space Ecosystem provides several different APIs to access the data. A [Copernicus access token](https://documentation.dataspace.copernicus.eu/APIs/Token.html) is needed to make use of these interfaces. +#### CDSE data APIs and data download -Catalog APIs, all connected to the same database: +- [CDSE Catalog APIs](https://dataspace.copernicus.eu/analyse/apis/catalogue-apis) support 3 different options for finding suitable data: OData, OpenSearch and STAC. +- [CDSE S3](https://documentation.dataspace.copernicus.eu/APIs/S3.html) for high-performance parallel access and download from CDSE object storage. - - Odata - - OpenSearch - - STAC (Spatio Temporal Asset Catalog) +Several example scripts are available for CDSE data download: -Streamlined Data Access APIs (SDA) enables users to access and retrieve Earth observation (EO) data from the Copernicus Data Space Ecosystem catalogue. These APIs also provide you with a set of tools and services to support data processing and analysis: +* [OpenSearch API + rclone by CSC](https://github.com/csc-training/geocomputing/tree/master/Copernicus_data_download) how to find data and download it to disk or CSC Allas object storage using `rclone`. +* [OData API with Python requests](https://github.com/eu-cdse/notebook-samples/blob/main/geo/odata_basics.ipynb) +* Alternatively [s3cmd and Python boto3](https://documentation.dataspace.copernicus.eu/APIs/S3.html) may be used for S3 downloads. - - SentinelHub - - OpenEO -In addition, also direct EO data access to the CDSE object storage with S3 is provided, as well as an on-demand production API and traceability service. +In principle also STAC has a lot of potential, because it would easier to download products also partially (only some bands or geographically). It is the newest of data API and at the moment does not support any other search criteria than collection, id, time and location, so for example cloud cover filtering is not possible (yet). So use for Sentinel2 data is currently limited, but for Sentinel1 it might be more useful. -The [Copernicus Request builder](https://shapps.dataspace.copernicus.eu/requests-builder/) lets you build requests for the different APIs via Graphical User Interface (GUI). It can also provide complete Python scripts using 'requests' or 'sentinelhub' Python packages. Requests can be sent immediately from the GUI or copied into own script/terminal. +You can also read data directly from S3 with GDAL or GDAL-based tools, see [CSC GDAL cloud tutorial](gdal_cloud.md). -The [Sentinel Hub QGIS Plugin](https://documentation.dataspace.copernicus.eu/Applications/QGIS.html) allows you to view satellite image data from the Copernicus Data Space Ecosystem or from Sentinel Hub directly within a QGIS workspace. All datasets are available that are part of collections associated with your user. The current functionality of the QGIS Plugin is for visualization; it does not allow you to perform operations or access properties of the dataset. +These data APIs are free of charge. -The [Copernicus dashboard](https://dashboard.dataspace.copernicus.eu/) shows the state of services and products. +#### OpenEO and SentinelHub -Different services have different limitations, which are described here in the [CDSE quota documentation](https://documentation.dataspace.copernicus.eu/Quotas.html). Compared to the previous system (SciHub), the number of concurrent downloads per user has increased from two to four for most APIs. - -#### Data download from Copernicus Data Space Ecosystem - -For downloading data from the CDSE to CSC computing environment, we recommend using the **S3** access to the CDSE object storage, via command line tools like `s3cmd` for downloading data to a computing environment or `rclone` for downloading data directly to CSC object storage Allas (an example script for finding and downloading data with command line tools can be found in [CSC geocomputing repository](https://github.com/csc-training/geocomputing/tree/master/Copernicus_data_download). You can also read data directly with GDAL, following the instructions provided in [CSC GDAL cloud tutorial](gdal_cloud.md). The CDSE also offers many possibilities of accessing the data via **Python** with `sentinelhub`, `openeo` or more simple ones like using `requests`. All packages are available in the `geoconda` module on Puhti. Examples for using these packages are provided in [CDSE GitHub repository](https://github.com/eu-cdse/notebook-samples). The Python packages `sentinelhub` and `openeo` also provide much more capabilities than data download. +[OpenEO](https://dataspace.copernicus.eu/analyse/apis/openeo-api) and [SentinelHub](https://dataspace.copernicus.eu/analyse/apis/sentinel-hub) are CDSE related services with extensive service portfolio with option to bring processing close to the data. Both have free of charge options, but also services with a fee. They provide different APIs, which can be accessed also via Python and R. SentinelHub Catalog API is actually also STAC API and with more functionality than CDSE own STAC. ## How can I process EO data at CSC? @@ -215,7 +214,7 @@ There is no single software perfect for every task and taste. The right software [**Python**](../../../apps/python.md) -* The [geoconda module](../../../apps/geoconda.md) provides many useful Python packages for raster data processing and analysis, such as `rasterio`, `rasterstats`, `scimage`, `sentinelhub`, `xarray` and tools for working with STAC. +* The [geoconda module](../../../apps/geoconda.md) provides many useful Python packages for raster data processing and analysis, such as `rasterio`, `rasterstats`, `scimage`, `sentinelhub`, `xarray`, `boto3` and tools for working with STAC. * [Machine learning modules](../../../apps/by_discipline.md#data-analytics-and-machine-learning) provide some common machine learning frameworks, also for deep learning.. [**QGIS**](../../../apps/qgis.md) - open source tool with GUI for working with spatial data including limited multispectral image processing capabilities. GUI with batch processing possibility and Python interface. Used for example for visualization, map algebra and other raster processing. Many plug-ins available, for EO data processing, check out the [QGIS Semi-automatic classification plugin](https://fromgistors.blogspot.com/p/semi-automatic-classification-plugin.html). @@ -228,6 +227,8 @@ There is no single software perfect for every task and taste. The right software [**SNAP**](../../../apps/snap.md) - ESA Sentinel Application Platform. Tool for processing of Sentinel data (+ support for other data sources). GUI, CLI (Graph Processing Tool, GPT) and Python interfaces. [SNAP GPT example for Puhti](https://github.com/csc-training/geocomputing/tree/master/snap). +[**allas'']](../../../apps/allas.md) - tools for working with S3 storage, inc CSC Allas, CDSE S3 etc: `rclone` and `s3cmd`. + If you need further applications, you can ask CSC to install them for you. ### Machine Learning with EO data From 20444e7a0cef13761a83054e43dfa48a9e6ddb14 Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Thu, 22 Feb 2024 18:50:55 +0200 Subject: [PATCH 27/37] style fixes --- docs/support/tutorials/gis/eo_guide.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index fa6d405bd9..ef109715bc 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -149,6 +149,7 @@ Note that duplicates may be available due to reprocessing with newest baselines. #### CDSE applications CDSE provides many applications for interacting with the data: + * [CDSE Browser](https://dataspace.copernicus.eu/browser/) - for accessing, exploring and downloading the data. * [Copernicus Data Workspace](https://dataspace.copernicus.eu/workspace/) is a tool for managing and reviewing EO-related products, which can then be further processed or downloaded for various purposes. When products are selected for processing, you are provided with a list of processors that are capable of processing relevant data types. * [CDSE Jupyter Notebooks](https://jupyterhub.dataspace.copernicus.eu/) provide to analyse the data using Jupyter Notebooks. Each user has 10Gb of persistent space (deleted after 15 days without login) and access to 2 - 4 CPUs with 4 - 16 Gb RAM. Note that in addition to personal limits, also the total number of active users seems to be limited. It is possible to add own packages via pip. [CDSE example notebooks](https://github.com/eu-cdse/notebook-samples) @@ -161,12 +162,12 @@ CDSE provides many applications for interacting with the data: Several example scripts are available for CDSE data download: -* [OpenSearch API + rclone by CSC](https://github.com/csc-training/geocomputing/tree/master/Copernicus_data_download) how to find data and download it to disk or CSC Allas object storage using `rclone`. -* [OData API with Python requests](https://github.com/eu-cdse/notebook-samples/blob/main/geo/odata_basics.ipynb) +* [OpenSearch API + rclone by CSC](https://github.com/csc-training/geocomputing/tree/master/Copernicus_data_download), option to save to CSC Allas or some other object storage . +* [OData API + Python requests](https://github.com/eu-cdse/notebook-samples/blob/main/geo/odata_basics.ipynb) * Alternatively [s3cmd and Python boto3](https://documentation.dataspace.copernicus.eu/APIs/S3.html) may be used for S3 downloads. -In principle also STAC has a lot of potential, because it would easier to download products also partially (only some bands or geographically). It is the newest of data API and at the moment does not support any other search criteria than collection, id, time and location, so for example cloud cover filtering is not possible (yet). So use for Sentinel2 data is currently limited, but for Sentinel1 it might be more useful. +In principle also STAC has a lot of potential, because it would easier to download only needed data, for example only some bands or geographically only parts of data. It is the newest of data API and at the moment does not support any other search criteria than collection, time and location, so for example cloud cover filtering is not possible (yet). So use cases for Sentinel2 data is currently limited, but for Sentinel1 it might be more suitable. You can also read data directly from S3 with GDAL or GDAL-based tools, see [CSC GDAL cloud tutorial](gdal_cloud.md). @@ -214,7 +215,7 @@ There is no single software perfect for every task and taste. The right software [**Python**](../../../apps/python.md) -* The [geoconda module](../../../apps/geoconda.md) provides many useful Python packages for raster data processing and analysis, such as `rasterio`, `rasterstats`, `scimage`, `sentinelhub`, `xarray`, `boto3` and tools for working with STAC. +* The [geoconda module](../../../apps/geoconda.md) provides many useful Python packages for raster data processing and analysis, such as `rasterio`, `rasterstats`, `scimage`, `sentinelhub`, `xarray`, `boto3` and packages for working with STAC. * [Machine learning modules](../../../apps/by_discipline.md#data-analytics-and-machine-learning) provide some common machine learning frameworks, also for deep learning.. [**QGIS**](../../../apps/qgis.md) - open source tool with GUI for working with spatial data including limited multispectral image processing capabilities. GUI with batch processing possibility and Python interface. Used for example for visualization, map algebra and other raster processing. Many plug-ins available, for EO data processing, check out the [QGIS Semi-automatic classification plugin](https://fromgistors.blogspot.com/p/semi-automatic-classification-plugin.html). From 7a6df1f0946a02931df2c5625d48bd634644382c Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Thu, 22 Feb 2024 18:54:54 +0200 Subject: [PATCH 28/37] Paituli STAC status update --- docs/support/tutorials/gis/eo_guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index ef109715bc..f588ad7566 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -106,7 +106,7 @@ Commercial datasets are usually available from data provider, while open dataset ### EO data at CSC -Some Finnish EO datasets are available locally at CSC. A STAC catalog for all spatial data available at CSC is currently in progress. You can find more information about it and its current content from the [Paituli STAC page](https://paituli.csc.fi/stac.html). +Some Finnish EO datasets are available locally at CSC. [Paituli STAC](https://paituli.csc.fi/stac.html) includes all raster data available at CSC. * **Landsat mosaics** in Puhti. * **Sentinel-2 L2A data**, selection of cloud-free tiles in Allas. From 52529f801cc0c97a38cce3f2ec7e8310bbfcc41d Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Thu, 22 Feb 2024 18:56:38 +0200 Subject: [PATCH 29/37] order change of CDSE limitations --- docs/support/tutorials/gis/eo_guide.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index f588ad7566..6c6eba8c6f 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -138,7 +138,7 @@ Some Finnish EO datasets are available locally at CSC. [Paituli STAC](https://pa ### Copernicus Data Space Ecosystem -[Copernicus Data Space Ecosystem](https://dataspace.copernicus.eu/) (CDSE) provides the possibility to browse, visualize, download and analyze EO data. Almost all of the services require self-registration. Different services have different limitations, see [CDSE Quotas and limitations](https://documentation.dataspace.copernicus.eu/Quotas.html). Compared to the ESA's previous service (SciHub), the number of concurrent downloads per user has increased from two to four for most APIs. +[Copernicus Data Space Ecosystem](https://dataspace.copernicus.eu/) (CDSE) provides the possibility to browse, visualize, download and analyze EO data. Almost all of the services require self-registration. #### CDSE data @@ -171,7 +171,7 @@ In principle also STAC has a lot of potential, because it would easier to downlo You can also read data directly from S3 with GDAL or GDAL-based tools, see [CSC GDAL cloud tutorial](gdal_cloud.md). -These data APIs are free of charge. +These data APIs are free of charge. Different services have different limitations, see [CDSE Quotas and limitations](https://documentation.dataspace.copernicus.eu/Quotas.html). Compared to the ESA's previous service (SciHub), the number of concurrent downloads per user has increased from two to four for most APIs. #### OpenEO and SentinelHub From a1fe6e8b3a68d6893ddad1577c8278290af09e1b Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Thu, 22 Feb 2024 19:01:24 +0200 Subject: [PATCH 30/37] typo --- docs/support/tutorials/gis/eo_guide.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index 6c6eba8c6f..9d0b1314cf 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -164,10 +164,10 @@ Several example scripts are available for CDSE data download: * [OpenSearch API + rclone by CSC](https://github.com/csc-training/geocomputing/tree/master/Copernicus_data_download), option to save to CSC Allas or some other object storage . * [OData API + Python requests](https://github.com/eu-cdse/notebook-samples/blob/main/geo/odata_basics.ipynb) -* Alternatively [s3cmd and Python boto3](https://documentation.dataspace.copernicus.eu/APIs/S3.html) may be used for S3 downloads. +* Alternatively [s3cmd and Python boto3](https://documentation.dataspace.copernicus.eu/APIs/S3.html) can be used for S3 downloads. -In principle also STAC has a lot of potential, because it would easier to download only needed data, for example only some bands or geographically only parts of data. It is the newest of data API and at the moment does not support any other search criteria than collection, time and location, so for example cloud cover filtering is not possible (yet). So use cases for Sentinel2 data is currently limited, but for Sentinel1 it might be more suitable. +In principle, also STAC has a lot of potential, because it would easier to download only needed data, for example only some bands or geographically only parts of data. It is the newest of data APIs and at the moment does not support any other search criteria than collection, time and location, so for example cloud cover filtering is not possible (yet). So use cases for Sentinel2 data is currently limited, but for Sentinel1 it might be more suitable. You can also read data directly from S3 with GDAL or GDAL-based tools, see [CSC GDAL cloud tutorial](gdal_cloud.md). From efeb95c92836ef9cd8a47eddedd14cc6ee0ea5f1 Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Thu, 22 Feb 2024 19:09:03 +0200 Subject: [PATCH 31/37] CDSE processing update --- docs/support/tutorials/gis/eo_guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index 9d0b1314cf..bc1a6fdd87 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -244,7 +244,7 @@ Below is a list of alternative EO processing services that might be useful, when **[Microsoft planetary computer](https://planetarycomputer.microsoft.com)** offers JupyterHub together with Dask Gateway, both CPUs and GPUs are available. It is currently available in preview. - **[Data and Information Access Services (DIAS)](https://www.copernicus.eu/en/access-data/dias)** offer cloud based Virtual Machines (VMs), dedicated baremetal servers, containers, operating system and software images. These services are specialized in EO and have user support available. All of them are commercial services. The new [**Copernicus Data Space Ecosystem**](https://dataspace.copernicus.eu/) combines some of the DIASes into one, including also free trials of the service. See the [Copernicus Data Space Ecosystem - Analyse page](https://dataspace.copernicus.eu/analyse) for more information on the services. +[**CDSE**](https://dataspace.copernicus.eu/analyse) provides also processing services, mainly via OpenEO and SentinelHub services, soon also On-Demand Processing. [**Sentinelhub**](https://www.sentinel-hub.com/explore/) is a commercial service that offers several different APIs. From 371ebf843d9207096af926f698a9091e7255f806 Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Thu, 22 Feb 2024 19:11:29 +0200 Subject: [PATCH 32/37] remove sentinelhub from alternative processing services, included in CDSE --- docs/support/tutorials/gis/eo_guide.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index bc1a6fdd87..f6fa3fb7f8 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -246,8 +246,6 @@ Below is a list of alternative EO processing services that might be useful, when [**CDSE**](https://dataspace.copernicus.eu/analyse) provides also processing services, mainly via OpenEO and SentinelHub services, soon also On-Demand Processing. -[**Sentinelhub**](https://www.sentinel-hub.com/explore/) is a commercial service that offers several different APIs. - **Commercial clouds**: Amazon, Google Cloud and Microsoft Azure, all provide virtual machines and other processing services, all of them have some local data, see links above. ## Where can I get help? From cf3689432b5619b71214d91379b1086506697d7c Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Fri, 23 Feb 2024 16:28:25 +0200 Subject: [PATCH 33/37] Update eo_guide.md --- docs/support/tutorials/gis/eo_guide.md | 70 +++++++++++--------------- 1 file changed, 29 insertions(+), 41 deletions(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index f6fa3fb7f8..7dc9daed81 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -1,6 +1,6 @@ # Earth Observation guide -This guide aims to help researchers to work with Earth Observation (EO) data using CSC's computing resources. The purpose of this guide is to give an overview of available options, so it would be easier to decide if CSC has suitable services for your EO research. It also helps you find the right data and tools for raster data based EO tasks. This guide focuses on spaceborne platforms. However, many tools and concepts also apply to airborne platforms. If you are interested in the fundamentals of EO, please check the [resources and further reading section](#resources-and-further-reading). +This guide aims to help researchers to work with Earth Observation (EO) data using CSC's computing resources. The purpose of this guide is to give an overview of available options, so it would be easier to decide if CSC has suitable services for your EO research. It helps you find the right data and tools for raster data based EO tasks. This guide focuses on spaceborne platforms. However, many tools and concepts also apply to airborne platforms. If you are interested in the fundamentals of EO, please check the [resources and further reading section](#resources-and-further-reading). **What are the benefits of using EO data?** @@ -20,7 +20,7 @@ For working with EO data in general, there are three main options: 1) **EO specific services**, which provide both data and ready-to-use processing environments. Usually these give better user experience and efficiency, but the services might be limited in computing power, available tools and options for adding own data or tools. These might have usage fees. Examples are [Copernicus Data Space Ecosystem](https://dataspace.copernicus.eu/), [Google Earth Engine](https://earthengine.google.com/), [SentinelHub](https://www.sentinel-hub.com/) and [Microsoft Planetary Computer](https://planetarycomputer.microsoft.com). -2) **Cloud services** with access to EO data. Practically, the data is often stored in object-storage and can be accessed as independent service. They also provide general computing services, such as virtual machines, to which EO tools need to be installed by the end-user. These options usually have some fees, mainly for processing and storage. The data download may be free of charge or have a small cost, depending on the amount of data needed. One example is [Amazon Web Services](https://registry.opendata.aws/); also the [Microsoft Planetary Computer](https://planetarycomputer.microsoft.com) somewhat fit this category. +2) **Cloud services** with access to EO data. Practically, the data is often stored in object-storage and can be accessed as independent service. They provide general computing services, such as virtual machines, to which EO tools need to be installed by the end-user. These options usually have some fees, mainly for processing and storage. The data download may be free of charge or have a small cost, depending on the amount of data needed. One example is [Amazon Web Services](https://registry.opendata.aws/); also the [Microsoft Planetary Computer](https://planetarycomputer.microsoft.com) somewhat fit this category. 3) **Own computing environment** - PC, local cluster, virtual machines. Data needs to be downloaded and all tools must be installed to this system. On the other hand, it gives more freedom to select the tools and set-up. Usually this does not cause any extra costs, but the computing power is usually rather limited. @@ -28,7 +28,7 @@ CSC services do not fit well in this categorization, as they provide some featur At CSC, EO data can be processed and analyzed using a supercomputer, for example [supercomputer Puhti](../../../computing/systems-puhti.md), or a virtual machine in the [cPouta cloud service](../../../cloud/pouta/pouta-what-is.md). Puhti's computing capacity can hardly be compared to any other EO service, in both available processing power and amount of memory. Both Puhti and cPouta have also GPU resources, which are especially useful for large simulations and deep learning use cases. -Puhti has also a lot of [pre-installed applications](#what-applications-are-available-on-puhti), so it is an environment ready to use. cPouta virtual machines are similar to commercial cloud services, where all set-up and installations are done by the end-user. In general, both services only support Linux software. +Puhti has a lot of [pre-installed applications](#what-applications-are-available-on-puhti), so it is an environment ready to use. cPouta virtual machines are similar to commercial cloud services, where all set-up and installations are done by the end-user. In general, both services only support Linux software. At CSC, [some Finnish EO datasets](#eo-data-at-csc) are available for direct use. In many cases, however, downloading EO data from other services (see [list of EO data download services](#eo-data-download-services)) is a required step of the process. Puhti and cPouta provide local storage of ~1-20 Tb. For more storage space, [Allas object storage](../../../data/Allas/index.md) can be used. @@ -102,7 +102,7 @@ Commercial datasets are usually available from data provider, while open dataset !!! default "STAC" - Many data providers provide a Spatio Temporal Asset Catalog (STAC) of their datasets. These catalogs help in finding available data based on time and location with the possibility for multiple additional filters, such as cloud cover and resolution. The [STAC Index](https://www.stacindex.org/) provides a nice overview of available catalogs from all over the world, including [Paituli STAC](https://stacindex.org/catalogs/paituli-stac-finland#/). The STAC Index page also includes many resources for learning and utilizing STAC. Check out also CSC's [examples for utilizing STAC from Python](https://github.com/csc-training/geocomputing/blob/master/python/STAC) and [examples for utilizing STAC from R](https://github.com/csc-training/geocomputing/tree/master/R/STAC). + Many data providers provide a Spatio Temporal Asset Catalog (STAC) of their datasets. These catalogs help in finding available data based on time and location with the possibility for multiple additional filters, such as cloud cover and resolution. The [STAC Index](https://www.stacindex.org/) provides a nice overview of available catalogs from all over the world. The STAC Index page includes many resources for learning and utilizing STAC. Finnish data is available from [Paituli STAC](https://paituli.csc.fi/stac.html). Check out also CSC's examples for utilizing [STAC from Python](https://github.com/csc-training/geocomputing/blob/master/python/STAC) and [STAC from R](https://github.com/csc-training/geocomputing/tree/master/R/STAC). ### EO data at CSC @@ -114,50 +114,38 @@ Some Finnish EO datasets are available locally at CSC. [Paituli STAC](https://pa ### EO data download services -**[SYKE/FMI, Finnish image mosaics](https://www.syke.fi/fi-FI/Tutkimus__kehittaminen/Tutkimus_ja_kehittamishankkeet/Hankkeet/Paikkatietoalusta_PTA)** : Sentinel-1, Sentinel-2 and Landsat mosaics, for several time periods per year. Some of them are available in Puhti, but not all. [FMI provides also a STAC catalog for these mosaics](https://pta.data.lit.fmi.fi/stac/root.json) +**[SYKE/FMI, Finnish image mosaics](https://www.syke.fi/fi-FI/Tutkimus__kehittaminen/Tutkimus_ja_kehittamishankkeet/Hankkeet/Paikkatietoalusta_PTA)** : Sentinel-1, Sentinel-2 and Landsat mosaics, also index mosaics. For several time periods per year. These are included in [Paituli STAC](https://paituli.csc.fi/stac.html) -[**Copernicus Data Space Ecosystem**](https://dataspace.copernicus.eu/) provides worldwide main Sentinel products. Includes possibility for visualisation and data processing. This was introduced in late 2023 and replaced ESA's SciHub. This service provides much more than a data download service, see below for more information. +[**ESA Copernicus Data Space Ecosystem**](https://dataspace.copernicus.eu/) provides worldwide main Sentinel products, see below for more information. -[**FinHub**](https://finhub.nsdc.fmi.fi/#/home) is the Finnish national mirror of SciHub; other national mirrors also exist. It covers Finland and the Baltics and offers Sentinel-2 L1C (but not L2A) and Sentinel 1 SLC, GRD and OCN products and requires own registration. FinHub provides a similar Graphical User Interface (GUI) and Application Programming Interface (API) to access the data as the old SciHub. You can also use for example the [sentinelsat](https://sentinelsat.readthedocs.io/en/stable/) tool for downloading data from FinHub. +[**FinHub**](https://finhub.nsdc.fmi.fi/#/home) covers Finland and the Baltics and offers Sentinel-2 L1C (but not L2A) and Sentinel 1 SLC, GRD and OCN products. No STAC. [sentinelsat](https://sentinelsat.readthedocs.io/en/stable/) Python package is suitable for downloading data from FinHub, see [CSC FinHub sentinelsat example](https://github.com/csc-training/geocomputing/tree/master/python/sentinel). -[**USGS EarthExplorer**](https://earthexplorer.usgs.gov/) provides among others US related datasets, also worldwide Landsat mission datasets. It requires free registration. Data can be browsed and downloaded via web interface and bulk download. USGS is the main provider of the new [Landsat Collection 2 data](https://www.usgs.gov/landsat-missions/landsat-data-access). +[**USGS EarthExplorer**](https://earthexplorer.usgs.gov/) huge datastore with focus on US data, but also worldwide Landsat datasets. USGS is the main provider of the new [Landsat Collection 2 data](https://www.usgs.gov/landsat-missions/landsat-data-access). [Landsat Collection 2 STAC](https://www.usgs.gov/landsat-missions/spatiotemporal-asset-catalog-stac) -[**NASA Earthdata**](https://search.earthdata.nasa.gov) provides among many others [harmonized Landsat 8 and Sentinel-2 dataset](https://hls.gsfc.nasa.gov/). It requires free registration and download is possible via web interface and bulk download. +[**NASA Earthdata**](https://search.earthdata.nasa.gov) provides among many others [harmonized Landsat 8 and Sentinel-2 dataset](https://hls.gsfc.nasa.gov/). [NASA STAC](https://cmr.earthdata.nasa.gov/search/site/docs/search/stac) -**[Amazon Web Service (AWS) open EO data](https://registry.opendata.aws/?search=tags:gis,earth%20observation,events,mapping,meteorological,environmental,transportation)** is a collection of worldwide EO datasets provided by different organizations, including Landsat and Sentinel. Some of the data can be downloaded only on "requestor pays" basis. Currently, [Sentinel-2 L2A Cloud-optimized Geotiffs](https://registry.opendata.aws/sentinel-2-l2a-cogs/) are available for free, also via STAC. +**[Amazon Web Service (AWS) open EO data](https://registry.opendata.aws/?search=tags:gis,earth%20observation,events,mapping,meteorological,environmental,transportation)** is a collection of worldwide EO datasets provided by different organizations, including Landsat and Sentinel. Some of the data can be downloaded only on "requestor pays" basis. Currently, [Sentinel-2 L2A Cloud-optimized Geotiffs](https://registry.opendata.aws/sentinel-2-l2a-cogs/) by Element 84 are available for free, inc. STAC. -**[Microsoft planetary computer](https://planetarycomputer.microsoft.com)** provides a STAC of all available data, which includes Sentinel, Landsat, MODIS. It is currently available in preview. +**[Microsoft planetary computer](https://planetarycomputer.microsoft.com)** provides a STAC of all available data, which includes Sentinel, Landsat, MODIS. [**Google Cloud Storage open EO data**](https://cloud.google.com/storage/docs/public-datasets), including Sentinel-2 L1C and Landsat Collection 1 data. Data can be downloaded for example with [FORCE](../../../apps/force.md). -[**Terramonitor**](https://www.terramonitor.com) provides pre-prosessed, analysis ready Sentinel-2 data from Finland available between 2018-2020. It is a commercial service. +[**Terramonitor**](https://www.terramonitor.com) provides pre-prosessed analysis ready Sentinel-2 data, also from Finland. It is a commercial service. + +Almost all of the services provide download with web interface and bulk download via API. Most services require free self-registration. !!! default "Other geospatial datasets" To find other geospatial datasets, check out [CSC open spatial dataset list](https://research.csc.fi/open-gis-data). -### Copernicus Data Space Ecosystem - -[Copernicus Data Space Ecosystem](https://dataspace.copernicus.eu/) (CDSE) provides the possibility to browse, visualize, download and analyze EO data. Almost all of the services require self-registration. - -#### CDSE data - -CDSE mainly includes different Sentinel datasets, but also some complimentary datasets, inc Landsat, see [full list of CDSE datasets](https://documentation.dataspace.copernicus.eu/Data.html). - -Note that duplicates may be available due to reprocessing with newest baselines. +### ESA Copernicus Data Space Ecosystem -#### CDSE applications +[Copernicus Data Space Ecosystem](https://dataspace.copernicus.eu/) (CDSE) provides the possibility to browse, visualize, download and analyze EO data. It started in late 2023 and replaced ESA's SciHub. CDSE mainly includes different Sentinel datasets, but also some complimentary datasets, inc Landsat, see [full list of CDSE datasets](https://documentation.dataspace.copernicus.eu/Data.html). Note that duplicates may be available due to reprocessing with newest baselines. -CDSE provides many applications for interacting with the data: +CDSE data APIs and data download: -* [CDSE Browser](https://dataspace.copernicus.eu/browser/) - for accessing, exploring and downloading the data. -* [Copernicus Data Workspace](https://dataspace.copernicus.eu/workspace/) is a tool for managing and reviewing EO-related products, which can then be further processed or downloaded for various purposes. When products are selected for processing, you are provided with a list of processors that are capable of processing relevant data types. -* [CDSE Jupyter Notebooks](https://jupyterhub.dataspace.copernicus.eu/) provide to analyse the data using Jupyter Notebooks. Each user has 10Gb of persistent space (deleted after 15 days without login) and access to 2 - 4 CPUs with 4 - 16 Gb RAM. Note that in addition to personal limits, also the total number of active users seems to be limited. It is possible to add own packages via pip. [CDSE example notebooks](https://github.com/eu-cdse/notebook-samples) -* And many more, see [all CDSE applications](https://documentation.dataspace.copernicus.eu/Applications.html) - -#### CDSE data APIs and data download - -- [CDSE Catalog APIs](https://dataspace.copernicus.eu/analyse/apis/catalogue-apis) support 3 different options for finding suitable data: OData, OpenSearch and STAC. +* [CDSE Browser](https://dataspace.copernicus.eu/browser/) - web interface for accessing, exploring and downloading the data. +- [CDSE Catalog APIs](https://dataspace.copernicus.eu/analyse/apis/catalogue-apis) support 3 different options for finding suitable data: OData, OpenSearch and STAC. OData and OpenSearch provide similar functionality. In principle, also STAC has a lot of potential, because it would easier to download only needed data, for example only some bands or geographically only parts of data. It is the newest of data APIs and at the moment does not support any other search criteria than collection, time and location, so for example cloud cover filtering is not possible (yet). So use cases for Sentinel2 STAC are currently limited, but for Sentinel1 it might be more suitable. - [CDSE S3](https://documentation.dataspace.copernicus.eu/APIs/S3.html) for high-performance parallel access and download from CDSE object storage. Several example scripts are available for CDSE data download: @@ -166,16 +154,12 @@ Several example scripts are available for CDSE data download: * [OData API + Python requests](https://github.com/eu-cdse/notebook-samples/blob/main/geo/odata_basics.ipynb) * Alternatively [s3cmd and Python boto3](https://documentation.dataspace.copernicus.eu/APIs/S3.html) can be used for S3 downloads. - -In principle, also STAC has a lot of potential, because it would easier to download only needed data, for example only some bands or geographically only parts of data. It is the newest of data APIs and at the moment does not support any other search criteria than collection, time and location, so for example cloud cover filtering is not possible (yet). So use cases for Sentinel2 data is currently limited, but for Sentinel1 it might be more suitable. - You can also read data directly from S3 with GDAL or GDAL-based tools, see [CSC GDAL cloud tutorial](gdal_cloud.md). -These data APIs are free of charge. Different services have different limitations, see [CDSE Quotas and limitations](https://documentation.dataspace.copernicus.eu/Quotas.html). Compared to the ESA's previous service (SciHub), the number of concurrent downloads per user has increased from two to four for most APIs. +These data APIs are free of charge. Different services have different limitations, see [CDSE Quotas and limitations](https://documentation.dataspace.copernicus.eu/Quotas.html). Compared to the ESA's previous SciHub service, the number of concurrent downloads per user has increased from two to four for most APIs. -#### OpenEO and SentinelHub +CDSE includes also [OpenEO](https://documentation.dataspace.copernicus.eu/APIs/openEO/Collections.html) and [SentinelHub](https://documentation.dataspace.copernicus.eu/APIs/SentinelHub/Data.html) services, which provide more analysis ready datasets and own download services and APIs. Both have own STAC. SentinelHub provides also OGC APIs. -[OpenEO](https://dataspace.copernicus.eu/analyse/apis/openeo-api) and [SentinelHub](https://dataspace.copernicus.eu/analyse/apis/sentinel-hub) are CDSE related services with extensive service portfolio with option to bring processing close to the data. Both have free of charge options, but also services with a fee. They provide different APIs, which can be accessed also via Python and R. SentinelHub Catalog API is actually also STAC API and with more functionality than CDSE own STAC. ## How can I process EO data at CSC? @@ -216,11 +200,11 @@ There is no single software perfect for every task and taste. The right software [**Python**](../../../apps/python.md) * The [geoconda module](../../../apps/geoconda.md) provides many useful Python packages for raster data processing and analysis, such as `rasterio`, `rasterstats`, `scimage`, `sentinelhub`, `xarray`, `boto3` and packages for working with STAC. -* [Machine learning modules](../../../apps/by_discipline.md#data-analytics-and-machine-learning) provide some common machine learning frameworks, also for deep learning.. +* [Machine learning modules](../../../apps/by_discipline.md#data-analytics-and-machine-learning) provide some common machine learning frameworks, inc. for deep learning.. [**QGIS**](../../../apps/qgis.md) - open source tool with GUI for working with spatial data including limited multispectral image processing capabilities. GUI with batch processing possibility and Python interface. Used for example for visualization, map algebra and other raster processing. Many plug-ins available, for EO data processing, check out the [QGIS Semi-automatic classification plugin](https://fromgistors.blogspot.com/p/semi-automatic-classification-plugin.html). -[**R**](../../../apps/r-env-for-gis.md) - Puhti R installation includes a lot of geospatial packages, including several useful for EO data processing, such as `terra`, `CAST`, `raster` and `spacetime`, also `rstac` for working with STAC catalogs. +[**R**](../../../apps/r-env-for-gis.md) - Puhti R installation includes a lot of geospatial packages, including several useful for EO data processing, such as `terra`, `CAST`, `raster`, `rstac` and `spacetime`. [**Sen2Cor**](../../../apps/sen2cor.md) - a command-line tool for Sentinel-2 Level 2A product generation and formatting. @@ -244,7 +228,11 @@ Below is a list of alternative EO processing services that might be useful, when **[Microsoft planetary computer](https://planetarycomputer.microsoft.com)** offers JupyterHub together with Dask Gateway, both CPUs and GPUs are available. It is currently available in preview. -[**CDSE**](https://dataspace.copernicus.eu/analyse) provides also processing services, mainly via OpenEO and SentinelHub services, soon also On-Demand Processing. +[**CDSE**](https://dataspace.copernicus.eu/analyse) provides also processing services, mainly via [**OpenEO**](https://dataspace.copernicus.eu/analyse/apis/openeo-api) and [**SentinelHub**](https://dataspace.copernicus.eu/analyse/apis/sentinel-hub) with options to bring processing close to the data. Both have free of charge options and services with a fee. They provide different APIs, which can be accessed via Python or R. Soon also On-Demand Processing. + + * [Copernicus Data Workspace](https://dataspace.copernicus.eu/workspace/) is a tool for managing and reviewing EO-related products, which can then be further processed or downloaded for various purposes. When products are selected for processing, you are provided with a list of processors that are capable of processing relevant data types. + * [CDSE Jupyter Notebooks](https://jupyterhub.dataspace.copernicus.eu/) provide to analyse the data using Jupyter Notebooks. Each user has 10Gb of persistent space (deleted after 15 days without login) and access to 2 - 4 CPUs with 4 - 16 Gb RAM. Note that in addition to personal limits, also the total number of active users seems to be limited. It is possible to add own packages via pip. [CDSE example notebooks](https://github.com/eu-cdse/notebook-samples) + * And many more, see [all CDSE applications](https://documentation.dataspace.copernicus.eu/Applications.html) **Commercial clouds**: Amazon, Google Cloud and Microsoft Azure, all provide virtual machines and other processing services, all of them have some local data, see links above. @@ -257,7 +245,7 @@ If you are interested in using CSC services for your EO research, please make yo * Find information about services and how to use them in [CSC's documentation pages](../../../index.md) * For information on geocomputing in CSC environment, checkout the collection of [CSC's geocomputing learning materials](https://research.csc.fi/gis-learning-materials) and [CSC geocomputing examples on Github](https://github.com/csc-training/geocomputing) -You can find all the ways that you can get help from CSC specialists via [CSC contact page](../../contact.md). We are happy to help with technical problems around our services and are open for suggestions on which software should be installed to Puhti, or what kind of courses should be offered or materials/examples should be prepared. Please also let us know, if you would like to add a service to this page or find anything unclear. +You can find all the ways that you can get help from CSC specialists via [CSC contact page](../../contact.md). We are happy to help with technical problems around our services and are open for suggestions on which software should be installed to Puhti, or what kind of courses should be offered or materials/examples should be prepared. Please let us know, if you would like to add a service to this page or find anything unclear. ## Acknowledgement From 74e5f5d69e0cc51c3644e3292fbdf11e68bcd6a0 Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Fri, 23 Feb 2024 16:32:12 +0200 Subject: [PATCH 34/37] Update eo_guide.md --- docs/support/tutorials/gis/eo_guide.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index 7dc9daed81..383c87eca2 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -230,9 +230,9 @@ Below is a list of alternative EO processing services that might be useful, when [**CDSE**](https://dataspace.copernicus.eu/analyse) provides also processing services, mainly via [**OpenEO**](https://dataspace.copernicus.eu/analyse/apis/openeo-api) and [**SentinelHub**](https://dataspace.copernicus.eu/analyse/apis/sentinel-hub) with options to bring processing close to the data. Both have free of charge options and services with a fee. They provide different APIs, which can be accessed via Python or R. Soon also On-Demand Processing. - * [Copernicus Data Workspace](https://dataspace.copernicus.eu/workspace/) is a tool for managing and reviewing EO-related products, which can then be further processed or downloaded for various purposes. When products are selected for processing, you are provided with a list of processors that are capable of processing relevant data types. - * [CDSE Jupyter Notebooks](https://jupyterhub.dataspace.copernicus.eu/) provide to analyse the data using Jupyter Notebooks. Each user has 10Gb of persistent space (deleted after 15 days without login) and access to 2 - 4 CPUs with 4 - 16 Gb RAM. Note that in addition to personal limits, also the total number of active users seems to be limited. It is possible to add own packages via pip. [CDSE example notebooks](https://github.com/eu-cdse/notebook-samples) - * And many more, see [all CDSE applications](https://documentation.dataspace.copernicus.eu/Applications.html) +* [Copernicus Data Workspace](https://dataspace.copernicus.eu/workspace/) is a tool for managing and reviewing EO-related products, which can then be further processed or downloaded for various purposes. When products are selected for processing, you are provided with a list of processors that are capable of processing relevant data types. +* [CDSE Jupyter Notebooks](https://jupyterhub.dataspace.copernicus.eu/) provide to analyse the data using Jupyter Notebooks. Each user has 10Gb of persistent space (deleted after 15 days without login) and access to 2 - 4 CPUs with 4 - 16 Gb RAM. Note that in addition to personal limits, also the total number of active users seems to be limited. It is possible to add own packages via pip. [CDSE example notebooks](https://github.com/eu-cdse/notebook-samples) +* And many more, see [all CDSE applications](https://documentation.dataspace.copernicus.eu/Applications.html) **Commercial clouds**: Amazon, Google Cloud and Microsoft Azure, all provide virtual machines and other processing services, all of them have some local data, see links above. From 922ace96d47ed4035daf1446e662909b6ff1d378 Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Fri, 23 Feb 2024 16:38:49 +0200 Subject: [PATCH 35/37] Update eo_guide.md --- docs/support/tutorials/gis/eo_guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index 383c87eca2..761e79b9fd 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -18,7 +18,7 @@ This guide aims to help researchers to work with Earth Observation (EO) data usi For working with EO data in general, there are three main options: -1) **EO specific services**, which provide both data and ready-to-use processing environments. Usually these give better user experience and efficiency, but the services might be limited in computing power, available tools and options for adding own data or tools. These might have usage fees. Examples are [Copernicus Data Space Ecosystem](https://dataspace.copernicus.eu/), [Google Earth Engine](https://earthengine.google.com/), [SentinelHub](https://www.sentinel-hub.com/) and [Microsoft Planetary Computer](https://planetarycomputer.microsoft.com). +1) **EO specific services**, which provide both data and ready-to-use processing environments. Usually these give better user experience and efficiency, but the services might be limited in computing power, available tools and options for adding own data or tools. These might have usage fees. Examples are [Copernicus Data Space Ecosystem](https://dataspace.copernicus.eu/), [Google Earth Engine](https://earthengine.google.com/) and [Microsoft Planetary Computer](https://planetarycomputer.microsoft.com). 2) **Cloud services** with access to EO data. Practically, the data is often stored in object-storage and can be accessed as independent service. They provide general computing services, such as virtual machines, to which EO tools need to be installed by the end-user. These options usually have some fees, mainly for processing and storage. The data download may be free of charge or have a small cost, depending on the amount of data needed. One example is [Amazon Web Services](https://registry.opendata.aws/); also the [Microsoft Planetary Computer](https://planetarycomputer.microsoft.com) somewhat fit this category. From eaa708e811590253eaf37237af515f16d4cc2ff1 Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Mon, 26 Feb 2024 11:32:01 +0200 Subject: [PATCH 36/37] Update docs/support/tutorials/gis/eo_guide.md Co-authored-by: EetuHuuskoCSC <116141296+EetuHuuskoCSC@users.noreply.github.com> --- docs/support/tutorials/gis/eo_guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index 761e79b9fd..3bc7dd8ad2 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -28,7 +28,7 @@ CSC services do not fit well in this categorization, as they provide some featur At CSC, EO data can be processed and analyzed using a supercomputer, for example [supercomputer Puhti](../../../computing/systems-puhti.md), or a virtual machine in the [cPouta cloud service](../../../cloud/pouta/pouta-what-is.md). Puhti's computing capacity can hardly be compared to any other EO service, in both available processing power and amount of memory. Both Puhti and cPouta have also GPU resources, which are especially useful for large simulations and deep learning use cases. -Puhti has a lot of [pre-installed applications](#what-applications-are-available-on-puhti), so it is an environment ready to use. cPouta virtual machines are similar to commercial cloud services, where all set-up and installations are done by the end-user. In general, both services only support Linux software. +Puhti has a lot of [pre-installed applications](#what-applications-are-available-on-puhti), so it is a ready-to-use environment. cPouta virtual machines are similar to commercial cloud services, where all set-up and installations are done by the end-user. In general, both services only support Linux software. At CSC, [some Finnish EO datasets](#eo-data-at-csc) are available for direct use. In many cases, however, downloading EO data from other services (see [list of EO data download services](#eo-data-download-services)) is a required step of the process. Puhti and cPouta provide local storage of ~1-20 Tb. For more storage space, [Allas object storage](../../../data/Allas/index.md) can be used. From 9537b698ac30ee8100b151bea2f69ebc822f9727 Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Mon, 26 Feb 2024 11:32:18 +0200 Subject: [PATCH 37/37] Update docs/support/tutorials/gis/eo_guide.md Co-authored-by: EetuHuuskoCSC <116141296+EetuHuuskoCSC@users.noreply.github.com> --- docs/support/tutorials/gis/eo_guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index 3bc7dd8ad2..8353834536 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -231,7 +231,7 @@ Below is a list of alternative EO processing services that might be useful, when [**CDSE**](https://dataspace.copernicus.eu/analyse) provides also processing services, mainly via [**OpenEO**](https://dataspace.copernicus.eu/analyse/apis/openeo-api) and [**SentinelHub**](https://dataspace.copernicus.eu/analyse/apis/sentinel-hub) with options to bring processing close to the data. Both have free of charge options and services with a fee. They provide different APIs, which can be accessed via Python or R. Soon also On-Demand Processing. * [Copernicus Data Workspace](https://dataspace.copernicus.eu/workspace/) is a tool for managing and reviewing EO-related products, which can then be further processed or downloaded for various purposes. When products are selected for processing, you are provided with a list of processors that are capable of processing relevant data types. -* [CDSE Jupyter Notebooks](https://jupyterhub.dataspace.copernicus.eu/) provide to analyse the data using Jupyter Notebooks. Each user has 10Gb of persistent space (deleted after 15 days without login) and access to 2 - 4 CPUs with 4 - 16 Gb RAM. Note that in addition to personal limits, also the total number of active users seems to be limited. It is possible to add own packages via pip. [CDSE example notebooks](https://github.com/eu-cdse/notebook-samples) +* [CDSE Jupyter Notebooks](https://jupyterhub.dataspace.copernicus.eu/) provide the ability to analyze the data using Jupyter Notebooks. Each user has 10Gb of persistent space (deleted after 15 days without login) and access to 2 - 4 CPUs with 4 - 16 Gb RAM. Note that in addition to personal limits, also the total number of active users seems to be limited. It is possible to add own packages via pip. [CDSE example notebooks](https://github.com/eu-cdse/notebook-samples) * And many more, see [all CDSE applications](https://documentation.dataspace.copernicus.eu/Applications.html) **Commercial clouds**: Amazon, Google Cloud and Microsoft Azure, all provide virtual machines and other processing services, all of them have some local data, see links above.