Skip to content

Commit c7bb56d

Browse files
authored
Merge pull request #1556 from CSCfi/clean-conda-tutorial
Clean up old Conda tutorial and update Tykky
2 parents a125e44 + aaef337 commit c7bb56d

File tree

11 files changed

+187
-577
lines changed

11 files changed

+187
-577
lines changed

docs/apps/python-data.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ Collection of Python libraries for data analytics and machine learning.
2424

2525
**4.2.2022** All old Python Data versions which were based on direct Conda
2626
installations have been deprecated, and we encourage users to move to newer
27-
versions. Read more on our separate [Conda deprecation page](../support/deprecate-conda.md).
27+
versions. Read more on our separate [Conda deprecation page](../support/tutorials/conda.md).
2828

2929

3030
## Available

docs/apps/pytorch.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ Machine learning framework for Python.
2020

2121
**4.2.2022** All old PyTorch versions which were based on direct Conda
2222
installations have been deprecated, and we encourage users to move to newer
23-
versions. Read more on our separate [Conda deprecation page](../support/deprecate-conda.md).
23+
versions. Read more on our separate [Conda deprecation page](../support/tutorials/conda.md).
2424

2525

2626
## Available

docs/apps/rapids.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ Suite of libraries for data analytics and machine learning on GPUs.
2020

2121
**4.2.2022** All old RAPIDS versions which were based on direct Conda
2222
installations have been deprecated, and we encourage users to move to newer
23-
versions. Read more on our separate [Conda deprecation page](../support/deprecate-conda.md).
23+
versions. Read more on our separate [Conda deprecation page](../support/tutorials/conda.md).
2424

2525

2626
## Available

docs/apps/tensorflow.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Deep learning framework for Python.
2121

2222
**4.2.2022** All old TensorFlow versions which were based on direct Conda
2323
installations have been deprecated, and we encourage users to move to newer
24-
versions. Read more on our separate [Conda deprecation page](../support/deprecate-conda.md).
24+
versions. Read more on our separate [Conda deprecation page](../support/tutorials/conda.md).
2525

2626

2727
## Available

docs/computing/containers/tykky.md

+129-88
Original file line numberDiff line numberDiff line change
@@ -2,65 +2,63 @@
22

33
## Intro
44

5-
Tykky is a set of tools which make software installations to HPC systems easier and
5+
Tykky is a set of tools which make software installations on HPC systems easier and
66
more efficient using Apptainer containers.
77

88
Tykky use cases:
99

10-
* Conda installations, based on Conda `environment.yml`.
11-
* Pip installations, based on pip `requirements.txt`.
12-
* Container installations, based on existing Docker or Apptainer/Singularity images.
13-
* This includes installations from the Bioconda channel, see [this tutorial for
10+
- Conda installations, based on Conda `environment.yml`.
11+
- Pip installations, based on pip `requirements.txt`.
12+
- Container installations, based on existing Docker or Apptainer/Singularity images.
13+
- This includes installations from the Bioconda channel, see [this tutorial for
1414
an example](../../support/tutorials/bioconda-tutorial.md).
1515

16-
Tykky wraps installations inside
17-
an Apptainer/Singularity container to improve startup times,
18-
reduce IO load, and lessen the number of files on large parallel filesystems.
19-
Additionally, Tykky will generate wrappers so that installed
20-
software can be used (almost) as if it was not containerized. Depending
21-
on tool selection and settings, either the whole host filesystem or
22-
a limited subset is visible during execution and installation. This means that
23-
it's possible to wrap installation using e.g mpi4py relying on the host provided
24-
mpi installation.
16+
Tykky wraps installations inside an Apptainer/Singularity container to improve startup
17+
times, reduce I/O load, and lessen the number of files on large parallel file systems.
18+
Additionally, Tykky will generate wrappers so that installed software can be used
19+
(almost) as if it was not containerized. Depending on tool selection and settings,
20+
either the whole host file system or a limited subset is visible during execution
21+
and installation. This means that it's possible to wrap installations using e.g
22+
`mpi4py` relying on the host-provided MPI installation.
2523

26-
This documentation covers a subset of the functionality and focuses on
27-
conda and Python, a large part of the advanced use-cases
28-
are not covered here yet.
24+
This documentation covers a subset of the functionality and focuses on Conda and
25+
Python. Most advanced use-cases are not covered here yet.
2926

3027
!!! Warning
31-
As Tykky is still under development some of the more advanced features might change in exact usage and API.
28+
As Tykky is still under development, some of the more advanced features might
29+
change with respect to exact usage and API.
3230

3331
## Tykky module
3432

35-
To access Tykky tools:
33+
To access Tykky tools:
3634

37-
1) Usually it is best to first unload all other modules:
35+
1) Usually it is best to first unload all other modules:
3836

39-
```
37+
```bash
4038
module purge
4139
```
4240

43-
2) Load Tykky module.
41+
2) Load the Tykky module:
4442

4543
```bash
4644
module load tykky
4745
```
4846

49-
## Conda based installation
47+
## Conda-based installation
5048

51-
First make sure that you have read and understood the license terms for miniconda and any used channels
52-
before using the command.
49+
First, make sure that you have read and understood the license terms for Miniconda
50+
and any used channels before using the command.
5351

54-
- [Miniconda end user license agreement](https://www.anaconda.com/end-user-license-agreement-miniconda).
52+
- [Miniconda end-user license agreement](https://www.anaconda.com/end-user-license-agreement-miniconda).
5553
- [Anaconda terms of service](https://www.anaconda.com/terms-of-service).
5654
- [A blog entry on Anaconda commercial edition](https://www.anaconda.com/blog/anaconda-commercial-edition-faq).
5755

58-
1) Create **conda environment file** env.yml:
56+
1) Create a **Conda environment file** `env.yml`:
5957

60-
* [Create manually a new file](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#create-env-file-manually) or
61-
* [Create the file from existing conda installation](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#sharing-an-environment). For example: `conda env export -n <target_env_name> > env.yml`.
62-
* If the existing environment is on a Windows and MacOS machine, it might need the [`--from-history` flag](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#exporting-an-environment-file-across-platforms), to get a .yml file suitable for Linux.
63-
* If the existing environment is on a Linux machine with x86 CPU architecture, it is possible also to use [`--explicit` flag](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#building-identical-conda-environments)
58+
- [Create manually a new file](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#create-env-file-manually) or
59+
- [Create the file from an existing Conda installation](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#sharing-an-environment). For example: `conda env export -n <target_env_name> > env.yml`.
60+
- If the existing environment is on a Windows or MacOS machine, the [`--from-history` flag](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#exporting-an-environment-file-across-platforms) might be required to get a `.yml` file suitable for Linux.
61+
- If the existing environment is on a Linux machine with x86 CPU architecture, it is also possible to use [`--explicit` flag](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#building-identical-conda-environments).
6462

6563
An example of a suitable `env.yml` file would be:
6664

@@ -73,48 +71,61 @@ dependencies:
7371
- nglview
7472
```
7573
74+
!!! info
75+
The `channels` field lists which channels the packages should be pulled from
76+
to this environment, whereas the `dependencies` field lists the actual Conda
77+
packages that will be installed into the environment. Note that Conda uses a
78+
channel priority for determining where to install packages from, i.e. it tries
79+
to first install packages from the first listed channel. If no package versions
80+
are specified, Conda always installs the latest versions.
7681

77-
2) Create new directory for installation <install_dir>. Likely `/projappl/<your_project>/..` is a good place.
82+
2) Create a new directory `<install_dir>` for the installation. `/projappl/<your_project>/...`
83+
is recommended.
7884

79-
3) Create installation
85+
3) Create the installation:
8086

8187
```bash
8288
conda-containerize new --prefix <install_dir> env.yml
8389
```
8490

85-
4) Add the bin directory `<install_dir>/bin` to the path.
91+
4) Add the `<install_dir>/bin` directory to your `$PATH`:
8692

8793
```bash
8894
export PATH="<install_dir>/bin:$PATH"
8995
```
9096

91-
5) You can call python and any other executables conda has installed in the same way as if you had activated the environment.
97+
5) Now you can call `python` and any other executables Conda has installed in the same
98+
way as if you had activated the environment.
9299

93-
### pip with conda
100+
### Pip with Conda
94101

95-
To install some additional pip packages, add the `-r <req_file>` argument e.g:
102+
To install some additional pip packages, add the `-r <req_file>` argument, e.g.:
96103

97-
```
104+
```bash
98105
conda-containerize new -r req.txt --prefix <install_dir> env.yml
99106
```
100107

101-
### mamba
102-
The tool also supports using [mamba](https://github.com/mamba-org/mamba)
103-
for installing packages. Mamba often finds suitable packages much faster than conda, so it is a good option when required package list is long. Enable this feature by adding the `--mamba` flag.
108+
### Mamba
104109

105-
```
110+
The tool also supports using [Mamba](https://github.com/mamba-org/mamba) for installing
111+
packages. Mamba often finds suitable packages much faster than Conda, so it is a good
112+
option when the required package list is long. Enable this feature by adding the `--mamba`
113+
flag.
114+
115+
```bash
106116
conda-containerize new --mamba --prefix <install_dir> env.yml
107117
```
108118

119+
### End-to-end example
109120

110-
### End-to-end example
121+
Create a new Conda-based installation using the previous `env.yml` file.
111122

112-
Create new conda based installation using the previous `env.yml` file.
113-
```
123+
```bash
114124
mkdir MyEnv
115-
conda-containerize new --prefix MyEnv env.yml
125+
conda-containerize new --prefix MyEnv env.yml
116126
```
117-
After the installation finishes we can add the installation directory to our PATH
127+
128+
After the installation finishes, add the installation directory to your `PATH`
118129
and use it like normal.
119130

120131
```bash
@@ -130,74 +141,104 @@ Type "help", "copyright", "credits" or "license" for more information.
130141
>>>
131142
```
132143

133-
### Modifying a conda installation
144+
### Modifying a Conda installation
134145

135-
Tykky installed software resides in a container, so it can not be directly modified.
136-
Small Python packages can be added normally using `pip`, but then the Python packages are
137-
sitting on the parallel filesystem so this is not recommended for any larger installations.
146+
Tykky installations reside in a container, so they can not be directly modified.
147+
Small Python packages can be added normally using `pip`, but then the Python packages
148+
will be sitting on the parallel file system, which is not recommended for any larger
149+
installations.
138150

139-
To actually modify the installation we can use the `update` keyword
140-
together with the `--post-install <file>` option which specifies a bash script
141-
with commands to run to update the installation. The commands are executed
142-
with the conda environment activated.
151+
To actually modify the installation, we can use the `update` keyword together with
152+
the `--post-install <file>` option, which specifies a bash script with commands to
153+
run to update the installation. The commands are executed with the Conda environment
154+
activated.
143155

144-
```
156+
```bash
145157
conda-containerize update <existing installation> --post-install <file>
146158
```
147159

148-
Where `<file>` could e.g contain:
160+
Where `<file>` could e.g. contain:
149161

150-
```
151-
conda install -y numpy
152-
conda remove -y nglview
162+
```bash
163+
conda install -y numpy
164+
conda remove -y nglview
153165
pip install requests
154166
```
155167

156-
In this mode the whole host system is available including all software and modules.
168+
In this mode the whole host system is available including all software and modules.
157169

158-
## Pip based installations
170+
## Pip-based installations
159171

160-
Sometimes you don't need a full blown conda environment or you might prefer pip
161-
to manage Python installations. For this case we can use:
172+
Sometimes you don't need a full-blown Conda environment or you might prefer pip
173+
to manage Python installations. In this case we can use:
162174

163-
```
175+
```bash
164176
pip-containerize new --prefix <install_dir> req.txt
165177
```
166-
Where `req.txt` is a standard pip requirements file.
167-
The notes and options for modifying a conda installation apply here as well.
168178

169-
Note that the Python version used by `pip-containerize` is the first Python executable found in the path, so it's affected by loaded modules.
179+
where `req.txt` is a standard pip requirements file. The notes and options for
180+
modifying a Conda installation apply here as well.
170181

171-
**Important:** This python can not be itself container-based as nesting is not possible.
182+
Note that the Python version used by `pip-containerize` is the first Python executable
183+
found in the path, so it's affected by loaded modules.
172184

173-
An additional flag `--slim` argument exists, which will instead use a pre-built minimal python
174-
container with a much newer version of python as a base. Without the `--slim` flag, the whole host system is available,
175-
and with the flag the system installations (i.e /usr, /lib64 ...) are no longer taken from the host, instead
176-
coming from within container.
185+
**Important:** This Python can not be itself container-based as nesting is not possible!
177186

178-
## Container based installations
187+
An additional `--slim` flag exists, which will instead use a pre-built minimal Python
188+
container with a much newer version of Python as a base. Without the `--slim` flag,
189+
the whole host system is available, whereas with the flag the system installations (i.e.
190+
`/usr`, `/lib64`, ...) are no longer taken from the host, but instead coming from
191+
within the container.
179192

180-
Tykky also provides an option to:
181-
182-
* Generate wrappers for tools in existing Apptainer/Singularity containers, so that they can be used
183-
transparently (no need to prepend `singularity exec ...`, or modify scripts if switching between containerized versions and "normal" installation).
184-
* Install tools available in Docker images, including generating wrappers.
193+
## Container-based installations
185194

195+
Tykky also provides an option to:
196+
197+
- Generate wrappers for tools in existing Apptainer/Singularity containers so that
198+
they can be used transparently (no need to prepend `apptainer exec ...` or modify
199+
scripts if switching between containerized versions and "normal" installations).
200+
- Install tools available in Docker images, including generating wrappers.
201+
202+
```bash
203+
wrap-container -w /path/inside/container <container> --prefix <install_dir>
186204
```
187-
wrap-container -w </path/inside/container> <container> --prefix <install_dir>
188-
```
189205

190-
* `<container>` can be a local filepath or any [URL accepted by singularity](https://docs.sylabs.io/guides/3.7/user-guide/cli/singularity_pull.html) (e.g `docker://` `oras://` )
191-
* `-w` needs to be an absolute path (or comma separated list) inside the container. Wrappers will then be automatically
192-
created for the executables in the target directories / for the target path. If you do not know the path of executables in the container, open a shell inside the container and use [which command](https://linuxize.com/post/linux-which-command/). To open shell:
193-
* In case of existing local Apptainer/Singularity file: `singularity shell xxx.sif`.
194-
* In case of Docker or non-local Apptainer/Singularity file, create first the installation with some path and then start with created `_debug_shell`.
206+
- `<container>` can be a local filepath or any [URL accepted by
207+
Apptainer/Singularity](https://docs.sylabs.io/guides/3.7/user-guide/cli/singularity_pull.html)
208+
(e.g `docker://` `oras://`)
209+
- `-w` needs to be an absolute path (or comma-separated list) inside the container.
210+
Wrappers will then be automatically created for the executables in the target
211+
directories / for the target path. If you do not know the path of the executables
212+
in the container, open a shell inside the container and use the [which
213+
command](https://linuxize.com/post/linux-which-command/). To open a shell:
214+
- In case of existing local Apptainer/Singularity file: `singularity shell image.sif`.
215+
- In case of Docker or non-local Apptainer/Singularity file, create first the
216+
installation with some path and then start with created `_debug_shell`.
217+
218+
## Memory errors
219+
220+
With very large installations the resources available on the login node might
221+
not be enough, resulting in Tykky failing with a `MemoryError`. In this case, the
222+
installation needs to be done on a compute node, for example using an [interactive
223+
session](../../computing/running/interactive-usage.md#sinteractive-in-puhti):
224+
225+
```bash
226+
# Start interactive session, here with 12 GB memory and 15 GB local disk (increase if needed)
227+
sinteractive --account <project> --time 1:00:00 --mem 12000 --tmp 15
228+
229+
# Load Tykky
230+
module purge
231+
module load tykky
232+
233+
# Run the Tykky commands as described above, e.g.
234+
conda-containerize new --prefix <install_dir> env.yml
235+
```
195236

196237
## More complicated example
197238

198239
[Example in tool repository](https://github.com/CSCfi/hpc-container-wrapper/blob/master/examples/fftw.md).
199240

200241
## How it works
201242

202-
See the README in the source code repository.
203-
The source code can be found in the [GitHub repository](https://github.com/CSCfi/hpc-container-wrapper).
243+
See the `README` in the source code repository. The source code can be found in the
244+
[GitHub repository](https://github.com/CSCfi/hpc-container-wrapper).

docs/computing/running/fireworks.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ pymongo==3.10.0
3333

3434
!!! Note
3535
Please do not install FireWorks in a Conda environment that is sitting directly on the shared
36-
Lustre file system. [CSC has deprecated the direct usage of Conda](../../support/deprecate-conda.md)
36+
Lustre file system. [CSC has deprecated the direct usage of Conda](../../support/tutorials/conda.md)
3737
installations on our supercomputers to avoid performance issues due to the large number of files
3838
brought by Conda. For reference, a Conda installation of FireWorks contains more than 24000
3939
files, most of which are read each time the application is run. This causes startup delays and

docs/computing/running/throughput.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -209,7 +209,7 @@ workflows.
209209
[Singularity container]: ../containers/run-existing.md
210210
[mount your datasets with SquashFS]: ../containers/run-existing.md#mounting-datasets-with-squashfs
211211
[file striping]: ../lustre.md#file-striping-and-alignment
212-
[CSC has deprecated the direct usage of Conda environments]: ../../support/deprecate-conda.md
212+
[CSC has deprecated the direct usage of Conda environments]: ../../support/tutorials/conda.md
213213
[container wrapper tool Tykky]: ../containers/tykky.md
214214
[how to work efficiently with Lustre are documented here]: ../lustre.md#best-practices
215215
[Data storage guide for machine learning]: ../../support/tutorials/ml-data.md

docs/computing/usage-policy.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ efficiently.
6868
Due to performance issues of Conda-based environments on parallel file systems,
6969
CSC has deprecated the _direct_ usage of Conda installations. This means that any
7070
Conda environments you intend to use must be installed within a container. See
71-
the page [Deprecating Conda](../support/deprecate-conda.md) for more information.
71+
[Conda best practices](../support/tutorials/conda.md) for more information.
7272

7373
!!! info "Tykky"
7474
Please consider the [Tykky container wrapper](containers/tykky.md) for easy

0 commit comments

Comments
 (0)