Skip to content

Commit 7a03360

Browse files
committed
[RFC] Support for air gapped mode
Signed-off-by: Lianhao Lu <[email protected]>
1 parent c539231 commit 7a03360

File tree

1 file changed

+81
-0
lines changed

1 file changed

+81
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
# Support Air-gapped environment
2+
3+
This RFC talks about how to support running OPEA microservices in an air-gapped environment.
4+
5+
## Author(s)
6+
7+
[Lianhao Lu](https://github.com/lianhao)
8+
9+
## Status
10+
11+
`Under Review`
12+
13+
## Objective
14+
15+
An air-gapped computer or network is one that has no network interfaces, either wired or wireless, connected to outside networks(e.g. Internet, etc.). Many enterprises have network security policies that prohibit the computers in the enterprise internal network to have any kind of ability to send/receive data to/from the outside networks.
16+
17+
This RFC discusses about how to support running OPEA microservices in such an air-gapped environment, including the methodology to find out the what kind of data need to be pre-downloaded, where to store the pre-downloaded data, and how this will affect the deployment, etc.
18+
19+
### Online data types
20+
There are some OPEA microservices requires downloading data from Internet during the runtime, these kind of data includes:
21+
22+
#### Type1: User configurable AI model data
23+
Some OPEA microservices allow the user to configure the AI model it will run against with. We've already supported pre-download these kind of data and run the OPEA microservices with them in air-gapped environment.
24+
25+
#### Type2: User non-configurable AI model data
26+
Some OPEA microservice will silently download some AI model data from Internet during runtime. The kind of data to be downloaded is either hardcoded in the OPEA microservice, or is not exposed to the end user to be configurable. For example, the `dataprep` microservice requires to download the AI model `unstructuredio/yolo_x_layout` during runtime to process unstructured input data, and currently there are no exposed configuration options for the `dataprep` end users to specify which AI model to be downloaded for that purpose.
27+
28+
#### Type3: Other data
29+
Some OPEA microservices needs download additional data other than AI models during runtime, e.g. `dataprep`, `retriever` and `gpt-sovits` need to download a subset of `nltk` data, `speecht5` needs to download data from [intel-extension-for-transformers](https://github.com/intel/intel-extension-for-transformers/tree/main/intel_extension_for_transformers/neural_chat/assets/speaker_embeddings), etc.
30+
31+
In this RFC, we'll mainly cover the `Type2` and `Type3` online data type, since most of the microservices have already supported using the offline data of `Type1`.
32+
33+
## Motivation
34+
35+
When trying to deploy the OPEA microservices in some customers air-gapped environment, we've found that there are quite some OPEA microservices needs to download data from internet during runtime, which requires special tweaks for each of them. We want to make sure that all OPEA microservices can be run in the air-gapped environment in a uniform way.
36+
37+
## Design Proposal
38+
39+
### Ways to verify
40+
To quickly verify if a OPEA microservice support air-gapped mode or not, and what kind of online data it's downloading, we can use the following steps:
41+
42+
1. Deploy the microservice in one of the 3 following environment:
43+
- Deploy it in a real air gapped environment(docker or K8s)
44+
- Deploy it in a tweak K8s environment by disable K8s DNS forwarding(i.e. remove the `forward` part in `kubectl -n kube-system edit cm coredns`, and restart coredns related pods)
45+
- Deploy it with the fake proxy settings by setting both `http_proxy` and `https_proxy` to a non existent proxy servers, e.g. `http://localhost:54321`.
46+
2. Send requests to the microservice under verification
47+
We need to make sure that all the sent out requests should have a decent coverage of the internal data flow of that microservices, because sometimes it's not the microservice itself is downloading online data, but the dependent modules.
48+
3. Check the requests return status and microservice logs to find out whether it supports air-gapped mode and what kind of online data it's downloading if it's not.
49+
50+
### Type 2 online data
51+
52+
Since we're not allowed the distribute AI model data in the microserivce's container image, we need to make sure that there is a way to `mount` the user pre downloaded AI model data into the microservice's runtime so that the microservice itself can run in the air gapped mode. We also need to document what to be download and how to `mount` in the deployment document of that microservice.
53+
54+
### Type 3 online data
55+
56+
To minimize the deployment complexity, for small size online data which is NOT shared by multiple microservices, we should have them downloaded in the container image, so that the microservice itself doesn't need to download it during runtime.
57+
58+
For online data which are used by multiple microservices, e.g. `nltk` data, depends on its size, we can have them pre downloaded in the container image if it will not increase the image size more than 5%. For large size data, we should follow the `Type 2` online data method to support running in air gapped mode.
59+
60+
61+
## Alternatives Considered
62+
63+
Using the same `Type 2` method for `Type 3` online data
64+
65+
## Compatibility
66+
67+
n/a.
68+
69+
## Miscellaneous
70+
71+
List other information user and developer may care about, such as:
72+
73+
- Engineering Impact:
74+
- increase container image build time
75+
- decrease the container image startup time.
76+
77+
- Staging plan:
78+
- Using the methods listed in the above section `Ways to verify` to find all the microservices which doesn't support air-gapped mode, and create corresponding github issues
79+
- For each found microservice, figure out the online data type and enhance it to support air gapped mode
80+
81+
- CI env to test this functionality: Since this is a common requirements to all OPEA microservices, we probably need to setup an CI/CD test task to test it?

0 commit comments

Comments
 (0)