Fraud-detection with encrypted data
PERSONA: Application developer
This example focuses on the same use case as before, fraud-detection, but doesn’t run the traditional image used so far. Instead, it runs the workload as Jupyter notebook or Openshift AI workbench.
The Confidential Workflow
The difference here is mainly that the jupyter notebook has two sealed secrets attached: one for the Azure storage access, and another for the dataset decryption key.
-
The fraud-detection container starts.
-
Two of the attached volumes in this pod are sealed secret (azure credentials and decryption key).
-
CoCo internal components find the sealed secrets, and begin attestation to get their content.
-
Attestation starts by first having the confidential container to generate a report that shows it is a genuine CoCo running on a secure, trusted platform.
-
The CoCo then sends this report to Trustee. Only after Trustee verifies that this report is genuine and correct, it can confirm that the container is secure by releasing the requested secret.
-
The secret is then inserted in the corresponding sealed secret, which acts as normal volume mount.
-
-
The container downloads the public model.
-
Then, it tries to pull the encrypted dataset. The Azure credentials are not available in the container; they are held remotely by Trustee.
-
The Azure SAS (connection string in a real example) required to access the blob has been loaded as sealed secret in the CoCo pod. If everything went well, it should be available in a volume defined in the podspec.
-
-
Once the blob is accessed, the decryption key is not present in the container; it is held remotely by Trustee too.
-
The key required to decrypt the dataset has been loaded as sealed secret in the CoCo pod. If everything went well, it should be available in a volume defined in the podspec.
-
In addition, in the jupyter notebook it is also possible to try lazy attestation to fetch the key manually.
-
-
The container uses the received key to decrypt the credit card datasets in memory.
-
The (now-decrypted) private data is fed into the public model for processing, all within the protected container.
CoCo-specific implementation steps
The main goal here is to show how much work is actually needed to convert the plain application to securely run with CoCo.
There are three main changes added to the podspec:
-
runtimeClassName: kata-remotein the podspec. This is just a single line to enable CoCo for this pod. -
Sealed secrets. Such secrets are added to the podspec as normal secrets, but as we saw before such secrets contain a reference to the actual secret provided by Trustee.
-
Persistent storage. For CoCo in public cloud using peer-pods, since the pod VM is external to the worker node and PVCs are actually mounted on the worker node, there are no proper and maintainable mechanisms to mount the PVC in the pod VM instead of on the worker node and secure it from the cluster. It’s left to the application to directly mount the storage and use client side encryption/decryption for the storage.
-
In this example, we will replace the PVCs with podvm storage, meaning the CVM encrypted disk will be used to store data. However, once the pod terminates, the data will be lost too.
-
Add the application secrets into Trustee
Let’s add the application secrets (decryption key and azure credentials) into the Trustee. Here we are in the trusted cluster.
In case you didn’t do it before, download the decryption key and upload it into the Trustee. Remember that FD_SECRET_NAME=fraud-dataset.
Now let’s also add the Azure storage secret.
### Azure SAS - sealed secret
AZURE_SAS_SECRET_NAME=fraud-azure-sas
oc create secret generic $AZURE_SAS_SECRET_NAME \
--from-literal azure-sas="sp=r&st=2025-10-27T15:42:27Z&se=2028-10-27T22:57:27Z&spr=https&sv=2024-11-04&sr=b&sig=vjaRotd7de%2B3QwlzHVaHF2GVyehw1xb3fFiXe9E7YOI%3D" \
-n trustee-operator-system
And then instruct Trustee to load that secret into its deployment, by updating the KbsConfig and restarting the Trustee deployment.
echo "Default Kbsconfig - kbsSecretResources:"
oc get kbsconfig trusteeconfig-kbs-config -n trustee-operator-system -o json \
| jq '.spec.kbsSecretResources'
echo ""
oc patch kbsconfig trusteeconfig-kbs-config \
-n trustee-operator-system \
--type=json \
-p="[
{\"op\": \"add\", \"path\": \"/spec/kbsSecretResources/-\", \"value\": \"$AZURE_SAS_SECRET_NAME\"},
]"
echo ""
echo "Updated Kbsconfig - kbsSecretResources:"
oc get kbsconfig trusteeconfig-kbs-config -n trustee-operator-system -o json \
| jq '.spec.kbsSecretResources'
oc rollout restart deployment/trustee-deployment -n trustee-operator-system
You should see a fraud-azure-sas and fraud-dataset secret in the KbsConfig.
Create the sealed secret
Let’s now create the sealed secret that contains the pointer to the actual secret in trustee. Here we move to the untrusted cluster.
AZ_SECRET=$(podman run -it quay.io/confidential-devhub/coco-tools:0.3.0 /tools/secret seal vault --resource-uri kbs:///default/${AZURE_SAS_SECRET_NAME}/azure-sas --provider kbs | grep -v "Warning")
FD_SECRET_NAME=fraud-dataset
KEY_SECRET=$(podman run -it quay.io/confidential-devhub/coco-tools:0.3.0 /tools/secret seal vault --resource-uri kbs:///default/${FD_SECRET_NAME}/dataset_key --provider kbs | grep -v "Warning")
# namespace here is fraud-detection!
oc create namespace fraud-detection
oc create secret generic sealed-azure-sas --from-literal=azure-sas=$AZ_SECRET -n fraud-detection
oc create secret generic sealed-dataset-key --from-literal=key=$KEY_SECRET -n fraud-detection
Deploy the notebook
Let’s create a notebook and run it as CoCo.
This notebook specifically uses python sdk to download the encrypted data from Azure for two reasons:
-
Closely align with regular interactive AI workflows which uses python SDKs to download data from s3, azure, minio etc.
-
Provides an example of programmatic storage access for AI workloads when using the peer-pods approach.
Before running this notebook, ensure that ROOT_VOLUME_SIZE in the peer-pods configmap is set at least to 20 GB, as the steps in the guide will install a lot of python packages. If you modify that value, remember as always to to restart the OSC deployment!
There are two ways to deploy the notebook:
-
Via a Openshift AI (OAI) workbench: everything is handled by Openshift AI. A
Notebookobject is created, and OAI takes care of deploying it and exposing it for the user. In such way, we integrate CoCo with Openshift AI traditional workbenches, that take care of most of the work. -
Via a plain Jupyter notebook: a simple custom pod, with networking handled by a custom service and route. This is the simplest and fastest way to deploy.
Openshift AI workbench
Prerequisites
PERSONA: Operational security expert
First of all, we need to make sure we can install Openshift AI. The main requirement for OAI is to have big enough worker nodes, as they have to run OAI, OSC and Trustee.
Please ensure you have big enough worker nodes. In ARO, 3 workers with size Standard_D8s_v5 should be enough. In case you don’t have them, you can manually upgrade the worker node, deploy a new cluster with bigger workers or try to add more worker nodes.
Install OAI
PERSONA: Application developer
In order to simplify the installation of OAI and since it’s not the focus of this workshop, we will provide a script to automatically handle that:
curl -L https://raw.githubusercontent.com/confidential-devhub/workshop-on-ARO-showroom/refs/heads/main/helpers/install-oai.sh -o install-oai.sh
chmod +x install-oai.sh
./install-oai.sh
Once completed, you will receive a link to access the RHOAI dashboard and also the notebook itself directly.
|
The script also creates a new signature image verification policy. The OAI workbench is running other signed images from registry.redhat.io, therefore we can create a new policy for them. |
Now you can go through the notebook.
Plain Jupyter notebook
If you open the yaml code for the plain jupyter notebook, you will notice once again the only difference with a normal deployment is runtimeClassName: kata-remote.
oc apply -f https://raw.githubusercontent.com/confidential-devhub/workshop-on-ARO-showroom/refs/heads/main/helpers/fraud-encrypted-datasets/notebook.yaml
Switch to the newly created fraud-detection namespace.
oc project fraud-detection
Wait that the pod is created.
watch oc get pods/fraud-encrypted-datasets
The pod is ready when the STATUS is in Running.
The jupyter notebook will be available at the following URL and the login password is aro_workshop123:
FD_ROUTE=$(oc get route fraud-encrypted-datasets-route -n fraud-detection -o jsonpath='{.spec.host}')
echo ""
echo "Click on the following URL to open the notebook in a new tab:"
echo "https://${FD_ROUTE}"
Run the notebook
Starting from fraud-detection/1_download_data.ipynb, go through the various notebooks. Specifically:
-
fraud-detection/1_download_data.ipynb: download encrypted datasets -
fraud-detection/2_decrypt_data.ipynb: decrypt the datasets -
fraud-detection/3_run_model.ipynb: run the model -
fraud-detection/4_cleanup.ipynb: clean everything to restart the demo
Considerations
Note for a second how a secret like /sealed/azure-value/azure-sas can be read and used in the jupyter notebook, but if you try to oc exec and read it, it won’t work.
The difference is that the notebook is running within the container, whereas oc exec is executed from outside. This shows exactly the CoCo threat model: an application/developer inside the CoCo pod is obviously allowed to read the secrets granted. However, a cluster/infra/platform admin is not trusted, therefore there is no way for him/her to access this data.