团队的数据科学家需要隔离且可按需分配的Jupyter环境,这是一个常见但棘手的需求。最初的方案是在OCI(Oracle Cloud Infrastructure)上手动创建计算实例,再用Ansible脚本配置环境,整个过程耗时、易错,且难以统一管理。当请求量上来后,运维成本急剧攀升。我们很快意识到,需要一个更自动化的、声明式的解决方案。
将Jupyter容器化并部署在OCI的Kubernetes引擎(OKE)上是显而易见的第一步。然而,直接使用Helm Chart部署并不完全符合我们的要求。我们需要深度集成OCI的块存储(Block Volume)来实现数据持久化,需要与内部的Java认证体系对接注入token,还需要根据项目级别进行精细的资源配额。这些定制化需求,指向了一个更强大的模式:Kubernetes Operator。
争论的焦点在于实现语言。Go语言和Operator SDK是社区主流,但我们的核心技术栈是Java,团队对JVM的理解、现有的监控工具链、以及与OCI交互的Java SDK都非常成熟。重写一套Go的工具链成本不菲。最终,我们决定采用Java,具体选型为Quarkus Operator SDK,它提供了一种高效、云原生的方式来构建Operator,同时能与我们现有的Java生态无缝集成。
定义我们的API:JupyterNotebook Custom Resource
Operator的核心是声明式API。我们首先需要定义一个Custom Resource Definition (CRD),来描述一个Jupyter环境应该是什么样子。这个CRD就是我们与系统交互的契约。
JupyterNotebook.yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: jupyternotebooks.datascience.example.com
spec:
group: datascience.example.com
versions:
- name: v1alpha1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
image:
type: string
description: "The JupyterLab container image to use."
default: "jupyter/scipy-notebook:latest"
cpuRequest:
type: string
pattern: '^[0-9]+m?$'
description: "CPU resource request."
default: "500m"
memoryRequest:
type: string
pattern: '^[0-9]+([KMGT]i?)?$'
description: "Memory resource request."
default: "1Gi"
storageSize:
type: string
pattern: '^[0-9]+([KMGT]i?)?$'
description: "The size of the persistent volume, e.g., 10Gi."
ociBlockVolumeId:
type: string
description: "Optional OCI Block Volume OCID to re-attach an existing volume."
required:
- storageSize
status:
type: object
properties:
phase:
type: string
description: "The current phase of the notebook: PENDING, PROVISIONING, RUNNING, FAILED."
url:
type: string
description: "The access URL for the JupyterLab instance."
pvcName:
type: string
description: "Name of the provisioned PersistentVolumeClaim."
scope: Namespaced
names:
plural: jupyternotebooks
singular: jupyternotebook
kind: JupyterNotebook
shortNames:
- jnb
这里的spec
定义了用户可以配置的参数:镜像、CPU/内存请求、存储大小,以及一个关键字段ociBlockVolumeId
。这个字段允许用户重新挂载一个已存在的OCI块存储卷,这对于恢复工作环境至关重要。status
则由我们的Operator来填充,向用户报告环境的当前状态和访问地址。
Operator项目骨架与依赖
我们使用Quarkus来构建Operator。它的依赖注入和原生镜像编译能力非常适合这种长时间运行的控制器。
pom.xml
核心依赖:
<dependencies>
<!-- Quarkus Operator SDK -->
<dependency>
<groupId>io.quarkiverse.operatorsdk</groupId>
<artifactId>quarkus-operator-sdk</artifactId>
</dependency>
<!-- Kubernetes Java Client -->
<dependency>
<groupId>io.fabric8</groupId>
<artifactId>kubernetes-client</artifactId>
</dependency>
<!-- OCI Java SDK for Block Storage -->
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-core</artifactId>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-identity</artifactId>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-blockstorage</artifactId>
</dependency>
<!-- Logging -->
<dependency>
<groupId>org.jboss.logmanager</groupId>
<artifactId>log4j2-jboss-logmanager</artifactId>
</dependency>
<!-- Other Quarkus dependencies -->
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-arc</artifactId>
</dependency>
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-junit5</artifactId>
<scope>test</scope>
</dependency>
</dependencies>
配置OCI Java SDK需要认证信息。在生产环境中,我们使用OCI实例主体(Instance Principal)进行认证,这比硬编码密钥更安全。
application.properties
:
# OCI SDK Configuration
# This assumes the operator is running on an OCI compute instance with appropriate instance principal policies.
oci.auth.strategy=instance_principal
核心逻辑:Reconciler的实现
Reconciler(调和器)是Operator的大脑。每当JupyterNotebook
资源发生变化(创建、更新、删除)时,reconcile
方法就会被调用。它的职责是观测当前集群状态,并采取行动使其趋近于spec
中定义的期望状态。这个过程必须是幂等的。
1. Reconciler结构
import io.javaoperatorsdk.operator.api.reconciler.*;
import io.javaoperatorsdk.operator.api.reconciler.Context;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
@ControllerConfiguration
public class JupyterNotebookReconciler implements Reconciler<JupyterNotebook>, ErrorStatusHandler<JupyterNotebook> {
private static final Logger log = LoggerFactory.getLogger(JupyterNotebookReconciler.class);
private final KubernetesClient client;
// OCI Service clients would be injected here
// private final OciStorageService ociStorageService;
public JupyterNotebookReconciler(KubernetesClient client /*, OciStorageService ociStorageService */) {
this.client = client;
// this.ociStorageService = ociStorageService;
}
@Override
public UpdateControl<JupyterNotebook> reconcile(JupyterNotebook resource, Context<JupyterNotebook> context) {
final String namespace = resource.getMetadata().getNamespace();
final String name = resource.getMetadata().getName();
log.info("Reconciling JupyterNotebook {} in namespace {}", name, namespace);
// Step 1: Manage Persistent Volume Claim (PVC)
// This is the most critical part for stateful workloads.
// We will implement this logic in detail.
// Step 2: Manage Deployment
// Ensure the Jupyter pod is running with the correct spec.
// Step 3: Manage Service
// Expose the Jupyter pod for access.
// Step 4: Update Status
// Report the current state back to the CRD.
return UpdateControl.noUpdate(); // For now
}
@Override
public ErrorStatusUpdateControl<JupyterNotebook> updateErrorStatus(JupyterNotebook resource, Context<JupyterNotebook> context, Exception e) {
log.error("Error during reconciliation for JupyterNotebook {}", resource.getMetadata().getName(), e);
resource.getStatus().setPhase("FAILED");
resource.getStatus().setUrl(e.getMessage());
return ErrorStatusUpdateControl.updateStatus(resource);
}
}
2. 调和流程的可视化
在深入代码之前,我们先用Mermaid图梳理一下整个调和循环的逻辑。
graph TD A[JupyterNotebook CR Event] --> B{Reconcile Triggered}; B --> C{PVC Exists?}; C -- No --> D[Create PVC]; C -- Yes --> E{Deployment Exists?}; D --> E; E -- No --> F[Create Deployment]; E -- Yes --> G{Service Exists?}; F --> G; G -- No --> H[Create Service]; G -- Yes --> I{Update Status}; H --> I; I --> J[Reconciliation Ends];
3. 关键实现:PVC与OCI块存储的集成
这是将通用Jupyter部署转变为OCI原生解决方案的关键。我们需要一个服务来封装OCI SDK的调用。
// A simplified service to interact with OCI Block Storage
// In a real project, this would be a proper Quarkus CDI bean.
public class OciStorageService {
private final BlockStorageClient blockStorageClient;
private final String compartmentId;
public OciStorageService(/*...dependency injection...*/) {
// AuthenticationProvider provider = ... (e.g., InstancePrincipalsAuthenticationDetailsProvider)
// this.blockStorageClient = BlockStorageClient.builder().build(provider);
// this.compartmentId = ... from config
this.blockStorageClient = null; // Placeholder for compilation
this.compartmentId = "ocid1.compartment.oc1..xxxxx";
}
/**
* Ensures a block volume exists and returns its OCID.
* If volumeId is provided, it verifies its existence.
* If not, it creates a new one.
* This method must be idempotent.
*/
public String ensureBlockVolume(String desiredName, String sizeInGB, String existingVolumeId) throws Exception {
if (existingVolumeId != null && !existingVolumeId.isBlank()) {
log.info("Verifying existing OCI Block Volume: {}", existingVolumeId);
// In a real implementation, call OCI SDK to GetVolume and check its state.
// If not found or in a bad state, throw an exception.
return existingVolumeId;
}
log.info("Creating new OCI Block Volume with name: {}", desiredName);
// This is a placeholder for the actual SDK call.
// CreateVolumeDetails details = CreateVolumeDetails.builder()
// .compartmentId(compartmentId)
// .displayName(desiredName)
// .sizeInGBs(Long.parseLong(sizeInGB))
// .availabilityDomain("...AD...") // Must match the k8s nodes' AD
// .build();
// CreateVolumeRequest request = CreateVolumeRequest.builder().createVolumeDetails(details).build();
// CreateVolumeResponse response = blockStorageClient.createVolume(request);
// String newVolumeId = response.getVolume().getId();
// Here you would wait for the volume to become AVAILABLE.
// For demonstration purposes, returning a fake OCID
return "ocid1.volume.oc1.phx.xxxxxxxxxxxxx_FAKE";
}
}
这个服务是与云平台紧密集成的部分。一个常见的坑是OCI的可用域(Availability Domain)。创建块存储卷时必须指定AD,并且该AD必须与将要挂载它的OKE工作节点的AD相匹配,否则Pod会因无法挂载卷而一直处于Pending状态。
现在,我们将这个逻辑集成到reconcile
方法中。我们使用CSI(Container Storage Interface)驱动来连接Kubernetes和OCI块存储。首先,我们需要创建一个StorageClass
,它告诉Kubernetes如何动态供应OCI块存储。
oci-bv-storageclass.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: oci-bv
provisioner: blockvolume.csi.oraclecloud.com
parameters:
# This should be configured based on your tenancy
# compartmentOCID: "ocid1.compartment.oc1..xxxxx"
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Retain
至关重要,它确保了当PVC被删除时,底层的OCI块存储卷不会被删除,防止数据丢失。
接下来是Reconciler中处理PVC的代码:
// Inside JupyterNotebookReconciler.reconcile method
// ...
PersistentVolumeClaim existingPvc = client.persistentVolumeClaims()
.inNamespace(namespace)
.withName(name + "-pvc")
.get();
if (existingPvc == null) {
log.info("PVC {} not found, creating.", name + "-pvc");
// In a real scenario, call ociStorageService here.
// For now, we assume dynamic provisioning via StorageClass.
Map<String, Quantity> requests = new HashMap<>();
requests.put("storage", Quantity.parse(resource.getSpec().getStorageSize()));
PersistentVolumeClaim pvc = new PersistentVolumeClaimBuilder()
.withNewMetadata()
.withName(name + "-pvc")
.withNamespace(namespace)
.withOwnerReferences(new OwnerReferenceBuilder()
.withApiVersion(resource.getApiVersion())
.withKind(resource.getKind())
.withName(resource.getMetadata().getName())
.withUid(resource.getMetadata().getUid())
.build())
.endMetadata()
.withNewSpec()
.withStorageClassName("oci-bv") // Use our OCI StorageClass
.withAccessModes("ReadWriteOnce")
.withResources(new ResourceRequirementsBuilder().withRequests(requests).build())
.endSpec()
.build();
client.persistentVolumeClaims().inNamespace(namespace).create(pvc);
// Update status and requeue to wait for PVC to be bound
resource.getStatus().setPhase("PROVISIONING");
return UpdateControl.updateStatus(resource);
}
// ...
我们将CRD实例设置为PVC的所有者(ownerReferences
),这是一个最佳实践。当JupyterNotebook
资源被删除时,Kubernetes的垃圾回收机制会自动清理其拥有的PVC。
4. 创建Deployment和Service
一旦PVC准备就绪(Bound状态),我们就可以创建工作负载了。
// Inside JupyterNotebookReconciler.reconcile method, after PVC check
Deployment existingDeployment = client.apps().deployments()
.inNamespace(namespace)
.withName(name + "-deployment")
.get();
if (existingDeployment == null) {
log.info("Deployment {} not found, creating.", name + "-deployment");
Deployment deployment = new DeploymentBuilder()
.withNewMetadata()
.withName(name + "-deployment")
.withNamespace(namespace)
.withOwnerReferences(new OwnerReferenceBuilder()
.withApiVersion(resource.getApiVersion())
.withKind(resource.getKind())
.withName(name)
.withUid(resource.getMetadata().getUid())
.build())
.endMetadata()
.withNewSpec()
.withReplicas(1)
.withNewSelector()
.addToMatchLabels("app", name)
.endSelector()
.withNewTemplate()
.withNewMetadata()
.addToLabels("app", name)
.endMetadata()
.withNewSpec()
.addNewVolume()
.withName("notebook-storage")
.withNewPersistentVolumeClaim()
.withClaimName(name + "-pvc")
.endPersistentVolumeClaim()
.endVolume()
.addNewContainer()
.withName("jupyter-notebook")
.withImage(resource.getSpec().getImage())
.withImagePullPolicy("Always")
.withNewResources()
.addToRequests("cpu", Quantity.parse(resource.getSpec().getCpuRequest()))
.addToRequests("memory", Quantity.parse(resource.getSpec().getMemoryRequest()))
.endResources()
.addNewPort()
.withContainerPort(8888)
.withName("http")
.endPort()
.addNewVolumeMount()
.withName("notebook-storage")
.withMountPath("/home/jovyan/work") // Standard path for Jupyter stacks
.endVolumeMount()
// This is how you would inject auth tokens from a secret
.addNewEnv()
.withName("JUPYTER_TOKEN")
.withNewValueFrom()
.withNewSecretKeyRef("jupyter-auth-token", "token")
.endValueFrom()
.endEnv()
.endContainer()
.endSpec()
.endTemplate()
.endSpec()
.build();
client.apps().deployments().inNamespace(namespace).create(deployment);
}
// Similar logic for creating a ClusterIP Service to expose port 8888
// ...
代码中,我们将PVC挂载到Jupyter容器的/home/jovyan/work
目录,这是Jupyter官方镜像的工作目录,确保了所有工作成果都保存在OCI块存储上。同时,通过环境变量从一个预先存在的Secret(jupyter-auth-token
)中注入Jupyter的访问令牌,实现了与我们认证体系的解耦。
5. 更新状态
调和循环的最后一步是更新CRD的status
字段,向用户报告真实世界的状态。
// Inside JupyterNotebookReconciler.reconcile method, at the end
Service service = client.services().inNamespace(namespace).withName(name + "-service").get();
Deployment deployment = client.apps().deployments().inNamespace(namespace).withName(name + "-deployment").get();
// A simple check to determine if the deployment is ready
boolean isReady = deployment != null && deployment.getStatus() != null &&
deployment.getStatus().getReadyReplicas() != null &&
deployment.getStatus().getReadyReplicas() > 0;
if (isReady) {
String currentPhase = resource.getStatus().getPhase();
if (!"RUNNING".equals(currentPhase)) {
log.info("JupyterNotebook {} is now running.", name);
resource.getStatus().setPhase("RUNNING");
// Construct an internal cluster URL
resource.getStatus().setUrl(String.format("http://%s.%s.svc.cluster.local:8888", service.getMetadata().getName(), namespace));
return UpdateControl.updateStatus(resource);
}
} else {
// If not ready, keep it in PROVISIONING state
if (!"PROVISIONING".equals(resource.getStatus().getPhase())) {
resource.getStatus().setPhase("PROVISIONING");
return UpdateControl.updateStatus(resource);
}
}
// If everything is stable, no update is needed.
return UpdateControl.noUpdate();
至此,一个完整的调和循环逻辑就构建完成了。当数据科学家执行kubectl apply -f my-notebook.yaml
时,这个Java Operator会自动化地完成所有在OCI和Kubernetes上的资源编排,最终提供一个持久化、资源隔离的Jupyter环境。
局限与未来路径
这个实现只是一个起点。当前的错误处理机制依赖于简单的requeue,一个更健壮的系统需要引入带有指数退避的重试策略。对于删除操作,我们尚未实现finalizer
。没有finalizer
,当用户删除JupyterNotebook
资源时,CR实例会立即消失,我们来不及执行清理逻辑,比如安全地卸载并处理OCI块存储卷。下一步工作就是添加一个finalizer
,在资源删除前执行我们的清理代码,确保云资源不会被泄露。此外,可以进一步集成OCI的日志和监控服务,将Jupyter实例的日志和性能指标自动推送到统一的平台上,为运维提供更完整的可观测性。