Page 1 of 1

Kubernetes: Treiber-Py: Nicht gefundene und fehlende Anwendungsressourcenfehler

Posted: 16 Sep 2025, 18:23
by Anonymous
Ich versuche, meine PYSPark -Datei in einer K8s -Umgebung mit einem benutzerdefinierten Bild auszuführen.

Code: Select all

/opt/entrypoint.sh: line 128: exec: driver-py: not found

Dies war darauf zurückzuführen, dass das Treiber-Py aus der Datei Eintragspoint.SH in v3+ entfernt wurde, und mein Docker-Bild so geändert, dass die letzte Zeile hinzugefügt wurde.

Code: Select all

FROM apache/spark:3.5.1-java17-python3

ENV SPARK_HOME=/opt/spark
ENV PATH=$SPARK_HOME/bin:$PATH

# # Set working directory
WORKDIR ${SPARK_HOME}

# Install extra Python packages as root
USER root

RUN pip install --no-cache-dir numpy pandas smbprotocol openpyxl pyspark==3.5.1

ADD mssql-jdbc-12.10.1.jre11.jar /opt/spark/jars/
ADD ojdbc11.jar /opt/spark/jars/

## added later
RUN ln -s /opt/spark/bin/spark-submit /usr/bin/driver-py
< /code>
Jetzt bekomme ich Folgendes: < /p>
Error: Missing application resource.

Usage: spark-submit [options]  [app arguments]
Usage: spark-submit --kill [submission ID] --master [spark://...]
Usage: spark-submit --status [submission ID] --master [spark://...]
Usage: spark-submit run-example [options] example-class [example args]
< /code>
Hier ist mein yaml: < /p>
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
name: juni-test
namespace: hpesbo
spec:
type: Python
sparkVersion: 3.5.1
mode: cluster
image: **********************:spark3.5-test
imagePullPolicy: Always

mainApplicationFile: "file:///maprfs-csi/test.py"

restartPolicy:
type: Never
imagePullSecrets:
- imagepull

sparkConf:
spark.mapr.user.secret: long-live-spark-secret
spark.driver.extraClassPath: "local:///maprfs-csi/snowflake-jdbc-3.13.30.jar:local:///maprfs-csi/spark-snowflake_2.12-2.12.0-spark_3.4.jar"
spark.executor.extraClassPath: "local:///maprfs-csi/snowflake-jdbc-3.13.30.jar:local:///maprfs-csi/spark-snowflake_2.12-2.12.0-spark_3.4.jar"

deps:
jars:
- local:///maprfs-csi/snowflake-jdbc-3.13.30.jar
- local:///maprfs-csi/spark-snowflake_2.12-2.12.0-spark_3.4.jar
volumes:
- name: maprfs-volume
persistentVolumeClaim:
claimName: spark-pvc
- name: rsa-key-volume
secret:
secretName: snowflake-rsa-key
- name: log-volume
persistentVolumeClaim:
claimName: spark-pvc-logs

driver:
cores: 1
coreLimit: "1000m"
memory: "8g"
labels:
version: 3.5.1
annotations:
sidecar.istio.io/inject: "false"
volumeMounts:
- name: maprfs-volume
mountPath: /maprfs-csi
- name: rsa-key-volume
mountPath: /keys
readOnly: true
- name: log-volume
mountPath: /log-csi

executor:
cores: 1
coreLimit: "1000m"
instances: 3
memory: "16g"
labels:
version: 3.5.1
annotations:
sidecar.istio.io/inject: "false"
volumeMounts:
- name: maprfs-volume
mountPath: /maprfs-csi
- name: rsa-key-volume
mountPath: /keys
readOnly: true
- name: log-volume
mountPath: /log-csi
Ich habe versucht, von MainApplicationFile zu wechseln: "Local: ///maprfs-csi/test.py" to

Code: Select all

mainApplicationFile: "file:///maprfs-csi/test.py"
und MainApplicationFile: "/maprfs-csi/test.py" Alle von ihnen werfen den gleichen Fehler. Die Datei ist definitiv im PVC vorhanden.Name: spark-test-driver
Namespace:
Priority: 0
Service Account:
Node: /
Start Time: Tue, 16 Sep 2025 19:52:31 +0530
Labels: spark-app-selector=
spark-role=driver
sparkoperator.com/app-name=spark-test
sparkoperator.com/launched-by-spark-operator=true
sparkoperator.com/submission-id=
version=3.5.1
Annotations: cni.projectcalico.org/podIP: /32
sidecar.istio.io/inject: false
Status: Failed
IP:
Controlled By: SparkApplication/spark-test

Containers:
spark-kubernetes-driver:
Container ID: docker://
Image: :spark3.5-test
Image ID: docker-pullable://
Ports: 7078/TCP, 7079/TCP, 4040/TCP
Args:
driver-py
--properties-file
/opt/spark/conf/spark.properties
--class
org.apache.spark.deploy.PythonRunner
State: Terminated
Reason: Error
Exit Code: 1
Started: Tue, 16 Sep 2025 19:52:50 +0530
Finished: Tue, 16 Sep 2025 19:52:55 +0530
Ready: False
Restart Count: 0
Limits:
cpu: 1
memory: 11468Mi
Requests:
cpu: 1
memory: 11468Mi
Environment:
SPARK_DRIVER_BIND_ADDRESS: (v1:status.podIP)
SPARK_LOCAL_DIRS: /var/data/spark-
PYSPARK_PRIMARY: /maprfs-csi/test.py
PYSPARK_MAJOR_PYTHON_VERSION: 2
SPARK_CONF_DIR: /opt/spark/conf
NEW_RELIC_METADATA_KUBERNETES_CLUSTER_NAME:
NEW_RELIC_METADATA_KUBERNETES_NODE_NAME: (v1:spec.nodeName)
NEW_RELIC_METADATA_KUBERNETES_NAMESPACE_NAME:
NEW_RELIC_METADATA_KUBERNETES_POD_NAME: spark-test-driver
NEW_RELIC_METADATA_KUBERNETES_CONTAINER_NAME: spark-kubernetes-driver
Mounts:
/keys from rsa-key-volume (ro)
/log-csi from log-volume (rw)
/maprfs-csi from maprfs-volume (rw)
/opt/spark/conf from spark-conf-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from (ro)

Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 25s default-scheduler Successfully assigned /spark-test-driver to
Warning FailedMount 25s kubelet MountVolume.SetUp failed for volume "spark-conf-volume" : configmap "spark-test--driver-conf-map" not found
Normal Pulling 9s kubelet Pulling image ":spark3.5-test"
Normal Pulled 7s kubelet Successfully pulled image ":spark3.5-test"
Normal Created 7s kubelet Created container spark-kubernetes-driver
Normal Started 7s kubelet Started container spark-kubernetes-driver
< /code>
Ich habe einige Probleme gefunden, die im GIT für die gleichen Fehler gemeldet wurden, aber keine von ihnen hat eine Lösung. Was vermisse ich hier?