içeri aktarılamıyor Yerel Mac dizüstü bilgisayarımda Spark 1.4.1 kullanıyorum ve pyspark
'u sorunsuz bir şekilde kullanabiliyorum. Spark, Homebrew ile kuruldu ve Anaconda Python kullanıyorum. Ben /usr/local/Cellar/apache-spark/1.4.1/
dizinde herhangi bir yere dosya taşımak o zaman, spark-submit
Spark-submit, SparkContext
from pyspark import SparkContext
if __name__ == "__main__":
sc = SparkContext("local","test")
sc.parallelize([1,2,3,4])
sc.stop()
: Burada
15/09/04 08:51:09 ERROR SparkContext: Error initializing SparkContext.
java.io.FileNotFoundException: Added file file:test.py does not exist.
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1329)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1305)
at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:458)
at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:458)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:458)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:214)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:745)
15/09/04 08:51:09 ERROR SparkContext: Error stopping SparkContext after init error.
java.lang.NullPointerException
at org.apache.spark.network.netty.NettyBlockTransferService.close(NettyBlockTransferService.scala:152)
at org.apache.spark.storage.BlockManager.stop(BlockManager.scala:1216)
at org.apache.spark.SparkEnv.stop(SparkEnv.scala:96)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1659)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:565)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:214)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:745)
Traceback (most recent call last):
File "test.py", line 35, in <module> sc = SparkContext("local","test")
File "/usr/local/Cellar/apache-spark/1.4.1/libexec/python/lib/pyspark.zip/pyspark/context.py", line 113, in __init__
File "/usr/local/Cellar/apache-spark/1.4.1/libexec/python/lib/pyspark.zip/pyspark/context.py", line 165, in _do_init
File "/usr/local/Cellar/apache-spark/1.4.1/libexec/python/lib/pyspark.zip/pyspark/context.py", line 219, in _initialize_context
File "/usr/local/Cellar/apache-spark/1.4.1/libexec/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 701, in __call__
File "/usr/local/Cellar/apache-spark/1.4.1/libexec/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.io.FileNotFoundException: Added file file:test.py does not exist.
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1329)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1305)
at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:458)
at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:458)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:458)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:214)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:745)
kodum edilir: Ancak, en kısa sürede
spark-submit
kullanmaya çalışırken, aşağıdaki hatayı alıyorum iyi çalışıyor. Ben şöyle benim ortam değişkenleri ayarlamak zorunda:
export SPARK_HOME="/usr/local/Cellar/apache-spark/1.4.1"
export PATH=$SPARK_HOME/bin:$PATH
export PYTHONPATH=$SPARK_HOME/libexec/python:$SPARK_HOME/libexec/python/lib/py4j-0.8.2.1-src.zip
Eminim bir şey benim ortamında yanlış ayarlanmış değilim, ama izini gibi olamaz.
'spark-submit/text.py' kullanmayı deneyin, 'spark-submit' Python komut dosyanızı bulamıyor gibi görünüyor. –
Tüm yolu denedim ve hala aynı hatayı alıyorum. Ayrıca klasördeki izinleri kontrol ettim ve bu sorun görünmüyor. – caleboverman
´test.py´ öğesini tutan dizini PYTHONPATH öğenize eklemeye çalışın. –