2015-09-04 16 views
6

içeri aktarılamıyor Yerel Mac dizüstü bilgisayarımda Spark 1.4.1 kullanıyorum ve pyspark'u sorunsuz bir şekilde kullanabiliyorum. Spark, Homebrew ile kuruldu ve Anaconda Python kullanıyorum. Ben /usr/local/Cellar/apache-spark/1.4.1/ dizinde herhangi bir yere dosya taşımak o zaman, spark-submitSpark-submit, SparkContext

from pyspark import SparkContext 

if __name__ == "__main__": 
    sc = SparkContext("local","test") 
    sc.parallelize([1,2,3,4]) 
    sc.stop() 

: Burada

15/09/04 08:51:09 ERROR SparkContext: Error initializing SparkContext. 
java.io.FileNotFoundException: Added file file:test.py does not exist. 
    at org.apache.spark.SparkContext.addFile(SparkContext.scala:1329) 
    at org.apache.spark.SparkContext.addFile(SparkContext.scala:1305) 
    at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:458) 
    at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:458) 
    at scala.collection.immutable.List.foreach(List.scala:318) 
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:458) 
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61) 
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) 
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
    at java.lang.reflect.Constructor.newInstance(Constructor.java:422) 
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234) 
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379) 
    at py4j.Gateway.invoke(Gateway.java:214) 
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79) 
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68) 
    at py4j.GatewayConnection.run(GatewayConnection.java:207) 
    at java.lang.Thread.run(Thread.java:745) 
15/09/04 08:51:09 ERROR SparkContext: Error stopping SparkContext after init error. 
java.lang.NullPointerException 
    at org.apache.spark.network.netty.NettyBlockTransferService.close(NettyBlockTransferService.scala:152) 
    at org.apache.spark.storage.BlockManager.stop(BlockManager.scala:1216) 
    at org.apache.spark.SparkEnv.stop(SparkEnv.scala:96) 
    at org.apache.spark.SparkContext.stop(SparkContext.scala:1659) 
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:565) 
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61) 
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
Method) 
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) 
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
    at java.lang.reflect.Constructor.newInstance(Constructor.java:422) 
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234) 
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379) 
    at py4j.Gateway.invoke(Gateway.java:214) 
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79) 
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68) 
    at py4j.GatewayConnection.run(GatewayConnection.java:207) 
    at java.lang.Thread.run(Thread.java:745) 
Traceback (most recent call last): 
    File "test.py", line 35, in <module> sc = SparkContext("local","test") 
    File "/usr/local/Cellar/apache-spark/1.4.1/libexec/python/lib/pyspark.zip/pyspark/context.py", line 113, in __init__ 
    File "/usr/local/Cellar/apache-spark/1.4.1/libexec/python/lib/pyspark.zip/pyspark/context.py", line 165, in _do_init 
    File "/usr/local/Cellar/apache-spark/1.4.1/libexec/python/lib/pyspark.zip/pyspark/context.py", line 219, in _initialize_context 
    File "/usr/local/Cellar/apache-spark/1.4.1/libexec/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 701, in __call__ 
    File "/usr/local/Cellar/apache-spark/1.4.1/libexec/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value 
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. 
: java.io.FileNotFoundException: Added file file:test.py does not exist. 
    at org.apache.spark.SparkContext.addFile(SparkContext.scala:1329) 
    at org.apache.spark.SparkContext.addFile(SparkContext.scala:1305) 
    at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:458) 
    at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:458) 
    at scala.collection.immutable.List.foreach(List.scala:318) 
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:458) 
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61) 
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) 
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
    at java.lang.reflect.Constructor.newInstance(Constructor.java:422) 
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234) 
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379) 
    at py4j.Gateway.invoke(Gateway.java:214) 
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79) 
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68) 
    at py4j.GatewayConnection.run(GatewayConnection.java:207) 
    at java.lang.Thread.run(Thread.java:745) 

kodum edilir: Ancak, en kısa sürede spark-submit kullanmaya çalışırken, aşağıdaki hatayı alıyorum iyi çalışıyor. Ben şöyle benim ortam değişkenleri ayarlamak zorunda:

export SPARK_HOME="/usr/local/Cellar/apache-spark/1.4.1" 
export PATH=$SPARK_HOME/bin:$PATH 
export PYTHONPATH=$SPARK_HOME/libexec/python:$SPARK_HOME/libexec/python/lib/py4j-0.8.2.1-src.zip 

Eminim bir şey benim ortamında yanlış ayarlanmış değilim, ama izini gibi olamaz.

+1

'spark-submit /text.py' kullanmayı deneyin, 'spark-submit' Python komut dosyanızı bulamıyor gibi görünüyor. –

+0

Tüm yolu denedim ve hala aynı hatayı alıyorum. Ayrıca klasördeki izinleri kontrol ettim ve bu sorun görünmüyor. – caleboverman

+4

´test.py´ öğesini tutan dizini PYTHONPATH öğenize eklemeye çalışın. –

cevap

0

spark-submit tarafından yürütülen python dosyaları PYTHONPATH üzerinde olmalıdır.

export PYTHONPATH=full/path/to/dir:$PYTHONPATH 

veya python komut dosyası @Def_Os için

export PYTHONPATH='.':$PYTHONPATH 

Teşekkür olduğu dizine içerisinde zaten mevcut olduğu takdirde ayrıca PYTHONPATH için '.' ekleyebilirsiniz: Ya yaparak dizinin tam yolunu ekleyin Bunu belirtmek için!

İlgili konular