Spark için tamamen yeniyim, Spark üzerinde öğrenme devam ediyor. Uygulamada, aşağıdaki gibi birkaç sorunla karşı karşıya. Birden çok adım ve sessiz uzun. UNIX ortamında kıvılcım kabuğu kullanıyorum. Aşağıdaki gibi hatalar alıyorum.<console>: 22: hata: bulunamadı: değer sc
Adım 1
$ spark-shell Welcome to ____ __ /__/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 1.3.1 /_/ Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_25) Type in expressions to have them evaluated. Type :help for more information. 2016-04-22 07:44:31,5095 ERROR JniCommon fs/client/fileclient/cc/jni_MapRClient.cc:1473 Thread: 20535 mkdirs failed for /user/cni/.sparkStaging/application_1459074732364_1192326, error 13 org.apache.hadoop.security.AccessControlException: User cni(user id 5689) has been denied access to create application_1459074732364_1192326 at com.mapr.fs.MapRFileSystem.makeDir(MapRFileSystem.java:1100) at com.mapr.fs.MapRFileSystem.mkdirs(MapRFileSystem.java:1120) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1851) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:631) at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:224) at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:384) at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:102) at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:58) at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141) at org.apache.spark.SparkContext.(SparkContext.scala:381) at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:1016) at $iwC$$iwC.(:9) at $iwC.(:18) at (:20) at .(:24) at .() at .(:7) at .() at $print() at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1338) at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:856) at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:901) at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:813) at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:123) at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:122) at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:324) at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:122) at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:64) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:973) at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:157) at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:64) at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:106) at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:64) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:990) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:944) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:944) at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:944) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1058) at org.apache.spark.repl.Main$.main(Main.scala:31) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) java.lang.NullPointerException at org.apache.spark.sql.SQLContext.(SQLContext.scala:145) at org.apache.spark.sql.hive.HiveContext.(HiveContext.scala:49) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.spark.repl.SparkILoop.createSQLContext(SparkILoop.scala:1027) at $iwC$$iwC.(:9) at $iwC.(:18) at (:20) at .(:24) at .() at .(:7) at .() at $print() at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1338) at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:856) at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:901) at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:813) at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:130) at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:122) at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:324) at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:122) at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:64) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:973) at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:157) at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:64) at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:106) at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:64) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:990) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:944) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:944) at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:944) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1058) at org.apache.spark.repl.Main$.main(Main.scala:31) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) :10: error: not found: value sqlContext import sqlContext.implicits._ ^ :10: error: not found: value sqlContext import sqlContext.sql ^
Adım 2:
Sadece göz ardı uyarı/yukarıdaki hataları ve benim koduyla geçti. Bunu okudum, eğer kıvılcım kabuğu kullanırsam, otomatik olarak oluşturulacaktır.
<pre>
scala> val textFile = sc.textFile("README.md")
<console>:13: error: not found: value sc
val textFile = sc.textFile("README.md")
</pre>
3. Adım: söylediğini gibi bulunamadı sc, onu yaratan çalıştı. kıvılcım olarak
scala> import org.apache.spark._
import org.apache.spark._
scala> import org.apache.spark.streaming._
import org.apache.spark.streaming._
scala> import org.apache.spark.streaming.StreamingContext._
import org.apache.spark.streaming.StreamingContext._
scala> val conf = new SparkConf().setMaster("local[2]").setAppName("NetworkWordCount").set("spark.ui.port", "44040").set("spark.driver.allowMultipleContexts", "true")
conf: org.apache.spark.SparkConf = [email protected]
scala> val ssc = new StreamingContext(conf, Seconds(2))
16/04/22 08:19:18 WARN SparkContext: Another SparkContext is being constructed (or threw an exception in its constructor). This may indicate an error, since only one SparkContext may be running in this JVM (see SPARK-2243). The other SparkContext was created at:
org.apache.spark.SparkContext.<init>(SparkContext.scala:80)
org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:1016)
$line3.$read$$iwC$$iwC.<init>(<console>:9)
$line3.$read$$iwC.<init>(<console>:18)
$line3.$read.<init>(<console>:20)
$line3.$read$.<init>(<console>:24)
$line3.$read$.<clinit>(<console>)
$line3.$eval$.<init>(<console>:7)
$line3.$eval$.<clinit>(<console>)
$line3.$eval.$print(<console>)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(Method.java:606)
org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1338)
org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:856)
ssc: org.apache.spark.streaming.StreamingContext = [email protected]
Yani ihmal, o (tabii aynı zamanda hatasını gösterebilir dedi) uyarıyor söyledi ve RDD oluşturmak geçti. Yine, İşte emin değilim, Bu bir hata mı/Uyarı ???
adim 4
Aşağıdaki şekilde RDD oluşturuldu.
<pre>
scala> var fil = ssc.textFile("/mapr/datalake/01.Call_ID.txt")
<console>:21: error: value textFile is not a member of org.apache.spark.streaming.StreamingContext
var fil = ssc.textFile("/mapr/datalake/01.Call_ID.txt")
^
</pre>
Burada textFile, streamingContext öğesinin bir üyesi değil. Bütün bunlara kızmaya gidiyorum. Ayrıca, bir şirketin şirket dizüstü bilgisayarında (JFYI) komut dosyalarını çalıştırıyorum.
HADOOP_USER_NAME = HDF'ler kıvılcım-kabuğu ve HADOOP_USER_NAME = CNI kıvılcım kabuğu çalıştı Ama bana aynı hataları veriyor. i olduğuna göre, nasıl izinleri kontrol etmek için (CNI benim kullanıcı adıdır) ??? – subro
Ayarlanmış bir izin yok olabilir. En azından herhangi bir dosya erişim izniniz olup olmadığını görmesi gereken hadoop fs -ls/çalıştırmayı deneyin. – SChorlton
$ Hadoop fs -ls/ Bulunan 9 ürün drwxr-xr-x - mapr mapr 1 2016/04/05 16:29/apps drwxrwxrwx - mapr mapr 1 2014/09/29 15:45/ drwxr- kriterler xr-x - mapr mapr 16 2016-02-02 00:03/datalake drwxr-xr-x - mapr mapr 0 2013-12-10 16:35/hbase drwxr-xr-x - mapr mapr 0 2014-09 -27 08:14/tablolar drwxrwxrwx - mapr mapr 1124 2016-04-22 22:40/tmp drwxr-xr-x - mapr mapr 18 2016-04-14 14:06/kullanıcı drwxr-xr-x - mapr mapr 1 2013-12-10 16:35/var drwxrwxrwt - mapr mapr 356 2016-04-22 11:43/yarnlogs – subro