Spark2.3.1+Kafka0.9使用Direct模式消费信息异常怎么办
Spark2.3.1+Kafka0.9使用Direct模式消费信息异常怎么办,相信很多没有经验的人对此束手无策,为此本文总结了问题出现的原因和解决方法,通过这篇文章希望你能解决这个问题。
成都创新互联公司是专业的西秀网站建设公司,西秀接单;提供成都网站制作、网站建设、外贸网站建设,网页设计,网站设计,建网站,PHP网站建设等专业做网站服务;采用PHP框架,可快速的进行西秀网站开发网页制作和功能扩展;专业做搜索引擎喜爱的网站,专业的做网站团队,希望更多企业前来合作!
Spark2.3.1+Kafka
使用Direct
模式消费信息
Maven
依赖
org.apache.spark spark-streaming-kafka-0-8_2.11 2.3.1 org.apache.spark spark-streaming_2.11 2.3.1
2.3.1
即spark
版本
Direct
模式代码
import kafka.serializer.StringDecoder import org.apache.spark.streaming.kafka.KafkaUtils import org.apache.spark.streaming.{Seconds, StreamingContext} import org.apache.spark.{SparkConf, SparkContext} object Test { val zkQuorum = "mirrors.mucang.cn:2181" val groupId = "nginx-cg" val topic = Map("nginx-log" -> 1) val KAFKA_INTERVAL = 10 case class NginxInof(domain: String, ip: String) def main(args: Array[String]): Unit = { val sparkConf = new SparkConf().setAppName("NginxLogAnalyze").setMaster("local[*]") val sparkContext = new SparkContext(sparkConf) val streamContext = new StreamingContext(sparkContext, Seconds(KAFKA_INTERVAL)) val kafkaParam = Map[String, String]( "bootstrap.servers" -> "xx.xx.cn:9092", "group.id" -> "nginx-cg", "auto.offset.reset" -> "largest" ) val topic = Set("nginx-log") val kafkaStream = KafkaUtils.createDirectStream(streamContext, kafkaParam, topic) val counter = kafkaStream .map(_.toString().split(" ")) .map(item => (item(0).split(",")(1) + "-" + item(2), 1)) .reduceByKey((x, y) => (x + y)) counter.foreachRDD(rdd => { rdd.foreach(println) }) streamContext.start() streamContext.awaitTermination() } }
largest
因为kafka
版本过低不支持latest
异常信息
Caused by: java.lang.NoSuchMethodException: scala.runtime.Nothing$.(kafka.utils.VerifiableProperties) at java.lang.Class.getConstructor0(Class.java:3082) at java.lang.Class.getConstructor(Class.java:1825) at org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator. (KafkaRDD.scala:153) at org.apache.spark.streaming.kafka.KafkaRDD.compute(KafkaRDD.scala:136) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) ... 3 more
解决方案
在验证kafka
属性时不能使用scala
默认的类,需要指定kafka
带的类createDirectStream[String, String, StringDecoder, StringDecoder]
其中StringDecoder必须是kafka.serializer.StringDecoder
看完上述内容,你们掌握Spark2.3.1+Kafka0.9使用Direct模式消费信息异常怎么办的方法了吗?如果还想学到更多技能或想了解更多相关内容,欢迎关注创新互联行业资讯频道,感谢各位的阅读!
分享标题:Spark2.3.1+Kafka0.9使用Direct模式消费信息异常怎么办
文章位置:http://lswzjz.com/article/jhogpe.html