na.drop not work

#1

Hi there

val t1=Seq(("a","1"),("b","2")).toDF("id","value")
val t2=Seq(("a","3"),("c","2")).toDF("id","value2")
t1.join(t2,Array("id"),"left").show()

image

t1.join(t2,Array("id"),"left").na.drop().show()

image

def countMatch(s:String,pattern:String):Int={
    s.sliding(pattern.length).count(_==pattern)
  }

val counterWord = udf((s:String)=>{
	keywordSeq.map(x=>(x,Util.countMatch(s,x))).filter(_._2!=0)
})

when i run

t1.join(t2,Array("id"),"left").na.drop().withColumn("counter",counterWord(col("value2"))).withColumn("size",size($"counter")).filter($"size" === 1).show()

i got a java null exception

but when i run

t1.join(t2,Array("id"),"left").na.drop().withColumn("counter",counterWord(col("value2"))).withColumn("size",size($"counter")).persist().filter($"size" === 1).show()

it worked.

i wander know why i can get the right answer when i use persist(). it seems that without persist(), the code run in the wrong order.