Use of Java Streams from Scala

Hello,
I recently started learning Scala and am using it in an existing larger Java project. I have an existing API, that returns a java.util.stream.Stream and wanted to use that from my Scala code. I solved my problem by using iterator.asScala, but the first solution I tried, suggested by a comment on StackOverflow, was simply using Stream’s methods with Scala function literals. This lead to problems with the typechecker, that I’d like to understand.

Following is some minimized code, that causes the errors:

class StackOverflowReproduce extends App {
    val normalWords = (java.util.stream.Stream.of("hello", "world")
        collect groupingBy(_.length, mapping(_.toUpperCase(), joining()))
    )
}

This causes:

StackOverflowReproduce.scala:7: error: missing parameter type for expanded function ((x$1: <error>) => x$1.length)
    collect groupingBy(_.length, mapping(_.toUpperCase(), joining()))
                   ^
StackOverflowReproduce.scala:7: error: value toUpperCase is not a member of ?0
    collect groupingBy(_.length, mapping(_.toUpperCase(), joining()))
                                       ^

As far as I understand, the compiler cannot infer the type of both _ placeholders, but replacing them with (_: String) causes a StackOverflowError when compiling.
How would I fix this problem, if it happened with code without an alternative approach like here?

edit: hit ctrl-enter too soon…

Have you tried using the rocket notation instead?

class StackOverflowReproduce extends App {
    val normalWords = (java.util.stream.Stream.of("hello", "world")
        collect groupingBy(str => str.length, mapping(str.toUpperCase(), joining()))
    )
}

As a general rule, the underscore can only be used when it appears once for each variable. If you have two underscores in a single lambda, Scala sees them as two arguments. So _+_ is short for (x,y) => x+y, not x => x+x.

I could be wrong, but I think the problem you’re running into is that Scala isn’t auto-detecting that it needs to run SAM expansion on the arguments you’re providing to the Java groupingBy function.

My experience with this has mostly been while dealing with currency situations. So, for example, if I want to create a runnable from a function and I try to do:

val thread = new Thread( () => println("Hi") )

My compile will fail. However, if I instead do…

val runnable: Runnable = () => println("Hi")
val thread = new Thread( runnable )

Things will compile (and work) just fine.

This behavior seems to be intended (or at least know) judging by this quote from the Scala 2.12.0 release notes link I posted above:

Note that only lambda expressions are converted to SAM type instances, not arbitrary expressions of FunctionN type:

I’ve also found that type inference doesn’t work perfectly either. As a result if I were you I would try:

  • Giving groupingBy explicit parameterized types.
  • extracting _.length and _.toUpperCase() into vals that explicitly ask for the correct type. I think it’s some variant of a Java Function object.

And see if that fixes your issue.

Cheers!

Thank you for your suggestions.
@MarkCLewis: I tried the other notation, but in this case this should not make a difference, as in my case there are actually two separate lambdas, groupingBy takes one as is first parameter, while mapping is its second parameter, which in turn takes a lambda.

@farmdawgnation: The SAM expansion sounds, like it is probably part of the problem. Unfortunately giving groupingBy explicit parameters is not possible, because its third type parameter is used as the second type parameter of the Collector returned by mapping, which is a wildcard (I tried [String, Int, _ String], which gave an error about an unbound wildcard). Without nested Collectors, this problem does not appear, so the wildcard is probably the problem.

This looks like a limitation in type inference. Higher order methods in Scala are designed so that type inference can proceed one argument list at a time, from left to right, so that it knows the argument types to expect for the function literals by the time it gets to them. In these Java methods, it goes the other way: the function comes first, and the data structure that would constrain the type parameters second. Sadly, we don’t support that.

To make these APIs easy to use from Scala, you’d be best of writing some wrappers that use currying in the same style as our collections. Luckily, a lot of the simpler use cases work: https://gist.github.com/adriaanm/892d6063dd485d7dd221

Okay, for my current problem I will stay with the conversion to iterator, which makes the code shorter actually, but I’ll keep that in mind for similar methods where that isn’t possible. Thanks.

I noticed, that the compiler crashes, even when I supply all types:

val upcase: java.util.function.Function[String, String] = str => str.toUpperCase()
val join: Collector[CharSequence,_,String] = joining()
val mapper: Collector[String, _, String] = mapping(upcase, join)

$ scalac StackOverflow.scala

error: java.lang.StackOverflowError
    at scala.reflect.internal.tpe.TypeMaps$TypeMap.mapOver(TypeMaps.scala:101)
    at scala.reflect.internal.tpe.TypeMaps$FindTypeCollector.traverse(TypeMaps.scala:1076)
    ...

Is this considered a compiler bug that I should report on the tracker? The page about reporting bugs mentioned, that most StackOverflows are not considered bugs, but increasing stack space as recommended still doesn’t produce a result (like a more helpful error message).