When using TreeSet, how to get ceiling or floor value efficiently?


#1

There are functions named ceiling and floor in Java’s TreeSet. However I didn’t find something like them in Scala’s TreeSet.

I use code like set.from(123).min to do such thing. However in this way a new set is created, thus the complexity is O(n) right? But actually I only need that value, the complexity should be O(logn)

Is there any function functioning like ceiling and floor?

Thanks guys!


#2

You can use the min and max methods.


#3

I know that. But I want to get the max value below an arbitrary value x. Use plain max will only get the max value in the whole tree. So I have to use to method and then max.

Does it hurt performance? It is mentioned in here few years ago. Is it sill the same?


#4

How about .maxBy(_ < x)?


#5

maxBy method is used for finding the first element which yields the largest value measured by function f.

And the f you gave is _ < x, which yields Boolean. Seems like Scala considers True is larger than False, since
TreeSet(1, 3, 5).maxBy(_ < 5) returns 1, due to 1 and 3 both yield True.

maxBy is not the one I want in this case.

TreeSet(1, 3, 5).maxBy(x => if (x < 5) x else Int.MinValue) this will return the desired result 3. However, I think the complexity of it is O(n).


#6

Strange that TreeSet has slice(Int, Int), but no method to get a single
element at a given position.

Of course, you can do

def get[A](set: SortedSet[A], pos: Int): A = set.slice(pos, pos + 1).head

Now you can do a binary search for elements.


#7

slice looks like the combination of from and to, which means it’s still the same thing. slice returns a new TreeSet, which means a new red-black tree will be created, thus the complexity of O(n) is inevitable.

The point is, in a BST the complexity should be O(logn) if you want to find a specific value. In Java, ceiling and floor are provided to do this.


#8

A reasonable implementation of TreeSet.slice should only go with
O(n_in_slice*log(n_total))


#9

Yeah, I agree with u. I said O(n) because I guess it was similar to heap.

But the point is you can go with O(log n_total) if you only want that specific value.