I have a very big list. I want to take some action as this:

`list.sample(xx) // return a sub-list who contains xx random elements from the list`

how to implement this? thank you

I have a very big list. I want to take some action as this:

`list.sample(xx) // return a sub-list who contains xx random elements from the list`

how to implement this? thank you

This may not be the fastest implementation but it would be quick and easy to code.

You can use a combination of `Random.shuffle`

and `take`

to implement this.

See this blog post: How to shuffle (randomize) a list in Scala (List, Vector, Seq, String) | alvinalexander.com

The shuffling algorithm should have a complexity of `O(log n)`

IIRC.

A more efficient way would be to randomly select indices and stick them in a `Set`

until the `Set`

is a given size.

1 Like

I am afraid shuffle is not effective since the list is very big (more than 10million elements). thank you.

This might work for you.

Whoops made a little mistake with it. this should be better Scastie - An interactive playground for Scala.

1 Like

Well, you may try to search for know algorithms to do this.

I would guess that a simple approach would be something like this: Scastie - An interactive playground for Scala.

Let me know if you have any questions about the implementation.

2 Likes

I see that there’s a second thread about this over at How do you think of my subset function? — I’ll respond there.

can scala officially add some functions which are available in breeze? currently i have to import breeze for such data analysis requirement. for instance, the sample() function.

It seems normal to me that you would use a separate library for that.

1 Like