Deprecated range syntax


#14

It is fairly easy to write a routine that works right by using multiplication and division rather than addition. In fact, I did exactly that a long time ago for my scalar class (which represents physical scalars with units). Here it is:

def scalarSteps(start: Scalar, end: Scalar, step: Scalar): Vector[Scalar] = {

val inc = abs(step)
val sgn = Real(signum(step)) // convert to "Double"
val start1 = Real(start/inc)
val end1 = Real(end/inc) + 1e-10 * sgn

(BigDecimal(start1) to end1 by sgn)
    .map(_.toDouble).map(_ * inc).toVector
}

A simpler version of this (replace Scalar with Double) could be provided by default for so-called “Doubles” in Scala so that the deprecated syntax could be maintained and would work correctly. That would relieve users of spending time to figure out how to use BigDecimal. It would also result in a tiny performance penalty, but I would gladly take the slight hit in return for the convenience.


#15

For what it’s worth, It just occurred to me that if the human race had chosen base 8 (octal) instead of base 10 as the standard numeral system, we wouldn’t have this problem. People say we use base 10 because we have ten fingers, but actually we have 8 fingers and two thumbs! Too late to fix that one, I guess!


#16

That doesn’t work right on 0.1 to 0.299999999999 by 0.2.


#17

(How) is this different from 1 to 7 by 2?


#18

Binary-coded integers can represent whole decimal numbers exactly. Binary fractions, which is what Float and Double are, cannot represent decimal fractions exactly. So 1 to 7 by 2 is calculated without error.


#19

I’m sorry, but I truly do not see why you think Range.Double is “unproblematic”. What do you think about this behavior:

**Welcome to Scala 2.12.4 (OpenJDK 64-Bit Server VM, Java 1.8.0_171).
Type in expressions for evaluation. Or try :help.

Range.Double(0.0, 7.0, 1.0).last
res0: Double = 6.0

Range.Double(0.0, 0.7, 0.1).last
res1: Double = 0.7000000000000001**

Best, Oliver


#20

Even if you step with precision you run into surprises. What should the behavior of 0.1 until 3*0.1 by 0.1 be? More sneakily, suppose you have def steps(x: Double) = x until 3*x by x. Shall this sometimes give you three elements and sometimes two?

The imprecision can easily arrive in the input which is why a solution with precise stepping isn’t really a solution.


#21

That’s just a bug. The code comment is:

// XXX This may be incomplete.

And with the appropriate override,

scala> Range.Double(0, .7, .1).last
<console>:12: warning: method apply in object Double is deprecated (since 2.12.6): use Range.BigDecimal instead
       Range.Double(0, .7, .1).last
             ^
res1: Double = 0.6

#22

I still insist that in the age of literal types and macros, it’s not too much magic to insist on literals or at least take warning action. 0 to .7 by .1, give me BDs or Doubles or whatever seems to be expected.

Or at least enable a Propensive library to do it for me.


#23

That doesn’t work right on 0.1 to 0.299999999999 by 0.2.

But it’s “close enough for government work,” as they say!

I actually use something like this to discretize a bounding area for a numerical algorithm, and I definitely need to capture the end point. But I need to capture the end point even if it is in the middle of a step, which is a slightly different problem. I could just add the end point to the end of the sequence, but then I would usually be repeating the end point. So I came up with this little scheme:

def scalarStepsx(start: Scalar, end: Scalar, step: Scalar): Vector[Scalar] = {
// same as scalarSteps except guaranteed to include end point

val steps = scalarSteps(start, end, step)
if (areClose(steps.last, end)) steps else steps :+ end
}

def areClose(x: Scalar, y: Scalar) =
if (y == 0) x == 0 else abs(x / y - 1) < 1e-13


#24

That one also has counterexamples where it does the wrong thing (e.g. 0.1 to 0.300000000001 by 0.1). None of these are suitable for a library method that should act “intuitively”.


#25

@som-snytt - I don’t have any objection to a working macro. I’m not likely to be able to write one in a reasonable amount of time myself, though.

As Russ’s examples indicate, it’s tricky to get it working. The only really safe thing to do is pass literal numeric arguments into the BigDecimal string constructor, picking them directly out of the text of the code (not the Double literal computed by the compiler).


#26

I can’t sneak anything by you!

Seriously though, a person is extremely unlikely to actually use a number like 0.300000000001, and roundoff error will be a couple orders of magnitude less than 1e-12. Hence, I don’t see it as a practical issue. Nevertheless, I can understand that you cannot allow even the tiniest “loophole” in the standard language and library.


#27

Some applications actually hinge upon these kinds of differences–those that have chunked intervals where the intervals are used as a denominator, for instance, or those that count on hitting the endpoint exactly in order to generate a difference between a to b and a until b. This can be really important to get right if you’re, say, trying to generate angles between 0 and 2*Pi; overshooting on the last endpoint giving you a second approximately-zero angle can be a big deal.

I’d love to have a better story here, but unfortunately it is all too easy to have an “intuitive” result that’s just wrong. For example, people will reason, “Well, if I hit the endpoint exactly, to and until will be different, so I’ll just boost the endpoint up/down a tiny bit to make them the same,” and then they get weird unexpected behavior because it’s fighting secret heuristics in the algorithm put there to try to preserve a different kind of intuition.


#28

I can see that arithmetic with Doubles is imprecise, and that makes a naive range of Doubles unintuitive. But I don’t really see the problem anymore when you can make the steps of the range precise by using a BigDecimal underneath. The argument now is that you can give an imprecise result of a calculation with Doubles as input to the range (e.g. 0.1 until 3*0.1 by 0.1). But isn’t this just the case for everything one might do with Doubles? If that’s a reason not to have a range of Doubles, then shouldn’t you just remove Double itself?

For instance:

scala> Ordering[Double].equiv(0.3, 0.1 * 3)
res0: Boolean = false

Should we now deprecate Ordering[Double]?

Also, if you force people to use Range.BigDecimal instead, this is what’s going to happen:

scala> def someInput = 0.1 * 3
input: Double

scala> val range = BigDecimal("0.1") until someInput by 0.1
range: scala.collection.immutable.NumericRange.Exclusive[scala.math.BigDecimal] = NumericRange 0.1 until 0.30000000000000004 by 0.1

scala> range.last.toDouble
res1: Double = 0.3

Uglier code for the same result.


#29

I was already thrown by

scala> BigDecimal(.1 * 3)
res0: scala.math.BigDecimal = 0.30000000000000004

As was mentioned on the other thread, I think at least the folded constant should do the more obvious thing:

.1 * 3 : BigDecimal

where the expected type has to guide something somehow. Probably just every term is BigDecimal.

In the meantime, I think a lint rule is called for. Abide, abide. I mean Scalafix.

If anyone watched Agents of shield, I’d like the t-shirt that says, “I can Scalafix this!”


#30

There’s a limit to how much we can protect people. But the bottom line is that Double represents decimal fractions imprecisely, and NumericRange has an API that presupposes accurate treatment of endpoints. There’s an inherent conflict there. We shouldn’t present an API and then blame the user for assuming that it works reliably because of course Double is imprecise.

So either we need an alternate API, e.g. 0.1 to 0.7 size 7 and 0.1 to 0.7 every 0.1 where you promise you will hit the endpoints regardless (and the step size for every is not strictly adhered to); or we need to bail on Double entirely and/or leave the deprecations forever that tell people that what they’re trying to do can’t be made reliable because of the mismatch between endpoint assumptions requiring something that Double can’t deliver.


#31

I would add that arithmetic mixing floating points (e.g. Double) with fixed points aka decimals (e.g. BigDecimal) strongly smells like a broken design.

Decimals are only precise if your numbers don’t have more digits than your BigDecimal is configured to handle. For example, BigDecimal is imprecise for one third (0.3333…) and, with default precision, for (1e50 + 1).

Decimals are much more expensive than Doubles. Doubles are 8 bytes big and operations are hardware-supported simple atomic transactions, i.e. very fast. Decimals are user-level objects some dozens of bytes large, and every operation is complex and user-defined, i.e. very slow.

Doubles work excellent for most science, engineering and applied math use cases when used properly. If you ever find that Doubles are not precise enough, then almost always one of the following three is true:

(1) You have a pure math problem requiring many digits, like calculating the first million digits of pi, or finding the next biggest known prime number. In that case, neither Double nor BigDecimal will save you, and you will need your own custom types. (Ok, maybe BigDecimal may somehow work, but only if used very cleverly)

(2) You have some financial or legal use case that calls for decimals. For example, calculating an account balance, or appointing seats in parliament according to election results. In this case, BigDecimal will work, but only after you have made sure the rounding (MathContext) is exactly according to the rules.

(3) You are using the wrong algorithm. Ask yourself whether the end result will critically change if some numbers are slightly altered. If yes, your algorithm will not work. In particular, testing for equality is almost always an error. Testing for ordering is only fine if you can tolerate an unexpected ordering of numbers that end up close to each other.

For example, to get a Range of Doubles, a valid algorithm would be to first calculate the first and last number and number of intervals and then calculate all numbers from Int indices. On the other hand, repeatedly adding and comparing to some boundary is probably not useful.

Best, Oliver


#32

For starters what on earth is the problem with “until” which has also been deprecated?

Double is an imprecise class. The use of precision-less equality for Doubles should certainly be removed from the language. But the use of the comparison operators is precisely what the floating point classes were designed for. As range built on top of comparison, it is perfectly legitimate use of Double.

Have I ever been caught out by Double’s imprecision? Yes of course, but this part of the problem domain. It is not accidental complexity. Big Decimal should not be the default.


#33

to includes the right endpoint. until does not. When you can’t tell where the endpoint is because Double is imprecise, this results in a pretty non-intuitive API.

The default isn’t supposed to be BigDecimal. The default should be that we don’t provide a confusing API. You can then get the desired functionality some other way that is predictable (like mapping Int ranges, if you want to be fast, or BigDecimal if you want to avoid having to do the math yourself).