I was exploring immutable RedBlackTree
tree implementation in Scala (2.12.4) when I noticed something that wasn’t clear to me:
private[this] def upd[A, B, B1 >: B](tree: Tree[A, B], k: A, v: B1, overwrite: Boolean)(implicit ordering: Ordering[A]): Tree[A, B1] = if (tree eq null) { RedTree(k, v, null, null) } else { val cmp = ordering.compare(k, tree.key) if (cmp < 0) balanceLeft(isBlackTree(tree), tree.key, tree.value, upd(tree.left, k, v, overwrite), tree.right) else if (cmp > 0) balanceRight(isBlackTree(tree), tree.key, tree.value, tree.left, upd(tree.right, k, v, overwrite)) else if (overwrite || k != tree.key) mkTree(isBlackTree(tree), k, v, tree.left, tree.right) else tree }
Comparing keys via compare
is followed by explicit key not “equals
” condition.
I compared similar parts of mutable RedBlackTree
implementation and haven’t found anything similar. So I started to look for the collections that may work differently for the mutable and immutable version. So TreeSet
, overwrite
is passed as false compared to true TreeMap
: this makes sense, for tree map, inserting the same key will replace the value, while this is not needed for TreeSet
. Although key equality is still checked, it’s still not clear why.
Simple test show that mutable and immutable TreeSet work really differently in the following example.
Let’s define class extending Ordered
with two fields that compare
distinguishes only the first one:
case class Dummy(x: Int, y: Int) extends Ordered[Dummy] { override def compare(that: Dummy): Int = Ordering[Int].compare(this.x, that.x) }
Now try to insert the different one by “equals
” instances into mutable and immutable versions of sets:
val mutableTree = scala.collection.mutable.TreeSet.empty[Dummy] mutableTree.add(Dummy(0, 0)) mutableTree.add(Dummy(0, 1)) mutableTree var immutableTree = scala.collection.immutable.TreeSet.empty[Dummy] immutableTree += Dummy(0, 0) immutableTree += Dummy(0, 1) immutableTree
So were the results (2.12.4):
Welcome to Scala 2.12.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_121). Type in expressions for evaluation. Or try :help. scala> case class Dummy(x: Int, y: Int) extends Ordered[Dummy] { | override def compare(that: Dummy): Int = Ordering[Int].compare(this.x, that.x) | } defined class Dummy scala> val mutableTree = scala.collection.mutable.TreeSet.empty[Dummy] mutableTree: scala.collection.mutable.TreeSet[Dummy] = TreeSet() scala> mutableTree.add(Dummy(0, 0)) res0: Boolean = true scala> mutableTree.add(Dummy(0, 1)) res1: Boolean = false scala> mutableTree res2: scala.collection.mutable.TreeSet[Dummy] = TreeSet(Dummy(0,0)) scala> var immutableTree = scala.collection.immutable.TreeSet.empty[Dummy] immutableTree: scala.collection.immutable.TreeSet[Dummy] = TreeSet() scala> immutableTree += Dummy(0, 0) scala> immutableTree += Dummy(0, 1) scala> immutableTree res5: scala.collection.immutable.TreeSet[Dummy] = TreeSet(Dummy(0,1))
As we can see mutable TreeSet
has not got Dummy(0, 1)
while the immutable TreeSet
did.
More interesting, is what will happen if we try to remove an element:
scala> mutableTree.remove(Dummy(0, 1)) res6: Boolean = true scala> mutableTree res7: scala.collection.mutable.TreeSet[Dummy] = TreeSet()
Mutable version worked exactly based on compare.
For Immutable version:
scala> immutableTree -= Dummy(0, 0) scala> immutableTree res9: scala.collection.immutable.TreeSet[Dummy] = TreeSet()
Nice, immutable did the same (it has Dummy(0, 1)
and removing Dummy(0, 0)
works).
If we look at immutable RedBlackTree
implementation for the del
function, we can see that there is simple:
val cmp = ordering.compare(k, tree.key) if (cmp < 0) delLeft else if (cmp > 0) delRight else append(tree.left, tree.right)
and there is no specific case for cmp == 0 && k != tree.key
.
This is not a bug as this violates Ordered contract
It is important that the equals method for an instance of Ordered[A] be consistent with the compare method. However, due to limitations inherent in the type erasure semantics, there is no reasonable way to provide a default implementation of equality for instances of Ordered[A]. Therefore, if you need to be able to use equality on an instance of Ordered[A] you must provide it yourself either when inheriting or instantiating.
I was still interested in finding out why this condition was added, so I went to git and found an original commit. It was exactly what I had expected for an overwrite parameter but there was no clear explanation why key compare was added:
Here we had an issue that RedBlack does not work the same way
for sets – which are not supposed to replace an element if
it is the same (wrt equals) and maps – which should replace
the corresponding values.Adding an overwrite parameter which decides whether to overwrite
added keys if they are the same in the ordering.
And also worth noting is that HashSet has nothing to do with ordering so they end up having both values inserted:
scala> var immutableHash = scala.collection.immutable.HashSet.empty[Dummy] immutableHash: scala.collection.immutable.HashSet[Dummy] = Set() scala> immutableHash += Dummy(0, 0) scala> immutableHash += Dummy(0, 1) scala> immutableHash res11: scala.collection.immutable.HashSet[Dummy] = Set(Dummy(0,0), Dummy(0,1))
No comments