Hi all,
A request for some design advice. Get yourself a brew, it’s a long one…
Background:
I have an abstraction to represent merges (in the sense of merging file content, eg. Git merge) that looks like this…
trait MergeResult[Element]:
def transformElementsEnMasse[TransformedElement](
transform: IndexedSeq[Element] => IndexedSeq[TransformedElement]
)(using equality: Eq[TransformedElement]): MergeResult[TransformedElement]
/** Yields content in a form suitable for being potentially nested within
* either side of a larger [[MergedWithConflicts]]. This is necessary because
* Git doesn't model nested conflicts, where one side of a conflict can
* contain a smaller conflict. <p>This occurs because splicing can generate
* such nested conflicts housed within a larger conflict between move
* destination anchors on one side (with the splice) and some other
* conflicting content on the opposite side to the anchors.
*
* @return
* The elements as single [[IndexedSeq]].
* @see
* https://github.com/sageserpent-open/kineticMerge/issues/160
*/
def flattenContent: IndexedSeq[Element]
The merge result abstraction generalises an indexed sequence of elements to be either a resolved merge or a conflicted merge, which at the moment is either a single indexed sequence for a fully resolved merge, or two indexed sequences for the left and right halves of a conflicted merge.
Note that no attempt is made to localise fine-grained conflicts - it’s all or nothing over the entire file content, so you get a ‘left-side’ and a ‘right-side’, these correspond to Git stage entries in a conflicted merge.
So far, so good - that all works very nicely.
It gets a bit more complicated in that the various transforms applied end up falling into four categories:
- A filtration, so the sequence element type doesn’t change, but elements can be dropped.
- A map-accumulate, in this case the sequence element type doesn’t change, but elements can be added or dropped or transmogrified - all in the context of previously processed elements.
- A left-fold, but as it turns out, this is really another map-accumulate in disguise.
- A flat-map of sorts, where the inner-type yielded by the operation being flat-mapped is an indexed sequence that should be concatenated into the resulting sequence. The element types change too.
- A map; the element types change.
Note that all these transforms are expressed wrt to the indexed sequence abstraction, not MergeResult.
There is also a rather hokey flattening operation on MergeResult that converts one to a simple indexed sequence (regardless of whether resolved or conflicted), and the less said about that the better, only it is unavoidable. Shrug.
What I want to do:
A. Finesse MergeResult so that it can model localised conflicts - so whereas for a fully resolved merge one would have an implied indexed sequence of elements, one with conflicts could be thought of as being a dag of elements with (hopefully) mostly long straight runs representing partially resolved output and the odd competing runs of elements representing local conflicts (or one run competing against a direct shortcut between elements - this models a deletion versus edit conflict).
B. Make those five kinds of transform into a first-class part of the API for MergeResult. This isn’t just to scratch an itch, the point is that with finer grained conflicts, there is no longer a single sequence of elements to pass to the transform - the operations apply in a more localised form, tracking the straight sections and competing sides of the local conflicts separately. This also has to take context into account for the map-accumulate / left-fold situations.
C. Finesse the flattening, so that instead of unconditionally flattening a MergeResult into an indexed sequence, there is an operation of MergeResult[MergeResult[Element]] => MergeResult[Element] that will inline a localised conflict directly into an enclosing non-conflicted straight run, and will inline a straight run into the relevant side of an enclosing conflict. Where one conflict is nested within one side of another, then the same kind of hokey outcome as with the current flattening takes place, and that’s OK.
Where I’ve got to:
It’s fairly straightforward to cutover MergeResult to internally represent the localised conflicts using a dag approach, and the operations to build one up pretty much write themselves. So that’s item A done.
I’m less certain about item B. My current thoughts are to hoist those four operations into the API of MergeResult, being a bit cavalier about the difference between a MergeResult on one hand as being a collection-like / monad-like thing and IndexedSeq on the other also being collection-like / monad-like.
So:
trait MergeResult[Element]:
def mapAccumulate[State, Transformed](seed: State)(
operation: (State, Element) => (State, Transformed)
): (State, MergeResult[Transformed])
def map[Transformed](
transform: Element => Transformed
): MergeResult[Transformed]
def flatMap[Transformed](innerTransform: Element => IndexedSeq[Transformed]): MergeResult[Transformed]
Note that the flat-map’s inner transform yields an indexed sequence - so this is rather like flat-mapping over a collection to yield optional inner results in the standard library; a mixed flat-map.
Now for item C. The only place where this is required is after doing one of the map-accumulates, not when doing the mixed flat-mapping. Of course, if I admitted that one of the two map-accumulates actually end up building a MergeResult[MergeResult[Element]], then perhaps this could be expressed more directly in the API. A kind of flatMapAccumulate?
Curious to know your thoughts…