#2296 How about... List.groupBy()

SlimerDude Fri 6 Jun 2014

A little method I've found myself cut'n'pasting into various projects is the ability to chop a list up into sub-sections, such as:

list := [1, 2, 3, 4, 5, 6, 7, 8, 9]

grpsOf3 := groupsOf(list, 3) // --> [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

grpsOf2 := groupsOf(list, 2) // --> [[1, 2], [3, 4], [5, 6], [7, 8], [9]]

And I was wondering if it might be useful enough to put in the core List type as:

Obj[][] List.groupsOf(Int n)

list := [1, 2, 3, 4, 5, 6, 7, 8, 9]

grpsOf3 := list.groupsOf(3) // --> [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

grpsOf2 := list.groupsOf(2) // --> [[1, 2], [3, 4], [5, 6], [7, 8], [9]]

Of course, it could be named better! I dunno, split() or sumfin.

Oh, and my current implementation is:

Obj[][] groupsOf(Obj[] all, Int groupsOf) {
    grps := all.typeof.emptyList.rw
    (all.size / groupsOf).times |i| {
        start := i * groupsOf
        grps.add(all[start..<start+groupsOf])
    }
    mod := all.size % groupsOf
    if (mod > 0) 
        grps.add(all[-mod..-1])            
    
    return grps
}

ivan Mon 9 Jun 2014

Here's mine ;)

fansh> groupsOf := |Int n, List l -> List[]| { 
  l.reduce([,]) |Obj[][] r, Obj item, Int index -> Obj[][]| { 
    if(index % n == 0) 
      r.add([item]); 
    else 
      r.last.add(item); 
    return r
  }
}
|sys::Int,sys::List->sys::List[]|
fansh> groupsOf(2, [1,2,3,4,5])
[[1, 2], [3, 4], [5]]

SlimerDude Mon 9 Jun 2014

Oh, a fancy pants solution! Nice, I like it!

Though I would replace the initial reduction [,] with l.typeof.emptyList.rw, to keep the returned list of the same type as the one passed in.

Probably do the same with [item] too.

[item] -> l.typeof.params["V"].emptyList.rw.add(item)

But then you have to cater for generic lists being passed in:

[item] -> (l.typeof.params["V"] ?: Obj?#).emptyList.rw.add(item)

tomcl Mon 9 Jun 2014

An idoiomatic Fantom library List method to do this would be more general:

V [] [] GroupBeforeIndex( V [] all, Int->Bool splitHere)

Where the returned list of lists is all cut into groups each starting with an index n for which splitHere(n) is true.

Then what you want would be written as:

x.groupBeforeIndex { it % 3 == 0 }

I may have got some types etc wrong here, but I hope the idea is clear...

SlimerDude Mon 9 Jun 2014

Hi Tom,

Having the method be more general purpose is a good idea. Then calling it split() or similar would make more sense.

To fit in with the other List methods, I believe the signature would look more like:

V[][] split(|V item, Int index -> Bool| c)

Then, as you say, a grouping func would look like:

[1,2,3,4,5].split |item, index -> Bool| { return (index % 2) == 0 }

It would seem both Ivan and I would use it, if no one else!

tomcl Mon 9 Jun 2014

Yes!

My preference for a name is splitBefore or something like because otherwise I'd never remember whether the split was before or after the selected indexes.

Anyway,this method as you have written it would be coherent with the other list methods and probably more useful than some!

Tom

andy Mon 9 Jun 2014

Out of curiosity - what is your typical use case for this?

SlimerDude Mon 9 Jun 2014

For a live example, see the list of articles on Fantom-Factory, you'll notice they're in groups of 9.

brian Mon 9 Jun 2014

This one seems a little more iffy to me - I've never needed to do anything like that, so I agree with Andy would love to hear multiple use cases.

Also, this one would be good to get some links from other APIs such as Python, Ruby, etc to see what they named it and how it works (to me that is the most convincing that would we should have it)

tomcl Mon 9 Jun 2014

SlimerDudes "every n" subdivision happens quite often.

I have to say the generalisation - although it is really neat - does not necessarily help much because I can't immediately think of a use case other than strings (dealt with in other ways) or parsing. Say to take a token list that includes EOL tokens and split it into a list of tokens one for each line. But there are more general and powerful ways to do parsing so this is maybe not a good example.

Python does this a slightly different (and maybe more useful) way, using groupby which accepts a function mapping from list elements to keys and groups by unique key value

itertools.groupby(iterable[, key])

Make an iterator that returns consecutive keys and groups from the iterable. The key is a function computing a key value for each element. If not specified or is None, key defaults to an identity function and returns the element unchanged. Generally, the iterable needs to already be sorted on the same key function.

The operation of groupby() is similar to the uniq filter in Unix. It generates a break or new group every time the value of the key function changes (which is why it is usually necessary to have sorted the data using the same key function). That behavior differs from SQL’s GROUP BY which aggregates common elements regardless of their input order.*

That would deal with SlimerDude's use cases:

groupby |index, value|{ index / 3 }

Or, without anything new, we could do it like this:

Obj [][] grps := [,]
x.each |val, index| 
{ 
   ((index / 3) == grps.len) ? grps.add( [val] ) : grps[-1].add(val)
}

SlimerDude Thu 24 Sep 2015

I had a quick look around for similar and found:

SlimerDude Thu 8 Oct 2015

Extrapolating from @tomcls generic idea, how about a groupBy() method on List that creates a Map using the list items as the values, but creating keys from a given func:

Obj:V[] groupBy(|V item, Int index -> Obj| c)

Usage:

animals := ["cat", "cow", "donkey", "dog", "snake"]

buckets := animals.groupBy { it.chars.first.toChar }

// --> [c:[cat, cow], d:[donkey, dog], s:[snake]]

This strikes me as far more useful to a wider audience.

As long as the returned Map was ordered, you could split lists (in the manner this thread has been talking about) by taking the Map vals, like this:

split := [1,2,3,4,5].groupBy |v, i| { i % 3}.vals

// --> [[1,2,3], [4,5]]

A simple non-generic method for use today looks like:

static Obj:Obj[] groupBy(Obj[] list, |Obj item, Int index->Obj| keyFunc) {
    list.reduce(Obj:Obj[][:] { it.ordered = true}) |Obj:Obj[] bucketList, val, i| {
        key := keyFunc(val, i)
        bucketList.getOrAdd(key) { Obj[,] }.add(val)
        return bucketList
    }
}

animals := groupBy(["cat", "cow", "donkey", "dog", "snake"]) |Str v, i| { v.chars.first.toChar }
echo(animals)  // --> [c:[cat, cow], d:[donkey, dog], s:[snake]]

split := groupBy([1,2,3,4,5]) |v, i| { i % 2}.vals
echo(split)    // --> [[1,2,3], [4,5]]

SlimerDude Thu 12 Nov 2015

Just discovered that lodash, the popular JS library, has a groupBy() function.

_.groupBy(collection, iteratee)

Creates an object composed of keys generated from the results of running each element of collection through iteratee. The corresponding value of each key is an array of the elements responsible for generating the key.

_.groupBy([4.2, 6.1, 6.4], function(n) {
  return Math.floor(n);
});
// → { '4': [4.2], '6': [6.1, 6.4] }

Jeremy C Fri 4 Oct 2019

+1'ing this old topic. I find my self using this quite a lot as well.

Mengu Mon 6 Jan 2020

and i'm going to go ahead and suggest partition for that :-)

Login or Signup to reply.