Groupby
Groups data by one or more attributes and/or aggregate measures.
Usage
groupby (indata, group-expr : outdata)
Details
Tuples of the relation indata are grouped by the
attributes and aggregate functions specified in group-expr.
The resulting outdata contains only those attributes
and aggregate function results specified by group-expr.
Notes
- Aggregate functions should follow the same format
specified in the description of the
Aggregate operator.
Known Bugs
- Functions cannot accept literals.
Example
Using the input:
RELATION books: title char, author char, pub_date date, pages number
Title1|Author1|09-01-1991|34
Title2|Author2|12-23-1954|479
Title3|Author2|05-09-2002|733
Title4|Author3|01-01-1968|32
Title5|Author2|07-03-2001|1152
and the externally compiled MyFunctions.java that contains:
import java.text.*;
import java.util.*;
public class MyFunctions
{
public static double average (ArrayList a_data)
{
return MyFunctions.sum(a_data)/((double)a_data.size());
}
public static double sum (ArrayList a_data)
{
double sum = 0;
for (int i=0; i<a_data.size(); i++) {
sum += Double.parseDouble(a_data.get(i).toString());
}
return sum;
}
}
when executing the plan:
PLAN test
{
INPUT: stream books
OUTPUT: stream result
BODY
{
groupby (books, "author, MyFunctions.sum(pages) sum_pages, MyFunctions.average(pages) avg_pages" : result)
}
}
generates the following output:
----------------------------------------------
RELATION: test
attrs: author, sum_pages, avg_pages
----------------------------------------------
Author2|2364.0|788.0
Author3|32.0|32.0
Author1|34.0|34.0
----------------------------------------------