Apply
Executes an external function for each tuple of a relation.
Usage
apply (indata, func-expr, new-attr : outdata)
Details
Each tuple of the relation indata is passed to the
function specified in func-expr. The function's return value
is then joined back to indata as new-attr to produce
outdata (i.e., a dependent join). A func-expr is a call to
the remote function that specifies how attributes in the indata
relation are to be passed.
Notes
- Functions should be written in Java and adhere to the following
constraints:
- Functions must be statically declared.
- Functions must require only Java
Object for input
parameters and must return a java.util.ArrayList
as a result. At runtime, the system does not do any
interpretation of the values (they merely must be Objects)
and hands them off to the function for casting and processing.
- The ArrayList returned by the function can contain any type
of object. However, since some Theseus operators process
data based on type (such as Select), it is necessary for
the system to characterize the returned data as either:
CHAR, NUMBER, DATE, or DOM. To do this, the following
logic is used:
- String Java types are casted as Theseus CHAR types.
- Integer, Float, and Double Java types are casted as Theseus
NUMBER types.
- Date Java types are casted as Theseus DATE types.
- Node and Document w3.org types are casted as Theseus DOM
types.
- All other types are casted as Theseus OTHER types.
- Function calls can accept multiple parameters (attributes) from
the input relation indata - use commas to separate each
parameter.
- Multiple results can be returned (i.e., multiple new attributes
can be created) - the
ArrayList contains the values
that will be matched to the comma-delimited list of new attribute
names.
Known Bugs
Example
Using the input:
RELATION books: title char, author char, pub_date date, pages number
Title1|Author1|09-01-1991|34
Title2|Author2|12-23-1954|479
Title3|Author2|05-09-2002|733
Title4|Author3|01-01-1968|32
Title5|Author2|07-03-2001|1152
and the externally compiled MyFunctions.java that contains:
import java.text.*;
import java.util.*;
public class MyFunctions
{
public static ArrayList reformatDate (Object a_dateStr,
Object a_format1, Object a_format2)
{
ArrayList result = null;
try {
String dateStr = (String)a_dateStr;
String format1 = (String)a_format1;
String format2 = (String)a_format2;
SimpleDateFormat fmtr1 = new SimpleDateFormat(format1);
SimpleDateFormat fmtr2 = new SimpleDateFormat(format2);
java.util.Date theDate = fmtr1.parse(dateStr);
result = new ArrayList();
result.add(fmtr2.format(theDate));
}
catch (Exception e) {
System.err.println("Error calling reformatDate("+
a_dateStr+", "+a_format1+", "+a_format2+")");
e.printStackTrace();
}
return result;
}
}
when executing the plan:
PLAN test
{
INPUT: stream books
OUTPUT: stream result
BODY
{
apply (books, "MyFunctions.reformatDate(pub_date, 'MM-dd-yyyy', 'EEE d MMM yyyy HH:mm:ss Z')", "new_pub_date" : result)
}
}
generates the following output:
----------------------------------------------
RELATION: test_result
attrs: title, author, pub_date, pages, new_pub_date
----------------------------------------------
Title2|Author2|12-23-1954|479|Thu 23 Dec 1954 00:00:00 -0800
Title1|Author1|09-01-1991|34|Sun 1 Sep 1991 00:00:00 -0700
Title5|Author2|07-03-2001|1152|Tue 3 Jul 2001 00:00:00 -0700
Title4|Author3|01-01-1968|32|Mon 1 Jan 1968 00:00:00 -0800
Title3|Author2|05-09-2002|733|Thu 9 May 2002 00:00:00 -0700
----------------------------------------------