[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3. Installation

3.1 System Requirements  
3.2 Linux Installation  
3.3 Windows Installation  
3.4 Database Configuration  


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.1 System Requirements

To install and use KOJAK you'll approximately need the following amounts of disk space:

This version of KOJAK comes in C++ and Java versions and supports Linux (C++ and Java) as well as Windows (Java only). Linux distributions such as RedHat or Suse as well as Windows 2000 and XP should support it out of the box. It also requires access to a database system such as MySQL (4.0 or later) and/or Oracle with appropriate ODBC and/or JDBC drivers and driver managers. KOJAK was primarily developed and tested under Linux RedHat 8.0 and 9.0 as well as Suse 9.2 with MySQL 4.1.0-alpha and MyODBC 3.51.11. We successfully tested it with both the iODBC 3.0.6 and UnixODBC 2.2.9 driver managers. We also successfully ran it with Oracle-10g using the Oracle JDBC Thin Client as well as an experimental open-source Oracle ODBC driver for Linux from http://fndapl.fnal.gov/~dbox/oracle/odbc/. The closer your environment is to some cross-section of that the higher the chances that things will work right out of the box. This release is new and fairly complex, so there is a definite chance for problems when installing it in a different environment.

The C++ version of KOJAK can be about 1.5 times faster than the Java version; however, actual speedup depends significantly on how much time is spent for database access which will be the same for C++ and Java. If most time is spent waiting for the database server, elapsed run times of the C++ and Java versions will be very similar. The faster speed of the C++ version might have to be paid for with a potentially more difficult installation process of the necessary ODBC driver support.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.2 Linux Installation

Both the C++ and Java versions of KOJAK can be run on Linux platforms such as Suse Linux 9.2 or RedHat 9.0. To install KOJAK under Linux choose an installation location and then uncompress and untar the file `kojak-X.Y.Z.tar.gz' (or unzip the file `kojak-X.Y.Z.zip') in the parent directory of that location. `X.Y.Z' are place holders for the actual version numbers. For example:

 
% cd install-dir
% tar xzf kojak-2.2.0.tar.gz

This will create the KOJAK tree in the directory `kojak-X.Y.Z/' (`install-dir/kojak-2.2.0/' in the example). All pathnames mentioned below will be relative to that directory which we will usually refer to as the "KOJAK directory".

Both the C++ and Java versions are already precompiled and should be ready to use. The C++ version is compiled with the iODBC driver manager library iODBC 3.0.6 and the Java version was compiled with Java J2SDK 1.4.2. If that matches your local setup you are done and can go on to 3.4 Database Configuration. If you want to use a different ODBC driver manager such as UnixODBC go on to 3.2.1 Recompiling the C++ Sources. If you want to use a different version of Java or have it installed in a non-standard location go on to 3.2.2 Java Configuration for Linux.

To run KOJAK use the run-kojak script and supply c++ or java as the first argument to select which version should be run (the C++ version is run by default if neither c++ or java is supplied). For example:

 
% cd <KOJAK directory> 
% run-kojak c++ -c config/example3-no-db.dat

There are actually two script files: run-kojak for Linux and run-kojak.bat for Windows. When you execute the run-kojak command the OS will automatically select the appropriate script.

Note that for most of the configuration files shipping with KOJAK you need to have a database server and appropriate drivers set up first before you can run them (see section 3.4 Database Configuration).

A Java-based GUI is currently in a prototype stage and will ship with one of the next versions of KOJAK. This GUI will make it easier to edit configuration files and run KOJAK.

3.2.1 Recompiling the C++ Sources  
3.2.2 Java Configuration for Linux  


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.2.1 Recompiling the C++ Sources

If you want to use a different ODBC driver manager or want to use a different compiler or compiler settings you can recompile KOJAK with help of the top-level `Makefile'. Edit any of the variables in the `Makefile' if necessary and then call make from the KOJAK directory to recompile the system. Note that all the provided source code was generated automatically from STELLA sources (see http://www.isi.edu/isd/LOOM/Stella/index.html for information on the STELLA programming language).

If you want to use KOJAK with the UnixODBC driver manager instead of the iODBC manager it ships with, you need to recompile it by doing the following:

  1. Edit the top-level `Makefile' in the KOJAK directory so that its CFLAGS and LDFLAGS will point to the appropriate locations for your UnixODBC installation (the include files provided in the top-level `include' directory come from iODBC but should work for UnixODBC as well). Also edit the value of the ODBC-LIB variable to use -lodbc as its value.

  2. Run make in the KOJAK directory which will recompile it with the UnixODBC libraries on your machine. This will require GNU make as well as a recent version of g++ such as g++ 3.3.4.

  3. If your UnixODBC library is installed in a non-standard place, you will need to update LD_LIBRARY_PATH in the run-kojak script accordingly.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.2.2 Java Configuration for Linux

The Java version of KOJAK is already pre-compiled for Java J2SDK 1.4.2 and archived in the `kojak.jar' archive which can be found in `native/java/lib/'. Using Java 1.4 should run KOJAK without any problems. Using newer (or slightly older) versions of Java (e.g., the new version 1.5) should work as well, but we have not done any testing to that extent.

The default Java configuration in the `run-kojak' script looks for a java executable in the current command path. If you want to use a different version of Java or if it is no installed in the standard path please edit the JAVA variable in the `run-kojak' script accordingly.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.3 Windows Installation

The Java version of KOJAK now supports Windows operating systems such as Windows 2000 and Windows XP (the C++ version is not yet available for Windows). To install KOJAK under Windows choose an installation location (e.g., `C:\Program Files\') and then unzip the file `kojak-X.Y.Z.zip' in that location using a utility such as WinZip (if you use a different utility make sure it does appropriate translation of line endings in text files from the Unix to the Windows convention). `X.Y.Z' are place holders for the actual version numbers. This will create the KOJAK tree in the folder `kojak-X.Y.Z\' (e.g., `C:\Program Files\kojak-2.2.0\'). All pathnames mentioned below will be relative to that directory which we will usually refer to as the "KOJAK directory".

For most of this manual we will use Unix syntax for pathnames. To translate those into appropriate Windows pathnames simply substitute \ for the Unix / pathname separator. IMPORTANT: if you supply a physical Windows pathname somewhere in a KOJAK command that takes a quoted string as an argument, you will need to double the \ character, since it is also the escape character for strings (but you don't have to do that for pathnames in configuration files). For example:

 
(load-kojak-configuration :config-file "C:\\kojak\\myconfig.dat")

The Java version is already precompiled with Java J2SDK 1.4.2 and should now be ready to use. If that matches your local setup you are done and can go on to 3.4 Database Configuration. If you want to use a different version of Java or have it installed in a non-standard location go on to 3.3.1 Java Configuration for Windows.

To run KOJAK you have to launch a Command Prompt window and then run the run-kojak.bat script supplying java as the first argument (in fact, the Java version is run by default under Windows unless the first argument is c++). For example:

 
C:\> cd <KOJAK directory> 
C:\...> run-kojak java -c config\example3-no-db.dat

There are actually two script files: run-kojak for Linux and run-kojak.bat for Windows. When you execute the run-kojak command the OS will automatically select the appropriate script.

Note that for most of the configuration files shipping with KOJAK you need to have a database server and appropriate drivers set up first before you can run them (see section 3.4 Database Configuration).

A Java-based GUI is currently in a prototype stage and will ship with one of the next versions of KOJAK. This GUI will make it easier to edit configuration files and run KOJAK without using a Command Prompt window.

3.3.1 Java Configuration for Windows  


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.3.1 Java Configuration for Windows

The Java version of KOJAK is already pre-compiled for Java J2SDK 1.4.2 and archived in the `kojak.jar' archive which can be found in `native\java\lib/'. Using Java 1.4 should run KOJAK without any problems. Using newer (or slightly older) versions of Java (e.g., the new version 1.5) should work as well, but we have not done any testing to that extent.

The default Java configuration in the `run-kojak.bat' script looks for a java executable in the current command path. If you want to use a different version of Java or if it is no installed in the standard path please edit the %JAVA% variable in the `run-kojak.bat' script accordingly.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.4 Database Configuration

The Group Finder uses a relational database server such as MySQL and/or Oracle to store group and membership hypotheses, configuration information and analysis metadata, as well as to import evidence data if necessary. Additionally, the Group Finder can work directly with information stored in existing Oracle or MySQL databases (given an appropriate mapping specification) without having to translate the whole database or importing/loading it into memory.

While it is possible to run KOJAK without access to a database server (see for example the configuration file `config/example3-no-db.dat') this is not recommended for analyzing large datasets. If KOJAK analyzes a large dataset without a database server, it needs to load all data into main memory which might be prohibitive depending on dataset size. If it can use a database server, it will be able to load data very selectively and off-load a lot of aggregation and processing to the database which greatly improves scalability and reduces the memory footprint.

If you want to use MySQL as the database server, make sure you have a recent version of MySQL 4.0 or later installed or available on a server (see http://www.mysql.com/). Alternatively, you can use an Oracle database server such as Oracle 10g (see http://www.oracle.com/). Earlier versions of Oracle are likely to work as well, but we have only tested KOJAK with version 10g so far. KOJAK can also work with MySQL and Oracle simultaneously, for example, to store the internal KOJAK database under MySQL and access evidence from an Oracle server.

3.4.1 KOJAK DB and EDB Schema  
3.4.2 ODBC and Driver Managers  
3.4.3 JDBC Drivers  


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.4.1 KOJAK DB and EDB Schema

The Group Finder relies on a set of database tables we call the "KOJAK database" to store group and membership hypotheses, various configuration information or to import evidence data from flat files. Before you can run KOJAK with a database you have to create these KOJAK database tables in your database server.

The KOJAK database tables should best reside in a separate schema to avoid conflict or loss of existing information. Under MySQL we create a new KOJAK database for this purpose, with Oracle we create a new KOJAK user to hold this schema. It is, however, possible to add these tables directly to an existing schema if so desired. In this case it is very important to ensure that there are no preexisting tables with the same name in the schema where they are added.

3.4.1.1 MySQL Setup  
3.4.1.2 Oracle Setup  


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.4.1.1 MySQL Setup

To create the KOJAK database under MySQL use the following steps in Linux (you will need the appropriate privileges to create these schema objects). For Windows users the steps should be similar, just use the appropriate Windows MySQL client:

 
% cd <KOJAK directory>
% mysql -u <dbuser> -p
mysql> source kbs/kojak-db-schema-mysql.sql;
mysql> source kbs/kojak-edb-schema-mysql.sql;

Make sure you load the two files exactly in that order. The first file will create the new KOJAK database and add the various KOJAK hypothesis and configuration tables. If you don't want to create a new KOJAK database but instead want to add the tables to an existing database, you need to edit the database creation and use commands in `kbs/kojak-db-schema-mysql.sql' accordingly before you load it.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.4.1.2 Oracle Setup

To create the KOJAK database under Oracle use the following steps (you will need appropriate SYSDBA privileges to create these schema objects): First edit the file `kbs/kojak-db-schema-oracle.sql' to use the appropriate password information for user KOJAK. Alternatively, you can have a DBA create the KOJAK user for you and simply comment or delete the user creation statements in the script. Under Linux you can use the SQL*Plus tool to load these scripts, for example:

 
% cd <KOJAK directory>
% sqlplus /nolog
SQL> connect sys as sysdba
SQL> start kbs/kojak-db-schema-oracle.sql
SQL> start kbs/kojak-edb-schema-oracle.sql

Alternatively (or for Windows users), you can paste the content of these files into the iSQL*Plus client of the Oracle Enterprise Manager. If you don't want to create a new KOJAK user but instead want to add the tables to an existing schema, you need to edit the user creation and connect commands in `kbs/kojak-db-schema-oracle.sql' accordingly before you load it.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.4.2 ODBC and Driver Managers

If you want to use the C++ version of KOJAK (currently only supported for Linux) you need to have an appropriate ODBC driver (e.g., MyODBC 3.51 if you are using MySQL) and ODBC driver managers (such as iODBC 3.0.6 or UnixODBC 2.2.9) installed on the machine where you run KOJAK.

Visit http://www.openlinksw.com/iodbc/ (or http://www.unixodbc.org/) for information on ODBC drivers and driver managers for Linux. The C++ version of KOJAK is precompiled with the iODBC driver manager library `libiodbc.so.2.1.6'. It might also be necessary to uninstall pre-installed conflicting versions such as MyODBC-2.50 and UnixODBC which come pre-installed with some flavors of Linux. If you want to use KOJAK with the UnixODBC driver manager instead, you need to recompile it with the different libraries (see section 3.2.1 Recompiling the C++ Sources).

If you are using MySQL you will need to install the MyODBC 3.51 driver (see http://dev.mysql.com/downloads/connector/odbc/3.51.html for more information). If you are using Oracle you can use a free open-source ODBC driver for Oracle under Linux available from http://fndapl.fnal.gov/~dbox/oracle/odbc/. This is an unsupported alpha release (version 0.5.5) that takes some work to install, but we have used this version successfully. There is also a somewhat expensive commercial driver available from Easysoft http://www.easysoft.com/ which is a better alternative; however, due to its cost we have not yet experimented with this driver.

3.4.2.1 .odbc.ini File  


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.4.2.1 .odbc.ini File

After you have ensured the installation of proper ODBC drivers and driver managers for your database server, copy the provided `.odbc.ini' file to your home directory (editing the local copy in the KOJAK directory will not have any effect!) and adapt the [kojak] entry with the appropriate access information (user, password, driver, server, etc.). Data source specifications given to KOJAK will use this information as defaults for fields that are not provided in ODBC connection strings specified in the configuration file.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.4.3 JDBC Drivers

If you are running the Java version of KOJAK it will need to use JDBC to communicate with database servers. KOJAK currently supports the following two JDBC driver class implementations:

  1. com.mysql.jdbc.Driver for use with MySQL
  2. oracle.jdbc.driver.OracleDriver for use with Oracle (Thin Client)

KOJAK currently ships with various driver implementations of these classes (jar files located in directory `native/java/lib/'. For MySQL we use `mysql-connector-java-3.0.17-ga-bin.jar' and for Oracle we use `ojdbc14.jar' which is the Java 1.4 driver from the Oracle 10g distribution. These driver files are communicated to KOJAK via the environment variables in the `run-kojak' and `run-kojak.bat' scripts. If you want/need to use a different JDBC driver (for one or both of the driver classes shown above), you can do so by editing the appropriate variables in the `run-kojak' and/or `run-kojak.bat' scripts. You only need to supply a driver for the database system you are using, i.e., if you are only using MySQL you don't need to supply an Oracle driver and vice versa.

For MySQL a set of newer, alternative drivers are also shipped in the `native/java/lib/' directory, since different versions of MySQL seem to need different JDBC drivers. These drivers are `mysql-connector-java-3.1.10-bin.jar' and `mysql-connector-java-3.2.0-alpha-bin.jar'. If you experience problems with the default JDBC driver for MySQL try to substitute one of these alternatives and see whether it fixes the problem.


[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by Hans Chalupsky on October, 30 2007 using texi2html