Script Configuration

From OpenTextMining

Jump to: navigation, search

The gen_otmi.rb script uses a configuration file to set the (1) working directories, and (2) runtime options.

  1. The working directories are:
    • Base Directory
      • BASE_DIR - base directory for other directories
    • Input Directories
      • XML_DIR - directory for input XML files
      • REFS_DIR - directory for refernces info
    • Output Directories
      • LOG_DIR - directory for log files produced
      • OTMI_DIR - directory for oputput OTMI files
  2. The runtime options are:
    • STOPLIST - remove stopwords listed in OTMI_STOP file
    • TESTING - run in test mode
    • UNICODE - use Unicode char set pattern match
    • WILDCARD - wildcard matches all '*.xml' in XML_DIR

This is the default config file loaded by gen_otmi.rb:

# Config file 'otmi_config.rb'
#
# This is the default config file
#

# Base Directory
BASE_DIR = "."

# Input Directories
XML_DIR = "#{BASE_DIR}/xml"
REFS_DIR = "#{BASE_DIR}/refs"

# Output  Directories
LOG_DIR = "#{BASE_DIR}/log"
OTMI_DIR = "#{BASE_DIR}/otmi"

# Options
STOPLIST = false
TESTING = false
UNICODE = true
WILDCARD = false

The default run configuration can be easily modified by the user by supplying an alternate config file. The following search method is used to locate this file:

  1. command-line argument ('--config' or '-c')
  2. environment variable OTMI_CONF
  3. look for file 'otmi.conf' in current directory

If none of these are found then gen_otmi.rb defaults to the prelodaded config file listed above.

The configuration can be further modified by using separate command line arguments - see Script Command Line. The resulting configuration assembled from configuration file and command line arguments will be used for the run. (Note also that directories can be 'turned off' using the special '0' argument to a directory switch. See Script Command Line for further details.)

This is an example of the run configuration as logged by gen_otmi.rb and written out to the run log file e.g. gen_otmi.log.1184145517:

### BEGIN CONFIG ###

Config File:
  conf_file = /Users/tony/My Developer/otmi/etc/otmi.conf

Base Directory:
  base_dir = /Users/tony/My Developer/otmi/test

Input Directories:
  xml_dir = /Users/tony/My Developer/otmi/test/xml
  refs_dir = /Users/tony/My Developer/otmi/test/refs

Output Directories:
  log_dir = /Users/tony/My Developer/otmi/test/log
  otmi_dir = /Users/tony/My Developer/otmi/test/otmi

Options:
  stoplist = false
  testing = false
  unicode = true    ($ascii = false)
  wildcard = true

### END CONFIG   ###
Personal tools