Script Command Line
From OpenTextMining
The script gen_otmi.rb has a number of command line options which can be listed by adding the flag '--help' (or the short form '-h') to the comand line. (Note that adding no argument to the script name will also result in this help lkisting.)
% gen_otmi.rb --help Purpose: This script produces an OTMI file for each xml_file listed Usage: $ gen_otmi.rb [option...] [xml_file...] Examples: $ gen_otmi.rb -s file_1.xml file_2.xml $ gen_otmi.rb -wt $ gen_otmi.rb -r ./refs -x ./xml -w General Options: -h, --help print help test and exit -v, --version print version and exit -c, --config CONF_FILE locate config file Set Options: -a, --ascii use ASCII char set in pattern match -s, --stoplist remove stopwords listed in OTMI_STOP file -t, --testing run in test mode -u, --unicode use Unicode char set pattern match -w, --wildcard wildcard matches all '*.xml' in XML_DIR Set Base Directory: -b, --base-dir BASE_DIR override BASE_DIR in config file Set Input Directories: -x, --xml-dir XML_DIR override XML_DIR in config file -r, --refs-dir REFS_DIR override REFS_DIR in config file Set Output Directories: -l, --log-dir LOG_DIR override LOG_DIR in config file -o, --otmi-dir OTMI_DIR override OTMI_DIR in config file
Note that a special case of directory argument is supported ('0') which has the effect of cancelling or 'zeroing' the directory. In practice, this means that the following hold:
--xml-dir (-x) 0 # no input (XML) files are read, and so no output (OTMI) files are produced --refs-dir (-r) 0 # no refs files are read from REFS_DIR, and so no refs are added to the output (OTMI) files --log-dir (-l) 0 # no logging is performed --otmi-dir (-o) 0 # no output (OTMI) files are produced
So, a minimal run of the script (using short argument switches) would be
% gen_otmi.rb -l 0 -r 0 -o otmi -x xml -w
All input (XML) files from the XML_DIR (directory 'xml') are processed and output (OTMI) files are written out to the OTMI_DIR (directory 'otmi'). No logging is performed. No references are processed.
Likewise in the following script invocation (using long argument switches)
% gen_otmi.rb --log-dir log --refs-dir 0 --otmi-dir 0 -w
All input (XML) files from the XML_DIR (specified by the configuration) are processed. No output (OTMI) files are written out. No references are processed. Log files are written out to the LOG_DIR (directory 'log').
