Monthly Report for April 2007

 

 

Project No.: 260671

Funding Agency: Defense Logistics Agency - Defense Technical Information Center

Award No.: SP4700-05-P-0148

Project Title: Tools for Automatic Extraction of Metadata from DTIC Electronic Documents Collections - Phase II

Project Period: 09/19/05 - 04/30/07

 

Work Accomplishments during period

 

        wrote templates for sub classes(1 line of authors, 2 line of corporate authors)

        wrote postprocessors for date(strip off prefix), personal authors and corporate authors breaking

        completed stress testing for integrated version

        working on thread & timeout integration

        noted problem with unexpected exceptions being written to console and solved it

        updated README file

        submitted paper to STEV 2007 on validation

        changing code by taking idm10 to clean 10 out of loop, using string buffer was main source of performance increase, now 3min(instead of 15min) for 4 templates for 170 files

        removed extra logging messages and made them more meaningful

        Testbed for demo can be found at /home/dtic/dtic_testbed/dticNew2k/testbedfordemo

        Updated the validation spec to include abstract

        Case sensitivity issue is resolved

        Now the config file path can be given as an argument (which was hard-coded to c:\dticsoftware).

        Removed the title template

        The latest software with all these updates can be found at /home/dtic/newsoftware

 

Problem Areas and Corrective Actions

 

None

 

Deviations in Cost/Schedule

 

Tutorial and workshop delayed for final delivery in phase 3

 

Work to be Accomplished Next Period

1) Repeat form/stress test using NASA documents

2)prepare for final reports to management