Chukwa Change Log

Trunk (unrelease changes)

Release 0.1.2 - 2009-05-14

  NEW FEATURES

    CHUKWA-236. Added migration script for moving database schema for Chukwa 0.1.1 to Chukwa 0.1.2. (Eric Yang)

    CHUKWA-100. Added ability to run aggregation by command line utility. (Eric Yang)

    CHUKWA-78. Added down sample SQL aggregation for job data, task data and utilization data. (Eric Yang)

    CHUKWA-17. Collect PS output every 10 minutes and parse into key/value pairs sequence file. (Cheng Zhang via Eric Yang)

    CHUKWA-55. Aggregate data from HDFS bytes usage, and Mapreduce Job slot time to compute user usage of the cluster. (Eric Yang)

    CHUKWA-50. Added parser for extract Job History and Job Conf into key value pairs, 
    and database loader dictionary file. (Contribute by Cheng Zhang via Eric Yang)

    CHUKWA-69.  Calculate trapezoid area for a given series of data.  (Contribute by Cheng Zhang via Eric Yang)

    CHUKWA-62.  Add script to start tailing files.  (asrabkin)
    
    CHUKWA-12.  Add instrumention Api for Chukwa components.  (Jerome Boulon via asrabkin)

    CHUKWA-14.  Added permalink support to HICC GUI. (Eric Yang)

    CHUKWA-45.  Added docs to folder structure. (Corinne Chandel via asrabkin)

    CHUKWA-13.  Create a new Macro class for macro substitution.
                Changed Aggregator to use Macro class.
                Added macro substitution support to Chukwa Charting.
                Added Macro test case. (Eric Yang)

    HADOOP-4989. Added capability to add scatter chart. (Eric Yang)

  IMPROVEMENTS

    CHUKWA-226. Changed HDFS usage collection frequency from 10 minutes to 60 minutes. (Cheng Zhang via Eric Yang)

    CHUKWA-219. Improved usability of host selector, and updated caching for host list. (Eric Yang)

    CHUKWA-104. Remove permission setting code for Log4JMetricsContext and ChukwaDailyRollingFileAppender.  (Jerome Boulon via Eric Yang)

    CHUKWA-178. Extend aggregation by 5 minutes for cluster metrics. (Eric Yang)

    CHUKWA-176. Rearrange the parameter order for aggregator.sh. (Eric Yang)

    CHUKWA-174. Added test cases to test database partitioning, database aggregation, and data loading. (Eric Yang)

    CHUKWA-173. Parameterize configuration, and enable substitution at build time. (Jerome Boulon via Eric Yang)

    CHUKWA-131. Added additional Mapred job/task metrics.  (Eric Yang)

    CHUKWA-145. Tuned hadoop parameters for demux. (Jerome Boulon via Eric Yang)

    CHUKWA-163. Updated reference to mdl.xml file. (Terence Kwan via Eric Yang)

    CHUKWA-157. Added javadoc target, api-xml, api-report, and change log 2 html. (Eric Yang)

    CHUKWA-141. Updated Chukwa Docs overview page for 0.1.2. (Corinne Chandel via Eric Yang)

    CHUKWA-138. Updated Chukwa Admin Guide. (Corinne Chandel via Eric Yang)

    CHUKWA-140. Updated Chukwa Docs tab for 0.1.2. (Corinne Chandel via Eric Yang)

    CHUKWA-112. Updated README file.  (Corinne Chandel via Eric Yang)

    CHUKWA-134. Add release audit target. (Giridharan Kesavan via Eric Yang)

    CHUKWA-80. Extracted rpm spec file from build.xml file.  Fix start up script, and config script. (Eric Yang)

    CHUKWA-26. DemuxManager, ArchiveManager and PostProcessorManager are now a single daemon process each.
               Each one working independently from others, as soon as something is available.
               Start-data-processor is now using those new daemons instead of pocessSink.sh
               Daily will process a daily compaction only when all hourly would have been done.
               Demux is now able to send NSCA commands to Nagios. (Jerome Boulon via Eric Yang)

    CHUKWA-81. Fix file descriptor leak.  (asrabkin)

    CHUKWA-29. Added TaskTracker and DataNode client trace log parser and database loader.  (Chris Douglas via Eric Yang)

    CHUKWA-86. Improved JDBC compatibility layer (Cheng Zhang via Eric Yang)

    CHUKWA-64. Changed version number to 0.1.2, and release number to alpha. (Eric Yang)

    CHUKWA-59. Collect HDFS Usage information for users. (Contribute by Cheng Zhang via Eric Yang)

    CHUKWA-54. Filters web based input from HICC to prevent cross site scripting attack. (Eric Yang)

    CHUKWA-31. Added duration logging for SQL statements. (Eric Yang)
 
    HADOOP-5373. Collectors track lifetime-received chunks. (asrabkin)

    HADOOP-5370. Collectors don't write empty sink files (asrabkin)

    HADOOP-5228. Chukwa tests shouldn't write to /tmp. (asrabkin)

    HADOOP-5035. Improved Y axis ticker labelling.
                 Used TreeMap to build non-time series data for charting. 
                 Improved handling of "not a number "values. (Eric Yang)

    HADOOP-5030. Changed RPM install location to the value specified by build.properties file. (Eric Yang)

    HADOOP-5029. Added mdl script to manually load chukwa sequence file to database. (Eric Yang)

    HADOOP-4843. Enable job history log file streaming into Chukwa by using JobTrackerInstrumentation API. (Eric Yang)

    HADOOP-4827. Replace Consolidator with Aggregator macros 
                 Updated database table schema 
                 Fix interval for aggregation execute 
                 Improved SQL macros support, add group_avg() and past_*_minutes key word (Eric Yang)

  OPTIMIZATIONS

  BUG FIXES

    CHUKWA-250. Updated hadoop jar reference in chukwa config. (Cheng Zhang via Eric Yang)

    CHUKWA-211. Manage symlink correctly for RPM upgrade and uninstall. (Eric Yang)

    CHUKWA-245. Includes ChukwaJobTrackerInstrumentation class as part of the input tools compilation. (Eric Yang)

    CHUKWA-232. Fixed classpath for chukwa core jar file. (Eric Yang)

    CHUKWA-215. Corrected postProcess.sh environment setup. (Jerome Boulon via Eric Yang)

    CHUKWA-223. Renamed read only session caching object to cache.*, and skip those values for caching. (Eric Yang)

    CHUKWA-239. Demux settings now work out of the box. (Ari Rabkin)

    CHUKWA-238.  Resolve race condition in archiving. (Ari Rabkin)

    CHUKWA-243. Set execute permission on bin scripts.  (Ari Rabkin)

    CHUKWA-228.  Added rpm.hdfsusage.uid option to run HDFS usage as a separate user. (Cheng Zhang via Eric Yang)

    CHUKWA-227. Improved the user privilege for reading HDFS usage.  (Cheng Zhang via Eric Yang)

    CHUKWA-229. Fix descriptor leak for ExecPlugin. (Ari Rabkin via Eric Yang)

    CHUKWA-212. Fix file descriptor leak in MDL.  (Jerome Boulon via Eric Yang)

    CHUKWA-220. Correct min, max settings for yaxis charting. (Terence Kwan via Eric Yang)

    CHUKWA-103. Fix documentation broken link. (Eric Yang)

    CHUKWA-222. Reduce Sar version dependency. (Ari Rabkin)

    CHUKWA-208. Skip copying of test/sample directory if it doesn't exist. (Eric Yang)

    CHUKWA-206. Removed hard coded path from configuration files, and sample data. (Eric Yang)

    CHUKWA-201. Removed sourcing of Timeline widget from simile.mit.edu. (Eric Yang)

    CHUKWA-193. Remove unrecognized tag from Job History Log file parser. (Cheng Zhang via Eric Yang)

    CHUKWA-189. Added JAVA_LIBRARY_PATH to chukwa-env.sh for enabling compression. (Jerome Boulon via Eric Yang)

    CHUKWA-186. Remove duplicated timestamp from aggregation script. (Eric Yang)

    CHUKWA-187. Correction to status for job and task status. (Cheng Zhang via Eric Yang)

    CHUKWA-183. Added HICC startup script. (Eric Yang)

    CHUKWA-177. Added test case for verify the value between JSON values and database values. (Terence Kwan via Eric Yang)

    CHUKWA-181. Changed chukwa check point file to $CHUKWA_LOG_DIR. (Jerome Boulon via Eric Yang)

    CHUKWA-182. Added nagios appender configuration. (Eric Yang)

    CHUKWA-158. Consolidate namenode address to a single configuration object. (Eric Yang)

    CHUKWA-168. Added watchdog for database. (Eric Yang)

    CHUKWA-175.  Removed error message for shutting down data processors. (Eric Yang)

    CHUKWA-155.  Store final job status only (Cheng Zhang via Eric Yang)

    CHUKWA-164.  Use year corresponding to the sender time stamp. (Cheng Zhang via Eric Yang)

    CHUKWA-166.  Handle null parameter case for XSSFilter. (Terence Kwan via Eric Yang)

    CHUKWA-156.  Test Macro testcase changed to use timestamp check for the generated macros. (Eric Yang)

    CHUKWA-154.  Handle adaptor exception, close file pointers on failure condition.  (Jerome Boulon via Eric Yang)

    CHUKWA-139.  Rewrite collector bail out code.  (Cheng Zhang via Eric Yang)

    CHUKWA-119.  Removed dependency of ChukwaAgent from ChunkImpl for preventing multiple 
                 MetricsContext to be initialized in the same VM. (Jerome Boulon via Eric Yang)

    CHUKWA-121.  Added logic to detect partition number less than 0.  (Eric Yang)

    CHUKWA-132.  Handle multi-line output in Job History file more gracefully. (Cheng Zhang via Eric Yang)

    CHUKWA-124.  Fixed JDBC connection initialization and closing.  Added escapeQuotes utility method. (Eric Yang)

    CHUKWA-122. Updated logic for finding time partition for Macro.  (Eric Yang)

    CHUKWA-125. Updated chukwa demux template to include JobHistory parser. (Eric Yang)

    CHUKWA-98.  Added Daemon watcher to capture signal for pid file management. (Cheng Zhang via Eric Yang)

    CHUKWA-120.  Added the missing commons-cli library. (Jerome Boulon via Eric Yang)

    CHUKWA-61.  Added report widget for accounting information. (Eric Yang)

    CHUKWA-77.  Added Database Schema for Hadoop accounting information. (Eric Yang)

    CHUKWA-92.  AbstractMetricsContext was using the wrong value (Jerome Boulon via asrabkin)

    CHUKWA-70.  Rewrite FileAdaptor.  (Jerome Boulon via asrabkin)

    CHUKWA-93.  Fix NPE in SeqFileWriter.  (Jiaqi Tan via asrabkin)

    CHUKWA-1. Remove lzo job configuration from Chukwa data processors. (Contribute by Jerome Boulon via Eric Yang)

    CHUKWA-40. Check for null pointer exception before unregister adaptor. (Contribute By Jerome Boulon via Eric Yang)

    CHUKWA-75. Filter out iostat values that is greater than 1+e10.

    CHUKWA-47. ChukwaJobTrackerInstrumentation class extends JobTrackerMetricsInst and added 
               finalizedJob method for stop streaming job history log file. (Jerome Boulon via Eric Yang)

    CHUKWA-66. Corrected TerminatorThread logger class name. (Eric Yang)

    CHUKWA-65. Redirect metrics log file to CHUKWA_LOG_DIR, and setup read/writable permission on the metrics
               log files. (Eric Yang)


    CHUKWA-73. Added Socket Timeout 60 seconds. (Jerome Boulon via Eric Yang)

    CHUKWA-58. Changed watchdog to look for CHUKWA_PID_DIR. (Eric Yang)

    CHUKWA-43.  ChukwaLog4jAppender should send the current file offset instead of sending 0.  (Jerome Boulon via asrabkin)

    CHUKWA-41.  ChukwaLog4jAppender does not escape \n for exception.  (Jerome Boulon via asrabkin)

    CHUKWA-28.  Late initalization of log4j adaptor.  (Jerome Boulon via asrabkin)

    CHUKWA-48.  Cleanup code to resolve compiler warnings.  (asrabkin)

    CHUKWA-9.  MetricDataLoader should close JDBC connection.  (Jerome Boulon via asrabkin)

    CHUKWA-37.  Remove ChuckwaArchiveBuilder. (asrabkin)

    CHUKWA-33.  Reformat code to fit hadoop style.  (asrabkin)
 
    CHUKWA-8.  Remove deprecated conf files.  (eyang via asrabkin)

    CHUKWA-11.  Remove non .template files from conf dir. (asrabkin)

    HADOOP-5409. Changed opt directory copy to a optional operation, if opt exist. (Eric Yang)

    HADOOP-5138. Fixing failing test cases by separating TestCharFileTailing
    and TestFileTailing adaptors. (Jerome Boulon via acmurthy)

    HADOOP-5401. Standardize control port conf option name as chukwaAgent.control.port (asrabkin)

    HADOOP-5087. Fix regex for command parsing. (asrabkin)

    HADOOP-5057. Better test coverage for missing checkpoint cases. (asrabkin)

    HADOOP-5055. Changed alert.conf location from $CHUKWA_HOME/conf/alert.conf to $CHUKWA_CONF_DIR/alert.conf. (Eric Yang)

    HADOOP-5054. Improved database partitioning by date. (Eric Yang)

    HADOOP-5051. * Added macro token subsitution for sum(table_name) 
                 * Added correct hdfs throughput aggregation SQL macros. (Eric Yang)

    HADOOP-5032. Export CHUKWA_CONF_DIR in chukwa-config.sh.

    HADOOP-4960. Changed metrics time from current system time to data source time.

    HADOOP-4959. Support parsing of top output for Redhat EL 5.1. (Eric Yang)

    HADOOP-4916. * Added external property file to reference location and ownership of chukwa 
                 * Added ability to control the user name to run chukwa (Eric Yang)

    HADOOP-4914. Added description fields to chukwa init.d scripts.

    HADOOP-4889. Move chown from post install phase to build phase of the RPM file. (Eric Yang)

    HADOOP-4884. Change date format from dd/mm/yyyy to yyyy/mm/dd for display in the chart tool tips. (Eric Yang)

    HADOOP-4860. Changed test cases into 3 different test classes to prevent Agent from loading chukwa_check_point file which interfered with test cases. (Eric Yang)

    HADOOP-4825. Replaced jps with ps ax for shutdown scripts.
                 Clean up reference of jar file from top level instead of build directory. (Eric Yang)

    HADOOP-4805. Removed black list collector feature from Chukwa Agent HTTP Sender. (Eric Yang)

    HADOOP-4796. Change component unit test targets from "ant test-agent" to "ant -Dtestcase=TestAgent test". (Eric Yang)

    HADOOP-4791. Add build configuration parameter to specify where Chukwa will be installed for RPM packaging. (Eric Yang)


Release 0.1.1 - Unreleased

  IMPROVEMENTS

    HADOOP-4431. Add versionning/tags to Chukwa Chunk. 
    (Jerome Boulon via Johan)

    HADOOP-4433. Improve data loader for collecting metrics and log files.
    (Eric Yang via omalley)

    HADOOP-5205.  Change the value of CHUKWA_IDENT_STRING from demo to TODO-AGENTS-INSTANCE-NAME in chukwa-env.sh.template (Jerome Boulon via asrabkin)

    HADOOP-5033.  Simplified ChukwaWriter API.  (asrabkin)

  NEW FEATURES

    HADOOP-3719. Initial checkin of Chukwa, which is a data collection and 
    analysis framework. (Jerome Boulon, Andy Konwinski, Ari Rabkin, 
    and Eric Yang)
