Print

Print


Attendees: Robert Lupton, K-T Lim, Bill Chickering,
            Mike Freemon, Serge Monkewitz, Jacek


High level summary
==================
  - application-level logging:
     - considering using log4cxx, plus one-line interface
       and configuration
     - swig it to python
  - distributed logging:
     - considering using Apache Flume
  - next steps:
    - build a prototype
    - evaluate
    - make final decision (this is probably SAT level)
    - expecting to have app-level prototype in late January,
      with Flume in February [Bill will build it]


Related reading
===============
https://dev.lsstcorp.org/trac/wiki/db/Qserv/Logging
(it will be updated shortly after these notes are sent out)


Detailed notes
==============

the wiki page does not cover our existing event system
  - it could be considered as an option for distributed logging
  - although it has many issues


google logging
  - automatically selects module name (we dislike it)
  - doesn't seem very popular


yes, pex_logging has optimizations and tries to minimize
computation if logging turned off


pay attention to performance, e.g., ideally Robert would
like an option to compile out logging all together


it'd be useful to define even more levels than log4cxx offers
(e.g., several levels within "debug") - need to research if
that is doable


Robert dislikes the api in pex_logging, want something
as easy as printf, a.la.:

    LOG("a.b.c", n, message, args, ...);

   where "a.b.c" is the hierarchical component name, n is
   the severity, and args are printf/boost::format/stream
   style arguments for the message.


need to be able to attach metadata to individual log messages


log4cxx directly vs wrapping?
  - log4cxx plus one-line simple format, and dealing with
    configuration


need to be able to dynamically turn on/off distributed logging
(isolate our software from distributed logging)
  - proposed model: configure system to produce log in log4j
    format, distributed logging simply consumes the data


same logging system from python and C++ or they can diverge?
  - same. It is a requirement!


logging from python
  - swig the c++ implementation
  - yes, even for simple, python-only admin tools


configuring logging
  - from c++: through api or a text file
  - from python: through a text file only


two main uses for logging:
  - manual debugging
  - automated parsing of logs
    - want key/value for that
       - can configure log4cxx to write in json format


problem with existing event system
  - ingesting into rdbms slow, querying key/value in rdbms slow
  - it does not batch
  - issues with multi-threading
  - but using event system as a transport protocol is an option,
    it is fast, scales, reliable


sending log data to distributed logging system
  - through local files, or streaming directly
  - if local files: space mgmt issues (disks getting full)
    - but acts as a buffer, if distrib logging down, can resync
      and catch up


Flume vs Kafka
  - Flum is simpler, easier to deploy than Kafka
  - Kafka focuses a lot on high throughput / high performance
  - both use ZooKeeper
  - Flume is not just for hdfs
  - both solid, well supported, Apache products
  --> prototype with Flume


if we use Flume, would we still use event system for monitoring?
Options:
  - use event system
  - do event monitoring in Flume
  - generate events from Flume and use event system after Flume


This discussion is all about application level logging, not for
system level logging (done by NCSA)
  - NCSA has good tools that they use and like
  - Mike will send some info, if it looks interesting to us
    we will schedule a phone call to discuss


Jacek




On 12/17/2013 10:53 PM, Jacek Becla wrote:
> Hi,
>
> I've put together a sketch proposal for logging system
> based on earlier research done by our student Bill
> on distributed logging systems, plus a couple of
> hours of research/reading up about pex_logging,
> log4cxx, Flume and Kafka, see:
>
> https://dev.lsstcorp.org/trac/wiki/db/Qserv/Logging
>
> I'm planning to discuss it tomorrow (Wed) at 11:00
> pacific with Bill and K-T.
>
> If anyone is interested in joining, we'll keep
> the line open: 866 740 1260, pass 9268664.
>
> Jacek
>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1