Print

Print


We won't have a quorum to officially discuss logging
with the rest of the DM team, so we will proceed with
a regular Qserv meeting. We will discuss logging and
will do a short demo at that meeting, so if anyone
wants to join feel free to call at:
866 740 1260, pass 9268664
We can do hangout video ad hoc if needed.

Jacek



On 03/12/2014 11:37 AM, Jacek Becla wrote:
> Hi all,
>
> I'd like to schedule a meeting about logging to continue
> discussion we started back in December (see below if you
> don't recall).
>
> Bill has now built a log4cxx-based prototype which he can
> demo, so sometime in the next week or so feels like a good
> time to pick up the discussion. Could we possibly do it
> tomorrow at 1pm PDT, or some time Monday? (based on DM
> calendar, Robert is gone for quite some time starting Tuesday...)
>
> These interested in the discussion, please fill the poll at:
>
> http://doodle.com/an9yguusdhq8rzqb
>
> Jacek
>
>
>
>
>
> On 12/18/2013 01:57 PM, Jacek Becla wrote:
>> Attendees: Robert Lupton, K-T Lim, Bill Chickering,
>>               Mike Freemon, Serge Monkewitz, Jacek
>>
>>
>> High level summary
>> ==================
>>     - application-level logging:
>>        - considering using log4cxx, plus one-line interface
>>          and configuration
>>        - swig it to python
>>     - distributed logging:
>>        - considering using Apache Flume
>>     - next steps:
>>       - build a prototype
>>       - evaluate
>>       - make final decision (this is probably SAT level)
>>       - expecting to have app-level prototype in late January,
>>         with Flume in February [Bill will build it]
>>
>>
>> Related reading
>> ===============
>> https://dev.lsstcorp.org/trac/wiki/db/Qserv/Logging
>> (it will be updated shortly after these notes are sent out)
>>
>>
>> Detailed notes
>> ==============
>>
>> the wiki page does not cover our existing event system
>>     - it could be considered as an option for distributed logging
>>     - although it has many issues
>>
>>
>> google logging
>>     - automatically selects module name (we dislike it)
>>     - doesn't seem very popular
>>
>>
>> yes, pex_logging has optimizations and tries to minimize
>> computation if logging turned off
>>
>>
>> pay attention to performance, e.g., ideally Robert would
>> like an option to compile out logging all together
>>
>>
>> it'd be useful to define even more levels than log4cxx offers
>> (e.g., several levels within "debug") - need to research if
>> that is doable
>>
>>
>> Robert dislikes the api in pex_logging, want something
>> as easy as printf, a.la.:
>>
>>       LOG("a.b.c", n, message, args, ...);
>>
>>      where "a.b.c" is the hierarchical component name, n is
>>      the severity, and args are printf/boost::format/stream
>>      style arguments for the message.
>>
>>
>> need to be able to attach metadata to individual log messages
>>
>>
>> log4cxx directly vs wrapping?
>>     - log4cxx plus one-line simple format, and dealing with
>>       configuration
>>
>>
>> need to be able to dynamically turn on/off distributed logging
>> (isolate our software from distributed logging)
>>     - proposed model: configure system to produce log in log4j
>>       format, distributed logging simply consumes the data
>>
>>
>> same logging system from python and C++ or they can diverge?
>>     - same. It is a requirement!
>>
>>
>> logging from python
>>     - swig the c++ implementation
>>     - yes, even for simple, python-only admin tools
>>
>>
>> configuring logging
>>     - from c++: through api or a text file
>>     - from python: through a text file only
>>
>>
>> two main uses for logging:
>>     - manual debugging
>>     - automated parsing of logs
>>       - want key/value for that
>>          - can configure log4cxx to write in json format
>>
>>
>> problem with existing event system
>>     - ingesting into rdbms slow, querying key/value in rdbms slow
>>     - it does not batch
>>     - issues with multi-threading
>>     - but using event system as a transport protocol is an option,
>>       it is fast, scales, reliable
>>
>>
>> sending log data to distributed logging system
>>     - through local files, or streaming directly
>>     - if local files: space mgmt issues (disks getting full)
>>       - but acts as a buffer, if distrib logging down, can resync
>>         and catch up
>>
>>
>> Flume vs Kafka
>>     - Flum is simpler, easier to deploy than Kafka
>>     - Kafka focuses a lot on high throughput / high performance
>>     - both use ZooKeeper
>>     - Flume is not just for hdfs
>>     - both solid, well supported, Apache products
>>     --> prototype with Flume
>>
>>
>> if we use Flume, would we still use event system for monitoring?
>> Options:
>>     - use event system
>>     - do event monitoring in Flume
>>     - generate events from Flume and use event system after Flume
>>
>>
>> This discussion is all about application level logging, not for
>> system level logging (done by NCSA)
>>     - NCSA has good tools that they use and like
>>     - Mike will send some info, if it looks interesting to us
>>       we will schedule a phone call to discuss
>>
>>
>> Jacek
>>
>>
>>
>>
>> On 12/17/2013 10:53 PM, Jacek Becla wrote:
>>> Hi,
>>>
>>> I've put together a sketch proposal for logging system
>>> based on earlier research done by our student Bill
>>> on distributed logging systems, plus a couple of
>>> hours of research/reading up about pex_logging,
>>> log4cxx, Flume and Kafka, see:
>>>
>>> https://dev.lsstcorp.org/trac/wiki/db/Qserv/Logging
>>>
>>> I'm planning to discuss it tomorrow (Wed) at 11:00
>>> pacific with Bill and K-T.
>>>
>>> If anyone is interested in joining, we'll keep
>>> the line open: 866 740 1260, pass 9268664.
>>>
>>> Jacek
>>>
>>
>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1