This is a second attempt to organize meeting about logging.
These interested in attending, please let me know your
availability through:
http://doodle.com/sby9zh26di86nm9g
We will provide an updated documentation describing
the latest version of the prototype late tonight.
Thanks,
Jacek
>
> On 03/12/2014 11:37 AM, Jacek Becla wrote:
>> Hi all,
>>
>> I'd like to schedule a meeting about logging to continue
>> discussion we started back in December (see below if you
>> don't recall).
>>
>> Bill has now built a log4cxx-based prototype which he can
>> demo, so sometime in the next week or so feels like a good
>> time to pick up the discussion. Could we possibly do it
>> tomorrow at 1pm PDT, or some time Monday? (based on DM
>> calendar, Robert is gone for quite some time starting Tuesday...)
>>
>> These interested in the discussion, please fill the poll at:
>>
>> http://doodle.com/an9yguusdhq8rzqb
>>
>> Jacek
>>
>>
>>
>>
>>
>> On 12/18/2013 01:57 PM, Jacek Becla wrote:
>>> Attendees: Robert Lupton, K-T Lim, Bill Chickering,
>>> Mike Freemon, Serge Monkewitz, Jacek
>>>
>>>
>>> High level summary
>>> ==================
>>> - application-level logging:
>>> - considering using log4cxx, plus one-line interface
>>> and configuration
>>> - swig it to python
>>> - distributed logging:
>>> - considering using Apache Flume
>>> - next steps:
>>> - build a prototype
>>> - evaluate
>>> - make final decision (this is probably SAT level)
>>> - expecting to have app-level prototype in late January,
>>> with Flume in February [Bill will build it]
>>>
>>>
>>> Related reading
>>> ===============
>>> https://dev.lsstcorp.org/trac/wiki/db/Qserv/Logging
>>> (it will be updated shortly after these notes are sent out)
>>>
>>>
>>> Detailed notes
>>> ==============
>>>
>>> the wiki page does not cover our existing event system
>>> - it could be considered as an option for distributed logging
>>> - although it has many issues
>>>
>>>
>>> google logging
>>> - automatically selects module name (we dislike it)
>>> - doesn't seem very popular
>>>
>>>
>>> yes, pex_logging has optimizations and tries to minimize
>>> computation if logging turned off
>>>
>>>
>>> pay attention to performance, e.g., ideally Robert would
>>> like an option to compile out logging all together
>>>
>>>
>>> it'd be useful to define even more levels than log4cxx offers
>>> (e.g., several levels within "debug") - need to research if
>>> that is doable
>>>
>>>
>>> Robert dislikes the api in pex_logging, want something
>>> as easy as printf, a.la.:
>>>
>>> LOG("a.b.c", n, message, args, ...);
>>>
>>> where "a.b.c" is the hierarchical component name, n is
>>> the severity, and args are printf/boost::format/stream
>>> style arguments for the message.
>>>
>>>
>>> need to be able to attach metadata to individual log messages
>>>
>>>
>>> log4cxx directly vs wrapping?
>>> - log4cxx plus one-line simple format, and dealing with
>>> configuration
>>>
>>>
>>> need to be able to dynamically turn on/off distributed logging
>>> (isolate our software from distributed logging)
>>> - proposed model: configure system to produce log in log4j
>>> format, distributed logging simply consumes the data
>>>
>>>
>>> same logging system from python and C++ or they can diverge?
>>> - same. It is a requirement!
>>>
>>>
>>> logging from python
>>> - swig the c++ implementation
>>> - yes, even for simple, python-only admin tools
>>>
>>>
>>> configuring logging
>>> - from c++: through api or a text file
>>> - from python: through a text file only
>>>
>>>
>>> two main uses for logging:
>>> - manual debugging
>>> - automated parsing of logs
>>> - want key/value for that
>>> - can configure log4cxx to write in json format
>>>
>>>
>>> problem with existing event system
>>> - ingesting into rdbms slow, querying key/value in rdbms slow
>>> - it does not batch
>>> - issues with multi-threading
>>> - but using event system as a transport protocol is an option,
>>> it is fast, scales, reliable
>>>
>>>
>>> sending log data to distributed logging system
>>> - through local files, or streaming directly
>>> - if local files: space mgmt issues (disks getting full)
>>> - but acts as a buffer, if distrib logging down, can resync
>>> and catch up
>>>
>>>
>>> Flume vs Kafka
>>> - Flum is simpler, easier to deploy than Kafka
>>> - Kafka focuses a lot on high throughput / high performance
>>> - both use ZooKeeper
>>> - Flume is not just for hdfs
>>> - both solid, well supported, Apache products
>>> --> prototype with Flume
>>>
>>>
>>> if we use Flume, would we still use event system for monitoring?
>>> Options:
>>> - use event system
>>> - do event monitoring in Flume
>>> - generate events from Flume and use event system after Flume
>>>
>>>
>>> This discussion is all about application level logging, not for
>>> system level logging (done by NCSA)
>>> - NCSA has good tools that they use and like
>>> - Mike will send some info, if it looks interesting to us
>>> we will schedule a phone call to discuss
>>>
>>>
>>> Jacek
>>>
>>>
>>>
>>>
>>> On 12/17/2013 10:53 PM, Jacek Becla wrote:
>>>> Hi,
>>>>
>>>> I've put together a sketch proposal for logging system
>>>> based on earlier research done by our student Bill
>>>> on distributed logging systems, plus a couple of
>>>> hours of research/reading up about pex_logging,
>>>> log4cxx, Flume and Kafka, see:
>>>>
>>>> https://dev.lsstcorp.org/trac/wiki/db/Qserv/Logging
>>>>
>>>> I'm planning to discuss it tomorrow (Wed) at 11:00
>>>> pacific with Bill and K-T.
>>>>
>>>> If anyone is interested in joining, we'll keep
>>>> the line open: 866 740 1260, pass 9268664.
>>>>
>>>> Jacek
>>>>
>>>
>>
>
########################################################################
Use REPLY-ALL to reply to list
To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1
|