This is a second attempt to organize meeting about logging. These interested in attending, please let me know your availability through: http://doodle.com/sby9zh26di86nm9g We will provide an updated documentation describing the latest version of the prototype late tonight. Thanks, Jacek > > On 03/12/2014 11:37 AM, Jacek Becla wrote: >> Hi all, >> >> I'd like to schedule a meeting about logging to continue >> discussion we started back in December (see below if you >> don't recall). >> >> Bill has now built a log4cxx-based prototype which he can >> demo, so sometime in the next week or so feels like a good >> time to pick up the discussion. Could we possibly do it >> tomorrow at 1pm PDT, or some time Monday? (based on DM >> calendar, Robert is gone for quite some time starting Tuesday...) >> >> These interested in the discussion, please fill the poll at: >> >> http://doodle.com/an9yguusdhq8rzqb >> >> Jacek >> >> >> >> >> >> On 12/18/2013 01:57 PM, Jacek Becla wrote: >>> Attendees: Robert Lupton, K-T Lim, Bill Chickering, >>> Mike Freemon, Serge Monkewitz, Jacek >>> >>> >>> High level summary >>> ================== >>> - application-level logging: >>> - considering using log4cxx, plus one-line interface >>> and configuration >>> - swig it to python >>> - distributed logging: >>> - considering using Apache Flume >>> - next steps: >>> - build a prototype >>> - evaluate >>> - make final decision (this is probably SAT level) >>> - expecting to have app-level prototype in late January, >>> with Flume in February [Bill will build it] >>> >>> >>> Related reading >>> =============== >>> https://dev.lsstcorp.org/trac/wiki/db/Qserv/Logging >>> (it will be updated shortly after these notes are sent out) >>> >>> >>> Detailed notes >>> ============== >>> >>> the wiki page does not cover our existing event system >>> - it could be considered as an option for distributed logging >>> - although it has many issues >>> >>> >>> google logging >>> - automatically selects module name (we dislike it) >>> - doesn't seem very popular >>> >>> >>> yes, pex_logging has optimizations and tries to minimize >>> computation if logging turned off >>> >>> >>> pay attention to performance, e.g., ideally Robert would >>> like an option to compile out logging all together >>> >>> >>> it'd be useful to define even more levels than log4cxx offers >>> (e.g., several levels within "debug") - need to research if >>> that is doable >>> >>> >>> Robert dislikes the api in pex_logging, want something >>> as easy as printf, a.la.: >>> >>> LOG("a.b.c", n, message, args, ...); >>> >>> where "a.b.c" is the hierarchical component name, n is >>> the severity, and args are printf/boost::format/stream >>> style arguments for the message. >>> >>> >>> need to be able to attach metadata to individual log messages >>> >>> >>> log4cxx directly vs wrapping? >>> - log4cxx plus one-line simple format, and dealing with >>> configuration >>> >>> >>> need to be able to dynamically turn on/off distributed logging >>> (isolate our software from distributed logging) >>> - proposed model: configure system to produce log in log4j >>> format, distributed logging simply consumes the data >>> >>> >>> same logging system from python and C++ or they can diverge? >>> - same. It is a requirement! >>> >>> >>> logging from python >>> - swig the c++ implementation >>> - yes, even for simple, python-only admin tools >>> >>> >>> configuring logging >>> - from c++: through api or a text file >>> - from python: through a text file only >>> >>> >>> two main uses for logging: >>> - manual debugging >>> - automated parsing of logs >>> - want key/value for that >>> - can configure log4cxx to write in json format >>> >>> >>> problem with existing event system >>> - ingesting into rdbms slow, querying key/value in rdbms slow >>> - it does not batch >>> - issues with multi-threading >>> - but using event system as a transport protocol is an option, >>> it is fast, scales, reliable >>> >>> >>> sending log data to distributed logging system >>> - through local files, or streaming directly >>> - if local files: space mgmt issues (disks getting full) >>> - but acts as a buffer, if distrib logging down, can resync >>> and catch up >>> >>> >>> Flume vs Kafka >>> - Flum is simpler, easier to deploy than Kafka >>> - Kafka focuses a lot on high throughput / high performance >>> - both use ZooKeeper >>> - Flume is not just for hdfs >>> - both solid, well supported, Apache products >>> --> prototype with Flume >>> >>> >>> if we use Flume, would we still use event system for monitoring? >>> Options: >>> - use event system >>> - do event monitoring in Flume >>> - generate events from Flume and use event system after Flume >>> >>> >>> This discussion is all about application level logging, not for >>> system level logging (done by NCSA) >>> - NCSA has good tools that they use and like >>> - Mike will send some info, if it looks interesting to us >>> we will schedule a phone call to discuss >>> >>> >>> Jacek >>> >>> >>> >>> >>> On 12/17/2013 10:53 PM, Jacek Becla wrote: >>>> Hi, >>>> >>>> I've put together a sketch proposal for logging system >>>> based on earlier research done by our student Bill >>>> on distributed logging systems, plus a couple of >>>> hours of research/reading up about pex_logging, >>>> log4cxx, Flume and Kafka, see: >>>> >>>> https://dev.lsstcorp.org/trac/wiki/db/Qserv/Logging >>>> >>>> I'm planning to discuss it tomorrow (Wed) at 11:00 >>>> pacific with Bill and K-T. >>>> >>>> If anyone is interested in joining, we'll keep >>>> the line open: 866 740 1260, pass 9268664. >>>> >>>> Jacek >>>> >>> >> > ######################################################################## Use REPLY-ALL to reply to list To unsubscribe from the QSERV-L list, click the following link: https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1