Should have sent to this list too..
-------- Original Message --------
Subject: notes from discussion about logging (db team + Steve)
Date: Tue, 08 Oct 2013 11:57:27 -0700
From: Jacek Becla <[log in to unmask]>
Organization: SLAC National Accelerator Laboratory
To: LSST Data Management <[log in to unmask]>
Steve, Serge, Daniel, Bill, Douglas, Jacek
Purpose of the meeting: to capture database
requirements and sync thoughts of the db team,
so that we are well prepared for DM-wide
discussion about logging.
action items:
- capture requirements / statement what is needed
a) db team is capturing db-related input through:
https://dev.lsstcorp.org/trac/wiki/db/Qserv/Logging
(will add a section based on this meeting discussions)
b) need to do that for middleware / DM-wide. KT will
surely coordinate
- document (summarize) what we already learned about
existing tools
- Bill will experiment with most-promising systems,
will migrate the logging system he wrote for Qserv,
without committing code, just to get some hands-on
experience. ~1 day worth of work
Requirements (brought up at the meeting):
- want flexible system, every piece of code might
generate different structure. So key/value best
- want to impose some structure, like timestamp,
thread id, component id
- want to dynamically turn on/off parts of logging,
and query immediately after
- estimate expected rate... 1 million/sec aggregate???
- want to easily filter out subsets, drill in various
directions, eg all info for short window of time,
all info for a given component, etc
- want to query on local logs, not only on centralized
loging server
- not too concerned if we lose a line of log here and there
- need to support simple, easy logging for developers,
eg log to screen or file is good, setting up a server
is bad
observation: different people have different idea
about logging levels, need project-level policy
best candidates:
- Apache Flume (Java)
- Kafka (Java)
- Facebook's Scribe (C++)
Apache flum looks most attractive
- can monitor directories, watch files
- free format structure
- common solution is to put data in hdfs,
query using hive, sql-like api
- can be setup with "routers", where router is collecting
logs for local application and data never hits
logging server
- application could write to a file (or cout),
flum would watch the files. Files can be structured,
eg json
binary vs text logs
- [Daniel:]
- want to work with logs without writing text processing
- if purely text, need to do text parsing
- happy if we have both binary and text
- but text only is not good enough
Steve:
- free form definitely needed (apps team expressed this
many times)
- pex harness, didn't work (logging daemon writing to db).
whole backend didn't keep up
- bigger / larger scale requirements - need to check w/KT
Jacek
########################################################################
Use REPLY-ALL to reply to list
To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1
|