Jacek, I'm happy to discuss it. FYI, the system already accepts multiple emptyChunk files (one per db), but it defaults to emptyChunks.txt if none is found. Have a look at populateEmptyChunkInfo() in metadata.py. Also, note that in the czar log you should see a message saying what it was looking for. Sorry, it's not configurable. It will always look in $CWD/empty_<dbname>.txt . At the time, I was hoping we could eliminate the empty chunks file with a lightweight secondary index, but that got shelved because I didn't have a good solution to handle updates. -Daniel On 05/16/2014 01:03 PM, Becla, Jacek wrote: > Daniel > > I feel like it'd be good to get to the bottom of it this FY, > at minimum we should allow multiple emptyChunk lists to > coexist (one per database). I'll add to the list of to-discuss > topics for next Wed. > > Thanks, > Jacek > > > > > > > On 05/16/2014 12:29 PM, Wang, Daniel Liwei wrote: >> Ideally, we would rebuild it upon any changes to the dirTable. There was >> code in indexing.py(I think?) that was a placeholder attempt of code >> that could generate it. >> >> There are a couple ways of generating it: >> 1) from the range of numbers defined by the min and max chunk number, >> filter out chunks determined to be non-empty by the existence of >> dirtable_NNN tables. This is what my scripts did, and it was a hassle to >> get it to work. >> >> 2) Do a special all-chunks query (count(*) or similar), but don't squash >> on errors-- add them to the empty chunks list. >> >> 3) Populate the empty chunks file when you create a database, or when a >> czar becomes aware that a database exists. Every time you load data, you >> know what chunks you are creating, so remove those chunks from the empty >> chunks file/list. The czar always checks the db entry in css to see if >> its empty chunks file is out of date. It is impossible to delete >> partitioned rows, so chunks never become empty after being non-empty. >> >> 4) Compute a non-empty chunk list from the sec-index list: select >> distinct chunkId from blah (will go obsolete soon) and create a >> hash-table or std::map, and use it. The czar can compute and cache this >> the first time the db is accessed. >> >> I know, it's annoying. >> -Daniel >> >> >> -Daniel >> >> >> On 05/16/2014 11:59 AM, Jacek Becla wrote: >>> Daniel >>> >>> Can you remind me what the plan are with regards to the >>> emptyChunk.txt? Will it be going away any time soon? >>> It can still lead to a lot of confusion: I had a fully >>> working environment based on PT1.1 data set, but after >>> I run qserv-testdata it silently overwrote the version >>> I had (in build/dist/etc), and as a result I started >>> getting an error: >>> >>> Table 'LSST.Object_1234567890' doesn't exist >>> >>> which I did not connect with bad emptyChunk.txt for >>> quite some time. >>> >>> Thanks, >>> Jacek >>> >>> ######################################################################## >>> Use REPLY-ALL to reply to list >>> >>> To unsubscribe from the QSERV-L list, click the following link: >>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1 ######################################################################## Use REPLY-ALL to reply to list To unsubscribe from the QSERV-L list, click the following link: https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1