Print

Print


Hi Jacek,

On Jun 9, 2014, at 11:50 PM, Jacek Becla <[log in to unmask]> wrote:

Serge, Daniel

Please see:

https://jira.lsstcorp.org/browse/DM-854

Are you touching the part of the code that is responsible
for that? If not, I am happy to try fixing it, any hints
where to look are very welcome.

The problem is that the worker is taking the sql and creating result tables using:

CREATE TABLE foo SELECT ….;

The czar does not rewrite the select list to guarantee that this works. Really fixing this is non-trivial, because you have to rewrite every entry in the select list to have a unique alias. It’s tempting to just quote each select list entry. E.g.:


SELECT DISTINCT o1.objectId, o2.objectId
FROM   Object o1,
       Object o2
WHERE  scisql_angSep(o1.ra_PS, o1.decl_PS, o2.ra_PS, o2.decl_PS) < 1
  AND  o1.objectId <> o2.objectId


would become:

SELECT DISTINCT o1.objectId AS `o1.objectId`, o2.objectId AS `o2.objectId`
FROM   Object o1,
       Object o2
WHERE  scisql_angSep(o1.ra_PS, o1.decl_PS, o2.ra_PS, o2.decl_PS) < 1
  AND  o1.objectId <> o2.objectId

In general that doesn’t work for at least 4 reasons:
 1. MySQL imposes a 64 character column name limit.
 2. The user could have used one of the aliases you are adding for another column, expression, or table reference.
 3. The statement might select the same column or expression more than once
 4. You can get conflicts between table stars

I’d say if you want to fix 1-3, you should generate totally opaque aliases, and maintain a mapping from the opaque aliases back to the column references/expressions in the original query. This mapping is probably needed so that the result table the user gets back doesn’t have incomprehensible column names.

I don’t think you can fix 4 without making the czar aware of table schemas. You need to expand table stars into constituent column references and then generate unique aliases as before.

I would like to have it fixed before giving Qserv to
google cloud team, as it prevents us from running
near neighbor query.

You can always get around this by adding aliases yourself. Something like

SELECT DISTINCT o1.objectId AS oid1, o2.objectId AS oid2
FROM   Object o1,
       Object o2
WHERE  scisql_angSep(o1.ra_PS, o1.decl_PS, o2.ra_PS, o2.decl_PS) < 1
  AND  oid1 <> oid2

should work.

Daniel has some result marshalling rework planned (not sure when it’s scheduled) that may impact what needs to be done in terms of query rewriting (from what I recall, the CREATE TABLEs on the worker are going away). For these reasons I hope asking the user to add aliases is OK for the near term.



Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1