Print

Print


Hi Adrian,

Yes, this should also work from python.

You just need to add the ?xrdcl.unzip=fn to the metalink url (nothing else), also remove the #fn from the files
inside the metalink. 

To make the xrdcl.unzip work for the links/files inside metalink is more complicated, if this is something
you would be interested in, please create an issue in github, but I don't think we will be able to accommodate
this feature request for 4.10.1.

It is the client that is extracting the file from the zip archive - basically the client reads the central directory record
of the zip archives in order to know the offset of the respective file within the archive.

Cheers,
Michal
________________________________________
From: Adrian Sevcenco
Sent: 10 September 2019 13:11
To: Michal Kamil Simon; [log in to unmask]
Subject: Re: python :: cp process fails when it shouldn't (another utility can download file)

On 9/10/19 1:22 PM, Michal Kamil Simon wrote:
> Hi Adrian,
Hi!

> Thanks for reporting this problem!
>
> I just pushed a patch that will allow for adding opaque info to local files:
> https://github.com/xrootd/xrootd/commit/45f4d8cf1cf2e7bdfc69c461dd47d45ae838748c
great! thanks a lot!!!

> with this fix you will be able to add the '?xrdcl.unzip=fn' cgi to the local metalink (I run some tests
> and it extracts the right file correctly).
So, would this metfile.meta4?xrdcl.unzip=zipped_file work for the python
copyProcess?

Also, would these parameters be added? (the url from metalink have the
ALICE authorization envelope)

> The patch will be included in 4.10.1 which we are now preparing.
any chance of adding this feature also to the links from within the
metalink? i do not know what is happening within, so i just ask :)

If the unziping is done on the server side, maybe it would make more
sense to work with actual url (from metalink), the way that you
initially told me to do?

Thanks a lot!
Adrian



>
> Cheers,
> Michal
> ________________________________________
> From: Adrian Sevcenco
> Sent: 09 September 2019 12:18
> To: Michal Kamil Simon; [log in to unmask]
> Subject: Re: python :: cp process fails when it shouldn't (another utility can download file)
>
> On 9/9/19 12:56 PM, Michal Kamil Simon wrote:
>> Hi Adrian,
>>
>> Could you try adding the /xrdcl.unzip=AliAOD.root/ cgi to the original
>> metalink URL, like:
>>
>> file://localhost/home/adrian/tmp/_alice_data_2018_LHC18m_000291397_pass1_withTRDtracking_AOD208_0884_AliAOD.root.meta4?xrdcl.unzip=AliAOD.root
> well, i do not think that the argument understand something else than a
> file :
>
> ll
> total 3266976
> -rw-r--r-- 1 adrian adrian 3345375231 Sep  9 13:06 AliAOD.root.zip
> -rw-rw-r-- 1 adrian adrian       3289 Sep  9 13:02
> _alice_data_2018_LHC18m_000291397_pass1_withTRDtracking_AOD208_0884_AliAOD.root.meta4
>
> xrdcp -p -P -f
> _alice_data_2018_LHC18m_000291397_pass1_withTRDtracking_AOD208_0884_AliAOD.root.meta4?xrdcl.unzip=AliAOD.root
> AliAOD.root
> xrdcp: No such file or directory processing
> _alice_data_2018_LHC18m_000291397_pass1_withTRDtracking_AOD208_0884_AliAOD.root.meta4?xrdcl.unzip=AliAOD.root
>
> xrdcp -p -P -f
> file://localhost/_alice_data_2018_LHC18m_000291397_pass1_withTRDtracking_AOD208_0884_AliAOD.root.meta4?xrdcl.unzip=AliAOD.root
> AliAOD.root
> xrdcp: No such file or directory processing
> /_alice_data_2018_LHC18m_000291397_pass1_withTRDtracking_AOD208_0884_AliAOD.root.meta4?xrdcl.unzip=AliAOD.root
>
> xrdcp -p -P -f
> file://localhost/${PWD}/_alice_data_2018_LHC18m_000291397_pass1_withTRDtracking_AOD208_0884_AliAOD.root.meta4?xrdcl.unzip=AliAOD.root
> AliAOD.root
> xrdcp: No such file or directory processing
> //home/adrian/work-GRID/jalien_py/t/_alice_data_2018_LHC18m_000291397_pass1_withTRDtracking_AOD208_0884_AliAOD.root.meta4?xrdcl.unzip=AliAOD.root
>
> xrdcp -p -P -f
> file:///localhost/${PWD}/_alice_data_2018_LHC18m_000291397_pass1_withTRDtracking_AOD208_0884_AliAOD.root.meta4?xrdcl.unzip=AliAOD.root
> AliAOD.root
> xrdcp: No such file or directory processing
> /localhost//home/adrian/work-GRID/jalien_py/t/_alice_data_2018_LHC18m_000291397_pass1_withTRDtracking_AOD208_0884_AliAOD.root.meta4?xrdcl.unzip=AliAOD.root
>
> xrdcp -p -P -f
> file://${PWD}/_alice_data_2018_LHC18m_000291397_pass1_withTRDtracking_AOD208_0884_AliAOD.root.meta4?xrdcl.unzip=AliAOD.root
> AliAOD.root
> xrdcp: No such file or directory processing
> /home/adrian/work-GRID/jalien_py/t/_alice_data_2018_LHC18m_000291397_pass1_withTRDtracking_AOD208_0884_AliAOD.root.meta4?xrdcl.unzip=AliAOD.root
>
> Thank you!
> Adrian
>
>>
>> (it seems that there's a friction between the metalink handling and zip
>> archive handling)
>>
>> Cheers,
>> Michal
>> ________________________________________
>> From: Adrian Sevcenco
>> Sent: 06 September 2019 16:30
>> To: Michal Kamil Simon; [log in to unmask]
>> Subject: Re: python :: cp process fails when it shouldn't (another
>> utility can download file)
>>
>> Hi Michal!
>>
>> On 9/2/19 5:52 PM, Michal Kamil Simon wrote:
>>   > Yep, it looks about right :-)
>> so, i use ?xrdcl.unzip=AliAOD.root&authz=TOKEN
>> format but it seems that the zipfile is downloaded instead of file
>> extracted from archive...
>>
>> the detailed log is here:
>> https://cernbox.cern.ch/index.php/s/JNaLKsaC5pyrhMP
>>
>> do you have any idea/hint why the file is not extracted?
>>
>> Thanks a lot!!
>> Adrian
>>
>>
>>
>>   >
>>   > Michal
>>   > ________________________________________
>>   > From: Adrian Sevcenco
>>   > Sent: 02 September 2019 16:46
>>   > To: Michal Kamil Simon; [log in to unmask]
>>   > Subject: Re: python :: cp process fails when it shouldn't (another
>> utility can download file)
>>   >
>>   > On 9/2/19 5:29 PM, Michal Kamil Simon wrote:
>>   >> Hi Adrian,
>>   > Hi!
>>   >
>>   >> You can replace the '#' with '?xrdcl.unzip=' however you have to make
>>   >> sure that if the URL
>>   >> already contains a CGI you have to replace the following '?' with '&',
>>   >> e.g. :
>>   >>
>>   >>
>> root://eosalice.cern.ch:1094//15/62933/56da6906-9149-11e7-ba1b-579516ed5c66*#*AliAOD.root*?*authz=LONG_ALICE_TOKEN
>>   >>
>>   >> gets transformed into:
>>   >>
>>   >>
>> root://eosalice.cern.ch:1094//15/62933/56da6906-9149-11e7-ba1b-579516ed5c66*?xrdcl.unzip=*AliAOD.root*&*authz=LONG_ALICE_TOKEN
>>   >
>>   > yeap, is doable as the full url (physical url + token) is constructed by
>>   > me... so i can do something like :
>>   >
>>   > pfn_components = pfn.split('#') # i have the guarantee that the files
>>   > that ALICE uploads have no # in name
>>   >
>>   > if len(pfn_components) > 1:
>>   > full_url = pfn + '?xrdcl.unzip=' + pfn_components[1] +
>>   > '&authz=LONG_ALICE_TOKEN'
>>   > else:
>>   > full_url = pfn + '?authz=LONG_ALICE_TOKEN'
>>   >
>>   > Does it sound right?
>>   > Thanks a lot for help!!
>>   > Adrian
>>   >
>>   >
>>   >
>>   >>
>>   >>
>>   >>
>>   >> Regarding the /CopyProcess.add_job(...)/ method I could add parameters
>>   >> that will allow
>>   >> to specify the file name for extraction from zip archive.
>>   >>
>>   >> Regarding supporting the '#' root native format we will have to see with
>>   >> Andy whether this
>>   >> wont harm any existing use cases (as # is a legal character that could
>>   >> be used a file name).
>>   >>
>>   >> Cheers,
>>   >> Michal
>>   >> ________________________________________
>>   >> From: Adrian Sevcenco
>>   >> Sent: 02 September 2019 15:37
>>   >> To: Michal Kamil Simon; [log in to unmask]
>>   >> Subject: Re: python :: cp process fails when it shouldn't (another
>>   >> utility can download file)
>>   >>
>>   >> On 9/2/19 2:09 PM, Michal Kamil Simon wrote:
>>   >> > Hi Adrian,
>>   >> Hi!
>>   >>
>>   >> > >From what I see in the logs you use the following file name:
>>   >> >
>>   >> >
>>   >>
>> root://eosalice.cern.ch:1094//15/62933/56da6906-9149-11e7-ba1b-579516ed5c66#AliAOD.root
>>   >> >
>>   >> > The '#' is root syntax for unpacking root files, this is not supported
>>   >> > in the
>>   >> > xrootd client, instead you have to use the /xrdcl.unzip/ cgi tag, e.g.
>>   >> >
>>   >> >
>>   >>
>> root://eosalice.cern.ch:1094//15/62933/56da6906-9149-11e7-ba1b-579516ed5c66?xrdcl.unzip=AliAOD.root
>>   >>
>>   >> oh!!! so, could i use a simplistic logic like :
>>   >> replace latest '#' from string with '?xrdcl.unzip='
>>   >>
>>   >> ALICE stores files in the form of GUID (that last uid)
>>   >> and when i request access to a lfn i get the guid and the authz envelope
>>   >> for accessing the file ... so, it is guaranteed that i will always get a
>>   >> url with a GUID ...
>>   >>
>>   >> Given this, do you thing that i could use the logic from above?
>>   >>
>>   >> > alternatively I can expose extracting of zip files (root files use zip
>>   >> > format for bundling)
>>   >> > in the /CopyProcess.add_job(...)/ method.
>>   >> that would be great! if it is possible it would be best if
>>   >> the same format of '#file' is recognized (as this is the url that i get
>>   >> when requesting lfn access)
>>   >>
>>   >> Thanks a lot!!
>>   >> Adrian
>>   >>
>>   >> >
>>   >> > Hope this helps!
>>   >> >
>>   >> > Cheers,
>>   >> > Michal
>>   >> >
>>   >> > ________________________________________
>>   >> > From: Adrian Sevcenco
>>   >> > Sent: 01 September 2019 22:13
>>   >> > To: [log in to unmask]
>>   >> > Cc: Michal Kamil Simon
>>   >> > Subject: python :: cp process fails when it shouldn't (another utility
>>   >> > can download file)
>>   >> >
>>   >> > Hi! I have a really baffling situation where my python tool cannot
>>   >> > download a file and another tool (java based, use xrdcp) can download
>>   >> > the same file ...
>>   >> >
>>   >> > the detailed logs for my cp are here :
>>   >> > https://cernbox.cern.ch/index.php/s/JNaLKsaC5pyrhMP
>>   >> >
>>   >> > the java based tool it seems that somehow ignores the external XRD_
>>   >> > variables so i cannot get a log of cp process
>>   >> >
>>   >> > Could some expert take a look please and point me to a hint why my cp
>>   >> > fails and the other tool can download just fine?
>>   >> >
>>   >> > Thank you!!
>>   >> > Adrian
>>   >> >
>>   >>
>>   >>
>>   >> --
>>   >> ----------------------------------------------
>>   >> Adrian Sevcenco, Ph.D. |
>>   >> Institute of Space Science - ISS, Romania |
>>   >> adrian.sevcenco at {cern.ch,spacescience.ro} |
>>   >> ----------------------------------------------
>>   >>
>>   >
>>   >
>>   > --
>>   > ----------------------------------------------
>>   > Adrian Sevcenco, Ph.D. |
>>   > Institute of Space Science - ISS, Romania |
>>   > adrian.sevcenco at {cern.ch,spacescience.ro} |
>>   > ----------------------------------------------
>>   >
>>   >
>>   >
>>
>>
>> --
>> ----------------------------------------------
>> Adrian Sevcenco, Ph.D. |
>> Institute of Space Science - ISS, Romania |
>> adrian.sevcenco at {cern.ch,spacescience.ro} |
>> ----------------------------------------------
>>
>
>
> --
> ----------------------------------------------
> Adrian Sevcenco, Ph.D.                       |
> Institute of Space Science - ISS, Romania    |
> adrian.sevcenco at {cern.ch,spacescience.ro} |
> ----------------------------------------------
>
>
>


--
----------------------------------------------
Adrian Sevcenco, Ph.D.                       |
Institute of Space Science - ISS, Romania    |
adrian.sevcenco at {cern.ch,spacescience.ro} |
----------------------------------------------


########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1