In Python 3, strings are bytes interpreted as UTF-8. This means one cannot use arbitrary byte sequences to construct a `str` because Python will try to interpret the bytes as UTF-8, throwing a `UnicodeDecodeError` on failure. Instead, byte sequences should be returned as `bytes` objects. As an example where this fails in `pyxrootd`, the [`File::Read` method] tries to build a string from the result of reading a file: ```cpp pyresponse = Py_BuildValue( "s#", buffer, bytesRead ); ``` The `s#` notation [means] > Convert a C string and its length to a Python `str` object using `'utf-8'` encoding. If the C string pointer is _NULL_, the length is ignored and `None` is returned. This will fail in general, as not all byte sequences are sequences of valid UTF-8 codes. A sufficient fix might be to use `y#` instead. [`File::Read` method]: https://github.com/xrootd/xrootd/blob/4b98210385f1fb9eafb49db58c1f8d6983cee055/bindings/python/src/PyXRootDFile.cc#L143-L209 [means]: https://docs.python.org/3/c-api/arg.html#c.Py_BuildValue -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/xrootd/xrootd/issues/632 ######################################################################## Use REPLY-ALL to reply to list To unsubscribe from the XROOTD-DEV list, click the following link: https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1