bytes in privData?

It seems like I can't set the content of a PrivFile to arbitrary bytes.

$ propellor --set 'PrivFile "mysecret.key"' 'mycontext' < ~/mysecret.key
find . | grep -v /.git/ | grep -v /tmp/ | grep -v /dist/ | grep -v /doc/ | egrep '\.hs$' | xargs hothasktags | perl -ne 'print; s/Propellor\.Property\.//; print' | sort > tags  2>/dev/null || true
cabal build
Building propellor-2.2.1...
Preprocessing library propellor-2.2.1...
In-place registering propellor-2.2.1...
Preprocessing executable 'propellor' for propellor-2.2.1...
Preprocessing executable 'propellor-config' for propellor-2.2.1...
[70 of 70] Compiling Main             ( src/config.hs, dist/build/propellor-config/propellor-config-tmp/Main.o )
Linking dist/build/propellor-config/propellor-config ...
ln -sf dist/build/propellor-config/propellor-config propellor


Enter private data on stdin; ctrl-D when done:
propellor: <stdin>: hGetContents: invalid argument (invalid byte sequence)

done! --Joey

RSS

comment 1

I imagine that adding fileEncoding stdin to setPrivData will fix this crash, but I'd expect there are also other problems with encodings for privdata that haskell doesn't like. Similar fixes would probably be needed in several other places.

Probably cleaner and better to convert PrivData from a String to a ByteString, and so avoid encodings being applied to it. I think this could be done without changing the file format; the privdata file uses Read/Show for serialization, and happily ByteString uses the same Read/Show format as String does.

So, changing the type and following the compile errors should get you there, I think!

Comment by joey — Mon Apr 20 01:04:26 2015

Remove comment

comment 2

Would you accept a patch converting PrivData from String to ByteString?

Comment by gueux — Tue Apr 21 12:59:42 2015

Remove comment

comment 3

Absolutely. Thought that went w/o saying.

Comment by joey — Tue Apr 21 16:52:06 2015

Remove comment

comment 4

I tried to do the conversion, but then it started a kind of chain reaction... (PrivData=ByteString to writeFileProtected to Line=ByteString to ... to readProcess to ...) Should I use FilePath=String? ... To be honest, the patch became a lot bigger that what I am comfortable with. :-)

I guess you should have a look at it...

At least, I think there is a type bug in Propellor.Property.File:

hasPrivContent' :: (IsContext c, IsPrivDataSource s) => (String -> FilePath -> IO ()) -> s -> FilePath -> c -> Property HasInfo

but it should be

hasPrivContent' :: (IsContext c, IsPrivDataSource s) => (FilePath -> String -> IO ()) -> s -> FilePath -> c -> Property HasInfo

(it is hidden by FilePath = String)

Comment by gueux — Thu Apr 23 09:21:07 2015

Remove comment

comment 5

Can you put the patch up somewhere? I'll take a look. Might see a way to short-curcuit the bytestring before everything becomes one..

One way might be:

writeFileProtected :: FileContent content => FilePath -> content -> IO ()

Which would also at least partly avoid foot-shooting over which parameter is which. (Fixed that type signature.)

Comment by joey — Thu Apr 23 13:25:50 2015

Remove comment

comment 6

I've followed the same path in the wip-bytestring-privdata branch.

It needs to round trip through String anyway to handle Read/Show serialization the same as before. I think this is doable without falling over on invalid encodings, but it's certianly ugly.

And yeah, changing Line to ByteString and all the other follow-on changes just don't seem right. Everything that uses withPrivData would need to deal with it being a ByteString, and would need to worry about encoding problems when it needed to convert to a String, or Text, or whatever.

So this feels like kicking the can down the road in the wrong direction...

Maybe it would be better to handle this by adding a type to wrap up an encoded ByteString in the PrivData. Could use base64 or something like that for the encoding. Then only consumers of these ByteStrings would be a little complicated by needing to unwrap it.

Then it would be handly to give --set, --dump and --edit some special handling of fields encoded like that. They could operate on raw ByteStrings when handling such fields, and take care of the encoding details.

Add a new constructor to PrivDataField for binary files:

| PrivBinaryFile FilePath

And a function to get the encoder and decoder:

type Encoder = ByteString -> PrivData
type Decoder = PrivData -> ByteString

privDataEncoding :: PrivDataField -> Maybe (Encoder, Decoder)

Then --set, --dump, and --edit could use that to encode and decode the data.

And finally, a withBinaryPrivData that uses ByteString.

(Maybe this could be made more type safe though..)

Comment by joey — Tue Apr 28 19:24:12 2015

Remove comment

mostly done

A recent change converted PrivData to a newtype. There are no longer any things that directly use PrivData; all use should be via accessor functions like privDataLines and privDataVal. Which helps with this.

So, I've instead implemented a privDataByteString :: PrivData -> ByteString, and I've adjusted the privdata serialization so it shouldn't crash on arbitrarily encoded data when eg, a binary file is fed into propellor --set.

(Note that I was wrong earlier when I said it'd be safe to change the serialization to use ByteString; it must use String. While "foo" can be Read as a ByteString same as a string, "foo\1000", when Read as a ByteString, truncates the big unicode character to a single byte. So, PrivData is still stored as Strings internally.)

The final step would be to make hasPrivContent use privDataByteString instead of privDataLines. Which needs some more work to add Properties to ensure a file contains a ByteString. This should be pretty easy to do, but I lost steam..

Comment by joey — Tue Sep 15 00:26:10 2015

Remove comment

comment 8

This has bit me one too many times, so here is a full implementation. There could be some dedup work to merge fileProperty and byteProperty, but I'd rather not break API with a trivial bug fix.

https://github.com/arcticwaters/propellor/commit/f5a921760ccabf0a3ebdda626c19ee6ecbe89629

Comment by andrew — Wed Jun 15 06:44:55 2016

Remove comment

comment 9

Thank you, Andrew!

Before I merge this, I want to clear up something that confuses me; you characterized this as a bug that has bit you. How did the pre-bytestring File.hasContentProtected exhibit buggy behavior?

AFAICS, it already supported binary privdata, just in a suboptimal way.

(Also fileProperty and bytesProperty should indeed be deduped; a second patch that merges them, even with an API change, would be ok.)

Comment by joey — Fri Jun 17 01:33:05 2016

Remove comment

comment 10

I've recreated my propellor repository a few times and have had to write out .pfx files. Essentially a binary format of .pem and .key. I had no problem getting the pfx file into privData, but propellor bails when writing the binary data on the host. This patch tackles the writing on host bit (not the writing to privData). You've used hPutStr to write out data which errors on certain bytes (because hPutStr assumes a character encoding?). 0x00 is a likely candidate. I don't recall the exact error, but at least Haskell noticed this and gave an error rather than writing out a partial file.

I'll see if I can get a deduping patch to tidy up fileProperty and byteProperty.

Comment by andrew — Fri Jun 17 05:25:08 2016

Remove comment

comment 11

Hmm, the way Strings are used for PrivData takes advantage of ghc's "filename encoding", which is supposed to allow arbitrary bytes to be included in filenames; unicode surrogate characters are used to map them to unicode.

But, Property.File is using readFile, witeFile, and writeFileProtected, which will bail on invalid unicode as the filename encoding is not used. Your patch avoids that problem I see.

Comment by joey — Fri Jun 17 13:16:17 2016

Remove comment

comment 12

I found a reasonable way to refactor it without the duplication, so have landed the patch.

Comment by joey — Sun Jun 19 17:53:17 2016

Remove comment

Add a comment