Nice and simple this week, a short example with one of my most used algorithms – std::transform. This post talked about the way in which I am sorting a list of files. At the time I wrote that post I was getting the list of files directly from the file system, however I also want to be able to write the file metadata out to a JSON file and read it back in.
A simplified version of the JSON file looks like this:
{ "compile.bat" : [77, 990630480], "keycodes.h" : [2728, 1124154608], "q3_ui.bat" : [5742, 1033844878], "q3_ui.q3asm" : [544, 1033844878], "q3_ui.sh" : [1124, 990630480], "q3_ui.vcproj" : [63532, 1124230338], "ui.def" : [29, 990630480], "ui.q3asm" : [568, 990630480] }
(The full version includes a directory structure as well. I have removed the directory structure for simplicity.)
As before, I have my FileMetaData structure and a vector of FileMetaData objects:
struct FileMetaData { std::string fileName_; std::size_t size_; std::time_t lastWriteTime_; }; typedef std::vector< FileMetaData > FileMetaDataVector;
I want to read in the file which gives me the data as a JSON string, and transform it into a FileMetaDataVector
object. For simplicity, I am using the boost property tree classes as a quick and easy way to get JSON input working (this will probably not be my final solution but it is good enough for now).
Reading the JSON data in and parsing it as JSON is easy:
FileMetaDataVector getFileMetaData( std::istream& istr ) { boost::property_tree::ptree propertiesTree; boost::property_tree::read_json( istr, propertiesTree );
The nice thing about the property tree is that (to quote from the documentation) “each node is also an STL-compatible Sequence for its child nodes”. An STL-compatible sequence means that I can use its iterators as input to the standard algorithms. The JSON input has been transformed into a boost::property_tree::ptree
object. Now I want to transform it into a FileMetaDataVector
object. Because boost::property_tree::ptree
is STL compatible I can treat it just like I would any other container:
FileMetaDataVector result; std::transform( std::begin( propertiesTree ), std::end( propertiesTree ), std::back_inserter( result ), valueToFileMetaData ); return result; }
I need to write valueToFileMetaData
– the function to convert a child of boost::property_tree::ptree
into a FileMetaData object. The implementation isn’t of interest for this post, the prototype for the function looks like this:
FileMetaData
valueToFileMetaData(
boost::property_tree::ptree::value_type const& );
I have said it before and I will say it again – one of the best things the STL does for us is to give us a framework. Write an algorithm or a container that fits into that framework and you instantly have access to everything else in that framework.
Incidentally, the code I have that reads in the file metadata from the file system also uses standard algorithms. I am using the boost filesystem library, and in particular, directory iterator. I didn’t use that code as an example because directory_iterator
returns files and directories and I need to do a little work to split them out, but the principle is the same – because Boost treats a directory as an STL-compatible container I can use STL algorithms to operate on it.
Could you please elaborate on why the boost::property_tree library won’t make it to your program’s final version? I’m curious on the factors you take into consideration when choosing such a tool. What alternatives do you consider?
It’s been 6 months since I wrote this post and boost::property_tree is still in my code. At the time I wrote the post I had just started using the boost library and was having a few problems with it. It appeared to be slow, however that was due to me getting the performance tests wrong – it isn’t as bad as it first appeared. property_tree is more general than required for JSON and getting the information I needed out of it was proving tricky. It’s still tricky, but I now have a number of helper functions that make life easier.
At the time, the main factor was the speed of getting JSON reading functionality into my project. I was already using boost so all I had to do was #include the appropriate headers. Looking to the future, things I will take into consideration include speed (boost::property_tree isn’t as slow as I thought but that doesn’t mean it’s as fast as a less general option), error handling (I haven’t experimented with boost::property_tree error handling yet – it might be just fine) and simplicity of use (boost::property_tree doesn’t fare too well here).
If I decide that boost::property tree isn’t what I want I’ll take a look through the list of parsers here:
http://json.org/
and see if there’s anything with an appropriate license that meets my needs.
There is one more option – the “write a parser myself” option. That seems like reinventing the wheel, however there’s one feature that I haven’t seen in any of the JSON parsers I have looked at. I want to be able to have comments in the JSON file, and I want the file to be capable of being round-tripped whilst maintaining those comments.
The “round trip” requirement means that I can’t just run the file through a preprocessor to remove the comments. The comments have to be read in, and stored in a way that lets them be written out later. Several years ago I went to a talk by Sean Parent at Adobe where he proposed that comments should always be attached to a construct in the file. When that construct is written out, the comment is written out with it.
I haven’t worked out the details of this yet. I have thought about the problem enough to realize that the devil is probably lurking in those details. I am not sure that there is a solution to the problem – at least not a solution that does everything I want – but part of the point of my personal projects is to play around with stuff like this. I may or may not solve the problem, but I’ll learn something interesting along the way.