Google's Coding Culture and C++
I stumbled upon this blog entry from Joe Beda, a software developer at Google. Joe writes:
There is, by and large, only one code base at Google. This has many advantages. Most obvious is that it is really easy to look at and contribute to code in other projects without having to talk to anyone, get special permissions or fill out forms in triplicate. That is just the tip of the iceberg, though. Having one codebase means that there is a very high degree of code sharing. Need to base 64 encode/decode something? No problem, there is a standard Google routine for that. Found a bug? Just fix it and check it in after getting it code reviewed by a documented owner. One of the reasons that environments like Perl, Python, C#, Java, etc. flourish is that they have large and well through out libraries of useful code. For a variety of reasons, C++ has never had this. (I could theorize but that would be off topic.) Google has solved this problem by building up a large library of well documented and easy to integrate code. This not only lowers the bar for new projects but makes it easy to switch projects as you don't have to learn new conventions.
Code sharing particularly company wide sharing is in general always a good thing. However, I was a bit surprised about the remark about the lack of resusability in C++. I haven't worked with C++ in ages, I stopped doing that when Java came about, and just in case you didn't notice, that's a decade ago! I had hoped that the situation in the C++ world would have improved since then, unfortunately it appears that I may be sadly dissapointed.
My major pain point with C++ a decade ago was that there was a serious lack of uniformity. Developers could develop in several styles. The major styles at that time where a C-coding style, a Smalltalk like style (i.e. clone() methods and lots of allocations on the heap), a C++ canonical style (i.e. copy constructors and stack based objects) and finally a Template based style (i.e. STL). So when you got a team of C++ developers, you had people coming from too many different coding cultures. So unless you laid down the law pretty early, you would end up with complete chaos. The typical coding standards documents of that day didn't help because aside from recommending a naming standard (i.e. hungarian notation, incidentally a throwback from C ). Furthermore developers were simply not cognizant of the conflicting C++ styles. In fact, this fact was so obvious that by 1998, Jim Coplien who wrote about a book about the Canonical C++ style, published another book titled "Multi-Paradigm Design for C++".
Fast forward into today and Joe Bella in a subsequent entry writes:
Here are some more thoughts.
- There *are* good sets of libraries out there for doing stuff. One example would be boost. These have limited uptake I think for a couple of reasons. First, downloading and building these is a pain. They don't already come with your compiler. Second, these libraries are either laser focused on a specific area (libtiff?) or are more collections/algorithms base. Boost, for example, doesn't include an easy interface for dealing with zip files or for writing a single threaded select loop based network server.
- The C++ standards process is slow and ponderous. While there are downsides to these environments being controlled by a single entity or a focused group, but the end result is much faster advancement of the language/runtime/environment.
- There are so many ways to do things in C++ that invariably arguments come up over how to do something. Even within one company, people can't agree. For some reason C++ programmers seem to be more stubborn on a lot of these points. The result is that everyone just writes their own thing.
- It is way to easy to write super hard to use APIs in C++. Templates make this problem worse. If you do this stuff day in and day out you get to know it and it makes sense to you, but most C++ libraries are hard to use on a casual basis. I think that there is a disease where people think that it is their obligation to use templates in as confusing a way as possible. Any time a class has a templated base class, you are going to go right over the head of 95% of the developers out there.
- Opinions are divided around RTTI and exceptions. Code written to deal with exceptions doesn't interop well with code that turns exceptions off. Almost no one turns on RTTI. In any case, the result is that these are walls to reusing code. If you don't use exceptions you have to come up with your own error code space or use platform concepts like HRESULTs.
- Everyone likes to replace their allocator but there is no standard way to do that for various libraries. If library lets your replace the allocator it is usually something that has to be done on a library by library basis. So many problems would be solved if the CRT and language had a way to do global hooking of the malloc/free and new/delete at the link level.
- DLLs, shared libraries, compiler differences, differing STL support all make it harder to do things in a standard way.
- Windows doesn't have a good standard build system. The upshot is that libraries are only on Unix or the Windows build is cobbled together. Environments like Cygwin are really poor compromises. It would be better for everyone if the unix make/configure systems would be updated to work natively on Windows also. It isn't like Windows can change anytime soon without breaking the world. The MS compilers are now free -- the make system is really the only missing part now.
- Each codebase has its own way for deciding how header files are included. Figuring stuff like that out is just a headache.
- Lots and lots of different string types.
So if I could summarize the latest state of C++, it appears to have truly gotten worse since a decade ago. People are creating new mind games with templates, Not everyone uses RTTI, STL is still not yet standard and finally of course, nobody can decide on a string class. These thoughts are coming from a Google and former Microsoft employee. Microsoft is well know to being a C++ shop, so I take this as a glimpse of the inner working s of Microsoft. What's of course most interesting, is that the first blog entry seems to be alluding to the fact that C++ is not used in Google. Despite C++'s shortcomings, I would be surprised if this were in fact true. Can someone out there let the world know if C++ is not used in Google?
Last modified 2005-04-04 07:24 AM