Skip to content.

Manageability

Sections
Personal tools
You are here: Home » news

How to Choose an Open Source Library

  • Posted by cperez
  • Published: 2008-05-27

Having a lot of open source to choose from is definitely a good thing. However, the downside is that you have to do some actual work to figure out which is best for your project. So here's are some rules of thumb on how to quickly evaluate and pick an open source library. Specific emphasis on "library" since our intent is to incorporate it into our own work as opposed to simple usage.

  1. Usage - Google's PageRank algortithm ranks web pages based on the quantity and quality of other web pages linking to it. It's similar to the rule of thumb of looking at citations to discern the quality of an academic paper. In similar spirit an open source library's quality can be discerned from the number and quality of other open and closed source projects that use it. For example, the Hibernate project has a ton of projects that not only use it but provide tools to support it.
  2. Extensibility - An essential quality of most popular projects is the presence of a well defined mechanism for extension. Look for this in the library your are evaluating, the lack of this quality can put you in a bind that's extremely difficult to extricate oneself from. For example, Eclipse and JEdit have well defined plugin architectures that make it workable for multiple contributors to enhance the platform without stepping on one another.
  3. Velocity - Is the project improving on a consistent and steady pace? Some well known open source projects (i.e. JDOM) have been progressing at a glacial pace, this observation along should make you turn to a more vibrant project like DOM4J. JDOM is an example of a project that has fallen victim to its premature popularity.
  4. Scafolding - You should seriously be critical of a project if it lacks an easy way to build. The lack of a good build system implies a lack of support for collaborative development. In otherwords, the originators don't find it necessary because its most likely nobody else is collaborating with them.
  5. API Usability Testing - Now it should be obvious that any good software project requires extensive testing. However, the presence of extensive unit tests indicates other qualities. That is, it indicates a discipline of supporting iterative development. Good projects are built in iterative fashion, this implies a need to regression test against previous iterations and this also implies experimentation of the form of APIs. APIs just like user interfaces require an iterative approach to implement well, if you can't see any indication of such an approach being applied to APIs then expect to see APIs that feel like UIs that forgot to do any usability testing.
  6. How To's - Information on the kind of how to use this stuff or even better if its available as unit tests. Ignore "marketecture", that is schpiels about themselves, rather look for evidence like blogs, external articles or third party books that describes actual usage of the library. If someone else has invested the time to write about how they leveraged the tool then it contributes some credibility. In addition, you get yourself up and running faster.
  7. Developed NOT in a Vacuum - The problem with all too many libraries is that they're developed in a vacuum. Look for indicators that the library is developed as part of a much larger project. This indicates that most of the functionality is added because of an actual requirement as opposed to some developer's whim. The last thing we want to use is a library with feature creep.
  8. Attention to Detail - When you see indications of attention to the minutae then you may have struck gold. For example, LOG4J takes great pains fine tuning the time it takes to log events as opposed to simply provifing a logging framework. The design of LOG4J is a consequence of these painstaking efforts and not out of the collective wisdom (i.e. least common denominator) of a commitee.
  9. Sponsorship - Believe it or not but many of the best open source projects come from paid efforts. The good ones usually come out of either from a corporate's internal project, a result of a consulting engagement or from a research group. The better ones come from a continuous and stead stream of sponsorship. Money doesn't buy you quality, however money does buy you continuous and steady development. Remember, boredom can kill even the best projects, however cash is a good distraction from monotony.
  10. Don't forget the Code - What about the code? If you do have the time, then do look at the code, afterall, that's were the rubber meets the road. You can run the code against many of the static code checkers that are available for Java. They'll give you a quick sample of the quality of the underlying code. It's not going to tell you if it's any good, however, it'll at least weed out the ugly.

Remember also to consider suitability. That is are you comfortable with level of standards compliance, the licensing restrictions and availability of support? The best technical solution is not necessarily the best for your needs.

Finally, in no way should the choice to use open source interfere with the success of your projects. The main benefit of open source is options, rather than the traditional "buy vs build?" question, it is now more like "buy vs build vs borrow?"

Mega Components Reusability - Due Diligence

  • Posted by cperez
  • Published: 2008-05-27

Reusability, despite what many software practitioners have been lamenting about, is ubiquitous. Unfortunately, our Computer Science education makes us poorly predisposed towards observing it in the wild. From the Computer Science perspective, Reusability is framed by the programming language constructs that support it.

Structured Programming developed the construct of the callable procedure. So rather than have similar and redundant code cluttered all over the place, structured programming invented the call stack. A less sophisticated version of this would be the construct of macro expansion. Macro expansion can be quite powerful since they work at the meta-level, however they typically lack the self recursive call. The next step in the evolution was to package a group of procedures and their corresponding data structures into a module. Reusability was further improved with the introduction of module inheritance and interface polymorphism. This begot the Object Oriented paradigm that is prevalent today.

So this kind of Reusability is measured by the number of classes we either invoke or inherit from. It's a naive notion of Reusability. Fortunately, Object Oriented practice has led to the development of frameworks. The are rich Object Oriented constructions that employ multiple Design Patterns to drastically reduce the coding effort in a particular domain. Reusability has evolved beyond language constructs into employing Design Patterns that capture repetitive patterns encountered in practice into reusable design guides.

Before Design Patterns, the popular approach of improving Reusability was in the invention of new languages. Fourth Generation (4GL) or even Fifth Generation (5GL) languages that were billed as being able to capture the business model and logic at such a high abstraction level that it was divorced of the minutiae of mundane coding. Elegant idea, unfortunately this approach has failed miserably in practice. The 4GLs and 5GLs were to inflexible and applicable to very limited domains. Today, these languages are coined 'Domain Specific Languages'. More concisely characterizing their applicability and therefore setting more realistic expectations as to their capability.

Domain Specific Languages (DSLs) of course are not really new. One may argue that the most prevalent form of re-use has been via DSLs. Take for example the ubiquitous Relational Database Management System (RDMS). These are programmed exclusively via a DSL called SQL that is based on a Relational paradigm for managing data. The same can be said about Operating Systems (OS) like Unix that employ the triad of Processes, Pipes and Files as the basis of its own DSL. There are very few enterprise systems that do not employ a RDMS. There are even fewer systems that don't employ an OS.

Practitioners however don't treat an OS like Linux or a database like Oracle as being a reusable components in the conventional sense. These mega-scale components are platforms that we build on top of. These systems are very much like frameworks. They do all the heavy lifting and as a programmer you have well defined plug-in points for customization. These plug-in points are defined by the respective DSLs provided by the platform.

So if we can so comfortably accept Linux or Oracle as part of a solution, then shouldn't we be able to be more accepting of other platforms? The answer may be a resounding no, that's because we've been burned to often by monolithic platforms that tried to do everything but the kitchen sink. The common observation goes like this "you don't have SAP fit your business, rather you fit your business to SAP". Just like 4GLs of old, there's a high probability that the kitchen sink approach makes too many assumption that make it inapplicable to your business.

The key to successful Reusability is in selecting the "right level" of abstraction. The right level of abstraction are orthogonal to other abstractions. So, for example, I don't need a Oracle specific OS to run Oracle. The process management infrastructure of an OS is independent of the database management contructs of an RDMS. Some interplay between the abstractions is realistically unavoidable. However, a lower layer abstraction makes possible such things as an OS portability layer that is used in most RDMS implementations. Java for example provides that portability layer so that any Java based RDMS is portable to any OS that Java supports.

There are many frameworks/platforms out there that provide their abstractions at the "right level". Here are a few of them worth considering "Document Management System", "Workflow Engines", "Rules Engines" and "Identity Management Systems". These are integrated with the many other frameworks out there that support integration like "EAI", "ETL" and "Portals". Although the list here are exclusively open source, there are also many closed source alternatives that are equally malleable.

There are many keys aspects to consider in the selection (see "How to Choose an Open Source Project") of a mega-component. The first item that however needs to be nailed down is the integration strategy (see "10 Commandments of SOA Salvation). The integration strategy should be such that it preserves as much as possible as to what exists and has sufficient flexibility to address the core problem at hand. In other words, it maintains isolation between components but is able to cohesively bind mega-components into a well coordinated solution. Fortunately, web based systems by virtue of providing an introspectable web interface provides a good enough component model to build a foundation from. This is in stark contrast with database integrated systems and client server systems of yesteryear. Systems integrated via a database have too many implicit means of interaction that are usually intractable. Client server systems have ossified user interfaces that cannot be introspected and therefore modified on demand. It is no surprise that mashable interfaces have been all the rage, in fact, you should consider the browser as a mega-component.

The selection of a mega-component is such that its integration is compatible with the integration strategy. This therefore requires a understanding of the mega-components component model. Does it have a component model at all? Does it have a framework to plug-in customizations via code or DSLs? Does it expose public remoteable APIs? It is important to characterize the degree of reusability of each component. Furthermore, the minimum degree would be such that a remoteable API exists or achievable (see: Screenscraping).

To summarize, I am advocating a development approach that does not myopically think that the only mega-components to consider are the database and the OS. There are other mega-components out there that one must perform due diligence on. Don't have the arrogance that nobody else has built functionality that you are building. In all likelihood, someone already has and it'll take you less time learning and reusing what already exists that starting from scratch. The software industry is sufficiently mature that the mega-components are out there if you are in fact willing to take a look.

Open Source Resource and Task Management Projects written in Java

Another list of interesting and related Open Source tools written in Java

  • Posted by cperez
  • Published: 2008-05-24

I've done a little browsing on some applications that you may find useful in managing your development organization. All applications are open source and written in Java (not necessarily 100% pure). This should give someone an idea of building a much better and more complete mashup someday.

  • XPlanner - XPlanner is a project planning and tracking tool for eXtreme Programming (XP) teams. In the XP planning process, the customers pick the features to be added (user stories) to each development iteration. The developers estimate the effort to complete the stories either at the story level or by decomposing the story into tasks and estimating those. Information about team development velocity from the previous iteration is used to estimate if the team can complete the stories proposed by the customer. If the team appears to be overcommitted, the set of stories are renegotiated with the customer. The XPlanner tool was created to support this process and address issues experienced in a long-term real-life XP project.
  • WebPBC - WebPBC (Web-based Project Budget Consolidator) is a Web-based application that enables small to medium-sized companies to do budget consolidation on their projects. WebPBC is suitable for companies that coordinate projects that are geographically distributed. The Web interface allows managers, accountants and organizers from anywhere in the world to access, update and consolidate budget data stored on a central server.
  • Vishnu - Vishnu is a powerful tool for performing all types of scheduling. It includes an automated scheduler that can find optimized schedules and a browser-based user interface for viewing and editing schedules and data.
  • TrackIt - TrackIt is a web based project tracking tool designed from the ground up to provide maximum flexibility, customization, and most importantly, usefulness to the developer. Features include customizable RSS Feeds, customizable rich content area, a high level view of all ticket types and Eclipse Plugins.
  • Teamwork - Teamwork is a software application specifically for team work management. Teamwork is a tool to manage work for a group of people at the same time it is a tool to get coordinated communication to work in a managed way. Originally open source.
  • Scarab - Scarab is a highly customizable artifact tracking system. Primary features include data entry, queries, reports, notifications to interested parties, collaborative accumulation of comments, dependency tracking. Additionally it is fully customizable, multi-lingual, servlet based, imports/exports XML and skinnable.
  • Rapla - Rapla started as a simple room booking software, but in the last five years it evolved into a fully configurable framework for event and resource-management. The primary target are universities. Rapla allows coordination beetween the lectures and the administration. It offers multiple ways to view the available resources and schedule events.
  • Memoranda - Memoranda is intended for the people, whose daily work is shared between a few different projects. This is a tool helping to keep your projects, irrespective of their nature.
  • Open Workbench - Open Workbench is an open source desktop application that provides robust project scheduling and management functionality. Already the scheduling standard for more than 100,000 project managers worldwide, Open Workbench is a free and powerful alternative to Microsoft Project. The internal engine is Java based but the UI is based on MFC.
  • MPXJ - This library provides a set of facilities to allow project information to be manipulated in Java. MPXJ supports three file formats, Microsoft Project Exchange (MPX), Microsoft Project (MPP,MPT), and Microsoft Project Data Interchange (MSPDI). MPP functionality depends on the POI library produced by the Apache Jakarta project. MSPDI functionality depends on the Sun JAXB
  • JTrac - JTrac is a generic issue-tracking web-application that can be easily customized by adding custom fields and drop-downs. Features include customizable workflow, field level permissions, e-mail integration, file attachments and a detailed history view.
  • JETeam - JETeam is a J2EE application that aims to help members of a team working together. JETeam features include keeping up to date with the current status of each project, creation of tasks that are assigned to developers, attaching notes to tasks, receiving notifications on updates, maintaining a knowledge base of all solutions/problems encountered during the project development.
  • GanttProject - A Swing based project scheduling application featuring gantt chart, resource management, calendaring, import/export (MS Project, HTML, PDF, spreadsheets).
  • FUTURe - FUTURe is an application that will deal with time management, not only for an individual but also for groups/projects. It is inspired by Tools for Thought by Howard Rheingold.
  • E-Gantt - E-Gantt is a Gantt Chart library for Java Swing / Scheduling Visualization tool. The library is typically used for editing and visualizing complicated work schedules. The library E-Gantt has been succesfully integrated in many open source projects and large commercial projects with-in the following industries like Scheduling research, Medical research, US Military Defense projects and Network Administration tools.
  • Cougaar Planning - Cougaar is a product of two consecutive, multi-year DARPA research programs into large-scale agent systems spanning eight years of effort. The first program conclusively demonstrated the feasibility of using advanced agent-based technology to conduct rapid, large scale, distributed logistics planning and replanning. The second program developed information technologies to enhance the survivability of these distributed agent-based systems operating in extremely chaotic environments. The Cougaar Planning project supports the planning Domain and the Task/Allocation/AllocationResult blackboard object structure.
  • ALF - The Application Lifecycle Framework (ALF) Project enables development and IT tools to be orchestrated in support of the consumer’s business needs. ALF provides the logical definition of the overall interoperability business process. This technology handles the exchange of information from one tool to another, the business logic governing the sequencing of tools in support of the application lifecycle process, and the routing of significant events as tools interact. ALF achieves this by providing a common infrastructure (SOAP Web Services, BPEL orchestration engine and the ALF Event Manager), and a set of domain vocabularies that define the events, objects and attributes. Together these address the issues of tool interoperability and interchangeability, process segmentation, reusability, and versioning. ALF provides various Common Services (logging, notifications, security, etc.) that are easily integrated into BPEL processes to create richer interoperability processes.
  • OpenProj - OpenProj is a desktop alternative to Microsoft Project. It can open existing Microsoft or Primavera files. OpenProj employs an advanced scheduling engine and provides Gantt Charts, Network Diagrams (PERT Charts), WBS and RBS charts, Earned Value costing and more. OpenProj has been included with Star Office in Europe.
  • Project.Net - Project.net is a collaborative project execution application. Project.net's software addresses the needs of multiple stakeholders - executives, portfolio managers, project managers, and team members - for current and accurate information on the status of important projects and the performance of distributed project teams. The Project.net application is a scalable, customizable Web-based product that facilitates project management, collaboration and execution across the enterprise.

Let me know if I missed a project that is equally relevant.

Navigation
visitors
reading
 
 

Powered by Plone

This site conforms to the following standards: