Friday, April 20, 2012

Oracle, Google and copyrighting APIs

The second quarter of 2012 has started out with a fairly intense showdown between Google and Oracle.

The conflict?  Java, it's implementation within Google's Android, and if somehow Oracle (the semi-new owners of Java) deserve a chunk of Google's cash over a copyright violation within it's usage.

Whether Oracle does or does not get paid will be resolved in the near future, I'm sure.  But one of the things going on in the courtroom that hasn't been stressed enough is the suggestion by Oracle that the copyright law should be amended to include APIs.

Taking the stand yesterday was Joshua Bloch, Java Guru. Joshua was a former Sun employee, and now currently on the Android team at Google.  Oracle's lawyers were basically trying to get him to admit that his knowledge of Java's APIs (several of which he worked on), impacted the way he designed APIs while working for Google in Android's implementation.  And they were also intent on exploiting the pride that comes with the creativity good API developers possess by getting Mr. Bloch to admit that APIs are a "creative work".  Something that falls within the definition of a copy-writable item.

While I do agree that writing elegant code is a bit of an art form, and can bring about a sense of pride and creativity akin to writing a novel or creating a painting, it is rare that all of this creativity comes from one centralized brain.

Furthermore, while I think that APIs are creative efforts, they are merely components of the bigger picture.  On their own, they don't do much.  If one were to make an analogy to cooking, the ingredients to a recipe would be APIs, and the dish would be the application.  And like ingredients, APIs can be home grown, or bought from a vendor.  But no single vendor has the right to claim they invented tomatoes and that nobody else can grow them, even if they were to somehow convince everyone they were the first to figure out how.  And if they could, how do you suppose that would impact the price of tomatoes?  How many suspicious allegations would get thrown about if someone came up with a fruit that's similar, whether it was done honestly or not?  And how many great dishes that depend upon tomatoes would have never been invented because of the controlled availability and fear of legal action?  Or what if an artist could prevent other artists from using a particular brush stroke because they "invented it".  APIs are nearly on that level of abstraction and expression.

Well written APIs, as I'm sure Bloch would agree, usually follow a number of well established patterns and practices.  In fact, most APIs are mix and matches of already established ideas that have been shared within the development community on a global scale.  This has allowed us to grow from simple monochrome text calculators to satellite powered, complex mapping applications on your cell phone in the short span of 15-20 years.  All of this progress is the result of building upon existing knowledge.  Which is the very essence of API design.

If suddenly entities can claim ownership to this knowledge because it's implemented in a copyright-able API, this brings enormous liability to just about every development effort out there, past, present and future.

I can virtually guarantee that most APIs published today have portions of identical code in them.  Sometimes there's only one logical way to solve a problem, and if methods can be found to improve the process of solving that problem, developers should be free to make those improvements.  Both for the benefit of the end-users and the company providing the service facilitated by the developer's knowledge.

Under a "copyrighted API" law, a huge mess of "I did it first" litigation could become the new way to patent troll.  And it could easily block the progress or birth of the next Facebook or Dropbox.

And not only that, legally responsible release of software would imply proper copyright protection would have to be implemented on it before it sees the light of day.  And to be completely responsible, any changes to improve the process would require a fair amount of research to determine if your improved version doesn't somehow violate someone else's copyright.  And if you had duplicated someone's effort?  You'd have to throw it away.  Wasting the time and money getting you to that improvement.

This law would make software development cumbersome and nearly impossible to do legally.  I know I've personally written code without knowledge of someone else's work, only to find out later that it was nearly identical (differing only by a couple interchanged lines and different naming patterns).  And I'm sure other developers have come up with code similar to mine without even knowing what I'm working on.  Which means a lot of most of us come up with very similar solutions when presented with the same or similar problem.  Guess why?  Education.  Just like how doctors are shown and trained in certain procedures, developers are trained to approach problems in a similar way.  Independently ask a group of developer to write a function/API method to swap two numbers.  I bet you'll find that implementation to be nearly identical from person to person.  Helpful sites like StackOverflow  are potentially a HUGE liability given how many functions exist there that people just copy into their own projects.

So to what degree should an API be eligible for copyright if that's the case?

Such a law will stifle innovation, and slow down the progression of software and systems development we have grown to expect by forcing a legal dependency into the process.  Oracle better be careful.  Especially given how developers jump from company to company.  Suppose a Microsoft employee on the SQL team went to work for Oracle.  Could Microsoft make a case of copyright infringement based on the relatively safe assumption that some of this ex-employee's knowledge cultivated at Microsoft may have landed in Oracle's codebase?

It is a shame that Larry Ellison has lost touch with his roots.  He himself didn't invent the notion of relational databases, and in fact built his company on the ideas of Edgar Codd.  He also built his company based on the knowledge he gained from a database project while working at Ampex.  Surely if an API copyright law were in place back then, his former employer might have suspected some of the code he created there made it into his product at SDL (former name of Oracle Corp)  And he has clearly forgotten the frustration he had over attempting to work with IBM over System R compatibility and how stifling that must have felt at the time.

I think it's safe to say that Oracle may not even exist today if such a law were in play at that time.

As a developer, I'm watching this case closely.  If the end result culminates into a new copyright law, the impact of this case will be far more than Google having to pay some sort retro-active licensing fee to Oracle.  It will affect how we all develop software.  And Larry, if you're reading this (haha.. right).. PLEASE..  Get your check from Google if you must, but realize that APIs have existed outside of copyright since nearly the dawn of computers themselves. And you have benefited greatly from it.  Please don't burn the bridge behind you.

Wednesday, April 11, 2012

WCF "just works", even when you may not want it to.

So my current project requires a notion of file upload/download between the client and a WCF service.

As such, I was required to implement MTOM/Streaming capabilities to accommodate large files.
The usage of this is documented fairly well on a very generic level at MSDN.

To make a long story short, there is some semi-undocumented gotchas that can cause you to search and do trial and error until you find the problem to the solution to be a small mis-configuration that's corrected with a mere couple of keystrokes.  How annoying!

So here we go...

WCF in .Net has a default service/binding context that allows your service to "just work" for typical situations.
Unfortunately, MTOM/Streaming implementation does not qualify as a typical situation.  And equally as unfortunate, you won't find out things are not going quite right until run-time when you begin to exercise the limits of the typical situation.

If you're mis-configured and step outside of the norm, you will see anywhere from helpful to vague exceptions and dialogs such as these:

The maximum message size quota for incoming messages (65536) has been exceeded. To increase the quota, use the MaxReceivedMessageSize property on the appropriate binding element

or something along the lines of this:

The message cannot be processed at the receiver, due to a mismatch at the EndpointDispatcher. This may be because of either a contract mismatch (mismatched Actions between sender and receiver) or a binding/security mismatch between the sender and the receiver.  Check that sender and receiver have the same contract and the same binding

or maybe this:

Content Type text/xml; charset=utf-8 was sent to a service expecting application/soap+xml; charset=utf-8.  The client and service bindings may be mismatched.

or the ever so vague:

HTTP 404: Bad Request 

Yet you've changed things appropriately on client and server in your config files and things still don't work.  Suddenly the "just works" concept has created a bunch of false impressions of what is actually happening.

If you're suddenly finding that WCF is ignoring all of your maxReceivedMessageSize or messageEncoding and other customizations in your binding settings, odds are the service is ignoring your custom binding and using the built-in "typical" one.  There are a couple of places you need to check to ensure you're telling WCF to look at the right binding.  One is obvious (and widely documented). The other is not.

There are a couple of things that can cause WCF to ignore your custom binding:
First, the obvious...

In your Web/App.config in the WCF service, you must specify a name for your custom binding (I've put it in green here for emphasis):

        <binding name="FileStreamBinding" maxReceivedMessageSize="2147483647" messageEncoding="Mtom" transferMode="Buffered">

Now, make sure your bindingConfiguration in your endpoint matches the name of the custom binding you want the service to obey:

    <service name="SomeNamespace.SomeOtherNameSpace.SomeCoolService">
      <endpoint address="basic" binding="basicHttpBinding" bindingConfiguration="FileStreamBinding"
        name="WebClientHttp" contract="SomeService.ICoolServiceContract" />
      <endpoint address="mex" binding="mexHttpBinding" name="MetadataServices"
        contract="IMetadataExchange" isSystemEndpoint="false" />

This is the most documented thing to check.  So my inclusion of it here is to keep all the trouble shooting all in one place rather than repeating the efforts of others.

Still doesn't work?

Now the not so obvious... Look at the .svc file in your WCF host.  You know, the file that you usually right click and "View in Browser" to see the instructions for consuming the service.  In our case, we'll say it's called "SomeCoolService.svc"

Typically, this file has one line in it that looks something like this:

<%@ ServiceHost Language="C#" Debug="true" Service="SomeNamespace.SomeOtherNameSpace.SomeServce"  %>

The service described here needs to correlate to the service you've defined in your config:

    <service name="SomeNamespace.SomeOtherNameSpace.SomeCoolService">
      <endpoint address="basic" binding="basicHttpBinding" bindingConfiguration="FileStreamBinding"
        name="WebClientHttp" contract="SomeService.ICoolServiceContract" />
      <endpoint address="mex" binding="mexHttpBinding" name="MetadataServices"
        contract="IMetadataExchange" isSystemEndpoint="false" /> 

See the error?  The final namespace element mentioned in the config's service name is "SomeCoolService". In the actual .svc file, the final namespace element is "SomeService"

Since .svc file defines the running service, this is the service name that the application is aware of at run time.  When the initialization processes interrogate the config file for custom service bindings, it can't find any services defined as "SomeService".  So, in an effort to make the WCF "just work", the default typical bindings are used instead.

The solution here is clear.  Make the service names match in the .svc file and the .config file:

    <service name="SomeNamespace.SomeOtherNameSpace.SomeService">
      <endpoint address="basic" binding="basicHttpBinding" bindingConfiguration="FileStreamBinding"
        name="WebClientHttp" contract="SomeService.ICoolServiceContract" />
      <endpoint address="mex" binding="mexHttpBinding" name="MetadataServices"
        contract="IMetadataExchange" isSystemEndpoint="false" /> 

(I'd probably rename the .svc file as well to keep things consistent) :)

In conclusion, I can see where the inclusion of this "default" binding is useful.  Perhaps even essential to keep the learning curve low in order to get developers with lesser knowledge and/or patience to accept and use it over the traditional web service models.

But as illustrated above, sometimes this "just works" behavior create a situation where a harsher, all-out failure would have maybe saved time hunting down the bug.