IN 2010 Oracle accused Google of pilfering its
intellectual property (IP) for use in the Android mobile platform. It
has since presented oodles of forensic evidence, including e-mails among
Google executives and bits of allegedly copied program code. On May 7th
a federal jury in San Francisco found in its favour. Sort of.
Google,
the jurors decided, had indeed copied Oracle's IP related to bits of
its Java infrastructure. For a start, the search giant purloined nine
lines of Oracle's code for its own version of Java, out of 15m that make
up the contentious software. Damages for this misdeed, which will be
set at a later stage of the trial, cannot exceed $150,000 by statute.
More controversially, Google was also deemed to have infringed Oracle's
copyright by mimicking "the overall structure, sequence and organisation
of copyrighted works", even where it had not directly copied any code.
Curiously,
the jurors could not agree whether this infringement was in fact
acceptable under the law. This means that Oracle cannot collect damages
from Google (it was seeking up to $1 billion) or require Android to be
partially rewritten, at least for now. To add to the confusion, it
emerged that one juror had discussed the case with her husband, which
the law forbids. Google has called for a mistrial. It now seems likely
that this first part of the case, which now proceeds to humdrum patent
disputes, will be either retried or appealed.
So, what is all the
fuss about? Oracle's copyright-related accusations centred on two bits
of software plumbing: application programming interfaces (APIs) and Java
virtual machines (JVMs).
Start with APIs. These are the link
which allows software developers to create applications which interact
seamlessly with a programming language (like Java or C++) or a service
(like Facebook or Twitter). Without an API, programmers would first have
to suss out how the gears and cogs inside the target platform work, and
then construct software to mesh with those. Moreover, different
hardware platforms would require separate software versions, which would
need to be constantly updated as languages or services are tweaked by
their makers. APIs limit such inefficiencies.
Fortunately for
programmers, they do not need to write software in machine code, an
impenetrable string of 0s and 1s that a computer processor understands.
Instead, a separate program called a compiler translates code written in
a particular "high-level" language (whose vocabulary and syntax are not
entirely unlike that in natural language) into machine-readable
commands. APIs make coders' lives easier still, by providing access to
ready-made chunks of code to perform some basic, well-defined tasks,
from simple ones like displaying dates to the more complicated, such as
creating encryption keys.
An API for a particular language is
paired with a functional counterpart, a library containing snippets of
code in that language which perform the tasks in question. These can be
integral parts of languages, paid and licensed add-ons, or some
combination of public source and free-but-copyrighted code. Then there
is an instruction manual in plain, albeit technical English. It includes
descriptions of what each snippet does, together with a command (known
as a function call) that, if inserted into a program's source code, acts
as a shortcut to the relevant section of the library. Any snippet in
the library could be written from scratch—but this takes time and,
crucially, fails to take advantage of the extensive testing the existing
code in the library has been subject to. It is easier, and safer,
simply to bung a reference to the required function into the newly
created program.
To run on a particular piece of hardware, a
program written in a high-level language must first be converted, or
"compiled", into machine code (this typically happens after the program
has been completed and prior to distribution). When the hardware runs
the compiled program and reaches the function call, it jumps to the
relevant section of the library (which is included in the completed code
and compiled with it), runs the function's code, and jumps back to the
main flow of the program.
Besides snippets of code in a high-level
language, some APIs' code libraries contain portions pre-compiled
for specific hardware platforms, with the appropriate one picked
automatically when the remainder of the program is compiled for a given
device. Java API code libraries contain only high-level code. A Java
program is compiled all at once. This is where virtual machines come in.
A
virtual machine is a computer program which simulates a physical
processor. It allows applications designed for one platform, Microsoft
Windows, say, to run on another, like Apple Macintosh. A Java VM is not
itself written in Java but in another language like C++, and then
compiled in the machine code for the device on which it has been
installed. Every combination of processor and operating system (Apple's
iMac running on an Intel chip, say) therefore has its own unique JVM.
Just
as real processors understand a specific machine vernacular, all JVMs
speak a machine-code-like version of Java (called Java byte-code). In
effect, they act as translators between Java byte-code and the physical
hardware's machine language. In theory, then, any Java program only
needs to be compiled once and should run on any JVM, prompting Java's
developer, Sun Microsystems (which Oracle bought in 2009), to hail it as
"write once, run anywhere".
In practice, however, Oracle offers
four types of JVM which support distinct dialects of Java byte-code,
tailored for smart cards, mobiles, desktops and servers. A program
compiled for a server JVM may not necessarily work on a mobile JVM, or
vice versa, as some elements needed to carry it out may be missing from
the other sort of virtual machine. A slimmed-down mobile JVM, for
instance, lacks the ability to perform complex server tasks, which are a
drain on processing power and would unnecessarily slow down a
smartphone. A server JVM, meanwhile, does not need to be able to be
efficient about draining a battery.
Oracle also licenses other
companies to create their own JVMs, on the condition that they can show
that their virtual machines are capable of running any software written
for at least one of the four classes of virtual device. This lets
device-makers create bespoke JVMs for their gadgets.
Google
created its own version of Java, which it dubbed Dalvik, for its Android
mobile platform, complete with Dalvik APIs, libraries and VMs. Although
Dalvik and Java differ on the surface, their structure and many
features are identical. As a consequence, a Java program can be adapted
to work in Dalvik and vice versa. Crucially, programmers who know one
are by the language's fundamental similarities proficient in the other.
When a Dalvik program is compiled for use on the Android platform,
however, its byte-code is different from Java's—and therefore
incompatible with other JVMs.
To create all its Dalvik
paraphernalia Google relied on open-source projects, only some of which
had secured licences from Oracle. It supplemented them with code of its
own, without obtaining a licence. The upshot is that 37 of Dalvik's 173
APIs are functionally identical to Java's (which itself sports a total
of 166), albeit implemented using different underlying code.
All
this irked Oracle in several ways, prompting the lawsuit. First, the
company alleged that Google pinched bits of its code for Dalvik's
API-associated libraries. Google admitted this but said it had removed
the contentious snippets long ago. The jury agreed with Google, apart
from the nine lines mentioned in its verdict. Second, Oracle accused
Google of copying its language designs, using its API descriptions, and
building a virtual machine incompatible with other elements of the Java
infrastructure, without obtaining permission or licences. Here, the
jurors agreed with Oracle.
In doing so, they were told by the
presiding judge to assume that it is not just the particular wording of
the plain-English API descriptions, the function calls, or the
underlying code that are protected by copyright. So are the functions
themselves, regardless of how they are implemented in software, at least
so long as the functions' inputs and outputs are indistinguishable.
Some observers found this odd, given that there is currently no clear
doctrine about whether API functionality is in fact subject to
copyright.
Either way, despite concluding that infringement had
occurred, the jury still deadlocked on whether Google's actions fall
within the "fair-use" doctrine, which in the context of software might
be construed as permitting Google to figure out and emulate all that
Java does without seeking a license or permission. The judge accepted
this partial verdict and may yet bring his own opinion to bear on the
question of doctrine at a later stage of the trial.
Google
insists that API functions, as separate from code, cannot be subject to
copyright. That, Google has warned, would be like claiming ownership of
ordinary words in a language. If its call for a mistrial is heeded, it
will rehearse those arguments anew. If not, it is likely to appeal
against the ruling, possibly all the way to the Supreme Court.
Many
tech types are jittery about a verdict fully in favour of Oracle.
Equivalent API functions based on distinct source code abound across all
aspects of hardware, software and services, on the internet and
offline. If the court ultimately sides with Oracle it might reshape the
nature of technological development.