"Open Source", "Free Software" and other beasts - summary Introduction Many people will hear about Linux in the news, being the cool new operating system that everyone can use free of charge. Those who become interested in it enough or actually start working with it, will learn that it is made out of many independant "open source" components. Now, after enough time (perhaps very soon), they will learn that the term "free software" (where free is free as in "free speech" and not free as in "free beer") can be used as an alternative to the adjective "open source". But what is open source and free software? What distiniguishes them from other software that is available to the public at no cost or is distributed as shareware? Note that the terms "free software" and "open source" would be used throught the lecture to refer to the same phenomena. I do not religiously stick to either term. Recently I've seen the term OSS/FS (short for "open source software/free software") around. I'm not going to use it because it is too sluggish, and is bound to make fanatics of the "free software" term mad anyhow. Software Licenses and "Proprietary" Software Note that this section deals with the legal details of distributing software, and the so-called licenses that dictate what can be done with them. Note that I am not a lawyer, so I may interpret things incorrectly. Let me know if you find something incorrect here (or in this document as general) and I'll fix it. Software out of being a sequence of bits, that can be transcribed to a paper, spoken or otherwise transported is considered speech and so is protected by the Free Speech principile of Liberalism. Thus, writing software and distributing is a constitutional right in most liberal countries. (possibly with some exceptions or deviations, which are besides the point) Nevertheless, a piece of software, as any other text, can be copyrighted. Copyright involves making sure that the software as given to someone else besides its originator or copyright holder will be restricted in use or modification. An originator can outline what he believes to be a proper use of the software in a code license (which applies to the code) or an "End-User License Agreement" (or EULA which applies to given binaries). Proprietary software, i.e: such whose use, modification or distribution is encumbered, was a relatively new phenomena if you take a look at the old history of computing. It actually started even before the time when Microsoft, then a very small company wrote Altair Basic, and Bill Gates published the famous (or possibly infamous) open letter to Hobbysits. () IBM and other companies distributed proprietary software for mainframe systems, a long time before the Personal Computer revolution. The PC revolution, however, made the situation more critical. Soon, computers became faster, more powerful, with larger memory, and more common as time went by. At the moment, there are 100's of millions of Pentiums and other computers out there, and millions new are sold each. Yet, the majority of these computers mostly run software that cannot be modified or distributed, at least not effectively or legally. The free software (or open-source) movement started as an anti-thesis to the tendency of vendors to hide the details of their software from the public. The Linux Operating System with its various components (most of which are available to other systems as well, and are not affiliated with the Linux kernel in particular) is the most visible showcase to this phenomena. By installing Linux it is possible to gain a turn an everyday personal computer into a full fledged UNIX clone, which is a 100% powerful GNU system. This can cost little if any money, and the various components of the operating system are all freely modifiable and can be re-distributed in their modified form. It is not the only place where free software can be used. It is in fact possible to turn a Windows installation into a Linux-like GNU system as well (see Cygwin for instance). Meaning of the terms According to the Free Software Definition () includes 4 freedoms: * The freedom to run the program, for any purpose 1 The freedom to study how the program works, and adapt it to your needs. Access to the source code is a precondition for this. 2 The freedom to redistribute copies so you can help your neighbour 3 The freedom to improve the program, and release your improvements to the public, so that the whole community benefits . Access to the source code is a precondition for this. The Open Source definition () is similar, but some licenses can qualify as open-source and not as free software. This is usually not an issue, because the majority of open source software out there is free as well. Despite common belief selling free/open-source software is perfectly legitimate. In fact one can charge as much as he pleases for it. Nevertheless, most free software is distributed for free or for very cheaply on the Internet and other mediums. This is due to the fact that its freely distributable nature does not give way much to sale value, so there usually is no point in attempting to mandate a charge for selling it. Another common misconception is that it sometimes cannot be modified or customized for internal use. In fact, all free software (but not all open source software), can. Only when you wish to distribute it (free of charge or commercially), you may have to distribute your changes. (depending on the license) The use of open source software to process proprietary content or be processed by non-free progrmas is also, always available. Thus, an open-source C compiler can be used to compile the code of proprietary programs like the Oracle Database Server. History This section is not a definitive overview of the history of the free software movement. It focuses on the issues regarding the usage of the common terms. Early Days, AT&T UNIX, BSD The free software movement (before it was called this way) started organically from individuals who distributed code they wrote under the Public Domain or what now would be considered open source or semi-open source licenses. AT&T UNIX that started at 1969 was the first showcase for this movement. Several Bell Labs Engineers led by Ken Thompson developed UNIX for their own use, and out of legal restrictions AT&T faced, decided to distribute it to Academical organizations and other organizations for free with the source included. (that license did not qualify as open-source but it was pretty close). UNIX eventually sported the C programming languages, which enabled writing code that would run on many platforms easier, and the UNIX sources included a C compiler that was itself written in C. Around the early 70's the only computers capable of running UNIX were main-frames and the so-called "mini-computers" so there initially weren't as many installations as only large organizations could support buying computers to deploy UNIX on. That changed as integrated circuits, and computers became cheaper and more powerful. Very soon, cheap UNIX-based servers and workstations became commonplace and the number of UNIX installations exploded. Footnote: Present day: Linux, etc. on PCs. Link to Nadav Har'El's coverage of the BSD and early AT&T UNIX history: The Berkeley Univeristy forked its own version of AT&T UNIX and started re-writing parts of the code, and incorporating many changes of its own. The original parts were licensed under the BSD license which is a copyright license that is virtually Public Domain. The BSD system became very popular (perhaps even more than the AT&T one). Another thing that greatly benefited the early open-source movement was ARPAnet which eventually became the Internet. Gradually, sources and modifications to various UNIX programs were regularily posted and exchanged on the Internet under free or semi-free licenses. Richard Stallman, the GNU project, and the "Free Software" term After a while, the legal restrictions posed on AT&T subsided, and it started to "smell money" and believe it can do better selling UNIX commercially. It created the AT&T System V's system, tout it was better than AT&T UNIX and the BSDs, and sell it to vendors under a very restrictive license, that forced them to hold the source code for themselves. Even cooperation between two different vendors was not allowed. Gradually, vendors licensed the System V source code and ported it to their own architectures. This caused an explosion of proprietary UNIX systems. What Sun Microsystems initally did was actually take the BSD source code, diverge from it and distribute it without full access to the code. (something that was allowed by its license). A similar thing happened with other software distributed under similar licenses. To answer this threat, a new phenomenon sprang into existence: the "free software" movement, the GNU project and the copyleft licenses, all led by one dynamic personality: Richard M. Stallman. Richard Stallman (aka RMS) published the GNU Manifesto in 1984, which coined the term "free software", and explained the rationale behind it. The Manifesto was also a creed for the the GNU project which aimed to be a complete UNIX-compatible replacement for UNIX systems, while being completely original work. The software of the GNU project was released as free software, under the terms of the GNU General Public License (or GPL for short) (). Gradually, the GNU project created more and more C code to replace the UNIX and BSD utilities. It was already installable and usable on various flavours of UNIX, and became a fully independant system once the Linux kernel was written. The GPL license is a free software license that has many fine points. The most important concepts in it are: 1. Copyleft - making sure that derived work that are distributed to the outside includes the source and is distributed under the same license. Note that this does not apply to modifications done for internal or private use. 2. Restrictive Integration by Other Codebases - GPL code can only be linked against code with free software licenses that match some criteria. This property has been recently referred to as "viral". The incentive to restrict a software this way rather than following the traditional virtually public domain BSD license, was to make sure that the core GNU system would always remain free as well. The Linux Kernel, GNU/Linux and the Debian Free Software Guidelines In 1992 Linus Torvalds, then a student at Helsinki University, began writing the "Linux" kernel - a 32-bit kernel for UNIX-like operating system. The kernel development advanced rapidly and was released under the GPL license starting from an early stage. To complete the system and make it into a usable UNIX system, the Linux developers used various existing user-land utilities and libraries from the GNU project and other sources (such as the X-Windows system), and wrote a few user-land things from scratch. From an early stage, this entire system was dubbed "Linux" as well. Richard Stallman instead advocates the name "GNU/Linux" (pronounced "ggnu-Linux") which acknowledges the fact that the GNU project contributed the lion's share of the system. (as well as some pre-requisites of the Linux kernel itself). Most people haven't consistently followed this piece of advice. (I will not consistently refer to it as GNU/Linux either in this article or elsewhere) The importance of the Linux kernel was that it was the last brick in materializing a fully GNU system. Since GNU tools tend to be more complete, feature-rich and generally superior to tools of other systems, this has made Linux one of the most powerful UNIXes systems out of there. Nowadays, most UNIX servers out there and almost all workstations run the GNU/Linux system. Linux was, thus, the spearhead that guided the acceptance of free software into the mainstream. The Debian GNU/Linux was a Linux distribution that was eventually endorsed by the GNU project. One of the things that made it unique was the fact it distiniguished between "free" and "non-free" packages as far as the user is concerned. The guidelines for determining which software is "free" in the Debian sense were phrased by Bruce Perens and can be found here: http://www.debian.org/social_contract.html Note that they deviate from the free software definition and include some licenses that are not free. I.e: "Debian Free" is a superset of free software according to the Stallman definition. This fact is important because later on, the Debian Free Software Guidelines formed the basis for the open-source definition. The "Cathedral and the Bazaar" and the coining of the term "Open-Source" Eric Steven Raymond (now known also as ESR) Wrote an essay titled "The Cathedral and the Bazaar" which contrasted the Bazaar way of managing a software project and the old "Cathedral" way, that was used by almost all non-free projects and by most free ones. "Bazaar" projects are characterized by frequent and incremental release schedules, treating the users as co-developers, and generally getting a lot of peer review and cooperation. Despite common belief, the core group of the project contributors still usually remains relatively small except for very large projects. The article is considered one of the seminal works on free software and was followed by other works in what is collectively known as the "Cathedral and the Bazaar" (or CatB for short) series. It has made Eric Raymond a famous person at least among the community of free software hackers. In February 3rd 1998, in Palo Alto Califronia, a brainstorming session which Raymond attended, coined the term "open source" as an alternative for "free software". Their incenetive was that when talking to a businessman, either free software will be understood as software that costs nothing, or it will be associated with the relatively anti-Capitalist views held by Richard Stallman. (who claims proprietary software is immoral). They decided that the term "open source" would be a better candidate for acceptance in the corporate world. During the following week, Eric Raymond, and Bruce Perens launch the web-site, and form the Open source definition. This was based on the Debian Free Software Guidelines. The term "open source" catched on. Very soon, Richard Stallman decides to reject on the premise that the freedom of software is more important than the "openness" of its code. While he does not oppose the openness of the code, and acknowledges the fact that free software is open source as well, its freedom remained more important. For more information read: While some people have continously stick to the term free software and a few others converted to using open source entirely, most knowledgable people don't completely reject either term, and use each one whenever they see fit. Recent Developments Since 1997, Linux and other open-source systems have become more and more popular. Linux saw a lot of success in the server market, where cheap PCs that can be bought in stores can serve as an almost full replacement for more costy UNIX servers by installing Linux. Even if the latter are used, they very often run open source servers. Linux has become the number one choice for constructing clusters, a large set of computers that are networked together to form a fast computation system, with powers that rival or exceed super-computers. There are various kinds of clusters around. Some of them are performed at a relatively high level. Others, try to make the system believe it has as many processors as there are nodes. Linux also had a lot of success in the embedded market, serving as the framework for creating software that is embedded in hardware. The Internet boom not only made free software more essential for its operation , but also enabled more and more users and developers to share their code, get help and work together for advancing it. At the moment, Linux had a much more limited success as a choice for a desktop system. While being the only operating system that is gaining market share, it still has a very low one, in comparison to Microsoft solutions. Many projects started to supply users with desktop and GUI environments and applications. Some of them are very mature, usable and successful. Only time can tell if and when Linux becomes the default solution for the desktop. Recently, Apple's MacOS X was released and is based on Darwin, which is an open-source BSD-derived system. MacOS X can run UNIX applications natively, and supports the X-Windows system, which is the de-facto GUI framework for UNIXes (including Linux). It is therefore a popular UNIX choice for PowerMac computers, albeit not the only one since Linux, and various open-source BSD clones and other UNIXes can run there as well. The recent recession in the information technology market, did not seem to slow down the development of open source software. Freshmeat is still busy as ever with releases of new software, and since the recession started, many important new releases were done for a lot of major applications and even many more less important ones. Difference between "open-source" and "fs". The term "free software" was coined by Richard Stallman, and is associated with the Free Software Foundation (link to http://www.fsf.org/). The term "open source" was coined by Eric Raymond and is advocated by him and other people at the Open Source Initiative. Nevertheless, those who consider themselves in either camp, much less those who use either or both terms, do not necessarily hold the opinions of these figures. Therefore I will not globally associate them with the "free software movement" or the "open source movement" because both include many users and developers with hetrogenous opinions on the subject. Moreover, they are pretty much one and the same. Nevertheless, it is important to summarize their opinions, because they are recurring in many places. Stallmanism Proprietary software is legal but illegitimate and immoral. Manufacturing proprietary software causes a lot of unhappy social and psychological side-effects. The knowledge that a software cannot be shared causes people to become reluctant to sharing, which is a natural and good part of living in a human society. The inability of people to modify software for their own needs, makes them feel helpless, and at the mercy of external software. Free software, on the other hand, is the natural conclusion derived from the basic facts of information, computing and software, and is highly moral. People , companies and other organizations can modify it and customize it for their own use should the need arise, and so it actually benefits them. Raymondism Proprietary software is not illegitimate, just problematic from the economic sense. Open Source software gives many advantages to the end-users and is a generally a good thing. Copyleft licenses are important in making sure certain software is not abused. It is not immoral to use proprietary software, it's just risky. Using or producing software that is not 100% open-source but pretty close, can be a good idea, depending on its license and the general attitude of its developers. In for Free Beer This approach basically says this: "I like free software because I can get a lot of useful software without charge. I may like contributing to free software because it helps other people, makes me happy and may indirectly benefit me technically or financially. But proprietary software is perfectly valid as well, if it's done right, and I may choose to use it or contribute to it. In short: write code, use whatever tool you wish, and be happy." The most prominent figure who holds this view is Linus Torvalds, but there are many others, some of them quite prominent. Such figures, however, tend to be less loud than the "religious" advocates of the other two views, and thus it may seem that they are at a minority. Part of the reason is that many of them inherently tend to value productive code and decisions over advocacy. Conclusion While some figures out there prominently stick to either ideology, most people hold a mixture of the three (or more?) approaches, or are just happy using free software or contributing to it, without thinking too much about its philosophy. The terms themselves are used interchangabily by many people. "Open source" has become more common, partly because free software can mean software that is given free of charge. (the standard "free as in free speach" or "free as in free beer" distinction). Moreover, both the Free Software Foundation, and the people associated with the Open Source Institute are on friendly terms with each other and answer questions, give feedback, accept contributions, etc. from each other or from people that do not belong to either camp. Like I said earlier, the fact that some licenses would qualify as open-source and not as free software is usually a negligible fact. While some estoric software has been released under custom licenses that are open-source while not free software, most of the important software out there (and most software generally started by individuals) is free as well. [*] Footnote [*] < and for such lists. EOF Other Criteria of Open Source Software GPL Compatiblity Making a software free is not necessarily enough to make it compatible with the GNU GPL license. The GPL makes some restrictions regarding which licenses it can link against, and some perfectly good free software does not qualify to it. (examples are the Mozilla Public License, the Qt Public License, and even the original BSD license). It is advisable that whenever possible a developer or vendor choose a license that is compatible with the GPL, because otherwise there may be problems integrating his code with GPL one or using both a GPL and a non-GPL compatible library. (I am not a lawyer, so I cannot exclusively say when it is legal or not). Mozilla is an example for a large project that started out with its custom (albeit now relatively common), non-GPL compatible license, and recently adopted a triple license of the Mozilla Public License, the GNU General Public License, and the GNU Lesser General Public License in order to make it compatible with the GPL and to standarize its integrability. The Qt library whose commercial vendor and originator is Troll Tech Inc., also adopted the GPL as well as its own QPL, to relieve the various legal problems that KDE (a desktop system for UNIXes which is based on it) faced when using GPL code. Copyleft Copyleft means that a derived work of a copyleft software, that are not used for internal or personal use, must include the source code and released under the same terms of the original work. Copyleft is common in many licenses including the GPL, the Lesser General Public License, the QPL, etc. Many licenses are not Copyleft most notably the BSD and MIT/X11 licenses. Such license can be turned into a proprietary software product by a third party, and often have been. Open Source vs. Sourceware Open source does not mean any software that is accompanied by its source, albeit many people who are new to the term would be tempted to think that. It is possible to write non-OS software while accompanying it with the source. Examples for such cases are: 1. The Microsoft Visual C++ Run-Time Library and the Microsoft Foundation Classes, that are accompanied with their source. 2. xv - http://www.trilon.com/xv/ - a very popular shareware image viewer for X-Windows that have been distributed with its source code. 3. qmail - http://www.qmail.org/ - a popular mail server whose source code is available and can be deployed free of charge, but its license specifies that it is illegal to distribute modified binaries (at least outside the organization) This is enough to make it non-open-source, but it is still a very popular program. None of these packages qualify as free software, but they are all accompanied with the source. There are many others around. A quick search on Freshmeat will find many such packages. In order for a program to be open-source it needs to be free of various restrictions as specified in the open-source definition: http://www.opensource.org/docs/definition.php. To be free software as well, it must be also free of some other restrictions. [*] Footnote [*]: <) attempted to answer some of these, with a focus on negative myths. Myth #1 is completely false as bugs can still be found by analyzing the disassembly of the machine code. There were many closed-source packages out there in which many bugs have already been discovered. (like Microsoft Outlook or Microsoft IIS). Myth #2 has a grain of truth in it. However, some open source packages nevertheless had very poor security records out of poor programming practices. Some closed-source offerings, on the other hand, have a very good security record. In most packages, security bugs occured due to sloppy programming practice or lack of auditing of the code. They can be mostly avoided whether or not the package's source code is available to the public. Myth #5 is not entirely true. While it is possible to fork a piece of open-source software, most packages were not effectively forked. Eric Raymond covers the customs that relate to forking a package in "Homesteading the Noosphere" (link), and Rick Moen's explains why when major packages forked, it was not necessarily a bad thing in his "Fear of Forking essay" (). Moreover, many times proprietary software was forked as well. There are many flavours of System V UNIX out there, and there used to be many more. Microsoft released three different lines of Windows flavours with two or more simultaneously, and has many localized versions. (that are many times incompatible with one another). Where I Stand It is customary in documents of this kind to convey the personal opinion of the author in this case. This document will not be an exception. I am a user, developer and advocate of free software. However, I do not think that proprietary software is inherently immoral or destructive. I know some vendors of such software abuse their customers. However, I generally see them as suppliers of goods, which took a lot of time to develop, and which they perfectly naturally wish to sell for money. The fact that open-source developers develop similar goods and distribute them for no cost or little cost under a less restrictive open-source license, does not invalidate this fact. I agree with most of what Eric Raymond said in the "Cathedral and the Bazaar" series, part of which is that proprietary software is problematic. However, I think that a world dominated by free software (which I hope to see soon) can inhibit some proprietary software without it having a generally harmful effect on the computer world at large. I do not hate Microsoft, just think that their systems are much inferior to GNU/Linux, which I like better. I still use Windows when I find it appropriate, or when I need to. (I'm not an "I only use free software" kind of guy). I realize the superiority of Linux may have stemmed from the fact it is free software, but otherwise don't use it only because it is free software. I just like to work with it better. I don't see Microsoft or other suppliers of proprietary software as enemies of the free software movement. I expect that people will continue to buy some proprietary software even after Linux and free software take over, assuming they do. I think Microsoft will eventually port their software to Linux if it gains enough marketshare. While they may lose the revenue generated from selling Windows and providing various services for it, I don't think they will disappear entirely. And they may be able to find different revenue streams. Open-source, however, can change the rules of the game, and I believe it will. In a world dominated by open-source, proprietary vendors must realize that they need to supply their customers with quality software, listen to what they say and act upon it, and constantly try to keep it above the open-source competition. There is no point in hiding the details of the SPECs or protocols, and completely hiding the source code is not as important as many of them now think. If Microsoft survives in a Linux environment, we will see a much less abusive Microsoft. My general ideology used to be a variation of "in for free beer". Use, code, and be guiltless and happy. A recent encounter with a free for some uses proprietary software whose license changed and I became unable to use it any longer, slightly modified it. In the future, I'll be more careful in relying upon proprietary software, because it may become inaccessible to me, but otherwise still don't hold the vendors of it as immoral. I still use some not-entirely-free software because I like it and am used to it or it gets the job done. My stance regarding the war between the term "open source" and "free software" is that I use either one when I find it appropriate, and am not fanatical to either term. It depends on the context of using it, who I speak to, what I wish to imply, what sounds right, or the first thing that pops out of my head. I usually prefer saying "Linux" over "GNU/Linux" because it is shorter, and more snappy and people will understand what I talk about. I do sometimes resort to "GNU/Linux", but not very often, and use the term a "GNU system" even more. The fact I don't stick to either open-source or free software, stems from the fact that I respect both the Free Software Foundation, and the Open Source Institute, and believe that the free software movement and the open source movement is pretty much one and the same. I also like both terms. I don't use GNU/Linux despite the fact that a large and integral part of it stems from the GNU project, for marketing reasons. GNU/Linux is longer than Linux and does not add more information, just a lot of pseudo-ideology. Add to the fact that many people will pronounce it "jee-en-you-slash-Linux" when they first see it, and you'll get something that makes a very bad marketing name. Here's a nice quote from Linus Torvalds on why "Linux" is superior to "386BSD" (a 90's BSD clone that was free software as well): > > Other than the fact Linux has a cool name, could someone explain why I > > should use Linux over BSD? > > No. That's it. The cool name, that is. We worked very hard on > creating a name that would appeal to the majority of people, and it > certainly paid off: thousands of people are using linux just to be able > to say "OS/2? Hah. I've got Linux. What a cool name". 386BSD made the > mistake of putting a lot of numbers and weird abbreviations into the > name, and is scaring away a lot of people just because it sounds too > technical. Well, GNU/Linux is a step in the wrong direction in this regard. As a developer I try to use Public Domain licenses (like a pure PD, or variations of the BSD license or the MIT/X11 license) for software I produce. I don't mind people making a derived code proprietary much less integrating it inside proprietary products. If the original code, which I modify or derive from, is distributed under a different license, I respect the original license, whatever it may be. I agree that some systems are critical enough to justify GPLing or LGPLing them. Even so, there are many BSD, MIT X11, or otherwise public domain projects out there, and they don't seem to suffer from it. Assuming that the GPL is the perfect license for everything is a relatively immature notion, in my opinion. Links and References TODO: add references, links to biographies or homepages of ESR, RMS, Linus, Ken Thompson, Bruce Perens, etc. TODO: Add references to CatB, the various documents on the GNU site, the Mythical Man-Month, Peopleware (which I did not read but heard good things about). Author Shlomi Fish, Thanks Thanks to Chen Shapira for taking the time to read several early drafts of this document and provide useful comments. Thanks to Nadav Har'El for summarizing the early history of UNIX and the BSDs for me. And naturally thanks to Richard Stallman for starting the GNU project, Eric Raymond for writing the "Cathedral and the Bazaar series" and so all the way back to thanking God for creating the universe. To Do List * Convert to DocBook/XML. * DocBookify the contents (i.e: hyperlink appropriate things) * Mention some other projects besides Linux and other operating systems. (?) Copyright This document is copyrighted by Shlomi Fish under the Open Publication License version 2.0 or greater. The OPL is a commonly used free content license intended primarily for written texts. Let me know if you need it under a different one, but I believe the OPL should be suffice for almost every need.