Standing on the Shoulders of the Guy in the UK Office
Standing on the Shoulders of the Guy in the UK Office¶
“If I have seen further, it is by standing on the shoulders of giants.” ~ Isaac Newton.
Issac Newton wrote this to Robert Hooke, a British scientist, describing how his work built on the work of his predecessors. In science, we often think of these giants as old, dead, men in history books. In computer science, many of the giants of the field are still alive. In open source, they are people you meet at conferences. In innersource, they are your coworkers who write code, including those you’ve never met.
Open-source software is released under a license in which the copyright holder grants users the rights to use, change, and distribute the software to anyone for any purpose.
Innersource refers to the sharing, reuse, and distributed development of code inside an organization.
While open source has proved its value in terms of better code through more eyeballs and leveraging mindshare to organize development of code without centralized control, open source is hard for companies and especially hard for oil and gas. Lawyers get involved. Release processes are slow, non-existent, or written for the worst-case scenario. Innersource offers many of the benefits as open source with less institutional pushback. It can also be a stepping stone to open source.
Oil and gas companies are adept at information sharing up and down an organizational hierarchy and across the company within discipline groups. However, innersource needs code sharing across the entire company, not just within the normal communication silos. Financial analysts and geoscientists both use Pandas, a Python library for data analysis. Petrophysicists and procurement analysts both use natural language processing to find patterns in corpuses of text documents too large to read. Traditional boundaries aren’t helpful when it comes to sharing code. The distinguishing characteristics for useful groupings are tools and techniques, not discipline, geography, or business unit. Being able to find and discover code written anywhere in an organization is a necessary part of an effective innersource program.
A centralized enterprise instance of Github, Gitlab, or Bitbucket running behind the company firewall is the typical answer for how to start innersourcing. However, it is not uncommon for multiple code sharing systems to sprout up in large organizations, leading to many people only seeing a portion of the code projects. This generates the need to either enforce a single option or have methods for discoverability across platforms. Scrappers that grab metadata from code repositories and provide it at a central location is one method to ensure discovery. Recommendation engines that use enterprise code repositories to connect people writing similar code is another idea, though there are not yet any off-the-shelf solutions in 2019.
Success in inner-source is difficult to measure. Although innersource can be quantified in terms of the number of code contributions to existing projects from outside the original team, this only captures a small slice of the actual benefits. In a company that does innersource well, code is more secure, because there are more people reading it. Projects are written faster, because more code can be reused. As the rate of code reuse and external contributions increase, code users have heightened expectations for good documentation, and the original developers have a greater need for better test coverage. This results in more reliable and maintainable code. Code sharing also enables more effective use of data systems. Microservices are becoming increasingly preferred over monolithic architectures, partly to take advantage of previously written code. Shared microservices and APIs together can punch holes in data silos, allowing data to flow more easily. Small scripts shared between power users can augment the capabilities of costly GUI-based software tools. When more people can find examples of similar code being written by others, opportunities for mentorship increase, development time decreases, and sparsely distributed expert knowledge can more efficiently be distributed to where it is needed.
As the number of people who can code continues to climb and the time required to write useful code decreases, the amount of code written inside organizations will increase, the disciplines of people writing code will diversity, and the need for effective means to share and discover code across large organizations will become even more pressing. All large organizations will eventually have an innersource program. The companies that implement a good innersource program earlier will be at an advantage. When you can stand on the shoulders of someone in a different office, and they can stand on your shoulders, you can both make progress very quickly.