Tuesday, February 7, 2017

When Is Software Done?

I have some very exciting news.  A piece of software I've been working on for over 2 years is released to the general public!  This is a little exciting if it were software I'd been working on for some big company.  It's very exciting because it's software I have been working on for my company.  That's right!  My company is ready to start selling software and start making money!

I'm not gonna use this blog post to talk about my company and what it does.  You can read about that in our press release.  Instead, I'm going to talk about the software industry and the concept of done.  Because, as with everything, it's more complicated than it seems.

Software is never really done

Actually that's a misnomer.  Software can really be done.  But done is sort of a quantum state--there and not there at the same time.  First and foremost, anyone can understand that software that works is complete.  If the software's purpose is to process a credit card, if the credit cards process correctly then it's done, it's working.  But even in that simple example you can see how modern society doesn't end there.  There's constantly changing regulations on how the card has to be processed.  There's new security features like chips that we need to handle as they come out.  Different companies will change the way they format the data in their magnetic strips.  And that's not to mention the fact that there's always new and different banks and credit card companies that you'll need to deal with.

But that's really the problem isn't it?  We perceive done software as this never changing stone that can do absolutely anything.  The reality is, it does one thing--what it was programmed to do.  The world around the software is ever changing.  When new software comes out to make searching for information easier (aka Google),  we begin to rely on that and we want software that can make our company appear at the top of relevant search results.  The world around us is not static.  Software is.  For that reason, it's hard to make software that is done in a world that isn't.

But software really can be done

Software that does it's job well enough might never need changing.  A great example is a 1980s computer still running HVAC at some schools.  There are hundreds of similar systems out there.  Examples of done software can include things like the pipe command in linux, or the dir command on windows.  This software hasn't had a need to change in a long long time.  And there's a reason for that.

Software that only needs to do one thing can be done

If you can clearly and completely define a single thing that the software needs to do, it will be finished.  The important part of that sentence is completely, because "process credit cards" is not the full definition.  There's reading magnetic strips and chip data.  There's encryption and decryption.  There's integrating with varying credit card companies.  There's keeping up with industry regulations.  And note that each of those things are not complete definitions either.  Chips and strips have varying data formats that change by company.  Encryption and decryption change every other week as someone finds new ways to make things more or less secure.  "varying" credit card companies is it's own bag of wax.  And regulations don't even have a reliable way to be discovered by lawyers, let alone a software developer.

Integration is a dirty word

The biggest problem with the concept of done software is more subtle than it seems.  Integration.  Integrating your software with someone else's software makes the two pieces of software into one piece of software from the done perspective.  If your software is done, and they change theirs, then your integration fails and you have to change it.  So, the second you want to integrate, you introduce a huge exponential factor that makes finishing your software harder.

Changes bite hard

Even then, if you think about this in terms of two simple pieces of software this is understandable.  But the second your software is something complicated that won't be done anytime soon (like credit card processing tools) then you create a scenario where your integration can't be complete until the rest of your software is.

But that's not even the worst of it.  The worst part comes back to what it means to be finished.  Software that you start working on before it is completely defined will take longer to finish than it would otherwise.  And this is not what most CEOs want to hear.  In manufacturing, the sooner you start on a product, the sooner it's finished.  In software this isn't always the case.  If you change your mind halfway through, you can completely change the way the software needs to be written.

More often than you might think, what appears to be a subtle change can be nearly impossible for software that was built with certain conditions in mind.  A common comparison in the software world is a house.  If the customer says "we're going to live in Florida" then the house designer might do things like lower the roof angle since snow won't be a problem but hurricanes are, and replace the frame with hardier wooden beams to resist the extra moisture in the air.  If the customer makes no other changes to the design except deciding to live in New York instead, the frame and roof, and various other bits, have to be completely changed.  And those changes result in a very different house in the end.

I can't tell you how many meetings I've been in where software developers, seeing the dangers of change, try to get clarification by asking something like "What if the user has two middle names?" only to be dismissed by management with "That'll never happen".  Then, having structured the database and apis around this particular situation never happening, they have to account for it at a later date when "never" turns out to be "rarely".  I always advise my co-workers to be careful about these situations.

Realistically, you don't usually want done software

In the end, done is not what we really need.  What we need is functioning software.  If it works for the moment, we can let the developers keep working away at fixing it, improving it, and adding more features and integrations as we need them.  This is why Agile is so popular in software development these days.  Most companies have accepted that diving into having their own software developers is not a temporary investment.  They are going to keep them around forever as the company's needs grow and change.  Because the world is changing, so the software must also.

Take my recent release as an example.  This software is functioning, and it's doing so well.  It's brilliantly effective at doing it's job, but it's job is complicated and ever-changing.  The mortgage industry is inundated with ever-changing regulations, and we're working our way through dozens of integrations at the moment.  So we'll continue to work on improving and integrating this software for years to come, but it works!  It's out there!  And it's ready for your company to use NOW!

And that is a wonderful feeling.

Thursday, August 18, 2016

Encrypt All The Things

I love to study new technology as it comes out, but today's blog post is about a very old technology finally being handled in an intelligent way.  Let's start with the old technology.

It's all https

Security on the internet is hard to explain, but in general it just--sucks.  The truth is that when you log into you bank's website, the data to handle that is (usually) encrypted.  That is to say that the data your'e sending to the bank is gibberish that the bank's servers know how to translate back into your username and password, among other things

Encryption is complicated, but the important thing to understand about all of this is that your computer has to know how to translate your username and password into gibberish, and the bank's computer has to know how to translate that gibberish back into your username and password.  For this to happen, those computers have to start by sharing the translator with each other, we call this the handshake.  Usually, it goes something like this (yes, with this many steps, Yay TCP!).

Your Computer - "Hey bank, I'd like to see your login page"
Bank Computer - "Sure thing, here it is"
Your Computer - "What is this gibberish?  Must be encrypted.  Hey bank, I'd like to see your encrypted login page."
Bank Computer - "Sure thing, here's the encrypted login page and translation instructions."

It's all a sham

SSL/TLS encryption is the fancy term to refer to this method of security today.  The problem is subtle--or more specifically, subtlety.  The problem is this:

You see, the only way to know if a website is secure today is for you, the user, to see that green little https lock icon.  There's no check to ensure that you don't accidentally log in on a page that looks like this:

The difference between the former and the latter from your perspective is whether or not you were paying close enough attention, which even many security experts fail to do.  The difference in terms of security, however, is everything.

Modern browsers don't manage this for you at all.  In fact, security experts got caught recently by a fake google site, and the only difference was the above text.  And of course, the other side can be true too.  http://google.com is not safe, but so is https://docs.google.com/user/12345/login_page because users can bring you to sites with the wrong URL just as easily as one that isn't secure.

The problem is sharing

Programmers attempted to build security around https into browsers to let you know when a website was unsafe, but there's no way to know which websites should be using encryption and which shouldn't, so instead they chose to check if the encryption of the website did indeed come from that website--which they only do when there is any encryption at all in place.

This is where you get gigantic warning pages about websites:

The way they know that https://self-signed.badssl.com should be red and https://google.com should be green is called a Certificate Authority, often called a CA.  Because the programmers who wrote your browser can't possibly know every website that should have https, they rely on the website to tell your computer, as in the above dialogue (handshake).  When you get back those translation instructions (called the certificate) from the website, the browser then goes to a CA it knows and trusts to see if that really is the correct instructions for that website, because if it's not, you're just encrypting the data to send it to someone who's pretending to be the website, which would defeat the purpose.

If the CA hasn't heard of the certificate, the browser gives the gigantic warning page above, which makes you feel very safe.  But remember, that if they don't even bother to encrypt the page, there's no warning at all, even if it's a page that's usually encrypted.  Because CAs are the only accepted authority on what sites should have what certificates, they have to be right, every time.  For this reason, and the costs associated with that, CAs have always charged a decent amount of money (hundreds of dollars per year on the low end) to grant your website a certificate that they will tell the browser is safe.

You can create your own certificates (translations) for your websites, called self-signed certificates, but your users will get this gigantic warning page every time they try to go to your site unless they do some manual configuration to accept the certificate on their computer.  This is why most websites don't even bother to encrypt.  It costs too much, and if you do it yourself, browsers actually make the experience worse for your users than if you didn't do anything at all.

Enter Let's Encrypt

The world of CAs is going a little nuts right now, as a newcomer to the market has decided to issue certificates for free, and completely automate the process.  What's unusual is that, the newcomer is actually being accepted by all major browsers.  This is incredible.  Now, people who make websites can encrypt them just for fun!  There's no longer any reason not to see that beautiful green https at the top of your browser.  It doesn't cost anything!

Now, I may not sound like it, but I'm usually quite skeptical of new technology.  I like to use it and study it, but I don't usually trust it in production until it's more tried and true.  Let's Encrypt still acts like new technology.  It's unwieldy and hard to use, but it works, and because I care about security it's worth it to me.  From the perspective of someone who doesn't like it when things are hard, however, here's a StackOverflow post I wrote about issues with setting it up.  It's not everything, but it has some links that should help you along the way.

What's great about Let's Encrypt being free is that everyone should be doing it now.  If you make websites, encrypt them.  I don't care if it's just a blog about baking cookies, or a picture page with no logins.  Encrypt it!  You see, since programmers never could figure out a good way to identify whether a website should have a certificate, we can create an internet for them where they don't have to.  If every legitimate website is encrypted, then they can tell the browsers to throw ugly scary warning pages when there is no encryption, and take that onus off of the user.

This means a lot to me, as a programmer who understands how important your personal information is.  The internet is dangerous, and this is one way that we can make it just a little bit safer.

Encrypt all the things!

Thursday, April 14, 2016

Your code smells? Let it!

Code smell is a term used pretty commonly in software development.  It is a way for one developer to express to another that they're experience is telling them something is wrong, even though they don't see it on the surface.  In fact, that's what it's supposed to mean.

But software developers don't do well with "trust your gut" as a methodology, so we make an effort to identify and quantify things like this.  For that reason, over time we have managed to put together lists of things in your code that are "code smells" and explain why we don't like them.

For illustration, I'm going to refer to personal experience and talk about an application I'm currently working on.  A common code-smell that developers will talk about is singletons.  So that will be the basis for my example, but the message should be clear even without the example.

Why does it smell?

The first thing you should ask the developer on your team who's most knowledgeable about the code in question (that may be yourself) when you encounter a code smell is "Why does it smell?".  This is the part I see skipped--A LOT.  This is the most important part of the process.  Let's face it.  There are legitimate reasons for things to smell.  Your mechanic's shop should not smell the same as a bank vault!  Each code base is going to have its own set of smells, and that's ok, as long as the answer to "Why does it smell?" is reasonable.

In a project I've stepped into, singletons are everywhere.  I asked one of the developers about them and his answer was one sentence.  "We have memory constraints."  Oh!  That makes sense.  This application lives on (crappy) webservice connections, and is written in an older language, so it's logical to store those connection objects for reuse later, rather than waiting on a crappy garbage collector to delete them and running out of memory.  This code base smells like they're using singletons as global objects.. Guess what?  They are!  And it's OK.  That was their intent!

Of course, the answer is not always valid.  If the answer is "That's the way it's done" or something else vague like that, it's time to investigate yourself.  It may be that there is a legitimate reason and you haven't encountered it yet, but this is a code smell, and so you need to find the source.  You cannot, and should not go changing code to "fix the smell" if you don't know why it smells.

In my example, if I had started getting rid of singletons to "fix it" and making them instances of the service all throughout the application, I would have surely crashed production!  Don't go fixing things that you don't KNOW are broken.  Code smells are just that: a smell.  Some smells are supposed to happen!

What if it's bad?

So you've identified a code smell.  You asked "Why does it smell?" and you figured out, through some form of investigation, that it's bad.  This is the smell of burning rubber over a box of wires.  It's a bad smell, we need to fix it.

You're not supposed to fix the smell!

So many developers get this bit wrong.  The smell is your marker.  It identifies that there's a problem in your code base.  The smell itself is NOT the problem!  99% of code smells are pointers to an architectural problem!  The smell "goes away" when you fix the root problem.  Saying "x is a code smell" is not the same as saying "x is bad".  The code you're looking at solved a problem.  It is good code.  It just may not need to exist if you fix the part of the code that is bad.  The part that caused the smell to begin with.

As an example, in my current day job, they identified the singletons but knew that "singletons are a code smell".  They made the standard mistake.  They think that the word code smell means "bad".  So they say "singletons are bad".   Then they try to remove the singleton.  This actually happened in this code base.  They did something I'm seeing a lot on blogs.  They wrapped their singletons in a Dependency Injection library and pretended that, since the instance is being passed to their class via dependency injection, it's no longer a singleton.

Don't hide the smell!

Ok, so the new guy is just starting.  He didn't do any of the above, but he saw the singleton pattern and got all excited to help and swapped it out for dependency injection...  What did he do?  He masked the smell.  The smell is there for a reason.  It's like covering up the dog turd in the middle of the floor by putting a towel over it.  Whether this is a kennel, where that dog turd needs to be kept in the dog-turd specific area (maybe a singleton resolver class or some such), or a perfume store, where it needs to be cleared out immediately at all costs, he has made it harder to fix, and hard to identify for everyone.

In a code-base where the smell identifies an actual problem, to fix the smell masks the real problem and keeps you from finding it longer.

In a code-base where the smell points to outside forces beyond our control, the fix makes it harder to maintain, and harder for new developers to step in and learn about the code base.

Never, ever hide the smell.

What about your singleton problem?

With a big problem in an existing code-base like this, I'm starting small.  First thing to do, unmask all the singletons.  I'm creating a group of static classes that wrap up the singletons in the application.  Some day in the future, I'll be able to remove the Dependency Injection framework entirely and start over, using it correctly (for actual dependency injection).  But until then, I've had to think differently.  For now, in this particular code-base, whenever I see that dependency injection is being used.  That, for me, is a code smell.

Friday, April 8, 2016

Dependency Injection - You're doing it wrong!

So I have worked at a lot of places and seen a lot of different styles of programming.  Early on in my career, I became acquainted with the concept of dependency injection.  It was a hard topic to grasp, but I have learned understand it deeply.  Now, when I step into a new application that uses it, I can very quickly see the flaws in the implementation, and there's a common one I want to talk about today: global singletons.  But we'll get to that in a minute.

What is Dependency Injection?

Dependency Injection is exactly what it sounds like.  You use it to inject your dependencies.   The unique part about Dependency Injection though, is that you can do this at runtime.  Now this sounds fancier than it is.  By inject we don't mean they're downloaded for you.  You still have to have all of the parts installed where you want to run your app.

Dependency Injection is somewhat of a complicated topic to a newbie.  Let's start with defining the word dependency here.  Specifically, DI is about classes, but everyone who talks about DI talks about Interfaces.  This is because DI does something unique among software patterns.  It doesn't decrease the amount of code you have to write, it increases it.  That's right, when you start using Dependency Injection, you can expect to write a lot more code, a third as much as you are currently writing or more (beyond 100%) on top of your existing workload.  For every class you write, you need to write a matching interface that provides access to all of the class's public properties, and the class needs to implement that interface.

Then Why Use It?

Like all software patterns, DI is about making your code easier to work with.  The magic of DI is the interface.  Interfaces are simple stubs of classes that essentially mean "any class that implements this interface must have these public properties".  That's it.  It's a set of rules for creating a class.  This makes it easier to write your class later, because you know it only needs public properties X, Y, and Z.  Let's look at a real life example.  I'm pulling this example from a library written by a good friend of mine who's worked with me at some of my big name jobs.  Here's the full library for reference: https://github.com/danielkrainas/squire

First, let's look at an interface he created for storing data.

namespace Squire.Storage
    using System;
    using System.Collections.Generic;
    using System.IO;
    using System.Linq;
    using System.Text;
    using System.Threading.Tasks;

    public interface IBlob : IBlobItem
        void SetPermissions(BlobPermissions permissions);

        void Delete();

        void PerformRead(Action<Stream> readerActivity);

        void PerformWrite(Action<Stream> writerActivity);

        void CopyTo(IBlobContainer container, string copyName = "");

Pretty simple right?  Basically his interface offers CRUD.  Create, Read, Update, Delete, and a few other minor features.  OK, it's not perfect CRUD, but I'm sure he intended for it to be that way.

If you've written code for storage, you can imagine how having this template makes it a little easier to write the class that implements IBlob, but the value of this interface is the ability to use it for Dependency Injection.  Let's take a look at another Interface (because they're short) that uses the IBlob interface as if it were an object.

namespace Squire.Storage
    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.Threading.Tasks;

    public interface IBlobContainer : IBlobItem
        IBlob GetBlob(string name);

        void CreateIfNotExists();

        void SetPermissions(BlobContainerPermissions permissions);

        IBlobContainer GetContainer(string containerName);

        void Delete();

        IEnumerable<IBlob> SearchBlobs(string filter, bool recursive = false);

        IEnumerable<IBlobContainer> SearchContainers(string filter, bool recursive = false);

        IEnumerable<IBlobItem> Search(string filter, bool recursive = false);

        IEnumerable<IBlobItem> Contents

So we can create objects and refer to the types by the interface instead of by the class.  But ultimately, an interface is not a class.  You cannot create an instance of an interface.  You can only create an instance of a class.  What we get is a state where you can rely on any class that implements your interface to have those same public properties.  They call this a Contract.

Wait, this is about inheritance?

No.  This is where we talk about the second thing people talk about with Dependency Injection: Inversion of Control.  The benefit of having these interfaces is that you no longer have to worry about HOW something will be implemented.  You can pass off individual classes even between developers and they no longer have to know what each other are doing.  They have an interface that binds them.  That interface tells you in a very concise way what the architecture, the structure of your application is.  The individual class implementations can be good or bad, but you don't have to know how one class works in order to work with another.  Instead, you only need to know what public properties that class will provide you, and you can build your logic around it.

Inversion of Control is where you decide which class to inject at runtime.  You get control of your implementation at runtime, instead of at compile time.  Your control is "inverted" from the traditional implementation..  If you think about a traditional implementation of the above interfaces, BlobContainer would have an instance of Blob.  Which means that if you change Blob, you can easily break BlobContainer.  But if BlobContainer relies on an IBlob interface, then you can change Blob all day.  As long as it implements IBlob, BlobContainer will be unaffected by those changes.  BlobContainer now has control over the contract it chooses to support with objects that implement IBlob, rather than the other way around.

This is what Dependency Injection gives us.  It creates a clear application structure that we can rely on.  But it has a bonus feature.  Interfaces can be implemented by multiple classes.  Since we can swap out these classes at runtime, which means: we can swap out our implementation.  For this reason, if you're working in a statically typed language, you should always use this Interface pattern.  In fact, that pattern has been around a while.  It's called the Revealing Interface Pattern, and has varying names and slightly modified versions in many languages, even those that are not statically typed.

So what's wrong?

Because you can change your implementation at runtime, you can do some crazy things.  The craziest I have seen is that some people use their DI library to create "global singletons" that they can access throughout their application.  Globals are bad.  Singletons are bad because people treat them like globals.  Wrapping singletons in a Dependency Injection library adds extra weight.  To add that extra weight in order to do something that's bad in the first place is so dumb.

Use Interfaces to make your code easier to follow, and IoC in order to make it easier to change out dependencies (read: classes).  Don't use DI to make your dependencies global.  Dependency Injection: You're doing it wrong.

Thursday, March 24, 2016

Javascript Broke, and no one noticed

So, on Tuesday, at around 11:30, the Javascript world went into cardiac arrest.  The details are pretty interesting only if you're as deep in the code as I am, so here's a summary for the tl;dr; crowd.

What happened?

Code builds on itself.  No one (well, almost no one) codes in binary anymore because we came out with code that wraps groups of binary into smaller, more readable pieces.  That then got wrapped the same way, and so on.  That's how the software world works.  Some articles about this incident even use a Jenga tower as a reference, and that's not very far off (sadly).  This isn't just from language to language either.  Particularly in a language that's been around for a while (like Javascript), there are libraries of code within the language to do the same thing (wrap complex bits in smaller, more readable pieces).  One of those libraries has been around for a loooooooooong time and pretty much everyone relied on it, somewhere deep down the Jenga tower (we call it the software stack).  For the devs in my audience, some of the software that directly relied on left-pad (the library that was the most important here) included Node.js and Babel.  For the non-devs: it was pretty deep down the stack--so deep that most people weren't even aware they were relying on it.

Azer Koçulu is the man of the hour.  He built the tiny piece of code (literally 11 lines) that everyone was using.  He was responsible for hundreds of emergency meetings on Tuesday (I don't know the exact number, but you can imagine a lot of execs calling their IT department screaming).  He's a big deal in the OSS (open source software) community.  He's published over 270 packages to npm, the place most people (including Node.js and Babel) get their Javascript code.  On Tuesday he un-published all 270+ of his packages, including one innocuous 11 line piece of code called left-pad.  After that moment, everyone whose code relied on left-pad at some level was fine (really, it was).  But new code starting out, and deployment builds all failed, because they couldn't retrieve one little dependency.

Why did he do it?

I'm not going to steal the man's thunder.  He published a blog post in his own words saying why he did it.  Here's the tl;dr version.

Koçulu had a package named kik.  A messenger company named kik asked Koçulu to give them the package name so they could use it instead.  Koçulu refused.  The company pleaded with npm.  npm sided with kik.  Koçulu was offended and decided not to be involved with npm in the future.

It's a bit more complicated than that: Koçulu was good friends with most of the npm team, including the guy who took away his package.  kik sent Koçulu threatening emails, and Koçulu sent kik rude emails.  The whole thing took just under 3 weeks to come to this.

For some perspective, here are some of the highlights from the email chain, which was published by the kik team in their own blog post on the matter.

our trademark lawyers are going to be banging on your door and taking down your accounts and stuff like that


fuck you. don’t e-mail me back.


Is there something we could do for you in compensation to get you to change the name?


you can buy it for $30.000 for the hassle of giving up with my pet project for bunch of corporate dicks

What I expected

While Koçulu was kind of a dick to kik, and kik was kind of a dick to Koçulu, all of that was more or less expected as far as I'm concerned.  Open source developers, especially those as active as Azer are notoriously anti-establishment and are likely to respond to even gentle requests from them harshly.  And kik has trademark rights to consider, so you expect them not to be nice when someone tells them no.  They actually mentioned it in the email chain:

we’d have no choice but to do all that because you have to enforce trademarks or you lose them.

All of that is more or less, the sort of banter you'd expect if you're as deep in the software world as I am.  But that's not where it ended.  kik went and got npm involved.

npm's take

kik sent numerous emails over the 3 weeks following the initial email thread with Koçulu about how rude he was to them asking "can you guys help?" repeatedly.  Eventually, npm sent one email to both parties involved with this message.


Hi, Azer.
I hear your frustration. The desire to continue to use the kik and kik-starter package names, is clear.
Our goal is to make publishing and installing packages as frictionless as possible. In this case, we believe that most users who would come across a kik package, would reasonably expect it to be related to kik.com. In this context, transferring ownership of these two package names achieves that goal. I understand that you’ve committed time and energy to the packages already, and we don’t take that lightly. I’m hopeful that you’ll be able to republish this project with a new name.
Can you provide an npm account to transfer the name to?
Thank you both for your patience and understanding.

So npm's decision was to transfer the package from one account to another.  As anyone who's lost a battle would be, Koçulu was offended and decided to remove all of his code from this particular package manager.  He even sent a reply with an explanation.


Isaac; I’m very disappointed with your decision here. I know you for years and would never imagine you siding with corporate patent lawyers threatening open source contributors.
There are hundreds of modules like Kik, for example, Square; https://www.npmjs.com/package/square.
So you’ll let these corporate lawyers register whatever name they want ? Noone is looking for a Kik package because they don’t have one.
I want all my modules to be deleted including my account, along with this package. I don’t wanna be a part of NPM anymore. If you don’t do it, let me know how do it quickly. I think I have the right of deleting all my stuff from NPM.

He deleted code?

Well, no actually.  He didn't delete anything.  He just "un-published" them.  You can still get access to all of his open source code on his github account.  He simply doesn't want his code in the package management system that screwed him over.  Sadly, the package management system he pulled his code from is the one that everyone uses so it broke everyone's code (remember that Jenga tower again?).  It was relatively easy to fix things, but most people weren't aware of what was broken, so it resulted in a very big freak out (which is why I called it cardiac arrest and not death).

But I didn't notice

OK.  To be fair, most people didn't notice.  Most build processes are smarter than to publish code to production when something like that breaks.  But the developers who were trying to publish code to production on Tuesday noticed, and that's what the big freakout was.  In the software world, many companies publish code to production multiple-times per day, so you can see how quickly this becomes a corporate meeting.

In fact, npm "fixed" the brokenness by re-publishing the left-pad package.   npm's Laurie Voss made the call here.  And later, npm published a followup about the decision.

What's the contraversy here?

So most of the discussion is about the argument between Azer and kik, and about this "unprecedented" move by npm to re-publish an un-published package.  I'm ok with all of that.  I expected the argument, and I am even ok with npm choosing to publish someone's package without their consent.  It is open source after all.  I tend to side with Voss on this one:

In the meantime, several thousand open source projects have been repaired, and I'm sleeping fine tonight.

I have the same problem that Koçulu had to begin with.  I think that npm should not be handing over projects to other users.  That's a dispute between the parties involved.  For npm to act as arbitrator in the situation is inappropriate.  Moreover, to hand over a project from someone like Koçulu who has published over 270 open source packages to npm and knows the team on a first name basis to someone like the kik team, who sends nasty threatening emails to that same developer is wrong.

We're talking about a community here, not just one guy.  They may have made the kik team happier, but they damaged the community as a whole.  They took the non-contributor and put him above one of their top contributors.  That's very wrong to me.

For completeness, and those who want it, here's the code that broke the "internet" for 2.5 hours.

module.exports = leftpad;

function leftpad (str, len, ch) {
  str = String(str);

  var i = -1;

  if (!ch && ch !== 0) ch = ' ';

  len = len - str.length;

  while (++i < len) {
    str = ch + str;

  return str;

Wednesday, February 3, 2016

Github is weird

So I'm coding all the time.  As a result, I find it somewhat hillarious that my github stats don't show it, mostly because I do a lot of my coding on private repositories not on Github.  In fact, as of late, I'm thrilled to be using my own GoGS server.  It runs great on one of my raspberry pis.

I've been coding a lot of javascript lately, so this post will be heavy in js terminology.  Don't mind that.  The sum of it is in the title and the screenshots.

What I actually did

Last night (this morning?) I forked a repository of an npm package I'm using in one of my projects.  I needed some new features added to it to make it work with the project, so I coded them.  The work spreads across 4 commits.

It wasn't much, and the commit to fix the tests didn't work because the original author wrote tests that assumed a specific timezone, so this wasn't yet pull request material.  To keep it simple, I put the commits on the master branch and simply point my project at my fork instead of the released versions of the project.

Later, after some sleep, I started researching base for my fork only to realize that the author hasn't even looked at pull requests since July of 2015.  So I looked at some other forks, and considered basing my potential pull request off of a more recent version.  Instead of just doing that and abandoning what is definitely the basis for the project on npm, as a courtesy to the author I posted an issue that the project appears to be dead and asked if we should redirect to a more active fork.

So, naturally, I expected Github to track my activity and show it in my contributions page.

What Github says I did

I noticed a few months back that github doesn't put anything on your contributions page if you are working on a branch that isn't "master" which must be annoying for people who don't use master as their main branch, but I can understand it and it results in "eventual" consistency with contributions showing up when your branch is merged into master.  But that seems silly.

What's worse, I noticed when I left my last job, that when you leave a team, your contributions to that team's private repositories no longer show up on Github either, but I guess I get that too...to the same extent.

As of a few weeks ago, I noticed that issues showed up as contributions on this display.  I can understand that issues are important, but to track them while not tracking actual commits....whatever I guess.

All of that helps it make more sense when you see that, even though my contributions are to a public repository on the master branch, they don't show up because the repository's a fork...But then I realized something extremely unpleasant.

That's not just wrong, it's detrimental

Leave aside the fact that this encourages people to spam meaningless issues onto every repository they can find.

I'd like to remind you instead that: I'm using the forked repo in production.  That means that the information that github is showing is actually the opposite of what contributions I've made to software development today.

I don't know what to say.  Maybe I'm using github wrong.  Let me know if you think so.  Otherwise, feel free to share your own github stories.  What's weird about github to you?.

Tuesday, December 22, 2015

You don't know what you don't know

I've been programming since I was little.  I'm not just someone who wrote some code once when I was 15.  I code all the time because I enjoy it.  This industry is both very deep and very wide.  The ocean of available knowledge is what attracts me to it.  I bore easily, and the idea that I can always be learning something new is extremely appealing.  But, though it may change, at this point in time, I'm the exception, not the rule.

Not every programmer started as a kid.  Not every programmer does in for the sake of programming.  Most do it for the ego trip, or worse, the money.  But having the level of passion that I have for programming is something to be proud of.  That means that many people who do this job for other reasons--well, they fake it.

But it's complicated

Like I said, the industry is a vast ocean of information.  This means that the tiniest amount of knowledge in an obscure enough area (it doesn't even have to be that deep) can give a person the appearance of being more skilled than someone who has spent thousands of hours more time studying other areas of development.  Let's name these people for simplicity.  Phil will be our name for the programmer who happens to know something extremely obscure (let's say he's an awk expert).  And Bill will be our name for the programmer who knows a lot of things, but has never heard of this obscure technology.

The thing is, Phil gets respect over Bill because of this knowledge.  "Even Bill doesn't know it!" say his co-workers.  Sometimes, Bill even compliments him.  "I wish I had your knowledge".  Phil doesn't know what Bill does, but he doesn't have to.  He was hired for this specific purpose.  And this is all great.  Phil gets lots of praise.  His ego inflates and he might even start to get the impression that he's actually a better programmer than Bill.

And then things get complicated

Bill, Phil, and the rest of the team are invited to a meeting by Mark, the new project manager who is trying to plan out the next piece of software that the team has to build.  Bill is excited.  He presents a new piece of technology and explains how it will fit well on the existing technology stack.  Phil gets excited as well, and explains how this new branch of his obscure technology is perfect (maybe) for the problem area.

The team, having spent years complimenting Phil on how much smarter than Bill he was, backs the idea to use this obscure technology on the project.  A decision is made and Bill has to make it work.

Do you see it now?

Bill is now responsible for a project written in a technology he has no knowledge of.  Phil, while extremely confident, will have no actual involvement in the project anyway.  The project will be written in an extremely obscure technology, making it harder for the team to get up to speed and learn what they need to get the job done.

The project will fail

It's worse than just the team learning about the technology.  Phil doesn't have knowledge of the existing production stack, the vital knowledge that Bill has to use when making decisions about what tools to use for the project.  Using this obscure technology will actually be impossible for this company, but even the developers don't know that soon enough for it to be able to help.  The business has no clue.  The company will spend a lot of money on changing infrastructure to fit the needs as the developers discover they need this and that to make it work.  And then, when they're mostly there, they'll realize that it's close enough to impossible that the business is not going to waste money on it anymore.  And it gets worse than that, but let's not focus on the negatives.

What about Phil?

Well, Phil hasn't been working on his project.  He won't get blamed when it fails.  Because he understands most of the technologies involved, he may get consulted on parts of it, and seeing the rudimentary mistakes of the other developers learning his technology he may even think it's failing purely because they're incompetent.  It's not his fault either.  His line of thought is perfectly reasoned based on the information he has.  But then, that's the mistake isn't it.  The information he has is incomplete.

This is a common issue in the pursuit of knowledge, and I wrote this post to explain why it manifests itself so strongly in software development.  But with that explanation, it's also inspiration for a new way of thinking.  As much time as I spend studying obscure, deep, complex bits of software, I don't spend as much time talking about it.  Like Phil, I certainly want to talk about the cool stuff I was studying, but I don't usually get to.

Back to basics

Instead, I spend most of my time talking about simple things.  Even skilled developers often have a weak understanding of testing, readability, and sustainability.  That's what I recommend you talk about also.  If you're ever in Bill's shoes in a meeting trying to deal with an ugly situation, stop.  Don't argue about a topic you don't understand.  Fall back to the basics.  Talk about sustainability.  Talk about testability.  Talk about readability.  But above all, remember.  It's not about today.  It's about the long term.  In 6 months, you're going to have to explain what happened.  Be clear and concise.  And above all, remember, you don't know what you don't know.