.NET DC Meetup – Prototyping On the Cheap

I was fortunate to attend the .NET DC Meetup last night (3/19/2019), with Steve Goguen talking about using F# and Linqpad with the title of: Prototyping on the Cheap.

Steve opened with a relevant clip from The Founder about the Speedee System; a film that details McDonalds and their change from a single location to becoming the franchise powerhouse it has.

In the scene, the McDonalds brothers prototype how to make the most efficient kitchen layout possible to ensure fast, reliable hamburger production. Previous to this Hamburgers took 30 minutes to make and deliver in Drive-Ins. After this, Burger production went to 30 seconds. It would have been prohibitively expensive to make the changes necessary for this without prototyping. Note the tools they use: a ruler, chalk, a ladder, and imagination:

Steve then went through how F#, with its minimalist approach to declaring types and data structures (like a hash of hashes), let’s you use the code itself to prototype the relationship between entities in a system. The example he used was the menu system for chipotle. I have almost no experience with F# and I found his example easy to follow.

Besides using F# Data Structures as a way to show off relationships between entities, Steve took us through Bogus, a library for generating data.

Up until this point, I had never heard of Bogus, and I have just one question where has this thing been all my life? I hope Steve shares his code examples, but Bogus coupled with F# makes it trivial to generate sample data (image from last night’s meetup below):

Steve Goguen, taking us through how you can seed Bogus with the same number to have a deterministic randomness.

From the image above, the F# used to generate fake data is on the left, with the result, in Linqpad, shown on the right.

From this demonstration, a few things stood out to me:
1. Linqpad’s Dump()method is really useful and reminds me of the Perl library Data::Dumper, except in Linqpad it can dump in full blown HTML.
2. As Steve says in his presentation, Bogus does really well if you let it generate random data without constraints. If you provide constraints (for instance, Bogus can generate random birthdates in a 20 year timeframe), then Bogus can take much longer to generate the data you need. One way around this is to generate the data once and then use that data as the data to randomly pull from.

In the Second Act, Steve goes over FsCheck, to which I have to ask again, where has this been all my life?

FsCheck allows you to generate random values of given types as inputs for your application and check whether or not it handles those random inputs well. In terms of Prototyping, FsCheck provides a quick way to sanely flesh out where your idea may run into data problems (or where your assumptions are too broad or narrow), and you can use it in Linqpad.

Steve then went over Reactive Extensions in C# with Observables, and how to use Observables to prototype showing changes when items are selected or de-selected, as a way to show stakeholders different options.

Finally, in Act Three, Steve showed us how to use all of this along with Linqpad’s built in UI controls and some extension methods to generate Winforms and Web UIs, In Linqpad.

Linqpad has a number of controls built in , and Steve showed us how we can use these controls to prototype a Winforms UI. But that’s not all; Steve also showed us how to use an extension method to Dump IHtmlString as HTML to Linqpad, along with pulling in Bootstrap and Jquery (and Dumping it with the glorious Dump() method, to prototype entire HTML interactions in Linqpad.

Steve Goguen, showing us how to create entire interactive webpage prototypes in Linqpad.

The entire talk was well put together and made me excited for Linqpad, F#, and those newfound (to me) prototyping libraries. Many thanks to Steve Goguen for speaking on this subject. If/When the slides and code snippets are released, I’ll update this post with links to them. For now, Steve’s site is here, here’s his Github and Twitter.

Advertisements

My Salary Progression in Tech

There was a recent tweet that asked people to share their salary as a means of helping others, so I’ll do that. I don’t think it’s enough, however1. It will give you power in your own negotiations, but it’s not enough1, again. To help you more, I highly recommend Patrick McKenzie’s (@patio11) article on Salary Negotiation. In my case I used some of his techniques; but as this was 2010, his article wasn’t written yet. It’s one of the best articles I’ve seen on salary negotiation. Seriously, go read it.

My salary increases over the years have been due to a few things:
1. Knowing what I brought to the table and selling that.
2. Showing that I bring more value than I’m asking for.
3. Not being emotionally invested in the outcome. (Which is somewhat ironic as being emotionally invested is what ends up getting me in trouble later on).

I have never threatened to leave a job if I wasn’t given a raise; as I feel like that leads to a “Gosh, what will happen with George if we give it to him? Will he stick around? He’s already mentioned leaving!” mentality. I also don’t ask twice. If I don’t get it, then I step back, learn what’s valuable to the other party, and do more of that, visibly.1, once again Anywhere you see a bump for the same employer is where I’ve asked for a raise (discounting the cost of living raises; as I did not ask for those).

In instances where I’ve changed jobs, which I’ve done quite a few times throughout my career; it was generally done for more money or an intangible benefit (for instance, I loved working at Higher Logic, but left to join Jewelbots because I really believed in Sara Chipps‘ mission). I left Silkroad technology (even though I loved the team) because I had moved to Northern Virginia and couldn’t make it in NoVA on North Carolina wages (the cost of living jumped by 35%). Similarly, I left The Motley Fool to join Higher Logic because there was an opportunity for a bit of a pay increase; and as a new father I couldn’t turn that down (though, the Fool is pretty awesome to work for).

A final note, this is just salary. I’m not including 401K employer contributions, bonuses, or anything of that nature (it clouds the base-pay issue; and if you’re living month to month (like we were), base-pay is all that really matters. I will say that base-pay isn’t the full story. Jewelbots couldn’t offer health insurance but they fully covered my COBRA. Since Higher Logic had an amazing healthcare plan; it was a really good place to be in.

I should also note (so final_note_2), that @vcsjones is the one that got me realizing I should ask for more money. We were having a conversation on our way down to RevolutionConf 2016, and we stopped at a local brewery he suggested for a beer and food. I asked him what he made — I had had a few sips of beer, to be fair — but credit to him, he told me. They aren’t my facts to tell, but he is the one that helped me see the local market was not what Glassdoor made it out to be.

So here it is, my salary progression in tech (Note, as I just opened my business; there is no revenue to report).

YearCompanyPositionSalaryTimeLocationLang.
2004US ArmyHR Admin$26,460 (E-5)2 yearsFt. Bragg, NCVBA
2007Mainheim
Statesville
IT Admin$42,0001+ yearStatesville,
NC
Perl, C#
2008Silkroad
technology
Junior
Programmer
$48,0001+ yearWinston-
Salem
NC
C#,
ASP.NET
2009CACI Inc.Developer
$85,0001+yearChantilly, VAC#,
Winforms
2010CACI Inc.Team Lead$120,000*
2011The Motley
Fool
Developer$87,0003+ yrsAlexandria,
VA
C#,
ASP.NET
MVC
2012The Motley
Fool
Developer$91,000C#
ASP.NET
MVC
2013The Motley
Fool
Developer$94,000Python,
Angular
2014The Motley
Fool
Developer$97,000Angular,
C#
2014Higher
Logic
Senior
Developer
/ DBA
$120,000*1+ yearRosslyn, VAC#,
ASP.NET
& MVC
2015JewelbotsVP,
Software
$115,0001+yearRemote
(NYC)
Ionic,
Angular,
C
(firmware)
2016Solutions
Architect
$170,000*2+
Years
Reston, VAC#,
Perl,
JS
2017Solutions
Architect
$185,000*Reston, VA
2018Solutions
Architect
$187,000Springfield,
VA
2019Solutions
Architect
$189,000Springfield,
VA
2019Hollow Wall
Technology
Services
Owner$0CurrentSpringfield,
VA

*: The Asterisk (*) indicates when I’ve asked for raises or otherwise negotiated for that salary.

1: Privilege is a large part of the equation; the privilege to not care; the privilege to be a white dude in an industry that (either intentionally or unintentionally) caters to white dudes. Yes, this reflects playing the game on easy mode. I have no doubts there. I’m writing that it’s not enough because I am too privileged to be able to see what non-white dudes should do. So if you’re a white dude reading this, make it better for everyone by not being cheap on compensation (and recognize any potential bias or privilege you may have).

Software Teardowns: Console.WriteLine (Part 2: Unix)

Last time we went through tearing down the Windows implementation of Console.Writeline; and we made it all the way to the closed source of the Win32 API. This time, we’re going through the Unix version of Console.Writeline, and along the way will be able to go deeper since Unix is open source.

We left off at the platform implementation of the ConsolePal class, and if you remember, ConsolePal.Unix.cs is a drop in compile time file replacement.

The UML diagram at this point resembles the following (omitting properties and methods not relevant to this post):

In an effort to move a little faster, I’ll refer to the previous post where code is the same, and explain it here when it isn’t.

In order to write to the console, we must (again) OpenStandardOutput() which looks as follows:

public static Stream OpenStandardOutput()
{
    return new UnixConsoleStream(SafeFileHandleHelper.Open(() => Interop.Sys.Dup(Interop.Sys.FileDescriptors.STDOUT_FILENO)), FileAccess.Write);
}

This differs significantly from its Windows OpenStandardOutput() counterpart, starting with a call to SafeFileHandleHelper.Open(func<SafeFileHandle> fdFunc).

namespace Microsoft.Win32.SafeHandles
{
    internal static class SafeFileHandleHelper
    {
         /* Snip.... */
 
         /// <summary>Opens a SafeFileHandle for a file descriptor created by a provided delegate.</summary>
        /// <param name="fdFunc">
        /// The function that creates the file descriptor. Returns the file descriptor on success, or an invalid
        /// file descriptor on error with Marshal.GetLastWin32Error() set to the error code.
        /// </param>
        /// <returns>The created SafeFileHandle.</returns>
        internal static SafeFileHandle Open(Func<SafeFileHandle> fdFunc)
        {
            SafeFileHandle handle = Interop.CheckIo(fdFunc());

            Debug.Assert(!handle.IsInvalid, "File descriptor is invalid");
            return handle;
        }
    }
}

Several things of note about the above code; even though this code is only called from the Unix code, it’s labeled as Microsoft.Win32.Safehandles, and the name of the file is SafeFileHandleHelper.Unix.cs.

The next interesting bit is that this takes in a Funcdelegate comprised of a call to Interop.Sys.Dup(Interop.Sys.FileDescriptors.STDIN_FILENO), which leads us down a new path. Previously we had seen Interoprefer to native windows calls; but since .NET Core is cross platform, it also has to have Interop with *Nix environments. The Dup namespace is new to me, so I’ll spend a moment trying to track down why it’s called that. A quick github search shows me that it’s a wrapper for a SystemNative_Dupcall, which I don’t quite yet understand:

internal static partial class Interop
{
    internal static partial class Sys
    {
        [DllImport(Libraries.SystemNative, EntryPoint = "SystemNative_Dup", SetLastError = true)]
        internal static extern SafeFileHandle Dup(SafeFileHandle oldfd);
    }
}

If my understanding holds true, I should be able to look around and find a SystemNative_Dup either in the framework’s CLR, or in a native standard library. (Time to Google again).

I found a pal_io.h (header file), and a pal_io.c that contains the SystemNative_Dup function call. From our last blog post on this subject, we found out that PAL stands for Platform Abstraction Layer; so this native code file handles IO at the PAL level.
This file is located at ./src/Native/Unix/System.Native/pal_io.c.

intptr_t SystemNative_Dup(intptr_t oldfd)
{
    int result;
#if HAVE_F_DUPFD_CLOEXEC
    while ((result = fcntl(ToFileDescriptor(oldfd), F_DUPFD_CLOEXEC, 0)) < 0 && errno == EINTR);
#else
    while ((result = fcntl(ToFileDescriptor(oldfd), F_DUPFD, 0)) < 0 && errno == EINTR);
    // do CLOEXEC here too
    fcntl(result, F_SETFD, FD_CLOEXEC);
#endif
    return result;
}

The first bit I want to tear down here is the HAF_F_DUPFD_CLOEXEC preprocessor block. Since this is a preprocessor definition, the code underneath is going to change based on whether that definition is enabled (generally through a compiler directive in Visual Studio, or through a command line flag switch for GCC or MsBuild. A quick search shows that HAVE_F_DPFD_CLOEXEC is defined in one place, but used in two places:

In src/Native/Unix/Common/pal_config.h.in (comments added by me):

#pragma once

#cmakedefine PAL_UNIX_NAME @PAL_UNIX_NAME@ //line 3
//SNIP
#cmakedefine01 HAVE_F_DUPFD_CLOEXEC //line 9

The interesting part about this is #cmakedefine01 is a pre-defined (hehe) define in cmake; so it makes sense that they use cmake as part of their build toolchain.

As far as what HAVE_F_DUPFD_CLO_EXECmay mean, there are references to F_DUPFD_CLOEXEC in some linux codebases; particularly in /include/uapi/linux/fcntl.h; which has the following definition:

/* Create a file descriptor with FD_CLOEXEC set. */
#define F_DUPFD_CLOEXEC	(F_LINUX_SPECIFIC_BASE + 6)

And a google search turns up the following documentation for fcntl.h which is short for “File Control”:

F_DUPFD
Return a new file descriptor which shall be allocated as described in File Descriptor Allocation, except that it shall be the lowest numbered available file descriptor greater than or equal to the third argument, arg, taken as an integer of type int. The new file descriptor shall refer to the same open file description as the original file descriptor, and shall share any locks. The FD_CLOEXEC flag associated with the new file descriptor shall be cleared to keep the file open across calls to one of the exec functions.
F_DUPFD_CLOEXEC
Like F_DUPFD, but the FD_CLOEXEC flag associated with the new file descriptor shall be set.

In other words (I think), using this returns a new file descriptor, but the CLOEXEC (Close/Execute?) flag will be set with the new file descriptior. (I think DUP means duplicate?), so with that questioned, we’re back to this line of code:

 while ((result = fcntl(ToFileDescriptor(oldfd), F_DUPFD_CLOEXEC, 0)) < 0 && errno == EINTR);

This while loop does a check against the oldfd, and converts it to a FileDescriptor:

/**
* Converts an intptr_t to a file descriptor.
* intptr_t is the type used to marshal file descriptors so we can use SafeHandles effectively.
*/
inline static int ToFileDescriptorUnchecked(intptr_t fd)
{
    return (int)fd;
}

/**
* Converts an intptr_t to a file descriptor.
* intptr_t is the type used to marshal file descriptors so we can use SafeHandles effectively.
*/
inline static int ToFileDescriptor(intptr_t fd)
{
    assert(0 <= fd && fd < sysconf(_SC_OPEN_MAX));

    return ToFileDescriptorUnchecked(fd);
}

and when that check has completed, calls the standard library __libc_fcntl() , which calls the native syscall do_fcntl, which has the following function signature:

static long do_fcntl(int fd, unsigned int cmd, unsigned long arg,
		struct file *filp)

The first argument is the type of filedescriptor to pass (by convention a filedescriptor in POSIX land is a non-negative integer). STDOUT, which is what we’d care about, has a FileDescriptor value of 1, or STDOUT_FILENOset to 1 as a constant.


It casts the oldfd to a system appropriate filedescriptor, with the F_DUPFD_CLOEXEC argument set; and starting at 0.

So to recap where we’re at; we’ve crossed from the .NET Core Framework into the native calls necessary to open STDOUT_FILENO and ensure it’s open so we can write to it.

Now that we’ve opened the File Descriptor, we can open a stream containing that File Descriptor; and we’ll do that with UnixConsoleStream; with this line of code:

public static Stream OpenStandardInput()
{
    return new UnixConsoleStream(SafeFileHandleHelper.Open(() => Interop.Sys.Dup(Interop.Sys.FileDescriptors.STDIN_FILENO)), FileAccess.Read);
}

The UnixConsoleStream class is an internal class located in the ConsolePal.Unix.csfile. It derives from the Abstract base class ConsoleStream, and it does two particularly linux-y things on instantiation:

internal UnixConsoleStream(SafeFileHandle handle, FileAccess access)
                : base(access)
{
        Debug.Assert(handle != null, "Expected non-null console handle");
        Debug.Assert(!handle.IsInvalid, "Expected valid console handle");
        _handle = handle;

        // Determine the type of the descriptor (e.g. regular file, character file, pipe, etc.)
        Interop.Sys.FileStatus buf;
        _handleType =
        Interop.Sys.FStat(_handle, out buf) == 0 ?
                (buf.Mode & Interop.Sys.FileTypes.S_IFMT) :
                Interop.Sys.FileTypes.S_IFREG; // if something goes wrong, don't fail, just say it's a regular file
}

First, it checks with FStat (a native syscall for Unix) to see if status of the file descriptor. Much like all framework calls to native, there’s an Interop class made for this purpose. Here’s the one for FStat:

[DllImport(Libraries.SystemNative, EntryPoint = "SystemNative_FStat", SetLastError = true)]
        internal static extern int FStat(SafeFileHandle fd, out FileStatus output);
internal static class FileTypes
{
    internal const int S_IFMT = 0xF000;
    internal const int S_IFIFO = 0x1000;
    internal const int S_IFCHR = 0x2000;
    internal const int S_IFDIR = 0x4000;
    internal const int S_IFREG = 0x8000;
    internal const int S_IFLNK = 0xA000;
    internal const int S_IFSOCK = 0xC000;
}

(This is getting too long for me to teardown DllImportAttribute, so I’ll do that in a future post).
Above this call are the constants also referenced in the code snippet above for UnixConsoleStream , particularly S_IFMT, and S_IFREG, which stands for type of fileand Regular file, respectively. (Honestly I can’t see how S_IFMTstands for type of file; I would have expected “format”. Could someone who does system programming chime in on why it’s named S_IFMT ?

Because Unix and its variants are open source software, we get to actually dive into the C code behind these calls. For fstat; and glibc, the fstat function looks like this:

#include <sys/stat.h>

/* This definition is only used if inlining fails for this function; see
   the last page of <sys/stat.h>.  The real work is done by the `x'
   function which is passed a version number argument.  We arrange in the
   makefile that when not inlined this function is always statically
   linked; that way a dynamically-linked executable always encodes the
   version number corresponding to the data structures it uses, so the `x'
   functions in the shared library can adapt without needing to recompile
   all callers.  */

#undef fstat
#undef __fstat
int
attribute_hidden
__fstat (int fd, struct stat *buf)
{
  return __fxstat (_STAT_VER, fd, buf);
}

weak_hidden_alias (__fstat, fstat)

This is where things get really complicated. There are multiple versions of fstat; and depending on the version, different code will be called.Or as explained by the man page:


Over time, increases in the size of the stat structure have led to three successive versions of stat(): sys_stat() (slot __NR_oldstat), sys_newstat() (slot __NR_stat), and sys_stat64() (slot __NR_stat64) on 32-bit platforms such as i386. The first two versions were already present in Linux 1.0 (albeit with different names); the last was added in Linux 2.4. Similar remarks apply for fstat() and lstat().

Hello technical debt?

Don’t worry, there’s more:

The glibc stat() wrapper function hides these details from applications, invoking the most recent version of the system call provided by the kernel, and repacking the returned information if required for old binaries.

On modern 64-bit systems, life is simpler: there is a single stat() system call and the kernel deals with a stat structure that contains fields of a sufficient size.

The underlying system call employed by the glibc fstatat() wrapper function is actually called fstatat64() or, on some architectures, newfstatat().

And this is where Unix majorly differs from windows. I’ve ignored it up until now for simplicity, but Unix is not a singular operating system as much as it’s a style of operating system. If you ask someone if they’re running Unix, you’ll also need to ask what variant they’re using: is it a GNU Linux variant? BSD variant? Unix variant? And that doesn’t even tell you everything you need to know if you’re interacting with the OS. You still need to know which C standard library implementation is running: musl, glibc, or some other C library. It working is a testament to software developers.

Back to our UnixConsoleStream. Now that we have OpenStandardOutput() executed, we need to write to it:

public static TextWriter Out => EnsureInitialized(ref s_out, () => CreateOutputWriter(OpenStandardOutput()));

The next step is to create the output writer. This part starts with a call to this private method:

private static TextWriter CreateOutputWriter(Stream outputStream)
{
return TextWriter.Synchronized(outputStream == Stream.Null ?
       StreamWriter.Null :
       new StreamWriter(
          stream: outputStream,
          encoding: OutputEncoding.RemovePreamble(), // This ensures no prefix is written to the stream.
          bufferSize: DefaultConsoleBufferSize,
          leaveOpen: true) { AutoFlush = true });
}

The preamble mentioned is that when you write to the console, you don’t want to send a byte-order-mark (BOM) first. That BOM is called a preamble. If you’re writing to a file, this preamble is important, but if you’re writing to the console, it’s not as important(?).

The next part is that AutoFlush is set to true. This is important because when you write to a file, the file is not immediately written to. A buffer fills up, and once that buffer is full it’s “Flushed” to the file. This can cause problems if you’re looking for immediate feedback on a console window, so turning on AutoFlush alleviates that.

The TextWriter.Syncronizedstatic method is located here:

public static TextWriter Synchronized(TextWriter writer)
{
    if (writer == null)
       throw new ArgumentNullException(nameof(writer));

    return writer is SyncTextWriter ? writer : new SyncTextWriter(writer);
}

The SyncTextWriter as the name suggests; ensures that writing is syncronized(?), and the only bits that seem strange are a new Attribute here, which is [MethodImpl(MethodImplOptions.Synchronized)]. (Not posting the full source due to its length; but it looks a lot like TextWriter; except it has this attribute). In fact, it’s a child class of TextWriter; and calls the base class’s version of all the methods; while adding the above attribute to the call.

MethodImplOptions.Synchronizedis an enum of compiler flags, and as the comments state this is used when compiling the code to generate certain properties of the method:

namespace System.Runtime.CompilerServices
{
    // This Enum matchs the miImpl flags defined in corhdr.h. It is used to specify 
    // certain method properties.
    [Flags]
    public enum MethodImplOptions
    {
        Unmanaged = 0x0004,
        NoInlining = 0x0008,
        ForwardRef = 0x0010,
        Synchronized = 0x0020,
        NoOptimization = 0x0040,
        PreserveSig = 0x0080,
        AggressiveInlining = 0x0100,
        AggressiveOptimization = 0x0200,
        InternalCall = 0x1000
    }
}

Unfortunately if I want to go deeper I need to dig into Roslyn. So I’ll do that, but only for a second. I’m out of my depth, so I search for Synchronized; and find this comment, which points me (almost by accident) in the right direction… except now I’m so far out of my depth I’m not sure which way is up. I’m looking for what IL would be generated for a Synchronized method; but can’t find it on my own searching.

Back to the TextWriter (well, SyncTextWriter; but since it calls the base-class methods with special options, we’ll look at TextWriter and just pretend it’s synchronized).

// Writes a string followed by a line terminator to the text stream.
//
public virtual void WriteLine(string value)
{
     if (value != null)
     {
         Write(value);
     }
     Write(CoreNewLineStr);
}

The interesting case is that it doesn’t write a null string to the console (I wonder why not?). The first call is to Write(string value) :

public virtual void Write(string value)
{
    if (value != null)
    {
        Write(value.ToCharArray());
    }
}

Which itself calls Write(char[] buffer)

// Writes a character array to the text stream. This default method calls
// Write(char) for each of the characters in the character array.
// If the character array is null, nothing is written.
//
public virtual void Write(char[] buffer)
{
    if (buffer != null)
    {
        Write(buffer, 0, buffer.Length);
    }
}

Which itself calls the version of Write(char[] buffer, int index, int count):

// Writes a range of a character array to the text stream. This method will
// write count characters of data into this TextWriter from the
// buffer character array starting at position index.
//
public virtual void Write(char[] buffer, int index, int count)
{
    if (buffer == null)
    {
        throw new ArgumentNullException(nameof(buffer), SR.ArgumentNull_Buffer);
    }
    if (index < 0)
    {
       throw new ArgumentOutOfRangeException(nameof(index), SR.ArgumentOutOfRange_NeedNonNegNum);
    }
    if (count < 0)
    {
        throw new ArgumentOutOfRangeException(nameof(count), SR.ArgumentOutOfRange_NeedNonNegNum);
    }
    if (buffer.Length - index < count)
    {
        throw new ArgumentException(SR.Argument_InvalidOffLen);
    }
    for (int i = 0; i < count; i++) Write(buffer[index + i]);
}

Now that we’ve covered the innards of what is happening, let’s step back to where this all began, ConsolePal.Unix.cs.


public override void Write(byte[] buffer, int offset, int count)
{
    ValidateWrite(buffer, offset, count);
    ConsolePal.Write(_handle, buffer, offset, count);
}

This calls the the base ConsoleStream ValidateWrite method; which does bounds checking on the inputs:

protected void ValidateWrite(byte[] buffer, int offset, int count)
{
    if (buffer == null)
        throw new ArgumentNullException(nameof(buffer));
    if (offset < 0 || count < 0)
        throw new ArgumentOutOfRangeException(offset < 0 ? nameof(offset) : nameof(count), SR.ArgumentOutOfRange_NeedNonNegNum);
    if (buffer.Length - offset < count)
        throw new ArgumentException(SR.Argument_InvalidOffLen);
    if (!_canWrite) throw Error.GetWriteNotSupported();
}

And this calls the Unix Specific ConsolePal.Write method, which should send us back down the Unix rabbithole.

ConsolePal.Write(_handle, buffer, offset, count);
/// <summary>Writes data from the buffer into the file descriptor.</summary>
/// <param name="fd">The file descriptor.</param>
/// <param name="buffer">The buffer from which to write data.</param>
/// <param name="offset">The offset at which the data to write starts in the buffer.</param>
/// <param name="count">The number of bytes to write.</param>
private static unsafe void Write(SafeFileHandle fd, byte[] buffer, int offset, int count)
{
    fixed (byte* bufPtr = buffer)
    {
        Write(fd, bufPtr + offset, count);
    }
}

Much like before, this safe version ensures the compiler pins the memory locations of the data structures being written, and then it calls the internal version with a byte* to the buffer instead of a byte[]:

private static unsafe void Write(SafeFileHandle fd, byte* bufPtr, int count)
{
    while (count > 0)
    {
        int bytesWritten = Interop.Sys.Write(fd, bufPtr, count);
        if (bytesWritten < 0)
        {
            Interop.ErrorInfo errorInfo = Interop.Sys.GetLastErrorInfo();
            if (errorInfo.Error == Interop.Error.EPIPE)
            {
                // Broken pipe... likely due to being redirected to a program
                // that ended, so simply pretend we were successful.
                return;
            }
            else if (errorInfo.Error == Interop.Error.EAGAIN) // aka EWOULDBLOCK
            {
                // May happen if the file handle is configured as non-blocking.
                // In that case, we need to wait to be able to write and then
                // try again. We poll, but don't actually care about the result,
                // only the blocking behavior, and thus ignore any poll errors
                // and loop around to do another write (which may correctly fail
                // if something else has gone wrong).
                Interop.Sys.Poll(fd, Interop.Sys.PollEvents.POLLOUT, Timeout.Infinite, out Interop.Sys.PollEvents triggered);
                continue;
            }
            else
            {
                // Something else... fail.
                throw Interop.GetExceptionForIoErrno(errorInfo);
            }
        }

        count -= bytesWritten;
        bufPtr += bytesWritten;
    }
}

This code then calls the native Write call, checks for system specific behavior around polling, and writes again.

So finally (!) we’ve gotten to the point where .NET Core Framework code calls the native sys_call for Write; and this is the point where it all happens. All of that setup for this.

Let’s recap where we are:
1. User Calls Console.WriteLine(string val);
2. Console.WriteLine calls OpenStandardOutput()
3. OpenStandardOutput()is compiled with different dropins for Windows and Linux; in this cause ConsolePal.Unix.csis compiled and used.
4. ConsolePal.Unix.csensures the correct native syscall is made to open a File Descriptor to FILENO.STDOUT. This syscall is called in a loop because we have to wait until it’s open to continue.
5. Once the native syscall executes; the stream is opened using the same conditional compilation we saw earlier, with UnixConsoleStreambeing created.
6. The UnixConsoleStream is created with the correct filehandle and the access being requested (read or write); and the instantiation checks to ensure the file can be accessed in the manner requested and is available (does it exist?). If so, it writes to the buffer of the file descriptor.
7. The TextWriter is created appropriately; and its Write method will be called.
8. It calls the appropriate stream’s Write method.
9. The UnixConsoleStream calls its internal ValidateWritemethod, and then its internal unsafe Writemethod; which calls another version of Write that takes in a pointer to the bytes being passed in.
10. It’s at this point where we call the native Writemethod and actually write out to the system console.

Now we have to find the syscall in the .NET Core Framework code. Since the calls follow a naming convention of SystemNative_<call>, I’ll look for SystemNative_Write, and sure enough, I find it.

[DllImport(Libraries.SystemNative, EntryPoint = "SystemNative_Write", SetLastError = true)]
internal static extern unsafe int Write(int fd, byte* buffer, int bufferSize);

This calls a .c class called pal_io.c, and in particular its SystemNative_Writefunction:

int32_t SystemNative_Write(intptr_t fd, const void* buffer, int32_t bufferSize)
{
    assert(buffer != NULL || bufferSize == 0);
    assert(bufferSize >= 0);

    if (bufferSize < 0)
    {
        errno = ERANGE;
        return -1;
    }

    ssize_t count;
    while ((count = write(ToFileDescriptor(fd), buffer, (uint32_t)bufferSize)) < 0 && errno == EINTR);

    assert(count >= -1 && count <= bufferSize);
    return (int32_t)count;
}

This block is interesting in that it does the checking; and then it calls the syscall write note lowercase) in a while loop. That writecall is dependent on which C library in use; I’m referencing glibc since I know it’s pretty common.
Because I’ve been looking at glibc, I know its prepends its syscalls with two underscores; so I use that to find the write (right) function:

/* Write NBYTES of BUF to FD.  Return the number written, or -1.  */
ssize_t
__libc_write (int fd, const void *buf, size_t nbytes)
{
  if (nbytes == 0)
    return 0;
  if (fd < 0)
    {
      __set_errno (EBADF);
      return -1;
    }
  if (buf == NULL)
    {
      __set_errno (EINVAL);
      return -1;
    }

  __set_errno (ENOSYS);
  return -1;
}
libc_hidden_def (__libc_write)
stub_warning (write)

weak_alias (__libc_write, __write)
libc_hidden_weak (__write)
weak_alias (__libc_write, write)
libc_hidden_weak (write)

And it’s at this point where I’ve gone about as far as I can without knowing more magic behind the way glibcis set up. I would have expected to see the buffer being written here; but all I see are processor definitions after a serious of ifstatements, none of which look like they do anything with the buffer.

I’ve enjoyed this dive into how a string is written to the console. I’d love to hear from you if you have picked up a thread I’m missing here; particularly around the syscall write and how that magic works.

Fixing the “depends_on” crisis in .NET Core by implementing the Circuit Breaker Pattern for Docker-Compose and Postgres

With Docker Compose version 2, a person using docker-compose could implement a “depends_on” and healthcheck script to ensure dependencies when starting docker containers could be handled. This was incredibly useful in waiting for a database to be ready to accept connections before attempting to connect. It looked like the following:

version: '2.3'
services:
  stats-processor:
    build: ./
    depends_on:
      - db
  db:
    image: postgres:10.3-alpine
    restart: always
    environment:
      POSTGRES_USER: user
      POSTGRES_PASSWORD: password
    ports: 
      - "5432:5432"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 10s
      timeout: 5s
      retries: 5

In Docker Compose version 3, the ‘depends_on’ condition behavior was removed, with the reason being that the application should implement this behavior, or the orchestration software should implement this behavior; not docker-compose. I respectfully disagree1; but that’s neither here nor there.

To “fix” this issue for cases where an individual is using .NET Core to connect to Postgres, I’ve come up with the following, based on the Circuit Breaker Pattern. The pattern is described as:


The basic idea behind the circuit breaker is very simple. You wrap a protected function call in a circuit breaker object, which monitors for failures. Once the failures reach a certain threshold, the circuit breaker trips, and all further calls to the circuit breaker return with an error, without the protected call being made at all. 

Martin Fowler, “CIRCUITBREAKER”

It’s a very useful pattern and it is precisely the ‘right’ answer for the problem we’re facing: How do we ensure Postgres is ready to accept connections before we connect to it? Here’s the way I chose:

private static bool retryConnect(int tryTimes, string connectionString)
        {
            int times = tryTimes;
            using (NpgsqlConnection conn = new NpgsqlConnection(connectionString))
            {
                while (times > 0 && conn.FullState != ConnectionState.Open)
                {
                    try
                    {
                        if (conn.FullState == ConnectionState.Connecting) { Console.WriteLine("Connecting...");  Thread.Sleep(5000); break; }
                        if (conn.FullState != ConnectionState.Open) { Console.WriteLine("Opening Connection..."); conn.Open(); Thread.Sleep(5000); }
                        if (conn.FullState == ConnectionState.Open)
                        {
                            Console.WriteLine("We have connected!");
                        }
                    }
                    catch (SocketException ex)
                    {
                        Console.WriteLine("SocketException Exception: {0} ", ex);
                        Thread.Sleep(5000);
                        times--;
                    }
                    catch (NpgsqlException nex)
                    {
                        Console.WriteLine("NpgsqlException Exception: {0} ", nex);
                        Thread.Sleep(5000);
                        times--;
                    }
                }
                if (conn.FullState==ConnectionState.Open)
                {
                    Console.WriteLine("Connected!");
                    conn.Close();
                    return true;
                }
                return false;
            }
        }

The NpgsqlConnection class maintains a state machine of the status of the connection using its FullState property which uses the ConnectionState enumeration to declare which state its in, we use this as the internal property to determine whether we need to keep trying or not.
private static bool retryConnect(int tryTimes, string connectionString)
The method is static due to its use in a .NET Core Console application directly in the main method.

int times = tryTimes;
using (NpgsqlConnection conn = new NpgsqlConnection(connectionString))

Assigning the tryTimes variable to times (how many times should we try to connect) isn’t required since in C# a the variable is passed in by value (meaning external invocations wouldn’t be affected by mutating the variable inside this method); but I do it because I’m not really concerned about it here.
I am using the usingblock to ensure the connection is cleaned up when I’m done with it. I don’t really know the internals of the class (And if it’s a good abstraction I shouldn’t have to); so I’ll put it in a using block.

The next part is a `while` loop that specifies how long we should try:

while (times > 0 && conn.FullState != ConnectionState.Open)

Keep trying until we run out of tries (3, in my example), and the connection hasn’t been opened. If it’s opened before we’ve gotten to our third try, then abort. If we’ve gone three times and it’s still not been opened, abort. I used a `while` loop because the logic made sense when reading the code: “While we haven’t connected or run out of tries, keep trying to connect.”

The next three lines handle the conditions of the FullStateproperty.

if (conn.FullState == ConnectionState.Connecting) { 
    Thread.Sleep(5000); break; 
}
if (conn.FullState != ConnectionState.Open) { 
  conn.Open(); Thread.Sleep(5000); 
}
if (conn.FullState == ConnectionState.Open)
{
  break;
}

If we’re currently trying to connect, give it 5 seconds to connect (This amount is variable depending on how much you have going on in your database; you can raise or lower the limit to your taste and particulars). If we didn’t put a sleep in here, we could effectively run out of chances to give it a chance to try to connect before we wanted to.
If the connection isn’t open, try to open it (We’re betting that trying to open a connection that’s already trying to be opened results in a no-op. This is an assumption and could be wrong). And of course, Sleep for 5 seconds. It’s not entirely clear if we need to sleep here, as the Open() function is synchronous.
Finally, if the connection is indeed open, break out of the loop and let’s allow it to be evalulated again. This line isn’t needed; but was included in the original code to give us a chance to Console.WriteLine()and debug through the console.

The next two blocks handle errors that could happen while trying to connect:

catch (SocketException ex)
{
    Console.WriteLine("SocketException Exception: {0} ", ex);
    Thread.Sleep(5000);
    times--;
}
catch (NpgsqlException nex)
{
    Console.WriteLine("NpgsqlException Exception: {0} ", nex);
    Thread.Sleep(5000);
    times--;
}

If you’re trying to use NpgsqlConnection you’re ultimately using an abstraction over opening a Socket, which means that not only do you have to worry about things going wrong with Npgsql, you also have to worry about underlying network exceptions. In my case, when the database wasn’t ready, it would issue a Connection Refused, and perhaps paradoxically this does not raise an NpgsqlException, it raises a SocketException.
In our case, if we receive an exception (and we’re expecting to the first time, at least) then reduce the number of times we’re going to try again, and do nothing for 5 seconds (to hopefully give time for the Database to be available). This is also one of those settings that you’d tweak for your environment, as in some instances I’ve seen databases take a minute to become available when started from docker (typically due to the number of operations at startup or whether the database’s volumes were removed before starting it up2.

Finally, we handle the state we really care about; is this connection open?

if (conn.FullState== ConnectionState.Open)
{
    Console.WriteLine("Connected!");
    conn.Close();
    return true;
}
return false;

If the connection is open, our work is done, close the connection, and return true(the retryConnect method returns bool). Otherwise return false; as we’ve exhausted our number of tries and could not connect. The connection will be closed when the NpgsqlConnection class is disposed; but we’re going to be a good citizen and be explicit about it.

So that’s the code, explained in full. It’s very rough (as I created it about an hour ago); but it works for my use-case. I wouldn’t recommend it being blindly copied into a production codebase without a bit of testing. Using the code is pretty easy, and demonstrated below in the while loop. This while loop exists to ensure we’re going to wait for the database before trying to do anything. Incidentally (it’s turtles all the way down), this code will wait until the database is ready (as as an application it can’t do anything if the database isn’t up); and therefore the application will sit there until the database comes up. This works for this particular application; but your scenario (like a web application), may need a more robust answer.

 while (succeeded = retryConnect(3, connectionString) && !succeeded)
 {
     //do something that requires database to be available here
 }

Overall I wish this were an option in Docker Compose version 3 yaml files; but as it is not we’re forced to solve it ourselves. Please sound off with any corrections you’d make to this code.

1 (not everyone’s use-case is web scale orchestration. Some people have simpler desires; and depends_on fufills those simpler desires, even if it doesn’t cover the entire gamut of ways containers could fail. Put simply, it would be akin to removing Airbags from cars because people could get hurt by airbags; instead of realizing airbags do have their own uses and even if they don’t cover everything they’re still useful. Maybe an airbag switch would be a better feature instead of removing the airbags).

2: Yes, I know you shouldn’t really run a database in docker; however for development it’s frightfully useful and keeps you from having those niggling issues where each developer has to have the right settings on their host and the right installed software on their host in order to be productive (Not to mention when any system setting change occurs; every developer having to make that change on their system). By putting it in Docker, you reduce the number of pieces you have to have installed to be productive to just Docker, and that’s a net positive. If it’s not good for your use case, don’t do it.

Software Teardowns: Console.WriteLine (Part 1: Windows)

I love teardowns, and I spend a lot of time looking at the articles that tear down different consumer products to show how they’re made. Software teardowns are harder though. It’s usually difficult to figure out a discrete piece to look at and tear it down without a whole lot of problems. Or at least, that’s what I’ve always thought.

Today, I’m going to take a look at the Console.WriteLine method in .NET Core; and attempt to ‘tear it down’. I’ll take it as far as I can, and we’ll see where I get. I haven’t ‘pre-planned’ this (though I have trawled through the .NET Core source code before on an unrelated issue, and am reasonably sure this won’t suck). I’ll attempt to stick to just the act of writing a character to the console (and printing a new line); but I may see other things I want to talk about along the way. Ado aside, here we go.

The source code for the .NET Core framework (called corefx by whoever names these things) can be found under the .NET Foundation’s github page, here. The .NET Core Framework (I’ll just call it corefx or The Framework from here on out) is meant to be a managed library of pre-built classes and features you can use as a programmer. You’ll hear “Standard Library” thrown about in various languages; corefx is .NET Core’s Standard Library. You could create your own if you wanted to (the Mono team did); but generally unless there are licensing concerns, don’t re-invent the wheel. A note about licensing, corefx is MIT licensed; so there shouldn’t be any licensing issues. (You can always find licensing information in each source code file, as well as a ‘repo-level’ LICENSE.MD, LICENSE, or LICENSE.TXT file or some such. CoreFx’s license is here.

For this post I’ll be referring to this revision of the framework. It’s currently ‘master’ as well; but master is nebelous as to whatever’s checked in; and I want to be able to point to specific pieces that may change in the future.

The first thing of note is that The Framework makes use of several third-party libraries and source code(s) (todo: figure out the plural of source code: “source codes”, “sources code” or “source code”), and they spell out a notice here as to what they use and where it’s from. Several interesting tidbits, including “Bit Twiddling Hacks”, “decode_fuzzer.c” and Brotli. I have no idea what Brotli is; but I’m curious to know.

That aside, the central question I have is:

Given this simple program; what is the resulting look of the code that prints characters to the console that gets used on Windows? On Debian-Based Linux Machines? (I’m not as interested in the compiled IL right now).

using System;

public class Program {
    public static void Main(string[] args) {
      Console.WriteLine("Hello World!");
    }
}

Because I am importing System, I would expect the Console class to be right underneath System; and lo and behold it is.

The source code structure surrounding it is interesting.

If I click on src, underneath corefx, I see a whole bunch of namespaces that I typically use in applications:

well these look familiar…

The one at the top is interesting; it’s called Common. Even Microsoft can’t get away from a dumper folder, eh?

It looks like at least part of the Common is a sync-ing point between projects, at least it says that on the tin.

Upon further inspection, due to the myriad of internal classes (I’ve spot-checked and all are marked with the internal identifier), I’d classify this code as necessary to build/develop for the .NET Framework; but not necessary for consumers of the .NET Framework.

That detour finished; I click on System.Console and open it up to inspect what’s going on. There shouldn’t be that much here, right?

The first thing I notice is that System.Console has its own .sln file; so presumably if we ever need to make changes to System.Console, you don’t have to load the entire .NET Framework source. This is probably a recurring theme; but interesting to note.

This ends up being a common theme. .sln, .props, ref, src, and tests… I wonder if it’s code-gen’d?

The second thing is there is a ref folder, and it contains a partial Console class; with all of the property and method signatures needed for this class. It has no implementation, however. I’m not really sure why it’s here; but to guess (way out of my element, mind you); would seem it could be there so someone could check API Compatibility without needing the whole framework source. As I said, I have no idea; and I’ll have to google more to figure out why it exists. Interestingly enough there’s a short-link at the top of this file that points to the API Design guidelines for third-parties to request their library/class be made as part of the System.* namespace. They even include an example of a recent process to add classes to the namespace. It is worth a read.

In the ref folder, besides the partial definition of the System.Console class; there are also Configuration.props files and a System.Console.csproj file.

I really have no idea what “ref” is for but it looks important.

It still uses the non-SDK tooling (which isn’t surprising — but it makes you wonder — if Microsoft can’t use their own SDK tooling for the .NET Core Framework Library; what’s the hope for the rest of us?). (Editor’s note, I wrote this at the beginning of 2018; 1 year later, it still uses the non-SDK tooling).

I assume the .props file is to give some hints as to how to build this; there are two interesting build properties that I haven’t seen elsewhere (yet):

    <IsNETCoreApp>true</IsNETCoreApp>
    <IsUAP>true</IsUAP>

IsNETCoreApp is used here, and it (unsurprisingly) is used during building the Framework, and is used to reduce the build to only projects that are a part of .NET Core.

IsUAP is an acronym I hadn’t heard before; and a quick Googling suggests it’s the same as UWP (Universal Windows Program). Not sure if this is a marketing change, or if UAP actually means something different than UWP. I thought UWP was The Right Acronym For This Thing (TRAFTT); and it appears that UAP is an earlier naming. Considering the build tooling isn’t using the latest; they probably haven’t gotten around to the name change yet (and who would want to?)

Now that Github has stopped rate limiting my searches, I can go back to explaining what is going on here. I should probably switch over to grep; but I’m trying to keep this lightweight. You know, besides all the commentary.

IsUAP is slightly more interesting; as it also has runtime considerations. It’s used during building to determine where to put the bin folder but it’s also used by test code to ‘do things differently’ depending on if it’s UAP or not. It looks like UAP uses a WinRTWebSocket, and non-UAP uses WinHttpWebSockets. I’m literally monkey-typing what I see here, and there are lots more questions than answers, but there it is. There are several places where UAP differs from non-UAP; I’m just highlighting one.

The tests for System.Console are interesting. Anything that tests the Console is by definition an integrated test; so I’m cracking these to see how they delineate Unit Tests from Integration Tests.

Interestingly (I know, I know, I use that word too much), it appears that in order to test the System.Console class; The Framework authors wrote a helper class to intercept what is sent to the console and look at the MemoryStream, instead of it actually getting sent to the Console.

It also appears there’s a ManualTests class to actually have a user test the console.

I have lots of questions that I’d probably get answered by running the tests themselves (but that’s not why I’m here today).

Now to the meat of what I wanted to talk about; the Console class itself.

It appears the Console has its class, and then there are UnixPal and WindowsPal classes that actually act as implementations for particular platforms (Windows, Unix). I’m guessing “PAL” stands for Platform Abstraction Layer; but again, that’s a guess. A quick Google Search reveals even more goodies, a Github project I never knew existed!

The System.Console.csproj file contains build switches to determine which ConsolePal class to bring in. The ConsolePal.<platform>.cs class is an internal class; and purely a drop-in replacement for implementations on different platforms.

The class sends hints to the compiler not to inline the methods; this appears to be done for ease-of-debugging reasons:

        // Give a hint to the code generator to not inline the common console methods. The console methods are 
        // not performance critical. It is unnecessary code bloat to have them inlined.
        //
        // Moreover, simple repros for codegen bugs are often console-based. It is tedious to manually filter out 
        // the inlined console writelines from them.
        //

The particular overload of WriteLine I’m concerned with is this one:

public static void WriteLine(String value) 
{
  Out.WriteLine(value);
}

Out in this context refers to a public member that is of type TextWriter.

public static TextWriter Out => EnsureInitialized(ref s_out, () => CreateOutputWriter(OpenStandardOutput()));

s_out refers to a private TextWriter member, and that is passed to EnsureInitialized along with a Func<T>, and EnsureInitalized has the following definition:

internal static T EnsureInitialized<T>(ref T field, Func<T> initializer) where T : class =>
            LazyInitializer.EnsureInitialized(ref field, ref InternalSyncObject, initializer);

Since it’s turtles all the way down, the EnsureInitialized in turn calls a LazyInitializer class (which probably follows the Lazy Initialization (aka, Lazy Loading) pattern.

This LazyInitializer class is written to be Thread Safe, and overall if you’re interested in “How would I lazily initialize members in a high-performance situation”, the code for this class is worth a look. Upon further inspection (My eyes glazed over the public modifier); it looks like you can use this in your code (so that’s cool — and I’ll have to try this).

The next part of the EnsureInitialized method call for Out is the CreateOutputWriter(OpenStandardOutput()) method. Briefly, it does the following:

public static Stream OpenStandardOutput()
{
    return ConsolePal.OpenStandardOutput();
}

OpenStandardOutput() is an abstraction over the system specific method of opening Standard Output,

And CreateOutputWriter does the following:

private static TextWriter CreateOutputWriter(Stream outputStream)
{
    return SyncTextWriter.GetSynchronizedTextWriter(outputStream == Stream.Null ?
        StreamWriter.Null :
        new StreamWriter(
            stream: outputStream,
            encoding: new ConsoleEncoding(OutputEncoding), // This ensures no prefix is written to the stream.
            bufferSize: DefaultConsoleBufferSize,
            leaveOpen: true) { AutoFlush = true });
}

It’s important to note that the outputStream that is a parameter is the previous ConsolePal.OpenStandardOutput() method; so we’ll examine that further when we break off into the Windows vs. Unix implementations.
At this point, I’m going to go down two paths; what Console.WriteLine does on Windows in this blog post, and Unix in another blog post.

Windows

The first step as we saw above was to OpenStandardOutput(), and for Windows that method can be found here:

public static Stream OpenStandardOutput()
{
    return GetStandardFile(Interop.Kernel32.HandleTypes.STD_OUTPUT_HANDLE, FileAccess.Write);
}

It returns a Stream; and makes a call to GetStandardFile with two parameters.

Interop.Kernel32.HandleTypes.STD_OUTPUT_HANDLE

From my C days (I’m still a junior C developer, mind you), I can guess that STD_OUTPUT_HANDLE is a constant, and navigating to Interop.Kernel32.HandleTypes confirms it:

internal partial class Interop
{
    internal partial class Kernel32
    {
        internal partial class HandleTypes
        {
            internal const int STD_INPUT_HANDLE = -10;
            internal const int STD_OUTPUT_HANDLE = -11;
            internal const int STD_ERROR_HANDLE = -12;
        }
    }
}

The internal tells me this is for the Framework itself (not us), and it being a partial class tells me there will be another section of this class elsewhere. I can’t seem to find it (again, I’m basing it off of source code available via github, and not having it built locally), so I have unanswered questions as to what is this for and why it is partial.

STD_OUTPUT_HANDLE is a constant that is set to -11, so that value has meaning for GetStandardFile (likely it means that Win32 has hardcoded outputing to StdOut as -11).

FileAccess.Write is an Enum Flag that is a clean way to tell the underlying Win32 libraries that we want to ‘write’ to the Console Output “stream”.

As an archeological note, it appears that the System.IO.FileSystem.Primitives project that FileAccess sits in uses an older version of .NET Core, as evidenced by its use of project.json.

The method declaration for GetStandardFile is as follows:

private static Stream GetStandardFile(int handleType, FileAccess access)
{
    IntPtr handle = Interop.Kernel32.GetStdHandle(handleType);

    // If someone launches a managed process via CreateProcess, stdout,
    // stderr, & stdin could independently be set to INVALID_HANDLE_VALUE.
    // Additionally they might use 0 as an invalid handle.  We also need to
    // ensure that if the handle is meant to be writable it actually is.
    if (handle == IntPtr.Zero || handle == s_InvalidHandleValue ||
        (access != FileAccess.Read && !ConsoleHandleIsWritable(handle)))
    {
        return Stream.Null;
    }

    return new WindowsConsoleStream(handle, access, GetUseFileAPIs(handleType));
}

Let’s break this down.

IntPtr is an integer sized pointer struct; and it’s used to get a pointer to the handle we’ll be writing to.

Whether it’s a 32-bit or 64-bit size is dependent on a compiler flag:

#if BIT64
using nint = System.Int64;
#else
using nint = System.Int32;
#endif

I have multiple questions here (having never seen the internals of this struct before), and I have to say most of this code is deep in the weeds code. To see which overload of this struct initializer is used, I’ll have to see what InteropKernel32.GetStdHandle(int) does:

using System;
using System.Runtime.InteropServices;

internal partial class Interop
{
    internal partial class Kernel32
    {
        [DllImport(Libraries.Kernel32, SetLastError = true)]
        internal static extern IntPtr GetStdHandle(int nStdHandle);  // param is NOT a handle, but it returns one!
    }
}

Ok. At this point we are making a call into the Win32 libraries; specifically the Libraries.Kernel32 library. I’m going to check to see if I can dig into the internals of the Libraries.Kernel32 class, but likely will not be able to (darn you, closed source). Yea, a quick Google Search indicates this is the end of the line for determining how it gets a handle. On the Unix side we’ll be able to dig deeper than this, but for Windows we’ll have to stop here on getting the handle. If you ever want to know about the internals of Windows, make sure you read The Old New Thing, it’s Raymond Chen’s blog; and is (with apologies to a certain comedian, quite amazing).

Once it gets the handle, it checks to make sure the handle is valid to write to:

// If someone launches a managed process via CreateProcess, stdout,
// stderr, & stdin could independently be set to INVALID_HANDLE_VALUE.
// Additionally they might use 0 as an invalid handle.  We also need to
// ensure that if the handle is meant to be writable it actually is.

The comment above is great; it doesn’t just restate what the code does; it tells why the code does what it does. If you’re going to write comments in code, be like this commenter.
And the code itself:

if (handle == IntPtr.Zero || handle == s_InvalidHandleValue ||
    (access != FileAccess.Read && !ConsoleHandleIsWritable(handle)))
{
    return Stream.Null;
}

IntPtr.Zero is likely a Windows-wide ‘null’ or sentinel value for a pointer.

s_InvalidHandleValue is by default:

private static IntPtr s_InvalidHandleValue = new IntPtr(-1);

and access cannot be ‘read’ only (read and write would be OK), and the ConsoleHandleIsWritable must be set to false.

This code gave me a headache for a minute because I realized this entire if statement is meant to be a guard clause; if any of these conditions exist, we can’t write and should therefore return Stream.Null. I can’t find System.IO.Stream (though as I understand it it is an abstract class); though it looks like CoreFX has moved it to the System.Runtime.Extensions namespace, and the cleanest definition I can find for Stream.Null is that it appears each child class of Stream (some or all) redefine it using the new operator:

public new static readonly StreamWriter Null = new StreamWriter(Stream.Null, UTF8NoBOM, MinBufferSize, true);

Before we pass the guard-clause, there’s one remaining method in the if statement: ConsoleHandleIsWriteable(IntPtr).

I’m pasting it below (with its comments) in its entirety; and the comments are A++ (would vote A+ again):

// Checks whether stdout or stderr are writable.  Do NOT pass
// stdin here! The console handles are set to values like 3, 7, 
// and 11 OR if you've been created via CreateProcess, possibly -1
// or 0.  -1 is definitely invalid, while 0 is probably invalid.
// Also note each handle can independently be invalid or good.
// For Windows apps, the console handles are set to values like 3, 7, 
// and 11 but are invalid handles - you may not write to them.  However,
// you can still spawn a Windows app via CreateProcess and read stdout
// and stderr. So, we always need to check each handle independently for validity
// by trying to write or read to it, unless it is -1.
private static unsafe bool ConsoleHandleIsWritable(IntPtr outErrHandle)
{
    // Windows apps may have non-null valid looking handle values for 
    // stdin, stdout and stderr, but they may not be readable or 
    // writable.  Verify this by calling WriteFile in the 
    // appropriate modes. This must handle console-less Windows apps.
    int bytesWritten;
    byte junkByte = 0x41;
    int r = Interop.Kernel32.WriteFile(outErrHandle, &junkByte, 0, out bytesWritten, IntPtr.Zero);
    return r != 0; // In Win32 apps w/ no console, bResult should be 0 for failure.
}

And here’s the signature of Interop.Kernel32.WriteFile():

 internal partial class Kernel32
{
    [DllImport(Libraries.Kernel32, SetLastError = true)]
    internal static extern unsafe int WriteFile(
        IntPtr handle,
        byte* bytes,
        int numBytesToWrite,
        out int numBytesWritten,
        IntPtr mustBeZero);
}

It looks like this is to make a system check to see if this is a writable location by actually trying to do it, and what I can ascertain (again, without seeing the Kernel32.WriteFile source) is that it tries to write zero bytes to the supplied handle, and the exit code from trying to run this function should be non-zero. (Which is interesting because 0 is generally success).

And if the guard clause is passed; the interesting part for us is now:

return new WindowsConsoleStream(handle, access, GetUseFileAPIs(handleType));

line (here).

Before diving into that class, I first want to resolve what GetUseFileAPIs does, so I’ll find that, and it turns out to be a private method in this class:

private static bool GetUseFileAPIs(int handleType)
{
    switch (handleType)
    {
        case Interop.Kernel32.HandleTypes.STD_INPUT_HANDLE:
            return Console.InputEncoding.CodePage != Encoding.Unicode.CodePage || Console.IsInputRedirected;

        case Interop.Kernel32.HandleTypes.STD_OUTPUT_HANDLE:
            return Console.OutputEncoding.CodePage != Encoding.Unicode.CodePage || Console.IsOutputRedirected;

        case Interop.Kernel32.HandleTypes.STD_ERROR_HANDLE:
            return Console.OutputEncoding.CodePage != Encoding.Unicode.CodePage || Console.IsErrorRedirected;

        default:
            // This can never happen.
            Debug.Assert(false, "Unexpected handleType value (" + handleType + ")");
            return true;
    }
}

while I applaud their confidence at the default state; I’ll keep the commentary on just the part of the case/switch we care about:

case Interop.Kernel32.HandleTypes.STD_OUTPUT_HANDLE:
    return Console.OutputEncoding.CodePage != Encoding.Unicode.CodePage || Console.IsOutputRedirected;

And now we’re into OutputEncoding, and particularly I don’t quite understand the ‘why’ behind this; but they want to make sure the OutputEncoding is unicode, or that the output is being redirected. If either of those things is true; then the intent is to use File APIs (and not the console?). I have questions behind this; but from what I can gather the Console API assumes Unicode and redirecting the output would mean it’s going to a file (typically).

WindowsConsoleStream

I’m relatively sure at this point we’re nearing the “We’re about to print to the console” stage; (it’s in the name, right?). The WindowsConsoleStream is a private class that provides the ability to stream to the WindowsConsole.

The constructor for this class follows:

internal WindowsConsoleStream(IntPtr handle, FileAccess access, bool useFileAPIs) : base(access)
{
    Debug.Assert(handle != IntPtr.Zero && handle != s_InvalidHandleValue, "ConsoleStream expects a valid handle!");
    _handle = handle;
    _isPipe = Interop.Kernel32.GetFileType(handle) == Interop.Kernel32.FileTypes.FILE_TYPE_PIPE;
    _useFileAPIs = useFileAPIs;
}

The Debug.Assert is a nice touch (I’ll go into it more below); and the only new thing we haven’t seen before is the GetFileType for the handle to determine if we’re trying to pipe the output.

A quick Github search brings a set of constants (set to 0x0003):

and since it’s derived from ConsoleStream, here’s the constructor for that class as well:

internal ConsoleStream(FileAccess access)
{
    Debug.Assert(access == FileAccess.Read || access == FileAccess.Write);
    _canRead = ((access & FileAccess.Read) == FileAccess.Read);
    _canWrite = ((access & FileAccess.Write) == FileAccess.Write);
}

Nothing new here; it’s just setting an internal variable to whether we can write to the Console. Debug.Assert is rather interesting, and bears its own blog post; but in short: Debug.Assert is a throwback to C programming where invariants (things that if they happen mean something is fundamentally broken) were checked using a macro (because C), and that macro generally looked like ASSERT(ThingThatShouldNeverBeFalse) (the uppercase ASSERT was to let other programmers know it was a pre-processor macro). That practice carries on, except now, Debug.Assert is a real thing, with an implementation in the .NET Core Runtime.

There are several interesting parts here (and I refuse to let myself be sucked into them; but I’ll mention them nonetheless):

[System.Diagnostics.Conditional("DEBUG")]
public static void Assert(bool condition, string message, string detailMessage)
{
    if (!condition)
    {
        string stackTrace;

        try
        {
            stackTrace = Internal.Runtime.Augments.EnvironmentAugments.StackTrace;
        }
        catch
        {
            stackTrace = "";
        }

        WriteLine(FormatAssert(stackTrace, message, detailMessage));
        s_ShowAssertDialog(stackTrace, message, detailMessage);
    }
}

First is that it appears (without checking into consumers of the ConditionalAttribute attribute) that this code will only be compiled in to the Runtime on a “DEBUG” build (if the DEBUG switch is activated on the build?)

Second is that it looks (again, I’m guessing) that it produces a StackTrace up to the point where the DebugAssert was called, and then there is likely a platform specific s_ShowAssertDialog that is called to let the developer know something bombed.

Back to the fun. So it looks like once the WindowsConsoleStream class is instantiated, there are three things you can do; read, write, and flush.

We’ll focus on the Write part (though Flush is probably useful too).

public override void Write(byte[] buffer, int offset, int count)
{
    ValidateWrite(buffer, offset, count);

    int errCode = WriteFileNative(_handle, buffer, offset, count, _useFileAPIs);
    if (Interop.Errors.ERROR_SUCCESS != errCode)
        throw Win32Marshal.GetExceptionForWin32Error(errCode);
}

It calls ValidateWrite which ensures the values are correct:

protected void ValidateWrite(byte[] buffer, int offset, int count)
{
    if (buffer == null)
        throw new ArgumentNullException(nameof(buffer));
    if (offset < 0 || count < 0)
        throw new ArgumentOutOfRangeException(offset < 0 ? nameof(offset) : nameof(count), SR.ArgumentOutOfRange_NeedNonNegNum);
    if (buffer.Length - offset < count)
        throw new ArgumentException(SR.Argument_InvalidOffLen);

    if (!_canWrite) throw Error.GetWriteNotSupported();
}

and then it calls a private internal method called WriteFileNative that uses that _useFileAPIsboolean to determine which APIs need to be used (Screen or writing to a file)

WriteFileNative gets us to the guts of what happens. Let’s take this a section at a time.

private static unsafe int WriteFileNative(IntPtr hFile, byte[] bytes, int offset, int count, bool useFileAPIs)
{
    Debug.Assert(offset >= 0, "offset >= 0");
    Debug.Assert(count >= 0, "count >= 0");
    Debug.Assert(bytes != null, "bytes != null");
    Debug.Assert(bytes.Length >= offset + count, "bytes.Length >= offset + count");

First, it’s an unsafe method (likely because it’s touching things outside the framework), and the Debug.Asserts rear their heads again.

// You can't use the fixed statement on an array of length 0.
if (bytes.Length == 0)
    return Interop.Errors.ERROR_SUCCESS;

Return ERROR_SUCCESS. That’s good, right? I don’t know what the fixed statement is; but I guess we’ll get to that in a minute.

bool writeSuccess;
fixed (byte* p = &bytes[0])
{
    if (useFileAPIs)
    {
        int numBytesWritten;
        writeSuccess = (0 != Interop.Kernel32.WriteFile(hFile, p + offset, count, out numBytesWritten, IntPtr.Zero));
        // In some cases we have seen numBytesWritten returned that is twice count;
        // so we aren't asserting the value of it. See corefx #24508
    }

Ok, that was quick. A google search tells me the C# keyword fixedis only used in unsafecode to tell the runtime the position of the memory should not be moved. The fixed keyword allows us to do pointer arithmetic in C# without the runtime mettling in our affairs.

This block of code also points out there’s an interesting bug in the framework where the number of bytes written doubles? Or maybe it’s counting characters when it should be counting bytes? Who knows?

    else
    {

        // If the code page could be Unicode, we should use ReadConsole instead, e.g.
        // Note that WriteConsoleW has a max limit on num of chars to write (64K)
        // [https://docs.microsoft.com/en-us/windows/console/writeconsole]
        // However, we do not need to worry about that because the StreamWriter in Console has
        // a much shorter buffer size anyway.
        int charsWritten;
        writeSuccess = Interop.Kernel32.WriteConsole(hFile, p + offset, count / BytesPerWChar, out charsWritten, IntPtr.Zero);
        Debug.Assert(!writeSuccess || count / BytesPerWChar == charsWritten);
    }
}
if (writeSuccess)
    return Interop.Errors.ERROR_SUCCESS;

// For pipes that are closing or broken, just stop.
// (E.g. ERROR_NO_DATA ("pipe is being closed") is returned when we write to a console that is closing;
// ERROR_BROKEN_PIPE ("pipe was closed") is returned when stdin was closed, which is not an error, but EOF.)
int errorCode = Marshal.GetLastWin32Error();
if (errorCode == Interop.Errors.ERROR_NO_DATA || errorCode == Interop.Errors.ERROR_BROKEN_PIPE)
    return Interop.Errors.ERROR_SUCCESS;
return errorCode;
}

If we aren’t using the FileAPIs, then call the Kernel32’s WriteConsole function with the appropriate handle, the current location of the pointer, how many characters we’re writing (count / BytesPerWChar ), a placeholder out variable that will contain what’s been written, and IntPtr.Zero. The final piece is to ensure that if the pipe is closed, then stop trying to write to the console.

Overall, this was a very interesting dive into the .NET Core framework. I’m a bit bummed that I couldn’t see more of what Windows does to actually print a character to the screen; but I feel like I’ll be able to do that in the next blog post, when we dive into how the .NET Core Framework prints to a Unix console.

Metaphors, Agile, and Us

I haven’t found a metaphor for programming that adequately describes its effect on business, maintenance, products, and developer sanity.

Manufacturing is probably the most used metaphor, and that tends to work for about as long as it takes to scream the words “mythical man month”, and with smarter people, that’s the end of that metaphor.

Then there’s gardening, or house building, or art, or carpentry, or some other activity that is superficially like programming. I use golf as a metaphor.  It’s an activity that requires following an exact procedure in a constantly new situation, it is largely mental, and there’s a physical limit to how long you can do it. It’s expensive, requires lots of practice, and to someone who has never done it before, it looks easy. It is also a personal activity, and there’s little use comparing two programmers, even under the exact same conditions.  People who aren’t doing it wonder what the trouble is, and people who do it can’t begin to explain how a leaf in the wrong spot can change everything.

Difficulty

Watching golf on TV is boring if you’ve never golfed, much like watching someone program. One aspect of golf that doesn’t translate to TV is just how hard they hit those golf balls. The ball itself makes sounds as it passes people. I’m not kidding. I can’t explain just how amazing watching a good golfer play is, and since you’re probably not likely to do it on your own, we will just have to take my word for it.

One area where golf really shines is that when someone is a good golfer, there are statistics to back them up.  Greens hit in regulation, scores, fairways hit, up and downs, all of it. We really don’t have the same sort of stats for programmers, do we. Even as I think about it there’s no immediate statistics that come to mind that would be universally applicable.  Bugs per 1kloc? Mean Time to feature completion? Revenue per 1kloc? Maintenance score? Story point average? Even if we could normalize these numbers across the industry there’s no authoritative source to say what’s valuable.

By and large golf is a solo activity. Others can (and do) help you decide on your strategy, but it’s up to you to execute it. Following someone else’s strategy blindly often results in a bogey, or worse. The effects of cargo-culting play out in real time as your ball goes sailing into the deep stuff. Bad decisions in golf can be cumulative, but after 4 and a half hours, you’ve made all the bad decisions.

All metaphors fail, and I suppose this is where the golf as programming metaphor fails. We should all be so lucky to see the effects of bad decisions after 4.5 hours in programming. Often the outcomes are small annoyances that add up, or a bad product direction that takes years to correct. There is no fast feedback in programming for bad technical decisions, and product decisions even less so. Everyone knows this, and everyone reacts to this reality differently. Project managers try risk registers, CTOs micromanagement, Architects enforce coding standards that have nothing to do with the problem at hand. Each party tries to control what it has dominion over. At least in golf, once the ball is in the air there is nothing you can do to retain control over your bad decision.

Course Management

In golf, there’s a phrase that is used to manage this phenomena, “course management”. At its heart, the strategy is to play to your strength and to the numbers. If you’re driving a 450 Yard par 5, you may as well hit 3 7-irons and make it on the green in 3. The 7-iron is one of the easier irons to hit, and your rational brain knows you aren’t making it in two anyway. It’s a lot safer to hit 150-150-150 than it is to try to drive 225-180-45. Most golfers can’t reliably hit woods and low irons (Golf clubs go from low numbers which can hit further to high numbers which can’t hit as far), and the flight of the ball is harder to control. Another side-effect of using the 7-iron is that if you need to readjust for your second and third shots, it’s easier to do so.

This is a well known strategy, carries little risk, and usually isn’t followed. Reasons vary, but distilled it comes back to ego, pride, and a foolish sense that this time it will be different. I know the strategy and even when nothing is at stake I have a hard time following it. I compare my skills to those next to me and so I try to play the same way they do, even though I’ve broken 100 a grand total of 1 time (I think. That’s 28 over par, in case you were wondering, or averaging 1.5 strokes over par per hole).  I am over-confident in my abilities, all evidence to the contrary. I see myself through their eyes; and want to impress them with my ability, and so I pull the driver out of the bag and use it, even though my hit rate with the Driver is 15% (I’m shockingly bad with a Driver, to the point that I now typically “drive” with a 3-hybrid). Going for the big hit doesn’t work if you’ve not successfully executed big hits in the past on a regular basis.  

Golf has the privilege of being a self-contained activity that’s at the mercy of external forces. It doesn’t matter how well you execute your shot if there’s a 40mph cross-wind, and yet the best you can do is adjust to changing factors and pray.  It’s like business in that regard. Technical decisions aren’t made for technical reasons alone. Choosing microservices may be due to a business requirement that teams are operating in parallel; or it may be because a monolith wasn’t “nextgen” enough. Or both.  Adding on business and product decisions to them and it’s impossible to be sure the decision is correct. At some point you have to swing the club to find out. Except the ball is your MVP, the wind is your business’s inertia, and the ball is heading into the woods.

Business decisions are critically important to the software we create, and yet I’d wager that for most developers you probably don’t realize what your most pressing business objective is right now. That’s ok, the code is hard enough. Strangely, the opposite is also true. Business people watch programmers in the same way tv viewers watch golf. It’s just typing, or it’s just a new menu, or it’s just restricting what a user can do after business hours. And just like golfers, we tell them it’s not that easy, but of course seeing is believing. The bad news is that it takes several people-months to show them how hard it is to fulfill their request. Software development has its own version of course management, called agile. Just like the green in three approach, it seeks to reduce risk by putting the effort and expected outcome in reach of everyone. Break the problem down into manageable and achievable chunks and work towards those. We may fail, but the cost of failure is smaller.  It allows software development teams of all skill levels to succeed, even teams that are in an unknown problem space. It really is the safest way to ensure failure chances are minimized and success is maximized.

Just like course management, agile only works if everyone commits to it. The caddy, the golfer, the people on TV, and the announcers. The management, programmers, the industry, and executives. The consulting industry has sold business that ‘agile’ means ‘going faster’, or ‘doing more with less’. Neither is true, and a great amount of ink has been spilled disproving these myths. Software developers, disillusioned with the promises of agile development  (notably the Scrum framework) combined with executives not truly buying in to the change, have decided to become “post-agile”. Executives revert to the tried-and-not-so-true method of asking for too much, Managers revert to command and control, and the industry apes what they see, because they think it’s successful.

Closing

The problems we encounter are at the seams of each discipline.  We understand programming is hard and requires exacting specifications, and we understand that business people need to see a return on their investment.  What we aren’t able to do is to align all of these forces cohesively. Post agile is great for developers.  It’s great for remote work.  It doesn’t solve the problem on measuring return on investment, because that’s not what it’s meant to do. It doesn’t solve the problem of establishing trust between business and development, because that’s not what it was meant to do. Agile development, for all its flaws, attempted to establish trust between developers and business, and to make development better for both the business and developers.  The promise of agile has succeeded when it’s been adopted by everyone in the organization; when everyone puts aside their egos and recognizes that past performance is an indicator of future performance; when businesses ask for bite-sized pieces of functionality instead of features that take people months to accomplish.  It has succeeded, and it can succeed.


What we’re missing is the metaphor that would convince business people and executives that software is hard, and to ship early and often, there is both vision reduction and tooling needed. That practice and training is vital to performance, and that practice must come on company time.   That working longer hours will make things worse; and adding people to the problem won’t deliver features faster; or that not correcting bad technical decisions has disastrous effects. We don’t have that metaphor, and without a good metaphor, it’s hard to have a shared understanding. We’re stuck with the same tools we’ve had: Build trust, ship, and use personal relationships to convince individuals of what works.  It’s not systemic, but it could work.

Microservices after Two Years

At this point, I have two(+) years of experience with Microservices, and I’m not an expert, but I have some hard-earned knowledge distilled from working with them (and making lots of mistakes in the process). Here’s what I learned that I wish I had known going into it.

Microservices are not mini-monoliths

Jim Gaffigan has a rather funny skit about (American) Mexican food. Listen to it here before I butcher the punchline. The punchline of the skit is all Mexican food basically consists of a tortilla with cheese, meat, or vegetables. We tend to think of deployable software in that same way. It’s all code, wrapped up with a deployment script, and sent to production. Monoliths are independent complete applications that fulfill a business function. So what’s a Microservice? An independent complete application that fulfills a business function. So why aren’t microservices just ‘mini-monoliths’? The answer comes from the idea that microservices collaborate. A monolith does not rely on another monolith for its uptime, data, or resiliency. It is generally a self-contained view of the world and due to their nature they do not care if anyone else exists. Your company’s website is wholly independent of anything else. More critically though, multiple teams may work on your company’s website. They share code, branches, and a single production pipeline. Microservices, on the other hand, are independent complete applications that fulfill a business function, but doesn’t fulfill more than one. A monolith does.

A Microservice understands that while it is independent, there are possibly zero or more people out there interested in what it has to say, and so it is designed with that understanding in mind. A Monolith is not, and does not have to be. Businesses eventually find out that they wish their monolith was designed to share its information in a de-coupled fashion, but often too late to do anything about it easily.

Microservices are not mini-monoliths; they’re collaborators that operate independently when they need to.

Microservices require a different way of thinking about problem solving

Developers love to write code. We’re so enamored with writing code that we’ll write code even when no one needs us to. We’ll write code to solve nagging problems on our own machines, or to automate silly things, or even write code to solve problems in our households. In fact, I have a new side project to set up a Raspberry Pi as a calendar viewer in my house. This is probably not unique to software development (though maybe it is? Do plumbers re-pipe their houses? Do electricians rewire theirs on a whim?) but the tenor of it is so overdone in software development that we exhort new developers to not write code first.

… And then we ask them to work on a monolith. Monoliths make writing more code easy. It gets to a point where the default state is “find problem”, “write code”, “ship”, without understanding whether or not the problem is best served by a bolt-on or add-on to the existing system. For small things this is not an emergent issue. Those small things can add up, and it will become a problem over time.

For instance, if you’ve ever tried to add a CSV import to any existing system , you’ve probably found out within days that the desired “CSV Import” feature is really a “CSV + Domain Specific Logic” import function, or almost as harmful is if a ‘bulk’ method of inserting wasn’t part of the original requirements; necessitating a change in the API. In a monolith; it’s really easy to write code to add this functionality that has baked in assumptions that aren’t clear, and to potentially change the API your system exposes, or how it presents itself to the user. Because of the ease of ‘just’ writing code, it it easy to rush the implementation without regards to the design. Writing code quickly is not the job; solving problems without causing more problems is the job; and a monolith makes that hard to do.

A user wants to add a stock to their portfolio…

Microservices, on the other hand, require up-front planning before code is written, every time. Every new service or any change to a service may be able to be coupled with completely replacing that service. Anything that has the potential to change the contract in a system (whether with the user or other services), requires more understanding and up-front design than the same change in a monolith. To go back to our CSV import example; a potential way of doing it with microservices is to have a new CSV importer service stood up that takes in a CSV file; does any Domain Specific Formatting; and emits an event or sends an HTTP request to the correct service and uses its existing API for adding/importing information.

And now they want to add multiple through CSV.

Now, these services are necessarily coupled to each other (though the coupling does goes in the right direction), and since the contract has not been changed for the original service; the guarantees of the original service are kept intact. Microservices make it harder to break existing consumers if done well. The trade-off is more upfront planning is required when designing a solution in a microservices based topology.

Domain boundaries are critical to Microservices success

There are three general flows to microservices (There may be more; but the types are escaping me right now):
1. Microservices that give new capabilities to an existing domain bounded context (the previous example of adding CSV import for a portfolio service as a separate microservice is an example of this — there are several trade-offs to doing that, and it depends on your constraints and desires)
2. Microservices that represent a stateless process (viz. validating a credit card)
3. Microservices that represent a stateful process or interaction (the portfolio service)

Notice that I said nothing about size of these services; and depending on whom you speak to, the size of a microservice is a mystery. I have opinions on this, of course; but the one invariant I’ve seen is that good microservices topologies ensure the lines are drawn at the domain’s “bounded context“. This is a fancy Domain Driven Design phrase that means to split up models and interactions by what they mean. To sales, a customer interaction is quite a different model and mode of interaction than a customer interaction for customer support. By splitting them up by their ‘context’ (and the boundaries being sales and customer support), the software can maintain independent ideas of how to interact with a customer depending on the context.

Martin Fowler’s Illustration of Bounded Contexts, source: https://martinfowler.com/bliki/BoundedContext.html


For microservices, this typically means that your customer support portal will be a different bounded context than your sales funnel; even if they share the same properties of a customer (at least demographically). There are three ways to handle the above problem:

Method 1: Set up a separate service with an independent customer model for each service (sales, customer support), and one created in one system is not necessarily referenced elsewhere (or it can be; customer_id, customer_support_id, sales_id)

Method 1, illustrated.

Method 2: Set up a “Customer” service, a sales service, and a customer support service, and both sales and customer support get customer information from the “customer” service.

Method 3: Set up a customer service, a sales service, and a customer support service; and sales and customer support have duplicated data (received through events) of things that happen in the customer service, but they maintain their own disparate models for what a customer means to them. From a system perspective the internal identifier is the same; how it’s used varies from system to system. This means having a customer service that has demographic information; a sales service that may or may not have this same demographic information but adds on sales context, and a customer support service that maintains this duplicate information but adds on its customer support pieces.

Each method has its own trade-offs; but you can quickly see the maintenance issues with each:

  1. Method 1 has three different representations of a customer; and potentially at different states in each service (a sales person sees a customer before they’ve signed on the dotted line, and a customer support person always has a “post sale” view of the customer. This is OK until you want sales to have the customer support information; and then you need to do a bit of juggling to ensure a customer from a sales context is indeed the same customer in a customer support context.
  2. Method 2 allows there to be one representation of a customer; and each service can either “add-on” to this representation of a customer; but each downstream service is still beholden to the customer service; and which context does that live in? Both. There is also a temporal coupling factor as each service “gets” demographic information from the customer service.
  3. Method 3 allows each service to be de-coupled from the “customer” service. It allows each service to add its own data to what it means for there to be a customer; and it allows each service to change independently (since each service will emit events it can listen to to update its model if it wants to). But this also means having a unified contract of what defines the demographics of a customer; and ensuring each service is set up to listen to events pertaining to customers, and each service appropriately handles being down if a customer event is emitted (event sourcing is a possible solution here).

None of these methods are “ideal” from an “easiest to develop” standpoint; and they have different levels of maintenance requirements. The one crucial decision that a team must make is what is the domain context, is this <thing> I’m dealing with talked about differently depending on who I talk to, and what is the maintenance cost of each approach. 

If the team chooses method #1, then they have a lot of distributed systems problems that aren’t easily solved; they’ve made interacting with the system harder. If they choose #2, then two services depend on a third (not really ‘independent’ at that point), and they’ve added an Request/Response dependency between services that may not need to exist (And is harder to debug). If they choose approach #3, they have quite a bit of upfront work (defining contracts; defining patterns), but the maintenance work, reasoning about how a service interacts with another service, debugging, and future expansion is far easier.

Developer Tooling doesn’t support Microservices as well as Monoliths

We have about 25 years of experience as an industry creating tooling around building and deploying software; though it’s only really in the last 15-18 years that the tooling has accelerated. But, even at 18 years of experience, we have pretty solid tooling around developing and debugging monoliths. Debuggers and IDEs take monoliths for-granted, as they likely should. If you write microservices that depend on other microservices over REST, you’re going to have a bad time debugging services locally. Your choices range from standing up the parts of the system that collaborate, or mocking out external dependencies, or dockerizing the system’s services so that they can be stood up independently. Of course, once you do this you’re diving into mixed networking land for Docker; and there’s not a lot of tooling that can make that experience seamless. A service running outside of docker that you’re debugging is hard to set up to work with services running inside of a docker network, or vice versa. Front-end development is even worse; as node.js is a requirement for building front-ends these days. Try live-debugging with docker for your UI where the source is kept locally. Not fun. Teams handle this problem in different ways; but the point is this problem exists, and the solutions are not as mature as debugging a monolith.

If you use microservices, you need to allocate a sizable chunk of time to building the tooling necessary to allow people to develop against those services.

Deployment requires better tooling with Microservices

Deployment considerations are key if you want a fast moving organization. You can’t respond to change without being able to change your software quickly. Even if you can develop changes quickly, if you can’t deploy them quickly you aren’t a fast-moving organization. Continuous Integration (CI) and Continuous Delivery (CD) is essential to being able to respond to change. These products reflect that the deployment view of the world is monolithic in nature. Source control is built for it, CI/CD systems are built around it; and pretty much every commercial CD system is built with monoliths in mind. There are several deployment models where microservices are used; and none of them have good tooling for microservices.

  1. Deploy on-premises as a packaged solution
  2. Deploy to the cloud independently
  3. Deploy to the cloud as a packaged solution

If you sell your product to customers, and they run it in their own data center, deployment method #1 is what you often deal with. Your solution must be packaged up and deployed together as a single unit. Should this necessitate that you develop as a monolith? No. It shouldn’t. However, if you have microservices, you necessarily have multiple deployable artifacts (whether they’re contained in a mono-repository (all services in one source control repository) or micro-repositories (each service in its own source control repository) is a separate matter), and your CD pipeline must take that into account. The trade-offs change whether it’s a micro-repository or mono-repository; but they still exist as problems not solved by current tooling. For instance, tagging master or a release branch with what is in production; or your promotion model to different internal environments; or even local deployments need to be taken into account by the tooling. If you choose method #2 and combine it with continuous delivery, some of those trade-offs go away; as you can make a rule that the latest in master is always pushed to internal promotion environments; and the only tag happens after a particular commit has been pushed to production; but again, tooling is still lacking to make this a seamless experience.

Microservices deliver on the promises of Object-Oriented Programming

I didn’t understand the hype of object oriented programming. I understood the fundamentals of encapsulation, abstraction, inheritance, message dispatching, and polymorphism, but I didn’t understand why they were so useful (I started with Perl, and then moved to Java, so I had nothing to compare Java’s OO nature to. At the time it just seemed like more work to do the same things I could do in Perl. Ahh, youth). The SOLID principles helped later on, but I always felt like there was more hype to OO than actual benefit. After several jobs maintaining and creating Object-Oriented solutions, I was convinced that Object Oriented Programming was a pipe-dream. To the 80% of us who are not “expert” programmers, it is a fad we can never make full use of and it causes more harm than good.

That was until I started researching microservices. This was it! A fully independent object that had agency that could collaborate with others; but encapsulation was ensured! The Open/Closed principle was a requirement! Single responsibility was almost ensured just by the nature of the service! (It says “micro” on the tin) Inheritance was far simpler — consume what the service gives you and modify it to suit your needs (the CSV example above). You couldn’t share information unless you had a common contract and used some sort of message dispatching!

This was absolutely huge for me. All of those principles that I’d been trying to bring to reality for years in codebases I’ve worked on were here — and best of all they didn’t have the downsides of OOP in practice! It’s really easy when modifying code to do something that breaks encapsulation, and business pressures make it even easier. With Microservices, that was no longer possible. Sure, other business induced pressures might cause problems, but they couldn’t alter the contract of a service; and that allowed the system to be reasoned about in ways OOP promised. Perhaps best of all, microservices put up guard rails that keep the mistakes of OOP from happening, and we’re all better for it.

Contracts, Patterns, and Practices should be Code generated

If you do something once, do it manually. If you do it twice, write down the steps, and do it manually. By the third time, automate it. Producing even a dozen services means either manually enforcing the structure of contracts
(the format by which services communicate with each other or to the user), patterns (how you structure common infrastructural concerns), and practices (how you write software) or code generating it for commonality. If you don’t code generate it, entropy wins. Even across features services start to do the same thing different ways; or you find new patterns for structuring your events, and depending on which service you’re in, you could see a different pattern. It’s untenable from a development and maintenance perspective.
Method #3 above shows a world where the Customer Service emits events when a customer is added or updated; allowing interested services to listen for changes and update their own data stores as necessary. Without code generation this would be a tedious process filled with error. With code generation and schema defined models; this is a viable development model.

Can you imagine trying to update any model/contract without code generation?


There are only two sane paths; package the commonalities (which can really only be done for dependencies) into utility functions, or code-generate everything.

Packaging utility classes/models (like the customer model and the events above); is a valid approach. The concerns with using it are taking on dependencies (even internal ones); the overhead of internal infrastructure; and the fact that every service would be required to be in the same programming language.
The latter path (code generation) is exactly what Michael Bryzek advocated in his talk Designing Microservices Architectures the Right Way and coming from trying the other paths (packaging common functionality, and doing it manually), I can see its utility. The trade-off, of course, is that developing the code generation tooling is a heavy investment of time. It requires discipline to develop this tooling first without trying to develop features; and it would likely result in no visible movement on things the business cares about (features, revenue, etc). It also ensures that as long as you have tooling to support that language, you can implement those models in any language you’d like.

You can’t punt on non-functional requirements

There are lots of non-functional requirements in a system that never appear on the roadmap, are never spoken about at the sales meetings, and are only tolerated by the product manager. Things like a user should be signed out after fifteen minutes; or the authorization system should incorporate roles and location; or some data is transient and not part of the backup strategy, and other data needs to be backed up every minute. Or, the system must allow 5000 concurrent users at a time. Those are non-functional requirements; they’re qualities of the system that aren’t part of the user-facing features being developed.

In a Monolith, there’s typically very few places to go to implement a non-functional requirement, and as we’ve discussed previously, IDE tooling is built for the refactoring necessary to ensure a change takes place everywhere it’s needed (only for the statically typed languages; the dynamic folks have their own problems to contend with), and even if you have to implement a new feature, there’s generally one place to do it.

Not so with microservices. If you implement authorization, you must implement it across all services. If you implement a timeout, you must implement it across all services. Unless your microservices are across hosts, any performance improvements must take into account that each service may share host resources with one or more other services. If each service is using the same server instance (i.e., every service that uses postgres shares a postgres server instance, even if they’re separate databases in that instance), then performance tuning and backups must take that into account. This greatly complicates matters of performance tuning and dealing with non-functional requirements; and for the system to be easily built, those non-functional requirements need to be known at the beginning! Every delay in implementing a non-functional requirement makes it more likely that some disparate changes will need to be made across several services; and that will take much longer once the services are built.

Event Driven Programming makes microservices work

In firmware programming, the finite state machine and events got me through the day. Each peripheral has separate states; and those are triggered by events that may happen from user input or other peripherals (for instance, seeing a bluetooth advertisement from a whitelisted address may trigger a connection). Since firmware by-and-large sits on a single core System-on-Chip with limited use of or no threads at all, using an event loop and finite-state-machines are one of the best ways to make firmware work.

Finite State Machines coupled with Event Driven programming also has other nice properties that parlay well into microservices: events ensure each service is de-coupled from the others (there are no direct request/responses between services); and Finite State Machines dictate what happens based on the current state of the service plus its input. This makes debugging a matter of knowing which state the service is in, and what input was received. That’s it. This greatly reduces the complexity in standing up and debugging services; and allows problems to be de-composed into events and states. If you add event sourcing into the mix, you have an event stream that records the events that occurred, so playing back issues is as simple as replaying events.

This is possible because microservices operate on network boundaries. In a monolith you’re forced to debug the entire monolith at once, and hope someone didn’t write code with disastrous side-effects that are impossible to find through normal means. It’s easier to find a needle in a small jar of needles than a giant haystack, and that’s possible because of the observable boundaries of microservices and using patterns that limit the amount complexity that allows you to arrive at a certain state.

If you’re going to start writing microservices; I highly recommend going down the path of event-driven programming, state machines, and some sort of event stream (even if you decide against event sourcing).

Choosing between REST and Events for supporting Microservices is tougher than you may think

If you’ve read the fallacies of distributed systems, then this section almost writes itself. Microservices are distributed systems, no matter how you shake it. One of the major problems when communicating across a network boundary is “is that service down, or am I just having a network timeout?” If you’re using REST, this means implementing the circuit-breaker pattern with some sort of timeout. It also means that if your services communicate to services that communicate to services through REST, then the availability in that chain will eventually hover just above zero. (00:00-12:31). As the video rightfully says, don’t do that. I’d go so far as to say that if at all possible; don’t make calls to other services through REST.

If you need data, have the service publish an event, and consume that event. This sounds great; it’s de-coupled, and it’s resilient to failure. However, each service must now have means to publish to a bus, consume an event off of a bus, and support whatever serialization scheme you want to use. Oh, and now you need to be able to debug all of the above. If you want runtime resiliency, you must sacrifice development simplicity to get there.

Maintaining Microservices requires strong organizational and technical leadership

“The business” does not care what the topology of your system is. They don’t care about its architecture, and they don’t care about how easy to maintain it is, any more than you care whether they use Excel or Quickbooks for forecasting. The business wants two things (really it’s n of 9 things) but work with me here:

  1. Increase Revenue
  2. Reduce costs

They believe more features will increase revenue. It’s a fair belief (correlation does not imply causation), but more features also increases development costs. To “the business”, the way to solve this problem is not by reducing the costs, but by increasing revenues. Again, this is also fair, and in a good number of cases is the right path.

Earlier, I mentioned that microservices keep those nasty shortcuts that cripple development teams from happening, and that’s a good thing, but, to the business, it can also be a bad thing. See, that crippling shortcut may never happen; but adding that feature (to their way of thinking) will increase revenue. If they have to choose between helping revenue but possibly hurting future maintenance, or delaying that feature by several weeks but helping future maintenance, they’ll pick the path to fastest revenue, every time.

The person or people that keep this from happening are hopefully the organization’s CTO and engineering leadership (VP or Director of Engineering, the Architect, and senior leaders of the team). They’re the people with the cachet and experience to know when this is going to hurt future maintenance, and they hopefully know enough to know it’s probably not a sure revenue bet either. But this requires discipline and trust on the part of the engineering leadership team. They must have gained the trust of the business by delivering what the business wants in the timeframe they want it; and they must be disciplined enough to stick to their guns. If someone says, “Well, we could do this in a week if we just hooked Service A up to Service B’s database”, you have now failed with microservices and are maintaining a future monolith. You’ve also lost the advantages of working with microservices.

Shortcuts are easy to say yes to, and shortcuts can greatly endanger the maintainability and health of a development team and the system.

Microservices are a technical solution to an organizational problem

While developers and consultants tend to espouse microservices in a cloud scenario, they tend to ignore that microservices are orthogonal to their deployment scenario, and they’re orthogonal to technology stacks. Take away all these advantages of microservices; and you’re still left with a topology that allows you to segment teams along domain boundaries, and have those teams operate independently of one another. At a small enough scale, you could even have individuals own services and scale out your feature creation to the number of people in your development organization. The Mythical Man month states that adding people to a late project makes it later; and it says that because those people have to communicate with each other. What if they didn’t? or what if you could reduce the amount of communication needed to ship a feature? Microservices let you do that. (I fall firmly in the micro-repository camp as well, so I’m about to conflate the two on purpose). Microservices development means independent repositories, and less issues with merge conflicts, branching, or collaboration needing to happen to push out a particular feature. It also means fewer avenues for the feature to clash with existing features; since by definition the service is independent and autonomous. It means fewer parts to reason about, and that results in faster development time.

Microservices (when architected well), let you go faster and further than you otherwise could, with less need to put organizational guardrails on the development team (code reviews; gated checkins, code freezes) to resolve team performance issues. It minimizes the effect a single developer can have against the whole system. This is a great benefit if the organization does not hire well or pay well (and if every organization did, we’d have a low turnover rate in software development), as it substitutes technology for some of the human training and improvements that organizations should do but don’t do.

If you have all top-notch performers in a high-performing engineering organization with a high performing business with no turnover, you don’t need microservices because you’re not going to make the mistakes that microservices would fix. If, however, you’re in an organization that consists of humans that are fallible, microservices provide a benefit to development that monoliths cannot.

Closing

Microservices are another tool to help make software development better and to make systems easier to maintain. They provide many benefits and have many trade-offs with traditional monoliths, and it’s rarely clear whether or not a system should be developed as a monolith or as microservices. There are several factors that can steer the choice towards one or the other; but those factors depend greatly on the individuals, organizational leadership, business model, constraints, and politics of the organization implementing those services.

These are the things I wish I had known when I started with microservices. What do you wish you had known about Microservices before working with them?

Note: Special thanks to Adam Maras for spending part of his weekend giving me feedback on this post.