Improving Diversity as an Individual Contributor

The book Getting to Diversity: What works and what doesn’t by Frank Dobbin & Alexandra Kalev is a great book for organizations to understand what the data says about common practices used to increase diversity. Surprisingly, many of the common ones don’t actually help. 

While the book is geared towards HR and other business executives who have the ability to influence company policies, there are some actionable things for individual contributors that the data shows can help improve diversity.

Mentoring

[From chapter 4: Open Networks Up]

Mentoring can help improve diversity at higher levels in the organization by creating intentional social networks and connections. [Figure 4.1] But all mentoring programs are not created equal and there are two things needed to successfully democratize them:

  1. Formal Programs
    Formal mentoring programs help ensure that everyone is eligible to be mentored. Informal programs often leave out under-represented groups.
  2. Make matches on interests, not gender and race.
    Only pairing under-represented mentors with mentees of the same under-represented category is a numbers problem – there simply aren’t enough under-represented mentors if our premise is that there is not the desired diversity at higher levels. It’s also a power problem because over-represented groups usually have the most influential network connections and ability to help mentees join those networks.

Take-aways:

  • Higher-level individuals, particularly those from over-represented groups, can help improve diversity by mentoring under-represented individuals to help them grow their careers and professional networks.
  • Utilize established formal mentoring programs where available.

Leverage ERGs for hiring

[From chapter 4: Open Networks Up]

ERGs in and of themselves don’t improve diversity. While they are places for people to find community, they can also create silos – the people who need to know more about a community’s needs aren’t in the room because they aren’t in the ERG. Getting to Diversity shows that the effect of ERGs on diversity is poor and have little to no improvement for most under-represented groups. [Figure 4.5]

However, ERGs have a supercharging effect on targeted recruiting. Having ERG members lead or accompany recruiting efforts at universities and other professional groups can have an outsized impact on increasing diversity. [Figure 4.6]

ERGs can also increase diversity through referrals. Targeted referral programs – asking employees for referrals – that leveraged ERGs turbocharged those programs and increased diversity. [Figure 4.7]

Take-aways:

  • As an ERG member, the best way to use the ERG to increase diversity is to assist with recruiting efforts, refer diverse people to jobs, and to encourage other ERG members to do likewise.
  • ERGs can encourage their members to be a mentor / mentee.
  • High-level individuals in an ERG can be a mentor for other ERG members – ideally these are cross-ERG pairings.

Use flex time

[From chapter 6: Work-life Help for Everyone]

Flex time, the ability to adjust one’s work schedule to better accommodate work/life balance, is a major contributor to increasing diversity. [Figure 6.1] Many companies, including Invitae, support some form of flex time for many roles.

The challenge is that just because a company has flex time policies does not mean that using it is culturally acceptable. People may not take advantage of flex time because they are worried that doing so will negatively impact their career. Creating a culture where it is not only acceptable but encouraged to take advantage of flex time policies can help.

Take-aways:

  • High-level individual contributors can lead by example and take advantage of flex time and other time-off policies (PTO, parental leave, etc) and encourage others to do so.

python, bytecode, and read-only containers

Upon first access, python compiles .py code into bytecode and stores it in .pyc files. Subsequent uses of those python sources are read from the .pyc files without needing to re-compile. This make startup time, but not runtime, faster.

But what about read-only filesystems? If python is running on a read-only filesystem no .pyc files are written and every use of a .py file involves compiling the code afresh. Everything works, the startup time of a script is just a little slower.

Read-only filesystems within a container are a security best practice in production environments. Often in kubernetes deployment manifests you might see something like:

securityContext:
readOnlyRootFilesystem: true

And if you’re running python only once within the container, the container has a little bit of overhead at startup as it compiles into bytecode and then everything is in memory and off it goes. But if you’re running python multiple times, or want to make even the single run start faster, we can pre-compile the code when we build the container with the built-in python compileall library.

# Compile code on sys.path (includes system packages and CWD):
RUN python -c "import compileall; compileall.compile_path(maxlevels=10)"

# Compile code in a directory:
RUN python -m compileall /path/to/code

This moves the compilation overhead to the container build where it happens once, and out of the startup.

Thanks to Itamar Turner-Trauring at https://pythonspeed.com/ for their excellent Production-ready Docker packaging for Python slide deck with this gem.

poetry auth via .netrc

poetry, the python package manager, provides several ways of authenticating against a repository. What isn’t explicitly documented, because it’s an implicit dependency, is that poetry can also use the ~/.netrc file for authentication when fetching packages.

poetry uses requests under the covers, and requests falls back to the ~/.netrc file. This is the same fallback method for pip for the same reason.

There are several (probably bad) reasons why someone would want to do this vs one of the explicit methods given by poetry. One that comes to mind is needing to install python packages from a private repository from inside a docker container by simply volume mapping the host’s ~/.netrc file to have poetry use the right creds.

This approach probably won’t work when publishing packages — caveat emptor.

While I’m not suggesting that this is a best practice, it’s good to know that it’s an available method in some extreme edge cases.

Bye ExtraHop, hello Invitae

I’m excited to announce that I am starting a new job with Invitae as a Staff Software Engineer in their Office of the CTO on Monday, Sept 27th. I’ll be helping teams across the engineering organization solve intractable problems, foster the use of software development best practices, assist with designing performant & scalable architectures, and just rolling up my sleeves and helping out. I’m looking forward to a new challenge on a team with a broad mission.

I’m very sad to be leaving the wonderful people at ExtraHop, in particular my amazing Perf & Tools team. It’s been a fun 2.734 years and I got to work on some company-changing projects like Reveal(x) Cloud, ExtraHop’s first SaaS product, which we developed and shipped in just 4 months. I still heartily recommend ExtraHop as a company.

My last day at ExtraHop is this Friday, Sept 10th then I have two weeks of funemployment before starting at Invitae.

Accessing Ubuntu desktop UI over SSH+VNC

During this pandemic I’m working from home on my Mac laptop and accessing things on my Ubuntu 18.04-based Linux desktop in the office. For most things this is fine via SSH or sshfs, but there are times you just need access to the desktop UI to get things done.

Specifically I had a 500 MB OVA that I needed to upload to an ESXi system — both of which are in the office. I could have downloaded the OVA to my laptop over the VPN, then uploaded it back over the VPN to ESXi but that is both slow, tedious, and wasteful. Instead after a bit of googling I figured out how to get a VNC client on my Mac securely accessing my work Xwindows display and do it all local to the office:

On your desktop, install x11vnc:

sudo apt install x11vnc

On your home computer, open an SSH tunnel and start the viewer on your remote system (below as $HOSTNAME):

ssh -t -L 5900:localhost:5900 $HOSTNAME 'x11vnc -localhost -display :0'

Then start a VNC viewer on your home computer (on MacOS I recommend RealVNC) and connect to localhost:5900

Security advisory: when accessing your desktop like this your computer is unlocked and accessible by keyboard and mouse to users who wander by your desk. Granted, in a pandemic when everyone is working from home is this really a problem? Lock your computer when you’re done as if you were walking away from your desk and you’ll be fine.

Preparing your software engineering team for a pandemic

We would be naive if we didn’t consider the possibility that the Coronavirus might flare up into a full pandemic over the next few months. Here are some things you can do to ensure your engineering team keeps humming along if that happens. If the pandemic doesn’t happen at least we’ll be better prepared for next flu season.

For individual contributors

If you’re sick, stay home and take care of yourself.
Your colleagues want you to feel better and also for you to keep your germs to yourself.

Don’t come back to work until you’re fever-free for 24 hours.
If you feel like you can work, work from home rather than come back too soon and risk a relapse or sharing your illness.

Note: the CDC goes further and says to stay home until you’ve been fever- and symptom-free for 24 hours without the help of medication.

Be prepared to work from home.
Whether it’s because you’re sick, you’re taking care of a loved one who is sick, or your kid’s school is closed, be prepared to work effectively from home.

Make sure you have the tools (computer, monitor, keyboard, etc) and access (VPN, routing to AWS resources, etc) you need to do your job effectively. Don’t wait until you’re sick to figure it all out, do so now while you have the energy to tackle some bumps along the way.

Talk with your manager about your team’s WFH policies, procedures, and best practices.

Be prepared for others to work from home.
Add call-in info to all of your meetings and actually call into them. Consider ways for those working from home to participate in stand-ups and other activities (my team does daily chat-based standups, for example).

For managers

Allow & enable your employees to work from home.
Even if your engineering org prefers to have employees in the office, allow your employees more latitude to work from home if a pandemic happens. Some employees will need to stay at home and take care of their children if schools close.

Be sure to provide employees with the resources they need to work effectively from home. That might include computer hardware, VPN software, head sets, etc.

Have WFH policies, procedures, and best practices.
Ensure your employees have clear expectations for when they work from home. Be clear if you have different expectations for when employees are working from home vs working in the office such as daily status reports or check-ins. Working remotely may be challenging for some individuals who may need more structure.

Be prepared for a lot of people to work from home.
Ensure your infrastructure will handle many more people than usual working from home at the same time.

Factor potential sick time into your planning and sprints.
When doing planning, be sure to add in some buffer for people who might be out sick. You may need to take on fewer stretch goals.

And…. ?

What did I miss? What are you doing to ensure your engineering team is prepared in case Coronavirus goes full pandemic?

Further reading

Bye SFI, hello ExtraHop

After much deliberation and soul-searching I’m changing jobs. Next Monday, November 26th is my last day at Spaceflight Industries. I will then have 3 glorious weeks of vacation before I start my new job at ExtraHop as a lead on their performance team.

I gave notice a month ago but I wanted to stay at SFI to support my team through our first commercial satellite launch — a launch that was suppose to take place today but is now delayed (the challenges of planning around rocket launches was one factor in my decision to leave the aerospace industry).

I’ve learned a great deal during my 2.5 years at Spaceflight Industries. I’ve worked with some brilliant and hardworking people, whom I will miss, and together we solved some really challenging problems in ingenious ways. I appreciate that SFI was willing to take a chance on me being a manager and giving me the flexibility to explore what that looked like for me.

That said, I’m looking forward to stepping back into an individual contributor position. While I’m told I was a good people manager it didn’t feed my soul and I found it really draining. I’ve had some really great managers over the past 18 years and attempting to live up to the high standards I set for myself was exhausting. I’m not ruling out going back into it in the future, but for now I’m excited to sink my teeth into some gnarly technical problems and to sling some code with the rest of the performance team.

I’m also looking forward to working, albeit indirectly, with the esteemed Jeena Khan and her team of writers! Frankly, I’m not certain ExtraHop knows what they’ve gotten themselves into with Jeena and I working together again. The building might not be able to contain our mutual enthusiasm!

Constellation orchestration with Gemini

This is a company blog post I wrote about Gemini, the cloud-based constellation orchestration software my team and I created at Spaceflight Industries. I’m duplicating it here from the original that was posted on 2018/11/12 for posterity.

Constellation Orchestration using the Cloud

Since the launch of Pathfinder-1 two years ago, the BlackSky ground and control team has been working on Gemini, our internal name for our next-generation cloud-based constellation orchestration system. We’ve taken operator interactions with our first demonstration satellite Pathfinder-1 combined with lessons learned from our first-generation software and redesigned the system from the ground-up for fully-automated operations of our Global satellites. From the very beginning, Gemini was designed to scale up with our constellation.

Designed for fully-automated operations

The initial checkout of the satellite post-launch begins with our satellite operators. Satellite constellation operators use Gemini for manual commanding of Global satellites during launch and early operations to confirm the satellite is healthy in orbit. After checkout is complete, the operators take a step back and the satellite is handed over to Gemini automation. Gemini is responsible for orchestrating the tasking and downlink from the satellite, engaging the groundstations around the world to communicate with the satellite during contact passes, create and upload satellite mission tasking scripts, manage telemetry & health logs, and alert operators to any anomalous telemetry. The automation is designed to protect the satellite but as additional safeguard Gemini alerts operators in the event of anomalous behavior so that they can intervene if needed.

In addition, Gemini also:

  • plans images and tasks them across the entire constellation
  • orchestrates connectivity with multiple satellites around our world-wide network of groundstations
  • manages the radio chain & antenna tracking
  • propagates satellite and equipment telemetry in sub-seconds from groundstations to operator dashboards during contact passes
  • monitors the entire system in real-time and alerts on anomalies
  • provides infrastructure for our image processing pipeline, code-named Obscura internally, that does georeferencing and orthorectification and more
  • exposes web-based UIs to operators for manual satellite commanding in addition to insight into automated activities and constellation health

Cross-team development and validation

Gemini development was a collaborative effort using input from many cross-company teams to ensure that we could test the system in the same way we expected to use it while in space (as they say in aerospace: test what you fly, fly what you test). The Gemini development team worked closely with operators to design a system that provided the control and insight they needed for successful satellite operations. Our development team worked hand-in-hand with flight software and hardware AI&T teams to validate all radio, commanding, and telemetry interfaces. An agile development approach allowed operators and other stakeholders to request features and resolve issues through an iterative testing and release process.

Our validation team created multi-satellite constellations using virtual satellites — a novelty in the aerospace industry — to ensure our system scalability. They also created automated deployments and tests to run nightly against our physical test satellite (Flatsat) to validate end-to-end radio equipment functionality and full-system integration. This innovative testing showcases the robustness of our constellation automation ahead of launch and allows the cross-functional team to evaluate the space to ground system while still on Earth.

Under the hood

Gemini was built leveraging technologies and practices that, while common in many software development shops, are new to aerospace. Our microservices architecture runs on EC2 instances running CoreOS in Amazon GovCloud and in CoreOS virtual machines on top of VMware ESXi hosts in our groundstations around the world, allowing a unified architecture across these disparate environments. Microservices are coded in Python 3.6, primarily with asyncio/aiohttp, with a smattering of node.js and are deployed via Docker containers.

To handle the firehose of critical telemetry, both from the satellite as well as the groundstation systems, we propagate telemetry in real-time using Redis pubsub then store it in KairosDB/Cassandra and expose it to operators in Grafana dashboards. WebSockets are used for real-time service alerts and messages making them available nearly instantaneously to the user. Our Polymer-based operations UI allows for tight coupling between the microservice source of the data and the operator interface all while being presented together as a single cohesive interface. Using encapsulated web components allows quick deployment of new features and easy integration with third party tools.

We use the HashiCorp stack (Terraform, Consul, Vault, and Nomad) to manage our infrastructure as code, Gitlab for source management, and Pants/Concourse for builds.

Launch ready

We’re excited to put Gemini to work when the rubber meets the road with the upcoming Global launches!

2 years at Spaceflight Industries

Today is my 2-year anniversary at Spaceflight Industries.

Coincidentally today I am operating as an Engineering Lead for our 4th mission rehearsal in preparation for commanding Global-1 when it launches in a few months. In the last 18 months my team has built Gemini, a ground & control system, from the ground up (pun intended) to task a 20+ earth-imaging satellite constellation from our groundstations around the world. The system provides satellite operators with real-time telemetry on the state of the spacecraft during a contact pass.

It’s amazing to think about what we’ve accomplished since I’ve been here and I’m excited what the next several months have in store!

My questions for new direct reports

My management mantra has always been “what would I like my manager to do in this position?”. That gave rise to the following set of questions that I ask every new person who reports to me, either as a transfer or new-hire, to start off on the right foot.

  • What would you prefer your core work hours to be?
    I’m not monitoring when my reports are in and out of the office every day (far from it), but knowing if they are a morning or evening person helps me know how they work best and when to start getting worried if they don’t show up and I haven’t heard from them.
  • During those core hours, what hours would you like to have meetings?
    Are there certain days of the week or times of the day you would prefer to not have meetings?

    I view one of my primary objectives as a manager to buffer my folks from interruptions. One way I can do that is to make sure I’m scheduling meetings at times that are good for the employee. For example, if they prefer to eat lunch at 11a I’ll try my best not to schedule a meeting with them then. I also try to enforce meeting-free Thursdays to give a solid block of Maker time and enable people to work from home.
  • How often would you like to have one-on-ones?
    Setting up reoccurring 1:1s are important, as is knowing how frequently the person wants to meet. We may have a discussion if their desired frequency is the right amount, but most people know how often they want to check in with their manager.
  • How do you like to communicate? (Slack/email/in-person/phone/etc)
    I think this is one of the most important questions. Part of buffering folks from interruptions is buffering them from my interruptions too. If someone prefers email to Slack, I’ll drop them a more coherent email rather than a train-of-thought IM. If someone would rather me stop by their desk to ask something rather than send an IM (and I have a couple of folks who prefer this), I’m happy to oblige.

Thus far these questions have been well-received and knowing the answer has improved my ability to effectively manage my employees and communicate with them.

What questions do you ask your direct reports or wish your manager would ask you?