Ben Kamens

Founder @ Spring Discovery Proud pasts @ Khan AcademyFog Creek

Announcing Spring Discovery

I’m proud to announce our team, funding, and ambitious mission at Spring Discovery. We’re fighting back against the diseases of old age by accelerating the discovery of anti-aging therapies.

image

Why do therapies focused on aging present such a profound opportunity? Because aging is the single greatest risk factor for the most detrimental diseases on Earth — cardiovascular disease, neurodegenerative disease, pulmonary disease, cancer, muscle wasting, and more — and drugs that slow the biological damage accumulated while aging have the potential to reduce the incidences of these diseases, possibly simultaneously.

More about our approach and our launch announcement.

Thankful and lucky to have helped build Khan Academy

In early 2010 I saw Sal Khan doing something inspiring. He had been diligently posting educational video after educational video on YouTube for years (“secret to overnight success…”) and was starting to get a real response from thousands of learners around the world.

I looked at the website at the time, thought I might be able to help, and sent him an email asking if I could.

I now consider myself unreasonably lucky to have spent the last 6 years riding the consequences of that email.

I volunteered until Sal turned Khan Academy into a real, more-than-one-person company, at which point I joined as our first engineer. I felt deep pride in using the engineering and management lessons I’d learned from Joel and Fog Creek to build a team pointed at KA’s epic mission: “A free, world-class education for anyone, anywhere.”

Now we’re a company of ~120 people. And that group has been steadily adding momentum to the pursuit of our mission, increasing Khan Academy’s impact a little here and a lot there…making me thankful while I just try to hold on:

(And while far more wishy-washy, my personal favorite signal of Khan Academy’s impact is the fact that every single day we receive stories like these about how access to a free education changed users’ lives. They are many. And meaningful. It’s actually hard not to become numb to them internally, which is a phenomenon we fight against.)

I never would’ve guessed half the above in 2010. This team has taught me many lessons, none more important than the fact that when a small group of people are willing to square up together against an epic problem, they can make a huge difference.

I bold mentions of this team because it’s taken a special group to accomplish the above. I’m not responsible for most of it, and KA’s highest impact is yet to come. But rest assured I’ll be the dude bragging that “at least I didn’t screw it all up in the early days” until I croak — the opportunity to play a small part alongside this team is a lifelong gift.

I’m off to a new mission, a personal connection of mine. By the numbers, it’s unlikely I’ll ever even approach the same level of world impact I got to have here.

But I’m gonna take a crack at a problem that’s important to me, and I’ll be wielding lessons learned from building Khan Academy — most importantly the fact that small groups can square up, together, and make big dents in epic problems.

Khan Academy’s Engineering Principles

You know those super-frustrating movie scenes where the entire plotline is driven by some sort of completely avoidable communication failure?

Where if the two characters WOULD JUST FRIGGIN’ TALK TO EACH OTHER, the whole shebang could’ve been avoided? And you, you poor shmuck in the audience, just have to sit there and watch the author jump through all sorts of hoops to justify why these two otherwise-humanesque characters seem to be woefully incapable of talking like humans?

The “communication failure” trope seems super contrived when I’m watching a movie. But when playing my role as part of a growing company, I’m part of an intricate system that’s constantly creating communication failures just as epic and just as avoidable.

Somewhere along the way to becoming a 40+ person engineering org, writing down our principles felt like the healthy thing to do to help innoculate against communication failure.

The effort’s paid off for us many times over. We’re sharing our principles and more on our engineering blog with the hope of helping someone else.

Margaret Hamilton and the First Men on the Moon

First Men on the Moon is one of my favorite little educational resources. It’s the closest I’ve ever felt to being part of the Apollo 11 moon landing. They took the actual video filmed outside the window of the lunar module, the audio recorded between astronauts and mission control, and a bunch of data about the landing itself — including Armstrong’s heartrate over time — and synced it all together into a sweet interactive experience.

Looks something like this, ‘cept on the site you hear everything and can scrub back'n'forth from descent to “The Eagle has Landed!”

image

Playing w/ this site is how I learned about a software glitch that almost botched the whole thing. Neil Armstrong is a couple minutes from touching down when he says “program alarm” so calmly you could almost miss it amid the chatter. Goes something like:

Armstrong: "Program alarm."

[4 seconds, unrelated chatter]

Armstrong: "It's a 1202."

Aldrin: "1202."

[16 seconds of silence]

Armstrong (now with urgency in his voice): "Give us a reading on the 1202 program alarm."

Mission control: "Roger. We got...we're go on that alarm."

And on the Mission Control loop in Houston — the one the astronauts can’t hear — they say:

Steve Bales: "If it doesn't reoccur, we'll be go."

Go listen to it! For those of you who haven’t spent your lives in software, let me translate from Super Professional NASA Speak™ into ordinary developer chat for you:

image

This is obviously a silly caricature. NASA are professionals. But I read about this event after listening to the audio. Turns out there’s a bit of truth here — when that 1202 error first showed up, there was a lot of confusion. Only one person, thanks to a previous simulator failure, happened to have a hand-scribbled list of various possible program alarms and was able to suggest that it was safe to keep going. The alarm would pop up three more times before landing.

It’s always been one of my favorite moon-landing stories — almost aborting right at the very end.

So I was really happy to learn about the programmer behind this success: Margaret Hamilton. I’d never heard her name until a recent reddit thread and a followup on hacker news.

image

Turns out she’s not only responsible for this alarm not ruining the mission but she actually coined the term 'software engineering’. Apparently she spent a lot of time arguing for software safety checks that prevented astronauts from accidentally doing the wrong thing, despite being regularly assured that this code wouldn’t be worth its weight because professional astronauts don’t accidentally do the wrong thing.

In Apollo 11’s case, a switch in the lunar landing module was set in the wrong position, causing its computer to process data from two radars instead of just the one pointed at the moon. This was too much to process, so Hamilton’s system began (impressively) dropping its low priority tasks in order to have enough CPU to properly handle the critical lunar surface radar input. Wasn’t a software glitch at all — quite the opposite. A graceful handling of a critical situation.

Rest is history. Enjoy it for yourself, and be sure to thank Margaret when Neil gets to ignore those scary-sounding alarms.

A book hiding in a gist

A while back — after leaving Heroku — Adam Wiggins left this little note on twitter:

I normally don’t just “reblog” someone’s work, but this humble little gist just packs so much wisdom so tight. Anyone building a product team will love the quick read.

Me? I find myself re-skimming these gems every few months. “Make it real.” “Ship it.” “Do it with style. ”Own up to failure.“ ”Write well.“ I like keeping these in mind while we build our team at Khan Academy.

Each little bit in that gist could easily be its own blog post. Altogether there’s a great book hiding.

KA Lite

Choosing what to work on is one of the hardest things we do at Khan Academy. Saying no’s hard for everyone, but it sure does feel tough when the person you’re saying no to is some earnest child trying to learn.

Two of the hardest no’s we’ve ever said were aimed at students who couldn’t speak English and students who don’t have access to the internet.

For a long time we said no to internationalization. i18n is hard. Especially when you have more content than all the Harry Potter books combined, much of it in video form. We were a small team. With a core product to figure out and without the time to make it work in other languages. We’d hear cries from all over: “my daughter needs help but doesn’t speak English,” “my students are dying to use KA but can’t navigate the site,” “we’ll translate everything for you, just please tell us how!” We always knew we’d bite the bullet one day, but for a long time we just had to say no, no, no.

At some point the chips fell into place. And now the whole shebang’s available in Spanish, French, Turkish, Portuguese, and soon many others. Looking at those pages makes me proud. The thought of not-just-English-speakers getting access to free educational content reminds me why we’re doing what we’re doing.

image
“No.”

Unfortunately, those without internet are in a different ship altogether. Despite the fact that any student w/out internet is probably also a student most in need of free educational resources, we simply can’t yet focus on that Ridiculously Hard Problem.

Enter Jamie Alexandre, a Summer 2012 intern at Khan Academy. During our first ever hackathon Jamie hacked a Raspberry Pi to run a modified version of KA entirely on cheap components without needing any access to the internet. Fast forward a few months, and next thing we know he tells us he’s started his own non-profit, the Foundation for Learning Equality, dedicated to bringing their own custom version of KA’s platform (“KA Lite”) and other online educational resources to everybody — especially those offline.

I love this story for about a million and twenty reasons. Selfishly it’s just so cool to see an incredible and entirely separate organization spring out of our internship. Non-selfishly, KA Lite has now been installed in over 120 countries, giving access to students, teachers, and even inmates who don’t get internet and all that comes with it.

Internally we still think the best way Khan Academy itself can improve access for those without internet is to build the best educational product and content we can — doing so will motivate others to help us reach everyone one day. But it sure is a lot easier to say no to kids without internet when Jamie and his crew are tackling the problem in their own way.

Thanks for all you do, KA Lite. Please come back and visit for our next healthy hackathon!

image

Going over the top

When Andy recently decided to join Khan Academy, he emailed us a secretive app and said to run it on an iPad. When we did we found a fun little toy — if the device was angled in just the right way and the little knobs were in just the right position, the words “I accept” playfully emblazoned themselves on the screen. It was such a cool way to join the team.

image

Point is, he could’ve just emailed us his signed offer letter and the team would’ve been thrilled. Honestly he could’ve used however many hours he spent on that app to get “real work” done or find a great pizza spot. Nobody would’ve thought twice.

When we recently ran the third Khan Academy Healthy Hackathon, I announced it with a ridiculous Doctor Seuss poem (copied below in case you haven’t rolled your eyes today). I also took the time to make a whole little single-page site dedicated to introducing the hackathon.

image

Point is, all of that was completely unnecessary. Our hackathon was internal only and a quick email to team@khanacademy.org would’ve saved hours. There were at least twenty-seven better ways I could’ve been spending my time.

But if every single person made every single decision based only on what’s most valuable at that exact moment, we’d always be fighting fires.

I’ve come to appreciate how important it is for an organization to make it safe for folks to go over the top. To express themselves by taking an ordinary task — one that you might think isn’t important — and investing so much passion into it that people say, “that’s nuts.” I wanna work with more of those nuts. And I wanna help build a team that embraces ‘em.




You have hacks in your head
But never the time
The last thing you need
Is to be reading a rhyme.

But lo and behold, an opportunity rises
To hack hacks and make makes and prize all sorts of prizes.
It's the third time around and a chance for some firsts
Hackathon Healthy Academy Khan — now read that reversed.

Some hacks will work well
They'll change lots of lives
We'll ship 'em bug-free
And exchange five high-fives.

Except when they don't.
Because, sometimes, they won't.

I'm afraid that some hacks
Might not even ship
That's ok! Time for fun!
Hack make build, let 'er rip.

Three minute quiz: App Engine datastore performance

As somebody now spending all his time in NoSQL land, my brain perked up when working through The Three Minute SQL Performance Quiz. It was a blast from the past for me, a chance to remember all the little ins'n'outs of SQL performance from a quaint old time when I used to be able to write JOIN statements.

So I thought it’d be fun to develop a similar performance challenge for this new NoSQL life of mine. Since we use App Engine at Khan Academy, we’ll focus on the App Engine datastore. Proceed.

Question 1


QueryModel
Monkey.all().filter(
    "genus IN",
    ["Ateles", "Cebus", "Aotus"]
  ).fetch(10)
class Monkey(db.Model):
  genus = db.StringProperty(
    indexed=True)

Hold on there — a major improvement is possible orAll looks good to me — don’t go changin’ a thang

You’re right! Queries that use the IN operator may look like a single query, but they actually run multiple queries behind the scenes, one for each item in the list. That means if you ran the above query, opened Appstats, and looked at this request’s profile, you’d see this:

image

…and that’s not ideal if you really care about this request’s performance. The requests are running asynchronously and overlap as much as possible — which is great — but you’re still running three requests and increasing the likelihood that one of ‘em will slow you down.

If you want this to be blazing fast, you almost certainly want to denormalize this set membership into a property that can be queried without an IN operator. Perhaps you’d add something like is_in_favorite_genus = db.BooleanProperty(indexed=True) to Monkey, set that property to True if the genus is one you’re interested in, and then change your query to Monkey.all().filter("is_in_favorite_genus =", True). That’d be a significant improvement — especially if your IN list contained many items. That’s the App Engine way.

Question 2


QueryModel
Monkey.all().filter(
    "unique_name =",
    "bob_the_monkey"
  ).get()
class Monkey(db.Model):
  unique_name = db.StringProperty(
    indexed=True)

Hold on there — a major improvement is possible orAll looks good to me — don’t go changin’ a thang

You’re right! If you’re loading a single entity using a unique identifier, queries aren’t as fast as loading by key or key name. There are multiple ways to load an entity by key that we won’t get into here — just know that if you have unique identifiers for your models, you should strongly consider constructing your entities using the unique identifier as part of your model’s key name — Monkey(key_name="bob_the_monkey", **kwds) — so you can quickly retrieve it later: db.get_by_key_name("bob_the_monkey")

If you’re curious why this is faster than a query, it makes sense if you’re willing to swallow the gross oversimplification that a query is just quickly looking up the key in an index and then fetching the entity using the key. If you’ve already got the key, why query?

Question 3


image

One of the coolest dogs ever orMeh, she’s so-so

You’re right! Shouldn’t need explainin’.

Question 4


Putting an entityModel
m = Monkey(
    name="Bob the Monkey",
    favorite_color="blue",
    favorite_food="pizza",
    worst_enemy="honey_badger"
  )

m.put()
class Monkey(db.Model):
  name = db.StringProperty()
  favorite_color = db.StringProperty()
  favorite_food = db.StringProperty()
  worst_enemy = db.StringProperty()

Hold on there — a major improvement is possible orAll looks good to me — don’t go changin’ a thang

You’re right! Kinda. It depends how you’re querying this entity. All of the datastore properties on Monkey are indexed by default, and that means every time you put() a Monkey, you’re spending time writing new values to all of those indexes.

Now, if you need to be able to run queries to find Monkeys based on their favorite_color, favorite_food, worst_enemy, and so on — then you’re doing the right thing. You need the indexes to be able to query them. If your code base is anything like Khan Academy’s used to be, however, you may have tons of properties w/ automatic indexes that never, ever need to be queried. We fixed this by using db.StringProperty(indexed=False) in these cases (and writing a linter that requires us to explicitly specify indexed=True|False for every datastore property).

Write speed may not matter for your app, but if it does you can speed up writes by not wasting time writing to indexes you don’t need.

Question 5


QueryModel
query = Monkey.all()
query.filter("name =", "bob")
query.filter("zoo =", "Manhattan")
query.filter("hats >", 5)
query.get()
class Monkey(db.Model):
  name = db.StringProperty(indexed=True)
  zoo = db.StringProperty(indexed=True)
  hats = db.IntegerProperty(indexed=True)

Hold on there — a major improvement is possible orAll looks good to me — don’t go changin’ a thang

You’re right! This query will work. And when you first start out and only have a couple hundred Monkeys, it’ll be blazing fast. But as your family of Monkeys grows, you may find yourself with a Very Serious™ performance issue. Why? You have all of the properties indexed, you’re only asking for a single entity…what’s the problem?

You don’t have a perfect index defined. Without an index that perfectly covers all of the properties involved in the query’s filters, here’s what App Engine has to do when you ask it to fetch a single entity: start querying the datastore for, say, Monkeys with name == “Bob”, retrieving them from the datastore, and filtering through them to find any with zoo == “Manhattan” and at least 5 hats. That’s not what you want, and it looks like this in Appstats:

image

That’s a disaster. Depending on the shape of your data, you could spend hundreds of RPCs just returning a single entity from the datastore. You want App Engine to query the datastore for “Monkeys named Bob in the Manhattan zoo who own at least 5 hats” in one swift, single blow — it should only need one RPC. And it’ll do exactly this if you have a perfect index. You’ll need an entry in index.yaml, something like:

- kind: Monkey
  properties:
  - name: name
  - name: zoo
  - name: hats

If you aren’t careful and don’t keep an eye to make sure that your important queries have indices that perfectly cover 'em, you could have a piece of code that runs snappy one day and, months later when your datastore fills up with Monkeys, it’ll suddenly take 5+ seconds to return a single entity. We’ve experienced this at Khan Academy more than once. It’s public enemy number one due to how easy it is to not notice the problem at first and then encounter brutal performance problems later.

Question 6


QueryModel
query = Monkey.all()
query.filter("name =", "bob")
query.filter("zoo =", "Manhattan")
query.filter("hats >", 5)
query.get()
class Animal(db.PolyModel):
  name = db.StringProperty(indexed=True)
  zoo = db.StringProperty(indexed=True)

class Monkey(Animal):
  hats = db.IntegerProperty(indexed=True)
 index.yaml
 
- kind: Animal
  properties:
  - name: name
  - name: zoo
  - name: hats

Hold on there — a major improvement is possible orAll looks good to me — don’t go changin’ a thang

You’re right! What now?! You’ve got your perfect index on all the queried properties! Well, not really.

This’ll suffer from the exact same Very Serious™ performance issue from Question 5. By using PolyModel and querying specifically for Monkeys — not just any old Animal — you’re implicitly adding another filter to your query. This filter will use PolyModel’s built-in special property, class, to only return Monkey results. If you don’t have an index that covers all filtered properties including class, you ain’t go no perfect index.

- kind: Animal
  properties:
  - name: class
  - name: name
  - name: zoo
  - name: hats



That’s it, you’re done! I could keep going, but by now you’ve spotted a trend in the answers and I’m pretty sure we’re past the 3-minute mark.

If you want more or have other interesting quiz questions I’d love to know. And if you geek out on perf work like me, you know what to do.